kernel_optimize_test

Author	SHA1	Message	Date
Jesse Gross	f0b128c1e2	openvswitch: Wrap struct ovs_key_ipv4_tunnel in a new structure. Currently, the flow information that is matched for tunnels and the tunnel data passed around with packets is the same. However, as additional information is added this is not necessarily desirable, as in the case of pointers. This adds a new structure for tunnel metadata which currently contains only the existing struct. This change is purely internal to the kernel since the current OVS_KEY_ATTR_IPV4_TUNNEL is simply a compressed version of OVS_KEY_ATTR_TUNNEL that is translated at flow setup. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-06 00:32:20 -04:00
Jesse Gross	67fa034194	openvswitch: Add support for matching on OAM packets. Some tunnel formats have mechanisms for indicating that packets are OAM frames that should be handled specially (either as high priority or not forwarded beyond an endpoint). This provides support for allowing those types of packets to be matched. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-06 00:32:20 -04:00
Jesse Gross	0714812134	openvswitch: Eliminate memset() from flow_extract. As new protocols are added, the size of the flow key tends to increase although few protocols care about all of the fields. In order to optimize this for hashing and matching, OVS uses a variable length portion of the key. However, when fields are extracted from the packet we must still zero out the entire key. This is no longer necessary now that OVS implements masking. Any fields (or holes in the structure) which are not part of a given protocol will be by definition not part of the mask and zeroed out during lookup. Furthermore, since masking already uses variable length keys this zeroing operation automatically benefits as well. In principle, the only thing that needs to be done at this point is remove the memset() at the beginning of flow. However, some fields assume that they are initialized to zero, which now must be done explicitly. In addition, in the event of an error we must also zero out corresponding fields to signal that there is no valid data present. These increase the total amount of code but very little of it is executed in non-error situations. Removing the memset() reduces the profile of ovs_flow_extract() from 0.64% to 0.56% when tested with large packets on a 10G link. Suggested-by: Pravin Shelar <pshelar@nicira.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: Andy Zhou <azhou@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-06 00:32:20 -04:00
Andy Zhou	0b5e8b8eea	net: Add Geneve tunneling protocol driver This adds a device level support for Geneve -- Generic Network Virtualization Encapsulation. The protocol is documented at http://tools.ietf.org/html/draft-gross-geneve-01 Only protocol layer Geneve support is provided by this driver. Openvswitch can be used for configuring, set up and tear down functional Geneve tunnels. Signed-off-by: Jesse Gross <jesse@nicira.com> Signed-off-by: Andy Zhou <azhou@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-06 00:32:20 -04:00
Frank Li	c259c132ad	net: fec: fix build error at m68k platform reproduce: wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross git checkout `1b7bde6d65` make.cross ARCH=m68k m5275evb_defconfig make.cross ARCH=m68k All error/warnings: drivers/net/ethernet/freescale/fec_main.c: In function 'fec_enet_rx_queue': >> drivers/net/ethernet/freescale/fec_main.c:1470:3: error: implicit declaration of function 'prefetch' [-Werror=implicit-function-declaration] prefetch(skb->data - NET_IP_ALIGN); ^ cc1: some warnings being treated as errors missed included prefetch.h Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Frank Li <Frank.Li@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-06 00:19:13 -04:00
Petri Gynther	5ad6e6c508	net: bcmgenet: improve bcmgenet_mii_setup() bcmgenet_mii_setup() is called from the PHY state machine every 1-2 seconds when the PHYs are in PHY_POLL mode. Improve bcmgenet_mii_setup() so that it touches the MAC registers only when the link is up and there was a change to link, speed, duplex, or pause status. Signed-off-by: Petri Gynther <pgynther@google.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-06 00:18:09 -04:00
David S. Miller	f13909cdab	Merge branch 'altera_tse' Walter Lozano says: ==================== Altera TSE with no PHY In some scenarios there is no PHY chip present, for example in optical links. This serie of patches moves PHY get addr and MDIO create to a new function and avoids PHY and MDIO probing in these cases. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-05 21:39:58 -04:00
Walter Lozano	3354313e50	Altera TSE: Add support for no PHY This patch avoids PHY and MDIO probing if no PHY chip is present. This is the case mainly in optical links where there is no need for PHY chip, and therefore no need of MDIO. In this scenario Ethernet MAC is directly connected to an optical module through an external SFP transceiver. Signed-off-by: Walter Lozano <walter@vanguardiasur.com.ar> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-05 21:39:40 -04:00
Walter Lozano	004fa11861	Altera TSE: Move PHY get addr and MDIO create Move PHY get addr and MDIO create to a new function to improve readability and make it easier to avoid its usage. This will be useful for example in the case where there is no PHY chip. Signed-off-by: Walter Lozano <walter@vanguardiasur.com.ar> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-05 21:39:40 -04:00
David S. Miller	a4b4a2b7f9	Merge tag 'master-2014-10-02' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== pull request: wireless-next 2014-10-03 Please pull tihs batch of updates intended for the 3.18 stream! For the iwlwifi bits, Emmanuel says: "I have here a few things that depend on the latest mac80211's changes: RRM, TPC, Quiet Period etc... Eyal keeps improving our rate control and we have a new device ID. This last patch should probably have gone to wireless.git, but at that stage, I preferred to send it to -next and CC stable." For (most of) the Atheros bits, Kalle says: "The only new feature is testmode support from me. Ben added a new method to crash the firmware with an assert for debug purposes. As usual, we have lots of smaller fixes from Michal. Matteo fixed a Kconfig dependency with debugfs. I fixed some warnings recently added to checkpatch." For the NFC bits, Samuel says: "We've had major updates for TI and ST Microelectronics drivers, and a few NCI related changes. For TI's trf7970a driver: - Target mode support for trf7970a - Suspend/resume support for trf7970a - DT properties additions to handle different quirks - A bunch of fixes for smartphone IOP related issues For ST Microelectronics' ST21NFCA and ST21NFCB drivers: - ISO15693 support for st21nfcb - checkpatch and sparse related warning fixes - Code cleanups and a few minor fixes Finally, Marvell added ISO15693 support to the NCI stack, together with a couple of NCI fixes." For the Bluetooth bits, Johan says: "This 3.18 pull request replaces the one I did on Monday ("bluetooth-next 2014-09-22", which hasn't been pulled yet). The additions since the last request are: - SCO connection fix for devices not supporting eSCO - Cleanups regarding the SCO establishment logic - Remove unnecessary return value from logging functions - Header compression fix for 6lowpan - Cleanups to the ieee802154/mrf24j40 driver Here's a copy from previous request that this one replaces: ' Here are some more patches for 3.18. They include various fixes to the btusb HCI driver, a fix for LE SMP, as well as adding Jukka to the MAINTAINERS file for generic 6LoWPAN (as requested by Alexander Aring). I've held on to this pull request a bit since we were waiting for a SCO related fix to get sorted out first. However, since the merge window is getting closer I decided not to wait for it. If we do get the fix sorted out there'll probably be a second small pull request later this week. '" And, "Unless 3.17 gets delayed this will probably be our last -next pull request for 3.18. We've got: - New Marvell hardware supportr - Multicast support for 6lowpan - Several of 6lowpan fixes & cleanups - Fix for a (false-positive) lockdep warning in L2CAP - Minor btusb cleanup" On top of all that comes the usual sort of updates to ath5k, ath9k, ath10k, brcmfmac, mwifiex, and wil6210. This time around there are also a number of rtlwifi updates to enable some new hardware and to reconcile the in-kernel drivers with some newer releases of the Realtek vendor drivers. Also of note is some device tree work for the bcma bus. Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-05 21:34:39 -04:00
David S. Miller	61b37d2f54	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains another batch with Netfilter/IPVS updates for net-next, they are: 1) Add abstracted ICMP codes to the nf_tables reject expression. We introduce four reasons to reject using ICMP that overlap in IPv4 and IPv6 from the semantic point of view. This should simplify the maintainance of dual stack rule-sets through the inet table. 2) Move nf_send_reset() functions from header files to per-family nf_reject modules, suggested by Patrick McHardy. 3) We have to use IS_ENABLED(CONFIG_BRIDGE_NETFILTER) everywhere in the code now that br_netfilter can be modularized. Convert remaining spots in the network stack code. 4) Use rcu_barrier() in the nf_tables module removal path to ensure that we don't leave object that are still pending to be released via call_rcu (that may likely result in a crash). 5) Remove incomplete arch 32/64 compat from nft_compat. The original (bad) idea was to probe the word size based on the xtables match/target info size, but this assumption is wrong when you have to dump the information back to userspace. 6) Allow to filter from prerouting and postrouting in the nf_tables bridge. In order to emulate the ebtables NAT chains (which are actually simple filter chains with no special semantics), we have support filtering from this hooks too. 7) Add explicit module dependency between xt_physdev and br_netfilter. This provides a way to detect if the user needs br_netfilter from the configuration path. This should reduce the breakage of the br_netfilter modularization. 8) Cleanup coding style in ip_vs.h, from Simon Horman. 9) Fix crash in the recently added nf_tables masq expression. We have to register/unregister the notifiers to clean up the conntrack table entries from the module init/exit path, not from the rule addition / deletion path. From Arturo Borrero. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-05 21:32:37 -04:00
David S. Miller	ad9eef5208	Merge branch 'bridge_default_pvid' Vladislav Yasevich says: ==================== bridge: Add vlan filtering support for default pvid This series adds default pvid support to vlan filtering in the bridge. VLAN 1 (as recommended by 802.1q spec) is used as default pvid on ports. Then the user can over-ride this configuration by configuring their own vlan information. The user can additionally change the default value through the sysfs interface (netlink coming shortly). The user can turn off default pvid functionality by setting default pvid to 0. This series changes the default behavior of the bridge when vlan filtering is turned on. Currently, ports without any vlan filtering configured will not recevie any traffic at all. This patch changes the behavior of the above ports to receive only untagged traffic. Since v3: - allocated 'changed' bitmap on the heap and re-arrange code to clean it up. - remove extra blank lines. - Fix patch1 to build by itself. - Fix error recover to not add vlan 0. - Restructure nbp_vlan_init to remove uneeded variable. Since v2: - Fix handling of invalid values in sysfs interface. - Add some additional log messages. - Fix default_pvid handling when vlan filtering is compiled out. - Fix sparse issues with new code. - Fix how we located the old default pvid (added a helper function). Since v1: - Add ability to turn off default_pvid settings. - Drop the automiatic filtering support based on configured vlan devices (will be its own series) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-05 21:21:44 -04:00
Vlad Yasevich	5be5a2df40	bridge: Add filtering support for default_pvid Currently when vlan filtering is turned on on the bridge, the bridge will drop all traffic untill the user configures the filter. This isn't very nice for ports that don't care about vlans and just want untagged traffic. A concept of a default_pvid was recently introduced. This patch adds filtering support for default_pvid. Now, ports that don't care about vlans and don't define there own filter will belong to the VLAN of the default_pvid and continue to receive untagged traffic. This filtering can be disabled by setting default_pvid to 0. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-05 21:21:37 -04:00
Vlad Yasevich	3df6bf45ec	bridge: Simplify pvid checks. Currently, if the pvid is not set, we return an illegal vlan value even though the pvid value is set to 0. Since pvid of 0 is currently invalid, just return 0 instead. This makes the current and future checks simpler. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-05 21:21:36 -04:00
Vlad Yasevich	96a20d9d7f	bridge: Add a default_pvid sysfs attribute This patch allows the user to set and retrieve default_pvid value. A new value can only be stored when vlan filtering is disabled. Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-05 21:21:36 -04:00
Antoine Ténart	e885439f37	net: pxa168_eth: avoid using signed char for bitops Signedness bugs may occur when using signed char for bitops, depending on if the highest bit is ever used. Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-05 21:18:56 -04:00
David S. Miller	5555dfdc0f	Merge branch 'isdn-next' Tilman Schmidt says: ==================== ISDN patches for net-next Here's a series of patches for the ISDN CAPI subsystem and the Gigaset ISDN driver. Please merge via net-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-05 21:17:56 -04:00
Tilman Schmidt	7b0c67e495	isdn/gigaset: use USB API function usb_endpoint_num() Use function usb_endpoint_num() for the bulk endpoint and store the endpoint number in the cardstate structure instead of the raw endpoint address value. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-05 21:17:52 -04:00
Tilman Schmidt	434d13ba39	isdn/gigaset: drop unused cardstate structure member Field int_in_endpointAddr was set but never used. Drop it. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-05 21:17:52 -04:00
Tilman Schmidt	5dcd7d8439	isdn/gigaset: improve error handling when leaving DLE mode Avoid cascading warnings when leaving DLE mode fails by clearing the DLE flag before entering recovery. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-05 21:17:52 -04:00
Tilman Schmidt	51db998fb6	isdn/capi: drop two dead if branches The last branch in command_2_index() cannot be reached since c==0xff is already caught by the first "if". The empty second branch makes no difference since no other branch will be taken for c<0x0f. Signed-off-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-05 21:17:51 -04:00
John Fastabend	1e203c1a2c	net: sched: suspicious RCU usage in qdisc_watchdog Suspicious RCU usage in qdisc_watchdog call needs to be done inside rcu_read_lock/rcu_read_unlock. And then Qdisc destroy operations need to ensure timer is cancelled before removing qdisc structure. [ 3992.191339] =============================== [ 3992.191340] [ INFO: suspicious RCU usage. ] [ 3992.191343] 3.17.0-rc6net-next+ #72 Not tainted [ 3992.191345] ------------------------------- [ 3992.191347] include/net/sch_generic.h:272 suspicious rcu_dereference_check() usage! [ 3992.191348] [ 3992.191348] other info that might help us debug this: [ 3992.191348] [ 3992.191351] [ 3992.191351] rcu_scheduler_active = 1, debug_locks = 1 [ 3992.191353] no locks held by swapper/1/0. [ 3992.191355] [ 3992.191355] stack backtrace: [ 3992.191358] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.17.0-rc6net-next+ #72 [ 3992.191360] Hardware name: /DZ77RE-75K, BIOS GAZ7711H.86A.0060.2012.1115.1750 11/15/2012 [ 3992.191362] 0000000000000001 ffff880235803e48 ffffffff8178f92c 0000000000000000 [ 3992.191366] ffff8802322224a0 ffff880235803e78 ffffffff810c9966 ffff8800a5fe3000 [ 3992.191370] ffff880235803f30 ffff8802359cd768 ffff8802359cd6e0 ffff880235803e98 [ 3992.191374] Call Trace: [ 3992.191376] <IRQ> [<ffffffff8178f92c>] dump_stack+0x4e/0x68 [ 3992.191387] [<ffffffff810c9966>] lockdep_rcu_suspicious+0xe6/0x130 [ 3992.191392] [<ffffffff8167213a>] qdisc_watchdog+0x8a/0xb0 [ 3992.191396] [<ffffffff810f93f2>] __run_hrtimer+0x72/0x420 [ 3992.191399] [<ffffffff810f9bcd>] ? hrtimer_interrupt+0x7d/0x240 [ 3992.191403] [<ffffffff816720b0>] ? tc_classify+0xc0/0xc0 [ 3992.191406] [<ffffffff810f9c4f>] hrtimer_interrupt+0xff/0x240 [ 3992.191410] [<ffffffff8109e4a5>] ? __atomic_notifier_call_chain+0x5/0x140 [ 3992.191415] [<ffffffff8103577b>] local_apic_timer_interrupt+0x3b/0x60 [ 3992.191419] [<ffffffff8179c2b5>] smp_apic_timer_interrupt+0x45/0x60 [ 3992.191422] [<ffffffff8179a6bf>] apic_timer_interrupt+0x6f/0x80 [ 3992.191424] <EOI> [<ffffffff815ed233>] ? cpuidle_enter_state+0x73/0x2e0 [ 3992.191432] [<ffffffff815ed22e>] ? cpuidle_enter_state+0x6e/0x2e0 [ 3992.191437] [<ffffffff815ed567>] cpuidle_enter+0x17/0x20 [ 3992.191441] [<ffffffff810c0741>] cpu_startup_entry+0x3d1/0x4a0 [ 3992.191445] [<ffffffff81106fc6>] ? clockevents_config_and_register+0x26/0x30 [ 3992.191448] [<ffffffff81033c16>] start_secondary+0x1b6/0x260 Fixes: `b26b0d1e8b` ("net: qdisc: use rcu prefix and silence sparse warnings") Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-04 20:45:54 -04:00
Florian Fainelli	f7d6b96f34	net: dsa: do not call phy_start_aneg Commit `f7f1de51ed` ("net: dsa: start and stop the PHY state machine") add calls to phy_start() in dsa_slave_open() respectively phy_stop() in dsa_slave_close(). We also call phy_start_aneg() in dsa_slave_create(), and this call is messing up with the PHY state machine, since we basically start the auto-negotiation, and later on restart it when calling phy_start(). phy_start() does not currently handle the PHY_FORCING or PHY_AN states properly, but such a fix would be too invasive for this window. Fixes: `f7f1de51ed` ("net: dsa: start and stop the PHY state machine") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-04 20:44:44 -04:00
Sébastien Barré	dd3619f2ed	Removed unused inet6 address state the inet6 state INET6_IFADDR_STATE_UP only appeared in its definition. Cc: Christoph Paasch <christoph.paasch@uclouvain.be> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sébastien Barré <sebastien.barre@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-04 20:37:17 -04:00
Vijay Subramanian	c8753d55af	net: Cleanup skb cloning by adding SKB_FCLONE_FREE SKB_FCLONE_UNAVAILABLE has overloaded meaning depending on type of skb. 1: If skb is allocated from head_cache, it indicates fclone is not available. 2: If skb is a companion fclone skb (allocated from fclone_cache), it indicates it is available to be used. To avoid confusion for case 2 above, this patch replaces SKB_FCLONE_UNAVAILABLE with SKB_FCLONE_FREE where appropriate. For fclone companion skbs, this indicates it is free for use. SKB_FCLONE_UNAVAILABLE will now simply indicate skb is from head_cache and cannot / will not have a companion fclone. Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-04 20:34:25 -04:00
Eric Dumazet	9fab426de7	mlx4: add a new xmit_more counter ethtool -S reports a new counter, tracking number of time doorbell was not triggered, because skb->xmit_more was set. $ ethtool -S eth0 \| egrep "tx_packet\|xmit_more" tx_packets: 2413288400 xmit_more: 666121277 I merged the tso_packet false sharing avoidance in this patch as well. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-04 20:04:14 -04:00
David S. Miller	6106253e69	Merge branch 'gudp' Tom Herbert says: ==================== net: Generic UDP Encapsulation Generic UDP Encapsulation (GUE) is UDP encapsulation protocol which encapsulates packets of various IP protocols. The GUE protocol is described in http://tools.ietf.org/html/draft-herbert-gue-01. The receive path of GUE is implemented in the FOU over UDP module (FOU). This includes a UDP encap receive function for GUE as well as GUE specific GRO functions. Management and configuration of GUE ports shares most of the same code with FOU. For the transmit path, the previous FOU support for IPIP, sit, and GRE was simply extended for GUE (when GUE is enabled insert the GUE header on transmit in addition to UDP header inserted for FOU). Semantically GUE is the same as FOU in that the encapsulation (UDP and GUE headers) that are inserted on transmission and removed on reception so that IP packet is processed with the inner header. This patch set includes: - Some fixes to FOU, removal of IPv4,v6 specific GRO functions - Support to configure a GUE receive port - Implementation of GUE receive path (normal and GRO) - Additions to ip_tunnel netlink to configure GUE - GUE header inserion in ip_tunnel transmit path v2: - Include net/gue.h in patch set Testing: I ran performance numbers using netperf TCP_RR with 200 streams, comparing encapsulation without GUE, encapsulation with GUE, and encapsulation with FOU. GRE TCP_STREAM IPv4, FOU, UDP checksum enabled 14.04% TX CPU utilization 13.17% RX CPU utilization 9211 Mbps IPv4, GUE, UDP checksum enabled 14.99% TX CPU utilization 13.79% RX CPU utilization 9185 Mbps IPv4, FOU, UDP checksum disabled 13.14% TX CPU utilization 23.18% RX CPU utilization 9277 Mbps IPv4, GUE, UDP checksum disabled 13.66% TX CPU utilization 23.57% RX CPU utilization 9184 Mbps TCP_RR IPv4, FOU, UDP checksum enabled 94.2% CPU utilization 155/249/460 90/95/99% latencies 1.17018e+06 tps IPv4, GUE, UDP checksum enabled 93.9% CPU utilization 158/253/472 90/95/99% latencies 1.15045e+06 tps IPIP TCP_STREAM FOU, UDP checksum enabled 15.28% TX CPU utilization 13.92% RX CPU utilization 9342 Mbps GUE, UDP checksum enabled 13.99% TX CPU utilization 13.34% RX CPU utilization 9210 Mbps FOU, UDP checksum disabled 15.08% TX CPU utilization 24.64% RX CPU utilization 9226 Mbps GUE, UDP checksum disabled 15.90% TX CPU utilization 24.77% RX CPU utilization 9197 Mbps TCP_RR FOU, UDP checksum enabled 94.23% CPU utilization 149/237/429 90/95/99% latencies 1.19553e+06 tps GUE, UDP checksum enabled 93.75% CPU utilization 152/243/442 90/95/99% latencies 1.17027e+06 tps SIT TCP_STREAM FOU, UDP checksum enabled 14.47% TX CPU utilization 14.58% RX CPU utilization 9106 Mbps GUE, UDP checksum enabled 15.09% TX CPU utilization 14.84% RX CPU utilization 9080 Mbps FOU, UDP checksum disabled 15.70% TX CPU utilization 27.93% RX CPU utilization 9097 Mbps GUE, UDP checksum disabled 15.04% TX CPU utilization 27.54% RX CPU utilization 9073 Mbps TCP_RR FOU, UDP checksum enabled 96.9% CPU utilization 170/281/581 90/95/99% latencies 1.03372e+06 tps GUE, UDP checksum enabled 97.16% CPU utilization 172/286/576 90/95/99% latencies 1.00469e+06 tps ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 16:53:36 -07:00
Tom Herbert	bc1fc390e1	ip_tunnel: Add GUE support This patch allows configuring IPIP, sit, and GRE tunnels to use GUE. This is very similar to fou excpet that we need to insert the GUE header in addition to the UDP header on transmit. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 16:53:33 -07:00
Tom Herbert	37dd024779	gue: Receive side for Generic UDP Encapsulation This patch adds support receiving for GUE packets in the fou module. The fou module now supports direct foo-over-udp (no encapsulation header) and GUE. To support this a type parameter is added to the fou netlink parameters. For a GUE socket we define gue_udp_recv, gue_gro_receive, and gue_gro_complete to handle the specifics of the GUE protocol. Most of the code to manage and configure sockets is common with the fou. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 16:53:33 -07:00
Tom Herbert	efc98d08e1	fou: eliminate IPv4,v6 specific GRO functions This patch removes fou[46]_gro_receive and fou[46]_gro_complete functions. The v4 or v6 variants were chosen for the UDP offloads based on the address family of the socket this is not necessary or correct. Alternatively, this patch adds is_ipv6 to napi_gro_skb. This is set in udp6_gro_receive and unset in udp4_gro_receive. In fou_gro_receive the value is used to select the correct inet_offloads for the protocol of the outer IP header. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 16:53:32 -07:00
Tom Herbert	7371e0221c	ip_tunnel: Account for secondary encapsulation header in max_headroom When adjusting max_header for the tunnel interface based on egress device we need to account for any extra bytes in secondary encapsulation (e.g. FOU). Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 16:53:32 -07:00
Eric Dumazet	01291202ed	net: do not export skb_gro_receive() skb_gro_receive() is only called from tcp_gro_receive() which is not in a module. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 15:54:30 -07:00
Chen Gang	ad2a2a6d7c	drivers/net/irda/Kconfig: Let SH_IRDA depend on HAS_IOMEM SH_IRDA needs HAS_IOMEM, so depend on it. The related error(with allmodconfig under um): CC [M] drivers/net/irda/sh_irda.o drivers/net/irda/sh_irda.c: In function ‘sh_irda_probe’: drivers/net/irda/sh_irda.c:776:2: error: implicit declaration of function ‘ioremap_nocache’ [-Werror=implicit-function-declaration] self->membase = ioremap_nocache(res->start, resource_size(res)); ^ drivers/net/irda/sh_irda.c:776:16: warning: assignment makes pointer from integer without a cast [enabled by default] self->membase = ioremap_nocache(res->start, resource_size(res)); ^ drivers/net/irda/sh_irda.c:821:2: error: implicit declaration of function ‘iounmap’ [-Werror=implicit-function-declaration] iounmap(self->membase); ^ Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 15:52:04 -07:00
Chen Gang	65cb29a4f3	drivers/net/ethernet/marvell/Kconfig: Let PXA168_ETH depend on HAS_IOMEM PXA168_ETH need HAS_IOMEM, so depend on it, the related error (with allmodconfig under um): CC [M] drivers/net/ethernet/marvell/pxa168_eth.o drivers/net/ethernet/marvell/pxa168_eth.c: In function ‘pxa168_eth_probe’: drivers/net/ethernet/marvell/pxa168_eth.c:1605:2: error: implicit declaration of function ‘iounmap’ [-Werror=implicit-function-declaration] iounmap(pep->base); ^ Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 15:52:04 -07:00
Chen Gang	28b5533a6f	drivers/net/dsa/Kconfig: Let NET_DSA_BCM_SF2 depend on HAS_IOMEM NET_DSA_BCM_SF2 need HAS_IOMEM, so depend on it, the related error (with allmodconfig under um): CC [M] drivers/net/dsa/bcm_sf2.o drivers/net/dsa/bcm_sf2.c: In function ‘bcm_sf2_sw_setup’: drivers/net/dsa/bcm_sf2.c:487:3: error: implicit declaration of function ‘iounmap’ [-Werror=implicit-function-declaration] iounmap(*base); ^ Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 15:52:04 -07:00
Chen Gang	9dc8be2816	drivers/net/can/Kconfig: Let CAN_AT91 depend on HAS_IOMEM CAN_AT91 needs HAS_IOMEM, so depends on it. The related error (with allmodconfig under um): CC [M] drivers/net/can/at91_can.o drivers/net/can/at91_can.c: In function ‘at91_can_probe’: drivers/net/can/at91_can.c:1329:2: error: implicit declaration of function ‘ioremap_nocache’ [-Werror=implicit-function-declaration] addr = ioremap_nocache(res->start, resource_size(res)); ^ drivers/net/can/at91_can.c:1329:7: warning: assignment makes pointer from integer without a cast [enabled by default] addr = ioremap_nocache(res->start, resource_size(res)); ^ drivers/net/can/at91_can.c:1384:2: error: implicit declaration of function ‘iounmap’ [-Werror=implicit-function-declaration] iounmap(addr); ^ Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 15:52:03 -07:00
David S. Miller	579899a9ea	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2014-10-02 This series contains updates to fm10k, igb, ixgbe and i40e. Alex provides two updates to the fm10k driver. First reduces the buffer size to 2k for all page sizes, since most frames only have a 1500 MTU so supporting a buffer size larger than this is somewhat wasteful. Second fixes an issue where the number of transmit queues was not being updated, so added the lines necessary to update the number of transmit queues. Rick Jones provides two patches to convert ixgbe, igb and i40e to use dev_consume_skb_any(). Emil provides two patches for ixgbe, first cleans up a couple of wait loops on auto-negotiation that were not needed. Second fixes an issue reported by Fujitsu/Red Hat, which consolidates the logic behind the dynamically setting of TXDCTL.WTHRESH depending on interrupt throttle rate (ITR) setting regardless of BQL. Ethan Zhao provides a cleanup patch for ixgbe where he noticed a duplicate define. Bernhard Kaindl provides a patch for igb to remove a source of latency spikes by not calling code that uses mdelay() for feeding a PHY stat while being called with a spinlock held. Todd bumps the igb version based on the recent changes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 15:43:50 -07:00
David S. Miller	48fea861c9	Merge branch 'mlx5-next' Eli Cohen says: ==================== mlx5 update for 3.18 This series integrates a new mechanism for populating and extracting field values used in the driver/firmware interaction around command mailboxes. Changes from V1: - Remove unused definition of memcpy_cpu_to_be32() - Remove definitions of non_existent_*() and use BUILD_BUG_ON() instead. - Added a patch one line patch to add support for ConnectX-4 devices. Changes from V0: - trimmed the auto-generated file to a minimum, as required by the reviewers. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 15:42:37 -07:00
Eli Cohen	f832dc820f	net/mlx5_core: Add ConnectX-4 to list of supported devices Add the upcoming ConnectX-4 device to the list of supported devices by then mlx5 driver. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 15:42:32 -07:00
Eli Cohen	5903325a64	net/mlx5_core: Identify resources by their type This patch puts a common part as the first field of mlx5_core_qp. This field is used to identify which resource generated an event. This is required since upcoming new resource types such as DC targets are allocated for the same numerical space as regular QPs and may generate the same events. By searching the resource in the same table we can then look at the common field to identify the resource. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 15:42:32 -07:00
Eli Cohen	b775516b04	net/mlx5_core: use set/get macros in device caps Transform device capabilities related commands to use set/get macros to manipulate command mailboxes. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 15:42:32 -07:00
Eli Cohen	d29b796ada	net/mlx5_core: Use hardware registers description header file Add an auto generated header file that describes hardware registers along with set of macros that set/get values. The macros do static checks to avoid overflow, handle endianess, and overall provide a clean way to code commands. Currently the header file is small and we will add structs as we make use of the macros. A few commands were removed from the commands enum since they are not supported currently and will be added when support is available. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 15:42:31 -07:00
Eli Cohen	c7a08ac7ee	net/mlx5_core: Update device capabilities handling Rearrange struct mlx5_caps so it has a "gen" field to represent the current capabilities configured for the device. Max capabilities can also be queried from the device. Also update capabilities struct to contain more fields as per the latest revision if firmware specification. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 15:42:31 -07:00
Eric Dumazet	55a93b3ea7	qdisc: validate skb without holding lock Validation of skb can be pretty expensive : GSO segmentation and/or checksum computations. We can do this without holding qdisc lock, so that other cpus can queue additional packets. Trick is that requeued packets were already validated, so we carry a boolean so that sch_direct_xmit() can validate a fresh skb list, or directly use an old one. Tested on 40Gb NIC (8 TX queues) and 200 concurrent flows, 48 threads host. Turning TSO on or off had no effect on throughput, only few more cpu cycles. Lock contention on qdisc lock disappeared. Same if disabling TX checksum offload. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 15:36:11 -07:00
Tobias Klauser	6a05880a8b	net: ethernet: Remove superfluous ether_setup after alloc_etherdev There is no need to call ether_setup after alloc_ethdev since it was already called there. Follow commits `c706471b26` ("net: axienet: remove unnecessary ether_setup after alloc_etherdev") and `3c87dcbfb3` ("net: ll_temac: Remove unnecessary ether_setup after alloc_etherdev") and fix the pattern in all remaining ethernet drivers. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 15:31:40 -07:00
David S. Miller	c2bf5ec204	Merge branch 'qdisc_bulk_dequeue' Jesper Dangaard Brouer says: ==================== qdisc: bulk dequeue support This patchset uses DaveM's recent API changes to dev_hard_start_xmit(), from the qdisc layer, to implement dequeue bulking. Patch01: "qdisc: bulk dequeue support for qdiscs with TCQ_F_ONETXQUEUE" - Implement basic qdisc dequeue bulking - This time, 100% relying on BQL limits, no magic safe-guard constants Patch02: "qdisc: dequeue bulking also pickup GSO/TSO packets" - Extend bulking to bulk several GSO/TSO packets - Seperate patch, as it introduce a small regression, see test section. We do have a patch03, which exports a userspace tunable as a BQL tunable, that can byte-cap or disable the bulking/bursting. But we could not agree on it internally, thus not sending it now. We basically strive to avoid adding any new userspace tunable. Testing patch01: ================ Demonstrating the performance improvement of qdisc dequeue bulking, is tricky because the effect only "kicks-in" once the qdisc system have a backlog. Thus, for a backlog to form, we need either 1) to exceed wirespeed of the link or 2) exceed the capability of the device driver. For practical use-cases, the measureable effect of this will be a reduction in CPU usage 01-TCP_STREAM: -------------- Testing effect for TCP involves disabling TSO and GSO, because TCP already benefit from bulking, via TSO and especially for GSO segmented packets. This patch view TSO/GSO as a seperate kind of bulking, and avoid further bulking of these packet types. The measured perf diff benefit (at 10Gbit/s) for a single netperf TCP_STREAM were 9.24% less CPU used on calls to _raw_spin_lock() (mostly from sch_direct_xmit). If my E5-2695v2(ES) CPU is tuned according to: http://netoptimizer.blogspot.dk/2014/04/basic-tuning-for-network-overload.html Then it is possible that a single netperf TCP_STREAM, with GSO and TSO disabled, can utilize all bandwidth on a 10Gbit/s link. This will then cause a standing backlog queue at the qdisc layer. Trying to pressure the system some more CPU util wise, I'm starting 24x TCP_STREAMs and monitoring the overall CPU utilization. This confirms bulking saves CPU cycles when it "kicks-in". Tool mpstat, while stressing the system with netperf 24x TCP_STREAM, shows: * Disabled bulking: sys:2.58% soft:8.50% idle:88.78% * Enabled bulking: sys:2.43% soft:7.66% idle:89.79% 02-UDP_STREAM ------------- The measured perf diff benefit for UDP_STREAM were 6.41% less CPU used on calls to _raw_spin_lock(). 24x UDP_STREAM with packet size -m 1472 (to avoid sending UDP/IP fragments). 03-trafgen driver test ---------------------- The performance of the 10Gbit/s ixgbe driver is limited due to updating the HW ring-queue tail-pointer on every packet. As previously demonstrated with pktgen. Using trafgen to send RAW frames from userspace (via AF_PACKET), and forcing it through qdisc path (with option --qdisc-path and -t0), sending with 12 CPUs. I can demonstrate this driver layer limitation: * 12.8 Mpps with no qdisc bulking * 14.8 Mpps with qdisc bulking (full 10G-wirespeed) Testing patch02: ================ Testing Bulking several GSO/TSO packets: Measuring HoL (Head-of-Line) blocking for TSO and GSO, with netperf-wrapper. Bulking several TSO show no performance regressions (requeues were in the area 32 requeues/sec for 10G while transmitting approx 813Kpps). Bulking several GSOs does show small regression or very small improvement (requeues were in the area 8000 requeues/sec, for 10G while transmitting approx 813Kpps). Using ixgbe 10Gbit/s with GSO bulking, we can measure some additional latency. Base-case, which is "normal" GSO bulking, sees varying high-prio queue delay between 0.38ms to 0.47ms. Bulking several GSOs together, result in a stable high-prio queue delay of 0.50ms. Corrosponding to: (1000010^6)((0.50-0.47)/10^3)/8 = 37500 bytes (1000010^6)((0.50-0.38)/10^3)/8 = 150000 bytes 37500/1500 = 25 pkts 150000/1500 = 100 pkts Using igb at 100Mbit/s with GSO bulking, shows an improvement. Base-case sees varying high-prio queue delay between 2.23ms to 2.35ms diff of 0.12ms corrosponding to 1500 bytes at 100Mbit/s. Bulking several GSOs together, result in a stable high-prio queue delay of 2.23ms. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 12:37:23 -07:00
Jesper Dangaard Brouer	808e7ac0bd	qdisc: dequeue bulking also pickup GSO/TSO packets The TSO and GSO segmented packets already benefit from bulking on their own. The TSO packets have always taken advantage of the only updating the tailptr once for a large packet. The GSO segmented packets have recently taken advantage of bulking xmit_more API, via merge commit `53fda7f7f9` ("Merge branch 'xmit_list'"), specifically via commit `7f2e870f2a` ("net: Move main gso loop out of dev_hard_start_xmit() into helper.") allowing qdisc requeue of remaining list. And via commit `ce93718fb7` ("net: Don't keep around original SKB when we software segment GSO frames."). This patch allow further bulking of TSO/GSO packets together, when dequeueing from the qdisc. Testing: Measuring HoL (Head-of-Line) blocking for TSO and GSO, with netperf-wrapper. Bulking several TSO show no performance regressions (requeues were in the area 32 requeues/sec). Bulking several GSOs does show small regression or very small improvement (requeues were in the area 8000 requeues/sec). Using ixgbe 10Gbit/s with GSO bulking, we can measure some additional latency. Base-case, which is "normal" GSO bulking, sees varying high-prio queue delay between 0.38ms to 0.47ms. Bulking several GSOs together, result in a stable high-prio queue delay of 0.50ms. Using igb at 100Mbit/s with GSO bulking, shows an improvement. Base-case sees varying high-prio queue delay between 2.23ms to 2.35ms Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 12:37:06 -07:00
Jesper Dangaard Brouer	5772e9a346	qdisc: bulk dequeue support for qdiscs with TCQ_F_ONETXQUEUE Based on DaveM's recent API work on dev_hard_start_xmit(), that allows sending/processing an entire skb list. This patch implements qdisc bulk dequeue, by allowing multiple packets to be dequeued in dequeue_skb(). The optimization principle for this is two fold, (1) to amortize locking cost and (2) avoid expensive tailptr update for notifying HW. (1) Several packets are dequeued while holding the qdisc root_lock, amortizing locking cost over several packet. The dequeued SKB list is processed under the TXQ lock in dev_hard_start_xmit(), thus also amortizing the cost of the TXQ lock. (2) Further more, dev_hard_start_xmit() will utilize the skb->xmit_more API to delay HW tailptr update, which also reduces the cost per packet. One restriction of the new API is that every SKB must belong to the same TXQ. This patch takes the easy way out, by restricting bulk dequeue to qdisc's with the TCQ_F_ONETXQUEUE flag, that specifies the qdisc only have attached a single TXQ. Some detail about the flow; dev_hard_start_xmit() will process the skb list, and transmit packets individually towards the driver (see xmit_one()). In case the driver stops midway in the list, the remaining skb list is returned by dev_hard_start_xmit(). In sch_direct_xmit() this returned list is requeued by dev_requeue_skb(). To avoid overshooting the HW limits, which results in requeuing, the patch limits the amount of bytes dequeued, based on the drivers BQL limits. In-effect bulking will only happen for BQL enabled drivers. Small amounts for extra HoL blocking (2x MTU/0.24ms) were measured at 100Mbit/s, with bulking 8 packets, but the oscillating nature of the measurement indicate something, like sched latency might be causing this effect. More comparisons show, that this oscillation goes away occationally. Thus, we disregard this artifact completely and remove any "magic" bulking limit. For now, as a conservative approach, stop bulking when seeing TSO and segmented GSO packets. They already benefit from bulking on their own. A followup patch add this, to allow easier bisect-ability for finding regressions. Jointed work with Hannes, Daniel and Florian. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 12:37:06 -07:00
Mark Einon	38df6492eb	et131x: Add PCIe gigabit ethernet driver et131x to drivers/net This adds the ethernet driver for Agere et131x devices to drivers/net/ethernet. The driver being added has been in the staging tree for some time, and will be removed from there in a seperate patch. This one merely disables the staging version to prevent two instances being built. Signed-off-by: Mark Einon <mark.einon@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-10-03 12:22:19 -07:00
Arturo Borrero	8da4cc1b10	netfilter: nft_masq: register/unregister notifiers on module init/exit We have to register the notifiers in the masquerade expression from the the module _init and _exit path. This fixes crashes when removing the masquerade rule with no ipt_MASQUERADE support in place (which was masking the problem). Fixes: 9ba1f72 ("netfilter: nf_tables: add new nft_masq expression") Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2014-10-03 14:24:35 +02:00

1 2 3 4 5 ...

471511 Commits