kernel_optimize_test

Author	SHA1	Message	Date
Harini Katakam	98b5a0f4a2	net: macb: Add support for jumbo frames Enable jumbo frame support for Zynq Ultrascale+ MPSoC. Update the NWCFG register and descriptor length masks accordingly. Jumbo max length register should be set according to support in SoC; it is set to 10240 for Zynq Ultrascale+ MPSoC. Signed-off-by: Harini Katakam <harinik@xilinx.com> Reviewed-by: Punnaiah Choudary Kalluri <punnaia@xilinx.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 17:41:54 -04:00
Harini Katakam	7b61f9c132	net: macb: Add compatible string for Zynq Ultrascale+ MPSoC Add compatible string and config structure for Zynq Ultrascale+ MPSoC Signed-off-by: Harini Katakam <harinik@xilinx.com> Reviewed-by: Punnaiah Choudary Kalluri <punnaia@xilinx.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 17:41:53 -04:00
Harini Katakam	988d6f07fc	devicetree: Add compatible string for Zynq Ultrascale+ MPSoC Add "cdns,zynqmp-gem" to be used for Zynq Ultrascale+ MPSoC. Signed-off-by: Harini Katakam <harinik@xilinx.com> Reviewed-by: Punnaiah Choudary Kalluri <punnaia@xilinx.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 17:41:53 -04:00
Jason Baron	790ba4566c	tcp: set SOCK_NOSPACE under memory pressure Under tcp memory pressure, calling epoll_wait() in edge triggered mode after -EAGAIN, can result in an indefinite hang in epoll_wait(), even when there is sufficient memory available to continue making progress. The problem is that when __sk_mem_schedule() returns 0 under memory pressure, we do not set the SOCK_NOSPACE flag in the tcp write paths (tcp_sendmsg() or do_tcp_sendpages()). Then, since SOCK_NOSPACE is used to trigger wakeups when incoming acks create sufficient new space in the write queue, all outstanding packets are acked, but we never wake up with the the EPOLLOUT that we are expecting from epoll_wait(). This issue is currently limited to epoll() when used in edge trigger mode, since 'tcp_poll()', does in fact currently set SOCK_NOSPACE. This is sufficient for poll()/select() and epoll() in level trigger mode. However, in edge trigger mode, epoll() is relying on the write path to set SOCK_NOSPACE. EPOLL(7) says that in edge-trigger mode we can only call epoll_wait() after read/write return -EAGAIN. Thus, in the case of the socket write, we are relying on the fact that tcp_sendmsg()/network write paths are going to issue a wakeup for us at some point in the future when we get -EAGAIN. Normally, epoll() edge trigger works fine when we've exceeded the sk->sndbuf because in that case we do set SOCK_NOSPACE. However, when we return -EAGAIN from the write path b/c we are over the tcp memory limits and not b/c we are over the sndbuf, we are never going to get another wakeup. I can reproduce this issue, using SO_SNDBUF, since __sk_mem_schedule() will return 0, or failure more readily with SO_SNDBUF: 1) create socket and set SO_SNDBUF to N 2) add socket as edge trigger 3) write to socket and block in epoll on -EAGAIN 4) cause tcp mem pressure via: echo "<small val>" > net.ipv4.tcp_mem The fix here is simply to set SOCK_NOSPACE in sk_stream_wait_memory() when the socket is non-blocking. Note that SOCK_NOSPACE, in addition to waking up outstanding waiters is also used to expand the size of the sk->sndbuf. However, we will not expand it by setting it in this case because tcp_should_expand_sndbuf(), ensures that no expansion occurs when we are under tcp memory pressure. Note that we could still hang if sk->sk_wmem_queue is 0, when we get the -EAGAIN. In this case the SOCK_NOSPACE bit will not help, since we are waiting for and event that will never happen. I believe that this case is harder to hit (and did not hit in my testing), in that over the tcp 'soft' memory limits, we continue to guarantee a minimum write buffer size. Perhaps, we could return -ENOSPC in this case, or maybe we simply issue a wakeup in this case, such that we keep retrying the write. Note that this case is not specific to epoll() ET, but rather would affect blocking sockets as well. So I view this patch as bringing epoll() edge-trigger into sync with the current poll()/select()/epoll() level trigger and blocking sockets behavior. Signed-off-by: Jason Baron <jbaron@akamai.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 17:38:36 -04:00
Claudiu Manoil	3d23a05c75	gianfar: Enable changing mac addr when if up Use device flag IFF_LIVE_ADDR_CHANGE to signal that the device supports changing the hardware address when the device is running. This allows eth_mac_addr() to change the mac address also when the network device's interface is open. This capability is required by certain applications, like bonding mode 6 (Adaptive Load Balancing). Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 17:37:46 -04:00
Claudiu Manoil	bc60228087	gianfar: Move TxFIFO underrun handling to reset path Handle TxFIFO underrun exceptions outside the fast path. A controller reset is more reliable in this exceptional case, as opposed to re-enabling on-the-fly the Tx DMA. As the controller reset is handled outside the fast path by the reset_gfar() workqueue handler, the locking scheme on the Tx path is significantly simplified. Because the Tx processing (xmit queues and tx napi) is disabled during controller reset, tstat access from xmit does not require locking. So the scope of the txlock on the processing path is now reduced to num_txbdfree, which is shared only between process context (xmit) and softirq (clean_tx_ring). As a result, the txlock must not guard against interrupt context, and the spin_lock_irqsave() from xmit can be replaced by spin_lock_bh(). Likewise, the locking has been downgraded for clean_tx_ring(). Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 17:37:46 -04:00
David S. Miller	39d726b76c	Merge branch 'bpf_seccomp' Daniel Borkmann says: ==================== BPF updates This set gets rid of BPF special handling in seccomp filter preparation and provides generic infrastructure from BPF side, which eventually also allows for classic BPF JITs to add support for seccomp filters. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 17:35:05 -04:00
Daniel Borkmann	ac67eb2c53	seccomp, filter: add and use bpf_prog_create_from_user from seccomp Seccomp has always been a special candidate when it comes to preparation of its filters in seccomp_prepare_filter(). Due to the extra checks and filter rewrite it partially duplicates code and has BPF internals exposed. This patch adds a generic API inside the BPF code code that seccomp can use and thus keep it's filter preparation code minimal and better maintainable. The other side-effect is that now classic JITs can add seccomp support as well by only providing a BPF_LDX \| BPF_W \| BPF_ABS translation. Tested with seccomp and BPF test suites. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Nicolas Schichan <nschichan@freebox.fr> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Kees Cook <keescook@chromium.org> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 17:35:05 -04:00
Daniel Borkmann	658da9379d	net: filter: add __GFP_NOWARN flag for larger kmem allocs When seccomp BPF was added, it was discussed to add __GFP_NOWARN flag for their configuration path as f.e. up to 32K allocations are more prone to fail under stress. As we're going to reuse BPF API, add __GFP_NOWARN flags where larger kmalloc() and friends allocations could fail. It doesn't make much sense to pass around __GFP_NOWARN everywhere as an extra argument only for seccomp while we just as well could run into similar issues for socket filters, where it's not desired to have a user application throw a WARN() due to allocation failure. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Nicolas Schichan <nschichan@freebox.fr> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Kees Cook <keescook@chromium.org> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 17:35:05 -04:00
Nicolas Schichan	d9e12f42e5	seccomp: simplify seccomp_prepare_filter and reuse bpf_prepare_filter Remove the calls to bpf_check_classic(), bpf_convert_filter() and bpf_migrate_runtime() and let bpf_prepare_filter() take care of that instead. seccomp_check_filter() is passed to bpf_prepare_filter() so that it gets called from there, after bpf_check_classic(). We can now remove exposure of two internal classic BPF functions previously used by seccomp. The export of bpf_check_classic() symbol, previously known as sk_chk_filter(), was there since pre git times, and no in-tree module was using it, therefore remove it. Joint work with Daniel Borkmann. Signed-off-by: Nicolas Schichan <nschichan@freebox.fr> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Kees Cook <keescook@chromium.org> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 17:35:05 -04:00
Nicolas Schichan	4ae92bc77a	net: filter: add a callback to allow classic post-verifier transformations This is in preparation for use by the seccomp code, the rationale is not to duplicate additional code within the seccomp layer, but instead, have it abstracted and hidden within the classic BPF API. As an interim step, this now also makes bpf_prepare_filter() visible (not as exported symbol though), so that seccomp can reuse that code path instead of reimplementing it. Joint work with Daniel Borkmann. Signed-off-by: Nicolas Schichan <nschichan@freebox.fr> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Kees Cook <keescook@chromium.org> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 17:35:05 -04:00
David S. Miller	0e00a0f73f	Lots of updates for net-next for this cycle. As usual, we have a lot of small fixes and cleanups, the bigger items are: * proper mac80211 rate control locking, to fix some random crashes (this required changing other locking as well) * mac80211 "fast-xmit", a mechanism to reduce, in most cases, the amount of code we execute while going from ndo_start_xmit() to the driver * this also clears the way for properly supporting S/G and checksum and segmentation offloads -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJVSh65AAoJEDBSmw7B7bqrudwP/0iXyNQhF0mLTENrx+rdsDZS qQhB/8wejJaOJb89Re7M+bhwri7Q6S5BM/G24vhMc01dxmqNMcdKfEV3+nlmc5C+ KeEgTI9aZiCnUt4WAd54Zwbkc9o+1kBtaFuaWDvOdQHUf0WDwEIQxjnV4+SZujV9 xl1TV5yV35hRQgrDE8ZSbtOYRmhSVoi0MEgwqAjzdN2fEPyWVeqwYULDtpOopjL2 UHQgv0E2fYVRWennHyQQ88tWBQg+EsRaG1U1/rYHhNBmAJ+f9AOxKi7ErzxYfkbM 961B+3E++pM+zUeqw6+jaMKqT5jeCCM5ugCNSG4NrIvfxDIDgecAFV9Fs2islnI4 8xd3GqyA5iqaitAWIUsaYaQfaAcwSIlpSinfQW9EUm2wuCkPyZboFP+GRd2K7sQn FnRJSJ9PkGPdWwdDE3gunLHBHtbDS0z+R8VegIeS0qT8LamkqICiNQSyPlsTeluW ig2kwHsDdj3k11wyelhfp/RdtsOch/brKpLSjdzPXC1BzIWhQLwmsPh9qZ83vSB9 qbLsdnM/IPQXocWB6fOhmwaGsLeRalxs2yQFM0zdJCwpaU9dzKsJrxepAXVuq31p r0fygWTp8GVevHXzfS7fRya8xjsTRrSs6n2kH7ErOfiep13HQypAjbyLswNe4kW/ D6x8pVC3AhdGkl/9CW4m =oUlh -----END PGP SIGNATURE----- Merge tag 'mac80211-next-for-davem-2015-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== Lots of updates for net-next for this cycle. As usual, we have a lot of small fixes and cleanups, the bigger items are: * proper mac80211 rate control locking, to fix some random crashes (this required changing other locking as well) * mac80211 "fast-xmit", a mechanism to reduce, in most cases, the amount of code we execute while going from ndo_start_xmit() to the driver * this also clears the way for properly supporting S/G and checksum and segmentation offloads ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 17:27:25 -04:00
David S. Miller	82ae9c6060	Merge branch 'tcp-more-reliable-window-probes' Eric Dumazet says: ==================== tcp: more reliable window probes This series address a problem caused by small rto_min timers in DC, leading to either timer storms or early flow terminations. We also add two new SNMP counters for proper monitoring : TCPWinProbe and TCPKeepAlive v2: added TCPKeepAlive counter, as suggested by Yuchung & Neal ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:42:32 -04:00
Eric Dumazet	e520af48c7	tcp: add TCPWinProbe and TCPKeepAlive SNMP counters Diagnosing problems related to Window Probes has been hard because we lack a counter. TCPWinProbe counts the number of ACK packets a sender has to send at regular intervals to make sure a reverse ACK packet opening back a window had not been lost. TCPKeepAlive counts the number of ACK packets sent to keep TCP flows alive (SO_KEEPALIVE) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Nandita Dukkipati <nanditad@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:42:32 -04:00
Eric Dumazet	21c8fe9915	tcp: adjust window probe timers to safer values With the advent of small rto timers in datacenter TCP, (ip route ... rto_min x), the following can happen : 1) Qdisc is full, transmit fails. TCP sets a timer based on icsk_rto to retry the transmit, without exponential backoff. With low icsk_rto, and lot of sockets, all cpus are servicing timer interrupts like crazy. Intent of the code was to retry with a timer between 200 (TCP_RTO_MIN) and 500ms (TCP_RESOURCE_PROBE_INTERVAL) 2) Receivers can send zero windows if they don't drain their receive queue. TCP sends zero window probes, based on icsk_rto current value, with exponential backoff. With /proc/sys/net/ipv4/tcp_retries2 being 15 (or even smaller in some cases), sender can abort in less than one or two minutes ! If receiver stops the sender, it obviously doesn't care of very tight rto. Probability of dropping the ACK reopening the window is not worth the risk. Lets change the base timer to be at least 200ms (TCP_RTO_MIN) for these events (but not normal RTO based retransmits) A followup patch adds a new SNMP counter, as it would have helped a lot diagnosing this issue. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:42:32 -04:00
Richard Alpe	b063bc5ea7	tipc: send explicit not supported error in nl compat The legacy netlink API treated EPERM (permission denied) as "operation not supported". Reported-by: Tomi Ollila <tomi.ollila@iki.fi> Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:40:03 -04:00
Richard Alpe	670f4f8818	tipc: add broadcast link window set/get to nl api Add the ability to get or set the broadcast link window through the new netlink API. The functionality was unintentionally missing from the new netlink API. Adding this means that we also fix the breakage in the old API when coming through the compat layer. Fixes: `37e2d4843f` (tipc: convert legacy nl link prop set to nl compat) Reported-by: Tomi Ollila <tomi.ollila@iki.fi> Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:40:02 -04:00
Richard Alpe	c3d6fb85b2	tipc: fix default link prop regression in nl compat Default link properties can be set for media or bearer. This functionality was missed when introducing the NL compatibility layer. This patch implements this functionality in the compat netlink layer. It works the same way as it did in the old API. We search for media and bearers matching the "link name". If we find a matching media or bearer the link tolerance, priority or window is used as default for new links on that media or bearer. Fixes: `37e2d4843f` (tipc: convert legacy nl link prop set to nl compat) Reported-by: Tomi Ollila <tomi.ollila@iki.fi> Signed-off-by: Richard Alpe <richard.alpe@ericsson.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:40:02 -04:00
Hariprasad Shenai	c035e183eb	cxgb4: Initialize RSS mode for all Ports Implements t4_init_rss_mode() to initialize the rss_mode for all the ports. If Tunnel All Lookup isn't specified in the global RSS Configuration, then we need to specify a default Ingress Queue for any ingress packets which aren't hashed. We'll use our first ingress queue. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:33:10 -04:00
David S. Miller	b2a6c326b2	Merge branch 'be2net' Sathya Perla says: ==================== be2net: patch-set The following patch-set has two new feature additions, and a few minor fixes and cleanups. Pls consider applying to the net-next tree. Thanks. v2 changes: a) dropped the "don't enable pause by default" patch b) described how the "spoof check" works in patch 1's commit log c) I had to update our email addresses from "@emulex" to "@avagotech". I'll send a separate patch updating the maintainers Patch 1 adds support for the "spoofchk" knob for VFs. When it is enabled, "spoof checking" is done for both MAC-address and VLAN. For each VF, the HW ensures that the source MAC address (or vlan) of every outgoing packet of the VF exists in the MAC-list (or vlan-list) configured for RX filtering for that VF. If not, the packet is dropped and an error is reported to the driver in the TX completion. Patch 2 improves interrupt moderation on Skyhawk-R chip by using the EQ-DB mechanism to set a "re-arm to interrupt" delay. Currently interrupt moderation is adjusted by calculating and configuring an EQ-delay every second. This is done via a FW-cmd. This patch uses the EQ_DB facility to calculate and set the interrupt delay every 1ms. This helps moderating interrupts better when the traffic is bursty. Patch 3 adds L3/L4 error accounting to BE3 VFs, by passing L3/4 error packets to the network stack. Patch 4 adds an extra FW-cmd error value check in the driver to identify an "out of vlan filters" scenario. Patch 5 stops enabling pause by default as this setting fails in some HW-configs where priority pause is enabled in FW. If the user tries to do the same, an appropriate error is returned via ethtool. Patch 5 posts the full RXQ in be_open() to prevent packet drops due to bursty traffic when the interface is enabled. Patch 6 refactors the be_check_ufi_compatibility() routine, that checks to see if a UFI file meant for a lower rev of a chip is being flashed on a higher rev, to make it simpler. Patch 7 replaces the usage of !be_physfn() macro with be_virtfn() that is already avialble in the driver. Patch 8 updates the year in the copyright text to 2015. Path 9 bumps up the driver version to 10.6.02. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:27:04 -04:00
Sathya Perla	029e9330dd	be2net: update the driver version to 10.6.0.2 Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:27:04 -04:00
Vasundhara Volam	d19261b8ef	be2net: update copyright year to 2015 Signed-off-by: Vasundhara Volam <vasundhara.volam@avagotech.com> Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:27:04 -04:00
Kalesh AP	18c57c74a1	be2net: use be_virtfn() instead of !be_physfn() Use be_virtfn() to determine a VF instead of !be_physfn() for better readability. Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:27:04 -04:00
Vasundhara Volam	a6e6ff6eee	be2net: simplify UFI compatibility checking The code in be_check_ufi_compatibility() checks to see if a UFI file meant for a lower rev of a chip is being flashed on a higher rev, which is disallowed. This patch re-writes the code needed for this check in a much simpler manner. Signed-off-by: Vasundhara Volam <vasundhara.volam@avagotech.com> Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:27:03 -04:00
Suresh Reddy	b02e60c86e	be2net: post full RXQ on interface enable When an RXQ is created in be_open(), the driver currently posts only 64 buffers. This sometimes results in packet drops when there is a traffic burst as soon as the interface is enabled. This patch fixes this problem by posting the full RXQ on interface enable. Signed-off-by: Suresh Reddy <Suresh.Reddy@avagotech.com> Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:27:03 -04:00
Kalesh AP	77be8c1c4c	be2net: check for INSUFFICIENT_VLANS error When the FW runs out of vlan filters it can either return an INSUFFICIENT_RESOURCES error or an INSUFFICIENT_VLANS error. The driver currently checks only for the former error value. This patch adds a check for the latter value too. Signed-off-by: Kalesh AP <kalesh.purayil@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:27:03 -04:00
Somnath Kotur	0ed7d7498d	be2net: receive pkts with L3, L4 errors on VFs Currently pkts with L3 or L4 errors received on PFs are not dropped by the adapter, but instead sent to the stack. This helps the network stack to better reflect error statistics. This was not being done on BE3 VFs. This patch fixes this for BE3 VFs. Signed-off-by: Somnath Kotur <somnath.kotur@avagotech.com> Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:27:03 -04:00
Padmanabh Ratnakar	2094777041	be2net: set interrupt moderation for Skyhawk-R using EQ-DB Currently adaptive interrupt moderation is set by calculating and configuring an EQ-delay every second. This is done via a FW-cmd. But, on Skyhawk-R a "re-arm to interrupt" delay can be set while ringing the EQ-DB. This patch uses this facility to calculate and set the interrupt delay every 1ms. This helps moderating interrupts better when the traffic is bursty. Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@avagotech.com> Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:27:03 -04:00
Kalesh AP	e7bcbd7b81	be2net: add support for spoofchk setting This patch adds support for spoofchk configuration for VFs. When it is enabled, "spoof checking" is done for both MAC-address and VLAN. For each VF, the HW ensures that the source MAC address (or vlan) of every outgoing packet exists in the MAC-list (or vlan-list) configured for RX filtering for that VF. If not, the packet is dropped and an error is reported to the driver in the TX completion; this is reflected in the "tx_spoof_check_err" ethtool counter. This feature is supported in Skyhawk FW version 10.6.31.0 and above. Signed-off-by: Kalesh AP <kalesh.purayil@emulex.com> Signed-off-by: Sathya Perla <sathya.perla@avagotech.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:27:03 -04:00
David S. Miller	1a676d2b13	Merge branch 'sfc-next' Shradha Shah says: ==================== sfc: Enabling EF10 Vf's, set up vswitching and bind the SFC driver to the VF's This set of patches makes way for the implementation of EF10 SR-IOV driver starting with some cleanup code. NIC specific SR-IOV functions are moved to their own header and netdev_ops are made generic instead of being NIC specific Next in line comes the patch to enable VF's using sriov_configure. VEB vswitching hierarchy is set up next followed by patches to prepare sfc driver to bind to enabled VF's This is followed by patch to support use of shared RSS contexts which makes VF's use shared RSS contexts in all cases. Patch series ends with a patch to bind the sfc driver to the enabled VF's which creates network interfaces corresponding to the VF's. Coming up soon are the patches to set_vf_mac, set_vf_config, set_vf_vlan, vf_spoofcheck, etc. These patches have been tested with and without CONFIG_SFC_SRIOV. In the case of CONFIG_SFC_SRIOV=y enabling of VF's using sriov_configure is also tested. The enabled VF's bind to the installed sfc driver succesfully to create network interfaces. In the case of CONFIG_SFC_SRIOV=n enabling of VF's using sriov_configure returns the correct error message: "Function not implemented". ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:16:49 -04:00
Shradha Shah	6f7f8aa69a	sfc: Bind the sfc driver to any available VF's Add the device ID of the VF to the PCI device ID table. Added a boolean flag is_vf in efx_nic_type to differentiate between a VF and PF at probe time. This flag is useful in later patches while setting MAC address specially in the PCI-passthrough case. Signed-off-by: Shradha Shah <sshah@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:16:48 -04:00
Jon Cooper	267c01571b	sfc: Add use of shared RSS contexts. Allow PFs to allocate shared RSS contexts if we exhaust our exclusive RSS contexts. Make VFs use shared RSS contexts in all cases. Spruce up error handling so that the shadow copy of the RSS table is updated after successful update, rather than in all cases, so that we report the actual contents of the RSS table after a failure to set it, rather than what we'd like it to be. Populate context_size parameter when vacuously allocating RSS context of size 1. Signed-off-by: Shradha Shah <sshah@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:16:48 -04:00
Edward Cree	267d9d7387	sfc: Cope with permissions enforcement added to firmware for SR-IOV * Accept EPERM in some simple cases, the following cases are handled: 1) efx_mcdi_read_assertion() Unprivileged PCI functions aren't allowed to GET_ASSERTS. We return success as it's up to the primary PF to deal with asserts. 2) efx_mcdi_mon_probe() in efx_ef10_probe() Unprivileged PCI functions aren't allowed to read sensor info, and worrying about sensor data is the primary PF's job. 3) phy_op->reconfigure() in efx_init_port() and efx_reset_up() Unprivileged functions aren't allowed to MC_CMD_SET_LINK, they just have to accept the settings (including flow-control, which is what efx_init_port() is worried about) they've been given. 4) Fallback to GET_WORKAROUNDS in efx_ef10_probe() Unprivileged PCI functions aren't allowed to set workarounds. So if efx_mcdi_set_workaround() fails EPERM, use efx_mcdi_get_workarounds() to find out if workaround_35388 is enabled. 5) If DRV_ATTACH gets EPERM, try without specifying fw-variant Unprivileged PCI functions have to use a FIRMWARE_ID of 0xffffffff (MC_CMD_FW_DONT_CARE). 6) Don't try to exit_assertion unless one had fired Previously we called efx_mcdi_exit_assertion even if efx_mcdi_read_assertion had received MC_CMD_GET_ASSERTS_FLAGS_NO_FAILS. This is unnecessary, and the resulting MC_CMD_REBOOT, even if the AFTER_ASSERTION flag made it a no-op, would fail EPERM for unprivileged PCI functions. So make efx_mcdi_read_assertion return whether an assert happened, and only call efx_mcdi_exit_assertion if it has. Signed-off-by: Shradha Shah <sshah@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:16:48 -04:00
Shradha Shah	7b8c7b54f0	sfc: manually allocate and free vadaptors To be able to use MC_CMD_VADAPTOR_SET_MAC, vadaptors must be manually allocated and freed as automatic vadaptors will disappear when their reference_count reaches zero, which must happen before the MAC address is changed. Vadaptors are allocated and freed in the vswitching_probe/remove functions for PFs and VFs, and this means that vadaptors are restored correctly following an MC reboot or other reset when required. Signed-off-by: Shradha Shah <sshah@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:16:48 -04:00
Shradha Shah	3c5eb87605	sfc: create vports for VFs and assign random MAC addresses The parent PF creates vports for all its child VFs and adds MAC addresses to these. When the VF driver loads, it can make an MCDI call to get the MAC address that the parent PF assigned it. The parent PF also assigns a mac address to its own vport because implicit creation of a vAdaptor will only work on evb ports with MAC addresses assigned. The vport MAC address needs to be stored in the PF's nic_data struct as it can later be changed on the vadaptor (and its net_dev struct). When removing a vport the original MAC address must be deleted. A new flag is needed in the VF data structure to identify whether a vport has been assigned to the VF. This is to determine whether it needs to be un-assigned before freeing the vport. Also, attempting to un-assign a vport which is not assigned will result in an EALREADY error. Signed-off-by: Shradha Shah <sshah@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:16:48 -04:00
Shradha Shah	02246a7f96	sfc: Prepare to bind the sfc driver to the VF. Added efx_nic_type structure for VF. Mapped a different BAR for VF as it uses BAR 0 for memory. Added functions sriov_init and sriov_fini. Signed-off-by: Shradha Shah <sshah@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:16:47 -04:00
Daniel Pieczko	1cd9ecbbe6	sfc: get the PF number and record in nic_data Use MC_CMD_GET_FUNCTION_INFO to record the PF number in nic_data. This will be needed when assigned vports to VFs. Signed-off-by: Shradha Shah <sshah@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:16:47 -04:00
Daniel Pieczko	6d8aaaf6f7	sfc: create VEB vswitch and vport above default firmware setup Adds functions to allocate and free vswitches and vports; vadaptors are automatically allocated and freed when TX/RX queues are initialised and finalised. This vswitching structure is only created if the firmware supports it, so a check that full-featured firmware is running is performed first. If the MC resets, the vswitching infrastructure will need to be recreated, so mark the "must_probe_vswitching" flag when an MC reboot is detected. Don't try to create a vswitch if vf-count=0 This allocation of vswitches and vports does not currently support configuring VLAN tags, but that can be added in a future change. Signed-off-by: Shradha Shah <sshah@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:16:47 -04:00
Daniel Pieczko	45b2449e3f	sfc: record the PF's vport ID in nic_data The default port ID of EVB_PORT_ID_ASSIGNED is a "magic" number for the MCFW to select the physical port of the PF. If other vswitches and vports are created on top of the default firmware configuration, the ID of the newly created vport is then required when passed to MCDI commands. Currently, this doesn't happen so the vport_id is never changed, but a subsequent patch will change this behaviour so that other vswitches and vports are created. The vport_id recorded in nic_data is only relevant for PFs. VFs will have their vports created by their parent PF, and in that case the parent PF will record the vport ID of each VF. For a VF, nic_data->vport_id is expected to remain at the default value. Signed-off-by: Shradha Shah <sshah@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:16:47 -04:00
Daniel Pieczko	8d9f9dd448	sfc: Record [rt]x_dpcpu_fw_id in EF10 nic_data The (future) code to add/remove vswitches and vports will be dependent on the firmware variant. To simplify the checking of the firmware variant, record values for rx_dpcpu_fw_id and tx_dpcpu_fw_id in EF10 nic_data. There was only one place where this was previously used: efx_mcdi_print_fwver() in ethtool.c. The MC_CMD_GET_CAPABILITIES can be replaced and the values from nic_data used instead. Note that the printing of "?" if the MC command fails or if the outlength is incorrect no longer apply, because errors are returned in efx_ef10_init_datapath_caps() in both of these cases. Signed-off-by: Shradha Shah <sshah@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:16:47 -04:00
Shradha Shah	e3d3629387	sfc: Use MCDI to set FILTER_OP_IN_TX_DOMAIN The TX_DOMAIN field is currently reserved but its safer to set it to 0 for future compatibility. Signed-off-by: Shradha Shah <sshah@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:16:46 -04:00
Shradha Shah	834e23dd0a	sfc: Enable VF's via a write to the sysfs file sriov_numvfs This patch adds support for the use of sriov_configure on EF10 to enable Virtual Functions while the driver is loaded. Signed-off-by: Shradha Shah <sshah@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:16:46 -04:00
Daniel Pieczko	bf3d0156c5	sfc: Move and rename efx_vf struct to siena_vf The efx_vf struct contains Siena-specific fields for VFs, so rename to siena_vf. Also move it into the siena_nic_data struct, as EF10 will track its VFs in its own ef10_nic_data, storing much less information about them since VFDI is no longer used. Signed-off-by: Shradha Shah <sshah@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:16:46 -04:00
Shradha Shah	7fa8d54704	sfc: Own header for nic-specific sriov functions, single instance of netdev_ops and sriov removed from Falcon code By putting all the efx_{siena,ef10}_sriov_* declarations in {siena,ef10}_sriov.h, ensure they cannot be called from nic-generic code. Also fixes up an instance of this, where mcdi.c was calling efx_siena_sriov_flr. The single instance of netdev_ops should call general high level functions that can then call something adapter specific in efx_nic_type. We should only do adapter specialisation via efx_nic_type. Removal of sriov functionality from the Falcon code means that tests are needed for the presence of some callbacks. Signed-off-by: Shradha Shah <sshah@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:16:46 -04:00
David S. Miller	f926204b8b	Merge branch 'dsa-next' Andrew Lunn says: ==================== More Marvell DSA refactring and fixup This patch setup continues the refactoring and cleanup of the Marvell DSA drivers. Patch #1 Centralizes the duplicated parts of port setup and global setup into the shared mv88e6xxx. Patch #2 Centralizes looping over the ports setting them up Patch #3 Uses mnemonics for the remaining register access in the drivers. Patch #4 The 6172 is actually a member of the 6352 family. This moves the probe code into the correct driver. Patch #5 Adds more members of the 6171 family to the 6171 driver. The new devices are untested. Patch #6 The 6185 is a member of the 6131 family. Add it to the probe code of the 6131 driver. Patch #7 and Patch #8 Simply the mutex's in mv88e6xxx.c. The SMI bus is the bottleneck, not the granularity of the mutex's so simply the code down to a single mutex. Patch #8 Fixes a false positive lockdep splat, due to nested uses of MDIO busses. Patch #9 Fixes another false positive lockdep splat with the transmit queue because of stacked Ethernet devices. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:05:54 -04:00
Andrew Lunn	448b4482c6	net: dsa: Add lockdep class to tx queues to avoid lockdep splat DSA stacks an Ethernet device on top of an Ethernet device. This can cause false positive lockdep splats for the transmit queue: Acked-by: Florian Fainelli <f.fainelli@gmail.com> ============================================= [ INFO: possible recursive locking detected ] 4.0.0-rc7-01838-g70621a215fc7 #386 Not tainted --------------------------------------------- kworker/0:0/4 is trying to acquire lock: (_xmit_ETHER#2){+.-...}, at: [<c040e95c>] sch_direct_xmit+0xa8/0x1fc but task is already holding lock: (_xmit_ETHER#2){+.-...}, at: [<c03f4208>] __dev_queue_xmit+0x4d4/0x56c other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(_xmit_ETHER#2); lock(_xmit_ETHER#2); To avoid this, walk the tq queues of the dsa slaves and set a lockdep class. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:05:54 -04:00
Andrew Lunn	16fe24fc43	net: dsa: mv88e6xxx: Fix false positive lockdep splat DSA can have nested MDIO busses, where the Ethernet MDIO bus is used to access an MDIO bus within the switch which has the PHYs connected to it. This nesting causes lockdep to give false positives. Use mutex_lock_nested() to avoid this. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:05:54 -04:00
Andrew Lunn	31888234b7	net: dsa: mv88e6xxx: Replace stats mutex with SMI mutex The SMI bus is the bottleneck in all switch operations, not the granularity of locks. Replace the stats mutex by the SMI mutex to make the locking concept simpler. The REG_READ/REG_WRITE macros cannot be used while holding the SMI mutex, since they try to acquire it. Replace with calls to the appropriate function which does not try to get the mutex. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:05:54 -04:00
Andrew Lunn	3898c14858	net: dsa: mv88e6xxx: Replace PHY mutex by SMI mutex The SMI bus is the bottleneck in all switch operations, not the granularity of locks. Replace the PHY mutex by the SMI mutex to make the locking concept simpler. The REG_READ/REG_WRITE macros cannot be used while holding the SMI mutex, since they try to acquire it. Replace with calls to the appropriate function which does not try to get the mutex. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:05:53 -04:00
Andrew Lunn	1441f4e596	net: dsa: mv88e6131: Add support for mv88e6185 The mv88e6185 is part of the family that the mv88e6131 driver supports. Add it to the probe function, and set the number of ports. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-05-09 16:05:53 -04:00

1 2 3 4 5 ...

519149 Commits