Commit Graph

471511 Commits

Author SHA1 Message Date
Tom Herbert
54bc9bac30 gre: Set inner protocol in v4 and v6 GRE transmit
Call skb_set_inner_protocol to set inner Ethernet protocol to
protocol being encapsulation by GRE before tunnel_xmit. This is
needed for GSO if UDP encapsulation (fou) is being done.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 21:35:51 -04:00
Tom Herbert
077c5a0948 ipip: Set inner IP protocol in ipip
Call skb_set_inner_ipproto to set inner IP protocol to IPPROTO_IPV4
before tunnel_xmit. This is needed if UDP encapsulation (fou) is
being done.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 21:35:51 -04:00
Tom Herbert
469471cdfc sit: Set inner IP protocol in sit
Call skb_set_inner_ipproto to set inner IP protocol to IPPROTO_IPV6
before tunnel_xmit. This is needed if UDP encapsulation (fou) is
being done.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 21:35:51 -04:00
Tom Herbert
8bce6d7d0d udp: Generalize skb_udp_segment
skb_udp_segment is the function called from udp4_ufo_fragment to
segment a UDP tunnel packet. This function currently assumes
segmentation is transparent Ethernet bridging (i.e. VXLAN
encapsulation). This patch generalizes the function to
operate on either Ethertype or IP protocol.

The inner_protocol field must be set to the protocol of the inner
header. This can now be either an Ethertype or an IP protocol
(in a union). A new flag in the skbuff indicates which type is
effective. skb_set_inner_protocol and skb_set_inner_ipproto
helper functions were added to set the inner_protocol. These
functions are called from the point where the tunnel encapsulation
is occuring.

When skb_udp_tunnel_segment is called, the function to segment the
inner packet is selected based on the inner IP or Ethertype. In the
case of an IP protocol encapsulation, the function is derived from
inet[6]_offloads. In the case of Ethertype, skb->protocol is
set to the inner_protocol and skb_mac_gso_segment is called. (GRE
currently does this, but it might be possible to lookup the protocol
in offload_base and call the appropriate segmenation function
directly).

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 21:35:51 -04:00
David S. Miller
f44d61cdd3 Merge branch 'bpf-next'
Alexei Starovoitov says:

====================
bpf: add search pruning optimization and tests

patch #1 commit log explains why eBPF verifier has to examine some
instructions multiple times and describes the search pruning optimization
that improves verification speed for branchy programs and allows more
complex programs to be verified successfully.
This patch completes the core verifier logic.

patch #2 adds more verifier tests related to branches and search pruning

I'm still working on Andy's 'bitmask for stack slots' suggestion. It will be
done on top of this patch.

The current verifier algorithm is brute force depth first search with
state pruning. If anyone can come up with another algorithm that demonstrates
better results, we'll replace the algorithm without affecting user space.

Note verifier doesn't guarantee that all possible valid programs are accepted.
Overly complex programs may still be rejected.
Verifier improvements/optimizations will guarantee that if a program
was passing verification in the past, it will still be passing.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 21:30:46 -04:00
Alexei Starovoitov
fd10c2ef3e bpf: add tests to verifier testsuite
add 4 extra tests to cover jump verification better

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 21:30:33 -04:00
Alexei Starovoitov
f1bca824da bpf: add search pruning optimization to verifier
consider C program represented in eBPF:
int filter(int arg)
{
    int a, b, c, *ptr;

    if (arg == 1)
        ptr = &a;
    else if (arg == 2)
        ptr = &b;
    else
        ptr = &c;

    *ptr = 0;
    return 0;
}
eBPF verifier has to follow all possible paths through the program
to recognize that '*ptr = 0' instruction would be safe to execute
in all situations.
It's doing it by picking a path towards the end and observes changes
to registers and stack at every insn until it reaches bpf_exit.
Then it comes back to one of the previous branches and goes towards
the end again with potentially different values in registers.
When program has a lot of branches, the number of possible combinations
of branches is huge, so verifer has a hard limit of walking no more
than 32k instructions. This limit can be reached and complex (but valid)
programs could be rejected. Therefore it's important to recognize equivalent
verifier states to prune this depth first search.

Basic idea can be illustrated by the program (where .. are some eBPF insns):
    1: ..
    2: if (rX == rY) goto 4
    3: ..
    4: ..
    5: ..
    6: bpf_exit
In the first pass towards bpf_exit the verifier will walk insns: 1, 2, 3, 4, 5, 6
Since insn#2 is a branch the verifier will remember its state in verifier stack
to come back to it later.
Since insn#4 is marked as 'branch target', the verifier will remember its state
in explored_states[4] linked list.
Once it reaches insn#6 successfully it will pop the state recorded at insn#2 and
will continue.
Without search pruning optimization verifier would have to walk 4, 5, 6 again,
effectively simulating execution of insns 1, 2, 4, 5, 6
With search pruning it will check whether state at #4 after jumping from #2
is equivalent to one recorded in explored_states[4] during first pass.
If there is an equivalent state, verifier can prune the search at #4 and declare
this path to be safe as well.
In other words two states at #4 are equivalent if execution of 1, 2, 3, 4 insns
and 1, 2, 4 insns produces equivalent registers and stack.

Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 21:30:33 -04:00
Nimrod Andy
1b7bde6d65 net: fec: implement rx_copybreak to improve rx performance
- Copy short frames and keep the buffers mapped, re-allocate skb instead of
  memory copy for long frames.
- Add support for setting/getting rx_copybreak using generic ethtool tunable

Changes V3:
* As Eric Dumazet's suggestion that removing the copybreak module parameter
  and only keep the ethtool API support for rx_copybreak.

Changes V2:
* Implements rx_copybreak
* Rx_copybreak provides module parameter to change this value
* Add tunable_ops support for rx_copybreak

Signed-off-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 21:28:21 -04:00
Eric Dumazet
ce1a4ea3f1 net: avoid one atomic operation in skb_clone()
Fast clone cloning can actually avoid an atomic_inc(), if we
guarantee prior clone_ref value is 1.

This requires a change kfree_skbmem(), to perform the
atomic_dec_and_test() on clone_ref before setting fclone to
SKB_FCLONE_UNAVAILABLE.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 21:27:23 -04:00
Fabian Frederick
e500f488c2 net/dccp/ccid.c: add __init to ccid_activate
ccid_activate is only called by __init ccid_initialize_builtins in same module.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 18:33:13 -04:00
Fabian Frederick
0c5b8a4629 net/dccp/proto.c: add __init to dccp_mib_init
dccp_mib_init is only called by __init dccp_init in same module.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 18:33:13 -04:00
David S. Miller
0754476419 Merge branch 'r8152'
Hayes Wang says:

====================
r8152: patches about firmware

The patches fix the issues when the firmware exists.

For the multiple OS, the firmware may be loaded by the
driver of the other OS. And the Linux driver has influences
on it.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 16:46:41 -04:00
hayeswang
49be17235c r8152: disable power cut for RTL8153
The firmware would be clear when the power cut is enabled for
RTL8153.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 16:46:34 -04:00
hayeswang
204c870412 r8152: remove clearing bp
The xxx_clear_bp() is used to halt the firmware. It only necessary
for updating the new firmware. Besides, depend on the version of
the current firmware, it may have problem to halt the firmware
directly. Finally, halt the firmware would let the firmware code
useless, and the bugs which are fixed by the firmware would occur.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 16:46:34 -04:00
Vlad Yasevich
1b0ecb28b0 bnx2: Correctly receive full sized 802.1ad fragmes
This driver, similar to tg3, has a check that will
cause full sized 802.1ad frames to be dropped.  The
frame will be larger then the standard mtu due to the
presense of vlan header that has not been stripped.
The driver should not drop this frame and should process
it just like it does for 802.1q.

CC: Sony Chacko <sony.chacko@qlogic.com>
CC: Dept-HSGLinuxNICDev@qlogic.com
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 16:43:45 -04:00
Vlad Yasevich
7d3083ee36 tg3: Allow for recieve of full-size 8021AD frames
When receiving a vlan-tagged frame that still contains
a vlan header, the length of the packet will be greater
then MTU+ETH_HLEN since it will account of the extra
vlan header.  TG3 checks this for the case for 802.1Q,
but not for 802.1ad.  As a result, full sized 802.1ad
frames get dropped by the card.

Add a check for 802.1ad protocol when receving full
sized frames.

Suggested-by: Prashant Sreedharan <prashant@broadcom.com>
CC: Prashant Sreedharan <prashant@broadcom.com>
CC: Michael Chan <mchan@broadcom.com>
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 16:43:45 -04:00
Florian Westphal
1e91887685 r8169: add support for Byte Queue Limits
tested on RTL8168d/8111d model using 'super_netperf 40' with TCP/UDP_STREAM.

Output of
while true; do
    for n in inflight limit; do
          echo -n $n\ ; cat $n;
    done;
    sleep 1;
done

during netperf run, 100mbit peer:

inflight 0
limit 3028
inflight 6056
limit 4542

[ trimmed output for brevity, no limit/inflight changes during
  test steady-state ]

limit 4542
inflight 3028
limit 6122
inflight 0
limit 6122
[ changed cable to 1gbit peer, restart netperf ]
inflight 37850
limit 36336
inflight 33308
limit 31794
inflight 33308
limit 31794
inflight 27252
limit 25738
[ again, no changes during test ]
inflight 27252
limit 25738
inflight 0
limit 28766
[ change cable to 100mbit peer, restart netperf ]
limit 28766
inflight 27370
limit 28766
inflight 4542
limit 5990
inflight 6056
limit 4542
[ .. ]
inflight 6056
limit 4542
inflight 0

[end of test]

Cc: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 16:35:43 -04:00
Eric Dumazet
d0bf4a9e92 net: cleanup and document skb fclone layout
Lets use a proper structure to clearly document and implement
skb fast clones.

Then, we might experiment more easily alternative layouts.

This patch adds a new skb_fclone_busy() helper, used by tcp and xfrm,
to stop leaking of implementation details.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 16:34:25 -04:00
Yuchung Cheng
b248230c34 tcp: abort orphan sockets stalling on zero window probes
Currently we have two different policies for orphan sockets
that repeatedly stall on zero window ACKs. If a socket gets
a zero window ACK when it is transmitting data, the RTO is
used to probe the window. The socket is aborted after roughly
tcp_orphan_retries() retries (as in tcp_write_timeout()).

But if the socket was idle when it received the zero window ACK,
and later wants to send more data, we use the probe timer to
probe the window. If the receiver always returns zero window ACKs,
icsk_probes keeps getting reset in tcp_ack() and the orphan socket
can stall forever until the system reaches the orphan limit (as
commented in tcp_probe_timer()). This opens up a simple attack
to create lots of hanging orphan sockets to burn the memory
and the CPU, as demonstrated in the recent netdev post "TCP
connection will hang in FIN_WAIT1 after closing if zero window is
advertised." http://www.spinics.net/lists/netdev/msg296539.html

This patch follows the design in RTO-based probe: we abort an orphan
socket stalling on zero window when the probe timer reaches both
the maximum backoff and the maximum RTO. For example, an 100ms RTT
connection will timeout after roughly 153 seconds (0.3 + 0.6 +
.... + 76.8) if the receiver keeps the window shut. If the orphan
socket passes this check, but the system already has too many orphans
(as in tcp_out_of_resources()), we still abort it but we'll also
send an RST packet as the connection may still be active.

In addition, we change TCP_USER_TIMEOUT to cover (life or dead)
sockets stalled on zero-window probes. This changes the semantics
of TCP_USER_TIMEOUT slightly because it previously only applies
when the socket has pending transmission.

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reported-by: Andrey Dmitrov <andrey.dmitrov@oktetlabs.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 16:27:52 -04:00
Linus Torvalds
a44f867247 Merge branch 'for-3.17' of git://linux-nfs.org/~bfields/linux
Pull nfsd bugfix from Bruce Fields:
 "This fixes a data corruption bug introduced by the v3.16 xdr encoding
  rewrite.  I haven't managed to reproduce it myself yet, but it's
  apparently not hard to hit given the right workload"

* 'for-3.17' of git://linux-nfs.org/~bfields/linux:
  nfsd4: fix corruption of NFSv4 read data
2014-10-01 13:22:00 -07:00
Fabian Frederick
cb57659a15 cipso: add __init to cipso_v4_cache_init
cipso_v4_cache_init is only called by __init cipso_v4_init

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 15:46:20 -04:00
Fabian Frederick
57a02c39c1 inet: frags: add __init to ip4_frags_ctl_register
ip4_frags_ctl_register is only called by __init ipfrag_init

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 15:46:19 -04:00
Fabian Frederick
47d7a88c18 tcp: add __init to tcp_init_mem
tcp_init_mem is only called by __init tcp_init.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 15:41:14 -04:00
Chun-Hao Lin
ee7a1beb97 r8169:call "rtl8168_driver_start" "rtl8168_driver_stop" only when hardware dash function is enabled
These two functions are used to inform dash firmware that driver is been
brought up or brought down. So call these two functions only when hardware dash
function is enabled.

Signed-off-by: Chun-Hao Lin <hau@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 15:33:18 -04:00
Chun-Hao Lin
2a9b4d9670 r8169:modify the behavior of function "rtl8168_oob_notify"
In function "rtl8168_oob_notify", using function "rtl_eri_write" to access
eri register 0xe8, instead of using MAC register "ERIDR" and "ERIAR" to
access it.

For using function "rtl_eri_write" in function "rtl8168_oob_notify", need to
move down "rtl8168_oob_notify" related functions under the function
"rtl_eri_write".

Signed-off-by: Chun-Hao Lin <hau@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 15:33:18 -04:00
Chun-Hao Lin
2f8c040ce6 r8169:change the name of function "r8168dp_check_dash" to "r8168_check_dash"
DASH function not only RTL8168DP can support, but also RTL8168EP.
So change the name of function "r8168dp_check_dash" to "r8168_check_dash".

Signed-off-by: Chun-Hao Lin <hau@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 15:33:18 -04:00
Chun-Hao Lin
706123d06c r8169:change the name of function"rtl_w1w0_eri"
Change the name of function "rtl_w1w0_eri" to "rtl_w0w1_eri".

In this function, the local variable "val" is "write zeros then write ones".
Please see below code.

(val & ~m) | p

In this patch, change the function name from "xx_w1w0_xx" to "xx_w0w1_xx".
The changed function name is more suitable for it's behavior.

Signed-off-by: Chun-Hao Lin <hau@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 15:33:18 -04:00
Chun-Hao Lin
7656442824 r8169:for function "rtl_w1w0_phy" change its name and behavior
Change function name from "rtl_w1w0_phy" to "rtl_w0w1_phy".
And its behavior from "write ones then write zeros" to
"write zeros then write ones".

In Realtek internal driver, bitwise operations are almost "write zeros then
write ones". For easy to port hardware parameters from Realtek internal driver
to Linux kernal driver "r8169", we would like to change this function's
behavior and its name.

Signed-off-by: Chun-Hao Lin <hau@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 15:33:17 -04:00
Chun-Hao Lin
ac85bcdbc0 r8169:add more chips to support magic packet v2
For RTL8168F RTL8168FB RTL8168G RTL8168GU RTL8411 RTL8411B RTL8402 RTL8107E,
the magic packet enable bit is changed to eri 0xde bit0.

In this patch, change magic packet enable bit of these chips to eri 0xde bit0.

Signed-off-by: Chun-Hao Lin <hau@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 15:33:17 -04:00
Chun-Hao Lin
89cceb2729 r8169:add support more chips to get mac address from backup mac address register
RTL8168FB RTL8168G RTL8168GU RTL8411 RTL8411B RTL8106EUS RTL8402 can
support get mac address from backup mac address register.

Signed-off-by: Chun-Hao Lin <hau@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 15:33:17 -04:00
Chun-Hao Lin
42fde73710 r8169:add disable/enable RTL8411B pll function
RTL8411B can support disable/enable pll function.

Signed-off-by: Chun-Hao Lin <hau@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 15:33:17 -04:00
Chun-Hao Lin
b8e5e6ad71 r8169:add disable/enable RTL8168G pll function
RTL8168G also can disable/enable pll function.

Signed-off-by: Chun-Hao Lin <hau@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 15:33:17 -04:00
Chun-Hao Lin
05b9687bb3 r8169:change uppercase number to lowercase number
Signed-off-by: Chun-Hao Lin <hau@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 15:33:17 -04:00
David L Stevens
a29c9c43bb sunvnet: fix potential NULL pointer dereference
One of the error cases for vnet_start_xmit()'s "out_dropped" label
is port == NULL, so only mess with port->clean_timer when port is not NULL.

Signed-off-by: David L Stevens <david.stevens@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 15:26:09 -04:00
Thierry Reding
e506d405ac net: dsa: Fix build warning for !PM_SLEEP
The dsa_switch_suspend() and dsa_switch_resume() functions are only used
when PM_SLEEP is enabled, so they need #ifdef CONFIG_PM_SLEEP protection
to avoid a compiler warning.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 15:24:00 -04:00
Subbaraya Sundeep Bhatta
3c87dcbfb3 net: ll_temac: Remove unnecessary ether_setup after alloc_etherdev
Calling ether_setup is redundant since alloc_etherdev calls it.

Signed-off-by: Subbaraya Sundeep Bhatta <sbhatta@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 15:01:10 -04:00
Eric Dumazet
2c804d0f8f ipv4: mentions skb_gro_postpull_rcsum() in inet_gro_receive()
Proper CHECKSUM_COMPLETE support needs to adjust skb->csum
when we remove one header. Its done using skb_gro_postpull_rcsum()

In the case of IPv4, we know that the adjustment is not really needed,
because the checksum over IPv4 header is 0. Lets add a comment to
ease code comprehension and avoid copy/paste errors.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 13:44:05 -04:00
Stephen Rothwell
eb51bbaf8d fm10k: using vmalloc requires including linux/vmalloc.h
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 13:41:26 -04:00
Fabian Frederick
f0a0c1cedf ieee802154: fix __init functions
Commit 3243acd37f
("ieee802154: add __init to lowpan_frags_sysctl_register")

added __init to lowpan_frags_ns_sysctl_register instead of
lowpan_frags_sysctl_register

Suggested-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01 02:03:13 -04:00
Linus Torvalds
aad7fb916a Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Pull ARM fixes from Russell King:
 "Some further ARM fixes:
   - another build fix for the kprobes test code
   - a fix for no kuser helpers for the set_tls code, which oopsed on
     noMMU hardware
   - a fix for alignment handler with neon opcodes being misinterpreted
   - turning off the hardware access support, which is not implemented
   - a build fix for the v7 coherency exiting code, which can be built
     in non-v7 environments (but still only executed on v7 CPUs)"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: 8179/1: kprobes-test: Fix compile error "bad immediate value for offset"
  ARM: 8178/1: fix set_tls for !CONFIG_KUSER_HELPERS
  ARM: 8177/1: cacheflush: Fix v7_exit_coherency_flush exynos build breakage on ARMv6
  ARM: 8165/1: alignment: don't break misaligned NEON load/store
  ARM: 8164/1: mm: clear SCTLR.HA instead of setting it for LPAE
2014-09-30 19:52:08 -07:00
David S. Miller
09bba1ca55 Merge branch 'sunvnet-jumbograms'
David L Stevens says:

====================
sunvnet: add jumbo frames support

This patch set updates the sunvnet driver to version 1.6 of the VIO protocol
to support per-port exchange of MTU information and allow non-standard MTU
sizes, including jumbo frames.

Using large MTUs shows a nearly 5X throughput improvement Linux-Solaris
and > 10X throughput improvement Linux-Linux.

Changes from v8:
	-add a short timeout to free pending skbs if a new transmit doesn't
	 do it first per Dave Miller <davem@davemloft.net>
Changes from v7:
	-handle skb allocation failures in vnet_skb_shape()
	 per Dave Miller <davem@davemloft.net>
Changes from v6:
	-made kernel transmit path zero-copy to remove memory n^2 scaling issue
	 raised by Raghuram Kothakota <Raghuram.Kothakota@oracle.com>
Changes from v5:
	- fixed comment per Sowmini Varadhan <sowmini.varadhan@oracle.com>
Changes from v4:
	- changed VNET_MAXPACKET per David Laight <David.Laight@ACULAB.COM>
	- added cookies to support non-contiguous buffers of max size
Changes from v3:
	- added version functions per Dave Miller <davem@davemloft.net>
	- moved rmtu to vnet_port per Dave Miller <davem@davemloft.net>
	- explicitly set options bits and capability flags to 0 per
		Raghuram Kothakota <Raghuram.Kothakota@oracle.com>
Changes from v2:
	- make checkpatch clean
Changes from v1:
	- fix brace formatting per Dave Miller <davem@davemloft.net>
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-30 17:10:47 -04:00
David L Stevens
a2b78e9b2c sunvnet: generate ICMP PTMUD messages for smaller port MTUs
This patch sends ICMP and ICMPv6 messages for Path MTU Discovery when a remote
port MTU is smaller than the device MTU. This allows mixing newer VIO protocol
devices that support MTU negotiation with older devices that do not on the
same vswitch. It also allows Linux-Linux LDOMs to use 64K-1 data packets even
though Solaris vswitch is limited to <16K MTU.

Signed-off-by: David L Stevens <david.stevens@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-30 17:10:39 -04:00
David L Stevens
42db672dca sunvnet: allow admin to set sunvnet MTU
This patch allows an admin to set the MTU on a sunvnet device to arbitrary
values between the minimum (68) and maximum (65535) IPv4 packet sizes.

Signed-off-by: David L Stevens <david.stevens@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-30 17:10:39 -04:00
David L Stevens
8e845f4cbb sunvnet: make transmit path zero-copy in the kernel
This patch removes pre-allocated transmit buffers and instead directly maps
pending packets on demand. This saves O(n^2) maximum-sized transmit buffers,
for n hosts on a vswitch, as well as a copy to those buffers.

Single-stream TCP throughput linux-solaris dropped ~5% for 1500-byte MTU,
but linux-linux at 1500-bytes increased ~20%.

Signed-off-by: David L Stevens <david.stevens@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-30 17:10:39 -04:00
David L Stevens
e4defc7754 sunvnet: upgrade to VIO protocol version 1.6
This patch upgrades the sunvnet driver to support VIO protocol version 1.6.
In particular, it adds per-port MTU negotiation, allowing MTUs other than
ETH_FRAMELEN with ports using newer VIO protocol versions.

Signed-off-by: David L Stevens <david.stevens@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-30 17:10:39 -04:00
Li RongQing
a12a601ed1 tcp: Change tcp_slow_start function to return void
No caller uses the return value, so make this function return void.

Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-30 17:09:16 -04:00
Fabian Frederick
3243acd37f ieee802154: add __init to lowpan_frags_sysctl_register
lowpan_frags_sysctl_register is only called by __init lowpan_net_frag_init
(part of the lowpan module).

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-30 17:08:06 -04:00
Fabian Frederick
0d4a2f9a33 irda: add __init to irlan_open
irlan_open is only called by __init irlan_init in same module.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-30 17:08:06 -04:00
Guenter Roeck
72d099e257 next: mips: bpf: Fix build failure
Fix:

arch/mips/net/bpf_jit.c: In function 'build_body':
arch/mips/net/bpf_jit.c:762:6: error: unused variable 'tmp'
cc1: all warnings being treated as errors
make[2]: *** [arch/mips/net/bpf_jit.o] Error 1

Seen when building mips:allmodconfig in -next since next-20140924.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-30 16:54:51 -04:00
David S. Miller
9ba10afe32 Merge branch 'pxa168_eth'
Antoine Tenart says:

====================
ARM: Berlin: Ethernet support

This series introduce support for the Ethernet controller on Berlin SoCs,
using the existing pxa168 Ethernet driver. In order to do this, DT
support is added to the driver alongside some other modifications and
fixes.

This has been tested on a Berlin BG2Q DMP board.

Changes since v5:
	- fixed the build when building the driver as a module

Changes since v4:
        - removed the phy-addr property and added a phy subnode
        - added COMPILE_TEST for the pxa168_eth driver

Changes since v3:
        - moved the addition of pxa168_eth_get_mac_address() to the patch
          using it first

Changes since v2:
        - reworked how the MAC address is configured
        - made the clock anonymous

Changes since v1:
        - removed custom Berlin Ethernet driver
        - used the pxa168 Ethernet driver instead
        - made modifications to the pxa168 driver (DT support, fixes)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-30 16:37:13 -04:00