Commit Graph

21079 Commits

Author SHA1 Message Date
David S. Miller
3830847396 ipv6: Various cleanups in route.c
1) x == NULL --> !x
2) x != NULL --> x
3) (x&BIT) --> (x & BIT)
4) (BIT1|BIT2) --> (BIT1 | BIT2)
5) proper argument and struct member alignment

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-03 18:02:47 -05:00
David S. Miller
507c9b1e07 ipv6: Various cleanups in ip6_route.c
1) x == NULL --> !x
2) x != NULL --> x
3) if() --> if ()
4) while() --> while ()
5) (x & BIT) == 0 --> !(x & BIT)
6) (x&BIT) --> (x & BIT)
7) x=y --> x = y
8) (BIT1|BIT2) --> (BIT1 | BIT2)
9) if ((x & BIT)) --> if (x & BIT)
10) proper argument and struct member alignment

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-03 17:50:45 -05:00
David S. Miller
340e8dc1fb atm: clip: Remove code commented out since eternity.
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-02 14:27:11 -05:00
David S. Miller
b3613118eb Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2011-12-02 13:49:21 -05:00
Linus Torvalds
5983fe2b29 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (73 commits)
  netfilter: Remove ADVANCED dependency from NF_CONNTRACK_NETBIOS_NS
  ipv4: flush route cache after change accept_local
  sch_red: fix red_change
  Revert "udp: remove redundant variable"
  bridge: master device stuck in no-carrier state forever when in user-stp mode
  ipv4: Perform peer validation on cached route lookup.
  net/core: fix rollback handler in register_netdevice_notifier
  sch_red: fix red_calc_qavg_from_idle_time
  bonding: only use primary address for ARP
  ipv4: fix lockdep splat in rt_cache_seq_show
  sch_teql: fix lockdep splat
  net: fec: Select the FEC driver by default for i.MX SoCs
  isdn: avoid copying too long drvid
  isdn: make sure strings are null terminated
  netlabel: Fix build problems when IPv6 is not enabled
  sctp: better integer overflow check in sctp_auth_create_key()
  sctp: integer overflow in sctp_auth_create_key()
  ipv6: Set mcast_hops to IPV6_DEFAULT_MCASTHOPS when -1 was given.
  net: Fix corruption in /proc/*/net/dev_mcast
  mac80211: fix race between the AGG SM and the Tx data path
  ...
2011-12-01 20:09:08 -08:00
David S. Miller
3ced1be549 netfilter: Remove ADVANCED dependency from NF_CONNTRACK_NETBIOS_NS
firewalld in Fedora 16 needs this.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-01 22:19:01 -05:00
Peter Pan(潘卫平)
d01ff0a049 ipv4: flush route cache after change accept_local
After reset ipv4_devconf->data[IPV4_DEVCONF_ACCEPT_LOCAL] to 0,
we should flush route cache, or it will continue receive packets with local
source address, which should be dropped.

Signed-off-by: Weiping Pan <panweiping3@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-01 21:46:12 -05:00
Eric Dumazet
1ee5fa1e99 sch_red: fix red_change
Le mercredi 30 novembre 2011 à 14:36 -0800, Stephen Hemminger a écrit :

> (Almost) nobody uses RED because they can't figure it out.
> According to Wikipedia, VJ says that:
>  "there are not one, but two bugs in classic RED."

RED is useful for high throughput routers, I doubt many linux machines
act as such devices.

I was considering adding Adaptative RED (Sally Floyd, Ramakrishna
Gummadi, Scott Shender), August 2001

In this version, maxp is dynamic (from 1% to 50%), and user only have to
setup min_th (target average queue size)
(max_th and wq (burst in linux RED) are automatically setup)

By the way it seems we have a small bug in red_change()

if (skb_queue_empty(&sch->q))
	red_end_of_idle_period(&q->parms);

First, if queue is empty, we should call
red_start_of_idle_period(&q->parms);

Second, since we dont use anymore sch->q, but q->qdisc, the test is
meaningless.

Oh well...

[PATCH] sch_red: fix red_change()

Now RED is classful, we must check q->qdisc->q.qlen, and if queue is empty,
we start an idle period, not end it.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-01 19:24:38 -05:00
David S. Miller
d984e6197e dccp: Fix compile warning in probe code.
Commit 1386be55e3 ("dccp: fix
auto-loading of dccp(_probe)") fixed a bug but created a new
compiler warning:

net/dccp/probe.c: In function ‘dccpprobe_init’:
net/dccp/probe.c:166:2: warning: the omitted middle operand in ?: will always be ‘true’, suggest explicit middle operand [-Wparentheses]

try_then_request_module() is built for situations where the
"existence" test is some lookup function that returns a non-NULL
object on success, and with a reference count of some kind held.

Here we're looking for a success return of zero from the jprobe
registry.

Instead of fighting the way try_then_request_module() works, simply
open code what we want to happen in a local helper function.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-01 14:45:49 -05:00
David S. Miller
59c2cdae27 Revert "udp: remove redundant variable"
This reverts commit 81d54ec847.

If we take the "try_again" goto, due to a checksum error,
the 'len' has already been truncated.  So we won't compute
the same values as the original code did.

Reported-by: paul bilke <fsmail@conspiracy.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-01 14:12:55 -05:00
Vitalii Demianets
b03b6dd58c bridge: master device stuck in no-carrier state forever when in user-stp mode
When in user-stp mode, bridge master do not follow state of its slaves, so
after the following sequence of events it can stuck forever in no-carrier
state:
1) turn stp off
2) put all slaves down - master device will follow their state and also go in
no-carrier state
3) turn stp on with bridge-stp script returning 0 (go to the user-stp mode)
Now bridge master won't follow slaves' state and will never reach running
state.

This patch solves the problem by making user-stp and kernel-stp behavior
similar regarding master following slaves' states.

Signed-off-by: Vitalii Demianets <vitas@nppfactor.kiev.ua>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-01 14:05:17 -05:00
David S. Miller
efbc368dcc ipv4: Perform peer validation on cached route lookup.
Otherwise we won't notice the peer GENID change.

Reported-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-01 13:38:59 -05:00
Eric Dumazet
84f9307c5d ipv4: use a 64bit load/store in output path
gcc compiler is smart enough to use a single load/store if we
memcpy(dptr, sptr, 8) on x86_64, regardless of
CONFIG_CC_OPTIMIZE_FOR_SIZE

In IP header, daddr immediately follows saddr, this wont change in the
future. We only need to make sure our flowi4 (saddr,daddr) fields wont
break the rule.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-01 13:28:54 -05:00
David S. Miller
898f73585b dccp: Evaluate ip_hdr() only once in dccp_v4_route_skb().
This also works around a bogus gcc warning generated by an
upcoming patch from Eric Dumazet that rearranges the layout
of struct flowi4.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-01 13:28:34 -05:00
Eric Dumazet
b536db9332 net: net_device flags is an unsigned int
commit b00055aacd ([NET] core: add RFC2863 operstate) changed
net_device flags from unsigned short to unsigned int.

Some core functions still assume its an unsigned short.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-01 11:41:48 -05:00
Eric Dumazet
fc33cc7242 netem: fix build error on 32bit arches
ERROR: "__udivdi3" [net/sched/sch_netem.ko] undefined!

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-12-01 11:40:19 -05:00
RongQing.Li
8f89148986 net/core: fix rollback handler in register_netdevice_notifier
Within nested statements, the break statement terminates only the
do, for, switch, or while statement that immediately encloses it,
So replace the break with goto.

Signed-off-by: RongQing.Li <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30 23:43:07 -05:00
sjur.brandeland@stericsson.com
e977b4cf63 caif: Remove unused enum and parameter in cfserl
Remove unused enum cfcnfg_phy_type and the parameter to cfserl_create.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30 23:30:48 -05:00
sjur.brandeland@stericsson.com
7c18d2205e caif: Restructure how link caif link layer enroll
Enrolling CAIF link layers are refactored.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30 23:30:48 -05:00
sjur.brandeland@stericsson.com
200c5a3b38 caif: Allow cfpkt_extr_head to process empty message
Allow NULL pointer in cfpkt_extr_head in order to
skip past header data.

Signed-off-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30 23:30:47 -05:00
Hagen Paul Pfeifer
7bc0f28c7a netem: rate extension
Currently netem is not in the ability to emulate channel bandwidth. Only static
delay (and optional random jitter) can be configured.

To emulate the channel rate the token bucket filter (sch_tbf) can be used.  But
TBF has some major emulation flaws. The buffer (token bucket depth/rate) cannot
be 0. Also the idea behind TBF is that the credit (token in buckets) fills if
no packet is transmitted. So that there is always a "positive" credit for new
packets. In real life this behavior contradicts the law of nature where
nothing can travel faster as speed of light. E.g.: on an emulated 1000 byte/s
link a small IPv4/TCP SYN packet with ~50 byte require ~0.05 seconds - not 0
seconds.

Netem is an excellent place to implement a rate limiting feature: static
delay is already implemented, tfifo already has time information and the
user can skip TBF configuration completely.

This patch implement rate feature which can be configured via tc. e.g:

	tc qdisc add dev eth0 root netem rate 10kbit

To emulate a link of 5000byte/s and add an additional static delay of 10ms:

	tc qdisc add dev eth0 root netem delay 10ms rate 5KBps

Note: similar to TBF the rate extension is bounded to the kernel timing
system. Depending on the architecture timer granularity, higher rates (e.g.
10mbit/s and higher) tend to transmission bursts. Also note: further queues
living in network adaptors; see ethtool(8).

Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@drr.davemloft.net>
2011-11-30 23:18:35 -05:00
Jun Zhao
99d2f47aa9 ipv6 : mcast : Delete useless parameter in ip6_mc_add1_src()
Need not to used 'delta' flag when add single-source to interface
filter source list.

Signed-off-by: Jun Zhao <mypopydev@gmail.com>
Signed-off-by: David S. Miller <davem@drr.davemloft.net>
2011-11-30 23:10:02 -05:00
Jun Zhao
5eb81e8916 ipv4 : igmp : Delete useless parameter in ip_mc_add1_src()
Need not to used 'delta' flag when add single-source to interface
filter source list.

Signed-off-by: Jun Zhao <mypopydev@gmail.com>
Signed-off-by: David S. Miller <davem@drr.davemloft.net>
2011-11-30 23:10:01 -05:00
David Miller
32092ecf06 atm: clip: Use device neigh support on top of "arp_tbl".
Instead of instantiating an entire new neigh_table instance
just for ATM handling, use the neigh device private facility.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30 18:51:03 -05:00
David Miller
da6a8fa027 neigh: Add device constructor/destructor capability.
If the neigh entry has device private state, it will need
constructor/destructor ops.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30 18:48:03 -05:00
David Miller
869759b9e4 atm: clip: Convert over to neighbour_priv()
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30 18:46:44 -05:00
David Miller
76cc714ed5 neigh: Do not set tbl->entry_size in ipv4/ipv6 neigh tables.
Let the core self-size the neigh entry based upon the key length.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30 18:46:43 -05:00
David Miller
596b9b68ef neigh: Add infrastructure for allocating device neigh privates.
netdev->neigh_priv_len records the private area length.

This will trigger for neigh_table objects which set tbl->entry_size
to zero, and the first instances of this will be forthcoming.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30 18:46:43 -05:00
David Miller
5b8b0060cb neigh: Get rid of neigh_table->kmem_cachep
We are going to alloc for device specific private areas for
neighbour entries, and in order to do that we have to move
away from the fixed allocation size enforced by using
neigh_table->kmem_cachep

As a nice side effect we can now use kfree_rcu().

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30 18:46:43 -05:00
Eric Dumazet
218fa90f07 ipv4: fix lockdep splat in rt_cache_seq_show
After commit f2c31e32b3 (fix NULL dereferences in check_peer_redir()),
dst_get_neighbour() should be guarded by rcu_read_lock() /
rcu_read_unlock() section.

Reported-by: Miles Lane <miles.lane@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30 17:24:14 -05:00
Eric Dumazet
f7e57044ee sch_teql: fix lockdep splat
We need rcu_read_lock() protection before using dst_get_neighbour(), and
we must cache its value (pass it to __teql_resolve())

teql_master_xmit() is called under rcu_read_lock_bh() protection, its
not enough.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30 17:10:49 -05:00
Eric Dumazet
d8a6e65f8b tcp: inherit listener congestion control for passive cnx
Rick Jones reported that TCP_CONGESTION sockopt performed on a listener
was ignored for its children sockets : right after accept() the
congestion control for new socket is the system default one.

This seems an oversight of the initial design (quoted from Stephen)

Based on prior investigation and patch from Rick.

Reported-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
CC: Yuchung Cheng <ycheng@google.com>
Tested-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-30 16:55:26 -05:00
John W. Linville
3b95e9c089 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem 2011-11-30 14:14:42 -05:00
RongQing.Li
e92036a651 ipv4: remove useless codes in ipmr_device_event()
Commit 7dc00c82 added a 'notify' parameter for vif_delete() to
distinguish whether to unregister the device.

When notify=1 means we does not need to unregister the device,
so calling unregister_netdevice_many is useless.

Signed-off-by: RongQing.Li <roy.qing.li@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29 18:32:33 -05:00
Igor Maravic
6977a79d36 net: Fix skb_update_prio RCU usage.
Change function rcu_dereference to rcu_dereference_bh to avoid warning

[ INFO: suspicious RCU usage. ]
-------------------------------
net/core/dev.c:2459 suspicious rcu_dereference_check() usage!

because we are locking with

rcu_read_lock_bh();

in function dev_queue_xmit(struct sk_buff *skb)

Signed-off-by: Igor Maravic <igorm@etf.rs>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29 18:25:17 -05:00
Paul Moore
1281bc2565 netlabel: Fix build problems when IPv6 is not enabled
A recent fix to the the NetLabel code caused build problem with
configurations that did not have IPv6 enabled; see below:

 netlabel_kapi.c: In function 'netlbl_cfg_unlbl_map_add':
 netlabel_kapi.c:165:4:
  error: implicit declaration of function 'netlbl_af6list_add'

This patch fixes this problem by making the IPv6 specific code conditional
on the IPv6 configuration flags as we done in the rest of NetLabel and the
network stack as a whole.  We have to move some variable declarations
around as a result so things may not be quite as pretty, but at least it
builds cleanly now.

Some additional IPv6 conditionals were added to the NetLabel code as well
for the sake of consistency.

Reported-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29 16:48:52 -05:00
Xi Wang
c89304b8ea sctp: better integer overflow check in sctp_auth_create_key()
The check from commit 30c2235c is incomplete and cannot prevent
cases like key_len = 0x80000000 (INT_MAX + 1).  In that case, the
left-hand side of the check (INT_MAX - key_len), which is unsigned,
becomes 0xffffffff (UINT_MAX) and bypasses the check.

However this shouldn't be a security issue.  The function is called
from the following two code paths:

 1) setsockopt()

 2) sctp_auth_asoc_set_secret()

In case (1), sca_keylength is never going to exceed 65535 since it's
bounded by a u16 from the user API.  As such, the key length will
never overflow.

In case (2), sca_keylength is computed based on the user key (1 short)
and 2 * key_vector (3 shorts) for a total of 7 * USHRT_MAX, which still
will not overflow.

In other words, this overflow check is not really necessary.  Just
make it more correct.

Signed-off-by: Xi Wang <xi.wang@gmail.com>
Cc: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29 15:51:03 -05:00
Eric Dumazet
2bcc34bb98 sch_choke: use skb_flow_dissect()
Instead of using a custom flow dissector, use skb_flow_dissect() and
benefit from tunnelling support.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29 13:17:03 -05:00
Eric Dumazet
11fca931d3 sch_sfq: use skb_flow_dissect()
Instead of using a custom flow dissector, use skb_flow_dissect() and
benefit from tunnelling support.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29 13:17:03 -05:00
Eric Dumazet
f07d960df3 tcp: avoid frag allocation for small frames
tcp_sendmsg() uses select_size() helper to choose skb head size when a
new skb must be allocated.

If GSO is enabled for the socket, current strategy is to force all
payload data to be outside of headroom, in PAGE fragments.

This strategy is not welcome for small packets, wasting memory.

Experiments show that best results are obtained when using 2048 bytes
for skb head (This includes the skb overhead and various headers)

This patch provides better len/truesize ratios for packets sent to
loopback device, and reduce memory needs for in-flight loopback packets,
particularly on arches with big pages.

If a sender sends many 1-byte packets to an unresponsive application,
receiver rmem_alloc will grow faster and will stop queuing these packets
sooner, or will collapse its receive queue to free excess memory.

netperf -t TCP_RR results are improved by ~4 %, and many workloads are
improved as well (tbench, mysql...)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29 13:17:03 -05:00
Eric Dumazet
4d77d2b567 flow_dissector: use a 64bit load/store
Le lundi 28 novembre 2011 à 19:06 -0500, David Miller a écrit :
> From: Dimitris Michailidis <dm@chelsio.com>
> Date: Mon, 28 Nov 2011 08:25:39 -0800
>
> >> +bool skb_flow_dissect(const struct sk_buff *skb, struct flow_keys
> >> *flow)
> >> +{
> >> +	int poff, nhoff = skb_network_offset(skb);
> >> +	u8 ip_proto;
> >> +	u16 proto = skb->protocol;
> >
> > __be16 instead of u16 for proto?
>
> I'll take care of this when I apply these patches.

( CC trimmed )

Thanks David !

Here is a small patch to use one 64bit load/store on x86_64 instead of
two 32bit load/stores.

[PATCH net-next] flow_dissector: use a 64bit load/store

gcc compiler is smart enough to use a single load/store if we
memcpy(dptr, sptr, 8) on x86_64, regardless of
CONFIG_CC_OPTIMIZE_FOR_SIZE

In IP header, daddr immediately follows saddr, this wont change in the
future. We only need to make sure our flow_keys (src,dst) fields wont
break the rule.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29 13:17:03 -05:00
Tom Herbert
114cf58021 bql: Byte queue limits
Networking stack support for byte queue limits, uses dynamic queue
limits library.  Byte queue limits are maintained per transmit queue,
and a dql structure has been added to netdev_queue structure for this
purpose.

Configuration of bql is in the tx-<n> sysfs directory for the queue
under the byte_queue_limits directory.  Configuration includes:
limit_min, bql minimum limit
limit_max, bql maximum limit
hold_time, bql slack hold time

Also under the directory are:
limit, current byte limit
inflight, current number of bytes on the queue

Signed-off-by: Tom Herbert <therbert@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29 12:46:19 -05:00
Tom Herbert
927fbec13e xps: Add xps_queue_release function
This patch moves the xps specific parts in netdev_queue_release into
its own function which netdev_queue_release can call.  This allows
netdev_queue_release to be more generic (for adding new attributes
to tx queues).

Signed-off-by: Tom Herbert <therbert@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29 12:46:19 -05:00
Tom Herbert
7346649826 net: Add queue state xoff flag for stack
Create separate queue state flags so that either the stack or drivers
can turn on XOFF.  Added a set of functions used in the stack to determine
if a queue is really stopped (either by stack or driver)

Signed-off-by: Tom Herbert <therbert@google.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29 12:46:19 -05:00
David S. Miller
c1baa88431 Merge branch 'nf' of git://1984.lsi.us.es/net 2011-11-29 01:20:55 -05:00
Neal Cardwell
6b5a5c0dbb tcp: do not scale TSO segment size with reordering degree
Since 2005 (c1b4a7e695)
tcp_tso_should_defer has been using tcp_max_burst() as a target limit
for deciding how large to make outgoing TSO packets when not using
sysctl_tcp_tso_win_divisor. But since 2008
(dd9e0dda66) tcp_max_burst() returns the
reordering degree. We should not have tcp_tso_should_defer attempt to
build larger segments just because there is more reordering. This
commit splits the notion of deferral size used in TSO from the notion
of burst size used in cwnd moderation, and returns the TSO deferral
limit to its original value.

Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29 00:29:41 -05:00
Pascal Hambourg
befc93fe76 atm: br2684: Avoid alignment issues
Use memcmp() instead of cast to u16 when checking the PAD field.

Signed-off-by: Pascal Hambourg <pascal@plouf.fr.eu.org>
Signed-off-by: chas williams - CONTRACTOR <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29 00:28:06 -05:00
Pascal Hambourg
9e667b2988 atm: br2684: Make headroom and hard_header_len depend on the payload type
Routed payload requires less headroom than bridged payload.
So do not reallocate headroom if not needed.
Also, add worst case AAL5 overhead to netdev->hard_header_len.

Signed-off-by: Pascal Hambourg <pascal@plouf.fr.eu.org>
Signed-off-by: chas williams - CONTRACTOR <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29 00:28:06 -05:00
Eric Dumazet
08e29af3a9 net: optimize socket timestamping
We can test/set multiple bits from sk_flags at once, to shorten a bit
socket setup/dismantle phase.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29 00:27:11 -05:00
Eric Dumazet
b90e5794c5 net: dont call jump_label_dec from irq context
Igor Maravic reported an error caused by jump_label_dec() being called
from IRQ context :

 BUG: sleeping function called from invalid context at kernel/mutex.c:271
 in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper
 1 lock held by swapper/0:
  #0:  (&n->timer){+.-...}, at: [<ffffffff8107ce90>] call_timer_fn+0x0/0x340
 Pid: 0, comm: swapper Not tainted 3.2.0-rc2-net-next-mpls+ #1
Call Trace:
 <IRQ>  [<ffffffff8104f417>] __might_sleep+0x137/0x1f0
 [<ffffffff816b9a2f>] mutex_lock_nested+0x2f/0x370
 [<ffffffff810a89fd>] ? trace_hardirqs_off+0xd/0x10
 [<ffffffff8109a37f>] ? local_clock+0x6f/0x80
 [<ffffffff810a90a5>] ? lock_release_holdtime.part.22+0x15/0x1a0
 [<ffffffff81557929>] ? sock_def_write_space+0x59/0x160
 [<ffffffff815e936e>] ? arp_error_report+0x3e/0x90
 [<ffffffff810969cd>] atomic_dec_and_mutex_lock+0x5d/0x80
 [<ffffffff8112fc1d>] jump_label_dec+0x1d/0x50
 [<ffffffff81566525>] net_disable_timestamp+0x15/0x20
 [<ffffffff81557a75>] sock_disable_timestamp+0x45/0x50
 [<ffffffff81557b00>] __sk_free+0x80/0x200
 [<ffffffff815578d0>] ? sk_send_sigurg+0x70/0x70
 [<ffffffff815e936e>] ? arp_error_report+0x3e/0x90
 [<ffffffff81557cba>] sock_wfree+0x3a/0x70
 [<ffffffff8155c2b0>] skb_release_head_state+0x70/0x120
 [<ffffffff8155c0b6>] __kfree_skb+0x16/0x30
 [<ffffffff8155c119>] kfree_skb+0x49/0x170
 [<ffffffff815e936e>] arp_error_report+0x3e/0x90
 [<ffffffff81575bd9>] neigh_invalidate+0x89/0xc0
 [<ffffffff81578dbe>] neigh_timer_handler+0x9e/0x2a0
 [<ffffffff81578d20>] ? neigh_update+0x640/0x640
 [<ffffffff81073558>] __do_softirq+0xc8/0x3a0

Since jump_label_{inc|dec} must be called from process context only,
we must defer jump_label_dec() if net_disable_timestamp() is called
from interrupt context.

Reported-by: Igor Maravic <igorm@etf.rs>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-11-29 00:26:25 -05:00