Commit Graph

210432 Commits

Author SHA1 Message Date
Eric Dumazet
e71895a1be xfrm: dont assume rcu_read_lock in xfrm_output_one()
ip_local_out() is called with rcu_read_lock() held from ip_queue_xmit()
but not from other call sites.

Reported-and-bisected-by: Nick Bowler <nbowler@elliptictech.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-16 21:46:15 -07:00
Matthew Garrett
801e147cde r8169: Handle rxfifo errors on 8168 chips
The Thinkpad X100e seems to have some odd behaviour when the display is
powered off - the onboard r8169 starts generating rxfifo overflow errors.
The root cause of this has not yet been identified and may well be a
hardware design bug on the platform, but r8169 should be more resiliant to
this. This patch enables the rxfifo interrupt on 8168 devices and removes
the MAC version check in the interrupt handler, and the machine no longer
crashes when under network load while the screen turns off.

Signed-off-by: Matthew Garrett <mjg@redhat.com>
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-15 19:32:59 -07:00
Denis Kirjanov
84176b7b56 3c59x: Remove atomic context inside vortex_{set|get}_wol
There is no need to use spinlocks in vortex_{set|get}_wol.
This also fixes a bug:
[  254.214993] 3c59x 0000:00:0d.0: PME# enabled
[  254.215021] BUG: sleeping function called from invalid context at kernel/mutex.c:94
[  254.215030] in_atomic(): 0, irqs_disabled(): 1, pid: 4875, name: ethtool
[  254.215042] Pid: 4875, comm: ethtool Tainted: G        W   2.6.36-rc3+ #7
[  254.215049] Call Trace:
[  254.215050]  [] __might_sleep+0xb1/0xb6
[  254.215050]  [] mutex_lock+0x17/0x30
[  254.215050]  [] acpi_enable_wakeup_device_power+0x2b/0xb1
[  254.215050]  [] acpi_pm_device_sleep_wake+0x42/0x7f
[  254.215050]  [] acpi_pci_sleep_wake+0x5d/0x63
[  254.215050]  [] platform_pci_sleep_wake+0x1d/0x20
[  254.215050]  [] __pci_enable_wake+0x90/0xd0
[  254.215050]  [] acpi_set_WOL+0x8e/0xf5 [3c59x]
[  254.215050]  [] vortex_set_wol+0x4e/0x5e [3c59x]
[  254.215050]  [] dev_ethtool+0x1cf/0xb61
[  254.215050]  [] ? debug_mutex_free_waiter+0x45/0x4a
[  254.215050]  [] ? __mutex_lock_common+0x204/0x20e
[  254.215050]  [] ? __mutex_lock_slowpath+0x12/0x15
[  254.215050]  [] ? mutex_lock+0x23/0x30
[  254.215050]  [] dev_ioctl+0x42c/0x533
[  254.215050]  [] ? _cond_resched+0x8/0x1c
[  254.215050]  [] ? lock_page+0x1c/0x30
[  254.215050]  [] ? page_address+0x15/0x7c
[  254.215050]  [] ? filemap_fault+0x187/0x2c4
[  254.215050]  [] sock_ioctl+0x1d4/0x1e0
[  254.215050]  [] ? sock_ioctl+0x0/0x1e0
[  254.215050]  [] vfs_ioctl+0x19/0x33
[  254.215050]  [] do_vfs_ioctl+0x424/0x46f
[  254.215050]  [] ? selinux_file_ioctl+0x3c/0x40
[  254.215050]  [] sys_ioctl+0x40/0x5a
[  254.215050]  [] sysenter_do_call+0x12/0x22

vortex_set_wol protected with a spinlock, but nested  acpi_set_WOL acquires a mutex inside atomic context.
Ethtool operations are already serialized by RTNL mutex, so it is safe to drop the locks.

Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-15 14:32:39 -07:00
Alexey Kuznetsov
01f83d6984 tcp: Prevent overzealous packetization by SWS logic.
If peer uses tiny MSS (say, 75 bytes) and similarly tiny advertised
window, the SWS logic will packetize to half the MSS unnecessarily.

This causes problems with some embedded devices.

However for large MSS devices we do want to half-MSS packetize
otherwise we never get enough packets into the pipe for things
like fast retransmit and recovery to work.

Be careful also to handle the case where MSS > window, otherwise
we'll never send until the probe timer.

Reported-by: ツ Leandro Melo de Sales <leandroal@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-15 12:01:44 -07:00
David S. Miller
6dcbc12290 net: RPS needs to depend upon USE_GENERIC_SMP_HELPERS
You cannot invoke __smp_call_function_single() unless the
architecture sets this symbol.

Reported-by: Daniel Hellstrom <daniel@gaisler.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-14 21:42:22 -07:00
Simon Guinot
fddd91016d phylib: fix PAL state machine restart on resume
On resume, before starting the PAL state machine, check if the
adjust_link() method is well supplied. If not, this would lead to a
NULL pointer dereference in the phy_state_machine() function.

This scenario can happen if the Ethernet driver call manually the PHY
functions instead of using the PAL state machine. The mv643xx_eth driver
is a such example.

Signed-off-by: Simon Guinot <sguinot@lacie.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-14 14:31:03 -07:00
Eric Dumazet
ef885afbf8 net: use rcu_barrier() in rollback_registered_many
netdev_wait_allrefs() waits that all references to a device vanishes.

It currently uses a _very_ pessimistic 250 ms delay between each probe.
Some users reported that no more than 4 devices can be dismantled per
second, this is a pretty serious problem for some setups.

Most of the time, a refcount is about to be released by an RCU callback,
that is still in flight because rollback_registered_many() uses a
synchronize_rcu() call instead of rcu_barrier(). Problem is visible if
number of online cpus is one, because synchronize_rcu() is then a no op.

time to remove 50 ipip tunnels on a UP machine :

before patch : real 11.910s
after patch : real 1.250s

Reported-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reported-by: Octavian Purdila <opurdila@ixiacom.com>
Reported-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-14 14:27:29 -07:00
Andy Gospodarek
ab12811c89 bonding: correctly process non-linear skbs
It was recently brought to my attention that 802.3ad mode bonds would no
longer form when using some network hardware after a driver update.
After snooping around I realized that the particular hardware was using
page-based skbs and found that skb->data did not contain a valid LACPDU
as it was not stored there.  That explained the inability to form an
802.3ad-based bond.  For balance-alb mode bonds this was also an issue
as ARPs would not be properly processed.

This patch fixes the issue in my tests and should be applied to 2.6.36
and as far back as anyone cares to add it to stable.

Thanks to Alexander Duyck <alexander.h.duyck@intel.com> and Jesse
Brandeburg <jesse.brandeburg@intel.com> for the suggestions on this one.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: stable@kerne.org
Signed-off-by: Jay Vosburgh <fubar@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-14 14:25:32 -07:00
Michael S. Tsirkin
ee05d6939e vhost-net: fix range checking in mrg bufs case
In mergeable buffer case, we use headcount, log_num
and seg as indexes in same-size arrays, and
we know that headcount <= seg and
log_num equals either 0 or seg.

Therefore, the right thing to do is range-check seg,
not headcount as we do now: these will be different
if guest chains s/g descriptors (this does not
happen now, but we can not trust the guest).

Long term, we should add BUG_ON checks to verify
two other indexes are what we think they should be.

Reported-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2010-09-14 15:22:41 +02:00
Michael Kerrisk
a89b47639f ipv4: enable getsockopt() for IP_NODEFRAG
While integrating your man-pages patch for IP_NODEFRAG, I noticed
that this option is settable by setsockopt(), but not gettable by
getsockopt(). I suppose this is not intended. The (untested,
trivial) patch below adds getsockopt() support.

Signed-off-by: Michael kerrisk <mtk.manpages@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-13 19:57:23 -07:00
Bob Arendt
7998156344 ipv4: force_igmp_version ignored when a IGMPv3 query received
After all these years, it turns out that the
    /proc/sys/net/ipv4/conf/*/force_igmp_version
parameter isn't fully implemented.

*Symptom*:
When set force_igmp_version to a value of 2, the kernel should only perform
multicast IGMPv2 operations (IETF rfc2236).  An host-initiated Join message
will be sent as a IGMPv2 Join message.  But if a IGMPv3 query message is
received, the host responds with a IGMPv3 join message.  Per rfc3376 and
rfc2236, a IGMPv2 host should treat a IGMPv3 query as a IGMPv2 query and
respond with an IGMPv2 Join message.

*Consequences*:
This is an issue when a IGMPv3 capable switch is the querier and will only
issue IGMPv3 queries (which double as IGMPv2 querys) and there's an
intermediate switch that is only IGMPv2 capable.  The intermediate switch
processes the initial v2 Join, but fails to recognize the IGMPv3 Join responses
to the Query, resulting in a dropped connection when the intermediate v2-only
switch times it out.

*Identifying issue in the kernel source*:
The issue is in this section of code (in net/ipv4/igmp.c), which is called when
an IGMP query is received  (from mainline 2.6.36-rc3 gitweb):
 ...
A IGMPv3 query has a length >= 12 and no sources.  This routine will exit after
line 880, setting the general query timer (random timeout between 0 and query
response time).  This calls igmp_gq_timer_expire():
...
.. which only sends a v3 response.  So if a v3 query is received, the kernel
always sends a v3 response.

IGMP queries happen once every 60 sec (per vlan), so the traffic is low.  A
IGMPv3 query *is* a strict superset of a IGMPv2 query, so this patch properly
short circuit's the v3 behaviour.

One issue is that this does not address force_igmp_version=1.  Then again, I've
never seen any IGMPv1 multicast equipment in the wild.  However there is a lot
of v2-only equipment. If it's necessary to support the IGMPv1 case as well:

837         if (len == 8 || IGMP_V2_SEEN(in_dev) || IGMP_V1_SEEN(in_dev)) {

Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-13 12:56:51 -07:00
Dan Carpenter
3429769bc6 ppp: potential NULL dereference in ppp_mp_explode()
Smatch complains because we check whether "pch->chan" is NULL and then
dereference it unconditionally on the next line.  Partly the reason this
bug was introduced is because code was too complicated.  I've simplified
it a little.

Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-13 12:44:11 -07:00
Dan Carpenter
339db11b21 net/llc: make opt unsigned in llc_ui_setsockopt()
The members of struct llc_sock are unsigned so if we pass a negative
value for "opt" it can cause a sign bug.  Also it can cause an integer
overflow when we multiply "opt * HZ".

CC: stable@kernel.org
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-13 12:44:10 -07:00
David S. Miller
a505b3b30f sch_atm: Fix potential NULL deref.
The list_head conversion unearther an unnecessary flow
check.  Since flow is always NULL here we don't need to
see if a matching flow exists already.

Reported-by: Jiri Slaby <jirislaby@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-12 11:56:44 -07:00
David S. Miller
053d8f6622 Merge branch 'vhost-net' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost 2010-09-09 21:59:51 -07:00
Dan Williams
c9cedbba0f ipheth: remove incorrect devtype to WWAN
The 'wwan' devtype is meant for devices that require preconfiguration
and *every* time setup before the ethernet interface can be used, like
cellular modems which require a series of setup commands on serial ports
or other mechanisms before the ethernet interface will handle packets.

As ipheth only requires one-per-hotplug pairing setup with no
preconfiguration (like APN, phone #, etc) and the network interface is
usable at any time after that initial setup, remove the incorrect
devtype wwan.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-09 21:41:59 -07:00
Joe Perches
201b6bab67 MAINTAINERS: Add CAIF
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-09 21:41:59 -07:00
Joe Perches
123031c0ee sctp: fix test for end of loop
Add a list_has_sctp_addr function to simplify loop

Based on a patches by Dan Carpenter and David Miller

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-09 15:00:29 -07:00
David S. Miller
e199e6136c Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 2010-09-08 23:49:04 -07:00
Eric Dumazet
972c40b5be KS8851: Correct RX packet allocation
Use netdev_alloc_skb_ip_align() helper and do correct allocation

Tested-by: Abraham Arce <x0066660@ti.com>
Signed-off-by: Abraham Arce <x0066660@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-08 21:47:13 -07:00
Eric Dumazet
719f835853 udp: add rehash on connect()
commit 30fff923 introduced in linux-2.6.33 (udp: bind() optimisation)
added a secondary hash on UDP, hashed on (local addr, local port).

Problem is that following sequence :

fd = socket(...)
connect(fd, &remote, ...)

not only selects remote end point (address and port), but also sets
local address, while UDP stack stored in secondary hash table the socket
while its local address was INADDR_ANY (or ipv6 equivalent)

Sequence is :
 - autobind() : choose a random local port, insert socket in hash tables
              [while local address is INADDR_ANY]
 - connect() : set remote address and port, change local address to IP
              given by a route lookup.

When an incoming UDP frame comes, if more than 10 sockets are found in
primary hash table, we switch to secondary table, and fail to find
socket because its local address changed.

One solution to this problem is to rehash datagram socket if needed.

We add a new rehash(struct socket *) method in "struct proto", and
implement this method for UDP v4 & v6, using a common helper.

This rehashing only takes care of secondary hash table, since primary
hash (based on local port only) is not changed.

Reported-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-08 21:45:01 -07:00
Jianzhao Wang
ae2688d59b net: blackhole route should always be recalculated
Blackhole routes are used when xfrm_lookup() returns -EREMOTE (error
triggered by IKE for example), hence this kind of route is always
temporary and so we should check if a better route exists for next
packets.
Bug has been introduced by commit d11a4dc18b.

Signed-off-by: Jianzhao Wang <jianzhao.wang@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-08 14:35:43 -07:00
Jarek Poplawski
f6b085b69d ipv4: Suppress lockdep-RCU false positive in FIB trie (3)
Hi,
Here is one more of these warnings and a patch below:

Sep  5 23:52:33 del kernel: [46044.244833] ===================================================
Sep  5 23:52:33 del kernel: [46044.269681] [ INFO: suspicious rcu_dereference_check() usage. ]
Sep  5 23:52:33 del kernel: [46044.277000] ---------------------------------------------------
Sep  5 23:52:33 del kernel: [46044.285185] net/ipv4/fib_trie.c:1756 invoked rcu_dereference_check() without protection!
Sep  5 23:52:33 del kernel: [46044.293627]
Sep  5 23:52:33 del kernel: [46044.293632] other info that might help us debug this:
Sep  5 23:52:33 del kernel: [46044.293634]
Sep  5 23:52:33 del kernel: [46044.325333]
Sep  5 23:52:33 del kernel: [46044.325335] rcu_scheduler_active = 1, debug_locks = 0
Sep  5 23:52:33 del kernel: [46044.348013] 1 lock held by pppd/1717:
Sep  5 23:52:33 del kernel: [46044.357548]  #0:  (rtnl_mutex){+.+.+.}, at: [<c125dc1f>] rtnl_lock+0xf/0x20
Sep  5 23:52:33 del kernel: [46044.367647]
Sep  5 23:52:33 del kernel: [46044.367652] stack backtrace:
Sep  5 23:52:33 del kernel: [46044.387429] Pid: 1717, comm: pppd Not tainted 2.6.35.4.4a #3
Sep  5 23:52:33 del kernel: [46044.398764] Call Trace:
Sep  5 23:52:33 del kernel: [46044.409596]  [<c12f9aba>] ? printk+0x18/0x1e
Sep  5 23:52:33 del kernel: [46044.420761]  [<c1053969>] lockdep_rcu_dereference+0xa9/0xb0
Sep  5 23:52:33 del kernel: [46044.432229]  [<c12b7235>] trie_firstleaf+0x65/0x70
Sep  5 23:52:33 del kernel: [46044.443941]  [<c12b74d4>] fib_table_flush+0x14/0x170
Sep  5 23:52:33 del kernel: [46044.455823]  [<c1033e92>] ? local_bh_enable_ip+0x62/0xd0
Sep  5 23:52:33 del kernel: [46044.467995]  [<c12fc39f>] ? _raw_spin_unlock_bh+0x2f/0x40
Sep  5 23:52:33 del kernel: [46044.480404]  [<c12b24d0>] ? fib_sync_down_dev+0x120/0x180
Sep  5 23:52:33 del kernel: [46044.493025]  [<c12b069d>] fib_flush+0x2d/0x60
Sep  5 23:52:33 del kernel: [46044.505796]  [<c12b06f5>] fib_disable_ip+0x25/0x50
Sep  5 23:52:33 del kernel: [46044.518772]  [<c12b10d3>] fib_netdev_event+0x73/0xd0
Sep  5 23:52:33 del kernel: [46044.531918]  [<c1048dfd>] notifier_call_chain+0x2d/0x70
Sep  5 23:52:33 del kernel: [46044.545358]  [<c1048f0a>] raw_notifier_call_chain+0x1a/0x20
Sep  5 23:52:33 del kernel: [46044.559092]  [<c124f687>] call_netdevice_notifiers+0x27/0x60
Sep  5 23:52:33 del kernel: [46044.573037]  [<c124faec>] __dev_notify_flags+0x5c/0x80
Sep  5 23:52:33 del kernel: [46044.586489]  [<c124fb47>] dev_change_flags+0x37/0x60
Sep  5 23:52:33 del kernel: [46044.599394]  [<c12a8a8d>] devinet_ioctl+0x54d/0x630
Sep  5 23:52:33 del kernel: [46044.612277]  [<c12aabb7>] inet_ioctl+0x97/0xc0
Sep  5 23:52:34 del kernel: [46044.625208]  [<c123f6af>] sock_ioctl+0x6f/0x270
Sep  5 23:52:34 del kernel: [46044.638046]  [<c109d2b0>] ? handle_mm_fault+0x420/0x6c0
Sep  5 23:52:34 del kernel: [46044.650968]  [<c123f640>] ? sock_ioctl+0x0/0x270
Sep  5 23:52:34 del kernel: [46044.663865]  [<c10c3188>] vfs_ioctl+0x28/0xa0
Sep  5 23:52:34 del kernel: [46044.676556]  [<c10c38fa>] do_vfs_ioctl+0x6a/0x5c0
Sep  5 23:52:34 del kernel: [46044.688989]  [<c1048676>] ? up_read+0x16/0x30
Sep  5 23:52:34 del kernel: [46044.701411]  [<c1021376>] ? do_page_fault+0x1d6/0x3a0
Sep  5 23:52:34 del kernel: [46044.714223]  [<c10b6588>] ? fget_light+0xf8/0x2f0
Sep  5 23:52:34 del kernel: [46044.726601]  [<c1241f98>] ? sys_socketcall+0x208/0x2c0
Sep  5 23:52:34 del kernel: [46044.739140]  [<c10c3eb3>] sys_ioctl+0x63/0x70
Sep  5 23:52:34 del kernel: [46044.751967]  [<c12fca3d>] syscall_call+0x7/0xb
Sep  5 23:52:34 del kernel: [46044.764734]  [<c12f0000>] ? cookie_v6_check+0x3d0/0x630

-------------->

This patch fixes the warning:
 ===================================================
 [ INFO: suspicious rcu_dereference_check() usage. ]
 ---------------------------------------------------
 net/ipv4/fib_trie.c:1756 invoked rcu_dereference_check() without protection!

 other info that might help us debug this:

 rcu_scheduler_active = 1, debug_locks = 0
 1 lock held by pppd/1717:
  #0:  (rtnl_mutex){+.+.+.}, at: [<c125dc1f>] rtnl_lock+0xf/0x20

 stack backtrace:
 Pid: 1717, comm: pppd Not tainted 2.6.35.4a #3
 Call Trace:
  [<c12f9aba>] ? printk+0x18/0x1e
  [<c1053969>] lockdep_rcu_dereference+0xa9/0xb0
  [<c12b7235>] trie_firstleaf+0x65/0x70
  [<c12b74d4>] fib_table_flush+0x14/0x170
  ...

Allow trie_firstleaf() to be called either under rcu_read_lock()
protection or with RTNL held. The same annotation is added to
node_parent_rcu() to prevent a similar warning a bit later.

Followup of commits 634a4b20 and 4eaa0e3c.

Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-08 14:14:20 -07:00
Ben Hutchings
ee9c5cfad2 niu: Fix kernel buffer overflow for ETHTOOL_GRXCLSRLALL
niu_get_ethtool_tcam_all() assumes that its output buffer is the right
size, and warns before returning if it is not.  However, the output
buffer size is under user control and ETHTOOL_GRXCLSRLALL is an
unprivileged ethtool command.  Therefore this is at least a local
denial-of-service vulnerability.

Change it to check before writing each entry and to return an error if
the buffer is already full.

Compile-tested only.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-08 14:01:52 -07:00
Julian Anastasov
6523ce1525 ipvs: fix active FTP
- Do not create expectation when forwarding the PORT
  command to avoid blocking the connection. The problem is that
  nf_conntrack_ftp.c:help() tries to create the same expectation later in
  POST_ROUTING and drops the packet with "dropping packet" message after
  failure in nf_ct_expect_related.

- Change ip_vs_update_conntrack to alter the conntrack
  for related connections from real server. If we do not alter the reply in
  this direction the next packet from client sent to vport 20 comes as NEW
  connection. We alter it but may be some collision happens for both
  conntracks and the second conntrack gets destroyed immediately. The
  connection stucks too.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-08 10:39:57 -07:00
Jarek Poplawski
64289c8e68 gro: Re-fix different skb headrooms
The patch: "gro: fix different skb headrooms" in its part:
"2) allocate a minimal skb for head of frag_list" is buggy. The copied
skb has p->data set at the ip header at the moment, and skb_gro_offset
is the length of ip + tcp headers. So, after the change the length of
mac header is skipped. Later skb_set_mac_header() sets it into the
NET_SKB_PAD area (if it's long enough) and ip header is misaligned at
NET_SKB_PAD + NET_IP_ALIGN offset. There is no reason to assume the
original skb was wrongly allocated, so let's copy it as it was.

bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626
fixes commit: 3d3be4333f

Reported-by: Plamen Petrov <pvp-lsts@fs.uni-ruse.bg>
Signed-off-by: Jarek Poplawski <jarkao2@gmail.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Tested-by: Plamen Petrov <pvp-lsts@fs.uni-ruse.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-08 10:32:15 -07:00
Linus Torvalds
d56557af19 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
  PCI: bus speed strings should be const
  PCI hotplug: Fix build with CONFIG_ACPI unset
  PCI: PCIe: Remove the port driver module exit routine
  PCI: PCIe: Move PCIe PME code to the pcie directory
  PCI: PCIe: Disable PCIe port services during port initialization
  PCI: PCIe: Ask BIOS for control of all native services at once
  ACPI/PCI: Negotiate _OSC control bits before requesting them
  ACPI/PCI: Do not preserve _OSC control bits returned by a query
  ACPI/PCI: Make acpi_pci_query_osc() return control bits
  ACPI/PCI: Reorder checks in acpi_pci_osc_control_set()
  PCI: PCIe: Introduce commad line switch for disabling port services
  PCI: PCIe AER: Introduce pci_aer_available()
  x86/PCI: only define pci_domain_nr if PCI and PCI_DOMAINS are set
  PCI: provide stub pci_domain_nr function for !CONFIG_PCI configs
2010-09-07 16:00:17 -07:00
Linus Torvalds
fa2925cf90 Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
  xfs: Make fiemap work with sparse files
  xfs: prevent 32bit overflow in space reservation
  xfs: Disallow 32bit project quota id
  xfs: improve buffer cache hash scalability
2010-09-07 15:44:28 -07:00
Linus Torvalds
98e52c373c Merge branch 'for-linus' of git://android.kernel.org/kernel/tegra
* 'for-linus' of git://android.kernel.org/kernel/tegra:
  [ARM] tegra: Add ZRELADDR default for ARCH_TEGRA
2010-09-07 14:48:44 -07:00
Linus Torvalds
add2b10f2b Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha-2.6:
  alpha: Fix printk format errors
  alpha: convert perf_event to use local_t
  Fix call to replaced SuperIO functions
  alpha: remove homegrown L1_CACHE_ALIGN macro
2010-09-07 14:38:54 -07:00
Linus Torvalds
3c5dff7b5e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
  9p: potential ERR_PTR() dereference
2010-09-07 14:38:21 -07:00
Linus Torvalds
dc6f962eb5 Merge branch 'for-linus' of git://neil.brown.name/md
* 'for-linus' of git://neil.brown.name/md:
  md: resolve confusion of MD_CHANGE_CLEAN
  md: don't clear MD_CHANGE_CLEAN in md_update_sb() for external arrays
  Move .gitignore from drivers/md to lib/raid6
2010-09-07 14:37:34 -07:00
Linus Torvalds
61f953cbaa Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  Revert "hwrng: n2-drv - remove casts from void*"
  crypto: testmgr - Default to no tests
  crypto: testmgr - Fix test disabling option
  crypto: hash - Fix handling of small unaligned buffers
2010-09-07 14:35:16 -07:00
Linus Torvalds
a44a553f82 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/pseries: Correct rtas_data_buf locking in dlpar code
  powerpc/85xx: Add P1021 PCI IDs and quirks
  arch/powerpc/sysdev/qe_lib/qe.c: Add of_node_put to avoid memory leak
  arch/powerpc/platforms/83xx/mpc837x_mds.c: Add missing iounmap
  fsl_rio: fix compile errors
  powerpc/85xx: Fix compile issue with p1022_ds due to lmb rename to memblock
  powerpc/85xx: Fix compilation of mpc85xx_mds.c
  powerpc: Don't use kernel stack with translation off
  powerpc/perf_event: Reduce latency of calling perf_event_do_pending
  powerpc/kexec: Adds correct calling convention for kexec purgatory
2010-09-07 14:34:37 -07:00
Linus Torvalds
ce7db282a3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu: fix a mismatch between code and comment
  percpu: fix a memory leak in pcpu_extend_area_map()
  percpu: add __percpu notations to UP allocator
  percpu: handle __percpu notations in UP accessors
2010-09-07 14:08:37 -07:00
Linus Torvalds
cd4d4fc413 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: use zalloc_cpumask_var() for gcwq->mayday_mask
  workqueue: fix GCWQ_DISASSOCIATED initialization
  workqueue: Add a workqueue chapter to the tracepoint docbook
  workqueue: fix cwq->nr_active underflow
  workqueue: improve destroy_workqueue() debuggability
  workqueue: mark lock acquisition on worker_maybe_bind_and_lock()
  workqueue: annotate lock context change
  workqueue: free rescuer on destroy_workqueue
2010-09-07 14:08:17 -07:00
Linus Torvalds
608307e6de Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (26 commits)
  pkt_sched: Fix lockdep warning on est_tree_lock in gen_estimator
  ipvs: avoid oops for passive FTP
  Revert "sky2: don't do GRO on second port"
  gro: fix different skb headrooms
  bridge: Clear INET control block of SKBs passed into ip_fragment().
  3c59x: Remove incorrect locking; correct documented lock hierarchy
  sky2: don't do GRO on second port
  ipv4: minor fix about RPF in help of Kconfig
  xfrm_user: avoid a warning with some compiler
  net/sched/sch_hfsc.c: initialize parent's cl_cfmin properly in init_vf()
  pxa168_eth: fix a mdiobus leak
  net sched: fix kernel leak in act_police
  vhost: stop worker only if created
  MAINTAINERS: Add ehea driver as Supported
  ath9k_hw: fix parsing of HT40 5 GHz CTLs
  ath9k_hw: Fix EEPROM uncompress block reading on AR9003
  wireless: register wiphy rfkill w/o holding cfg80211_mutex
  netlink: Make NETLINK_USERSOCK work again.
  irda: Correctly clean up self->ias_obj on irda_bind() failure.
  wireless extensions: fix kernel heap content leak
  ...
2010-09-07 14:06:10 -07:00
Linus Torvalds
96d4cbb6a9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6:
  Staging: wlan-ng: Explicitly set some fields in cfg80211 interface
  Staging: octeon: depends on NETDEVICES
  Staging: spectra: depend on X86_MRST
  Staging: zram: free device memory when init fails
  Staging: rt2870sta: Add more device IDs from vendor drivers
  staging: comedi das08_cs.c: Fix io_req_t conversion
  staging: spectra needs <linux/slab.h>
  staging: hv: Fixed lockup problem with bounce_buffer scatter list
  staging: hv: Increased storvsc ringbuffer and max_io_requests
  staging: hv: Fixed the value of the 64bit-hole inside ring buffer
  staging: hv: Fixed bounce kmap problem by using correct index
  staging: hv: Fix missing functions for net_device_ops
2010-09-07 14:05:22 -07:00
Linus Torvalds
d3de0eb164 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
  sysfs: checking for NULL instead of ERR_PTR
2010-09-07 14:04:59 -07:00
Linus Torvalds
b06ac5a360 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6:
  USB: ftdi_sio: Added custom PIDs for ChamSys products
  USB: cdc-acm: Fixing crash when ACM probing interfaces with no endpoint descriptors.
  USB: cdc-acm: Add pseudo modem without AT command capabilities
  USB: cxacru: Use a bulk/int URB to access the command endpoint
  usb: serial: mos7840: Add USB IDs to support more B&B USB/RS485 converters.
  USB: cdc-acm: Adding second ACM channel support for various Nokia and one Samsung phones
  usb: serial: mos7840: Add USB ID to support the B&B Electronics USOPTL4-2P.
  USB: ssu100: turn off debug flag
  usb: allow drivers to use allocated bandwidth until unbound
  USB: cp210x usb driver: add USB_DEVICE for Pirelli DP-L10 mobile.
  USB: cp210x: Add B&G H3000 link cable ID
  USB: CP210x Add new device ID
  USB: option: fix incorrect novatel entries
  USB: Fix kernel oops with g_ether and Windows
  USB: rndis: section mismatch fix
  USB: ehci-ppc-of: problems in unwind
  USB: s3c-hsotg: Remove DEBUG define
2010-09-07 14:04:34 -07:00
Linus Torvalds
608a5ffc3e Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
  tty: fix tty_line must not be equal to number of allocated tty pointers in tty driver
  serial: bfin_sport_uart: restore transmit frame sync fix
  serial: fix port type conflict between NS16550A & U6_16550A
  MAINTAINERS: orphan isicom
  vt: Fix console corruption on driver hand-over.
2010-09-07 14:04:09 -07:00
Linus Torvalds
78f220a84f Merge branch 'linux-next' of git://git.infradead.org/ubi-2.6
* 'linux-next' of git://git.infradead.org/ubi-2.6:
  UBI: do not oops when erroneous PEB is scheduled for scrubbing
  UBI: fix kconfig unmet dependency
  UBI: fix forward compatibility
  UBI: eliminate update of list_for_each_entry loop cursor
2010-09-07 14:02:09 -07:00
Linus Torvalds
4848d71569 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2:
  nilfs2: fix leak of shadow dat inode in error path of load_nilfs
2010-09-07 14:01:50 -07:00
Linus Torvalds
4eab8a5717 Merge branch 'drm-intel-fixes' of git://anongit.freedesktop.org/~ickle/drm-intel
* 'drm-intel-fixes' of git://anongit.freedesktop.org/~ickle/drm-intel: (25 commits)
  intel_agp,i915: Add more sandybridge graphics device ids
  drm/i915: Enable MI_FLUSH on Sandybridge
  agp/intel: Fix cache control for Sandybridge
  agp/intel: use #ifdef idiom for intel-agp.h
  agp/intel: fix physical address mask bits for sandybridge
  drm/i915: Prevent double dpms on
  drm/i915: Avoid use of uninitialised values when disabling panel-fitter
  drm/i915: Avoid pageflipping freeze when we miss the flip prepare interrupt
  drm/i915: Tightly scope intel_encoder to prevent invalid use
  drm/i915: Allocate the PCI resource for the MCHBAR
  drm/i915/dp: Really try 5 times before giving up.
  drm/i915/sdvo: Restore guess of the DDC bus in absence of VBIOS
  drm/i915/dp: Boost timeout for enabling transcoder to 100ms
  drm/i915: Re-use set_base_atomic to share setting of the display registers
  drm/i915: Fix offset page-flips on i965+
  drm/i915: Include a generation number in the device info
  i915: return -EFAULT if copy_to_user fails
  i915: return -EFAULT if copy_to_user fails
  agp/intel: Promote warning about failure to setup flush to error.
  drm/i915: overlay on gen2 can't address above 1G
  ...
2010-09-07 14:00:43 -07:00
Linus Torvalds
6300d6d755 Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm: Do not force 1024x768 modes on unknown connectors
  drm/kms: Add a module parameter to disable polling
  drm/radeon/kms: fix tv-out on avivo asics
  drm/radeon/kms/evergreen: fix gpu hangs in userspace accel code
  drm/nv50: initialize ramht_refs list for faked 0 channel
  drm/nouveau: Don't take struct_mutex around the pushbuf IOCTL.
  drm/nouveau: Take fence spinlock before reading the last sequence.
  drm/radeon/kms/evergreen: work around bad data in some i2c tables
  drm/radeon/kms: properly set crtc high base on r7xx
  drm/radeon/kms: fix tv module parameter
  drm/radeon/kms: force legacy pll algo for RV515 LVDS
  drm/radeon/kms: remove useless clock code
  drm/radeon/kms: fix a regression on r7xx AGP due to the HDP flush fix
  drm/radeon/kms: use tracked values for sclk and mclk
2010-09-07 13:59:49 -07:00
David S. Miller
de2b96f121 via-velocity: Turn scatter-gather support back off.
It causes all kinds of DMA API debugging assertions and
all straight-forward attempts to fix it have failed.

So turn off SG, and we'll tackle making this work
properly in net-next-2.6

Reported-by: Dave Jones <davej@redhat.com>
Tested-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-07 13:57:24 -07:00
David S. Miller
6f86b32518 ipv4: Fix reverse path filtering with multipath routing.
Actually iterate over the next-hops to make sure we have
a device match.  Otherwise RP filtering is always elided
when the route matched has multiple next-hops.

Reported-by: Igor M Podlesny <for.poige@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-07 13:57:24 -07:00
Tetsuo Handa
8df73ff90f UNIX: Do not loop forever at unix_autobind().
We assumed that unix_autobind() never fails if kzalloc() succeeded.
But unix_autobind() allows only 1048576 names. If /proc/sys/fs/file-max is
larger than 1048576 (e.g. systems with more than 10GB of RAM), a local user can
consume all names using fork()/socket()/bind().

If all names are in use, those who call bind() with addr_len == sizeof(short)
or connect()/sendmsg() with setsockopt(SO_PASSCRED) will continue

  while (1)
        yield();

loop at unix_autobind() till a name becomes available.
This patch adds a loop counter in order to give up after 1048576 attempts.

Calling yield() for once per 256 attempts may not be sufficient when many names
are already in use, for __unix_find_socket_byname() can take long time under
such circumstance. Therefore, this patch also adds cond_resched() call.

Note that currently a local user can consume 2GB of kernel memory if the user
is allowed to create and autobind 1048576 UNIX domain sockets. We should
consider adding some restriction for autobind operation.

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-07 13:57:23 -07:00
Mark Lord
32737e934a PATCH: b44 Handle RX FIFO overflow better (simplified)
This patch is a simplified version of the original patch from James Courtier-Dutton.

>From: James Courtier-Dutton
>Subject: [PATCH] Fix b44 RX FIFO overflow recovery.
>Date: Wednesday, June 30, 2010 - 1:11 pm
>
>This patch improves the recovery after a RX FIFO overflow on the b44
>Ethernet NIC.
>Before it would do a complete chip reset, resulting is loss of link
>for a few seconds.
>This patch improves this to do recovery in about 20ms without loss of link.
>
>Signed off by: James@superbug.co.uk

Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-07 13:57:23 -07:00
Dan Carpenter
cf9b94f88b irda: off by one
This is an off by one.  We would go past the end when we NUL terminate
the "value" string at end of the function.  The "value" buffer is
allocated in irlan_client_parse_response() or
irlan_provider_parse_command().

CC: stable@kernel.org
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2010-09-07 13:57:22 -07:00