Commit Graph

98201 Commits

Author SHA1 Message Date
Stephen Hemminger
7be87351a1 tcp: /proc/net/tcp rto,ato values not scaled properly (v2)
I found another case where we are sending information to userspace
in the wrong HZ scale.  This should have been fixed back in 2.5 :-(

This means an ABI change but as it stands there is no way for an application
like ss to get the right value.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-27 20:00:19 -07:00
Adrian Bunk
c88e6f51c2 include/linux/netdevice.h: don't export MAX_HEADER to userspace
Due to the CONFIG_'s the value is anyway not correct in userspace.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-27 19:54:54 -07:00
Adrian Bunk
ede16af4cd pkt_sched: Remove CONFIG_NET_SCH_RR
Commit d62733c8e4
([SCHED]: Qdisc changes and sch_rr added for multiqueue)
added a NET_SCH_RR option that was unused since the code
went unconditionally into sch_prio.

Reported-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-27 19:54:05 -07:00
WANG Cong
01e123d79a pkt_sched: ERR_PTR() ususally encodes an negative errno, not positive.
Note, in the following patch, 'err' is initialized as:

int err = -ENOBUFS;

Signed-off-by: WANG Cong <wcong@critical-links.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-27 19:51:35 -07:00
Wang Chen
5dbaec5dc6 netdevice: Fix typo of dev_unicast_add() comment
Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-27 19:35:16 -07:00
Rainer Weikusat
ec0d215f94 af_unix: fix 'poll for write'/connected DGRAM sockets
For n:1 'datagram connections' (eg /dev/log), the unix_dgram_sendmsg
routine implements a form of receiver-imposed flow control by
comparing the length of the receive queue of the 'peer socket' with
the max_ack_backlog value stored in the corresponding sock structure,
either blocking the thread which caused the send-routine to be called
or returning EAGAIN. This routine is used by both SOCK_DGRAM and
SOCK_SEQPACKET sockets. The poll-implementation for these socket types
is datagram_poll from core/datagram.c. A socket is deemed to be
writeable by this routine when the memory presently consumed by
datagrams owned by it is less than the configured socket send buffer
size. This is always wrong for PF_UNIX non-stream sockets connected to
server sockets dealing with (potentially) multiple clients if the
abovementioned receive queue is currently considered to be full.
'poll' will then return, indicating that the socket is writeable, but
a subsequent write result in EAGAIN, effectively causing an (usual)
application to 'poll for writeability by repeated send request with
O_NONBLOCK set' until it has consumed its time quantum.

The change below uses a suitably modified variant of the datagram_poll
routines for both type of PF_UNIX sockets, which tests if the
recv-queue of the peer a socket is connected to is presently
considered to be 'full' as part of the 'is this socket
writeable'-checking code. The socket being polled is additionally
put onto the peer_wait wait queue associated with its peer, because the
unix_dgram_recvmsg routine does a wake up on this queue after a
datagram was received and the 'other wakeup call' is done implicitly
as part of skb destruction, meaning, a process blocked in poll
because of a full peer receive queue could otherwise sleep forever
if no datagram owned by its socket was already sitting on this queue.
Among this change is a small (inline) helper routine named
'unix_recvq_full', which consolidates the actual testing code (in three
different places) into a single location.

Signed-off-by: Rainer Weikusat <rweikusat@mssgmbh.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-27 19:34:18 -07:00
Octavian Purdila
db43a282d3 tcp: fix for splice receive when used with software LRO
If an skb has nr_frags set to zero but its frag_list is not empty (as
it can happen if software LRO is enabled), and a previous
tcp_read_sock has consumed the linear part of the skb, then
__skb_splice_bits:

(a) incorrectly reports an error and

(b) forgets to update the offset to account for the linear part

Any of the two problems will cause the subsequent __skb_splice_bits
call (the one that handles the frag_list skbs) to either skip data,
or, if the unadjusted offset is greater then the size of the next skb
in the frag_list, make tcp_splice_read loop forever.

Signed-off-by: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-27 17:27:21 -07:00
Miquel van Smoorenburg
57413ebc4e tcp: calculate tcp_mem based on low memory instead of all memory
The tcp_mem array which contains limits on the total amount of memory
used by TCP sockets is calculated based on nr_all_pages.  On a 32 bits
x86 system, we should base this on the number of lowmem pages.

Signed-off-by: Miquel van Smoorenburg <miquels@cistron.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-27 17:23:57 -07:00
Andre Haupt
4797982119 hamradio: remove unused variable
Signed-off-by: Andre Haupt <andre@bitwigglers.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-27 17:22:08 -07:00
Emmanuel Grumbach
00eb7fe77e mac80211: fix an oops in several failure paths in key allocation
This patch fixes an oops in several failure paths in key allocation. This
Oops occurs when freeing a key that has not been linked yet, so the
key->sdata is not set.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-27 14:49:52 -04:00
Harvey Harrison
5f4a6fae46 prism: islpci_eth.c endianness fix
clock is already cpu-endian (see le32_to_cpu slightly before), so
le64_to_cpu doesn't make much sense.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-27 14:49:52 -04:00
Ivo van Doorn
980dfcb932 rt2x00: Fix lock dependency errror
This fixes a circular locking dependency in the workqueue handling.
The interface work task uses the mac80211 function
ieee80211_iterate_active_interfaces() which grabs the RTNL lock.

However when the interface is brough down, this happens under the RTNL
lock as well, this causes problems because mac80211 will flush the workqueue
during the ifdown event. This causes mac80211 to wait until the driver has
completed all work which can't finish because it is waiting on the RTNL lock.

This is fixed by moving rt2x00 workqueue tasks on a different workqueue,
this workqueue can be flushed when the ieee80211_hw structure is removed
by the driver (when the driver is unloaded) which does not happen under the
RTNL lock.

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-27 14:49:52 -04:00
David S. Miller
7ac3b02536 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/linville/wireless-2.6 2008-06-27 04:26:58 -07:00
Ben Hutchings
3e3cda96d0 Hold RTNL while calling dev_close()
dev_close() must be called holding the RTNL.  Compile-tested only.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-27 01:31:52 -04:00
Ben Hutchings
c81ec80bc8 qla3xxx: Hold RTNL while calling dev_close()
dev_close() must be called holding the RTNL.  Compile-tested only.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-27 01:31:50 -04:00
Andi Kleen
64c42f6976 [netdrvr] Fix IOMMU overflow checking in s2io.c
s2io has IOMMU overflow checking, but unfortunately it is wrong.

It didn't use the standard macros, which meant that it only worked
on POWER and SPARC because only those define DMA_ERROR_CODE. Convert it to
use the standard macros instead.

I also commented two more bugs in the IOMMU handling. It assumes
that 0 DMA addresses cannot happen, but that's not true in all IOMMU setups.
The information if a buffer has been already mapped needs to be stored
elsewhere.

Didn't fix those because it needs careful checking of the buffer handling
by the maintainers.

Cc: ram.vepa@neterion.com
Cc: santosh.rastapur@neterion.com
Cc: sivakumar.subramani@neterion.com
Cc: sreenivasa.honnur@neterion.com

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-27 01:31:22 -04:00
Andy Gospodarek
581abbc26a e1000: only enable TSO6 via ethtool when using correct hardware
When enabling TSO via ethool on e1000, it is possible to set
NETIF_F_TSO6 on hardware that does not support it.  Setting TSO via
ethtool now matches the settings used when the hardware is probed.

Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-27 01:31:08 -04:00
Kevin Hao
1923815d85 e100: Do pci_dma_sync after skb_alloc for proper operation on ixp4xx
The E100 device can't work on current kernel (2.6.26-rc6) and will cause
kernel corruption on intel ixdp4xx.

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-27 01:30:59 -04:00
Al Viro
70081ac55d [netdrvr] netxen: fix netxen_pci_tbl[] breakage
PCI_DEVICE_CLASS sets .device and .vendor to PCI_ANY_DEV,
which overrides the effect of preceding PCI_DEVICE() and makes
all elements of netxen_pci_tbl[] identical.  Introduced in the
commit dcd56fdbae.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-27 01:30:46 -04:00
Ingo Molnar
c5643cab7b [netdrvr] 3c59x: remove irqs_disabled warning from local_bh_enable
Original Author: Michael Buesch <mb@bu3sch.de>

net, vortex: fix lockup

Ingo Molnar reported:

-tip testing found that Johannes Berg's "softirq: remove irqs_disabled
warning from local_bh_enable" enhancement to lockdep triggers a new
warning on an old testbox that uses 3c59x vortex and netlogging:

----->
    calling  vortex_init+0x0/0xb0
    PCI: Found IRQ 10 for device 0000:00:0b.0
    PCI: Sharing IRQ 10 with 0000:00:0a.0
    PCI: Sharing IRQ 10 with 0000:00:0b.1
    3c59x: Donald Becker and others.
    0000:00:0b.0: 3Com PCI 3c556 Laptop Tornado at e0800400.
    PCI: Enabling bus mastering for device 0000:00:0b.0
    initcall vortex_init+0x0/0xb0 returned 0 after 47 msecs
...
    calling  init_netconsole+0x0/0x1b0
    netconsole: local port 4444
    netconsole: local IP 10.0.1.9
    netconsole: interface eth0
    netconsole: remote port 4444
    netconsole: remote IP 10.0.1.16
    netconsole: remote ethernet address 00:19:xx:xx:xx:xx
    netconsole: device eth0 not up yet, forcing it
    eth0:  setting half-duplex.
    eth0:  setting full-duplex.
------------[ cut here ]------------
    WARNING: at kernel/softirq.c:137 local_bh_enable_ip+0xd1/0xe0()
    Pid: 1, comm: swapper Not tainted 2.6.26-rc6-tip #2091
     [<c0125ecf>] warn_on_slowpath+0x4f/0x70
     [<c0126834>] ? release_console_sem+0x1b4/0x1d0
     [<c0126d00>] ? vprintk+0x2a0/0x450
     [<c012fde5>] ? __mod_timer+0xa5/0xc0
     [<c046f7fd>] ? mdio_sync+0x3d/0x50
     [<c0160ef6>] ? marker_probe_cb+0x46/0xa0
     [<c0126ed7>] ? printk+0x27/0x50
     [<c046f4c3>] ? vortex_set_duplex+0x43/0xc0
     [<c046f521>] ? vortex_set_duplex+0xa1/0xc0
     [<c0471b92>] ? vortex_timer+0xe2/0x3e0
     [<c012b361>] local_bh_enable_ip+0xd1/0xe0
     [<c08d9f9f>] _spin_unlock_bh+0x2f/0x40
     [<c0471b92>] vortex_timer+0xe2/0x3e0
     [<c014743b>] ? trace_hardirqs_on+0xb/0x10
     [<c0147358>] ? trace_hardirqs_on_caller+0x88/0x160
     [<c012f8b2>] run_timer_softirq+0x162/0x1c0
     [<c0471ab0>] ? vortex_timer+0x0/0x3e0
     [<c012b361>] local_bh_enable_ip+0xd1/0xe0
     [<c08d9f9f>] _spin_unlock_bh+0x2f/0x40
     [<c0471b92>] vortex_timer+0xe2/0x3e0
     [<c014743b>] ? trace_hardirqs_on+0xb/0x10
     [<c0147358>] ? trace_hardirqs_on_caller+0x88/0x160
     [<c012f8b2>] run_timer_softirq+0x162/0x1c0
     [<c0471ab0>] ? vortex_timer+0x0/0x3e0
     [<c0471ab0>] ? vortex_timer+0x0/0x3e0
     [<c012b60a>] __do_softirq+0x9a/0x160
     [<c012b570>] ? __do_softirq+0x0/0x160
     [<c0106775>] call_on_stack+0x15/0x30
     [<c012b4f5>] ? irq_exit+0x55/0x60
     [<c0106e85>] ? do_IRQ+0x85/0xd0
     [<c0147391>] ? trace_hardirqs_on_caller+0xc1/0x160
     [<c0104888>] ? common_interrupt+0x28/0x30
     [<c08d8ac8>] ? mutex_unlock+0x8/0x10
     [<c08d8180>] ? _cond_resched+0x10/0x30
     [<c07a3be7>] ? netpoll_setup+0x117/0x390
     [<c0cbfcfe>] ? init_netconsole+0x14e/0x1b0
     [<c013d539>] ? ktime_get+0x19/0x40
     [<c0c9bab2>] ? kernel_init+0x1b2/0x2c0
     [<c0cbfbb0>] ? init_netconsole+0x0/0x1b0
     [<c0396aa4>] ? trace_hardirqs_on_thunk+0xc/0x10
     [<c0103f12>] ? restore_nocheck_notrace+0x0/0xe
     [<c0c9b900>] ? kernel_init+0x0/0x2c0
     [<c0c9b900>] ? kernel_init+0x0/0x2c0
     [<c0104aa7>] ? kernel_thread_helper+0x7/0x10
     =======================
---[ end trace 37f9c502aff112e0 ]---
    console [netcon0] enabled
    netconsole: network logging started
    initcall init_netconsole+0x0/0x1b0 returned 0 after 2914 msecs

looking at the driver I think the bug is real and the fix actually
is trivial.

vp->lock is also taken in hardware IRQ context, so we _have_ to always
use irqsafe locking. As we run in a timer with IRQs disabled,
we can simply use spin_lock.

Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-27 01:30:33 -04:00
Pekka Enberg
e8399fed7e ipg: use NULL, not zero, for pointers
Fixes a sparse warning in a code block that's hidden under JUMBO_FRAME #ifdef.

Tested-by: Andrew Savchenko <Bircoph@list.ru>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-27 01:28:31 -04:00
Pekka Enberg
ecfecfb5e3 ipg: fix jumbo frame compilation
Make jumbo frame support compile again. It was broken by the cleanup series
before the merge because the code is hidden under JUMBO_FRAME #ifdef.

Tested-by: Andrew Savchenko <Bircoph@list.ru>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-27 01:28:29 -04:00
Julia Lawall
3f6602ad56 drivers/net/r6040.c: Eliminate double sizeof
Taking sizeof the result of sizeof is quite strange and does not seem to be
what is wanted here.

This was fixed using the following semantic patch.
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@@
expression E;
@@

- sizeof (
  sizeof (E)
- )
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-27 01:28:25 -04:00
Komuro
54299ef7e9 pcnet_cs, axnet_cs: clear bogus interrupt before request_irq
Signed-off-by: Komuro <komurojun-mbn@nifty.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-27 01:28:21 -04:00
Jeff Kirsher
52cc30862a e1000e: fix EEH recovery during reset on PPC
EEH is not recovering in a reasonable amount of time on PPC during
e1000e_down().

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-27 01:27:48 -04:00
Jeff Kirsher
3023682e74 igb: fix EEH recovery during reset on PPC
EEH is not recovering in a reasonable amount of time on PPC during
igb_down().

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-27 01:27:47 -04:00
Paul Larson
6f4a0e45c6 ixgbe: fix EEH recovery during reset on PPC
EEh is not recovering in a resonable amount of time on PPC during
ixgbe_down().

Signed-off-by: Paul Larson <pl@us.ibm.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-27 01:27:46 -04:00
Atsushi Nemoto
ccc57aac9c tc35815: Fix receiver hangup on Rx FIFO overflow
On Rx FIFO overflow error, the controller consume a buffer descriptor
but currently the driver does not give it back to the controller.
This results unrecoverable 'Buffer List Exhausted' condition.  This
patch fix this problem by moving a "fbl_count--" line to proper place.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-27 01:27:43 -04:00
Atsushi Nemoto
59524a3744 tc35815: Mark carrier-off before starting PHY
Call netif_carrier_off() before starting PHY device.  This is a
behavior before converting to generic PHY layer.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-27 01:27:33 -04:00
Michal Schmidt
f471f92339 s2io: fix documentation about intr_type
The documentation for intr_type module parameter of the s2io driver is
not consistent with the code. The comments in drivers/net/s2io.c are
OK, but Documentation/networking/s2io.txt is wrong.

Pointed out by Andrew Hecox.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-27 01:27:28 -04:00
Ron Rindjunsky
66b5004d85 iwlwifi: improve scanning band selection management
This patch modifies the band selection management when scanning, so
bands are now scanned according to HW band support.

Signed-off-by: Ron Rindjunsky <ron.rindjunsky@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-25 10:57:03 -04:00
Ivo van Doorn
99ade2597e rt2x00: Fix unbalanced mutex locking
The usb_cache_mutex was not correctly released
under all circumstances. Both rt73usb as rt2500usb
didn't release the mutex under certain conditions
when the register access failed. Obviously such
failure would lead to deadlocks.

In addition under similar circumstances when the
bbp register couldn't be read the value must be
set to 0xff to indicate that the value is wrong.
This too didn't happen under all circumstances.

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-25 10:56:16 -04:00
Michael Buesch
2f9ec47d09 b43legacy: Fix possible NULL pointer dereference in DMA code
This fixes a possible NULL pointer dereference in an error path of the
DMA allocation error checking code. This is also necessary for a future
DMA API change that is on its way into the mainline kernel that adds
an additional dev parameter to dma_mapping_error().

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Cc: stable <stable@kernel.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-25 10:56:16 -04:00
Michael Buesch
7b3abfc87e b43: Fix possible MMIO access while device is down
This fixes a possible MMIO access while the device is still down
from a suspend cycle. MMIO accesses with the device powered down
may cause crashes on certain devices.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-25 10:55:09 -04:00
Michael Buesch
664f200610 b43legacy: Do not return TX_BUSY from op_tx
Never return TX_BUSY from op_tx. It doesn't make sense to return
TX_BUSY, if we can not transmit the packet.
Drop the packet and return TX_OK.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-25 10:55:09 -04:00
Michael Buesch
c9e8eae093 b43: Do not return TX_BUSY from op_tx
Never return TX_BUSY from op_tx. It doesn't make sense to return
TX_BUSY, if we can not transmit the packet.
Drop the packet and return TX_OK.
This will fix the resume hang.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-25 10:51:51 -04:00
Tony Vroon
59d393ad92 mac80211: implement EU regulatory domain
Implement missing EU regulatory domain for mac80211. Based on the
information in IEEE 802.11-2007 (specifically pages 1142, 1143 & 1148)
and ETSI 301 893 (V1.4.1).
With thanks to Johannes Berg.

Signed-off-by: Tony Vroon <tony@linx.net>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-06-25 10:31:29 -04:00
Patrick McHardy
88a6f4ad76 netfilter: ip6table_mangle: don't reroute in LOCAL_IN
Rerouting should only happen in LOCAL_OUT, in INPUT its useless
since the packet has already chosen its final destination.

Noticed by Alexey Dobriyan <adobriyan@gmail.com>.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-24 13:30:45 -07:00
Eric W. Biederman
b9f75f45a6 netns: Don't receive new packets in a dead network namespace.
Alexey Dobriyan <adobriyan@gmail.com> writes:
> Subject: ICMP sockets destruction vs ICMP packets oops

> After icmp_sk_exit() nuked ICMP sockets, we get an interrupt.
> icmp_reply() wants ICMP socket.
>
> Steps to reproduce:
>
> 	launch shell in new netns
> 	move real NIC to netns
> 	setup routing
> 	ping -i 0
> 	exit from shell
>
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
> IP: [<ffffffff803fce17>] icmp_sk+0x17/0x30
> PGD 17f3cd067 PUD 17f3ce067 PMD 0 
> Oops: 0000 [1] PREEMPT SMP DEBUG_PAGEALLOC
> CPU 0 
> Modules linked in: usblp usbcore
> Pid: 0, comm: swapper Not tainted 2.6.26-rc6-netns-ct #4
> RIP: 0010:[<ffffffff803fce17>]  [<ffffffff803fce17>] icmp_sk+0x17/0x30
> RSP: 0018:ffffffff8057fc30  EFLAGS: 00010286
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff81017c7db900
> RDX: 0000000000000034 RSI: ffff81017c7db900 RDI: ffff81017dc41800
> RBP: ffffffff8057fc40 R08: 0000000000000001 R09: 000000000000a815
> R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff8057fd28
> R13: ffffffff8057fd00 R14: ffff81017c7db938 R15: ffff81017dc41800
> FS:  0000000000000000(0000) GS:ffffffff80525000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000000 CR3: 000000017fcda000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process swapper (pid: 0, threadinfo ffffffff8053a000, task ffffffff804fa4a0)
> Stack:  0000000000000000 ffff81017c7db900 ffffffff8057fcf0 ffffffff803fcfe4
>  ffffffff804faa38 0000000000000246 0000000000005a40 0000000000000246
>  000000000001ffff ffff81017dd68dc0 0000000000005a40 0000000055342436
> Call Trace:
>  <IRQ>  [<ffffffff803fcfe4>] icmp_reply+0x44/0x1e0
>  [<ffffffff803d3a0a>] ? ip_route_input+0x23a/0x1360
>  [<ffffffff803fd645>] icmp_echo+0x65/0x70
>  [<ffffffff803fd300>] icmp_rcv+0x180/0x1b0
>  [<ffffffff803d6d84>] ip_local_deliver+0xf4/0x1f0
>  [<ffffffff803d71bb>] ip_rcv+0x33b/0x650
>  [<ffffffff803bb16a>] netif_receive_skb+0x27a/0x340
>  [<ffffffff803be57d>] process_backlog+0x9d/0x100
>  [<ffffffff803bdd4d>] net_rx_action+0x18d/0x250
>  [<ffffffff80237be5>] __do_softirq+0x75/0x100
>  [<ffffffff8020c97c>] call_softirq+0x1c/0x30
>  [<ffffffff8020f085>] do_softirq+0x65/0xa0
>  [<ffffffff80237af7>] irq_exit+0x97/0xa0
>  [<ffffffff8020f198>] do_IRQ+0xa8/0x130
>  [<ffffffff80212ee0>] ? mwait_idle+0x0/0x60
>  [<ffffffff8020bc46>] ret_from_intr+0x0/0xf
>  <EOI>  [<ffffffff80212f2c>] ? mwait_idle+0x4c/0x60
>  [<ffffffff80212f23>] ? mwait_idle+0x43/0x60
>  [<ffffffff8020a217>] ? cpu_idle+0x57/0xa0
>  [<ffffffff8040f380>] ? rest_init+0x70/0x80
> Code: 10 5b 41 5c 41 5d 41 5e c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 53
> 48 83 ec 08 48 8b 9f 78 01 00 00 e8 2b c7 f1 ff 89 c0 <48> 8b 04 c3 48 83 c4 08
> 5b c9 c3 66 66 66 66 66 2e 0f 1f 84 00
> RIP  [<ffffffff803fce17>] icmp_sk+0x17/0x30
>  RSP <ffffffff8057fc30>
> CR2: 0000000000000000
> ---[ end trace ea161157b76b33e8 ]---
> Kernel panic - not syncing: Aiee, killing interrupt handler!

Receiving packets while we are cleaning up a network namespace is a
racy proposition. It is possible when the packet arrives that we have
removed some but not all of the state we need to fully process it.  We
have the choice of either playing wack-a-mole with the cleanup routines
or simply dropping packets when we don't have a network namespace to
handle them.

Since the check looks inexpensive in netif_receive_skb let's just
drop the incoming packets.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-20 22:16:51 -07:00
David S. Miller
735ce972fb sctp: Make sure N * sizeof(union sctp_addr) does not overflow.
As noticed by Gabriel Campana, the kmalloc() length arg
passed in by sctp_getsockopt_local_addrs_old() can overflow
if ->addr_num is large enough.

Therefore, enforce an appropriate limit.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-20 22:04:34 -07:00
Stephen Hemminger
2645a3c376 pppoe: warning fix
Fix warning:
drivers/net/pppoe.c: In function 'pppoe_recvmsg':
drivers/net/pppoe.c:945: warning: comparison of distinct pointer types lacks a cast
because skb->len is unsigned int and total_len is size_t

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-20 21:58:02 -07:00
YOSHIFUJI Hideaki
f630e43a21 ipv6: Drop packets for loopback address from outside of the box.
[ Based upon original report and patch by Karsten Keil.  Karsten
  has verified that this fixes the TAHI test case "ICMPv6 test
  v6LC.5.1.2 Part F". -DaveM ]

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-19 16:33:57 -07:00
Shan Wei
aea7427f70 ipv6: Remove options header when setsockopt's optlen is 0
Remove the sticky Hop-by-Hop options header by calling setsockopt()
for IPV6_HOPOPTS with a zero option length, per RFC3542.

Routing header and Destination options header does the same as
Hop-by-Hop options header.

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-19 16:29:39 -07:00
Johannes Berg
ef3a62d272 mac80211: detect driver tx bugs
When a driver rejects a frame in it's ->tx() callback, it must also
stop queues, otherwise mac80211 can go into a loop here. Detect this
situation and abort the loop after five retries, warning about the
driver bug.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-18 15:39:48 -07:00
Patrick McHardy
6d1a3fb567 netlink: genl: fix circular locking
genetlink has a circular locking dependency when dumping the registered
families:

- dump start:
genl_rcv()            : take genl_mutex
genl_rcv_msg()        : call netlink_dump_start() while holding genl_mutex
netlink_dump_start(),
netlink_dump()        : take nlk->cb_mutex
ctrl_dumpfamily()     : try to detect this case and not take genl_mutex a
                        second time

- dump continuance:
netlink_rcv()         : call netlink_dump
netlink_dump          : take nlk->cb_mutex
ctrl_dumpfamily()     : take genl_mutex

Register genl_lock as callback mutex with netlink to fix this. This slightly
widens an already existing module unload race, the genl ops used during the
dump might go away when the module is unloaded. Thomas Graf is working on a
seperate fix for this.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-18 02:07:07 -07:00
David S. Miller
3a5be7d4b0 Revert "mac80211: Use skb_header_cloned() on TX path."
This reverts commit 608961a5ec.

The problem is that the mac80211 stack not only needs to be able to
muck with the link-level headers, it also might need to mangle all of
the packet data if doing sw wireless encryption.

This fixes kernel bugzilla #10903.  Thanks to Didier Raboud (for the
bugzilla report), Andrew Prince (for bisecting), Johannes Berg (for
bringing this bisection analysis to my attention), and Ilpo (for
trying to analyze this purely from the TCP side).

In 2.6.27 we can take another stab at this, by using something like
skb_cow_data() when the TX path of mac80211 ends up with a non-NULL
tx->key.  The ESP protocol code in the IPSEC stack can be used as a
model for implementation.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-18 01:19:51 -07:00
Rainer Weikusat
3c73419c09 af_unix: fix 'poll for write'/ connected DGRAM sockets
The unix_dgram_sendmsg routine implements a (somewhat crude)
form of receiver-imposed flow control by comparing the length of the
receive queue of the 'peer socket' with the max_ack_backlog value
stored in the corresponding sock structure, either blocking
the thread which caused the send-routine to be called or returning
EAGAIN. This routine is used by both SOCK_DGRAM and SOCK_SEQPACKET
sockets. The poll-implementation for these socket types is
datagram_poll from core/datagram.c. A socket is deemed to be writeable
by this routine when the memory presently consumed by datagrams
owned by it is less than the configured socket send buffer size. This
is always wrong for connected PF_UNIX non-stream sockets when the
abovementioned receive queue is currently considered to be full.
'poll' will then return, indicating that the socket is writeable, but
a subsequent write result in EAGAIN, effectively causing an
(usual) application to 'poll for writeability by repeated send request
with O_NONBLOCK set' until it has consumed its time quantum.

The change below uses a suitably modified variant of the datagram_poll
routines for both type of PF_UNIX sockets, which tests if the
recv-queue of the peer a socket is connected to is presently
considered to be 'full' as part of the 'is this socket
writeable'-checking code. The socket being polled is additionally
put onto the peer_wait wait queue associated with its peer, because the
unix_dgram_sendmsg routine does a wake up on this queue after a
datagram was received and the 'other wakeup call' is done implicitly
as part of skb destruction, meaning, a process blocked in poll
because of a full peer receive queue could otherwise sleep forever
if no datagram owned by its socket was already sitting on this queue.
Among this change is a small (inline) helper routine named
'unix_recvq_full', which consolidates the actual testing code (in three
different places) into a single location.

Signed-off-by: Rainer Weikusat <rweikusat@mssgmbh.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 22:28:05 -07:00
David S. Miller
4552e1198a Merge branch 'davem-fixes' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/netdev-2.6 2008-06-17 21:32:08 -07:00
Ang Way Chuang
f09f7ee20c tun: Proper handling of IPv6 header in tun driver when TUN_NO_PI is set
By default, tun.c running in TUN_TUN_DEV mode will set the protocol of
packet to IPv4 if TUN_NO_PI is set. My program failed to work when I
assumed that the driver will check the first nibble of packet,
determine IP version and set the appropriate protocol.

Signed-off-by: Ang Way Chuang <wcang@nav6.org>
Acked-by: Max Krasnyansky <maxk@qualcomm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-17 21:10:33 -07:00
Radu Cristescu
58c7821c42 atl1: relax eeprom mac address error check
The atl1 driver tries to determine the MAC address thusly:

	- If an EEPROM exists, read the MAC address from EEPROM and
	  validate it.
	- If an EEPROM doesn't exist, try to read a MAC address from
	  SPI flash.
	- If that fails, try to read a MAC address directly from the
	  MAC Station Address register.
	- If that fails, assign a random MAC address provided by the
	  kernel.

We now have a report of a system fitted with an EEPROM containing all
zeros where we expect the MAC address to be, and we currently handle
this as an error condition.  Turns out, on this system the BIOS writes
a valid MAC address to the NIC's MAC Station Address register, but we
never try to read it because we return an error when we find the all-
zeros address in EEPROM.

This patch relaxes the error check and continues looking for a MAC
address even if it finds an illegal one in EEPROM.

Signed-off-by: Radu Cristescu <advantis@gmx.net>
Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2008-06-17 23:09:21 -04:00