kernel_optimize_test/net
Eric W. Biederman b9f75f45a6 netns: Don't receive new packets in a dead network namespace.
Alexey Dobriyan <adobriyan@gmail.com> writes:
> Subject: ICMP sockets destruction vs ICMP packets oops

> After icmp_sk_exit() nuked ICMP sockets, we get an interrupt.
> icmp_reply() wants ICMP socket.
>
> Steps to reproduce:
>
> 	launch shell in new netns
> 	move real NIC to netns
> 	setup routing
> 	ping -i 0
> 	exit from shell
>
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
> IP: [<ffffffff803fce17>] icmp_sk+0x17/0x30
> PGD 17f3cd067 PUD 17f3ce067 PMD 0 
> Oops: 0000 [1] PREEMPT SMP DEBUG_PAGEALLOC
> CPU 0 
> Modules linked in: usblp usbcore
> Pid: 0, comm: swapper Not tainted 2.6.26-rc6-netns-ct #4
> RIP: 0010:[<ffffffff803fce17>]  [<ffffffff803fce17>] icmp_sk+0x17/0x30
> RSP: 0018:ffffffff8057fc30  EFLAGS: 00010286
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff81017c7db900
> RDX: 0000000000000034 RSI: ffff81017c7db900 RDI: ffff81017dc41800
> RBP: ffffffff8057fc40 R08: 0000000000000001 R09: 000000000000a815
> R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff8057fd28
> R13: ffffffff8057fd00 R14: ffff81017c7db938 R15: ffff81017dc41800
> FS:  0000000000000000(0000) GS:ffffffff80525000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000000 CR3: 000000017fcda000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process swapper (pid: 0, threadinfo ffffffff8053a000, task ffffffff804fa4a0)
> Stack:  0000000000000000 ffff81017c7db900 ffffffff8057fcf0 ffffffff803fcfe4
>  ffffffff804faa38 0000000000000246 0000000000005a40 0000000000000246
>  000000000001ffff ffff81017dd68dc0 0000000000005a40 0000000055342436
> Call Trace:
>  <IRQ>  [<ffffffff803fcfe4>] icmp_reply+0x44/0x1e0
>  [<ffffffff803d3a0a>] ? ip_route_input+0x23a/0x1360
>  [<ffffffff803fd645>] icmp_echo+0x65/0x70
>  [<ffffffff803fd300>] icmp_rcv+0x180/0x1b0
>  [<ffffffff803d6d84>] ip_local_deliver+0xf4/0x1f0
>  [<ffffffff803d71bb>] ip_rcv+0x33b/0x650
>  [<ffffffff803bb16a>] netif_receive_skb+0x27a/0x340
>  [<ffffffff803be57d>] process_backlog+0x9d/0x100
>  [<ffffffff803bdd4d>] net_rx_action+0x18d/0x250
>  [<ffffffff80237be5>] __do_softirq+0x75/0x100
>  [<ffffffff8020c97c>] call_softirq+0x1c/0x30
>  [<ffffffff8020f085>] do_softirq+0x65/0xa0
>  [<ffffffff80237af7>] irq_exit+0x97/0xa0
>  [<ffffffff8020f198>] do_IRQ+0xa8/0x130
>  [<ffffffff80212ee0>] ? mwait_idle+0x0/0x60
>  [<ffffffff8020bc46>] ret_from_intr+0x0/0xf
>  <EOI>  [<ffffffff80212f2c>] ? mwait_idle+0x4c/0x60
>  [<ffffffff80212f23>] ? mwait_idle+0x43/0x60
>  [<ffffffff8020a217>] ? cpu_idle+0x57/0xa0
>  [<ffffffff8040f380>] ? rest_init+0x70/0x80
> Code: 10 5b 41 5c 41 5d 41 5e c9 c3 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 53
> 48 83 ec 08 48 8b 9f 78 01 00 00 e8 2b c7 f1 ff 89 c0 <48> 8b 04 c3 48 83 c4 08
> 5b c9 c3 66 66 66 66 66 2e 0f 1f 84 00
> RIP  [<ffffffff803fce17>] icmp_sk+0x17/0x30
>  RSP <ffffffff8057fc30>
> CR2: 0000000000000000
> ---[ end trace ea161157b76b33e8 ]---
> Kernel panic - not syncing: Aiee, killing interrupt handler!

Receiving packets while we are cleaning up a network namespace is a
racy proposition. It is possible when the packet arrives that we have
removed some but not all of the state we need to fully process it.  We
have the choice of either playing wack-a-mole with the cleanup routines
or simply dropping packets when we don't have a network namespace to
handle them.

Since the check looks inexpensive in netif_receive_skb let's just
drop the incoming packets.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-20 22:16:51 -07:00
..
9p 9p: fix error path during early mount 2008-05-14 19:23:27 -05:00
802
8021q vlan: Use bitmask of feature flags instead of seperate feature bits 2008-05-23 00:27:50 -07:00
appletalk
atm atm: [br2864] fix routed vcmux support 2008-06-16 17:18:18 -07:00
ax25 ax25: Fix NULL pointer dereference and lockup. 2008-06-03 14:53:46 -07:00
bluetooth bluetooth: rfcomm_dev_state_change deadlock fix 2008-06-03 14:27:17 -07:00
bridge bridge: Consolidate error paths in br_add_bridge(). 2008-05-04 17:58:07 -07:00
can Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2008-05-08 19:03:26 -07:00
core netns: Don't receive new packets in a dead network namespace. 2008-06-20 22:16:51 -07:00
dccp dccp: Bug in initial acknowledgment number assignment 2008-06-11 11:19:10 +01:00
decnet ip: Use inline function dst_metric() instead of direct access to dst->metric[] 2008-05-04 22:14:42 -07:00
econet net: Allow netdevices to specify needed head/tailroom 2008-05-12 20:48:31 -07:00
ethernet
ieee80211
ipv4 xfrm: fix fragmentation for ipv4 xfrm tunnel 2008-06-17 16:38:23 -07:00
ipv6 ipv6: Drop packets for loopback address from outside of the box. 2008-06-19 16:33:57 -07:00
ipx
irda irda: Sock leak on error path in irda_create. 2008-06-03 15:18:36 -07:00
iucv
key ipsec: pfkey should ignore events when no listeners 2008-06-10 14:25:34 -07:00
lapb
llc llc: Fix double accounting of received packets 2008-05-30 02:57:29 -07:00
mac80211 mac80211: detect driver tx bugs 2008-06-18 15:39:48 -07:00
netfilter netfilter: nf_conntrack_h323: fix module unload crash 2008-06-17 15:52:32 -07:00
netlabel Audit: collect sessionid in netlink messages 2008-04-28 06:18:03 -04:00
netlink netlink: genl: fix circular locking 2008-06-18 02:07:07 -07:00
netrom
packet net: Allow netdevices to specify needed head/tailroom 2008-05-12 20:48:31 -07:00
rfkill
rose rose: Wrong list_lock argument in rose_node seqops 2008-05-02 17:03:22 -07:00
rxrpc net: Add missing braces to multi-statement if()s 2008-05-02 16:20:10 -07:00
sched pkt_sched: Change HTB_HYSTERESIS to a runtime parameter htb_hysteresis. 2008-06-16 16:39:32 -07:00
sctp sctp: Make sure N * sizeof(union sctp_addr) does not overflow. 2008-06-20 22:04:34 -07:00
sunrpc Merge branch 'for-2.6.26' of git://linux-nfs.org/~bfields/linux 2008-05-20 19:30:54 -07:00
tipc tipc: Increase buffer header to support worst-case device 2008-05-08 21:38:24 -07:00
unix af_unix: fix 'poll for write'/ connected DGRAM sockets 2008-06-17 22:28:05 -07:00
wanrouter
wireless netlink: Improve returned error codes 2008-06-03 16:36:54 -07:00
x25
xfrm xfrm: xfrm_algo: correct usage of RIPEMD-160 2008-06-04 12:04:55 -07:00
compat.c net: Add compat support for getsockopt (MCAST_MSFILTER) 2008-04-29 03:23:22 -07:00
Kconfig
Makefile
nonet.c
socket.c net: Unexport move_addr_to_{kernel,user} 2008-04-23 03:37:49 -07:00
sysctl_net.c net: fix returning void-valued expression warnings 2008-05-01 02:47:38 -07:00
TUNABLE