kernel_optimize_test/net
Sven Wegener aea9d711f3 ipvs: Add missing locking during connection table hashing and unhashing
The code that hashes and unhashes connections from the connection table
is missing locking of the connection being modified, which opens up a
race condition and results in memory corruption when this race condition
is hit.

Here is what happens in pretty verbose form:

CPU 0					CPU 1
------------				------------
An active connection is terminated and
we schedule ip_vs_conn_expire() on this
CPU to expire this connection.

					IRQ assignment is changed to this CPU,
					but the expire timer stays scheduled on
					the other CPU.

					New connection from same ip:port comes
					in right before the timer expires, we
					find the inactive connection in our
					connection table and get a reference to
					it. We proper lock the connection in
					tcp_state_transition() and read the
					connection flags in set_tcp_state().

ip_vs_conn_expire() gets called, we
unhash the connection from our
connection table and remove the hashed
flag in ip_vs_conn_unhash(), without
proper locking!

					While still holding proper locks we
					write the connection flags in
					set_tcp_state() and this sets the hashed
					flag again.

ip_vs_conn_expire() fails to expire the
connection, because the other CPU has
incremented the reference count. We try
to re-insert the connection into our
connection table, but this fails in
ip_vs_conn_hash(), because the hashed
flag has been set by the other CPU. We
re-schedule execution of
ip_vs_conn_expire(). Now this connection
has the hashed flag set, but isn't
actually hashed in our connection table
and has a dangling list_head.

					We drop the reference we held on the
					connection and schedule the expire timer
					for timeouting the connection on this
					CPU. Further packets won't be able to
					find this connection in our connection
					table.

					ip_vs_conn_expire() gets called again,
					we think it's already hashed, but the
					list_head is dangling and while removing
					the connection from our connection table
					we write to the memory location where
					this list_head points to.

The result will probably be a kernel oops at some other point in time.

This race condition is pretty subtle, but it can be triggered remotely.
It needs the IRQ assignment change or another circumstance where packets
coming from the same ip:port for the same service are being processed on
different CPUs. And it involves hitting the exact time at which
ip_vs_conn_expire() gets called. It can be avoided by making sure that
all packets from one connection are always processed on the same CPU and
can be made harder to exploit by changing the connection timeouts to
some custom values.

Signed-off-by: Sven Wegener <sven.wegener@stealer.net>
Cc: stable@kernel.org
Acked-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2010-06-09 16:10:57 +02:00
..
9p kernel-wide: replace USHORT_MAX, SHORT_MAX and SHORT_MIN with USHRT_MAX, SHRT_MAX and SHRT_MIN 2010-05-25 08:07:02 -07:00
802 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-04-11 14:53:53 -07:00
8021q net: Remove unnecessary semicolons after switch statements 2010-05-17 17:44:35 -07:00
appletalk Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-04-11 14:53:53 -07:00
atm net: Remove unnecessary returns from void function()s 2010-05-17 23:23:14 -07:00
ax25 net: sk_sleep() helper 2010-04-20 16:37:13 -07:00
bluetooth net: Remove unnecessary returns from void function()s 2010-05-17 23:23:14 -07:00
bridge sysfs: add struct file* to bin_attr callbacks 2010-05-21 09:37:31 -07:00
caif caif: Bugfix - use MSG_TRUNC in receive 2010-05-23 23:57:43 -07:00
can net: Remove unnecessary returns from void function()s 2010-05-17 23:23:14 -07:00
core Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2010-05-28 10:18:40 -07:00
dcb include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
dccp Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2010-05-25 16:59:51 -07:00
decnet net: Remove unnecessary returns from void function()s 2010-05-17 23:23:14 -07:00
dsa Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-04-11 14:53:53 -07:00
econet include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
ethernet net: Inline skb_pull() in eth_type_trans(). 2010-05-02 02:21:44 -07:00
ieee802154 ieee802154: Fix possible NULL pointer dereference in wpan_phy_alloc 2010-05-23 23:11:07 -07:00
ipv4 netfilter: xtables: stackptr should be percpu 2010-05-31 16:41:35 +02:00
ipv6 netfilter: xtables: stackptr should be percpu 2010-05-31 16:41:35 +02:00
ipx include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
irda net: Remove unnecessary returns from void function()s 2010-05-17 23:23:14 -07:00
iucv Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2010-05-28 10:18:40 -07:00
key pfkey: add severity to printk 2010-05-17 23:23:13 -07:00
l2tp l2tp_eth: fix memory allocation 2010-04-23 16:37:33 -07:00
lapb include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
llc Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-05-12 00:05:35 -07:00
mac80211 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2010-05-25 16:59:51 -07:00
netfilter ipvs: Add missing locking during connection table hashing and unhashing 2010-06-09 16:10:57 +02:00
netlabel net: Remove unnecessary returns from void function()s 2010-05-17 23:23:14 -07:00
netlink netlink: Implment netlink_broadcast_filtered 2010-05-21 09:37:32 -07:00
netrom net: sk_sleep() helper 2010-04-20 16:37:13 -07:00
packet Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-04-21 01:14:25 -07:00
phonet Phonet: fix potential use-after-free in pep_sock_close() 2010-05-25 16:08:39 -07:00
rds net: Remove unnecessary semicolons after switch statements 2010-05-17 17:44:35 -07:00
rfkill Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2010-04-11 14:53:53 -07:00
rose net: sk_sleep() helper 2010-04-20 16:37:13 -07:00
rxrpc net: sock_def_readable() and friends RCU conversion 2010-05-01 15:00:15 -07:00
sched cls_cgroup: Store classid in struct sock 2010-05-24 00:12:34 -07:00
sctp net: Remove unnecessary returns from void function()s 2010-05-17 23:23:14 -07:00
sunrpc sunrpc: fix leak on error on socket xprt setup 2010-05-26 08:43:50 -04:00
tipc tipc: Reduce footprint by un-inlining tipc_msg_* routines 2010-05-12 23:02:29 -07:00
unix unix/garbage: kill copy of the skb queue walker 2010-05-03 15:39:58 -07:00
wanrouter
wimax Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 2010-05-20 21:04:44 -07:00
wireless cfg80211: add missing braces 2010-05-21 14:40:01 -04:00
x25 X25: Remove bkl in sockopts 2010-05-17 17:39:28 -07:00
xfrm net: Remove unnecessary returns from void function()s 2010-05-17 23:23:14 -07:00
compat.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
Kconfig net/sysfs: Fix the bitrot in network device kobject namespace support 2010-05-21 09:37:32 -07:00
Makefile l2tp: Split pppol2tp patch into separate l2tp and ppp parts 2010-04-03 14:56:02 -07:00
nonet.c
socket.c cls_cgroup: Store classid in struct sock 2010-05-24 00:12:34 -07:00
sysctl_net.c net: Remove unnecessary returns from void function()s 2010-05-17 23:23:14 -07:00
TUNABLE