kernel_optimize_test/net
Daniel Borkmann 4e7c133068 netlink: make sure -EBUSY won't escape from netlink_insert
Linus reports the following deadlock on rtnl_mutex; triggered only
once so far (extract):

[12236.694209] NetworkManager  D 0000000000013b80     0  1047      1 0x00000000
[12236.694218]  ffff88003f902640 0000000000000000 ffffffff815d15a9 0000000000000018
[12236.694224]  ffff880119538000 ffff88003f902640 ffffffff81a8ff84 00000000ffffffff
[12236.694230]  ffffffff81a8ff88 ffff880119c47f00 ffffffff815d133a ffffffff81a8ff80
[12236.694235] Call Trace:
[12236.694250]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.694257]  [<ffffffff815d133a>] ? schedule+0x2a/0x70
[12236.694263]  [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10
[12236.694271]  [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0
[12236.694280]  [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30
[12236.694291]  [<ffffffff814f1f90>] ? rtnetlink_rcv+0x10/0x30
[12236.694299]  [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
[12236.694309]  [<ffffffff814f5ad3>] ? rtnl_getlink+0x113/0x190
[12236.694319]  [<ffffffff814f202a>] ? rtnetlink_rcv_msg+0x7a/0x210
[12236.694331]  [<ffffffff8124565c>] ? sock_has_perm+0x5c/0x70
[12236.694339]  [<ffffffff814f1fb0>] ? rtnetlink_rcv+0x30/0x30
[12236.694346]  [<ffffffff8150d62c>] ? netlink_rcv_skb+0x9c/0xc0
[12236.694354]  [<ffffffff814f1f9f>] ? rtnetlink_rcv+0x1f/0x30
[12236.694360]  [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180
[12236.694367]  [<ffffffff8150d344>] ? netlink_sendmsg+0x484/0x5d0
[12236.694376]  [<ffffffff810a236f>] ? __wake_up+0x2f/0x50
[12236.694387]  [<ffffffff814cad23>] ? sock_sendmsg+0x33/0x40
[12236.694396]  [<ffffffff814cb05e>] ? ___sys_sendmsg+0x22e/0x240
[12236.694405]  [<ffffffff814cab75>] ? ___sys_recvmsg+0x135/0x1a0
[12236.694415]  [<ffffffff811a9d12>] ? eventfd_write+0x82/0x210
[12236.694423]  [<ffffffff811a0f9e>] ? fsnotify+0x32e/0x4c0
[12236.694429]  [<ffffffff8108cb70>] ? wake_up_q+0x60/0x60
[12236.694434]  [<ffffffff814cba09>] ? __sys_sendmsg+0x39/0x70
[12236.694440]  [<ffffffff815d4797>] ? entry_SYSCALL_64_fastpath+0x12/0x6a

It seems so far plausible that the recursive call into rtnetlink_rcv()
looks suspicious. One way, where this could trigger is that the senders
NETLINK_CB(skb).portid was wrongly 0 (which is rtnetlink socket), so
the rtnl_getlink() request's answer would be sent to the kernel instead
to the actual user process, thus grabbing rtnl_mutex() twice.

One theory would be that netlink_autobind() triggered via netlink_sendmsg()
internally overwrites the -EBUSY error to 0, but where it is wrongly
originating from __netlink_insert() instead. That would reset the
socket's portid to 0, which is then filled into NETLINK_CB(skb).portid
later on. As commit d470e3b483 ("[NETLINK]: Fix two socket hashing bugs.")
also puts it, -EBUSY should not be propagated from netlink_insert().

It looks like it's very unlikely to reproduce. We need to trigger the
rhashtable_insert_rehash() handler under a situation where rehashing
currently occurs (one /rare/ way would be to hit ht->elasticity limits
while not filled enough to expand the hashtable, but that would rather
require a specifically crafted bind() sequence with knowledge about
destination slots, seems unlikely). It probably makes sense to guard
__netlink_insert() in any case and remap that error. It was suggested
that EOVERFLOW might be better than an already overloaded ENOMEM.

Reference: http://thread.gmane.org/gmane.linux.network/372676
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-10 10:59:10 -07:00
..
6lowpan
9p virtio/vhost: fixes for 4.2 2015-07-23 13:07:04 -07:00
802
8021q
appletalk
atm
ax25 NET: AX.25: Stop heartbeat timer on disconnect. 2015-07-15 15:59:58 -07:00
batman-adv batman-adv: initialize up/down values when adding a gateway 2015-08-05 00:31:47 +02:00
bluetooth Bluetooth: Fix NULL pointer dereference in smp_conn_security 2015-07-23 16:41:24 +02:00
bridge bridge: netlink: account for the IFLA_BRPORT_PROXYARP_WIFI attribute size and policy 2015-08-06 23:54:42 -07:00
caif caif: fix leaks and race in caif_queue_rcv_skb() 2015-07-21 00:02:44 -07:00
can can: replace timestamp as unique skb attribute 2015-07-12 21:13:22 +02:00
ceph libceph: treat sockaddr_storage with uninitialized family as blank 2015-07-09 20:30:34 +03:00
core net: pktgen: don't abuse current->state in pktgen_thread_worker() 2015-08-06 23:52:44 -07:00
dcb
dccp tcp: fix recv with flags MSG_WAITALL | MSG_PEEK 2015-07-27 01:06:53 -07:00
decnet
dns_resolver
dsa net: dsa: Fix off-by-one in switch address parsing 2015-07-11 23:25:16 -07:00
ethernet
hsr
ieee802154 inet: frag: change *_frag_mem_limit functions to take netns_frags as argument 2015-07-26 21:00:14 -07:00
ipv4 udp: fix dst races with multicast early demux 2015-08-03 22:16:50 -07:00
ipv6 ipv6: flush nd cache on IFF_NOARP change 2015-07-29 23:01:39 -07:00
ipx
irda
iucv
key Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2015-06-24 16:49:49 -07:00
l2tp
lapb
llc tcp: fix recv with flags MSG_WAITALL | MSG_PEEK 2015-07-27 01:06:53 -07:00
mac80211 cfg80211: use RTNL locked reg_can_beacon for IR-relaxation 2015-07-17 15:02:02 +02:00
mac802154 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2015-06-24 16:49:49 -07:00
mpls
netfilter netfilter: nf_conntrack: Support expectations in different zones 2015-07-22 17:00:47 +02:00
netlabel
netlink netlink: make sure -EBUSY won't escape from netlink_insert 2015-08-10 10:59:10 -07:00
netrom netfilter: Remove spurios included of netfilter.h 2015-06-18 21:14:32 +02:00
nfc
openvswitch openvswitch: Fix L4 checksum handling when dealing with IP fragments 2015-08-03 14:03:08 -07:00
packet packet: tpacket_snd(): fix signed/unsigned comparison 2015-07-29 00:09:58 -07:00
phonet
rds rds: fix an integer overflow test in rds_info_getsockopt() 2015-08-03 15:20:16 -07:00
rfkill
rose Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-06-24 02:58:51 -07:00
rxrpc
sched act_mirred: avoid calling tcf_hash_release() when binding 2015-08-03 14:13:28 -07:00
sctp net: sctp: stop spamming klog with rfc6458, 5.3.2. deprecation warnings 2015-07-26 16:32:41 -07:00
sunrpc NFS client bugfixes for Linux 4.2 2015-07-28 09:37:44 -07:00
switchdev net: switchdev: don't abort unsupported operations 2015-07-11 21:29:55 -07:00
tipc net/tipc: initialize security state for new connection socket 2015-07-08 16:08:23 -07:00
unix
vmw_vsock
wimax
wireless cfg80211: use RTNL locked reg_can_beacon for IR-relaxation 2015-07-17 15:02:02 +02:00
x25
xfrm Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2015-06-24 16:49:49 -07:00
compat.c
Kconfig
Makefile
socket.c
sysctl_net.c