kernel_optimize_test/include/net/netns
Kuniyuki Iwashima 4b01a96742 tcp: bind(0) remove the SO_REUSEADDR restriction when ephemeral ports are exhausted.
Commit aacd9289af ("tcp: bind() use stronger
condition for bind_conflict") introduced a restriction to forbid to bind
SO_REUSEADDR enabled sockets to the same (addr, port) tuple in order to
assign ports dispersedly so that we can connect to the same remote host.

The change results in accelerating port depletion so that we fail to bind
sockets to the same local port even if we want to connect to the different
remote hosts.

You can reproduce this issue by following instructions below.

  1. # sysctl -w net.ipv4.ip_local_port_range="32768 32768"
  2. set SO_REUSEADDR to two sockets.
  3. bind two sockets to (localhost, 0) and the latter fails.

Therefore, when ephemeral ports are exhausted, bind(0) should fallback to
the legacy behaviour to enable the SO_REUSEADDR option and make it possible
to connect to different remote (addr, port) tuples.

This patch allows us to bind SO_REUSEADDR enabled sockets to the same
(addr, port) only when net.ipv4.ip_autobind_reuse is set 1 and all
ephemeral ports are exhausted. This also allows connect() and listen() to
share ports in the following way and may break some applications. So the
ip_autobind_reuse is 0 by default and disables the feature.

  1. setsockopt(sk1, SO_REUSEADDR)
  2. setsockopt(sk2, SO_REUSEADDR)
  3. bind(sk1, saddr, 0)
  4. bind(sk2, saddr, 0)
  5. connect(sk1, daddr)
  6. listen(sk2)

If it is set 1, we can fully utilize the 4-tuples, but we should use
IP_BIND_ADDRESS_NO_PORT for bind()+connect() as possible.

The notable thing is that if all sockets bound to the same port have
both SO_REUSEADDR and SO_REUSEPORT enabled, we can bind sockets to an
ephemeral port and also do listen().

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-12 12:08:09 -07:00
..
can.h can: netns: remove "can_" prefix from members struct netns_can 2019-09-04 13:29:14 +02:00
conntrack.h netfilter: conntrack: limit sysctl setting for boolean options 2019-04-30 14:18:56 +02:00
core.h sock: Hide unused variable when !CONFIG_PROC_FS. 2017-12-19 09:58:14 -05:00
dccp.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
generic.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
hash.h netns: provide pure entropy for net_hash_mix() 2019-03-28 17:00:45 -07:00
ieee802154_6lowpan.h net: dynamically allocate fqdir structures 2019-05-26 14:08:05 -07:00
ipv4.h tcp: bind(0) remove the SO_REUSEADDR restriction when ephemeral ports are exhausted. 2020-03-12 12:08:09 -07:00
ipv6.h ipv6: keep track of routes using src 2019-11-21 14:45:55 -08:00
mib.h net/tls: add skeleton of MIB statistics 2019-10-05 16:29:00 -07:00
mpls.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
netfilter.h netfilter: don't allocate space for arp/bridge hooks unless needed 2018-01-08 18:01:11 +01:00
nexthop.h net: Initial nexthop code 2019-05-28 21:37:30 -07:00
nftables.h netfilter: nf_tables: autoload modules from the abort path 2020-01-24 20:54:29 +01:00
packet.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
sctp.h sctp: add support for Primary Path Switchover 2019-11-08 14:18:32 -08:00
unix.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
x_tables.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
xdp.h net: xsk: track AF_XDP sockets on a per-netns list 2019-01-25 01:50:03 +01:00
xfrm.h xfrm: policy: store inexact policies in an rhashtable 2018-11-09 11:57:47 +01:00