kernel_optimize_test/include/net
Florian Westphal 7223ecd466 netfilter: nat: switch to new rhlist interface
I got offlist bug report about failing connections and high cpu usage.
This happens because we hit 'elasticity' checks in rhashtable that
refuses bucket list exceeding 16 entries.

The nat bysrc hash unfortunately needs to insert distinct objects that
share same key and are identical (have same source tuple), this cannot
be avoided.

Switch to the rhlist interface which is designed for this.

The nulls_base is removed here, I don't think its needed:

A (unlikely) false positive results in unneeded port clash resolution,
a false negative results in packet drop during conntrack confirmation,
when we try to insert the duplicate into main conntrack hash table.

Tested by adding multiple ip addresses to host, then adding
iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

... and then creating multiple connections, from same source port but
different addresses:

for i in $(seq 2000 2032);do nc -p 1234 192.168.7.1 $i > /dev/null  & done

(all of these then get hashed to same bysource slot)

Then, to test that nat conflict resultion is working:

nc -s 10.0.0.1 -p 1234 192.168.7.1 2000
nc -s 10.0.0.2 -p 1234 192.168.7.1 2000

tcp  .. src=10.0.0.1 dst=192.168.7.1 sport=1234 dport=2000 src=192.168.7.1 dst=192.168.7.10 sport=2000 dport=1024 [ASSURED]
tcp  .. src=10.0.0.2 dst=192.168.7.1 sport=1234 dport=2000 src=192.168.7.1 dst=192.168.7.10 sport=2000 dport=1025 [ASSURED]
tcp  .. src=192.168.7.10 dst=192.168.7.1 sport=1234 dport=2000 src=192.168.7.1 dst=192.168.7.10 sport=2000 dport=1234 [ASSURED]
tcp  .. src=192.168.7.10 dst=192.168.7.1 sport=1234 dport=2001 src=192.168.7.1 dst=192.168.7.10 sport=2001 dport=1234 [ASSURED]
[..]

-> nat altered source ports to 1024 and 1025, respectively.
This can also be confirmed on destination host which shows
ESTAB      0      0   192.168.7.1:2000      192.168.7.10:1024
ESTAB      0      0   192.168.7.1:2000      192.168.7.10:1025
ESTAB      0      0   192.168.7.1:2000      192.168.7.10:1234

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Fixes: 870190a9ec ("netfilter: nat: convert nat bysrc hash to rhashtable")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-11-24 14:43:34 +01:00
..
9p
bluetooth
caif
irda
iucv
netfilter netfilter: nat: switch to new rhlist interface 2016-11-24 14:43:34 +01:00
netns
nfc
phonet
sctp sctp: hold transport instead of assoc when lookup assoc in rx path 2016-10-31 16:20:33 -04:00
tc_act
6lowpan.h
act_api.h
addrconf.h ipv6: fix a potential deadlock in do_ipv6_setsockopt() 2016-10-21 11:29:02 -04:00
af_ieee802154.h
af_rxrpc.h
af_unix.h
af_vsock.h
ah.h
arp.h
atmclip.h
ax25.h
ax88796.h
bond_3ad.h
bond_alb.h
bond_options.h
bonding.h
busy_poll.h
calipso.h
cfg80211-wext.h
cfg80211.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-10-29 20:33:20 -07:00
cfg802154.h
checksum.h
cipso_ipv4.h
cls_cgroup.h
codel_impl.h
codel_qdisc.h
codel.h
compat.h
datalink.h
dcbevent.h
dcbnl.h
devlink.h
dn_dev.h
dn_fib.h
dn_neigh.h
dn_nsp.h
dn_route.h
dn.h
dsa.h
dsfield.h
dst_cache.h
dst_metadata.h
dst_ops.h
dst.h
esp.h
ethoc.h
fib_rules.h
firewire.h
flow_dissector.h
flow.h
flowcache.h
fou.h
fq_impl.h
fq.h
garp.h
gen_stats.h
genetlink.h
geneve.h
gre.h
gro_cells.h gro_cells: mark napi struct as not busy poll candidates 2016-11-15 22:27:27 -05:00
gtp.h
gue.h
hwbm.h
icmp.h
ieee80211_radiotap.h
ieee802154_netdev.h
if_inet6.h IPv6: fix DESYNC_FACTOR 2016-10-14 10:59:15 -04:00
ila.h
inet6_connection_sock.h
inet6_hashtables.h
inet_common.h
inet_connection_sock.h
inet_ecn.h
inet_frag.h
inet_hashtables.h
inet_sock.h
inet_timewait_sock.h
inetpeer.h
ip6_checksum.h
ip6_fib.h net: ipv6: Fix processing of RAs in presence of VRF 2016-10-27 16:30:52 -04:00
ip6_route.h net: ipv6: Do not consider link state for nexthop validation 2016-10-27 16:33:12 -04:00
ip6_tunnel.h ip6_tunnel: Clear IP6CB in ip6tunnel_xmit() 2016-11-02 15:18:36 -04:00
ip_fib.h ipv4: Restore fib_trie_flush_external function and fix call ordering 2016-11-16 13:24:50 -05:00
ip_tunnels.h
ip_vs.h
ip.h ipv4: allow local fragmentation in ip_finish_output_gso() 2016-11-03 16:10:26 -04:00
ipcomp.h
ipconfig.h
ipv6.h
ipx.h
iw_handler.h
kcm.h
l3mdev.h net: ipv4: Do not drop to make_route if oif is l3mdev 2016-10-13 12:05:26 -04:00
lapb.h
lib80211.h
llc_c_ac.h
llc_c_ev.h
llc_c_st.h
llc_conn.h
llc_if.h
llc_pdu.h
llc_s_ac.h
llc_s_ev.h
llc_s_st.h
llc_sap.h
llc.h
lwtunnel.h
mac80211.h mac80211: fix some sphinx warnings 2016-10-26 08:01:07 +02:00
mac802154.h
mip6.h
mld.h
mpls_iptunnel.h
mpls.h
mrp.h
ncsi.h
ndisc.h
neighbour.h
net_namespace.h netns: fix get_net_ns_by_fd(int pid) typo 2016-11-18 14:01:58 -05:00
net_ratelimit.h
netevent.h
netlabel.h
netlink.h
netprio_cgroup.h
netrom.h
nexthop.h
nl802154.h
p8022.h
ping.h
pkt_cls.h
pkt_sched.h
pptp.h
protocol.h
psnap.h
raw.h
rawv6.h
red.h
regulatory.h
request_sock.h
rose.h
route.h
rtnetlink.h
sch_generic.h
scm.h
secure_seq.h
slhc_vj.h
snmp.h
sock_reuseport.h
sock.h dccp: do not release listeners too soon 2016-11-03 16:16:50 -04:00
Space.h
stp.h
strparser.h
switchdev.h
tcp_states.h
tcp.h tcp: take care of truncations done by sk_filter() 2016-11-13 12:30:02 -05:00
timewait_sock.h
transp_v6.h
tso.h
udp_tunnel.h
udp.h udp: must lock the socket in udp_disconnect() 2016-10-20 14:45:52 -04:00
udplite.h
vsock_addr.h
vxlan.h vxlan: avoid using stale vxlan socket. 2016-10-29 20:56:31 -04:00
wext.h
wimax.h
x25.h
x25device.h
xfrm.h