kernel_optimize_test/net/ipv4
Eric Dumazet 7433819a1e tcp: do not create inetpeer on SYNACK message
Another problem on SYNFLOOD/DDOS attack is the inetpeer cache getting
larger and larger, using lots of memory and cpu time.

tcp_v4_send_synack()
->inet_csk_route_req()
 ->ip_route_output_flow()
  ->rt_set_nexthop()
   ->rt_init_metrics()
    ->inet_getpeer( create = true)

This is a side effect of commit a4daad6b09 (net: Pre-COW metrics for
TCP) added in 2.6.39

Possible solution :

Instruct inet_csk_route_req() to remove FLOWI_FLAG_PRECOW_METRICS

Before patch :

# grep peer /proc/slabinfo
inet_peer_cache   4175430 4175430    192   42    2 : tunables    0    0    0 : slabdata  99415  99415      0

Samples: 41K of event 'cycles', Event count (approx.): 30716565122
+  20,24%      ksoftirqd/0  [kernel.kallsyms]           [k] inet_getpeer
+   8,19%      ksoftirqd/0  [kernel.kallsyms]           [k] peer_avl_rebalance.isra.1
+   4,81%      ksoftirqd/0  [kernel.kallsyms]           [k] sha_transform
+   3,64%      ksoftirqd/0  [kernel.kallsyms]           [k] fib_table_lookup
+   2,36%      ksoftirqd/0  [ixgbe]                     [k] ixgbe_poll
+   2,16%      ksoftirqd/0  [kernel.kallsyms]           [k] __ip_route_output_key
+   2,11%      ksoftirqd/0  [kernel.kallsyms]           [k] kernel_map_pages
+   2,11%      ksoftirqd/0  [kernel.kallsyms]           [k] ip_route_input_common
+   2,01%      ksoftirqd/0  [kernel.kallsyms]           [k] __inet_lookup_established
+   1,83%      ksoftirqd/0  [kernel.kallsyms]           [k] md5_transform
+   1,75%      ksoftirqd/0  [kernel.kallsyms]           [k] check_leaf.isra.9
+   1,49%      ksoftirqd/0  [kernel.kallsyms]           [k] ipt_do_table
+   1,46%      ksoftirqd/0  [kernel.kallsyms]           [k] hrtimer_interrupt
+   1,45%      ksoftirqd/0  [kernel.kallsyms]           [k] kmem_cache_alloc
+   1,29%      ksoftirqd/0  [kernel.kallsyms]           [k] inet_csk_search_req
+   1,29%      ksoftirqd/0  [kernel.kallsyms]           [k] __netif_receive_skb
+   1,16%      ksoftirqd/0  [kernel.kallsyms]           [k] copy_user_generic_string
+   1,15%      ksoftirqd/0  [kernel.kallsyms]           [k] kmem_cache_free
+   1,02%      ksoftirqd/0  [kernel.kallsyms]           [k] tcp_make_synack
+   0,93%      ksoftirqd/0  [kernel.kallsyms]           [k] _raw_spin_lock_bh
+   0,87%      ksoftirqd/0  [kernel.kallsyms]           [k] __call_rcu
+   0,84%      ksoftirqd/0  [kernel.kallsyms]           [k] rt_garbage_collect
+   0,84%      ksoftirqd/0  [kernel.kallsyms]           [k] fib_rules_lookup

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Hans Schillstrom <hans.schillstrom@ericsson.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-01 14:22:11 -04:00
..
netfilter net: Convert net_ratelimit uses to net_<level>_ratelimited 2012-05-15 13:45:03 -04:00
af_inet.c sock: Introduce named constants for sk_reuse 2012-04-21 15:52:25 -04:00
ah4.c net: ipv4 and ipv6: Convert printk(KERN_DEBUG to pr_debug 2012-05-16 01:01:03 -04:00
arp.c Merge branch 'delete-tokenring' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2012-05-16 01:02:40 -04:00
cipso_ipv4.c
datagram.c
devinet.c net: ipv4 and ipv6: Convert printk(KERN_DEBUG to pr_debug 2012-05-16 01:01:03 -04:00
esp4.c xfrm: take net hdr len into account for esp payload size calculation 2012-05-27 01:08:29 -04:00
fib_frontend.c net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
fib_lookup.h
fib_rules.c ipv4: Stop using NLA_PUT*(). 2012-04-02 04:33:43 -04:00
fib_semantics.c ipv4: fix the rcu race between free_fib_info and ip_route_output_slow 2012-05-24 00:28:21 -04:00
fib_trie.c ipv4: Do not use dead fib_info entries. 2012-05-10 22:16:32 -04:00
gre.c
icmp.c net: Convert net_ratelimit uses to net_<level>_ratelimited 2012-05-15 13:45:03 -04:00
igmp.c ipv4: fix checkpatch errors 2012-04-15 12:37:19 -04:00
inet_connection_sock.c tcp: do not create inetpeer on SYNACK message 2012-06-01 14:22:11 -04:00
inet_diag.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-05-07 23:35:40 -04:00
inet_fragment.c
inet_hashtables.c ipv4: fix checkpatch errors 2012-04-15 12:37:19 -04:00
inet_lro.c
inet_timewait_sock.c net: ipv4 and ipv6: Convert printk(KERN_DEBUG to pr_debug 2012-05-16 01:01:03 -04:00
inetpeer.c
ip_forward.c ipv4: fix checkpatch errors 2012-04-15 12:37:19 -04:00
ip_fragment.c ipv4: use skb coalescing in defragmentation 2012-05-19 18:34:57 -04:00
ip_gre.c ipv4: fix checkpatch errors 2012-04-15 12:37:19 -04:00
ip_input.c net: Convert net_ratelimit uses to net_<level>_ratelimited 2012-05-15 13:45:03 -04:00
ip_options.c net: Convert net_ratelimit uses to net_<level>_ratelimited 2012-05-15 13:45:03 -04:00
ip_output.c net: Convert net_ratelimit uses to net_<level>_ratelimited 2012-05-15 13:45:03 -04:00
ip_sockglue.c net: IP_MULTICAST_IF setsockopt now recognizes struct mreq 2012-05-07 23:03:22 -04:00
ipcomp.c
ipconfig.c net/ipv4/ipconfig: neaten __setup placement 2012-05-20 04:06:16 -04:00
ipip.c ipv4: fix checkpatch errors 2012-04-15 12:37:19 -04:00
ipmr.c net: Convert net_ratelimit uses to net_<level>_ratelimited 2012-05-15 13:45:03 -04:00
Kconfig net: delete all instances of special processing for token ring 2012-05-15 20:14:35 -04:00
Makefile
netfilter.c net: Delete all remaining instances of ctl_path 2012-04-20 21:22:30 -04:00
ping.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2012-05-23 17:42:39 -07:00
proc.c
protocol.c
raw.c ipv4: fix checkpatch errors 2012-04-15 12:37:19 -04:00
route.c mm: add a low limit to alloc_large_system_hash 2012-05-24 00:28:21 -04:00
syncookies.c
sysctl_net_ipv4.c tcp: early retransmit 2012-05-02 20:56:10 -04:00
tcp_bic.c
tcp_cong.c tcp: bool conversions 2012-05-17 14:59:59 -04:00
tcp_cubic.c
tcp_diag.c
tcp_highspeed.c
tcp_htcp.c
tcp_hybla.c tcp: bool conversions 2012-05-17 14:59:59 -04:00
tcp_illinois.c
tcp_input.c tcp: take care of overlaps in tcp_try_coalesce() 2012-05-24 00:28:21 -04:00
tcp_ipv4.c tcp: bool conversions 2012-05-17 14:59:59 -04:00
tcp_lp.c
tcp_memcontrol.c memcg: decrement static keys at real destroy time 2012-05-29 16:22:28 -07:00
tcp_minisocks.c tcp: bool conversions 2012-05-17 14:59:59 -04:00
tcp_output.c tcp: bool conversions 2012-05-17 14:59:59 -04:00
tcp_probe.c net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
tcp_scalable.c
tcp_timer.c tcp: early retransmit: delayed fast retransmit 2012-05-02 20:56:10 -04:00
tcp_vegas.c
tcp_vegas.h
tcp_veno.c
tcp_westwood.c
tcp_yeah.c
tcp.c mm: add a low limit to alloc_large_system_hash 2012-05-24 00:28:21 -04:00
tunnel4.c
udp_diag.c udp_diag: implement idiag_get_info for udp/udplite to get queue information 2012-04-25 20:43:01 -04:00
udp_impl.h ipv4: fix checkpatch errors 2012-04-15 12:37:19 -04:00
udp.c mm: add a low limit to alloc_large_system_hash 2012-05-24 00:28:21 -04:00
udplite.c
xfrm4_input.c
xfrm4_mode_beet.c
xfrm4_mode_transport.c
xfrm4_mode_tunnel.c
xfrm4_output.c
xfrm4_policy.c net: Convert all sysctl registrations to register_net_sysctl 2012-04-20 21:22:30 -04:00
xfrm4_state.c
xfrm4_tunnel.c