kernel_optimize_test/net/ipv4
Soheil Hassas Yeganeh d7722e8570 tcp: track application-limited rate samples
This commit adds code to track whether the delivery rate represented
by each rate_sample was limited by the application.

Upon each transmit, we store in the is_app_limited field in the skb a
boolean bit indicating whether there is a known "bubble in the pipe":
a point in the rate sample interval where the sender was
application-limited, and did not transmit even though the cwnd and
pacing rate allowed it.

This logic marks the flow app-limited on a write if *all* of the
following are true:

  1) There is less than 1 MSS of unsent data in the write queue
     available to transmit.

  2) There is no packet in the sender's queues (e.g. in fq or the NIC
     tx queue).

  3) The connection is not limited by cwnd.

  4) There are no lost packets to retransmit.

The tcp_rate_check_app_limited() code in tcp_rate.c determines whether
the connection is application-limited at the moment. If the flow is
application-limited, it sets the tp->app_limited field. If the flow is
application-limited then that means there is effectively a "bubble" of
silence in the pipe now, and this silence will be reflected in a lower
bandwidth sample for any rate samples from now until we get an ACK
indicating this bubble has exited the pipe: specifically, until we get
an ACK for the next packet we transmit.

When we send every skb we record in scb->tx.is_app_limited whether the
resulting rate sample will be application-limited.

The code in tcp_rate_gen() checks to see when it is safe to mark all
known application-limited bubbles of silence as having exited the
pipe. It does this by checking to see when the delivered count moves
past the tp->app_limited marker. At this point it zeroes the
tp->app_limited marker, as all known bubbles are out of the pipe.

We make room for the tx.is_app_limited bit in the skb by borrowing a
bit from the in_flight field used by NV to record the number of bytes
in flight. The receive window in the TCP header is 16 bits, and the
max receive window scaling shift factor is 14 (RFC 1323). So the max
receive window offered by the TCP protocol is 2^(16+14) = 2^30. So we
only need 30 bits for the tx.in_flight used by NV.

Signed-off-by: Van Jacobson <vanj@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-09-21 00:23:00 -04:00
..
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-09-12 15:52:44 -07:00
af_inet.c gso: Support partial splitting at the frag_list pointer 2016-09-19 20:59:34 -04:00
ah4.c
arp.c
cipso_ipv4.c Merge branch 'stable-4.8' of git://git.infradead.org/users/pcmoore/selinux into next 2016-07-07 10:15:34 +10:00
datagram.c
devinet.c netconf: add a notif when settings are created 2016-09-01 15:18:08 -07:00
esp4.c
fib_frontend.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-09-12 15:52:44 -07:00
fib_lookup.h
fib_rules.c net: flow: Add l3mdev flow update 2016-09-10 23:12:51 -07:00
fib_semantics.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-09-12 15:52:44 -07:00
fib_trie.c ipv4: fix value of ->nlmsg_flags reported in RTM_NEWROUTE events 2016-09-09 16:50:23 -07:00
fou.c fou: make nla_policy const 2016-09-01 14:09:00 -07:00
gre_demux.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-06-30 05:03:36 -04:00
gre_offload.c gso: Support partial splitting at the frag_list pointer 2016-09-19 20:59:34 -04:00
icmp.c
igmp.c net/multicast: should not send source list records when have filter mode change 2016-08-08 16:04:39 -07:00
inet_connection_sock.c timers, net/ipv4/inet: Initialize connection request timers as pinned 2016-07-07 10:35:06 +02:00
inet_diag.c net: inet: diag: expose the socket mark to privileged processes. 2016-09-08 16:13:09 -07:00
inet_fragment.c
inet_hashtables.c
inet_timewait_sock.c timers, net/ipv4/inet: Initialize connection request timers as pinned 2016-07-07 10:35:06 +02:00
inetpeer.c
ip_forward.c net/ipv4: Introduce IPSKB_FRAG_SEGS bit to inet_skb_parm.flags 2016-07-19 16:40:22 -07:00
ip_fragment.c
ip_gre.c net/ip_tunnels: Introduce tunnel_id_to_key32() and key32_to_tunnel_id() 2016-09-10 20:53:55 -07:00
ip_input.c
ip_options.c
ip_output.c net: l3mdev: remove redundant calls 2016-09-10 23:12:52 -07:00
ip_sockglue.c ipv4: accept u8 in IP_TOS ancillary data 2016-09-08 17:45:57 -07:00
ip_tunnel_core.c ip_tunnel: do not clear l4 hashes 2016-09-09 19:33:11 -07:00
ip_tunnel.c ip_tunnel: add collect_md mode to IPIP tunnel 2016-09-17 10:13:07 -04:00
ip_vti.c vti: flush x-netns xfrm cache when vti interface is removed 2016-08-09 12:57:49 -07:00
ipcomp.c
ipconfig.c net: ipconfig: Fix NULL pointer dereference on RARP/BOOTP/DHCP timeout 2016-08-22 21:04:41 -07:00
ipip.c ip_tunnel: add collect_md mode to IPIP tunnel 2016-09-17 10:13:07 -04:00
ipmr.c net: ipmr/ip6mr: update lastuse on entry change 2016-07-26 15:18:31 -07:00
Kconfig
Makefile tcp: track data delivery rate for a TCP connection 2016-09-21 00:23:00 -04:00
netfilter.c
ping.c
proc.c tcp: md5: add LINUX_MIB_TCPMD5FAILURE counter 2016-08-25 16:43:11 -07:00
protocol.c
raw.c net: ipv4: Remove l3mdev_get_saddr 2016-09-10 23:12:53 -07:00
route.c net: l3mdev: remove redundant calls 2016-09-10 23:12:52 -07:00
syncookies.c
sysctl_net_ipv4.c
tcp_bic.c
tcp_cdg.c tcp: cdg: rename struct minmax in tcp_cdg.c to avoid a naming conflict 2016-09-21 00:22:59 -04:00
tcp_cong.c
tcp_cubic.c
tcp_dctcp.c
tcp_diag.c net: diag: Fix refcnt leak in error path destroying socket 2016-08-23 23:11:36 -07:00
tcp_fastopen.c tcp: fastopen: avoid negative sk_forward_alloc 2016-09-08 16:08:10 -07:00
tcp_highspeed.c
tcp_htcp.c
tcp_hybla.c
tcp_illinois.c
tcp_input.c tcp: track data delivery rate for a TCP connection 2016-09-21 00:23:00 -04:00
tcp_ipv4.c tcp: use an RB tree for ooo receive queue 2016-09-08 17:25:58 -07:00
tcp_lp.c
tcp_metrics.c tcp: make nla_policy const 2016-09-01 14:09:01 -07:00
tcp_minisocks.c tcp: track application-limited rate samples 2016-09-21 00:23:00 -04:00
tcp_nv.c
tcp_offload.c gso: Support partial splitting at the frag_list pointer 2016-09-19 20:59:34 -04:00
tcp_output.c tcp: track data delivery rate for a TCP connection 2016-09-21 00:23:00 -04:00
tcp_probe.c
tcp_rate.c tcp: track application-limited rate samples 2016-09-21 00:23:00 -04:00
tcp_recovery.c
tcp_scalable.c
tcp_timer.c tcp_timer.c: Add kernel-doc function descriptions 2016-07-15 23:18:14 -07:00
tcp_vegas.c
tcp_vegas.h
tcp_veno.c
tcp_westwood.c
tcp_yeah.c tcp: cwnd does not increase in TCP YeAH 2016-09-08 17:16:12 -07:00
tcp.c tcp: track application-limited rate samples 2016-09-21 00:23:00 -04:00
tunnel4.c tunnels: correct conditional build of MPLS and IPv6 2016-07-11 13:27:06 -07:00
udp_diag.c net: inet: diag: expose the socket mark to privileged processes. 2016-09-08 16:13:09 -07:00
udp_impl.h
udp_offload.c gso: Support partial splitting at the frag_list pointer 2016-09-19 20:59:34 -04:00
udp_tunnel.c
udp.c net: ipv4: Remove l3mdev_get_saddr 2016-09-10 23:12:53 -07:00
udplite.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-08-30 00:54:02 -04:00
xfrm4_input.c
xfrm4_mode_beet.c
xfrm4_mode_transport.c
xfrm4_mode_tunnel.c
xfrm4_output.c
xfrm4_policy.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-09-12 15:52:44 -07:00
xfrm4_protocol.c
xfrm4_state.c
xfrm4_tunnel.c