kernel_optimize_test/net/sched
Vimalkumar 56b765b79e htb: improved accuracy at high rates
Current HTB (and TBF) uses rate table computed by the "tc"
userspace program, which has the following issue:

The rate table has 256 entries to map packet lengths
to token (time units).  With TSO sized packets, the
256 entry granularity leads to loss/gain of rate,
making the token bucket inaccurate.

Thus, instead of relying on rate table, this patch
explicitly computes the time and accounts for packet
transmission times with nanosecond granularity.

This greatly improves accuracy of HTB with a wide
range of packet sizes.

Example:

tc qdisc add dev $dev root handle 1: \
        htb default 1

tc class add dev $dev classid 1:1 parent 1: \
        rate 5Gbit mtu 64k

Here is an example of inaccuracy:

$ iperf -c host -t 10 -i 1

With old htb:
eth4:   34.76 Mb/s In  5827.98 Mb/s Out -  65836.0 p/s In  481273.0 p/s Out
[SUM]  9.0-10.0 sec   669 MBytes  5.61 Gbits/sec
[SUM]  0.0-10.0 sec  6.50 GBytes  5.58 Gbits/sec

With new htb:
eth4:   28.36 Mb/s In  5208.06 Mb/s Out -  53704.0 p/s In  430076.0 p/s Out
[SUM]  9.0-10.0 sec   594 MBytes  4.98 Gbits/sec
[SUM]  0.0-10.0 sec  5.80 GBytes  4.98 Gbits/sec

The bits per second on the wire is still 5200Mb/s with new HTB
because qdisc accounts for packet length using skb->len, which
is smaller than total bytes on the wire if GSO is used.  But
that is for another patch regardless of how time is accounted.

Many thanks to Eric Dumazet for review and feedback.

Signed-off-by: Vimalkumar <j.vimal@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-11-03 15:24:01 -04:00
..
act_api.c netlink: Rename pid to portid to avoid confusion 2012-09-10 15:30:41 -04:00
act_csum.c ipv6: correct the ipv6 option name - Pad0 to Pad1 2012-05-17 15:49:51 -04:00
act_gact.c net_sched: gact: Fix potential panic in tcf_gact(). 2012-08-03 16:47:24 -07:00
act_ipt.c net_sched: act: Delete estimator in error path. 2012-08-06 13:30:01 -07:00
act_mirred.c act_mirred: do not drop packets when fails to mirror it 2012-08-16 14:54:44 -07:00
act_nat.c pkt_sched: Stop using NLA_PUT*(). 2012-04-01 18:11:37 -04:00
act_pedit.c net_sched: act: Delete estimator in error path. 2012-08-06 13:30:01 -07:00
act_police.c pkt_sched: Stop using NLA_PUT*(). 2012-04-01 18:11:37 -04:00
act_simple.c net_sched: act: Delete estimator in error path. 2012-08-06 13:30:01 -07:00
act_skbedit.c pkt_sched: Stop using NLA_PUT*(). 2012-04-01 18:11:37 -04:00
cls_api.c netlink: Rename pid to portid to avoid confusion 2012-09-10 15:30:41 -04:00
cls_basic.c net sched: Pass the skb into change so it can access NETLINK_CB 2012-08-14 21:55:28 -07:00
cls_cgroup.c cgroup: net_cls: Rework update socket logic 2012-10-26 03:40:51 -04:00
cls_flow.c userns: Convert cls_flow to work with user namespaces enabled 2012-08-14 21:55:28 -07:00
cls_fw.c net sched: Pass the skb into change so it can access NETLINK_CB 2012-08-14 21:55:28 -07:00
cls_route.c net sched: Pass the skb into change so it can access NETLINK_CB 2012-08-14 21:55:28 -07:00
cls_rsvp6.c
cls_rsvp.c
cls_rsvp.h net sched: Pass the skb into change so it can access NETLINK_CB 2012-08-14 21:55:28 -07:00
cls_tcindex.c net sched: Pass the skb into change so it can access NETLINK_CB 2012-08-14 21:55:28 -07:00
cls_u32.c net sched: Pass the skb into change so it can access NETLINK_CB 2012-08-14 21:55:28 -07:00
em_canid.c net: em_canid: Ematch rule to match CAN frames according to their identifiers 2012-07-04 13:07:05 +02:00
em_cmp.c
em_ipset.c net: sched: add ipset ematch 2012-07-12 07:54:46 -07:00
em_meta.c net: use a per task frag allocator 2012-09-24 16:31:37 -04:00
em_nbyte.c
em_text.c
em_u32.c
ematch.c net: Convert net_ratelimit uses to net_<level>_ratelimited 2012-05-15 13:45:03 -04:00
Kconfig net: sched: add ipset ematch 2012-07-12 07:54:46 -07:00
Makefile net: sched: add ipset ematch 2012-07-12 07:54:46 -07:00
sch_api.c pkt_sched: use ns_to_ktime() helper 2012-10-21 22:21:27 -04:00
sch_atm.c sch_atm.c: get rid of poinless extern 2012-06-01 10:37:18 -04:00
sch_blackhole.c
sch_cbq.c pkt_sched: use ns_to_ktime() helper 2012-10-21 22:21:27 -04:00
sch_choke.c net: sched: factorize code (qdisc_drop()) 2012-05-04 11:50:05 -04:00
sch_codel.c fq_codel: should use qdisc backlog as threshold 2012-05-16 15:30:26 -04:00
sch_drr.c pkt_sched: Fix warning false positives. 2012-09-27 18:35:47 -04:00
sch_dsmark.c net: sched: factorize code (qdisc_drop()) 2012-05-04 11:50:05 -04:00
sch_fifo.c pkt_sched: Stop using NLA_PUT*(). 2012-04-01 18:11:37 -04:00
sch_fq_codel.c fq_codel: dont reinit flow state 2012-09-03 14:36:50 -04:00
sch_generic.c net: qdisc busylock needs lockdep annotations 2012-09-05 17:49:27 -04:00
sch_gred.c net_sched: gred: actually perform idling in WRED mode 2012-09-13 16:10:13 -04:00
sch_hfsc.c net_sched: update bstats in dequeue() 2012-05-10 23:33:01 -04:00
sch_htb.c htb: improved accuracy at high rates 2012-11-03 15:24:01 -04:00
sch_ingress.c
sch_mq.c
sch_mqprio.c pkt_sched: Stop using NLA_PUT*(). 2012-04-01 18:11:37 -04:00
sch_multiq.c pkt_sched: Stop using NLA_PUT*(). 2012-04-01 18:11:37 -04:00
sch_netem.c netem: refine early skb orphaning 2012-07-16 23:08:33 -07:00
sch_plug.c net_sched: sch_plug: plug_qdisc_ops is static 2012-02-13 16:04:40 -05:00
sch_prio.c pkt_sched: Stop using NLA_PUT*(). 2012-04-01 18:11:37 -04:00
sch_qfq.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-09-28 14:40:49 -04:00
sch_red.c pkt_sched: Stop using NLA_PUT*(). 2012-04-01 18:11:37 -04:00
sch_sfb.c sch_sfb: Fix missing NULL check 2012-07-12 08:33:18 -07:00
sch_sfq.c pkt_sched: Stop using NLA_PUT*(). 2012-04-01 18:11:37 -04:00
sch_tbf.c pkt_sched: Stop using NLA_PUT*(). 2012-04-01 18:11:37 -04:00
sch_teql.c sch_teql: Convert over to dev_neigh_lookup_skb(). 2012-07-05 01:09:06 -07:00