kernel_optimize_test/net/sched
Eric Dumazet 46b1c18f9d net: sched: put back q.qlen into a single location
In the series fc8b81a598 ("Merge branch 'lockless-qdisc-series'")
John made the assumption that the data path had no need to read
the qdisc qlen (number of packets in the qdisc).

It is true when pfifo_fast is used as the root qdisc, or as direct MQ/MQPRIO
children.

But pfifo_fast can be used as leaf in class full qdiscs, and existing
logic needs to access the child qlen in an efficient way.

HTB breaks badly, since it uses cl->leaf.q->q.qlen in :
  htb_activate() -> WARN_ON()
  htb_dequeue_tree() to decide if a class can be htb_deactivated
  when it has no more packets.

HFSC, DRR, CBQ, QFQ have similar issues, and some calls to
qdisc_tree_reduce_backlog() also read q.qlen directly.

Using qdisc_qlen_sum() (which iterates over all possible cpus)
in the data path is a non starter.

It seems we have to put back qlen in a central location,
at least for stable kernels.

For all qdisc but pfifo_fast, qlen is guarded by the qdisc lock,
so the existing q.qlen{++|--} are correct.

For 'lockless' qdisc (pfifo_fast so far), we need to use atomic_{inc|dec}()
because the spinlock might be not held (for example from
pfifo_fast_enqueue() and pfifo_fast_dequeue())

This patch adds atomic_qlen (in the same location than qlen)
and renames the following helpers, since we want to express
they can be used without qdisc lock, and that qlen is no longer percpu.

- qdisc_qstats_cpu_qlen_dec -> qdisc_qstats_atomic_qlen_dec()
- qdisc_qstats_cpu_qlen_inc -> qdisc_qstats_atomic_qlen_inc()

Later (net-next) we might revert this patch by tracking all these
qlen uses and replace them by a more efficient method (not having
to access a precise qlen, but an empty/non_empty status that might
be less expensive to maintain/track).

Another possibility is to have a legacy pfifo_fast version that would
be used when used a a child qdisc, since the parent qdisc needs
a spinlock anyway. But then, future lockless qdiscs would also
have the same problem.

Fixes: 7e66016f2c ("net: sched: helpers to sum qlen and qlen for per cpu logic")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-03-02 14:10:18 -08:00
..
act_api.c net/sched: Remove egdev mechanism 2018-12-10 15:54:34 -08:00
act_bpf.c
act_connmark.c
act_csum.c
act_gact.c net/sched: act_gact: disallow 'goto chain' on fallback control action 2018-10-22 19:40:55 -07:00
act_ife.c
act_ipt.c net/sched: act_ipt: fix refcount leak when replace fails 2019-02-24 17:28:55 -08:00
act_meta_mark.c
act_meta_skbprio.c
act_meta_skbtcindex.c
act_mirred.c act_mirred: clear skb->tstamp on redirect 2018-11-11 10:21:31 -08:00
act_nat.c
act_pedit.c net/sched: act_pedit: fix memory leak when IDR allocation fails 2018-11-16 19:53:45 -08:00
act_police.c net/sched: act_police: fix memory leak in case of invalid control action 2018-11-30 17:14:06 -08:00
act_sample.c
act_simple.c
act_skbedit.c net/sched: act_skbedit: fix refcount leak when replace fails 2019-02-24 17:31:43 -08:00
act_skbmod.c
act_tunnel_key.c net: sched: act_tunnel_key: fix NULL pointer dereference during init 2019-02-25 10:13:38 -08:00
act_vlan.c net/core: use __vlan_hwaccel helpers 2018-11-08 20:45:04 -08:00
cls_api.c net_sched: refetch skb protocol for each filter 2019-01-16 13:25:11 -08:00
cls_basic.c
cls_bpf.c net_sched: fold tcf_block_cb_call() into tc_setup_cb_call() 2018-12-14 15:32:19 -08:00
cls_cgroup.c
cls_flow.c
cls_flower.c net: cls_flower: Remove filter from mask before freeing it 2019-02-04 09:19:14 -08:00
cls_fw.c
cls_matchall.c net_sched: fold tcf_block_cb_call() into tc_setup_cb_call() 2018-12-14 15:32:19 -08:00
cls_route.c
cls_rsvp.c
cls_rsvp.h
cls_rsvp6.c
cls_tcindex.c net_sched: fix two more memory leaks in cls_tcindex 2019-02-12 14:10:56 -05:00
cls_u32.c net_sched: fold tcf_block_cb_call() into tc_setup_cb_call() 2018-12-14 15:32:19 -08:00
em_canid.c
em_cmp.c
em_ipset.c
em_ipt.c
em_meta.c
em_nbyte.c
em_text.c
em_u32.c
ematch.c
Kconfig tc: Add support for configuring the taprio scheduler 2018-10-04 13:52:23 -07:00
Makefile tc: Add support for configuring the taprio scheduler 2018-10-04 13:52:23 -07:00
sch_api.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2018-12-27 13:04:52 -08:00
sch_atm.c
sch_blackhole.c
sch_cake.c sch_cake: Correctly update parent qlen when splitting GSO packets 2019-01-15 20:12:01 -08:00
sch_cbq.c
sch_cbs.c sched: Avoid dereferencing skb pointer after child enqueue 2019-01-15 20:12:00 -08:00
sch_choke.c
sch_codel.c
sch_drr.c sched: Fix detection of empty queues in child qdiscs 2019-01-15 20:12:00 -08:00
sch_dsmark.c sched: Avoid dereferencing skb pointer after child enqueue 2019-01-15 20:12:00 -08:00
sch_etf.c etf: Drop all expired packets 2018-11-16 20:39:34 -08:00
sch_fifo.c
sch_fq_codel.c
sch_fq.c net_sched: sch_fq: avoid calling ktime_get_ns() if not needed 2018-11-20 09:51:32 -08:00
sch_generic.c net: sched: put back q.qlen into a single location 2019-03-02 14:10:18 -08:00
sch_gred.c net: sched: gred: support reporting stats from offloads 2018-11-19 18:53:46 -08:00
sch_hfsc.c sched: Fix detection of empty queues in child qdiscs 2019-01-15 20:12:00 -08:00
sch_hhf.c
sch_htb.c sched: Avoid dereferencing skb pointer after child enqueue 2019-01-15 20:12:00 -08:00
sch_ingress.c
sch_mq.c net: sched: mq: offload a graft notification 2018-11-14 08:51:28 -08:00
sch_mqprio.c
sch_multiq.c
sch_netem.c net: netem: fix skb length BUG_ON in __skb_to_sgvec 2019-02-28 10:31:31 -08:00
sch_pie.c net: sched: pie: fix coding style issues 2018-10-07 20:39:01 -07:00
sch_plug.c
sch_prio.c sched: Avoid dereferencing skb pointer after child enqueue 2019-01-15 20:12:00 -08:00
sch_qfq.c sched: Fix detection of empty queues in child qdiscs 2019-01-15 20:12:00 -08:00
sch_red.c net: sched: red: notify drivers about RED's limit parameter 2018-11-14 08:51:28 -08:00
sch_sfb.c
sch_sfq.c
sch_skbprio.c
sch_taprio.c tc: Add support for configuring the taprio scheduler 2018-10-04 13:52:23 -07:00
sch_tbf.c sched: Avoid dereferencing skb pointer after child enqueue 2019-01-15 20:12:00 -08:00
sch_teql.c