tmp_suning_uos_patched/include
Vincent Ray 503a3fd646 net: sched: fixed barrier to prevent skbuff sticking in qdisc backlog
[ Upstream commit a54ce3703613e41fe1d98060b62ec09a3984dc28 ]

In qdisc_run_begin(), smp_mb__before_atomic() used before test_bit()
does not provide any ordering guarantee as test_bit() is not an atomic
operation. This, added to the fact that the spin_trylock() call at
the beginning of qdisc_run_begin() does not guarantee acquire
semantics if it does not grab the lock, makes it possible for the
following statement :

if (test_bit(__QDISC_STATE_MISSED, &qdisc->state))

to be executed before an enqueue operation called before
qdisc_run_begin().

As a result the following race can happen :

           CPU 1                             CPU 2

      qdisc_run_begin()               qdisc_run_begin() /* true */
        set(MISSED)                            .
      /* returns false */                      .
          .                            /* sees MISSED = 1 */
          .                            /* so qdisc not empty */
          .                            __qdisc_run()
          .                                    .
          .                              pfifo_fast_dequeue()
 ----> /* may be done here */                  .
|         .                                clear(MISSED)
|         .                                    .
|         .                                smp_mb __after_atomic();
|         .                                    .
|         .                                /* recheck the queue */
|         .                                /* nothing => exit   */
|   enqueue(skb1)
|         .
|   qdisc_run_begin()
|         .
|     spin_trylock() /* fail */
|         .
|     smp_mb__before_atomic() /* not enough */
|         .
 ---- if (test_bit(MISSED))
        return false;   /* exit */

In the above scenario, CPU 1 and CPU 2 both try to grab the
qdisc->seqlock at the same time. Only CPU 2 succeeds and enters the
bypass code path, where it emits its skb then calls __qdisc_run().

CPU1 fails, sets MISSED and goes down the traditionnal enqueue() +
dequeue() code path. But when executing qdisc_run_begin() for the
second time, after enqueuing its skbuff, it sees the MISSED bit still
set (by itself) and consequently chooses to exit early without setting
it again nor trying to grab the spinlock again.

Meanwhile CPU2 has seen MISSED = 1, cleared it, checked the queue
and found it empty, so it returned.

At the end of the sequence, we end up with skb1 enqueued in the
backlog, both CPUs out of __dev_xmit_skb(), the MISSED bit not set,
and no __netif_schedule() called made. skb1 will now linger in the
qdisc until somebody later performs a full __qdisc_run(). Associated
to the bypass capacity of the qdisc, and the ability of the TCP layer
to avoid resending packets which it knows are still in the qdisc, this
can lead to serious traffic "holes" in a TCP connection.

We fix this by replacing the smp_mb__before_atomic() / test_bit() /
set_bit() / smp_mb__after_atomic() sequence inside qdisc_run_begin()
by a single test_and_set_bit() call, which is more concise and
enforces the needed memory barriers.

Fixes: 89837eb4b246 ("net: sched: add barrier to ensure correct ordering for lockless qdisc")
Signed-off-by: Vincent Ray <vray@kalrayinc.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220526001746.2437669-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-14 18:32:35 +02:00
..
acpi ACPICA: actypes.h: Expand the ACPI_ACCESS_ definitions 2022-01-27 10:54:18 +01:00
asm-generic tlb: hugetlb: Add more sizes to tlb_remove_huge_tlb_entry 2022-04-20 09:23:22 +02:00
clocksource
crypto crypto: drbg - make reseeding from get_random_bytes() synchronous 2022-06-06 08:42:42 +02:00
drm drm: fix EDID struct for old ARM OABI format 2022-06-09 10:20:59 +02:00
dt-bindings
keys
kunit
kvm
linux nodemask.h: fix compilation error with GCC12 2022-06-09 10:21:27 +02:00
math-emu
media media: subdev: disallow ioctl for saa6588/davinci 2021-07-19 09:45:02 +02:00
memory memory: renesas-rpc-if: Fix HF/OSPI data transfer in Manual Mode 2022-05-09 09:05:02 +02:00
misc
net net: sched: fixed barrier to prevent skbuff sticking in qdisc backlog 2022-06-14 18:32:35 +02:00
pcmcia
ras
rdma RDMA/netlink: Add __maybe_unused to static inline in C file 2021-11-26 10:39:21 +01:00
scsi scsi: fcoe: Fix Wstringop-overflow warnings in fcoe_wwn_from_mac() 2022-06-09 10:21:15 +02:00
soc firmware: raspberrypi: Keep count of all consumers 2021-09-15 09:50:41 +02:00
sound ALSA: jack: Access input_dev under mutex 2022-06-09 10:20:51 +02:00
target scsi: target: Fix ordered tag handling 2021-11-26 10:39:11 +01:00
trace rxrpc: Fix decision on when to generate an IDLE ACK 2022-06-09 10:21:12 +02:00
uapi include/uapi/linux/xfrm.h: Fix XFRM_MSG_MAPPING ABI breakage 2022-05-25 09:18:02 +02:00
vdso
video
xen xen/gnttab: fix gnttab_end_foreign_access() without page specified 2022-03-11 12:11:54 +01:00