kernel_optimize_test/net
Jesper Dangaard Brouer fd38d4e675 bpf: Remove MTU check in __bpf_skb_max_len
commit 6306c1189e77a513bf02720450bb43bd4ba5d8ae upstream.

Multiple BPF-helpers that can manipulate/increase the size of the SKB uses
__bpf_skb_max_len() as the max-length. This function limit size against
the current net_device MTU (skb->dev->mtu).

When a BPF-prog grow the packet size, then it should not be limited to the
MTU. The MTU is a transmit limitation, and software receiving this packet
should be allowed to increase the size. Further more, current MTU check in
__bpf_skb_max_len uses the MTU from ingress/current net_device, which in
case of redirects uses the wrong net_device.

This patch keeps a sanity max limit of SKB_MAX_ALLOC (16KiB). The real limit
is elsewhere in the system. Jesper's testing[1] showed it was not possible
to exceed 8KiB when expanding the SKB size via BPF-helper. The limiting
factor is the define KMALLOC_MAX_CACHE_SIZE which is 8192 for
SLUB-allocator (CONFIG_SLUB) in-case PAGE_SIZE is 4096. This define is
in-effect due to this being called from softirq context see code
__gfp_pfmemalloc_flags() and __do_kmalloc_node(). Jakub's testing showed
that frames above 16KiB can cause NICs to reset (but not crash). Keep this
sanity limit at this level as memory layer can differ based on kernel
config.

[1] https://github.com/xdp-project/bpf-examples/tree/master/MTU-tests

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/161287788936.790810.2937823995775097177.stgit@firesoul
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-07 15:00:08 +02:00
..
6lowpan
9p net: 9p: advance iov on empty read 2021-04-07 15:00:08 +02:00
802
8021q net: vlan: avoid leaks on register_vlan_dev() failures 2021-01-17 14:16:55 +01:00
appletalk appletalk: Fix skb allocation size in loopback case 2021-04-07 15:00:08 +02:00
atm net: atm: fix update of position index in lec_seq_next 2020-10-31 12:26:30 -07:00
ax25
batman-adv batman-adv: Don't always reallocate the fragmentation skb head 2020-11-27 08:02:55 +01:00
bluetooth Bluetooth: Fix null pointer dereference in amp_read_loc_assoc_final_data 2021-03-07 12:34:10 +01:00
bpf bpf: Reject too big ctx_size_in for raw_tp test run 2021-01-27 11:55:07 +01:00
bpfilter Revert "bpfilter: Fix build error with CONFIG_BPFILTER_UMH" 2020-10-15 12:33:24 -07:00
bridge net: bridge: don't notify switchdev for local FDB addresses 2021-03-30 14:32:04 +02:00
caif
can net: introduce CAN specific pointer in the struct net_device 2021-04-07 15:00:07 +02:00
ceph libceph: clear con->out_msg on Policy::stateful_server faults 2020-10-12 15:29:27 +02:00
core bpf: Remove MTU check in __bpf_skb_max_len 2021-04-07 15:00:08 +02:00
dcb net: dcb: Accept RTM_GETDCB messages carrying set-like DCB commands 2021-01-23 16:04:01 +01:00
dccp ipv6: weaken the v4mapped source check 2021-03-30 14:32:01 +02:00
decnet
dns_resolver
dsa net: dsa: tag_mtk: fix 802.1ad VLAN egress 2021-03-17 17:06:22 +01:00
ethernet
ethtool ethtool: fix the check logic of at least one channel for RX/TX 2021-03-17 17:06:16 +01:00
hsr net: hsr: add support for EntryForgetTime 2021-03-07 12:34:07 +01:00
ieee802154 genetlink: move to smaller ops wherever possible 2020-10-02 19:11:11 -07:00
ife
ipv4 Revert "netfilter: x_tables: Update remaining dereference to RCU" 2021-03-30 14:32:06 +02:00
ipv6 Revert "netfilter: x_tables: Update remaining dereference to RCU" 2021-03-30 14:32:06 +02:00
iucv net/af_iucv: remove WARN_ONCE on malformed RX packets 2021-03-07 12:34:05 +01:00
kcm
key af_key: relax availability checks for skb size calculation 2021-02-13 13:55:02 +01:00
l2tp net: l2tp: reduce log level of messages in receive path, add counter instead 2021-03-17 17:06:11 +01:00
l3mdev
lapb net: lapb: Copy the skb before sending a packet 2021-02-10 09:29:14 +01:00
llc
mac80211 mac80211: fix double free in ibss_leave 2021-03-30 14:32:08 +02:00
mac802154
mpls net: avoid infinite loop in mpls_gso_segment when mpls_hlen == 0 2021-03-17 17:06:11 +01:00
mptcp ipv6: weaken the v4mapped source check 2021-03-30 14:32:01 +02:00
ncsi net/ncsi: Use real net-device for response handler 2021-01-12 20:18:10 +01:00
netfilter netfilter: x_tables: Use correct memory barriers. 2021-03-30 14:32:06 +02:00
netlabel cipso,calipso: resolve a number of problems with the DOI refcounts 2021-03-17 17:06:15 +01:00
netlink netlink: export policy in extended ACK 2020-10-09 20:22:32 -07:00
netrom
nfc tty: convert tty_ldisc_ops 'read()' function to take a kernel pointer 2021-03-04 11:37:36 +01:00
nsh
openvswitch net: openvswitch: fix TTL decrement exception action execution 2021-02-23 15:53:23 +01:00
packet net: fix proc_fs init handling in af_packet and tls 2021-02-23 15:53:23 +01:00
phonet
psample net: psample: Fix netlink skb length with tunnel info 2021-03-07 12:34:07 +01:00
qrtr net: qrtr: fix a kernel-infoleak in qrtr_recvmsg() 2021-03-30 14:31:58 +02:00
rds RDMA: Lift ibdev_to_node from rds to common code 2021-02-26 10:12:59 +01:00
rfkill rfkill: Fix use-after-free in rfkill_resume() 2020-11-12 09:18:06 +01:00
rose rose: Fix Null pointer dereference in rose_send_frame() 2020-11-20 10:04:58 -08:00
rxrpc rxrpc: Fix clearance of Tx/Rx ring when releasing a call 2021-02-17 11:02:28 +01:00
sched net/sched: cls_flower: fix only mask bit check in the validate_ct_state 2021-03-30 14:32:01 +02:00
sctp net: fix iteration for sctp transport seq_files 2021-02-17 11:02:29 +01:00
smc net/smc: fix direct access to ib_gid_addr->ndev in smc_ib_determine_gid() 2020-11-19 10:59:19 -08:00
strparser
sunrpc rpc: fix NULL dereference on kmalloc failure 2021-04-07 15:00:04 +02:00
switchdev net: switchdev: don't set port_obj_info->handled true when -EOPNOTSUPP 2021-02-07 15:37:12 +01:00
tipc tipc: better validate user input in tipc_nl_retrieve_key() 2021-03-30 14:31:59 +02:00
tls net: fix proc_fs init handling in af_packet and tls 2021-02-23 15:53:23 +01:00
unix networking changes for the 5.10 merge window 2020-10-15 18:42:13 -07:00
vmw_vsock selinux: vsock: Set SID for socket returned by accept() 2021-03-30 14:32:03 +02:00
wimax genetlink: move to smaller ops wherever possible 2020-10-02 19:11:11 -07:00
wireless wext: fix NULL-ptr-dereference with cfg80211's lack of commit() 2021-02-03 23:28:38 +01:00
x25 net/x25: prevent a couple of overflows 2020-12-02 17:26:36 -08:00
xdp xsk: Clear pool even for inactive queues 2021-01-27 11:55:10 +01:00
xfrm xfrm: Fix wraparound in xfrm_policy_addr_delta() 2021-02-03 23:28:45 +01:00
compat.c iov_iter: transparently handle compat iovecs in import_iovec 2020-10-03 00:02:13 -04:00
devres.c
Kconfig drop_monitor: Convert to using devlink tracepoint 2020-09-30 18:01:26 -07:00
Makefile
socket.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-10-05 18:40:01 -07:00
sysctl_net.c