Commit Graph

663500 Commits

Author SHA1 Message Date
David Miller
89087c456f bpf: Fix values type used in test_maps
Maps of per-cpu type have their value element size adjusted to 8 if it
is specified smaller during various map operations.

This makes test_maps as a 32-bit binary fail, in fact the kernel
writes past the end of the value's array on the user's stack.

To be quite honest, I think the kernel should reject creation of a
per-cpu map that doesn't have a value size of at least 8 if that's
what the kernel is going to silently adjust to later.

If the user passed something smaller, it is a sizeof() calcualtion
based upon the type they will actually use (just like in this testcase
code) in later calls to the map operations.

Fixes: df570f5772 ("samples/bpf: unit test for BPF_MAP_TYPE_PERCPU_ARRAY")
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2017-04-21 15:16:46 -04:00
David Ahern
557c44be91 net: ipv6: RTF_PCPU should not be settable from userspace
Andrey reported a fault in the IPv6 route code:

kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
Modules linked in:
CPU: 1 PID: 4035 Comm: a.out Not tainted 4.11.0-rc7+ #250
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff880069809600 task.stack: ffff880062dc8000
RIP: 0010:ip6_rt_cache_alloc+0xa6/0x560 net/ipv6/route.c:975
RSP: 0018:ffff880062dced30 EFLAGS: 00010206
RAX: dffffc0000000000 RBX: ffff8800670561c0 RCX: 0000000000000006
RDX: 0000000000000003 RSI: ffff880062dcfb28 RDI: 0000000000000018
RBP: ffff880062dced68 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff880062dcfb28 R14: dffffc0000000000 R15: 0000000000000000
FS:  00007feebe37e7c0(0000) GS:ffff88006cb00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000205a0fe4 CR3: 000000006b5c9000 CR4: 00000000000006e0
Call Trace:
 ip6_pol_route+0x1512/0x1f20 net/ipv6/route.c:1128
 ip6_pol_route_output+0x4c/0x60 net/ipv6/route.c:1212
...

Andrey's syzkaller program passes rtmsg.rtmsg_flags with the RTF_PCPU bit
set. Flags passed to the kernel are blindly copied to the allocated
rt6_info by ip6_route_info_create making a newly inserted route appear
as though it is a per-cpu route. ip6_rt_cache_alloc sees the flag set
and expects rt->dst.from to be set - which it is not since it is not
really a per-cpu copy. The subsequent call to __ip6_dst_alloc then
generates the fault.

Fix by checking for the flag and failing with EINVAL.

Fixes: d52d3997f8 ("ipv6: Create percpu rt6_info")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21 13:55:33 -04:00
Ilan Tayari
43170c4e0b gso: Validate assumption of frag_list segementation
Commit 07b26c9454 ("gso: Support partial splitting at the frag_list
pointer") assumes that all SKBs in a frag_list (except maybe the last
one) contain the same amount of GSO payload.

This assumption is not always correct, resulting in the following
warning message in the log:
    skb_segment: too many frags

For example, mlx5 driver in Striding RQ mode creates some RX SKBs with
one frag, and some with 2 frags.
After GRO, the frag_list SKBs end up having different amounts of payload.
If this frag_list SKB is then forwarded, the aforementioned assumption
is violated.

Validate the assumption, and fall back to software GSO if it not true.

Change-Id: Ia03983f4a47b6534dd987d7a2aad96d54d46d212
Fixes: 07b26c9454 ("gso: Support partial splitting at the frag_list pointer")
Signed-off-by: Ilan Tayari <ilant@mellanox.com>
Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21 13:30:29 -04:00
David S. Miller
918b70244f Merge branch 'skb_cow_head'
Eric Dumazet says:

====================
net: use skb_cow_head() to deal with cloned skbs

James Hughes found an issue with smsc95xx driver. Same problematic code
is found in other drivers.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21 13:24:07 -04:00
Eric Dumazet
39fba7835a kaweth: use skb_cow_head() to deal with cloned skbs
We can use skb_cow_head() to properly deal with clones,
especially the ones coming from TCP stack that allow their head being
modified. This avoids a copy.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: James Hughes <james.hughes@raspberrypi.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21 13:24:06 -04:00
Eric Dumazet
6bc6895bdd ch9200: use skb_cow_head() to deal with cloned skbs
We need to ensure there is enough headroom to push extra header,
but we also need to check if we are allowed to change headers.

skb_cow_head() is the proper helper to deal with this.

Fixes: 4a476bd6d1 ("usbnet: New driver for QinHeng CH9200 devices")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: James Hughes <james.hughes@raspberrypi.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21 13:24:06 -04:00
Eric Dumazet
d4ca735919 lan78xx: use skb_cow_head() to deal with cloned skbs
We need to ensure there is enough headroom to push extra header,
but we also need to check if we are allowed to change headers.

skb_cow_head() is the proper helper to deal with this.

Fixes: 55d7de9de6 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: James Hughes <james.hughes@raspberrypi.org>
Cc: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21 13:24:05 -04:00
Eric Dumazet
d532c1082f sr9700: use skb_cow_head() to deal with cloned skbs
We need to ensure there is enough headroom to push extra header,
but we also need to check if we are allowed to change headers.

skb_cow_head() is the proper helper to deal with this.

Fixes: c9b37458e9 ("USB2NET : SR9700 : One chip USB 1.1 USB2NET SR9700Device Driver Support")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: James Hughes <james.hughes@raspberrypi.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21 13:24:05 -04:00
Eric Dumazet
a9e840a208 cx82310_eth: use skb_cow_head() to deal with cloned skbs
We need to ensure there is enough headroom to push extra header,
but we also need to check if we are allowed to change headers.

skb_cow_head() is the proper helper to deal with this.

Fixes: cc28a20e77 ("introduce cx82310_eth: Conexant CX82310-based ADSL router USB ethernet driver")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: James Hughes <james.hughes@raspberrypi.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21 13:24:05 -04:00
Eric Dumazet
b7c6d26758 smsc75xx: use skb_cow_head() to deal with cloned skbs
We need to ensure there is enough headroom to push extra header,
but we also need to check if we are allowed to change headers.

skb_cow_head() is the proper helper to deal with this.

Fixes: d0cad87170 ("smsc75xx: SMSC LAN75xx USB gigabit ethernet adapter driver")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: James Hughes <james.hughes@raspberrypi.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21 13:24:04 -04:00
David Lebrun
95b9b88d2d ipv6: sr: fix double free of skb after handling invalid SRH
The icmpv6_param_prob() function already does a kfree_skb(),
this patch removes the duplicate one.

Fixes: 1ababeba4a ("ipv6: implement dataplane support for rthdr type 4 (Segment Routing Header)")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21 13:16:01 -04:00
David S. Miller
b0522e13b2 MAINTAINERS: Add "B:" field for networking.
We want people to report bugs to the netdev list.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-21 10:44:47 -04:00
Wolfgang Bumiller
e0535ce58b net sched actions: allocate act cookie early
Policing filters do not use the TCA_ACT_* enum and the tb[]
nlattr array in tcf_action_init_1() doesn't get filled for
them so we should not try to look for a TCA_ACT_COOKIE
attribute in the then uninitialized array.
The error handling in cookie allocation then calls
tcf_hash_release() leading to invalid memory access later
on.
Additionally, if cookie allocation fails after an already
existing non-policing filter has successfully been changed,
tcf_action_release() should not be called, also we would
have to roll back the changes in the error handling, so
instead we now allocate the cookie early and assign it on
success at the end.

CVE-2017-7979
Fixes: 1045ba77a5 ("net sched actions: Add support for user cookies")
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20 16:32:07 -04:00
David S. Miller
43af390573 Merge branch 'qed-dcbx-fixes'
Sudarsana Reddy Kalluru says:

====================
qed: Dcbx bug fixes

The series has set of bug fixes for dcbx implementation of qed driver.
Please consider applying this to 'net' branch.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20 16:29:41 -04:00
sudarsana.kalluru@cavium.com
c0c5dbe711 qed: Fix issue in populating the PFC config paramters.
Change ieee_setpfc() callback implementation to populate traffic class
count with the user provided value.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20 16:29:40 -04:00
sudarsana.kalluru@cavium.com
62289ba275 qed: Fix possible system hang in the dcbnl-getdcbx() path.
qed_dcbnl_get_dcbx() API uses kmalloc in GFT_KERNEL mode. The API gets
invoked in the interrupt context by qed_dcbnl_getdcbx callback. Need
to invoke this kmalloc in atomic mode.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20 16:29:40 -04:00
sudarsana.kalluru@cavium.com
6cf75f1ceb qed: Fix sending an invalid PFC error mask to MFW.
PFC error-mask value is not supported by MFW, but this bit could be
set in the pfc bit-map of the operational parameters if remote device
supports it. These operational parameters are used as basis for
populating the dcbx config parameters. User provided configs will be
applied on top of these parameters and then send them to MFW when
requested. Driver need to clear the error-mask bit before sending the
config parameters to MFW.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20 16:29:39 -04:00
sudarsana.kalluru@cavium.com
66367dab30 qed: Fix possible error in populating max_tc field.
Some adapters may not publish the max_tc value. Populate the default
value for max_tc field in case the mfw didn't provide one.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20 16:29:39 -04:00
James Hughes
e9156cd26a smsc95xx: Use skb_cow_head to deal with cloned skbs
The driver was failing to check that the SKB wasn't cloned
before adding checksum data.
Replace existing handling to extend/copy the header buffer
with skb_cow_head.

Signed-off-by: James Hughes <james.hughes@raspberrypi.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Woojung Huh <Woojung.Huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20 16:27:15 -04:00
Sekhar Nori
74d209b835 MAINTAINERS: update entry for TI's CPSW driver
Mugunthan V N, who was reviewing TI's CPSW driver patches is
not working for TI anymore and wont be reviewing patches for
that driver.

Drop Mugunthan as the maintiainer for this driver.

Grygorii continues to be a reviewer. Dave Miller applies the
patches directly and adding a maintainer is actually
misleading since get_maintainer.pl script stops suggesting
that Dave Miller be copied.

Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20 16:21:31 -04:00
David S. Miller
12a0a6c054 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:

====================
pull request (net): ipsec 2017-04-19

Two fixes for af_key:

1) Add a lock to key dump to prevent a NULL pointer dereference.
   From Yuejie Shi.

2) Fix slab-out-of-bounds in parse_ipsecrequests.
   From Herbert Xu.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20 16:19:46 -04:00
Dan Carpenter
9d386cd9a7 dp83640: don't recieve time stamps twice
This patch is prompted by a static checker warning about a potential
use after free.  The concern is that netif_rx_ni() can free "skb" and we
call it twice.

When I look at the commit that added this, it looks like some stray
lines were added accidentally.  It doesn't make sense to me that we
would recieve the same data two times.  I asked the author but never
recieved a response.

I can't test this code, but I'm pretty sure my patch is correct.

Fixes: 4b063258ab ("dp83640: Delay scheduled work.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Stefan Sørensen <stefan.sorensen@spectralink.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20 16:00:48 -04:00
David Lebrun
2f3bb64247 ipv6: sr: fix out-of-bounds access in SRH validation
This patch fixes an out-of-bounds access in seg6_validate_srh() when the
trailing data is less than sizeof(struct sr6_tlv).

Reported-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David Lebrun <david.lebrun@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20 15:40:33 -04:00
Mike Maloney
c1f8d0f98c selftests/net: Fixes psock_fanout CBPF test case
'psock_fanout' has been failing since commit 4d7b9dc1f3 ("tools:
psock_lib: harden socket filter used by psock tests").  That commit
changed the CBPF filter to examine the full ethernet frame, and was
tested on 'psock_tpacket' which uses SOCK_RAW.  But 'psock_fanout' was
also using this same CBPF in two places, for filtering and fanout, on a
SOCK_DGRAM socket.

Change 'psock_fanout' to use SOCK_RAW so that the CBPF program used with
SO_ATTACH_FILTER can examine the entire frame.  Create a new CBPF
program for use with PACKET_FANOUT_DATA which ignores the header, as it
cannot see the ethernet header.

Tested: Ran tools/testing/selftests/net/psock_{fanout,tpacket} 10 times,
and they all passed.

Fixes: 4d7b9dc1f3 ("tools: psock_lib: harden socket filter used by psock tests")
Signed-off-by: 'Mike Maloney <maloneykernel@gmail.com>'
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20 15:39:19 -04:00
Johannes Berg
3018e947d7 mac80211: reject ToDS broadcast data frames
AP/AP_VLAN modes don't accept any real 802.11 multicast data
frames, but since they do need to accept broadcast management
frames the same is currently permitted for data frames. This
opens a security problem because such frames would be decrypted
with the GTK, and could even contain unicast L3 frames.

Since the spec says that ToDS frames must always have the BSSID
as the RA (addr1), reject any other data frames.

The problem was originally reported in "Predicting, Decrypting,
and Abusing WPA2/802.11 Group Keys" at usenix
https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/vanhoef
and brought to my attention by Jouni.

Cc: stable@vger.kernel.org
Reported-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
--
Dave, I didn't want to send you a new pull request for a single
commit yet again - can you apply this one patch as is?
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20 15:37:46 -04:00
David S. Miller
6324805979 A single fix, for the MU-MIMO monitor mode, that fixes
bad SKB accesses if the SKB was paged, which is the case
 for the only driver supporting this - iwlwifi.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEExu3sM/nZ1eRSfR9Ha3t4Rpy0AB0FAlj1zD0ACgkQa3t4Rpy0
 AB0zsRAAgQ+rAwSjMfd4HtqiM80JLOhDzjkpw3wXvJvA3eG47DIEdPp3i2rV3FjB
 pfFmcn/iAVoNgGwxPoFXqsSevxZ+Kv5LPhvvYeqCyOB1H8xIRQw1pHN93jHc19JT
 BXa/UAuZLcLNvChTWBNF2jbWeiLiErkRTFqqdveGp6FJxmfFH0GhOi0eZPCWgXOH
 AK35A/59rngYRXc7lZTX47i2FE49IcxUSbp/0ZLtn74VgHyHElltiEhaUva9lUki
 YaOOLter25HzZgsB+LpxQYOeOS88oF+8vvYbT07Vn5UpjVu6Y7WpKfCEx4BqliP5
 H2VQflSXTSGH+wX5sQLmtXA5oidlNfeA3r9THsaPBXCfnqNUFTaszfs7zzAO+JNC
 BCtmGDiz1c6EYOjKdIpqglfsTehDHkCDnVtOdiUN4oPDqYACgXH/XdfzVwoZCQSg
 be5idiS6DPVFeQQ19DO+mK+Iwcn9PZ2H2NunLuAeaHd6mR1hdKN8wcSF/ipgtjpR
 yNyQbSFKIgtvTHiBq42IZZzLsASAI/qAgVXQFnrbR8uzYCJwVscnt7Cg0nSNIVV4
 c5McdytcJ8b1NY37hNrwS9e9XJ8oJmOjOOewxciRNLKJq2NjVkB7wh4oYiCL899z
 fQrCvXEcIqv4m0YtnT4On1ouEn1B3A0XFSUVuIBNdec1rJK7ovg=
 =r4GR
 -----END PGP SIGNATURE-----

Merge tag 'mac80211-for-davem-2017-04-18' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
A single fix, for the MU-MIMO monitor mode, that fixes
bad SKB accesses if the SKB was paged, which is the case
for the only driver supporting this - iwlwifi.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-20 13:36:42 -04:00
Sergei Shtylyov
1debdc8f9e sh_eth: unmap DMA buffers when freeing rings
The DMA API debugging (when enabled) causes:

WARNING: CPU: 0 PID: 1445 at lib/dma-debug.c:519 add_dma_entry+0xe0/0x12c
DMA-API: exceeded 7 overlapping mappings of cacheline 0x01b2974d

to be  printed after repeated initialization of the Ether device, e.g.
suspend/resume or 'ifconfig' up/down. This is because DMA buffers mapped
using dma_map_single() in sh_eth_ring_format() and sh_eth_start_xmit() are
never unmapped. Resolve this problem by unmapping the buffers when freeing
the descriptor  rings;  in order  to do it right, we'd have to add an extra
parameter to sh_eth_txfree() (we rename this function to sh_eth_tx_free(),
while at it).

Based on the commit a47b70ea86 ("ravb: unmap descriptors when freeing
rings").

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-18 22:04:32 -04:00
Linus Torvalds
005882e53d Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:
 "Two Sparc bug fixes from Daniel Jordan and Nitin Gupta"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc64: Fix hugepage page table free
  sparc64: Use LOCKDEP_SMALL, not PROVE_LOCKING_SMALL
2017-04-18 13:56:51 -07:00
Linus Torvalds
40d9018eb7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) BPF tail call handling bug fixes from Daniel Borkmann.

 2) Fix allowance of too many rx queues in sfc driver, from Bert
    Kenward.

 3) Non-loopback ipv6 packets claiming src of ::1 should be dropped,
    from Florian Westphal.

 4) Statistics requests on KSZ9031 can crash, fix from Grygorii
    Strashko.

 5) TX ring handling fixes in mediatek driver, from Sean Wang.

 6) ip_ra_control can deadlock, fix lock acquisition ordering to fix,
    from Cong WANG.

 7) Fix use after free in ip_recv_error(), from Willem de Buijn.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  bpf: fix checking xdp_adjust_head on tail calls
  bpf: fix cb access in socket filter programs on tail calls
  ipv6: drop non loopback packets claiming to originate from ::1
  net: ethernet: mediatek: fix inconsistency of port number carried in TXD
  net: ethernet: mediatek: fix inconsistency between TXD and the used buffer
  net: phy: micrel: fix crash when statistic requested for KSZ9031 phy
  net: vrf: Fix setting NLM_F_EXCL flag when adding l3mdev rule
  net: thunderx: Fix set_max_bgx_per_node for 81xx rgx
  net-timestamp: avoid use-after-free in ip_recv_error
  ipv4: fix a deadlock in ip_ra_control
  sfc: limit the number of receive queues
2017-04-18 13:24:42 -07:00
Nitin Gupta
544f8f9358 sparc64: Fix hugepage page table free
Make sure the start adderess is aligned to PMD_SIZE
boundary when freeing page table backing a hugepage
region. The issue was causing segfaults when a region
backed by 64K pages was unmapped since such a region
is in general not PMD_SIZE aligned.

Signed-off-by: Nitin Gupta <nitin.m.gupta@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-18 13:11:07 -07:00
Daniel Jordan
395102db44 sparc64: Use LOCKDEP_SMALL, not PROVE_LOCKING_SMALL
CONFIG_PROVE_LOCKING_SMALL shrinks the memory usage of lockdep so the
kernel text, data, and bss fit in the required 32MB limit, but this
option is not set for every config that enables lockdep.

A 4.10 kernel fails to boot with the console output

    Kernel: Using 8 locked TLB entries for main kernel image.
    hypervisor_tlb_lock[2000000:0:8000000071c007c3:1]: errors with f
    Program terminated

with these config options

    CONFIG_LOCKDEP=y
    CONFIG_LOCK_STAT=y
    CONFIG_PROVE_LOCKING=n

To fix, rename CONFIG_PROVE_LOCKING_SMALL to CONFIG_LOCKDEP_SMALL, and
enable this option with CONFIG_LOCKDEP=y so we get the reduced memory
usage every time lockdep is turned on.

Tested that CONFIG_LOCKDEP_SMALL is set to 'y' if and only if
CONFIG_LOCKDEP is set to 'y'.  When other lockdep-related config options
that select CONFIG_LOCKDEP are enabled (e.g. CONFIG_LOCK_STAT or
CONFIG_PROVE_LOCKING), verified that CONFIG_LOCKDEP_SMALL is also
enabled.

Fixes: e6b5f1be7a ("config: Adding the new config parameter CONFIG_PROVE_LOCKING_SMALL for sparc")
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-18 13:11:07 -07:00
Linus Torvalds
fb5e2154b7 While testing my development branch, without the fix for the pid use
after free bug, the selftest that Namhyung added triggers it. I figured
 it would be good to add the test for the bug after the fix, such that
 it does not exist without the fix.
 
 I added another patch that lets the test only test part of the pid
 filtering, and ignores the function-fork (filtering on children as well)
 if the function-fork feature does not exist. This feature is added by
 Namhyung just before he added this test. But since the test tests both
 with and without the feature, it would be good to let it not fail if
 the feature does not exist.
 -----BEGIN PGP SIGNATURE-----
 
 iQExBAABCAAbBQJY9kVEFBxyb3N0ZWR0QGdvb2RtaXMub3JnAAoJEMm5BfJq2Y3L
 nZQIAMJN51sNAnJHodKieAx6NUdnFbih7XknFZePGsGX2CHaRpPJuYRTEMIJrtds
 FSGCKOWjmmZ57xB/WYsCdH2H4cqd2TCFIeCT+6Pglk4+L2Y97idg5tzJ0+QGnDqT
 zBMd1kcmLathH5OoNsUEO5FR0QplBTb+3kVRu9XaAUgJhIlLwbF58BdtOv0l0avb
 saV/cVLosUjb4TXxwPgRZnmH9YElQ7RElf0S60JKbFTHCzyvoG0U17seFAklZOQl
 Ux0nn+LFWM+M7e7LYR3nSXnOzofDMz9r1bGGo9bgkng0Csl2Op1MFttofcsi3PvT
 FUxUGPZSEjxj3XrxXrkzzK8pRuI=
 =NEh1
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.11-rc5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull ftrace testcase update from Steven Rostedt:
 "While testing my development branch, without the fix for the pid use
  after free bug, the selftest that Namhyung added triggers it. I
  figured it would be good to add the test for the bug after the fix,
  such that it does not exist without the fix.

  I added another patch that lets the test only test part of the pid
  filtering, and ignores the function-fork (filtering on children as
  well) if the function-fork feature does not exist. This feature is
  added by Namhyung just before he added this test. But since the test
  tests both with and without the feature, it would be good to let it
  not fail if the feature does not exist"

* tag 'trace-v4.11-rc5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  selftests: ftrace: Add check for function-fork before running pid filter test
  selftests: ftrace: Add a testcase for function PID filter
2017-04-18 10:19:47 -07:00
Steven Rostedt (VMware)
9ed19c7695 selftests: ftrace: Add check for function-fork before running pid filter test
Have the func-filter-pid test check for the function-fork option before
testing it. It can still test the pid filtering, but will stop before
testing the function-fork option for children inheriting the pids.
This allows the test to be added before the function-fork feature, but after
a bug fix that triggers one of the bugs the test can cause.

Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-18 12:46:11 -04:00
Linus Torvalds
0bad6d7e93 Namhyung Kim discovered a use after free bug. It has to do with adding
a pid filter to function tracing in an instance, and then freeing
 the instance.
 -----BEGIN PGP SIGNATURE-----
 
 iQExBAABCAAbBQJY9hO7FBxyb3N0ZWR0QGdvb2RtaXMub3JnAAoJEMm5BfJq2Y3L
 qBgIAJv+IH1zQTHqFn4gOtIkHJ0kxjTr9mzz4S5SgnHDMaCKOHTpuste02RmCvfo
 J+6F//bw3eM9CpEcQg/t41aFagXs+g3x1HmD0PN7Y1fKHXQ5xDdpjPpOsgprrx7q
 dvGLg4bolv6KaNMTJmJ8LhwPXJGMEqnbY6Ypz3qbnsziSeXe1zcrQKNA88ySJoh0
 V6QV9XPWNkPO4AknnqD88oZvJhz/H/fQuJYQZNBoTomD6SG3f7mYW1bxyoWc08yW
 W+Rg/YddGHk6Mmkqy0BaCPBjKjGiq20h9DOvLU6CFR0Gt4ZQ7sVZczYN4NkjEn7H
 qdFcqaHNSkjxs0JFvbWToIu4D8w=
 =Gv/C
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.11-rc5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull ftrace fix from Steven Rostedt:
 "Namhyung Kim discovered a use after free bug. It has to do with adding
  a pid filter to function tracing in an instance, and then freeing the
  instance"

* tag 'trace-v4.11-rc5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace: Fix function pid filter on instances
2017-04-18 09:31:51 -07:00
Linus Torvalds
5ee4c5a929 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
 "This fixes the following problems:

   - regression in new XTS/LRW code when used with async crypto

   - long-standing bug in ahash API when used with certain algos

   - bogus memory dereference in async algif_aead with certain algos"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: algif_aead - Fix bogus request dereference in completion function
  crypto: ahash - Fix EINPROGRESS notification callback
  crypto: lrw - Fix use-after-free on EINPROGRESS
  crypto: xts - Fix use-after-free on EINPROGRESS
2017-04-18 09:03:50 -07:00
Namhyung Kim
093be89a12 selftests: ftrace: Add a testcase for function PID filter
Like event pid filtering test, add function pid filtering test with the
new "function-fork" option.  It also tests it on an instance directory
so that it can verify the bug related pid filtering on instances.

Link: http://lkml.kernel.org/r/20170417024430.21194-5-namhyung@kernel.org

Cc: Ingo Molnar <mingo@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-18 12:02:36 -04:00
Herbert Xu
096f41d3a8 af_key: Fix sadb_x_ipsecrequest parsing
The parsing of sadb_x_ipsecrequest is broken in a number of ways.
First of all we're not verifying sadb_x_ipsecrequest_len.  This
is needed when the structure carries addresses at the end.  Worse
we don't even look at the length when we parse those optional
addresses.

The migration code had similar parsing code that's better but
it also has some deficiencies.  The length is overcounted first
of all as it includes the header itself.  It also fails to check
the length before dereferencing the sa_family field.

This patch fixes those problems in parse_sockaddr_pair and then
uses it in parse_ipsecrequest.

Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-04-18 08:26:03 +02:00
Linus Torvalds
20bb78f6b3 Merge branch 'parisc-4.11-5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fix from Helge Deller:
 "One patch which fixes get_user() for 64-bit values on 32-bit kernels.

  Up to now we lost the upper 32-bits of the returned 64-bit value"

* 'parisc-4.11-5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Fix get_user() for 64-bit value on 32-bit kernel
2017-04-17 15:06:34 -07:00
Namhyung Kim
d879d0b8c1 ftrace: Fix function pid filter on instances
When function tracer has a pid filter, it adds a probe to sched_switch
to track if current task can be ignored.  The probe checks the
ftrace_ignore_pid from current tr to filter tasks.  But it misses to
delete the probe when removing an instance so that it can cause a crash
due to the invalid tr pointer (use-after-free).

This is easily reproducible with the following:

  # cd /sys/kernel/debug/tracing
  # mkdir instances/buggy
  # echo $$ > instances/buggy/set_ftrace_pid
  # rmdir instances/buggy

  ============================================================================
  BUG: KASAN: use-after-free in ftrace_filter_pid_sched_switch_probe+0x3d/0x90
  Read of size 8 by task kworker/0:1/17
  CPU: 0 PID: 17 Comm: kworker/0:1 Tainted: G    B           4.11.0-rc3  #198
  Call Trace:
   dump_stack+0x68/0x9f
   kasan_object_err+0x21/0x70
   kasan_report.part.1+0x22b/0x500
   ? ftrace_filter_pid_sched_switch_probe+0x3d/0x90
   kasan_report+0x25/0x30
   __asan_load8+0x5e/0x70
   ftrace_filter_pid_sched_switch_probe+0x3d/0x90
   ? fpid_start+0x130/0x130
   __schedule+0x571/0xce0
   ...

To fix it, use ftrace_clear_pids() to unregister the probe.  As
instance_rmdir() already updated ftrace codes, it can just free the
filter safely.

Link: http://lkml.kernel.org/r/20170417024430.21194-2-namhyung@kernel.org

Fixes: 0c8916c342 ("tracing: Add rmdir to remove multibuffer instances")
Cc: Ingo Molnar <mingo@kernel.org>
Cc: stable@vger.kernel.org
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-17 16:44:23 -04:00
David S. Miller
acf167f3f2 Merge branch 'bpf-fixes'
Daniel Borkmann says:

====================
Two BPF fixes

The set fixes cb_access and xdp_adjust_head bits in struct bpf_prog,
that are used for requirement checks on the program rather than f.e.
heuristics. Thus, for tail calls, we cannot make any assumptions and
are forced to set them.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17 15:51:58 -04:00
Daniel Borkmann
c2002f9837 bpf: fix checking xdp_adjust_head on tail calls
Commit 17bedab272 ("bpf: xdp: Allow head adjustment in XDP prog")
added the xdp_adjust_head bit to the BPF prog in order to tell drivers
that the program that is to be attached requires support for the XDP
bpf_xdp_adjust_head() helper such that drivers not supporting this
helper can reject the program. There are also drivers that do support
the helper, but need to check for xdp_adjust_head bit in order to move
packet metadata prepended by the firmware away for making headroom.

For these cases, the current check for xdp_adjust_head bit is insufficient
since there can be cases where the program itself does not use the
bpf_xdp_adjust_head() helper, but tail calls into another program that
uses bpf_xdp_adjust_head(). As such, the xdp_adjust_head bit is still
set to 0. Since the first program has no control over which program it
calls into, we need to assume that bpf_xdp_adjust_head() helper is used
upon tail calls. Thus, for the very same reasons in cb_access, set the
xdp_adjust_head bit to 1 when the main program uses tail calls.

Fixes: 17bedab272 ("bpf: xdp: Allow head adjustment in XDP prog")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17 15:51:57 -04:00
Daniel Borkmann
6b1bb01bcc bpf: fix cb access in socket filter programs on tail calls
Commit ff936a04e5 ("bpf: fix cb access in socket filter programs")
added a fix for socket filter programs such that in i) AF_PACKET the
20 bytes of skb->cb[] area gets zeroed before use in order to not leak
data, and ii) socket filter programs attached to TCP/UDP sockets need
to save/restore these 20 bytes since they are also used by protocol
layers at that time.

The problem is that bpf_prog_run_save_cb() and bpf_prog_run_clear_cb()
only look at the actual attached program to determine whether to zero
or save/restore the skb->cb[] parts. There can be cases where the
actual attached program does not access the skb->cb[], but the program
tail calls into another program which does access this area. In such
a case, the zero or save/restore is currently not performed.

Since the programs we tail call into are unknown at verification time
and can dynamically change, we need to assume that whenever the attached
program performs a tail call, that later programs could access the
skb->cb[], and therefore we need to always set cb_access to 1.

Fixes: ff936a04e5 ("bpf: fix cb access in socket filter programs")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17 15:51:57 -04:00
Florian Westphal
0aa8c13eb5 ipv6: drop non loopback packets claiming to originate from ::1
We lack a saddr check for ::1. This causes security issues e.g. with acls
permitting connections from ::1 because of assumption that these originate
from local machine.

Assuming a source address of ::1 is local seems reasonable.
RFC4291 doesn't allow such a source address either, so drop such packets.

Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17 15:09:23 -04:00
David S. Miller
71947f0f91 Merge branch 'mediatek-tx-bugs'
Sean Wang says:

====================
mediatek: Fix crash caused by reporting inconsistent skb->len to BQL

Changes since v1:
- fix inconsistent enumeration which easily causes the potential bug

The series fixes kernel BUG caused by inconsistent SKB length reported
into BQL. The reason for inconsistent length comes from hardware BUG which
results in different port number carried on the TXD within the lifecycle of
SKB. So patch 2) is proposed for use a software way to track which port
the SKB involving instead of hardware way. And patch 1) is given for another
issue I found which causes TXD and SKB inconsistency that is not expected
in the initial logic, so it is also being corrected it in the series.

The log for the kernel BUG caused by the issue is posted as below.

[  120.825955] kernel BUG at ... lib/dynamic_queue_limits.c:26!
[  120.837684] Internal error: Oops - BUG: 0 [#1] SMP ARM
[  120.842778] Modules linked in:
[  120.845811] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.11.0-rc1-191576-gdbcef47 #35
[  120.853488] Hardware name: Mediatek Cortex-A7 (Device Tree)
[  120.859012] task: c1007480 task.stack: c1000000
[  120.863510] PC is at dql_completed+0x108/0x17c
[  120.867915] LR is at 0x46
[  120.870512] pc : [<c03c19c8>]    lr : [<00000046>]    psr: 80000113
[  120.870512] sp : c1001d58  ip : c1001d80  fp : c1001d7c
[  120.881895] r10: 0000003e  r9 : df6b3400  r8 : 0ed86506
[  120.887075] r7 : 00000001  r6 : 00000001  r5 : 0ed8654c  r4 : df0135d8
[  120.893546] r3 : 00000001  r2 : df016800  r1 : 0000fece  r0 : df6b3480
[  120.900018] Flags: Nzcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[  120.907093] Control: 10c5387d  Table: 9e27806a  DAC: 00000051
[  120.912789] Process swapper/0 (pid: 0, stack limit = 0xc1000218)
[  120.918744] Stack: (0xc1001d58 to 0xc1002000)

....

121.085331] 1fc0: 00000000 c0a52a28 00000000 c10855d4 c1003c58 c0a52a24 c100885c 8000406a
[  121.093444] 1fe0: 410fc073 00000000 00000000 c1001ff8 8000807c c0a009cc 00000000 00000000
[  121.101575] [<c03c19c8>] (dql_completed) from [<c04cb010>] (mtk_napi_tx+0x1d0/0x37c)
[  121.109263] [<c04cb010>] (mtk_napi_tx) from [<c05e28cc>] (net_rx_action+0x24c/0x3b8)
[  121.116951] [<c05e28cc>] (net_rx_action) from [<c010152c>] (__do_softirq+0xe4/0x35c)
[  121.124638] [<c010152c>] (__do_softirq) from [<c012a624>] (irq_exit+0xe8/0x150)
[  121.131895] [<c012a624>] (irq_exit) from [<c017750c>] (__handle_domain_irq+0x70/0xc4)
[  121.139666] [<c017750c>] (__handle_domain_irq) from [<c0101404>] (gic_handle_irq+0x58/0x9c)
[  121.147953] [<c0101404>] (gic_handle_irq) from [<c010e18c>] (__irq_svc+0x6c/0x90)
[  121.155373] Exception stack(0xc1001ef8 to 0xc1001f40)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17 13:33:59 -04:00
Sean Wang
134d21525f net: ethernet: mediatek: fix inconsistency of port number carried in TXD
Fix port inconsistency on TXD due to hardware BUG that would cause
different port number is carried on the same TXD between tx_map()
and tx_unmap() with the iperf test. It would cause confusing BQL
logic which leads to kernel panic when dual GMAC runs concurrently.

Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17 13:33:58 -04:00
Sean Wang
81d2dd09ca net: ethernet: mediatek: fix inconsistency between TXD and the used buffer
Fix inconsistency between the TXD descriptor and the used buffer that
would cause unexpected logic at mtk_tx_unmap() during skb housekeeping.

Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17 13:33:58 -04:00
Grygorii Strashko
bfe7244257 net: phy: micrel: fix crash when statistic requested for KSZ9031 phy
Now the command:
	ethtool --phy-statistics eth0
will cause system crash with meassage "Unable to handle kernel NULL pointer
dereference at virtual address 00000010" from:

 (kszphy_get_stats) from [<c069f1d8>] (ethtool_get_phy_stats+0xd8/0x210)
 (ethtool_get_phy_stats) from [<c06a0738>] (dev_ethtool+0x5b8/0x228c)
 (dev_ethtool) from [<c06b5484>] (dev_ioctl+0x3fc/0x964)
 (dev_ioctl) from [<c0679f7c>] (sock_ioctl+0x170/0x2c0)
 (sock_ioctl) from [<c02419d4>] (do_vfs_ioctl+0xa8/0x95c)
 (do_vfs_ioctl) from [<c02422c4>] (SyS_ioctl+0x3c/0x64)
 (SyS_ioctl) from [<c0107d60>] (ret_fast_syscall+0x0/0x44)

The reason: phy_driver structure for KSZ9031 phy has no .probe() callback
defined. As result, struct phy_device *phydev->priv pointer will not be
initializes (null).
This issue will affect also following phys:
 KSZ8795, KSZ886X, KSZ8873MLL, KSZ9031, KSZ9021, KSZ8061, KS8737

Fix it by:
- adding .probe() = kszphy_probe() callback to KSZ9031, KSZ9021
phys. The kszphy_probe() can be re-used as it doesn't do any phy specific
settings.
- removing statistic callbacks from other phys (KSZ8795, KSZ886X,
KSZ8873MLL, KSZ8061, KS8737) as they doesn't have corresponding
statistic counters.

Fixes: 2b2427d064 ("phy: micrel: Add ethtool statistics counters")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17 13:29:49 -04:00
David Ahern
426c87caa2 net: vrf: Fix setting NLM_F_EXCL flag when adding l3mdev rule
Only need 1 l3mdev FIB rule. Fix setting NLM_F_EXCL in the nlmsghdr.

Fixes: 1aa6c4f6b8 ("net: vrf: Add l3mdev rules on first device create")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17 13:27:54 -04:00
George Cherian
b47a57a273 net: thunderx: Fix set_max_bgx_per_node for 81xx rgx
Add the PCI_SUBSYS_DEVID_81XX_RGX and use the same to set
the max bgx per node count.

This fixes the issue intoduced by following commit
78aacb6f6 net: thunderx: Fix invalid mac addresses for node1 interfaces
With this commit the max_bgx_per_node for 81xx is set as 2 instead of 3
because of which num_vfs is always calculated as zero.

Signed-off-by: George Cherian <george.cherian@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17 13:03:48 -04:00
Willem de Bruijn
1862d6208d net-timestamp: avoid use-after-free in ip_recv_error
Syzkaller reported a use-after-free in ip_recv_error at line

    info->ipi_ifindex = skb->dev->ifindex;

This function is called on dequeue from the error queue, at which
point the device pointer may no longer be valid.

Save ifindex on enqueue in __skb_complete_tx_timestamp, when the
pointer is valid or NULL. Store it in temporary storage skb->cb.

It is safe to reference skb->dev here, as called from device drivers
or dev_queue_xmit. The exception is when called from tcp_ack_tstamp;
in that case it is NULL and ifindex is set to 0 (invalid).

Do not return a pktinfo cmsg if ifindex is 0. This maintains the
current behavior of not returning a cmsg if skb->dev was NULL.

On dequeue, the ipv4 path will cast from sock_exterr_skb to
in_pktinfo. Both have ifindex as their first element, so no explicit
conversion is needed. This is by design, introduced in commit
0b922b7a82 ("net: original ingress device index in PKTINFO"). For
ipv6 ip6_datagram_support_cmsg converts to in6_pktinfo.

Fixes: 829ae9d611 ("net-timestamp: allow reading recv cmsg on errqueue with origin tstamp")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17 12:59:22 -04:00