When calling keyctl_read() on a key of type "trusted", if the
user-supplied buffer was too small, the kernel ignored the buffer length
and just wrote past the end of the buffer, potentially corrupting
userspace memory. Fix it by instead returning the size required, as per
the documentation for keyctl_read().
We also don't even fill the buffer at all in this case, as this is
slightly easier to implement than doing a short read, and either
behavior appears to be permitted. It also makes it match the behavior
of the "encrypted" key type.
Fixes: d00a1c72f7 ("keys: add new trusted key-type")
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Cc: <stable@vger.kernel.org> # v2.6.38+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Commit e645016abc ("KEYS: fix writing past end of user-supplied buffer
in keyring_read()") made keyring_read() stop corrupting userspace memory
when the user-supplied buffer is too small. However it also made the
return value in that case be the short buffer size rather than the size
required, yet keyctl_read() is actually documented to return the size
required. Therefore, switch it over to the documented behavior.
Note that for now we continue to have it fill the short buffer, since it
did that before (pre-v3.13) and dump_key_tree_aux() in keyutils arguably
relies on it.
Fixes: e645016abc ("KEYS: fix writing past end of user-supplied buffer in keyring_read()")
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Cc: <stable@vger.kernel.org> # v3.13+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Pull signal bugfix from Eric Biederman:
"When making the generic support for SIGEMT conditional on the presence
of SIGEMT I made a typo that causes it to fail to activate. It was
noticed comparatively quickly but the bug report just made it to me
today"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
signal: Fix name of SIGEMT in #if defined() check
Commit cc731525f2 ("signal: Remove kernel interal si_code magic")
added a check for SIGMET and NSIGEMT being defined. That SIGMET should
in fact be SIGEMT, with SIGEMT being defined in
arch/{alpha,mips,sparc}/include/uapi/asm/signal.h
This was actually pointed out by BenHutchings in a lwn.net comment
here https://lwn.net/Comments/734608/
Fixes: cc731525f2 ("signal: Remove kernel interal si_code magic")
Signed-off-by: Andrew Clayton <andrew@digital-domain.net>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Pull block fixes from Jens Axboe:
"A few fixes that should go into this series:
- Regression fix for ide-cd, ensuring that a request is fully
initialized. From Hongxu.
- Ditto fix for virtio_blk, from Bart.
- NVMe fix from Keith, ensuring that we set the right block size on
revalidation. If the block size changed, we'd be in trouble without
it.
- NVMe rdma fix from Sagi, fixing a potential hang while the
controller is being removed"
* 'for-linus' of git://git.kernel.dk/linux-block:
ide:ide-cd: fix kernel panic resulting from missing scsi_req_init
nvme: Fix setting logical block format when revalidating
virtio_blk: Fix an SG_IO regression
nvme-rdma: fix possible hang when issuing commands during ctrl removal
Pull networking fixes from David Miller:
1) Fix refcounting in xfrm_bundle_lookup() when using a dummy bundle,
from Steffen Klassert.
2) Fix crypto header handling in rx data frames in ath10k driver, from
Vasanthakumar Thiagarajan.
3) Fix use after free of qdisc when we defer tcp_chain_flush() to a
workqueue. From Cong Wang.
4) Fix double free in lapbether driver, from Pan Bian.
5) Sanitize TUNSETSNDBUF values, from Craig Gallek.
6) Fix refcounting when addrconf_permanent_addr() calls
ipv6_del_addr(). From Eric Dumazet.
7) Fix MTU probing bug in TCP that goes back to 2007, from Eric
Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
tcp: fix tcp_mtu_probe() vs highest_sack
ipv6: addrconf: increment ifp refcount before ipv6_del_addr()
tun/tap: sanitize TUNSETSNDBUF input
mlxsw: i2c: Fix buffer increment counter for write transaction
mlxsw: reg: Add high and low temperature thresholds
MAINTAINERS: Remove Yotam from mlxfw
MAINTAINERS: Update Yotam's E-mail
net: hns: set correct return value
net: lapbether: fix double free
bpf: remove SK_REDIRECT from UAPI
net: phy: marvell: Only configure RGMII delays when using RGMII
xfrm: Fix GSO for IPsec with GRE tunnel.
tc-testing: fix arg to ip command: -s -> -n
net_sched: remove tcf_block_put_deferred()
l2tp: hold tunnel in pppol2tp_connect()
Revert "ath10k: fix napi_poll budget overflow"
ath10k: rebuild crypto header in rx data frames
wcn36xx: Remove unnecessary rcu_read_unlock in wcn36xx_bss_info_changed
xfrm: Clear sk_dst_cache when applying per-socket policy.
xfrm: Fix xfrm_dst_cache memleak
Syzkaller with KASAN has reported a use-after-free of vma->vm_flags in
__do_page_fault() with the following reproducer:
mmap(&(0x7f0000000000/0xfff000)=nil, 0xfff000, 0x3, 0x32, 0xffffffffffffffff, 0x0)
mmap(&(0x7f0000011000/0x3000)=nil, 0x3000, 0x1, 0x32, 0xffffffffffffffff, 0x0)
r0 = userfaultfd(0x0)
ioctl$UFFDIO_API(r0, 0xc018aa3f, &(0x7f0000002000-0x18)={0xaa, 0x0, 0x0})
ioctl$UFFDIO_REGISTER(r0, 0xc020aa00, &(0x7f0000019000)={{&(0x7f0000012000/0x2000)=nil, 0x2000}, 0x1, 0x0})
r1 = gettid()
syz_open_dev$evdev(&(0x7f0000013000-0x12)="2f6465762f696e7075742f6576656e742300", 0x0, 0x0)
tkill(r1, 0x7)
The vma should be pinned by mmap_sem, but handle_userfault() might (in a
return to userspace scenario) release it and then acquire again, so when
we return to __do_page_fault() (with other result than VM_FAULT_RETRY),
the vma might be gone.
Specifically, per Andrea the scenario is
"A return to userland to repeat the page fault later with a
VM_FAULT_NOPAGE retval (potentially after handling any pending signal
during the return to userland). The return to userland is identified
whenever FAULT_FLAG_USER|FAULT_FLAG_KILLABLE are both set in
vmf->flags"
However, since commit a3c4fb7c9c ("x86/mm: Fix fault error path using
unsafe vma pointer") there is a vma_pkey() read of vma->vm_flags after
that point, which can thus become use-after-free. Fix this by moving
the read before calling handle_mm_fault().
Reported-by: syzbot <bot+6a5269ce759a7bb12754ed9622076dc93f65a1f6@syzkaller.appspotmail.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Suggested-by: Kirill A. Shutemov <kirill@shutemov.name>
Fixes: 3c4fb7c9c2e ("x86/mm: Fix fault error path using unsafe vma pointer")
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQGcBAABAgAGBQJZ+SpeAAoJEIosvXAHck9R3wwL/RoTCiE9T6LNRovIhOAJU8Z9
gd0j5ysO0ZdeKul66zDQLHJ2KdNQ2cxhXJH6y8+YdpgZ9hKxsl/6FNAncvwsjnWG
76KM+QxsG1yxIRNynO130SYD8huS2dLFF4HmIYYj9JxaIDLtHf/7hEPvjjmmvbX2
70+7O2moo8ljR3ER5qLZ6wqL17gXdIDk5D3k52BTJ0OPMANqztNL23dWu7Cpvflg
O7QPEXX97NZmYYSnQ4AJFTY08A2Ya8FPEDnRYTuJlBeM02KrKRxrMyFNwXESQuVs
lERNcxyRIkRJL5rINdep8X7Xzp9LR5XlJuUMB1s4IORC/LagNFO4weJyfcvV3cez
lbdqYjug07ag321zyTex5ho4q4mr8sPzmYfBm4pv717DLLS5VPA/nFHb00ajWqF6
HLZMOP+bLhgPiH8OMs22+y1nI+DcKu1BVA0gV/ZSrn69xJsAr32ut893B7/VSiMO
CIwhFJTTVQvFSdSaNn6AIk95m8T80NwuYv/UGHq2AA==
=WOnZ
-----END PGP SIGNATURE-----
Merge tag 'smb3-file-name-too-long-fix' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fix from Steve French:
"smb3 file name too long fix"
* tag 'smb3-file-name-too-long-fix' of git://git.samba.org/sfrench/cifs-2.6:
cifs: check MaxPathNameComponentLength != 0 before using it
Since we split the scsi_request out of struct request, while the
standard prep_rq_fn builds 10 byte cmds, it missed to invoke
scsi_req_init() to initialize certain fields of a scsi_request
structure (.__cmd[], .cmd, .cmd_len and .sense_len but no other
members of struct scsi_request).
An example panic on virtual machines (qemu/virtualbox) to boot
from IDE cdrom:
...
[ 8.754381] Call Trace:
[ 8.755419] blk_peek_request+0x182/0x2e0
[ 8.755863] blk_fetch_request+0x1c/0x40
[ 8.756148] ? ktime_get+0x40/0xa0
[ 8.756385] do_ide_request+0x37d/0x660
[ 8.756704] ? cfq_group_service_tree_add+0x98/0xc0
[ 8.757011] ? cfq_service_tree_add+0x1e5/0x2c0
[ 8.757313] ? ktime_get+0x40/0xa0
[ 8.757544] __blk_run_queue+0x3d/0x60
[ 8.757837] queue_unplugged+0x2f/0xc0
[ 8.758088] blk_flush_plug_list+0x1f4/0x240
[ 8.758362] blk_finish_plug+0x2c/0x40
...
[ 8.770906] RIP: ide_cdrom_prep_fn+0x63/0x180 RSP: ffff92aec018bae8
[ 8.772329] ---[ end trace 6408481e551a85c9 ]---
...
Fixes: 82ed4db499 ("block: split scsi_request out of struct request")
Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Based on SNMP values provided by Roman, Yuchung made the observation
that some crashes in tcp_sacktag_walk() might be caused by MTU probing.
Looking at tcp_mtu_probe(), I found that when a new skb was placed
in front of the write queue, we were not updating tcp highest sack.
If one skb is freed because all its content was copied to the new skb
(for MTU probing), then tp->highest_sack could point to a now freed skb.
Bad things would then happen, including infinite loops.
This patch renames tcp_highest_sack_combine() and uses it
from tcp_mtu_probe() to fix the bug.
Note that I also removed one test against tp->sacked_out,
since we want to replace tp->highest_sack regardless of whatever
condition, since keeping a stale pointer to freed skb is a recipe
for disaster.
Fixes: a47e5a988a ("[TCP]: Convert highest_sack to sk_buff to allow direct access")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Reported-by: Roman Gushchin <guro@fb.com>
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It fixes a problem for the last chunk where 'chunk_size' is smaller than
MLXSW_I2C_BLK_MAX and data is copied to the wrong offset, overriding
previous data.
Fixes: 6882b0aee1 ("mlxsw: Introduce support for I2C bus")
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert says:
====================
pull request (net): ipsec 2017-11-01
1) Fix a memleak when a packet matches a policy
without a matching state.
2) Reset the socket cached dst_entry when inserting
a socket policy, otherwise the policy might be
ignored. From Jonathan Basseri.
3) Fix GSO for a IPsec, GRE tunnel combination.
We reset the encapsulation field at the skb
too erly, as a result GRE does not segment
GSO packets. Fix this by resetting the the
encapsulation field right before the
transformation where the inner headers get
invalid.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The ASIC has the ability to generate events whenever a sensor indicates
the temperature goes above or below its high or low thresholds,
respectively.
In new firmware versions the firmware enforces a minimum of 5
degrees Celsius difference between both thresholds. Make the driver
conform to this requirement.
Note that this is required even when the events are disabled, as in
certain systems interrupts are generated via GPIO based on these
thresholds.
Fixes: 85926f8770 ("mlxsw: reg: Add definition of temperature management registers")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Provide a mailing list for maintenance of the module instead.
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For the time being I will be available in my private mail. Update both the
MAINTAINERS file and the individual modules MODULE_AUTHOR directive with
the new address.
Signed-off-by: Yotam Gigi <yotam.gi@gmail.com>
Signed-off-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function of_parse_phandle() returns a NULL pointer if it cannot
resolve a phandle property to a device_node pointer. In function
hns_nic_dev_probe(), its return value is passed to PTR_ERR to extract
the error code. However, in this case, the extracted error code will
always be zero, which is unexpected.
Signed-off-by: Pan Bian <bianpan2016@163.com>
Reviewed-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function netdev_priv() returns the private data of the device. The
memory to store the private data is allocated in alloc_netdev() and is
released in netdev_free(). Calling kfree() on the return value of
netdev_priv() after netdev_free() results in a double free bug.
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that SK_REDIRECT is no longer a valid return code. Remove it
from the UAPI completely. Then do a namespace remapping internal
to sockmap so SK_REDIRECT is no longer externally visible.
Patchs primary change is to do a namechange from SK_REDIRECT to
__SK_REDIRECT
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The fix 5987feb38a ("net: phy: marvell: logical vs bitwise OR typo")
uncovered another bug in the Marvell PHY driver, which broke the
Marvell OpenRD platform. It relies on the bootloader configuring the
RGMII delays and does not specify a phy-mode in its device tree. The
PHY driver should only configure RGMII delays if the phy mode
indicates it is using RGMII. Without anything in device tree, the
mv643xx Ethernet driver defaults to GMII.
Fixes: 5987feb38a ("net: phy: marvell: logical vs bitwise OR typo")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The most important here is the security vulnerabitility fix for
ath10k.
ath10k
* fix security vulnerability with missing PN check on certain hardware
* revert ath10k napi fix as it caused regressions on QCA6174
wcn36xx
* remove unnecessary rcu_read_unlock() from error path
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJZ+JEXAAoJEG4XJFUm622bxlYH/RVFveiBV/5v3Edj7bfQaonm
DrSO9fWNi+B3d2S0jpP/E9aPAZFfI6lrJ2zYbmyHEtIgLmWUsCgpkPbsTiE7Qa/7
VesLt5cDQyE25F9C3N2cNgaSI2VrHVml13GZ1f9329us/pUyPhzT7SKVO5c3SIjf
PcnyobgRtT42tJWYGDJ83tP9EaGQnIcxVcjrH5Bfp7SiOLlWh05owDBQKyxni+8Q
NAF/CAqaeXOTygL8f/A4mmp/qAolPzSp3XpyFWNOP3UdfI+sgLNbc8BDo1COzHMw
of9SF0RZi8Vy65x9xZKwxnzkmUmp0F7GX/MpkO2vI9Zhca8XlXCVb30t6pDqSLc=
=iqP1
-----END PGP SIGNATURE-----
Merge tag 'wireless-drivers-for-davem-2017-10-31' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for 4.14
The most important here is the security vulnerabitility fix for
ath10k.
ath10k
* fix security vulnerability with missing PN check on certain hardware
* revert ath10k napi fix as it caused regressions on QCA6174
wcn36xx
* remove unnecessary rcu_read_unlock() from error path
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts two recent PM QoS commits one of which
introduced multiple problems and the other one fixed
some, but not all of them (Rafael Wysocki).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJZ+Ld8AAoJEILEb/54YlRxC3oQAICi5txylcV65XRPu3HYZf6t
KfOJ8qQijAAtv6P/kx5H8iFPfEf9HQj86pBa0yTPAg7yWoG4u32kssmDYd3n9zbp
OX5Y90NJTkmvVZY6zAzHZtg3vGN7y/UNozkl40ENLhDfBafYRsLkOC93IQ9LNzJC
ZcUmq+8JjwhSRVL4M0F4WemJ78DbP0wxeU3V+wQRg/e+jJhiJlhhcTAEKGOL+4Lb
etmIa1zovAF2c8YbfOa7drvy6xuA0hL86yzbWTF4DOHk1314YzAPaNFM9+dzP8dN
Gk4PfBTyZqOzUZHzq0UHmNKiSMnFG0aDVR7lG98FXCu6h/XZUwa9P5vjIAxyIerf
iDeQaLgfr9GMiqzp3rOWZUPTbmjAY97JhtLnvSpKGQtveJ27Fmvv9Pw3N5skwM5m
CExHyppAa/bv5WEyqYQZQwFkRdzjMf3cDHQ+e7cO/FfUyTSirakdSBeHi448KYph
Y/21ETsogGjZWDYpsDWL9AUx/A4Mk+COopckDQc5MXmIr/9gW/AYKrIE0E3XQ2+i
ngmlBzjnxK1Q0cfYiOoiihblX8Zg9ex01ylUyDAEOjvaVI2HdjTRwoNSqDs0SZZA
KOETAsepDmZPJblZ/lW7z1ju/fCe+ma5BXDl7nfJEc7yBIJ1FXiR+eSsp56yvp1B
0KW+euUOpQzd0qqPc2Nq
=KfFO
-----END PGP SIGNATURE-----
Merge tag 'pm-reverts-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management commit reverts from Rafael Wysocki:
"Since Geert reports additional problems with my PM QoS fix from the
last week that have not been addressed by the most recent fixup on top
of it, they both should better be reverted now and let's fix the
original issue properly in 4.15.
This reverts two recent PM QoS commits one of which introduced
multiple problems and the other one fixed some, but not all of them
(Rafael Wysocki)"
* tag 'pm-reverts-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "PM / QoS: Fix device resume latency PM QoS"
Revert "PM / QoS: Fix default runtime_pm device resume latency"
IB device index is nldev's handler and it should be checked always.
Fixes: c3f66f7b00 ("RDMA/netlink: Implement nldev port doit callback")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Doug Ledford <dledford@redhat.com>
[ Applying directly, since Doug fried his SSD's and is rebuilding - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This reverts commit 0cc2b4e5a0 (PM / QoS: Fix device resume latency PM
QoS) as it introduced regressions on multiple systems and the fix-up
in commit 2a9a86d5c8 (PM / QoS: Fix default runtime_pm device resume
latency) does not address all of them.
The original problem that commit 0cc2b4e5a0 was attempting to fix
will be addressed later.
Fixes: 0cc2b4e5a0 (PM / QoS: Fix device resume latency PM QoS)
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This reverts commit 2a9a86d5c8 (PM / QoS: Fix default runtime_pm
device resume latency) as the commit it depends on is going to be
reverted.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
ath.git fixes for 4.14. Major changes:
ath10k
* fix security vulnerability with missing PN check on certain hardware
* revert ath10k napi fix as it caused regressions on QCA6174
wcn36xx
* remove unnecessary rcu_read_unlock() from error path
We reset the encapsulation field of the skb too early
in xfrm_output. As a result, the GRE GSO handler does
not segment the packets. This leads to a performance
drop down. We fix this by resetting the encapsulation
field right before we do the transformation, when
the inner headers become invalid.
Fixes: f1bd7d659e ("xfrm: Add encapsulation header offsets while SKB is not encrypted")
Reported-by: Vicente De Luca <vdeluca@zendesk.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Fixes: 31c2611b66 ("selftests: Introduce a new test case to tc testsuite")
Fixes: 76b903ee19 ("selftests: Introduce tc testsuite")
Signed-off-by: Brenda J. Butler <bjb@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 7aa0045dad ("net_sched: introduce a workqueue for RCU callbacks of tc filter")
I defer tcf_chain_flush() to a workqueue, this causes a use-after-free
because qdisc is already destroyed after we queue this work.
The tcf_block_put_deferred() is no longer necessary after we get RTNL
for each tc filter destroy work, no others could jump in at this point.
Same for tcf_chain_hold(), we are fully serialized now.
This also reduces one indirection therefore makes the code more
readable. Note this brings back a rcu_barrier(), however comparing
to the code prior to commit 7aa0045dad we still reduced one
rcu_barrier(). For net-next, we can consider to refcnt tcf block to
avoid it.
Fixes: 7aa0045dad ("net_sched: introduce a workqueue for RCU callbacks of tc filter")
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use l2tp_tunnel_get() in pppol2tp_connect() to ensure the tunnel isn't
going to disappear while processing the rest of the function.
Fixes: fd558d186d ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes new breakage introduced by the most recent PM QoS
fix in which, embarrassingly enough, I forgot to update
dev_pm_qos_raw_read_value() to return the right default
for devices with no PM QoS constraints at all which prevents
runtime PM from suspending those devices (fix from Tero Kristo).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJZ97EwAAoJEILEb/54YlRxAw8P/Rtl4qgVBGncDCDlmVK7uyaa
nFGSSAdgTVo+WlyZ/ubs1zXj8ra8AhkKIOa3rwtFI21e/DejlqTxBbUHQMqSDPZC
JUfS6F4hkLW86BmAMeDh+tma0oG5abXvhzx1F26/1LRFsAAORW2MQ02GUzQfe04j
iThjk1J+CmfdZ5Lkc6fIWnI5bQRo7WxuYlT1GfthJVPyorTgCviQ/p5I7012HgF1
ZDwCbUWyEereybIMYaXkwXdyP2RltBKDlxRRWe/tC6O3m3VsDHWuO9+HoS6ZFMCH
sicgpovEHhZagJSWdNuEK2HAvE3XoDTtWU3xSRkZpxJSarKxPD/aXZzwGkyg1x6N
8/KmuwsBuNThQLY6ODDxBP9+1ThpbpgtFbx8JMPy5/Usdm/HlIJSHZyeGSc6BWoj
TI8GLEzqhF61aHWM1u5Nxh8v1DsGvAXEXLWKgQlgfInUPaw1PxYv5lCa5r4fhRzr
te3ePtbaMh2bNu0fbJyYgTcteaOz/1DPSc/keYFhhGj/iU39232IzryDZKTbz0Cn
Zb0OVhkpQcr+1e3HPf1aTjHVQcWds4QtTmDmaoushxmBxhdBYnWX8qiwwmSnvW8S
+kkax7UY4EE7FH+8+3cW7OfqQ7vGLqqG+l3pQptf5yeGMb/hwdWidC669Be6g9Bz
tZ0sDywMg2MGcbeRPx/j
=xbGZ
-----END PGP SIGNATURE-----
Merge tag 'pm-urgent-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
"This fixes new breakage introduced by the most recent PM QoS fix in
which, embarrassingly enough, I forgot to update
dev_pm_qos_raw_read_value() to return the right default for devices
with no PM QoS constraints at all which prevents runtime PM from
suspending those devices (fix from Tero Kristo)"
* tag 'pm-urgent-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / QoS: Fix default runtime_pm device resume latency
It turns out that some drivers seem to think it's ok to remap page
ranges from within interrupts and even NMI's. That is definitely not
the case, since the page table build-up is simply not interrupt-safe.
This showed up in the zero-day robot that reported it for the ACPI APEI
GHES ("Generic Hardware Error Source") driver. Normally it had been
hidden by the fact that no page table operations had been needed because
the vmalloc area had been set up by other things.
Apparently due to a recent change to the GHEI driver: commit
77b246b32b ("acpi: apei: check for pending errors when probing GHES
entries") 0day actually caught a case during bootup whenthe ioremap
called down to page allocation. But that recent change only showed the
symptom, it wasn't the root cause of the problem.
Hopefully it is limited to just that one driver.
If you need to access random physical memory, you either need to ioremap
in process context, or you need to use the FIXMAP facility to set one
particular fixmap entry to the required mapping - that can be done safely.
Cc: Borislav Petkov <bp@suse.de>
Cc: Len Brown <lenb@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Tyler Baicar <tbaicar@codeaurora.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Revalidating the disk needs to set the logical block format and capacity,
otherwise it can't figure out if the users modified anything about
the namespace.
Fixes: cdbff4f26b ("nvme: remove nvme_revalidate_ns")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The recent change to the PM QoS framework to introduce a proper
no constraint value overlooked to handle the devices which don't
implement PM QoS OPS. Runtime PM is one of the more severely
impacted subsystems, failing every attempt to runtime suspend
a device. This leads into some nasty second level issues like
probe failures and increased power consumption among other
things.
Fix this by adding a proper return value for devices that don't
implement PM QoS.
Fixes: 0cc2b4e5a0 (PM / QoS: Fix device resume latency PM QoS)
Signed-off-by: Tero Kristo <t-kristo@ti.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Thorsten reported on <fa6e3ee2-91b5-a54b-afe3-87f30aac7a48@leemhuis.info> that
commit c9353bf483 made ath10k unstable with QCA6174 on his Dell XPS13 (9360)
with an error message:
ath10k_pci 0000:3a:00.0: failed to extract amsdu: -11
It only seemed to happen with certain APs, not all, but when it happened the
only way to get ath10k working was to switch the wifi off and on with a hotkey.
As this commit made things even worse (a warning vs breaking the whole
connection) let's revert the commit for now and while the issue is being fixed.
Link: http://lists.infradead.org/pipermail/ath10k/2017-October/010227.html
Reported-by: Thorsten Leemhuis <linux@leemhuis.info>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Rx data frames notified through HTT_T2H_MSG_TYPE_RX_IND and
HTT_T2H_MSG_TYPE_RX_FRAG_IND expect PN/TSC check to be done
on host (mac80211) rather than firmware. Rebuild cipher header
in every received data frames (that are notified through those
HTT interfaces) from the rx_hdr_status tlv available in the
rx descriptor of the first msdu. Skip setting RX_FLAG_IV_STRIPPED
flag for the packets which requires mac80211 PN/TSC check support
and set appropriate RX_FLAG for stripped crypto tail. Hw QCA988X,
QCA9887, QCA99X0, QCA9984, QCA9888 and QCA4019 currently need the
rebuilding of cipher header to perform PN/TSC check for replay
attack.
Please note that removing crypto tail for CCMP-256, GCMP and GCMP-256 ciphers
in raw mode needs to be fixed. Since Rx with these ciphers in raw
mode does not work in the current form even without this patch and
removing crypto tail for these chipers needs clean up, raw mode related
issues in CCMP-256, GCMP and GCMP-256 can be addressed in follow up
patches.
Tested-by: Manikanta Pubbisetty <mpubbise@qti.qualcomm.com>
Signed-off-by: Vasanthakumar Thiagarajan <vthiagar@qti.qualcomm.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
And fix tcon leak in error path.
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: David Disseldorp <ddiss@samba.org>
Pull networking fixes from David Miller:
1) Fix route leak in xfrm_bundle_create().
2) In mac80211, validate user rate mask before configuring it. From
Johannes Berg.
3) Properly enforce memory limits in fair queueing code, from Toke
Hoiland-Jorgensen.
4) Fix lockdep splat in inet_csk_route_req(), from Eric Dumazet.
5) Fix TSO header allocation and management in mvpp2 driver, from Yan
Markman.
6) Don't take socket lock in BH handler in strparser code, from Tom
Herbert.
7) Don't show sockets from other namespaces in AF_UNIX code, from
Andrei Vagin.
8) Fix double free in error path of tap_open(), from Girish Moodalbail.
9) Fix TX map failure path in igb and ixgbe, from Jean-Philippe Brucker
and Alexander Duyck.
10) Fix DCB mode programming in stmmac driver, from Jose Abreu.
11) Fix err_count handling in various tunnels (ipip, ip6_gre). From Xin
Long.
12) Properly align SKB head before building SKB in tuntap, from Jason
Wang.
13) Avoid matching qdiscs with a zero handle during lookups, from Cong
Wang.
14) Fix various endianness bugs in sctp, from Xin Long.
15) Fix tc filter callback races and add selftests which trigger the
problem, from Cong Wang.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (73 commits)
selftests: Introduce a new test case to tc testsuite
selftests: Introduce a new script to generate tc batch file
net_sched: fix call_rcu() race on act_sample module removal
net_sched: add rtnl assertion to tcf_exts_destroy()
net_sched: use tcf_queue_work() in tcindex filter
net_sched: use tcf_queue_work() in rsvp filter
net_sched: use tcf_queue_work() in route filter
net_sched: use tcf_queue_work() in u32 filter
net_sched: use tcf_queue_work() in matchall filter
net_sched: use tcf_queue_work() in fw filter
net_sched: use tcf_queue_work() in flower filter
net_sched: use tcf_queue_work() in flow filter
net_sched: use tcf_queue_work() in cgroup filter
net_sched: use tcf_queue_work() in bpf filter
net_sched: use tcf_queue_work() in basic filter
net_sched: introduce a workqueue for RCU callbacks of tc filter
sctp: fix some type cast warnings introduced since very beginning
sctp: fix a type cast warnings that causes a_rwnd gets the wrong value
sctp: fix some type cast warnings introduced by transport rhashtable
sctp: fix some type cast warnings introduced by stream reconf
...
Cong Wang says:
====================
net_sched: fix races with RCU callbacks
Recently, the RCU callbacks used in TC filters and TC actions keep
drawing my attention, they introduce at least 4 race condition bugs:
1. A simple one fixed by Daniel:
commit c78e1746d3
Author: Daniel Borkmann <daniel@iogearbox.net>
Date: Wed May 20 17:13:33 2015 +0200
net: sched: fix call_rcu() race on classifier module unloads
2. A very nasty one fixed by me:
commit 1697c4bb52
Author: Cong Wang <xiyou.wangcong@gmail.com>
Date: Mon Sep 11 16:33:32 2017 -0700
net_sched: carefully handle tcf_block_put()
3. Two more bugs found by Chris:
https://patchwork.ozlabs.org/patch/826696/https://patchwork.ozlabs.org/patch/826695/
Usually RCU callbacks are simple, however for TC filters and actions,
they are complex because at least TC actions could be destroyed
together with the TC filter in one callback. And RCU callbacks are
invoked in BH context, without locking they are parallel too. All of
these contribute to the cause of these nasty bugs.
Alternatively, we could also:
a) Introduce a spinlock to serialize these RCU callbacks. But as I
said in commit 1697c4bb52 ("net_sched: carefully handle
tcf_block_put()"), it is very hard to do because of tcf_chain_dump().
Potentially we need to do a lot of work to make it possible (if not
impossible).
b) Just get rid of these RCU callbacks, because they are not
necessary at all, callers of these call_rcu() are all on slow paths
and holding RTNL lock, so blocking is allowed in their contexts.
However, David and Eric dislike adding synchronize_rcu() here.
As suggested by Paul, we could defer the work to a workqueue and
gain the permission of holding RTNL again without any performance
impact, however, in tcf_block_put() we could have a deadlock when
flushing workqueue while hodling RTNL lock, the trick here is to
defer the work itself in workqueue and make it queued after all
other works so that we keep the same ordering to avoid any
use-after-free. Please see the first patch for details.
Patch 1 introduces the infrastructure, patch 2~12 move each
tc filter to the new tc filter workqueue, patch 13 adds
an assertion to catch potential bugs like this, patch 14
closes another rcu callback race, patch 15 and patch 16 add
new test cases.
====================
Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In this patchset, we fixed a tc bug. This patch adds the test case
that reproduces the bug. To run this test case, user should specify
an existing NIC device:
# sudo ./tdc.py -d enp4s0f0
This test case belongs to category "flower". If user doesn't specify
a NIC device, the test cases belong to "flower" will not be run.
In this test case, we create 1M filters and all filters share the same
action. When destroying all filters, kernel should not panic. It takes
about 18s to run it.
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: Chris Mi <chrism@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
# ./tdc_batch.py -h
usage: tdc_batch.py [-h] [-n NUMBER] [-o] [-s] [-p] device file
TC batch file generator
positional arguments:
device device name
file batch file name
optional arguments:
-h, --help show this help message and exit
-n NUMBER, --number NUMBER
how many lines in batch file
-o, --skip_sw skip_sw (offload), by default skip_hw
-s, --share_action all filters share the same action
-p, --prio all filters have different prio
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Lucas Bates <lucasb@mojatatu.com>
Signed-off-by: Chris Mi <chrism@mellanox.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to commit c78e1746d3
("net: sched: fix call_rcu() race on classifier module unloads"),
we need to wait for flying RCU callback tcf_sample_cleanup_rcu().
Cc: Yotam Gigi <yotamg@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After previous patches, it is now safe to claim that
tcf_exts_destroy() is always called with RTNL lock.
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.
Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.
Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Defer the tcf_exts_destroy() in RCU callback to
tc filter workqueue and get RTNL lock.
Reported-by: Chris Mi <chrism@mellanox.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>