We initialize a struct tipc_event allocated on the kernel stack to
zero to avert info leak to user space.
Reported-by: syzbot+057458894bc8cada4dee@syzkaller.appspotmail.com
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In ethtool_ioctl(), the ioctl command 'ethcmd' is checked through a switch
statement to see whether it is necessary to pre-process the ethtool
structure, because, as mentioned in the comment, the structure
ethtool_rxnfc is defined with padding. If yes, a user-space buffer 'rxnfc'
is allocated through compat_alloc_user_space(). One thing to note here is
that, if 'ethcmd' is ETHTOOL_GRXCLSRLALL, the size of the buffer 'rxnfc' is
partially determined by 'rule_cnt', which is actually acquired from the
user-space buffer 'compat_rxnfc', i.e., 'compat_rxnfc->rule_cnt', through
get_user(). After 'rxnfc' is allocated, the data in the original user-space
buffer 'compat_rxnfc' is then copied to 'rxnfc' through copy_in_user(),
including the 'rule_cnt' field. However, after this copy, no check is
re-enforced on 'rxnfc->rule_cnt'. So it is possible that a malicious user
race to change the value in the 'compat_rxnfc->rule_cnt' between these two
copies. Through this way, the attacker can bypass the previous check on
'rule_cnt' and inject malicious data. This can cause undefined behavior of
the kernel and introduce potential security risk.
This patch avoids the above issue via copying the value acquired by
get_user() to 'rxnfc->rule_cn', if 'ethcmd' is ETHTOOL_GRXCLSRLALL.
Signed-off-by: Wenwen Wang <wang6495@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
When dumping classes by parent, kernel would return classes twice:
| # tc qdisc add dev lo root prio
| # tc class show dev lo
| class prio 8001:1 parent 8001:
| class prio 8001:2 parent 8001:
| class prio 8001:3 parent 8001:
| # tc class show dev lo parent 8001:
| class prio 8001:1 parent 8001:
| class prio 8001:2 parent 8001:
| class prio 8001:3 parent 8001:
| class prio 8001:1 parent 8001:
| class prio 8001:2 parent 8001:
| class prio 8001:3 parent 8001:
This comes from qdisc_match_from_root() potentially returning the root
qdisc itself if its handle matched. Though in that case, root's classes
were already dumped a few lines above.
Fixes: cb395b2010 ("net: sched: optimize class dumps")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rtl_rx() and rtl_tx() are called only if the respective bits are set
in the interrupt status register. Under high load NAPI may not be
able to process all data (work_done == budget) and it will schedule
subsequent calls to the poll callback.
rtl_ack_events() however resets the bits in the interrupt status
register, therefore subsequent calls to rtl8169_poll() won't call
rtl_rx() and rtl_tx() - chip interrupts are still disabled.
Fix this by calling rtl_rx() and rtl_tx() independent of the bits
set in the interrupt status register. Both functions will detect
if there's nothing to do for them.
Fixes: da78dbff2e ("r8169: remove work from irq handler.")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert says:
====================
pull request (net): ipsec 2018-10-18
1) Free the xfrm interface gro_cells when deleting the
interface, otherwise we leak it. From Li RongQing.
2) net/core/flow.c does not exist anymore, so remove it
from the MAINTAINERS file.
3) Fix a slab-out-of-bounds in _decode_session6.
From Alexei Starovoitov.
4) Fix RCU protection when policies inserted into
thei bydst lists. From Florian Westphal.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If the skb space ends in an unresolved entry while dumping we'll miss
some unresolved entries. The reason is due to zeroing the entry counter
between dumping resolved and unresolved mfc entries. We should just
keep counting until the whole table is dumped and zero when we move to
the next as we have a separate table counter.
Reported-by: Colin Ian King <colin.king@canonical.com>
Fixes: 8fb472c09b ("ipmr: improve hash scalability")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ocelot_vlant_wait_for_completion() function is very similar to the
ocelot_mact_wait_for_completion(). It seemed to have be copied but the
comment was not updated, so let's fix it.
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sctp data size should be calculated by subtracting data chunk header's
length from chunk_hdr->length, not just data header.
Fixes: 668c9beb90 ("sctp: implement assign_number for sctp_stream_interleave")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 713a98d90c ("virtio-net: serialize tx routine during reset")
introduces netif_tx_disable() after netif_device_detach() in order to
avoid use-after-free of tx queues. However, there are two issues.
1) Its operation is redundant with netif_device_detach() in case the
interface is running.
2) In case of the interface is not running before suspending and
resuming, the tx does not get resumed by netif_device_attach().
This results in losing network connectivity.
It is better to use netif_tx_lock_bh()/netif_tx_unlock_bh() instead for
serializing tx routine during reset. This also preserves the symmetry
of netif_device_detach() and netif_device_attach().
Fixes commit 713a98d90c ("virtio-net: serialize tx routine during reset")
Signed-off-by: Ake Koomsin <ake@igel.co.jp>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit eb63f2964d ("udp6: add missing checks on edumux packet
processing") used the same return code convention of the ipv4 counterpart,
but ipv6 uses the opposite one: positive values means resubmit.
This change addresses the issue, using positive return value for
resubmitting. Also update the related comment, which was broken, too.
Fixes: eb63f2964d ("udp6: add missing checks on edumux packet processing")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the switch driver (e.g., mlxsw_spectrum) determines it needs to
flash a new firmware version it resets the ASIC after the flashing
process. The bus driver (e.g., mlxsw_pci) then registers itself again
with mlxsw_core which means (among other things) that the device
registers itself again with the hwmon subsystem again.
Since the device was registered with the hwmon subsystem using
devm_hwmon_device_register_with_groups(), then the old hwmon device
(registered before the flashing) was never unregistered and was
referencing stale data, resulting in a use-after free.
Fix by removing reliance on device managed APIs in mlxsw_hwmon_init().
Fixes: c86d62cc41 ("mlxsw: spectrum: Reset FW after flash")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Alexander Petrovskiy <alexpe@mellanox.com>
Tested-by: Alexander Petrovskiy <alexpe@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When sctp_wait_for_connect is called to wait for connect ready
for sp->strm_interleave in sctp_sendmsg_to_asoc, a panic could
be triggered if cpu is scheduled out and the new asoc is freed
elsewhere, as it will return err and later the asoc gets freed
again in sctp_sendmsg.
[ 285.840764] list_del corruption, ffff9f0f7b284078->next is LIST_POISON1 (dead000000000100)
[ 285.843590] WARNING: CPU: 1 PID: 8861 at lib/list_debug.c:47 __list_del_entry_valid+0x50/0xa0
[ 285.846193] Kernel panic - not syncing: panic_on_warn set ...
[ 285.846193]
[ 285.848206] CPU: 1 PID: 8861 Comm: sctp_ndata Kdump: loaded Not tainted 4.19.0-rc7.label #584
[ 285.850559] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[ 285.852164] Call Trace:
...
[ 285.872210] ? __list_del_entry_valid+0x50/0xa0
[ 285.872894] sctp_association_free+0x42/0x2d0 [sctp]
[ 285.873612] sctp_sendmsg+0x5a4/0x6b0 [sctp]
[ 285.874236] sock_sendmsg+0x30/0x40
[ 285.874741] ___sys_sendmsg+0x27a/0x290
[ 285.875304] ? __switch_to_asm+0x34/0x70
[ 285.875872] ? __switch_to_asm+0x40/0x70
[ 285.876438] ? ptep_set_access_flags+0x2a/0x30
[ 285.877083] ? do_wp_page+0x151/0x540
[ 285.877614] __sys_sendmsg+0x58/0xa0
[ 285.878138] do_syscall_64+0x55/0x180
[ 285.878669] entry_SYSCALL_64_after_hwframe+0x44/0xa9
This is a similar issue with the one fixed in Commit ca3af4dd28
("sctp: do not free asoc when it is already dead in sctp_sendmsg").
But this one can't be fixed by returning -ESRCH for the dead asoc
in sctp_wait_for_connect, as it will break sctp_connect's return
value to users.
This patch is to simply set err to -ESRCH before it returns to
sctp_sendmsg when any err is returned by sctp_wait_for_connect
for sp->strm_interleave, so that no asoc would be freed due to
this.
When users see this error, they will know the packet hasn't been
sent. And it also makes sense to not free asoc because waiting
connect fails, like the second call for sctp_wait_for_connect in
sctp_sendmsg_to_asoc.
Fixes: 668c9beb90 ("sctp: implement assign_number for sctp_stream_interleave")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
syzbot reported an use-after-free involving sctp_id2asoc. Dmitry Vyukov
helped to root cause it and it is because of reading the asoc after it
was freed:
CPU 1 CPU 2
(working on socket 1) (working on socket 2)
sctp_association_destroy
sctp_id2asoc
spin lock
grab the asoc from idr
spin unlock
spin lock
remove asoc from idr
spin unlock
free(asoc)
if asoc->base.sk != sk ... [*]
This can only be hit if trying to fetch asocs from different sockets. As
we have a single IDR for all asocs, in all SCTP sockets, their id is
unique on the system. An application can try to send stuff on an id
that matches on another socket, and the if in [*] will protect from such
usage. But it didn't consider that as that asoc may belong to another
socket, it may be freed in parallel (read: under another socket lock).
We fix it by moving the checks in [*] into the protected region. This
fixes it because the asoc cannot be freed while the lock is held.
Reported-by: syzbot+c7dd55d7aec49d48e49a@syzkaller.appspotmail.com
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to d49c88d767 ("r8169: Enable MSI-X on RTL8106e") after
e9d0ba506ea8 ("PCI: Reprogram bridge prefetch registers on resume")
we can safely assume that this also fixes the root cause of
the issue worked around by 7c53a72245 ("r8169: don't use MSI-X on
RTL8168g"). So let's revert it.
Fixes: 7c53a72245 ("r8169: don't use MSI-X on RTL8168g")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
pid_task() dereferences rcu protected tasks array.
But there is no rcu_read_lock() in shutdown_umh() routine so that
rcu_read_lock() is needed.
get_pid_task() is wrapper function of pid_task. it holds rcu_read_lock()
then calls pid_task(). if task isn't NULL, it increases reference count
of task.
test commands:
%modprobe bpfilter
%modprobe -rv bpfilter
splat looks like:
[15102.030932] =============================
[15102.030957] WARNING: suspicious RCU usage
[15102.030985] 4.19.0-rc7+ #21 Not tainted
[15102.031010] -----------------------------
[15102.031038] kernel/pid.c:330 suspicious rcu_dereference_check() usage!
[15102.031063]
other info that might help us debug this:
[15102.031332]
rcu_scheduler_active = 2, debug_locks = 1
[15102.031363] 1 lock held by modprobe/1570:
[15102.031389] #0: 00000000580ef2b0 (bpfilter_lock){+.+.}, at: stop_umh+0x13/0x52 [bpfilter]
[15102.031552]
stack backtrace:
[15102.031583] CPU: 1 PID: 1570 Comm: modprobe Not tainted 4.19.0-rc7+ #21
[15102.031607] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/08/2015
[15102.031628] Call Trace:
[15102.031676] dump_stack+0xc9/0x16b
[15102.031723] ? show_regs_print_info+0x5/0x5
[15102.031801] ? lockdep_rcu_suspicious+0x117/0x160
[15102.031855] pid_task+0x134/0x160
[15102.031900] ? find_vpid+0xf0/0xf0
[15102.032017] shutdown_umh.constprop.1+0x1e/0x53 [bpfilter]
[15102.032055] stop_umh+0x46/0x52 [bpfilter]
[15102.032092] __x64_sys_delete_module+0x47e/0x570
[ ... ]
Fixes: d2ba09c17a ("net: add skeleton of bpfilter kernel module")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
pin_index can be indirectly controlled by user-space, hence leading
to a potential exploitation of the Spectre variant 1 vulnerability.
This issue was detected with the help of Smatch:
drivers/ptp/ptp_chardev.c:253 ptp_ioctl() warn: potential spectre issue
'ops->pin_config' [r] (local cap)
Fix this by sanitizing pin_index before using it to index
ops->pin_config, and before passing it as an argument to
function ptp_set_pinfunc(), in which it is used to index
info->pin_config.
Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].
[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clang currently warns:
drivers/net/ethernet/qlogic/qla3xxx.c:384:24: warning: signed shift
result (0xF00000000) requires 37 bits to represent, but 'int' only has
32 bits [-Wshift-overflow]
((ISP_NVRAM_MASK << 16) | qdev->eeprom_cmd_data));
~~~~~~~~~~~~~~ ^ ~~
1 warning generated.
The warning is certainly accurate since ISP_NVRAM_MASK is defined as
(0x000F << 16) which is then shifted by 16, resulting in 64424509440,
well above UINT_MAX.
Given that this is the only location in this driver where ISP_NVRAM_MASK
is shifted again, it seems likely that ISP_NVRAM_MASK was originally
defined without a shift and during the move of the shift to the
definition, this statement wasn't properly removed (since ISP_NVRAM_MASK
is used in the statenent right above this). Only the maintainers can
confirm this since this statment has been here since the driver was
first added to the kernel.
Link: https://github.com/ClangBuiltLinux/linux/issues/127
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Brivio says:
====================
geneve, vxlan: Don't set exceptions if skb->len < mtu
This series fixes the exception abuse described in 2/2, and 1/2
is just a preparatory change to make 2/2 less ugly.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We shouldn't abuse exceptions: if the destination MTU is already higher
than what we're transmitting, no exception should be created.
Fixes: 52a589d51f ("geneve: update skb dst pmtu on tx path")
Fixes: a93bf0ff44 ("vxlan: update skb dst pmtu on tx path")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit f15ca723c1 ("net: don't call update_pmtu unconditionally") avoids
that we try updating PMTU for a non-existent destination, but didn't clean
up cases where the check was already explicit. Drop those redundant checks.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
According to rfc7496 section 4.3 or 4.4:
sprstat_policy: This parameter indicates for which PR-SCTP policy
the user wants the information. It is an error to use
SCTP_PR_SCTP_NONE in sprstat_policy. If SCTP_PR_SCTP_ALL is used,
the counters provided are aggregated over all supported policies.
We change to dump pr_assoc and pr_stream all status by SCTP_PR_SCTP_ALL
instead, and return error for SCTP_PR_SCTP_NONE, as it also said "It is
an error to use SCTP_PR_SCTP_NONE in sprstat_policy. "
Fixes: 826d253d57 ("sctp: add SCTP_PR_ASSOC_STATUS on sctp sockopt")
Fixes: d229d48d18 ("sctp: add SCTP_PR_STREAM_STATUS sockopt for prsctp")
Reported-by: Ying Xu <yinxu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski says:
====================
nfp: fix pedit set action offloads
Pieter says:
This set fixes set actions when using multiple pedit actions with
partial masks and with multiple keys per pedit action. Additionally
it fixes set ipv6 pedit action offloads when using it in combination
with other header keys.
The problem would only trigger if one combines multiple pedit actions
of the same type with partial masks, e.g.:
$ tc filter add dev netdev protocol ip parent ffff: \
flower indev netdev \
ip_proto tcp \
action pedit ex munge \
ip src set 11.11.11.11 retain 65535 munge \
ip src set 22.22.22.22 retain 4294901760 pipe \
csum ip and tcp pipe \
mirred egress redirect dev netdev
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously when populating the set ipv6 address action, we incorrectly
made use of pedit's key index to determine which 32bit word should be
set. We now calculate which word has been selected based on the offset
provided by the pedit action.
Fixes: 354b82bb32 ("nfp: add set ipv6 source and destination address")
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously we only allowed a single header key per pedit action to
change the header. This used to result in the last header key in the
pedit action to overwrite previous headers. We now keep track of them
and allow multiple header keys per pedit action.
Fixes: c0b1bd9a8b ("nfp: add set ipv4 header action flower offload")
Fixes: 354b82bb32 ("nfp: add set ipv6 source and destination address")
Fixes: f8b7b0a6b1 ("nfp: add set tcp and udp header action flower offload")
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously we did not correctly change headers when using multiple
pedit actions with partial masks. We now take this into account and
no longer just commit the last pedit action.
Fixes: c0b1bd9a8b ("nfp: add set ipv4 header action flower offload")
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix a missing call to rxrpc_put_peer() on the main path through the
rxrpc_error_report() function. This manifests itself as a ref leak
whenever an ICMP packet or other error comes in.
In commit f334430316, the hand-off of the ref to a work item was removed
and was not replaced with a put.
Fixes: f334430316 ("rxrpc: Fix error distribution")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Other than asoc pmtu sync from all transports, sctp_assoc_sync_pmtu
is also processing transport pmtu_pending by icmp packets. But it's
meaningless to use sctp_dst_mtu(t->dst) as new pmtu for a transport.
The right pmtu value should come from the icmp packet, and it would
be saved into transport->mtu_info in this patch and used later when
the pmtu sync happens in sctp_sendmsg_to_asoc or sctp_packet_config.
Besides, without this patch, as pmtu can only be updated correctly
when receiving a icmp packet and no place is holding sock lock, it
will take long time if the sock is busy with sending packets.
Note that it doesn't process transport->mtu_info in .release_cb(),
as there is no enough information for pmtu update, like for which
asoc or transport. It is not worth traversing all asocs to check
pmtu_pending. So unlike tcp, sctp does this in tx path, for which
mtu_info needs to be atomic_t.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit db65f35f50 ("net: fec: add support of ethtool get_regs") introduce
ethool "--register-dump" interface to dump all FEC registers.
But not all silicon implementations of the Freescale FEC hardware module
have the FRBR (FIFO Receive Bound Register) and FRSR (FIFO Receive Start
Register) register, so we should not be trying to dump them on those that
don't.
To fix it we create a quirk flag, FEC_QUIRK_HAS_RFREG, and check it before
dump those RX FIFO registers.
Signed-off-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trivial fix to spelling mistake in DP_INFO message
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The binding table's 'cluster_scope' list is rcu protected to handle
races between threads changing the list and those traversing the list at
the same moment. We have now found that the function named_distribute()
uses the regular list_for_each() macro to traverse the said list.
Likewise, the function tipc_named_withdraw() is removing items from the
same list using the regular list_del() call. When these two functions
execute in parallel we see occasional crashes.
This commit fixes this by adding the missing _rcu() suffixes.
Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The udpv6_encap_enable() function is part of the ipv6 code, and if that is
configured as a loadable module and rxrpc is built in then a build failure
will occur because the conditional check is wrong:
net/rxrpc/local_object.o: In function `rxrpc_lookup_local':
local_object.c:(.text+0x2688): undefined reference to `udpv6_encap_enable'
Use the correct config symbol (CONFIG_AF_RXRPC_IPV6) in the conditional
check rather than CONFIG_IPV6 as that will do the right thing.
Fixes: 5271953cad ("rxrpc: Use the UDP encap_rcv hook")
Reported-by: kbuild-all@01.org
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
When commit 270972554c ("[IPV6]: ROUTE: Add Router Reachability
Probing (RFC4191).") introduced router probing, the rt6_probe() function
required that a neighbour entry existed. This neighbour entry is used to
record the timestamp of the last probe via the ->updated field.
Later, commit 2152caea71 ("ipv6: Do not depend on rt->n in rt6_probe().")
removed the requirement for a neighbour entry. Neighbourless routes skip
the interval check and are not rate-limited.
This patch adds rate-limiting for neighbourless routes, by recording the
timestamp of the last probe in the fib6_info itself.
Fixes: 2152caea71 ("ipv6: Do not depend on rt->n in rt6_probe().")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On GENETv5, there is a hardware issue which prevents the GENET hardware
from generating a link UP interrupt when the link is operating at
10Mbits/sec. Since we do not have any way to configure the link
detection logic, fallback to polling in that case.
Fixes: 421380856d ("net: bcmgenet: add support for the GENETv5 hardware")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes gcc '-Wunused-but-set-variable' warning:
net/rxrpc/output.c: In function 'rxrpc_reject_packets':
net/rxrpc/output.c:527:11: warning:
variable 'ioc' set but not used [-Wunused-but-set-variable]
'ioc' is the correct kvec num when sending a BUSY (or an ABORT) response
packet.
Fixes: ece64fec16 ("rxrpc: Emit BUSY packets when supposed to rather than ABORTs")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix an uninitialised variable introduced by the last patch. This can cause
a crash when a new call comes in to a local service, such as when an AFS
fileserver calls back to the local cache manager.
Fixes: c1e15b4944 ("rxrpc: Fix the packet reception routine")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the commit referred to below we added link tolerance as an additional
criteria for declaring broadcast transmission "stale" and resetting the
unicast links to the affected node.
Unfortunately, this 'improvement' introduced two bugs, which each and
one alone cause only limited problems, but combined lead to seemingly
stochastic unicast link resets, depending on the amount of broadcast
traffic transmitted.
The first issue, a missing initialization of the 'tolerance' field of
the receiver broadcast link, was recently fixed by commit 047491ea33
("tipc: set link tolerance correctly in broadcast link").
Ths second issue, where we omit to reset the 'stale_cnt' field of
the same link after a 'stale' period is over, leads to this counter
accumulating over time, and in the absence of the 'tolerance' criteria
leads to the above described symptoms. This commit adds the missing
initialization.
Fixes: a4dc70d46c ("tipc: extend link reset criteria for stale packet retransmission")
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
WHen an llc sock is added into the sk_laddr_hash of an llc_sap,
it is not marked with SOCK_RCU_FREE.
This causes that the sock could be freed while it is still being
read by __llc_lookup_established() with RCU read lock. sock is
refcounted, but with RCU read lock, nothing prevents the readers
getting a zero refcnt.
Fix it by setting SOCK_RCU_FREE in llc_sap_add_socket().
Reported-by: syzbot+11e05f04c15e03be5254@syzkaller.appspotmail.com
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJbvqcMAAoJEEg/ir3gV/o+eFsH/2TbJH+i1BuGMVwCB8o+U1Rz
C01pJmR7Lb7WwQZ8ZKTOqQkS7BkGX1hNGyIlc4i6ZnP+4gsVJAbP6LKPjTvyD7e6
TNb8bvxTUCOovknrevKkGba8tzoTTsC4wwwbHLGHd1hkKSY1P5hXg8R7vpear+n6
/PFJwzpIXDAa8AHqeORCNYj7MneUm3kaahcmSOxOhvDbRx3UG9cgy7tEhPjZbRn5
jPFsxFCSPcGedtI+g8bzodmpneTcu1KF6QCunrl2bGt5EzgDrbaw1UUoctxD2CJR
Ch45W807EvBJoFiJXXCNf9N+p5020F/Q+mTmK7khPirUjdtoLdcT9Goswpjfbtk=
=vJIA
-----END PGP SIGNATURE-----
Merge tag 'mlx5-fixes-2018-10-10' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2018-10-10
This pull request includes some fixes to mlx5 driver,
Please pull and let me know if there's any problem.
For -stable v4.11:
('net/mlx5: Take only bit 24-26 of wqe.pftype_wq for page fault type')
For -stable v4.17:
('net/mlx5: Fix memory leak when setting fpga ipsec caps')
For -stable v4.18:
('net/mlx5: WQ, fixes for fragmented WQ buffers API')
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Similarly to what has been done in 8b4c3cdd9d ("net: sched: Add policy
validation for tc attributes"), fix classifier code to add validation of
TCA_CHAIN and TCA_KIND netlink attributes.
tested with:
# ./tdc.py -c filter
v2: Let sch_api and cls_api share nla_policy they have in common, thanks
to David Ahern.
v3: Avoid EXPORT_SYMBOL(), as validation of those attributes is not done
by TC modules, thanks to Cong Wang.
While at it, restore the 'Delete / get qdisc' comment to its orginal
position, just above tc_get_qdisc() function prototype.
Fixes: 5bc1701881 ("net: sched: introduce multichain support for filters")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In dev_ethtool(), the eth command 'ethcmd' is firstly copied from the
use-space buffer 'useraddr' and checked to see whether it is
ETHTOOL_PERQUEUE. If yes, the sub-command 'sub_cmd' is further copied from
the user space. Otherwise, 'sub_cmd' is the same as 'ethcmd'. Next,
according to 'sub_cmd', a permission check is enforced through the function
ns_capable(). For example, the permission check is required if 'sub_cmd' is
ETHTOOL_SCOALESCE, but it is not necessary if 'sub_cmd' is
ETHTOOL_GCOALESCE, as suggested in the comment "Allow some commands to be
done by anyone". The following execution invokes different handlers
according to 'ethcmd'. Specifically, if 'ethcmd' is ETHTOOL_PERQUEUE,
ethtool_set_per_queue() is called. In ethtool_set_per_queue(), the kernel
object 'per_queue_opt' is copied again from the user-space buffer
'useraddr' and 'per_queue_opt.sub_command' is used to determine which
operation should be performed. Given that the buffer 'useraddr' is in the
user space, a malicious user can race to change the sub-command between the
two copies. In particular, the attacker can supply ETHTOOL_PERQUEUE and
ETHTOOL_GCOALESCE to bypass the permission check in dev_ethtool(). Then
before ethtool_set_per_queue() is called, the attacker changes
ETHTOOL_GCOALESCE to ETHTOOL_SCOALESCE. In this way, the attacker can
bypass the permission check and execute ETHTOOL_SCOALESCE.
This patch enforces a check in ethtool_set_per_queue() after the second
copy from 'useraddr'. If the sub-command is different from the one obtained
in the first copy in dev_ethtool(), an error code EINVAL will be returned.
Fixes: f38d138a7d ("net/ethtool: support set coalesce per queue")
Signed-off-by: Wenwen Wang <wang6495@umn.edu>
Reviewed-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
In ethtool_get_rxnfc(), the eth command 'cmd' is compared against
'ETHTOOL_GRXFH' to see whether it is necessary to adjust the variable
'info_size'. Then the whole structure of 'info' is copied from the
user-space buffer 'useraddr' with 'info_size' bytes. In the following
execution, 'info' may be copied again from the buffer 'useraddr' depending
on the 'cmd' and the 'info.flow_type'. However, after these two copies,
there is no check between 'cmd' and 'info.cmd'. In fact, 'cmd' is also
copied from the buffer 'useraddr' in dev_ethtool(), which is the caller
function of ethtool_get_rxnfc(). Given that 'useraddr' is in the user
space, a malicious user can race to change the eth command in the buffer
between these copies. By doing so, the attacker can supply inconsistent
data and cause undefined behavior because in the following execution 'info'
will be passed to ops->get_rxnfc().
This patch adds a necessary check on 'info.cmd' and 'cmd' to confirm that
they are still same after the two copies in ethtool_get_rxnfc(). Otherwise,
an error code EINVAL will be returned.
Signed-off-by: Wenwen Wang <wang6495@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Originally, we have an issue where r8169 MSI-X interrupt is broken after
S3 suspend/resume on RTL8106e of ASUS X441UAR.
02:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd.
RTL8101/2/6E PCI Express Fast/Gigabit Ethernet controller [10ec:8136]
(rev 07)
Subsystem: ASUSTeK Computer Inc. RTL810xE PCI Express Fast
Ethernet controller [1043:200f]
Flags: bus master, fast devsel, latency 0, IRQ 16
I/O ports at e000 [size=256]
Memory at ef100000 (64-bit, non-prefetchable) [size=4K]
Memory at e0000000 (64-bit, prefetchable) [size=16K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [70] Express Endpoint, MSI 01
Capabilities: [b0] MSI-X: Enable+ Count=4 Masked-
Capabilities: [d0] Vital Product Data
Capabilities: [100] Advanced Error Reporting
Capabilities: [140] Virtual Channel
Capabilities: [160] Device Serial Number 01-00-00-00-36-4c-e0-00
Capabilities: [170] Latency Tolerance Reporting
Kernel driver in use: r8169
Kernel modules: r8169
We found the all of the values in PCI BAR=4 of the ethernet adapter
become 0xFF after system resumes. That breaks the MSI-X interrupt.
Therefore, we can only fall back to MSI interrupt to fix the issue at
that time.
However, there is a commit which resolves the drivers getting nothing in
PCI BAR=4 after system resumes. It is 04cb3ae895d7 "PCI: Reprogram
bridge prefetch registers on resume" by Daniel Drake.
After apply the patch, the ethernet adapter works fine before suspend
and after resume. So, we can revert the workaround after the commit
"PCI: Reprogram bridge prefetch registers on resume" is merged into main
tree.
This patch reverts commit 7bb05b85bc
"r8169: don't use MSI-X on RTL8106e".
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=201181
Fixes: 7bb05b85bc ("r8169: don't use MSI-X on RTL8106e")
Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann says:
====================
pull-request: bpf 2018-10-14
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix xsk map update and delete operation to not call synchronize_net()
but to piggy back on SOCK_RCU_FREE for sockets instead as we are not
allowed to sleep under RCU, from Björn.
2) Do not change RLIMIT_MEMLOCK in reuseport_bpf selftest if the process
already has unlimited RLIMIT_MEMLOCK, from Eric.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Two last minute bugfixes, both for NXP platforms:
* The Layerscape 'qbman' infrastructure suffers from probe ordering
bugs in some configurations, a two-patch series adds a hotfix for
this. 4.20 will have a longer set of patches to rework it.
* The old imx53-qsb board regressed in 4.19 after the addition
of cpufreq support, adding a set of explicit operating points
fixes this.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJbwLz1AAoJEGCrR//JCVInfygP/2U3Uqt3Tr2QRROx4wIl0gGh
+sE5gfjBSrXPXq8SoLrJOOgxHYQwbAoWO93rYcmgjhn5a14YZuJzK9cgKiHaHOIc
fbkqjGrJ8Bbmy6gRCKrEVCvF8W6qV4pAwh/VT+HFaKDJK6pblvJEOysAAzFEbAP+
3fOxJcjKh5KaDrvWS/Y/wVBd5BUfpIpPWksWBaRONIfaO24gGs5Bp+OfVS/u2Ccz
iml59Pgg3KsaBr5u6PQSgZDfJ1CX4mMdJS0yLlqCdh+LdHduCWomH15OLIxiCEty
8hrDeSleMRW7MzIhbnxvGgkKGE3wa5yPr5ABMB6DR6wcWuI0V1K5TDr0GP4x53yK
li4+rFGVgenIkGqtFEogYerfbH7jLp3noC/7SYKPsT4wkSSoXegFgOw+tV9DPmVf
6CZbNP98HCNlPyu/pxizHv9PHPKOVmzO7k32Afens35/7/2oPrbeNnXbBRBaKbFF
2oUUNyHER6DOuELsXcMZGLeJpp5lwRa7+0/6pKOFvqkirbJji7N1o7EBNMG2ZL82
OiDXK4SKip9TWDfZ7ueKV8TznYnGlWmiGc1az1wrsOctB8Sk2tqxl7UADFOYM/5L
KBwDw9SVttRWfbKTBhEiaFIyHLY+X6JPSbEqbXU7HL/MNW9JlGVwZzFHtaq/AaYL
yPNvMgg7GFfGpEk8pQFV
=WY74
-----END PGP SIGNATURE-----
Merge tag 'armsoc-fixes-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Arnd writes:
"ARM: SoC fixes for 4.19
Two last minute bugfixes, both for NXP platforms:
* The Layerscape 'qbman' infrastructure suffers from probe ordering
bugs in some configurations, a two-patch series adds a hotfix for
this. 4.20 will have a longer set of patches to rework it.
* The old imx53-qsb board regressed in 4.19 after the addition
of cpufreq support, adding a set of explicit operating points
fixes this."
* tag 'armsoc-fixes-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
soc: fsl: qman_portals: defer probe after qman's probe
soc: fsl: qbman: add APIs to retrieve the probing status
ARM: dts: imx53-qsb: disable 1.2GHz OPP
Fix a leak of afs_server structs. The routine that installs them in the
various lookup lists and trees gets a ref on leaving the function, whether
it added the server or a server already exists. It shouldn't increment
the refcount if it added the server.
The effect of this that "rmmod kafs" will hang waiting for the leaked
server to become unused.
Fixes: d2ddc776a4 ("afs: Overhaul volume and server record caching and fileserver rotation")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Just drop the "linux" part of the path, it was never correct.
Reported-by: Joe Perches <joe@perches.com>
Fixes: 256ac03750 ("dt-bindings: document devicetree bindings for mux-controllers and gpio-mux")
Signed-off-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The file is GPL v2 or later.
Acked-by: Mircea Caprioru <mircea.caprioru@analog.com>
Signed-off-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Access to the list of cells by /proc/net/afs/cells has a couple of
problems:
(1) It should be checking against SEQ_START_TOKEN for the keying the
header line.
(2) It's only holding the RCU read lock, so it can't just walk over the
list without following the proper RCU methods.
Fix these by using an hlist instead of an ordinary list and using the
appropriate accessor functions to follow it with RCU.
Since the code that adds a cell to the list must also necessarily change,
sort the list on insertion whilst we're at it.
Fixes: 989782dcdc ("afs: Overhaul cell database management")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- Avoid fragile multiblock reads for the last sector in SPI mode
WIFI/SDIO:
- libertas: Fixup suspend sequence for the SDIO card
-----BEGIN PGP SIGNATURE-----
iQJLBAABCgA1FiEEugLDXPmKSktSkQsV/iaEJXNYjCkFAlvAeBQXHHVsZi5oYW5z
c29uQGxpbmFyby5vcmcACgkQ/iaEJXNYjCnGZQ/+OYERfaKGxOWlfTyBEFUoVHid
OKoayYgtI4rHATsDYxo4h33dPneAxd1HL0sGa7U7ReY2whUfMlxzVw0Y1MhfVlwn
LrKXzVvuAu57FbckGIf4aROw6NHGSxYZALCTbFxN/Lz1I7rEsjrKytl5wV8o2Sgc
INevRqYQTQq7B4nkEGCLvvbKOfTSitvh8uYKAXGJRKxM9PEBc0JgZ/eOJe+iFmUV
T79Fj5Rv6Fqi2cKh25kzTD/RVQt/NOKBb7hblnKrAsMCNWOOPhXNacKsXWTpBI0u
w1TAy1rFbmHC1KGuZuVYiYxgYJTkfCqybEh3SQNO+4qxBT6PZjRUIzrQjZ24xAm2
lvXjkhRl0OtS+mAWYDa3CyX+2+sSx45QDjUtpaHdfwpCv3ykoxjpnTS1TWqkdKzh
7MKeeYMH5yF/ERTdPWKNqya11DF5vaI4I10dxux1vHLSVhHFRT4dq0h2XvXL/n8S
+d3Oli3Mn7osXyyCiQCv1vhkaSzRrp3mHKzcbColB2VJPqiHPqjZ8fj7cVdmm8IB
A+25FuLXOaty21SwNsJBwCnxsIGI3t7G0tnh6fu2ZuXEAQGeKlqq6i0Wsq9H2PRs
RGX/tUhphTFXw/E23BZwd3V5igJohy+9MKJttfGxBYJDuJvHAw+pPzv+uTzVsh+d
7D4fwEbVnC9TxCOea4g=
=s+ye
-----END PGP SIGNATURE-----
Merge tag 'mmc-v4.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Ulf writes:
"MMC core:
- Avoid fragile multiblock reads for the last sector in SPI mode
WIFI/SDIO:
- libertas: Fixup suspend sequence for the SDIO card"
* tag 'mmc-v4.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
libertas: call into generic suspend code before turning off power
mmc: block: avoid multiblock reads for the last sector in SPI mode