Ido Schimmel says:
====================
mlxsw: Offload TC action pedit munge dsfield
Petr says:
The Spectrum switches allow packet prioritization based on DSCP on ingress,
and update of DSCP on egress. This is configured through the DCB APP rules.
For some use cases, assigning a custom DSCP value based on an ACL match is
a better tool. To that end, offload FLOW_ACTION_MANGLE to permit changing
of dsfield as a whole, or DSCP and ECN values in isolation.
After fixing a commentary nit in patch #1, and mlxsw naming in patch #2,
patches #3 and #4 add the offload to mlxsw.
Patch #5 adds a forwarding selftest for pedit dsfield, applicable to SW as
well as HW datapaths. Patch #6 adds a mlxsw-specific test to verify DSCP
rewrite due to DCB APP rules is not performed on pedited packets.
The tests only cover IPv4 dsfield setting. We have tests for IPv6 as well,
but would like to postpone their contribution until the corresponding
iproute patches have been accepted.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When DSCP is updated through an offloaded pedit action, DSCP rewrite on
egress should be disabled. Add a test that check that it is so.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a test that runs packets with dsfield set, and test that pedit adjusts
the DSCP or ECN parts or the whole field.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Offload action pedit ex munge when used with a flower classifier. Only
allow setting of DSCP, ECN, or the whole DSField in IPv4 and IPv6 packets.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The QOS_ACTION is used for manipulating the QOS attributes of the packet.
Add the defines and helpers related to DSCP and ECN fields, and dscp_rw.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The original idea was to reuse this set of actions for ECN rewrite as well,
but on second look, it's not such a great idea. These two items should each
have its own command. Rename the existing enum to make it obvious that it
belongs to switch_prio_cmd.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This field references FLOW_ACTION_PACKET_EDIT. Such action does not exist
though. Instead the field is used for FLOW_ACTION_MANGLE and _ADD.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In qlcnic_83xx_get_reset_instruction_template, the variable
of null test is bad, so correct it.
Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
1) Cleanups from Dan Carpenter and wenxu.
2) Paul and Roi, Some minor updates and fixes to E-Switch to address
issues introduced in the previous reg_c0 updates series.
3) Eli Cohen simplifies and improves flow steering matching group searches
and flow table entries version management.
4) Parav Pandit, improves devlink eswitch mode changes thread safety.
By making devlink rely on driver for thread safety and introducing mlx5
eswitch mode change protection.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl58SW8ACgkQSD+KveBX
+j4AxQf8DdrFrBD0NFTcAILS4bnTJC0I3xKRPb/2oYtWLVyJ9G5XAZqHC0DAG7xs
jy8xhIFbeUxgLEdcx0la5vdR1mPlzs4XBHTe99YwzwK/jojrA7YXrlb3kv+RXWVY
uNVAby68wh4EnO61R51ahIBXLPNbiYpo/wAWKvvBKRkOcYMVTKIFiP157AnJWObY
fxnt06I0NFaIX8Va4MEqkrmUYrI4jJcqOJC9FwRBLDhFHcFkLh0Gav3vJJ7M4BCB
ggPJpuZ4pu43qX9TtSOm8V/GlWWN0RB7PdbvliFBEHYG21hf9MfE8bPPKRlB7CO+
B5+9ULhpvbjX7yRrkZ7fd4zlQ1siew==
=Flln
-----END PGP SIGNATURE-----
Merge tag 'mlx5-updates-2020-03-25' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2020-03-25
1) Cleanups from Dan Carpenter and wenxu.
2) Paul and Roi, Some minor updates and fixes to E-Switch to address
issues introduced in the previous reg_c0 updates series.
3) Eli Cohen simplifies and improves flow steering matching group searches
and flow table entries version management.
4) Parav Pandit, improves devlink eswitch mode changes thread safety.
By making devlink rely on driver for thread safety and introducing mlx5
eswitch mode change protection.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/ethernet/atheros/atlx/atl2.c:40:19: warning: ‘atl2_driver_string’ defined but not used [-Wunused-const-variable=]
static const char atl2_driver_string[] = "Atheros(R) L2 Ethernet Driver";
^~~~~~~~~~~~~~~~~~
commit ea97374214 ("net/atheros: Clean atheros code from driver version")
left behind this, remove it.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the commit f73b12812a
("tipc: improve throughput between nodes in netns"), we're missing a check
to handle TIPC_DIRECT_MSG type, it's still using old sending mechanism for
this message type. So, throughput improvement is not significant as
expected.
Besides that, when sending a large message with that type, we're also
handle wrong receiving queue, it should be enqueued in socket receiving
instead of multicast messages.
Fix this by adding the missing case for TIPC_DIRECT_MSG.
Fixes: f73b12812a ("tipc: improve throughput between nodes in netns")
Reported-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Several MAINTAINERS updates
- Memory leak regression in ODP
- Several fixes for syzkaller related crashes. Google recently taught
syzkaller to create the software RDMA devices
- Crash fixes for HFI1
- Several fixes for mlx5 crashes
- Prevent unprivileged access to an unsafe mlx5 HW resource
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl57oakACgkQOG33FX4g
mxqLQBAAgjBxExYAK51ImMoCOXlZ9uInoACes8+e1HjIJUPhS3/WVubCMXJxAYrc
x7xtW7DUVBjNBYdQajgzWsDRrENovTq4rPscIWVSqTffWkYYubJbGXeBKTa045J6
3rsKL5Kzd0XGN0/XrXjctXpADEvEb1mjUjEMvsVmggIHneLJ5JgUdRTISwlsntmD
BbFo67DsSmQqOCx5B1az3l/RYPuWeFEYgQOAcWUC/rzsFEvEkmdqDr6FcbtkTWe1
pSqSTVQ6knDTvadwjwbpJRJg9SR/X4cHtGLOy1sP8HQ6Do671cAtFrusLb4Zf+w6
ncOll8/YPjiTe9fkTCX4Gdhn278t8cZAn2W9vrzPF8lv8tGRtlO5nXdR044NmBst
CBid7JaxzGVq0QhEEiWpV72Q8O7wQ5mARHR2x/zR2A/fdD+lqN1rjfDAqldrgYWV
5D+IFRwwwp9J+Y1pQAmTkzixItraRLD6jjzYG+qk5zopUnHBJBk67wZpWcS1zlER
SJuG0ztYn5KBSjoTGzOXfPIXpe+N3UhWxHx5P17G7UhRjbLE6oaij3TY7et/Bfj/
7esEQCLDyzEQdXJHNojyUKuyUU88N1yAC9gQbgHWPrrJBa/Ibxukq5O8dwDBpX8g
zOtSrv9i9r8g7RrKccxy0BUkrq+0OSttSopE+ssnRFTiVX3sMX0=
=RhVe
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"A small set of late-rc patches, mostly fixes for various crashers,
some syzkaller fixes and a mlx5 HW limitation:
- Several MAINTAINERS updates
- Memory leak regression in ODP
- Several fixes for syzkaller related crashes. Google recently taught
syzkaller to create the software RDMA devices
- Crash fixes for HFI1
- Several fixes for mlx5 crashes
- Prevent unprivileged access to an unsafe mlx5 HW resource"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/mlx5: Block delay drop to unprivileged users
RDMA/mlx5: Fix access to wrong pointer while performing flush due to error
RDMA/core: Ensure security pkey modify is not lost
MAINTAINERS: Clean RXE section and add Zhu as RXE maintainer
IB/hfi1: Ensure pq is not left on waitlist
IB/rdmavt: Free kernel completion queue when done
RDMA/mad: Do not crash if the rdma device does not have a umad interface
RDMA/core: Fix missing error check on dev_set_name()
RDMA/nl: Do not permit empty devices names during RDMA_NLDEV_CMD_NEWLINK/SET
RDMA/mlx5: Fix the number of hwcounters of a dynamic counter
MAINTAINERS: Update maintainers for HISILICON ROCE DRIVER
RDMA/odp: Fix leaking the tgid for implicit ODP
When a frame is transmitted via the nl80211 TX rather than as a
normal frame, IEEE80211_TX_CTRL_PORT_CTRL_PROTO wasn't set and
this will lead to wrong decisions (rate control etc.) being made
about the frame; fix this.
Fixes: 9118064914 ("mac80211: Add support for tx_control_port")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20200326155333.f183f52b02f0.I4054e2a8c11c2ddcb795a0103c87be3538690243@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
mac80211 used to check port authorization in the Data frame enqueue case
when going through start_xmit(). However, that authorization status may
change while the frame is waiting in a queue. Add a similar check in the
dequeue case to avoid sending previously accepted frames after
authorization change. This provides additional protection against
potential leaking of frames after a station has been disconnected and
the keys for it are being removed.
Cc: stable@vger.kernel.org
Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Link: https://lore.kernel.org/r/20200326155133.ced84317ea29.I34d4c47cd8cc8a4042b38a76f16a601fbcbfd9b3@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
When cfg80211_update_assoc_bss_entry() is called, there is a
verification that the BSS channel actually changed. As some APs use
CSA also for bandwidth changes, this would result with a kernel
warning.
Fix this by removing the WARN_ON().
Signed-off-by: Ilan Peer <ilan.peer@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20200326150855.96316ada0e8d.I6710376b1b4257e5f4712fc7ab16e2b638d512aa@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
If we know that we have an encrypted link (based on having had
a key configured for TX in the past) then drop all data frames
in the key selection handler if there's no key anymore.
This fixes an issue with mac80211 internal TXQs - there we can
buffer frames for an encrypted link, but then if the key is no
longer there when they're dequeued, the frames are sent without
encryption. This happens if a station is disconnected while the
frames are still on the TXQ.
Detecting that a link should be encrypted based on a first key
having been configured for TX is fine as there are no use cases
for a connection going from with encryption to no encryption.
With extended key IDs, however, there is a case of having a key
configured for only decryption, so we can't just trigger this
behaviour on a key being configured.
Cc: stable@vger.kernel.org
Reported-by: Jouni Malinen <j@w1.fi>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20200326150855.6865c7f28a14.I9fb1d911b064262d33e33dfba730cdeef83926ca@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
There is one one corner case at dma_fence_signal_locked
which will raise the NULL pointer problem just like below.
->dma_fence_signal
->dma_fence_signal_locked
->test_and_set_bit
here trigger dma_fence_release happen due to the zero of fence refcount.
->dma_fence_put
->dma_fence_release
->drm_sched_fence_release_scheduled
->call_rcu
here make the union fled “cb_list” at finished fence
to NULL because struct rcu_head contains two pointer
which is same as struct list_head cb_list
Therefore, to hold the reference of finished fence at drm_sched_process_job
to prevent the null pointer during finished fence dma_fence_signal
[ 732.912867] BUG: kernel NULL pointer dereference, address: 0000000000000008
[ 732.914815] #PF: supervisor write access in kernel mode
[ 732.915731] #PF: error_code(0x0002) - not-present page
[ 732.916621] PGD 0 P4D 0
[ 732.917072] Oops: 0002 [#1] SMP PTI
[ 732.917682] CPU: 7 PID: 0 Comm: swapper/7 Tainted: G OE 5.4.0-rc7 #1
[ 732.918980] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[ 732.920906] RIP: 0010:dma_fence_signal_locked+0x3e/0x100
[ 732.938569] Call Trace:
[ 732.939003] <IRQ>
[ 732.939364] dma_fence_signal+0x29/0x50
[ 732.940036] drm_sched_fence_finished+0x12/0x20 [gpu_sched]
[ 732.940996] drm_sched_process_job+0x34/0xa0 [gpu_sched]
[ 732.941910] dma_fence_signal_locked+0x85/0x100
[ 732.942692] dma_fence_signal+0x29/0x50
[ 732.943457] amdgpu_fence_process+0x99/0x120 [amdgpu]
[ 732.944393] sdma_v4_0_process_trap_irq+0x81/0xa0 [amdgpu]
v2: hold the finished fence at drm_sched_process_job instead of
amdgpu_fence_process
v3: resume the blank line
Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The original single target IPI fastpath patch forgot to filter the
ICR destination shorthand field. Multicast IPI is not suitable for
this feature since wakeup the multiple sleeping vCPUs will extend
the interrupt disabled time, it especially worse in the over-subscribe
and VM has a little bit more vCPUs scenario. Let's narrow it down to
single target IPI.
Two VMs, each is 76 vCPUs, one running 'ebizzy -M', the other
running cyclictest on all vCPUs, w/ this patch, the avg score
of cyclictest can improve more than 5%. (pv tlb, pv ipi, pv
sched yield are disabled during testing to avoid the disturb).
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1585189202-1708-3-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We add enable dynamic suspend (autosuspend) support in host driver, and
it can let platform cut down idle power consumption.
To support autosuspend feature in host driver, kernel need to be built
with CONFIG_USB_SUSPEND and autosuspend need to be turn on.
And we also replace wowl feature with adding "needs_remote_wakeup", so
that host still can be waken by wireless device.
Signed-off-by: Wright Feng <wright.feng@cypress.com>
Signed-off-by: Chi-Hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1585124429-97371-6-git-send-email-chi-hsien.lin@cypress.com
Will enable FMAC to push more packets to bus tx queue and help
improve throughput when fws queuing is enabled. This change is
required to tune the throughput for passing WMM CERT tests.
Signed-off-by: Madhan Mohan R <madhanmohan.r@cypress.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1585124429-97371-5-git-send-email-chi-hsien.lin@cypress.com
The function brcmf_inform_single_bss returns the value as success,
even when the length exceeds the maximum value.
The fix is to send appropriate code on this error.
This issue is observed when Cypress test group reported random fmac
crashes when running their tests and the path was identified from the
crash logs. With this fix the random failure issue in Cypress test group
was resolved.
Reviewed-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Signed-off-by: Raveendran Somu <raveendran.somu@cypress.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1585124429-97371-4-git-send-email-chi-hsien.lin@cypress.com
When the brcmf_fws_process_skb() fails to get hanger slot for
queuing the skb, it tries to free the skb.
But the caller brcmf_netdev_start_xmit() of that funciton frees
the packet on error return value.
This causes the double freeing and which caused the kernel crash.
Signed-off-by: Raveendran Somu <raveendran.somu@cypress.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1585124429-97371-3-git-send-email-chi-hsien.lin@cypress.com
When the control transfer gets timed out, the error status
was returned without killing that urb, this leads to using
the same urb. This issue causes the kernel crash as the same
urb is sumbitted multiple times. The fix is to kill the
urb for timeout transfer before returning error
Signed-off-by: Raveendran Somu <raveendran.somu@cypress.com>
Signed-off-by: Chi-hsien Lin <chi-hsien.lin@cypress.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1585124429-97371-2-git-send-email-chi-hsien.lin@cypress.com
The nl80211 commands such as 'iw link' can't get current txrate
information from the driver. This commit fills in the tx rate
information from the C2H RA report in the sta_statistics function.
Signed-off-by: Chris Chiu <chiu@endlessm.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200320063833.1058-3-chiu@endlessm.com
There's a data field in H2C and C2H commands which is used to
carry channel bandwidth information. Add enumeration to make it
more descriptive in code.
Signed-off-by: Chris Chiu <chiu@endlessm.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200320063833.1058-2-chiu@endlessm.com
Sometimes we need to stop the coex mechanism to debug, so that we
can manually control the device through various outer commands.
Hence, add a new debugfs coex_enable to allow us to enable/disable
the coex mechanism when driver is running.
To disable coex
echo 0 > /sys/kernel/debug/ieee80211/phyX/rtw88/coex_enable
To enable coex
echo 1 > /sys/kernel/debug/ieee80211/phyX/rtw88/coex_enable
To check coex dm is enabled or not
cat /sys/kernel/debug/ieee80211/phyX/rtw88/coex_enable
Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200313033008.20070-3-yhchuang@realtek.com
Add a new entry "coex_info" in debugfs to dump coex's states for
us to debug on coex's issues.
The basic concept for co-existence (coex, usually for WiFi + BT)
is to decide a strategy based on the current status of WiFi and
BT. So, it means the WiFi driver requires to gather information
from BT side and choose a strategy (TDMA/table/HW settings).
Althrough we can easily check the current status of WiFi, e.g.,
from kernel log or just dump the hardware registers, it is still
very difficult for us to gather so many different types of WiFi
states (such as RFE config, antenna, channel/band, TRX, Power
save). Also we will need BT's information that is stored in
"struct rtw_coex". So it is necessary for us to have a debugfs
that can dump all of the WiFi/BT information required.
Note that to debug on coex related issues, we usually need a
longer period of time of coex_info dump every 2 seconds (for
example, 30 secs, so we should have 15 times of coex_info's
dump).
Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com>
Reviewed-by: Chris Chiu <chiu@endlessm.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20200313033008.20070-2-yhchuang@realtek.com
Currently eswitch mode change is occurring from 2 different execution
contexts as below.
1. sriov sysfs enable/disable
2. devlink eswitch set commands
Both of them need to access eswitch related data structures in
synchronized manner.
Without any synchronization below race condition exist.
SR-IOV enable/disable with devlink eswitch mode change:
cpu-0 cpu-1
----- -----
mlx5_device_disable_sriov() mlx5_devlink_eswitch_mode_set()
mlx5_eswitch_disable() esw_offloads_stop()
esw_offloads_disable() mlx5_eswitch_disable()
esw_offloads_disable()
Hence, they are synchronized using a new mode_lock.
eswitch's state_lock is not used as it can lead to a deadlock scenario
below and state_lock is only for vport and fdb exclusive access.
ip link set vf <param>
netlink rcv_msg() - Lock A
rtnl_lock
vfinfo()
esw->state_lock() - Lock B
devlink eswitch_set
devlink_mutex
esw->state_lock() - Lock B
attach_netdev()
register_netdev()
rtnl_lock - Lock A
Alternatives considered:
1. Acquiring rtnl lock before taking esw->state_lock to follow similar
locking sequence as ip link flow during eswitch mode set.
rtnl lock is not good idea for two reasons.
(a) Holding rtnl lock for several hundred device commands is not good
idea.
(b) It leads to below and more similar deadlocks.
devlink eswitch_set
devlink_mutex
rtnl_lock - Lock A
esw->state_lock() - Lock B
eswitch_disable()
reload()
ib_register_device()
ib_cache_setup_one()
rtnl_lock()
2. Exporting devlink lock may lead to undesired use of it in vendor
driver(s) in future.
3. Unloading representors outside of the mode_lock requires
serialization with other process trying to enable the eswitch.
4. Differing the representors life cycle to a different workqueue
requires synchronization with func_change_handler workqueue.
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Subsequent patch protects eswitch mode changes across sriov and devlink
interfaces. It is desirable for eswitch to provide thread safe eswitch
enable and disable APIs.
Hence, extend eswitch enable API to optionally update num_vfs when
requested.
In subsequent patch, eswitch num_vfs are updated after all the eswitch
users eswitch drops its reference count.
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
In order to check eswitch state under a lock, prepare code to split
capability check and eswitch state check into two helper functions.
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Bodong Wang <bodong@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
devlink_nl_cmd_eswitch_set_doit() doesn't hold devlink->lock mutex while
invoking driver callback. This is likely due to eswitch mode setting
involves adding/remove devlink ports, health reporters or
other devlink objects for a devlink device.
So it is driver responsiblity to ensure thread safe eswitch state
transition happening via either sriov legacy enablement or via devlink
eswitch set callback.
Therefore, get() callback should also be invoked without holding
devlink->lock mutex.
Vendor driver can use same internal lock which it uses during eswitch
mode set() callback.
This makes get() and set() implimentation symmetric in devlink core and
in vendor drivers.
Hence, remove holding devlink->lock mutex during eswitch get() callback.
Failing to do so results into below deadlock scenario when mlx5_core
driver is improved to handle eswitch mode set critical section invoked
by devlink and sriov sysfs interface in subsequent patch.
devlink_nl_cmd_eswitch_set_doit()
mlx5_eswitch_mode_set()
mutex_lock(esw->mode_lock) <- Lock A
[...]
register_devlink_port()
mutex_lock(&devlink->lock); <- lock B
mutex_lock(&devlink->lock); <- lock B
devlink_nl_cmd_eswitch_get_doit()
mlx5_eswitch_mode_get()
mutex_lock(esw->mode_lock) <- Lock A
In subsequent patch, mlx5_core driver uses its internal lock during
get() and set() eswitch callbacks.
Other drivers have been inspected which returns either constant during
get operations or reads the value from already allocated structure.
Hence it is safe to remove the lock in get( ) callback and let vendor
driver handle it.
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
mlx5_register_device() doesn't check for any error and always returns 0.
Simplify mlx5_register_device() to return void and its caller, remove
dead code related to it.
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Group version is used when modifying a rule is allowed
(FLOW_ACT_NO_APPEND is clear) to detect a case where the rule was found
but while the groups where unlocked a new FTE was added. In this case,
the added FTE could be one with identical match value so we need to
attempt again with group lock held.
Change the code so version is retrieved only when FLOW_ACT_NO_APPEND is
cleared. As result, later compare can also be avoided if FLOW_ACT_NO_APPEND
is cleared.
Also improve comments text.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
FTE version is not used anywhere in the code so avoid incrementing it.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
When adding a rule to a flow group we need increment the version of the
group. Current code fails to do that and as a result, when trying to add
a rule, we will fail to discover a case where an FTE with the same match
value was added while we scanned the groups of the same match criteria,
thus we may try to add an FTE that was already added.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Instead of using two different structs for searching groups with the
same match, use a single struct and thus simplify the code, make it more
readable and smaller size which means less code cache misses.
text data bss dec hex
before: 35524 2744 0 38268 957c
after: 35038 2744 0 37782 9396
When testing add 70000 rules, delete all the rules, and repeat three
times taking the average, we get (time in seconds):
Before the change: insert 16.80, delete 11.02
After the change: insert 16.55, delete 10.95
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The correct type is u32.
Fixes: d18296ffd9 ("net/mlx5: E-Switch, Introduce global tables")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
We allocate a temporary memory but forget to free it.
Fixes: 11b717d615 ("net/mlx5: E-Switch, Get reg_c0 value on CQE")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Register c0 loopback is needed to fully support chains and prios.
Enable chains and prio only if loopback (of reg c1 which came together
with c0), is enabled. To be able to check that, move enabling of loopback
before eswitch chains init.
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reg c0/c1 matching, rewrite of regs c0/c1, and copy header of regs c1,B
is needed for the restore table to function, might not be supported by
firmware, and creation of the restore table or the copy header will
fail.
Check reg_c1 loopback support, as firmware which supports this,
should have all of the above.
Fixes: 11b717d615 ("net/mlx5: E-Switch, Get reg_c0 value on CQE")
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The function mlx5e_rep_setup_ft_cb check chain_index is zero twice.
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The actions_match_supported() function returns a bool, true for success
and false for failure. This error path is returning a negative which
is cast to true but it should return false.
Fixes: 4c3844d9e9 ("net/mlx5e: CT: Introduce connection tracking")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Overlapping header include additions in macsec.c
A bug fix in 'net' overlapping with the removal of 'version'
string in ena_netdev.c
Overlapping test additions in selftests Makefile
Overlapping PCI ID table adjustments in iwlwifi driver.
Signed-off-by: David S. Miller <davem@davemloft.net>
The imx SC api strongly assumes that messages are composed out of
4-bytes words but some of our message structs have odd sizeofs.
This produces many oopses with CONFIG_KASAN=y.
Fix by marking with __aligned(4).
Fixes: 666aed2d13 ("clk: imx: scu: add set parent support")
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Link: https://lkml.kernel.org/r/aad021e432b3062c142973d09b766656eec18fde.1582216144.git.leonard.crestez@nxp.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
The imx SC api strongly assumes that messages are composed out of
4-bytes words but some of our message structs have odd sizeofs.
This produces many oopses with CONFIG_KASAN=y.
Fix by marking with __aligned(4).
Fixes: fe37b48204 ("clk: imx: add scu clock common part")
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Link: https://lkml.kernel.org/r/10e97a04980d933b2cfecb6b124bf9046b6e4f16.1582216144.git.leonard.crestez@nxp.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
I copy/pasted these macros and forgot to update the argument
names and where they're passed to. Fix it so that these macros make
sense.
Reported-by: Maxime Ripard <maxime@cerno.tech>
Fixes: 194efb6e26 ("clk: gate: Add support for specifying parents via DT/pointers")
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Link: https://lkml.kernel.org/r/20200325022257.148244-1-sboyd@kernel.org
Tested-by: Maxime Ripard <mripard@kernel.org>
Pull networking fixes from David Miller:
1) Fix deadlock in bpf_send_signal() from Yonghong Song.
2) Fix off by one in kTLS offload of mlx5, from Tariq Toukan.
3) Add missing locking in iwlwifi mvm code, from Avraham Stern.
4) Fix MSG_WAITALL handling in rxrpc, from David Howells.
5) Need to hold RTNL mutex in tcindex_partial_destroy_work(), from Cong
Wang.
6) Fix producer race condition in AF_PACKET, from Willem de Bruijn.
7) cls_route removes the wrong filter during change operations, from
Cong Wang.
8) Reject unrecognized request flags in ethtool netlink code, from
Michal Kubecek.
9) Need to keep MAC in reset until PHY is up in bcmgenet driver, from
Doug Berger.
10) Don't leak ct zone template in act_ct during replace, from Paul
Blakey.
11) Fix flushing of offloaded netfilter flowtable flows, also from Paul
Blakey.
12) Fix throughput drop during tx backpressure in cxgb4, from Rahul
Lakkireddy.
13) Don't let a non-NULL skb->dev leave the TCP stack, from Eric
Dumazet.
14) TCP_QUEUE_SEQ socket option has to update tp->copied_seq as well,
also from Eric Dumazet.
15) Restrict macsec to ethernet devices, from Willem de Bruijn.
16) Fix reference leak in some ethtool *_SET handlers, from Michal
Kubecek.
17) Fix accidental disabling of MSI for some r8169 chips, from Heiner
Kallweit.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (138 commits)
net: Fix CONFIG_NET_CLS_ACT=n and CONFIG_NFT_FWD_NETDEV={y, m} build
net: ena: Add PCI shutdown handler to allow safe kexec
selftests/net/forwarding: define libs as TEST_PROGS_EXTENDED
selftests/net: add missing tests to Makefile
r8169: re-enable MSI on RTL8168c
net: phy: mdio-bcm-unimac: Fix clock handling
cxgb4/ptp: pass the sign of offset delta in FW CMD
net: dsa: tag_8021q: replace dsa_8021q_remove_header with __skb_vlan_pop
net: cbs: Fix software cbs to consider packet sending time
net/mlx5e: Do not recover from a non-fatal syndrome
net/mlx5e: Fix ICOSQ recovery flow with Striding RQ
net/mlx5e: Fix missing reset of SW metadata in Striding RQ reset
net/mlx5e: Enhance ICOSQ WQE info fields
net/mlx5_core: Set IB capability mask1 to fix ib_srpt connection failure
selftests: netfilter: add nfqueue test case
netfilter: nft_fwd_netdev: allow to redirect to ifb via ingress
netfilter: nft_fwd_netdev: validate family and chain type
netfilter: nft_set_rbtree: Detect partial overlaps on insertion
netfilter: nft_set_rbtree: Introduce and use nft_rbtree_interval_start()
netfilter: nft_set_pipapo: Separate partial and complete overlap cases on insertion
...