Zheyu Ma [Sat, 16 Oct 2021 04:02:59 +0000 (04:02 +0000)]
mwl8k: Fix use-after-free in mwl8k_fw_state_machine()
When the driver fails to request the firmware, it calls its error
handler. In the error handler, the driver detaches device from driver
first before releasing the firmware, which can cause a use-after-free bug.
Fix this by releasing firmware first.
The following log reveals it:
[ 9.007301 ] BUG: KASAN: use-after-free in mwl8k_fw_state_machine+0x320/0xba0
[ 9.010143 ] Workqueue: events request_firmware_work_func
[ 9.010830 ] Call Trace:
[ 9.010830 ] dump_stack_lvl+0xa8/0xd1
[ 9.010830 ] print_address_description+0x87/0x3b0
[ 9.010830 ] kasan_report+0x172/0x1c0
[ 9.010830 ] ? mutex_unlock+0xd/0x10
[ 9.010830 ] ? mwl8k_fw_state_machine+0x320/0xba0
[ 9.010830 ] ? mwl8k_fw_state_machine+0x320/0xba0
[ 9.010830 ] __asan_report_load8_noabort+0x14/0x20
[ 9.010830 ] mwl8k_fw_state_machine+0x320/0xba0
[ 9.010830 ] ? mwl8k_load_firmware+0x5f0/0x5f0
[ 9.010830 ] request_firmware_work_func+0x172/0x250
[ 9.010830 ] ? read_lock_is_recursive+0x20/0x20
[ 9.010830 ] ? process_one_work+0x7a1/0x1100
[ 9.010830 ] ? request_firmware_nowait+0x460/0x460
[ 9.010830 ] ? __this_cpu_preempt_check+0x13/0x20
[ 9.010830 ] process_one_work+0x9bb/0x1100
Signed-off-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1634356979-6211-1-git-send-email-zheyuma97@gmail.com
Ziyang Xuan [Fri, 15 Oct 2021 04:03:35 +0000 (12:03 +0800)]
rsi: stop thread firstly in rsi_91x_init() error handling
When fail to init coex module, free 'common' and 'adapter' directly, but
common->tx_thread which will access 'common' and 'adapter' is running at
the same time. That will trigger the UAF bug.
==================================================================
BUG: KASAN: use-after-free in rsi_tx_scheduler_thread+0x50f/0x520 [rsi_91x]
Read of size 8 at addr
ffff8880076dc000 by task Tx-Thread/124777
CPU: 0 PID: 124777 Comm: Tx-Thread Not tainted 5.15.0-rc5+ #19
Call Trace:
dump_stack_lvl+0xe2/0x152
print_address_description.constprop.0+0x21/0x140
? rsi_tx_scheduler_thread+0x50f/0x520
kasan_report.cold+0x7f/0x11b
? rsi_tx_scheduler_thread+0x50f/0x520
rsi_tx_scheduler_thread+0x50f/0x520
...
Freed by task 111873:
kasan_save_stack+0x1b/0x40
kasan_set_track+0x1c/0x30
kasan_set_free_info+0x20/0x30
__kasan_slab_free+0x109/0x140
kfree+0x117/0x4c0
rsi_91x_init+0x741/0x8a0 [rsi_91x]
rsi_probe+0x9f/0x1750 [rsi_usb]
Stop thread before free 'common' and 'adapter' to fix it.
Fixes:
2108df3c4b18 ("rsi: add coex support")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211015040335.1021546-1-william.xuanziyang@huawei.com
Ryder Lee [Wed, 13 Oct 2021 15:56:41 +0000 (23:56 +0800)]
MAINTAINERS: mt76: update MTK folks
Add more MTK folks to actively maintain the wireless chipsets across
segments. The work is becoming increasingly complicated and various
and we can provides hardware related perspectives to offload
Felix's workload, especially for the 11ax and upcoming 11be devices
Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Acked-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/eb888ae0e43a980c2c1aaed372a9b5e8098ea4ef.1634107511.git.ryder.lee@mediatek.com
Colin Ian King [Fri, 15 Oct 2021 15:21:13 +0000 (16:21 +0100)]
rtw89: Remove redundant check of ret after call to rtw89_mac_enable_bb_rf
The function rtw89_mac_enable_bb_rf is a void return type, so there is
no return error code to ret, so the following check for an error in ret
is redundant dead code and can be removed.
Addresses-Coverity: ("Logically dead code")
Fixes:
e3ec7017f6a2 ("rtw89: add Realtek 802.11ax driver")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211015152113.33179-1-colin.king@canonical.com
Colin Ian King [Fri, 15 Oct 2021 10:50:04 +0000 (11:50 +0100)]
rtw89: Fix two spelling mistakes in debug messages
There are two spelling mistakes in rtw89_debug messages. Fix them.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211015105004.11817-1-colin.king@canonical.com
Ping-Ke Shih [Wed, 13 Oct 2021 09:28:27 +0000 (17:28 +0800)]
MAINTAINERS: add rtw89 wireless driver
Add maintainer and email to MAINTAINERS file.
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211013092827.43642-1-pkshih@realtek.com
Jonas Dreßler [Mon, 11 Oct 2021 13:32:24 +0000 (15:32 +0200)]
mwifiex: Try waking the firmware until we get an interrupt
It seems that the PCIe+USB firmware (latest version 15.68.19.p21) of the
88W8897 card sometimes ignores or misses when we try to wake it up by
writing to the firmware status register. This leads to the firmware
wakeup timeout expiring and the driver resetting the card because we
assume the firmware has hung up or crashed.
Turns out that the firmware actually didn't hang up, but simply "missed"
our wakeup request and didn't send us an interrupt with an AWAKE event.
Trying again to read the firmware status register after a short timeout
usually makes the firmware wake up as expected, so add a small retry
loop to mwifiex_pm_wakeup_card() that looks at the interrupt status to
check whether the card woke up.
The number of tries and timeout lengths for this were determined
experimentally: The firmware usually takes about 500 us to wake up
after we attempt to read the status register. In some cases where the
firmware is very busy (for example while doing a bluetooth scan) it
might even miss our requests for multiple milliseconds, which is why
after 15 tries the waiting time gets increased to 10 ms. The maximum
number of tries it took to wake the firmware when testing this was
around 20, so a maximum number of 50 tries should give us plenty of
safety margin.
Here's a reproducer for those firmware wakeup failures I've found:
1) Make sure wifi powersaving is enabled (iw dev wlp1s0 set power_save on)
2) Connect to any wifi network (makes firmware go into wifi powersaving
mode, not deep sleep)
3) Make sure bluetooth is turned off (to ensure the firmware actually
enters powersave mode and doesn't keep the radio active doing bluetooth
stuff)
4) To confirm that wifi powersaving is entered ping a device on the LAN,
pings should be a few ms higher than without powersaving
5) Run "while true; do iwconfig; sleep 0.0001; done", this wakes and
suspends the firmware extremely often
6) Wait until things explode, for me it consistently takes <5 minutes
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=109681
Cc: stable@vger.kernel.org
Signed-off-by: Jonas Dreßler <verdre@v0yd.nl>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211011133224.15561-3-verdre@v0yd.nl
Jonas Dreßler [Mon, 11 Oct 2021 13:32:23 +0000 (15:32 +0200)]
mwifiex: Read a PCI register after writing the TX ring write pointer
On the 88W8897 PCIe+USB card the firmware randomly crashes after setting
the TX ring write pointer. The issue is present in the latest firmware
version 15.68.19.p21 of the PCIe+USB card.
Those firmware crashes can be worked around by reading any PCI register
of the card after setting that register, so read the PCI_VENDOR_ID
register here. The reason this works is probably because we keep the bus
from entering an ASPM state for a bit longer, because that's what causes
the cards firmware to crash.
This fixes a bug where during RX/TX traffic and with ASPM L1 substates
enabled (the specific substates where the issue happens appear to be
platform dependent), the firmware crashes and eventually a command
timeout appears in the logs.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=109681
Cc: stable@vger.kernel.org
Signed-off-by: Jonas Dreßler <verdre@v0yd.nl>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211011133224.15561-2-verdre@v0yd.nl
Christophe JAILLET [Sun, 10 Oct 2021 07:09:11 +0000 (09:09 +0200)]
wireless: Remove redundant 'flush_workqueue()' calls
'destroy_workqueue()' already drains the queue before destroying it, so
there is no need to flush it explicitly.
Remove the redundant 'flush_workqueue()' calls.
This was generated with coccinelle:
@@
expression E;
@@
- flush_workqueue(E);
destroy_workqueue(E);
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/0855d51423578ad019c0264dad3fe47a2e8af9c7.1633849511.git.christophe.jaillet@wanadoo.fr
Colin Ian King [Thu, 7 Oct 2021 23:41:53 +0000 (00:41 +0100)]
mt7601u: Remove redundant initialization of variable ret
The variable ret is being initialized with a value that is never read,
it is assigned later on with a different value. The initialization is
redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Jakub Kicinski <kubakici@wp.pl>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211007234153.31222-1-colin.king@canonical.com
Colin Ian King [Thu, 7 Oct 2021 16:37:22 +0000 (17:37 +0100)]
rtlwifi: rtl8192ee: Remove redundant initialization of variable version
The variable version is being initialized with a value that is
never read, it is being updated afterwards in both branches of
an if statement. The assignment is redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211007163722.20165-1-colin.king@canonical.com
Ping-Ke Shih [Mon, 11 Oct 2021 11:47:27 +0000 (14:47 +0300)]
rtw89: add Realtek 802.11ax driver
This driver named rtw89, which is the next generation of rtw88, supports
Realtek 8852AE 802.11ax 2x2 chip whose new features are OFDMA, DBCC,
Spatial reuse, TWT and BSS coloring; now some of them aren't implemented
though.
The chip architecture is entirely different from the chips supported by
rtw88 like RTL8822CE 802.11ac chip. First of all, register address ranges
are totally redefined, so it's impossible to reuse register definition. To
communicate with firmware, new H2C/C2H format is proposed. In order to have
better utilization, TX DMA flow is changed to two stages DMA. To provide
rich RX status information, additional RX PPDU packets are added.
Since there are so many differences mentioned above, we decide to propose
a new driver. It has many authors, they are listed in alphabetic order:
Chin-Yen Lee <timlee@realtek.com>
Ping-Ke Shih <pkshih@realtek.com>
Po Hao Huang <phhuang@realtek.com>
Tzu-En Huang <tehuang@realtek.com>
Vincent Fann <vincent_fann@realtek.com>
Yan-Hsuan Chuang <tony0620emma@gmail.com>
Zong-Zhe Yang <kevin_yang@realtek.com>
Tested-by: Aaron Ma <aaron.ma@canonical.com>
Tested-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211008035627.19463-1-pkshih@realtek.com
Dan Carpenter [Wed, 6 Oct 2021 07:36:22 +0000 (10:36 +0300)]
b43: fix a lower bounds test
The problem is that "channel" is an unsigned int, when it's less 5 the
value of "channel - 5" is not a negative number as one would expect but
is very high positive value instead.
This means that "start" becomes a very high positive value. The result
of that is that we never enter the "for (i = start; i <= end; i++) {"
loop. Instead of storing the result from b43legacy_radio_aci_detect()
it just uses zero.
Fixes:
ef1a628d83fc ("b43: Implement dynamic PHY API")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Michael Büsch <m@bues.ch>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211006073621.GE8404@kili
Dan Carpenter [Wed, 6 Oct 2021 07:35:42 +0000 (10:35 +0300)]
b43legacy: fix a lower bounds test
The problem is that "channel" is an unsigned int, when it's less 5 the
value of "channel - 5" is not a negative number as one would expect but
is very high positive value instead.
This means that "start" becomes a very high positive value. The result
of that is that we never enter the "for (i = start; i <= end; i++) {"
loop. Instead of storing the result from b43legacy_radio_aci_detect()
it just uses zero.
Fixes:
75388acd0cd8 ("[B43LEGACY]: add mac80211-based driver for legacy BCM43xx devices")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Michael Büsch <m@bues.ch>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211006073542.GD8404@kili
Subbaraya Sundeep [Sun, 10 Oct 2021 10:09:35 +0000 (15:39 +0530)]
octeontx2-pf: Simplify the receive buffer size calculation
This patch separates the logic of configuring hardware
maximum transmit frame size and receive frame size.
This simplifies the logic to calculate receive buffer
size and using cqe descriptor of different size.
Also additional size of skb_shared_info structure is
allocated for each receive buffer pointer given to
hardware which is not necessary. Hence change the
size calculation to remove the size of
skb_shared_info. Add a check for array out of
bounds while adding fragments to the network stack.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christophe JAILLET [Sun, 10 Oct 2021 07:01:32 +0000 (09:01 +0200)]
ethernet: Remove redundant 'flush_workqueue()' calls
'destroy_workqueue()' already drains the queue before destroying it, so
there is no need to flush it explicitly.
Remove the redundant 'flush_workqueue()' calls.
This was generated with coccinelle:
@@
expression E;
@@
- flush_workqueue(E);
destroy_workqueue(E);
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com> #mlx*
Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Sat, 9 Oct 2021 09:32:43 +0000 (17:32 +0800)]
virtio_net: skip RCU read lock by checking xdp_enabled of vi
networking benchmark shows that __rcu_read_lock and
__rcu_read_unlock takes some cpu cycles, and we can avoid
calling them partially in virtio rx path by check xdp_enabled
of vi, and xdp is disabled most of time
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Fri, 8 Oct 2021 14:21:03 +0000 (16:21 +0200)]
net: make dev_get_port_parent_id slightly more readable
Cosmetic commit making dev_get_port_parent_id slightly more readable.
There is no need to split the condition to return after calling
devlink_compat_switch_id_get and after that 'recurse' is always true.
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ansuel Smith [Sat, 9 Oct 2021 22:46:18 +0000 (00:46 +0200)]
net: phy: at803x: better describe debug regs
Give a name to known debug regs from Documentation instead of using
unknown hex values.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ansuel Smith [Sat, 9 Oct 2021 22:46:17 +0000 (00:46 +0200)]
net: phy: at803x: enable prefer master for 83xx internal phy
From original QCA source code the port was set to prefer master as port
type in 1000BASE-T mode. Apply the same settings also here.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ansuel Smith [Sat, 9 Oct 2021 22:46:16 +0000 (00:46 +0200)]
net: phy: at803x: add DAC amplitude fix for 8327 phy
QCA8327 internal phy require DAC amplitude adjustement set to +6% with
100m speed. Also add additional define to report a change of the same
reg in QCA8337. (different scope it does set 1000m voltage)
Add link_change_notify function to set the proper amplitude adjustement
on PHY_RUNNING state and disable on any other state.
Fixes:
b4df02b562f4 ("net: phy: at803x: add support for qca 8327 A variant internal phy")
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ansuel Smith [Sat, 9 Oct 2021 22:46:15 +0000 (00:46 +0200)]
net: phy: at803x: fix resume for QCA8327 phy
From Documentation phy resume triggers phy reset and restart
auto-negotiation. Add a dedicated function to wait reset to finish as
it was notice a regression where port sometime are not reliable after a
suspend/resume session. The reset wait logic is copied from phy_poll_reset.
Add dedicated suspend function to use genphy_suspend only with QCA8337
phy and set only additional debug settings for QCA8327. With more test
it was reported that QCA8327 doesn't proprely support this mode and
using this cause the unreliability of the switch ports, especially the
malfunction of the port0.
Fixes:
15b9df4ece17 ("net: phy: at803x: add resume/suspend function to qca83xx phy")
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 10 Oct 2021 10:18:48 +0000 (11:18 +0100)]
Merge branch 'net-use-helpers'
Juhee Kang says:
====================
net-next: replace open code with helper functions
Currently, there are many helper functions on netdevice.h. However,
some code doesn't use the helper functions and remains open code.
So this patchset replaces open code with an appropriate helper function.
First patch modifies to use netif_is_rxfh_configured instead of
dev->priv_flags & IFF_RXFH_CONFIGURED.
Second patch replaces open code with netif_is_bond_master.
Last patch substitutes netif_is_macsec() for dev->priv_flags & IFF_MACSEC.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Juhee Kang [Sun, 10 Oct 2021 04:03:29 +0000 (13:03 +0900)]
mlxsw: spectrum: use netif_is_macsec() instead of open code
Open code which is dev->priv_flags & IFF_MACSEC has already defined as
netif_is_macsec(). So use netif_is_macsec() instead of open code.
This patch doesn't change logic.
Signed-off-by: Juhee Kang <claudiajkang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Juhee Kang [Sun, 10 Oct 2021 04:03:28 +0000 (13:03 +0900)]
hv_netvsc: use netif_is_bond_master() instead of open code
Use netif_is_bond_master() function instead of open code, which is
((event_dev->priv_flags & IFF_BONDING) && (event_dev->flags & IFF_MASTER)).
This patch doesn't change logic.
Signed-off-by: Juhee Kang <claudiajkang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Juhee Kang [Sun, 10 Oct 2021 04:03:27 +0000 (13:03 +0900)]
bnxt: use netif_is_rxfh_configured instead of open code
The open code which is dev->priv_flags & IFF_RXFH_CONFIGURED is defined as
a helper function on netdevice.h. So use netif_is_rxfh_configured()
function instead of open code. This patch doesn't change logic.
Signed-off-by: Juhee Kang <claudiajkang@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 10 Oct 2021 09:42:47 +0000 (10:42 +0100)]
Merge branch 'ionic-vlanid-mgmt'
Shannon Nelson says:
====================
ionic: add vlanid overflow management
Add vlans to the existing rx_filter_sync mechanics currently
used for managing mac filters.
Older versions of our firmware had no enforced limits on the
number of vlans that the driver could request, but requesting
large numbers of vlans caused issues in FW memory management,
so an arbitrary limit was added in the FW. The FW now
returns -ENOSPC when it hits that limit, which the driver
needs to handle.
Unfortunately, the FW doesn't advertise the vlan id limit,
as it does with mac filters, so the driver won't know the
limit until it bumps into it. We'll grab the current vlan id
count and use that as the limit from there on and thus prevent
getting any more -ENOSPC errors.
Just as is done for the mac filters, the device puts the device
into promiscuous mode when -ENOSPC is seen for vlan ids, and
the driver will track the vlans that aren't synced to the FW.
When vlans are removed, the driver will retry the un-synced
vlans. If all outstanding vlans are synced, the promiscuous
mode will be disabled.
The first 6 patches rework the existing filter management to
make it flexible enough for additional filter types. Next
we add the vlan ids into the management. The last 2 patches
allow us to catch the max vlan -ENOSPC error without adding
an unnecessary error message to the kernel log.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Sat, 9 Oct 2021 18:45:23 +0000 (11:45 -0700)]
ionic: tame the filter no space message
Override the automatic AdminQ error message in order to
capture the potential No Space message when we hit the
max vlan limit, and add additional messaging to detail
what filter failed.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Sat, 9 Oct 2021 18:45:22 +0000 (11:45 -0700)]
ionic: allow adminq requests to override default error message
The AdminQ handler has an error handler that automatically prints
an error message when the request has failed. However, there are
situations where the caller can expect that it might fail and has
an alternative strategy, thus may not want the error message sent
to the log, such as hitting -ENOSPC when adding a new vlan id.
We add a new interface to the AdminQ API to allow for override of
the default behavior, and an interface to the use standard error
message formatting.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Sat, 9 Oct 2021 18:45:21 +0000 (11:45 -0700)]
ionic: handle vlan id overflow
Add vlans to the existing rx_filter_sync mechanics currently
used for managing mac filters.
Older versions of our firmware had no enforced limits on the
number of vlans that the LIF could request, but requesting large
numbers of vlans caused issues in FW memory management, so an
arbitrary limit was added in the FW. The FW now returns -ENOSPC
when it hits that limit, which the driver needs to handle.
Unfortunately, the FW doesn't advertise the vlan id limit,
as it does with mac filters, so the driver won't know the
limit until it bumps into it. We'll grab the current vlan id
count and use that as the limit from there on and thus prevent
getting any more -ENOSPC errors.
Just as is done for the mac filters, the device puts the device
into promiscuous mode when -ENOSPC is seen for vlan ids, and
the driver will track the vlans that aren't synced to the FW.
When vlans are removed, the driver will retry the un-synced
vlans. If all outstanding vlans are synced, the promiscuous
mode will be disabled.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Sat, 9 Oct 2021 18:45:20 +0000 (11:45 -0700)]
ionic: generic filter delete
Similar to the filter add, make a generic filter delete.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Sat, 9 Oct 2021 18:45:19 +0000 (11:45 -0700)]
ionic: generic filter add
In preparation for adding vlan overflow management, rework
the ionic_lif_addr_add() function to something a little more
generic that can be used for other filter types.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Sat, 9 Oct 2021 18:45:18 +0000 (11:45 -0700)]
ionic: add generic filter search
In preparation for enhancing vlan filter management,
add a filter search routine that can figure out for
itself which type of filter search is needed.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Sat, 9 Oct 2021 18:45:17 +0000 (11:45 -0700)]
ionic: remove mac overflow flags
The overflow flags really aren't useful and we don't need lif
struct elements to track them.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Sat, 9 Oct 2021 18:45:16 +0000 (11:45 -0700)]
ionic: move lif mac address functions
The routines that add and delete mac addresses from the
firmware really should be in the file with the rest of
the filter management. This simply moves the functions
with no logic changes.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Sat, 9 Oct 2021 18:45:15 +0000 (11:45 -0700)]
ionic: add filterlist to debugfs
Dump the filter list to debugfs - includes the device-assigned
filter id and the sync'd-to-hardware status.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Sat, 9 Oct 2021 12:27:36 +0000 (15:27 +0300)]
dt-bindings: net: dsa: document felix family in dsa-tag-protocol
The Vitesse/Microsemi/Microchip family of switches supported by the
Felix DSA driver is capable of selecting between the native tagging
protocol and ocelot-8021q. This is necessary to enable flow control on
the CPU port.
Certain systems where these switches are integrated use the switch as a
port multiplexer, so the termination throughput is paramount. Changing
the tagging protocol at runtime is possible for these systems, but since
it is known beforehand that one tagging protocol will provide strictly
better performance than the other, just allow them to specify the
preference in the device tree.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Sat, 9 Oct 2021 12:27:35 +0000 (15:27 +0300)]
dt-bindings: net: dsa: fix typo in dsa-tag-protocol description
There is a trivial typo when spelling "protocol", fix it.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 8 Oct 2021 17:53:39 +0000 (10:53 -0700)]
net: use dev_addr_set()
Use dev_addr_set() instead of writing directly to netdev->dev_addr
in various misc and old drivers.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 9 Oct 2021 10:46:57 +0000 (11:46 +0100)]
Merge branch 'dev_addr-direct-writes'
Jakub Kicinski says:
====================
net: remove direct netdev->dev_addr writes
Commit
406f42fa0d3c ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it got through appropriate helpers.
This series contains top 5 conversions in terms of LoC required
to bring the driver into compliance.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 8 Oct 2021 17:59:13 +0000 (10:59 -0700)]
ethernet: 8390: remove direct netdev->dev_addr writes
8390 contains a lot of loops assigning netdev->dev_addr
byte by byte. Convert what's possible directly to
eth_hw_addr_set(), use local buf in other places.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 8 Oct 2021 17:59:12 +0000 (10:59 -0700)]
ethernet: sun: remove direct netdev->dev_addr writes
Consify temporary variables pointing to netdev->dev_addr.
A few places need local storage but pretty simple conversion
over all. Note that macaddr[] is an array of ints, so we need
to keep the loops.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 8 Oct 2021 17:59:11 +0000 (10:59 -0700)]
ethernet: tulip: remove direct netdev->dev_addr writes
Consify the casts of netdev->dev_addr.
Convert pointless to eth_hw_addr_set() where possible.
Use local buffers in a number of places.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 8 Oct 2021 17:59:10 +0000 (10:59 -0700)]
ethernet: tg3: remove direct netdev->dev_addr writes
tg3 does various forms of direct writes to netdev->dev_addr.
Use a local buffer. Make sure local buffer is aligned since
eth_platform_get_mac_address() may call ether_addr_copy().
tg3_get_device_address() returns whenever it finds a method
that found a valid address. Instead of modifying all the exit
points pass the buffer from the outside and commit the address
in the caller.
Constify the argument of the set addr helper.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 8 Oct 2021 17:59:09 +0000 (10:59 -0700)]
ethernet: forcedeth: remove direct netdev->dev_addr writes
forcedeth writes to dev_addr byte by byte, make it use
a local buffer instead. Commit the changes with
eth_hw_addr_set() at the end.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Fri, 8 Oct 2021 13:23:15 +0000 (16:23 +0300)]
mlxsw: item: Annotate item helpers with '__maybe_unused'
mlxsw is using helpers to get / set fields in messages exchanged with
the device. It is possible that some fields are only set or only get.
This causes LLVM to emit warnings such as the following when building
with W=1 [1]:
drivers/net/ethernet/mellanox/mlxsw/core_acl_flex_actions.c:2022:1: warning: unused function 'mlxsw_afa_sampler_mirror_agent_get'
The fact that some fields are only set or only get is very much
intentional and not indicative of functions that need to be removed.
Therefore, annotate the item helpers with '__maybe_unused' to suppress
these warnings.
[1] https://lkml.org/lkml/2021/9/29/685
Cc: Nathan Chancellor <nathan@kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20211008132315.90211-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tianjia Zhang [Fri, 8 Oct 2021 09:17:45 +0000 (17:17 +0800)]
selftests/tls: add SM4 GCM/CCM to tls selftests
Add new cipher as a variant of standard tls selftests.
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Link: https://lore.kernel.org/r/20211008091745.42917-1-tianjia.zhang@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jean Sacren [Fri, 8 Oct 2021 06:31:47 +0000 (00:31 -0600)]
net: tg3: fix redundant check of true expression
Remove the redundant check of (tg3_asic_rev(tp) == ASIC_REV_5705) after
it is checked to be true.
Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20211008063147.1421-1-sakiwit@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Prabhakar Kushwaha [Thu, 7 Oct 2021 15:52:38 +0000 (18:52 +0300)]
qed: Fix compilation for CONFIG_QED_SRIOV undefined scenario
This patch fixes below compliation error in case CONFIG_QED_SRIOV not
defined.
drivers/net/ethernet/qlogic/qed/qed_dev.c: In function
‘qed_fw_err_handler’:
drivers/net/ethernet/qlogic/qed/qed_dev.c:2390:3: error: implicit
declaration of function ‘qed_sriov_vfpf_malicious’; did you mean
‘qed_iov_vf_task’? [-Werror=implicit-function-declaration]
qed_sriov_vfpf_malicious(p_hwfn, &data->err_data);
^~~~~~~~~~~~~~~~~~~~~~~~
qed_iov_vf_task
drivers/net/ethernet/qlogic/qed/qed_dev.c: In function
‘qed_common_eqe_event’:
drivers/net/ethernet/qlogic/qed/qed_dev.c:2410:10: error: implicit
declaration of function ‘qed_sriov_eqe_event’; did you mean
‘qed_common_eqe_event’? [-Werror=implicit-function-declaration]
return qed_sriov_eqe_event(p_hwfn, opcode, echo, data,
^~~~~~~~~~~~~~~~~~~
qed_common_eqe_event
Fixes:
fe40a830dcde ("qed: Update qed_hsi.h for fw 8.59.1.0")
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: Omkar Kulkarni <okulkarni@marvell.com>
Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Francesco Dolcini [Thu, 7 Oct 2021 16:45:35 +0000 (18:45 +0200)]
net: phy: micrel: ksz9131 led errata workaround
Micrel KSZ9131 PHY LED behavior is not correct when configured in
Individual Mode, LED1 (Activity LED) is in the ON state when there is
no-link.
Workaround this by setting bit 9 of register 0x1e after verifying that
the LED configuration is Individual Mode.
This issue is described in KSZ9131RNX Silicon Errata DS80000693B [*]
and according to that it will not be corrected in a future silicon
revision.
[*] https://ww1.microchip.com/downloads/en/DeviceDoc/KSZ9131RNX-Silicon-Errata-and-Data-Sheet-Clarification-
80000863B.pdf
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Reviewed-by: Philippe Schenker <philippe.schenker@toradex.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 8 Oct 2021 16:02:35 +0000 (17:02 +0100)]
Merge branch 'netdev-name-in-use'
Antoine Tenart says:
====================
net: introduce a function to check if a netdev name is in use
This was initially part of an RFC series[1] but has value on its own;
hence the standalone report. (It will also help in not having a series
too large).
From patch 1:
"""
__dev_get_by_name is currently used to either retrieve a net device
reference using its name or to check if a name is already used by a
registered net device (per ns). In the later case there is no need to
return a reference to a net device.
Introduce a new helper, netdev_name_in_use, to check if a name is
currently used by a registered net device without leaking a reference
the corresponding net device. This helper uses netdev_name_node_lookup
instead of __dev_get_by_name as we don't need the extra logic retrieving
a reference to the corresponding net device.
"""
Two uses[2] of __dev_get_by_name weren't converted to this new function,
as they are really looking for a net device, not only checking if a net
device name is in use. While checking one or the other currently has
the same result, that might change if the initial RFC series moves
forward. I'll convert them later depending on the outcome of the initial
series.
Thanks,
Antoine
[1] https://lore.kernel.org/all/
20210928125500.167943-1-atenart@kernel.org/
[2] drivers/net/Space.c:130 & drivers/nvme/host/tcp.c:2550
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Thu, 7 Oct 2021 16:16:52 +0000 (18:16 +0200)]
ppp: use the correct function to check if a netdev name is in use
A new helper to detect if a net device name is in use was added. Use it
here as the return reference from __dev_get_by_name was discarded.
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Thu, 7 Oct 2021 16:16:51 +0000 (18:16 +0200)]
bonding: use the correct function to check for netdev name collision
A new helper to detect if a net device name is in use was added. Use it
here as the return reference from __dev_get_by_name was discarded.
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Thu, 7 Oct 2021 16:16:50 +0000 (18:16 +0200)]
net: introduce a function to check if a netdev name is in use
__dev_get_by_name is currently used to either retrieve a net device
reference using its name or to check if a name is already used by a
registered net device (per ns). In the later case there is no need to
return a reference to a net device.
Introduce a new helper, netdev_name_in_use, to check if a name is
currently used by a registered net device without leaking a reference
the corresponding net device. This helper uses netdev_name_node_lookup
instead of __dev_get_by_name as we don't need the extra logic retrieving
a reference to the corresponding net device.
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 8 Oct 2021 15:59:09 +0000 (16:59 +0100)]
Merge branch 'enetc-swtso'
Ioana Ciornei says:
====================
net: enetc: add support for software TSO
This series adds support for driver level TSO in the enetc driver.
Ever since the ENETC MDIO erratum workaround is in place, the Tx path is
incurring a penalty (enetc_lock_mdio/enetc_unlock_mdio) for each skb to
be sent out. On top of this, ENETC does not support Tx checksum
offloading. This means that software TSO would help performance just by
the fact that one single mdio lock/unlock sequence would cover multiple
packets sent. On the other hand, checksum needs to be computed in
software since the controller cannot handle it.
This is why, beside using the usual tso_build_hdr()/tso_build_data()
this specific implementation also has to compute the checksum, both IP
and L4, for each resulted segment.
Even with that, the performance improvement of a TCP flow running on a
single A72@1.3GHz of the LS1028A SoC (2.5Gbit/s port) is the following:
before: 1.63 Gbits/sec
after: 2.34 Gbits/sec
Changes in v2:
- declare NETIF_F_HW_CSUM instead of NETIF_F_IP_CSUM in 1/2
- add support for TSO over IPv6 (NETIF_F_TSO6 and csum compute) in 2/2
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Thu, 7 Oct 2021 15:30:43 +0000 (18:30 +0300)]
net: enetc: add support for software TSO
This patch adds support for driver level TSO in the enetc driver using
the TSO API.
Beside using the usual tso_build_hdr(), tso_build_data() this specific
implementation also has to compute the checksum, both IP and L4, for
each resulted segment. This is because the ENETC controller does not
support Tx checksum offload which is needed in order to take advantage
of TSO.
With the workaround for the ENETC MDIO erratum in place the Tx path of
the driver is forced to lock/unlock for each skb sent. This is why, even
though we are computing the checksum by hand we see the following
improvement in TCP termination on the LS1028A SoC, on a single A72 core
running at 1.3GHz:
before: 1.63 Gbits/sec
after: 2.34 Gbits/sec
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Thu, 7 Oct 2021 15:30:42 +0000 (18:30 +0300)]
net: enetc: declare NETIF_F_HW_CSUM and do it in software
This is just a preparation patch for software TSO in the enetc driver.
Unfortunately, ENETC does not support Tx checksum offload which would
normally render TSO, even software, impossible.
Declare NETIF_F_HW_CSUM as part of the feature set and do it at driver
level using skb_csum_hwoffload_help() so that we can move forward and
also add support for TSO in the next patch.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 8 Oct 2021 15:41:00 +0000 (16:41 +0100)]
Merge branch 'ip6gre-tests'
Ido Schimmel says:
====================
selftests: forwarding: Add ip6gre tests
This patchset adds forwarding selftests for ip6gre. The tests can be run
with veth pairs or with physical loopbacks.
Patch #1 adds a new config option to determine if 'skip_sw' / 'skip_hw'
flags are used when installing tc filters. By default, it is not set
which means the flags are not used. 'skip_sw' is useful to ensure
traffic is forwarded by the hardware data path.
Patch #2 adds a new helper function.
Patches #3-#4 add the forwarding selftests.
Patch #5 adds a mlxsw-specific selftest to validate correct behavior of
the 'decap_error' trap with IPv6 underlay.
Patches #6-#8 align the corresponding IPv4 underlay test to the IPv6
one.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Fri, 8 Oct 2021 13:12:41 +0000 (16:12 +0300)]
selftests: mlxsw: devlink_trap_tunnel_ipip: Send a full-length key
As part of adding same test for GRE tunnel with IPv6 underlay, missing
bytes for key were found.
mausezahn does not fill zeros between two colons, so send them
explicitly. For example, use "00:00:00:E9:" instead of ":E9:"
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Fri, 8 Oct 2021 13:12:40 +0000 (16:12 +0300)]
selftests: mlxsw: devlink_trap_tunnel_ipip: Remove code duplication
As part of adding same test for GRE tunnel with IPv6 underlay, an
optional improvement was found - call ipip_payload_get from
ecn_payload_get, so do not duplicate the code which creates the payload.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Fri, 8 Oct 2021 13:12:39 +0000 (16:12 +0300)]
selftests: mlxsw: devlink_trap_tunnel_ipip: Align topology drawing correctly
As part of adding same test for GRE tunnel with IPv6 underlay, wrong
alignments were found, fix them.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Fri, 8 Oct 2021 13:12:38 +0000 (16:12 +0300)]
selftests: mlxsw: devlink_trap_tunnel_ipip6: Add test case for IPv6 decap_error
IPv6 underlay support was added, add test to check that "decap_error" trap
is triggered under the right conditions and that devlink counters increase.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Fri, 8 Oct 2021 13:12:37 +0000 (16:12 +0300)]
selftests: forwarding: Add IPv6 GRE hierarchical tests
Add tests that check IPv6-in-IPv6, IPv4-in-IPv6 and MTU change of GRE
tunnel. The tests use hierarchical model - the tunnel is bound to a device
in a different VRF.
These tests can be run with TC_FLAG=skip_sw, so then they will verify
that packets go through hardware as part of enacp and decap phases.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Fri, 8 Oct 2021 13:12:36 +0000 (16:12 +0300)]
selftests: forwarding: Add IPv6 GRE flat tests
Add tests that check IPv6-in-IPv6, IPv4-in-IPv6 and MTU change of GRE
tunnel. The tests use flat model - overlay and underlay share the same VRF.
These tests can be run with TC_FLAG=skip_sw, so then they will verify
that packets go through hardware as part of enacp and decap phases.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Fri, 8 Oct 2021 13:12:35 +0000 (16:12 +0300)]
testing: selftests: tc_common: Add tc_check_at_least_x_packets()
Add function that checks that at least X packets hit the tc rule.
There are cases that it is not possible to catch only the interesting
packets, so then, it is possible to send many packets and verify that at
least this amount of packets hit the rule.
This function will be used in the next patch for general tc rule that
can be used to test both software and hardware.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Fri, 8 Oct 2021 13:12:34 +0000 (16:12 +0300)]
testing: selftests: forwarding.config.sample: Add tc flag
Add TC_FLAG value to tests topology.
This flag supposed to be skip_sw/skip_hw which means do not filter by
software/hardware.
This can be useful for adding tests to forwarding directory, and be able
to verify that packets go through the hardware.
When the flag is not set or set to 'skip_hw', tests can still be executed
with veth pairs.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Richard Palethorpe [Fri, 8 Oct 2021 10:00:53 +0000 (11:00 +0100)]
vsock: Enable y2038 safe timeval for timeout
Reuse the timeval compat code from core/sock to handle 32-bit and
64-bit timeval structures. Also introduce a new socket option define
to allow using y2038 safe timeval under 32-bit.
The existing behavior of sock_set_timeout and vsock's timeout setter
differ when the time value is out of bounds. vsocks current behavior
is retained at the expense of not being able to share the full
implementation.
This allows the LTP test vsock01 to pass under 32-bit compat mode.
Fixes:
fe0c72f3db11 ("socket: move compat timeout handling into sock.c")
Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
Cc: Richard Palethorpe <rpalethorpe@richiejp.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Richard Palethorpe [Fri, 8 Oct 2021 10:00:52 +0000 (11:00 +0100)]
vsock: Refactor vsock_*_getsockopt to resemble sock_getsockopt
In preparation for sharing the implementation of sock_get_timeout.
Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
Cc: Richard Palethorpe <rpalethorpe@richiejp.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kalle Valo [Fri, 8 Oct 2021 14:39:32 +0000 (17:39 +0300)]
ath11k: fix m68k and xtensa build failure in ath11k_peer_assoc_h_smps()
Stephen reported that ath11k was failing to build on m68k and xtensa:
In file included from <command-line>:0:0:
In function 'ath11k_peer_assoc_h_smps',
inlined from 'ath11k_peer_assoc_prepare' at drivers/net/wireless/ath/ath11k/mac.c:2362:2:
include/linux/compiler_types.h:317:38: error: call to '__compiletime_assert_650' declared with attribute error: FIELD_GET: type of reg too small for mask
_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
^
include/linux/compiler_types.h:298:4: note: in definition of macro '__compiletime_assert'
prefix ## suffix(); \
^
include/linux/compiler_types.h:317:2: note: in expansion of macro '_compiletime_assert'
_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
^
include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert'
#define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
^
include/linux/bitfield.h:52:3: note: in expansion of macro 'BUILD_BUG_ON_MSG'
BUILD_BUG_ON_MSG((_mask) > (typeof(_reg))~0ull, \
^
include/linux/bitfield.h:108:3: note: in expansion of macro '__BF_FIELD_CHECK'
__BF_FIELD_CHECK(_mask, _reg, 0U, "FIELD_GET: "); \
^
drivers/net/wireless/ath/ath11k/mac.c:2079:10: note: in expansion of macro 'FIELD_GET'
smps = FIELD_GET(IEEE80211_HE_6GHZ_CAP_SM_PS,
Fix the issue by using le16_get_bits() to specify the size explicitly.
Fixes:
6f4d70308e5e ("ath11k: support SMPS configuration for 6 GHz")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Thu, 7 Oct 2021 14:00:51 +0000 (16:00 +0200)]
net-sysfs: try not to restart the syscall if it will fail eventually
Due to deadlocks in the networking subsystem spotted 12 years ago[1],
a workaround was put in place[2] to avoid taking the rtnl lock when it
was not available and restarting the syscall (back to VFS, letting
userspace spin). The following construction is found a lot in the net
sysfs and sysctl code:
if (!rtnl_trylock())
return restart_syscall();
This can be problematic when multiple userspace threads use such
interfaces in a short period, making them to spin a lot. This happens
for example when adding and moving virtual interfaces: userspace
programs listening on events, such as systemd-udevd and NetworkManager,
do trigger actions reading files in sysfs. It gets worse when a lot of
virtual interfaces are created concurrently, say when creating
containers at boot time.
Returning early without hitting the above pattern when the syscall will
fail eventually does make things better. While it is not a fix for the
issue, it does ease things.
[1] https://lore.kernel.org/netdev/
49A4D5D5.5090602@trash.net/
https://lore.kernel.org/netdev/m14oyhis31.fsf@fess.ebiederm.org/
and https://lore.kernel.org/netdev/
20090226084924.
16cb3e08@nehalam/
[2] Rightfully, those deadlocks are *hard* to solve.
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King (Oracle) [Thu, 7 Oct 2021 13:23:32 +0000 (14:23 +0100)]
net: phylib: ensure phy device drivers do not match by DT
PHYLIB device drivers must match by either numerical PHY ID or by their
.match_phy_device method. Matching by DT is not permitted.
Link: https://lore.kernel.org/r/2b1dc053-8c9a-e3e4-b450-eecdfca3fe16@gmail.com
Tested-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King (Oracle) [Thu, 7 Oct 2021 13:23:27 +0000 (14:23 +0100)]
net: mdio: ensure the type of mdio devices match mdio drivers
On the MDIO bus, we have PHYLIB devices and drivers, and we have non-
PHYLIB devices and drivers. PHYLIB devices are MDIO devices that are
wrapped with a struct phy_device.
Trying to bind a MDIO device with a PHYLIB driver results in out-of-
bounds accesses as we attempt to access struct phy_device members. So,
let's prevent this by ensuring that the type of the MDIO device
(indicated by the MDIO_DEVICE_FLAG_PHY flag) matches the type of the
MDIO driver (indicated by the MDIO_DEVICE_IS_PHY flag.)
Link: https://lore.kernel.org/r/2b1dc053-8c9a-e3e4-b450-eecdfca3fe16@gmail.com
Tested-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Davide Caratti [Thu, 7 Oct 2021 13:05:02 +0000 (15:05 +0200)]
net/sched: sch_ets: properly init all active DRR list handles
leaf classes of ETS qdiscs are served in strict priority or deficit round
robin (DRR), depending on the value of 'nstrict'. Since this value can be
changed while traffic is running, we need to be sure that the active list
of DRR classes can be updated at any time, so:
1) call INIT_LIST_HEAD(&alist) on all leaf classes in .init(), before the
first packet hits any of them.
2) ensure that 'alist' is not overwritten with zeros when a leaf class is
no more strict priority nor DRR (i.e. array elements beyond 'nbands').
Link: https://lore.kernel.org/netdev/YS%2FoZ+f0Nr8eQkzH@dcaratti.users.ipa.redhat.com
Suggested-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tim Gardner [Thu, 7 Oct 2021 12:04:13 +0000 (06:04 -0600)]
qed: Initialize debug string array
Coverity complains of an uninitialized variable.
CID 120847 (#1 of 1): Uninitialized scalar variable (UNINIT)
3. uninit_use_in_call: Using uninitialized value *sw_platform_str when calling qed_dump_str_param. [show details]
1344 offset += qed_dump_str_param(dump_buf + offset,
1345 dump, "sw-platform", sw_platform_str);
Fix this by removing dead code that references sw_platform_str.
Fixes:
6c95dd8f0aa1d ("qed: Update debug related changes")
Cc: Ariel Elior <aelior@marvell.com>
Cc: GR-everest-linux-l2@marvell.com
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Shai Malin <smalin@marvell.com>
Cc: Omkar Kulkarni <okulkarni@marvell.com>
Cc: Prabhakar Kushwaha <pkushwaha@marvell.com>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org (open list)
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Li [Fri, 8 Oct 2021 06:21:17 +0000 (14:21 +0800)]
net: dsa: rtl8366rb: remove unneeded semicolon
Eliminate the following coccicheck warning:
./drivers/net/dsa/rtl8366rb.c:1348:2-3: Unneeded semicolon
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 8 Oct 2021 13:54:34 +0000 (14:54 +0100)]
Merge branch 'dev_addr-helpers'
Jakub Kicinski says:
====================
net: add a helpers for loading netdev->dev_addr from platform
Similarly to recently added device_get_ethdev_address()
and of_get_ethdev_address() create a helper for drivers
loading mac addr from platform data.
nvmem_get_mac_address() does not have driver callers
so instead of adding a helper there unexport it.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Oct 2021 18:18:47 +0000 (11:18 -0700)]
ethernet: use platform_get_ethdev_address()
Use the new platform_get_ethdev_address() helper for the cases
where dev->dev_addr is passed in directly as the destination.
@@
expression dev, net;
@@
- eth_platform_get_mac_address(dev, net->dev_addr)
+ platform_get_ethdev_address(dev, net)
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Oct 2021 18:18:46 +0000 (11:18 -0700)]
eth: platform: add a helper for loading netdev->dev_addr
Commit
406f42fa0d3c ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it got through appropriate helpers.
There is a handful of drivers which pass netdev->dev_addr as
the destination buffer to eth_platform_get_mac_address().
Add a helper which takes a dev pointer instead, so it can call
an appropriate helper.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Oct 2021 18:18:45 +0000 (11:18 -0700)]
ethernet: un-export nvmem_get_mac_address()
nvmem_get_mac_address() is only called from of_net.c
we don't need the export.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 8 Oct 2021 13:31:01 +0000 (14:31 +0100)]
Merge branch '100GbE' of git://git./linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
100GbE Intel Wired LAN Driver Updates 2021-10-07
Michal Swiatkowski says:
The following patch series introduces basic switchdev model
support in ice driver. Implement the following blocks of
switchdev framework:
- VF port representors creation
- control plane VSI definition
- exception path (a. k. a. "slow-path") - to allow a virtual
switch or linux bridge to receive any packet that doesn't match
any hw filter
- link state management of virtual ports
- query virtual port statistics
Hardware offload support in switchdev mode is out of scope of
this patchset. Devlink interface is used to toggle between
switchdev and legacy (the default) modes of the driver.
---
Note: This series includes the use enum ice_status, however, we have
patches in our queue to remove it from the driver [1]. We are working
through the patches that precede the removal series.
[1] https://patchwork.ozlabs.org/project/intel-wired-lan/list/?series=265957
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 7 Oct 2021 22:24:06 +0000 (15:24 -0700)]
Merge git://git./linux/kernel/git/netdev/net
No conflicts.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Thu, 7 Oct 2021 21:11:40 +0000 (14:11 -0700)]
Merge tag 'nfsd-5.15-3' of git://git./linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
"Bug fixes for NFSD error handling paths"
* tag 'nfsd-5.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
NFSD: Keep existing listeners on portlist error
SUNRPC: fix sign error causing rpcsec_gss drops
nfsd: Fix a warning for nfsd_file_close_inode
nfsd4: Handle the NFSv4 READDIR 'dircount' hint being zero
nfsd: fix error handling of register_pernet_subsys() in init_nfsd()
Linus Torvalds [Thu, 7 Oct 2021 21:01:29 +0000 (14:01 -0700)]
Merge tag 'armsoc-fixes-5.15' of git://git./linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"This is a larger than normal update for Arm SoC specific code, most of
it in device trees, but also drivers and the omap and at91/sama7
platforms:
- There are four new entries to the MAINTAINERS file: Sven Peter and
Alyssa Rosenzweig for Apple M1, Romain Perier for Mstar/sigmastar,
and Vignesh Raghavendra for TI K3
- Build fixes to address randconfig warnings in sharpsl, dove, omap1,
and qcom platforms as well as the scmi and op-tee subsystems
- Regression fixes for missing CONFIG_FB and other options for
several defconfigs
- Several bug fixes for the newly added Microchip SAMA7 platform,
mostly regarding power management
- Missing SMP barriers to protect accesses to SCMI virtio device
- Regression fixes for TI OMAP, including a boot-time hang on am335x.
- Lots of bug fixes for NXP i.MX, mostly addressing incorrect
settings in devicetree files, and one revert for broken suspend.
- Fixes for ARM Juno/Vexpress devicetree files, addressing a couple
of schema warnings.
- Regression fixes for qualcomm SoC specific drivers and devicetree
files, reverting an mdt_loader change and at least pastially
reverting some of the 5.15 DTS changes, plus some minor bugfixes"
* tag 'armsoc-fixes-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (64 commits)
MAINTAINERS: Add Sven Peter as ARM/APPLE MACHINE maintainer
MAINTAINERS: Add Alyssa Rosenzweig as M1 reviewer
firmware: arm_scmi: Add proper barriers to scmi virtio device
firmware: arm_scmi: Simplify spinlocks in virtio transport
ARM: dts: omap3430-sdp: Fix NAND device node
bus: ti-sysc: Use CLKDM_NOAUTO for dra7 dcan1 for errata i893
ARM: sharpsl_param: work around -Wstringop-overread warning
ARM: defconfig: gemini: Restore framebuffer
ARM: dove: mark 'putc' as inline
ARM: omap1: move omap15xx local bus handling to usb.c
MAINTAINERS: Add Vignesh to TI K3 platform maintainership
arm64: dts: imx8m*-venice-gw7902: fix M2_RST# gpio
ARM: imx6: disable the GIC CPU interface before calling stby-poweroff sequence
arm64: dts: ls1028a: fix eSDHC2 node
arm64: dts: imx8mm-kontron-n801x-som: do not allow to switch off buck2
ARM: dts: at91: sama7g5ek: to not touch slew-rate for SDMMC pins
ARM: dts: at91: sama7g5ek: use proper slew-rate settings for GMACs
ARM: at91: pm: preload base address of controllers in tlb
ARM: at91: pm: group constants and addresses loading
ARM: dts: at91: sama7g5ek: add suspend voltage for ddr3l rail
...
Arnd Bergmann [Thu, 7 Oct 2021 19:14:12 +0000 (21:14 +0200)]
Merge tag 'asahi-soc-fixes-5.15' of https://github.com/AsahiLinux/linux into arm/fixes
Apple SoC fixes for 5.15; just two MAINTAINERS updates.
- MAINTAINERS: Add Sven Peter as ARM/APPLE MACHINE maintainer
- MAINTAINERS: Add Alyssa Rosenzweig as M1 reviewer
* tag 'asahi-soc-fixes-5.15' of https://github.com/AsahiLinux/linux:
MAINTAINERS: Add Sven Peter as ARM/APPLE MACHINE maintainer
MAINTAINERS: Add Alyssa Rosenzweig as M1 reviewer
Link: https://lore.kernel.org/r/a50a9015-0e62-c451-4d0d-668233b35b85@marcan.st
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Arnd Bergmann [Thu, 7 Oct 2021 19:14:03 +0000 (21:14 +0200)]
Merge tag 'scmi-fixes-5.15' of git://git./linux/kernel/git/sudeep.holla/linux into arm/fixes
SCMI fixes for v5.15
A few fixes addressing:
- Kconfig dependency between VIRTIO and ARM_SCMI_PROTOCOL
- Link-time error with __exit annotation for virtio_scmi_exit
- Unnecessary nested irqsave/irqrestore spinlocks in virtio transport
- Missing SMP barriers to protect accesses to SCMI virtio device
* tag 'scmi-fixes-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/sudeep.holla/linux:
firmware: arm_scmi: Add proper barriers to scmi virtio device
firmware: arm_scmi: Simplify spinlocks in virtio transport
firmware: arm_scmi: Remove __exit annotation
firmware: arm_scmi: Fix virtio transport Kconfig dependency
Link: https://lore.kernel.org/r/20211007102822.27886-1-sudeep.holla@arm.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Arnd Bergmann [Thu, 7 Oct 2021 19:13:57 +0000 (21:13 +0200)]
Merge tag 'omap-for-v5.15/fixes-rc4' of git://git./linux/kernel/git/tmlind/linux-omap into arm/fixes
Fixes for omaps for v5.15
Few regression fixes for omaps for the v5.15-rc cycle. There is a fix
for boot time hangs that can happen on some am335x devices that started
when the pruss devicetree nodes were added. The other fixes are less
critical:
- Fix compiler warning for sysc_init_soc() that got recently introduced
- Fix external abort for am335x pruss as otherwise some am335x will hang
- Use CLKDM_NOAUTO quirk also for dra7 dcan1
- Fix older NAND device node regression for omap3-sdp
* tag 'omap-for-v5.15/fixes-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: omap3430-sdp: Fix NAND device node
bus: ti-sysc: Use CLKDM_NOAUTO for dra7 dcan1 for errata i893
soc: ti: omap-prm: Fix external abort for am335x pruss
bus: ti-sysc: Add break in switch statement in sysc_init_soc()
Link: https://lore.kernel.org/r/pull-1633609552-789682@atomide.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Linus Torvalds [Thu, 7 Oct 2021 18:20:08 +0000 (11:20 -0700)]
Merge tag 'misc-fixes-
20211007' of git://git./linux/kernel/git/dhowells/linux-fs
Pull netfslib, cachefiles and afs fixes from David Howells:
- Fix another couple of oopses in cachefiles tracing stemming from the
possibility of passing in a NULL object pointer
- Fix netfs_clear_unread() to set READ on the iov_iter so that source
it is passed to doesn't do the wrong thing (some drivers look at the
flag on iov_iter rather than other available information to determine
the direction)
- Fix afs_launder_page() to write back at the correct file position on
the server so as not to corrupt data
* tag 'misc-fixes-
20211007' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
afs: Fix afs_launder_page() to set correct start file position
netfs: Fix READ/WRITE confusion when calling iov_iter_xarray()
cachefiles: Fix oops with cachefiles_cull() due to NULL object
Linus Torvalds [Thu, 7 Oct 2021 17:58:42 +0000 (10:58 -0700)]
Merge tag 'perf-tools-fixes-for-v5.15-2021-10-07' of git://git./linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix plugin static linking with libopencsd on ARM and ARM64
- Add missing -lstdc++ when linking with libopencsd
- Add missing topdown metrics events to 'perf test attr'
- Plug leak sys_event_tables list after processing JSON vendor events
entries
- Sync sound/asound.h copy with the kernel sources
* tag 'perf-tools-fixes-for-v5.15-2021-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf tests attr: Add missing topdown metrics events
tools include UAPI: Sync sound/asound.h copy with the kernel sources
perf build: Fix plugin static linking with libopencsd on ARM and ARM64
perf build: Add missing -lstdc++ when linking with libopencsd
perf jevents: Free the sys_event_tables list after processing entries
Wojciech Drewek [Fri, 20 Aug 2021 00:08:59 +0000 (17:08 -0700)]
ice: add port representor ethtool ops and stats
Introduce the following ethtool operations for VF's representor:
-get_drvinfo
-get_strings
-get_ethtool_stats
-get_sset_count
-get_link
In all cases, existing operations were used with minor
changes which allow us to detect if ethtool op was called for
representor. Only VF VSI stats will be available for representor.
Implement ndo_get_stats64 for port representor. This will update
VF VSI stats and read them.
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Grzegorz Nitka [Fri, 20 Aug 2021 00:08:58 +0000 (17:08 -0700)]
ice: switchdev slow path
Slow path means allowing packet to go from uplink to representor
and from representor to correct VF on Rx site and from VF to
representor and to uplink on Tx site.
To accomplish this driver, has to set correct Tx descriptor. When
packet is sent from representor to VF, destination should be
set to VF VSI. When packet is sent from uplink port destination
should be uplink to bypass switch infrastructure and send packet
outside.
On Rx site driver should check source VSI field from Rx descriptor
and based on that forward packed to correct netdev. To allow
this there is a target netdevs table in control plane VSI
struct.
Co-developed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Grzegorz Nitka [Fri, 20 Aug 2021 00:08:57 +0000 (17:08 -0700)]
ice: rebuild switchdev when resetting all VFs
As resetting all VFs behaves mostly like creating new VFs also
eswitch infrastructure has to be recreated. The easiest way to
do that is to rebuild eswitch after resetting VFs.
Implement helper functions to start and stop all representors
queues. This is used to disable traffic on port representors.
In rebuild path:
- NAPI has to be disabled
- eswitch environment has to be set up
- new port representors have to be created, because the old
one had pointer to not existing VFs
- new control plane VSI ring should be remapped
- NAPI hast to be enabled
- rxdid has to be set to FLEX_NIC_2, because this descriptor id
support source_vsi, which is needed on control plane VSI queues
- port representors queues have to be started
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Grzegorz Nitka [Fri, 20 Aug 2021 00:08:56 +0000 (17:08 -0700)]
ice: enable/disable switchdev when managing VFs
Only way to enable switchdev is to create VFs when the eswitch
mode is set to switchdev. Check if correct mode is set and
enable switchdev in function which creating VFs.
Disable switchdev when user change number of VFs to 0. Changing
eswitch mode back to legacy when VFs are created in switchdev
mode isn't allowed.
As switchdev takes care of managing filter rules, adding new
rules on VF is blocked.
In case of resetting VF driver has to update pointer in ice_repr
struct, because after reset VSI related things can change.
Co-developed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Grzegorz Nitka [Fri, 20 Aug 2021 00:08:55 +0000 (17:08 -0700)]
ice: introduce new type of VSI for switchdev
New type of VSI has to be defined for switchdev control plane
VSI. Number of allocated Tx and Rx queue has to be equal to
amount of VFs, because each port representor should have one
Tx and Rx queue.
Also to not increase number of used irqs too much, control plane
VSI uses only one q_vector and handle all queues in one irq.
To allow handling all queues in one irq , new function to clean
msix for eswitch was introduced. This function will schedule napi
for each representor instead of scheduling it only for one like in
normal clean irq function.
Only one additional msix has to be requested. Always try to request
it in ice_ena_msix_range function.
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Grzegorz Nitka [Fri, 20 Aug 2021 00:08:54 +0000 (17:08 -0700)]
ice: set and release switchdev environment
Switchdev environment has to be set up when user create VFs
and eswitch mode is switchdev. Release is done when user
delete all VFs.
Data path in this implementation is based on control plane VSI.
This VSI is used to pass traffic from port representors to
corresponding VFs and vice versa. Default TX rule has to be
added to forward packet to control plane VSI. This will redirect
packets from VFs which don't match other rules to control plane
VSI.
On RX side default rule is added on uplink VSI to receive all
traffic that doesn't match other rules. When setting switchdev
environment all other rules from VFs should be removed. Packet to
VFs will be forwarded by control plane VSI.
As VF without any mac rules can't send any packet because of
antispoof mechanism, VSI antispoof should be turned off on each VFs.
To send packet from representor to correct VSI, destination VSI
field in TX descriptor will have to be filled. Allow that by
setting destination override bit in control plane VSI security config.
Packet from VFs will be received on control plane VSI. Driver
should decide to which netdev forward the packet. Decision is
made based on src_vsi field from descriptor. There is a target
netdev list in control plane VSI struct which choose netdev
based on src_vsi number.
Co-developed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Michal Swiatkowski [Fri, 20 Aug 2021 00:08:53 +0000 (17:08 -0700)]
ice: allow changing lan_en and lb_en on dflt rules
There is no way to change default lan_en and lb_en flags while
adding new rule. Add function that allows changing these flags
on ICE_SW_LKUP_DFLT recipe and any rule id.
lan_en allows packet to go outside if rule is matched. Clearing
this bit will block packet from sending it outside.
lb_en allows packet to be forwarded to other VSI. Clearing
this bit will block packet from forwarding it to other VSI.
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Michal Swiatkowski [Fri, 20 Aug 2021 00:08:52 +0000 (17:08 -0700)]
ice: manage VSI antispoof and destination override
Implement functions to make setting VSI security config easier.
Main function ice_update_security fills security section field and
checks against error in updating VSI. Reset functions are responsible
for correct filling config according to user expectations.
This helper is needed because destination override is located in
this section. Driver has to set this bit to allow strering Tx packet
on VSI based on value in Tx descriptors.
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Michal Swiatkowski [Fri, 20 Aug 2021 00:08:51 +0000 (17:08 -0700)]
ice: allow process VF opcodes in different ways
In switchdev driver shouldn't add MAC, VLAN and promisc
filters on iavf demand but should return success to not
break normal iavf flow.
Achieve that by creating table of functions pointer with
default functions used to parse iavf command. While parse
iavf command, call correct function from table instead of
calling function direct.
When port representors are being created change functions
in table to new one that behaves correctly for switchdev
puprose (ignoring new filters).
Change back to default ops when representors are being
removed.
Co-developed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Michal Swiatkowski [Fri, 20 Aug 2021 00:08:50 +0000 (17:08 -0700)]
ice: introduce VF port representor
Port representor is used to manage VF from host side. To allow
it each created representor registers netdevice with random hw
address. Also devlink port is created for all representors.
Port representor name is created based on switch id or managed
by devlink core if devlink port was registered with success.
Open and stop ndo ops are implemented to allow managing the VF
link state. Link state is tracked in VF struct.
Struct ice_netdev_priv is extended by pointer to representor
field. This is needed to get correct representor from netdev
struct mostly used in ndo calls.
Implement helper functions to check if given netdev is netdev of
port representor (ice_is_port_repr_netdev) and to get representor
from netdev (ice_netdev_to_repr).
As driver mostly will create or destroy port representors on all
VFs instead of on single one, write functions to add and remove
representor for each VF.
Representor struct contains pointer to source VSI, which is VSI
configured on VF, backpointer to VF, backpointer to netdev,
q_vector pointer and metadata_dst which will be used in data path.
Co-developed-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Wojciech Drewek [Fri, 20 Aug 2021 00:08:49 +0000 (17:08 -0700)]
ice: Move devlink port to PF/VF struct
Keeping devlink port inside VSI data structure causes some issues.
Since VF VSI is released during reset that means that we have to
unregister devlink port and register it again every time reset is
triggered. With the new changes in devlink API it
might cause deadlock issues. After calling
devlink_port_register/devlink_port_unregister devlink API is going to
lock rtnl_mutex. It's an issue when VF reset is triggered in netlink
operation context (like setting VF MAC address or VLAN),
because rtnl_lock is already taken by netlink. Another call of
rtnl_lock from devlink API results in dead-lock.
By moving devlink port to PF/VF we avoid creating/destroying it
during reset. Since this patch, devlink ports are created during
ice_probe, destroyed during ice_remove for PF and created during
ice_repr_add, destroyed during ice_repr_rem for VF.
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Michal Swiatkowski [Fri, 20 Aug 2021 00:08:48 +0000 (17:08 -0700)]
ice: support basic E-Switch mode control
Write set and get eswitch mode functions used by devlink
ops. Use new pf struct member eswitch_mode to track current
eswitch mode in driver.
Changing eswitch mode is only allowed when there are no
VFs created.
Create new file for eswitch related code.
Add config flag ICE_SWITCHDEV to allow user to choose if
switchdev support should be enabled or disabled.
Use case examples:
- show current eswitch mode ('legacy' is the default one)
[root@localhost]# devlink dev eswitch show pci/0000:03:00.1
pci/0000:03:00.1: mode legacy
- move to 'switchdev' mode
[root@localhost]# devlink dev eswitch set pci/0000:03:00.1 mode
switchdev
[root@localhost]# devlink dev eswitch show pci/0000:03:00.1
pci/0000:03:00.1: mode switchdev
- create 2 VFs
[root@localhost]# echo 2 > /sys/class/net/ens4f1/device/sriov_numvfs
- unsuccessful attempt to change eswitch mode while VFs are created
[root@localhost]# devlink dev eswitch set pci/0000:03:00.1 mode legacy
devlink answers: Operation not supported
- destroy VFs
[root@localhost]# echo 0 > /sys/class/net/ens4f1/device/sriov_numvfs
- restore 'legacy' mode
[root@localhost]# devlink dev eswitch set pci/0000:03:00.1 mode legacy
[root@localhost]# devlink dev eswitch show pci/0000:03:00.1
pci/0000:03:00.1: mode legacy
Co-developed-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>