platform/kernel/linux-starfive.git
2 years agoMerge ath-next from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git
Kalle Valo [Thu, 28 Oct 2021 13:23:52 +0000 (16:23 +0300)]
Merge ath-next from git://git./linux/kernel/git/kvalo/ath.git

ath.git patches for v5.16. Major changes:

ath11k

* fix QCA6390 A-MSDU handling (CVE-2020-24588)

wcn36xx

* enable hardware scan offload for 5Ghz band

* add missing 5GHz channels 136 and 144

2 years agoath6kl: fix division by zero in send path
Johan Hovold [Wed, 27 Oct 2021 08:08:18 +0000 (10:08 +0200)]
ath6kl: fix division by zero in send path

Add the missing endpoint max-packet sanity check to probe() to avoid
division by zero in ath10k_usb_hif_tx_sg() in case a malicious device
has broken descriptors (or when doing descriptor fuzz testing).

Note that USB core will reject URBs submitted for endpoints with zero
wMaxPacketSize but that drivers doing packet-size calculations still
need to handle this (cf. commit 2548288b4fb0 ("USB: Fix: Don't skip
endpoint descriptors with maxpacket=0")).

Fixes: 9cbee358687e ("ath6kl: add full USB support")
Cc: stable@vger.kernel.org # 3.5
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211027080819.6675-3-johan@kernel.org
2 years agoath10k: fix division by zero in send path
Johan Hovold [Wed, 27 Oct 2021 08:08:17 +0000 (10:08 +0200)]
ath10k: fix division by zero in send path

Add the missing endpoint max-packet sanity check to probe() to avoid
division by zero in ath10k_usb_hif_tx_sg() in case a malicious device
has broken descriptors (or when doing descriptor fuzz testing).

Note that USB core will reject URBs submitted for endpoints with zero
wMaxPacketSize but that drivers doing packet-size calculations still
need to handle this (cf. commit 2548288b4fb0 ("USB: Fix: Don't skip
endpoint descriptors with maxpacket=0")).

Fixes: 4db66499df91 ("ath10k: add initial USB support")
Cc: stable@vger.kernel.org # 4.14
Cc: Erik Stromdahl <erik.stromdahl@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211027080819.6675-2-johan@kernel.org
2 years agoath6kl: fix control-message timeout
Johan Hovold [Mon, 25 Oct 2021 12:05:20 +0000 (14:05 +0200)]
ath6kl: fix control-message timeout

USB control-message timeouts are specified in milliseconds and should
specifically not vary with CONFIG_HZ.

Fixes: 241b128b6b69 ("ath6kl: add back beginnings of USB support")
Cc: stable@vger.kernel.org # 3.4
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211025120522.6045-3-johan@kernel.org
2 years agoath10k: fix control-message timeout
Johan Hovold [Mon, 25 Oct 2021 12:05:19 +0000 (14:05 +0200)]
ath10k: fix control-message timeout

USB control-message timeouts are specified in milliseconds and should
specifically not vary with CONFIG_HZ.

Fixes: 4db66499df91 ("ath10k: add initial USB support")
Cc: stable@vger.kernel.org # 4.14
Cc: Erik Stromdahl <erik.stromdahl@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211025120522.6045-2-johan@kernel.org
2 years agowcn36xx: add missing 5GHz channels 136 and 144
Benjamin Li [Mon, 25 Oct 2021 17:53:58 +0000 (10:53 -0700)]
wcn36xx: add missing 5GHz channels 136 and 144

The official feature-complete WCN3680B driver (known as prima, open source
but not upstream) supports channels 136 and 144.

However, these channels are missing in upstream. Add them here to get
closer to feature parity with prima.

Signed-off-by: Benjamin Li <benl@squareup.com>
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211025175359.3591048-3-benl@squareup.com
2 years agowcn36xx: switch on antenna diversity feature bit
Benjamin Li [Mon, 25 Oct 2021 17:53:57 +0000 (10:53 -0700)]
wcn36xx: switch on antenna diversity feature bit

The official feature-complete WCN3680B driver (known as prima, open source
but not upstream) sends this feature bit.

As we wish to support the antenna diversity feature in upstream, we need
to set this bit as well.

Signed-off-by: Benjamin Li <benl@squareup.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211025175359.3591048-2-benl@squareup.com
2 years agowcn36xx: Channel list update before hardware scan
Loic Poulain [Mon, 25 Oct 2021 15:22:08 +0000 (17:22 +0200)]
wcn36xx: Channel list update before hardware scan

The channel scan list must be updated before triggering a hardware scan
so that firmware takes into account the regulatory info for each single
channel such as active/passive config, power, DFS, etc... Without this
the firmware uses its own internal default channel configuration, which
is not aligned with mac80211 regulatory rules, and misses several
channels (e.g. 144).

Fixes: 2f3bef4b247e ("wcn36xx: Add hardware scan offload support")
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1635175328-25642-1-git-send-email-loic.poulain@linaro.org
2 years agoMerge tag 'mt76-for-kvalo-2021-10-23' of https://github.com/nbd168/wireless
Kalle Valo [Wed, 27 Oct 2021 15:36:30 +0000 (18:36 +0300)]
Merge tag 'mt76-for-kvalo-2021-10-23' of https://github.com/nbd168/wireless

mt76 patches for 5.16

* fix a compile error with !CONFIG_PM
* cleanups
* MT7915 DBDC fixes
* endian warning fixes

2 years agoRevert "wcn36xx: Enable firmware link monitoring"
Bryan O'Donoghue [Mon, 25 Oct 2021 09:30:37 +0000 (10:30 +0100)]
Revert "wcn36xx: Enable firmware link monitoring"

Firmware link offload monitoring can be made to work in 3/4 cases by
switching on firmware feature bit WLANACTIVE_OFFLOAD

- Secure power-save on
- Secure power-save off
- Open power-save on

However, with an open AP if we switch off power-saving - thus never
entering Beacon Mode Power Save - BMPS, firmware never forwards loss
of beacon upwards.

We had hoped that WLANACTIVE_OFFLOAD and some fixes for sequence numbers
would unblock this but, it hasn't and further investigation is required.

Its possible to have a complete set of Secure power-save on/off and Open
power-save on/off provided we use Linux' link monitoring mechanism.

While we debug the Open AP failure we need to fix upstream.

This reverts commit c973fdad79f6eaf247d48b5fc77733e989eb01e1.

Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211025093037.3966022-2-bryan.odonoghue@linaro.org
2 years agowcn36xx: Fix packet drop on resume
Loic Poulain [Mon, 25 Oct 2021 08:28:16 +0000 (10:28 +0200)]
wcn36xx: Fix packet drop on resume

If the system is resumed because of an incoming packet, the wcn36xx RX
interrupts is fired before actual resuming of the wireless/mac80211
stack, causing any received packets to be simply dropped. E.g. a ping
request causes a system resume, but is dropped and so never forwarded
to the IP stack.

This change fixes that, disabling DMA interrupts on suspend to no pass
packets until mac80211 is resumed and ready to handle them.

Note that it's not incompatible with RX irq wake.

Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1635150496-19290-1-git-send-email-loic.poulain@linaro.org
2 years agowcn36xx: Fix discarded frames due to wrong sequence number
Loic Poulain [Mon, 25 Oct 2021 08:25:36 +0000 (10:25 +0200)]
wcn36xx: Fix discarded frames due to wrong sequence number

The firmware is offering features such as ARP offload, for which
firmware crafts its own (QoS)packets without waking up the host.
Point is that the sequence numbers generated by the firmware are
not in sync with the host mac80211 layer and can cause packets
such as firmware ARP reponses to be dropped by the AP (too old SN).

To fix this we need to let the firmware manages the sequence
numbers by its own (except for QoS null frames). There is a SN
counter for each QoS queue and one global/baseline counter for
Non-QoS.

Fixes: 84aff52e4f57 ("wcn36xx: Use sequence number allocated by mac80211")
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1635150336-18736-1-git-send-email-loic.poulain@linaro.org
2 years agowcn36xx: add proper DMA memory barriers in rx path
Benjamin Li [Sat, 23 Oct 2021 00:15:28 +0000 (17:15 -0700)]
wcn36xx: add proper DMA memory barriers in rx path

This is essentially exactly following the dma_wmb()/dma_rmb() usage
instructions in Documentation/memory-barriers.txt.

The theoretical races here are:

1. DXE (the DMA Transfer Engine in the Wi-Fi subsystem) seeing the
dxe->ctrl & WCN36xx_DXE_CTRL_VLD write before the dxe->dst_addr_l
write, thus performing DMA into the wrong address.

2. CPU reading dxe->dst_addr_l before DXE unsets dxe->ctrl &
WCN36xx_DXE_CTRL_VLD. This should generally be harmless since DXE
doesn't write dxe->dst_addr_l (no risk of freeing the wrong skb).

Fixes: 8e84c2582169 ("wcn36xx: mac80211 driver for Qualcomm WCN3660/WCN3680 hardware")
Signed-off-by: Benjamin Li <benl@squareup.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211023001528.3077822-1-benl@squareup.com
2 years agowcn36xx: Fix HT40 capability for 2Ghz band
Loic Poulain [Wed, 20 Oct 2021 13:38:53 +0000 (15:38 +0200)]
wcn36xx: Fix HT40 capability for 2Ghz band

All wcn36xx controllers are supposed to support HT40 (and SGI40),
This doubles the maximum bitrate/throughput with compatible APs.

Tested with wcn3620 & wcn3680B.

Cc: stable@vger.kernel.org
Fixes: 8e84c2582169 ("wcn36xx: mac80211 driver for Qualcomm WCN3660/WCN3680 hardware")
Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1634737133-22336-1-git-send-email-loic.poulain@linaro.org
2 years agoRevert "wcn36xx: Disable bmps when encryption is disabled"
Bryan O'Donoghue [Fri, 22 Oct 2021 14:04:47 +0000 (15:04 +0100)]
Revert "wcn36xx: Disable bmps when encryption is disabled"

This reverts commit c6522a5076e1a65877c51cfee313a74ef61cabf8.

Testing on tip-of-tree shows that this is working now. Revert this and
re-enable BMPS for Open APs.

Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211022140447.2846248-3-bryan.odonoghue@linaro.org
2 years agowcn36xx: Treat repeated BMPS entry fail as connection loss
Bryan O'Donoghue [Fri, 22 Oct 2021 14:04:46 +0000 (15:04 +0100)]
wcn36xx: Treat repeated BMPS entry fail as connection loss

On an open AP when you pull the plug on the AP, if we are not already in
BMPS mode then the firmware will not generate a disconnection event.

Instead we need to monitor for failure to enter BMPS and treat a string of
failures as connection loss.

Secure AP connections don't appear to demonstrate this behavior so the
work-around is limited to open APs only.

Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211022140447.2846248-2-bryan.odonoghue@linaro.org
2 years agowcn36xx: Add chained transfer support for AMSDU
Loic Poulain [Mon, 25 Oct 2021 13:26:10 +0000 (16:26 +0300)]
wcn36xx: Add chained transfer support for AMSDU

WCNSS RX DMA transfer support is limited to 3872 bytes, which is
enough for simple MPDUs (single MSDU), but not enough for cases
with A-MSDU (depending on max AMSDU size or max MPDU size).

In that case the MPDU is spread over multiple transfers, with the
first transfer containing the MPDU header and (at least) the first
A-MSDU subframe and additional transfer(s) containing the following
A-MSDUs. This can be handled with a series of flags to tagging the
first and last A-MSDU transfers.

In that case we have to bufferize and re-linearize the A-MSDU buffers
into a proper MPDU skb before forwarding to mac80211 (in the same way
as it is done in ath10k).

This change also includes sanity check of the buffer descriptor to
prevent skb overflow.

Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1634557705-11120-1-git-send-email-loic.poulain@linaro.org
2 years agowcn36xx: Enable hardware scan offload for 5Ghz band
Loic Poulain [Mon, 18 Oct 2021 10:57:58 +0000 (12:57 +0200)]
wcn36xx: Enable hardware scan offload for 5Ghz band

Until now, offload scanning for 5Ghz channels was considered broken.
However it was mostly a driver issue, caused by bad reporting of the
beacons/probe-resp bands and frequencies, which has been fixed.

We can now allow offload scan for 5GHz band, this reduces the scanning
time comparing to software driven scanning.

Note that offloaded scan is limited to 48 channels, check for this.

Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1634554678-7993-2-git-send-email-loic.poulain@linaro.org
2 years agowcn36xx: Correct band/freq reporting on RX
Loic Poulain [Mon, 18 Oct 2021 10:57:57 +0000 (12:57 +0200)]
wcn36xx: Correct band/freq reporting on RX

For packets originating from hardware scan, the channel and band is
included in the buffer descriptor (bd->rf_band & bd->rx_ch).

For 2Ghz band the channel value is directly reported in the 4-bit
rx_ch field. For 5Ghz band, the rx_ch field contains a mapping
index (given the 4-bit limitation).

The reserved0 value field is also used to extend 4-bit mapping to
5-bit mapping to support more than 16 5Ghz channels.

This change adds correct reporting of the frequency/band, that is
used in scan mechanism. And is required for 5Ghz hardware scan
support.

Signed-off-by: Loic Poulain <loic.poulain@linaro.org>
Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/1634554678-7993-1-git-send-email-loic.poulain@linaro.org
2 years agolibertas: replace snprintf in show functions with sysfs_emit
Ye Guojin [Fri, 22 Oct 2021 09:04:38 +0000 (09:04 +0000)]
libertas: replace snprintf in show functions with sysfs_emit

coccicheck complains about the use of snprintf() in sysfs show
functions:
WARNING  use scnprintf or sprintf

Use sysfs_emit instead of scnprintf or sprintf makes more sense.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Ye Guojin <ye.guojin@zte.com.cn>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211022090438.1065286-1-ye.guojin@zte.com.cn
2 years agortw89: Fix variable dereferenced before check 'sta'
Ping-Ke Shih [Fri, 22 Oct 2021 06:12:42 +0000 (14:12 +0800)]
rtw89: Fix variable dereferenced before check 'sta'

The pointer rtwsta is dereferencing pointer sta before sta is being null
checked. Fix this by assigning sta->drv_priv to rtwsta only if sta is not
NULL, otherwise just NULL.

Fixes: e3ec7017f6a2 ("rtw89: add Realtek 802.11ax driver")
Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211022061242.8383-1-pkshih@realtek.com
2 years agortw89: fix return value in hfc_pub_cfg_chk
Kevin Lo [Thu, 21 Oct 2021 06:32:27 +0000 (14:32 +0800)]
rtw89: fix return value in hfc_pub_cfg_chk

It seems to me when pub_cfg->grp0 + pub_cfg->grp1 != pub_cfg->pub_max is true,
it should return -EFAULT rather than 0.  Otherwise, the function doesn't need
to exist.

Signed-off-by: Kevin Lo <kevlo@kevlo.org>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/YXEJey8lKksAZif4@ns.kevlo.org
2 years agortw89: remove duplicate register definitions
Kevin Lo [Thu, 21 Oct 2021 05:44:08 +0000 (13:44 +0800)]
rtw89: remove duplicate register definitions

Remove duplicate register definitions.

Signed-off-by: Kevin Lo <kevlo@kevlo.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/YXD+KL+xzFsnGShb@ns.kevlo.org
2 years agortw89: fix error function parameter
Lv Ruyi [Thu, 21 Oct 2021 04:20:35 +0000 (04:20 +0000)]
rtw89: fix error function parameter

This patch fixes the following Coccinelle warning:
drivers/net/wireless/realtek/rtw89/rtw8852a.c:753:
WARNING  possible condition with no effect (if == else)

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211021042035.1042463-1-lv.ruyi@zte.com.cn
2 years agolibertas: Fix possible memory leak in probe and disconnect
Wang Hai [Wed, 20 Oct 2021 12:03:45 +0000 (20:03 +0800)]
libertas: Fix possible memory leak in probe and disconnect

I got memory leak as follows when doing fault injection test:

unreferenced object 0xffff88812c7d7400 (size 512):
  comm "kworker/6:1", pid 176, jiffies 4295003332 (age 822.830s)
  hex dump (first 32 bytes):
    00 68 1e 04 81 88 ff ff 01 00 00 00 00 00 00 00  .h..............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff8167939c>] slab_post_alloc_hook+0x9c/0x490
    [<ffffffff8167f627>] kmem_cache_alloc_trace+0x1f7/0x470
    [<ffffffffa02c9873>] if_usb_probe+0x63/0x446 [usb8xxx]
    [<ffffffffa022668a>] usb_probe_interface+0x1aa/0x3c0 [usbcore]
    [<ffffffff82b59630>] really_probe+0x190/0x480
    [<ffffffff82b59a19>] __driver_probe_device+0xf9/0x180
    [<ffffffff82b59af3>] driver_probe_device+0x53/0x130
    [<ffffffff82b5a075>] __device_attach_driver+0x105/0x130
    [<ffffffff82b55949>] bus_for_each_drv+0x129/0x190
    [<ffffffff82b593c9>] __device_attach+0x1c9/0x270
    [<ffffffff82b5a250>] device_initial_probe+0x20/0x30
    [<ffffffff82b579c2>] bus_probe_device+0x142/0x160
    [<ffffffff82b52e49>] device_add+0x829/0x1300
    [<ffffffffa02229b1>] usb_set_configuration+0xb01/0xcc0 [usbcore]
    [<ffffffffa0235c4e>] usb_generic_driver_probe+0x6e/0x90 [usbcore]
    [<ffffffffa022641f>] usb_probe_device+0x6f/0x130 [usbcore]

cardp is missing being freed in the error handling path of the probe
and the path of the disconnect, which will cause memory leak.

This patch adds the missing kfree().

Fixes: 876c9d3aeb98 ("[PATCH] Marvell Libertas 8388 802.11b/g USB driver")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211020120345.2016045-3-wanghai38@huawei.com
2 years agolibertas_tf: Fix possible memory leak in probe and disconnect
Wang Hai [Wed, 20 Oct 2021 12:03:44 +0000 (20:03 +0800)]
libertas_tf: Fix possible memory leak in probe and disconnect

I got memory leak as follows when doing fault injection test:

unreferenced object 0xffff88810a2ddc00 (size 512):
  comm "kworker/6:1", pid 176, jiffies 4295009893 (age 757.220s)
  hex dump (first 32 bytes):
    00 50 05 18 81 88 ff ff 00 00 00 00 00 00 00 00  .P..............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff8167939c>] slab_post_alloc_hook+0x9c/0x490
    [<ffffffff8167f627>] kmem_cache_alloc_trace+0x1f7/0x470
    [<ffffffffa02a1530>] if_usb_probe+0x60/0x37c [libertas_tf_usb]
    [<ffffffffa022668a>] usb_probe_interface+0x1aa/0x3c0 [usbcore]
    [<ffffffff82b59630>] really_probe+0x190/0x480
    [<ffffffff82b59a19>] __driver_probe_device+0xf9/0x180
    [<ffffffff82b59af3>] driver_probe_device+0x53/0x130
    [<ffffffff82b5a075>] __device_attach_driver+0x105/0x130
    [<ffffffff82b55949>] bus_for_each_drv+0x129/0x190
    [<ffffffff82b593c9>] __device_attach+0x1c9/0x270
    [<ffffffff82b5a250>] device_initial_probe+0x20/0x30
    [<ffffffff82b579c2>] bus_probe_device+0x142/0x160
    [<ffffffff82b52e49>] device_add+0x829/0x1300
    [<ffffffffa02229b1>] usb_set_configuration+0xb01/0xcc0 [usbcore]
    [<ffffffffa0235c4e>] usb_generic_driver_probe+0x6e/0x90 [usbcore]
    [<ffffffffa022641f>] usb_probe_device+0x6f/0x130 [usbcore]

cardp is missing being freed in the error handling path of the probe
and the path of the disconnect, which will cause memory leak.

This patch adds the missing kfree().

Fixes: c305a19a0d0a ("libertas_tf: usb specific functions")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20211020120345.2016045-2-wanghai38@huawei.com
2 years agowlcore: spi: Use dev_err_probe()
Geert Uytterhoeven [Tue, 19 Oct 2021 12:48:41 +0000 (14:48 +0200)]
wlcore: spi: Use dev_err_probe()

Use the existing dev_err_probe() helper instead of open-coding the same
operation.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/465e76901b801ac0755088998249928fd546c08a.1634647460.git.geert+renesas@glider.be
2 years agoMerge tag 'iwlwifi-next-for-kalle-2021-10-22' of git://git.kernel.org/pub/scm/linux...
Kalle Valo [Wed, 27 Oct 2021 07:21:11 +0000 (10:21 +0300)]
Merge tag 'iwlwifi-next-for-kalle-2021-10-22' of git://git./linux/kernel/git/iwlwifi/iwlwifi-next

iwlwifi patches for v5.16

* Support for 160MHz in ranging measurements;
* Some fixes in HE capabilities;
* Fixes in vendor specific capabilities;
* Add the PC of both processors in error dumps;
* Small fix in TDLS;
* Code to sanitize firmware dumps;
* Updates for new FW rate and flags format;
* Continue implementation of new rate and flags format in the FW APIs;
* Some fixes for BZ family initialization;
* Fix session protection in some scenarios;
* Some debugging improvements;
* Fix BT-coex priority;
* Improve PS-poll timeout detection;
* Some other small fixes, clean-ups and improvements.

# gpg: Signature made Fri 22 Oct 2021 11:28:43 AM EEST
# gpg:                using RSA key 1772CD7E06F604F5A6EBCB26A1479CA21A3CC5FA
# gpg: Good signature from "Luciano Roth Coelho (Luca) <luca@coelho.fi>" [full]
# gpg:                 aka "Luciano Roth Coelho (Intel) <luciano.coelho@intel.com>" [full]

2 years agonet: phy: fixed warning: Function parameter not described
Luo Jie [Tue, 26 Oct 2021 10:29:57 +0000 (18:29 +0800)]
net: phy: fixed warning: Function parameter not described

Fixed warning: Function parameter or member 'enable' not
described in 'genphy_c45_fast_retrain'

Signed-off-by: Luo Jie <luoj@codeaurora.org>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20211026102957.17100-1-luoj@codeaurora.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet/mlx5: remove the recent devlink params
Jakub Kicinski [Tue, 26 Oct 2021 15:29:39 +0000 (08:29 -0700)]
net/mlx5: remove the recent devlink params

revert commit 46ae40b94d88 ("net/mlx5: Let user configure io_eq_size param")
revert commit a6cb08daa3b4 ("net/mlx5: Let user configure event_eq_size param")
revert commit 554604061979 ("net/mlx5: Let user configure max_macs param")

The EQE parameters are applicable to more drivers, they should
be configured via standard API, probably ethtool. Example of
another driver needing something similar:

https://lore.kernel.org/all/1633454136-14679-3-git-send-email-sbhatta@marvell.com/

The last param for "max_macs" is probably fine but the documentation
is severely lacking. The meaning and implications for changing the
param need to be stated.

Link: https://lore.kernel.org/r/20211026152939.3125950-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'phy-supported-interfaces-bitmap'
David S. Miller [Tue, 26 Oct 2021 14:10:37 +0000 (15:10 +0100)]
Merge branch 'phy-supported-interfaces-bitmap'

Russell King says:

====================
Introduce supported interfaces bitmap

This series introduces a new bitmap to allow us to indicate which
phy_interface_t modes are supported.

Currently, phylink will call ->validate with PHY_INTERFACE_MODE_NA to
request all link mode capabilities from the MAC driver before choosing
an interface to use. This leads in some cases to some rather hairly
code. This can be simplified if phylink is aware of the interface modes
that  the MAC supports, and it can instead walk those modes, calling
->validate for each one, and combining the results.

This series merely introduces the support; there is no change of
behaviour until MAC drivers populate their supported_interfaces bitmap.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: phylink: use supported_interfaces for phylink validation
Russell King (Oracle) [Tue, 26 Oct 2021 10:06:11 +0000 (11:06 +0100)]
net: phylink: use supported_interfaces for phylink validation

If the network device supplies a supported interface bitmap, we can use
that during phylink's validation to simplify MAC drivers in two ways by
using the supported_interfaces bitmap to:

1. reject unsupported interfaces before calling into the MAC driver.
2. generate the set of all supported link modes across all supported
   interfaces (used mainly for SFP, but also some 10G PHYs.)

Suggested-by: Sean Anderson <sean.anderson@seco.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: phylink: add MAC phy_interface_t bitmap
Russell King [Tue, 26 Oct 2021 10:06:06 +0000 (11:06 +0100)]
net: phylink: add MAC phy_interface_t bitmap

Add a phy_interface_t bitmap so the MAC driver can specifiy which PHY
interface modes it supports.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: phy: add phy_interface_t bitmap support
Russell King (Oracle) [Tue, 26 Oct 2021 10:06:01 +0000 (11:06 +0100)]
net: phy: add phy_interface_t bitmap support

Add support for a bitmap for phy interface modes, which includes:
- a macro to declare the interface bitmap
- an inline helper to zero the interface bitmap
- an inline helper to detect an empty interface bitmap
- inline helpers to do a bitwise AND and OR operations on two interface
  bitmaps

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'dsa-isolation-prep'
David S. Miller [Tue, 26 Oct 2021 14:07:36 +0000 (15:07 +0100)]
Merge branch 'dsa-isolation-prep'

Vladimir Oltean says:

====================
DSA preparations for FDB isolation between bridges

This series makes 2 small changes to DSA's SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE
handler, which will make it possible to offer switch drivers a stable
association between a FDB entry and a bridge device in a future series.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: dsa: stop calling dev_hold in dsa_slave_fdb_event
Vladimir Oltean [Tue, 26 Oct 2021 09:25:56 +0000 (12:25 +0300)]
net: dsa: stop calling dev_hold in dsa_slave_fdb_event

Now that we guarantee that SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE events have
finished executing by the time we leave our bridge upper interface,
we've established a stronger boundary condition for how long the
dsa_slave_switchdev_event_work() might run.

As such, it is no longer possible for DSA slave interfaces to become
unregistered, since they are still bridge ports.

So delete the unnecessary dev_hold() and dev_put().

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: dsa: flush switchdev workqueue when leaving the bridge
Vladimir Oltean [Tue, 26 Oct 2021 09:25:55 +0000 (12:25 +0300)]
net: dsa: flush switchdev workqueue when leaving the bridge

DSA is preparing to offer switch drivers an API through which they can
associate each FDB entry with a struct net_device *bridge_dev. This can
be used to perform FDB isolation (the FDB lookup performed on the
ingress of a standalone, or bridged port, should not find an FDB entry
that is present in the FDB of another bridge).

In preparation of that work, DSA needs to ensure that by the time we
call the switch .port_fdb_add and .port_fdb_del methods, the
dp->bridge_dev pointer is still valid, i.e. the port is still a bridge
port.

This is not guaranteed because the SWITCHDEV_FDB_{ADD,DEL}_TO_DEVICE API
requires drivers that must have sleepable context to handle those events
to schedule the deferred work themselves. DSA does this through the
dsa_owq.

It can happen that a port leaves a bridge, del_nbp() flushes the FDB on
that port, SWITCHDEV_FDB_DEL_TO_DEVICE is notified in atomic context,
DSA schedules its deferred work, but del_nbp() finishes unlinking the
bridge as a master from the port before DSA's deferred work is run.

Fundamentally, the port must not be unlinked from the bridge until all
FDB deletion deferred work items have been flushed. The bridge must wait
for the completion of these hardware accesses.

An attempt has been made to address this issue centrally in switchdev by
making SWITCHDEV_FDB_DEL_TO_DEVICE deferred (=> blocking) at the switchdev
level, which would offer implicit synchronization with del_nbp:

https://patchwork.kernel.org/project/netdevbpf/cover/20210820115746.3701811-1-vladimir.oltean@nxp.com/

but it seems that any attempt to modify switchdev's behavior and make
the events blocking there would introduce undesirable side effects in
other switchdev consumers.

The most undesirable behavior seems to be that
switchdev_deferred_process_work() takes the rtnl_mutex itself, which
would be worse off than having the rtnl_mutex taken individually from
drivers which is what we have now (except DSA which has removed that
lock since commit 0faf890fc519 ("net: dsa: drop rtnl_lock from
dsa_slave_switchdev_event_work")).

So to offer the needed guarantee to DSA switch drivers, I have come up
with a compromise solution that does not require switchdev rework:
we already have a hook at the last moment in time when the bridge is
still an upper of ours: the NETDEV_PRECHANGEUPPER handler. We can flush
the dsa_owq manually from there, which makes all FDB deletions
synchronous.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoifb: Depend on netfilter alternatively to tc
Lukas Wunner [Tue, 26 Oct 2021 05:15:32 +0000 (07:15 +0200)]
ifb: Depend on netfilter alternatively to tc

IFB originally depended on NET_CLS_ACT for traffic redirection.
But since v4.5, that may be achieved with NFT_FWD_NETDEV as well.

Fixes: 39e6dea28adc ("netfilter: nf_tables: add forward expression to the netdev family")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: <stable@vger.kernel.org> # v4.5+: bcfabee1afd9: netfilter: nft_fwd_netdev: allow to redirect to ifb via ingress
Cc: <stable@vger.kernel.org> # v4.5+
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agomctp: Implement extended addressing
Jeremy Kerr [Tue, 26 Oct 2021 01:57:28 +0000 (09:57 +0800)]
mctp: Implement extended addressing

This change allows an extended address struct - struct sockaddr_mctp_ext
- to be passed to sendmsg/recvmsg. This allows userspace to specify
output ifindex and physical address information (for sendmsg) or receive
the input ifindex/physaddr for incoming messages (for recvmsg). This is
typically used by userspace for MCTP address discovery and assignment
operations.

The extended addressing facility is conditional on a new sockopt:
MCTP_OPT_ADDR_EXT; userspace must explicitly enable addressing before
the kernel will consume/populate the extended address data.

Includes a fix for an uninitialised var:
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ax88796c: Remove pointless check in ax88796c_open()
Nathan Chancellor [Mon, 25 Oct 2021 21:12:39 +0000 (14:12 -0700)]
net: ax88796c: Remove pointless check in ax88796c_open()

Clang warns:

drivers/net/ethernet/asix/ax88796c_main.c:851:24: error: address of
array 'ax_local->phydev->advertising' will always evaluate to 'true'
[-Werror,-Wpointer-bool-conversion]
        if (ax_local->phydev->advertising &&
            ~~~~~~~~~~~~~~~~~~^~~~~~~~~~~ ~~

advertising cannot be NULL here if ax_local is not NULL, which cannot
happen due to the check in ax88796c_probe(). Remove the check.

Link: https://github.com/ClangBuiltLinux/linux/issues/1492
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: ax88796c: Fix clang -Wimplicit-fallthrough in ax88796c_set_mac()
Nathan Chancellor [Mon, 25 Oct 2021 21:12:38 +0000 (14:12 -0700)]
net: ax88796c: Fix clang -Wimplicit-fallthrough in ax88796c_set_mac()

Clang warns:

drivers/net/ethernet/asix/ax88796c_main.c:696:2: error: unannotated fall-through between switch labels [-Werror,-Wimplicit-fallthrough]
        case SPEED_10:
        ^
drivers/net/ethernet/asix/ax88796c_main.c:696:2: note: insert 'break;' to avoid fall-through
        case SPEED_10:
        ^
        break;
drivers/net/ethernet/asix/ax88796c_main.c:706:2: error: unannotated fall-through between switch labels [-Werror,-Wimplicit-fallthrough]
        case DUPLEX_HALF:
        ^
drivers/net/ethernet/asix/ax88796c_main.c:706:2: note: insert 'break;' to avoid fall-through
        case DUPLEX_HALF:
        ^
        break;

Clang is a little more pedantic than GCC, which permits implicit
fallthroughs to cases that contain just break or return. Clang's version
is more in line with the kernel's own stance in deprecated.rst, which
states that all switch/case blocks must end in either break,
fallthrough, continue, goto, or return. Add the missing breaks to fix
the warning.

Link: https://github.com/ClangBuiltLinux/linux/issues/1491
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: mana: Allow setting the number of queues while the NIC is down
Haiyang Zhang [Mon, 25 Oct 2021 18:37:34 +0000 (11:37 -0700)]
net: mana: Allow setting the number of queues while the NIC is down

The existing code doesn't allow setting the number of queues while the
NIC is down.

Update the ethtool handler functions to support setting the number of
queues while the NIC is at down state.

Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: hsr: Add support for redbox supervision frames
Andreas Oetken [Mon, 25 Oct 2021 18:56:18 +0000 (20:56 +0200)]
net: hsr: Add support for redbox supervision frames

added support for the redbox supervision frames
as defined in the IEC-62439-3:2018.

Signed-off-by: Andreas Oetken <andreas.oetken@siemens-energy.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'tcp_stream_alloc_skb'
David S. Miller [Tue, 26 Oct 2021 13:45:12 +0000 (14:45 +0100)]
Merge branch 'tcp_stream_alloc_skb'

Eric Dumazet says:

====================
tcp: tcp_stream_alloc_skb() changes

sk_stream_alloc_skb() is only used by TCP.

Rename it to tcp_stream_alloc_skb() and apply small
optimizations.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: remove unneeded code from tcp_stream_alloc_skb()
Eric Dumazet [Mon, 25 Oct 2021 22:13:42 +0000 (15:13 -0700)]
tcp: remove unneeded code from tcp_stream_alloc_skb()

Aligning @size argument to 4 bytes is not needed.

The header alignment has nothing to do with @size.

It really depends on skb->head alignment and MAX_TCP_HEADER.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: use MAX_TCP_HEADER in tcp_stream_alloc_skb
Eric Dumazet [Mon, 25 Oct 2021 22:13:41 +0000 (15:13 -0700)]
tcp: use MAX_TCP_HEADER in tcp_stream_alloc_skb

Both IPv4 and IPv6 uses same reserve, no need risking
cache line misses to fetch its value.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: rename sk_stream_alloc_skb
Eric Dumazet [Mon, 25 Oct 2021 22:13:40 +0000 (15:13 -0700)]
tcp: rename sk_stream_alloc_skb

sk_stream_alloc_skb() is only used by TCP.

Rename it to make this clear, and move its declaration
to include/net/tcp.h

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet: annotate data-race in neigh_output()
Eric Dumazet [Mon, 25 Oct 2021 18:15:55 +0000 (11:15 -0700)]
net: annotate data-race in neigh_output()

neigh_output() reads n->nud_state and hh->hh_len locklessly.

This is fine, but we need to add annotations and document this.

We evaluate skip_cache first to avoid reading these fields
if the cache has to by bypassed.

syzbot report:

BUG: KCSAN: data-race in __neigh_event_send / ip_finish_output2

write to 0xffff88810798a885 of 1 bytes by interrupt on cpu 1:
 __neigh_event_send+0x40d/0xac0 net/core/neighbour.c:1128
 neigh_event_send include/net/neighbour.h:444 [inline]
 neigh_resolve_output+0x104/0x410 net/core/neighbour.c:1476
 neigh_output include/net/neighbour.h:510 [inline]
 ip_finish_output2+0x80a/0xaa0 net/ipv4/ip_output.c:221
 ip_finish_output+0x3b5/0x510 net/ipv4/ip_output.c:309
 NF_HOOK_COND include/linux/netfilter.h:296 [inline]
 ip_output+0xf3/0x1a0 net/ipv4/ip_output.c:423
 dst_output include/net/dst.h:450 [inline]
 ip_local_out+0x164/0x220 net/ipv4/ip_output.c:126
 __ip_queue_xmit+0x9d3/0xa20 net/ipv4/ip_output.c:525
 ip_queue_xmit+0x34/0x40 net/ipv4/ip_output.c:539
 __tcp_transmit_skb+0x142a/0x1a00 net/ipv4/tcp_output.c:1405
 tcp_transmit_skb net/ipv4/tcp_output.c:1423 [inline]
 tcp_xmit_probe_skb net/ipv4/tcp_output.c:4011 [inline]
 tcp_write_wakeup+0x4a9/0x810 net/ipv4/tcp_output.c:4064
 tcp_send_probe0+0x2c/0x2b0 net/ipv4/tcp_output.c:4079
 tcp_probe_timer net/ipv4/tcp_timer.c:398 [inline]
 tcp_write_timer_handler+0x394/0x520 net/ipv4/tcp_timer.c:626
 tcp_write_timer+0xb9/0x180 net/ipv4/tcp_timer.c:642
 call_timer_fn+0x2e/0x1d0 kernel/time/timer.c:1421
 expire_timers+0x135/0x240 kernel/time/timer.c:1466
 __run_timers+0x368/0x430 kernel/time/timer.c:1734
 run_timer_softirq+0x19/0x30 kernel/time/timer.c:1747
 __do_softirq+0x12c/0x26e kernel/softirq.c:558
 invoke_softirq kernel/softirq.c:432 [inline]
 __irq_exit_rcu kernel/softirq.c:636 [inline]
 irq_exit_rcu+0x4e/0xa0 kernel/softirq.c:648
 sysvec_apic_timer_interrupt+0x69/0x80 arch/x86/kernel/apic/apic.c:1097
 asm_sysvec_apic_timer_interrupt+0x12/0x20
 native_safe_halt arch/x86/include/asm/irqflags.h:51 [inline]
 arch_safe_halt arch/x86/include/asm/irqflags.h:89 [inline]
 acpi_safe_halt drivers/acpi/processor_idle.c:109 [inline]
 acpi_idle_do_entry drivers/acpi/processor_idle.c:553 [inline]
 acpi_idle_enter+0x258/0x2e0 drivers/acpi/processor_idle.c:688
 cpuidle_enter_state+0x2b4/0x760 drivers/cpuidle/cpuidle.c:237
 cpuidle_enter+0x3c/0x60 drivers/cpuidle/cpuidle.c:351
 call_cpuidle kernel/sched/idle.c:158 [inline]
 cpuidle_idle_call kernel/sched/idle.c:239 [inline]
 do_idle+0x1a3/0x250 kernel/sched/idle.c:306
 cpu_startup_entry+0x15/0x20 kernel/sched/idle.c:403
 secondary_startup_64_no_verify+0xb1/0xbb

read to 0xffff88810798a885 of 1 bytes by interrupt on cpu 0:
 neigh_output include/net/neighbour.h:507 [inline]
 ip_finish_output2+0x79a/0xaa0 net/ipv4/ip_output.c:221
 ip_finish_output+0x3b5/0x510 net/ipv4/ip_output.c:309
 NF_HOOK_COND include/linux/netfilter.h:296 [inline]
 ip_output+0xf3/0x1a0 net/ipv4/ip_output.c:423
 dst_output include/net/dst.h:450 [inline]
 ip_local_out+0x164/0x220 net/ipv4/ip_output.c:126
 __ip_queue_xmit+0x9d3/0xa20 net/ipv4/ip_output.c:525
 ip_queue_xmit+0x34/0x40 net/ipv4/ip_output.c:539
 __tcp_transmit_skb+0x142a/0x1a00 net/ipv4/tcp_output.c:1405
 tcp_transmit_skb net/ipv4/tcp_output.c:1423 [inline]
 tcp_xmit_probe_skb net/ipv4/tcp_output.c:4011 [inline]
 tcp_write_wakeup+0x4a9/0x810 net/ipv4/tcp_output.c:4064
 tcp_send_probe0+0x2c/0x2b0 net/ipv4/tcp_output.c:4079
 tcp_probe_timer net/ipv4/tcp_timer.c:398 [inline]
 tcp_write_timer_handler+0x394/0x520 net/ipv4/tcp_timer.c:626
 tcp_write_timer+0xb9/0x180 net/ipv4/tcp_timer.c:642
 call_timer_fn+0x2e/0x1d0 kernel/time/timer.c:1421
 expire_timers+0x135/0x240 kernel/time/timer.c:1466
 __run_timers+0x368/0x430 kernel/time/timer.c:1734
 run_timer_softirq+0x19/0x30 kernel/time/timer.c:1747
 __do_softirq+0x12c/0x26e kernel/softirq.c:558
 invoke_softirq kernel/softirq.c:432 [inline]
 __irq_exit_rcu kernel/softirq.c:636 [inline]
 irq_exit_rcu+0x4e/0xa0 kernel/softirq.c:648
 sysvec_apic_timer_interrupt+0x69/0x80 arch/x86/kernel/apic/apic.c:1097
 asm_sysvec_apic_timer_interrupt+0x12/0x20
 native_safe_halt arch/x86/include/asm/irqflags.h:51 [inline]
 arch_safe_halt arch/x86/include/asm/irqflags.h:89 [inline]
 acpi_safe_halt drivers/acpi/processor_idle.c:109 [inline]
 acpi_idle_do_entry drivers/acpi/processor_idle.c:553 [inline]
 acpi_idle_enter+0x258/0x2e0 drivers/acpi/processor_idle.c:688
 cpuidle_enter_state+0x2b4/0x760 drivers/cpuidle/cpuidle.c:237
 cpuidle_enter+0x3c/0x60 drivers/cpuidle/cpuidle.c:351
 call_cpuidle kernel/sched/idle.c:158 [inline]
 cpuidle_idle_call kernel/sched/idle.c:239 [inline]
 do_idle+0x1a3/0x250 kernel/sched/idle.c:306
 cpu_startup_entry+0x15/0x20 kernel/sched/idle.c:403
 rest_init+0xee/0x100 init/main.c:734
 arch_call_rest_init+0xa/0xb
 start_kernel+0x5e4/0x669 init/main.c:1142
 secondary_startup_64_no_verify+0xb1/0xbb

value changed: 0x20 -> 0x01

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.15.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'mlxsw-rif-mac-prefixes'
David S. Miller [Tue, 26 Oct 2021 12:35:59 +0000 (13:35 +0100)]
Merge branch 'mlxsw-rif-mac-prefixes'

Ido Schimmel says:

====================
mlxsw: Support multiple RIF MAC prefixes

Currently, mlxsw enforces that all the netdevs used as router interfaces
(RIFs) have the same MAC prefix (e.g., same 38 MSBs in Spectrum-1).
Otherwise, an error is returned to user space with extack. This patchset
relaxes the limitation through the use of RIF MAC profiles.

A RIF MAC profile is a hardware entity that represents a particular MAC
prefix which multiple RIFs can reference. Therefore, the number of
possible MAC prefixes is no longer one, but the number of profiles
supported by the device.

The ability to change the MAC of a particular netdev is useful, for
example, for users who use the netdev to connect to an upstream provider
that performs MAC filtering. Currently, such users are either forced to
negotiate with the provider or change the MAC address of all other
netdevs so that they share the same prefix.

Patchset overview:

Patches #1-#3 are preparations.

Patch #4 adds actual support for RIF MAC profiles.

Patch #5 exposes RIF MAC profiles as a devlink resource, so that user
space has visibility into the maximum number of profiles and current
occupancy. Useful for debugging and testing (next 3 patches).

Patches #6-#8 add both scale and functional tests.

Patch #9 removes tests that validated the previous limitation. It is now
covered by patch #6 for devices that support a single profile.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoselftests: mlxsw: Remove deprecated test cases
Danielle Ratson [Tue, 26 Oct 2021 09:42:25 +0000 (12:42 +0300)]
selftests: mlxsw: Remove deprecated test cases

After adding the previous patches, the constraint that all the router
interface MAC addresses have the same prefix is no longer relevant.

Remove the test cases that validated that this constraint is honored.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoselftests: Add an occupancy test for RIF MAC profiles
Danielle Ratson [Tue, 26 Oct 2021 09:42:24 +0000 (12:42 +0300)]
selftests: Add an occupancy test for RIF MAC profiles

When all the RIF MAC profiles are in use, test that it is possible to
change the MAC of a netdev (i.e., a RIF) when its MAC profile is not
shared with other RIFs. Test that replacement fails when the MAC profile
is shared.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoselftests: mlxsw: Add forwarding test for RIF MAC profiles
Danielle Ratson [Tue, 26 Oct 2021 09:42:23 +0000 (12:42 +0300)]
selftests: mlxsw: Add forwarding test for RIF MAC profiles

Verify that MAC profile changes are indeed applied and that packets are
forwarded with the correct source MAC.

Output example:

$ ./rif_mac_profiles.sh
TEST: h1->h2: new mac profile                                       [ OK ]
TEST: h2->h1: new mac profile                                       [ OK ]
TEST: h1->h2: edit mac profile                                      [ OK ]
TEST: h2->h1: edit mac profile                                      [ OK ]

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoselftests: mlxsw: Add a scale test for RIF MAC profiles
Danielle Ratson [Tue, 26 Oct 2021 09:42:22 +0000 (12:42 +0300)]
selftests: mlxsw: Add a scale test for RIF MAC profiles

Query the maximum number of supported RIF MAC profiles using
devlink-resource and verify that all available MAC profiles can be utilized
and that an error is generated when user space tries to exceed this number.

Output example in Spectrum-2:

$ TESTS='rif_mac_profile' ./resource_scale.sh
TEST: 'rif_mac_profile' 4                                           [ OK ]
TEST: 'rif_mac_profile' overflow 5                                  [ OK ]

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agomlxsw: spectrum_router: Expose RIF MAC profiles to devlink resource
Danielle Ratson [Tue, 26 Oct 2021 09:42:21 +0000 (12:42 +0300)]
mlxsw: spectrum_router: Expose RIF MAC profiles to devlink resource

Expose via devlink-resource the maximum number of RIF MAC profiles and
their current occupancy, so it can be used for debug and writing generic
tests, like in the next patch.

Example for Spectrum-2 output:

$ devlink resource show pci/0000:06:00.0
...
  name rif_mac_profiles size 4 occ 0 unit entry dpipe_tables none

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agomlxsw: spectrum_router: Add RIF MAC profiles support
Danielle Ratson [Tue, 26 Oct 2021 09:42:20 +0000 (12:42 +0300)]
mlxsw: spectrum_router: Add RIF MAC profiles support

Currently, mlxsw enforces that all the router interfaces (RIFs) have the
same MAC prefix.

Relax this limitation by using RIF MAC profiles. Each profile is
associated with a particular MAC prefix and multiple RIFs can use the
same profile. Therefore, the number of possible MAC prefixes is no
longer one, but the number of profiles supported by the device.

Store the profiles in an IDR and reference count them according to the
number of RIFs using them.

Associate a RIF with a profile when the RIF is created and remove the
association when the RIF is deleted.

Change the association following 'NETDEV_CHANGEADDR' events, except when
only one RIF is using the profile. In which case, change the MAC prefix
of the profile itself instead of associating the RIF with a new profile.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agomlxsw: spectrum_router: Propagate extack further
Danielle Ratson [Tue, 26 Oct 2021 09:42:19 +0000 (12:42 +0300)]
mlxsw: spectrum_router: Propagate extack further

The next patch will set the MAC profile of a router interface (RIF) as
part of its configure() callback. The operation can fail in case the
maximum number of profiles was exceeded.

Add extack to mlxsw_sp_rif_ops::configure() in order to communicate such
failures to user space.

In addition, the MAC profile of a RIF can change following a
'NETDEV_CHANGEADDR' notification. Propagate extack to
mlxsw_sp_router_port_change_event() so that failures could be
communicated in this path as well.

No functional changes intended.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agomlxsw: resources: Add resource identifier for RIF MAC profiles
Danielle Ratson [Tue, 26 Oct 2021 09:42:18 +0000 (12:42 +0300)]
mlxsw: resources: Add resource identifier for RIF MAC profiles

Add a resource identifier for maximum RIF MAC profiles so that it could
be later used to query the information from firmware.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agomlxsw: reg: Add MAC profile ID field to RITR register
Danielle Ratson [Tue, 26 Oct 2021 09:42:17 +0000 (12:42 +0300)]
mlxsw: reg: Add MAC profile ID field to RITR register

Add MAC profile ID field to RITR register so that it could be used for
associating a RIF with a MAC profile ID by a later patch.

Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'netfilter-vrf-rework'
David S. Miller [Tue, 26 Oct 2021 12:21:10 +0000 (13:21 +0100)]
Merge branch 'netfilter-vrf-rework'

Florian Westphal says:

====================
vrf: rework interaction with netfilter/conntrack

V2:
- fix 'plain integer as null pointer' warning
- reword commit message in patch 2 to clarify loss of 'ct set untracked'

This patch series aims to solve the to-be-reverted change 09e856d54bda5f288e
("vrf: Reset skb conntrack connection on VRF rcv") in a different way.

Rather than have skbs pass through conntrack and nat hooks twice, suppress
conntrack invocation if the conntrack/nat hook is called from the vrf driver.

First patch deals with 'incoming connection' case:
1. suppress NAT transformations
2. skip conntrack confirmation

NAT and conntrack confirmation is done when ip/ipv6 stack calls
the postrouting hook.

Second patch deals with local packets:
in vrf driver, mark the skbs as 'untracked', so conntrack output
hook ignores them.  This skips all nat hooks as well.

Afterwards, remove the untracked state again so the second
round will pick them up.

One alternative to the chosen implementation would be to add a 'caller
id' field to 'struct nf_hook_state' and then use that, these patches
use the more straightforward check of VRF flag on the state->out device.

The two patches apply to both net and net-next, i am targeting -next
because I think that since snat did not work correctly for so long that
we can take the longer route.  If you disagree, apply to net at your
discretion.

The patches apply both with 09e856d54bda5f288e reverted or still
in-place, but only with the revert in place ingress conntrack settings
(zone, notrack etc) start working again.

I've already submitted selftests for vrf+nfqueue and conntrack+vrf.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agovrf: run conntrack only in context of lower/physdev for locally generated packets
Florian Westphal [Mon, 25 Oct 2021 14:14:00 +0000 (16:14 +0200)]
vrf: run conntrack only in context of lower/physdev for locally generated packets

The VRF driver invokes netfilter for output+postrouting hooks so that users
can create rules that check for 'oif $vrf' rather than lower device name.

This is a problem when NAT rules are configured.

To avoid any conntrack involvement in round 1, tag skbs as 'untracked'
to prevent conntrack from picking them up.

This gets cleared before the packet gets handed to the ip stack so
conntrack will be active on the second iteration.

One remaining issue is that a rule like

  output ... oif $vrfname notrack

won't propagate to the second round because we can't tell
'notrack set via ruleset' and 'notrack set by vrf driver' apart.
However, this isn't a regression: the 'notrack' removal happens
instead of unconditional nf_reset_ct().
I'd also like to avoid leaking more vrf specific conditionals into the
netfilter infra.

For ingress, conntrack has already been done before the packet makes it
to the vrf driver, with this patch egress does connection tracking with
lower/physical device as well.

Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonetfilter: conntrack: skip confirmation and nat hooks in postrouting for vrf
Florian Westphal [Mon, 25 Oct 2021 14:13:59 +0000 (16:13 +0200)]
netfilter: conntrack: skip confirmation and nat hooks in postrouting for vrf

The VRF driver invokes netfilter for output+postrouting hooks so that users
can create rules that check for 'oif $vrf' rather than lower device name.

Afterwards, ip stack calls those hooks again.

This is a problem when conntrack is used with IP masquerading.
masquerading has an internal check that re-validates the output
interface to account for route changes.

This check will trigger in the vrf case.

If the -j MASQUERADE rule matched on the first iteration, then round 2
finds state->out->ifindex != nat->masq_index: the latter is the vrf
index, but out->ifindex is the lower device.

The packet gets dropped and the conntrack entry is invalidated.

This change makes conntrack postrouting skip the nat hooks.
Also skip confirmation.  This allows the second round
(postrouting invocation from ipv4/ipv6) to create nat bindings.

This also prevents the second round from seeing packets that had their
source address changed by the nat hook.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge tag 'mlx5-updates-2021-10-25' of git://git.kernel.org/pub/scm/linux/kernel...
David S. Miller [Tue, 26 Oct 2021 12:17:45 +0000 (13:17 +0100)]
Merge tag 'mlx5-updates-2021-10-25' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2021-10-25

Misc updates for mlx5 driver:

1) Misc updates and cleanups:
 - Don't write directly to netdev->dev_addr, From Jakub Kicinski
 - Remove unnecessary checks for slow path flag in tc module
 - Fix unused function warning of mlx5i_flow_type_mask
 - Bridge, support replacing existing FDB entry

2) Sub Functions, Reduction in memory usage:
 - Reduce flow counters bulk query buffer size
 - Implement max_macs devlink parameter
 - Add devlink vendor params to control Event Queue sizes
 - Added SF life cycle trace points by Parav/

3) From Aya, Firmware health buffer reporting improvements
 - Print health buffer by log level and more missing information
 - Periodic update of host time to firmware
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: don't free a FIN sk_buff in tcp_remove_empty_skb()
Jon Maxwell [Sun, 24 Oct 2021 23:59:03 +0000 (10:59 +1100)]
tcp: don't free a FIN sk_buff in tcp_remove_empty_skb()

v1: Implement a more general statement as recommended by Eric Dumazet. The
sequence number will be advanced, so this check will fix the FIN case and
other cases.

A customer reported sockets stuck in the CLOSING state. A Vmcore revealed that
the write_queue was not empty as determined by tcp_write_queue_empty() but the
sk_buff containing the FIN flag had been freed and the socket was zombied in
that state. Corresponding pcaps show no FIN from the Linux kernel on the wire.

Some instrumentation was added to the kernel and it was found that there is a
timing window where tcp_sendmsg() can run after tcp_send_fin().

tcp_sendmsg() will hit an error, for example:

1269 ▹       if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))↩
1270 ▹       ▹       goto do_error;↩

tcp_remove_empty_skb() will then free the FIN sk_buff as "skb->len == 0". The
TCP socket is now wedged in the FIN-WAIT-1 state because the FIN is never sent.

If the other side sends a FIN packet the socket will transition to CLOSING and
remain that way until the system is rebooted.

Fix this by checking for the FIN flag in the sk_buff and don't free it if that
is the case. Testing confirmed that fixed the issue.

Fixes: fdfc5c8594c2 ("tcp: remove empty skb from write queue in error cases")
Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com>
Reported-by: Monir Zouaoui <Monir.Zouaoui@mail.schwarz>
Reported-by: Simon Stier <simon.stier@mail.schwarz>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'small-fixes-for-true-expression-checks'
Jakub Kicinski [Tue, 26 Oct 2021 02:11:17 +0000 (19:11 -0700)]
Merge branch 'small-fixes-for-true-expression-checks'

Jean Sacren says:

====================
Small fixes for true expression checks

This series fixes checks of true !rc expression.
====================

Link: https://lore.kernel.org/r/cover.1634974124.git.sakiwit@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: qed_dev: fix check of true !rc expression
Jean Sacren [Sat, 23 Oct 2021 09:26:15 +0000 (03:26 -0600)]
net: qed_dev: fix check of true !rc expression

Remove the check of !rc in (!rc && !resc_lock_params.b_granted) since it
is always true.

Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: qed_ptp: fix check of true !rc expression
Jean Sacren [Sat, 23 Oct 2021 09:26:14 +0000 (03:26 -0600)]
net: qed_ptp: fix check of true !rc expression

Remove the check of !rc in (!rc && !params.b_granted) since it is always
true.

We should also use constant 0 for return.

Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'tcp-receive-path-optimizations'
Jakub Kicinski [Tue, 26 Oct 2021 01:02:16 +0000 (18:02 -0700)]
Merge branch 'tcp-receive-path-optimizations'

Eric Dumazet says:

====================
tcp: receive path optimizations

This series aims to reduce cache line misses in RX path.

I am still working on better cache locality in tcp_sock but
this will wait few more weeks.
====================

Link: https://lore.kernel.org/r/20211025164825.259415-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoipv6/tcp: small drop monitor changes
Eric Dumazet [Mon, 25 Oct 2021 16:48:25 +0000 (09:48 -0700)]
ipv6/tcp: small drop monitor changes

Two kfree_skb() calls must be replaced by consume_skb()
for skbs that are not technically dropped.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoipv4: guard IP_MINTTL with a static key
Eric Dumazet [Mon, 25 Oct 2021 16:48:24 +0000 (09:48 -0700)]
ipv4: guard IP_MINTTL with a static key

RFC 5082 IP_MINTTL option is rarely used on hosts.

Add a static key to remove from TCP fast path useless code,
and potential cache line miss to fetch inet_sk(sk)->min_ttl

Note that once ip4_min_ttl static key has been enabled,
it stays enabled until next boot.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoipv4: annotate data races arount inet->min_ttl
Eric Dumazet [Mon, 25 Oct 2021 16:48:23 +0000 (09:48 -0700)]
ipv4: annotate data races arount inet->min_ttl

No report yet from KCSAN, yet worth documenting the races.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoipv6: guard IPV6_MINHOPCOUNT with a static key
Eric Dumazet [Mon, 25 Oct 2021 16:48:22 +0000 (09:48 -0700)]
ipv6: guard IPV6_MINHOPCOUNT with a static key

RFC 5082 IPV6_MINHOPCOUNT is rarely used on hosts.

Add a static key to remove from TCP fast path useless code,
and potential cache line miss to fetch tcp_inet6_sk(sk)->min_hopcount

Note that once ip6_min_hopcount static key has been enabled,
it stays enabled until next boot.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoipv6: annotate data races around np->min_hopcount
Eric Dumazet [Mon, 25 Oct 2021 16:48:21 +0000 (09:48 -0700)]
ipv6: annotate data races around np->min_hopcount

No report yet from KCSAN, yet worth documenting the races.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: annotate accesses to sk->sk_rx_queue_mapping
Eric Dumazet [Mon, 25 Oct 2021 16:48:20 +0000 (09:48 -0700)]
net: annotate accesses to sk->sk_rx_queue_mapping

sk->sk_rx_queue_mapping can be modified locklessly,
add a couple of READ_ONCE()/WRITE_ONCE() to document this fact.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: avoid dirtying sk->sk_rx_queue_mapping
Eric Dumazet [Mon, 25 Oct 2021 16:48:19 +0000 (09:48 -0700)]
net: avoid dirtying sk->sk_rx_queue_mapping

sk_rx_queue_mapping is located in a cache line that should be kept read mostly.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: avoid dirtying sk->sk_napi_id
Eric Dumazet [Mon, 25 Oct 2021 16:48:18 +0000 (09:48 -0700)]
net: avoid dirtying sk->sk_napi_id

sk_napi_id is located in a cache line that can be kept read mostly.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoipv6: move inet6_sk(sk)->rx_dst_cookie to sk->sk_rx_dst_cookie
Eric Dumazet [Mon, 25 Oct 2021 16:48:17 +0000 (09:48 -0700)]
ipv6: move inet6_sk(sk)->rx_dst_cookie to sk->sk_rx_dst_cookie

Increase cache locality by moving rx_dst_coookie next to sk->sk_rx_dst

This removes one or two cache line misses in IPv6 early demux (TCP/UDP)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agotcp: move inet->rx_dst_ifindex to sk->sk_rx_dst_ifindex
Eric Dumazet [Mon, 25 Oct 2021 16:48:16 +0000 (09:48 -0700)]
tcp: move inet->rx_dst_ifindex to sk->sk_rx_dst_ifindex

Increase cache locality by moving rx_dst_ifindex next to sk->sk_rx_dst

This is part of an effort to reduce cache line misses in TCP fast path.

This removes one cache line miss in early demux.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoax88796c: fix fetching error stats from percpu containers
Alexander Lobakin [Sat, 23 Oct 2021 12:19:16 +0000 (12:19 +0000)]
ax88796c: fix fetching error stats from percpu containers

rx_dropped, tx_dropped, rx_frame_errors and rx_crc_errors are being
wrongly fetched from the target container rather than source percpu
ones.
No idea if that goes from the vendor driver or was brainoed during
the refactoring, but fix it either way.

Fixes: a97c69ba4f30e ("net: ax88796c: ASIX AX88796C SPI Ethernet Adapter Driver")
Signed-off-by: Alexander Lobakin <alobakin@pm.me>
Acked-by: Łukasz Stelmach <l.stelmach@samsung.com>
Link: https://lore.kernel.org/r/20211023121148.113466-1-alobakin@pm.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet/mlx5: SF_DEV Add SF device trace points
Parav Pandit [Tue, 5 Oct 2021 08:26:05 +0000 (11:26 +0300)]
net/mlx5: SF_DEV Add SF device trace points

Add SF device add and delete specific trace points.

echo mlx5:mlx5_sf_dev_add >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_dev_del >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_vhca_event >> /sys/kernel/debug/tracing/set_event

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: SF, Add SF trace points
Parav Pandit [Tue, 21 Sep 2021 13:12:28 +0000 (16:12 +0300)]
net/mlx5: SF, Add SF trace points

Add support for trace events for SFs to improve debugging.
This covers
(a) port add and free trace points
(b) device level trace points
(c) SF hardware context add, free trace points.
(d) SF function activate/deacticate and state trace points

SF events examples:
echo mlx5:mlx5_sf_add >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_free >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_hwc_alloc >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_hwc_free >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_hwc_deferred_free >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_update_state >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_activate >> /sys/kernel/debug/tracing/set_event
echo mlx5:mlx5_sf_deactivate >> /sys/kernel/debug/tracing/set_event

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Let user configure max_macs param
Shay Drory [Mon, 16 Aug 2021 05:41:08 +0000 (08:41 +0300)]
net/mlx5: Let user configure max_macs param

Currently, max_macs is taking 70Kbytes of memory per function. This
size is not needed in all use cases, and is critical with large scale.
Hence, allow user to configure the number of max_macs.

For example, to reduce the number of max_macs to 1, execute::
$ devlink dev param set pci/0000:00:0b.0 name max_macs value 1 \
              cmode driverinit
$ devlink dev reload pci/0000:00:0b.0

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Let user configure event_eq_size param
Shay Drory [Wed, 13 Oct 2021 06:57:54 +0000 (09:57 +0300)]
net/mlx5: Let user configure event_eq_size param

Event EQ is an EQ which received the notification of almost all the
events generated by the NIC.
Currently, each event EQ is taking 512KB of memory. This size is not
needed in most use cases, and is critical with large scale. Hence,
allow user to configure the size of the event EQ.

For example to reduce event EQ size to 64, execute::
$ devlink resource set pci/0000:00:0b.0 path /event_eq_size/ size 64
$ devlink dev reload pci/0000:00:0b.0

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Let user configure io_eq_size param
Shay Drory [Thu, 12 Aug 2021 08:53:34 +0000 (11:53 +0300)]
net/mlx5: Let user configure io_eq_size param

Currently, each I/O EQ is taking 128KB of memory. This size
is not needed in all use cases, and is critical with large scale.
Hence, allow user to configure the size of I/O EQs.

For example, to reduce I/O EQ size to 64, execute:
$ devlink resource set pci/0000:00:0b.0 path /io_eq_size/ size 64
$ devlink dev reload pci/0000:00:0b.0

Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Bridge, support replacing existing FDB entry
Vlad Buslov [Tue, 19 Oct 2021 15:45:28 +0000 (18:45 +0300)]
net/mlx5: Bridge, support replacing existing FDB entry

The SWITCHDEV_FDB_ADD_TO_DEVICE is used for both adding new and replacing
existing entry. Implement support for replacing existing FDB entries in
mlx5 offload code.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Bridge, extract code to lookup and del/notify entry
Vlad Buslov [Tue, 19 Oct 2021 15:17:19 +0000 (18:17 +0300)]
net/mlx5: Bridge, extract code to lookup and del/notify entry

Following two patterns in bridge code are used in multiple places where
similar code is duplicated:

- Lookup FDB entry from hashtable by address+vid pair.

- Notify software bridge and then delete existing FDB entry.

In order to improve code quality and prepare for following patch series
that also uses described patterns, extract the codes to dedicated helper
functions.

This commit doesn't change functionality.

Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Add periodic update of host time to firmware
Aya Levin [Wed, 13 Oct 2021 06:45:22 +0000 (09:45 +0300)]
net/mlx5: Add periodic update of host time to firmware

Firmware logs its asserts also to non-volatile memory. In order to
reduce drift between the NIC and the host, the driver sets the host
epoch-time to the firmware every hour.

Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Print health buffer by log level
Aya Levin [Mon, 11 Oct 2021 14:19:23 +0000 (17:19 +0300)]
net/mlx5: Print health buffer by log level

Add log macro which gets log level as a parameter. Use the severity
read from the health buffer and the new log macro to log the health buffer
with severity as log level.  Prior to this patch, health buffer was
printed in error log level regardless of its severity. Now the user may
filter dmesg (--level) or change kernel log level to focus on different
severity levels of firmware errors.

Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Extend health buffer dump
Aya Levin [Mon, 11 Oct 2021 10:14:28 +0000 (13:14 +0300)]
net/mlx5: Extend health buffer dump

Enhance health buffer to include:
 - assert_var5: expose the 6'th assert variable.
 - time: error's time-stamp in seconds (epoch time).
 - rfr: Recovery Flow Requiered. When set, indicates that the error
        cannot be recovered without flow involving reset.
 - severity: error's severity value, ranging from emergency to debug.
Expose them in the health buffer dump (dmesg and devlink fw reporter).

Health buffer in dmesg:
mlx5_core 0000:08:00.0: print_health_info:425:(pid 912): Health issue observed, firmware internal error, severity(3) ERROR:
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[0] 0x08040700
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[1] 0x00000000
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[2] 0x00000000
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[3] 0x00000000
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[4] 0x00000000
mlx5_core 0000:08:00.0: print_health_info:429:(pid 912): assert_var[5] 0x00000000
mlx5_core 0000:08:00.0: print_health_info:432:(pid 912): assert_exit_ptr 0x00aaf800
mlx5_core 0000:08:00.0: print_health_info:434:(pid 912): assert_callra 0x00aaf70c
mlx5_core 0000:08:00.0: print_health_info:436:(pid 912): fw_ver 16.32.492
mlx5_core 0000:08:00.0: print_health_info:437:(pid 912): time 1634819758
mlx5_core 0000:08:00.0: print_health_info:438:(pid 912): hw_id 0x0000020d
mlx5_core 0000:08:00.0: print_health_info:439:(pid 912): rfr 0
mlx5_core 0000:08:00.0: print_health_info:440:(pid 912): severity 3 (ERROR)
mlx5_core 0000:08:00.0: print_health_info:441:(pid 912): irisc_index 9
mlx5_core 0000:08:00.0: print_health_info:442:(pid 912): synd 0x1: firmware internal error
mlx5_core 0000:08:00.0: print_health_info:444:(pid 912): ext_synd 0x802b
mlx5_core 0000:08:00.0: print_health_info:445:(pid 912): raw fw_ver 0x102001ec

Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Reduce flow counters bulk query buffer size for SFs
Avihai Horon [Wed, 6 Oct 2021 09:19:40 +0000 (12:19 +0300)]
net/mlx5: Reduce flow counters bulk query buffer size for SFs

Currently, the flow counters bulk query buffer takes a little more than
512KB of memory, which is aligned to the next power of 2, to 1MB.

The buffer size determines the maximum number of flow counters that can
be queried at a time. Thus, having a bigger buffer can improve
performance for users that need to query many flow counters.

SFs don't use many flow counters and don't need a big buffer. Since this
size is critical with large scale, reduce the size of the bulk query
buffer for SFs.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Fix unused function warning of mlx5i_flow_type_mask
Shay Drory [Mon, 18 Oct 2021 06:18:39 +0000 (09:18 +0300)]
net/mlx5: Fix unused function warning of mlx5i_flow_type_mask

The cited commit is causing unused-function warning[1] when
CONFIG_MLX5_EN_RXNFC is not set.
Fix this by moving the function into the ifdef, where it's only used

[1]
warning: ‘mlx5i_flow_type_mask’ defined but not used [-Wunused-function]

Fixes: 9fbe1c25ecca ("net/mlx5i: Enable Rx steering for IPoIB via ethtool")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5: Remove unnecessary checks for slow path flag
Paul Blakey [Sun, 24 Oct 2021 13:29:24 +0000 (16:29 +0300)]
net/mlx5: Remove unnecessary checks for slow path flag

After previous changes, caller (mlx5e_tc_offload_fdb_rules()) already
checks for the slow path flag, and if set won't call offload/unoffload
sample.

Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agonet/mlx5e: don't write directly to netdev->dev_addr
Jakub Kicinski [Wed, 13 Oct 2021 20:20:01 +0000 (13:20 -0700)]
net/mlx5e: don't write directly to netdev->dev_addr

Use a local buffer and eth_hw_addr_set()

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2 years agoMerge branch 'bluetooth-don-t-write-directly-to-netdev-dev_addr'
Jakub Kicinski [Mon, 25 Oct 2021 18:01:32 +0000 (11:01 -0700)]
Merge branch 'bluetooth-don-t-write-directly-to-netdev-dev_addr'

Jakub Kicinski says:

====================
bluetooth: don't write directly to netdev->dev_addr

The usual conversions.
====================

Link: https://lore.kernel.org/r/20211022231834.2710245-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agobluetooth: use dev_addr_set()
Jakub Kicinski [Fri, 22 Oct 2021 23:18:34 +0000 (16:18 -0700)]
bluetooth: use dev_addr_set()

Commit 406f42fa0d3c ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it go through appropriate helpers.

Reviewed-by: Marcel Holtmann <marcel@holtmann.org>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agobluetooth: use eth_hw_addr_set()
Jakub Kicinski [Fri, 22 Oct 2021 23:18:33 +0000 (16:18 -0700)]
bluetooth: use eth_hw_addr_set()

Commit 406f42fa0d3c ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it go through appropriate helpers.

Convert bluetooth from memcpy(... ETH_ADDR) to eth_hw_addr_set():

  @@
  expression dev, np;
  @@
  - memcpy(dev->dev_addr, np, ETH_ALEN)
  + eth_hw_addr_set(dev, np)

Reviewed-by: Marcel Holtmann <marcel@holtmann.org>
Acked-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agofddi: defza: add missing pointer type cast
Jakub Kicinski [Mon, 25 Oct 2021 16:00:00 +0000 (09:00 -0700)]
fddi: defza: add missing pointer type cast

hw_addr is a uint AKA unsigned int. dev_addr_set() takes
a u8 *.

  drivers/net/fddi/defza.c:1383:27: error: passing argument 2 of 'dev_addr_set' from incompatible pointer type [-Werror=incompatible-pointer-types]

Reported-by: kernel test robot <lkp@intel.com>
Fixes: 1e9258c389ee ("fddi: defxx,defza: use dev_addr_set()")
Acked-by: Maciej W. Rozycki <macro@orcam.me.uk>
Link: https://lore.kernel.org/r/20211025160000.2803818-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet/tls: getsockopt supports complete algorithm list
Tianjia Zhang [Mon, 25 Oct 2021 13:05:00 +0000 (21:05 +0800)]
net/tls: getsockopt supports complete algorithm list

AES_CCM_128 and CHACHA20_POLY1305 are already supported by tls,
similar to setsockopt, getsockopt also needs to support these
two algorithms.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet/tls: tls_crypto_context add supported algorithms context
Tianjia Zhang [Mon, 25 Oct 2021 13:04:39 +0000 (21:04 +0800)]
net/tls: tls_crypto_context add supported algorithms context

tls already supports the SM4 GCM/CCM algorithms. It is also necessary
to add support for these two algorithms in tls_crypto_context to avoid
potential issues caused by forced type conversion.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agomlxsw: spectrum: Use 'bitmap_zalloc()' when applicable
Christophe JAILLET [Sun, 24 Oct 2021 19:17:51 +0000 (21:17 +0200)]
mlxsw: spectrum: Use 'bitmap_zalloc()' when applicable

Use 'bitmap_zalloc()' to simplify code, improve the semantic and avoid
some open-coded arithmetic in allocator arguments.

Also change the corresponding 'kfree()' into 'bitmap_free()' to keep
consistency.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agousbb: catc: use correct API for MAC addresses
Oliver Neukum [Mon, 25 Oct 2021 14:11:21 +0000 (16:11 +0200)]
usbb: catc: use correct API for MAC addresses

Commit 406f42fa0d3c ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it got through appropriate helpers.

In the case of catc we need a new temporary buffer to conform
to the rules for DMA coherency. That in turn necessitates
a reworking of error handling in probe().

Signed-off-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>