platform/kernel/linux-starfive.git
13 months agowifi: rtw89: add tx_wake notify for 8851B
Chin-Yen Lee [Fri, 19 May 2023 03:14:59 +0000 (11:14 +0800)]
wifi: rtw89: add tx_wake notify for 8851B

8851B has the same issue: management frames get stuck when WiFi
chip enters low PS mode, so we also add notify wake function to
trigger WiFi chip wake before forwarding management frames.

Signed-off-by: Chin-Yen Lee <timlee@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230519031500.21087-7-pkshih@realtek.com
13 months agowifi: rtw89: enlarge supported length of read_reg debugfs entry
Ping-Ke Shih [Fri, 19 May 2023 03:14:58 +0000 (11:14 +0800)]
wifi: rtw89: enlarge supported length of read_reg debugfs entry

The register ranges of upcoming chips are different from current, and even
existing chips have different ranges, so support longer length to dump
registers. Then, user space can decide the ranges according to chip.

Since arbitrary length (e.g. 7) would be a little complicated, so simply
make length a multiple of 16. The output looks like

18620000h : 8580801f 82828282 82828282 080800fd

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230519031500.21087-6-pkshih@realtek.com
13 months agowifi: rtw89: 8851b: add RF configurations
Ping-Ke Shih [Fri, 19 May 2023 03:14:57 +0000 (11:14 +0800)]
wifi: rtw89: 8851b: add RF configurations

RF configurations include RF calibrations and getting thermal value.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230519031500.21087-5-pkshih@realtek.com
13 months agowifi: rtw89: 8851b: add MAC configurations to chip_info
Ping-Ke Shih [Fri, 19 May 2023 03:14:56 +0000 (11:14 +0800)]
wifi: rtw89: 8851b: add MAC configurations to chip_info

These configurations include path control, TX grant, TX scheduler,
register-based H2C/C2H and so on.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230519031500.21087-4-pkshih@realtek.com
13 months agowifi: rtw89: 8851b: fill BB related capabilities to chip_info
Ping-Ke Shih [Fri, 19 May 2023 03:14:55 +0000 (11:14 +0800)]
wifi: rtw89: 8851b: fill BB related capabilities to chip_info

These capabilities include helpers of BT coexistence, RX PPDU status
parser, DIG (dynamic initial gain) and CFO (center frequency offset)
settings.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230519031500.21087-3-pkshih@realtek.com
13 months agowifi: rtw89: 8851b: add TX power related functions
Ping-Ke Shih [Fri, 19 May 2023 03:14:54 +0000 (11:14 +0800)]
wifi: rtw89: 8851b: add TX power related functions

Get TX power value from tables according to selected country and channel,
and set proper power to registers.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230519031500.21087-2-pkshih@realtek.com
13 months agowifi: rtw89: refine packet offload handling under SER
Zong-Zhe Yang [Tue, 16 May 2023 08:24:41 +0000 (16:24 +0800)]
wifi: rtw89: refine packet offload handling under SER

H2C of packet offload needs to wait FW ACK by C2H. But, it's possible
that packet offload happens during SER (system error recovery), e.g.
SER L2 which restarts HW. More, packet offload flow isn't deferrable.
So, the H2C wait may get `ret == 1` (unreachable).

However, the logic FW deals with packet offload is simple enough, just
clone content. It means that as long as the H2C is issued successfully,
the thing will succeed sooner or later. Therefore, after we add a debug
log when receiving ACK to packet offload, it would be acceptable that
during SER, packet offload don't really wait for ACK. And, if debugging,
we can still check its debug logs. Besides, we can expect that if we see
SER before receiving ACK to packet offload, those debug logs of the ACK
have a time difference.

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230516082441.11154-4-pkshih@realtek.com
13 months agowifi: rtw89: tweak H2C TX waiting function for SER
Zong-Zhe Yang [Tue, 16 May 2023 08:24:40 +0000 (16:24 +0800)]
wifi: rtw89: tweak H2C TX waiting function for SER

Some specific H2C (host to chip command) needs waiting until FW ACK by
C2H (chip to host event). However, during SER (system error recovery),
most interrupts are disabled, so we can't receive C2H immediately. It
causes this kind of H2C TX waits will always time out during SER.

To save time spent by SER, we don't do these redundant waits. And, to
make a difference from -ETIMEDOUT in other cases, we make the function
return 1 for SER case. When some H2C callers really catch `ret == 1` at
runtime, they can determine whether it's reasonable or not, and consider
how to resolve their flow if needed.

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230516082441.11154-3-pkshih@realtek.com
13 months agowifi: rtw89: ser: reset total_sta_assoc and tdls_peer when L2
Zong-Zhe Yang [Tue, 16 May 2023 08:24:39 +0000 (16:24 +0800)]
wifi: rtw89: ser: reset total_sta_assoc and tdls_peer when L2

The total_sta_assoc and the tdls_peer are used for statistics accodring
to stations' information. L2 (Level 2) SER (system error recovery) will
call ieee80211_restart_hw() which re-invokes sta_state ops. And then,
the total_sta_assoc and tdls_peer will be re-increased. In case wrong
statistics results, we reset them in SER L2 handling.

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230516082441.11154-2-pkshih@realtek.com
13 months agowifi: rtw88: Add support for the SDIO based RTL8723DS chipset
Martin Blumenstingl [Mon, 22 May 2023 20:24:25 +0000 (22:24 +0200)]
wifi: rtw88: Add support for the SDIO based RTL8723DS chipset

Wire up RTL8723DS chipset support using the rtw88 SDIO HCI code as well
as the existing RTL8723D chipset code.

Reviewed-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230522202425.1827005-5-martin.blumenstingl@googlemail.com
13 months agommc: sdio: Add/rename SDIO ID of the RTL8723DS SDIO wifi cards
Martin Blumenstingl [Mon, 22 May 2023 20:24:24 +0000 (22:24 +0200)]
mmc: sdio: Add/rename SDIO ID of the RTL8723DS SDIO wifi cards

RTL8723DS comes in two variant and each of them has their own SDIO ID:
- 0xd723 can connect two antennas. The WiFi part is still 1x1 so the
  second antenna can be dedicated to Bluetooth
- 0xd724 can only connect one antenna so it's shared between WiFi and
  Bluetooth

Add a new entry for the single antenna RTL8723DS (0xd724) which can be
found on the MangoPi MQ-Quad. Also rename the existing RTL8723DS entry
(0xd723) so it's name reflects that it's the variant with support for
two antennas.

Reviewed-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230522202425.1827005-4-martin.blumenstingl@googlemail.com
13 months agowifi: rtw88: rtw8723d: Implement RTL8723DS (SDIO) efuse parsing
Martin Blumenstingl [Mon, 22 May 2023 20:24:23 +0000 (22:24 +0200)]
wifi: rtw88: rtw8723d: Implement RTL8723DS (SDIO) efuse parsing

The efuse of the SDIO RTL8723DS chip has only one known member: the mac
address is at offset 0x11a. Add a struct rtw8723ds_efuse describing this
and use it for copying the mac address when the SDIO bus is used.

Reviewed-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230522202425.1827005-3-martin.blumenstingl@googlemail.com
13 months agowifi: rtw88: sdio: Check the HISR RX_REQUEST bit in rtw_sdio_rx_isr()
Martin Blumenstingl [Mon, 22 May 2023 20:24:22 +0000 (22:24 +0200)]
wifi: rtw88: sdio: Check the HISR RX_REQUEST bit in rtw_sdio_rx_isr()

rtw_sdio_rx_isr() is responsible for receiving data from the wifi chip
and is called from the SDIO interrupt handler when the interrupt status
register (HISR) has the RX_REQUEST bit set. After the first batch of
data has been processed by the driver the wifi chip may have more data
ready to be read, which is managed by a loop in rtw_sdio_rx_isr().

It turns out that there are cases where the RX buffer length (from the
REG_SDIO_RX0_REQ_LEN register) does not match the data we receive. The
following two cases were observed with a RTL8723DS card:
- RX length is smaller than the total packet length including overhead
  and actual data bytes (whose length is part of the buffer we read from
  the wifi chip and is stored in rtw_rx_pkt_stat.pkt_len). This can
  result in errors like:
    skbuff: skb_over_panic: text:ffff8000011924ac len:3341 put:3341
  (one case observed was: RX buffer length = 1536 bytes but
   rtw_rx_pkt_stat.pkt_len = 1546 bytes, this is not valid as it means
   we need to read beyond the end of the buffer)
- RX length looks valid but rtw_rx_pkt_stat.pkt_len is zero

Check if the RX_REQUEST is set in the HISR register for each iteration
inside rtw_sdio_rx_isr(). This mimics what the RTL8723DS vendor driver
does and makes the driver only read more data if the RX_REQUEST bit is
set (which seems to be a way for the card's hardware or firmware to
tell the host that data is ready to be processed).

For RTW_WCPU_11AC chips this check is not needed. The RTL8822BS vendor
driver for example states that this check is unnecessary (but still uses
it) and the RTL8822CS drops this check entirely.

Reviewed-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230522202425.1827005-2-martin.blumenstingl@googlemail.com
13 months agowifi: add HAS_IOPORT dependencies
Niklas Schnelle [Mon, 22 May 2023 10:50:48 +0000 (12:50 +0200)]
wifi: add HAS_IOPORT dependencies

In a future patch HAS_IOPORT=n will result in inb()/outb() and friends
not being declared. We thus need to add HAS_IOPORT as dependency for
those drivers using them.

Co-developed-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@kernel.org>
Acked-by: Kalle Valo <kvalo@kernel.org>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230522105049.1467313-44-schnelle@linux.ibm.com
14 months agoMerge ath-next from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git
Kalle Valo [Wed, 17 May 2023 14:59:59 +0000 (17:59 +0300)]
Merge ath-next from git://git./linux/kernel/git/kvalo/ath.git

ath.git patches for v6.5. Major changes:

ath11k

* Multiple Basic Service Set Identifier (MBSSID) and Enhanced MBSSID
  Advertisement (EMA) support in AP mode

14 months agowifi: ath11k: Send HT fixed rate in WMI peer fixed param
Maharaja Kennadyrajan [Tue, 9 May 2023 17:07:24 +0000 (20:07 +0300)]
wifi: ath11k: Send HT fixed rate in WMI peer fixed param

Due to the firmware behavior with HT fixed rate setting,
HT fixed rate MCS with NSS > 1 are treated as NSS = 1
HT rates in the firmware and enables the HT fixed rate of
NSS = 1.

This leads to HT fixed rate is always configured for NSS = 1
even though the user sets NSS = 2 or > 1 HT fixed MCS in the
set bitrate command.

Currently HT fixed MCS is sent via WMI peer assoc command.
Fix this issue, by sending the HT fixed rate MCS in WMI peer
fixed param instead of sending in peer assoc command.

Tested-on: QCN9074 hw1.0 PCI WLAN.HK.2.7.0.1-01744-QCAHKSWPL_SILICONZ-1

Signed-off-by: Maharaja Kennadyrajan <quic_mkenna@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20230504092033.3542456-3-quic_mkenna@quicinc.com
14 months agowifi: ath11k: Relocate the func ath11k_mac_bitrate_mask_num_ht_rates() and change...
Maharaja Kennadyrajan [Tue, 9 May 2023 17:07:23 +0000 (20:07 +0300)]
wifi: ath11k: Relocate the func ath11k_mac_bitrate_mask_num_ht_rates() and change hweight16 to hweight8

Relocate the function ath11k_mac_bitrate_mask_num_ht_rates() definition
to call this function from other functions which helps to avoid the
compilation error (function not defined).

ht_mcs[] is 1 byte array and it is enough to use hweight8() instead
of hweight16(). Hence, fixed the same.

Tested on: Compile tested only.

Signed-off-by: Maharaja Kennadyrajan <quic_mkenna@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20230504092033.3542456-2-quic_mkenna@quicinc.com
14 months agowifi: ath12k: increase vdev setup timeout
Aishwarya R [Tue, 9 May 2023 17:07:23 +0000 (20:07 +0300)]
wifi: ath12k: increase vdev setup timeout

When vdev start/stop happens, response from firmware is received with delay
and hence there is a timeout before VDEV can be up/down.
Also, with maximum peers connected and when vdev stop occurs, firmware
will take time to clean up all the peers and vap queues.
In such cases as well, vdev start/stop response is sent by firmware with delay.

Increase the vdev setup timeout as recommended by firmware team.

Tested-on: QCN9274 hw2.0 PCI WLAN.WBE.1.0-02903-QCAHKSWPL_SILICONZ-1

Signed-off-by: Aishwarya R <quic_aisr@quicinc.com>
Signed-off-by: Ramya Gnanasekar <quic_rgnanase@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20230428091041.20033-1-quic_rgnanase@quicinc.com
14 months agowifi: rtw89: 8851b: rfk: add TSSI
Ping-Ke Shih [Sat, 13 May 2023 05:44:25 +0000 (13:44 +0800)]
wifi: rtw89: 8851b: rfk: add TSSI

TSSI is transmitter signal strength indication, which is a close-loop
hardware circuit to feedback actual transmitting power as a reference for
next transmission.

When we setup channel to connect an AP, it does full calibration. When
switching bands or channels, it needs to reset hardware status to prevent
use wrong feedback of previous transmission.

To do TX power compensation reflecting current temperature, it loads tables
of compensation values into registers according to channel and band group.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230513054425.9689-4-pkshih@realtek.com
14 months agowifi: rtw89: 8851b: rfk: add DPK
Ping-Ke Shih [Sat, 13 May 2023 05:44:24 +0000 (13:44 +0800)]
wifi: rtw89: 8851b: rfk: add DPK

DPK is short for digital pre-distortion calibration. It can adjusts digital
waveform according to PA linear characteristics dynamically to enhance
TX EVM.

Do this calibration when we are going to run on AP channel. To prevent
power offset out of boundary, it monitors thermal and set proper boundary
to register.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230513054425.9689-3-pkshih@realtek.com
14 months agowifi: rtw89: 8851b: rfk: add RX DCK
Ping-Ke Shih [Sat, 13 May 2023 05:44:23 +0000 (13:44 +0800)]
wifi: rtw89: 8851b: rfk: add RX DCK

RX DCK is receiver DC calibration. With this calibration, we have proper
DC offset to reflect correct received signal strength indicator. Do this
calibration when bringing up interface and going to run on AP channel.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230513054425.9689-2-pkshih@realtek.com
14 months agowifi: rtw89: 8851b: add to parse efuse content
Ping-Ke Shih [Fri, 12 May 2023 06:12:20 +0000 (14:12 +0800)]
wifi: rtw89: 8851b: add to parse efuse content

Parse efuse content to recognize MAC address, RFE type, XTAL offset and
so on. And, parse offset of PHY capability to retrieve TX power
calibration data.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230512061220.16544-7-pkshih@realtek.com
14 months agowifi: rtw89: 8851b: add set channel function
Ping-Ke Shih [Fri, 12 May 2023 06:12:19 +0000 (14:12 +0800)]
wifi: rtw89: 8851b: add set channel function

Set MAC/BB/RF registers according to channel we are going to set. In
additional, certain channels or bands need more deals, such as enable CCK
in 2 GHz band, spur elimination at certain frequencies.

The set channel helper is used to save/restore states before/after setting
channel, and does reset BB to prevent hardware getting stuck in abnormal
state during switching channel and receiving data.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230512061220.16544-6-pkshih@realtek.com
14 months agowifi: rtw89: 8851b: add basic power on function
Ping-Ke Shih [Fri, 12 May 2023 06:12:18 +0000 (14:12 +0800)]
wifi: rtw89: 8851b: add basic power on function

Add basic functions to power on chip and enable and access BB/RF, as
well as reset and hardware settings of BB.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230512061220.16544-5-pkshih@realtek.com
14 months agowifi: rtw89: 8851b: add BT coexistence support function
Ping-Ke Shih [Fri, 12 May 2023 06:12:17 +0000 (14:12 +0800)]
wifi: rtw89: 8851b: add BT coexistence support function

Add 8851B specific parameters of BT coexistence. Since 8851B has special
two antenna hardware module with antenna diversity, BT coexistence needs
to recognize this, so add some fields to store these information for
further use.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230512061220.16544-4-pkshih@realtek.com
14 months agowifi: rtw89: 8851b: configure GPIO according to RFE type
Ping-Ke Shih [Fri, 12 May 2023 06:12:16 +0000 (14:12 +0800)]
wifi: rtw89: 8851b: configure GPIO according to RFE type

Though 8851BE is a 1x1 chip, but it has two antenna hardware module that
needs additional configuration to help choose antenna we are going to use.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230512061220.16544-3-pkshih@realtek.com
14 months agowifi: rtw89: 8851b: add to read efuse version to recognize hardware version B
Ping-Ke Shih [Fri, 12 May 2023 06:12:15 +0000 (14:12 +0800)]
wifi: rtw89: 8851b: add to read efuse version to recognize hardware version B

8851B hardware version A and B use different firmware, but register version
code of these two are the same, so add this helper to read efuse version to
determine which version is installed.

Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230512061220.16544-2-pkshih@realtek.com
14 months agowifi: rtl8xxxu: Rename some registers
Bitterblue Smith [Sat, 13 May 2023 20:48:27 +0000 (23:48 +0300)]
wifi: rtl8xxxu: Rename some registers

Give proper names:

RF6052_REG_UNKNOWN_56 -> RF6052_REG_PAD_TXG
RF6052_REG_UNKNOWN_DF -> RF6052_REG_GAIN_CCA

And fix typos:

REG_OFDM0_AGCR_SSI_TABLE -> REG_OFDM0_AGC_RSSI_TABLE
REG_BB_ACCEESS_CTRL -> REG_BB_ACCESS_CTRL

Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Reviewed-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/40157253-76bd-8b23-06e0-3365139b5395@gmail.com
14 months agowifi: rtl8xxxu: Support new chip RTL8192FU
Bitterblue Smith [Sat, 13 May 2023 20:47:38 +0000 (23:47 +0300)]
wifi: rtl8xxxu: Support new chip RTL8192FU

This is a newer chip, similar to the RTL8710BU in that it uses the same
PHY status structs.

Features: 2.4 GHz, b/g/n mode, 2T2R, 300 Mbps.

It can allegedly have Bluetooth, but that's not implemented here.

This chip can have many RFE (RF front end) types, of which types 1
and 5 are the only ones tested. Many of the other types need different
initialisation tables. They can be added if someone wants them.

The vendor driver v5.8.6.2_35538.20191028_COEX20190910-0d02 from
https://github.com/BrightX/rtl8192fu was used as reference, with
additional device IDs taken from
https://github.com/kelebek333/rtl8192fu-dkms.

The vendor driver also claims to support devices with ID 0bda:a725,
but that is found in some bluetooth-only devices, so it's not supported
here.

Signed-off-by: Bitterblue Smith <rtl8821cerfe2@gmail.com>
Reviewed-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/7dcf9fb9-1c97-ac28-5286-2236e287a18c@gmail.com
14 months agowifi: brcmfmac: wcc: Add debug messages
Matthias Brugger [Tue, 9 May 2023 10:04:20 +0000 (12:04 +0200)]
wifi: brcmfmac: wcc: Add debug messages

The message is attach and detach function are merly for debugging,
change them from pr_err to pr_debug.

Signed-off-by: Matthias Brugger <mbrugger@suse.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230509100420.26094-1-matthias.bgg@kernel.org
14 months agonet: Remove low_thresh in ip defrag
Angus Chen [Fri, 12 May 2023 01:01:52 +0000 (09:01 +0800)]
net: Remove low_thresh in ip defrag

As low_thresh has no work in fragment reassembles,del it.
And Mark it deprecated in sysctl Document.

Signed-off-by: Angus Chen <angus.chen@jaguarmicro.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agoMerge tag 'wireless-next-2023-05-12' of git://git.kernel.org/pub/scm/linux/kernel...
David S. Miller [Mon, 15 May 2023 07:37:17 +0000 (08:37 +0100)]
Merge tag 'wireless-next-2023-05-12' of git://git./linux/kernel/git/wireless/wireless-next

Kalle valo says:

====================
wireless-next patches for v6.5

The first pull request for v6.5 and only driver changes this time.
rtl8xxxu has been making lots of progress lately and now has AP mode
support.

Major changes:

rtl8xxxu

* AP mode support, initially only for rtl8188f

rtw89

* provide RSSI, EVN and SNR statistics via debugfs

* support U-NII-4 channels on 5 GHz band
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agosfc: fix use-after-free in efx_tc_flower_record_encap_match()
Edward Cree [Fri, 12 May 2023 15:35:58 +0000 (16:35 +0100)]
sfc: fix use-after-free in efx_tc_flower_record_encap_match()

When writing error messages to extack for pseudo collisions, we can't
 use encap->type as encap has already been freed.  Fortunately the
 same value is stored in local variable em_type, so use that instead.

Fixes: 3c9561c0a5b9 ("sfc: support TC decap rules matching on enc_ip_tos")
Reported-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet: phylink: constify fwnode arguments
Russell King (Oracle) [Fri, 12 May 2023 16:58:37 +0000 (17:58 +0100)]
net: phylink: constify fwnode arguments

Both phylink_create() and phylink_fwnode_phy_connect() do not modify
the fwnode argument that they are passed, so lets constify these.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet: fec: using the standard return codes when xdp xmit errors
Shenwei Wang [Fri, 12 May 2023 13:20:10 +0000 (08:20 -0500)]
net: fec: using the standard return codes when xdp xmit errors

This patch standardizes the inconsistent return values for unsuccessful
XDP transmits by using standardized error codes (-EBUSY or -ENOMEM).

Signed-off-by: Shenwei Wang <shenwei.wang@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet: macb: Shorten max_tx_len to 4KiB - 56 on mpfs
Daire McNamara [Fri, 12 May 2023 12:20:32 +0000 (13:20 +0100)]
net: macb: Shorten max_tx_len to 4KiB - 56 on mpfs

On mpfs, with SRAM configured for 4 queues, setting max_tx_len
to GEM_TX_MAX_LEN=0x3f0 results multiple AMBA errors.
Setting max_tx_len to (4KiB - 56) removes those errors.

The details are described in erratum 1686 by Cadence

The max jumbo frame size is also reduced for mpfs to (4KiB - 56).

Signed-off-by: Daire McNamara <daire.mcnamara@microchip.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agoping: Convert hlist_nulls to plain hlist.
Kuniyuki Iwashima [Wed, 10 May 2023 21:54:43 +0000 (14:54 -0700)]
ping: Convert hlist_nulls to plain hlist.

Since introduced in commit c319b4d76b9e ("net: ipv4: add IPPROTO_ICMP
socket kind"), ping socket does not use SLAB_TYPESAFE_BY_RCU nor check
nulls marker in loops.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agoMerge branch 'skb_frag_fill_page_desc'
David S. Miller [Sat, 13 May 2023 18:47:56 +0000 (19:47 +0100)]
Merge branch 'skb_frag_fill_page_desc'

Yunsheng Lin says:

====================
net: introduce skb_frag_fill_page_desc()

Most users use __skb_frag_set_page()/skb_frag_off_set()/
skb_frag_size_set() to fill the page desc for a skb frag.
It does not make much sense to calling __skb_frag_set_page()
without calling skb_frag_off_set(), as the offset may depend
on whether the page is head page or tail page, so add
skb_frag_fill_page_desc() to fill the page desc for a skb
frag.

In the future, we can make sure the page in the frag is
head page of compound page or a base page, if not, we
may warn about that and convert the tail page to head
page and update the offset accordingly, if we see a warning
about that, we also fix the caller to fill the head page
in the frag. when the fixing is done, we may remove the
warning and converting.

In this way, we can remove the compound_head() or use
page_ref_*() like the below case:
https://elixir.bootlin.com/linux/latest/source/net/core/page_pool.c#L881
https://elixir.bootlin.com/linux/latest/source/include/linux/skbuff.h#L3383

It may also convert net stack to use the folio easier.

V1: repost with all the ack/review tags included.
RFC: remove a local variable as pointed out by Simon.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet: remove __skb_frag_set_page()
Yunsheng Lin [Thu, 11 May 2023 01:12:13 +0000 (09:12 +0800)]
net: remove __skb_frag_set_page()

The remaining users calling __skb_frag_set_page() with
page being NULL seems to be doing defensive programming,
as shinfo->nr_frags is already decremented, so remove
them.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet: introduce and use skb_frag_fill_page_desc()
Yunsheng Lin [Thu, 11 May 2023 01:12:12 +0000 (09:12 +0800)]
net: introduce and use skb_frag_fill_page_desc()

Most users use __skb_frag_set_page()/skb_frag_off_set()/
skb_frag_size_set() to fill the page desc for a skb frag.

Introduce skb_frag_fill_page_desc() to do that.

net/bpf/test_run.c does not call skb_frag_off_set() to
set the offset, "copy_from_user(page_address(page), ...)"
and 'shinfo' being part of the 'data' kzalloced in
bpf_test_init() suggest that it is assuming offset to be
initialized as zero, so call skb_frag_fill_page_desc()
with offset being zero for this case.

Also, skb_frag_set_page() is not used anymore, so remove
it.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agoselftests: net: vxlan: Add tests for vxlan nolocalbypass option.
Vladimir Nikishkin [Fri, 12 May 2023 03:40:34 +0000 (11:40 +0800)]
selftests: net: vxlan: Add tests for vxlan nolocalbypass option.

Add test to make sure that the localbypass option is on by default.

Add test to change vxlan localbypass to nolocalbypass and check
that packets are delivered to userspace.

Signed-off-by: Vladimir Nikishkin <vladimir@nikishkin.pw>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet: vxlan: Add nolocalbypass option to vxlan.
Vladimir Nikishkin [Fri, 12 May 2023 03:40:33 +0000 (11:40 +0800)]
net: vxlan: Add nolocalbypass option to vxlan.

If a packet needs to be encapsulated towards a local destination IP, the
packet will undergo a "local bypass" and be injected into the Rx path as
if it was received by the target VXLAN device without undergoing
encapsulation. If such a device does not exist, the packet will be
dropped.

There are scenarios where we do not want to perform such a bypass, but
instead want the packet to be encapsulated and locally received by a
user space program for post-processing.

To that end, add a new VXLAN device attribute that controls whether a
"local bypass" is performed or not. Default to performing a bypass to
maintain existing behavior.

Signed-off-by: Vladimir Nikishkin <vladimir@nikishkin.pw>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agoMerge branch 'broadcom-phy-wol'
David S. Miller [Sat, 13 May 2023 15:56:29 +0000 (16:56 +0100)]
Merge branch 'broadcom-phy-wol'

Florian Fainelli says:

====================
Support for Wake-on-LAN for Broadcom PHYs

This patch series adds support for Wake-on-LAN to the Broadcom PHY
driver. Specifically the BCM54210E/B50212E are capable of supporting
Wake-on-LAN using an external pin typically wired up to a system's GPIO.

These PHY operate a programmable Ethernet MAC destination address
comparator which will fire up an interrupt whenever a match is received.
Because of that, it was necessary to introduce patch #1 which allows the
PHY driver's ->suspend() routine to be called unconditionally. This is
necessary in our case because we need a hook point into the device
suspend/resume flow to enable the wake-up interrupt as late as possible.

Patch #2 adds support for the Broadcom PHY library and driver for
Wake-on-LAN proper with the WAKE_UCAST, WAKE_MCAST, WAKE_BCAST,
WAKE_MAGIC and WAKE_MAGICSECURE. Note that WAKE_FILTER is supportable,
however this will require further discussions and be submitted as a RFC
series later on.

Patch #3 updates the GENET driver to defer to the PHY for Wake-on-LAN if
the PHY supports it, thus allowing the MAC to be powered down to
conserve power.

Changes in v3:

- collected Reviewed-by tags
- explicitly use return 0 in bcm54xx_phy_probe() (Paolo)

Changes in v2:

- introduce PHY_ALWAYS_CALL_SUSPEND and only have the Broadcom PHY
  driver set this flag to minimize changes to the suspend flow to only
  drivers that need it

- corrected possibly uninitialized variable in bcm54xx_set_wakeup_irq
  (Simon)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet: bcmgenet: Add support for PHY-based Wake-on-LAN
Florian Fainelli [Thu, 11 May 2023 17:21:10 +0000 (10:21 -0700)]
net: bcmgenet: Add support for PHY-based Wake-on-LAN

If available, interrogate the PHY to find out whether we can use it for
Wake-on-LAN. This can be a more power efficient way of implementing
that feature, especially when the MAC is powered off in low power
states.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet: phy: broadcom: Add support for Wake-on-LAN
Florian Fainelli [Thu, 11 May 2023 17:21:09 +0000 (10:21 -0700)]
net: phy: broadcom: Add support for Wake-on-LAN

Add support for WAKE_UCAST, WAKE_MCAST, WAKE_BCAST, WAKE_MAGIC and
WAKE_MAGICSECURE. This is only supported with the BCM54210E and
compatible Ethernet PHYs. Using the in-band interrupt or an out of band
GPIO interrupts are supported.

Broadcom PHYs will generate a Wake-on-LAN level low interrupt on LED4 as
soon as one of the supported patterns is being matched. That includes
generating such an interrupt even if the PHY is operated during normal
modes. If WAKE_UCAST is selected, this could lead to the LED4 interrupt
firing up for every packet being received which is absolutely
undesirable from a performance point of view.

Because the Wake-on-LAN configuration can be set long before the system
is actually put to sleep, we cannot have an interrupt service routine to
clear on read the interrupt status register and ensure that new packet
matches will be detected.

It is desirable to enable the Wake-on-LAN interrupt as late as possible
during the system suspend process such that we limit the number of
interrupts to be handled by the system, but also conversely feed into
the Linux's system suspend way of dealing with interrupts in and around
the points of no return.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet: phy: Allow drivers to always call into ->suspend()
Florian Fainelli [Thu, 11 May 2023 17:21:08 +0000 (10:21 -0700)]
net: phy: Allow drivers to always call into ->suspend()

A few PHY drivers are currently attempting to not suspend the PHY when
Wake-on-LAN is enabled, however that code is not currently executing at
all due to an early check in phy_suspend().

This prevents PHY drivers from making an appropriate decisions and put
the hardware into a low power state if desired.

In order to allow the PHY drivers to opt into getting their ->suspend
routine to be called, add a PHY_ALWAYS_CALL_SUSPEND bit which can be
set. A boolean that tracks whether the PHY or the attached MAC has
Wake-on-LAN enabled is also provided for convenience.

If phydev::wol_enabled then the PHY shall not prevent its own
Wake-on-LAN detection logic from working and shall not prevent the
Ethernet MAC from receiving packets for matching.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agoMerge branch 'sfc-decap'
David S. Miller [Fri, 12 May 2023 09:37:02 +0000 (10:37 +0100)]
Merge branch 'sfc-decap'

Edward Cree says:

====================
sfc: more flexible encap matches on TC decap rules

This series extends the TC offload support on EF100 to support optionally
 matching on the IP ToS and UDP source port of the outer header in rules
 performing tunnel decapsulation.  Both of these fields allow masked
 matches if the underlying hardware supports it (current EF100 hardware
 supports masking on ToS, but only exact-match on source port).
Given that the source port is typically populated from a hash of inner
 header entropy, it's not clear whether filtering on it is useful, but
 since we can support it we may as well expose the capability.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agosfc: support TC decap rules matching on enc_src_port
Edward Cree [Thu, 11 May 2023 19:47:31 +0000 (20:47 +0100)]
sfc: support TC decap rules matching on enc_src_port

Allow efx_tc_encap_match entries to include a udp_sport and a
 udp_sport_mask.  As with enc_ip_tos, use pseudos to enforce that all
 encap matches within a given <src_ip,dst_ip,udp_dport> tuple have
 the same udp_sport_mask.
Note that since we use a single layer of pseudos for both fields, two
 matches that differ in (say) udp_sport value aren't permitted to have
 different ip_tos_mask, even though this would technically be safe.
Current userland TC does not support setting enc_src_port; this patch
 was tested with an iproute2 patched to support it.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agosfc: support TC decap rules matching on enc_ip_tos
Edward Cree [Thu, 11 May 2023 19:47:30 +0000 (20:47 +0100)]
sfc: support TC decap rules matching on enc_ip_tos

Allow efx_tc_encap_match entries to include an ip_tos and ip_tos_mask.
To avoid partially-overlapping Outer Rules (which can lead to undefined
 behaviour in the hardware), store extra "pseudo" entries in our
 encap_match hashtable, which are used to enforce that all Outer Rule
 entries within a given <src_ip,dst_ip,udp_dport> tuple (or IPv6
 equivalent) have the same ip_tos_mask.
The "direct" encap_match entry takes a reference on the "pseudo",
 allowing it to be destroyed when all "direct" entries using it are
 removed.
efx_tc_em_pseudo_type is an enum rather than just a bool because in
 future an additional pseudo-type will be added to support Conntrack
 offload.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agosfc: populate enc_ip_tos matches in MAE outer rules
Edward Cree [Thu, 11 May 2023 19:47:29 +0000 (20:47 +0100)]
sfc: populate enc_ip_tos matches in MAE outer rules

Currently tc.c will block them before they get here, but following
 patch will change that.
Use the extack message from efx_mae_check_encap_match_caps() instead
 of writing a new one, since there's now more being fed in than just
 an IP version.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agosfc: release encap match in efx_tc_flow_free()
Edward Cree [Thu, 11 May 2023 19:47:28 +0000 (20:47 +0100)]
sfc: release encap match in efx_tc_flow_free()

When force-freeing leftover entries from our match_action_ht, call
 efx_tc_delete_rule(), which releases all the rule's resources, rather
 than open-coding it.  The open-coded version was missing a call to
 release the rule's encap match (if any).
It probably doesn't matter as everything's being torn down anyway, but
 it's cleaner this way and prevents further error messages potentially
 being logged by efx_tc_encap_match_free() later on.
Move efx_tc_flow_free() further down the file to avoid introducing a
 forward declaration of efx_tc_delete_rule().

Fixes: 17654d84b47c ("sfc: add offloading of 'foreign' TC (decap) rules")
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet: liquidio: lio_main: Remove unnecessary (void*) conversions
wuych [Fri, 12 May 2023 02:44:03 +0000 (10:44 +0800)]
net: liquidio: lio_main: Remove unnecessary (void*) conversions

Pointer variables of void * type do not require type cast.

Signed-off-by: wuych <yunchuan@nfschina.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agosctp: add bpf_bypass_getsockopt proto callback
Alexander Mikhalitsyn [Thu, 11 May 2023 13:25:06 +0000 (15:25 +0200)]
sctp: add bpf_bypass_getsockopt proto callback

Implement ->bpf_bypass_getsockopt proto callback and filter out
SCTP_SOCKOPT_PEELOFF, SCTP_SOCKOPT_PEELOFF_FLAGS and SCTP_SOCKOPT_CONNECTX3
socket options from running eBPF hook on them.

SCTP_SOCKOPT_PEELOFF and SCTP_SOCKOPT_PEELOFF_FLAGS options do fd_install(),
and if BPF_CGROUP_RUN_PROG_GETSOCKOPT hook returns an error after success of
the original handler sctp_getsockopt(...), userspace will receive an error
from getsockopt syscall and will be not aware that fd was successfully
installed into a fdtable.

As pointed by Marcelo Ricardo Leitner it seems reasonable to skip
bpf getsockopt hook for SCTP_SOCKOPT_CONNECTX3 sockopt too.
Because internaly, it triggers connect() and if error is masked
then userspace will be confused.

This patch was born as a result of discussion around a new SCM_PIDFD interface:
https://lore.kernel.org/all/20230413133355.350571-3-aleksandr.mikhalitsyn@canonical.com/

Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks")
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: Xin Long <lucien.xin@gmail.com>
Cc: linux-sctp@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: netdev@vger.kernel.org
Suggested-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agoMerge branch 'selftests-fcnal'
David S. Miller [Fri, 12 May 2023 08:43:56 +0000 (09:43 +0100)]
Merge branch 'selftests-fcnal'

Guillaume Nault says:

====================
selftests: fcnal: Test SO_DONTROUTE socket option.

The objective is to cover kernel paths that use the RTO_ONLINK flag
in .flowi4_tos. This way we'll be able to safely remove this flag in
the future by properly setting .flowi4_scope instead. With these
selftests in place, we can make sure this won't introduce regressions.

For more context, the final objective is to convert .flowi4_tos to
dscp_t, to ensure that ECN bits don't influence route and fib-rule
lookups (see commit a410a0cf9885 ("ipv6: Define dscp_t and stop taking
ECN bits into account in fib6-rules")).

These selftests only cover IPv4, as SO_DONTROUTE has no effect on IPv6
sockets.

v2:
  - Use two different nettest options for setting SO_DONTROUTE either
    on the server or on the client socket.

  - Use the above feature to run a single 'nettest -B' instance per
    test (instead of having two nettest processes for server and
    client).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agoselftests: fcnal: Test SO_DONTROUTE on raw and ping sockets.
Guillaume Nault [Thu, 11 May 2023 14:39:46 +0000 (16:39 +0200)]
selftests: fcnal: Test SO_DONTROUTE on raw and ping sockets.

Use ping -r to test the kernel behaviour with raw and ping sockets
having the SO_DONTROUTE option.

Since ipv4_ping_novrf() is called with different values of
net.ipv4.ping_group_range, then it tests both raw and ping sockets
(ping uses ping sockets if its user ID belongs to ping_group_range
and raw sockets otherwise).

With both socket types, sending packets to a neighbour (on link) host,
should work. When the host is behind a router, sending should fail.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agoselftests: fcnal: Test SO_DONTROUTE on UDP sockets.
Guillaume Nault [Thu, 11 May 2023 14:39:39 +0000 (16:39 +0200)]
selftests: fcnal: Test SO_DONTROUTE on UDP sockets.

Use nettest --client-dontroute to test the kernel behaviour with UDP
sockets having the SO_DONTROUTE option. Sending packets to a neighbour
(on link) host, should work. When the host is behind a router, sending
should fail.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agoselftests: fcnal: Test SO_DONTROUTE on TCP sockets.
Guillaume Nault [Thu, 11 May 2023 14:39:32 +0000 (16:39 +0200)]
selftests: fcnal: Test SO_DONTROUTE on TCP sockets.

Use nettest --{client,server}-dontroute to test the kernel behaviour
with TCP sockets having the SO_DONTROUTE option. Sending packets to a
neighbour (on link) host, should work. When the host is behind a
router, sending should fail.

Client and server sockets are tested independently, so that we can
cover different TCP kernel paths.

SO_DONTROUTE also affects the syncookies path. So ipv4_tcp_dontroute()
is made to work with or without syncookies, to cover both paths.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agoselftests: Add SO_DONTROUTE option to nettest.
Guillaume Nault [Thu, 11 May 2023 14:39:25 +0000 (16:39 +0200)]
selftests: Add SO_DONTROUTE option to nettest.

Add --client-dontroute and --server-dontroute options to nettest. They
allow to set the SO_DONTROUTE option to the client and server sockets
respectively. This will be used by the following patches to test
the SO_DONTROUTE kernel behaviour with TCP and UDP.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agobonding: Always assign be16 value to vlan_proto
Simon Horman [Thu, 11 May 2023 15:07:12 +0000 (17:07 +0200)]
bonding: Always assign be16 value to vlan_proto

The type of the vlan_proto field is __be16.
And most users of the field use it as such.

In the case of setting or testing the field for the special VLAN_N_VID
value, host byte order is used. Which seems incorrect.

It also seems somewhat odd to store a VLAN ID value in a field that is
otherwise used to store Ether types.

Address this issue by defining BOND_VLAN_PROTO_NONE, a big endian value.
0xffff was chosen somewhat arbitrarily. What is important is that it
doesn't overlap with any valid VLAN Ether types.

I don't believe the problems described above are a bug because
VLAN_N_VID in both little-endian and big-endian byte order does not
conflict with any supported VLAN Ether types in big-endian byte order.

Reported by sparse as:

 .../bond_main.c:2857:26: warning: restricted __be16 degrades to integer
 .../bond_main.c:2863:20: warning: restricted __be16 degrades to integer
 .../bond_main.c:2939:40: warning: incorrect type in assignment (different base types)
 .../bond_main.c:2939:40:    expected restricted __be16 [usertype] vlan_proto
 .../bond_main.c:2939:40:    got int

No functional changes intended.
Compile tested only.

Signed-off-by: Simon Horman <horms@kernel.org>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agoMerge branch 'net-handshake-fixes'
David S. Miller [Fri, 12 May 2023 08:24:08 +0000 (09:24 +0100)]
Merge branch 'net-handshake-fixes'

Chuck Lever says:

====================
Bug fixes for net/handshake

Please consider these for merge via net-next.

Paolo observed that there is a possible leak of sock->file. I
haven't looked into that yet, but it seems to be separate from
the fixes in this series, so no need to hold these up.

Changes since v2:
- Address Paolo comment regarding handshake_dup()

Changes since v1:
- Rework "Fix handshake_dup() ref counting"
- Unpin sock->file when a handshake is cancelled
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet/handshake: Enable the SNI extension to work properly
Chuck Lever [Thu, 11 May 2023 15:49:50 +0000 (11:49 -0400)]
net/handshake: Enable the SNI extension to work properly

Enable the upper layer protocol to specify the SNI peername. This
avoids the need for tlshd to use a DNS lookup, which can return a
hostname that doesn't match the incoming certificate's SubjectName.

Fixes: 2fd5532044a8 ("net/handshake: Add a kernel API for requesting a TLSv1.3 handshake")
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet/handshake: Unpin sock->file if a handshake is cancelled
Chuck Lever [Thu, 11 May 2023 15:49:17 +0000 (11:49 -0400)]
net/handshake: Unpin sock->file if a handshake is cancelled

If user space never calls DONE, sock->file's reference count remains
elevated. Enable sock->file to be freed eventually in this case.

Reported-by: Jakub Kacinski <kuba@kernel.org>
Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet/handshake: handshake_genl_notify() shouldn't ignore @flags
Chuck Lever [Thu, 11 May 2023 15:48:45 +0000 (11:48 -0400)]
net/handshake: handshake_genl_notify() shouldn't ignore @flags

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet/handshake: Fix uninitialized local variable
Chuck Lever [Thu, 11 May 2023 15:48:13 +0000 (11:48 -0400)]
net/handshake: Fix uninitialized local variable

trace_handshake_cmd_done_err() simply records the pointer in @req,
so initializing it to NULL is sufficient and safe.

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet/handshake: Fix handshake_dup() ref counting
Chuck Lever [Thu, 11 May 2023 15:47:40 +0000 (11:47 -0400)]
net/handshake: Fix handshake_dup() ref counting

If get_unused_fd_flags() fails, we ended up calling fput(sock->file)
twice.

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet/handshake: Remove unneeded check from handshake_dup()
Chuck Lever [Thu, 11 May 2023 15:47:09 +0000 (11:47 -0400)]
net/handshake: Remove unneeded check from handshake_dup()

handshake_req_submit() now verifies that the socket has a file.

Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests")
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agoipvlan: Remove NULL check before dev_{put, hold}
Yang Li [Thu, 11 May 2023 07:21:19 +0000 (15:21 +0800)]
ipvlan: Remove NULL check before dev_{put, hold}

The call netdev_{put, hold} of dev_{put, hold} will check NULL,
so there is no need to check before using dev_{put, hold},
remove it to silence the warning:

./drivers/net/ipvlan/ipvlan_core.c:559:3-11: WARNING: NULL check before dev_{put, hold} functions is not needed.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4930
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agoocteontx2-pf: mcs: Offload extended packet number(XPN) feature
Subbaraya Sundeep [Thu, 11 May 2023 06:17:12 +0000 (11:47 +0530)]
octeontx2-pf: mcs: Offload extended packet number(XPN) feature

The macsec hardware block supports XPN cipher suites also.
Hence added changes to offload XPN feature. Changes include
configuring SecY policy to XPN cipher suite, Salt and SSCI values.
64 bit packet number is passed instead of 32 bit packet number.

Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet: samsung: sxgbe: Make sxgbe_drv_remove() return void
Uwe Kleine-König [Wed, 10 May 2023 20:02:47 +0000 (22:02 +0200)]
net: samsung: sxgbe: Make sxgbe_drv_remove() return void

sxgbe_drv_remove() returned zero unconditionally, so it can be converted
to return void without losing anything. The upside is that it becomes
more obvious in its callers that there is no error to handle.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 months agonet: enc28j60: Use threaded interrupt instead of workqueue
Philipp Rosenberger [Tue, 9 May 2023 04:28:56 +0000 (06:28 +0200)]
net: enc28j60: Use threaded interrupt instead of workqueue

The Microchip ENC28J60 SPI Ethernet driver schedules a work item from
the interrupt handler because accesses to the SPI bus may sleep.

On PREEMPT_RT (which forces interrupt handling into threads) this
old-fashioned approach unnecessarily increases latency because an
interrupt results in first waking the interrupt thread, then scheduling
the work item.  So, a double indirection to handle an interrupt.

Avoid by converting the driver to modern threaded interrupt handling.

Signed-off-by: Philipp Rosenberger <p.rosenberger@kunbus.com>
Signed-off-by: Zhi Han <hanzhi09@gmail.com>
[lukas: rewrite commit message, linewrap request_threaded_irq() call]
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Piotr Raczynski <piotr.raczynski@intel.com>
Link: https://lore.kernel.org/r/342380d989ce26bc49f0e5d45fbb0416a5f7809f.1683606193.git.lukas@wunner.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Thu, 11 May 2023 16:06:26 +0000 (09:06 -0700)]
Merge git://git./linux/kernel/git/netdev/net

Cross-merge networking fixes. No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoMerge tag 'net-6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 11 May 2023 13:42:47 +0000 (08:42 -0500)]
Merge tag 'net-6.4-rc2' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from netfilter.

  Current release - regressions:

   - mtk_eth_soc: fix NULL pointer dereference

  Previous releases - regressions:

   - core:
      - skb_partial_csum_set() fix against transport header magic value
      - fix load-tearing on sk->sk_stamp in sock_recv_cmsgs().
      - annotate sk->sk_err write from do_recvmmsg()
      - add vlan_get_protocol_and_depth() helper

   - netlink: annotate accesses to nlk->cb_running

   - netfilter: always release netdev hooks from notifier

  Previous releases - always broken:

   - core: deal with most data-races in sk_wait_event()

   - netfilter: fix possible bug_on with enable_hooks=1

   - eth: bonding: fix send_peer_notif overflow

   - eth: xpcs: fix incorrect number of interfaces

   - eth: ipvlan: fix out-of-bounds caused by unclear skb->cb

   - eth: stmmac: Initialize MAC_ONEUS_TIC_COUNTER register"

* tag 'net-6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (31 commits)
  af_unix: Fix data races around sk->sk_shutdown.
  af_unix: Fix a data race of sk->sk_receive_queue->qlen.
  net: datagram: fix data-races in datagram_poll()
  net: mscc: ocelot: fix stat counter register values
  ipvlan:Fix out-of-bounds caused by unclear skb->cb
  docs: networking: fix x25-iface.rst heading & index order
  gve: Remove the code of clearing PBA bit
  tcp: add annotations around sk->sk_shutdown accesses
  net: add vlan_get_protocol_and_depth() helper
  net: pcs: xpcs: fix incorrect number of interfaces
  net: deal with most data-races in sk_wait_event()
  net: annotate sk->sk_err write from do_recvmmsg()
  netlink: annotate accesses to nlk->cb_running
  kselftest: bonding: add num_grat_arp test
  selftests: forwarding: lib: add netns support for tc rule handle stats get
  Documentation: bonding: fix the doc of peer_notif_delay
  bonding: fix send_peer_notif overflow
  net: ethernet: mtk_eth_soc: fix NULL pointer dereference
  selftests: nft_flowtable.sh: check ingress/egress chain too
  selftests: nft_flowtable.sh: monitor result file sizes
  ...

14 months agoMerge tag 'media/v6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab...
Linus Torvalds [Thu, 11 May 2023 13:35:52 +0000 (08:35 -0500)]
Merge tag 'media/v6.4-2' of git://git./linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:

 - fix some unused-variable warning in mtk-mdp3

 - ignore unused suspend operations in nxp

 - some driver fixes in rcar-vin

* tag 'media/v6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  media: platform: mtk-mdp3: work around unused-variable warning
  media: nxp: ignore unused suspend operations
  media: rcar-vin: Select correct interrupt mode for V4L2_FIELD_ALTERNATE
  media: rcar-vin: Fix NV12 size alignment
  media: rcar-vin: Gen3 can not scale NV12

14 months agowifi: rtw89: suppress the log for specific SER called CMDPSR_FRZTO
Chin-Yen Lee [Mon, 8 May 2023 08:43:35 +0000 (16:43 +0800)]
wifi: rtw89: suppress the log for specific SER called CMDPSR_FRZTO

For 8852CE, there is abnormal state called CMDPSR_FRZTO,
which occasionally happens in some platforms, and could be
found by firmware and fixed in current SER flow, so we add
suppress function to avoid verbose message for this resolved case.

Signed-off-by: Chin-Yen Lee <timlee@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230508084335.42953-4-pkshih@realtek.com
14 months agowifi: rtw89: ser: L1 add pre-M0 and post-M0 states
Zong-Zhe Yang [Mon, 8 May 2023 08:43:34 +0000 (16:43 +0800)]
wifi: rtw89: ser: L1 add pre-M0 and post-M0 states

Newer FW re-design SER (syetem error recovery) L1 (level 1) flow.
New L1 flow will expect two extra states before original L1 flow.

* Before:
fw --- M1 --> driver
fw <-- M2 --- driver
fw --- M3 --> driver
fw <-- M4 --- driver
fw --- M5 --> driver

* After:
fw --- pre-M0  --> driver
fw <-- post-M0 --- driver
fw --- M1 --> driver
fw <-- M2 --- driver
fw --- M3 --> driver
fw <-- M4 --- driver
fw --- M5 --> driver

Then before M1, FW gets one more interval to deal with things that FW
should have handled well. To consider backward/forward compatibility,
FW and driver won't change flow from M1 to M5. (only except that halt
trigger control will change a little bit.) So, there will be two differnt
starting points of SER L1.
* old FW: SER L1 starts from M1
* new FW: SER L1 starts from pre-M0
Then, driver adds the new SER L1 entry and also keep the original one
instead of changing it.

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230508084335.42953-3-pkshih@realtek.com
14 months agowifi: rtw89: pci: fix interrupt enable mask for HALT C2H of RTL8851B
Zong-Zhe Yang [Mon, 8 May 2023 08:43:33 +0000 (16:43 +0800)]
wifi: rtw89: pci: fix interrupt enable mask for HALT C2H of RTL8851B

RTL8851B keeps almost the same interrupt flow as RTL8852A and RTL8852B.
But, it uses a different bitmask for interrupt indicator of FW HALT C2H.
So, we make a chip judgement in pci when configuring interrupt mask.

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230508084335.42953-2-pkshih@realtek.com
14 months agowifi: rtw89: support U-NII-4 channels on 5GHz band
Zong-Zhe Yang [Mon, 8 May 2023 08:12:11 +0000 (16:12 +0800)]
wifi: rtw89: support U-NII-4 channels on 5GHz band

U-NII-4 band, i.e 5.9GHz channels, can be supported by chip 8852C, 8852B
and 8851B. But, it is not supported by chip 8852A. Flag support_unii4 is
added in chip info and defined by chip accordingly to indicate that.
We reference this flag of runtime chip to decide whether to register
5.9GHz channels.

After that, we consider if U-NII-4 band is allowed by our regulatory
rule of U-NII-4. If chip::support_unii4 but not regd::allow_unii4,
we stll do not register 5.9GHz channels.

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230508081211.38760-4-pkshih@realtek.com
14 months agowifi: rtw89: regd: judge UNII-4 according to BIOS and chip
Zong-Zhe Yang [Mon, 8 May 2023 08:12:10 +0000 (16:12 +0800)]
wifi: rtw89: regd: judge UNII-4 according to BIOS and chip

For realtek regulatory, there are following two kinds of configurations
to determine whether to allow UNII-4 band, i.e. 5.9GHz channels.

1. default setting according to whether chip support it or not
2. evaluate realtek ACPI DSM with RTW89_ACPI_DSM_FUNC_59G_EN (func. 6)

If (1) is false, we won't try (2) and just disallow UNII-4. Otherwise,
if (2) is not supported or returns a non-specific value, we follow the
default setting either. Besides, this commit aims to add decision logic
in rtw89 regulatory. Actually, driver doesn't register UNII-4 yet. That
will be handled by another commit.

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230508081211.38760-3-pkshih@realtek.com
14 months agowifi: rtw89: introduce realtek ACPI DSM method
Zong-Zhe Yang [Mon, 8 May 2023 08:12:09 +0000 (16:12 +0800)]
wifi: rtw89: introduce realtek ACPI DSM method

Introduce realtek ACPI DSM method to get required BIOS
configurations. It will be used in the following commits.

And, enum rtw89_acpi_dsm_func is added for listing the functions
which are currently recognized.

Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230508081211.38760-2-pkshih@realtek.com
14 months agowifi: mwifiex: Fix the size of a memory allocation in mwifiex_ret_802_11_scan()
Christophe JAILLET [Sat, 6 May 2023 13:53:15 +0000 (15:53 +0200)]
wifi: mwifiex: Fix the size of a memory allocation in mwifiex_ret_802_11_scan()

The type of "mwifiex_adapter->nd_info" is "struct cfg80211_wowlan_nd_info",
not "struct cfg80211_wowlan_nd_match".

Use struct_size() to ease the computation of the needed size.

The current code over-allocates some memory, so is safe.
But it wastes 32 bytes.

Fixes: 7d7f07d8c5d3 ("mwifiex: add wowlan net-detect support")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/7a6074fb056d2181e058a3cc6048d8155c20aec7.1683371982.git.christophe.jaillet@wanadoo.fr
14 months agowifi: rtw88: unlock on error path in rtw_ops_add_interface()
Dan Carpenter [Wed, 3 May 2023 15:09:55 +0000 (18:09 +0300)]
wifi: rtw88: unlock on error path in rtw_ops_add_interface()

Call mutex_unlock(&rtwdev->mutex); before returning on this error path.

Fixes: f0e741e4ddbc ("wifi: rtw88: add bitmap for dynamic port settings")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/ddd10a74-5982-4f65-8c59-c1cca558d239@kili.mountain
14 months agowifi: wilc1000: Increase ASSOC response buffer
Amisha Patel [Tue, 9 May 2023 17:29:08 +0000 (17:29 +0000)]
wifi: wilc1000: Increase ASSOC response buffer

In recent access points, information element is longer as they include
additional data which exceeds 256 bytes. To accommodate longer
association response, increase the ASSOC response buffer.

Signed-off-by: Amisha Patel <amisha.patel@microchip.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230509172811.4953-1-amisha.patel@microchip.com
14 months agowifi: wilc1000: fix for absent RSN capabilities WFA testcase
Amisha Patel [Fri, 21 Apr 2023 18:10:20 +0000 (18:10 +0000)]
wifi: wilc1000: fix for absent RSN capabilities WFA testcase

Mandatory WFA testcase
CT_Security_WPA2Personal_STA_RSNEBoundsVerification-AbsentRSNCap,
performs bounds verfication on Beacon and/or Probe response frames. It
failed and observed the reason to be absence of cipher suite and AKM
suite in RSN information. To fix this, enable the RSN flag before extracting RSN
capabilities.

Fixes: cd21d99e595e ("wifi: wilc1000: validate pairwise and authentication suite offsets")
Signed-off-by: Amisha Patel <amisha.patel@microchip.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230421181005.4865-1-amisha.patel@microchip.com
14 months agoMerge branch 'net-mvneta-reduce-size-of-tso-header-allocation'
Paolo Abeni [Thu, 11 May 2023 11:05:19 +0000 (13:05 +0200)]
Merge branch 'net-mvneta-reduce-size-of-tso-header-allocation'

Russell King says:

====================
net: mvneta: reduce size of TSO header allocation

With reference to
https://forum.turris.cz/t/random-kernel-exceptions-on-hbl-tos-7-0/18865/
https://github.com/openwrt/openwrt/pull/12375#issuecomment-1528842334

It appears that mvneta attempts an order-6 allocation for the TSO
header memory. While this succeeds early on in the system's life time,
trying order-6 allocations later can result in failure due to memory
fragmentation.

Firstly, the reason it's so large is that we take the number of
transmit descriptors, and allocate a TSO header buffer for each, and
each TSO header is 256 bytes. The driver uses a simple mechanism to
determine the address - it uses the transmit descriptor index as an
index into the TSO header memory.

(The first obvious question is: do there need to be this
many? Won't each TSO header always have at least one bit
of data to go with it? In other words, wouldn't the maximum
number of TSO headers that a ring could accept be the number
of ring entries divided by 2?)

There is no real need for this memory to be an order-6 allocation,
since nothing in hardware requires this buffer to be contiguous.

Therefore, this series splits this order-6 allocation up into 32
order-1 allocations (8k pages on 4k page platforms), each giving
32 TSO headers per page.

In order to do this, these patches:

1) fix a horrible transmit path error-cleanup bug - the existing
   code unmaps from the first descriptor that was allocated at
   interface bringup, not the first descriptor that the packet
   is using, resulting in the wrong descriptors being unmapped.

2) since xdp support was added, we now have buf->type which indicates
   what this transmit buffer contains. Use this to mark TSO header
   buffers.

3) get rid of IS_TSO_HEADER(), instead using buf->type to determine
   whether this transmit buffer needs to be DMA-unmapped.

4) move tso_build_hdr() into mvneta_tso_put_hdr() to keep all the
   TSO header building code together.

5) split the TSO header allocation into chunks of order-1 pages.

This has now been tested by the Turris folk and has been found to fix
the allocation error.
====================

Link: https://lore.kernel.org/r/ZFtuhJOC03qpASt2@shell.armlinux.org.uk
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agonet: mvneta: allocate TSO header DMA memory in chunks
Russell King (Oracle) [Wed, 10 May 2023 10:16:03 +0000 (11:16 +0100)]
net: mvneta: allocate TSO header DMA memory in chunks

Now that we no longer need to check whether the DMA address is within
the TSO header DMA memory range for the queue, we can allocate the TSO
header DMA memory in chunks rather than one contiguous order-6 chunk,
which can stress the kernel's memory subsystems to allocate.

Instead, use order-1 (8k) allocations, which will result in 32 order-1
pages containing 32 TSO headers.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agonet: mvneta: move tso_build_hdr() into mvneta_tso_put_hdr()
Russell King (Oracle) [Wed, 10 May 2023 10:15:58 +0000 (11:15 +0100)]
net: mvneta: move tso_build_hdr() into mvneta_tso_put_hdr()

Move tso_build_hdr() into mvneta_tso_put_hdr() so that all the TSO
header building code is in one place.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agonet: mvneta: use buf->type to determine whether to dma-unmap
Russell King (Oracle) [Wed, 10 May 2023 10:15:53 +0000 (11:15 +0100)]
net: mvneta: use buf->type to determine whether to dma-unmap

Now that we use a different buffer type for TSO headers, we can use
buf->type to determine whether the original buffer was DMA-mapped or
not. The rules are:

MVNETA_TYPE_XDP_TX - from a DMA pool, no unmap is required
MVNETA_TYPE_XDP_NDO - dma_map_single()'d
MVNETA_TYPE_SKB - normal skbuff, dma_map_single()'d
MVNETA_TYPE_TSO - from the TSO buffer area

This means we only need to call dma_unmap_single() on the XDP_NDO and
SKB types of buffer, and we no longer need the private IS_TSO_HEADER()
which relies on the TSO region being contiguously allocated.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agonet: mvneta: mark mapped and tso buffers separately
Russell King (Oracle) [Wed, 10 May 2023 10:15:48 +0000 (11:15 +0100)]
net: mvneta: mark mapped and tso buffers separately

Mark dma-mapped skbs and TSO buffers separately, so we can use
buf->type to identify their differences.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agonet: mvneta: fix transmit path dma-unmapping on error
Russell King (Oracle) [Wed, 10 May 2023 10:15:42 +0000 (11:15 +0100)]
net: mvneta: fix transmit path dma-unmapping on error

The transmit code assumes that the transmit descriptors that are used
begin with the first descriptor in the ring, but this may not be the
case. Fix this by providing a new function that dma-unmaps a range of
numbered descriptor entries, and use that to do the unmapping.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agotcp: make the first N SYN RTO backoffs linear
David Morley [Tue, 9 May 2023 18:05:58 +0000 (18:05 +0000)]
tcp: make the first N SYN RTO backoffs linear

Currently the SYN RTO schedule follows an exponential backoff
scheme, which can be unnecessarily conservative in cases where
there are link failures. In such cases, it's better to
aggressively try to retransmit packets, so it takes routers
less time to find a repath with a working link.

We chose a default value for this sysctl of 4, to follow
the macOS and IOS backoff scheme of 1,1,1,1,1,2,4,8, ...
MacOS and IOS have used this backoff schedule for over
a decade, since before this 2009 IETF presentation
discussed the behavior:
https://www.ietf.org/proceedings/75/slides/tcpm-1.pdf

This commit makes the SYN RTO schedule start with a number of
linear backoffs given by the following sysctl:
* tcp_syn_linear_timeouts

This changes the SYN RTO scheme to be: init_rto_val for
tcp_syn_linear_timeouts, exp backoff starting at init_rto_val

For example if init_rto_val = 1 and tcp_syn_linear_timeouts = 2, our
backoff scheme would be: 1, 1, 1, 2, 4, 8, 16, ...

Signed-off-by: David Morley <morleyd@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Tested-by: David Morley <morleyd@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230509180558.2541885-1-morleyd.kernel@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
14 months agonet: wwan: iosm: clean up unused struct members
M Chetan Kumar [Tue, 9 May 2023 16:36:35 +0000 (22:06 +0530)]
net: wwan: iosm: clean up unused struct members

Below members are unused.
- td_tag member defined in struct ipc_pipe.
- adb_finish_timer & params defined in struct iosm_mux.

Remove it to avoid unexpected usage.

Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/92ee483d79dfc871ed7408da8fec60b395ff3a9c.1683649868.git.m.chetan.kumar@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet: wwan: iosm: remove unused enum definition
M Chetan Kumar [Tue, 9 May 2023 16:36:22 +0000 (22:06 +0530)]
net: wwan: iosm: remove unused enum definition

ipc_time_unit enum is defined but not used.
Remove it to avoid unexpected usage.

Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/8295a6138f13c686590ee4021384ee992f717408.1683649868.git.m.chetan.kumar@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet: wwan: iosm: remove unused macro definition
M Chetan Kumar [Tue, 9 May 2023 16:35:55 +0000 (22:05 +0530)]
net: wwan: iosm: remove unused macro definition

IOSM_IF_ID_PAYLOAD is defined but not used.
Remove it to avoid unexpected usage.

Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/0697e811cb7f10b4fd8f99e66bda1329efdd3d1d.1683649868.git.m.chetan.kumar@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoMerge tag 'nf-23-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Jakub Kicinski [Thu, 11 May 2023 02:08:58 +0000 (19:08 -0700)]
Merge tag 'nf-23-05-10' of git://git./linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter updates for net

The following patchset contains Netfilter fixes for net:

1) Fix UAF when releasing netnamespace, from Florian Westphal.

2) Fix possible BUG_ON when nf_conntrack is enabled with enable_hooks,
   from Florian Westphal.

3) Fixes for nft_flowtable.sh selftest, from Boris Sukholitko.

4) Extend nft_flowtable.sh selftest to cover integration with
   ingress/egress hooks, from Florian Westphal.

* tag 'nf-23-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  selftests: nft_flowtable.sh: check ingress/egress chain too
  selftests: nft_flowtable.sh: monitor result file sizes
  selftests: nft_flowtable.sh: wait for specific nc pids
  selftests: nft_flowtable.sh: no need for ps -x option
  selftests: nft_flowtable.sh: use /proc for pid checking
  netfilter: conntrack: fix possible bug_on with enable_hooks=1
  netfilter: nf_tables: always release netdev hooks from notifier
====================

Link: https://lore.kernel.org/r/20230510083313.152961-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoMerge branch 'af_unix-fix-two-data-races-reported-by-kcsan'
Jakub Kicinski [Thu, 11 May 2023 02:06:55 +0000 (19:06 -0700)]
Merge branch 'af_unix-fix-two-data-races-reported-by-kcsan'

Kuniyuki Iwashima says:

====================
af_unix: Fix two data races reported by KCSAN.

KCSAN reported data races around these two fields for AF_UNIX sockets.

  * sk->sk_receive_queue->qlen
  * sk->sk_shutdown

Let's annotate them properly.
====================

Link: https://lore.kernel.org/r/20230510003456.42357-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoaf_unix: Fix data races around sk->sk_shutdown.
Kuniyuki Iwashima [Wed, 10 May 2023 00:34:56 +0000 (17:34 -0700)]
af_unix: Fix data races around sk->sk_shutdown.

KCSAN found a data race around sk->sk_shutdown where unix_release_sock()
and unix_shutdown() update it under unix_state_lock(), OTOH unix_poll()
and unix_dgram_poll() read it locklessly.

We need to annotate the writes and reads with WRITE_ONCE() and READ_ONCE().

BUG: KCSAN: data-race in unix_poll / unix_release_sock

write to 0xffff88800d0f8aec of 1 bytes by task 264 on cpu 0:
 unix_release_sock+0x75c/0x910 net/unix/af_unix.c:631
 unix_release+0x59/0x80 net/unix/af_unix.c:1042
 __sock_release+0x7d/0x170 net/socket.c:653
 sock_close+0x19/0x30 net/socket.c:1397
 __fput+0x179/0x5e0 fs/file_table.c:321
 ____fput+0x15/0x20 fs/file_table.c:349
 task_work_run+0x116/0x1a0 kernel/task_work.c:179
 resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
 exit_to_user_mode_prepare+0x174/0x180 kernel/entry/common.c:204
 __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
 syscall_exit_to_user_mode+0x1a/0x30 kernel/entry/common.c:297
 do_syscall_64+0x4b/0x90 arch/x86/entry/common.c:86
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

read to 0xffff88800d0f8aec of 1 bytes by task 222 on cpu 1:
 unix_poll+0xa3/0x2a0 net/unix/af_unix.c:3170
 sock_poll+0xcf/0x2b0 net/socket.c:1385
 vfs_poll include/linux/poll.h:88 [inline]
 ep_item_poll.isra.0+0x78/0xc0 fs/eventpoll.c:855
 ep_send_events fs/eventpoll.c:1694 [inline]
 ep_poll fs/eventpoll.c:1823 [inline]
 do_epoll_wait+0x6c4/0xea0 fs/eventpoll.c:2258
 __do_sys_epoll_wait fs/eventpoll.c:2270 [inline]
 __se_sys_epoll_wait fs/eventpoll.c:2265 [inline]
 __x64_sys_epoll_wait+0xcc/0x190 fs/eventpoll.c:2265
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

value changed: 0x00 -> 0x03

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 222 Comm: dbus-broker Not tainted 6.3.0-rc7-02330-gca6270c12e20 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014

Fixes: 3c73419c09a5 ("af_unix: fix 'poll for write'/ connected DGRAM sockets")
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoaf_unix: Fix a data race of sk->sk_receive_queue->qlen.
Kuniyuki Iwashima [Wed, 10 May 2023 00:34:55 +0000 (17:34 -0700)]
af_unix: Fix a data race of sk->sk_receive_queue->qlen.

KCSAN found a data race of sk->sk_receive_queue->qlen where recvmsg()
updates qlen under the queue lock and sendmsg() checks qlen under
unix_state_sock(), not the queue lock, so the reader side needs
READ_ONCE().

BUG: KCSAN: data-race in __skb_try_recv_from_queue / unix_wait_for_peer

write (marked) to 0xffff888019fe7c68 of 4 bytes by task 49792 on cpu 0:
 __skb_unlink include/linux/skbuff.h:2347 [inline]
 __skb_try_recv_from_queue+0x3de/0x470 net/core/datagram.c:197
 __skb_try_recv_datagram+0xf7/0x390 net/core/datagram.c:263
 __unix_dgram_recvmsg+0x109/0x8a0 net/unix/af_unix.c:2452
 unix_dgram_recvmsg+0x94/0xa0 net/unix/af_unix.c:2549
 sock_recvmsg_nosec net/socket.c:1019 [inline]
 ____sys_recvmsg+0x3a3/0x3b0 net/socket.c:2720
 ___sys_recvmsg+0xc8/0x150 net/socket.c:2764
 do_recvmmsg+0x182/0x560 net/socket.c:2858
 __sys_recvmmsg net/socket.c:2937 [inline]
 __do_sys_recvmmsg net/socket.c:2960 [inline]
 __se_sys_recvmmsg net/socket.c:2953 [inline]
 __x64_sys_recvmmsg+0x153/0x170 net/socket.c:2953
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

read to 0xffff888019fe7c68 of 4 bytes by task 49793 on cpu 1:
 skb_queue_len include/linux/skbuff.h:2127 [inline]
 unix_recvq_full net/unix/af_unix.c:229 [inline]
 unix_wait_for_peer+0x154/0x1a0 net/unix/af_unix.c:1445
 unix_dgram_sendmsg+0x13bc/0x14b0 net/unix/af_unix.c:2048
 sock_sendmsg_nosec net/socket.c:724 [inline]
 sock_sendmsg+0x148/0x160 net/socket.c:747
 ____sys_sendmsg+0x20e/0x620 net/socket.c:2503
 ___sys_sendmsg+0xc6/0x140 net/socket.c:2557
 __sys_sendmmsg+0x11d/0x370 net/socket.c:2643
 __do_sys_sendmmsg net/socket.c:2672 [inline]
 __se_sys_sendmmsg net/socket.c:2669 [inline]
 __x64_sys_sendmmsg+0x58/0x70 net/socket.c:2669
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

value changed: 0x0000000b -> 0x00000001

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 49793 Comm: syz-executor.0 Not tainted 6.3.0-rc7-02330-gca6270c12e20 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet: datagram: fix data-races in datagram_poll()
Eric Dumazet [Tue, 9 May 2023 17:31:31 +0000 (17:31 +0000)]
net: datagram: fix data-races in datagram_poll()

datagram_poll() runs locklessly, we should add READ_ONCE()
annotations while reading sk->sk_err, sk->sk_shutdown and sk->sk_state.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230509173131.3263780-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoMAINTAINERS: re-sort all entries and fields
Linus Torvalds [Thu, 11 May 2023 01:04:04 +0000 (20:04 -0500)]
MAINTAINERS: re-sort all entries and fields

It's been a few years since we've sorted this thing, and the end result
is that we've added MAINTAINERS entries in the wrong order, and a number
of entries have their fields in non-canonical order too.

So roll this boulder up the hill one more time by re-running

   ./scripts/parse-maintainers.pl --order

on it.

This file ends up being fairly painful for merge conflicts even
normally, since unlike almost all other kernel files it's one of those
"everybody touches the same thing", and re-ordering all entries is only
going to make that worse.  But the alternative is to never do it at all,
and just let it all rot..

The rc2 week is likely the quietest and least painful time to do this.

Requested-by: Randy Dunlap <rdunlap@infradead.org>
Requested-by: Joe Perches <joe@perches.com> # "Please use --order"
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 months agoMerge tag 'fsnotify_for_v6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 10 May 2023 22:07:42 +0000 (17:07 -0500)]
Merge tag 'fsnotify_for_v6.4-rc2' of git://git./linux/kernel/git/jack/linux-fs

Pull inotify fix from Jan Kara:
 "A fix for possibly reporting invalid watch descriptor with inotify
  event"

* tag 'fsnotify_for_v6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  inotify: Avoid reporting event with invalid wd