platform/kernel/linux-starfive.git
19 months agoocteontx2-af: cn10kb: Add RPM_USX MAC support
Hariprasad Kelam [Mon, 5 Dec 2022 07:05:19 +0000 (12:35 +0530)]
octeontx2-af: cn10kb: Add RPM_USX MAC support

OcteonTx2's next gen platform the CN10KB has RPM_USX MAC which has a
different serdes when compared to RPM MAC. Though the underlying
HW is different, the CSR interface has been designed largely inline
with RPM MAC, with few exceptions though. So we are using the same
CGX driver for RPM_USX MAC as well and will have a different set of APIs
for RPM_USX where ever necessary.

The RPM and RPM_USX blocks support a different number of LMACS.
RPM_USX support 8 LMACS per MAC block whereas legacy RPM supports only 4
LMACS per MAC. with this RPM_USX support double the number of DMAC filters
and fifo size.

This patch adds initial support for CN10KB's RPM_USX  MAC i.e registering
the driver and defining MAC operations (mac_ops). Adds the logic to
configure internal loopback and pause frames and assign FIFO length to
LMACS.

Kernel reads lmac features like lmac type, autoneg, etc from shared
firmware data this structure only supports 4 lmacs per MAC, this patch
extends this structure to accommodate 8 lmacs.

Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
19 months agoocteontx2-af: Support variable number of lmacs
Rakesh Babu Saladi [Mon, 5 Dec 2022 07:05:18 +0000 (12:35 +0530)]
octeontx2-af: Support variable number of lmacs

Most of the code in CGX/RPM driver assumes that max lmacs per
given MAC as always, 4 and the number of MAC blocks also as 4.
With this assumption, the max number of interfaces supported is
hardcoded to 16. This creates a problem as next gen CN10KB silicon
MAC supports 8 lmacs per MAC block.

This patch solves the problem by using "max lmac per MAC block"
value from constant csrs and uses cgx_cnt_max value which is
populated based number of MAC blocks supported by silicon.

Signed-off-by: Rakesh Babu Saladi <rsaladi2@marvell.com>
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
19 months agoMerge branch 'net-dsa-microchip-add-mtu-support-for-ksz8-series'
Paolo Abeni [Wed, 7 Dec 2022 10:58:00 +0000 (11:58 +0100)]
Merge branch 'net-dsa-microchip-add-mtu-support-for-ksz8-series'

Oleksij Rempel says:

====================
net: dsa: microchip: add MTU support for KSZ8 series
====================

Link: https://lore.kernel.org/r/20221205052232.2834166-1-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
19 months agonet: dsa: microchip: ksz8: move all DSA configurations to one location
Oleksij Rempel [Mon, 5 Dec 2022 05:22:32 +0000 (06:22 +0100)]
net: dsa: microchip: ksz8: move all DSA configurations to one location

To make the code more comparable to KSZ9477 code, move DSA
configurations to the same location.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
19 months agonet: dsa: microchip: enable MTU normalization for KSZ8795 and KSZ9477 compatible...
Oleksij Rempel [Mon, 5 Dec 2022 05:22:31 +0000 (06:22 +0100)]
net: dsa: microchip: enable MTU normalization for KSZ8795 and KSZ9477 compatible switches

KSZ8795 and KSZ9477 compatible series of switches use global max frame
size configuration register. So, enable MTU normalization for this reason.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
19 months agonet: dsa: microchip: ksz8: add MTU configuration support
Oleksij Rempel [Mon, 5 Dec 2022 05:22:30 +0000 (06:22 +0100)]
net: dsa: microchip: ksz8: add MTU configuration support

Make MTU configurable on KSZ87xx and KSZ88xx series of switches.

Before this patch, pre-configured behavior was different on different
switch series, due to opposite meaning of the same bit:
- KSZ87xx: Reg 4, Bit 1 - if 1, max frame size is 1532; if 0 - 1514
- KSZ88xx: Reg 4, Bit 1 - if 1, max frame size is 1514; if 0 - 1532

Since the code was telling "... SW_LEGAL_PACKET_DISABLE, true)", I
assume, the idea was to set max frame size to 1532.

With this patch, by setting MTU size 1500, both switch series will be
configured to the 1532 frame limit.

This patch was tested on KSZ8873.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
19 months agonet: dsa: microchip: add ksz_rmw8() function
Oleksij Rempel [Mon, 5 Dec 2022 05:22:29 +0000 (06:22 +0100)]
net: dsa: microchip: add ksz_rmw8() function

Add ksz_rmw8(), it will be used in the next patch.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
19 months agonet: dsa: microchip: do not store max MTU for all ports
Oleksij Rempel [Mon, 5 Dec 2022 05:22:28 +0000 (06:22 +0100)]
net: dsa: microchip: do not store max MTU for all ports

If we have global MTU configuration, it is enough to configure it on CPU
port only.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
19 months agonet: dsa: microchip: move max mtu to one location
Oleksij Rempel [Mon, 5 Dec 2022 05:22:27 +0000 (06:22 +0100)]
net: dsa: microchip: move max mtu to one location

There are no HW specific registers, so we can process all of them
in one location.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Tested-by: Arun Ramadoss <arun.ramadoss@microchip.com> (KSZ9893 and LAN937x)
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
19 months agonet: ethernet: mtk_wed: Fix missing of_node_put() in mtk_wed_wo_hardware_init()
Yuan Can [Mon, 5 Dec 2022 03:43:39 +0000 (03:43 +0000)]
net: ethernet: mtk_wed: Fix missing of_node_put() in mtk_wed_wo_hardware_init()

The np needs to be released through of_node_put() in the error handling
path of mtk_wed_wo_hardware_init().

Fixes: 799684448e3e ("net: ethernet: mtk_wed: introduce wed wo support")
Signed-off-by: Yuan Can <yuancan@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20221205034339.112163-1-yuancan@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
19 months agonet: ethernet: mtk_wed: add reset to rx_ring_setup callback
Lorenzo Bianconi [Mon, 5 Dec 2022 11:34:42 +0000 (12:34 +0100)]
net: ethernet: mtk_wed: add reset to rx_ring_setup callback

This patch adds reset parameter to mtk_wed_rx_ring_setup signature
in order to align rx_ring_setup callback to tx_ring_setup one introduced
in 'commit 23dca7a90017 ("net: ethernet: mtk_wed: add reset to
tx_ring_setup callback")'

Co-developed-by: Sujuan Chen <sujuan.chen@mediatek.com>
Signed-off-by: Sujuan Chen <sujuan.chen@mediatek.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/29c6e7a5469e784406cf3e2920351d1207713d05.1670239984.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agonet: microchip: vcap: Remove unneeded semicolons
zhang songyi [Mon, 5 Dec 2022 06:22:15 +0000 (14:22 +0800)]
net: microchip: vcap: Remove unneeded semicolons

Semicolons after "}" are not needed.

Signed-off-by: zhang songyi <zhang.songyi@zte.com.cn>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/202212051422158113766@zte.com.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agosfc: use sysfs_emit() to instead of scnprintf()
ye xingchen [Mon, 5 Dec 2022 02:21:45 +0000 (10:21 +0800)]
sfc: use sysfs_emit() to instead of scnprintf()

Follow the advice of the Documentation/filesystems/sysfs.rst and show()
should only use sysfs_emit() or sysfs_emit_at() when formatting the
value to be returned to user space.

Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://lore.kernel.org/r/202212051021451139126@zte.com.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agonet: sfp: clean up i2c-bus property parsing
Russell King (Oracle) [Sat, 3 Dec 2022 17:25:15 +0000 (17:25 +0000)]
net: sfp: clean up i2c-bus property parsing

We currently have some complicated code in sfp_probe() which gets the
I2C bus depending on whether the sfp node is DT or ACPI, and we use
completely separate lookup functions.

This could do with being in a separate function to make the code more
readable, so move it to a new function, sfp_i2c_get(). We can also use
fwnode_find_reference() to lookup the I2C bus fwnode before then
decending into fwnode-type specific parsing.

A future cleanup would be to move the fwnode-type specific parsing into
the i2c layer, which is where it really should be.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1p1WGJ-0098wS-4w@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agonet/ncsi: Silence runtime memcpy() false positive warning
Kees Cook [Fri, 2 Dec 2022 21:24:22 +0000 (13:24 -0800)]
net/ncsi: Silence runtime memcpy() false positive warning

The memcpy() in ncsi_cmd_handler_oem deserializes nca->data into a
flexible array structure that overlapping with non-flex-array members
(mfr_id) intentionally. Since the mem_to_flex() API is not finished,
temporarily silence this warning, since it is a false positive, using
unsafe_memcpy().

Reported-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/netdev/CACPK8Xdfi=OJKP0x0D1w87fQeFZ4A2DP2qzGCRcuVbpU-9=4sQ@mail.gmail.com/
Cc: Samuel Mendoza-Jonas <sam@mendozajonas.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221202212418.never.837-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agoMerge branch 'net-lan966x-enable-ptp-on-bridge-interfaces'
Paolo Abeni [Tue, 6 Dec 2022 12:26:45 +0000 (13:26 +0100)]
Merge branch 'net-lan966x-enable-ptp-on-bridge-interfaces'

Horatiu Vultur says:

====================
net: lan966x: Enable PTP on bridge interfaces

Before it was not allowed to run ptp on ports that are part of a bridge
because in case of transparent clock the HW will still forward the frames
so there would be duplicate frames.
Now that there is VCAP support, it is possible to add entries in the VCAP
to trap frames to the CPU and the CPU will forward these frames.
The first part of the patch series, extends the VCAP support to be able to
modify and get the rule, while the last patch uses the VCAP to trap the ptp
frames.
====================

Link: https://lore.kernel.org/r/20221203104348.1749811-1-horatiu.vultur@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
19 months agonet: lan966x: Add ptp trap rules
Horatiu Vultur [Sat, 3 Dec 2022 10:43:48 +0000 (11:43 +0100)]
net: lan966x: Add ptp trap rules

Currently lan966x, doesn't allow to run PTP over interfaces that are
part of the bridge. The reason is when the lan966x was receiving a
PTP frame (regardless if L2/IPv4/IPv6) the HW it would flood this
frame.
Now that it is possible to add VCAP rules to the HW, such to trap these
frames to the CPU, it is possible to run PTP also over interfaces that
are part of the bridge.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
19 months agonet: microchip: vcap: Add vcap_rule_get_key_u32
Horatiu Vultur [Sat, 3 Dec 2022 10:43:47 +0000 (11:43 +0100)]
net: microchip: vcap: Add vcap_rule_get_key_u32

Add the function vcap_rule_get_key_u32 which allows to get the value and
the mask of a key that exist on the rule. If the key doesn't exist,
it would return error.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
19 months agonet: microchip: vcap: Add vcap_mod_rule
Horatiu Vultur [Sat, 3 Dec 2022 10:43:46 +0000 (11:43 +0100)]
net: microchip: vcap: Add vcap_mod_rule

Add the function vcap_mod_rule which allows to update an existing rule
in the vcap. It is required for the rule to exist in the vcap to be able
to modify it.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
19 months agonet: microchip: vcap: Add vcap_get_rule
Horatiu Vultur [Sat, 3 Dec 2022 10:43:45 +0000 (11:43 +0100)]
net: microchip: vcap: Add vcap_get_rule

Add function vcap_get_rule which returns a rule based on the internal
rule id.
The entire functionality of reading and decoding the rule from the VCAP
was inside vcap_api_debugfs file. So move the entire implementation in
vcap_api as this is used also by vcap_get_rule.

Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
19 months agonet: mtk_eth_soc: enable flow offload support for MT7986 SoC
Lorenzo Bianconi [Sat, 3 Dec 2022 13:20:37 +0000 (14:20 +0100)]
net: mtk_eth_soc: enable flow offload support for MT7986 SoC

Since Wireless Ethernet Dispatcher is now available for mt7986 in mt76,
enable hw flow support for MT7986 SoC.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/fdcaacd827938e6a8c4aa1ac2c13e46d2c08c821.1670072898.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agoethtool: add netlink based get rss support
Sudheer Mogilappagari [Fri, 2 Dec 2022 00:25:55 +0000 (16:25 -0800)]
ethtool: add netlink based get rss support

Add netlink based support for "ethtool -x <dev> [context x]"
command by implementing ETHTOOL_MSG_RSS_GET netlink message.
This is equivalent to functionality provided via ETHTOOL_GRSSH
in ioctl path. It sends RSS table, hash key and hash function
of an interface to user space.

This patch implements existing functionality available
in ioctl path and enables addition of new RSS context
based parameters in future.

Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com>
Link: https://lore.kernel.org/r/20221202002555.241580-1-sudheer.mogilappagari@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agonet: phy: mxl-gpy: rename MMD_VEND1 macros to match datasheet
Michael Walle [Fri, 2 Dec 2022 14:49:00 +0000 (15:49 +0100)]
net: phy: mxl-gpy: rename MMD_VEND1 macros to match datasheet

Rename the temperature sensors macros to match the names in the
datasheet.

Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 months agonfp: add support for multicast filter
Diana Wang [Fri, 2 Dec 2022 09:42:14 +0000 (10:42 +0100)]
nfp: add support for multicast filter

Rewrite nfp_net_set_rx_mode() to implement interface to delivery
mc address and operations to firmware by using general mailbox
for filtering multicast packets.

The operations include add mc address and delete mc address.
And the limitation of mc addresses number is 1024 for each net
device.

User triggers adding mc address by using command below:
ip maddress add <mc address> dev <interface name>

User triggers deleting mc address by using command below:
ip maddress del <mc address> dev <interface name>

Signed-off-by: Diana Wang <na.wang@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 months agonet: ipa: use sysfs_emit() to instead of scnprintf()
ye xingchen [Fri, 2 Dec 2022 08:42:14 +0000 (16:42 +0800)]
net: ipa: use sysfs_emit() to instead of scnprintf()

Follow the advice of the Documentation/filesystems/sysfs.rst and show()
should only use sysfs_emit() or sysfs_emit_at() when formatting the
value to be returned to user space.

Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn>
Reviewed-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 months agoMerge tag 'rxrpc-next-20221201-b' of git://git.kernel.org/pub/scm/linux/kernel/git...
David S. Miller [Mon, 5 Dec 2022 10:58:17 +0000 (10:58 +0000)]
Merge tag 'rxrpc-next-20221201-b' of git://git./linux/kernel/git/dhowells/linux-fs

David Howells says:

====================
rxrpc: Increasing SACK size and moving away from softirq, parts 2 & 3

Here are the second and third parts of patches in the process of moving
rxrpc from doing a lot of its stuff in softirq context to doing it in an
I/O thread in process context and thereby making it easier to support a
larger SACK table.

The full description is in the description for the first part[1] which is
already in net-next.

The second part includes some cleanups, adds some testing and overhauls
some tracing:

 (1) Remove declaration of rxrpc_kernel_call_is_complete() as the
     definition is no longer present.

 (2) Remove the knet() and kproto() macros in favour of using tracepoints.

 (3) Remove handling of duplicate packets from recvmsg.  The input side
     isn't now going to insert overlapping/duplicate packets into the
     recvmsg queue.

 (4) Don't use the rxrpc_conn_parameters struct in the rxrpc_connection or
     rxrpc_bundle structs - rather put the members in directly.

 (5) Extract the abort code from a received abort packet right up front
     rather than doing it in multiple places later.

 (6) Use enums and symbol lists rather than __builtin_return_address() to
     indicate where a tracepoint was triggered for local, peer, conn, call
     and skbuff tracing.

 (7) Add a refcount tracepoint for the rxrpc_bundle struct.

 (8) Implement an in-kernel server for the AFS rxperf testing program to
     talk to (enabled by a Kconfig option).

This is tagged as rxrpc-next-20221201-a.

The third part introduces the I/O thread and switches various bits over to
running there:

 (1) Fix call timers and call and connection workqueues to not hold refs on
     the rxrpc_call and rxrpc_connection structs to thereby avoid messy
     cleanup when the last ref is put in softirq mode.

 (2) Split input.c so that the call packet processing bits are separate
     from the received packet distribution bits.  Call packet processing
     gets bumped over to the call event handler.

 (3) Create a per-local endpoint I/O thread.  Barring some tiny bits that
     still get done in softirq context, all packet reception, processing
     and transmission is done in this thread.  That will allow a load of
     locking to be removed.

 (4) Perform packet processing and error processing from the I/O thread.

 (5) Provide a mechanism to process call event notifications in the I/O
     thread rather than queuing a work item for that call.

 (6) Move data and ACK transmission into the I/O thread.  ACKs can then be
     transmitted at the point they're generated rather than getting
     delegated from softirq context to some process context somewhere.

 (7) Move call and local processor event handling into the I/O thread.

 (8) Move cwnd degradation to after packets have been transmitted so that
     they don't shorten the window too quickly.

A bunch of simplifications can then be done:

 (1) The input_lock is no longer necessary as exclusion is achieved by
     running the code in the I/O thread only.

 (2) Don't need to use sk->sk_receive_queue.lock to guard socket state
     changes as the socket mutex should suffice.

 (3) Don't take spinlocks in RCU callback functions as they get run in
     softirq context and thus need _bh annotations.

 (4) RCU is then no longer needed for the peer's error_targets list.

 (5) Simplify the skbuff handling in the receive path by dropping the ref
     in the basic I/O thread loop and getting an extra ref as and when we
     need to queue the packet for recvmsg or another context.

 (6) Get the peer address earlier in the input process and pass it to the
     users so that we only do it once.

This is tagged as rxrpc-next-20221201-b.

Changes:
========
ver #2)
 - Added a patch to change four assertions into warnings in rxrpc_read()
   and fixed a checker warning from a __user annotation that should have
   been removed..
 - Change a min() to min_t() in rxperf as PAGE_SIZE doesn't seem to match
   type size_t on i386.
 - Three error handling issues in rxrpc_new_incoming_call():
   - If not DATA or not seq #1, should drop the packet, not abort.
   - Fix a goto that went to the wrong place, dropping a non-held lock.
   - Fix an rcu_read_lock that should've been an unlock.

Tested-by: Marc Dionne <marc.dionne@auristor.com>
Tested-by: kafs-testing+fedora36_64checkkafs-build-144@auristor.com
Link: https://lore.kernel.org/r/166794587113.2389296.16484814996876530222.stgit@warthog.procyon.org.uk/
Link: https://lore.kernel.org/r/166982725699.621383.2358362793992993374.stgit@warthog.procyon.org.uk/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
19 months agonet: stmmac: tegra: Add MGBE support
Bhadram Varka [Thu, 1 Dec 2022 15:58:44 +0000 (15:58 +0000)]
net: stmmac: tegra: Add MGBE support

Add support for the Multi-Gigabit Ethernet (MGBE/XPCS) IP found on
NVIDIA Tegra234 SoCs.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Bhadram Varka <vbhadram@nvidia.com>
Co-developed-by: Revanth Kumar Uppala <ruppala@nvidia.com>
Signed-off-by: Revanth Kumar Uppala <ruppala@nvidia.com>
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 months agonet: stmmac: Power up SERDES after the PHY link
Revanth Kumar Uppala [Thu, 1 Dec 2022 15:58:43 +0000 (15:58 +0000)]
net: stmmac: Power up SERDES after the PHY link

The Tegra MGBE ethernet controller requires that the SERDES link is
powered-up after the PHY link is up, otherwise the link fails to
become ready following a resume from suspend. Add a variable to indicate
that the SERDES link must be powered-up after the PHY link.

Signed-off-by: Revanth Kumar Uppala <ruppala@nvidia.com>
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 months agoMerge branch 'r8169-irq-coalesce'
David S. Miller [Sat, 3 Dec 2022 21:49:23 +0000 (21:49 +0000)]
Merge branch 'r8169-irq-coalesce'

Heiner Kallweit says:

====================
net: add and use netdev_sw_irq_coalesce_default_on()

There are reports about r8169 not reaching full line speed on certain
systems (e.g. SBC's) with a 2.5Gbps link.
There was a time when hardware interrupt coalescing was enabled per
default, but this was changed due to ASPM-related issues on few systems.

Meanwhile we have sysfs attributes for controlling kind of
"software interrupt coalescing" on the GRO level. However most distros
and users don't know about it. So lets set a conservative default for
both involved parameters. Users can still override the defaults via
sysfs. Don't enable these settings on the fast ethernet chip versions,
they are slow enough.

Even with these conservative setting interrupt load on my 1Gbps test
system reduced significantly.

Follow Jakub's suggestion and put this functionality into net core
so that other MAC drivers can reuse it.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
19 months agor8169: enable GRO software interrupt coalescing per default
Heiner Kallweit [Wed, 30 Nov 2022 22:30:15 +0000 (23:30 +0100)]
r8169: enable GRO software interrupt coalescing per default

There are reports about r8169 not reaching full line speed on certain
systems (e.g. SBC's) with a 2.5Gbps link.
There was a time when hardware interrupt coalescing was enabled per
default, but this was changed due to ASPM-related issues on few systems.
So let's use software interrupt coalescing instead and enable it
using new function netdev_sw_irq_coalesce_default_on().

Even with these conservative settings interrupt load on my 1Gbps test
system reduced significantly.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 months agonet: add netdev_sw_irq_coalesce_default_on()
Heiner Kallweit [Wed, 30 Nov 2022 22:28:26 +0000 (23:28 +0100)]
net: add netdev_sw_irq_coalesce_default_on()

Add a helper for drivers wanting to set SW IRQ coalescing
by default. The related sysfs attributes can be used to
override the default values.

Follow Jakub's suggestion and put this functionality into
net core so that drivers wanting to use software interrupt
coalescing per default don't have to open-code it.

Note that this function needs to be called before the
netdevice is registered.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 months agonet: ethernet: mtk_wed: fix sleep while atomic in mtk_wed_wo_queue_refill
Lorenzo Bianconi [Thu, 1 Dec 2022 15:26:53 +0000 (16:26 +0100)]
net: ethernet: mtk_wed: fix sleep while atomic in mtk_wed_wo_queue_refill

In order to fix the following sleep while atomic bug always alloc pages
with GFP_ATOMIC in mtk_wed_wo_queue_refill since page_frag_alloc runs in
spin_lock critical section.

[    9.049719] Hardware name: MediaTek MT7986a RFB (DT)
[    9.054665] Call trace:
[    9.057096]  dump_backtrace+0x0/0x154
[    9.060751]  show_stack+0x14/0x1c
[    9.064052]  dump_stack_lvl+0x64/0x7c
[    9.067702]  dump_stack+0x14/0x2c
[    9.071001]  ___might_sleep+0xec/0x120
[    9.074736]  __might_sleep+0x4c/0x9c
[    9.078296]  __alloc_pages+0x184/0x2e4
[    9.082030]  page_frag_alloc_align+0x98/0x1ac
[    9.086369]  mtk_wed_wo_queue_refill+0x134/0x234
[    9.090974]  mtk_wed_wo_init+0x174/0x2c0
[    9.094881]  mtk_wed_attach+0x7c8/0x7e0
[    9.098701]  mt7915_mmio_wed_init+0x1f0/0x3a0 [mt7915e]
[    9.103940]  mt7915_pci_probe+0xec/0x3bc [mt7915e]
[    9.108727]  pci_device_probe+0xac/0x13c
[    9.112638]  really_probe.part.0+0x98/0x2f4
[    9.116807]  __driver_probe_device+0x94/0x13c
[    9.121147]  driver_probe_device+0x40/0x114
[    9.125314]  __driver_attach+0x7c/0x180
[    9.129133]  bus_for_each_dev+0x5c/0x90
[    9.132953]  driver_attach+0x20/0x2c
[    9.136513]  bus_add_driver+0x104/0x1fc
[    9.140333]  driver_register+0x74/0x120
[    9.144153]  __pci_register_driver+0x40/0x50
[    9.148407]  mt7915_init+0x5c/0x1000 [mt7915e]
[    9.152848]  do_one_initcall+0x40/0x25c
[    9.156669]  do_init_module+0x44/0x230
[    9.160403]  load_module+0x1f30/0x2750
[    9.164135]  __do_sys_init_module+0x150/0x200
[    9.168475]  __arm64_sys_init_module+0x18/0x20
[    9.172901]  invoke_syscall.constprop.0+0x4c/0xe0
[    9.177589]  do_el0_svc+0x48/0xe0
[    9.180889]  el0_svc+0x14/0x50
[    9.183929]  el0t_64_sync_handler+0x9c/0x120
[    9.188183]  el0t_64_sync+0x158/0x15c

Fixes: 799684448e3e ("net: ethernet: mtk_wed: introduce wed wo support")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://lore.kernel.org/r/67ca94bdd3d9eaeb86e52b3050fbca0bcf7bb02f.1669908312.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agotcp: use 2-arg optimal variant of kfree_rcu()
Eric Dumazet [Fri, 2 Dec 2022 05:28:47 +0000 (05:28 +0000)]
tcp: use 2-arg optimal variant of kfree_rcu()

kfree_rcu(1-arg) should be avoided as much as possible,
since this is only possible from sleepable contexts,
and incurr extra rcu barriers.

I wish the 1-arg variant of kfree_rcu() would
get a distinct name, like kfree_rcu_slow()
to avoid it being abused.

Fixes: 459837b522f7 ("net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destruction")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Dmitry Safonov <dima@arista.com>
Link: https://lore.kernel.org/r/20221202052847.2623997-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agoMerge tag 'wireless-next-2022-12-02' of git://git.kernel.org/pub/scm/linux/kernel...
Jakub Kicinski [Sat, 3 Dec 2022 04:33:29 +0000 (20:33 -0800)]
Merge tag 'wireless-next-2022-12-02' of git://git./linux/kernel/git/wireless/wireless-next

Kalle Valo says:

====================
wireless-next patches for v6.2

Third set of patches for v6.2. mt76 has a new driver for mt7996 Wi-Fi 7
devices and iwlwifi also got initial Wi-Fi 7 support. Otherwise
smaller features and fixes.

Major changes:

ath10k
 - store WLAN firmware version in SMEM image table

mt76
 - mt7996: new driver for MediaTek Wi-Fi 7 (802.11be) devices
 - mt7986, mt7915: enable Wireless Ethernet Dispatch (WED) offload support
 - mt7915: add ack signal support
 - mt7915: enable coredump support
 - mt7921: remain_on_channel support
 - mt7921: channel context support

iwlwifi
 - enable Wi-Fi 7 Extremely High Throughput (EHT) PHY capabilities
 - 320 MHz channels support

* tag 'wireless-next-2022-12-02' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (144 commits)
  wifi: ath10k: fix QCOM_SMEM dependency
  wifi: mt76: mt7921e: add pci .shutdown() support
  wifi: mt76: mt7915: mmio: fix naming convention
  wifi: mt76: mt7996: add support to configure spatial reuse parameter set
  wifi: mt76: mt7996: enable ack signal support
  wifi: mt76: mt7996: enable use_cts_prot support
  wifi: mt76: mt7915: rely on band_idx of mt76_phy
  wifi: mt76: mt7915: enable per bandwidth power limit support
  wifi: mt76: mt7915: introduce mt7915_get_power_bound()
  mt76: mt7915: Fix PCI device refcount leak in mt7915_pci_init_hif2()
  wifi: mt76: do not send firmware FW_FEATURE_NON_DL region
  wifi: mt76: mt7921: Add missing __packed annotation of struct mt7921_clc
  wifi: mt76: fix coverity overrun-call in mt76_get_txpower()
  wifi: mt76: mt7996: add driver for MediaTek Wi-Fi 7 (802.11be) devices
  wifi: mt76: mt76x0: remove dead code in mt76x0_phy_get_target_power
  wifi: mt76: mt7915: fix band_idx usage
  wifi: mt76: mt7915: enable .sta_set_txpwr support
  wifi: mt76: mt7915: add basedband Txpower info into debugfs
  wifi: mt76: mt7915: add support to configure spatial reuse parameter set
  wifi: mt76: mt7915: add missing MODULE_PARM_DESC
  ...
====================

Link: https://lore.kernel.org/r/20221202214254.D0D3DC433C1@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agowifi: ath10k: fix QCOM_SMEM dependency
Kalle Valo [Fri, 2 Dec 2022 10:30:27 +0000 (12:30 +0200)]
wifi: ath10k: fix QCOM_SMEM dependency

Nathan noticed that when HWSPINLOCK is disabled there's a Kconfig warning:

  WARNING: unmet direct dependencies detected for QCOM_SMEM
    Depends on [n]: (ARCH_QCOM [=y] || COMPILE_TEST [=n]) && HWSPINLOCK [=n]
    Selected by [m]:
    - ATH10K_SNOC [=m] && NETDEVICES [=y] && WLAN [=y] && WLAN_VENDOR_ATH [=y] && ATH10K [=m] && (ARCH_QCOM [=y] || COMPILE_TEST [=n])

The problem here is that QCOM_SMEM depends on HWSPINLOCK so we cannot select
QCOM_SMEM and instead we neeed to use 'depends on'.

Reported-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/all/Y4YsyaIW+CPdHWv3@dev-arch.thelio-3990X/
Fixes: 4d79f6f34bbb ("wifi: ath10k: Store WLAN firmware version in SMEM image table")
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20221202103027.25974-1-kvalo@kernel.org
19 months agotsnep: Rework RX buffer allocation
Gerhard Engleder [Wed, 30 Nov 2022 19:37:08 +0000 (20:37 +0100)]
tsnep: Rework RX buffer allocation

Refill RX queue in batches of descriptors to improve performance. Refill
is allowed to fail as long as a minimum number of descriptors is active.
Thus, a limited number of failed RX buffer allocations is now allowed
for normal operation. Previously every failed allocation resulted in a
dropped frame.

If the minimum number of active descriptors is reached, then RX buffers
are still reused and frames are dropped. This ensures that the RX queue
never runs empty and always continues to operate.

Prework for future XDP support.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 months agotsnep: Throttle interrupts
Gerhard Engleder [Wed, 30 Nov 2022 19:37:07 +0000 (20:37 +0100)]
tsnep: Throttle interrupts

Without interrupt throttling, iperf server mode generates a CPU load of
100% (A53 1.2GHz). Also the throughput suffers with less than 900Mbit/s
on a 1Gbit/s link. The reason is a high interrupt load with interrupts
every ~20us.

Reduce interrupt load by throttling of interrupts. Interrupt delay
default is 64us. For iperf server mode the CPU load is significantly
reduced to ~20% and the throughput reaches the maximum of 941MBit/s.
Interrupts are generated every ~140us.

RX and TX coalesce can be configured with ethtool. RX coalesce has
priority over TX coalesce if the same interrupt is used.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 months agotsnep: Add ethtool::get_channels support
Gerhard Engleder [Wed, 30 Nov 2022 19:37:06 +0000 (20:37 +0100)]
tsnep: Add ethtool::get_channels support

Allow user space to read number of TX and RX queue. This is useful for
device dependent qdisc configurations like TAPRIO with hardware offload.
Also ethtool::get_per_queue_coalesce / set_per_queue_coalesce requires
that interface.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 months agotsnep: Consistent naming of struct net_device
Gerhard Engleder [Wed, 30 Nov 2022 19:37:05 +0000 (20:37 +0100)]
tsnep: Consistent naming of struct net_device

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 months agoDocumentation: bonding: correct xmit hash steps
Jonathan Toppins [Wed, 30 Nov 2022 20:12:07 +0000 (15:12 -0500)]
Documentation: bonding: correct xmit hash steps

Correct xmit hash steps for layer3+4 as introduced by commit
49aefd131739 ("bonding: do not discard lowest hash bit for non layer3+4
hashing").

Signed-off-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 months agoDocumentation: bonding: update miimon default to 100
Jonathan Toppins [Wed, 30 Nov 2022 20:12:06 +0000 (15:12 -0500)]
Documentation: bonding: update miimon default to 100

With commit c1f897ce186a ("bonding: set default miimon value for non-arp
modes if not set") the miimon default was changed from zero to 100 if
arp_interval is also zero. Document this fact in bonding.rst.

Fixes: c1f897ce186a ("bonding: set default miimon value for non-arp modes if not set")
Signed-off-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 months agonet: thunderbolt: Use bitwise types in the struct thunderbolt_ip_frame_header
Andy Shevchenko [Wed, 30 Nov 2022 12:36:13 +0000 (14:36 +0200)]
net: thunderbolt: Use bitwise types in the struct thunderbolt_ip_frame_header

The main usage of the struct thunderbolt_ip_frame_header is to handle
the packets on the media layer. The header is bound to the protocol
in which the byte ordering is crucial. However the data type definition
doesn't use that and sparse is unhappy, for example (17 altogether):

  .../thunderbolt.c:718:23: warning: cast to restricted __le32

  .../thunderbolt.c:966:42: warning: incorrect type in assignment (different base types)
  .../thunderbolt.c:966:42:    expected unsigned int [usertype] frame_count
  .../thunderbolt.c:966:42:    got restricted __le32 [usertype]

Switch to the bitwise types in the struct thunderbolt_ip_frame_header to
reduce this, but not completely solving (9 left), because the same data
type is used for Rx header handled locally (in CPU byte order).

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 months agonet: thunderbolt: Switch from __maybe_unused to pm_sleep_ptr() etc
Andy Shevchenko [Wed, 30 Nov 2022 12:36:12 +0000 (14:36 +0200)]
net: thunderbolt: Switch from __maybe_unused to pm_sleep_ptr() etc

Letting the compiler remove these functions when the kernel is built
without CONFIG_PM_SLEEP support is simpler and less heavier for builds
than the use of __maybe_unused attributes.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 months agonet: devlink: convert port_list into xarray
Jiri Pirko [Wed, 30 Nov 2022 08:52:50 +0000 (09:52 +0100)]
net: devlink: convert port_list into xarray

Some devlink instances may contain thousands of ports. Storing them in
linked list and looking them up is not scalable. Convert the linked list
into xarray.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
19 months agoMerge branch 'hsr'
Jakub Kicinski [Fri, 2 Dec 2022 04:26:24 +0000 (20:26 -0800)]
Merge branch 'hsr'

Sebastian Andrzej Siewior says:

====================
I started playing with HSR and run into a problem. Tested latest
upstream -rc and noticed more problems. Now it appears to work.
For testing I have a small three node setup with iperf and ping. While
iperf doesn't complain ping reports missing packets and duplicates.
====================

Link: https://lore.kernel.org/r/20221129164815.128922-1-bigeasy@linutronix.de/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agoselftests: Add a basic HSR test.
Sebastian Andrzej Siewior [Tue, 29 Nov 2022 16:48:15 +0000 (17:48 +0100)]
selftests: Add a basic HSR test.

This test adds a basic HSRv0 network with 3 nodes. In its current shape
it sends and forwards packets, announcements and so merges nodes based
on MAC A/B information.
It is able to detect duplicate packets and packetloss should any occur.

Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agohsr: Use a single struct for self_node.
Sebastian Andrzej Siewior [Tue, 29 Nov 2022 16:48:14 +0000 (17:48 +0100)]
hsr: Use a single struct for self_node.

self_node_db is a list_head with one entry of struct hsr_node. The
purpose is to hold the two MAC addresses of the node itself.
It is convenient to recycle the structure. However having a list_head
and fetching always the first entry is not really optimal.

Created a new data strucure contaning the two MAC addresses named
hsr_self_node. Access that structure like an RCU protected pointer so
it can be replaced on the fly without blocking the reader.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agohsr: Synchronize sequence number updates.
Sebastian Andrzej Siewior [Tue, 29 Nov 2022 16:48:13 +0000 (17:48 +0100)]
hsr: Synchronize sequence number updates.

hsr_register_frame_out() compares new sequence_nr vs the old one
recorded in hsr_node::seq_out and if the new sequence_nr is higher then
it will be written to hsr_node::seq_out as the new value.

This operation isn't locked so it is possible that two frames with the
same sequence number arrive (via the two slave devices) and are fed to
hsr_register_frame_out() at the same time. Both will pass the check and
update the sequence counter later to the same value. As a result the
content of the same packet is fed into the stack twice.

This was noticed by running ping and observing DUP being reported from
time to time.

Instead of using the hsr_priv::seqnr_lock for the whole receive path (as
it is for sending in the master node) add an additional lock that is only
used for sequence number checks and updates.

Add a per-node lock that is used during sequence number reads and
updates.

Fixes: f421436a591d3 ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agohsr: Synchronize sending frames to have always incremented outgoing seq nr.
Sebastian Andrzej Siewior [Tue, 29 Nov 2022 16:48:12 +0000 (17:48 +0100)]
hsr: Synchronize sending frames to have always incremented outgoing seq nr.

Sending frames via the hsr (master) device requires a sequence number
which is tracked in hsr_priv::sequence_nr and protected by
hsr_priv::seqnr_lock. Each time a new frame is sent, it will obtain a
new id and then send it via the slave devices.
Each time a packet is sent (via hsr_forward_do()) the sequence number is
checked via hsr_register_frame_out() to ensure that a frame is not
handled twice. This make sense for the receiving side to ensure that the
frame is not injected into the stack twice after it has been received
from both slave ports.

There is no locking to cover the sending path which means the following
scenario is possible:

  CPU0 CPU1
  hsr_dev_xmit(skb1) hsr_dev_xmit(skb2)
   fill_frame_info()             fill_frame_info()
    hsr_fill_frame_info()         hsr_fill_frame_info()
     handle_std_frame()            handle_std_frame()
      skb1's sequence_nr = 1
                                    skb2's sequence_nr = 2
   hsr_forward_do()              hsr_forward_do()

                                   hsr_register_frame_out(, 2)  // okay, send)

    hsr_register_frame_out(, 1) // stop, lower seq duplicate

Both skbs (or their struct hsr_frame_info) received an unique id.
However since skb2 was sent before skb1, the higher sequence number was
recorded in hsr_register_frame_out() and the late arriving skb1 was
dropped and never sent.

This scenario has been observed in a three node HSR setup, with node1 +
node2 having ping and iperf running in parallel. From time to time ping
reported a missing packet. Based on tracing that missing ping packet did
not leave the system.

It might be possible (didn't check) to drop the sequence number check on
the sending side. But if the higher sequence number leaves on wire
before the lower does and the destination receives them in that order
and it will drop the packet with the lower sequence number and never
inject into the stack.
Therefore it seems the only way is to lock the whole path from obtaining
the sequence number and sending via dev_queue_xmit() and assuming the
packets leave on wire in the same order (and don't get reordered by the
NIC).

Cover the whole path for the master interface from obtaining the ID
until after it has been forwarded via hsr_forward_skb() to ensure the
skbs are sent to the NIC in the order of the assigned sequence numbers.

Fixes: f421436a591d3 ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agohsr: Disable netpoll.
Sebastian Andrzej Siewior [Tue, 29 Nov 2022 16:48:11 +0000 (17:48 +0100)]
hsr: Disable netpoll.

The hsr device is a software device. Its
net_device_ops::ndo_start_xmit() routine will process the packet and
then pass the resulting skb to dev_queue_xmit().
During processing, hsr acquires a lock with spin_lock_bh()
(hsr_add_node()) which needs to be promoted to the _irq() suffix in
order to avoid a potential deadlock.
Then there are the warnings in dev_queue_xmit() (due to
local_bh_disable() with disabled interrupts) left.

Instead trying to address those (there is qdisc and…) for netpoll sake,
just disable netpoll on hsr.

Disable netpoll on hsr and replace the _irqsave() locking with _bh().

Fixes: f421436a591d3 ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agohsr: Avoid double remove of a node.
Sebastian Andrzej Siewior [Tue, 29 Nov 2022 16:48:10 +0000 (17:48 +0100)]
hsr: Avoid double remove of a node.

Due to the hashed-MAC optimisation one problem become visible:
hsr_handle_sup_frame() walks over the list of available nodes and merges
two node entries into one if based on the information in the supervision
both MAC addresses belong to one node. The list-walk happens on a RCU
protected list and delete operation happens under a lock.

If the supervision arrives on both slave interfaces at the same time
then this delete operation can occur simultaneously on two CPUs. The
result is the first-CPU deletes the from the list and the second CPUs
BUGs while attempting to dereference a poisoned list-entry. This happens
more likely with the optimisation because a new node for the mac_B entry
is created once a packet has been received and removed (merged) once the
supervision frame has been received.

Avoid removing/ cleaning up a hsr_node twice by adding a `removed' field
which is set to true after the removal and checked before the removal.

Fixes: f266a683a4804 ("net/hsr: Better frame dispatch")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agohsr: Add a rcu-read lock to hsr_forward_skb().
Sebastian Andrzej Siewior [Tue, 29 Nov 2022 16:48:09 +0000 (17:48 +0100)]
hsr: Add a rcu-read lock to hsr_forward_skb().

hsr_forward_skb() a skb and keeps information in an on-stack
hsr_frame_info. hsr_get_node() assigns hsr_frame_info::node_src which is
from a RCU list. This pointer is used later in hsr_forward_do().
I don't see a reason why this pointer can't vanish midway since there is
no guarantee that hsr_forward_skb() is invoked from an RCU read section.

Use rcu_read_lock() to protect hsr_frame_info::node_src from its
assignment until it is no longer used.

Fixes: f266a683a4804 ("net/hsr: Better frame dispatch")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agoRevert "net: hsr: use hlist_head instead of list_head for mac addresses"
Sebastian Andrzej Siewior [Tue, 29 Nov 2022 16:48:08 +0000 (17:48 +0100)]
Revert "net: hsr: use hlist_head instead of list_head for mac addresses"

The hlist optimisation (which not only uses hlist_head instead of
list_head but also splits hsr_priv::node_db into an array of 256 slots)
does not consider the "node merge":
Upon starting the hsr network (with three nodes) a packet that is
sent from node1 to node3 will also be sent from node1 to node2 and then
forwarded to node3.
As a result node3 will receive 2 packets because it is not able
to filter out the duplicate. Each packet received will create a new
struct hsr_node with macaddress_A only set the MAC address it received
from (the two MAC addesses from node1).
At some point (early in the process) two supervision frames will be
received from node1. They will be processed by hsr_handle_sup_frame()
and one frame will leave early ("Node has already been merged") and does
nothing. The other frame will be merged as portB and have its MAC
address written to macaddress_B and the hsr_node (that was created for
it as macaddress_A) will be removed.
From now on HSR is able to identify a duplicate because both packets
sent from one node will result in the same struct hsr_node because
hsr_get_node() will find the MAC address either on macaddress_A or
macaddress_B.

Things get tricky with the optimisation: If sender's MAC address is
saved as macaddress_A then the lookup will work as usual. If the MAC
address has been merged into macaddress_B of another hsr_node then the
lookup won't work because it is likely that the data structure is in
another bucket. This results in creating a new struct hsr_node and not
recognising a possible duplicate.

A way around it would be to add another hsr_node::mac_list_B and attach
it to the other bucket to ensure that this hsr_node will be looked up
either via macaddress_A _or_ macaddress_B.

I however prefer to revert it because it sounds like an academic problem
rather than real life workload plus it adds complexity. I'm not an HSR
expert with what is usual size of a network but I would guess 40 to 60
nodes. With 10.000 nodes and assuming 60us for pass-through (from node
to node) then it would take almost 600ms for a packet to almost wrap
around which sounds a lot.

Revert the hash MAC addresses optimisation.

Fixes: 4acc45db71158 ("net: hsr: use hlist_head instead of list_head for mac addresses")
Cc: Juhee Kang <claudiajkang@gmail.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agosctp: delete free member from struct sctp_sched_ops
Xin Long [Wed, 30 Nov 2022 23:04:31 +0000 (18:04 -0500)]
sctp: delete free member from struct sctp_sched_ops

After commit 9ed7bfc79542 ("sctp: fix memory leak in
sctp_stream_outq_migrate()"), sctp_sched_set_sched() is the only
place calling sched->free(), and it can actually be replaced by
sched->free_sid() on each stream, and yet there's already a loop
to traverse all streams in sctp_sched_set_sched().

This patch adds a function sctp_sched_free_sched() where it calls
sched->free_sid() for each stream to replace sched->free() calls
in sctp_sched_set_sched() and then deletes the unused free member
from struct sctp_sched_ops.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Link: https://lore.kernel.org/r/e10aac150aca2686cb0bd0570299ec716da5a5c0.1669849471.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agoMerge branch 'mptcp-pm-listener-events-selftests-cleanup'
Jakub Kicinski [Fri, 2 Dec 2022 04:06:10 +0000 (20:06 -0800)]
Merge branch 'mptcp-pm-listener-events-selftests-cleanup'

Matthieu Baerts says:

====================
mptcp: PM listener events + selftests cleanup

Thanks to the patch 6/11, the MPTCP path manager now sends Netlink events
when MPTCP listening sockets are created and closed. The reason why it is
needed is explained in the linked ticket [1]:

  MPTCP for Linux, when not using the in-kernel PM, depends on the
  userspace PM to create extra listening sockets before announcing
  addresses and ports. Let's call these "PM listeners".

  With the existing MPTCP netlink events, a userspace PM can create
  PM listeners at startup time, or in response to an incoming connection.
  Creating sockets in response to connections is not optimal: ADD_ADDRs
  can't be sent until the sockets are created and listen()ed, and if all
  connections are closed then it may not be clear to the userspace
  PM daemon that PM listener sockets should be cleaned up.

  Hence this feature request: to add MPTCP netlink events for listening
  socket close & create, so PM listening sockets can be managed based
  on application activity.

  [1] https://github.com/multipath-tcp/mptcp_net-next/issues/313

Selftests for these new Netlink events have been added in patches 9,11/11.

The remaining patches introduce different cleanups and small improvements
in MPTCP selftests to ease the maintenance and the addition of new tests.
====================

Link: https://lore.kernel.org/r/20221130140637.409926-1-matthieu.baerts@tessares.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agoselftests: mptcp: listener test for in-kernel PM
Geliang Tang [Wed, 30 Nov 2022 14:06:33 +0000 (15:06 +0100)]
selftests: mptcp: listener test for in-kernel PM

This patch adds test coverage for listening sockets created by the
in-kernel path manager in mptcp_join.sh.

It adds the listener event checking in the existing "remove single
address with port" test. The output looks like this:

 003 remove single address with port syn[ ok ] - synack[ ok ] - ack[ ok ]
                                     add[ ok ] - echo  [ ok ] - pt [ ok ]
                                     syn[ ok ] - synack[ ok ] - ack[ ok ]
                                     syn[ ok ] - ack   [ ok ]
                                     rm [ ok ] - rmsf  [ ok ]   invert
                                     CREATE_LISTENER 10.0.2.1:10100[ ok ]
                                     CLOSE_LISTENER 10.0.2.1:10100 [ ok ]

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agoselftests: mptcp: make evts global in mptcp_join
Geliang Tang [Wed, 30 Nov 2022 14:06:32 +0000 (15:06 +0100)]
selftests: mptcp: make evts global in mptcp_join

This patch moves evts_ns1 and evts_ns2 out of do_transfer() as two global
variables in mptcp_join.sh. Init them in init() and remove them in
cleanup().

Add a new helper reset_with_events() to save the outputs of 'pm_nl_ctl
events' command in them. And a new helper kill_events_pids() to kill
pids of 'pm_nl_ctl events' command. Use these helpers in userspace pm
tests.

Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agoselftests: mptcp: listener test for userspace PM
Geliang Tang [Wed, 30 Nov 2022 14:06:31 +0000 (15:06 +0100)]
selftests: mptcp: listener test for userspace PM

This patch adds test coverage for listening sockets created by userspace
processes.

It adds a new test named test_listener() and a new verifying helper
verify_listener_events(). The new output looks like this:

 CREATE_SUBFLOW 10.0.2.2 (ns2) => 10.0.2.1 (ns1)              [OK]
 DESTROY_SUBFLOW 10.0.2.2 (ns2) => 10.0.2.1 (ns1)             [OK]
 MP_PRIO TX                                                   [OK]
 MP_PRIO RX                                                   [OK]
 CREATE_LISTENER 10.0.2.2:37106       [OK]
 CLOSE_LISTENER 10.0.2.2:37106       [OK]

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agoselftests: mptcp: make evts global in userspace_pm
Geliang Tang [Wed, 30 Nov 2022 14:06:30 +0000 (15:06 +0100)]
selftests: mptcp: make evts global in userspace_pm

This patch makes server_evts and client_evts global in userspace_pm.sh,
then these two variables could be used in test_announce(), test_remove()
and test_subflows(). The local variable 'evts' in these three functions
then could be dropped.

Also move local variable 'file' as a global one.

Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agoselftests: mptcp: enhance userspace pm tests
Geliang Tang [Wed, 30 Nov 2022 14:06:29 +0000 (15:06 +0100)]
selftests: mptcp: enhance userspace pm tests

Some userspace pm tests failed since pm listener events have been added.
Now MPTCP_EVENT_LISTENER_CREATED event becomes the first item in the
events list like this:

 type:15,family:2,sport:10006,saddr4:0.0.0.0
 type:1,token:3701282876,server_side:1,family:2,saddr4:10.0.1.1,...

And no token value in this MPTCP_EVENT_LISTENER_CREATED event.

This patch fixes this by specifying the type 1 item to search for token
values.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agomptcp: add pm listener events
Geliang Tang [Wed, 30 Nov 2022 14:06:28 +0000 (15:06 +0100)]
mptcp: add pm listener events

This patch adds two new MPTCP netlink event types for PM listening
socket create and close, named MPTCP_EVENT_LISTENER_CREATED and
MPTCP_EVENT_LISTENER_CLOSED.

Add a new function mptcp_event_pm_listener() to push the new events
with family, port and addr to userspace.

Invoke mptcp_event_pm_listener() with MPTCP_EVENT_LISTENER_CREATED in
mptcp_listen() and mptcp_pm_nl_create_listen_socket(), invoke it with
MPTCP_EVENT_LISTENER_CLOSED in __mptcp_close_ssk().

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agoselftests: mptcp: declare var as local
Matthieu Baerts [Wed, 30 Nov 2022 14:06:27 +0000 (15:06 +0100)]
selftests: mptcp: declare var as local

Just to avoid classical Bash pitfall where variables are accidentally
overridden by other functions because the proper scope has not been
defined.

That's also what is done in other MPTCP selftests scripts where all non
local variables are defined at the beginning of the script and the
others are defined with the "local" keyword.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agoselftests: mptcp: clearly declare global ns vars
Matthieu Baerts [Wed, 30 Nov 2022 14:06:26 +0000 (15:06 +0100)]
selftests: mptcp: clearly declare global ns vars

It is clearer to declare these global variables at the beginning of the
file as it is done in other MPTCP selftests rather than in functions in
the middle of the script.

So for uniformity reason, we can do the same here in mptcp_sockopt.sh.

Suggested-by: Geliang Tang <geliang.tang@suse.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agoselftests: mptcp: uniform 'rndh' variable
Matthieu Baerts [Wed, 30 Nov 2022 14:06:25 +0000 (15:06 +0100)]
selftests: mptcp: uniform 'rndh' variable

The definition of 'rndh' was probably copied from one script to another
but some times, 'sec' was not defined, not used and/or not spelled
properly.

Here all the 'rndh' are now defined the same way.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agoselftests: mptcp: removed defined but unused vars
Matthieu Baerts [Wed, 30 Nov 2022 14:06:24 +0000 (15:06 +0100)]
selftests: mptcp: removed defined but unused vars

Some variables were set but never used.

This was not causing any issues except adding some confusion and having
shellcheck complaining about them.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agoselftests: mptcp: run mptcp_inq from a clean netns
Matthieu Baerts [Wed, 30 Nov 2022 14:06:23 +0000 (15:06 +0100)]
selftests: mptcp: run mptcp_inq from a clean netns

A new "sandbox" net namespace is available where no other netfilter
rules have been added.

Use this new netns instead of re-using "ns1" and clean it.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agobnxt: report FEC block stats via standard interface
Jakub Kicinski [Wed, 30 Nov 2022 01:31:08 +0000 (17:31 -0800)]
bnxt: report FEC block stats via standard interface

I must have missed that these stats are only exposed
via the unstructured ethtool -S when they got merged.
Plumb in the structured form.

Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20221130013108.90062-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agoMerge branch 'remove-label-cpu-from-dsa-dt-binding'
Jakub Kicinski [Thu, 1 Dec 2022 23:57:04 +0000 (15:57 -0800)]
Merge branch 'remove-label-cpu-from-dsa-dt-binding'

Arınç ÜNAL says:

====================
remove label = "cpu" from DSA dt-binding

With this patch series, we're completely getting rid of 'label = "cpu";'
which is not used by the DSA dt-binding at all.

Information for taking the patches for maintainers:
Patch 1: netdev maintainers (based off netdev/net-next.git main)
Patch 2-3: SoC maintainers (based off soc/soc.git soc/dt)
Patch 4: MIPS maintainers (based off mips/linux.git mips-next)
Patch 5: PowerPC maintainers (based off powerpc/linux.git next-test)

I've been meaning to submit this for a few months. Find the relevant
conversation here:
https://lore.kernel.org/netdev/20220913155408.GA3802998-robh@kernel.org/

Here's how I did it, for the interested (or suggestions):

Find the platforms which have got 'label = "cpu";' defined.
grep -rnw . -e 'label = "cpu";'

Remove the line where 'label = "cpu";' is included.
sed -i /'label = "cpu";'/,+d arch/arm/boot/dts/*
sed -i /'label = "cpu";'/,+d arch/arm64/boot/dts/freescale/*
sed -i /'label = "cpu";'/,+d arch/arm64/boot/dts/marvell/*
sed -i /'label = "cpu";'/,+d arch/arm64/boot/dts/mediatek/*
sed -i /'label = "cpu";'/,+d arch/arm64/boot/dts/rockchip/*
sed -i /'label = "cpu";'/,+d arch/mips/boot/dts/qca/*
sed -i /'label = "cpu";'/,+d arch/mips/boot/dts/ralink/*
sed -i /'label = "cpu";'/,+d arch/powerpc/boot/dts/turris1x.dts
sed -i /'label = "cpu";'/,+d Documentation/devicetree/bindings/net/qca,ar71xx.yaml

Restore the symlink files which typechange after running sed.
====================

Link: https://lore.kernel.org/r/20221130141040.32447-1-arinc.unal@arinc9.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agodt-bindings: net: qca,ar71xx: remove label = "cpu" from examples
Arınç ÜNAL [Wed, 30 Nov 2022 14:10:36 +0000 (17:10 +0300)]
dt-bindings: net: qca,ar71xx: remove label = "cpu" from examples

This is not used by the DSA dt-binding, so remove it from the examples.

Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agoMerge branch 'net-tcp-dynamically-disable-tcp-md5-static-key'
Jakub Kicinski [Thu, 1 Dec 2022 23:53:08 +0000 (15:53 -0800)]
Merge branch 'net-tcp-dynamically-disable-tcp-md5-static-key'

Dmitry Safonov says:

====================
net/tcp: Dynamically disable TCP-MD5 static key

The static key introduced by commit 6015c71e656b ("tcp: md5: add
tcp_md5_needed jump label") is a fast-path optimization aimed at
avoiding a cache line miss.
Once an MD5 key is introduced in the system the static key is enabled
and never disabled. Address this by disabling the static key when
the last tcp_md5sig_info in system is destroyed.

Previously it was submitted as a part of TCP-AO patches set [1].
Now in attempt to split 36 patches submission, I send this independently.

Version 5:
https://lore.kernel.org/all/20221122185534.308643-1-dima@arista.com/T/#u
Version 4:
https://lore.kernel.org/all/20221115211905.1685426-1-dima@arista.com/T/#u
Version 3:
https://lore.kernel.org/all/20221111212320.1386566-1-dima@arista.com/T/#u
Version 2:
https://lore.kernel.org/all/20221103212524.865762-1-dima@arista.com/T/#u
Version 1:
https://lore.kernel.org/all/20221102211350.625011-1-dima@arista.com/T/#u

[1]: https://lore.kernel.org/all/20221027204347.529913-1-dima@arista.com/T/#u
====================

Link: https://lore.kernel.org/r/20221123173859.473629-1-dima@arista.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agonet/tcp: Separate initialization of twsk
Dmitry Safonov [Wed, 23 Nov 2022 17:38:59 +0000 (17:38 +0000)]
net/tcp: Separate initialization of twsk

Convert BUG_ON() to WARN_ON_ONCE() and warn as well for unlikely
static key int overflow error-path.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agonet/tcp: Do cleanup on tcp_md5_key_copy() failure
Dmitry Safonov [Wed, 23 Nov 2022 17:38:58 +0000 (17:38 +0000)]
net/tcp: Do cleanup on tcp_md5_key_copy() failure

If the kernel was short on (atomic) memory and failed to allocate it -
don't proceed to creation of request socket. Otherwise the socket would
be unsigned and userspace likely doesn't expect that the TCP is not
MD5-signed anymore.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agonet/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destruction
Dmitry Safonov [Wed, 23 Nov 2022 17:38:57 +0000 (17:38 +0000)]
net/tcp: Disable TCP-MD5 static key on tcp_md5sig_info destruction

To do that, separate two scenarios:
- where it's the first MD5 key on the system, which means that enabling
  of the static key may need to sleep;
- copying of an existing key from a listening socket to the request
  socket upon receiving a signed TCP segment, where static key was
  already enabled (when the key was added to the listening socket).

Now the life-time of the static branch for TCP-MD5 is until:
- last tcp_md5sig_info is destroyed
- last socket in time-wait state with MD5 key is closed.

Which means that after all sockets with TCP-MD5 keys are gone, the
system gets back the performance of disabled md5-key static branch.

While at here, provide static_key_fast_inc() helper that does ref
counter increment in atomic fashion (without grabbing cpus_read_lock()
on CONFIG_JUMP_LABEL=y). This is needed to add a new user for
a static_key when the caller controls the lifetime of another user.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agonet/tcp: Separate tcp_md5sig_info allocation into tcp_md5sig_info_add()
Dmitry Safonov [Wed, 23 Nov 2022 17:38:56 +0000 (17:38 +0000)]
net/tcp: Separate tcp_md5sig_info allocation into tcp_md5sig_info_add()

Add a helper to allocate tcp_md5sig_info, that will help later to
do/allocate things when info allocated, once per socket.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agojump_label: Prevent key->enabled int overflow
Dmitry Safonov [Wed, 23 Nov 2022 17:38:55 +0000 (17:38 +0000)]
jump_label: Prevent key->enabled int overflow

1. With CONFIG_JUMP_LABEL=n static_key_slow_inc() doesn't have any
   protection against key->enabled refcounter overflow.
2. With CONFIG_JUMP_LABEL=y static_key_slow_inc_cpuslocked()
   still may turn the refcounter negative as (v + 1) may overflow.

key->enabled is indeed a ref-counter as it's documented in multiple
places: top comment in jump_label.h, Documentation/staging/static-keys.rst,
etc.

As -1 is reserved for static key that's in process of being enabled,
functions would break with negative key->enabled refcount:
- for CONFIG_JUMP_LABEL=n negative return of static_key_count()
  breaks static_key_false(), static_key_true()
- the ref counter may become 0 from negative side by too many
  static_key_slow_inc() calls and lead to use-after-free issues.

These flaws result in that some users have to introduce an additional
mutex and prevent the reference counter from overflowing themselves,
see bpf_enable_runtime_stats() checking the counter against INT_MAX / 2.

Prevent the reference counter overflow by checking if (v + 1) > 0.
Change functions API to return whether the increment was successful.

Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agoMerge branch 'locking/core' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Jakub Kicinski [Thu, 1 Dec 2022 23:36:27 +0000 (15:36 -0800)]
Merge branch 'locking/core' of git://git./linux/kernel/git/tip/tip

Pull in locking/core from tip (just a single patch) to avoid a conflict
with a jump_label change needed by a TCP cleanup.

Link: https://lore.kernel.org/all/Y4B17nBArWS1Iywo@hirez.programming.kicks-ass.net/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
19 months agoMerge tag 'iwlwifi-next-for-kalle-2022-11-28' of http://git.kernel.org/pub/scm/linux...
Kalle Valo [Thu, 1 Dec 2022 18:03:07 +0000 (20:03 +0200)]
Merge tag 'iwlwifi-next-for-kalle-2022-11-28' of git./linux/kernel/git/iwlwifi/iwlwifi-next

This is the second pull request intended for v6.2

It contains two patch-sets sent before with the following content:
* iwlwifi EHT adjustments
* double-free fix in tx path
* iwlmei PLDR flow fixes
* iwlmei smatch fixes
* a logging data improvement

19 months agoMerge tag 'mt76-for-kvalo-2022-12-01' of https://github.com/nbd168/wireless
Kalle Valo [Thu, 1 Dec 2022 17:58:20 +0000 (19:58 +0200)]
Merge tag 'mt76-for-kvalo-2022-12-01' of https://github.com/nbd168/wireless

mt76 patches for 6.2

- fixes
- WED support for mt7986 + mt7915 for flow offloading
- new driver for the mt7996 wifi-7 chipset

19 months agowifi: mt76: mt7921e: add pci .shutdown() support
Leon Yen [Thu, 1 Dec 2022 10:38:42 +0000 (18:38 +0800)]
wifi: mt76: mt7921e: add pci .shutdown() support

Some combinations of hosts cannnot detect mt7921e after reboot. The
interoperability issue is caused by the status mismatch between host
and chip fw. In such cases, the driver should stop chip activities
and reset chip to default state before reboot.

Suggested-by: angelogioacchino.delregno@collabora.com
Co-developed-by: Deren Wu <deren.wu@mediatek.com>
Signed-off-by: Deren Wu <deren.wu@mediatek.com>
Signed-off-by: Leon Yen <Leon.Yen@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
19 months agowifi: mt76: mt7915: mmio: fix naming convention
Lorenzo Bianconi [Thu, 1 Dec 2022 08:51:55 +0000 (09:51 +0100)]
wifi: mt76: mt7915: mmio: fix naming convention

Rename mt7915_wed_release_rx_buf in mt7915_mmio_wed_release_rx_buf,
mt7915_wed_init_rx_buf in mt7915_mmio_wed_init_rx_buf and
mt7915_wed_release_rx_buf in mt7915_mmio_wed_release_rx_buf

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
19 months agowifi: mt76: mt7996: add support to configure spatial reuse parameter set
Ryder Lee [Thu, 1 Dec 2022 08:03:32 +0000 (16:03 +0800)]
wifi: mt76: mt7996: add support to configure spatial reuse parameter set

The SPR parameter set comprises OBSS PD threshold for SRG and
non SRG and Bitmap of BSS color and partial BSSID. This adds
support to configure fields of SPR element to firmware.

User can disable firmware SR algorithms by turning sr_scene_detect off.

Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
19 months agowifi: mt76: mt7996: enable ack signal support
Ryder Lee [Thu, 1 Dec 2022 03:44:43 +0000 (11:44 +0800)]
wifi: mt76: mt7996: enable ack signal support

This reports signal strength of ACK packets from the peer as measured
at each interface.

Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
19 months agowifi: mt76: mt7996: enable use_cts_prot support
Ryder Lee [Thu, 1 Dec 2022 03:44:42 +0000 (11:44 +0800)]
wifi: mt76: mt7996: enable use_cts_prot support

This adds selectable RTC/CTS enablement for each interface.

Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
19 months agowifi: mt76: mt7915: rely on band_idx of mt76_phy
Ryder Lee [Thu, 1 Dec 2022 03:44:41 +0000 (11:44 +0800)]
wifi: mt76: mt7915: rely on band_idx of mt76_phy

The commit dc44c45c8cd0 added band_idx into mt76_phy, so switching to
rely on that.

Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
19 months agowifi: mt76: mt7915: enable per bandwidth power limit support
Ryder Lee [Wed, 23 Nov 2022 19:59:11 +0000 (03:59 +0800)]
wifi: mt76: mt7915: enable per bandwidth power limit support

This power should override the per bandwidth max power that the
device emits.

Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
19 months agowifi: mt76: mt7915: introduce mt7915_get_power_bound()
Ryder Lee [Wed, 23 Nov 2022 19:59:10 +0000 (03:59 +0800)]
wifi: mt76: mt7915: introduce mt7915_get_power_bound()

Add a helper for common boundary check. This is a preliminary patch
to add per bandwidth power control.

Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
19 months agomt76: mt7915: Fix PCI device refcount leak in mt7915_pci_init_hif2()
Xiongfeng Wang [Fri, 25 Nov 2022 02:58:31 +0000 (10:58 +0800)]
mt76: mt7915: Fix PCI device refcount leak in mt7915_pci_init_hif2()

As comment of pci_get_device() says, it returns a pci_device with its
refcount increased. We need to call pci_dev_put() to decrease the
refcount. Save the return value of pci_get_device() and call
pci_dev_put() to decrease the refcount.

Fixes: 9093cfff72e3 ("mt76: mt7915: add support for using a secondary PCIe link for gen1")
Fixes: 2e30db0dde61 ("mt76: mt7915: add device id for mt7916")
Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
19 months agowifi: mt76: do not send firmware FW_FEATURE_NON_DL region
Deren Wu [Thu, 24 Nov 2022 14:20:38 +0000 (22:20 +0800)]
wifi: mt76: do not send firmware FW_FEATURE_NON_DL region

skip invalid section to avoid potential risks

Fixes: 23bdc5d8cadf ("wifi: mt76: mt7921: introduce Country Location Control support")
Signed-off-by: Deren Wu <deren.wu@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
19 months agowifi: mt76: mt7921: Add missing __packed annotation of struct mt7921_clc
Deren Wu [Mon, 28 Nov 2022 07:04:21 +0000 (15:04 +0800)]
wifi: mt76: mt7921: Add missing __packed annotation of struct mt7921_clc

Add __packed annotation to avoid potential CLC parsing error

Fixes: 23bdc5d8cadf ("wifi: mt76: mt7921: introduce Country Location Control support")
Signed-off-by: Deren Wu <deren.wu@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
19 months agowifi: mt76: fix coverity overrun-call in mt76_get_txpower()
Deren Wu [Sun, 27 Nov 2022 02:35:37 +0000 (10:35 +0800)]
wifi: mt76: fix coverity overrun-call in mt76_get_txpower()

Make sure the nss is valid for nss_delta array. Return zero
if the index is invalid.

Coverity message:
Event overrun-call: Overrunning callee's array of size 4 by passing
argument "n_chains" (which evaluates to 15) in call to
"mt76_tx_power_nss_delta".
int delta = mt76_tx_power_nss_delta(n_chains);

Fixes: 07cda406308b ("mt76: fix rounding issues on converting per-chain and combined txpower")
Signed-off-by: Deren Wu <deren.wu@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
19 months agowifi: mt76: mt7996: add driver for MediaTek Wi-Fi 7 (802.11be) devices
Shayne Chen [Tue, 22 Nov 2022 08:45:46 +0000 (16:45 +0800)]
wifi: mt76: mt7996: add driver for MediaTek Wi-Fi 7 (802.11be) devices

The driver first supports Filogic 680 PCI device, which is a Wi-Fi 7
chipset supporting concurrent tri-band operation at 6 GHz, 5 GHz, and
2.4 GHz with 4x4 antennas on each band.

Currently, mt7996 only supports tri-band HE or older mode.
EHT mode and more variants of Filogic 680 support will be introduced
in further patches.

Reviewed-by: Ryder Lee <ryder.lee@mediatek.com>
Co-developed-by: Peter Chiu <chui-hao.chiu@mediatek.com>
Signed-off-by: Peter Chiu <chui-hao.chiu@mediatek.com>
Co-developed-by: Bo Jiao <Bo.Jiao@mediatek.com>
Signed-off-by: Bo Jiao <Bo.Jiao@mediatek.com>
Co-developed-by: Howard Hsu <howard-yh.hsu@mediatek.com>
Signed-off-by: Howard Hsu <howard-yh.hsu@mediatek.com>
Co-developed-by: MeiChia Chiu <meichia.chiu@mediatek.com>
Signed-off-by: MeiChia Chiu <meichia.chiu@mediatek.com>
Co-developed-by: StanleyYP Wang <StanleyYP.Wang@mediatek.com>
Signed-off-by: StanleyYP Wang <StanleyYP.Wang@mediatek.com>
Co-developed-by: Money Wang <Money.Wang@mediatek.com>
Signed-off-by: Money Wang <Money.Wang@mediatek.com>
Co-developed-by: Evelyn Tsai <evelyn.tsai@mediatek.com>
Signed-off-by: Evelyn Tsai <evelyn.tsai@mediatek.com>
Signed-off-by: Shayne Chen <shayne.chen@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
19 months agowifi: mt76: mt76x0: remove dead code in mt76x0_phy_get_target_power
Lorenzo Bianconi [Tue, 22 Nov 2022 13:52:08 +0000 (14:52 +0100)]
wifi: mt76: mt76x0: remove dead code in mt76x0_phy_get_target_power

tx_rate can't be greater than 3 in mt76x0_phy_get_target_power routine
for cck rates. Get rid of dead code.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
19 months agowifi: mt76: mt7915: fix band_idx usage
Ryder Lee [Tue, 22 Nov 2022 07:53:12 +0000 (15:53 +0800)]
wifi: mt76: mt7915: fix band_idx usage

The commit 006b9d4ad5bf introduced phy->band_idx to accommodate the
band definition change for mt7986 so that the band_idx of main_phy
can be 0 or 1. Accordingly, the source of band_idx 1 has switched to
"phy != &dev->phy" or "dev->phy.band_idx = 1".

We still use ext_phy to represent band 1 somewhere in driver, so fix it.
Also, band_idx sounds more reasonable than dbdc_idx, so change it.

Fixes: 006b9d4ad5bf ("mt76: mt7915: introduce band_idx in mt7915_phy")
Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
19 months agowifi: mt76: mt7915: enable .sta_set_txpwr support
Ryder Lee [Tue, 22 Nov 2022 07:53:11 +0000 (15:53 +0800)]
wifi: mt76: mt7915: enable .sta_set_txpwr support

This adds support for adjusting the Txpower level while pushing
traffic to an associated station. The allowed range is from 0 to
the maximum power of channel.

Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
19 months agowifi: mt76: mt7915: add basedband Txpower info into debugfs
Ryder Lee [Tue, 22 Nov 2022 07:53:10 +0000 (15:53 +0800)]
wifi: mt76: mt7915: add basedband Txpower info into debugfs

This helps user to debug Txpower propagation path easily.

Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
19 months agowifi: mt76: mt7915: add support to configure spatial reuse parameter set
Ryder Lee [Thu, 17 Nov 2022 17:09:47 +0000 (01:09 +0800)]
wifi: mt76: mt7915: add support to configure spatial reuse parameter set

The SPR parameter set comprises OBSS PD threshold for SRG and
non SRG and Bitmap of BSS color and partial BSSID. This adds
support to configure fields of SPR element to firmware.

User can disable firmware SR algorithms by turning sr_scene_detect off.

Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
19 months agowifi: mt76: mt7915: add missing MODULE_PARM_DESC
Ryder Lee [Thu, 17 Nov 2022 17:09:46 +0000 (01:09 +0800)]
wifi: mt76: mt7915: add missing MODULE_PARM_DESC

Add documentation for module_param so that they're visible with
modinfo command.

Signed-off-by: Ryder Lee <ryder.lee@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
19 months agowifi: mt76: mt7915: enable WED RX stats
Sujuan Chen [Sat, 12 Nov 2022 15:40:41 +0000 (16:40 +0100)]
wifi: mt76: mt7915: enable WED RX stats

Introduce the capability to report WED RX stats to mac80211.

Tested-by: Daniel Golle <daniel@makrotopia.org>
Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Sujuan Chen <sujuan.chen@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
19 months agowifi: mt76: mt7915: enable WED RX support
Lorenzo Bianconi [Sat, 12 Nov 2022 15:40:40 +0000 (16:40 +0100)]
wifi: mt76: mt7915: enable WED RX support

Enable RX Wireless Ethernet Dispatch available on MT7986 Soc in oreder
to offlad traffic received by WLAN NIC and forwarded to LAN/WAN one.

Tested-by: Daniel Golle <daniel@makrotopia.org>
Co-developed-by: Sujuan Chen <sujuan.chen@mediatek.com>
Signed-off-by: Sujuan Chen <sujuan.chen@mediatek.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
19 months agowifi: mt76: connac: introduce mt76_connac_mcu_sta_wed_update utility routine
Sujuan Chen [Sat, 12 Nov 2022 15:40:39 +0000 (16:40 +0100)]
wifi: mt76: connac: introduce mt76_connac_mcu_sta_wed_update utility routine

This is a preliminary patch to introduce WED RX support for mt7915.

Tested-by: Daniel Golle <daniel@makrotopia.org>
Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Sujuan Chen <sujuan.chen@mediatek.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>