platform/kernel/linux-starfive.git
11 months agonet: stmmac: dwmac-oxnas: remove obsolete dwmac glue driver
Neil Armstrong [Mon, 31 Jul 2023 14:41:10 +0000 (16:41 +0200)]
net: stmmac: dwmac-oxnas: remove obsolete dwmac glue driver

Due to lack of maintenance and stall of development for a few years now,
and since no new features will ever be added upstream, remove support
for OX810 and OX820 ethernet.

Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Daniel Golle <daniel@makrotopia.org>
Acked-by: Andy Shevchenko <andy@kernel.org>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agoMerge branch 'selftests-mlxsw'
David S. Miller [Wed, 2 Aug 2023 08:18:18 +0000 (09:18 +0100)]
Merge branch 'selftests-mlxsw'

Petr Machata says:

====================
selftests: New selftests for out-of-order-operations patches in mlxsw

In the past, the mlxsw driver made the assumption that the user applies
configuration in a bottom-up manner. Thus netdevices needed to be added to
the bridge before IP addresses were configured on that bridge or SVI added
on top of it, because whatever happened before a netdevice was mlxsw upper
was generally ignored by mlxsw. Recently, several patch series were pushed
to introduce the bookkeeping and replays necessary to offload the full
state, not just the immediate configuration step.

In this patchset, introduce new selftests that directly exercise the out of
order code paths in mlxsw.

- Patch #1 adds new tests into the existing selftest router_bridge.sh.
- Patches #2-#5 add new generic selftests.
- Patches #6-#8 add new mlxsw-specific selftests.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agoselftests: mlxsw: rif_bridge: Add a new selftest
Petr Machata [Mon, 31 Jul 2023 15:47:22 +0000 (17:47 +0200)]
selftests: mlxsw: rif_bridge: Add a new selftest

This test verifies driver behavior with regards to creation of RIFs for a
bridge as LAGs are added or removed to/from it, and ports added or removed
to/from the LAG.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agoselftests: mlxsw: rif_lag_vlan: Add a new selftest
Petr Machata [Mon, 31 Jul 2023 15:47:21 +0000 (17:47 +0200)]
selftests: mlxsw: rif_lag_vlan: Add a new selftest

This test verifies driver behavior with regards to creation of RIFs for LAG
VLAN uppers as ports are added or removed to/from the LAG.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agoselftests: mlxsw: rif_lag: Add a new selftest
Petr Machata [Mon, 31 Jul 2023 15:47:20 +0000 (17:47 +0200)]
selftests: mlxsw: rif_lag: Add a new selftest

This test verifies driver behavior with regards to creation of RIFs for a
LAG as ports are added or removed to/from it.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agoselftests: router_bridge_1d_lag: Add a new selftest
Petr Machata [Mon, 31 Jul 2023 15:47:19 +0000 (17:47 +0200)]
selftests: router_bridge_1d_lag: Add a new selftest

Add a selftest to verify that routing through several bridges works when
LAG VLANs are used instead of physical ports, and that routing through LAG
VLANs themselves works as physical ports are de/enslaved.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agoselftests: router_bridge_lag: Add a new selftest
Petr Machata [Mon, 31 Jul 2023 15:47:18 +0000 (17:47 +0200)]
selftests: router_bridge_lag: Add a new selftest

Add a selftest to verify that routing through a bridge works when LAG is
used instead of physical ports.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agoselftests: router_bridge_vlan_upper: Add a new selftest
Petr Machata [Mon, 31 Jul 2023 15:47:17 +0000 (17:47 +0200)]
selftests: router_bridge_vlan_upper: Add a new selftest

Add a selftest that verifies routing through VLAN bridge uppers.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agoselftests: router_bridge_1d: Add a new selftest
Petr Machata [Mon, 31 Jul 2023 15:47:16 +0000 (17:47 +0200)]
selftests: router_bridge_1d: Add a new selftest

Add a selftest to verify that routing through a 1d bridge works when VLAN
upper of a physical port is used instead of a physical port. Also verify
that when a port is attached to an already-configured bridge, the
configuration is applied.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agoselftests: router_bridge: Add remastering tests
Petr Machata [Mon, 31 Jul 2023 15:47:15 +0000 (17:47 +0200)]
selftests: router_bridge: Add remastering tests

Add two tests to deslave a port from and reenslave to a bridge. This should
retain the ability of the system to forward traffic, but on an offloading
driver that is sensitive to ordering of operations, it might not.

The first test does this configuration in a way that relies on
vlan_default_pvid to assign the PVID. The second test disables that
autoconfiguration and configures PVID by hand in a separate step.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agonet: stmmac: XGMAC support for mdio C22 addr > 3
Rohan G Thomas [Mon, 31 Jul 2023 11:50:41 +0000 (19:50 +0800)]
net: stmmac: XGMAC support for mdio C22 addr > 3

For XGMAC versions < 2.2 number of supported mdio C22 addresses is
restricted to 3. From XGMAC version 2.2 there are no restrictions on
the C22 addresses, it supports all valid mdio addresses(0 to 31).

Signed-off-by: Rohan G Thomas <rohan.g.thomas@intel.com>
Acked-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agoMerge branch 'add-tja1120-support'
Jakub Kicinski [Wed, 2 Aug 2023 04:06:27 +0000 (21:06 -0700)]
Merge branch 'add-tja1120-support'

Radu Pirea says:

====================
Add TJA1120 support

This patch series got bigger than I expected. It cleans up the
next-c45-tja11xx driver and adds support for the TJA1120(1000BaseT1
automotive phy).

Master/slave custom implementation was replaced with the generic
implementation (genphy_c45_config_aneg/genphy_c45_read_status).

The TJA1120 and TJA1103 are a bit different when it comes to the PTP
interface. The timestamp read procedure was changed, some addresses were
changed and some bits were moved from one register to another. Adding
TJA1120 support was tricky, and I tried not to duplicate the code. If
something looks too hacky to you, I am open to suggestions.
====================

Link: https://lore.kernel.org/r/20230731091619.77961-1-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: phy: nxp-c45-tja11xx: reset PCS if the link goes down
Radu Pirea (NXP OSS) [Mon, 31 Jul 2023 09:16:19 +0000 (12:16 +0300)]
net: phy: nxp-c45-tja11xx: reset PCS if the link goes down

During PTP testing on early TJA1120 engineering samples I observed that
if the link is lost and recovered, the tx timestamps will be randomly
lost. To avoid this HW issue, the PCS should be reset.

Resetting the PCS will break the link and we should reset the PCS on
LINK UP -> LINK DOWN transition, otherwise we will trigger and infinite
loop of LINK UP -> LINK DOWN events.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230731091619.77961-12-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: phy: nxp-c45-tja11xx: read ext trig ts on TJA1120
Radu Pirea (NXP OSS) [Mon, 31 Jul 2023 09:16:18 +0000 (12:16 +0300)]
net: phy: nxp-c45-tja11xx: read ext trig ts on TJA1120

On TJA1120, the external trigger timestamp now has a VALID bit. This
changes the logic and we can't use the TJA1103 procedure.

For TJA1103, we can always read a valid timestamp from the registers,
compare the new timestamp with the old timestamp and, if they are not the
same, an event occurred. This logic cannot be applied for TJA1120 because
the timestamp is 0 if the VALID bit is not set.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Link: https://lore.kernel.org/r/20230731091619.77961-11-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: phy: nxp-c45-tja11xx: run cable test with the PHY in test mode
Radu Pirea (NXP OSS) [Mon, 31 Jul 2023 09:16:17 +0000 (12:16 +0300)]
net: phy: nxp-c45-tja11xx: run cable test with the PHY in test mode

For TJA1120, the enable bit for cable test is not writable if the PHY is
not in test mode.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230731091619.77961-10-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: phy: nxp-c45-tja11xx: handle FUSA irq
Radu Pirea (NXP OSS) [Mon, 31 Jul 2023 09:16:16 +0000 (12:16 +0300)]
net: phy: nxp-c45-tja11xx: handle FUSA irq

TJA1120 and TJA1103 have a set of functional safety hardware tests
executed after every reset, and when the tests are done, the IRQ line is
asserted. For the moment, the purpose of these handlers is to acknowledge
the IRQ and not to check the FUSA tests status.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230731091619.77961-9-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: phy: nxp-c45-tja11xx: read egress ts on TJA1120
Radu Pirea (NXP OSS) [Mon, 31 Jul 2023 09:16:15 +0000 (12:16 +0300)]
net: phy: nxp-c45-tja11xx: read egress ts on TJA1120

The egress timestamp FIFO/circular buffer work different on TJA1120 than
TJA1103.

For TJA1103 the new timestamp should be manually moved from the FIFO to
the hardware buffer before checking if the timestamp is valid.

For TJA1120 the hardware will move automatically the new timestamp
from the FIFO to the buffer and the user should check the valid bit, read
the timestamp and unlock the buffer by writing any of the buffer
registers(which are read only).

Another change for the TJA1120 is the behaviour of the EGR TS IRQ bit.
This bit was a self-clear bit for TJA1103, but now should be cleared
before reading the timestamp.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Link: https://lore.kernel.org/r/20230731091619.77961-8-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: phy: nxp-c45-tja11xx: enable LTC sampling on both ext_ts edges
Radu Pirea (NXP OSS) [Mon, 31 Jul 2023 09:16:14 +0000 (12:16 +0300)]
net: phy: nxp-c45-tja11xx: enable LTC sampling on both ext_ts edges

The external trigger configuration for TJA1120 has changed. The PHY
supports sampling of the LTC on rising and on falling edge.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230731091619.77961-7-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: phy: nxp-c45-tja11xx: add TJA1120 support
Radu Pirea (NXP OSS) [Mon, 31 Jul 2023 09:16:13 +0000 (12:16 +0300)]
net: phy: nxp-c45-tja11xx: add TJA1120 support

Add TJA1120 driver entry and its driver_data.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230731091619.77961-6-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: phy: nxp-c45-tja11xx: use get_features
Radu Pirea (NXP OSS) [Mon, 31 Jul 2023 09:16:12 +0000 (12:16 +0300)]
net: phy: nxp-c45-tja11xx: use get_features

PHY_BASIC_T1_FEATURES are not the right features supported by TJA1103
anymore.
For example ethtool reports:
[root@alarm ~]# ethtool end0
Settings for end0:
        Supported ports: [ TP ]
        Supported link modes:   100baseT1/Full
                                10baseT1L/Full

10baseT1L/Full is not supported by TJA1103 and supported ports list is
not completed. The PHY also have a MII port.

genphy_c45_pma_read_abilities implementation can detect the PHY features
and they look like this.
[root@alarm ~]# ethtool end0
Settings for end0:
        Supported ports: [ TP    MII ]
        Supported link modes:   100baseT1/Full
        Supported pause frame use: Symmetric
        Supports auto-negotiation: No
        Supported FEC modes: Not reported
        Advertised link modes:  100baseT1/Full
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: No
        Advertised FEC modes: Not reported
        Speed: 100Mb/s
        Duplex: Full
        Auto-negotiation: off
        master-slave cfg: forced master
        master-slave status: master
        Port: Twisted Pair
        PHYAD: 1
        Transceiver: external
        MDI-X: Unknown
        Supports Wake-on: g
        Wake-on: d
        Link detected: yes
        SQI: 7/7

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230731091619.77961-5-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: phy: nxp-c45-tja11xx: prepare the ground for TJA1120
Radu Pirea (NXP OSS) [Mon, 31 Jul 2023 09:16:11 +0000 (12:16 +0300)]
net: phy: nxp-c45-tja11xx: prepare the ground for TJA1120

Between TJA1120 and TJA1103 the hardware was improved, but some register
addresses were changed and some bit fields were moved from one register
to another.

Introduce the nxp_c45_reg_field structure and its associated functions to
abstract the differences between the PHYs.

Remove the defined bits and register addresses that are not common
between TJA1103 and TJA1120 and replace them with reg_fields and
register addresses from phydev->drv->driver_data.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Link: https://lore.kernel.org/r/20230731091619.77961-4-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: phy: nxp-c45-tja11xx: remove RX BIST frame counters
Radu Pirea (NXP OSS) [Mon, 31 Jul 2023 09:16:10 +0000 (12:16 +0300)]
net: phy: nxp-c45-tja11xx: remove RX BIST frame counters

Remove RX BIST frame counters from the PHY statistics.
In production mode, these counters are always read as 0.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230731091619.77961-3-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: phy: nxp-c45-tja11xx: use phylib master/slave implementation
Radu Pirea (NXP OSS) [Mon, 31 Jul 2023 09:16:09 +0000 (12:16 +0300)]
net: phy: nxp-c45-tja11xx: use phylib master/slave implementation

Remove the custom implementation of master/save setup and read status
and use genphy_c45_config_aneg and genphy_c45_read_status since phylib
has support for master/slave setup and master/slave status.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230731091619.77961-2-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoMerge branch 'virtio_net-add-per-queue-interrupt-coalescing-support'
Jakub Kicinski [Wed, 2 Aug 2023 04:02:06 +0000 (21:02 -0700)]
Merge branch 'virtio_net-add-per-queue-interrupt-coalescing-support'

Gavin Li says:

====================
virtio_net: add per queue interrupt coalescing support

Currently, coalescing parameters are grouped for all transmit and receive
virtqueues. This patch series add support to set or get the parameters for
a specified virtqueue.

When the traffic between virtqueues is unbalanced, for example, one virtqueue
is busy and another virtqueue is idle, then it will be very useful to
control coalescing parameters at the virtqueue granularity.

Example command:
$ ethtool -Q eth5 queue_mask 0x1 --coalesce tx-packets 10
Would set max_packets=10 to VQ 1.
$ ethtool -Q eth5 queue_mask 0x1 --coalesce rx-packets 10
Would set max_packets=10 to VQ 0.
$ ethtool -Q eth5 queue_mask 0x1 --show-coalesce
 Queue: 0
 Adaptive RX: off  TX: off
 stats-block-usecs: 0
 sample-interval: 0
 pkt-rate-low: 0
 pkt-rate-high: 0

 rx-usecs: 222
 rx-frames: 0
 rx-usecs-irq: 0
 rx-frames-irq: 256

 tx-usecs: 222
 tx-frames: 0
 tx-usecs-irq: 0
 tx-frames-irq: 256

 rx-usecs-low: 0
 rx-frame-low: 0
 tx-usecs-low: 0
 tx-frame-low: 0

 rx-usecs-high: 0
 rx-frame-high: 0
 tx-usecs-high: 0
 tx-frame-high: 0
====================

Link: https://lore.kernel.org/r/20230731070656.96411-1-gavinl@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agovirtio_net: enable per queue interrupt coalesce feature
Gavin Li [Mon, 31 Jul 2023 07:06:56 +0000 (10:06 +0300)]
virtio_net: enable per queue interrupt coalesce feature

Enable per queue interrupt coalesce feature bit in driver and validate its
dependency with control queue.

Signed-off-by: Gavin Li <gavinl@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Heng Qi <hengqi@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20230731070656.96411-4-gavinl@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agovirtio_net: support per queue interrupt coalesce command
Gavin Li [Mon, 31 Jul 2023 07:06:55 +0000 (10:06 +0300)]
virtio_net: support per queue interrupt coalesce command

Add interrupt_coalesce config in send_queue and receive_queue to cache user
config.

Send per virtqueue interrupt moderation config to underlying device in
order to have more efficient interrupt moderation and cpu utilization of
guest VM.

Additionally, address all the VQs when updating the global configuration,
as now the individual VQs configuration can diverge from the global
configuration.

Signed-off-by: Gavin Li <gavinl@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Heng Qi <hengqi@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20230731070656.96411-3-gavinl@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agovirtio_net: extract interrupt coalescing settings to a structure
Gavin Li [Mon, 31 Jul 2023 07:06:54 +0000 (10:06 +0300)]
virtio_net: extract interrupt coalescing settings to a structure

Extract interrupt coalescing settings to a structure so that it could be
reused in other data structures.

Signed-off-by: Gavin Li <gavinl@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Heng Qi <hengqi@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230731070656.96411-2-gavinl@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoinet6: Remove unused function declaration udpv6_connect()
Yue Haibing [Mon, 31 Jul 2023 14:04:37 +0000 (22:04 +0800)]
inet6: Remove unused function declaration udpv6_connect()

This is never implemented since the beginning of git history.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230731140437.37056-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: make sure we never create ifindex = 0
Jakub Kicinski [Mon, 31 Jul 2023 17:11:58 +0000 (10:11 -0700)]
net: make sure we never create ifindex = 0

Instead of allocating from 1 use proper xa_init flag,
to protect ourselves from IDs wrapping back to 0.

Fixes: 759ab1edb56c ("net: store netdevs in an xarray")
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Link: https://lore.kernel.org/all/20230728162350.2a6d4979@hermes.local/
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230731171159.988962-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet/macmace: Replace zero-length array with DECLARE_FLEX_ARRAY() helper
Atul Raut [Sun, 30 Jul 2023 23:14:42 +0000 (16:14 -0700)]
net/macmace: Replace zero-length array with DECLARE_FLEX_ARRAY() helper

Since zero-length arrays are deprecated, we are replacing
them with C99 flexible-array members. As a result, instead
of declaring a zero-length array, use the new
DECLARE_FLEX_ARRAY() helper macro.

This fixes warnings such as:
./drivers/net/ethernet/apple/macmace.c:80:4-8: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)

Signed-off-by: Atul Raut <rauji.raut@gmail.com>
Link: https://lore.kernel.org/r/20230730231442.15003-1-rauji.raut@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: dsa: qca8k: use dsa_for_each macro instead of for loop
Christian Marangi [Sun, 30 Jul 2023 07:41:13 +0000 (09:41 +0200)]
net: dsa: qca8k: use dsa_for_each macro instead of for loop

Convert for loop to dsa_for_each macro to save some redundant write on
unconnected/unused port and tidy things up.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20230730074113.21889-5-ansuelsmth@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
11 months agonet: dsa: qca8k: move qca8xxx hol fixup to separate function
Christian Marangi [Sun, 30 Jul 2023 07:41:12 +0000 (09:41 +0200)]
net: dsa: qca8k: move qca8xxx hol fixup to separate function

Move qca8xxx hol fixup to separate function to tidy things up and to
permit using a more efficent loop in future patch.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20230730074113.21889-4-ansuelsmth@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
11 months agonet: dsa: qca8k: limit user ports access to the first CPU port on setup
Christian Marangi [Sun, 30 Jul 2023 07:41:11 +0000 (09:41 +0200)]
net: dsa: qca8k: limit user ports access to the first CPU port on setup

In preparation for multi-CPU support, set CPU port LOOKUP MEMBER outside
the port loop and setup the LOOKUP MEMBER mask for user ports only to
the first CPU port.

This is to handle flooding condition where every CPU port is set as
target and prevent packet duplication for unknown frames from user ports.

Secondary CPU port LOOKUP MEMBER mask will be setup later when
port_change_master will be implemented.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20230730074113.21889-3-ansuelsmth@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
11 months agonet: dsa: qca8k: make learning configurable and keep off if standalone
Christian Marangi [Sun, 30 Jul 2023 07:41:10 +0000 (09:41 +0200)]
net: dsa: qca8k: make learning configurable and keep off if standalone

Address learning should initially be turned off by the driver for port
operation in standalone mode, then the DSA core handles changes to it
via ds->ops->port_bridge_flags().

Currently this is not the case for qca8k where learning is enabled
unconditionally in qca8k_setup for every user port.

Handle ports configured in standalone mode by making the learning
configurable and not enabling it by default.

Implement .port_pre_bridge_flags and .port_bridge_flags dsa ops to
enable learning for bridge that request it and tweak
.port_stp_state_set to correctly disable learning when port is
configured in standalone mode.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20230730074113.21889-2-ansuelsmth@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
11 months agonet: dsa: tag_qca: return early if dev is not found
Christian Marangi [Sun, 30 Jul 2023 07:41:09 +0000 (09:41 +0200)]
net: dsa: tag_qca: return early if dev is not found

Currently checksum is recalculated and dsa tag stripped even if we later
don't find the dev.

To improve code, exit early if we don't find the dev and skip additional
operation on the skb since it will be freed anyway.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20230730074113.21889-1-ansuelsmth@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
11 months agoMerge branch 'net-sched-improve-class-lifetime-handling'
Paolo Abeni [Tue, 1 Aug 2023 08:47:28 +0000 (10:47 +0200)]
Merge branch 'net-sched-improve-class-lifetime-handling'

Pedro Tammela says:

====================
net/sched: improve class lifetime handling

Valis says[0]:
============
Three classifiers (cls_fw, cls_u32 and cls_route) always copy
tcf_result struct into the new instance of the filter on update.

This causes a problem when updating a filter bound to a class,
as tcf_unbind_filter() is always called on the old instance in the
success path, decreasing filter_cnt of the still referenced class
and allowing it to be deleted, leading to a use-after-free.
============

Turns out these could have been spotted easily with proper warnings.
Improve the current class lifetime with wrappers that check for
overflow/underflow.

While at it add an extack for when a class in use is deleted.

[0] https://lore.kernel.org/all/20230721174856.3045-1-sec@valis.email/
====================

Link: https://lore.kernel.org/r/20230728153537.1865379-1-pctammela@mojatatu.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
11 months agonet/sched: sch_qfq: warn about class in use while deleting
Pedro Tammela [Fri, 28 Jul 2023 15:35:37 +0000 (12:35 -0300)]
net/sched: sch_qfq: warn about class in use while deleting

Add extack to warn that delete was rejected because
the class is still in use

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
11 months agonet/sched: sch_htb: warn about class in use while deleting
Pedro Tammela [Fri, 28 Jul 2023 15:35:36 +0000 (12:35 -0300)]
net/sched: sch_htb: warn about class in use while deleting

Add extack to warn that delete was rejected because
the class is still in use

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
11 months agonet/sched: sch_hfsc: warn about class in use while deleting
Pedro Tammela [Fri, 28 Jul 2023 15:35:35 +0000 (12:35 -0300)]
net/sched: sch_hfsc: warn about class in use while deleting

Add extack to warn that delete was rejected because
the class is still in use

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
11 months agonet/sched: sch_drr: warn about class in use while deleting
Pedro Tammela [Fri, 28 Jul 2023 15:35:34 +0000 (12:35 -0300)]
net/sched: sch_drr: warn about class in use while deleting

Add extack to warn that delete was rejected because
the class is still in use

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
11 months agonet/sched: wrap open coded Qdics class filter counter
Pedro Tammela [Fri, 28 Jul 2023 15:35:33 +0000 (12:35 -0300)]
net/sched: wrap open coded Qdics class filter counter

The 'filter_cnt' counter is used to control a Qdisc class lifetime.
Each filter referecing this class by its id will eventually
increment/decrement this counter in their respective
'add/update/delete' routines.
As these operations are always serialized under rtnl lock, we don't
need an atomic type like 'refcount_t'.

It also means that we lose the overflow/underflow checks already
present in refcount_t, which are valuable to hunt down bugs
where the unsigned counter wraps around as it aids automated tools
like syzkaller to scream in such situations.

Wrap the open coded increment/decrement into helper functions and
add overflow checks to the operations.

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
11 months agoMerge branch 'mptcp-cleanup-and-improvements-in-the-selftests'
Jakub Kicinski [Tue, 1 Aug 2023 03:11:55 +0000 (20:11 -0700)]
Merge branch 'mptcp-cleanup-and-improvements-in-the-selftests'

Matthieu Baerts says:

====================
mptcp: cleanup and improvements in the selftests

This small series of 4 patches adds some improvements in MPTCP
selftests:

- Patch 1 reworks the detailed report of mptcp_join.sh selftest to
  better display what went well or wrong per test.

- Patch 2 adds colours (if supported, forced and/or not disabled) in
  mptcp_join.sh selftest output to help spotting issues.

- Patch 3 modifies an MPTCP selftest tool to interact with the
  path-manager via Netlink to always look for errors if any. This makes
  sure odd behaviours can be seen in the logs and errors can be caught
  later if needed.

- Patch 4 removes stdout and stderr redirections to /dev/null when using
  pm_nl_ctl if no errors are expected in order to log odd behaviours.
====================

Link: https://lore.kernel.org/r/20230730-upstream-net-next-20230728-mptcp-selftests-misc-v1-0-7e9cc530a9cd@tessares.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoselftests: mptcp: userspace_pm: unmute unexpected errors
Matthieu Baerts [Sun, 30 Jul 2023 08:05:18 +0000 (10:05 +0200)]
selftests: mptcp: userspace_pm: unmute unexpected errors

All pm_nl_ctl commands were muted. If there was an unexpected error with
one of them, this was simply not visible in the logs, making the
analysis very hard. It could also hide misuse of commands by mistake.

Now the output is only muted when we do expect to have an error, e.g.
when giving invalid arguments on purpose.

Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Link: https://lore.kernel.org/r/20230730-upstream-net-next-20230728-mptcp-selftests-misc-v1-4-7e9cc530a9cd@tessares.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoselftests: mptcp: pm_nl_ctl: always look for errors
Matthieu Baerts [Sun, 30 Jul 2023 08:05:17 +0000 (10:05 +0200)]
selftests: mptcp: pm_nl_ctl: always look for errors

If a Netlink command for the MPTCP path-managers is not valid, it is
important to check if there are errors. If yes, they need to be reported
instead of being ignored and exiting without errors.

Now if no replies are expected, an ACK from the kernelspace is asked by
the userspace in order to always expect a reply. We can use the same
buffer that is currently always >1024 bytes. Then we can check if there
is an error (err->error), print it if any and report the error.

After this modification, it is required to mute expected errors in
mptcp_join.sh and pm_netlink.sh selftests:

- when trying to add a bad endpoint, e.g. duplicated
- when trying to set the two limits above the hard limit

Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Link: https://lore.kernel.org/r/20230730-upstream-net-next-20230728-mptcp-selftests-misc-v1-3-7e9cc530a9cd@tessares.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoselftests: mptcp: join: colored results
Matthieu Baerts [Sun, 30 Jul 2023 08:05:16 +0000 (10:05 +0200)]
selftests: mptcp: join: colored results

Thanks to the parent commit, it is easy to change the output and add
some colours to help spotting issues.

The colours are not used if stdout is redirected or if NO_COLOR env var
is set to 1 as specified in https://no-color.org.

It is possible to force displaying the colours even if stdout is
redirected by setting this env var:

  SELFTESTS_MPTCP_LIB_COLOR_FORCE=1

Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Link: https://lore.kernel.org/r/20230730-upstream-net-next-20230728-mptcp-selftests-misc-v1-2-7e9cc530a9cd@tessares.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoselftests: mptcp: join: rework detailed report
Matthieu Baerts [Sun, 30 Jul 2023 08:05:15 +0000 (10:05 +0200)]
selftests: mptcp: join: rework detailed report

This patch modifies how the detailed results are printed, mainly to
improve what is displayed in case of issue:

- Now the test name (title) is printed earlier, when starting the test
  if it is not intentionally skipped: by doing that, errors linked to
  a test will be printed after having written the test name and then
  avoid confusions.

- Due to the previous item, it is required to add a new line after
  having printed the test name because in case of error with a command,
  it is better not to have the output in the middle of the screen.

- Each check is printed on a dedicated line with aligned status (ok,
  skip, fail): it is easier to spot which one has failed, simpler to
  manage in the code not having to deal with alignment case by case and
  helpers can be used to uniform what is done. These helpers can also be
  useful later to do more actions depending on the results or change in
  one place what is printed.

- Info messages have been reduced and aligned as well. And info messages
  about the creation of the default test files of 1 KB are no longer
  printed.

Example:

  001 no JOIN
        syn                                 [ ok ]
        synack                              [ ok ]
        ack                                 [ ok ]

Or with a skip and a failure:

  001 no JOIN
        syn                                 [ ok ]
        synack                              [fail] got 42 JOIN[s] synack expected 0
  Server ns stats
  (...)
  Client ns stats
  (...)
        ack                                 [skip]

Or with info:

  104 Infinite map
        Test file (size 128 KB) for client
        Test file (size 128 KB) for server
        file received by server has inverted byte at 169
        5 corrupted pkts
        syn                                 [ ok ]
        synack                              [ ok ]

While at it, verify_listener_events() now also print more info in case
of failure and in pm_nl_check_endpoint(), the test is marked as failed
instead of skipped if no ID has been given (internal selftest issue).

Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Link: https://lore.kernel.org/r/20230730-upstream-net-next-20230728-mptcp-selftests-misc-v1-1-7e9cc530a9cd@tessares.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet/hsr: Remove unused function declarations
Yue Haibing [Sat, 29 Jul 2023 12:34:56 +0000 (20:34 +0800)]
net/hsr: Remove unused function declarations

commit f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
introducted these but never implemented.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230729123456.36340-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoselftests: connector: Fix input argument error paths to skip
Shuah Khan [Sat, 29 Jul 2023 00:24:03 +0000 (18:24 -0600)]
selftests: connector: Fix input argument error paths to skip

Fix input argument parsing paths to skip from their error legs.
This fix helps to avoid false test failure reports without running
the test.

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: Anjali Kulkarni <anjali.k.kulkarni@oracle.com>
Link: https://lore.kernel.org/r/20230729002403.4278-1-skhan@linuxfoundation.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agotcx: Fix splat during dev unregister
Martin KaFai Lau [Fri, 28 Jul 2023 21:47:17 +0000 (23:47 +0200)]
tcx: Fix splat during dev unregister

During unregister_netdevice_many_notify(), the ordering of our concerned
function calls is like this:

  unregister_netdevice_many_notify
    dev_shutdown
qdisc_put
            clsact_destroy
    tcx_uninstall

The syzbot reproducer triggered a case that the qdisc refcnt is not
zero during dev_shutdown().

tcx_uninstall() will then WARN_ON_ONCE(tcx_entry(entry)->miniq_active)
because the miniq is still active and the entry should not be freed.
The latter assumed that qdisc destruction happens before tcx teardown.

This fix is to avoid tcx_uninstall() doing tcx_entry_free() when the
miniq is still alive and let the clsact_destroy() do the free later, so
that we do not assume any specific ordering for either of them.

If still active, tcx_uninstall() does clear the entry when flushing out
the prog/link. clsact_destroy() will then notice the "!tcx_entry_is_active()"
and then does the tcx_entry_free() eventually.

Fixes: e420bed02507 ("bpf: Add fd-based tcx multi-prog infra with link support")
Reported-by: syzbot+376a289e86a0fd02b9ba@syzkaller.appspotmail.com
Reported-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Co-developed-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: syzbot+376a289e86a0fd02b9ba@syzkaller.appspotmail.com
Tested-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/222255fe07cb58f15ee662e7ee78328af5b438e4.1690549248.git.daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: bcmgenet: Remove TX ring full logging
Florian Fainelli [Fri, 28 Jul 2023 18:39:45 +0000 (11:39 -0700)]
net: bcmgenet: Remove TX ring full logging

There is no need to spam the kernel log with such an indication, remove
this message.

Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230728183945.760531-1-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agovsock: Remove unused function declarations
Yue Haibing [Sat, 29 Jul 2023 12:20:36 +0000 (20:20 +0800)]
vsock: Remove unused function declarations

These are never implemented since introduction in
commit d021c344051a ("VSOCK: Introduce VM Sockets")

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20230729122036.32988-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet/smc: Remove unused function declarations
Yue Haibing [Sat, 29 Jul 2023 12:19:29 +0000 (20:19 +0800)]
net/smc: Remove unused function declarations

commit f9aab6f2ce57 ("net/smc: immediate freeing in smc_lgr_cleanup_early()")
left behind smc_lgr_schedule_free_work_fast() declaration.
And since commit 349d43127dac ("net/smc: fix kernel panic caused by race of smc_sock")
smc_ib_modify_qp_reset() is not used anymore.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
Link: https://lore.kernel.org/r/20230729121929.17180-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoMerge branch 'connector-proc_filter-test-fixes'
Jakub Kicinski [Mon, 31 Jul 2023 21:38:28 +0000 (14:38 -0700)]
Merge branch 'connector-proc_filter-test-fixes'

Shuah Khan says:

====================
Connector/proc_filter test fixes

The first patch fixes the LKFT reported compile error, second
one adds .gitignore.
====================

Applying the first 2 patches, third one resent separately.

Link: https://lore.kernel.org/r/cover.1690564372.git.skhan@linuxfoundation.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoselftests: connector: Add .gitignore and poupulate it with test
Shuah Khan [Fri, 28 Jul 2023 17:29:27 +0000 (11:29 -0600)]
selftests: connector: Add .gitignore and poupulate it with test

Add gitignore and poupulate it with test name - proc_filter

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/e3d04cc34e9af07909dc882b50fb1b6f1ce7705b.1690564372.git.skhan@linuxfoundation.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoselftests: connector: Fix Makefile to include KHDR_INCLUDES
Shuah Khan [Fri, 28 Jul 2023 17:29:26 +0000 (11:29 -0600)]
selftests: connector: Fix Makefile to include KHDR_INCLUDES

The test compile fails with following errors. Fix the Makefile
CFLAGS to include KHDR_INCLUDES to pull in uapi defines.

gcc -Wall     proc_filter.c  -o ../tools/testing/selftests/connector/proc_filter
proc_filter.c: In function ‘send_message’:
proc_filter.c:22:33: error: invalid application of ‘sizeof’ to incomplete type ‘struct proc_input’
   22 |                          sizeof(struct proc_input))
      |                                 ^~~~~~
proc_filter.c:42:19: note: in expansion of macro ‘NL_MESSAGE_SIZE’
   42 |         char buff[NL_MESSAGE_SIZE];
      |                   ^~~~~~~~~~~~~~~
proc_filter.c:22:33: error: invalid application of ‘sizeof’ to incomplete type ‘struct proc_input’
   22 |                          sizeof(struct proc_input))
      |                                 ^~~~~~
proc_filter.c:48:34: note: in expansion of macro ‘NL_MESSAGE_SIZE’
   48 |                 hdr->nlmsg_len = NL_MESSAGE_SIZE;
      |                                  ^~~~~~~~~~~~~~~
`

Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Link: https://lore.kernel.org/all/CA+G9fYt=6ysz636XcQ=-KJp7vJcMZ=NjbQBrn77v7vnTcfP2cA@mail.gmail.com/
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/d0055c8cdf18516db8ba9edec99cfc5c08f32a7c.1690564372.git.skhan@linuxfoundation.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoi40e: remove i40e_status
Jan Sokolowski [Fri, 28 Jul 2023 17:13:36 +0000 (10:13 -0700)]
i40e: remove i40e_status

Replace uses of i40e_status to as equivalent as possible error codes.
Remove enum i40e_status as it is no longer needed

Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230728171336.2446156-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agotcp: Remove unused function declarations
Yue Haibing [Sat, 29 Jul 2023 12:26:44 +0000 (20:26 +0800)]
tcp: Remove unused function declarations

commit 8a59f9d1e3d4 ("sock: Introduce sk->sk_prot->psock_update_sk_prot()")
left behind tcp_bpf_get_proto() declaration. And tcp_v4_tw_remember_stamp()
function is remove in ccb7c410ddc0 ("timewait_sock: Create and use getpeer op.").
Since commit 686989700cab ("tcp: simplify tcp_mark_skb_lost")
tcp_skb_mark_lost_uncond_verify() declaration is not used anymore.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230729122644.10648-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agodevlink: Remove unused extern declaration devlink_port_region_destroy()
Yue Haibing [Fri, 28 Jul 2023 13:21:13 +0000 (21:21 +0800)]
devlink: Remove unused extern declaration devlink_port_region_destroy()

devlink_port_region_destroy() is never implemented since
commit 544e7c33ec2f ("net: devlink: Add support for port regions").

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230728132113.32888-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: Use sockaddr_storage for getsockopt(SO_PEERNAME).
Kuniyuki Iwashima [Sat, 29 Jul 2023 00:48:13 +0000 (17:48 -0700)]
net: Use sockaddr_storage for getsockopt(SO_PEERNAME).

Commit df8fc4e934c1 ("kbuild: Enable -fstrict-flex-arrays=3") started
applying strict rules to standard string functions.

It does not work well with conventional socket code around each protocol-
specific sockaddr_XXX struct, which is cast from sockaddr_storage and has
a bigger size than fortified functions expect.  See these commits:

 commit 06d4c8a80836 ("af_unix: Fix fortify_panic() in unix_bind_bsd().")
 commit ecb4534b6a1c ("af_unix: Terminate sun_path when bind()ing pathname socket.")
 commit a0ade8404c3b ("af_packet: Fix warning of fortified memcpy() in packet_getname().")

We must cast the protocol-specific address back to sockaddr_storage
to call such functions.

However, in the case of getsockaddr(SO_PEERNAME), the rationale is a bit
unclear as the buffer is defined by char[128] which is the same size as
sockaddr_storage.

Let's use sockaddr_storage explicitly.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agonet: flow_dissector: Use 64bits for used_keys
Ratheesh Kannoth [Fri, 28 Jul 2023 23:22:15 +0000 (04:52 +0530)]
net: flow_dissector: Use 64bits for used_keys

As 32bits of dissector->used_keys are exhausted,
increase the size to 64bits.

This is base change for ESP/AH flow dissector patch.
Please find patch and discussions at
https://lore.kernel.org/netdev/ZMDNjD46BvZ5zp5I@corigine.com/T/#t

Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Reviewed-by: Petr Machata <petrm@nvidia.com> # for mlxsw
Tested-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agoteam: Remove NULL check before dev_{put, hold}
Yang Li [Thu, 27 Jul 2023 00:57:41 +0000 (08:57 +0800)]
team: Remove NULL check before dev_{put, hold}

The call netdev_{put, hold} of dev_{put, hold} will check NULL,
so there is no need to check before using dev_{put, hold},
remove it to silence the warning:

./drivers/net/team/team.c:2325:3-10: WARNING: NULL check before dev_{put, hold} functions is not needed.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=5991
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
11 months agonet: ethernet: mtk_eth_soc: enable nft hw flowtable_offload for MT7988 SoC
Lorenzo Bianconi [Thu, 27 Jul 2023 07:07:28 +0000 (09:07 +0200)]
net: ethernet: mtk_eth_soc: enable nft hw flowtable_offload for MT7988 SoC

Enable hw Packet Process Engine (PPE) for MT7988 SoC.

Tested-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/5e86341b0220a49620dadc02d77970de5ded9efc.1690441576.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: ethernet: mtk_eth_soc: enable page_pool support for MT7988 SoC
Lorenzo Bianconi [Thu, 27 Jul 2023 07:02:26 +0000 (09:02 +0200)]
net: ethernet: mtk_eth_soc: enable page_pool support for MT7988 SoC

In order to recycle pages, enable page_pool allocator for MT7988 SoC.

Tested-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/fd4e8693980e47385a543e7b002eec0b88bd09df.1690440675.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: bcmasp: Clean up redundant dev_err_probe()
Chen Jiahao [Thu, 27 Jul 2023 11:55:51 +0000 (19:55 +0800)]
net: bcmasp: Clean up redundant dev_err_probe()

Referring to platform_get_irq()'s definition, the return value has
already been checked, error message also been printed via
dev_err_probe() if ret < 0. Calling dev_err_probe() one more time
outside platform_get_irq() is obviously redundant.

Removing dev_err_probe() outside platform_get_irq() to clean up
above problem.

Signed-off-by: Chen Jiahao <chenjiahao16@huawei.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Justin Chen <justin.chen@broadcom.com>
Link: https://lore.kernel.org/r/20230727115551.2655840-1-chenjiahao16@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agobonding: 3ad: Remove unused declaration bond_3ad_update_lacp_active()
YueHaibing [Wed, 26 Jul 2023 14:38:16 +0000 (22:38 +0800)]
bonding: 3ad: Remove unused declaration bond_3ad_update_lacp_active()

This is not used since commit 3a755cd8b7c6 ("bonding: add new option lacp_active")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20230726143816.15280-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoMerge branch 'r8152-reduce-control-transfer'
Jakub Kicinski [Sat, 29 Jul 2023 01:01:24 +0000 (18:01 -0700)]
Merge branch 'r8152-reduce-control-transfer'

Hayes Wang says:

====================
r8152: reduce control transfer

The two patches are used to reduce the number of control transfer when
access the registers in bulk.
====================

Link: https://lore.kernel.org/r/20230726030808.9093-417-nic_swsd@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agor8152: set bp in bulk
Hayes Wang [Wed, 26 Jul 2023 03:08:08 +0000 (11:08 +0800)]
r8152: set bp in bulk

PLA_BP_0 ~ PLA_BP_15 (0xfc28 ~ 0xfc46) are continuous registers, so we
could combine the control transfers into one control transfer.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Link: https://lore.kernel.org/r/20230726030808.9093-419-nic_swsd@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agor8152: adjust generic_ocp_write function
Hayes Wang [Wed, 26 Jul 2023 03:08:07 +0000 (11:08 +0800)]
r8152: adjust generic_ocp_write function

Reduce the control transfer if all bytes of first or the last DWORD are
written.

The original method is to split the control transfer into three parts
(the first DWORD, middle continuous data, and the last DWORD). However,
they could be combined if whole bytes of the first DWORD or last DWORD
are written. That is, the first DWORD or the last DWORD could be combined
with the middle continuous data, if the byte_en is 0xff.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Link: https://lore.kernel.org/r/20230726030808.9093-418-nic_swsd@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: ethernet: slicoss: remove redundant increment of pointer data
Colin Ian King [Wed, 26 Jul 2023 16:45:22 +0000 (17:45 +0100)]
net: ethernet: slicoss: remove redundant increment of pointer data

The pointer data is being incremented but this change to the pointer
is not used afterwards. The increment is redundant and can be removed.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Acked-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Link: https://lore.kernel.org/r/20230726164522.369206-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoMerge branch 'in-kernel-support-for-the-tls-alert-protocol'
Jakub Kicinski [Fri, 28 Jul 2023 21:08:01 +0000 (14:08 -0700)]
Merge branch 'in-kernel-support-for-the-tls-alert-protocol'

Chuck Lever says:

====================
In-kernel support for the TLS Alert protocol

IMO the kernel doesn't need user space (ie, tlshd) to handle the TLS
Alert protocol. Instead, a set of small helper functions can be used
to handle sending and receiving TLS Alerts for in-kernel TLS
consumers.
====================

Merged on top of a tag in case it's needed in the NFS tree.

Link: https://lore.kernel.org/r/169047923706.5241.1181144206068116926.stgit@oracle-102.nfsv4bat.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet/handshake: Trace events for TLS Alert helpers
Chuck Lever [Thu, 27 Jul 2023 17:38:04 +0000 (13:38 -0400)]
net/handshake: Trace events for TLS Alert helpers

Add observability for the new TLS Alert infrastructure.

Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/169047947409.5241.14548832149596892717.stgit@oracle-102.nfsv4bat.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoSUNRPC: Use new helpers to handle TLS Alerts
Chuck Lever [Thu, 27 Jul 2023 17:37:37 +0000 (13:37 -0400)]
SUNRPC: Use new helpers to handle TLS Alerts

Use the helpers to parse the level and description fields in
incoming alerts. "Warning" alerts are discarded, and "fatal"
alerts mean the session is no longer valid.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/169047944747.5241.1974889594004407123.stgit@oracle-102.nfsv4bat.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet/handshake: Add helpers for parsing incoming TLS Alerts
Chuck Lever [Thu, 27 Jul 2023 17:37:10 +0000 (13:37 -0400)]
net/handshake: Add helpers for parsing incoming TLS Alerts

Kernel TLS consumers can replace common TLS Alert parsing code with
these helpers.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/169047942074.5241.13791647439480672048.stgit@oracle-102.nfsv4bat.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoSUNRPC: Send TLS Closure alerts before closing a TCP socket
Chuck Lever [Thu, 27 Jul 2023 17:36:44 +0000 (13:36 -0400)]
SUNRPC: Send TLS Closure alerts before closing a TCP socket

Before closing a TCP connection, the TLS protocol wants peers to
send session close Alert notifications. Add those in both the RPC
client and server.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/169047939404.5241.14392506226409865832.stgit@oracle-102.nfsv4bat.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet/handshake: Add API for sending TLS Closure alerts
Chuck Lever [Thu, 27 Jul 2023 17:36:17 +0000 (13:36 -0400)]
net/handshake: Add API for sending TLS Closure alerts

This helper sends an alert only if a TLS session was established.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/169047936730.5241.618595693821012638.stgit@oracle-102.nfsv4bat.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet/tls: Add TLS Alert definitions
Chuck Lever [Thu, 27 Jul 2023 17:35:50 +0000 (13:35 -0400)]
net/tls: Add TLS Alert definitions

I'm about to add support for kernel handshake API consumers to send
TLS Alerts, so introduce the needed protocol definitions in the new
header tls_prot.h.

This presages support for Closure alerts. Also, support for alerts
is a pre-requite for handling session re-keying, where one peer will
signal the need for a re-key by sending a TLS Alert.

Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/169047934064.5241.8377890858495063518.stgit@oracle-102.nfsv4bat.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet/tls: Move TLS protocol elements to a separate header
Chuck Lever [Thu, 27 Jul 2023 17:35:23 +0000 (13:35 -0400)]
net/tls: Move TLS protocol elements to a separate header

Kernel TLS consumers will need definitions of various parts of the
TLS protocol, but often do not need the function declarations and
other infrastructure provided in <net/tls.h>.

Break out existing standardized protocol elements into a separate
header, and make room for a few more elements in subsequent patches.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/169047931374.5241.7713175865185969309.stgit@oracle-102.nfsv4bat.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoocteontx2-af: Initialize 'cntr_val' to fix uninitialized symbol error
Suman Ghosh [Thu, 27 Jul 2023 16:31:01 +0000 (22:01 +0530)]
octeontx2-af: Initialize 'cntr_val' to fix uninitialized symbol error

drivers/net/ethernet/marvell/octeontx2/nic/otx2_tc.c:860
otx2_tc_update_mcam_table_del_req()
error: uninitialized symbol 'cntr_val'.

Fixes: ec87f05402f5 ("octeontx2-af: Install TC filter rules in hardware based on priority")
Signed-off-by: Suman Ghosh <sumang@marvell.com>
Link: https://lore.kernel.org/r/20230727163101.2793453-1-sumang@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoMerge branch 'eth-bnxt-fix-a-couple-of-w-1-c-1-warnings'
Jakub Kicinski [Fri, 28 Jul 2023 20:47:35 +0000 (13:47 -0700)]
Merge branch 'eth-bnxt-fix-a-couple-of-w-1-c-1-warnings'

Jakub Kicinski says:

====================
eth: bnxt: fix a couple of W=1 C=1 warnings

Fix a couple of build warnings.
====================

Link: https://lore.kernel.org/r/20230727190726.1859515-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoeth: bnxt: fix warning for define in struct_group
Jakub Kicinski [Thu, 27 Jul 2023 19:07:26 +0000 (12:07 -0700)]
eth: bnxt: fix warning for define in struct_group

Fix C=1 warning with sparse 0.6.4:

drivers/net/ethernet/broadcom/bnxt/bnxt.c: note: in included file:
drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.h:30:1: warning: directive in macro's argument list

Don't put defines in a struct_group().

Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20230727190726.1859515-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoeth: bnxt: fix one of the W=1 warnings about fortified memcpy()
Jakub Kicinski [Thu, 27 Jul 2023 19:07:25 +0000 (12:07 -0700)]
eth: bnxt: fix one of the W=1 warnings about fortified memcpy()

Fix a W=1 warning with gcc 13.1:

In function ‘fortify_memcpy_chk’,
    inlined from ‘bnxt_hwrm_queue_cos2bw_cfg’ at drivers/net/ethernet/broadcom/bnxt/bnxt_dcb.c:133:3:
include/linux/fortify-string.h:592:25: warning: call to ‘__read_overflow2_field’ declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning]
  592 |                         __read_overflow2_field(q_size_field, size);
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The field group is already defined and starts at queue_id:

struct bnxt_cos2bw_cfg {
u8 pad[3];
struct_group_attr(cfg, __packed,
u8 queue_id;
__le32 min_bw;

Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20230727190726.1859515-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoMerge tag 'mlx5-updates-2023-07-24' of git://git.kernel.org/pub/scm/linux/kernel...
Jakub Kicinski [Fri, 28 Jul 2023 20:41:59 +0000 (13:41 -0700)]
Merge tag 'mlx5-updates-2023-07-24' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2023-07-24

1) Generalize devcom implementation to be independent of number of ports
   or device's GUID.

2) Save memory on command interface statistics.

3) General code cleanups

* tag 'mlx5-updates-2023-07-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: Give esw_offloads_load/unload_rep() "mlx5_" prefix
  net/mlx5: Make mlx5_eswitch_load/unload_vport() static
  net/mlx5: Make mlx5_esw_offloads_rep_load/unload() static
  net/mlx5: Remove pointless devlink_rate checks
  net/mlx5: Don't check vport->enabled in port ops
  net/mlx5e: Make flow classification filters static
  net/mlx5e: Remove duplicate code for user flow
  net/mlx5: Allocate command stats with xarray
  net/mlx5: split mlx5_cmd_init() to probe and reload routines
  net/mlx5: Remove redundant cmdif revision check
  net/mlx5: Re-organize mlx5_cmd struct
  net/mlx5e: E-Switch, Allow devcom initialization on more vports
  net/mlx5e: E-Switch, Register devcom device with switch id key
  net/mlx5: Devcom, Infrastructure changes
  net/mlx5: Use shared code for checking lag is supported
====================

Link: https://lore.kernel.org/r/20230727183914.69229-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoMerge branch 'mlxsw-avoid-non-tracker-helpers-when-holding-and-putting-netdevices'
Jakub Kicinski [Fri, 28 Jul 2023 20:38:47 +0000 (13:38 -0700)]
Merge branch 'mlxsw-avoid-non-tracker-helpers-when-holding-and-putting-netdevices'

Petr Machata says:

====================
mlxsw: Avoid non-tracker helpers when holding and putting netdevices

Using the tracking helpers, netdev_hold() and netdev_put(), makes it easier
to debug netdevice refcount imbalances when CONFIG_NET_DEV_REFCNT_TRACKER
is enabled. For example, the following traceback shows the callpath to the
point of an outstanding hold that was never put:

    unregister_netdevice: waiting for swp3 to become free. Usage count = 6
    ref_tracker: eth%d@ffff888123c9a580 has 1/5 users at
mlxsw_sp_switchdev_event+0x6bd/0xcc0 [mlxsw_spectrum]
notifier_call_chain+0xbf/0x3b0
atomic_notifier_call_chain+0x78/0x200
br_switchdev_fdb_notify+0x25f/0x2c0 [bridge]
fdb_notify+0x16a/0x1a0 [bridge]
[...]

In this patchset, get rid of all non-ref-tracking helpers in mlxsw.

- Patch #1 drops two functions that are not used anymore, but contain
  dev_hold() / dev_put() calls.

- Patch #2 avoids taking a reference in one function which is called
  under RTNL.

- The remaining patches convert individual hold/put sites one by one
  from trackerless to tracker-enabled.

Suggested-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/netdev/4c056da27c19d95ffeaba5acf1427ecadfc3f94c.camel@redhat.com/
====================

Link: https://lore.kernel.org/r/cover.1690471774.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agomlxsw: spectrum_router: IPv6 events: Use tracker helpers to hold & put netdevices
Petr Machata [Thu, 27 Jul 2023 15:59:25 +0000 (17:59 +0200)]
mlxsw: spectrum_router: IPv6 events: Use tracker helpers to hold & put netdevices

Using the tracking helpers makes it easier to debug netdevice refcount
imbalances when CONFIG_NET_DEV_REFCNT_TRACKER is enabled.

Convert dev_hold() / dev_put() to netdev_hold() / netdev_put() in the
router code that deals with IPv6 address events.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/f0af6ad4722b4ca6e598fd4fda8311a3041651ec.1690471775.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agomlxsw: spectrum_router: RIF: Use tracker helpers to hold & put netdevices
Petr Machata [Thu, 27 Jul 2023 15:59:24 +0000 (17:59 +0200)]
mlxsw: spectrum_router: RIF: Use tracker helpers to hold & put netdevices

Using the tracking helpers makes it easier to debug netdevice refcount
imbalances when CONFIG_NET_DEV_REFCNT_TRACKER is enabled.

Convert dev_hold() / dev_put() to netdev_hold() / netdev_put() in the
router code that deals with RIF allocation.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/8b7701a7b439ac268e4be4040eff99d01e27ae47.1690471775.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agomlxsw: spectrum_router: hw_stats: Use tracker helpers to hold & put netdevices
Petr Machata [Thu, 27 Jul 2023 15:59:23 +0000 (17:59 +0200)]
mlxsw: spectrum_router: hw_stats: Use tracker helpers to hold & put netdevices

Using the tracking helpers makes it easier to debug netdevice refcount
imbalances when CONFIG_NET_DEV_REFCNT_TRACKER is enabled.

Convert dev_hold() / dev_put() to netdev_hold() / netdev_put() in the
router code that deals with hw_stats events.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/b972314cfef4f4c24e66e60d13cffa5d606d1bf3.1690471774.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agomlxsw: spectrum_router: FIB: Use tracker helpers to hold & put netdevices
Petr Machata [Thu, 27 Jul 2023 15:59:22 +0000 (17:59 +0200)]
mlxsw: spectrum_router: FIB: Use tracker helpers to hold & put netdevices

Using the tracking helpers makes it easier to debug netdevice refcount
imbalances when CONFIG_NET_DEV_REFCNT_TRACKER is enabled.

Convert dev_hold() / dev_put() to netdev_hold() / netdev_put() in the
router code that deals with FIB events.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/5221a92e751c40447c55959f622267ccc999ed04.1690471774.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agomlxsw: spectrum_switchdev: Use tracker helpers to hold & put netdevices
Petr Machata [Thu, 27 Jul 2023 15:59:21 +0000 (17:59 +0200)]
mlxsw: spectrum_switchdev: Use tracker helpers to hold & put netdevices

Using the tracking helpers makes it easier to debug netdevice refcount
imbalances when CONFIG_NET_DEV_REFCNT_TRACKER is enabled.

Convert dev_hold() / dev_put() to netdev_hold() / netdev_put() in the
switchdev module.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/774c3d7b5b0231f1435df2ec9dd660192e382756.1690471774.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agomlxsw: spectrum_nve: Do not take reference when looking up netdevice
Petr Machata [Thu, 27 Jul 2023 15:59:20 +0000 (17:59 +0200)]
mlxsw: spectrum_nve: Do not take reference when looking up netdevice

mlxsw_sp_nve_fid_disable() is always called under RTNL. It is therefore
safe to call __dev_get_by_index() to get the netdevice pointer without
bumping the reference count, because we can be sure the netdevice is not
going away. That then obviates the need to put the netdevice later in the
function.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/341d1046f89d8d839d9d00e4a3d58cdc351e9397.1690471774.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agomlxsw: spectrum: Drop unused functions mlxsw_sp_port_lower_dev_hold/_put()
Petr Machata [Thu, 27 Jul 2023 15:59:19 +0000 (17:59 +0200)]
mlxsw: spectrum: Drop unused functions mlxsw_sp_port_lower_dev_hold/_put()

As of commit 151b89f6025a ("mlxsw: spectrum_router: Reuse work neighbor
initialization in work scheduler"), the functions
mlxsw_sp_port_lower_dev_hold() and mlxsw_sp_port_dev_put() have no users.
Drop them.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/d0adcd7cb4ea19416294a0f861100edba84c9f36.1690471774.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: change accept_ra_min_rtr_lft to affect all RA lifetimes
Patrick Rohr [Wed, 26 Jul 2023 23:07:01 +0000 (16:07 -0700)]
net: change accept_ra_min_rtr_lft to affect all RA lifetimes

accept_ra_min_rtr_lft only considered the lifetime of the default route
and discarded entire RAs accordingly.

This change renames accept_ra_min_rtr_lft to accept_ra_min_lft, and
applies the value to individual RA sections; in particular, router
lifetime, PIO preferred lifetime, and RIO lifetime. If any of those
lifetimes are lower than the configured value, the specific RA section
is ignored.

In order for the sysctl to be useful to Android, it should really apply
to all lifetimes in the RA, since that is what determines the minimum
frequency at which RAs must be processed by the kernel. Android uses
hardware offloads to drop RAs for a fraction of the minimum of all
lifetimes present in the RA (some networks have very frequent RAs (5s)
with high lifetimes (2h)). Despite this, we have encountered networks
that set the router lifetime to 30s which results in very frequent CPU
wakeups. Instead of disabling IPv6 (and dropping IPv6 ethertype in the
WiFi firmware) entirely on such networks, it seems better to ignore the
misconfigured routers while still processing RAs from other IPv6 routers
on the same network (i.e. to support IoT applications).

The previous implementation dropped the entire RA based on router
lifetime. This turned out to be hard to expand to the other lifetimes
present in the RA in a consistent manner; dropping the entire RA based
on RIO/PIO lifetimes would essentially require parsing the whole thing
twice.

Fixes: 1671bcfd76fd ("net: add sysctl accept_ra_min_rtr_lft")
Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Patrick Rohr <prohr@google.com>
Reviewed-by: Maciej Żenczykowski <maze@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230726230701.919212-1-prohr@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoMerge branch 'net-store-netdevs-in-an-xarray'
Jakub Kicinski [Fri, 28 Jul 2023 18:36:00 +0000 (11:36 -0700)]
Merge branch 'net-store-netdevs-in-an-xarray'

Jakub Kicinski says:

====================
net: store netdevs in an xarray

One of more annoying developer experience gaps we have in netlink
is iterating over netdevs. It's painful. Add an xarray to make
it trivial.

v1: https://lore.kernel.org/all/20230722014237.4078962-1-kuba@kernel.org/
====================

Link: https://lore.kernel.org/r/20230726185530.2247698-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: convert some netlink netdev iterators to depend on the xarray
Jakub Kicinski [Wed, 26 Jul 2023 18:55:30 +0000 (11:55 -0700)]
net: convert some netlink netdev iterators to depend on the xarray

Reap the benefits of easier iteration thanks to the xarray.
Convert just the genetlink ones, those are easier to test.

Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230726185530.2247698-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agonet: store netdevs in an xarray
Jakub Kicinski [Wed, 26 Jul 2023 18:55:29 +0000 (11:55 -0700)]
net: store netdevs in an xarray

Iterating over the netdev hash table for netlink dumps is hard.
Dumps are done in "chunks" so we need to save the position
after each chunk, so we know where to restart from. Because
netdevs are stored in a hash table we remember which bucket
we were in and how many devices we dumped.

Since we don't hold any locks across the "chunks" - devices may
come and go while we're dumping. If that happens we may miss
a device (if device is deleted from the bucket we were in).
We indicate to user space that this may have happened by setting
NLM_F_DUMP_INTR. User space is supposed to dump again (I think)
if it sees that. Somehow I doubt most user space gets this right..

To illustrate let's look at an example:

               System state:
  start:       # [A, B, C]
  del:  B      # [A, C]

with the hash table we may dump [A, B], missing C completely even
tho it existed both before and after the "del B".

Add an xarray and use it to allocate ifindexes. This way we
can iterate ifindexes in order, without the worry that we'll
skip one. We may still generate a dump of a state which "never
existed", for example for a set of values and sequence of ops:

               System state:
  start:       # [A, B]
  add:  C      # [A, C, B]
  del:  B      # [A, C]

we may generate a dump of [A], if C got an index between A and B.
System has never been in such state. But I'm 90% sure that's perfectly
fine, important part is that we can't _miss_ devices which exist before
and after. User space which wants to mirror kernel's state subscribes
to notifications and does periodic dumps so it will know that C exists
from the notification about its creation or from the next dump
(next dump is _guaranteed_ to include C, if it doesn't get removed).

To avoid any perf regressions keep the hash table for now. Most
net namespaces have very few devices and microbenchmarking 1M lookups
on Skylake I get the following results (not counting loopback
to number of devs):

 #devs | hash |  xa  | delta
    2  | 18.3 | 20.1 | + 9.8%
   16  | 18.3 | 20.1 | + 9.5%
   64  | 18.3 | 26.3 | +43.8%
  128  | 20.4 | 26.3 | +28.6%
  256  | 20.0 | 26.4 | +32.1%
 1024  | 26.6 | 26.7 | + 0.2%
 8192  |541.3 | 33.5 | -93.8%

No surprises since the hash table has 256 entries.
The microbenchmark scans indexes in order, if the pattern is more
random xa starts to win at 512 devices already. But that's a lot
of devices, in practice.

Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230726185530.2247698-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoMerge branch 'ynl-couple-of-unrelated-fixes'
Jakub Kicinski [Fri, 28 Jul 2023 16:33:14 +0000 (09:33 -0700)]
Merge branch 'ynl-couple-of-unrelated-fixes'

Stanislav Fomichev says:

====================
ynl: couple of unrelated fixes

- spelling of xdp-features
- s/xdp_zc_max_segs/xdp-zc-max-segs/
- expose xdp-zc-max-segs
- add /* private: */
- regenerate headers
- print xdp_zc_max_segs from sample
====================

Link: https://lore.kernel.org/r/20230727163001.3952878-1-sdf@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoynl: print xdp-zc-max-segs in the sample
Stanislav Fomichev [Thu, 27 Jul 2023 16:30:01 +0000 (09:30 -0700)]
ynl: print xdp-zc-max-segs in the sample

Technically we don't have to keep extending the sample, but it
feels useful to run these tools locally to confirm everything
is working.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230727163001.3952878-5-sdf@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoynl: regenerate all headers
Stanislav Fomichev [Thu, 27 Jul 2023 16:30:00 +0000 (09:30 -0700)]
ynl: regenerate all headers

Also add support to pass topdir to ynl-regen.sh (Jakub) and call
it from the makefile to update the UAPI headers.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Co-developed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230727163001.3952878-4-sdf@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoynl: mark max/mask as private for kdoc
Stanislav Fomichev [Thu, 27 Jul 2023 16:29:59 +0000 (09:29 -0700)]
ynl: mark max/mask as private for kdoc

Simon mentioned in another thread that it makes kdoc happy
and Jakub confirms that commit e27cb89a22ad ("scripts: kernel-doc: support
private / public marking for enums") actually added the needed
support.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230727163001.3952878-3-sdf@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoynl: expose xdp-zc-max-segs
Stanislav Fomichev [Thu, 27 Jul 2023 16:29:58 +0000 (09:29 -0700)]
ynl: expose xdp-zc-max-segs

Also rename it to dashes, to match the rest. And fix unrelated
spelling error while we're at it.

Signed-off-by: Stanislav Fomichev <sdf@google.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230727163001.3952878-2-sdf@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 months agoMerge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/nex
David S. Miller [Fri, 28 Jul 2023 10:03:57 +0000 (11:03 +0100)]
Merge branch '100GbE' of git://git./linux/kernel/git/tnguy/nex
t-queue

Tony Nguyen says:

====================
ice: Implement support for SRIOV + LAG

Dave Ertman says:

Implement support for SRIOV VF's on interfaces that are in an
aggregate interface.

The first interface added into the aggregate will be flagged as
the primary interface, and this primary interface will be
responsible for managing the VF's resources.  VF's created on the
primary are the only VFs that will be supported on the aggregate.
Only Active-Backup mode will be supported and only aggregates whose
primary interface is in switchdev mode will be supported.

The ice-lag DDP must be loaded to support this feature.

Additional restrictions on what interfaces can be added to the aggregate
and still support SRIOV VFs are:
- interfaces have to all be on the same physical NIC
- all interfaces have to have the same QoS settings
- interfaces have to have the FW LLDP agent disabled
- only the primary interface is to be put into switchdev mode
- no more than two interfaces in the aggregate
---
v2:
- Move NULL check for q_ctx in ice_lag_qbuf_recfg() earlier (patch 6)

v1: https://lore.kernel.org/netdev/20230726182141.3797928-1-anthony.l.nguyen@intel.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>