platform/kernel/linux-rpi.git
15 months agonet: sparx5: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set()
Vladimir Oltean [Tue, 1 Aug 2023 14:28:20 +0000 (17:28 +0300)]
net: sparx5: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set()

The hardware timestamping through ndo_eth_ioctl() is going away.
Convert the sparx5 driver to the new API before that can be removed.

After removing the timestamping logic from sparx5_port_ioctl(), the rest
is equivalent to phy_do_ioctl().

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Link: https://lore.kernel.org/r/20230801142824.1772134-9-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: fec: delete fec_ptp_disable_hwts()
Vladimir Oltean [Tue, 1 Aug 2023 14:28:19 +0000 (17:28 +0300)]
net: fec: delete fec_ptp_disable_hwts()

Commit 340746398b67 ("net: fec: fix hardware time stamping by external
devices") was overly cautious with calling fec_ptp_disable_hwts() when
cmd == SIOCSHWTSTAMP and use_fec_hwts == false, because use_fec_hwts is
based on a runtime invariant (phy_has_hwtstamp()). Thus, if use_fec_hwts
is false, then fep->hwts_tx_en and fep->hwts_rx_en cannot be changed at
runtime; their values depend on the initial memory allocation, which
already sets them to zeroes.

If the core will ever gain support for switching timestamping layers,
it will arrange for a more organized calling convention and disable
timestamping in the previous layer as a first step. This means that the
code in the FEC driver is not necessary in any case.

The purpose of this change is to arrange the phy_has_hwtstamp() code in
a way in which it can be refactored away into generic logic.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Wei Fang <wei.fang@nxp.com>
Link: https://lore.kernel.org/r/20230801142824.1772134-8-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: fec: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set()
Vladimir Oltean [Tue, 1 Aug 2023 14:28:18 +0000 (17:28 +0300)]
net: fec: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set()

The hardware timestamping through ndo_eth_ioctl() is going away.
Convert the FEC driver to the new API before that can be removed.

After removing the timestamping logic from fec_enet_ioctl(), the rest
is equivalent to phy_do_ioctl_running().

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Wei Fang <wei.fang@nxp.com>
Link: https://lore.kernel.org/r/20230801142824.1772134-7-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: bonding: convert to ndo_hwtstamp_get() / ndo_hwtstamp_set()
Maxim Georgiev [Tue, 1 Aug 2023 14:28:17 +0000 (17:28 +0300)]
net: bonding: convert to ndo_hwtstamp_get() / ndo_hwtstamp_set()

bonding is one of the stackable net devices which pass the hardware
timestamping ops to the real device through ndo_eth_ioctl(). This
prevents converting any device driver to the new hwtimestamping API
without regressions.

Remove that limitation in bonding by using the newly introduced helpers
for timestamping through lower devices, that handle both the new and the
old driver API.

Signed-off-by: Maxim Georgiev <glipus@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230801142824.1772134-6-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: macvlan: convert to ndo_hwtstamp_get() / ndo_hwtstamp_set()
Maxim Georgiev [Tue, 1 Aug 2023 14:28:16 +0000 (17:28 +0300)]
net: macvlan: convert to ndo_hwtstamp_get() / ndo_hwtstamp_set()

macvlan is one of the stackable net devices which pass the hardware
timestamping ops to the real device through ndo_eth_ioctl(). This
prevents converting any device driver to the new hwtimestamping API
without regressions.

Remove that limitation in macvlan by using the newly introduced helpers
for timestamping through lower devices, that handle both the new and the
old driver API.

macvlan only implements ndo_eth_ioctl() for these 2 operations, so
delete that method.

Signed-off-by: Maxim Georgiev <glipus@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230801142824.1772134-5-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: vlan: convert to ndo_hwtstamp_get() / ndo_hwtstamp_set()
Maxim Georgiev [Tue, 1 Aug 2023 14:28:15 +0000 (17:28 +0300)]
net: vlan: convert to ndo_hwtstamp_get() / ndo_hwtstamp_set()

8021q is one of the stackable net devices which pass the hardware
timestamping ops to the real device through ndo_eth_ioctl(). This
prevents converting any device driver to the new hwtimestamping API
without regressions.

Remove that limitation in the vlan driver by using the newly introduced
helpers for timestamping through lower devices, that handle both the new
and the old driver API.

Signed-off-by: Maxim Georgiev <glipus@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230801142824.1772134-4-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: add hwtstamping helpers for stackable net devices
Maxim Georgiev [Tue, 1 Aug 2023 14:28:14 +0000 (17:28 +0300)]
net: add hwtstamping helpers for stackable net devices

The stackable net devices with hwtstamping support (vlan, macvlan,
bonding) only pass the hwtstamping ops to the lower (real) device.

These drivers are the first that need to be converted to the new
timestamping API, because if they aren't prepared to handle that,
then no real device driver cannot be converted to the new API either.

After studying what vlan_dev_ioctl(), macvlan_eth_ioctl() and
bond_eth_ioctl() have in common, here we propose two generic
implementations of ndo_hwtstamp_get() and ndo_hwtstamp_set() which
can be called by those 3 drivers, with "dev" being their lower device.

These helpers cover both cases, when the lower driver is converted to
the new API or unconverted.

We need some hacks in case of an unconverted driver, namely to stuff
some pointers in struct kernel_hwtstamp_config which shouldn't have
been there (since the new API isn't supposed to need it). These will
be removed when all drivers will have been converted to the new API.

Signed-off-by: Maxim Georgiev <glipus@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230801142824.1772134-3-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: add NDOs for configuring hardware timestamping
Maxim Georgiev [Tue, 1 Aug 2023 14:28:13 +0000 (17:28 +0300)]
net: add NDOs for configuring hardware timestamping

Current hardware timestamping API for NICs requires implementing
.ndo_eth_ioctl() for SIOCGHWTSTAMP and SIOCSHWTSTAMP.

That API has some boilerplate such as request parameter translation
between user and kernel address spaces, handling possible translation
failures correctly, etc. Since it is the same all across the board, it
would be desirable to handle it through generic code.

Here we introduce .ndo_hwtstamp_get() and .ndo_hwtstamp_set(), which
implement that boilerplate and allow drivers to just act upon requests.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Maxim Georgiev <glipus@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Link: https://lore.kernel.org/r/20230801142824.1772134-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agoMerge branch 'net-extend-alloc_skb_with_frags-max-size'
Jakub Kicinski [Thu, 3 Aug 2023 01:44:59 +0000 (18:44 -0700)]
Merge branch 'net-extend-alloc_skb_with_frags-max-size'

Eric Dumazet says:

====================
net: extend alloc_skb_with_frags() max size

alloc_skb_with_frags(), while being able to use high order allocations,
limits the payload size to PAGE_SIZE * MAX_SKB_FRAGS

Reviewing Tahsin Erdogan patch [1], it was clear to me we need
to remove this limitation.

[1] https://lore.kernel.org/netdev/20230731230736.109216-1-trdgn@amazon.com/

v2: Addressed Willem feedback on 1st patch.
====================

Link: https://lore.kernel.org/r/20230801205254.400094-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: tap: change tap_alloc_skb() to allow bigger paged allocations
Eric Dumazet [Tue, 1 Aug 2023 20:52:54 +0000 (20:52 +0000)]
net: tap: change tap_alloc_skb() to allow bigger paged allocations

tap_alloc_skb() is currently calling sock_alloc_send_pskb()
forcing order-0 page allocations.

Switch to PAGE_ALLOC_COSTLY_ORDER, to increase max size by 8x.

Also add logic to increase the linear part if needed.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tahsin Erdogan <trdgn@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20230801205254.400094-5-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet/packet: change packet_alloc_skb() to allow bigger paged allocations
Eric Dumazet [Tue, 1 Aug 2023 20:52:53 +0000 (20:52 +0000)]
net/packet: change packet_alloc_skb() to allow bigger paged allocations

packet_alloc_skb() is currently calling sock_alloc_send_pskb()
forcing order-0 page allocations.

Switch to PAGE_ALLOC_COSTLY_ORDER, to increase max size by 8x.

Also add logic to increase the linear part if needed.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tahsin Erdogan <trdgn@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20230801205254.400094-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: tun: change tun_alloc_skb() to allow bigger paged allocations
Eric Dumazet [Tue, 1 Aug 2023 20:52:52 +0000 (20:52 +0000)]
net: tun: change tun_alloc_skb() to allow bigger paged allocations

tun_alloc_skb() is currently calling sock_alloc_send_pskb()
forcing order-0 page allocations.

Switch to PAGE_ALLOC_COSTLY_ORDER, to increase max allocation size by 8x.

Also add logic to increase the linear part if needed.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tahsin Erdogan <trdgn@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20230801205254.400094-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: allow alloc_skb_with_frags() to allocate bigger packets
Eric Dumazet [Tue, 1 Aug 2023 20:52:51 +0000 (20:52 +0000)]
net: allow alloc_skb_with_frags() to allocate bigger packets

Refactor alloc_skb_with_frags() to allow bigger packets allocations.

Instead of assuming that only order-0 allocations will be attempted,
use the caller supplied max order.

v2: try harder to use high-order pages, per Willem feedback.

Link: https://lore.kernel.org/netdev/CANn89iJQfmc_KeUr3TeXvsLQwo3ZymyoCr7Y6AnHrkWSuz0yAg@mail.gmail.com/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tahsin Erdogan <trdgn@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20230801205254.400094-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agosctp: Remove unused function declarations
Yue Haibing [Mon, 31 Jul 2023 14:10:30 +0000 (22:10 +0800)]
sctp: Remove unused function declarations

These declarations are never implemented since beginning of git history.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/20230731141030.32772-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agoMerge branch 'mlx5-ipsec-packet-offload-support-in-eswitch-mode'
Jakub Kicinski [Thu, 3 Aug 2023 01:37:38 +0000 (18:37 -0700)]
Merge branch 'mlx5-ipsec-packet-offload-support-in-eswitch-mode'

Leon Romanovsky says:

====================
mlx5 IPsec packet offload support in eswitch mode

This series from Jianbo adds mlx5 IPsec packet offload support in eswitch
offloaded mode.

It works exactly like "regular" IPsec, nothing special, except
now users can switch to switchdev before adding IPsec rules.

 devlink dev eswitch set pci/0000:06:00.0 mode switchdev

Same configurations as here:

https://lore.kernel.org/netdev/cover.1670005543.git.leonro@nvidia.com/

Packet offload mode:
  ip xfrm state offload packet dev <if-name> dir <in|out>
  ip xfrm policy .... offload packet dev <if-name>
Crypto offload mode:
  ip xfrm state offload crypto dev <if-name> dir <in|out>
or (backward compatibility)
  ip xfrm state offload dev <if-name> dir <in|out>

v0: https://lore.kernel.org/all/cover.1689064922.git.leonro@nvidia.com
====================

Link: https://lore.kernel.org/r/cover.1690802064.git.leon@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet/mlx5e: Make TC and IPsec offloads mutually exclusive on a netdev
Jianbo Liu [Mon, 31 Jul 2023 11:28:24 +0000 (14:28 +0300)]
net/mlx5e: Make TC and IPsec offloads mutually exclusive on a netdev

For IPsec packet offload mode, the order of TC offload and IPsec
offload on the same netdevice is not aligned with the order in the
non-offload software. For example, for RX, the software performs TC
first and then IPsec transformation, but the implementation for
offload does that in the opposite way.

To resolve the difference for now, either IPsec offload or TC offload,
not both, is allowed for a specific interface.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/8e2e5e3b0984d785066e8663aaf97b3ba1bb873f.1690802064.git.leon@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet/mlx5e: Add get IPsec offload stats for uplink representor
Jianbo Liu [Mon, 31 Jul 2023 11:28:23 +0000 (14:28 +0300)]
net/mlx5e: Add get IPsec offload stats for uplink representor

As IPsec offload is supported in switchdev mode, HW stats can be can be
obtained from uplink rep.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/b43c91c452f1db9c35c10639a029aa10fd8b7895.1690802064.git.leon@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet/mlx5e: Modify and restore TC rules for IPSec TX rules
Jianbo Liu [Mon, 31 Jul 2023 11:28:22 +0000 (14:28 +0300)]
net/mlx5e: Modify and restore TC rules for IPSec TX rules

After IPsec policy/state TX rules are added, any TC flow rule, which
forwards packets to uplink, is modified to forward to IPsec TX tables.
As these tables are destroyed dynamically, whenever there is no
reference to them, the destinations of this kind of rules must be
restored to uplink.

There is a special case for packet encapsulation, as the
packet_reformat_id in the extended destination is used to reformat
packets, but only for the VPORT destination. To forward packet to
IPsec table and do encapsulation in one FTE, move the
packet_reformat_id to flow context, instead of using the extended
destination. As a limitation, multiple encapsulations with table
forwarding, and one together with other VPORT destinations, are not
allowed, so add a check when offloading TC rules.

TC rules are not allowed before IPsec TX rule is added, so only need
to restore TC rules after flush IPSec TX rules. As they are saved in
the vport_rep rhashtables, we walk all the rules in the rhashtables,
and find TC rules with destinations pointing to IPsec tables, and
modify them one by one. To avoid concurrent issue, this handling is
done under the protection of eswitch mode_lock.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/7bcb2c7e2ecf0e0d06b095c8dcc6a37ea7f02faf.1690802064.git.leon@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet/mlx5e: Make IPsec offload work together with eswitch and TC
Jianbo Liu [Mon, 31 Jul 2023 11:28:21 +0000 (14:28 +0300)]
net/mlx5e: Make IPsec offload work together with eswitch and TC

The eswitch mode is not allowed to change if there are any IPsec rules.
Besides, by using mlx5_esw_try_lock() to get eswitch mode lock, IPsec
rules are not allowed to be offloaded if there are any TC rules.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/e442b512b21a931fbdfb87d57ae428c37badd58a.1690802064.git.leon@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet/mlx5: Compare with old_dest param to modify rule destination
Jianbo Liu [Mon, 31 Jul 2023 11:28:20 +0000 (14:28 +0300)]
net/mlx5: Compare with old_dest param to modify rule destination

The rule destination must be compared with the old_dest passed in.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/24adc60d05d7492359ba343c6da1ebbe9fe284f6.1690802064.git.leon@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet/mlx5e: Support IPsec packet offload for TX in switchdev mode
Jianbo Liu [Mon, 31 Jul 2023 11:28:19 +0000 (14:28 +0300)]
net/mlx5e: Support IPsec packet offload for TX in switchdev mode

The IPsec encryption is done at the last, so add new prio for IPsec
offload in FDB, and put it just lower than the slow path prio and
higher than the per-vport prio.
Three levels are added for TX. The first one is for ip xfrm policy.
The sa table is created in the second level for ip xfrm state. The
status table is created at the last to count the number of packets
encrypted.
The rules, which forward packets to uplink, are changed to forward
them to IPsec TX tables first. These rules are restored after those
tables are destroyed, which is done immediately when there is no
reference to them, just as what does in legacy mode. The support for
slow path is added here, by refreshing uplink's channels. But, the
handling for TC fast path, which is more complicated, will be added
later. Besides, reg c4 is used instead to match reqid.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/cfd0e6ffaf0b8c55ebaa9fb0649b7c504b6b8ec6.1690802064.git.leon@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet/mlx5e: Refactor IPsec TX tables creation
Jianbo Liu [Mon, 31 Jul 2023 11:28:18 +0000 (14:28 +0300)]
net/mlx5e: Refactor IPsec TX tables creation

Add attribute for IPsec TX creation, pass all needed parameters in it,
so tx_create() can be used by eswitch.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/24d5ab988b0db2d39b7fde321b44ffe885d47828.1690802064.git.leon@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet/mlx5e: Handle IPsec offload for RX datapath in switchdev mode
Jianbo Liu [Mon, 31 Jul 2023 11:28:17 +0000 (14:28 +0300)]
net/mlx5e: Handle IPsec offload for RX datapath in switchdev mode

Reuse tun opts bits in reg c1, to pass IPsec obj id to datapath.
As this is only for RX SA and there are only 11 bits, xarray is used
to map IPsec obj id to an index, which is between 1 and 0x7ff, and
replace obj id to write to reg c1.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/43d60fbcc9cd672a97d7e2a2f7fe6a3d9e9a776d.1690802064.git.leon@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet/mlx5e: Support IPsec packet offload for RX in switchdev mode
Jianbo Liu [Mon, 31 Jul 2023 11:28:16 +0000 (14:28 +0300)]
net/mlx5e: Support IPsec packet offload for RX in switchdev mode

As decryption must be done first, add new prio for IPsec offload in
FDB, and put it just lower than BYPASS prio and higher than TC prio.
Three levels are added for RX. The first one is for ip xfrm policy. SA
table is created in the second level for ip xfrm state. The status
table is created in the last to check the decryption result. If
success, packets continue with the next process, or dropped otherwise.
For now, the set of reg c1 is removed for swtichdev mode, and the
datapath process will be added in the next patch.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/c91063554cf643fb50b99cf093e8a9bf11729de5.1690802064.git.leon@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet/mlx5e: Refactor IPsec RX tables creation and destruction
Jianbo Liu [Mon, 31 Jul 2023 11:28:15 +0000 (14:28 +0300)]
net/mlx5e: Refactor IPsec RX tables creation and destruction

Add attribute for IPsec RX creation, so rx_create() can be used by
eswitch in later patch. And move the code for TTC dest
connect/disconnect, which are needed only in NIC mode, to individual
functions.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/87478d928479b6a4eee41901204546ea05741815.1690802064.git.leon@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet/mlx5e: Prepare IPsec packet offload for switchdev mode
Jianbo Liu [Mon, 31 Jul 2023 11:28:14 +0000 (14:28 +0300)]
net/mlx5e: Prepare IPsec packet offload for switchdev mode

As the uplink representor is created only in switchdev mode, add a local
variable for IPsec to indicate the device is in this mode.
In this mode, IPsec ROCE is disabled, and crypto offload is kept
as it is. However, as the tables for packet offload are created in FDB,
ipsec->rx_esw and ipsec->tx_esw are added.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/ee242398f3b0a18007749fe79ff6ff19445a0280.1690802064.git.leon@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet/mlx5e: Change the parameter of IPsec RX skb handle function
Jianbo Liu [Mon, 31 Jul 2023 11:28:13 +0000 (14:28 +0300)]
net/mlx5e: Change the parameter of IPsec RX skb handle function

Refactor the function to pass in reg B value only.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/3b3c53f64660d464893eaecc41298b1ce49c6baa.1690802064.git.leon@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet/mlx5e: Add function to get IPsec offload namespace
Jianbo Liu [Mon, 31 Jul 2023 11:28:12 +0000 (14:28 +0300)]
net/mlx5e: Add function to get IPsec offload namespace

Add function to get namespace in different directions. It will be
extended for switchdev mode in later patch, but no functionality change
for now.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/ac2982c34f1ed3288d4670cacfd7e1b87a8c96d9.1690802064.git.leon@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agopds_core: Fix documentation for pds_client_register
Brett Creeley [Tue, 1 Aug 2023 16:58:33 +0000 (09:58 -0700)]
pds_core: Fix documentation for pds_client_register

The documentation above pds_client_register states that it returns 0 on
success and negative on error. However, it actually returns a positive
client ID on success and negative on error. Fix the documentation to
state exactly that.

Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://lore.kernel.org/r/20230801165833.1622-1-brett.creeley@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: switchdev: Remove unused typedef switchdev_obj_dump_cb_t()
Yue Haibing [Tue, 1 Aug 2023 14:42:09 +0000 (22:42 +0800)]
net: switchdev: Remove unused typedef switchdev_obj_dump_cb_t()

Commit 29ab586c3d83 ("net: switchdev: Remove bridge bypass support from switchdev")
leave this unused.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230801144209.27512-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonetlabel: Remove unused declaration netlbl_cipsov4_doi_free()
Yue Haibing [Tue, 1 Aug 2023 14:34:53 +0000 (22:34 +0800)]
netlabel: Remove unused declaration netlbl_cipsov4_doi_free()

Since commit b1edeb102397 ("netlabel: Replace protocol/NetLabel linking with refrerence counts")
this declaration is unused and can be removed.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20230801143453.24452-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agoila: Remove unnecessary file net/ila.h
Yue Haibing [Tue, 1 Aug 2023 14:31:29 +0000 (22:31 +0800)]
ila: Remove unnecessary file net/ila.h

Commit 642c2c95585d ("ila: xlat changes") removed ila_xlat_outgoing()
and ila_xlat_incoming() functions, then this file became unnecessary.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230801143129.40652-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agoudp: Remove unused function declaration udp_bpf_get_proto()
Yue Haibing [Tue, 1 Aug 2023 13:39:02 +0000 (21:39 +0800)]
udp: Remove unused function declaration udp_bpf_get_proto()

commit 8a59f9d1e3d4 ("sock: Introduce sk->sk_prot->psock_update_sk_prot()")
left behind this.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20230801133902.3660-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agocirrus: cs89x0: fix the return value handle and remove redundant dev_warn() for platf...
Ruan Jinjie [Tue, 1 Aug 2023 13:31:21 +0000 (21:31 +0800)]
cirrus: cs89x0: fix the return value handle and remove redundant dev_warn() for platform_get_irq()

There is no possible for platform_get_irq() to return 0
and the return value of platform_get_irq() is more sensible
to show the error reason.

And there is no need to call the dev_warn() function directly to print
a custom message when handling an error from platform_get_irq() function as
it is going to display an appropriate error message in case of a failure.

Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20230801133121.416319-1-ruanjinjie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: dsa: hellcreek: Replace bogus comment
Kurt Kanzenbach [Tue, 1 Aug 2023 13:16:47 +0000 (15:16 +0200)]
net: dsa: hellcreek: Replace bogus comment

Replace bogus comment about matching the latched timestamp to one of the
received frames. That comment is probably copied from mv88e6xxx and true for
these switches. However, the hellcreek switch is configured to insert the
timestamp directly into the PTP packets.

While here, remove the other comments regarding the list splicing and locking as
well, because it doesn't add any value.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20230801131647.84697-1-kurt@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agobnx2x: Remove unnecessary ternary operators
Ruan Jinjie [Tue, 1 Aug 2023 11:19:28 +0000 (19:19 +0800)]
bnx2x: Remove unnecessary ternary operators

There are a little ternary operators, the true or false judgement
of which is unnecessary in C language semantics.

Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230801111928.300231-1-ruanjinjie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agoocteontx2: Remove unnecessary ternary operators
Ruan Jinjie [Tue, 1 Aug 2023 11:26:38 +0000 (19:26 +0800)]
octeontx2: Remove unnecessary ternary operators

There are a little ternary operators, the true or false judgement
of which is unnecessary in C language semantics. So remove it
to clean Code.

Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Sunil Goutham <sgoutham@marvell.com>
Link: https://lore.kernel.org/r/20230801112638.317149-1-ruanjinjie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: hisilicon: fix the return value handle and remove redundant netdev_err() for...
Ruan Jinjie [Mon, 31 Jul 2023 07:38:58 +0000 (15:38 +0800)]
net: hisilicon: fix the return value handle and remove redundant netdev_err() for platform_get_irq()

There is no possible for platform_get_irq() to return 0
and the return value of platform_get_irq() is more sensible
to show the error reason.

And there is no need to call the netdev_err() function directly to print
a custom message when handling an error from platform_get_irq() function as
it is going to display an appropriate error message in case of a failure.

Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/20230731073858.3633193-1-ruanjinjie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: Remove duplicated include in mac.c
Yang Li [Tue, 1 Aug 2023 00:50:41 +0000 (08:50 +0800)]
net: Remove duplicated include in mac.c

./drivers/net/ethernet/freescale/fman/mac.c: linux/of_platform.h is included more than once.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=6039
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agoselftests/net: report rcv_mss in tcp_mmap
Willem de Bruijn [Mon, 31 Jul 2023 18:08:09 +0000 (14:08 -0400)]
selftests/net: report rcv_mss in tcp_mmap

tcp_mmap tests TCP_ZEROCOPY_RECEIVE. If 0% of data is received using
mmap, this may be due to mss. Report rcv_mss to identify this cause.

Output of a run failed due to too small mss:

    received 32768 MB (0 % mmap'ed) in 8.40458 s, 32.7057 Gbit
      cpu usage user:0.027922 sys:8.21126, 251.44 usec per MB, 3252 c-switches, rcv_mss 1428

Output on a successful run:

    received 32768 MB (99.9507 % mmap'ed) in 4.69023 s, 58.6064 Gbit
      cpu usage user:0.029172 sys:2.56105, 79.0473 usec per MB, 57591 c-switches, rcv_mss 4096

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agoMerge branch 'icssg-driver'
David S. Miller [Wed, 2 Aug 2023 09:38:12 +0000 (10:38 +0100)]
Merge branch 'icssg-driver'

MD Danish Anwar says:

====================
Introduce ICSSG based ethernet Driver

The Programmable Real-time Unit and Industrial Communication Subsystem
Gigabit (PRU_ICSSG) is a low-latency microcontroller subsystem in the TI
SoCs. This subsystem is provided for the use cases like the implementation
of custom peripheral interfaces, offloading of tasks from the other
processor cores of the SoC, etc.

The subsystem includes many accelerators for data processing like
multiplier and multiplier-accumulator. It also has peripherals like
UART, MII/RGMII, MDIO, etc. Every ICSSG core includes two 32-bit
load/store RISC CPU cores called PRUs.

The above features allow it to be used for implementing custom firmware
based peripherals like ethernet.

This series adds the YAML documentation and the driver with basic EMAC
support for TI AM654 Silicon Rev 2 SoC with the PRU_ICSSG Sub-system.
running dual-EMAC firmware.
This currently supports basic EMAC with 1Gbps and 100Mbps link. 10M and
half-duplex modes are not yet supported because they require the support
of an IEP, which will be added later.
Advanced features like switch-dev and timestamping will be added later.

This is the v13 of the patch series [v1]. This version of the patchset
addresses comments made on v12.

There series doesn't have any dependency.

Changes from v12 to v13 :
*) Rebased the series on latest net-next.
*) Addressed Jakub's comments on ndo_xmit API. Now we will only stop queues
   based on occupancy not on dma errors.
*) Removed limiting the number of serviced packets to budget for Tx NAPI.
   Now Tx NAPI will keep servicing packets.
*) Removed netif_running() check when packet arrives.
*) Introduced prototypes of APIs in the same patch where these APIs are added.
   Dropped __maybe_unused tags as compiler only cares about prototypes
   existing, not whether actual callers are in place. Now prototypes of these
   APIs are present in the same patch where they are introduced but thes APIs
   are called later (in patch 6).

Changes from v11 to v12 :
*) Rebased the series on latest net-next.
*) Addressed Jakub's comments on ndo_xmit API.
*) Added hooks to .get_rmon_stats for the driver. Now tx / rx bucket size
   and frame counts per bucket will be fetched by ethtool_rmon_stats instead
   of ethtool -S.
*) Added __maybe_unused tags to unused config and classifier APIs in patch
   2,3 and 4. These tags are later removed in patch 6.

Changes from v10 to v11 :
*) Rebased the series on latest net-next.
*) Split the ICSSG driver introduction patch into 9 different patches as
   asked by Jakub.
*) Introduced new patch(patch 8/10) to dump Standard network interface
   staticstics via ndo_get_stats64. Now certain stats that are reported by
   ICSSG hardware and are also part of struct rtnl_link_stats64, will be
   reported by ndo_get_stats64. While other stats that are not part of the
   struct rtnl_link_stats64 will be reported by ethtool -S. These stats
   are not duplicated.

Changes from v9 to v10 :
*) Rebased the series on latest net-next.
*) Moved 'ndev prueth->emac[mac] == emac' assignment to the end of function
   prueth_netdev_init().
*) In unsupported phy_mode switch case instead of returning -EINVAL, store
   the error code in ret and 'goto free'

Changes from v8 to v9 :
*) Rebased the series on latest net-next.
*) Fixed smatch and sparse warnings as pointed by Simon.
*) Fixed leaky ndev in prueth_netdev_init() as asked by Simon.

Changes from v7 to v8 :
*) Rebased the series on 6.5-rc1.
*) Fixed few formattings.

Changes from v6 to v7 :
*) Added RB tag of Rob in patch 1 of this series.
*) Addressed Simon's comment on patch 2 of the series.
*) Rebased patchset on next-20230428 linux-next.

Changes from v5 to v6 :
*) Added RB tag of Andrew Lunn in patch 2 of this series.
*) Addressed Rob's comment on patch 1 of the series.
*) Rebased patchset on next-20230421 linux-next.

Changes from v4 to v5 :
*) Re-arranged properties section in ti,icssg-prueth.yaml file.
*) Added requirement for minimum one ethernet port.
*) Fixed some minor formatting errors as asked by Krzysztof.
*) Dropped SGMII mode from enum mii_mode as SGMII mode is not currently
   supported by the driver.
*) Added switch-case block to handle different phy modes by ICSSG driver.

Changes from v3 to v4 :
*) Addressed Krzysztof's comments and fixed dt_binding_check errors in
   patch 1/2.
*) Added interrupt-extended property in ethernet-ports properties section.
*) Fixed comments in file icssg_switch_map.h according to the Linux coding
   style in patch 2/2. Added Documentation of structures in patch 2/2.

Changes from v2 to v3 :
*) Addressed Rob and Krzysztof's comments on patch 1 of this series.
   Fixed indentation. Removed description and pinctrl section from
   ti,icssg-prueth.yaml file.
*) Addressed Krzysztof, Paolo, Randy, Andrew and Christophe's comments on
   patch 2 of this seires.
*) Fixed blanklines in Kconfig and Makefile. Changed structures to const
   as suggested by Krzysztof.
*) Fixed while loop logic in emac_tx_complete_packets() API as suggested
   by Paolo. Previously in the loop's last iteration 'budget' was 0 and
   napi_consume_skb would wrongly assume the caller is not in NAPI context
   Now, budget won't be zero in last iteration of loop.
*) Removed inline functions addr_to_da1() and addr_to_da0() as asked by
   Andrew.
*) Added dev_err_probe() instead of dev_err() as suggested by Christophe.
*) In ti,icssg-prueth.yaml file, in the patternProperties section of
   ethernet-ports, kept the port name as "port" instead of "ethernet-port"
   as all other drivers were using "port". Will change it if is compulsory
   to use "ethernet-port".
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agonet: ti: icssg-prueth: Add Power management support
MD Danish Anwar [Tue, 1 Aug 2023 09:14:28 +0000 (14:44 +0530)]
net: ti: icssg-prueth: Add Power management support

Add suspend / resume APIs to support power management in ICSSG ethernet
driver.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agonet: ti: icssg-prueth: Add ethtool ops for ICSSG Ethernet driver
MD Danish Anwar [Tue, 1 Aug 2023 09:14:27 +0000 (14:44 +0530)]
net: ti: icssg-prueth: Add ethtool ops for ICSSG Ethernet driver

Add icssg_ethtool.c file. This file will be used for dumping statistics
via ethtool for ICSSG ethernet driver.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agonet: ti: icssg-prueth: Add Standard network staticstics
MD Danish Anwar [Tue, 1 Aug 2023 09:14:26 +0000 (14:44 +0530)]
net: ti: icssg-prueth: Add Standard network staticstics

Implement .ndo_get_stats64 to dump standard network interface
statistics for ICSSG ethernet driver.

Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agonet: ti: icssg-prueth: Add ICSSG Stats
MD Danish Anwar [Tue, 1 Aug 2023 09:14:25 +0000 (14:44 +0530)]
net: ti: icssg-prueth: Add ICSSG Stats

Add icssg_stats.c to help dump, icssg related driver statistics.

ICSSG has hardware registers for providing statistics like total rx bytes,
total tx bytes, etc. These registers are of 32 bits and hence in case of 1G
link, they overflows in around 32 seconds. The behaviour of these registers
is such that they don't roll back to 0 after overflow but rather stay at
UINT_MAX.

These registers support a feature where the value written to them is
subtracted from the register. This feature can be utilized to fix the
overflowing of stats.

This solution uses a Workqueues based solution where a function gets
called before the registers overflow (every 25 seconds in 1G link, 25000
seconds in 100M link), this function saves the register
values in local variables and writes the last read value to the
register. So any update during the read will be taken care of.

Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agonet: ti: icssg-prueth: Add ICSSG ethernet driver
Roger Quadros [Tue, 1 Aug 2023 09:14:24 +0000 (14:44 +0530)]
net: ti: icssg-prueth: Add ICSSG ethernet driver

This is the Ethernet driver for TI AM654 Silicon rev. 2
with the ICSSG PRU Sub-system running dual-EMAC firmware.

The Programmable Real-time Unit and Industrial Communication Subsystem
Gigabit (PRU_ICSSG) is a low-latency microcontroller subsystem in the TI
SoCs. This subsystem is provided for the use cases like implementation of
custom peripheral interfaces, offloading of tasks from the other
processor cores of the SoC, etc.

Every ICSSG core has two Programmable Real-Time Unit(PRUs),
two auxiliary Real-Time Transfer Unit (RT_PRUs), and
two Transmit Real-Time Transfer Units (TX_PRUs). Each one of these runs
its own firmware. Every ICSSG core has two MII ports connect to these
PRUs and also a MDIO port.

The cores can run different firmwares to support different protocols and
features like switch-dev, timestamping, etc.

It uses System DMA to transfer and receive packets and
shared memory register emulation between the firmware and
driver for control and configuration.

This patch adds support for basic EMAC functionality with 1Gbps
and 100Mbps link speed. 10M and half duplex mode are not supported
currently as they require IEP, the support for which will be added later.
Support for switch-dev, timestamp, etc. will be added later
by subsequent patch series.

Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agodt-bindings: net: Add ICSSG Ethernet
MD Danish Anwar [Tue, 1 Aug 2023 09:14:23 +0000 (14:44 +0530)]
dt-bindings: net: Add ICSSG Ethernet

Add a YAML binding document for the ICSSG Programmable real time unit
based Ethernet hardware. The ICSSG driver uses the PRU and PRUSS consumer
APIs to interface the PRUs and load/run the firmware for supporting
ethernet functionality.

Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agonet: ti: icssg-prueth: Add icssg queues APIs and macros
MD Danish Anwar [Tue, 1 Aug 2023 09:14:22 +0000 (14:44 +0530)]
net: ti: icssg-prueth: Add icssg queues APIs and macros

Add icssg_queue.c file. This file introduces macros and APIs related to
ICSSG queues. These will be used by ICSSG Ethernet driver.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agonet: ti: icssg-prueth: Add Firmware config and classification APIs.
MD Danish Anwar [Tue, 1 Aug 2023 09:14:21 +0000 (14:44 +0530)]
net: ti: icssg-prueth: Add Firmware config and classification APIs.

Add icssg_config.h / .c and icssg_classifier.c files. These are firmware
configuration and classification related files. These will be used by
ICSSG ethernet driver.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agonet: ti: icssg-prueth: Add mii helper apis and macros
MD Danish Anwar [Tue, 1 Aug 2023 09:14:20 +0000 (14:44 +0530)]
net: ti: icssg-prueth: Add mii helper apis and macros

Add MII helper APIs and MACROs. These APIs and MACROs will be later used
by ICSSG Ethernet driver. Also introduce icssg_prueth.h which has
definition of prueth related structures.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agonet: ti: icssg-prueth: Add Firmware Interface for ICSSG Ethernet driver.
MD Danish Anwar [Tue, 1 Aug 2023 09:14:19 +0000 (14:44 +0530)]
net: ti: icssg-prueth: Add Firmware Interface for ICSSG Ethernet driver.

Add firmware interface related headers and macros for ICSSG Ethernet
driver. These macros will be later used by the ICSSG ethernet driver.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agonet: dsa: mv88e6xxx: Add erratum 3.14 for 88E6390X and 88E6190X
Ante Knezic [Tue, 1 Aug 2023 06:48:15 +0000 (08:48 +0200)]
net: dsa: mv88e6xxx: Add erratum 3.14 for 88E6390X and 88E6190X

Fixes XAUI/RXAUI lane alignment errors.
Issue causes dropped packets when trying to communicate over
fiber via SERDES lanes of port 9 and 10.
Errata document applies only to 88E6190X and 88E6390X devices.
Requires poking in undocumented registers.

Signed-off-by: Ante Knezic <ante.knezic@helmholz.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agoMerge branch 'tc-flower-SPI'
David S. Miller [Wed, 2 Aug 2023 09:09:32 +0000 (10:09 +0100)]
Merge branch 'tc-flower-SPI'

Ratheesh Kannoth says:

====================
Packet classify by matching against SPI

1.  net: flow_dissector: Add IPSEC dissector.
Flow dissector patch reads IPSEC headers (ESP or AH) header
from packet and retrieves the SPI header.

2. tc: flower: support for SPI.
TC control path changes to pass SPI field from userspace to
kernel.

3. tc: flower: Enable offload support IPSEC SPI field.
Next patch enables the HW support for classify offload for ESP/AH.
This patch enables the HW offload control.

4. octeontx2-pf: TC flower offload support for SPI field.
HW offload support for classification in octeontx2 driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agoocteontx2-pf: TC flower offload support for SPI field
Ratheesh Kannoth [Tue, 1 Aug 2023 01:41:01 +0000 (07:11 +0530)]
octeontx2-pf: TC flower offload support for SPI field

Driver support to offload TC flower rules which matches
against SPI field of IPSEC packets (AH/ESP).

Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agotc: flower: Enable offload support IPSEC SPI field.
Ratheesh Kannoth [Tue, 1 Aug 2023 01:41:00 +0000 (07:11 +0530)]
tc: flower: Enable offload support IPSEC SPI field.

This patch enables offload for TC classifier
flower rules which matches against SPI field.

Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agotc: flower: support for SPI
Ratheesh Kannoth [Tue, 1 Aug 2023 01:40:59 +0000 (07:10 +0530)]
tc: flower: support for SPI

tc flower rules support to classify ESP/AH
packets matching SPI field.

Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agonet: flow_dissector: Add IPSEC dissector
Ratheesh Kannoth [Tue, 1 Aug 2023 01:40:58 +0000 (07:10 +0530)]
net: flow_dissector: Add IPSEC dissector

Support for dissecting IPSEC field SPI (which is
32bits in size) for ESP and AH packets.

Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agoMerge branch 'oxnas=dwmac-removal'
David S. Miller [Wed, 2 Aug 2023 09:01:05 +0000 (10:01 +0100)]
Merge branch 'oxnas=dwmac-removal'

Neil Armstrong says:

====================
net: ethernet: dwmac: oxnas glue removal

With [1] removing MPCore SMP support, this makes the OX820 barely usable,
associated with a clear lack of maintainance, development and migration to
dt-schema it's clear that Linux support for OX810 and OX820 should be removed.

In addition, the OX810 hasn't been booted for years and isn't even present
in an ARM config file.

For the OX820, lack of USB and SATA support makes the platform not usable
in the current Linux support and relies on off-tree drivers hacked from the
vendor (defunct for years) sources.

The last users are in the OpenWRT distribution, and today's removal means
support will still be in stable 6.1 LTS kernel until end of 2026.

If someone wants to take over the development even with lack of SMP, I'll
be happy to hand off maintainance.

It has been a fun time adding support for this architecture, but it's time
to get over!

This patchset only removes net changes, and is derived from:
https://lore.kernel.org/r/20230630-topic-oxnas-upstream-remove-v2-0-fb6ab3dea87c@linaro.org

---
Changes in v3:
- Removed applied changes
- Added Andy's tags
- Reduced for net
- Link to v2: https://lore.kernel.org/r/20230630-topic-oxnas-upstream-remove-v2-0-fb6ab3dea87c@linaro.org

Changes in v2:
- s/maintainance/maintenance/
- added acked/review tags
- dropped already applied patches
- drop RFC
- Link to v1: https://lore.kernel.org/r/20230331-topic-oxnas-upstream-remove-v1-0-5bd58fd1dd1f@linaro.org
====================

Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agodt-bindings: net: oxnas-dwmac: remove obsolete bindings
Neil Armstrong [Mon, 31 Jul 2023 14:41:11 +0000 (16:41 +0200)]
dt-bindings: net: oxnas-dwmac: remove obsolete bindings

Due to lack of maintenance and stall of development for a few years now,
and since no new features will ever be added upstream, remove the
OX810 and OX820 dwmac glue.

Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Daniel Golle <daniel@makrotopia.org>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agonet: stmmac: dwmac-oxnas: remove obsolete dwmac glue driver
Neil Armstrong [Mon, 31 Jul 2023 14:41:10 +0000 (16:41 +0200)]
net: stmmac: dwmac-oxnas: remove obsolete dwmac glue driver

Due to lack of maintenance and stall of development for a few years now,
and since no new features will ever be added upstream, remove support
for OX810 and OX820 ethernet.

Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Daniel Golle <daniel@makrotopia.org>
Acked-by: Andy Shevchenko <andy@kernel.org>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agoMerge branch 'selftests-mlxsw'
David S. Miller [Wed, 2 Aug 2023 08:18:18 +0000 (09:18 +0100)]
Merge branch 'selftests-mlxsw'

Petr Machata says:

====================
selftests: New selftests for out-of-order-operations patches in mlxsw

In the past, the mlxsw driver made the assumption that the user applies
configuration in a bottom-up manner. Thus netdevices needed to be added to
the bridge before IP addresses were configured on that bridge or SVI added
on top of it, because whatever happened before a netdevice was mlxsw upper
was generally ignored by mlxsw. Recently, several patch series were pushed
to introduce the bookkeeping and replays necessary to offload the full
state, not just the immediate configuration step.

In this patchset, introduce new selftests that directly exercise the out of
order code paths in mlxsw.

- Patch #1 adds new tests into the existing selftest router_bridge.sh.
- Patches #2-#5 add new generic selftests.
- Patches #6-#8 add new mlxsw-specific selftests.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agoselftests: mlxsw: rif_bridge: Add a new selftest
Petr Machata [Mon, 31 Jul 2023 15:47:22 +0000 (17:47 +0200)]
selftests: mlxsw: rif_bridge: Add a new selftest

This test verifies driver behavior with regards to creation of RIFs for a
bridge as LAGs are added or removed to/from it, and ports added or removed
to/from the LAG.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agoselftests: mlxsw: rif_lag_vlan: Add a new selftest
Petr Machata [Mon, 31 Jul 2023 15:47:21 +0000 (17:47 +0200)]
selftests: mlxsw: rif_lag_vlan: Add a new selftest

This test verifies driver behavior with regards to creation of RIFs for LAG
VLAN uppers as ports are added or removed to/from the LAG.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agoselftests: mlxsw: rif_lag: Add a new selftest
Petr Machata [Mon, 31 Jul 2023 15:47:20 +0000 (17:47 +0200)]
selftests: mlxsw: rif_lag: Add a new selftest

This test verifies driver behavior with regards to creation of RIFs for a
LAG as ports are added or removed to/from it.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agoselftests: router_bridge_1d_lag: Add a new selftest
Petr Machata [Mon, 31 Jul 2023 15:47:19 +0000 (17:47 +0200)]
selftests: router_bridge_1d_lag: Add a new selftest

Add a selftest to verify that routing through several bridges works when
LAG VLANs are used instead of physical ports, and that routing through LAG
VLANs themselves works as physical ports are de/enslaved.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agoselftests: router_bridge_lag: Add a new selftest
Petr Machata [Mon, 31 Jul 2023 15:47:18 +0000 (17:47 +0200)]
selftests: router_bridge_lag: Add a new selftest

Add a selftest to verify that routing through a bridge works when LAG is
used instead of physical ports.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agoselftests: router_bridge_vlan_upper: Add a new selftest
Petr Machata [Mon, 31 Jul 2023 15:47:17 +0000 (17:47 +0200)]
selftests: router_bridge_vlan_upper: Add a new selftest

Add a selftest that verifies routing through VLAN bridge uppers.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agoselftests: router_bridge_1d: Add a new selftest
Petr Machata [Mon, 31 Jul 2023 15:47:16 +0000 (17:47 +0200)]
selftests: router_bridge_1d: Add a new selftest

Add a selftest to verify that routing through a 1d bridge works when VLAN
upper of a physical port is used instead of a physical port. Also verify
that when a port is attached to an already-configured bridge, the
configuration is applied.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agoselftests: router_bridge: Add remastering tests
Petr Machata [Mon, 31 Jul 2023 15:47:15 +0000 (17:47 +0200)]
selftests: router_bridge: Add remastering tests

Add two tests to deslave a port from and reenslave to a bridge. This should
retain the ability of the system to forward traffic, but on an offloading
driver that is sensitive to ordering of operations, it might not.

The first test does this configuration in a way that relies on
vlan_default_pvid to assign the PVID. The second test disables that
autoconfiguration and configures PVID by hand in a separate step.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agonet: stmmac: XGMAC support for mdio C22 addr > 3
Rohan G Thomas [Mon, 31 Jul 2023 11:50:41 +0000 (19:50 +0800)]
net: stmmac: XGMAC support for mdio C22 addr > 3

For XGMAC versions < 2.2 number of supported mdio C22 addresses is
restricted to 3. From XGMAC version 2.2 there are no restrictions on
the C22 addresses, it supports all valid mdio addresses(0 to 31).

Signed-off-by: Rohan G Thomas <rohan.g.thomas@intel.com>
Acked-by: Jose Abreu <Jose.Abreu@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
15 months agoMerge branch 'add-tja1120-support'
Jakub Kicinski [Wed, 2 Aug 2023 04:06:27 +0000 (21:06 -0700)]
Merge branch 'add-tja1120-support'

Radu Pirea says:

====================
Add TJA1120 support

This patch series got bigger than I expected. It cleans up the
next-c45-tja11xx driver and adds support for the TJA1120(1000BaseT1
automotive phy).

Master/slave custom implementation was replaced with the generic
implementation (genphy_c45_config_aneg/genphy_c45_read_status).

The TJA1120 and TJA1103 are a bit different when it comes to the PTP
interface. The timestamp read procedure was changed, some addresses were
changed and some bits were moved from one register to another. Adding
TJA1120 support was tricky, and I tried not to duplicate the code. If
something looks too hacky to you, I am open to suggestions.
====================

Link: https://lore.kernel.org/r/20230731091619.77961-1-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: phy: nxp-c45-tja11xx: reset PCS if the link goes down
Radu Pirea (NXP OSS) [Mon, 31 Jul 2023 09:16:19 +0000 (12:16 +0300)]
net: phy: nxp-c45-tja11xx: reset PCS if the link goes down

During PTP testing on early TJA1120 engineering samples I observed that
if the link is lost and recovered, the tx timestamps will be randomly
lost. To avoid this HW issue, the PCS should be reset.

Resetting the PCS will break the link and we should reset the PCS on
LINK UP -> LINK DOWN transition, otherwise we will trigger and infinite
loop of LINK UP -> LINK DOWN events.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230731091619.77961-12-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: phy: nxp-c45-tja11xx: read ext trig ts on TJA1120
Radu Pirea (NXP OSS) [Mon, 31 Jul 2023 09:16:18 +0000 (12:16 +0300)]
net: phy: nxp-c45-tja11xx: read ext trig ts on TJA1120

On TJA1120, the external trigger timestamp now has a VALID bit. This
changes the logic and we can't use the TJA1103 procedure.

For TJA1103, we can always read a valid timestamp from the registers,
compare the new timestamp with the old timestamp and, if they are not the
same, an event occurred. This logic cannot be applied for TJA1120 because
the timestamp is 0 if the VALID bit is not set.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Link: https://lore.kernel.org/r/20230731091619.77961-11-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: phy: nxp-c45-tja11xx: run cable test with the PHY in test mode
Radu Pirea (NXP OSS) [Mon, 31 Jul 2023 09:16:17 +0000 (12:16 +0300)]
net: phy: nxp-c45-tja11xx: run cable test with the PHY in test mode

For TJA1120, the enable bit for cable test is not writable if the PHY is
not in test mode.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230731091619.77961-10-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: phy: nxp-c45-tja11xx: handle FUSA irq
Radu Pirea (NXP OSS) [Mon, 31 Jul 2023 09:16:16 +0000 (12:16 +0300)]
net: phy: nxp-c45-tja11xx: handle FUSA irq

TJA1120 and TJA1103 have a set of functional safety hardware tests
executed after every reset, and when the tests are done, the IRQ line is
asserted. For the moment, the purpose of these handlers is to acknowledge
the IRQ and not to check the FUSA tests status.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230731091619.77961-9-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: phy: nxp-c45-tja11xx: read egress ts on TJA1120
Radu Pirea (NXP OSS) [Mon, 31 Jul 2023 09:16:15 +0000 (12:16 +0300)]
net: phy: nxp-c45-tja11xx: read egress ts on TJA1120

The egress timestamp FIFO/circular buffer work different on TJA1120 than
TJA1103.

For TJA1103 the new timestamp should be manually moved from the FIFO to
the hardware buffer before checking if the timestamp is valid.

For TJA1120 the hardware will move automatically the new timestamp
from the FIFO to the buffer and the user should check the valid bit, read
the timestamp and unlock the buffer by writing any of the buffer
registers(which are read only).

Another change for the TJA1120 is the behaviour of the EGR TS IRQ bit.
This bit was a self-clear bit for TJA1103, but now should be cleared
before reading the timestamp.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Link: https://lore.kernel.org/r/20230731091619.77961-8-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: phy: nxp-c45-tja11xx: enable LTC sampling on both ext_ts edges
Radu Pirea (NXP OSS) [Mon, 31 Jul 2023 09:16:14 +0000 (12:16 +0300)]
net: phy: nxp-c45-tja11xx: enable LTC sampling on both ext_ts edges

The external trigger configuration for TJA1120 has changed. The PHY
supports sampling of the LTC on rising and on falling edge.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230731091619.77961-7-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: phy: nxp-c45-tja11xx: add TJA1120 support
Radu Pirea (NXP OSS) [Mon, 31 Jul 2023 09:16:13 +0000 (12:16 +0300)]
net: phy: nxp-c45-tja11xx: add TJA1120 support

Add TJA1120 driver entry and its driver_data.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230731091619.77961-6-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: phy: nxp-c45-tja11xx: use get_features
Radu Pirea (NXP OSS) [Mon, 31 Jul 2023 09:16:12 +0000 (12:16 +0300)]
net: phy: nxp-c45-tja11xx: use get_features

PHY_BASIC_T1_FEATURES are not the right features supported by TJA1103
anymore.
For example ethtool reports:
[root@alarm ~]# ethtool end0
Settings for end0:
        Supported ports: [ TP ]
        Supported link modes:   100baseT1/Full
                                10baseT1L/Full

10baseT1L/Full is not supported by TJA1103 and supported ports list is
not completed. The PHY also have a MII port.

genphy_c45_pma_read_abilities implementation can detect the PHY features
and they look like this.
[root@alarm ~]# ethtool end0
Settings for end0:
        Supported ports: [ TP    MII ]
        Supported link modes:   100baseT1/Full
        Supported pause frame use: Symmetric
        Supports auto-negotiation: No
        Supported FEC modes: Not reported
        Advertised link modes:  100baseT1/Full
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: No
        Advertised FEC modes: Not reported
        Speed: 100Mb/s
        Duplex: Full
        Auto-negotiation: off
        master-slave cfg: forced master
        master-slave status: master
        Port: Twisted Pair
        PHYAD: 1
        Transceiver: external
        MDI-X: Unknown
        Supports Wake-on: g
        Wake-on: d
        Link detected: yes
        SQI: 7/7

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230731091619.77961-5-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: phy: nxp-c45-tja11xx: prepare the ground for TJA1120
Radu Pirea (NXP OSS) [Mon, 31 Jul 2023 09:16:11 +0000 (12:16 +0300)]
net: phy: nxp-c45-tja11xx: prepare the ground for TJA1120

Between TJA1120 and TJA1103 the hardware was improved, but some register
addresses were changed and some bit fields were moved from one register
to another.

Introduce the nxp_c45_reg_field structure and its associated functions to
abstract the differences between the PHYs.

Remove the defined bits and register addresses that are not common
between TJA1103 and TJA1120 and replace them with reg_fields and
register addresses from phydev->drv->driver_data.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Link: https://lore.kernel.org/r/20230731091619.77961-4-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: phy: nxp-c45-tja11xx: remove RX BIST frame counters
Radu Pirea (NXP OSS) [Mon, 31 Jul 2023 09:16:10 +0000 (12:16 +0300)]
net: phy: nxp-c45-tja11xx: remove RX BIST frame counters

Remove RX BIST frame counters from the PHY statistics.
In production mode, these counters are always read as 0.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230731091619.77961-3-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: phy: nxp-c45-tja11xx: use phylib master/slave implementation
Radu Pirea (NXP OSS) [Mon, 31 Jul 2023 09:16:09 +0000 (12:16 +0300)]
net: phy: nxp-c45-tja11xx: use phylib master/slave implementation

Remove the custom implementation of master/save setup and read status
and use genphy_c45_config_aneg and genphy_c45_read_status since phylib
has support for master/slave setup and master/slave status.

Signed-off-by: Radu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230731091619.77961-2-radu-nicolae.pirea@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agoMerge branch 'virtio_net-add-per-queue-interrupt-coalescing-support'
Jakub Kicinski [Wed, 2 Aug 2023 04:02:06 +0000 (21:02 -0700)]
Merge branch 'virtio_net-add-per-queue-interrupt-coalescing-support'

Gavin Li says:

====================
virtio_net: add per queue interrupt coalescing support

Currently, coalescing parameters are grouped for all transmit and receive
virtqueues. This patch series add support to set or get the parameters for
a specified virtqueue.

When the traffic between virtqueues is unbalanced, for example, one virtqueue
is busy and another virtqueue is idle, then it will be very useful to
control coalescing parameters at the virtqueue granularity.

Example command:
$ ethtool -Q eth5 queue_mask 0x1 --coalesce tx-packets 10
Would set max_packets=10 to VQ 1.
$ ethtool -Q eth5 queue_mask 0x1 --coalesce rx-packets 10
Would set max_packets=10 to VQ 0.
$ ethtool -Q eth5 queue_mask 0x1 --show-coalesce
 Queue: 0
 Adaptive RX: off  TX: off
 stats-block-usecs: 0
 sample-interval: 0
 pkt-rate-low: 0
 pkt-rate-high: 0

 rx-usecs: 222
 rx-frames: 0
 rx-usecs-irq: 0
 rx-frames-irq: 256

 tx-usecs: 222
 tx-frames: 0
 tx-usecs-irq: 0
 tx-frames-irq: 256

 rx-usecs-low: 0
 rx-frame-low: 0
 tx-usecs-low: 0
 tx-frame-low: 0

 rx-usecs-high: 0
 rx-frame-high: 0
 tx-usecs-high: 0
 tx-frame-high: 0
====================

Link: https://lore.kernel.org/r/20230731070656.96411-1-gavinl@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agovirtio_net: enable per queue interrupt coalesce feature
Gavin Li [Mon, 31 Jul 2023 07:06:56 +0000 (10:06 +0300)]
virtio_net: enable per queue interrupt coalesce feature

Enable per queue interrupt coalesce feature bit in driver and validate its
dependency with control queue.

Signed-off-by: Gavin Li <gavinl@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Heng Qi <hengqi@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20230731070656.96411-4-gavinl@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agovirtio_net: support per queue interrupt coalesce command
Gavin Li [Mon, 31 Jul 2023 07:06:55 +0000 (10:06 +0300)]
virtio_net: support per queue interrupt coalesce command

Add interrupt_coalesce config in send_queue and receive_queue to cache user
config.

Send per virtqueue interrupt moderation config to underlying device in
order to have more efficient interrupt moderation and cpu utilization of
guest VM.

Additionally, address all the VQs when updating the global configuration,
as now the individual VQs configuration can diverge from the global
configuration.

Signed-off-by: Gavin Li <gavinl@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Heng Qi <hengqi@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20230731070656.96411-3-gavinl@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agovirtio_net: extract interrupt coalescing settings to a structure
Gavin Li [Mon, 31 Jul 2023 07:06:54 +0000 (10:06 +0300)]
virtio_net: extract interrupt coalescing settings to a structure

Extract interrupt coalescing settings to a structure so that it could be
reused in other data structures.

Signed-off-by: Gavin Li <gavinl@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Heng Qi <hengqi@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230731070656.96411-2-gavinl@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agoinet6: Remove unused function declaration udpv6_connect()
Yue Haibing [Mon, 31 Jul 2023 14:04:37 +0000 (22:04 +0800)]
inet6: Remove unused function declaration udpv6_connect()

This is never implemented since the beginning of git history.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230731140437.37056-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: make sure we never create ifindex = 0
Jakub Kicinski [Mon, 31 Jul 2023 17:11:58 +0000 (10:11 -0700)]
net: make sure we never create ifindex = 0

Instead of allocating from 1 use proper xa_init flag,
to protect ourselves from IDs wrapping back to 0.

Fixes: 759ab1edb56c ("net: store netdevs in an xarray")
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Link: https://lore.kernel.org/all/20230728162350.2a6d4979@hermes.local/
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230731171159.988962-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet/macmace: Replace zero-length array with DECLARE_FLEX_ARRAY() helper
Atul Raut [Sun, 30 Jul 2023 23:14:42 +0000 (16:14 -0700)]
net/macmace: Replace zero-length array with DECLARE_FLEX_ARRAY() helper

Since zero-length arrays are deprecated, we are replacing
them with C99 flexible-array members. As a result, instead
of declaring a zero-length array, use the new
DECLARE_FLEX_ARRAY() helper macro.

This fixes warnings such as:
./drivers/net/ethernet/apple/macmace.c:80:4-8: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays)

Signed-off-by: Atul Raut <rauji.raut@gmail.com>
Link: https://lore.kernel.org/r/20230730231442.15003-1-rauji.raut@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
15 months agonet: dsa: qca8k: use dsa_for_each macro instead of for loop
Christian Marangi [Sun, 30 Jul 2023 07:41:13 +0000 (09:41 +0200)]
net: dsa: qca8k: use dsa_for_each macro instead of for loop

Convert for loop to dsa_for_each macro to save some redundant write on
unconnected/unused port and tidy things up.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20230730074113.21889-5-ansuelsmth@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
15 months agonet: dsa: qca8k: move qca8xxx hol fixup to separate function
Christian Marangi [Sun, 30 Jul 2023 07:41:12 +0000 (09:41 +0200)]
net: dsa: qca8k: move qca8xxx hol fixup to separate function

Move qca8xxx hol fixup to separate function to tidy things up and to
permit using a more efficent loop in future patch.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20230730074113.21889-4-ansuelsmth@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
15 months agonet: dsa: qca8k: limit user ports access to the first CPU port on setup
Christian Marangi [Sun, 30 Jul 2023 07:41:11 +0000 (09:41 +0200)]
net: dsa: qca8k: limit user ports access to the first CPU port on setup

In preparation for multi-CPU support, set CPU port LOOKUP MEMBER outside
the port loop and setup the LOOKUP MEMBER mask for user ports only to
the first CPU port.

This is to handle flooding condition where every CPU port is set as
target and prevent packet duplication for unknown frames from user ports.

Secondary CPU port LOOKUP MEMBER mask will be setup later when
port_change_master will be implemented.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20230730074113.21889-3-ansuelsmth@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
15 months agonet: dsa: qca8k: make learning configurable and keep off if standalone
Christian Marangi [Sun, 30 Jul 2023 07:41:10 +0000 (09:41 +0200)]
net: dsa: qca8k: make learning configurable and keep off if standalone

Address learning should initially be turned off by the driver for port
operation in standalone mode, then the DSA core handles changes to it
via ds->ops->port_bridge_flags().

Currently this is not the case for qca8k where learning is enabled
unconditionally in qca8k_setup for every user port.

Handle ports configured in standalone mode by making the learning
configurable and not enabling it by default.

Implement .port_pre_bridge_flags and .port_bridge_flags dsa ops to
enable learning for bridge that request it and tweak
.port_stp_state_set to correctly disable learning when port is
configured in standalone mode.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20230730074113.21889-2-ansuelsmth@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
15 months agonet: dsa: tag_qca: return early if dev is not found
Christian Marangi [Sun, 30 Jul 2023 07:41:09 +0000 (09:41 +0200)]
net: dsa: tag_qca: return early if dev is not found

Currently checksum is recalculated and dsa tag stripped even if we later
don't find the dev.

To improve code, exit early if we don't find the dev and skip additional
operation on the skb since it will be freed anyway.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20230730074113.21889-1-ansuelsmth@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
15 months agoMerge branch 'net-sched-improve-class-lifetime-handling'
Paolo Abeni [Tue, 1 Aug 2023 08:47:28 +0000 (10:47 +0200)]
Merge branch 'net-sched-improve-class-lifetime-handling'

Pedro Tammela says:

====================
net/sched: improve class lifetime handling

Valis says[0]:
============
Three classifiers (cls_fw, cls_u32 and cls_route) always copy
tcf_result struct into the new instance of the filter on update.

This causes a problem when updating a filter bound to a class,
as tcf_unbind_filter() is always called on the old instance in the
success path, decreasing filter_cnt of the still referenced class
and allowing it to be deleted, leading to a use-after-free.
============

Turns out these could have been spotted easily with proper warnings.
Improve the current class lifetime with wrappers that check for
overflow/underflow.

While at it add an extack for when a class in use is deleted.

[0] https://lore.kernel.org/all/20230721174856.3045-1-sec@valis.email/
====================

Link: https://lore.kernel.org/r/20230728153537.1865379-1-pctammela@mojatatu.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
15 months agonet/sched: sch_qfq: warn about class in use while deleting
Pedro Tammela [Fri, 28 Jul 2023 15:35:37 +0000 (12:35 -0300)]
net/sched: sch_qfq: warn about class in use while deleting

Add extack to warn that delete was rejected because
the class is still in use

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
15 months agonet/sched: sch_htb: warn about class in use while deleting
Pedro Tammela [Fri, 28 Jul 2023 15:35:36 +0000 (12:35 -0300)]
net/sched: sch_htb: warn about class in use while deleting

Add extack to warn that delete was rejected because
the class is still in use

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
15 months agonet/sched: sch_hfsc: warn about class in use while deleting
Pedro Tammela [Fri, 28 Jul 2023 15:35:35 +0000 (12:35 -0300)]
net/sched: sch_hfsc: warn about class in use while deleting

Add extack to warn that delete was rejected because
the class is still in use

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
15 months agonet/sched: sch_drr: warn about class in use while deleting
Pedro Tammela [Fri, 28 Jul 2023 15:35:34 +0000 (12:35 -0300)]
net/sched: sch_drr: warn about class in use while deleting

Add extack to warn that delete was rejected because
the class is still in use

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
15 months agonet/sched: wrap open coded Qdics class filter counter
Pedro Tammela [Fri, 28 Jul 2023 15:35:33 +0000 (12:35 -0300)]
net/sched: wrap open coded Qdics class filter counter

The 'filter_cnt' counter is used to control a Qdisc class lifetime.
Each filter referecing this class by its id will eventually
increment/decrement this counter in their respective
'add/update/delete' routines.
As these operations are always serialized under rtnl lock, we don't
need an atomic type like 'refcount_t'.

It also means that we lose the overflow/underflow checks already
present in refcount_t, which are valuable to hunt down bugs
where the unsigned counter wraps around as it aids automated tools
like syzkaller to scream in such situations.

Wrap the open coded increment/decrement into helper functions and
add overflow checks to the operations.

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>