Sabrina Dubroca [Fri, 25 Aug 2023 21:35:13 +0000 (23:35 +0200)]
tls: extend tls_cipher_desc to fully describe the ciphers
- add nonce, usually equal to iv_size but not for chacha
- add offsets into the crypto_info for each field
- add algorithm name
- add offloadable flag
Also add helpers to access each field of a crypto_info struct
described by a tls_cipher_desc.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/39d5f476d63c171097764e8d38f6f158b7c109ae.1692977948.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sabrina Dubroca [Fri, 25 Aug 2023 21:35:12 +0000 (23:35 +0200)]
tls: rename tls_cipher_size_desc to tls_cipher_desc
We're going to add other fields to it to fully describe a cipher, so
the "_size" name won't match the contents.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/76ca6c7686bd6d1534dfa188fb0f1f6fabebc791.1692977948.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sabrina Dubroca [Fri, 25 Aug 2023 21:35:11 +0000 (23:35 +0200)]
tls: reduce size of tls_cipher_size_desc
tls_cipher_size_desc indexes ciphers by their type, but we're not
using indices 0..50 of the array. Each struct tls_cipher_size_desc is
20B, so that's a lot of unused memory. We can reindex the array
starting at the lowest used cipher_type.
Introduce the get_cipher_size_desc helper to find the right item and
avoid out-of-bounds accesses, and make tls_cipher_size_desc's size
explicit so that gcc reminds us to update TLS_CIPHER_MIN/MAX when we
add a new cipher.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/5e054e370e240247a5d37881a1cd93a67c15f4ca.1692977948.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sabrina Dubroca [Fri, 25 Aug 2023 21:35:10 +0000 (23:35 +0200)]
tls: add TLS_CIPHER_ARIA_GCM_* to tls_cipher_size_desc
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/b2e0fb79e6d0a4478be9bf33781dc9c9281c9d56.1692977948.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sabrina Dubroca [Fri, 25 Aug 2023 21:35:09 +0000 (23:35 +0200)]
tls: move tls_cipher_size_desc to net/tls/tls.h
It's only used in net/tls/*, no need to bloat include/net/tls.h.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/dd9fad80415e5b3575b41f56b331871038362eab.1692977948.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sabrina Dubroca [Fri, 25 Aug 2023 21:35:08 +0000 (23:35 +0200)]
selftests: tls: test some invalid inputs for setsockopt
This test will need to be updated if new ciphers are added.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/bfcfa9cffda56d2064296ab7c99a05775dd4c28e.1692977948.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sabrina Dubroca [Fri, 25 Aug 2023 21:35:07 +0000 (23:35 +0200)]
selftests: tls: add getsockopt test
The kernel accepts fetching either just the version and cipher type,
or exactly the per-cipher struct. Also check that getsockopt returns
what we just passed to the kernel.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/81a007ca13de9a74f4af45635d06682cdb385a54.1692977948.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sabrina Dubroca [Fri, 25 Aug 2023 21:35:06 +0000 (23:35 +0200)]
selftests: tls: add test variants for aria-gcm
Only supported for TLS1.2.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/ccf4a4d3f3820f8ff30431b7629f5210cb33fa89.1692977948.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Mon, 28 Aug 2023 00:17:19 +0000 (17:17 -0700)]
Merge branch 'tools-net-ynl-add-support-for-netlink-raw-families'
Donald Hunter says:
====================
tools/net/ynl: Add support for netlink-raw families
This patchset adds support for netlink-raw families such as rtnetlink.
Patch 1 fixes a typo in existing schemas
Patch 2 contains the schema definition
Patches 3 & 4 update the schema documentation
Patches 5 - 9 extends ynl
Patches 10 - 12 add several netlink-raw specs
The netlink-raw schema is very similar to genetlink-legacy and I thought
about making the changes there and symlinking to it. On balance I
thought that might be problematic for accurate schema validation.
rtnetlink doesn't seem to fit into unified or directional message
enumeration models. It seems like an 'explicit' model would be useful,
to force the schema author to specify the message ids directly.
There is not yet support for notifications because ynl currently doesn't
support defining 'event' properties on a 'do' operation. The message ids
are shared so ops need to be both sync and async. I plan to look at this
in a future patch.
The link and route messages contain different nested attributes
dependent on the type of link or route. Decoding these will need some
kind of attr-space selection that uses the value of another attribute as
the selector key. These nested attributes have been left with type
'binary' for now.
====================
Link: https://lore.kernel.org/r/20230825122756.7603-1-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Donald Hunter [Fri, 25 Aug 2023 12:27:55 +0000 (13:27 +0100)]
doc/netlink: Add spec for rt route messages
Add schema for rt route with support for getroute, newroute and
delroute.
Routes can be dumped with filter attributes like this:
./tools/net/ynl/cli.py \
--spec Documentation/netlink/specs/rt_route.yaml \
--dump getroute --json '{"rtm-family": 2, "rtm-table": 254}'
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230825122756.7603-13-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Donald Hunter [Fri, 25 Aug 2023 12:27:54 +0000 (13:27 +0100)]
doc/netlink: Add spec for rt link messages
Add schema for rt link with support for newlink, dellink, getlink,
setlink and getstats.
A dummy link can be created like this:
sudo ./tools/net/ynl/cli.py \
--spec Documentation/netlink/specs/rt_link.yaml \
--do newlink --create \
--json '{"ifname": "dummy0", "linkinfo": {"kind": "dummy"}}'
For example, offload stats can be fetched like this:
./tools/net/ynl/cli.py \
--spec Documentation/netlink/specs/rt_link.yaml \
--dump getstats --json '{ "filter-mask": 8 }'
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230825122756.7603-12-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Donald Hunter [Fri, 25 Aug 2023 12:27:53 +0000 (13:27 +0100)]
doc/netlink: Add spec for rt addr messages
Add schema for rt addr with support for:
- newaddr, deladdr, getaddr (dump)
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230825122756.7603-11-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Donald Hunter [Fri, 25 Aug 2023 12:27:52 +0000 (13:27 +0100)]
tools/net/ynl: Add support for create flags
Add support for using NLM_F_REPLACE, _EXCL, _CREATE and _APPEND flags
in requests.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230825122756.7603-10-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Donald Hunter [Fri, 25 Aug 2023 12:27:51 +0000 (13:27 +0100)]
tools/net/ynl: Implement nlattr array-nest decoding in ynl
Add support for the 'array-nest' attribute type that is used by several
netlink-raw families.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230825122756.7603-9-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Donald Hunter [Fri, 25 Aug 2023 12:27:50 +0000 (13:27 +0100)]
tools/net/ynl: Add support for netlink-raw families
Refactor the ynl code to encapsulate protocol specifics into
NetlinkProtocol and GenlProtocol.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20230825122756.7603-8-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Donald Hunter [Fri, 25 Aug 2023 12:27:49 +0000 (13:27 +0100)]
tools/net/ynl: Fix extack parsing with fixed header genlmsg
Move decode_fixed_header into YnlFamily and add a _fixed_header_size
method to allow extack decoding to skip the fixed header.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230825122756.7603-7-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Donald Hunter [Fri, 25 Aug 2023 12:27:48 +0000 (13:27 +0100)]
tools/ynl: Add mcast-group schema parsing to ynl
Add a SpecMcastGroup class to the nlspec lib.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230825122756.7603-6-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Donald Hunter [Fri, 25 Aug 2023 12:27:47 +0000 (13:27 +0100)]
doc/netlink: Document the netlink-raw schema extensions
Add a doc page for netlink-raw that describes the schema attributes
needed for netlink-raw.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230825122756.7603-5-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Donald Hunter [Fri, 25 Aug 2023 12:27:46 +0000 (13:27 +0100)]
doc/netlink: Update genetlink-legacy documentation
Add documentation for recently added genetlink-legacy schema attributes.
Remove statements about 'work in progress' and 'todo'.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230825122756.7603-4-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Donald Hunter [Fri, 25 Aug 2023 12:27:45 +0000 (13:27 +0100)]
doc/netlink: Add a schema for netlink-raw families
This schema is largely a copy of the genetlink-legacy schema with the
following modifications:
- change the schema id to netlink-raw
- add a top-level protonum property, e.g. 0 (for NETLINK_ROUTE)
- change the protocol enumeration to netlink-raw, removing the
genetlink options.
- replace doc references to generic netlink with raw netlink
- add a value property to mcast-group definitions
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230825122756.7603-3-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Donald Hunter [Fri, 25 Aug 2023 12:27:44 +0000 (13:27 +0100)]
doc/netlink: Fix typo in genetlink-* schemas
Fix typo verion -> version in genetlink-c and genetlink-legacy.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230825122756.7603-2-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Mon, 28 Aug 2023 00:08:47 +0000 (17:08 -0700)]
Merge branch 'devlink-mlx5-add-port-function-attributes-for-ipsec'
Saeed Mahameed says:
====================
{devlink,mlx5}: Add port function attributes for ipsec
From Dima:
Introduce hypervisor-level control knobs to set the functionality of PCI
VF devices passed through to guests. The administrator of a hypervisor
host may choose to change the settings of a port function from the
defaults configured by the device firmware.
The software stack has two types of IPsec offload - crypto and packet.
Specifically, the ip xfrm command has sub-commands for "state" and
"policy" that have an "offload" parameter. With ip xfrm state, both
crypto and packet offload types are supported, while ip xfrm policy can
only be offloaded in packet mode.
The series introduces two new boolean attributes of a port function:
ipsec_crypto and ipsec_packet. The goal is to provide a similar level of
granularity for controlling VF IPsec offload capabilities, which would
be aligned with the software model. This will allow users to decide if
they want both types of offload enabled for a VF, just one of them, or
none at all (which is the default).
At a high level, the difference between the two knobs is that with
ipsec_crypto, only XFRM state can be offloaded. Specifically, only the
crypto operation (Encrypt/Decrypt) is offloaded. With ipsec_packet, both
XFRM state and policy can be offloaded. Furthermore, in addition to
crypto operation offload, IPsec encapsulation is also offloaded. For
XFRM state, choosing between crypto and packet offload types is
possible. From the HW perspective, different resources may be required
for each offload type.
Examples of when a user prefers to enable IPsec packet offload for a VF
when using switchdev mode:
$ devlink port show pci/0000:06:00.0/1
pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
function:
hw_addr 00:00:00:00:00:00 roce enable migratable disable ipsec_crypto disable ipsec_packet disable
$ devlink port function set pci/0000:06:00.0/1 ipsec_packet enable
$ devlink port show pci/0000:06:00.0/1
pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
function:
hw_addr 00:00:00:00:00:00 roce enable migratable disable ipsec_crypto disable ipsec_packet enable
This enables the corresponding IPsec capability of the function before
it's enumerated, so when the driver reads the capability from the device
firmware, it is enabled. The driver is then able to configure
corresponding features and ops of the VF net device to support IPsec
state and policy offloading.
v2: https://lore.kernel.org/netdev/
20230421104901.897946-1-dchumak@nvidia.com/
====================
Link: https://lore.kernel.org/r/20230825062836.103744-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Dima Chumak [Fri, 25 Aug 2023 06:28:36 +0000 (23:28 -0700)]
net/mlx5: Implement devlink port function cmds to control ipsec_packet
Implement devlink port function commands to enable / disable IPsec
packet offloads. This is used to control the IPsec capability of the
device.
When ipsec_offload is enabled for a VF, it prevents adding IPsec packet
offloads on the PF, because the two cannot be active simultaneously due
to HW constraints. Conversely, if there are any active IPsec packet
offloads on the PF, it's not allowed to enable ipsec_packet on a VF,
until PF IPsec offloads are cleared.
Signed-off-by: Dima Chumak <dchumak@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20230825062836.103744-9-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Dima Chumak [Fri, 25 Aug 2023 06:28:35 +0000 (23:28 -0700)]
net/mlx5: Implement devlink port function cmds to control ipsec_crypto
Implement devlink port function commands to enable / disable IPsec
crypto offloads. This is used to control the IPsec capability of the
device.
When ipsec_crypto is enabled for a VF, it prevents adding IPsec crypto
offloads on the PF, because the two cannot be active simultaneously due
to HW constraints. Conversely, if there are any active IPsec crypto
offloads on the PF, it's not allowed to enable ipsec_crypto on a VF,
until PF IPsec offloads are cleared.
Signed-off-by: Dima Chumak <dchumak@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20230825062836.103744-8-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Leon Romanovsky [Fri, 25 Aug 2023 06:28:34 +0000 (23:28 -0700)]
net/mlx5: Provide an interface to block change of IPsec capabilities
mlx5 HW can't perform IPsec offload operation simultaneously both on PF
and VFs at the same time. While the previous patches added devlink knobs
to change IPsec capabilities dynamically, there is a need to add a logic
to block such IPsec capabilities for the cases when IPsec is already
configured.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20230825062836.103744-7-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Leon Romanovsky [Fri, 25 Aug 2023 06:28:33 +0000 (23:28 -0700)]
net/mlx5: Add IFC bits to support IPsec enable/disable
Add hardware definitions to allow to control IPSec capabilities.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20230825062836.103744-6-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Leon Romanovsky [Fri, 25 Aug 2023 06:28:32 +0000 (23:28 -0700)]
net/mlx5e: Rewrite IPsec vs. TC block interface
In the commit
366e46242b8e ("net/mlx5e: Make IPsec offload work together
with eswitch and TC"), new API to block IPsec vs. TC creation was introduced.
Internally, that API used devlink lock to avoid races with userspace, but it is
not really needed as dev->priv.eswitch is stable and can't be changed. So remove
dependency on devlink lock and move block encap code back to its original place.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20230825062836.103744-5-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Leon Romanovsky [Fri, 25 Aug 2023 06:28:31 +0000 (23:28 -0700)]
net/mlx5: Drop extra layer of locks in IPsec
There is no need in holding devlink lock as it gives nothing
compared to already used write mode_lock.
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Link: https://lore.kernel.org/r/20230825062836.103744-4-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Dima Chumak [Fri, 25 Aug 2023 06:28:30 +0000 (23:28 -0700)]
devlink: Expose port function commands to control IPsec packet offloads
Expose port function commands to enable / disable IPsec packet offloads,
this is used to control the port IPsec capabilities.
When IPsec packet is disabled for a function of the port (default),
function cannot offload IPsec packet operations (encapsulation and XFRM
policy offload). When enabled, IPsec packet operations can be offloaded
by the function of the port, which includes crypto operation
(Encrypt/Decrypt), IPsec encapsulation and XFRM state and policy
offload.
Example of a PCI VF port which supports IPsec packet offloads:
$ devlink port show pci/0000:06:00.0/1
pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
function:
hw_addr 00:00:00:00:00:00 roce enable ipsec_packet disable
$ devlink port function set pci/0000:06:00.0/1 ipsec_packet enable
$ devlink port show pci/0000:06:00.0/1
pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
function:
hw_addr 00:00:00:00:00:00 roce enable ipsec_packet enable
Signed-off-by: Dima Chumak <dchumak@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20230825062836.103744-3-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Dima Chumak [Fri, 25 Aug 2023 06:28:29 +0000 (23:28 -0700)]
devlink: Expose port function commands to control IPsec crypto offloads
Expose port function commands to enable / disable IPsec crypto offloads,
this is used to control the port IPsec capabilities.
When IPsec crypto is disabled for a function of the port (default),
function cannot offload any IPsec crypto operations (Encrypt/Decrypt and
XFRM state offloading). When enabled, IPsec crypto operations can be
offloaded by the function of the port.
Example of a PCI VF port which supports IPsec crypto offloads:
$ devlink port show pci/0000:06:00.0/1
pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
function:
hw_addr 00:00:00:00:00:00 roce enable ipsec_crypto disable
$ devlink port function set pci/0000:06:00.0/1 ipsec_crypto enable
$ devlink port show pci/0000:06:00.0/1
pci/0000:06:00.0/1: type eth netdev enp6s0pf0vf0 flavour pcivf pfnum 0 vfnum 0
function:
hw_addr 00:00:00:00:00:00 roce enable ipsec_crypto enable
Signed-off-by: Dima Chumak <dchumak@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20230825062836.103744-2-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David S. Miller [Sun, 27 Aug 2023 06:13:24 +0000 (07:13 +0100)]
Merge branch 'iep-drver-timestamping-support'
MD Danish Anwar says:
====================
Introduce IEP driver and packet timestamping support
This series introduces Industrial Ethernet Peripheral (IEP) driver to
support timestamping of ethernet packets and thus support PTP and PPS
for PRU ICSSG ethernet ports.
This series also adds 10M full duplex support for ICSSG ethernet driver.
There are two IEP instances. IEP0 is used for packet timestamping while IEP1
is used for 10M full duplex support.
This is v7 of the series [v1]. It addresses comments made on [v6].
This series is based on linux-next(#next-
20230823).
Changes from v6 to v7:
*) Dropped blank line in example section of patch 1.
*) Patch 1 previously had three examples, removed two examples and kept only
one example as asked by Krzysztof.
*) Added Jacob Keller's RB tag in patch 5.
*) Dropped Roger's RB tags from the patches that he has authored (Patch 3 and 4)
Changes from v5 to v6:
*) Added description of IEP in commit messages of patch 2 as asked by Rob.
*) Described the items constraints properly for iep property in patch 2 as
asked by Rob.
*) Added Roger and Simon's RB tags.
Changes from v4 to v5:
*) Added comments on why we are using readl / writel instead of regmap_read()
/ write() in icss_iep_gettime() / settime() APIs as asked by Roger.
*) Added Conor's RB tag in patch 1 and 2.
Change from v3 to v4:
*) Changed compatible in iep dt bindings. Now each SoC has their own compatible
in the binding with "ti,am654-icss-iep" as a fallback as asked by Conor.
*) Addressed Andew's comments and removed helper APIs icss_iep_readl() /
writel(). Now the settime/gettime APIs directly use readl() / writel().
*) Moved selecting TI_ICSS_IEP in Kconfig from patch 3 to patch 4.
*) Removed forward declaration of icss_iep_of_match in patch 3.
*) Replaced use of of_device_get_match_data() to device_get_match_data() in
patch 3.
*) Removed of_match_ptr() from patch 3 as it is not needed.
Changes from v2 to v3:
*) Addressed Roger's comment and moved IEP1 related changes in patch 5.
*) Addressed Roger's comment and moved icss_iep.c / .h changes from patch 4
to patch 3.
*) Added support for multiple timestamping in patch 4 as asked by Roger.
*) Addressed Andrew's comment and added comment in case SPEED_10 in
icssg_config_ipg() API.
*) Kept compatible as "ti,am654-icss-iep" for all TI K3 SoCs
Changes from v1 to v2:
*) Addressed Simon's comment to fix reverse xmas tree declaration. Some APIs
in patch 3 and 4 were not following reverse xmas tree variable declaration.
Fixed it in this version.
*) Addressed Conor's comments and removed unsupported SoCs from compatible
comment in patch 1.
*) Addded patch 2 which was not part of v1. Patch 2, adds IEP node to dt
bindings for ICSSG.
[v1] https://lore.kernel.org/all/
20230803110153.3309577-1-danishanwar@ti.com/
[v2] https://lore.kernel.org/all/
20230807110048.2611456-1-danishanwar@ti.com/
[v3] https://lore.kernel.org/all/
20230809114906.21866-1-danishanwar@ti.com/
[v4] https://lore.kernel.org/all/
20230814100847.3531480-1-danishanwar@ti.com/
[v5] https://lore.kernel.org/all/
20230817114527.1585631-1-danishanwar@ti.com/
[v6] https://lore.kernel.org/all/
20230823113254.292603-1-danishanwar@ti.com/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Grygorii Strashko [Thu, 24 Aug 2023 11:46:18 +0000 (17:16 +0530)]
net: ti: icssg-prueth: am65x SR2.0 add 10M full duplex support
For AM65x SR2.0 it's required to enable IEP1 in raw 64bit mode which is
used by PRU FW to monitor the link and apply w/a for 10M link issue.
Note. No public errata available yet.
Without this w/a the PRU FW will stuck if link state changes under TX
traffic pressure.
Hence, add support for 10M full duplex for AM65x SR2.0:
- add new IEP API to enable IEP, but without PTP support
- add pdata quirk_10m_link_issue to enable 10M link issue w/a.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roger Quadros [Thu, 24 Aug 2023 11:46:17 +0000 (17:16 +0530)]
net: ti: icssg-prueth: add packet timestamping and ptp support
Add packet timestamping TS and PTP PHC clock support.
For AM65x and AM64x:
- IEP1 is not used
- IEP0 is configured in shadow mode with 1ms cycle and shared between
Linux and FW. It provides time and TS in number cycles, so special
conversation in ns is required.
- IEP0 shared between PRUeth ports.
- IEP0 supports PPS, periodic output.
- IEP0 settime() and enabling PPS required FW interraction.
- RX TS provided with each packet in CPPI5 descriptor.
- TX TS returned through separate ICSSG hw queues for each port. TX TS
readiness is signaled by INTC IRQ. Only one packet at time can be requested
for TX TS.
Signed-off-by: Roger Quadros <rogerq@ti.com>
Co-developed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roger Quadros [Thu, 24 Aug 2023 11:46:16 +0000 (17:16 +0530)]
net: ti: icss-iep: Add IEP driver
Add a driver for Industrial Ethernet Peripheral (IEP) block of PRUSS to
support timestamping of ethernet packets and thus support PTP and PPS
for PRU ethernet ports.
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
MD Danish Anwar [Thu, 24 Aug 2023 11:46:15 +0000 (17:16 +0530)]
dt-bindings: net: Add IEP property in ICSSG
Add IEP property in ICSSG hardware DT binding document.
ICSSG uses IEP (Industrial Ethernet Peripheral) to support timestamping
of ethernet packets, PTP and PPS.
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
MD Danish Anwar [Thu, 24 Aug 2023 11:46:14 +0000 (17:16 +0530)]
dt-bindings: net: Add ICSS IEP
Add a DT binding document for the ICSS Industrial Ethernet Peripheral(IEP)
hardware. IEP supports packet timestamping, PTP and PPS.
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 27 Aug 2023 05:56:54 +0000 (06:56 +0100)]
Merge branch 'sfc-pedit-offloads'
Pieter Jansen van Vuuren says:
====================
sfc: introduce eth, ipv4 and ipv6 pedit offloads
This set introduces mac source and destination pedit set action offloads.
It also adds offload for ipv4 ttl and ipv6 hop limit pedit set action as
well pedit add actions that would result in the same semantics as
decrementing the ttl and hop limit.
v2:
- fix 'efx_tc_mangle' kdoc which was orphaned when adding 'efx_tc_pedit_add'.
- add description of 'match' in 'efx_tc_mangle' kdoc.
- correct some inconsistent kdoc indentation.
v1: https://lore.kernel.org/netdev/
20230823111725.28090-1-pieter.jansen-van-vuuren@amd.com/
====================
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pieter Jansen van Vuuren [Thu, 24 Aug 2023 11:28:42 +0000 (12:28 +0100)]
sfc: extend pedit add action to handle decrement ipv6 hop limit
Extend the pedit add actions to handle this case for ipv6. Similar to ipv4
dec ttl, decrementing ipv6 hop limit can be achieved by adding 0xff to the
hop limit field.
Co-developed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pieter Jansen van Vuuren [Thu, 24 Aug 2023 11:28:41 +0000 (12:28 +0100)]
sfc: introduce pedit add actions on the ipv4 ttl field
Introduce pedit add actions and use it to achieve decrement ttl offload.
Decrement ttl can be achieved by adding 0xff to the ttl field.
Co-developed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pieter Jansen van Vuuren [Thu, 24 Aug 2023 11:28:40 +0000 (12:28 +0100)]
sfc: add decrement ipv6 hop limit by offloading set hop limit actions
Offload pedit set ipv6 hop limit, where the hop limit has already been
matched and the new value is one less, by translating it to a decrement.
Co-developed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pieter Jansen van Vuuren [Thu, 24 Aug 2023 11:28:39 +0000 (12:28 +0100)]
sfc: add decrement ttl by offloading set ipv4 ttl actions
Offload pedit set ipv4 ttl field, where the ttl field has already been
matched and the new value is one less, by translating it to a decrement.
Co-developed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pieter Jansen van Vuuren [Thu, 24 Aug 2023 11:28:38 +0000 (12:28 +0100)]
sfc: add mac source and destination pedit action offload
Introduce the first pedit set offload functionality for the sfc driver.
In addition to this, add offload functionality for both mac source and
destination pedit set actions.
Co-developed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pieter Jansen van Vuuren [Thu, 24 Aug 2023 11:28:37 +0000 (12:28 +0100)]
sfc: introduce ethernet pedit set action infrastructure
Introduce the initial ethernet pedit set action infrastructure in
preparation for adding mac src and dst pedit action offloads.
Co-developed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Sat, 26 Aug 2023 02:09:45 +0000 (19:09 -0700)]
Merge branch '1GbE' of git://git./linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-08-24 (igc, e1000e)
This series contains updates to igc and e1000e drivers.
Vinicius adds support for utilizing multiple PTP registers on igc.
Sasha reduces interval time for PTM on igc and adds new device support
on e1000e.
* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
e1000e: Add support for the next LOM generation
igc: Decrease PTM short interval from 10 us to 1 us
igc: Add support for multiple in-flight TX timestamps
====================
Link: https://lore.kernel.org/r/20230824204418.1551093-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Donald Hunter [Thu, 24 Aug 2023 14:22:21 +0000 (15:22 +0100)]
doc/netlink: Add delete operation to ovs_vport spec
Add del operation to the spec to help with testing.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://lore.kernel.org/r/20230824142221.71339-1-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 24 Aug 2023 21:24:31 +0000 (14:24 -0700)]
tools: ynl-gen: fix uAPI generation after tempfile changes
We use a tempfile for code generation, to avoid wiping the target
file out if the code generator crashes. File contents are copied
from tempfile to actual destination at the end of main().
uAPI generation is relatively simple so when generating the uAPI
header we return from main() early, and never reach the "copy code
over" stage. Since commit under Fixes uAPI headers are not updated
by ynl-gen.
Move the copy/commit of the code into CodeWriter, to make it
easier to call at any point in time. Hook it into the destructor
to make sure we don't miss calling it.
Fixes:
f65f305ae008 ("tools: ynl-gen: use temporary file for rendering")
Link: https://lore.kernel.org/r/20230824212431.1683612-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Sat, 26 Aug 2023 01:55:21 +0000 (18:55 -0700)]
Merge branch 'stmmac-cleanups'
Russell King says:
====================
stmmac cleanups
One of the comments I had on Feiyang Chen's series was concerning the
initialisation of phylink... and so I've decided to do something about
it, cleaning it up a bit.
This series:
1) adds a new phylink function to limit the MAC capabilities according
to a maximum speed. This allows us to greatly simplify stmmac's
initialisation of phylink's mac capabilities.
2) everywhere that uses priv->plat->phylink_node first converts this
to a fwnode before doing anything with it. This is silly. Let's
instead store it as a fwnode to eliminate these conversions in
multiple places.
3) clean up passing the fwnode to phylink - it might as well happen
at the phylink_create() callsite, rather than being scattered
throughout the entire function.
4) same for mdio_bus_data
5) use phylink_limit_mac_speed() to handle the priv->plat->max_speed
restriction.
6) add a method to get the MAC-specific capabilities from the code
dealing with the MACs, and arrange to call it at an appropriate
time.
7) convert the gmac4 users to use the MAC specific method.
8) same for xgmac.
9) group all the simple phylink_config initialisations together.
10) convert half-duplex logic to being positive logic.
While looking into all of this, this raised eyebrows:
if (priv->plat->tx_queues_to_use > 1)
priv->phylink_config.mac_capabilities &=
~(MAC_10HD | MAC_100HD | MAC_1000HD);
priv->plat->tx_queues_to_use is initialised by platforms to either 1,
4 or 8, and can be controlled from userspace via the --set-channels
ethtool op. The implementation of this op in this driver limits the
number of channels to priv->dma_cap.number_tx_queues, which is derived
from the DMA hwcap.
So, the obvious questions are:
1) what guarantees that the static initialisation of tx_queues_to_use
will always be less than or equal to number_tx_queues from the DMA hw
cap?
2) tx_queues_to_use starts off as 1, but number_tx_queues is larger,
we will leave the half-duplex capabilities in place, but userspace can
increase tx_queues_to_use above 1. Does that mean half-duplex is then
not supported?
3) Should we be basing the decision whether half-duplex is supported
off the DMA capabilities?
4) What about priv->dma_cap.half_duplex? Doesn't that get a say in
whether half-duplex is supported or not? Why isn't this used? Why is
it only reported via debugfs? If it's not being used by the driver,
what's the point of reporting it via debugfs?
====================
Link: https://lore.kernel.org/r/ZOddFH22PWmOmbT5@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Thu, 24 Aug 2023 13:38:39 +0000 (14:38 +0100)]
net: stmmac: convert half-duplex support to positive logic
Rather than detecting when half-duplex is not supported, and clearing
the MAC capabilities, reverse the if() condition and use it to set the
capabilities instead.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAXn-005pUb-SP@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Thu, 24 Aug 2023 13:38:34 +0000 (14:38 +0100)]
net: stmmac: move priv->phylink_config.mac_managed_pm
Move priv->phylink_config.mac_managed_pm to be along side the other
phylink initialisations.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAXi-005pUV-Nq@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Thu, 24 Aug 2023 13:38:29 +0000 (14:38 +0100)]
net: stmmac: move xgmac specific phylink caps to dwxgmac2 core
Move the xgmac specific phylink capabilities to the dwxgmac2 support
core.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAXd-005pUP-JL@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Thu, 24 Aug 2023 13:38:24 +0000 (14:38 +0100)]
net: stmmac: move gmac4 specific phylink capabilities to gmac4
Move the setup of gmac4 speicifc phylink capabilities into gmac4 code.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAXY-005pUJ-Ez@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Thu, 24 Aug 2023 13:38:19 +0000 (14:38 +0100)]
net: stmmac: provide stmmac_mac_phylink_get_caps()
Allow MACs to provide their own capabilities via the MAC operations
struct.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAXT-005pUD-Aj@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Thu, 24 Aug 2023 13:38:14 +0000 (14:38 +0100)]
net: stmmac: use phylink_limit_mac_speed()
Use phylink_limit_mac_speed() to limit the MAC capabilities rather
than coding this for each speed.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAXO-005pU7-61@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Thu, 24 Aug 2023 13:38:09 +0000 (14:38 +0100)]
net: stmmac: use "mdio_bus_data" local variable
We have a local variable for priv->plat->mdio_bus_data, which we use
later in the conditional if() block, but we evaluate the above within
the conditional expression. Use mdio_bus_data instead. Since these
will be the only two users of this local variable, move its assignment
just before the if().
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAXJ-005pU1-1z@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Thu, 24 Aug 2023 13:38:03 +0000 (14:38 +0100)]
net: stmmac: clean up passing fwnode to phylink
Move the initialisation of the fwnode variable closer to its use
site, rather than scattered throughout stmmac_phy_setup().
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAXD-005pTv-TN@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Thu, 24 Aug 2023 13:37:58 +0000 (14:37 +0100)]
net: stmmac: convert plat->phylink_node to fwnode
All users of plat->phylink_node first convert it to a fwnode. Rather
than repeatedly convert to a fwnode, store it as a fwnode. To reflect
this change, call it plat->port_node instead - it is used for more
than just phylink.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAX8-005pTo-OT@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Thu, 24 Aug 2023 13:37:53 +0000 (14:37 +0100)]
net: phylink: add phylink_limit_mac_speed()
Add a function which can be used to limit the phylink MAC capabilities
to an upper speed limit.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qZAX3-005pTi-K1@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Liang Chen [Thu, 24 Aug 2023 12:31:31 +0000 (20:31 +0800)]
veth: Avoid NAPI scheduling on failed SKB forwarding
When an skb fails to be forwarded to the peer(e.g., skb data buffer
length exceeds MTU), it will not be added to the peer's receive queue.
Therefore, we should schedule the peer's NAPI poll function only when
skb forwarding is successful to avoid unnecessary overhead.
Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
Link: https://lore.kernel.org/r/20230824123131.7673-1-liangchen.linux@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Sat, 26 Aug 2023 01:40:14 +0000 (18:40 -0700)]
Merge tag 'for-netdev' of https://git./linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2023-08-25
We've added 87 non-merge commits during the last 8 day(s) which contain
a total of 104 files changed, 3719 insertions(+), 4212 deletions(-).
The main changes are:
1) Add multi uprobe BPF links for attaching multiple uprobes
and usdt probes, which is significantly faster and saves extra fds,
from Jiri Olsa.
2) Add support BPF cpu v4 instructions for arm64 JIT compiler,
from Xu Kuohai.
3) Add support BPF cpu v4 instructions for riscv64 JIT compiler,
from Pu Lehui.
4) Fix LWT BPF xmit hooks wrt their return values where propagating
the result from skb_do_redirect() would trigger a use-after-free,
from Yan Zhai.
5) Fix a BPF verifier issue related to bpf_kptr_xchg() with local kptr
where the map's value kptr type and locally allocated obj type
mismatch, from Yonghong Song.
6) Fix BPF verifier's check_func_arg_reg_off() function wrt graph
root/node which bypassed reg->off == 0 enforcement,
from Kumar Kartikeya Dwivedi.
7) Lift BPF verifier restriction in networking BPF programs to treat
comparison of packet pointers not as a pointer leak,
from Yafang Shao.
8) Remove unmaintained XDP BPF samples as they are maintained
in xdp-tools repository out of tree, from Toke Høiland-Jørgensen.
9) Batch of fixes for the tracing programs from BPF samples in order
to make them more libbpf-aware, from Daniel T. Lee.
10) Fix a libbpf signedness determination bug in the CO-RE relocation
handling logic, from Andrii Nakryiko.
11) Extend libbpf to support CO-RE kfunc relocations. Also follow-up
fixes for bpf_refcount shared ownership implementation,
both from Dave Marchevsky.
12) Add a new bpf_object__unpin() API function to libbpf,
from Daniel Xu.
13) Fix a memory leak in libbpf to also free btf_vmlinux
when the bpf_object gets closed, from Hao Luo.
14) Small error output improvements to test_bpf module, from Helge Deller.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (87 commits)
selftests/bpf: Add tests for rbtree API interaction in sleepable progs
bpf: Allow bpf_spin_{lock,unlock} in sleepable progs
bpf: Consider non-owning refs to refcounted nodes RCU protected
bpf: Reenable bpf_refcount_acquire
bpf: Use bpf_mem_free_rcu when bpf_obj_dropping refcounted nodes
bpf: Consider non-owning refs trusted
bpf: Ensure kptr_struct_meta is non-NULL for collection insert and refcount_acquire
selftests/bpf: Enable cpu v4 tests for RV64
riscv, bpf: Support unconditional bswap insn
riscv, bpf: Support signed div/mod insns
riscv, bpf: Support 32-bit offset jmp insn
riscv, bpf: Support sign-extension mov insns
riscv, bpf: Support sign-extension load insns
riscv, bpf: Fix missing exception handling and redundant zext for LDX_B/H/W
samples/bpf: Add note to README about the XDP utilities moved to xdp-tools
samples/bpf: Cleanup .gitignore
samples/bpf: Remove the xdp_sample_pkts utility
samples/bpf: Remove the xdp1 and xdp2 utilities
samples/bpf: Remove the xdp_rxq_info utility
samples/bpf: Remove the xdp_redirect* utilities
...
====================
Link: https://lore.kernel.org/r/20230825194319.12727-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Sat, 26 Aug 2023 01:35:08 +0000 (18:35 -0700)]
Merge tag 'wireless-next-2023-08-25' of git://git./linux/kernel/git/wireless/wireless-next
Kalle Valo says:
====================
wireless-next patches for v6.6
The second pull request for v6.6, this time with both stack and driver
changes. Unusually we have only one major new feature but lots of
small cleanup all over, I guess this is due to people have been on
vacation the last month.
Major changes:
rtw89
- Introduce Time Averaged SAR (TAS) support
* tag 'wireless-next-2023-08-25' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (114 commits)
wifi: rtlwifi: rtl8723: Remove unused function rtl8723_cmd_send_packet()
wifi: rtw88: usb: kill and free rx urbs on probe failure
wifi: rtw89: Fix clang -Wimplicit-fallthrough in rtw89_query_sar()
wifi: rtw89: phy: modify register setting of ENV_MNTR, PHYSTS and DIG
wifi: rtw89: phy: add phy_gen_def::cr_base to support WiFi 7 chips
wifi: rtw89: mac: define register address of rx_filter to generalize code
wifi: rtw89: mac: define internal memory address for WiFi 7 chip
wifi: rtw89: mac: generalize code to indirectly access WiFi internal memory
wifi: rtw89: mac: add mac_gen_def::band1_offset to map MAC band1 register address
wifi: wlcore: sdio: Use module_sdio_driver macro to simplify the code
wifi: rtw89: initialize multi-channel handling
wifi: rtw89: provide functions to configure NoA for beacon update
wifi: rtw89: call rtw89_chan_get() by vif chanctx if aware of vif
wifi: rtw89: sar: let caller decide the center frequency to query
wifi: rtw89: refine rtw89_correct_cck_chan() by rtw89_hw_to_nl80211_band()
wifi: rtw89: add function prototype for coex request duration
Fix nomenclature for USB and PCI wireless devices
wifi: ath: Use is_multicast_ether_addr() to check multicast Ether address
wifi: ath12k: Remove unused declarations
wifi: ath12k: add check max message length while scanning with extraie
...
====================
Link: https://lore.kernel.org/r/20230825132230.A0833C433C8@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Sat, 26 Aug 2023 01:30:59 +0000 (18:30 -0700)]
Merge tag 'for-net-next-2023-08-24' of git://git./linux/kernel/git/bluetooth/bluetooth-next
Luiz Augusto von Dentz says:
====================
bluetooth-next pull request for net-next:
- Introduce HCI_QUIRK_BROKEN_LE_CODED
- Add support for PA/BIG sync
- Add support for NXP IW624 chipset
- Add support for Qualcomm WCN7850
* tag 'for-net-next-2023-08-24' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next:
Bluetooth: btusb: Do not call kfree_skb() under spin_lock_irqsave()
Bluetooth: btusb: Fix quirks table naming
Bluetooth: HCI: Introduce HCI_QUIRK_BROKEN_LE_CODED
Bluetooth: btintel: Send new command for PPAG
Bluetooth: ISO: Add support for periodic adv reports processing
Bluetooth: hci_conn: fail SCO/ISO via hci_conn_failed if ACL gone early
Bluetooth: hci_core: Fix missing instances using HCI_MAX_AD_LENGTH
Bluetooth: ISO: Use defer setup to separate PA sync and BIG sync
Bluetooth: qca: add support for WCN7850
Bluetooth: qca: use switch case for soc type behavior
dt-bindings: net: bluetooth: qualcomm: document WCN7850 chipset
Bluetooth: hci_conn: Fix sending BT_HCI_CMD_LE_CREATE_CONN_CANCEL
Bluetooth: hci_sync: Fix UAF in hci_disconnect_all_sync
Bluetooth: btnxpuart: Improve inband Independent Reset handling
Bluetooth: btnxpuart: Add support for IW624 chipset
Bluetooth: btnxpuart: Remove check for CTS low after FW download
====================
Link: https://lore.kernel.org/r/20230824201458.2577-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alexei Starovoitov [Fri, 25 Aug 2023 16:23:17 +0000 (09:23 -0700)]
Merge branch 'bpf-refcount-followups-3-bpf_mem_free_rcu-refcounted-nodes'
Dave Marchevsky says:
====================
BPF Refcount followups 3: bpf_mem_free_rcu refcounted nodes
This series is the third of three (or more) followups to address issues
in the bpf_refcount shared ownership implementation discovered by Kumar.
This series addresses the use-after-free scenario described in [0]. The
first followup series ([1]) also attempted to address the same
use-after-free, but only got rid of the splat without addressing the
underlying issue. After this series the underyling issue is fixed and
bpf_refcount_acquire can be re-enabled.
The main fix here is migration of bpf_obj_drop to use
bpf_mem_free_rcu. To understand why this fixes the issue, let us consider
the example interleaving provided by Kumar in [0]:
CPU 0 CPU 1
n = bpf_obj_new
lock(lock1)
bpf_rbtree_add(rbtree1, n)
m = bpf_rbtree_acquire(n)
unlock(lock1)
kptr_xchg(map, m) // move to map
// at this point, refcount = 2
m = kptr_xchg(map, NULL)
lock(lock2)
lock(lock1) bpf_rbtree_add(rbtree2, m)
p = bpf_rbtree_first(rbtree1) if (!RB_EMPTY_NODE) bpf_obj_drop_impl(m) // A
bpf_rbtree_remove(rbtree1, p)
unlock(lock1)
bpf_obj_drop(p) // B
bpf_refcount_acquire(m) // use-after-free
...
Before this series, bpf_obj_drop returns memory to the allocator using
bpf_mem_free. At this point (B in the example) there might be some
non-owning references to that memory which the verifier believes are valid,
but where the underlying memory was reused for some other allocation.
Commit
7793fc3babe9 ("bpf: Make bpf_refcount_acquire fallible for
non-owning refs") attempted to fix this by doing refcount_inc_non_zero
on refcount_acquire in instead of refcount_inc under the assumption that
preventing erroneous incr-on-0 would be sufficient. This isn't true,
though: refcount_inc_non_zero must *check* if the refcount is zero, and
the memory it's checking could have been reused, so the check may look
at and incr random reused bytes.
If we wait to reuse this memory until all non-owning refs that could
point to it are gone, there is no possibility of this scenario
happening. Migrating bpf_obj_drop to use bpf_mem_free_rcu for refcounted
nodes accomplishes this.
For such nodes, the validity of their underlying memory is now tied to
RCU critical section. This matches MEM_RCU trustedness
expectations, so the series takes the opportunity to more explicitly
mark this trustedness state.
The functional effects of trustedness changes here are rather small.
This is largely due to local kptrs having separate verifier handling -
with implicit trustedness assumptions - than arbitrary kptrs.
Regardless, let's take the opportunity to move towards a world where
trustedness is more explicitly handled.
Changelog:
v1 -> v2: https://lore.kernel.org/bpf/
20230801203630.3581291-1-davemarchevsky@fb.com/
Patch 1 ("bpf: Ensure kptr_struct_meta is non-NULL for collection insert and refcount_acquire")
* Spent some time experimenting with a better approach as per convo w/
Yonghong on v1's patch. It started getting too complex, so left unchanged
for now. Yonghong was fine with this approach being shipped.
Patch 2 ("bpf: Consider non-owning refs trusted")
* Add Yonghong ack
Patch 3 ("bpf: Use bpf_mem_free_rcu when bpf_obj_dropping refcounted nodes")
* Add Yonghong ack
Patch 4 ("bpf: Reenable bpf_refcount_acquire")
* Add Yonghong ack
Patch 5 ("bpf: Consider non-owning refs to refcounted nodes RCU protected")
* Undo a nonfunctional whitespace change that shouldn't have been included
(Yonghong)
* Better logging message when complaining about rcu_read_{lock,unlock} in
rbtree cb (Alexei)
* Don't invalidate_non_owning_refs when processing bpf_rcu_read_unlock
(Yonghong, Alexei)
Patch 6 ("[RFC] bpf: Allow bpf_spin_{lock,unlock} in sleepable prog's RCU CS")
* preempt_{disable,enable} in __bpf_spin_{lock,unlock} (Alexei)
* Due to this we can consider spin_lock CS an RCU-sched read-side CS (per
RCU/Design/Requirements/Requirements.rst). Modify in_rcu_cs accordingly.
* no need to check for !in_rcu_cs before allowing bpf_spin_{lock,unlock}
(Alexei)
* RFC tag removed and renamed to "bpf: Allow bpf_spin_{lock,unlock} in
sleepable progs"
Patch 7 ("selftests/bpf: Add tests for rbtree API interaction in sleepable progs")
* Remove "no explicit bpf_rcu_read_lock" failure test, add similar success
test (Alexei)
Summary of patch contents, with sub-bullets being leading questions and
comments I think are worth reviewer attention:
* Patches 1 and 2 are moreso documententation - and
enforcement, in patch 1's case - of existing semantics / expectations
* Patch 3 changes bpf_obj_drop behavior for refcounted nodes such that
their underlying memory is not reused until RCU grace period elapses
* Perhaps it makes sense to move to mem_free_rcu for _all_
non-owning refs in the future, not just refcounted. This might
allow custom non-owning ref lifetime + invalidation logic to be
entirely subsumed by MEM_RCU handling. IMO this needs a bit more
thought and should be tackled outside of a fix series, so it's not
attempted here.
* Patch 4 re-enables bpf_refcount_acquire as changes in patch 3 fix
the remaining use-after-free
* One might expect this patch to be last in the series, or last
before selftest changes. Patches 5 and 6 don't change
verification or runtime behavior for existing BPF progs, though.
* Patch 5 brings the verifier's understanding of refcounted node
trustedness in line with Patch 4's changes
* Patch 6 allows some bpf_spin_{lock, unlock} calls in sleepable
progs. Marked RFC for a few reasons:
* bpf_spin_{lock,unlock} haven't been usable in sleepable progs
since before the introduction of bpf linked list and rbtree. As
such this feels more like a new feature that may not belong in
this fixes series.
* Patch 7 adds tests
[0]: https://lore.kernel.org/bpf/atfviesiidev4hu53hzravmtlau3wdodm2vqs7rd7tnwft34e3@xktodqeqevir/
[1]: https://lore.kernel.org/bpf/
20230602022647.1571784-1-davemarchevsky@fb.com/
====================
Link: https://lore.kernel.org/r/20230821193311.3290257-1-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Dave Marchevsky [Mon, 21 Aug 2023 19:33:11 +0000 (12:33 -0700)]
selftests/bpf: Add tests for rbtree API interaction in sleepable progs
Confirm that the following sleepable prog states fail verification:
* bpf_rcu_read_unlock before bpf_spin_unlock
* RCU CS will last at least as long as spin_lock CS
Also confirm that correct usage passes verification, specifically:
* Explicit use of bpf_rcu_read_{lock, unlock} in sleepable test prog
* Implied RCU CS due to spin_lock CS
None of the selftest progs actually attach to bpf_testmod's
bpf_testmod_test_read.
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/r/20230821193311.3290257-8-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Dave Marchevsky [Mon, 21 Aug 2023 19:33:10 +0000 (12:33 -0700)]
bpf: Allow bpf_spin_{lock,unlock} in sleepable progs
Commit
9e7a4d9831e8 ("bpf: Allow LSM programs to use bpf spin locks")
disabled bpf_spin_lock usage in sleepable progs, stating:
Sleepable LSM programs can be preempted which means that allowng spin
locks will need more work (disabling preemption and the verifier
ensuring that no sleepable helpers are called when a spin lock is
held).
This patch disables preemption before grabbing bpf_spin_lock. The second
requirement above "no sleepable helpers are called when a spin lock is
held" is implicitly enforced by current verifier logic due to helper
calls in spin_lock CS being disabled except for a few exceptions, none
of which sleep.
Due to above preemption changes, bpf_spin_lock CS can also be considered
a RCU CS, so verifier's in_rcu_cs check is modified to account for this.
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/r/20230821193311.3290257-7-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Dave Marchevsky [Mon, 21 Aug 2023 19:33:09 +0000 (12:33 -0700)]
bpf: Consider non-owning refs to refcounted nodes RCU protected
An earlier patch in the series ensures that the underlying memory of
nodes with bpf_refcount - which can have multiple owners - is not reused
until RCU grace period has elapsed. This prevents
use-after-free with non-owning references that may point to
recently-freed memory. While RCU read lock is held, it's safe to
dereference such a non-owning ref, as by definition RCU GP couldn't have
elapsed and therefore underlying memory couldn't have been reused.
From the perspective of verifier "trustedness" non-owning refs to
refcounted nodes are now trusted only in RCU CS and therefore should no
longer pass is_trusted_reg, but rather is_rcu_reg. Let's mark them
MEM_RCU in order to reflect this new state.
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/r/20230821193311.3290257-6-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Dave Marchevsky [Mon, 21 Aug 2023 19:33:08 +0000 (12:33 -0700)]
bpf: Reenable bpf_refcount_acquire
Now that all reported issues are fixed, bpf_refcount_acquire can be
turned back on. Also reenable all bpf_refcount-related tests which were
disabled.
This a revert of:
* commit
f3514a5d6740 ("selftests/bpf: Disable newly-added 'owner' field test until refcount re-enabled")
* commit
7deca5eae833 ("bpf: Disable bpf_refcount_acquire kfunc calls until race conditions are fixed")
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230821193311.3290257-5-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Dave Marchevsky [Mon, 21 Aug 2023 19:33:07 +0000 (12:33 -0700)]
bpf: Use bpf_mem_free_rcu when bpf_obj_dropping refcounted nodes
This is the final fix for the use-after-free scenario described in
commit
7793fc3babe9 ("bpf: Make bpf_refcount_acquire fallible for
non-owning refs"). That commit, by virtue of changing
bpf_refcount_acquire's refcount_inc to a refcount_inc_not_zero, fixed
the "refcount incr on 0" splat. The not_zero check in
refcount_inc_not_zero, though, still occurs on memory that could have
been free'd and reused, so the commit didn't properly fix the root
cause.
This patch actually fixes the issue by free'ing using the recently-added
bpf_mem_free_rcu, which ensures that the memory is not reused until
RCU grace period has elapsed. If that has happened then
there are no non-owning references alive that point to the
recently-free'd memory, so it can be safely reused.
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230821193311.3290257-4-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Dave Marchevsky [Mon, 21 Aug 2023 19:33:06 +0000 (12:33 -0700)]
bpf: Consider non-owning refs trusted
Recent discussions around default kptr "trustedness" led to changes such
as commit
6fcd486b3a0a ("bpf: Refactor RCU enforcement in the
verifier."). One of the conclusions of those discussions, as expressed
in code and comments in that patch, is that we'd like to move away from
'raw' PTR_TO_BTF_ID without some type flag or other register state
indicating trustedness. Although PTR_TRUSTED and PTR_UNTRUSTED flags mark
this state explicitly, the verifier currently considers trustedness
implied by other register state. For example, owning refs to graph
collection nodes must have a nonzero ref_obj_id, so they pass the
is_trusted_reg check despite having no explicit PTR_{UN}TRUSTED flag.
This patch makes trustedness of non-owning refs to graph collection
nodes explicit as well.
By definition, non-owning refs are currently trusted. Although the ref
has no control over pointee lifetime, due to non-owning ref clobbering
rules (see invalidate_non_owning_refs) dereferencing a non-owning ref is
safe in the critical section controlled by bpf_spin_lock associated with
its owning collection.
Note that the previous statement does not hold true for nodes with shared
ownership due to the use-after-free issue that this series is
addressing. True shared ownership was disabled by commit
7deca5eae833
("bpf: Disable bpf_refcount_acquire kfunc calls until race conditions are fixed"),
though, so the statement holds for now. Further patches in the series will change
the trustedness state of non-owning refs before re-enabling
bpf_refcount_acquire.
Let's add NON_OWN_REF type flag to BPF_REG_TRUSTED_MODIFIERS such that a
non-owning ref reg state would pass is_trusted_reg check. Somewhat
surprisingly, this doesn't result in any change to user-visible
functionality elsewhere in the verifier: graph collection nodes are all
marked MEM_ALLOC, which tends to be handled in separate codepaths from
"raw" PTR_TO_BTF_ID. Regardless, let's be explicit here and document the
current state of things before changing it elsewhere in the series.
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230821193311.3290257-3-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Dave Marchevsky [Mon, 21 Aug 2023 19:33:05 +0000 (12:33 -0700)]
bpf: Ensure kptr_struct_meta is non-NULL for collection insert and refcount_acquire
It's straightforward to prove that kptr_struct_meta must be non-NULL for
any valid call to these kfuncs:
* btf_parse_struct_metas in btf.c creates a btf_struct_meta for any
struct in user BTF with a special field (e.g. bpf_refcount,
{rb,list}_node). These are stored in that BTF's struct_meta_tab.
* __process_kf_arg_ptr_to_graph_node in verifier.c ensures that nodes
have {rb,list}_node field and that it's at the correct offset.
Similarly, check_kfunc_args ensures bpf_refcount field existence for
node param to bpf_refcount_acquire.
* So a btf_struct_meta must have been created for the struct type of
node param to these kfuncs
* That BTF and its struct_meta_tab are guaranteed to still be around.
Any arbitrary {rb,list} node the BPF program interacts with either:
came from bpf_obj_new or a collection removal kfunc in the same
program, in which case the BTF is associated with the program and
still around; or came from bpf_kptr_xchg, in which case the BTF was
associated with the map and is still around
Instead of silently continuing with NULL struct_meta, which caused
confusing bugs such as those addressed by commit
2140a6e3422d ("bpf: Set
kptr_struct_meta for node param to list and rbtree insert funcs"), let's
error out. Then, at runtime, we can confidently say that the
implementations of these kfuncs were given a non-NULL kptr_struct_meta,
meaning that special-field-specific functionality like
bpf_obj_free_fields and the bpf_obj_drop change introduced later in this
series are guaranteed to execute.
This patch doesn't change functionality, just makes it easier to reason
about existing functionality.
Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230821193311.3290257-2-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Kalle Valo [Fri, 25 Aug 2023 10:15:26 +0000 (13:15 +0300)]
Merge ath-next from git://git./linux/kernel/git/kvalo/ath.git
ath.git patches for v6.6. No major changes, only smaller fixes and
cleanups this time.
Jinjie Ruan [Thu, 24 Aug 2023 06:23:39 +0000 (14:23 +0800)]
wifi: rtlwifi: rtl8723: Remove unused function rtl8723_cmd_send_packet()
The function rtl8723_cmd_send_packet() is not used anywhere, so remove it.
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230824062339.1885385-1-ruanjinjie@huawei.com
Sascha Hauer [Wed, 23 Aug 2023 07:50:21 +0000 (09:50 +0200)]
wifi: rtw88: usb: kill and free rx urbs on probe failure
After rtw_usb_alloc_rx_bufs() has been called rx urbs have been
allocated and must be freed in the error path. After rtw_usb_init_rx()
has been called they are submitted, so they also must be killed.
Add these forgotten steps to the probe error path.
Besides the lost memory this also fixes a problem when the driver
fails to download the firmware in rtw_chip_info_setup(). In this
case it can happen that the completion of the rx urbs handler runs
at a time when we already freed our data structures resulting in
a kernel crash.
Fixes:
a82dfd33d123 ("wifi: rtw88: Add common USB chip support")
Cc: stable@vger.kernel.org
Reported-by: Ilgaz Öcal <ilgaz@ilgaz.gen.tr>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Acked-by: Larry Finger <Larry.Finger@lwfinger.net>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230823075021.588596-1-s.hauer@pengutronix.de
Nathan Chancellor [Tue, 22 Aug 2023 15:27:16 +0000 (08:27 -0700)]
wifi: rtw89: Fix clang -Wimplicit-fallthrough in rtw89_query_sar()
clang warns (or errors with CONFIG_WERROR=y):
drivers/net/wireless/realtek/rtw89/sar.c:216:3: error: unannotated fall-through between switch labels [-Werror,-Wimplicit-fallthrough]
216 | case RTW89_TAS_STATE_DPR_FORBID:
| ^
drivers/net/wireless/realtek/rtw89/sar.c:216:3: note: insert 'break;' to avoid fall-through
216 | case RTW89_TAS_STATE_DPR_FORBID:
| ^
| break;
1 error generated.
Clang is a little more pedantic than GCC, which does not warn when
falling through to a case that is just break or return. Clang's version
is more in line with the kernel's own stance in deprecated.rst, which
states that all switch/case blocks must end in either break,
fallthrough, continue, goto, or return. Add the missing break to silence
the warning.
Closes: https://github.com/ClangBuiltLinux/linux/issues/1921
Fixes:
eb2624f55ad1 ("wifi: rtw89: Introduce Time Averaged SAR (TAS) feature")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230822-rtw89-tas-clang-implicit-fallthrough-v1-1-5cb73f0fa976@kernel.org
Cheng-Chieh Hsieh [Tue, 22 Aug 2023 12:58:22 +0000 (20:58 +0800)]
wifi: rtw89: phy: modify register setting of ENV_MNTR, PHYSTS and DIG
The ENV_MNTR(environment monitor) is the dynamic mechanism which based on
the HW of CCX(Cisco Compatible Extensions) which provide the channel
loading and noisy level indicator to debug or support the 802.11k. The
PHYSTS provide the detail PHY information per packet we received for
debugging. The DIG(dynamic initial gain) is the dynamic mechanism to
adjust the packet detect power level by received signal strength to avoid
false detection of the WiFi packet.
The address of registers used for ENV_MNTR, PHYSTS and DIG of WiFi 7 IC
are different with WiFi 6 series, so we modify the method to access the
register address in order to compatible with all WiFi 7 and 6 ICs.
Signed-off-by: Cheng-Chieh Hsieh <cj.hsieh@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230822125822.23817-7-pkshih@realtek.com
Ping-Ke Shih [Tue, 22 Aug 2023 12:58:21 +0000 (20:58 +0800)]
wifi: rtw89: phy: add phy_gen_def::cr_base to support WiFi 7 chips
cr_base is base address of PHY control register. The base of WiFi 6 and 7
chips are 0x1_0000 and 0x2_0000 respectively, so define them accordingly.
For example, if PHY address is 0x1330, absolute address is 0x1_1330 for
WiFi 6 chips, and 0x2_1330 for WiFi 7 chips.
Meanwhile, there are two copies of PHY hardware named PHY0 and PHY1. The
offset between them is 0x2_0000, so the base address of PHY0 and PHY1 are
0x2_0000 and 0x4_0000 respectively.
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230822125822.23817-6-pkshih@realtek.com
Ping-Ke Shih [Tue, 22 Aug 2023 12:58:20 +0000 (20:58 +0800)]
wifi: rtw89: mac: define register address of rx_filter to generalize code
rx_filter is used to decide which kind of packets are received to driver,
or just dropped by MAC layer to reduce bus traffic.
The bit definitions of old and new chips are the sames, but only address
is changed, so define a field to generalize usage.
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230822125822.23817-5-pkshih@realtek.com
Ping-Ke Shih [Tue, 22 Aug 2023 12:58:19 +0000 (20:58 +0800)]
wifi: rtw89: mac: define internal memory address for WiFi 7 chip
Define base address of WiFi 7 internal memory according to design to
provide the same functions as existing WiFi 6 chips.
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230822125822.23817-4-pkshih@realtek.com
Ping-Ke Shih [Tue, 22 Aug 2023 12:58:18 +0000 (20:58 +0800)]
wifi: rtw89: mac: generalize code to indirectly access WiFi internal memory
To diagnose abnormal behavior, we need to dump certain internal memory.
For example, dump security CAM when debugging encryption/decryption
problems, or dump BA CAM when debugging abnormal BlockAck.
Since the indirect address and internal memory base address are different
between WiFi 6 and 7 chips, add fields to reuse codes.
Also, only WiFi 6 chips initialize DMAC and CMAC tables via this indirect
interface, so no need to change the constant register address, and
new firmware will help to initialize these tables.
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230822125822.23817-3-pkshih@realtek.com
Ping-Ke Shih [Tue, 22 Aug 2023 12:58:17 +0000 (20:58 +0800)]
wifi: rtw89: mac: add mac_gen_def::band1_offset to map MAC band1 register address
There are two copies of MAC hardware called band0 and band1. Basically,
the only difference between them is base address, so we can share functions
with a 'band' (or 'mac_idx') argument.
The offset of base address of WiFi 6 and 7 are 0x2000 and 0x4000
respectively, so add band1_offset field to new introduced struct
mac_gen_def to possibly reuse functions.
Using below spatch script to convert callers:
@@
expression reg, band;
@@
- rtw89_mac_reg_by_idx(reg, band)
+ rtw89_mac_reg_by_idx(rtwdev, reg, band)
@@
expression reg, port, band;
@@
- rtw89_mac_reg_by_port(reg, port, band)
+ rtw89_mac_reg_by_port(rtwdev, reg, port, band)
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230822125822.23817-2-pkshih@realtek.com
Li Zetao [Mon, 21 Aug 2023 14:03:45 +0000 (22:03 +0800)]
wifi: wlcore: sdio: Use module_sdio_driver macro to simplify the code
Use the module_sdio_driver macro to simplify the code, which is the
same as declaring with module_init() and module_exit().
Signed-off-by: Li Zetao <lizetao1@huawei.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230821140345.3140493-1-lizetao1@huawei.com
Zong-Zhe Yang [Wed, 16 Aug 2023 08:21:33 +0000 (16:21 +0800)]
wifi: rtw89: initialize multi-channel handling
We prepare to deal with multiple channels via new entity modes.
* MCC_PREPARE: Transitional mode before MCC
* MCC: Multi-Channel Concurrent mode
And, enum of sub-entity is extended for second channel context.
We add the entry flow of multi-channel handling and the core stuffs
for extended index of sub-entity. And, we now deal with the filling
of entity channels' info in entity recalc where we know the number
of active chanctx. However, the other detail coding of MCC start/stop
will be implemented in the following.
Besides, chanctx listener struct is pre-added in chip info. Each
component can add callback type in chanctx listener and configure
its callback function to react according to chanctx states. We know
at least RFK (RF calibration) and BTC (BT coexistence) will require
such callbacks.
Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230816082133.57474-7-pkshih@realtek.com
Zong-Zhe Yang [Wed, 16 Aug 2023 08:21:32 +0000 (16:21 +0800)]
wifi: rtw89: provide functions to configure NoA for beacon update
Callers call renew function when wanting to generate a new P2P NoA
information element, and call append function to append NoA attribute
one by one. Then, updating beacon work will fetch the P2P NoA information
element configured by callers and add it to beacon.
The use case of MCC (multi-channel concurrent) <GO + STA> for example:
* start MCC - GO part
renew P2P NoA
append period NoA after calculation
* download beacon for GO
fetch P2P NoA and add to beacon content
* stop MCC - GO part
renew P2P NoA (reset)
Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230816082133.57474-6-pkshih@realtek.com
Zong-Zhe Yang [Wed, 16 Aug 2023 08:21:31 +0000 (16:21 +0800)]
wifi: rtw89: call rtw89_chan_get() by vif chanctx if aware of vif
We adjust these processes which can work accodrding to vif but call
rtw89_chan_get() with static RTW89_SUB_ENTITY_0. After multi-channel
support, chanctx of vif won't always be on RTW89_SUB_ENTITY_0. So,
we make them call rtw89_chan_get() with rtwvif->sub_entity_idx.
Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230816082133.57474-5-pkshih@realtek.com
Zong-Zhe Yang [Wed, 16 Aug 2023 08:21:30 +0000 (16:21 +0800)]
wifi: rtw89: sar: let caller decide the center frequency to query
If multiple channels, SAR will be hard to determine the center frequency
to query. Therefore, we move this decision out of SAR.
Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230816082133.57474-4-pkshih@realtek.com
Zong-Zhe Yang [Wed, 16 Aug 2023 08:21:29 +0000 (16:21 +0800)]
wifi: rtw89: refine rtw89_correct_cck_chan() by rtw89_hw_to_nl80211_band()
In rtw89_correct_cck_chan(), we turn to use rtw89_hw_to_nl80211_band().
The difference between rtw89_hw_to_nl80211_band() and the original raw
judgement is the case on 6 GHz. Since rtw89_correct_cck_chan() is common
code independent on chip, if runtime chip doesn't support 6 GHz, it is
probably safe. Otherwise, it might not.
Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230816082133.57474-3-pkshih@realtek.com
Zong-Zhe Yang [Wed, 16 Aug 2023 08:21:28 +0000 (16:21 +0800)]
wifi: rtw89: add function prototype for coex request duration
The request duration comes from coex mechanism, indicating the
length of time that should be reserved for BT in each time division.
It is required to handle update notification when channel concurrency
processes. Since it will involve in both coex and wifi code flow, this
commit ahead adds the prototype for required function interfaces to
split the implementation of coex and wifi in the following.
The follow-up are expected be add afterwards.
1. coex mechanism call rtw89_core_ntfy_btc_event() once bt req len changes
2. channel concurrency flow updates related stuffs when notified
Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20230816082133.57474-2-pkshih@realtek.com
Alan Stern [Wed, 9 Aug 2023 00:44:48 +0000 (20:44 -0400)]
Fix nomenclature for USB and PCI wireless devices
A mouse that uses a USB connection is called a "USB mouse" device (or
"USB mouse" for short), not a "mouse USB" device. By analogy, a WiFi
adapter that connects to the host computer via USB is a "USB wireless"
device, not a "wireless USB" device. (The latter term more properly
refers to a defunct Wireless USB specification, which described a
technology for sending USB protocol messages over an ultra wideband
radio link.)
Similarly for a WiFi adapter card that plugs into a PCIe slot: It is a
"PCIe wireless" device, not a "wireless PCIe" device.
Rephrase the text in the kernel source where the word ordering is
wrong.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/57da7c80-0e48-41b5-8427-884a02648f55@rowland.harvard.edu
Christophe Leroy [Wed, 23 Aug 2023 13:21:43 +0000 (15:21 +0200)]
kunit: Fix checksum tests on big endian CPUs
On powerpc64le checksum kunit tests work:
[ 2.011457][ T1] KTAP version 1
[ 2.011662][ T1] # Subtest: checksum
[ 2.011848][ T1] 1..3
[ 2.034710][ T1] ok 1 test_csum_fixed_random_inputs
[ 2.079325][ T1] ok 2 test_csum_all_carry_inputs
[ 2.127102][ T1] ok 3 test_csum_no_carry_inputs
[ 2.127202][ T1] # checksum: pass:3 fail:0 skip:0 total:3
[ 2.127533][ T1] # Totals: pass:3 fail:0 skip:0 total:3
[ 2.127956][ T1] ok 1 checksum
But on powerpc64 and powerpc32 they fail:
[ 1.859890][ T1] KTAP version 1
[ 1.860041][ T1] # Subtest: checksum
[ 1.860201][ T1] 1..3
[ 1.861927][ T58] # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:243
[ 1.861927][ T58] Expected result == expec, but
[ 1.861927][ T58] result == 54991 (0xd6cf)
[ 1.861927][ T58] expec == 33316 (0x8224)
[ 1.863742][ T1] not ok 1 test_csum_fixed_random_inputs
[ 1.864520][ T60] # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:267
[ 1.864520][ T60] Expected result == expec, but
[ 1.864520][ T60] result == 255 (0xff)
[ 1.864520][ T60] expec == 65280 (0xff00)
[ 1.868820][ T1] not ok 2 test_csum_all_carry_inputs
[ 1.869977][ T62] # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:306
[ 1.869977][ T62] Expected result == expec, but
[ 1.869977][ T62] result == 64515 (0xfc03)
[ 1.869977][ T62] expec == 0 (0x0)
[ 1.872060][ T1] not ok 3 test_csum_no_carry_inputs
[ 1.872102][ T1] # checksum: pass:0 fail:3 skip:0 total:3
[ 1.872458][ T1] # Totals: pass:0 fail:3 skip:0 total:3
[ 1.872791][ T1] not ok 3 checksum
This is because all expected values were calculated for X86 which
is little endian. On big endian systems all precalculated 16 bits
halves must be byte swapped.
And this is confirmed by a huge amount of sparse errors when building
with C=2
So fix all sparse errors and it will naturally work on all endianness.
Fixes:
688eb8191b47 ("x86/csum: Improve performance of `csum_partial`")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Fang [Thu, 24 Aug 2023 06:11:50 +0000 (14:11 +0800)]
net: fec: add statistics for XDP_TX
The FEC driver supports the statistics for XDP actions except for
XDP_TX before, because the XDP_TX was not supported when adding
the statistics for XDP. Now the FEC driver has supported XDP_TX
since commit
f601899e4321 ("net: fec: add XDP_TX feature support").
So it's reasonable and necessary to add statistics for XDP_TX.
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ruan Jinjie [Mon, 14 Aug 2023 12:42:11 +0000 (20:42 +0800)]
wifi: ath: Use is_multicast_ether_addr() to check multicast Ether address
Use is_multicast_ether_addr() to perform the Checking.
Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com>
Reviewed-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20230814124212.302738-2-ruanjinjie@huawei.com
Yue Haibing [Wed, 16 Aug 2023 13:05:50 +0000 (21:05 +0800)]
wifi: ath12k: Remove unused declarations
Commit
d889913205cf ("wifi: ath12k: driver for Qualcomm Wi-Fi 7 devices")
declared but never implemented these, remove it.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20230816130550.50896-1-yuehaibing@huawei.com
Wen Gong [Wed, 9 Aug 2023 08:16:57 +0000 (04:16 -0400)]
wifi: ath12k: add check max message length while scanning with extraie
Currently the extraie length is directly used to allocate skb buffer. When
the length of skb is greater than the max message length which firmware
supports, error will happen in firmware side.
Hence add check for the skb length and drop extraie when overflow and
print a message.
Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0-03427-QCAHMTSWPL_V1.0_V2.0_SILICONZ-1.15378.4
Signed-off-by: Wen Gong <quic_wgong@quicinc.com>
Reviewed-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20230809081657.13858-1-quic_wgong@quicinc.com
Wang Ming [Thu, 13 Jul 2023 03:03:44 +0000 (11:03 +0800)]
wifi: ath9k: use IS_ERR() with debugfs_create_dir()
The debugfs_create_dir() function returns error pointers,
it never returns NULL. Most incorrect error checks were fixed,
but the one in ath9k_htc_init_debug() was forgotten.
Fix the remaining error check.
Fixes:
e5facc75fa91 ("ath9k_htc: Cleanup HTC debugfs")
Signed-off-by: Wang Ming <machel@vivo.com>
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20230713030358.12379-1-machel@vivo.com
David S. Miller [Fri, 25 Aug 2023 06:43:20 +0000 (07:43 +0100)]
Merge branch 'txgbe-link-modes'
Jiawen Wu says:
====================
support more link mode for TXGBE
There are three new interface mode support for Wangxun 10Gb NICs:
1000BASE-X, SGMII and XAUI.
Specific configurations are added to XPCS. And external PHY attaching
is added for copper NICs.
v2 -> v3:
- add device identifier read
- restrict pcs soft reset
- add firmware version warning
v1 -> v2:
- use the string "txgbe_pcs_mdio_bus" directly
- use dev_err() instead of pr_err()
- add device quirk flag
- add more macro definitions to explain PMA registers
- move txgbe_enable_sec_tx_path() to mac_finish()
- implement phylink for copper NICs
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiawen Wu [Wed, 23 Aug 2023 06:19:35 +0000 (14:19 +0800)]
net: ngbe: move mdio access registers to libwx
Registers of mdio accessing are common defined in libwx, remove the
redundant macro definitions in ngbe driver.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiawen Wu [Wed, 23 Aug 2023 06:19:34 +0000 (14:19 +0800)]
net: txgbe: support copper NIC with external PHY
Wangxun SP chip supports to connect with external PHY (marvell 88x3310),
which links to 10GBASE-T/1000BASE-T/100BASE-T. Add the identification of
media types from subsystem device IDs. For sp_media_copper, register mdio
bus for the external PHY.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiawen Wu [Wed, 23 Aug 2023 06:19:33 +0000 (14:19 +0800)]
net: txgbe: support switching mode to 1000BASE-X and SGMII
Disable data path before PCS VR reset while switching PCS mode, to prevent
the blocking of data path. Enable AN interrupt for CL37 auto-negotiation.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiawen Wu [Wed, 23 Aug 2023 06:19:32 +0000 (14:19 +0800)]
net: txgbe: add FW version warning
Since XPCS device identifier is implemented in the firmware version
0x20010 and above, so add a warning to prompt the users to upgrade the
firmware to make sure the driver works.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiawen Wu [Wed, 23 Aug 2023 06:19:31 +0000 (14:19 +0800)]
net: pcs: xpcs: adapt Wangxun NICs for SGMII mode
Wangxun NICs support the connection with SFP to RJ45 module. In this case,
PCS need to be configured in SGMII mode.
According to chapter 6.11.1 "SGMII Auto-Negitiation" of DesignWare Cores
Ethernet PCS (version 3.20a) and custom design manual, do the following
configuration when the interface mode is SGMII.
1. program VR_MII_AN_CTRL bit(3) [TX_CONFIG] = 1b (PHY side SGMII)
2. program VR_MII_AN_CTRL bit(8) [MII_CTRL] = 1b (8-bit MII)
3. program VR_MII_DIG_CTRL1 bit(0) [PHY_MODE_CTRL] = 1b
Also CL37 AN in backplane configurations need to be enabled because of the
special hardware design. Another thing to note is that PMA needs to be
reconfigured before each CL37 AN configuration for SGMII, otherwise AN will
fail, although we don't know why.
On this device, CL37_ANSGM_STS (bit[4:1] of VR_MII_AN_INTR_STS) indicates
the status received from remote link during the auto-negotiation, and
self-clear after the auto-negotiation is complete.
Meanwhile, CL37_ANCMPLT_INTR will be set to 1, to indicate CL37 AN is
complete. So add another way to get the state for CL37 SGMII.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiawen Wu [Wed, 23 Aug 2023 06:19:30 +0000 (14:19 +0800)]
net: pcs: xpcs: add 1000BASE-X AN interrupt support
Enable CL37 AN complete interrupt for DW XPCS. It requires to clear the
bit(0) [CL37_ANCMPLT_INTR] of VR_MII_AN_INTR_STS after AN completed.
And there is a quirk for Wangxun devices to enable CL37 AN in backplane
configurations because of the special hardware design.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>