Lorenzo Bianconi [Mon, 11 Apr 2022 10:13:25 +0000 (12:13 +0200)]
net: ethernet: mtk_eth_soc: use standard property for cci-control-port
Rely on standard cci-control-port property to identify CCI port
reference.
Update mt7622 dts binding.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 13 Apr 2022 10:57:55 +0000 (11:57 +0100)]
Merge branch '40GbE' of git://git./linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
40GbE Intel Wired LAN Driver Updates 2022-04-12
This series contains updates to i40e and ice drivers.
Joe Damato adds TSO support for MPLS packets on i40e and ice drivers. He
also adds tracking and reporting of tx_stopped statistic for i40e.
Nabil S. Alramli adds reporting of tx_restart to ethtool for i40e.
Mateusz adds new device id support for i40e.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 13 Apr 2022 10:45:39 +0000 (11:45 +0100)]
Merge branch 'tls-rx-refactor-part-3'
Jakub Kicinski says:
====================
tls: rx: random refactoring part 3
TLS Rx refactoring. Part 3 of 3. This set is mostly around rx_list
and async processing. The last two patches are minor optimizations.
A couple of features to follow.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 11 Apr 2022 19:19:17 +0000 (12:19 -0700)]
tls: rx: only copy IV from the packet for TLS 1.2
TLS 1.3 and ChaChaPoly don't carry IV in the packet.
The code before this change would copy out iv_size
worth of whatever followed the TLS header in the packet
and then for TLS 1.3 | ChaCha overwrite that with
the sequence number. Waste of cycles especially
with TLS 1.2 being close to dead and TLS 1.3 being
the common case.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 11 Apr 2022 19:19:16 +0000 (12:19 -0700)]
tls: rx: use MAX_IV_SIZE for allocations
IVs are 8 or 16 bytes, no point reading out the exact value
for quantities this small.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 11 Apr 2022 19:19:15 +0000 (12:19 -0700)]
tls: rx: use async as an in-out argument
Propagating EINPROGRESS thru multiple layers of functions is
error prone. Use darg->async as an in/out argument, like we
use darg->zc today. On input it tells the code if async is
allowed, on output if it took place.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 11 Apr 2022 19:19:14 +0000 (12:19 -0700)]
tls: rx: return the already-copied data on crypto error
async crypto handler will report the socket error no need
to report it again. We can, however, let the data we already
copied be reported to user space but we need to make sure
the error will be reported next time around.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 11 Apr 2022 19:19:13 +0000 (12:19 -0700)]
tls: rx: treat process_rx_list() errors as transient
process_rx_list() only fails if it can't copy data to user
space. There is no point recording the error onto sk->sk_err
or giving up on the data which was read partially. Treat
the return value like a normal socket partial read.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 11 Apr 2022 19:19:12 +0000 (12:19 -0700)]
tls: rx: assume crypto always calls our callback
If crypto didn't always invoke our callback for async
we'd not be clearing skb->sk and would crash in the
skb core when freeing it. This if must be dead code.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 11 Apr 2022 19:19:11 +0000 (12:19 -0700)]
tls: rx: don't handle TLS 1.3 in the async crypto callback
Async crypto never worked with TLS 1.3 and was explicitly disabled in
commit
8497ded2d16c ("net/tls: Disable async decrytion for tls1.3").
There's no need for us to handle TLS 1.3 padding in the async cb.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 11 Apr 2022 19:19:10 +0000 (12:19 -0700)]
tls: rx: move counting TlsDecryptErrors for sync
Move counting TlsDecryptErrors to tls_do_decryption()
where differences between sync and async crypto are
reconciled.
No functional changes, this code just always gave
me a pause.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 11 Apr 2022 19:19:09 +0000 (12:19 -0700)]
tls: rx: reuse leave_on_list label for psock
The code is identical, we can save a few LoC.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 11 Apr 2022 19:19:08 +0000 (12:19 -0700)]
tls: rx: consistently use unlocked accessors for rx_list
rx_list is protected by the socket lock, no need to take
the built-in spin lock on accesses.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lv Ruyi [Tue, 12 Apr 2022 08:51:26 +0000 (08:51 +0000)]
ixp4xx_eth: fix error check return value of platform_get_irq()
platform_get_irq() return negative value on failure, so null check of
return value is incorrect. Fix it by comparing whether it is less than
zero.
Fixes:
9055a2f59162 ("ixp4xx_eth: make ptp support a platform driver")
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220412085126.2532924-1-lv.ruyi@zte.com.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Minghao Chi [Tue, 12 Apr 2022 08:28:47 +0000 (08:28 +0000)]
net: ethernet: ti: cpsw: using pm_runtime_resume_and_get instead of pm_runtime_get_sync
Using pm_runtime_resume_and_get() to replace pm_runtime_get_sync and
pm_runtime_put_noidle. This change is just to simplify the code, no
actual functional changes.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Link: https://lore.kernel.org/r/20220412082847.2532584-1-chi.minghao@zte.com.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Coco Li [Mon, 11 Apr 2022 21:37:17 +0000 (14:37 -0700)]
fou: Remove XRFM from NET_FOU Kconfig
XRFM is no longer needed for configuring FOU tunnels
(CONFIG_NET_FOU_IP_TUNNELS), remove from Kconfig.
Also remove the xrfm.h dependency in fou.c. It was
added in '
23461551c006 ("fou: Support for foo-over-udp RX path")'
for depencies of udp_del_offload and udp_offloads, which were removed in
'
d92283e338f6 ("fou: change to use UDP socket GRO")'.
Built and installed kernel and setup GUE/FOU tunnels.
Signed-off-by: Coco Li <lixiaoyan@google.com>
Link: https://lore.kernel.org/r/20220411213717.3688789-1-lixiaoyan@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mateusz Palczewski [Tue, 29 Mar 2022 07:35:43 +0000 (09:35 +0200)]
i40e: Add Ethernet Connection X722 for 10GbE SFP+ support
Add support for Ethernet Connection X722 for 10GbE SFP+ cards.
Make possible for the driver to bind to the card.
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Nabil S. Alramli [Thu, 24 Mar 2022 20:03:38 +0000 (20:03 +0000)]
i40e: Add vsi.tx_restart to i40e ethtool stats
Add vsi.tx_restart to the i40e driver ethtool statistics
Signed-off-by: Nabil S. Alramli <dev@nalramli.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Joe Damato [Thu, 24 Mar 2022 19:46:58 +0000 (12:46 -0700)]
i40e: Add tx_stopped stat
Track TX queue stop events and export the new stat with ethtool.
Signed-off-by: Joe Damato <jdamato@fastly.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Joe Damato [Fri, 18 Mar 2022 04:12:12 +0000 (21:12 -0700)]
ice: Add mpls+tso support
Attempt to add mpls+tso support.
I don't have ice hardware available to test myself, but I just implemented
this feature in i40e and thought it might be useful to implement for ice
while this is fresh in my brain.
Hoping some one at intel will be able to test this on my behalf.
Signed-off-by: Joe Damato <jdamato@fastly.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Jakub Kicinski [Tue, 12 Apr 2022 16:51:43 +0000 (09:51 -0700)]
Merge branch 'mlxsw-extend-device-registers-for-line-cards-support'
Ido Schimmel says:
====================
mlxsw: Extend device registers for line cards support
This patch set prepares mlxsw for line cards support by extending device
registers with a slot index, which allows accessing components found on
a line card at a given slot. Currently, only slot index 0 (main board)
is used.
No user visible changes that I am aware of.
====================
Link: https://lore.kernel.org/r/20220411144657.2655752-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vadim Pasternak [Mon, 11 Apr 2022 14:46:57 +0000 (17:46 +0300)]
mlxsw: reg: Add new field to Management General Peripheral Information Register
Add new field 'max_modules_per_slot' to provide maximum number of
modules that can be connected per slot. This field will always be zero,
if 'slot_index' in query request is set to non-zero value, otherwise
value in this field will provide maximum modules number, which can be
equipped on device inserted at any slot.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vadim Pasternak [Mon, 11 Apr 2022 14:46:56 +0000 (17:46 +0300)]
mlxsw: core_env: Pass slot index during PMAOS register write call
Pass the slot index down to PMAOS pack helper alongside with the module.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vadim Pasternak [Mon, 11 Apr 2022 14:46:55 +0000 (17:46 +0300)]
mlxsw: reg: Extend MGPIR register with new slot fields
Extend MGPIR (Management General Peripheral Information Register) with
new fields specifying the slot number and number of the slots available
on system. The purpose of these fields is:
- to support access to MPGIR register on modular system for getting the
number of cages, equipped on the line card, inserted at specified
slot. In case slot number is set zero, MGPIR will provide the
information for the main board. For Top of the Rack (non-modular)
system it will provide the same as before.
- to provide the number of slots supported by system. This data is
relevant only in case slot number is set zero.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vadim Pasternak [Mon, 11 Apr 2022 14:46:54 +0000 (17:46 +0300)]
mlxsw: reg: Extend PMMP register with new slot number field
Extend PMMP (Port Module Memory Map Properties Register) with new
field specifying the slot number. The purpose of this field is to
enable overriding the cable/module memory map advertisement.
For non-modular systems the 'module' number uniquely identifies the
transceiver location. For modular systems the transceivers are
identified by two indexes:
- 'slot_index', specifying the slot number, where line card is located;
- 'module', specifying cage transceiver within the line card.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vadim Pasternak [Mon, 11 Apr 2022 14:46:53 +0000 (17:46 +0300)]
mlxsw: reg: Extend MCION register with new slot number field
Extend MCION (Management Cable IO and Notifications Register) with new
field specifying the slot number. The purpose of this field is to
support access to MCION register for query cage transceiver on modular
system.
For non-modular systems the 'module' number uniquely identifies the
transceiver location. For modular systems the transceivers are
identified by two indexes:
- 'slot_index', specifying the slot number, where line card is located;
- 'module', specifying cage transceiver within the line card.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vadim Pasternak [Mon, 11 Apr 2022 14:46:52 +0000 (17:46 +0300)]
mlxsw: reg: Extend MCIA register with new slot number field
Extend MCIA (Management Cable Info Access Register) with new field
specifying the slot number. The purpose of this field is to support
access to MCIA register for reading cage cable information on modular
system. For non-modular systems the 'module' number uniquely identifies
the transceiver location. For modular systems the transceivers are
identified by two indexes:
- 'slot_index', specifying the slot number, where line card is located;
- 'module', specifying cage transceiver within the line card.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vadim Pasternak [Mon, 11 Apr 2022 14:46:51 +0000 (17:46 +0300)]
mlxsw: reg: Extend MTBR register with new slot number field
Extend MTBR (Management Temperature Bulk Register) with new field
specifying the slot number. The purpose of this field is to support
access to MTBR register for reading temperature sensors on modular
system. For non-modular systems the 'sensor_index' uniquely identifies
the cage sensors. For modular systems the sensors are identified by two
indexes:
- 'slot_index', specifying the slot number, where line card is located;
- 'sensor_index', specifying cage sensor within the line card.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vadim Pasternak [Mon, 11 Apr 2022 14:46:50 +0000 (17:46 +0300)]
mlxsw: reg: Extend MTMP register with new slot number field
Extend MTMP (Management Temperature Register) with new field specifying
the slot index. The purpose of this field is to support access to MTMP
register for reading temperature sensors on modular systems.
For non-modular systems the 'sensor_index' uniquely identifies the cage
sensors, while 'slot_index' is always 0. For modular systems the
sensors are identified by:
- 'slot_index', specifying the slot index, where line card is located;
- 'sensor_index', specifying cage sensor within the line card.
Signed-off-by: Vadim Pasternak <vadimp@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Joe Damato [Thu, 3 Mar 2022 01:29:07 +0000 (17:29 -0800)]
i40e: Add support for MPLS + TSO
This change adds support for TSO of MPLS packets.
In my tests with tcpdump it seems to work. Note this test setup has
a 9000 byte MTU:
MPLS (label 100, exp 0, [S], ttl 64) IP srcip.50086 > dstip.1234:
Flags [P.], seq 593345:644401, ack 0, win 420,
options [nop,nop,TS val
45022534 ecr
1722291395], length 51056
IP dstip.1234 > srcip.50086: Flags [.], ack 593345, win 122,
options [nop,nop,TS val
1722291395 ecr
45022534], length 0
IP dstip.1234 > srcip.50086: Flags [.], ack 602289, win 105,
options [nop,nop,TS val
1722291395 ecr
45022534], length 0
IP dstip.1234 > srcip.50086: Flags [.], ack 620177, win 71,
options [nop,nop,TS val
1722291395 ecr
45022534], length 0
MPLS (label 100, exp 0, [S], ttl 64) IP srcip.50086 > dstip.1234:
Flags [P.], seq 644401:655953, ack 0, win 420,
options [nop,nop,TS val
45022534 ecr
1722291395], length 11552
IP dstip.1234 > srcip.50086: Flags [.], ack 638065, win 37,
options [nop,nop,TS val
1722291395 ecr
45022534], length 0
IP dstip.1234 > srcip.50086: Flags [.], ack 644401, win 25,
options [nop,nop,TS val
1722291395 ecr
45022534], length 0
IP dstip.1234 > srcip.50086: Flags [.], ack 653345, win 8,
options [nop,nop,TS val
1722291395 ecr
45022534], length 0
IP dstip.1234 > srcip.50086: Flags [.], ack 655953, win 3,
options [nop,nop,TS val
1722291395 ecr
45022534], length 0
Signed-off-by: Joe Damato <jdamato@fastly.com>
Co-developed-by: Mike Gallo <mgallo@fastly.com>
Signed-off-by: Mike Gallo <mgallo@fastly.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Lorenzo Bianconi [Mon, 11 Apr 2022 14:05:26 +0000 (16:05 +0200)]
page_pool: Add recycle stats to page_pool_put_page_bulk
Add missing recycle stats to page_pool_put_page_bulk routine.
Reviewed-by: Joe Damato <jdamato@fastly.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Link: https://lore.kernel.org/r/3712178b51c007cfaed910ea80e68f00c916b1fa.1649685634.git.lorenzo@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Oliver Hartkopp [Mon, 11 Apr 2022 12:49:55 +0000 (14:49 +0200)]
net: remove noblock parameter from recvmsg() entities
The internal recvmsg() functions have two parameters 'flags' and 'noblock'
that were merged inside skb_recv_datagram(). As a follow up patch to commit
f4b41f062c42 ("net: remove noblock parameter from skb_recv_datagram()")
this patch removes the separate 'noblock' parameter for recvmsg().
Analogue to the referenced patch for skb_recv_datagram() the 'flags' and
'noblock' parameters are unnecessarily split up with e.g.
err = sk->sk_prot->recvmsg(sk, msg, size, flags & MSG_DONTWAIT,
flags & ~MSG_DONTWAIT, &addr_len);
or in
err = INDIRECT_CALL_2(sk->sk_prot->recvmsg, tcp_recvmsg, udp_recvmsg,
sk, msg, size, flags & MSG_DONTWAIT,
flags & ~MSG_DONTWAIT, &addr_len);
instead of simply using only flags all the time and check for MSG_DONTWAIT
where needed (to preserve for the formerly separated no(n)block condition).
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://lore.kernel.org/r/20220411124955.154876-1-socketcan@hartkopp.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Daniele Palmas [Mon, 11 Apr 2022 13:59:43 +0000 (15:59 +0200)]
net: usb: qmi_wwan: add Telit 0x1057 composition
Add the following Telit FN980 composition:
0x1057: tty, adb, rmnet, tty, tty, tty, tty, tty
Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Link: https://lore.kernel.org/r/20220411135943.4067264-1-dnlplm@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Tue, 12 Apr 2022 10:13:32 +0000 (12:13 +0200)]
Merge branch 'sfc-remove-some-global-definitions'
Martin Habets says:
====================
sfc: Remove some global definitions
These are some small cleanups to remove definitions that need not
be defined in .h files.
====================
Link: https://lore.kernel.org/r/164967635861.17602.16525009567130361754.stgit@palantir17.mph.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Martin Habets [Mon, 11 Apr 2022 11:27:20 +0000 (12:27 +0100)]
sfc: Remove global definition of efx_reset_type_names
The strings are only used in efx_common.c so the definitions
can be static in there.
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Martin Habets [Mon, 11 Apr 2022 11:27:15 +0000 (12:27 +0100)]
sfc: Remove duplicate definition of efx_xmit_done
It is defined both in efx.h and tx_common.h.
Remove the definition in efx.h.
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Martin Habets [Mon, 11 Apr 2022 11:27:10 +0000 (12:27 +0100)]
sfc: efx_default_channel_type APIs can be static
This means we can remove them from efx_channel.h and avoid
naming conflicts later.
efx_channel_dummy_op_void() cannot be static as it is
used in ef100_nic.c.
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Tue, 12 Apr 2022 08:33:17 +0000 (10:33 +0200)]
Merge branch 'net-dsa-mt7530-updates-for-phylink-changes'
Russell King says:
====================
net: dsa: mt7530: updates for phylink changes
This revised series is a partial conversion of the mt7530 DSA driver to
the modern phylink infrastructure. This driver has some exceptional
cases which prevent - at the moment - its full conversion (particularly
with the Autoneg bit) to using phylink_generic_validate().
Patch 1 fixes the incorrect test highlighted in the first RFC series.
Patch 2 fixes the incorrect assumption that RGMII is unable to support
1000BASE-X.
Patch 3 populates the supported_interfaces for each port
Patch 4 removes the interface checks that become unnecessary as a result
of patch 3.
Patch 5 removes use of phylink_helper_basex_speed() which is no longer
required by phylink.
Patch 6 becomes possible after patch 5, only indicating the ethtool
modes that can be supported with a particular interface mode - this
involves removing some modes and adding others as per phylink
documentation.
Patch 7 switches the driver to use phylink_get_linkmodes(), which moves
the driver as close as we can to phylink_generic_validate() due to the
Autoneg bit issue mentioned above.
Patch 8 converts the driver to the phylink pcs support, removing a bunch
of driver private indirected methods. We include TRGMII as a PCS even
though strictly TRGMII does not have a PCS. This is convenient to allow
the change in patch 9 to be made.
Patch 9 moves the special autoneg handling to the PCS validate method,
which means we can convert the MAC side to the generic validator.
Patch 10 marks the driver as non-legacy.
The series was posted on 23 February, and a ping sent on 3 March, but
no feedback has been received. The previous posting also received no
feedback on the actual patches either.
v2:
- fix build issue in patch 5
- add Marek's tested-by
drivers/net/dsa/mt7530.c | 330 +++++++++++++++++++++--------------------------
drivers/net/dsa/mt7530.h | 26 ++--
2 files changed, 159 insertions(+), 197 deletions(-)
====================
Link: https://lore.kernel.org/r/YlP4vGKVrlIJUUHK@shell.armlinux.org.uk
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Russell King (Oracle) [Mon, 11 Apr 2022 09:46:37 +0000 (10:46 +0100)]
net: dsa: mt7530: mark as non-legacy
The mt7530 driver does not make use of the speed, duplex, pause or
advertisement in its phylink_mac_config() implementation, so it can be
marked as a non-legacy driver.
Tested-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Russell King (Oracle) [Mon, 11 Apr 2022 09:46:32 +0000 (10:46 +0100)]
net: dsa: mt7530: move autoneg handling to PCS validation
Move the autoneg bit handling to the PCS validation, which allows us to
get rid of mt753x_phylink_validate() and rely on the default
phylink_generic_validate() implementation for the MAC side.
Tested-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Russell King (Oracle) [Mon, 11 Apr 2022 09:46:27 +0000 (10:46 +0100)]
net: dsa: mt7530: partially convert to phylink_pcs
Partially convert the mt7530 driver to use phylink's PCS support. This
is a partial implementation as we don't move anything into the
pcs_config method yet - this driver supports SGMII or 1000BASE-X
without in-band.
Tested-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Russell King (Oracle) [Mon, 11 Apr 2022 09:46:22 +0000 (10:46 +0100)]
net: dsa: mt7530: switch to use phylink_get_linkmodes()
Switch mt7530 to use phylink_get_linkmodes() to generate the ethtool
linkmodes that can be supported. We are unable to use the generic
helper for this as pause modes are dependent on the interface as
the Autoneg bit depends on the interface mode.
Tested-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Russell King (Oracle) [Mon, 11 Apr 2022 09:46:17 +0000 (10:46 +0100)]
net: dsa: mt7530: only indicate linkmodes that can be supported
Now that mt7530 is not using the basex helper, it becomes unnecessary to
indicate support for both 1000baseX and 2500baseX when one of the 803.3z
PHY interface modes is being selected. Ensure that the driver indicates
only those linkmodes that can actually be supported by the PHY interface
mode.
Tested-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Russell King (Oracle) [Mon, 11 Apr 2022 09:46:12 +0000 (10:46 +0100)]
net: dsa: mt7530: drop use of phylink_helper_basex_speed()
Now that we have a better method to select SFP interface modes, we
no longer need to use phylink_helper_basex_speed() in a driver's
validation function.
Tested-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Russell King (Oracle) [Mon, 11 Apr 2022 09:46:06 +0000 (10:46 +0100)]
net: dsa: mt7530: remove interface checks
As phylink checks the interface mode against the supported_interfaces
bitmap, we no longer need to validate the interface mode, nor handle
PHY_INTERFACE_MODE_NA in the validation function. Remove these to
simplify the implementation.
Tested-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Russell King (Oracle) [Mon, 11 Apr 2022 09:46:01 +0000 (10:46 +0100)]
net: dsa: mt7530: populate supported_interfaces and mac_capabilities
Populate the supported interfaces and MAC capabilities for mt7530,
mt7531 and mt7621 DSA switches. Filling this in will enable phylink
to pre-check the PHY interface mode against the the supported
interfaces bitmap prior to calling the validate function, and will
eventually allow us to convert to using the generic validation.
Tested-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Russell King (Oracle) [Mon, 11 Apr 2022 09:45:56 +0000 (10:45 +0100)]
net: dsa: mt7530: 1G can also support 1000BASE-X link mode
When using an external PHY connected using RGMII to mt7531 port 5, the
PHY can be used to used support 1000BASE-X connections. Moreover, if
1000BASE-T is supported, then we should allow 1000BASE-X as well, since
which are supported is a property of the PHY.
Therefore, it makes no sense to exclude this from the linkmodes when
1000BASE-T is supported.
Fixes:
c288575f7810 ("net: dsa: mt7530: Add the support of MT7531 switch")
Tested-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Tue, 12 Apr 2022 08:06:57 +0000 (10:06 +0200)]
Merge branch 'net-bridge-add-support-for-host-l2-mdb-entries'
Joachim Wiberg says:
====================
net: bridge: add support for host l2 mdb entries
Fix to an obvious omissions for layer-2 host mdb entries, this v2 adds
the missing selftest and some minor style fixes.
Note: this patch revealed some worrying problems in how the bridge
forwards unknown BUM traffic and also how unknown multicast is
forwarded when a IP multicast router is known, which a another
(RFC) patch series intend to address. That series will build
on this selftest, hence the name of the test.
v2:
- Add braces to other if/else clauses (Jakub)
- Add selftest to verify add/del of mac/ipv4/ipv6 mdb entries (Jakub)
====================
Link: https://lore.kernel.org/r/20220411084054.298807-1-troglobit@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Joachim Wiberg [Mon, 11 Apr 2022 08:40:54 +0000 (10:40 +0200)]
selftests: forwarding: new test, verify host mdb entries
Boiler plate for testing static mdb entries. This first test verifies
adding and removing host mdb entries for all supported types: IPv4,
IPv6, and MAC multicast.
Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Joachim Wiberg [Mon, 11 Apr 2022 08:40:53 +0000 (10:40 +0200)]
net: bridge: add support for host l2 mdb entries
This patch expands on the earlier work on layer-2 mdb entries by adding
support for host entries. Due to the fact that host joined entries do
not have any flag field, we infer the permanent flag when reporting the
entries to userspace, which otherwise would be listed as 'temp'.
Before patch:
~# bridge mdb add dev br0 port br0 grp 01:00:00:c0:ff:ee permanent
Error: bridge: Flags are not allowed for host groups.
~# bridge mdb add dev br0 port br0 grp 01:00:00:c0:ff:ee
Error: bridge: Only permanent L2 entries allowed.
After patch:
~# bridge mdb add dev br0 port br0 grp 01:00:00:c0:ff:ee permanent
~# bridge mdb show
dev br0 port br0 grp 01:00:00:c0:ff:ee permanent vid 1
Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Lv Ruyi [Mon, 11 Apr 2022 03:25:46 +0000 (03:25 +0000)]
sfc: Fix spelling mistake "writting" -> "writing"
There are some spelling mistakes in the comment. Fix it.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Link: https://lore.kernel.org/r/20220411032546.2517628-1-lv.ruyi@zte.com.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Minghao Chi [Mon, 11 Apr 2022 01:38:12 +0000 (01:38 +0000)]
net/cadence: using pm_runtime_resume_and_get instead of pm_runtime_get_sync
Using pm_runtime_resume_and_get is more appropriate
for simplifing code
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20220411013812.2517212-1-chi.minghao@zte.com.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Haowen Bai [Mon, 11 Apr 2022 01:32:37 +0000 (09:32 +0800)]
sfc: ef10: Fix assigning negative value to unsigned variable
fix warning reported by smatch:
251 drivers/net/ethernet/sfc/ef10.c:2259 efx_ef10_tx_tso_desc()
warn: assigning (-208) to unsigned variable 'ip_tot_len'
Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/1649640757-30041-1-git-send-email-baihaowen@meizu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Arınç ÜNAL [Sun, 10 Apr 2022 13:42:27 +0000 (16:42 +0300)]
net: bridge: offload BR_HAIRPIN_MODE, BR_ISOLATED, BR_MULTICAST_TO_UNICAST
Add BR_HAIRPIN_MODE, BR_ISOLATED and BR_MULTICAST_TO_UNICAST port flags to
BR_PORT_FLAGS_HW_OFFLOAD so that switchdev drivers which have an offloaded
data plane have a chance to reject these bridge port flags if they don't
support them yet.
It makes the code path go through the
SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS driver handlers, which return
-EINVAL for everything they don't recognize.
For drivers that don't catch SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS at
all, switchdev will return -EOPNOTSUPP for those which is then ignored, but
those are in the minority.
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20220410134227.18810-1-arinc.unal@arinc9.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 12 Apr 2022 03:50:04 +0000 (20:50 -0700)]
Merge branch 'net-lan966x-add-support-for-fdma'
Horatiu Vultur says:
====================
net: lan966x: Add support for FDMA
Currently when injecting or extracting a frame from CPU, the frame
is given to the HW each word at a time. There is another way to
inject/extract frames from CPU using FDMA(Frame Direct Memory Access).
In this way the entire frame is given to the HW. This improves both
RX and TX bitrate.
====================
Tested-by: Michael Walle <michael@walle.cc> # on kontron-kswitch-d10
Link: https://lore.kernel.org/r/20220408070357.559899-1-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Horatiu Vultur [Fri, 8 Apr 2022 07:03:57 +0000 (09:03 +0200)]
net: lan966x: Update FDMA to change MTU.
When changing the MTU, it is required to change also the size of the
DBs. In case those frames will arrive to CPU.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Horatiu Vultur [Fri, 8 Apr 2022 07:03:56 +0000 (09:03 +0200)]
net: lan966x: Add FDMA functionality
Ethernet frames can be extracted or injected to or from the device's
DDR memory. There is one channel for injection and one channel for
extraction. Each of these channels contain a linked list of DCBs which
contains DB. The DCB contains only 1 DB for both the injection and
extraction. Each DB contains a frame. Every time when a frame is received
or transmitted an interrupt is generated.
It is not possible to use both the FDMA and the manual
injection/extraction of the frames. Therefore the FDMA has priority over
the manual because of better performance values.
FDMA:
iperf -c 192.168.1.1
[ 5] 0.00-10.02 sec 420 MBytes 352 Mbits/sec 0 sender
[ 5] 0.00-10.03 sec 420 MBytes 351 Mbits/sec receiver
iperf -c 192.168.1.1 -R
[ 5] 0.00-10.01 sec 528 MBytes 442 Mbits/sec 0 sender
[ 5] 0.00-10.00 sec 524 MBytes 440 Mbits/sec receiver
Manual:
iperf -c 192.168.1.1
[ 5] 0.00-10.02 sec 93.8 MBytes 78.5 Mbits/sec 0 sender
[ 5] 0.00-10.03 sec 93.8 MBytes 78.4 Mbits/sec receiver
ipers -c 192.168.1.1 -R
[ 5] 0.00-10.03 sec 121 MBytes 101 Mbits/sec 0 sender
[ 5] 0.00-10.01 sec 118 MBytes 99.0 Mbits/sec receiver
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Horatiu Vultur [Fri, 8 Apr 2022 07:03:55 +0000 (09:03 +0200)]
net: lan966x: Expose functions that are needed by FDMA
Expose the following functions 'lan966x_hw_offload',
'lan966x_ifh_get_src_port' and 'lan966x_ifh_get_timestamp' in
lan966x_main.h so they can be accessed by FDMA.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Horatiu Vultur [Fri, 8 Apr 2022 07:03:54 +0000 (09:03 +0200)]
net: lan966x: Add registers that are used for FDMA.
Add the registers that are used to configure the FDMA.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jonathan Neuschäfer [Sat, 9 Apr 2022 18:21:45 +0000 (20:21 +0200)]
net: calxedaxgmac: Fix typo (doubled "the")
Fix a doubled word in the comment above xgmac_poll.
Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Link: https://lore.kernel.org/r/20220409182147.2509788-1-j.neuschaefer@gmx.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
YueHaibing [Sat, 9 Apr 2022 10:59:31 +0000 (18:59 +0800)]
net: ethernet: ti: am65-cpsw: Fix build error without PHYLINK
If PHYLINK is n, build fails:
drivers/net/ethernet/ti/am65-cpsw-ethtool.o: In function `am65_cpsw_set_link_ksettings':
am65-cpsw-ethtool.c:(.text+0x118): undefined reference to `phylink_ethtool_ksettings_set'
drivers/net/ethernet/ti/am65-cpsw-ethtool.o: In function `am65_cpsw_get_link_ksettings':
am65-cpsw-ethtool.c:(.text+0x138): undefined reference to `phylink_ethtool_ksettings_get'
drivers/net/ethernet/ti/am65-cpsw-ethtool.o: In function `am65_cpsw_set_eee':
am65-cpsw-ethtool.c:(.text+0x158): undefined reference to `phylink_ethtool_set_eee'
Select PHYLINK for TI_K3_AM65_CPSW_NUSS to fix this.
Fixes:
e8609e69470f ("net: ethernet: ti: am65-cpsw: Convert to PHYLINK")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20220409105931.9080-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 12 Apr 2022 03:34:01 +0000 (20:34 -0700)]
Merge branch 'mlx5-next' of https://git./linux/kernel/git/mellanox/linux
Leon Romanovsky says:
====================
Mellanox shared branch that includes:
* Removal of FPGA TLS code https://lore.kernel.org/all/cover.
1649073691.git.leonro@nvidia.com
Mellanox INNOVA TLS cards are EOL in May, 2018 [1]. As such, the code
is unmaintained, untested and not in-use by any upstream/distro oriented
customers. In order to reduce code complexity, drop the kernel code,
clean build config options and delete useless kTLS vs. TLS separation.
[1] https://network.nvidia.com/related-docs/eol/LCR-000286.pdf
* Removal of FPGA IPsec code https://lore.kernel.org/all/cover.
1649232994.git.leonro@nvidia.com
Together with FPGA TLS, the IPsec went to EOL state in the November of
2019 [1]. Exactly like FPGA TLS, no active customers exist for this
upstream code and all the complexity around that area can be deleted.
[2] https://network.nvidia.com/related-docs/eol/LCR-000535.pdf
* Fix to undefined behavior from Borislav https://lore.kernel.org/all/
20220405151517.29753-11-bp@alien8.de
* 'mlx5-next' of https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: (23 commits)
net/mlx5: Remove not-implemented IPsec capabilities
net/mlx5: Remove ipsec_ops function table
net/mlx5: Reduce kconfig complexity while building crypto support
net/mlx5: Move IPsec file to relevant directory
net/mlx5: Remove not-needed IPsec config
net/mlx5: Align flow steering allocation namespace to common style
net/mlx5: Unify device IPsec capabilities check
net/mlx5: Remove useless IPsec device checks
net/mlx5: Remove ipsec vs. ipsec offload file separation
RDMA/core: Delete IPsec flow action logic from the core
RDMA/mlx5: Drop crypto flow steering API
RDMA/mlx5: Delete never supported IPsec flow action
net/mlx5: Remove FPGA ipsec specific statistics
net/mlx5: Remove XFRM no_trailer flag
net/mlx5: Remove not-used IDA field from IPsec struct
net/mlx5: Delete metadata handling logic
net/mlx5_fpga: Drop INNOVA IPsec support
IB/mlx5: Fix undefined behavior due to shift overflowing the constant
net/mlx5: Cleanup kTLS function names and their exposure
net/mlx5: Remove tls vs. ktls separation as it is the same
...
====================
Link: https://lore.kernel.org/r/20220409055303.1223644-1-leon@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Minghao Chi [Fri, 8 Apr 2022 08:12:50 +0000 (08:12 +0000)]
net: stmmac: using pm_runtime_resume_and_get instead of pm_runtime_get_sync
Using pm_runtime_resume_and_get is more appropriate
for simplifing code
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
Link: https://lore.kernel.org/r/20220408081250.2494588-1-chi.minghao@zte.com.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Haiyang Zhang [Thu, 7 Apr 2022 20:21:34 +0000 (13:21 -0700)]
hv_netvsc: Add support for XDP_REDIRECT
Handle XDP_REDIRECT action in netvsc driver.
Also, transparently pass ndo_xdp_xmit to VF when available.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://lore.kernel.org/r/1649362894-20077-1-git-send-email-haiyangz@microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 12 Apr 2022 00:38:02 +0000 (17:38 -0700)]
Merge branch 'ipv4-convert-several-tos-fields-to-dscp_t'
Guillaume Nault says:
====================
ipv4: Convert several tos fields to dscp_t
Continue the work started with commit
a410a0cf9885 ("ipv6: Define
dscp_t and stop taking ECN bits into account in fib6-rules") and
convert more structure fields and variables to dscp_t. This series
focuses on struct fib_rt_info, struct fib_entry_notifier_info and their
users (networking drivers).
The purpose of dscp_t is to ensure that ECN bits don't influence IP
route lookups. It does so by ensuring that dscp_t variables have the
ECN bits cleared.
Notes:
* This series is entirely about type annotation and isn't supposed
to have any user visible effect.
* The first two patches have to introduce a few dsfield <-> dscp
conversions in the affected drivers, but those are then removed when
converting the internal driver structures (patches 3-5). In the end,
drivers don't have to handle any conversion.
====================
Link: https://lore.kernel.org/r/cover.1649445279.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Guillaume Nault [Fri, 8 Apr 2022 20:08:50 +0000 (22:08 +0200)]
net: marvell: prestera: Use dscp_t in struct prestera_kern_fib_cache
Use the new dscp_t type to replace the kern_tos field of struct
prestera_kern_fib_cache. This ensures ECN bits are ignored and makes it
compatible with the dscp fields of struct fib_entry_notifier_info and
struct fib_rt_info.
This also allows sparse to flag potential incorrect uses of DSCP and
ECN bits.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Guillaume Nault [Fri, 8 Apr 2022 20:08:46 +0000 (22:08 +0200)]
mlxsw: Use dscp_t in struct mlxsw_sp_fib4_entry
Use the new dscp_t type to replace the tos field of struct
mlxsw_sp_fib4_entry. This ensures ECN bits are ignored and makes it
compatible with the dscp fields of fib_entry_notifier_info and
fib_rt_info.
This also allows sparse to flag potential incorrect uses of DSCP and
ECN bits.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Guillaume Nault [Fri, 8 Apr 2022 20:08:43 +0000 (22:08 +0200)]
netdevsim: Use dscp_t in struct nsim_fib4_rt
Use the new dscp_t type to replace the tos field of struct
nsim_fib4_rt. This ensures ECN bits are ignored and makes it compatible
with the dscp fields of struct fib_entry_notifier_info and struct
fib_rt_info.
This also allows sparse to flag potential incorrect uses of DSCP and
ECN bits.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Guillaume Nault [Fri, 8 Apr 2022 20:08:40 +0000 (22:08 +0200)]
ipv4: Use dscp_t in struct fib_entry_notifier_info
Use the new dscp_t type to replace the tos field of struct
fib_entry_notifier_info. This ensures ECN bits are ignored and makes it
compatible with the dscp field of struct fib_rt_info.
This also allows sparse to flag potential incorrect uses of DSCP and
ECN bits.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Guillaume Nault [Fri, 8 Apr 2022 20:08:37 +0000 (22:08 +0200)]
ipv4: Use dscp_t in struct fib_rt_info
Use the new dscp_t type to replace the tos field of struct fib_rt_info.
This ensures ECN bits are ignored and makes it compatible with the
fa_dscp field of struct fib_alias.
This also allows sparse to flag potential incorrect uses of DSCP and
ECN bits.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Grygorii Strashko [Fri, 8 Apr 2022 13:48:38 +0000 (16:48 +0300)]
net: ethernet: ti: cpsw: drop CPSW_HEADROOM define
Since commit
1771afd47430 ("net: cpsw: avoid alignment faults by taking
NET_IP_ALIGN into account") the TI CPSW driver was switched to use correct
define CPSW_HEADROOM_NA to avoid alignment faults, but there are two places
left where CPSW_HEADROOM is still used (without causing issues).
Hence, completely drop CPSW_HEADROOM define and use CPSW_HEADROOM_NA
everywhere to avoid further mistakes in code.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 11 Apr 2022 10:55:54 +0000 (11:55 +0100)]
Merge branch 'mptcp-next'
Mat Martineau says:
====================
mptcp: Miscellaneous changes for 5.19
Four separate groups of patches here:
Patch 1 optimizes flag checking when releasing mptcp socket locks.
Patches 2 and 3 update the packet scheduler when subflow priorities
change.
Patch 4 adds some pernet helper functions for MPTCP.
Patches 5-8 add diag support for MPTCP listeners, including a selftest.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Fri, 8 Apr 2022 19:46:01 +0000 (12:46 -0700)]
selftests/mptcp: add diag listen tests
Check dumping of mptcp listener sockets:
1. filter by dport should not return any results
2. filter by sport should return listen sk
3. filter by saddr+sport should return listen sk
4. no filter should return listen sk
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Fri, 8 Apr 2022 19:46:00 +0000 (12:46 -0700)]
mptcp: listen diag dump support
makes 'ss -Ml' show mptcp listen sockets.
Iterate over the tcp listen sockets and pick those that have mptcp ulp
info attached.
mptcp_diag_get_info() is modified to prefer msk->first for mptcp sockets
in listen state. This reports accurate number for recv and send queue
(pending / max connection backlog counters).
Sample output:
ss -Mil
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 20 127.0.0.1:12000 0.0.0.0:*
subflows_max:2
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Fri, 8 Apr 2022 19:45:59 +0000 (12:45 -0700)]
mptcp: remove locking in mptcp_diag_fill_info
Problem is that listener iteration would call this from atomic context
so this locking is not allowed.
One way is to drop locks before calling the helper, but afaics the lock
isn't really needed, all values are fetched via READ_ONCE().
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Fri, 8 Apr 2022 19:45:58 +0000 (12:45 -0700)]
mptcp: diag: switch to context structure
Raw access to cb->arg[] is deprecated, use a context structure.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Geliang Tang [Fri, 8 Apr 2022 19:45:57 +0000 (12:45 -0700)]
mptcp: add pm_nl_pernet helpers
This patch adds two pm_nl_pernet related helpers, named pm_nl_get_pernet()
and pm_nl_get_pernet_from_msk() to get pm_nl_pernet from 'net' or 'msk'.
Use these helpers instead of using net_generic() directly.
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Fri, 8 Apr 2022 19:45:56 +0000 (12:45 -0700)]
mptcp: reset the packet scheduler on PRIO change
Similar to the previous patch, for priority changes
requested by the local PM.
Reported-and-suggested-by: Davide Caratti <dcaratti@redhat.com>
Fixes:
067065422fcd ("mptcp: add the outgoing MP_PRIO support")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Fri, 8 Apr 2022 19:45:55 +0000 (12:45 -0700)]
mptcp: reset the packet scheduler on incoming MP_PRIO
When an incoming MP_PRIO option changes the backup
status of any subflow, we need to reset the packet
scheduler status, or the next send could keep using
the previously selected subflow, without taking in account
the new priorities.
Reported-by: Davide Caratti <dcaratti@redhat.com>
Fixes:
40453a5c61f4 ("mptcp: add the incoming MP_PRIO support")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Fri, 8 Apr 2022 19:45:54 +0000 (12:45 -0700)]
mptcp: optimize release_cb for the common case
The mptcp release callback checks several flags in atomic
context, but only MPTCP_CLEAN_UNA can be up frequently.
Reorganize the code to avoid multiple conditionals in the
most common scenarios.
Additional clarify a related comment.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 11 Apr 2022 10:47:58 +0000 (11:47 +0100)]
Merge git://git./linux/kernel/git/netfilter/nf-next
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next:
1) Replace unnecessary list_for_each_entry_continue() in nf_tables,
from Jakob Koschel.
2) Add struct nf_conntrack_net_ecache to conntrack event cache and
use it, from Florian Westphal.
3) Refactor ctnetlink_dump_list(), also from Florian.
4) Bump module reference counter on cttimeout object addition/removal,
from Florian.
5) Consolidate nf_log MAC printer, from Phil Sutter.
6) Add basic logging support for unknown ethertype, from Phil Sutter.
7) Consolidate check for sysctl nf_log_all_netns toggle, also from Phil.
8) Replace hardcode value in nft_bitwise, from Jeremy Sowden.
9) Rename BASIC-like goto tags in nft_bitwise to more meaningful names,
also from Jeremy.
10) nft_fib support for reverse path filtering with policy-based routing
on iif. Extend selftests to cover for this new usecase, from Florian.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Thu, 31 Mar 2022 13:46:52 +0000 (15:46 +0200)]
selftests: netfilter: add fib expression forward test case
Its now possible to use fib expression in the forward chain (where both
the input and output interfaces are known).
Add a simple test case for this.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Thu, 31 Mar 2022 15:14:47 +0000 (17:14 +0200)]
netfilter: nft_fib: reverse path filter for policy-based routing on iif
If policy-based routing using the iif selector is used, then the fib
expression fails to look up for the reverse path from the prerouting
hook because the input interface cannot be inferred. In order to support
this scenario, extend the fib expression to allow to use after the route
lookup, from the forward hook.
This patch also adds support for the input hook for usability reasons.
Since the prerouting hook cannot be used for the scenario described
above, users need two rules: one for the forward chain and another rule
for the input chain to check for the reverse path check for locally
targeted traffic.
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Lv Ruyi [Fri, 8 Apr 2022 09:49:01 +0000 (09:49 +0000)]
bnx2x: Fix spelling mistake "regiser" -> "register"
There are some spelling mistakes in the comments for macro. Fix it.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Felix Fietkau [Fri, 8 Apr 2022 08:59:45 +0000 (10:59 +0200)]
net: ethernet: mtk_eth_soc/wed: fix sparse endian warnings
Descriptor fields are little-endian
Fixes:
804775dfc288 ("net: ethernet: mtk_eth_soc: add support for Wireless Ethernet Dispatch (WED)")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Fri, 8 Apr 2022 03:22:46 +0000 (11:22 +0800)]
net: ethernet: mtk_eth_soc: fix return value check in mtk_wed_add_hw()
If syscon_regmap_lookup_by_phandle() fails, it never return NULL pointer,
change the check to IS_ERR().
Fixes:
804775dfc288 ("net: ethernet: mtk_eth_soc: add support for Wireless Ethernet Dispatch (WED)")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 11 Apr 2022 09:38:38 +0000 (10:38 +0100)]
Merge branch 'icmp-skb-reason'
Menglong Dong says:
====================
net: icmp: add skb drop reasons to icmp
In the commit
c504e5c2f964 ("net: skb: introduce kfree_skb_reason()"),
we added the support of reporting the reasons of skb drops to kfree_skb
tracepoint. And in this series patches, reasons for skb drops are added
to ICMP protocol.
In order to report the reasons of skb drops in 'sock_queue_rcv_skb()',
the function 'sock_queue_rcv_skb_reason()' is introduced in the 1th
patch, which is used in the 3th patch.
As David Ahern suggested, the reasons for skb drops should be more
general and not be code based. Therefore, in the 2th patch,
SKB_DROP_REASON_PTYPE_ABSENT is renamed to
SKB_DROP_REASON_UNHANDLED_PROTO, which is used for the cases of no
L3 protocol handler, no L4 protocol handler, version extensions, etc.
In the 3th patch, we introduce the new function __ping_queue_rcv_skb()
to report drop reasons by its return value and keep the return value of
ping_queue_rcv_skb() still.
In the 4th patch, we make ICMP message handler functions return drop
reasons, which means we change the return type of 'handler()' in
'struct icmp_control' from 'bool' to 'enum skb_drop_reason'. This
changed its original intention, as 'false' means failure, but
'SKB_NOT_DROPPED_YET', which is 0, means success now. Therefore, we
have to change all usages of these handler. Following "handler"
functions are involved:
icmp_unreach()
icmp_redirect()
icmp_echo()
icmp_timestamp()
icmp_discard()
And following drop reasons are added(what they mean can be see
in the document for them):
SKB_DROP_REASON_ICMP_CSUM
SKB_DROP_REASON_INVALID_PROTO
The reason 'INVALID_PROTO' is introduced for the case that the packet
doesn't follow rfc 1122 and is dropped. I think this reason is different
from the 'UNHANDLED_PROTO', as the 'UNHANDLED_PROTO' means the packet is
fine, and it is just not supported. This is not a common case, and I
believe we can locate the problem from the data in the packet. For now,
this 'INVALID_PROTO' is used for the icmp broadcasts with wrong types.
Maybe there should be a document file for these reasons. For example,
list all the case that causes the 'INVALID_PROTO' drop reason. Therefore,
users can locate their problems according to the document.
Changes since v4:
- rename SKB_DROP_REASON_RFC_1122 to SKB_DROP_REASON_INVALID_PROTO
Changes since v3:
- rename SKB_DROP_REASON_PTYPE_ABSENT to SKB_DROP_REASON_UNHANDLED_PROTO
in the 2th patch
- fix the return value problem of ping_queue_rcv_skb() in the 3th patch
- remove SKB_DROP_REASON_ICMP_TYPE and SKB_DROP_REASON_ICMP_BROADCAST
and introduce the SKB_DROP_REASON_RFC_1122 in the 4th patch
Changes since v2:
- fix aliegnment problem in the 2th patch
Changes since v1:
- introduce __ping_queue_rcv_skb() instead of change the return value
of ping_queue_rcv_skb() in the 2th patch, as Paolo suggested
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Menglong Dong [Thu, 7 Apr 2022 06:20:52 +0000 (14:20 +0800)]
net: icmp: add skb drop reasons to icmp protocol
Replace kfree_skb() used in icmp_rcv() and icmpv6_rcv() with
kfree_skb_reason().
In order to get the reasons of the skb drops after icmp message handle,
we change the return type of 'handler()' in 'struct icmp_control' from
'bool' to 'enum skb_drop_reason'. This may change its original
intention, as 'false' means failure, but 'SKB_NOT_DROPPED_YET' means
success now. Therefore, all 'handler' and the call of them need to be
handled. Following 'handler' functions are involved:
icmp_unreach()
icmp_redirect()
icmp_echo()
icmp_timestamp()
icmp_discard()
And following new drop reasons are added:
SKB_DROP_REASON_ICMP_CSUM
SKB_DROP_REASON_INVALID_PROTO
The reason 'INVALID_PROTO' is introduced for the case that the packet
doesn't follow rfc 1122 and is dropped. This is not a common case, and
I believe we can locate the problem from the data in the packet. For now,
this 'INVALID_PROTO' is used for the icmp broadcasts with wrong types.
Maybe there should be a document file for these reasons. For example,
list all the case that causes the 'UNHANDLED_PROTO' and 'INVALID_PROTO'
drop reason. Therefore, users can locate their problems according to the
document.
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Menglong Dong [Thu, 7 Apr 2022 06:20:51 +0000 (14:20 +0800)]
net: icmp: introduce __ping_queue_rcv_skb() to report drop reasons
In order to avoid to change the return value of ping_queue_rcv_skb(),
introduce the function __ping_queue_rcv_skb(), which is able to report
the reasons of skb drop as its return value, as Paolo suggested.
Meanwhile, make ping_queue_rcv_skb() a simple call to
__ping_queue_rcv_skb().
The kfree_skb() and sock_queue_rcv_skb() used in ping_queue_rcv_skb()
are replaced with kfree_skb_reason() and sock_queue_rcv_skb_reason()
now.
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Menglong Dong [Thu, 7 Apr 2022 06:20:50 +0000 (14:20 +0800)]
net: skb: rename SKB_DROP_REASON_PTYPE_ABSENT
As David Ahern suggested, the reasons for skb drops should be more
general and not be code based.
Therefore, rename SKB_DROP_REASON_PTYPE_ABSENT to
SKB_DROP_REASON_UNHANDLED_PROTO, which is used for the cases of no
L3 protocol handler, no L4 protocol handler, version extensions, etc.
From previous discussion, now we have the aim to make these reasons
more abstract and users based, avoiding code based.
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Menglong Dong [Thu, 7 Apr 2022 06:20:49 +0000 (14:20 +0800)]
net: sock: introduce sock_queue_rcv_skb_reason()
In order to report the reasons of skb drops in 'sock_queue_rcv_skb()',
introduce the function 'sock_queue_rcv_skb_reason()'.
As the return value of 'sock_queue_rcv_skb()' is used as the error code,
we can't make it as drop reason and have to pass extra output argument.
'sock_queue_rcv_skb()' is used in many places, so we can't change it
directly.
Introduce the new function 'sock_queue_rcv_skb_reason()' and make
'sock_queue_rcv_skb()' an inline call to it.
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 10 Apr 2022 16:32:12 +0000 (17:32 +0100)]
Merge branch 'tls-rx-refactoring-part-2'
Jakub Kicinski says:
====================
tls: rx: random refactoring part 2
TLS Rx refactoring. Part 2 of 3. This one focusing on the main loop.
A couple of features to follow.
====================
Jakub Kicinski [Fri, 8 Apr 2022 18:31:34 +0000 (11:31 -0700)]
tls: rx: jump out for cases which need to leave skb on list
The current invese logic is harder to follow (and adds extra
tests to the fast path). We have to enumerate all cases which
need to keep the skb before consuming it. It's simpler to
jump out of the full record flow as we detect those cases.
This makes it clear that partial consumption and peek can
only reach end of the function thru the !zc case so move
the code up there.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 8 Apr 2022 18:31:33 +0000 (11:31 -0700)]
tls: rx: clear ctx->recv_pkt earlier
Whatever we do in the loop the skb should not remain on as
ctx->recv_pkt afterwards. We can clear that pointer and
restart strparser earlier.
This adds overhead of extra linking and unlinking to rx_list
but that's not large (upcoming change will switch to unlocked
skb list operations).
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 8 Apr 2022 18:31:32 +0000 (11:31 -0700)]
tls: rx: inline consuming the skb at the end of the loop
tls_sw_advance_skb() always consumes the skb at the end of the loop.
To fall here the following must be true:
!async && !is_peek && !retain_skb
retain_skb => !zc && rxm->full_len > len
# but non-full record implies !zc, so above can be simplified as
retain_skb => rxm->full_len > len
!async && !is_peek && !(rxm->full_len > len)
!async && !is_peek && rxm->full_len <= len
tls_sw_advance_skb() returns false if len < rxm->full_len
which can't be true given conditions above.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 8 Apr 2022 18:31:31 +0000 (11:31 -0700)]
tls: rx: pull most of zc check out of the loop
Most of the conditions deciding if zero-copy can be used
do not change throughout the iterations, so pre-calculate
them.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 8 Apr 2022 18:31:30 +0000 (11:31 -0700)]
tls: rx: don't track the async count
We track both if the last record was handled by async crypto
and how many records were async. This is not necessary. We
implicitly assume once crypto goes async it will stay that
way, otherwise we'd reorder records. So just track if we're
in async mode, the exact number of records is not necessary.
This change also forces us into "async" mode more consistently
in case crypto ever decided to interleave async and sync.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 8 Apr 2022 18:31:29 +0000 (11:31 -0700)]
tls: rx: don't handle async in tls_sw_advance_skb()
tls_sw_advance_skb() caters to the async case when skb argument
is NULL. In that case it simply unpauses the strparser.
These are surprising semantics to a person reading the code,
and result in higher LoC, so inline the __strp_unpause and
only call tls_sw_advance_skb() when we actually move past
an skb.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 8 Apr 2022 18:31:28 +0000 (11:31 -0700)]
tls: rx: factor out writing ContentType to cmsg
cmsg can be filled in during rx_list processing or normal
receive. Consolidate the code.
We don't need to keep the boolean to track if the cmsg was
created. 0 is an invalid content type.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 8 Apr 2022 18:31:27 +0000 (11:31 -0700)]
tls: rx: simplify async wait
Since we are protected from async completions by decrypt_compl_lock
we can drop the async_notify and reinit the completion before we
start waiting.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>