Michael Walle [Fri, 18 Mar 2022 20:13:22 +0000 (21:13 +0100)]
dt-bindings: net: mscc-miim: add lan966x compatible
The MDIO controller has support to release the internal PHYs from reset
by specifying a second memory resource. This is different between the
currently supported SparX-5 and the LAN966x. Add a new compatible to
distinguish between these two.
Signed-off-by: Michael Walle <michael@walle.cc>
Acked-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tobias Waldekranz [Sat, 19 Mar 2022 11:03:45 +0000 (12:03 +0100)]
net: dsa: mv88e6xxx: Fill in STU support for all supported chips
Some chips using the split VTU/STU design will not accept VTU entries
who's SID points to an invalid STU entry. Therefore, mark all those
chips with either the mv88e6352_g1_stu_* or mv88e6390_g1_stu_* ops as
appropriate.
Notably, chips for the Opal Plus (6085/6097) era seem to use a
different implementation than those from Agate (6352) and onwards,
even though their external interface is the same. The former happily
accepts VTU entries referencing invalid STU entries, while the latter
does not.
This fixes an issue where the driver would fail to probe switch trees
that contained chips of the Agate/Topaz generation which did not
declare STU support, as loaded VTU entries would be read back as
invalid.
Fixes:
49c98c1dc7d9 ("net: dsa: mv88e6xxx: Disentangle STU from VTU")
Reported-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Tested-by: Marek Behún <kabel@kernel.org>
Link: https://lore.kernel.org/r/20220319110345.555270-1-tobias@waldekranz.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Guo Zhengkui [Sat, 19 Mar 2022 07:37:30 +0000 (15:37 +0800)]
selftests: net: change fprintf format specifiers
`cur64`, `start64` and `ts_delta` are int64_t. Change format
specifiers in fprintf from `"%lu"` to `"%" PRId64` to adapt
to 32-bit and 64-bit systems.
Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com>
Link: https://lore.kernel.org/r/20220319073730.5235-1-guozhengkui@vivo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vladimir Oltean [Fri, 18 Mar 2022 19:58:12 +0000 (21:58 +0200)]
net: dsa: felix: allow PHY_INTERFACE_MODE_INTERNAL on port 5
The Felix switch has 6 ports, 2 of which are internal.
Due to some misunderstanding, my initial suggestion for
vsc9959_port_modes[]:
https://patchwork.kernel.org/project/netdevbpf/patch/
20220129220221.2823127-10-colin.foster@in-advantage.com/#
24718277
got translated by Colin into a 5-port array, leading to an all-zero port
mode mask for port 5.
Fixes:
acf242fc739e ("net: dsa: felix: remove prevalidate_phy_mode interface")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220318195812.276276-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Mon, 21 Mar 2022 22:51:53 +0000 (15:51 -0700)]
Merge branch 'net-dsa-mv88e6xxx-mst-fixes'
Tobias Waldekranz says:
====================
net: dsa: mv88e6xxx: MST Fixes
1/2 fixes the issue reported by Marek here:
https://lore.kernel.org/netdev/
20220318182817.
5ade8ecd@dellmb/
2/2 adds a missing capability check to the new .vlan_msti_set
callback.
====================
Link: https://lore.kernel.org/r/20220318201321.4010543-1-tobias@waldekranz.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tobias Waldekranz [Fri, 18 Mar 2022 20:13:21 +0000 (21:13 +0100)]
net: dsa: mv88e6xxx: Ensure STU support in VLAN MSTI callback
In the same way that we check for STU support in the MST state
callback, we should also verify it before trying to change a VLANs
MSTI membership.
Fixes:
acaf4d2e36b3 ("net: dsa: mv88e6xxx: MST Offloading")
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tobias Waldekranz [Fri, 18 Mar 2022 20:13:20 +0000 (21:13 +0100)]
net: dsa: mv88e6xxx: Require ops be implemented to claim STU support
Simply having a physical STU table in the device doesn't do us any
good if there's no implementation of the relevant ops to access that
table. So ensure that chips that claim STU support can also talk to
the hardware.
This fixes an issue where chips that had a their ->info->max_sid
set (due to their family membership), but no implementation (due to
their chip-specific ops struct) would fail to probe.
Fixes:
49c98c1dc7d9 ("net: dsa: mv88e6xxx: Disentangle STU from VTU")
Reported-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Mon, 21 Mar 2022 21:58:20 +0000 (14:58 -0700)]
Merge branch 'net-tls-some-optimizations-for-tls'
Ziyang Xuan says:
====================
net/tls: some optimizations for tls
Do some small optimizations for tls, including jump instructions
optimization, and judgement processes optimization.
====================
Link: https://lore.kernel.org/r/cover.1647658604.git.william.xuanziyang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ziyang Xuan [Sat, 19 Mar 2022 03:15:20 +0000 (11:15 +0800)]
net/tls: optimize judgement processes in tls_set_device_offload()
It is known that priority setting HW offload when set tls TX/RX offload
by setsockopt(). Check netdevice whether support NETIF_F_HW_TLS_TX or
not at the later stages in the whole tls_set_device_offload() process,
some memory allocations have been done before that. We must release those
memory and return error if we judge the netdevice not support
NETIF_F_HW_TLS_TX. It is redundant.
Move NETIF_F_HW_TLS_TX judgement forward, and move start_marker_record
and offload_ctx memory allocation back slightly. Thus, we can get
simpler exception handling process.
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ziyang Xuan [Sat, 19 Mar 2022 03:14:33 +0000 (11:14 +0800)]
net/tls: remove unnecessary jump instructions in do_tls_setsockopt_conf()
Avoid using "goto" jump instruction unconditionally when we
can return directly. Remove unnecessary jump instructions in
do_tls_setsockopt_conf().
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sebastian Andrzej Siewior [Mon, 21 Mar 2022 09:22:37 +0000 (10:22 +0100)]
net: Revert the softirq will run annotation in ____napi_schedule().
The lockdep annotation lockdep_assert_softirq_will_run() expects that
either hard or soft interrupts are disabled because both guaranty that
the "raised" soft-interrupts will be processed once the context is left.
This triggers in flush_smp_call_function_from_idle() but it this case it
explicitly calls do_softirq() in case of pending softirqs.
Revert the "softirq will run" annotation in ____napi_schedule() and move
the check back to __netif_rx() as it was. Keep the IRQ-off assert in
____napi_schedule() because this is always required.
Fixes:
fbd9a2ceba5c7 ("net: Add lockdep asserts to ____napi_schedule().")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/YjhD3ZKWysyw8rc6@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David S. Miller [Mon, 21 Mar 2022 14:11:38 +0000 (14:11 +0000)]
Merge branch 'devlink-locking'
Jakub Kicinski says:
====================
devlink: hold the instance lock in eswitch callbacks
Series number 2 in the effort to hold the devlink instance lock
in call driver callbacks. We have the following drivers using
this API:
- bnxt, nfp, netdevsim - their own locking is removed / simplified
by this series; all of them needed a lock to protect from changes
to the number of VFs while switching modes, now the VF config bus
callback takes the devlink instance lock via devl_lock();
- ice - appears not to allow changing modes while SR-IOV enabled,
so nothing to do there;
- liquidio - does not contain any locking;
- octeontx2/af - is very special but at least doesn't have locking
so doesn't get in the way either;
- mlx5 has a wealth of locks - I chickened out and dropped the lock
in the callbacks so that I can leave the driver be, for now.
The last one is obviously not ideal, but I would prefer to transition
the API already as it make take longer.
v2: use a wrapper in mlx5 and extend the comment
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 18 Mar 2022 19:23:44 +0000 (12:23 -0700)]
devlink: hold the instance lock during eswitch_mode callbacks
Make the devlink core hold the instance lock during eswitch_mode
callbacks. Cheat in case of mlx5 (see the cover letter).
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 18 Mar 2022 19:23:43 +0000 (12:23 -0700)]
netdevsim: replace vfs_lock with devlink instance lock
Similarly to the previous commit, use the devlink instance
lock and let it replace the vfs_lock.
nsim_esw_legacy_enable() was locked by both port lock and
vfs lock so one set of lock/unlocks goes away.
netdevsim's .eswitch_mode_set callback is now ready for
the callback to take the instance lock.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 18 Mar 2022 19:23:42 +0000 (12:23 -0700)]
netdevsim: replace port_list_lock with devlink instance lock
Take advantage of the devlink instance lock for protecting
the port list. This will simplify locking even more once
all devlink callbacks hold the instance lock.
We need to add locking in nsim_dev_port_add_all() which used
to assume higher layer protection when accessing the list.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 18 Mar 2022 19:23:41 +0000 (12:23 -0700)]
devlink: add explicitly locked flavor of the rate node APIs
We'll need an explicitly locked rate node API for netdevsim
to switch eswitch mode setting to locked.
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 18 Mar 2022 19:23:40 +0000 (12:23 -0700)]
bnxt: use the devlink instance lock to protect sriov
In prep for .eswitch_mode_set being called with the devlink instance
lock held use that lock explicitly instead of creating a local mutex
just for the sriov reconfig.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 21 Mar 2022 13:26:38 +0000 (13:26 +0000)]
Merge branch 'too-short'
Tong Zhang says:
====================
fix typos: "to short" -> "too short"
doing some code review and I found out there are a couple of places
where "too short" is misspelled as "to short".
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Tong Zhang [Mon, 21 Mar 2022 07:18:18 +0000 (00:18 -0700)]
mISDN: fix typo "frame to short" -> "frame too short"
"frame to short" -> "frame too short"
Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tong Zhang [Mon, 21 Mar 2022 07:18:08 +0000 (00:18 -0700)]
i825xx: fix typo "Frame to short" -> "Frame too short"
"Frame to short" -> "Frame too short"
Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tong Zhang [Mon, 21 Mar 2022 07:17:57 +0000 (00:17 -0700)]
s390/ctcm: fix typo "length to short" -> "length too short"
"packet length to short" -> "packet length too short"
Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tong Zhang [Mon, 21 Mar 2022 07:17:28 +0000 (00:17 -0700)]
ar5523: fix typo "to short" -> "too short"
"RX USB to short" -> "RX USB too short"
Signed-off-by: Tong Zhang <ztong0001@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 21 Mar 2022 13:24:28 +0000 (13:24 +0000)]
Merge branch 'sparx5-mcast'
Casper Andersson says:
====================
net: sparx5: Add multicast support
Add multicast support to Sparx5.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Casper Andersson [Mon, 21 Mar 2022 10:14:46 +0000 (11:14 +0100)]
net: sparx5: Add mdb handlers
Adds mdb handlers. Uses the PGID arbiter to
find a free entry in the PGID table for the
multicast group port mask.
Signed-off-by: Casper Andersson <casper.casan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Casper Andersson [Mon, 21 Mar 2022 10:14:45 +0000 (11:14 +0100)]
net: sparx5: Add arbiter for managing PGID table
The PGID (Port Group ID) table holds port masks
for different purposes. The first 72 are reserved
for port destination masks, flood masks, and CPU
forwarding. The rest are shared between multicast,
link aggregation, and virtualization profiles. The
GLAG area is reserved to not be used by anything
else, since it is a subset of the MCAST area.
The arbiter keeps track of which entries are in
use. You can ask for a free ID or give back one
you are done using.
Signed-off-by: Casper Andersson <casper.casan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 21 Mar 2022 13:21:17 +0000 (13:21 +0000)]
Merge branch 'nfp3800'
Simon Horman says:
====================
nfp: support for NFP-3800
Yinjun Zhan says:
This is the second of a two part series to support the NFP-3800 device.
To utilize the new hardware features of the NFP-3800, driver adds support
of a new data path NFDK. This series mainly does some refactor work to the
data path related implementations. The data path specific implementations
are now separated into nfd3 and nfdk directories respectively, and the
common part is also moved into a new file.
* The series starts with a small refinement in Patch 1/10. Patches 2/10 and
3/10 are the main refactoring of data path implementation, which prepares
for the adding the NFDK data path.
* Before the introduction of NFDK, there's some more preparation work
for NFP-3800 features, such as multi-descriptor per-packet and write-back
mechanism of TX pointer, which is done in patches 4/10, 5/10, 6/10, 7/10.
* Patch 8/10 allows the driver to select data path according
to firmware version. Finally, patches 9/10 and 10/10 introduce the new
NFDK data path.
Changes between v1 and v2
* Correct kdoc for nfp_nfdk_tx()
* Correct build warnings on 32-bit
Thanks to everyone who contributed to this work.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Yinjun Zhang [Mon, 21 Mar 2022 10:42:09 +0000 (11:42 +0100)]
nfp: nfdk: implement xdp tx path for NFDK
Due to the different definition of txbuf in NFDK comparing to NFD3,
there're no pre-allocated txbufs for xdp use in NFDK's implementation,
we just use the existed rxbuf and recycle it when xdp tx is completed.
For each packet to transmit in xdp path, we cannot use more than
`NFDK_TX_DESC_PER_SIMPLE_PKT` txbufs, one is to stash virtual address,
and another is for dma address, so currently the amount of transmitted
bytes is not accumulated. Also we borrow the last bit of virtual addr
to indicate a new transmitted packet due to address's alignment
attribution.
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Fei Qin <fei.qin@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 21 Mar 2022 10:42:08 +0000 (11:42 +0100)]
nfp: add support for NFDK data path
Add new data path. The TX is completely different, each packet
has multiple descriptor entries (between 2 and 32). TX ring is
divided into blocks 32 descriptor, and descritors of one packet
can't cross block bounds. The RX side is the same for now.
ABI version 5 or later is required. There is no support for
VLAN insertion on TX. XDP_TX action and AF_XDP zero-copy is not
implemented in NFDK path.
Changes to Jakub's work:
* Move statistics of hw_csum_tx after jumbo packet's segmentation.
* Set L3_CSUM flag to enable recaculating of L3 header checksum
in ipv4 case.
* Mark the case of TSO a packet with metadata prepended as
unsupported.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Xingfeng Hu <xingfeng.hu@corigine.com>
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Dianchao Wang <dianchao.wang@corigine.com>
Signed-off-by: Fei Qin <fei.qin@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 21 Mar 2022 10:42:07 +0000 (11:42 +0100)]
nfp: choose data path based on version
Prepare for choosing data path based on the firmware version field.
Exploit one bit from the reserved byte in the firmware version field
as the data path type. We need the firmware version right after
vNIC is allocated, so it has to be read inside nfp_net_alloc(),
callers don't have to set it afterwards.
Following patches will bring the implementation of the second data
path.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Fei Qin <fei.qin@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 21 Mar 2022 10:42:06 +0000 (11:42 +0100)]
nfp: add per-data path feature mask
Make sure that features supported only by some of the data paths
are not enabled for all. Add a mask of supported features into
the data path op structure.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Fei Qin <fei.qin@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 21 Mar 2022 10:42:05 +0000 (11:42 +0100)]
nfp: use TX ring pointer write back
Newer versions of the PCIe microcode support writing back the
position of the TX pointer back into host memory. This speeds
up TX completions, because we avoid a read from device memory
(replacing PCIe read with DMA coherent read).
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Fei Qin <fei.qin@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 21 Mar 2022 10:42:04 +0000 (11:42 +0100)]
nfp: move tx_ring->qcidx into cold data
QCidx is not used on fast path, move it to the lower cacheline.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Fei Qin <fei.qin@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 21 Mar 2022 10:42:03 +0000 (11:42 +0100)]
nfp: prepare for multi-part descriptors
New datapaths may use multiple descriptor units to describe
a single packet. Prepare for that by adding a descriptors
per simple frame constant into ring size calculations.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Fei Qin <fei.qin@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 21 Mar 2022 10:42:02 +0000 (11:42 +0100)]
nfp: use callbacks for slow path ring related functions
To reduce the coupling of slow path ring implementations and their
callers, use callbacks instead.
Changes to Jakub's work:
* Also use callbacks for xmit functions
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 21 Mar 2022 10:42:01 +0000 (11:42 +0100)]
nfp: move the fast path code to separate files
In preparation for support for a new datapath format move all
ring and fast path logic into separate files. It is basically
a verbatim move with some wrapping functions, no new structures
and functions added.
The current data path is called NFD3 from the initial version
of the driver ABI it used. The non-fast path, but ring related
functions are moved to nfp_net_dp.c file.
Changes to Jakub's work:
* Rebase on xsk related code.
* Split the patch, move the callback changes to next commit.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Fei Qin <fei.qin@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Mon, 21 Mar 2022 10:42:00 +0000 (11:42 +0100)]
nfp: calculate ring masks without conditionals
Ring enable masks are 64bit long. Replace mask calculation from:
block_cnt == 64 ? 0xffffffffffffffffULL : (1 << block_cnt) - 1
with:
(U64_MAX >> (64 - block_cnt))
to simplify the code.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Fei Qin <fei.qin@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 21 Mar 2022 12:36:03 +0000 (12:36 +0000)]
Merge git://git./linux/kernel/git/netfilter/nf-next
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next.
This patchset contains updates for the nf_tables register tracking
infrastructure, disable bogus warning when attaching ct helpers,
one namespace pollution fix and few cleanups for the flowtable.
1) Revisit conntrack gc routine to reduce chances of overruning
the netlink buffer from the event path. From Florian Westphal.
2) Disable warning on explicit ct helper assignment, from Phil Sutter.
3) Read-only expressions do not update registers, mark them as
NFT_REDUCE_READONLY. Add helper functions to update the register
tracking information. This patch re-enables the register tracking
infrastructure.
4) Cancel register tracking in case an expression fully/partially
clobbers existing data.
5) Add register tracking support for remaining expressions: ct,
lookup, meta, numgen, osf, hash, immediate, socket, xfrm, tunnel,
fib, exthdr.
6) Rename init and exit functions for the conntrack h323 helper,
from Randy Dunlap.
7) Remove redundant field in struct flow_offload_work.
8) Update nf_flow_table_iterate() to pass flowtable to callback.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Casper Andersson [Fri, 18 Mar 2022 12:53:31 +0000 (13:53 +0100)]
net: sparx5: Use vid 1 when bridge default vid 0 to avoid collision
Standalone ports use vid 0. Let the bridge use vid 1 when
"vlan_default_pvid 0" is set to avoid collisions. Since no
VLAN is created when default pvid is 0 this is set
at "PORT_ATTR_SET" and handled in the Switchdev fdb handler.
Signed-off-by: Casper Andersson <casper.casan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wan Jiabing [Fri, 18 Mar 2022 09:31:53 +0000 (17:31 +0800)]
qed: remove unnecessary memset in qed_init_fw_funcs
allocated_mem is allocated by kcalloc(). The memory is set to zero.
It is unnecessary to call memset again.
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wang Yufen [Fri, 18 Mar 2022 06:35:08 +0000 (14:35 +0800)]
netlabel: fix out-of-bounds memory accesses
In calipso_map_cat_ntoh(), in the for loop, if the return value of
netlbl_bitmap_walk() is equal to (net_clen_bits - 1), when
netlbl_bitmap_walk() is called next time, out-of-bounds memory accesses
of bitmap[byte_offset] occurs.
The bug was found during fuzzing. The following is the fuzzing report
BUG: KASAN: slab-out-of-bounds in netlbl_bitmap_walk+0x3c/0xd0
Read of size 1 at addr
ffffff8107bf6f70 by task err_OH/252
CPU: 7 PID: 252 Comm: err_OH Not tainted 5.17.0-rc7+ #17
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace+0x21c/0x230
show_stack+0x1c/0x60
dump_stack_lvl+0x64/0x7c
print_address_description.constprop.0+0x70/0x2d0
__kasan_report+0x158/0x16c
kasan_report+0x74/0x120
__asan_load1+0x80/0xa0
netlbl_bitmap_walk+0x3c/0xd0
calipso_opt_getattr+0x1a8/0x230
calipso_sock_getattr+0x218/0x340
calipso_sock_getattr+0x44/0x60
netlbl_sock_getattr+0x44/0x80
selinux_netlbl_socket_setsockopt+0x138/0x170
selinux_socket_setsockopt+0x4c/0x60
security_socket_setsockopt+0x4c/0x90
__sys_setsockopt+0xbc/0x2b0
__arm64_sys_setsockopt+0x6c/0x84
invoke_syscall+0x64/0x190
el0_svc_common.constprop.0+0x88/0x200
do_el0_svc+0x88/0xa0
el0_svc+0x128/0x1b0
el0t_64_sync_handler+0x9c/0x120
el0t_64_sync+0x16c/0x170
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso [Fri, 18 Mar 2022 12:11:24 +0000 (13:11 +0100)]
netfilter: flowtable: pass flowtable to nf_flow_table_iterate()
The flowtable object is already passed as argument to
nf_flow_table_iterate(), do use not data pointer to pass flowtable.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Fri, 18 Mar 2022 12:11:23 +0000 (13:11 +0100)]
netfilter: flowtable: remove redundant field in flow_offload_work struct
Already available through the flowtable object, remove it.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Randy Dunlap [Wed, 16 Mar 2022 19:20:05 +0000 (12:20 -0700)]
netfilter: nf_nat_h323: eliminate anonymous module_init & module_exit
Eliminate anonymous module_init() and module_exit(), which can lead to
confusion or ambiguity when reading System.map, crashes/oops/bugs,
or an initcall_debug log.
Give each of these init and exit functions unique driver-specific
names to eliminate the anonymous names.
Example 1: (System.map)
ffffffff832fc78c t init
ffffffff832fc79e t init
ffffffff832fc8f8 t init
Example 2: (initcall_debug log)
calling init+0x0/0x12 @ 1
initcall init+0x0/0x12 returned 0 after 15 usecs
calling init+0x0/0x60 @ 1
initcall init+0x0/0x60 returned 0 after 2 usecs
calling init+0x0/0x9a @ 1
initcall init+0x0/0x9a returned 0 after 74 usecs
Fixes:
f587de0e2feb ("[NETFILTER]: nf_conntrack/nf_nat: add H.323 helper port")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Mon, 14 Mar 2022 17:23:13 +0000 (18:23 +0100)]
netfilter: nft_exthdr: add reduce support
Check if we can elide the load. Cancel if the new candidate
isn't identical to previous store.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Mon, 14 Mar 2022 17:23:12 +0000 (18:23 +0100)]
netfilter: nft_fib: add reduce support
The fib expression stores to a register, so we can't add empty stub.
Check that the register that is being written is in fact redundant.
In most cases, this is expected to cancel tracking as re-use is
unlikely.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Mon, 14 Mar 2022 17:23:11 +0000 (18:23 +0100)]
netfilter: nft_tunnel: track register operations
Check if the destination register already contains the data that this
tunnel expression performs. This allows to skip this redundant operation.
If the destination contains a different selector, update the register
tracking information. This patch does not perform bitwise tracking.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Mon, 14 Mar 2022 17:23:10 +0000 (18:23 +0100)]
netfilter: nft_xfrm: track register operations
Check if the destination register already contains the data that this
xfrm expression performs. This allows to skip this redundant operation.
If the destination contains a different selector, update the register
tracking information.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Mon, 14 Mar 2022 17:23:09 +0000 (18:23 +0100)]
netfilter: nft_socket: track register operations
Check if the destination register already contains the data that this
socket expression performs. This allows to skip this redundant
operation. If the destination contains a different selector, update the
register tracking information.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Mon, 14 Mar 2022 17:23:08 +0000 (18:23 +0100)]
netfilter: nft_immediate: cancel register tracking for data destination register
The immediate expression might clobber existing data on the registers,
cancel register tracking for the destination register.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Mon, 14 Mar 2022 17:23:07 +0000 (18:23 +0100)]
netfilter: nft_hash: track register operations
Check if the destination register already contains the data that this
osf expression performs. Always cancel register tracking for jhash since
this requires tracking multiple source registers in case of
concatenations. Perform register tracking (without bitwise) for symhash
since input does not come from source register.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Mon, 14 Mar 2022 17:23:06 +0000 (18:23 +0100)]
netfilter: nft_osf: track register operations
Allow to recycle the previous output of the OS fingerprint expression
if flags and ttl are the same.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Mon, 14 Mar 2022 17:23:05 +0000 (18:23 +0100)]
netfilter: nft_numgen: cancel register tracking
Random and increment are stateful, each invocation results in fresh output.
Cancel register tracking for these two expressions.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Mon, 14 Mar 2022 17:23:04 +0000 (18:23 +0100)]
netfilter: nft_meta: extend reduce support to bridge family
its enough to export the meta get reduce helper and then call it
from nft_meta_bridge too.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Mon, 14 Mar 2022 17:23:03 +0000 (18:23 +0100)]
netfilter: nft_lookup: only cancel tracking for clobbered dregs
In most cases, nft_lookup will be read-only, i.e. won't clobber
registers. In case of map, we need to cancel the registers that will
see stores.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Mon, 14 Mar 2022 17:23:02 +0000 (18:23 +0100)]
netfilter: nft_ct: track register operations
Check if the destination register already contains the data that this ct
expression performs. This allows to skip this redundant operation. If
the destination contains a different selector, update the register
tracking information.
Export nft_expr_reduce_bitwise as a symbol since nft_ct might be
compiled as a module.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Mon, 14 Mar 2022 17:23:01 +0000 (18:23 +0100)]
netfilter: nf_tables: cancel tracking for clobbered destination registers
Output of expressions might be larger than one single register, this might
clobber existing data. Reset tracking for all destination registers that
required to store the expression output.
This patch adds three new helper functions:
- nft_reg_track_update: cancel previous register tracking and update it.
- nft_reg_track_cancel: cancel any previous register tracking info.
- __nft_reg_track_cancel: cancel only one single register tracking info.
Partial register clobbering detection is also supported by checking the
.num_reg field which describes the number of register that are used.
This patch updates the following expressions:
- meta_bridge
- bitwise
- byteorder
- meta
- payload
to use these helper functions.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso [Mon, 14 Mar 2022 17:23:00 +0000 (18:23 +0100)]
netfilter: nf_tables: do not reduce read-only expressions
Skip register tracking for expressions that perform read-only operations
on the registers. Define and use a cookie pointer NFT_REDUCE_READONLY to
avoid defining stubs for these expressions.
This patch re-enables register tracking which was disabled in
ed5f85d42290
("netfilter: nf_tables: disable register tracking"). Follow up patches
add remaining register tracking for existing expressions.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Phil Sutter [Wed, 2 Mar 2022 21:02:55 +0000 (22:02 +0100)]
netfilter: conntrack: Add and use nf_ct_set_auto_assign_helper_warned()
The function sets the pernet boolean to avoid the spurious warning from
nf_ct_lookup_helper() when assigning conntrack helpers via nftables.
Fixes:
1a64edf54f55 ("netfilter: nft_ct: add helper set support")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Florian Westphal [Wed, 16 Feb 2022 15:43:05 +0000 (16:43 +0100)]
netfilter: conntrack: revisit gc autotuning
as of commit
4608fdfc07e1
("netfilter: conntrack: collect all entries in one cycle")
conntrack gc was changed to run every 2 minutes.
On systems where conntrack hash table is set to large value, most evictions
happen from gc worker rather than the packet path due to hash table
distribution.
This causes netlink event overflows when events are collected.
This change collects average expiry of scanned entries and
reschedules to the average remaining value, within 1 to 60 second interval.
To avoid event overflows, reschedule after each bucket and add a
limit for both run time and number of evictions per run.
If more entries have to be evicted, reschedule and restart 1 jiffy
into the future.
Reported-by: Karel Rericha <karel@maxtel.cz>
Cc: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Cc: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
David S. Miller [Sat, 19 Mar 2022 14:50:19 +0000 (14:50 +0000)]
Merge tag 'mlx5-updates-2022-03-18' of git://git./linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2022-03-18
1) XDP multi buffer support
This series enables XDP on non-linear legacy RQ in multi buffer mode.
When XDP is enabled, fragmentation scheme on non-linear legacy RQ is
adjusted to comply to limitations of XDP multi buffer (fragments of the
same size). DMA addresses of fragments are stored in struct page for the
completion handler to be able to unmap them. XDP_TX is supported.
XDP_REDIRECT is not yet supported, the XDP core blocks it for multi
buffer packets at the moment.
2) Trivial cleanups
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 19 Mar 2022 14:49:08 +0000 (14:49 +0000)]
Merge branch 'master' of git://git./linux/kernel/git/klassert/
ipsec-next
Steffen Klassert says:
====================
pull request (net-next): ipsec-next 2022-03-19
1) Delete duplicated functions that calls same xfrm_api_check.
From Leon Romanovsky.
2) Align userland API of the default policy structure to the
internal structures. From Nicolas Dichtel.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Fri, 18 Mar 2022 07:47:23 +0000 (10:47 +0300)]
ptp: ocp: use snprintf() in ptp_ocp_verify()
This code is fine, but it's easier to review if we use snprintf()
instead of sprintf().
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Link: https://lore.kernel.org/r/20220318074723.GA6617@kili
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yang Yingliang [Fri, 18 Mar 2022 07:27:28 +0000 (15:27 +0800)]
nfc: st21nfca: remove unnecessary skb check before kfree_skb()
The skb will be checked in kfree_skb(), so remove the outside check.
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Link: https://lore.kernel.org/r/20220318072728.2659578-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Sat, 19 Mar 2022 04:38:20 +0000 (21:38 -0700)]
Merge branch '40GbE' of git://git./linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
40GbE Intel Wired LAN Driver Updates 2022-03-17
This series contains updates to i40e and igb drivers.
Tom Rix moves a conversion to little endian to occur only when the
value is used for i40e. He also zeros out a structure to resolve
possible use of garbage value for igb as reported by clang.
* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
igb: zero hwtstamp by default
i40e: little endian only valid checksums
====================
Link: https://lore.kernel.org/r/20220317160236.3534321-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Sat, 19 Mar 2022 00:17:19 +0000 (17:17 -0700)]
Merge tag 'for-net-next-2022-03-18' of git://git./linux/kernel/git/bluetooth/bluetooth-next
Luiz Augusto von Dentz says:
====================
bluetooth-next pull request for net-next:
- Add support for Asus TF103C
- Add support for Realtek RTL8852B
- Add support for Realtek RTL8723BE
- Add WBS support to mt7921s
* tag 'for-net-next-2022-03-18' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next: (24 commits)
Bluetooth: ath3k: remove superfluous header files
Bluetooth: bcm203x: remove superfluous header files
Bluetooth: hci_bcm: Add the Asus TF103C to the bcm_broken_irq_dmi_table
Bluetooth: mt7921s: Add WBS support
Bluetooth: mt7921s: Add .btmtk_get_codec_config_data
Bluetooth: mt7921s: Add .get_data_path_id
Bluetooth: mt7921s: Set HCI_QUIRK_VALID_LE_STATES
Bluetooth: btmtksdio: Fix kernel oops in btmtksdio_interrupt
Bluetooth: btmtkuart: fix error handling in mtk_hci_wmt_sync()
Bluetooth: call hci_le_conn_failed with hdev lock in hci_le_conn_failed
Bluetooth: Send AdvMonitor Dev Found for all matched devices
Bluetooth: msft: Clear tracked devices on resume
Bluetooth: fix incorrect nonblock bitmask in bt_sock_wait_ready()
Bluetooth: Don't assign twice the same value
Bluetooth: btrtl: Add support for RTL8852B
Bluetooth: hci_uart: add missing NULL check in h5_enqueue
Bluetooth: Fix use after free in hci_send_acl
Bluetooth: btusb: Use quirk to skip HCI_FLT_CLEAR_ALL on fake CSR controllers
Bluetooth: hci_sync: Add a new quirk to skip HCI_FLT_CLEAR_ALL
Bluetooth: btmtkuart: fix the conflict between mtk and msft vendor event
...
====================
Link: https://lore.kernel.org/r/20220318224752.1477292-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Colin Ian King [Fri, 18 Mar 2022 01:20:35 +0000 (01:20 +0000)]
qlcnic: remove redundant assignment to variable index
Variable index is being assigned a value that is never read, it is being
re-assigned later in a following for-loop. The assignment is redundant
and can be removed.
Cleans up clang scan build warning:
drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c:1358:17: warning:
Although the value stored to 'index' is used in the enclosing expression,
the value is never actually read from 'index' [deadcode.DeadStores]
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20220318012035.89482-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Colin Ian King [Fri, 18 Mar 2022 00:50:21 +0000 (00:50 +0000)]
atl1c: remove redundant assignment to variable size
Variable sie is being assigned a value that is never read. The
The assignment is redundant and can be removed.
Cleans up clang scan build warning:
drivers/net/ethernet/atheros/atl1c/atl1c_main.c:1054:22: warning:
Although the value stored to 'size' is used in the enclosing
expression, the value is never actually read from 'size'
[deadcode.DeadStores]
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20220318005021.82073-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yonglong Li [Thu, 17 Mar 2022 22:14:44 +0000 (15:14 -0700)]
mptcp: send ADD_ADDR echo before create subflows
In some corner cases, the peer handing an incoming ADD_ADDR option, can
receive a retransmitted ADD_ADDR for the same address before the subflow
creation completes.
We can avoid the above issue by generating and sending the ADD_ADDR echo
before starting the MPJ subflow connection.
This slightly changes the behaviour of the packetdrill tests as the
ADD_ADDR echo packet is sent earlier.
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Yonglong Li <liyonglong@chinatelecom.cn>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Link: https://lore.kernel.org/r/20220317221444.426335-1-mathew.j.martineau@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kuniyuki Iwashima [Thu, 17 Mar 2022 03:23:08 +0000 (12:23 +0900)]
af_unix: Remove unnecessary brackets around CONFIG_AF_UNIX_OOB.
Let's remove unnecessary brackets around CONFIG_AF_UNIX_OOB.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Link: https://lore.kernel.org/r/20220317032308.65372-1-kuniyu@amazon.co.jp
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Saeed Mahameed [Tue, 9 Feb 2021 22:41:05 +0000 (14:41 -0800)]
net/mlx5e: HTB, remove unused function declaration
There is no function mlx5e_get_sq(), remove the declaration.
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Moshe Tal <moshet@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Tariq Toukan [Wed, 9 Mar 2022 08:44:24 +0000 (10:44 +0200)]
net/mlx5e: Statify function mlx5_cmd_trigger_completions
Starting from commit
4cab346bcf74 ("net/mlx5: No command allowed when command interface is not ready"),
no calls to mlx5_cmd_trigger_completions() are external to cmd.c anymore.
Make it a static function.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Maxim Mikityanskiy [Tue, 15 Feb 2022 19:01:22 +0000 (21:01 +0200)]
net/mlx5e: Remove MLX5E_XDP_TX_DS_COUNT
After introducing multi-buffer XDP_TX, the MLX5E_XDP_TX_DS_COUNT define
became misleading. It's no longer the DS count of an XDP_TX WQE, this
WQE can be longer because of fragments.
As this define is only used at one place in mlx5e_open_xdpsq(), it's
also not very useful anymore. This commit removes the define and puts
the calculation of ds_count for prefilled single-fragment WQEs inline.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Maxim Mikityanskiy [Tue, 1 Feb 2022 12:21:26 +0000 (14:21 +0200)]
net/mlx5e: Permit XDP with non-linear legacy RQ
Now that legacy RQ implements XDP in the non-linear mode, stop blocking
this configuration. Allow non-linear mode only for programs aware of
multi buffer.
XDP performance with linear mode RQ hasn't changed.
Baseline (MTU 1500, TX MPWQE, legacy RQ, single core):
60-byte packets, XDP_DROP: 11.25 Mpps
60-byte packets, XDP_TX: 9.0 Mpps
60-byte packets, XDP_PASS: 668 kpps
Multi buffer (MTU 9000, TX MPWQE, legacy RQ, single core):
60-byte packets, XDP_DROP: 10.1 Mpps
60-byte packets, XDP_TX: 6.6 Mpps
60-byte packets, XDP_PASS: 658 kpps
8900-byte packets, XDP_DROP: 769 kpps (100% of sent packets)
8900-byte packets, XDP_TX: 674 kpps (100% of sent packets)
8900-byte packets, XDP_PASS: 637 kpps
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Maxim Mikityanskiy [Mon, 31 Jan 2022 17:43:53 +0000 (19:43 +0200)]
net/mlx5e: Support multi buffer XDP_TX
This commit enables passing multi buffer XDP frames to the TX handlers
on XDP_TX. Fragments are DMA synchronized to the device and queued to
the xdpi_fifo for a subsequent unmapping.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Maxim Mikityanskiy [Thu, 10 Mar 2022 16:16:17 +0000 (18:16 +0200)]
net/mlx5e: Unindent the else-block in mlx5e_xmit_xdp_buff
The next commit will add more indentation levels to mlx5e_xmit_xdp_buff.
To keep indentation minimal, unindent the else-block of the if-statement
by doing an early return.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Maxim Mikityanskiy [Mon, 31 Jan 2022 17:27:15 +0000 (19:27 +0200)]
net/mlx5e: Implement sending multi buffer XDP frames
xmit_xdp_frame is extended to support sending fragmented XDP frames. The
next commit will start using this functionality.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Maxim Mikityanskiy [Mon, 31 Jan 2022 15:41:31 +0000 (17:41 +0200)]
net/mlx5e: Don't prefill WQEs in XDP SQ in the multi buffer mode
When MPWQE is disabled, mlx5e_open_xdpsq() prefills the common fields of
WQEs in the XDP SQ to save time when sending packets.
mlx5e_xmit_xdp_frame() runs on the prefilled fields, however, sending
multi buffer XDP frames would require changing some of these fields on a
per-packet basis. Besides that, mlx5e_xmit_xdp_frame() will be used as a
fallback to send multi buffer XDP frames when MPWQE is enabled (MPWQE
can only handle linear packets).
In order to prepare for XDP multi buffer support, this commit introduces
a mode for mlx5e_xmit_xdp_frame() that fills all the fields itself.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Maxim Mikityanskiy [Mon, 31 Jan 2022 15:28:40 +0000 (17:28 +0200)]
net/mlx5e: Remove assignment of inline_hdr.sz on XDP TX
When MPWQE is disabled, mlx5e_open_xdpsq prefills the common fields of
WQEs in the XDP SQ to save time when sending packets. One of such fields
is eseg->inline_hdr.sz, which can be either 0 or MLX5E_XDP_MIN_INLINE,
depending on the inline mode of the SQ.
The inline mode can't change during the lifetime of the SQ, so setting
this field again in mlx5e_xmit_xdp_frame is redundant. Moreover, the
xmit function only sets it to MLX5E_XDP_MIN_INLINE, but not to 0 in the
other case.
This commit removes the redundant assignment in mlx5e_xmit_xdp_frame.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Maxim Mikityanskiy [Fri, 28 Jan 2022 14:43:14 +0000 (16:43 +0200)]
net/mlx5e: Move mlx5e_xdpi_fifo_push out of xmit_xdp_frame
The implementations of xmit_xdp_frame get the xdpi parameter of type
struct mlx5e_xdp_info for the sole purpose of calling
mlx5e_xdpi_fifo_push() on success.
This commit moves this call outside of xmit_xdp_frame, shifting this
responsibility to the caller. It will allow more fine-grained handling
of XDP info for cases when an xdp_frame is fragmented.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Maxim Mikityanskiy [Fri, 28 Jan 2022 11:13:11 +0000 (13:13 +0200)]
net/mlx5e: Store DMA address inside struct page
Use page_pool_set_dma_addr() to store the DMA address of a page inside
struct page, in order to avoid passing struct mlx5e_dma_info to XDP
handlers. Previously, struct mlx5e_dma_info was used to pass both the
DMA address and the page, and it worked well for the single-fragment
case.
When XDP multi buffer is in use, and a fragmented xdp_frame has to be
transmitted, the driver needs to know the DMA addresses of fragments,
however, the array of fragments in struct skb_shared_info doesn't
contain them. In order to pass the DMA addresses, the driver puts them
into struct page itself, which is accessible from the array of fragments
in struct skb_shared_info. The existing XDP handlers are modified to
remove the dependency on struct mlx5e_dma_info.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Maxim Mikityanskiy [Thu, 27 Jan 2022 15:01:05 +0000 (17:01 +0200)]
net/mlx5e: Add XDP multi buffer support to the non-linear legacy RQ
This commit adds XDP multi buffer support to the RX path in the
non-linear legacy RQ mode. mlx5e_xdp_handle is called from
mlx5e_skb_from_cqe_nonlinear.
XDP_TX action for fragmented XDP frames is not yet supported and
blocked.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Maxim Mikityanskiy [Wed, 9 Feb 2022 13:32:35 +0000 (15:32 +0200)]
net/mlx5e: Use page-sized fragments with XDP multi buffer
The implementation of XDP in mlx5e assumes that the frame size is equal
to the page size. Force this limitation in the non-linear mode for XDP
multi buffer.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Maxim Mikityanskiy [Thu, 27 Jan 2022 12:14:53 +0000 (14:14 +0200)]
net/mlx5e: Use fragments of the same size in non-linear legacy RQ with XDP
XDP multi buffer implementation in the kernel assumes that all fragments
have the same size. bpf_xdp_frags_increase_tail uses this assumption to
get the size of the last fragment, and __xdp_build_skb_from_frame uses
it to calculate truesize as nr_frags * xdpf->frame_sz.
The current implementation of mlx5e uses fragments of different size in
non-linear legacy RQ. Specifically, the last fragment can be larger than
the others. It's an optimization for packets smaller than MTU.
This commit adapts mlx5e to the kernel limitations and makes it use
fragments of the same size, in order to add support for XDP multi
buffer. The change is applied only if XDP is active, otherwise the old
optimization still applies.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Maxim Mikityanskiy [Wed, 26 Jan 2022 15:43:33 +0000 (17:43 +0200)]
net/mlx5e: Prepare non-linear legacy RQ for XDP multi buffer support
mlx5e_skb_from_cqe_nonlinear creates an xdp_buff first, putting the
first fragment as the linear part, and the rest of fragments as
fragments to struct skb_shared_info in the tailroom. Then it creates an
SKB in place, based on the xdp_buff. The XDP program is not called in
this commit yet.
This commit contains no functional change, except the SKB is built over
the whole frag_stride of the first fragment, instead of the minimal size
required (headroom, data and skb_shared_info).
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Íñigo Huguet [Tue, 15 Mar 2022 09:18:32 +0000 (10:18 +0100)]
net: set default rss queues num to physical cores / 2
Network drivers can call to netif_get_num_default_rss_queues to get the
default number of receive queues to use. Right now, this default number
is min(8, num_online_cpus()).
Instead, as suggested by Jakub, use the number of physical cores divided
by 2 as a way to avoid wasting CPU resources and to avoid using both CPU
threads, but still allowing to scale for high-end processors with many
cores.
As an exception, select 2 queues for processors with 2 cores, because
otherwise it won't take any advantage of RSS despite being SMP capable.
Tested: Processor Intel Xeon E5-2620 (2 sockets, 6 cores/socket, 2
threads/core). NIC Broadcom NetXtreme II BCM57810 (10GBps). Ran some
tests with `perf stat iperf3 -R`, with parallelisms of 1, 8 and 24,
getting the following results:
- Number of queues: 6 (instead of 8)
- Network throughput: not affected
- CPU usage: utilized 0.05-0.12 CPUs more than before (having 24 CPUs
this is only 0.2-0.5% higher)
- Reduced the number of context switches by 7-50%, being more noticeable
when using a higher number of parallel threads.
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Link: https://lore.kernel.org/r/20220315091832.13873-1-ihuguet@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mianhan Liu [Tue, 28 Sep 2021 19:47:30 +0000 (03:47 +0800)]
Bluetooth: ath3k: remove superfluous header files
ath3k.c hasn't use any macro or function declared in linux/device.h.
Thus, these files can be removed from ath3k.c safely without
affecting the compilation of the ./drivers/bluetooth module
Signed-off-by: Mianhan Liu <liumh1@shanghaitech.edu.cn>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Mianhan Liu [Tue, 28 Sep 2021 19:51:08 +0000 (03:51 +0800)]
Bluetooth: bcm203x: remove superfluous header files
bcm203x.c hasn't use any macro or function declared in linux/atomic.h.
Thus, these files can be removed from bcm203x.c safely without
affecting the compilation of the ./drivers/bluetooth module
Signed-off-by: Mianhan Liu <liumh1@shanghaitech.edu.cn>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Hans de Goede [Mon, 28 Feb 2022 11:38:41 +0000 (12:38 +0100)]
Bluetooth: hci_bcm: Add the Asus TF103C to the bcm_broken_irq_dmi_table
The DSDT for the Asus TF103C specifies a IOAPIC IRQ for the HCI -> host IRQ
but this is not correct. Unlike the previous entries in the table, this
time the correct GPIO to use instead is known; and the TF103C is battery
powered making runtime-pm support more important.
Extend the bcm_broken_irq_dmi_table mechanism to allow specifying the right
GPIO instead of just always disabling runtime-pm and add an entry to it for
the Asus TF103C.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Yake Yang [Wed, 16 Mar 2022 23:15:23 +0000 (07:15 +0800)]
Bluetooth: mt7921s: Add WBS support
It is time to add wide band speech (WBS) support.
Reviewed-by: Mark Chen <markyawenchen@gmail.com>
Co-developed-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Yake Yang <yake.yang@mediatek.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Yake Yang [Wed, 16 Mar 2022 23:15:22 +0000 (07:15 +0800)]
Bluetooth: mt7921s: Add .btmtk_get_codec_config_data
add .btmtk_get_codec_config_data to get codec configuration data.
In HFP offload usecase, controllers need to be set codec details before
opening SCO. This callback function is used to fetch vendor specific codec
config data.
This is a preliminary patch to add the WBS support to the MT7921 driver.
Reviewed-by: Mark Chen <markyawenchen@gmail.com>
Co-developed-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Yake Yang <yake.yang@mediatek.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Yake Yang [Wed, 16 Mar 2022 23:15:21 +0000 (07:15 +0800)]
Bluetooth: mt7921s: Add .get_data_path_id
Add .get_data_path_id to fetch data_path_id for MT7921 to support HFP
offload use case.
This is a preliminary patch to add the WBS support to the MT7921 driver.
Reviewed-by: Mark Chen <markyawenchen@gmail.com>
Co-developed-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Yake Yang <yake.yang@mediatek.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Yake Yang [Wed, 16 Mar 2022 23:15:20 +0000 (07:15 +0800)]
Bluetooth: mt7921s: Set HCI_QUIRK_VALID_LE_STATES
The patch set HCI_QUIRK_VALID_LE_STATES to be consistent with the btusb for
MT7921 and is required for the likes of experimental LE simultaneous roles.
Reviewed-by: Mark Chen <markyawenchen@gmail.com>
Co-developed-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Yake Yang <yake.yang@mediatek.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Yake Yang [Wed, 16 Mar 2022 23:15:19 +0000 (07:15 +0800)]
Bluetooth: btmtksdio: Fix kernel oops in btmtksdio_interrupt
Fix the following kernel oops in btmtksdio_interrrupt
[ 14.339134] btmtksdio_interrupt+0x28/0x54
[ 14.339139] process_sdio_pending_irqs+0x68/0x1a0
[ 14.339144] sdio_irq_work+0x40/0x70
[ 14.339154] process_one_work+0x184/0x39c
[ 14.339160] worker_thread+0x228/0x3e8
[ 14.339168] kthread+0x148/0x3ac
[ 14.339176] ret_from_fork+0x10/0x30
That happened because hdev->power_on is already called before
sdio_set_drvdata which btmtksdio_interrupt handler relies on is not
properly set up.
The details are shown as the below: hci_register_dev would run
queue_work(hdev->req_workqueue, &hdev->power_on) as WQ_HIGHPRI
workqueue_struct to complete the power-on sequeunce and thus hci_power_on
may run before sdio_set_drvdata is done in btmtksdio_probe.
The hci_dev_do_open in hci_power_on would initialize the device and enable
the interrupt and thus it is possible that btmtksdio_interrupt is being
called right before sdio_set_drvdata is filled out.
When btmtksdio_interrupt is being called and sdio_set_drvdata is not filled
, the kernel oops is going to happen because btmtksdio_interrupt access an
uninitialized pointer.
Fixes:
9aebfd4a2200 ("Bluetooth: mediatek: add support for MediaTek MT7663S and MT7668S SDIO devices")
Reviewed-by: Mark Chen <markyawenchen@gmail.com>
Co-developed-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: Yake Yang <yake.yang@mediatek.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Dan Carpenter [Thu, 17 Mar 2022 07:57:40 +0000 (10:57 +0300)]
Bluetooth: btmtkuart: fix error handling in mtk_hci_wmt_sync()
This code has an uninitialized variable warning:
drivers/bluetooth/btmtkuart.c:184 mtk_hci_wmt_sync()
error: uninitialized symbol 'wc'.
But it also has error paths which have memory leaks.
Fixes:
8f550f55b155 ("Bluetooth: btmtkuart: rely on BT_MTK module")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Niels Dossche [Wed, 16 Mar 2022 15:33:50 +0000 (16:33 +0100)]
Bluetooth: call hci_le_conn_failed with hdev lock in hci_le_conn_failed
hci_le_conn_failed function's documentation says that the caller must
hold hdev->lock. The only callsite that does not hold that lock is
hci_le_conn_failed. The other 3 callsites hold the hdev->lock very
locally. The solution is to hold the lock during the call to
hci_le_conn_failed.
Fixes:
3c857757ef6e ("Bluetooth: Add directed advertising support through connect()")
Signed-off-by: Niels Dossche <dossche.niels@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Manish Mandlik [Sat, 12 Mar 2022 10:08:59 +0000 (02:08 -0800)]
Bluetooth: Send AdvMonitor Dev Found for all matched devices
When an Advertisement Monitor is configured with SamplingPeriod 0xFF,
the controller reports only one adv report along with the MSFT Monitor
Device event.
When an advertiser matches multiple monitors, some controllers send one
adv report for each matched monitor; whereas, some controllers send just
one adv report for all matched monitors.
In such a case, report Adv Monitor Device Found event for each matched
monitor.
Signed-off-by: Manish Mandlik <mmandlik@google.com>
Reviewed-by: Miao-chen Chou <mcchou@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Manish Mandlik [Sat, 12 Mar 2022 10:08:58 +0000 (02:08 -0800)]
Bluetooth: msft: Clear tracked devices on resume
Clear already tracked devices on system resume. Once the monitors are
reregistered after resume, matched devices in range will be found again.
Signed-off-by: Manish Mandlik <mmandlik@google.com>
Reviewed-by: Miao-chen Chou <mcchou@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Gavin Li [Mon, 14 Mar 2022 22:42:52 +0000 (15:42 -0700)]
Bluetooth: fix incorrect nonblock bitmask in bt_sock_wait_ready()
Callers pass msg->msg_flags as flags, which contains MSG_DONTWAIT
instead of O_NONBLOCK.
Signed-off-by: Gavin Li <gavin@matician.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Christophe JAILLET [Wed, 2 Mar 2022 20:18:35 +0000 (21:18 +0100)]
Bluetooth: Don't assign twice the same value
data.pid is set twice with the same value. Remove one of these redundant
calls.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Max Chou [Mon, 14 Mar 2022 06:54:22 +0000 (14:54 +0800)]
Bluetooth: btrtl: Add support for RTL8852B
Add the support for RTL8852B BT controller on USB interface.
The necessary firmware file will be submitted to linux-firmware.
Signed-off-by: Max Chou <max.chou@realtek.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>