Yue Haibing [Fri, 4 Aug 2023 12:55:25 +0000 (20:55 +0800)]
i40e: Remove unused function declarations
Commit
f62b5060d670 ("i40e: fix mac address checking") left behind
i40e_validate_mac_addr() declaration.
Also the other declarations are declared but never implemented in
commit
56a62fc86895 ("i40e: init code and hardware support").
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230804125525.20244-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yue Haibing [Fri, 4 Aug 2023 12:52:03 +0000 (20:52 +0800)]
ixgbe: Remove unused function declarations
Commit
dc166e22ede5 ("ixgbe: DCB remove ixgbe_fcoe_getapp routine")
leave ixgbe_fcoe_getapp() unused.
Commit
ffed21bcee7a ("ixgbe: Don't bother clearing buffer memory for descriptor rings")
leave ixgbe_unmap_and_free_tx_resource() declaration unused.
And commit
3b3bf3b92b31 ("ixgbe: remove unused fcoe.tc field and fcoe_setapp()")
removed the ixgbe_fcoe_setapp() implementation.
Commit
c44ade9ef8ff ("ixgbe: update to latest common code module")
declared but never implemented ixgbe_init_ops_generic().
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230804125203.30924-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 8 Aug 2023 02:18:29 +0000 (19:18 -0700)]
Merge branch 'net-remove-redundant-initialization-owner'
Li Zetao says:
====================
net: Remove redundant initialization owner
This patch set removes redundant initialization owner when register a
fsl_mc_driver driver
====================
Link: https://lore.kernel.org/r/20230804095946.99956-1-lizetao1@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Li Zetao [Fri, 4 Aug 2023 09:59:46 +0000 (17:59 +0800)]
net: dpaa2-switch: Remove redundant initialization owner in dpaa2_switch_drv
The fsl_mc_driver_register() will set "THIS_MODULE" to driver.owner when
register a fsl_mc_driver driver, so it is redundant initialization to set
driver.owner in dpaa2_switch_drv statement. Remove it for clean code.
Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230804095946.99956-3-lizetao1@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Li Zetao [Fri, 4 Aug 2023 09:59:45 +0000 (17:59 +0800)]
net: dpaa2-eth: Remove redundant initialization owner in dpaa2_eth_driver
The fsl_mc_driver_register() will set "THIS_MODULE" to driver.owner when
register a fsl_mc_driver driver, so it is redundant initialization to set
driver.owner in dpaa2_eth_driver statement. Remove it for clean code.
Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230804095946.99956-2-lizetao1@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 8 Aug 2023 02:16:00 +0000 (19:16 -0700)]
Merge branch 'octeontx2-af-tc-flower-offload-changes'
Suman Ghosh says:
====================
octeontx2-af: TC flower offload changes
This patchset includes minor code restructuring related to TC
flower offload for outer vlan and adding support for TC inner
vlan offload.
Patch #1 Code restructure to handle TC flower outer vlan offload
Patch #2 Add TC flower offload support for inner vlan
====================
Link: https://lore.kernel.org/r/20230804045935.3010554-1-sumang@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Suman Ghosh [Fri, 4 Aug 2023 04:59:35 +0000 (10:29 +0530)]
octeontx2-af: TC flower offload support for inner VLAN
Extend the current TC flower offload support to enable filters matching
inner VLAN, and support offload of those filters to hardware.
Signed-off-by: Suman Ghosh <sumang@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230804045935.3010554-3-sumang@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Suman Ghosh [Fri, 4 Aug 2023 04:59:34 +0000 (10:29 +0530)]
octeontx2-af: Code restructure to handle TC outer VLAN offload
Moved the TC outer VLAN offload support to a separate function.
This change is done to handle all VLAN related changes cleanly from
a dedicated function.
Signed-off-by: Suman Ghosh <sumang@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230804045935.3010554-2-sumang@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Mon, 7 Aug 2023 19:28:54 +0000 (12:28 -0700)]
Merge branch 'page_pool-a-couple-of-assorted-optimizations'
Alexander Lobakin says:
====================
page_pool: a couple of assorted optimizations
That initially was a spin-off of the IAVF PP series[0], but has grown
(and shrunk) since then a bunch. In fact, it consists of three
semi-independent blocks:
* #1-2: Compile-time optimization. Split page_pool.h into 2 headers to
not overbloat the consumers not needing complex inline helpers and
then stop including it in skbuff.h at all. The first patch is also
prereq for the whole series.
* #3: Improve cacheline locality for users of the Page Pool frag API.
* #4-6: Use direct cache recycling more aggressively, when it is safe
obviously. In addition, make sure nobody wants to use Page Pool API
with disabled interrupts.
Patches #1 and #5 are authored by Yunsheng and Jakub respectively, with
small modifications from my side as per ML discussions.
For the perf numbers for #3-6, please see individual commit messages.
Also available on my GH with many more Page Pool goodies[1].
[0] https://lore.kernel.org/netdev/
20230530150035.
1943669-1-aleksander.lobakin@intel.com
[1] https://github.com/alobakin/linux/commits/iavf-pp-frag
====================
Link: https://lore.kernel.org/r/20230804180529.2483231-1-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alexander Lobakin [Fri, 4 Aug 2023 18:05:29 +0000 (20:05 +0200)]
net: skbuff: always try to recycle PP pages directly when in softirq
Commit
8c48eea3adf3 ("page_pool: allow caching from safely localized
NAPI") allowed direct recycling of skb pages to their PP for some cases,
but unfortunately missed a couple of other majors.
For example, %XDP_DROP in skb mode. The netstack just calls kfree_skb(),
which unconditionally passes `false` as @napi_safe. Thus, all pages go
through ptr_ring and locks, although most of time we're actually inside
the NAPI polling this PP is linked with, so that it would be perfectly
safe to recycle pages directly.
Let's address such. If @napi_safe is true, we're fine, don't change
anything for this path. But if it's false, check whether we are in the
softirq context. It will most likely be so and then if ->list_owner
is our current CPU, we're good to use direct recycling, even though
@napi_safe is false -- concurrent access is excluded. in_softirq()
protection is needed mostly due to we can hit this place in the
process context (not the hardirq though).
For the mentioned xdp-drop-skb-mode case, the improvement I got is
3-4% in Mpps. As for page_pool stats, recycle_ring is now 0 and
alloc_slow counter doesn't change most of time, which means the
MM layer is not even called to allocate any new pages.
Suggested-by: Jakub Kicinski <kuba@kernel.org> # in_softirq()
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://lore.kernel.org/r/20230804180529.2483231-7-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 4 Aug 2023 18:05:28 +0000 (20:05 +0200)]
page_pool: add a lockdep check for recycling in hardirq
Page pool use in hardirq is prohibited, add debug checks
to catch misuses. IIRC we previously discussed using
DEBUG_NET_WARN_ON_ONCE() for this, but there were concerns
that people will have DEBUG_NET enabled in perf testing.
I don't think anyone enables lockdep in perf testing,
so use lockdep to avoid pushback and arguing :)
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://lore.kernel.org/r/20230804180529.2483231-6-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alexander Lobakin [Fri, 4 Aug 2023 18:05:27 +0000 (20:05 +0200)]
net: skbuff: avoid accessing page_pool if !napi_safe when returning page
Currently, pp->p.napi is always read, but the actual variable it gets
assigned to is read-only when @napi_safe is true. For the !napi_safe
cases, which yet is still a pack, it's an unneeded operation.
Moreover, it can lead to premature or even redundant page_pool
cacheline access. For example, when page_pool_is_last_frag() returns
false (with the recent frag improvements).
Thus, read it only when @napi_safe is true. This also allows moving
@napi inside the condition block itself. Constify it while we are
here, because why not.
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://lore.kernel.org/r/20230804180529.2483231-5-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alexander Lobakin [Fri, 4 Aug 2023 18:05:26 +0000 (20:05 +0200)]
page_pool: place frag_* fields in one cacheline
On x86_64, frag_* fields of struct page_pool are scattered across two
cachelines despite the summary size of 24 bytes. All three fields are
used in pretty much the same places, but the last field, ::frag_users,
is pushed out to the next CL, provoking unwanted false-sharing on
hotpath (frags allocation code).
There are some holes and cold members to move around. Move frag_* one
block up, placing them right after &page_pool_params perfectly at the
beginning of CL2. This doesn't do any meaningful to the second block, as
those are some destroy-path cold structures, and doesn't do anything to
::alloc_stats, which still starts at 200-byte offset, 8 bytes after CL3
(still fitting into 1 cacheline).
On my setup, this yields 1-2% of Mpps when using PP frags actively.
When it comes to 32-bit architectures with 32-byte CL: &page_pool_params
plus ::pad is 44 bytes, the block taken care of is 16 bytes within one
CL, so there should be at least no regressions from the actual change.
::pages_state_hold_cnt is not related directly to that triple, but is
paired currently with ::frags_offset and decoupling them would mean
either two 4-byte holes or more invasive layout changes.
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://lore.kernel.org/r/20230804180529.2483231-4-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Alexander Lobakin [Fri, 4 Aug 2023 18:05:25 +0000 (20:05 +0200)]
net: skbuff: don't include <net/page_pool/types.h> to <linux/skbuff.h>
Currently, touching <net/page_pool/types.h> triggers a rebuild of more
than half of the kernel. That's because it's included in
<linux/skbuff.h>. And each new include to page_pool/types.h adds more
[useless] data for the toolchain to process per each source file from
that pile.
In commit
6a5bcd84e886 ("page_pool: Allow drivers to hint on SKB
recycling"), Matteo included it to be able to call a couple of functions
defined there. Then, in commit
57f05bc2ab24 ("page_pool: keep pp info as
long as page pool owns the page") one of the calls was removed, so only
one was left. It's the call to page_pool_return_skb_page() in
napi_frag_unref(). The function is external and doesn't have any
dependencies. Having very niche page_pool_types.h included only for that
looks like an overkill.
As %PP_SIGNATURE is not local to page_pool.c (was only in the
early submissions), nothing holds this function there. Teleport
page_pool_return_skb_page() to skbuff.c, just next to the main consumer,
skb_pp_recycle(), and rename it to napi_pp_put_page(), as it doesn't
work with skbs at all and the former name tells nothing. The #if guards
here are only to not compile and have it in the vmlinux when not needed
-- both call sites are already guarded.
Now, touching page_pool_types.h only triggers rebuilding of the drivers
using it and a couple of core networking files.
Suggested-by: Jakub Kicinski <kuba@kernel.org> # make skbuff.h less heavy
Suggested-by: Alexander Duyck <alexanderduyck@fb.com> # move to skbuff.c
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://lore.kernel.org/r/20230804180529.2483231-3-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yunsheng Lin [Fri, 4 Aug 2023 18:05:24 +0000 (20:05 +0200)]
page_pool: split types and declarations from page_pool.h
Split types and pure function declarations from page_pool.h
and add them in page_page/types.h, so that C sources can
include page_pool.h and headers should generally only include
page_pool/types.h as suggested by jakub.
Rename page_pool.h to page_pool/helpers.h to have both in
one place.
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://lore.kernel.org/r/20230804180529.2483231-2-aleksander.lobakin@intel.com
[Jakub: change microsoft/mana, fix kdoc paths in Documentation]
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Mon, 7 Aug 2023 19:25:46 +0000 (12:25 -0700)]
Merge tag 'linux-can-next-for-6.6-
20230807' of git://git./linux/kernel/git/mkl/linux-can-next
Marc Kleine-Budde says:
====================
pull-request: can-next 2023-08-07
The patch is from me and reverts the addition of the CAN controller
nodes in the allwinner d1 SoC.
* tag 'linux-can-next-for-6.6-
20230807' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next:
Revert "riscv: dts: allwinner: d1: Add CAN controller nodes"
====================
Link: https://lore.kernel.org/r/20230807074222.1576119-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Mon, 7 Aug 2023 19:17:14 +0000 (12:17 -0700)]
Merge branch 'net-stmmac-correct-mac-propagation-delay'
Johannes Zink says:
====================
net: stmmac: correct MAC propagation delay
Changes in v3:
- work in Richard's review feedback. Thank you for reviewing my patch:
- as some of the hardware may have no or invalid correction value
registers: introduce feature switch which can be enabled in the glue
code drivers depending on the actual hardware support
- only enable the feature on the i.MX8MP for the time being, as the patch
improves timing accuracy and is tested for this hardware
- Link to v2: https://lore.kernel.org/r/
20230719-stmmac_correct_mac_delay-v2-1-
3366f38ee9a6@pengutronix.de
Changes in v2:
- fix builds for 32bit, this was found by the kernel build bot
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202307200225.B8rmKQPN-lkp@intel.com/
- while at it also fix an overflow by shifting a u32 constant from macro by 10bits
by casting the constant to u64
- Link to v1: https://lore.kernel.org/r/
20230719-stmmac_correct_mac_delay-v1-1-
768aa4d09334@pengutronix.de
Tested-by: Kurt Kanzenbach <kurt@linutronix.de> # imx8mp
====================
Link: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v3-0-61e63427735e@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Johannes Zink [Tue, 1 Aug 2023 15:44:30 +0000 (17:44 +0200)]
net: stmmac: dwmac-imx: enable MAC propagation delay correction for i.MX8MP
As the i.MX8MP supports reading MAC propagation delay and correcting the
Hardware timestamp counter for additional delays [1], enable the feature
for this SoC.
This reduces phase error of the PPS output from the PTP Hardware Clock
from approx 150ns to 100ns.
[1] i.MX8MP Reference Manual, rev.1 Section 11.7.2.5.3 "Timestamp
correction"
Signed-off-by: Johannes Zink <j.zink@pengutronix.de>
Link: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v3-2-61e63427735e@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Johannes Zink [Tue, 1 Aug 2023 15:44:29 +0000 (17:44 +0200)]
net: stmmac: correct MAC propagation delay
The IEEE1588 Standard specifies that the timestamps of Packets must be
captured when the PTP message timestamp point (leading edge of first
octet after the start of frame delimiter) crosses the boundary between
the node and the network. As the MAC latches the timestamp at an
internal point, the captured timestamp must be corrected for the
additional data transmission latency, as described in the publicly
available datasheet [1].
This patch only corrects for the MAC-Internal delay, which can be read
out from the MAC_Ingress_Timestamp_Latency register on DWMAC version 5,
since the Phy framework currently does not support querying the Phy
ingress and egress latency. The Closs Domain Crossing Circuits errors as
indicated in [1] are already being accounted in the
stmmac_get_tx_hwtstamp() function and are not corrected here.
As the Latency varies for different link speeds and MII
modes of operation, the correction value needs to be updated on each
link state change.
As the delay also causes a phase shift in the timestamp counter compared
to the rest of the network, this correction will also reduce phase error
when generating PPS outputs from the timestamp counter.
Since the correction registers may be unavailable on some hardware and
no feature bits are documented for dynamically detection of the MAC
propagation delay readout, introduce a feature bit to explicitely enable
MAC delay Correction in the gluecode driver.
[1] i.MX8MP Reference Manual, rev.1 Section 11.7.2.5.3 "Timestamp
correction"
Signed-off-by: Johannes Zink <j.zink@pengutronix.de>
Link: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v2-1-3366f38ee9a6@pengutronix.de
Link: https://lore.kernel.org/r/20230719-stmmac_correct_mac_delay-v3-1-61e63427735e@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yue Haibing [Sat, 5 Aug 2023 11:00:09 +0000 (19:00 +0800)]
udp/udplite: Remove unused function declarations udp{,lite}_get_port()
Commit
6ba5a3c52da0 ("[UDP]: Make full use of proto.h.udp_hash innovation.")
removed these implementations but leave declarations.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yue Haibing [Sat, 5 Aug 2023 10:57:40 +0000 (18:57 +0800)]
net: sfp: Remove unused function declaration sfp_link_configure()
Commit
ce0aa27ff3f6 ("sfp: add sfp-bus to bridge between network devices and sfp cages")
declared but never implemented it.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yue Haibing [Sat, 5 Aug 2023 10:53:54 +0000 (18:53 +0800)]
ndisc: Remove unused ndisc_ifinfo_sysctl_strategy() declaration
Commit
f8572d8f2a2b ("sysctl net: Remove unused binary sysctl code")
left behind this declaration.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yue Haibing [Sat, 5 Aug 2023 10:52:08 +0000 (18:52 +0800)]
net: pkt_cls: Remove unused inline helpers
Commit
acb674428c3d ("net: sched: introduce per-block callbacks")
implemented these but never used it.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yue Haibing [Sat, 5 Aug 2023 10:50:33 +0000 (18:50 +0800)]
neighbour: Remove unused function declaration pneigh_for_each()
pneigh_for_each() is never implemented since the beginning of git history.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yue Haibing [Sat, 5 Aug 2023 10:48:11 +0000 (18:48 +0800)]
net/tls: Remove unused function declarations
Commit
3c4d7559159b ("tls: kernel TLS support") declared but never implemented
these functions.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marc Kleine-Budde [Mon, 7 Aug 2023 07:28:50 +0000 (09:28 +0200)]
Revert "riscv: dts: allwinner: d1: Add CAN controller nodes"
It turned out the dtsi changes were not quite ready, revert them for
now.
This reverts commit
6ea1ad888f5900953a21853e709fa499fdfcb317.
Link: https://lore.kernel.org/all/2690764.mvXUDI8C0e@jernej-laptop
Suggested-by: Jernej Å krabec <jernej.skrabec@gmail.com>
Link: https://lore.kernel.org/all/20230807-riscv-allwinner-d1-revert-can-controller-nodes-v1-1-eb3f70b435d9@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Vladimir Oltean [Fri, 4 Aug 2023 13:49:39 +0000 (16:49 +0300)]
net: omit ndo_hwtstamp_get() call when possible in dev_set_hwtstamp_phylib()
Setting dev->priv_flags & IFF_SEE_ALL_HWTSTAMP_REQUESTS is only legal
for drivers which were converted to ndo_hwtstamp_get() and
ndo_hwtstamp_set(), and it is only there that we call ndo_hwtstamp_set()
for a request that otherwise goes to phylib (for stuff like packet traps,
which need to be undone if phylib failed, hence the old_cfg logic).
The problem is that we end up calling ndo_hwtstamp_get() when we don't
need to (even if the SIOCSHWTSTAMP wasn't intended for phylib, or if it
was, but the driver didn't set IFF_SEE_ALL_HWTSTAMP_REQUESTS). For those
unnecessary conditions, we share a code path with virtual drivers (vlan,
macvlan, bonding) where ndo_hwtstamp_get() is implemented as
generic_hwtstamp_get_lower(), and may be resolved through
generic_hwtstamp_ioctl_lower() if the lower device is unconverted.
I.e. this situation:
$ ip link add link eno0 name eno0.100 type vlan id 100
$ hwstamp_ctl -i eno0.100 -t 1
We are unprepared to deal with this, because if ndo_hwtstamp_get() is
resolved through a legacy ndo_eth_ioctl(SIOCGHWTSTAMP) lower_dev
implementation, that needs a non-NULL old_cfg.ifr pointer, and we don't
have it.
But we don't even need to deal with it either. In the general case,
drivers may not even implement SIOCGHWTSTAMP handling, only SIOCSHWTSTAMP,
so it makes sense to completely avoid a SIOCGHWTSTAMP call if we can.
The solution is to split the single "if" condition into 3 smaller ones,
thus separating the decision to call ndo_hwtstamp_get() from the
decision to call ndo_hwtstamp_set(). The third "if" condition is
identical to the first one, and both are subsets of the second one.
Thus, the "cfg" argument of kernel_hwtstamp_config_changed() is always
valid.
Reported-by: Eric Dumazet <edumazet@google.com>
Closes: https://lore.kernel.org/netdev/CANn89iLOspJsvjPj+y8jikg7erXDomWe8sqHMdfL_2LQSFrPAg@mail.gmail.com/
Fixes: fd770e856e22 ("net: remove phy_has_hwtstamp() -> phy_mii_ioctl() decision from converted drivers")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yang Yingliang [Fri, 4 Aug 2023 09:35:31 +0000 (17:35 +0800)]
net: ethernet: adi: adin1110: use eth_broadcast_addr() to assign broadcast address
Use eth_broadcast_addr() to assign broadcast address instead
of memset().
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yu Liao [Fri, 4 Aug 2023 09:21:43 +0000 (17:21 +0800)]
ibmvnic: remove unused rc variable
gcc with W=1 reports
drivers/net/ethernet/ibm/ibmvnic.c:194:13: warning: variable 'rc' set but not used [-Wunused-but-set-variable]
^
This variable is not used so remove it.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202308040609.zQsSXWXI-lkp@intel.com/
Signed-off-by: Yu Liao <liaoyu15@huawei.com>
Reviewed-by: Nick Child <nnac123@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Haiyang Zhang [Fri, 4 Aug 2023 20:33:53 +0000 (13:33 -0700)]
net: mana: Add page pool for RX buffers
Add page pool for RX buffers for faster buffer cycle and reduce CPU
usage.
The standard page pool API is used.
With iperf and 128 threads test, this patch improved the throughput
by 12-15%, and decreased the IRQ associated CPU's usage from 99-100% to
10-50%.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 6 Aug 2023 07:34:37 +0000 (08:34 +0100)]
Merge branch 'gve-desc'
Rushil Gupta says:
====================
gve: Add QPL mode for DQO descriptor format
GVE supports QPL ("queue-page-list") mode where
all data is communicated through a set of pre-registered
pages. Adding this mode to DQO.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Rushil Gupta [Fri, 4 Aug 2023 21:34:44 +0000 (21:34 +0000)]
gve: update gve.rst
Add a note about QPL and RDA mode
Signed-off-by: Rushil Gupta <rushilg@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Signed-off-by: Bailey Forrest <bcf@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rushil Gupta [Fri, 4 Aug 2023 21:34:43 +0000 (21:34 +0000)]
gve: RX path for DQO-QPL
The RX path allocates the QPL page pool at queue creation, and
tries to reuse these pages through page recycling. This patch
ensures that on refill no non-QPL pages are posted to the device.
When the driver is running low on free buffers, an ondemand
allocation step kicks in that allocates a non-qpl page for
SKB business to free up the QPL page in use.
gve_try_recycle_buf was moved to gve_rx_append_frags so that driver does
not attempt to mark buffer as used if a non-qpl page was allocated
ondemand.
Signed-off-by: Rushil Gupta <rushilg@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Signed-off-by: Bailey Forrest <bcf@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rushil Gupta [Fri, 4 Aug 2023 21:34:42 +0000 (21:34 +0000)]
gve: Tx path for DQO-QPL
Each QPL page is divided into GVE_TX_BUFS_PER_PAGE_DQO buffers.
When a packet needs to be transmitted, we break the packet into max
GVE_TX_BUF_SIZE_DQO sized chunks and transmit each chunk using a TX
descriptor.
We allocate the TX buffers from the free list in dqo_tx.
We store these TX buffer indices in an array in the pending_packet
structure.
The TX buffers are returned to the free list in dqo_compl after
receiving packet completion or when removing packets from miss
completions list.
Signed-off-by: Rushil Gupta <rushilg@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Signed-off-by: Bailey Forrest <bcf@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rushil Gupta [Fri, 4 Aug 2023 21:34:41 +0000 (21:34 +0000)]
gve: Control path for DQO-QPL
GVE supports QPL ("queue-page-list") mode where
all data is communicated through a set of pre-registered
pages. Adding this mode to DQO descriptor format.
Add checks, abi-changes and device options to support
QPL mode for DQO in addition to GQI. Also, use
pages-per-qpl supplied by device-option to control the
size of the "queue-page-list".
Signed-off-by: Rushil Gupta <rushilg@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com>
Signed-off-by: Bailey Forrest <bcf@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 6 Aug 2023 07:24:56 +0000 (08:24 +0100)]
Merge branch 'tcp-options-lockless'
Eric Dumazet says:
====================
tcp: set few options locklessly
This series is avoiding the socket lock for six TCP options.
They are not heavily used, but this exercise can give
ideas for other parts of TCP/IP stack :)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 4 Aug 2023 14:46:16 +0000 (14:46 +0000)]
tcp: set TCP_DEFER_ACCEPT locklessly
rskq_defer_accept field can be read/written without
the need of holding the socket lock.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 4 Aug 2023 14:46:15 +0000 (14:46 +0000)]
tcp: set TCP_LINGER2 locklessly
tp->linger2 can be set locklessly as long as readers
use READ_ONCE().
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 4 Aug 2023 14:46:14 +0000 (14:46 +0000)]
tcp: set TCP_KEEPCNT locklessly
tp->keepalive_probes can be set locklessly, readers
are already taking care of this field being potentially
set by other threads.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 4 Aug 2023 14:46:13 +0000 (14:46 +0000)]
tcp: set TCP_KEEPINTVL locklessly
tp->keepalive_intvl can be set locklessly, readers
are already taking care of this field being potentially
set by other threads.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 4 Aug 2023 14:46:12 +0000 (14:46 +0000)]
tcp: set TCP_USER_TIMEOUT locklessly
icsk->icsk_user_timeout can be set locklessly,
if all read sides use READ_ONCE().
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 4 Aug 2023 14:46:11 +0000 (14:46 +0000)]
tcp: set TCP_SYNCNT locklessly
icsk->icsk_syn_retries can safely be set without locking the socket.
We have to add READ_ONCE() annotations in tcp_fastopen_synack_timer()
and tcp_write_timeout().
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Sat, 5 Aug 2023 01:34:25 +0000 (18:34 -0700)]
Merge tag 'wireless-next-2023-08-04' of git://git./linux/kernel/git/wireless/wireless-next
Kalle Valo says:
====================
wireless-next patches for v6.6
The first pull request for v6.6 and only driver patches this time.
Nothing special really standing out, it has been quiet most likely due
to vacations.
Major changes:
rtl8xxxu
- enable AP mode for: RTL8192FU, RTL8710BU (RTL8188GU),
RTL8192EU and RTL8723BU
mwifiex
- allow moving to a different namespace
mt76
- preparation for mt7925 support
- mt7981 support
ath12k
- Extremely High Throughput (EHT) PHY support for Wi-Fi 7
* tag 'wireless-next-2023-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (172 commits)
wifi: rtw89: return failure if needed firmware elements are not recognized
wifi: rtw89: add to parse firmware elements of BB and RF tables
wifi: rtw89: introduce infrastructure of firmware elements
wifi: rtw89: add firmware suit for BB MCU 0/1
wifi: rtw89: add firmware parser for v1 format
wifi: rtw89: introduce v1 format of firmware header
wifi: rtw89: support firmware log with formatted text
wifi: rtw89: recognize log format from firmware file
wifi: ath12k: avoid deadlock by change ieee80211_queue_work for regd_update_work
wifi: ath12k: add handler for scan event WMI_SCAN_EVENT_DEQUEUED
wifi: ath12k: relax list iteration in ath12k_mac_vif_unref()
wifi: ath12k: configure puncturing bitmap
wifi: ath12k: parse WMI service ready ext2 event
wifi: ath12k: add MLO header in peer association
wifi: ath12k: peer assoc for 320 MHz
wifi: ath12k: add WMI support for EHT peer
wifi: ath12k: prepare EHT peer assoc parameters
wifi: ath12k: add EHT PHY modes
wifi: ath12k: propagate EHT capabilities to userspace
wifi: ath12k: WMI support to process EHT capabilities
...
====================
Link: https://lore.kernel.org/r/87msz7j942.fsf@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Sat, 5 Aug 2023 01:28:38 +0000 (18:28 -0700)]
Merge branch 'tcp-disable-header-prediction-for-md5'
Kuniyuki Iwashima says:
====================
tcp: Disable header prediction for MD5.
The 1st patch disable header prediction for MD5 flow and the 2nd
patch updates the stale comment in tcp_parse_options().
====================
Link: https://lore.kernel.org/r/20230803224552.69398-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kuniyuki Iwashima [Thu, 3 Aug 2023 22:45:52 +0000 (15:45 -0700)]
tcp: Update stale comment for MD5 in tcp_parse_options().
Since commit
9ea88a153001 ("tcp: md5: check md5 signature without socket
lock"), the MD5 option is checked in tcp_v[46]_rcv().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230803224552.69398-3-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kuniyuki Iwashima [Thu, 3 Aug 2023 22:45:51 +0000 (15:45 -0700)]
tcp: Disable header prediction for MD5 flow.
TCP socket saves the minimum required header length in tcp_header_len
of struct tcp_sock, and later the value is used in __tcp_fast_path_on()
to generate a part of TCP header in tcp_sock(sk)->pred_flags.
In tcp_rcv_established(), if the incoming packet has the same pattern
with pred_flags, we enter the fast path and skip full option parsing.
The MD5 option is parsed in tcp_v[46]_rcv(), so we need not parse it
again later in tcp_rcv_established() unless other options exist. We
add TCPOLEN_MD5SIG_ALIGNED to tcp_header_len in two paths to avoid the
slow path.
For passive open connections with MD5, we add TCPOLEN_MD5SIG_ALIGNED
to tcp_header_len in tcp_create_openreq_child() after 3WHS.
On the other hand, we do it in tcp_connect_init() for active open
connections. However, the value is overwritten while processing
SYN+ACK or crossed SYN in tcp_rcv_synsent_state_process().
These two cases will have the wrong value in pred_flags and never go
into the fast path.
We could update tcp_header_len in tcp_rcv_synsent_state_process(), but
a test with slightly modified netperf which uses MD5 for each flow shows
that the slow path is actually a bit faster than the fast path.
On c5.4xlarge EC2 instance (16 vCPU, 32 GiB mem)
$ for i in {1..10}; do
./super_netperf $(nproc) -H localhost -l 10 -- -m 256 -M 256;
done
Avg of 10
*
36e68eadd303 : 10.376 Gbps
* all fast path : 10.374 Gbps (patch v2, See Link)
* all slow path : 10.394 Gbps
The header prediction is not worth adding complexity for MD5, so let's
disable it for MD5.
Link: https://lore.kernel.org/netdev/20230803042214.38309-1-kuniyu@amazon.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230803224552.69398-2-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King (Oracle) [Thu, 3 Aug 2023 15:56:24 +0000 (16:56 +0100)]
net: phy: move marking PHY on SFP module into SFP code
Move marking the PHY as being on a SFP module into the SFP code between
getting the PHY device (and thus initialising the phy_device structure)
and registering the discovered device.
This means that PHY drivers can use phy_on_sfp() in their match and
get_features methods.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1qRaga-001vKt-8X@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yue Haibing [Thu, 3 Aug 2023 14:20:47 +0000 (22:20 +0800)]
mlxsw: spectrum: Remove unused function declarations
Commit
c3d2ed93b14d ("mlxsw: Remove old parsing depth infrastructure")
left behind mlxsw_sp_nve_inc_parsing_depth_get()/mlxsw_sp_nve_inc_parsing_depth_put().
And commit
532b49e41e64 ("mlxsw: spectrum_span: Derive SBIB from maximum port speed & MTU")
remove mlxsw_sp_span_port_mtu_update()/mlxsw_sp_span_speed_update_work() but leave the
declarations.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20230803142047.42660-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yue Haibing [Thu, 3 Aug 2023 14:19:04 +0000 (22:19 +0800)]
ixgbevf: Remove unused function declarations
ixgbe_napi_add_all()/ixgbe_napi_del_all() are declared but never implemented in
commit
92915f71201b ("ixgbevf: Driver main and ethool interface module and main header")
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230803141904.15316-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yue Haibing [Thu, 3 Aug 2023 13:45:07 +0000 (21:45 +0800)]
af_vsock: Remove unused declaration vsock_release_pending()/vsock_init_tap()
Commit
d021c344051a ("VSOCK: Introduce VM Sockets") declared but never implemented
vsock_release_pending(). Also vsock_init_tap() never implemented since introduction
in commit
531b374834c8 ("VSOCK: Add vsockmon tap functions").
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20230803134507.22660-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yue Haibing [Thu, 3 Aug 2023 13:54:24 +0000 (21:54 +0800)]
net: 802: Remove unused function declarations
Commit
d8d9ba8dc9c7 ("net: 802: remove dead leftover after ipx driver removal")
remove these implementations but leave the declarations.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230803135424.41664-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Thu, 3 Aug 2023 13:54:16 +0000 (13:54 +0000)]
tcp_metrics: hash table allocation cleanup
After commit
098a697b497e ("tcp_metrics: Use a single hash table
for all network namespaces.") we can avoid calling tcp_net_metrics_init()
for each new netns.
Instead, rename tcp_net_metrics_init() to tcp_metrics_hash_alloc(),
and move it to __init section.
Also move tcpmhash_entries to __initdata section.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230803135417.2716879-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yue Haibing [Thu, 3 Aug 2023 13:51:38 +0000 (21:51 +0800)]
net: hns3: Remove unused function declarations
Commit
1e6e76101fd9 ("net: hns3: configure promisc mode for VF asynchronously")
left behind hclge_inform_vf_promisc_info() declaration.
And commit
68c0a5c70614 ("net: hns3: Add HNS3 IMP(Integrated Mgmt Proc) Cmd Interface Support")
declared but never implemented hclge_cmd_mdio_write() and hclge_cmd_mdio_read().
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230803135138.37456-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yue Haibing [Thu, 3 Aug 2023 13:47:47 +0000 (21:47 +0800)]
net: llc: Remove unused function declarations
llc_conn_ac_send_i_rsp_as_ack() and llc_conn_ev_sendack_tmr_exp()
are never implemented since beginning of git history.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230803134747.41512-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 4 Aug 2023 21:03:03 +0000 (14:03 -0700)]
Merge branch 'devlink-use-spec-to-generate-split-ops'
Jiri Pirko says:
====================
devlink: use spec to generate split ops
This is an outcome of the discussion in the following thread:
https://lore.kernel.org/netdev/
20230720121829.566974-1-jiri@resnulli.us/
It serves as a dependency on the linked selector patchset.
There is an existing spec for devlink used for userspace part
generation. There are two commands supported there.
This patchset extends the spec so kernel split ops code could
be generated from it.
====================
Link: https://lore.kernel.org/r/20230803111340.1074067-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Thu, 3 Aug 2023 11:13:40 +0000 (13:13 +0200)]
devlink: use generated split ops and remove duplicated commands from small ops
Do the switch and use generated split ops for get and info_get commands.
Remove those from small ops array.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230803111340.1074067-13-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Thu, 3 Aug 2023 11:13:39 +0000 (13:13 +0200)]
devlink: include the generated netlink header
Put the newly added generated header to the include list. Remove the
duplicated temporary function prototypes.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230803111340.1074067-12-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Thu, 3 Aug 2023 11:13:38 +0000 (13:13 +0200)]
devlink: add split ops generated according to spec
Improve the existing devlink spec in order to serve as a source for
generation of valid devlink split ops for the existing commands.
Add the generated sources.
Node that the policies are narrowed down only to the attributes that
are actually parsed. The dont-validate-strict parsing policy makes sure
that other possibly passed garbage attributes from userspace are
ignored during validation.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230803111340.1074067-11-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Thu, 3 Aug 2023 11:13:37 +0000 (13:13 +0200)]
netlink: specs: devlink: add info-get dump op
Add missing dump op for info-get command and re-generate related
devlink-user.[ch] code.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230803111340.1074067-10-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Thu, 3 Aug 2023 11:13:36 +0000 (13:13 +0200)]
devlink: un-static devlink_nl_pre/post_doit()
To be prepared for the follow-up generated split ops addition,
make the functions devlink_nl_pre_doit() and devlink_nl_post_doit()
usable outside of netlink.c. Introduce temporary prototypes which are
going to be removed once the generated header will be included.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230803111340.1074067-9-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Thu, 3 Aug 2023 11:13:35 +0000 (13:13 +0200)]
devlink: introduce couple of dumpit callbacks for split ops
Introduce couple of dumpit callbacks for generated split ops. Have them
as a thin wrapper around iteration function and allow to pass dump_one()
function pointer directly without need to store in devlink_cmd structs.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230803111340.1074067-8-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Thu, 3 Aug 2023 11:13:34 +0000 (13:13 +0200)]
devlink: rename couple of doit netlink callbacks to match generated names
The generated names of the doit netlink callback are missing "cmd" in
their names. Change names to be ready to switch to generated split ops
header.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230803111340.1074067-7-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Thu, 3 Aug 2023 11:13:33 +0000 (13:13 +0200)]
devlink: rename devlink_nl_ops to devlink_nl_small_ops
In order to avoid name collision with the generated split ops array
which is going to be introduced as a follow-up patch, rename
the existing ops array to devlink_nl_small_ops.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230803111340.1074067-6-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Thu, 3 Aug 2023 11:13:32 +0000 (13:13 +0200)]
ynl-gen-c.py: render netlink policies static for split ops
When policies are rendered for split ops, they are consumed in the same
file. No need to expose them for user outside, make them static.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230803111340.1074067-5-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Thu, 3 Aug 2023 11:13:31 +0000 (13:13 +0200)]
ynl-gen-c.py: allow directional model for kernel mode
Directional model limitation is only applicable for uapi mode.
For kernel mode, the code is generated correctly using right cmd values
for do/dump requests. Lift the limitation.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230803111340.1074067-4-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Thu, 3 Aug 2023 11:13:30 +0000 (13:13 +0200)]
ynl-gen-c.py: filter rendering of validate field values for split ops
For split ops, do and dump has different meaningful values in
validate field.
Fix the rendering to allow the values per op type as follows:
do: strict
dump: dump, strict-dump
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230803111340.1074067-3-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Thu, 3 Aug 2023 11:13:29 +0000 (13:13 +0200)]
netlink: specs: add dump-strict flag for dont-validate property
Allow user to specify GENL_DONT_VALIDATE_DUMP_STRICT flag for validation
and add this flag to netlink spec schema.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230803111340.1074067-2-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Zhu Wang [Thu, 3 Aug 2023 08:29:00 +0000 (16:29 +0800)]
net: lan966x: Do not check 0 for platform_get_irq_byname()
Since platform_get_irq_byname() never returned zero, so it need not to
check whether it returned zero, it returned -EINVAL or -ENXIO when
failed, so we replace the return error code with the result it returned.
Signed-off-by: Zhu Wang <wangzhu9@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 3 Aug 2023 07:14:26 +0000 (07:14 +0000)]
net: vlan: update wrong comments
vlan_insert_tag() and friends do not allocate a new skb.
However they might allocate a new skb->head.
Update their comments to better describe their behavior.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Thu, 3 Aug 2023 07:53:34 +0000 (07:53 +0000)]
tcp/dccp: cache line align inet_hashinfo
I have seen tcp_hashinfo starting at a non optimal location,
forcing input handlers to pull two cache lines instead of one,
and sharing a cache line that was dirtied more than necessary:
ffffffff83680600 b tcp_orphan_timer
ffffffff83680628 b tcp_orphan_cache
ffffffff8368062c b tcp_enable_tx_delay.__tcp_tx_delay_enabled
ffffffff83680630 B tcp_hashinfo
ffffffff83680680 b tcp_cong_list_lock
After this patch, ehash, ehash_locks, ehash_mask and ehash_locks_mask
are located in a read-only cache line.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Souradeep Chakrabarti [Wed, 2 Aug 2023 11:07:40 +0000 (04:07 -0700)]
net: mana: Configure hwc timeout from hardware
At present hwc timeout value is a fixed value. This patch sets the hwc
timeout from the hardware. It now uses a new hardware capability
GDMA_DRV_CAP_FLAG_1_HWC_TIMEOUT_RECONFIG to query and set the value
in hwc_timeout.
Signed-off-by: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Li Zetao [Wed, 2 Aug 2023 09:31:56 +0000 (17:31 +0800)]
net: microchip: vcap api: Use ERR_CAST() in vcap_decode_rule()
There is a warning reported by coccinelle:
./drivers/net/ethernet/microchip/vcap/vcap_api.c:2399:9-16: WARNING:
ERR_CAST can be used with ri
Use ERR_CAST instead of ERR_PTR + PTR_ERR to simplify the
conversion process.
Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 4 Aug 2023 07:46:07 +0000 (08:46 +0100)]
Merge tag 'linux-can-next-for-6.6-
20230803' of git://git./linux/kernel/git/mkl/linux-can-next
Marc Kleine-Budde says:
====================
pull-request: can-next 2023-08-03
This is a pull request of 9 patches for net-next/master.
The 1st patch is by Ruan Jinjie, targets the flexcan driver, and
cleans up the error handling of platform_get_irq() in the
flexcan_probe() function.
Markus Schneider-Pargmann contributes 6 patches for the tcan4x5x M_CAN
driver, consisting of some cleanups, and adding support for the
tcan4552/4553 chips.
Another patch by Ruan Jinjie, that cleans up the error path of
platform_get_irq() in the c_can_plat_probe() function of the C_CAN
platform driver.
The last patch is by Frank Jungclaus and adds support for the
CAN-USB/3 and CAN FD to the ESD USB CAN driver.
================
Signed-off-by: David S. Miller <davem@davemloft.net>
Yue Haibing [Wed, 2 Aug 2023 13:07:16 +0000 (21:07 +0800)]
net: Space.h: Remove unused function declarations
Commit
5aa83a4c0a15 (" [PATCH] remove two obsolete net drivers") remove fmv18x_probe().
And commmit
01f4685797a5 ("eth: amd: remove NI6510 support (ni65)") leave ni65_probe().
Commit
a10079c66290 ("staging: remove hp100 driver") remove hp100 driver and hp100_probe()
declaration is not used anymore.
sonic_probe() and iph5526_probe() are never implemented since the beginning of git history.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/r/20230802130716.37308-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 3 Aug 2023 23:00:07 +0000 (16:00 -0700)]
eth: dpaa: add missing net/xdp.h include
Add missing include for DPAA (fix aarch64 build).
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202308040620.ty8oYNOP-lkp@intel.com/
Fixes: 680ee0456a57 ("net: invert the netdevice.h vs xdp.h dependency")
Reviewed-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20230803230008.362214-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 3 Aug 2023 22:34:36 +0000 (15:34 -0700)]
Merge tag 'for-netdev' of https://git./linux/kernel/git/bpf/bpf-next
Martin KaFai Lau says:
====================
pull-request: bpf-next 2023-08-03
We've added 54 non-merge commits during the last 10 day(s) which contain
a total of 84 files changed, 4026 insertions(+), 562 deletions(-).
The main changes are:
1) Add SO_REUSEPORT support for TC bpf_sk_assign from Lorenz Bauer,
Daniel Borkmann
2) Support new insns from cpu v4 from Yonghong Song
3) Non-atomically allocate freelist during prefill from YiFei Zhu
4) Support defragmenting IPv(4|6) packets in BPF from Daniel Xu
5) Add tracepoint to xdp attaching failure from Leon Hwang
6) struct netdev_rx_queue and xdp.h reshuffling to reduce
rebuild time from Jakub Kicinski
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (54 commits)
net: invert the netdevice.h vs xdp.h dependency
net: move struct netdev_rx_queue out of netdevice.h
eth: add missing xdp.h includes in drivers
selftests/bpf: Add testcase for xdp attaching failure tracepoint
bpf, xdp: Add tracepoint to xdp attaching failure
selftests/bpf: fix static assert compilation issue for test_cls_*.c
bpf: fix bpf_probe_read_kernel prototype mismatch
riscv, bpf: Adapt bpf trampoline to optimized riscv ftrace framework
libbpf: fix typos in Makefile
tracing: bpf: use struct trace_entry in struct syscall_tp_t
bpf, devmap: Remove unused dtab field from bpf_dtab_netdev
bpf, cpumap: Remove unused cmap field from bpf_cpu_map_entry
netfilter: bpf: Only define get_proto_defrag_hook() if necessary
bpf: Fix an array-index-out-of-bounds issue in disasm.c
net: remove duplicate INDIRECT_CALLABLE_DECLARE of udp[6]_ehashfn
docs/bpf: Fix malformed documentation
bpf: selftests: Add defrag selftests
bpf: selftests: Support custom type and proto for client sockets
bpf: selftests: Support not connecting client socket
netfilter: bpf: Support BPF_F_NETFILTER_IP_DEFRAG in netfilter link
...
====================
Link: https://lore.kernel.org/r/20230803174845.825419-1-martin.lau@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 3 Aug 2023 21:29:50 +0000 (14:29 -0700)]
Merge git://git./linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.
Conflicts:
net/dsa/port.c
9945c1fb03a3 ("net: dsa: fix older DSA drivers using phylink")
a88dd7538461 ("net: dsa: remove legacy_pre_march2020 detection")
https://lore.kernel.org/all/
20230731102254.
2c9868ca@canb.auug.org.au/
net/xdp/xsk.c
3c5b4d69c358 ("net: annotate data-races around sk->sk_mark")
b7f72a30e9ac ("xsk: introduce wrappers and helpers for supporting multi-buffer in Tx path")
https://lore.kernel.org/all/
20230731102631.
39988412@canb.auug.org.au/
drivers/net/ethernet/broadcom/bnxt/bnxt.c
37b61cda9c16 ("bnxt: don't handle XDP in netpoll")
2b56b3d99241 ("eth: bnxt: handle invalid Tx completions more gracefully")
https://lore.kernel.org/all/
20230801101708.
1dc7faac@canb.auug.org.au/
Adjacent changes:
drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c
62da08331f1a ("net/mlx5e: Set proper IPsec source port in L4 selector")
fbd517549c32 ("net/mlx5e: Add function to get IPsec offload namespace")
drivers/net/ethernet/sfc/selftest.c
55c1528f9b97 ("sfc: fix field-spanning memcpy in selftest")
ae9d445cd41f ("sfc: Miscellaneous comment removals")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Thu, 3 Aug 2023 21:00:02 +0000 (14:00 -0700)]
Merge tag 'net-6.5-rc5' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bpf and wireless.
Nothing scary here. Feels like the first wave of regressions from v6.5
is addressed - one outstanding fix still to come in TLS for the
sendpage rework.
Current release - regressions:
- udp: fix __ip_append_data()'s handling of MSG_SPLICE_PAGES
- dsa: fix older DSA drivers using phylink
Previous releases - regressions:
- gro: fix misuse of CB in udp socket lookup
- mlx5: unregister devlink params in case interface is down
- Revert "wifi: ath11k: Enable threaded NAPI"
Previous releases - always broken:
- sched: cls_u32: fix match key mis-addressing
- sched: bind logic fixes for cls_fw, cls_u32 and cls_route
- add bound checks to a number of places which hand-parse netlink
- bpf: disable preemption in perf_event_output helpers code
- qed: fix scheduling in a tasklet while getting stats
- avoid using APIs which are not hardirq-safe in couple of drivers,
when we may be in a hard IRQ (netconsole)
- wifi: cfg80211: fix return value in scan logic, avoid page
allocator warning
- wifi: mt76: mt7615: do not advertise 5 GHz on first PHY of MT7615D
(DBDC)
Misc:
- drop handful of inactive maintainers, put some new in place"
* tag 'net-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (98 commits)
MAINTAINERS: update TUN/TAP maintainers
test/vsock: remove vsock_perf executable on `make clean`
tcp_metrics: fix data-race in tcpm_suck_dst() vs fastopen
tcp_metrics: annotate data-races around tm->tcpm_net
tcp_metrics: annotate data-races around tm->tcpm_vals[]
tcp_metrics: annotate data-races around tm->tcpm_lock
tcp_metrics: annotate data-races around tm->tcpm_stamp
tcp_metrics: fix addr_same() helper
prestera: fix fallback to previous version on same major version
udp: Fix __ip_append_data()'s handling of MSG_SPLICE_PAGES
net/mlx5e: Set proper IPsec source port in L4 selector
net/mlx5: fs_core: Skip the FTs in the same FS_TYPE_PRIO_CHAINS fs_prio
net/mlx5: fs_core: Make find_closest_ft more generic
wifi: brcmfmac: Fix field-spanning write in brcmf_scan_params_v2_to_v1()
vxlan: Fix nexthop hash size
ip6mr: Fix skb_under_panic in ip6mr_cache_report()
s390/qeth: Don't call dev_close/dev_open (DOWN/UP)
net: tap_open(): set sk_uid from current_fsuid()
net: tun_chr_open(): set sk_uid from current_fsuid()
net: dcb: choose correct policy to parse DCB_ATTR_BCN
...
Jakub Kicinski [Wed, 2 Aug 2023 18:28:43 +0000 (11:28 -0700)]
MAINTAINERS: update TUN/TAP maintainers
Willem and Jason have agreed to take over the maintainer
duties for TUN/TAP, thank you!
There's an existing entry for TUN/TAP which only covers
the user mode Linux implementation.
Since we haven't heard from Maxim on the list for almost
a decade, extend that entry and take it over, rather than
adding a new one.
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20230802182843.4193099-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 3 Aug 2023 18:22:53 +0000 (11:22 -0700)]
Merge tag 'for-netdev' of https://git./linux/kernel/git/bpf/bpf
Martin KaFai Lau says:
====================
pull-request: bpf 2023-08-03
We've added 5 non-merge commits during the last 7 day(s) which contain
a total of 3 files changed, 37 insertions(+), 20 deletions(-).
The main changes are:
1) Disable preemption in perf_event_output helpers code,
from Jiri Olsa
2) Add length check for SK_DIAG_BPF_STORAGE_REQ_MAP_FD parsing,
from Lin Ma
3) Multiple warning splat fixes in cpumap from Hou Tao
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf, cpumap: Handle skb as well when clean up ptr_ring
bpf, cpumap: Make sure kthread is running before map update returns
bpf: Add length check for SK_DIAG_BPF_STORAGE_REQ_MAP_FD parsing
bpf: Disable preemption in bpf_event_output
bpf: Disable preemption in bpf_perf_event_output
====================
Link: https://lore.kernel.org/r/20230803181429.994607-1-martin.lau@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kalle Valo [Thu, 3 Aug 2023 18:16:27 +0000 (21:16 +0300)]
Merge ath-next from git://git./linux/kernel/git/kvalo/ath.git
ath.git patches for v6.6. Major changes:
ath12k
* Extremely High Throughput (EHT) PHY support for Wi-Fi 7
Jakub Kicinski [Thu, 3 Aug 2023 18:05:46 +0000 (11:05 -0700)]
Merge tag 'wireless-2023-08-03' of git://git./linux/kernel/git/wireless/wireless
Kalle Valo says:
====================
wireless fixes for v6.5
We did some house cleaning in MAINTAINERS file so several patches
about that. Few regressions fixed and also fix some recently enabled
memcpy() warnings. Only small commits and nothing special standing
out.
* tag 'wireless-2023-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
wifi: brcmfmac: Fix field-spanning write in brcmf_scan_params_v2_to_v1()
wifi: ray_cs: Replace 1-element array with flexible array
MAINTAINERS: add Jeff as ath10k, ath11k and ath12k maintainer
MAINTAINERS: wifi: mark mlw8k as orphan
MAINTAINERS: wifi: mark b43 as orphan
MAINTAINERS: wifi: mark zd1211rw as orphan
MAINTAINERS: wifi: mark wl3501 as orphan
MAINTAINERS: wifi: mark rndis_wlan as orphan
MAINTAINERS: wifi: mark ar5523 as orphan
MAINTAINERS: wifi: mark cw1200 as orphan
MAINTAINERS: wifi: atmel: mark as orphan
MAINTAINERS: wifi: rtw88: change Ping as the maintainer
Revert "wifi: ath6k: silence false positive -Wno-dangling-pointer warning on GCC 12"
wifi: cfg80211: Fix return value in scan logic
Revert "wifi: ath11k: Enable threaded NAPI"
MAINTAINERS: Update mwifiex maintainer list
wifi: mt76: mt7615: do not advertise 5 GHz on first phy of MT7615D (DBDC)
====================
Link: https://lore.kernel.org/r/20230803140058.57476C433C9@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stefano Garzarella [Thu, 3 Aug 2023 08:54:54 +0000 (10:54 +0200)]
test/vsock: remove vsock_perf executable on `make clean`
We forgot to add vsock_perf to the rm command in the `clean`
target, so now we have a left over after `make clean` in
tools/testing/vsock.
Fixes: 8abbffd27ced ("test/vsock: vsock_perf utility")
Cc: AVKrasnov@sberdevices.ru
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Simon Horman <horms@kernel.org> # build-tested
Link: https://lore.kernel.org/r/20230803085454.30897-1-sgarzare@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 3 Aug 2023 17:58:27 +0000 (10:58 -0700)]
Merge branch 'tcp_metrics-series-of-fixes'
Eric Dumazet says:
====================
tcp_metrics: series of fixes
This series contains a fix for addr_same() and various
data-race annotations.
We still have to address races over tm->tcpm_saddr and
tm->tcpm_daddr later.
====================
Link: https://lore.kernel.org/r/20230802131500.1478140-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Wed, 2 Aug 2023 13:15:00 +0000 (13:15 +0000)]
tcp_metrics: fix data-race in tcpm_suck_dst() vs fastopen
Whenever tcpm_new() reclaims an old entry, tcpm_suck_dst()
would overwrite data that could be read from tcp_fastopen_cache_get()
or tcp_metrics_fill_info().
We need to acquire fastopen_seqlock to maintain consistency.
For newly allocated objects, tcpm_new() can switch to kzalloc()
to avoid an extra fastopen_seqlock acquisition.
Fixes: 1fe4c481ba63 ("net-tcp: Fast Open client - cookie cache")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230802131500.1478140-7-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Wed, 2 Aug 2023 13:14:59 +0000 (13:14 +0000)]
tcp_metrics: annotate data-races around tm->tcpm_net
tm->tcpm_net can be read or written locklessly.
Instead of changing write_pnet() and read_pnet() and potentially
hurt performance, add the needed READ_ONCE()/WRITE_ONCE()
in tm_net() and tcpm_new().
Fixes: 849e8a0ca8d5 ("tcp_metrics: Add a field tcpm_net and verify it matches on lookup")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230802131500.1478140-6-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Wed, 2 Aug 2023 13:14:58 +0000 (13:14 +0000)]
tcp_metrics: annotate data-races around tm->tcpm_vals[]
tm->tcpm_vals[] values can be read or written locklessly.
Add needed READ_ONCE()/WRITE_ONCE() to document this,
and force use of tcp_metric_get() and tcp_metric_set()
Fixes: 51c5d0c4b169 ("tcp: Maintain dynamic metrics in local cache.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Wed, 2 Aug 2023 13:14:57 +0000 (13:14 +0000)]
tcp_metrics: annotate data-races around tm->tcpm_lock
tm->tcpm_lock can be read or written locklessly.
Add needed READ_ONCE()/WRITE_ONCE() to document this.
Fixes: 51c5d0c4b169 ("tcp: Maintain dynamic metrics in local cache.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230802131500.1478140-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Wed, 2 Aug 2023 13:14:56 +0000 (13:14 +0000)]
tcp_metrics: annotate data-races around tm->tcpm_stamp
tm->tcpm_stamp can be read or written locklessly.
Add needed READ_ONCE()/WRITE_ONCE() to document this.
Also constify tcpm_check_stamp() dst argument.
Fixes: 51c5d0c4b169 ("tcp: Maintain dynamic metrics in local cache.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230802131500.1478140-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Wed, 2 Aug 2023 13:14:55 +0000 (13:14 +0000)]
tcp_metrics: fix addr_same() helper
Because v4 and v6 families use separate inetpeer trees (respectively
net->ipv4.peers and net->ipv6.peers), inetpeer_addr_cmp(a, b) assumes
a & b share the same family.
tcp_metrics use a common hash table, where entries can have different
families.
We must therefore make sure to not call inetpeer_addr_cmp()
if the families do not match.
Fixes: d39d14ffa24c ("net: Add helper function to compare inetpeer addresses")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230802131500.1478140-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jonas Gorski [Wed, 2 Aug 2023 09:23:56 +0000 (11:23 +0200)]
prestera: fix fallback to previous version on same major version
When both supported and previous version have the same major version,
and the firmwares are missing, the driver ends in a loop requesting the
same (previous) version over and over again:
[ 76.327413] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.1.img firmware, fall-back to previous 4.0 version
[ 76.339802] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version
[ 76.352162] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version
[ 76.364502] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version
[ 76.376848] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version
[ 76.389183] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version
[ 76.401522] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version
[ 76.413860] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version
[ 76.426199] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version
...
Fix this by inverting the check to that we aren't yet at the previous
version, and also check the minor version.
This also catches the case where both versions are the same, as it was
after commit
bb5dbf2cc64d ("net: marvell: prestera: add firmware v4.0
support").
With this fix applied:
[ 88.499622] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.1.img firmware, fall-back to previous 4.0 version
[ 88.511995] Prestera DX 0000:01:00.0: failed to request previous firmware: mrvl/prestera/mvsw_prestera_fw-v4.0.img
[ 88.522403] Prestera DX: probe of 0000:01:00.0 failed with error -2
Fixes: 47f26018a414 ("net: marvell: prestera: try to load previous fw version")
Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
Acked-by: Elad Nachman <enachman@marvell.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Taras Chornyi <taras.chornyi@plvision.eu>
Link: https://lore.kernel.org/r/20230802092357.163944-1-jonas.gorski@bisdn.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 3 Aug 2023 16:54:25 +0000 (09:54 -0700)]
Merge branch 'docs-net-page_pool-sync-dev-and-kdoc'
Jakub Kicinski says:
====================
docs: net: page_pool: sync dev and kdoc
Document PP_FLAG_DMA_SYNC_DEV based on recent conversation.
Use kdoc to document structs and functions, to avoid duplication.
Olek, this will conflict with your work, but I think that trying
to make progress in parallel is the best course of action...
Retargetting at net-next to make it a little less bad.
====================
Link: https://lore.kernel.org/r/20230802161821.3621985-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Wed, 2 Aug 2023 16:18:21 +0000 (09:18 -0700)]
docs: net: page_pool: use kdoc to avoid duplicating the information
All struct members of the driver-facing APIs are documented twice,
in the code and under Documentation. This is a bit tedious.
I also get the feeling that a lot of developers will read the header
when coding, rather than the doc. Bring the two a little closer
together by using kdoc for structs and functions.
Using kdoc also gives us links (mentioning a function or struct
in the text gets replaced by a link to its doc).
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Link: https://lore.kernel.org/r/20230802161821.3621985-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Wed, 2 Aug 2023 16:18:20 +0000 (09:18 -0700)]
docs: net: page_pool: document PP_FLAG_DMA_SYNC_DEV parameters
Using PP_FLAG_DMA_SYNC_DEV is a bit confusing. It was perhaps
more obvious when it was introduced but the page pool use
has grown beyond XDP and beyond packet-per-page so now
making the heads and tails out of this feature is not
trivial.
Obviously making the API more user friendly would be
a better fix, but until someone steps up to do that
let's at least document what the parameters are.
Relevant discussion in the first Link.
Link: https://lore.kernel.org/all/20230731114427.0da1f73b@kernel.org/
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Link: https://lore.kernel.org/r/20230802161821.3621985-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Thu, 3 Aug 2023 16:26:34 +0000 (09:26 -0700)]
Merge tag 'nfsd-6.5-3' of git://git./linux/kernel/git/cel/linux
Pull nfsd fix from Chuck Lever:
- Fix tmpfs splice read support
* tag 'nfsd-6.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
nfsd: Fix reading via splice
Linus Torvalds [Thu, 3 Aug 2023 16:20:50 +0000 (09:20 -0700)]
Merge tag 'erofs-for-6.5-rc5-fixes' of git://git./linux/kernel/git/xiang/erofs
Pull erofs fixes from Gao Xiang:
- Fix data corruption caused by insufficient decompression on
deduplicated compressed extents
- Drop a useless s_magic checking in erofs_kill_sb()
* tag 'erofs-for-6.5-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: drop unnecessary WARN_ON() in erofs_kill_sb()
erofs: fix wrong primary bvec selection on deduplicated extents
Linus Torvalds [Thu, 3 Aug 2023 16:06:38 +0000 (09:06 -0700)]
Merge tag 's390-6.5-4' of git://git./linux/kernel/git/s390/linux
Pull s390 fixes from Heiko Carstens:
- Split kernel large page mappings into 4k mappings in case debug
pagealloc is enabled again. This got accidentally removed by commit
bb1520d581a3 ("s390/mm: start kernel with DAT enabled")
- Fix error handling in KVM's sthyi handling
- Add missing include to s390's uapi ptrace.h
- Update defconfigs
* tag 's390-6.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/ptrace: add missing linux/const.h include
KVM: s390: fix sthyi error handling
s390: update defconfigs
s390/vmem: split pages when debug pagealloc is enabled
Martin KaFai Lau [Thu, 3 Aug 2023 15:38:07 +0000 (08:38 -0700)]
Merge branch 'net: struct netdev_rx_queue and xdp.h reshuffling'
Jakub Kicinski says:
====================
While poking at struct netdev_rx_queue I got annoyed by
the huge rebuild times. I split it out from netdevice.h
and then realized that it was the main reason we included
xdp.h in there. So I removed that dependency as well.
This gives us very pleasant build times for both xdp.h
and struct netdev_rx_queue changes.
I'm sending this for bpf-next because I think it'd be easiest
if it goes in there, and then bpf-next gets flushed soon after?
I can also make a branch on merge-base for net-next and bpf-next..
v2:
- build fix
- reorder some includes
v1: https://lore.kernel.org/all/
20230802003246.
2153774-1-kuba@kernel.org/
====================
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Jakub Kicinski [Thu, 3 Aug 2023 01:02:30 +0000 (18:02 -0700)]
net: invert the netdevice.h vs xdp.h dependency
xdp.h is far more specific and is included in only 67 other
files vs netdevice.h's 1538 include sites.
Make xdp.h include netdevice.h, instead of the other way around.
This decreases the incremental allmodconfig builds size when
xdp.h is touched from 5947 to 662 objects.
Move bpf_prog_run_xdp() to xdp.h, seems appropriate and filter.h
is a mega-header in its own right so it's nice to avoid xdp.h
getting included there as well.
The only unfortunate part is that the typedef for xdp_features_t
has to move to netdevice.h, since its embedded in struct netdevice.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Link: https://lore.kernel.org/r/20230803010230.1755386-4-kuba@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Jakub Kicinski [Thu, 3 Aug 2023 01:02:29 +0000 (18:02 -0700)]
net: move struct netdev_rx_queue out of netdevice.h
struct netdev_rx_queue is touched in only a few places
and having it defined in netdevice.h brings in the dependency
on xdp.h, because struct xdp_rxq_info gets embedded in
struct netdev_rx_queue.
In prep for removal of xdp.h from netdevice.h move all
the netdev_rx_queue stuff to a new header.
We could technically break the new header up to avoid
the sysfs.h include but it's so rarely included it
doesn't seem to be worth it at this point.
Reviewed-by: Amritha Nambiar <amritha.nambiar@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Link: https://lore.kernel.org/r/20230803010230.1755386-3-kuba@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>