Toke Høiland-Jørgensen [Sat, 9 Apr 2022 21:30:53 +0000 (23:30 +0200)]
bpf: Fix release of page_pool in BPF_PROG_RUN in test runner
The live packet mode in BPF_PROG_RUN allocates a page_pool instance for
each test run instance and uses it for the packet data. On setup it creates
the page_pool, and calls xdp_reg_mem_model() to allow pages to be returned
properly from the XDP data path. However, xdp_reg_mem_model() also raises
the reference count of the page_pool itself, so the single
page_pool_destroy() count on teardown was not enough to actually release
the pool. To fix this, add an additional xdp_unreg_mem_model() call on
teardown.
Fixes:
b530e9e1063e ("bpf: Add "live packet" mode for XDP in BPF_PROG_RUN")
Reported-by: Freysteinn Alfredsson <freysteinn.alfredsson@kau.se>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20220409213053.3117305-1-toke@redhat.com
Maciej Fijalkowski [Wed, 6 Apr 2022 15:58:04 +0000 (17:58 +0200)]
xsk: Fix l2fwd for copy mode + busy poll combo
While checking AF_XDP copy mode combined with busy poll, strange
results were observed. rxdrop and txonly scenarios worked fine, but
l2fwd broke immediately.
After a deeper look, it turned out that for l2fwd, Tx side was exiting
early due to xsk_no_wakeup() returning true and in the end
xsk_generic_xmit() was never called. Note that AF_XDP Tx in copy mode
is syscall steered, so the current behavior is broken.
Txonly scenario only worked due to the fact that
sk_mark_napi_id_once_xdp() was never called - since Rx side is not in
the picture for this case and mentioned function is called in
xsk_rcv_check(), sk::sk_napi_id was never set, which in turn meant that
xsk_no_wakeup() was returning false (see the sk->sk_napi_id >=
MIN_NAPI_ID check in there).
To fix this, prefer busy poll in xsk_sendmsg() only when zero copy is
enabled on a given AF_XDP socket. By doing so, busy poll in copy mode
would not exit early on Tx side and eventually xsk_generic_xmit() will
be called.
Fixes:
a0731952d9cd ("xsk: Add busy-poll support for {recv,send}msg()")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220406155804.434493-1-maciej.fijalkowski@intel.com
Duoming Zhou [Tue, 5 Apr 2022 13:22:06 +0000 (21:22 +0800)]
drivers: net: slip: fix NPD bug in sl_tx_timeout()
When a slip driver is detaching, the slip_close() will act to
cleanup necessary resources and sl->tty is set to NULL in
slip_close(). Meanwhile, the packet we transmit is blocked,
sl_tx_timeout() will be called. Although slip_close() and
sl_tx_timeout() use sl->lock to synchronize, we don`t judge
whether sl->tty equals to NULL in sl_tx_timeout() and the
null pointer dereference bug will happen.
(Thread 1) | (Thread 2)
| slip_close()
| spin_lock_bh(&sl->lock)
| ...
... | sl->tty = NULL //(1)
sl_tx_timeout() | spin_unlock_bh(&sl->lock)
spin_lock(&sl->lock); |
... | ...
tty_chars_in_buffer(sl->tty)|
if (tty->ops->..) //(2) |
... | synchronize_rcu()
We set NULL to sl->tty in position (1) and dereference sl->tty
in position (2).
This patch adds check in sl_tx_timeout(). If sl->tty equals to
NULL, sl_tx_timeout() will goto out.
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Reviewed-by: Jiri Slaby <jirislaby@kernel.org>
Link: https://lore.kernel.org/r/20220405132206.55291-1-duoming@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 7 Apr 2022 04:58:49 +0000 (21:58 -0700)]
Merge https://git./linux/kernel/git/bpf/bpf
Alexei Starovoitov says:
====================
pull-request: bpf 2022-04-06
We've added 8 non-merge commits during the last 8 day(s) which contain
a total of 9 files changed, 139 insertions(+), 36 deletions(-).
The main changes are:
1) rethook related fixes, from Jiri and Masami.
2) Fix the case when tracing bpf prog is attached to struct_ops, from Martin.
3) Support dual-stack sockets in bpf_tcp_check_syncookie, from Maxim.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf: Adjust bpf_tcp_check_syncookie selftest to test dual-stack sockets
bpf: Support dual-stack sockets in bpf_tcp_check_syncookie
bpf: selftests: Test fentry tracing a struct_ops program
bpf: Resolve to prog->aux->dst_prog->type only for BPF_PROG_TYPE_EXT
rethook: Fix to use WRITE_ONCE() for rethook:: Handler
selftests/bpf: Fix warning comparing pointer to 0
bpf: Fix sparse warnings in kprobe_multi_resolve_syms
bpftool: Explicit errno handling in skeletons
====================
Link: https://lore.kernel.org/r/20220407031245.73026-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Maxim Mikityanskiy [Wed, 6 Apr 2022 12:41:13 +0000 (15:41 +0300)]
bpf: Adjust bpf_tcp_check_syncookie selftest to test dual-stack sockets
The previous commit fixed support for dual-stack sockets in
bpf_tcp_check_syncookie. This commit adjusts the selftest to verify the
fixed functionality.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Arthur Fabre <afabre@cloudflare.com>
Link: https://lore.kernel.org/bpf/20220406124113.2795730-2-maximmi@nvidia.com
Maxim Mikityanskiy [Wed, 6 Apr 2022 12:41:12 +0000 (15:41 +0300)]
bpf: Support dual-stack sockets in bpf_tcp_check_syncookie
bpf_tcp_gen_syncookie looks at the IP version in the IP header and
validates the address family of the socket. It supports IPv4 packets in
AF_INET6 dual-stack sockets.
On the other hand, bpf_tcp_check_syncookie looks only at the address
family of the socket, ignoring the real IP version in headers, and
validates only the packet size. This implementation has some drawbacks:
1. Packets are not validated properly, allowing a BPF program to trick
bpf_tcp_check_syncookie into handling an IPv6 packet on an IPv4
socket.
2. Dual-stack sockets fail the checks on IPv4 packets. IPv4 clients end
up receiving a SYNACK with the cookie, but the following ACK gets
dropped.
This patch fixes these issues by changing the checks in
bpf_tcp_check_syncookie to match the ones in bpf_tcp_gen_syncookie. IP
version from the header is taken into account, and it is validated
properly with address family.
Fixes:
399040847084 ("bpf: add helper to check for a valid SYN cookie")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Acked-by: Arthur Fabre <afabre@cloudflare.com>
Link: https://lore.kernel.org/bpf/20220406124113.2795730-1-maximmi@nvidia.com
Xiaomeng Tong [Wed, 6 Apr 2022 03:55:56 +0000 (11:55 +0800)]
myri10ge: fix an incorrect free for skb in myri10ge_sw_tso
All remaining skbs should be released when myri10ge_xmit fails to
transmit a packet. Fix it within another skb_list_walk_safe.
Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcin Kozlowski [Wed, 6 Apr 2022 08:05:37 +0000 (10:05 +0200)]
net: usb: aqc111: Fix out-of-bounds accesses in RX fixup
aqc111_rx_fixup() contains several out-of-bounds accesses that can be
triggered by a malicious (or defective) USB device, in particular:
- The metadata array (desc_offset..desc_offset+2*pkt_count) can be out of bounds,
causing OOB reads and (on big-endian systems) OOB endianness flips.
- A packet can overlap the metadata array, causing a later OOB
endianness flip to corrupt data used by a cloned SKB that has already
been handed off into the network stack.
- A packet SKB can be constructed whose tail is far beyond its end,
causing out-of-bounds heap data to be considered part of the SKB's
data.
Found doing variant analysis. Tested it with another driver (ax88179_178a), since
I don't have a aqc111 device to test it, but the code looks very similar.
Signed-off-by: Marcin Kozlowski <marcinguy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jamie Bainbridge [Wed, 6 Apr 2022 11:19:19 +0000 (21:19 +1000)]
qede: confirm skb is allocated before using
qede_build_skb() assumes build_skb() always works and goes straight
to skb_reserve(). However, build_skb() can fail under memory pressure.
This results in a kernel panic because the skb to reserve is NULL.
Add a check in case build_skb() failed to allocate and return NULL.
The NULL return is handled correctly in callers to qede_build_skb().
Fixes:
8a8633978b842 ("qede: Add build_skb() support.")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal [Wed, 6 Apr 2022 10:04:45 +0000 (12:04 +0200)]
net: ipv6mr: fix unused variable warning with CONFIG_IPV6_PIMSM_V2=n
net/ipv6/ip6mr.c:1656:14: warning: unused variable 'do_wrmifwhole'
Move it to the CONFIG_IPV6_PIMSM_V2 scope where its used.
Fixes:
4b340a5a726d ("net: ip6mr: add support for passing full packet on wrong mif")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 6 Apr 2022 14:03:50 +0000 (15:03 +0100)]
Merge branch '100GbE' of git://git./linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2022-04-05
Maciej Fijalkowski says:
We were solving issues around AF_XDP busy poll's not-so-usual scenarios,
such as very big busy poll budgets applied to very small HW rings. This
set carries the things that were found during that work that apply to
net tree.
One thing that was fixed for all in-tree ZC drivers was missing on ice
side all the time - it's about syncing RCU before destroying XDP
resources. Next one fixes the bit that is checked in ice_xsk_wakeup and
third one avoids false setting of DD bits on Tx descriptors.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Walle [Tue, 5 Apr 2022 12:02:33 +0000 (14:02 +0200)]
net: phy: mscc-miim: reject clause 45 register accesses
The driver doesn't support clause 45 register access yet, but doesn't
check if the access is a c45 one either. This leads to spurious register
reads and writes. Add the check.
Fixes:
542671fe4d86 ("net: phy: mscc-miim: Add MDIO driver")
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 6 Apr 2022 12:54:52 +0000 (13:54 +0100)]
Merge branch 'axienet-broken-link'
Andy Chiu says:
====================
Fix broken link on Xilinx's AXI Ethernet in SGMII mode
The Ethernet driver use phy-handle to reference the PCS/PMA PHY. This
could be a problem if one wants to configure an external PHY via phylink,
since it use the same phandle to get the PHY. To fix this, introduce a
dedicated pcs-handle to point to the PCS/PMA PHY and deprecate the use
of pointing it with phy-handle. A similar use case of pcs-handle can be
seen on dpaa2 as well.
--- patch v5 ---
- Re-apply the v4 patch on the net tree.
- Describe the pcs-handle DT binding at ethernet-controller level.
--- patch v6 ---
- Remove "preferrably" to clearify usage of pcs_handle.
--- patch v7 ---
- Rebase the patch on latest net/master
--- patch v8 ---
- Rebase the patch on net-next/master
- Add "reviewed-by" tag in PATCH 3/4: dt-bindings: net: add pcs-handle
attribute
- Remove "fix" tag in last commit message since this is not a critical
bug and will not be back ported to stable.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Chiu [Tue, 5 Apr 2022 09:19:29 +0000 (17:19 +0800)]
net: axiemac: use a phandle to reference pcs_phy
In some SGMII use cases where both a fixed link external PHY and the
internal PCS/PMA PHY need to be configured, we should explicitly use a
phandle "pcs-phy" to get the reference to the PCS/PMA PHY. Otherwise, the
driver would use "phy-handle" in the DT as the reference to both the
external and the internal PCS/PMA PHY.
In other cases where the core is connected to a SFP cage, we could still
point phy-handle to the intenal PCS/PMA PHY, and let the driver connect
to the SFP module, if exist, via phylink.
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Greentime Hu <greentime.hu@sifive.com>
Reviewed-by: Robert Hancock <robert.hancock@calian.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Chiu [Tue, 5 Apr 2022 09:19:28 +0000 (17:19 +0800)]
dt-bindings: net: add pcs-handle attribute
Document the new pcs-handle attribute to support connecting to an
external PHY. For Xilinx's AXI Ethernet, this is used when the core
operates in SGMII or 1000Base-X modes and links through the internal
PCS/PMA PHY.
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Greentime Hu <greentime.hu@sifive.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Chiu [Tue, 5 Apr 2022 09:19:27 +0000 (17:19 +0800)]
net: axienet: factor out phy_node in struct axienet_local
the struct member `phy_node` of struct axienet_local is not used by the
driver anymore after initialization. It might be a remnent of old code
and could be removed.
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Greentime Hu <greentime.hu@sifive.com>
Reviewed-by: Robert Hancock <robert.hancock@calian.com>
Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Chiu [Tue, 5 Apr 2022 09:19:26 +0000 (17:19 +0800)]
net: axienet: setup mdio unconditionally
The call to axienet_mdio_setup should not depend on whether "phy-node"
pressents on the DT. Besides, since `lp->phy_node` is used if PHY is in
SGMII or 100Base-X modes, move it into the if statement. And the next patch
will remove `lp->phy_node` from driver's private structure and do an
of_node_put on it right away after use since it is not used elsewhere.
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Greentime Hu <greentime.hu@sifive.com>
Reviewed-by: Robert Hancock <robert.hancock@calian.com>
Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Taehee Yoo [Tue, 5 Apr 2022 08:45:44 +0000 (08:45 +0000)]
net: sfc: fix using uninitialized xdp tx_queue
In some cases, xdp tx_queue can get used before initialization.
1. interface up/down
2. ring buffer size change
When CPU cores are lower than maximum number of channels of sfc driver,
it creates new channels only for XDP.
When an interface is up or ring buffer size is changed, all channels
are initialized.
But xdp channels are always initialized later.
So, the below scenario is possible.
Packets are received to rx queue of normal channels and it is acted
XDP_TX and tx_queue of xdp channels get used.
But these tx_queues are not initialized yet.
If so, TX DMA or queue error occurs.
In order to avoid this problem.
1. initializes xdp tx_queues earlier than other rx_queue in
efx_start_channels().
2. checks whether tx_queue is initialized or not in efx_xdp_tx_buffers().
Splat looks like:
sfc 0000:08:00.1 enp8s0f1np1: TX queue 10 spurious TX completion id 250
sfc 0000:08:00.1 enp8s0f1np1: resetting (RECOVER_OR_ALL)
sfc 0000:08:00.1 enp8s0f1np1: MC command 0x80 inlen 100 failed rc=-22
(raw=22) arg=789
sfc 0000:08:00.1 enp8s0f1np1: has been disabled
Fixes:
f28100cb9c96 ("sfc: fix lack of XDP TX queues - error XDP TX failed (-22)")
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 4 Apr 2022 18:34:39 +0000 (11:34 -0700)]
rxrpc: fix a race in rxrpc_exit_net()
Current code can lead to the following race:
CPU0 CPU1
rxrpc_exit_net()
rxrpc_peer_keepalive_worker()
if (rxnet->live)
rxnet->live = false;
del_timer_sync(&rxnet->peer_keepalive_timer);
timer_reduce(&rxnet->peer_keepalive_timer, jiffies + delay);
cancel_work_sync(&rxnet->peer_keepalive_work);
rxrpc_exit_net() exits while peer_keepalive_timer is still armed,
leading to use-after-free.
syzbot report was:
ODEBUG: free active (active state 0) object type: timer_list hint: rxrpc_peer_keepalive_timeout+0x0/0xb0
WARNING: CPU: 0 PID: 3660 at lib/debugobjects.c:505 debug_print_object+0x16e/0x250 lib/debugobjects.c:505
Modules linked in:
CPU: 0 PID: 3660 Comm: kworker/u4:6 Not tainted 5.17.0-syzkaller-13993-g88e6c0207623 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: netns cleanup_net
RIP: 0010:debug_print_object+0x16e/0x250 lib/debugobjects.c:505
Code: ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 af 00 00 00 48 8b 14 dd 00 1c 26 8a 4c 89 ee 48 c7 c7 00 10 26 8a e8 b1 e7 28 05 <0f> 0b 83 05 15 eb c5 09 01 48 83 c4 18 5b 5d 41 5c 41 5d 41 5e c3
RSP: 0018:
ffffc9000353fb00 EFLAGS:
00010082
RAX:
0000000000000000 RBX:
0000000000000003 RCX:
0000000000000000
RDX:
ffff888029196140 RSI:
ffffffff815efad8 RDI:
fffff520006a7f52
RBP:
0000000000000001 R08:
0000000000000000 R09:
0000000000000000
R10:
ffffffff815ea4ae R11:
0000000000000000 R12:
ffffffff89ce23e0
R13:
ffffffff8a2614e0 R14:
ffffffff816628c0 R15:
dffffc0000000000
FS:
0000000000000000(0000) GS:
ffff8880b9c00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00007fe1f2908924 CR3:
0000000043720000 CR4:
00000000003506f0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
<TASK>
__debug_check_no_obj_freed lib/debugobjects.c:992 [inline]
debug_check_no_obj_freed+0x301/0x420 lib/debugobjects.c:1023
kfree+0xd6/0x310 mm/slab.c:3809
ops_free_list.part.0+0x119/0x370 net/core/net_namespace.c:176
ops_free_list net/core/net_namespace.c:174 [inline]
cleanup_net+0x591/0xb00 net/core/net_namespace.c:598
process_one_work+0x996/0x1610 kernel/workqueue.c:2289
worker_thread+0x665/0x1080 kernel/workqueue.c:2436
kthread+0x2e9/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298
</TASK>
Fixes:
ace45bec6d77 ("rxrpc: Fix firewall route keepalive")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Marc Dionne <marc.dionne@auristor.com>
Cc: linux-afs@lists.infradead.org
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ilya Maximets [Mon, 4 Apr 2022 15:43:45 +0000 (17:43 +0200)]
net: openvswitch: fix leak of nested actions
While parsing user-provided actions, openvswitch module may dynamically
allocate memory and store pointers in the internal copy of the actions.
So this memory has to be freed while destroying the actions.
Currently there are only two such actions: ct() and set(). However,
there are many actions that can hold nested lists of actions and
ovs_nla_free_flow_actions() just jumps over them leaking the memory.
For example, removal of the flow with the following actions will lead
to a leak of the memory allocated by nf_ct_tmpl_alloc():
actions:clone(ct(commit),0)
Non-freed set() action may also leak the 'dst' structure for the
tunnel info including device references.
Under certain conditions with a high rate of flow rotation that may
cause significant memory leak problem (2MB per second in reporter's
case). The problem is also hard to mitigate, because the user doesn't
have direct control over the datapath flows generated by OVS.
Fix that by iterating over all the nested actions and freeing
everything that needs to be freed recursively.
New build time assertion should protect us from this problem if new
actions will be added in the future.
Unfortunately, openvswitch module doesn't use NLA_F_NESTED, so all
attributes has to be explicitly checked. sample() and clone() actions
are mixing extra attributes into the user-provided action list. That
prevents some code generalization too.
Fixes:
34ae932a4036 ("openvswitch: Make tunnel set action attach a metadata dst")
Link: https://mail.openvswitch.org/pipermail/ovs-dev/2022-March/392922.html
Reported-by: Stéphane Graber <stgraber@ubuntu.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Tue, 5 Apr 2022 00:04:04 +0000 (02:04 +0200)]
net: ethernet: mv643xx: Fix over zealous checking of_get_mac_address()
There is often not a MAC address available in an EEPROM accessible by
Linux with Marvell devices. Instead the bootload has the MAC address
and directly programs it into the hardware. So don't consider an error
from of_get_mac_address() has fatal. However, the check was added for
the case where there is a MAC address in an the EEPROM, but the EEPROM
has not probed yet, and -EPROBE_DEFER is returned. In that case the
error should be returned. So make the check specific to this error
code.
Cc: Mauri Sandberg <maukka@ext.kapsi.fi>
Reported-by: Thomas Walther <walther-it@gmx.de>
Fixes:
42404d8f1c01 ("net: mv643xx_eth: process retval from of_get_mac_address")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220405000404.3374734-1-andrew@lunn.ch
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ilya Maximets [Mon, 4 Apr 2022 10:41:50 +0000 (12:41 +0200)]
net: openvswitch: don't send internal clone attribute to the userspace.
'OVS_CLONE_ATTR_EXEC' is an internal attribute that is used for
performance optimization inside the kernel. It's added by the kernel
while parsing user-provided actions and should not be sent during the
flow dump as it's not part of the uAPI.
The issue doesn't cause any significant problems to the ovs-vswitchd
process, because reported actions are not really used in the
application lifecycle and only supposed to be shown to a human via
ovs-dpctl flow dump. However, the action list is still incorrect
and causes the following error if the user wants to look at the
datapath flows:
# ovs-dpctl add-dp system@ovs-system
# ovs-dpctl add-flow "<flow match>" "clone(ct(commit),0)"
# ovs-dpctl dump-flows
<flow match>, packets:0, bytes:0, used:never,
actions:clone(bad length 4, expected -1 for: action0(01 00 00 00),
ct(commit),0)
With the fix:
# ovs-dpctl dump-flows
<flow match>, packets:0, bytes:0, used:never,
actions:clone(ct(commit),0)
Additionally fixed an incorrect attribute name in the comment.
Fixes:
b233504033db ("openvswitch: kernel datapath clone action")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Link: https://lore.kernel.org/r/20220404104150.2865736-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Horatiu Vultur [Tue, 5 Apr 2022 06:59:36 +0000 (08:59 +0200)]
net: micrel: Fix KS8851 Kconfig
KS8851 selects MICREL_PHY, which depends on PTP_1588_CLOCK_OPTIONAL, so
make KS8851 also depend on PTP_1588_CLOCK_OPTIONAL.
Fixes kconfig warning and build errors:
WARNING: unmet direct dependencies detected for MICREL_PHY
Depends on [m]: NETDEVICES [=y] && PHYLIB [=y] && PTP_1588_CLOCK_OPTIONAL [=m]
Selected by [y]:
- KS8851 [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_MICREL [=y] && SPI [=y]
ld.lld: error: undefined symbol: ptp_clock_register referenced by micrel.c
net/phy/micrel.o:(lan8814_probe) in archive drivers/built-in.a
ld.lld: error: undefined symbol: ptp_clock_index referenced by micrel.c
net/phy/micrel.o:(lan8814_ts_info) in archive drivers/built-in.a
Reported-by: kernel test robot <lkp@intel.com>
Fixes:
ece19502834d ("net: phy: micrel: 1588 support for LAN8814 phy")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20220405065936.4105272-1-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 5 Apr 2022 20:04:03 +0000 (13:04 -0700)]
Merge git://git./linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
1) Incorrect comparison in bitmask .reduce, from Jeremy Sowden.
2) Missing GFP_KERNEL_ACCOUNT for dynamically allocated objects,
from Vasily Averin.
* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_tables: memcg accounting for dynamically allocated objects
netfilter: bitwise: fix reduce comparisons
====================
Link: https://lore.kernel.org/r/20220405100923.7231-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Maciej Fijalkowski [Thu, 17 Mar 2022 18:36:29 +0000 (19:36 +0100)]
ice: clear cmd_type_offset_bsz for TX rings
Currently when XDP rings are created, each descriptor gets its DD bit
set, which turns out to be the wrong approach as it can lead to a
situation where more descriptors get cleaned than it was supposed to,
e.g. when AF_XDP busy poll is run with a large batch size. In this
situation, the driver would request for more buffers than it is able to
handle.
Fix this by not setting the DD bits in ice_xdp_alloc_setup_rings(). They
should be initialized to zero instead.
Fixes:
9610bd988df9 ("ice: optimize XDP_TX workloads")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Shwetha Nagaraju <shwetha.nagaraju@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Maciej Fijalkowski [Thu, 17 Mar 2022 18:36:28 +0000 (19:36 +0100)]
ice: xsk: fix VSI state check in ice_xsk_wakeup()
ICE_DOWN is dedicated for pf->state. Check for ICE_VSI_DOWN being set on
vsi->state in ice_xsk_wakeup().
Fixes:
2d4238f55697 ("ice: Add support for AF_XDP")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Shwetha Nagaraju <shwetha.nagaraju@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Maciej Fijalkowski [Thu, 17 Mar 2022 18:36:27 +0000 (19:36 +0100)]
ice: synchronize_rcu() when terminating rings
Unfortunately, the ice driver doesn't respect the RCU critical section that
XSK wakeup is surrounded with. To fix this, add synchronize_rcu() calls to
paths that destroy resources that might be in use.
This was addressed in other AF_XDP ZC enabled drivers, for reference see
for example commit
b3873a5be757 ("net/i40e: Fix concurrency issues
between config flow and XSK")
Fixes:
efc2214b6047 ("ice: Add support for XDP")
Fixes:
2d4238f55697 ("ice: Add support for AF_XDP")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Shwetha Nagaraju <shwetha.nagaraju@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
David Ahern [Mon, 4 Apr 2022 15:09:08 +0000 (09:09 -0600)]
ipv6: Fix stats accounting in ip6_pkt_drop
VRF devices are the loopbacks for VRFs, and a loopback can not be
assigned to a VRF. Accordingly, the condition in ip6_pkt_drop should
be '||' not '&&'.
Fixes:
1d3fd8a10bed ("vrf: Use orig netdev to count Ip6InNoRoutes and a fresh route lookup when sending dest unreach")
Reported-by: Pudak, Filip <Filip.Pudak@windriver.com>
Reported-by: Xiao, Jiguang <Jiguang.Xiao@windriver.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220404150908.2937-1-dsahern@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Tue, 5 Apr 2022 10:50:28 +0000 (12:50 +0200)]
Merge branch 'ice-bug-fixes'
Tony Nguyen says:
====================
ice bug fixes
Alice Michael says:
There were a couple of bugs that have been found and
fixed by Anatolii in the ice driver. First he fixed
a bug on ring creation by setting the default value
for the teid. Anatolli also fixed a bug with deleting
queues in ice_vc_dis_qs_msg based on their enablement.
---
v2: Remove empty lines between tags
The following are changes since commit
458f5d92df4807e2a7c803ed928369129996bf96:
sfc: Do not free an empty page_ring
and are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue 100GbE
====================
Link: https://lore.kernel.org/r/20220404183548.3422851-1-anthony.l.nguyen@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Anatolii Gerasymenko [Mon, 4 Apr 2022 18:35:48 +0000 (11:35 -0700)]
ice: Do not skip not enabled queues in ice_vc_dis_qs_msg
Disable check for queue being enabled in ice_vc_dis_qs_msg, because
there could be a case when queues were created, but were not enabled.
We still need to delete those queues.
Normal workflow for VF looks like:
Enable path:
VIRTCHNL_OP_ADD_ETH_ADDR (opcode 10)
VIRTCHNL_OP_CONFIG_VSI_QUEUES (opcode 6)
VIRTCHNL_OP_ENABLE_QUEUES (opcode 8)
Disable path:
VIRTCHNL_OP_DISABLE_QUEUES (opcode 9)
VIRTCHNL_OP_DEL_ETH_ADDR (opcode 11)
The issue appears only in stress conditions when VF is enabled and
disabled very fast.
Eventually there will be a case, when queues are created by
VIRTCHNL_OP_CONFIG_VSI_QUEUES, but are not enabled by
VIRTCHNL_OP_ENABLE_QUEUES.
In turn, these queues are not deleted by VIRTCHNL_OP_DISABLE_QUEUES,
because there is a check whether queues are enabled in
ice_vc_dis_qs_msg.
When we bring up the VF again, we will see the "Failed to set LAN Tx queue
context" error during VIRTCHNL_OP_CONFIG_VSI_QUEUES step. This
happens because old 16 queues were not deleted and VF requests to create
16 more, but ice_sched_get_free_qparent in ice_ena_vsi_txq would fail to
find a parent node for first newly requested queue (because all nodes
are allocated to 16 old queues).
Testing Hints:
Just enable and disable VF fast enough, so it would be disabled before
reaching VIRTCHNL_OP_ENABLE_QUEUES.
while true; do
ip link set dev ens785f0v0 up
sleep 0.065 # adjust delay value for you machine
ip link set dev ens785f0v0 down
done
Fixes:
77ca27c41705 ("ice: add support for virtchnl_queue_select.[tx|rx]_queues bitmap")
Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Alice Michael <alice.michael@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Anatolii Gerasymenko [Mon, 4 Apr 2022 18:35:47 +0000 (11:35 -0700)]
ice: Set txq_teid to ICE_INVAL_TEID on ring creation
When VF is freshly created, but not brought up, ring->txq_teid
value is by default set to 0.
But 0 is a valid TEID. On some platforms the Root Node of
Tx scheduler has a TEID = 0. This can cause issues as shown below.
The proper way is to set ring->txq_teid to ICE_INVAL_TEID (0xFFFFFFFF).
Testing Hints:
echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs
ip link set dev ens785f0v0 up
ip link set dev ens785f0v0 down
If we have freshly created VF and quickly turn it on and off, so there
would be no time to reach VIRTCHNL_OP_CONFIG_VSI_QUEUES stage, then
VIRTCHNL_OP_DISABLE_QUEUES stage will fail with error:
[ 639.531454] disable queue 89 failed 14
[ 639.532233] Failed to disable LAN Tx queues, error: ICE_ERR_AQ_ERROR
[ 639.533107] ice 0000:02:00.0: Failed to stop Tx ring 0 on VSI 5
The reason for the fail is that we are trying to send AQ command to
delete queue 89, which has never been created and receive an "invalid
argument" error from firmware.
As this queue has never been created, it's teid and ring->txq_teid
have default value 0.
ice_dis_vsi_txq has a check against non-existent queues:
node = ice_sched_find_node_by_teid(pi->root, q_teids[i]);
if (!node)
continue;
But on some platforms the Root Node of Tx scheduler has a teid = 0.
Hence, ice_sched_find_node_by_teid finds a node with teid = 0 (it is
pi->root), and we go further to submit an erroneous request to firmware.
Fixes:
37bb83901286 ("ice: Move common functions out of ice_main.c part 7/7")
Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Alice Michael <alice.michael@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Miaoqian Lin [Mon, 4 Apr 2022 12:53:36 +0000 (12:53 +0000)]
dpaa2-ptp: Fix refcount leak in dpaa2_ptp_probe
This node pointer is returned by of_find_compatible_node() with
refcount incremented. Calling of_node_put() to aovid the refcount leak.
Fixes:
d346c9e86d86 ("dpaa2-ptp: reuse ptp_qoriq driver")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20220404125336.13427-1-linmq006@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Vasily Averin [Sat, 2 Apr 2022 09:50:37 +0000 (12:50 +0300)]
netfilter: nf_tables: memcg accounting for dynamically allocated objects
nft_*.c files whose NFT_EXPR_STATEFUL flag is set on need to
use __GFP_ACCOUNT flag for objects that are dynamically
allocated from the packet path.
Such objects are allocated inside nft_expr_ops->init() callbacks
executed in task context while processing netlink messages.
In addition, this patch adds accounting to nft_set_elem_expr_clone()
used for the same purposes.
Signed-off-by: Vasily Averin <vvs@openvz.org>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Jamie Bainbridge [Sun, 3 Apr 2022 23:47:48 +0000 (09:47 +1000)]
sctp: count singleton chunks in assoc user stats
Singleton chunks (INIT, HEARTBEAT PMTU probes, and SHUTDOWN-
COMPLETE) are not counted in SCTP_GET_ASOC_STATS "sas_octrlchunks"
counter available to the assoc owner.
These are all control chunks so they should be counted as such.
Add counting of singleton chunks so they are properly accounted for.
Fixes:
196d67593439 ("sctp: Add support to per-association statistics via a new SCTP_GET_ASSOC_STATS call")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Link: https://lore.kernel.org/r/c9ba8785789880cf07923b8a5051e174442ea9ee.1649029663.git.jamie.bainbridge@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Martin Habets [Mon, 4 Apr 2022 10:48:51 +0000 (11:48 +0100)]
sfc: Do not free an empty page_ring
When the page_ring is not used page_ptr_mask is 0.
Do not dereference page_ring[0] in this case.
Fixes:
2768935a4660 ("sfc: reuse pages to avoid DMA mapping/unmapping costs")
Reported-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Rix [Sun, 3 Apr 2022 14:02:02 +0000 (10:02 -0400)]
stmmac: dwmac-loongson: change loongson_dwmac_driver from global to static
Smatch reports this issue
dwmac-loongson.c:208:19: warning: symbol
'loongson_dwmac_driver' was not declared.
Should it be static?
loongson_dwmac_driver is only used in dwmac-loongson.c.
File scope variables used only in one file should
be static. Change loongson_dwmac_driver's
storage-class-specifier from global to static.
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 4 Apr 2022 11:44:50 +0000 (12:44 +0100)]
Merge branch 'bnxt_en-fixes'
Michael Chan says:
====================
bnxt_en: XDP redirect fixes
This series includes 3 fixes related to the XDP redirect code path in
the driver. The first one adds locking when the number of TX XDP rings
is less than the number of CPUs. The second one adjusts the maximum MTU
that can support XDP with enough tail room in the buffer. The 3rd one
fixes a race condition between TX ring shutdown and the XDP redirect path.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ray Jui [Sat, 2 Apr 2022 00:21:12 +0000 (20:21 -0400)]
bnxt_en: Prevent XDP redirect from running when stopping TX queue
Add checks in the XDP redirect callback to prevent XDP from running when
the TX ring is undergoing shutdown.
Also remove redundant checks in the XDP redirect callback to validate the
txr and the flag that indicates the ring supports XDP. The modulo
arithmetic on 'tx_nr_rings_xdp' already guarantees the derived TX
ring is an XDP ring. txr is also guaranteed to be valid after checking
BNXT_STATE_OPEN and within RCU grace period.
Fixes:
f18c2b77b2e4 ("bnxt_en: optimized XDP_REDIRECT support")
Reviewed-by: Vladimir Olovyannikov <vladimir.olovyannikov@broadcom.com>
Signed-off-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Gospodarek [Sat, 2 Apr 2022 00:21:11 +0000 (20:21 -0400)]
bnxt_en: reserve space inside receive page for skb_shared_info
Insufficient space was being reserved in the page used for packet
reception, so the interface MTU could be set too large to still have
room for the contents of the packet when doing XDP redirect. This
resulted in the following message when redirecting a packet between
3520 and 3822 bytes with an MTU of 3822:
[311815.561880] XDP_WARN: xdp_update_frame_from_buff(line:200): Driver BUG: missing reserved tailroom
Fixes:
f18c2b77b2e4 ("bnxt_en: optimized XDP_REDIRECT support")
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pavan Chebbi [Sat, 2 Apr 2022 00:21:10 +0000 (20:21 -0400)]
bnxt_en: Synchronize tx when xdp redirects happen on same ring
If there are more CPUs than the number of TX XDP rings, multiple XDP
redirects can select the same TX ring based on the CPU on which
XDP redirect is called. Add locking when needed and use static
key to decide whether to take the lock.
Fixes:
f18c2b77b2e4 ("bnxt_en: optimized XDP_REDIRECT support")
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Manish Chopra [Fri, 1 Apr 2022 18:53:04 +0000 (11:53 -0700)]
qed: fix ethtool register dump
To fix a coverity complain, commit
d5ac07dfbd2b
("qed: Initialize debug string array") removed "sw-platform"
(one of the common global parameters) from the dump as this
was used in the dump with an uninitialized string, however
it did not reduce the number of common global parameters
which caused the incorrect (unable to parse) register dump
this patch fixes it with reducing NUM_COMMON_GLOBAL_PARAMS
bye one.
Cc: stable@vger.kernel.org
Cc: Tim Gardner <tim.gardner@canonical.com>
Cc: "David S. Miller" <davem@davemloft.net>
Fixes:
d5ac07dfbd2b ("qed: Initialize debug string array")
Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: Alok Prasad <palok@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Manish Chopra <manishc@marvell.com>
Reviewed-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 4 Apr 2022 11:40:42 +0000 (12:40 +0100)]
Merge branch 'micrel-lan8814-remove-latencies'
Horatiu Vultur says:
====================
net: phy: micrel: Remove latencies support lan8814
Remove the latencies support both from the PHY driver and from the DT.
The IP already has some default latencies values which can be used to get
decent results. It has the following values(defined in ns):
rx-1000mbit: 429
tx-1000mbit: 201
rx-100mbit: 2346
tx-100mbit: 705
v0->v1:
- fix the split of the patches, there was a compiling error between patch 2 and
patch 3.
---
But to get better results the following values needs to be set:
rx-1000mbit: 459
tx-1000mbit: 171
rx-100mbit: 1706
tx-100mbit: 1345
We are proposing to use ethtool to set these latencies, the RFC can be found
here[1]
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Horatiu Vultur [Fri, 1 Apr 2022 11:05:22 +0000 (13:05 +0200)]
net: phy: micrel: Remove DT option lan8814,ignore-ts
When the PHY and the MAC are capable of doing timestamping, the PHY has
priority. Therefore the DT option lan8814,ignore-ts was added such that
the PHY will not expose a PHC so then the timestamping was done in the
MAC. This is not the correct approach of doing it, therefore remove
this.
Fixes:
ece19502834d84 ("net: phy: micrel: 1588 support for LAN8814 phy")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Horatiu Vultur [Fri, 1 Apr 2022 11:05:21 +0000 (13:05 +0200)]
net: phy: micrel: Remove latency from driver
Based on the discussions here[1], the PHY driver is the wrong place
to set the latencies, therefore remove them.
[1] https://lkml.org/lkml/2022/3/4/325
Fixes:
ece19502834d84 ("net: phy: micrel: 1588 support for LAN8814 phy")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Horatiu Vultur [Fri, 1 Apr 2022 11:05:20 +0000 (13:05 +0200)]
dt-bindings: net: micrel: Revert latency support and timestamping check
Revert latency support from binding.
Based on the discussion[1], the DT is the wrong place to have the
lantecies for the PHY.
[1] https://lkml.org/lkml/2022/3/4/325
Fixes:
2358dd3fd325fc ("dt-bindings: net: micrel: Configure latency values and timestamping check for LAN8814 phy")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nikolay Aleksandrov [Fri, 1 Apr 2022 15:54:27 +0000 (18:54 +0300)]
selftests: net: fix nexthop warning cleanup double ip typo
I made a stupid typo when adding the nexthop route warning selftest and
added both $IP and ip after it (double ip) on the cleanup path. The
error doesn't show up when running the test, but obviously it doesn't
cleanup properly after it.
Fixes:
392baa339c6a ("selftests: net: add delete nexthop route warning test")
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bjorn Helgaas [Sat, 2 Apr 2022 14:46:23 +0000 (09:46 -0500)]
docs: net: dsa: fix minor grammar and punctuation issues
Fix a few typos and minor grammatical issues.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chen-Yu Tsai [Thu, 31 Mar 2022 18:48:32 +0000 (02:48 +0800)]
net: stmmac: Fix unset max_speed difference between DT and non-DT platforms
In commit
9cbadf094d9d ("net: stmmac: support max-speed device tree
property"), when DT platforms don't set "max-speed", max_speed is set to
-1; for non-DT platforms, it stays the default 0.
Prior to commit
eeef2f6b9f6e ("net: stmmac: Start adding phylink support"),
the check for a valid max_speed setting was to check if it was greater
than zero. This commit got it right, but subsequent patches just checked
for non-zero, which is incorrect for DT platforms.
In commit
92c3807b9ac3 ("net: stmmac: convert to phylink_get_linkmodes()")
the conversion switched completely to checking for non-zero value as a
valid value, which caused 1000base-T to stop getting advertised by
default.
Instead of trying to fix all the checks, simply leave max_speed alone if
DT property parsing fails.
Fixes:
9cbadf094d9d ("net: stmmac: support max-speed device tree property")
Fixes:
92c3807b9ac3 ("net: stmmac: convert to phylink_get_linkmodes()")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Link: https://lore.kernel.org/r/20220331184832.16316-1-wens@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Dimitris Michailidis [Fri, 1 Apr 2022 23:24:11 +0000 (16:24 -0700)]
net/fungible: Fix reference to __udivdi3 on 32b builds
32b builds with CONFIG_PHYS_ADDR_T_64BIT=y, such as i386 PAE,
raise a linker error due to a 64b division:
ld: drivers/net/ethernet/fungible/funcore/fun_dev.o: in function
`fun_dev_enable':
(.text+0xe1a): undefined reference to `__udivdi3'
The divisor in the offendinng expression is a power of 2. Change it to
use an explicit right shift.
Fixes:
e1ffcc66818f ("net/fungible: Add service module for Fungible drivers")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Dimitris Michailidis <dmichail@fungible.com>
Link: https://lore.kernel.org/r/20220401232411.313881-1-dmichail@fungible.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David S. Miller [Fri, 1 Apr 2022 11:09:17 +0000 (12:09 +0100)]
Merge branch 'nexthop-route-deletye-warning'
Nikolay Aleksandrov says:
====================
net: ipv4: fix nexthop route delete warning
The first patch fixes a warning that can be triggered by deleting a
nexthop route and specifying a device (more info in its commit msg).
And the second patch adds a selftest for that case.
Chose this way to fix it because we should match when deleting without
nh spec and should fail when deleting a nexthop route with old-style nh
spec because nexthop objects are managed separately, e.g.:
$ ip r show 1.2.3.4/32
1.2.3.4 nhid 12 via 192.168.11.2 dev dummy0
$ ip r del 1.2.3.4/32
$ ip r del 1.2.3.4/32 nhid 12
<both should work>
$ ip r del 1.2.3.4/32 dev dummy0
<should fail with ESRCH>
v2: addded more to patch 01's commit message
adjusted the test comment in patch 02
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Nikolay Aleksandrov [Fri, 1 Apr 2022 07:33:43 +0000 (10:33 +0300)]
selftests: net: add delete nexthop route warning test
Add a test which causes a WARNING on kernels which treat a
nexthop route like a normal route when comparing for deletion and a
device is specified. That is, a route is found but we hit a warning while
matching it. The warning is from fib_info_nh() in include/net/nexthop.h
because we run it on a fib_info with nexthop object. The call chain is:
inet_rtm_delroute -> fib_table_delete -> fib_nh_match (called with a
nexthop fib_info and also with fc_oif set thus calling fib_info_nh on
the fib_info and triggering the warning).
Repro steps:
$ ip nexthop add id 12 via 172.16.1.3 dev veth1
$ ip route add 172.16.101.1/32 nhid 12
$ ip route delete 172.16.101.1/32 dev veth1
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Nikolay Aleksandrov [Fri, 1 Apr 2022 07:33:42 +0000 (10:33 +0300)]
net: ipv4: fix route with nexthop object delete warning
FRR folks have hit a kernel warning[1] while deleting routes[2] which is
caused by trying to delete a route pointing to a nexthop id without
specifying nhid but matching on an interface. That is, a route is found
but we hit a warning while matching it. The warning is from
fib_info_nh() in include/net/nexthop.h because we run it on a fib_info
with nexthop object. The call chain is:
inet_rtm_delroute -> fib_table_delete -> fib_nh_match (called with a
nexthop fib_info and also with fc_oif set thus calling fib_info_nh on
the fib_info and triggering the warning). The fix is to not do any
matching in that branch if the fi has a nexthop object because those are
managed separately. I.e. we should match when deleting without nh spec and
should fail when deleting a nexthop route with old-style nh spec because
nexthop objects are managed separately, e.g.:
$ ip r show 1.2.3.4/32
1.2.3.4 nhid 12 via 192.168.11.2 dev dummy0
$ ip r del 1.2.3.4/32
$ ip r del 1.2.3.4/32 nhid 12
<both should work>
$ ip r del 1.2.3.4/32 dev dummy0
<should fail with ESRCH>
[1]
[ 523.462226] ------------[ cut here ]------------
[ 523.462230] WARNING: CPU: 14 PID: 22893 at include/net/nexthop.h:468 fib_nh_match+0x210/0x460
[ 523.462236] Modules linked in: dummy rpcsec_gss_krb5 xt_socket nf_socket_ipv4 nf_socket_ipv6 ip6table_raw iptable_raw bpf_preload xt_statistic ip_set ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs xt_mark nf_tables xt_nat veth nf_conntrack_netlink nfnetlink xt_addrtype br_netfilter overlay dm_crypt nfsv3 nfs fscache netfs vhost_net vhost vhost_iotlb tap tun xt_CHECKSUM xt_MASQUERADE xt_conntrack 8021q garp mrp ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter bridge stp llc rfcomm snd_seq_dummy snd_hrtimer rpcrdma rdma_cm iw_cm ib_cm ib_core ip6table_filter xt_comment ip6_tables vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) qrtr bnep binfmt_misc xfs vfat fat squashfs loop nvidia_drm(POE) nvidia_modeset(POE) nvidia_uvm(POE) nvidia(POE) intel_rapl_msr intel_rapl_common snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi btusb btrtl iwlmvm uvcvideo btbcm snd_hda_intel edac_mce_amd
[ 523.462274] videobuf2_vmalloc videobuf2_memops btintel snd_intel_dspcfg videobuf2_v4l2 snd_intel_sdw_acpi bluetooth snd_usb_audio snd_hda_codec mac80211 snd_usbmidi_lib joydev snd_hda_core videobuf2_common kvm_amd snd_rawmidi snd_hwdep snd_seq videodev ccp snd_seq_device libarc4 ecdh_generic mc snd_pcm kvm iwlwifi snd_timer drm_kms_helper snd cfg80211 cec soundcore irqbypass rapl wmi_bmof i2c_piix4 rfkill k10temp pcspkr acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc drm zram ip_tables crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel nvme sp5100_tco r8169 nvme_core wmi ipmi_devintf ipmi_msghandler fuse
[ 523.462300] CPU: 14 PID: 22893 Comm: ip Tainted: P OE 5.16.18-200.fc35.x86_64 #1
[ 523.462302] Hardware name: Micro-Star International Co., Ltd. MS-7C37/MPG X570 GAMING EDGE WIFI (MS-7C37), BIOS 1.C0 10/29/2020
[ 523.462303] RIP: 0010:fib_nh_match+0x210/0x460
[ 523.462304] Code: 7c 24 20 48 8b b5 90 00 00 00 e8 bb ee f4 ff 48 8b 7c 24 20 41 89 c4 e8 ee eb f4 ff 45 85 e4 0f 85 2e fe ff ff e9 4c ff ff ff <0f> 0b e9 17 ff ff ff 3c 0a 0f 85 61 fe ff ff 48 8b b5 98 00 00 00
[ 523.462306] RSP: 0018:
ffffaa53d4d87928 EFLAGS:
00010286
[ 523.462307] RAX:
0000000000000000 RBX:
ffffaa53d4d87a90 RCX:
ffffaa53d4d87bb0
[ 523.462308] RDX:
ffff9e3d2ee6be80 RSI:
ffffaa53d4d87a90 RDI:
ffffffff920ed380
[ 523.462309] RBP:
ffff9e3d2ee6be80 R08:
0000000000000064 R09:
0000000000000000
[ 523.462310] R10:
0000000000000000 R11:
0000000000000000 R12:
0000000000000031
[ 523.462310] R13:
0000000000000020 R14:
0000000000000000 R15:
ffff9e3d331054e0
[ 523.462311] FS:
00007f245517c1c0(0000) GS:
ffff9e492ed80000(0000) knlGS:
0000000000000000
[ 523.462313] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 523.462313] CR2:
000055e5dfdd8268 CR3:
00000003ef488000 CR4:
0000000000350ee0
[ 523.462315] Call Trace:
[ 523.462316] <TASK>
[ 523.462320] fib_table_delete+0x1a9/0x310
[ 523.462323] inet_rtm_delroute+0x93/0x110
[ 523.462325] rtnetlink_rcv_msg+0x133/0x370
[ 523.462327] ? _copy_to_iter+0xb5/0x6f0
[ 523.462330] ? rtnl_calcit.isra.0+0x110/0x110
[ 523.462331] netlink_rcv_skb+0x50/0xf0
[ 523.462334] netlink_unicast+0x211/0x330
[ 523.462336] netlink_sendmsg+0x23f/0x480
[ 523.462338] sock_sendmsg+0x5e/0x60
[ 523.462340] ____sys_sendmsg+0x22c/0x270
[ 523.462341] ? import_iovec+0x17/0x20
[ 523.462343] ? sendmsg_copy_msghdr+0x59/0x90
[ 523.462344] ? __mod_lruvec_page_state+0x85/0x110
[ 523.462348] ___sys_sendmsg+0x81/0xc0
[ 523.462350] ? netlink_seq_start+0x70/0x70
[ 523.462352] ? __dentry_kill+0x13a/0x180
[ 523.462354] ? __fput+0xff/0x250
[ 523.462356] __sys_sendmsg+0x49/0x80
[ 523.462358] do_syscall_64+0x3b/0x90
[ 523.462361] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 523.462364] RIP: 0033:0x7f24552aa337
[ 523.462365] Code: 0e 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b9 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
[ 523.462366] RSP: 002b:
00007fff7f05a838 EFLAGS:
00000246 ORIG_RAX:
000000000000002e
[ 523.462368] RAX:
ffffffffffffffda RBX:
000000006245bf91 RCX:
00007f24552aa337
[ 523.462368] RDX:
0000000000000000 RSI:
00007fff7f05a8a0 RDI:
0000000000000003
[ 523.462369] RBP:
0000000000000000 R08:
0000000000000001 R09:
0000000000000000
[ 523.462370] R10:
0000000000000008 R11:
0000000000000246 R12:
0000000000000001
[ 523.462370] R13:
00007fff7f05ce08 R14:
0000000000000000 R15:
000055e5dfdd1040
[ 523.462373] </TASK>
[ 523.462374] ---[ end trace
ba537bc16f6bf4ed ]---
[2] https://github.com/FRRouting/frr/issues/6412
Fixes:
4c7e8084fd46 ("ipv4: Plumb support for nexthop object in a fib_info")
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Randy Dunlap [Fri, 1 Apr 2022 05:42:44 +0000 (22:42 -0700)]
net: micrel: fix KS8851_MLL Kconfig
KS8851_MLL selects MICREL_PHY, which depends on PTP_1588_CLOCK_OPTIONAL,
so make KS8851_MLL also depend on PTP_1588_CLOCK_OPTIONAL since
'select' does not follow any dependency chains.
Fixes kconfig warning and build errors:
WARNING: unmet direct dependencies detected for MICREL_PHY
Depends on [m]: NETDEVICES [=y] && PHYLIB [=y] && PTP_1588_CLOCK_OPTIONAL [=m]
Selected by [y]:
- KS8851_MLL [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_MICREL [=y] && HAS_IOMEM [=y]
ld: drivers/net/phy/micrel.o: in function `lan8814_ts_info':
micrel.c:(.text+0xb35): undefined reference to `ptp_clock_index'
ld: drivers/net/phy/micrel.o: in function `lan8814_probe':
micrel.c:(.text+0x2586): undefined reference to `ptp_clock_register'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 1 Apr 2022 11:04:15 +0000 (12:04 +0100)]
Merge branch 'MCTP-fixes'
Matt Johnston says:
====================
MCTP fixes
The following are fixes for the mctp core and mctp-i2c driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Matt Johnston [Fri, 1 Apr 2022 02:48:44 +0000 (10:48 +0800)]
mctp: Use output netdev to allocate skb headroom
Previously the skb was allocated with headroom MCTP_HEADER_MAXLEN,
but that isn't sufficient if we are using devs that are not MCTP
specific.
This also adds a check that the smctp_halen provided to sendmsg for
extended addressing is the correct size for the netdev.
Fixes:
833ef3b91de6 ("mctp: Populate socket implementation")
Reported-by: Matthew Rinaldi <mjrinal@g.clemson.edu>
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matt Johnston [Fri, 1 Apr 2022 02:48:43 +0000 (10:48 +0800)]
mctp i2c: correct mctp_i2c_header_create result
header_ops.create should return the length of the header,
instead mctp_i2c_head_create() returned 0.
This didn't cause any problem because the MCTP stack accepted
0 as success.
Fixes:
f5b8abf9fc3d ("mctp i2c: MCTP I2C binding driver")
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Matt Johnston [Fri, 1 Apr 2022 02:48:42 +0000 (10:48 +0800)]
mctp: Fix check for dev_hard_header() result
dev_hard_header() returns the length of the header, so
we need to test for negative errors rather than non-zero.
Fixes:
889b7da23abf ("mctp: Add initial routing framework")
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 1 Apr 2022 11:01:38 +0000 (12:01 +0100)]
Merge branch 'ice-fixups'
Tony Nguyen says:
====================
ice-fixups
This series handles a handful of cleanups for the ice
driver. Ivan fixed a problem on the VSI during a release,
fixing a MAC address setting, and a broken IFF_ALLMULTI
handling.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Vecera [Thu, 31 Mar 2022 16:20:08 +0000 (09:20 -0700)]
ice: Fix broken IFF_ALLMULTI handling
Handling of all-multicast flag and associated multicast promiscuous
mode is broken in ice driver. When an user switches allmulticast
flag on or off the driver checks whether any VLANs are configured
over the interface (except default VLAN 0).
If any extra VLANs are registered it enables multicast promiscuous
mode for all these VLANs (including default VLAN 0) using
ICE_SW_LKUP_PROMISC_VLAN look-up type. In this situation all
multicast packets tagged with known VLAN ID or untagged are received
and multicast packets tagged with unknown VLAN ID ignored.
If no extra VLANs are registered (so only VLAN 0 exists) it enables
multicast promiscuous mode for VLAN 0 and uses ICE_SW_LKUP_PROMISC
look-up type. In this situation any multicast packets including
tagged ones are received.
The driver handles IFF_ALLMULTI in ice_vsi_sync_fltr() this way:
ice_vsi_sync_fltr() {
...
if (changed_flags & IFF_ALLMULTI) {
if (netdev->flags & IFF_ALLMULTI) {
if (vsi->num_vlans > 1)
ice_set_promisc(..., ICE_MCAST_VLAN_PROMISC_BITS);
else
ice_set_promisc(..., ICE_MCAST_PROMISC_BITS);
} else {
if (vsi->num_vlans > 1)
ice_clear_promisc(..., ICE_MCAST_VLAN_PROMISC_BITS);
else
ice_clear_promisc(..., ICE_MCAST_PROMISC_BITS);
}
}
...
}
The code above depends on value vsi->num_vlan that specifies number
of VLANs configured over the interface (including VLAN 0) and
this is problem because that value is modified in NDO callbacks
ice_vlan_rx_add_vid() and ice_vlan_rx_kill_vid().
Scenario 1:
1. ip link set ens7f0 allmulticast on
2. ip link add vlan10 link ens7f0 type vlan id 10
3. ip link set ens7f0 allmulticast off
4. ip link set ens7f0 allmulticast on
[1] In this scenario IFF_ALLMULTI is enabled and the driver calls
ice_set_promisc(..., ICE_MCAST_PROMISC_BITS) that installs
multicast promisc rule with non-VLAN look-up type.
[2] Then VLAN with ID 10 is added and vsi->num_vlan incremented to 2
[3] Command switches IFF_ALLMULTI off and the driver calls
ice_clear_promisc(..., ICE_MCAST_VLAN_PROMISC_BITS) but this
call is effectively NOP because it looks for multicast promisc
rules for VLAN 0 and VLAN 10 with VLAN look-up type but no such
rules exist. So the all-multicast remains enabled silently
in hardware.
[4] Command tries to switch IFF_ALLMULTI on and the driver calls
ice_clear_promisc(..., ICE_MCAST_PROMISC_BITS) but this call
fails (-EEXIST) because non-VLAN multicast promisc rule already
exists.
Scenario 2:
1. ip link add vlan10 link ens7f0 type vlan id 10
2. ip link set ens7f0 allmulticast on
3. ip link add vlan20 link ens7f0 type vlan id 20
4. ip link del vlan10 ; ip link del vlan20
5. ip link set ens7f0 allmulticast off
[1] VLAN with ID 10 is added and vsi->num_vlan==2
[2] Command switches IFF_ALLMULTI on and driver installs multicast
promisc rules with VLAN look-up type for VLAN 0 and 10
[3] VLAN with ID 20 is added and vsi->num_vlan==3 but no multicast
promisc rules is added for this new VLAN so the interface does
not receive MC packets from VLAN 20
[4] Both VLANs are removed but multicast rule for VLAN 10 remains
installed so interface receives multicast packets from VLAN 10
[5] Command switches IFF_ALLMULTI off and because vsi->num_vlan is 1
the driver tries to remove multicast promisc rule for VLAN 0
with non-VLAN look-up that does not exist.
All-multicast looks disabled from user point of view but it
is partially enabled in HW (interface receives all multicast
packets either untagged or tagged with VLAN ID 10)
To resolve these issues the patch introduces these changes:
1. Adds handling for IFF_ALLMULTI to ice_vlan_rx_add_vid() and
ice_vlan_rx_kill_vid() callbacks. So when VLAN is added/removed
and IFF_ALLMULTI is enabled an appropriate multicast promisc
rule for that VLAN ID is added/removed.
2. In ice_vlan_rx_add_vid() when first VLAN besides VLAN 0 is added
so (vsi->num_vlan == 2) and IFF_ALLMULTI is enabled then look-up
type for existing multicast promisc rule for VLAN 0 is updated
to ICE_MCAST_VLAN_PROMISC_BITS.
3. In ice_vlan_rx_kill_vid() when last VLAN besides VLAN 0 is removed
so (vsi->num_vlan == 1) and IFF_ALLMULTI is enabled then look-up
type for existing multicast promisc rule for VLAN 0 is updated
to ICE_MCAST_PROMISC_BITS.
4. Both ice_vlan_rx_{add,kill}_vid() have to run under ICE_CFG_BUSY
bit protection to avoid races with ice_vsi_sync_fltr() that runs
in ice_service_task() context.
5. Bit ICE_VSI_VLAN_FLTR_CHANGED is use-less and can be removed.
6. Error messages added to ice_fltr_*_vsi_promisc() helper functions
to avoid them in their callers
7. Small improvements to increase readability
Fixes:
5eda8afd6bcc ("ice: Add support for PF/VF promiscuous mode")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Alice Michael <alice.michael@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Vecera [Thu, 31 Mar 2022 16:20:07 +0000 (09:20 -0700)]
ice: Fix MAC address setting
Commit
2ccc1c1ccc671b ("ice: Remove excess error variables") merged
the usage of 'status' and 'err' variables into single one in
function ice_set_mac_address(). Unfortunately this causes
a regression when call of ice_fltr_add_mac() returns -EEXIST because
this return value does not indicate an error in this case but
value of 'err' remains to be -EEXIST till the end of the function
and is returned to caller.
Prior mentioned commit this does not happen because return value of
ice_fltr_add_mac() was stored to 'status' variable first and
if it was -EEXIST then 'err' remains to be zero.
Fix the problem by reset 'err' to zero when ice_fltr_add_mac()
returns -EEXIST.
Fixes:
2ccc1c1ccc671b ("ice: Remove excess error variables")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Signed-off-by: Alice Michael <alice.michael@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Vecera [Thu, 31 Mar 2022 16:20:06 +0000 (09:20 -0700)]
ice: Clear default forwarding VSI during VSI release
VSI is set as default forwarding one when promisc mode is set for
PF interface, when PF is switched to switchdev mode or when VF
driver asks to enable allmulticast or promisc mode for the VF
interface (when vf-true-promisc-support priv flag is off).
The third case is buggy because in that case VSI associated with
VF remains as default one after VF removal.
Reproducer:
1. Create VF
echo 1 > sys/class/net/ens7f0/device/sriov_numvfs
2. Enable allmulticast or promisc mode on VF
ip link set ens7f0v0 allmulticast on
ip link set ens7f0v0 promisc on
3. Delete VF
echo 0 > sys/class/net/ens7f0/device/sriov_numvfs
4. Try to enable promisc mode on PF
ip link set ens7f0 promisc on
Although it looks that promisc mode on PF is enabled the opposite
is true because ice_vsi_sync_fltr() responsible for IFF_PROMISC
handling first checks if any other VSI is set as default forwarding
one and if so the function does not do anything. At this point
it is not possible to enable promisc mode on PF without re-probe
device.
To resolve the issue this patch clear default forwarding VSI
during ice_vsi_release() when the VSI to be released is the default
one.
Fixes:
01b5e89aab49 ("ice: Add VF promiscuous support")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Alice Michael <alice.michael@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Thu, 31 Mar 2022 13:28:54 +0000 (16:28 +0300)]
Revert "net: dsa: stop updating master MTU from c"
This reverts commit
a1ff94c2973c43bc1e2677ac63ebb15b1d1ff846.
Switch drivers that don't implement ->port_change_mtu() will cause the
DSA master to remain with an MTU of 1500, since we've deleted the other
code path. In turn, this causes a regression for those systems, where
MTU-sized traffic can no longer be terminated.
Revert the change taking into account the fact that rtnl_lock() is now
taken top-level from the callers of dsa_master_setup() and
dsa_master_teardown(). Also add a comment in order for it to be
absolutely clear why it is still needed.
Fixes:
a1ff94c2973c ("net: dsa: stop updating master MTU from master.c")
Reported-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jean-Philippe Brucker [Thu, 31 Mar 2022 10:24:41 +0000 (11:24 +0100)]
skbuff: fix coalescing for page_pool fragment recycling
Fix a use-after-free when using page_pool with page fragments. We
encountered this problem during normal RX in the hns3 driver:
(1) Initially we have three descriptors in the RX queue. The first one
allocates PAGE1 through page_pool, and the other two allocate one
half of PAGE2 each. Page references look like this:
RX_BD1 _______ PAGE1
RX_BD2 _______ PAGE2
RX_BD3 _________/
(2) Handle RX on the first descriptor. Allocate SKB1, eventually added
to the receive queue by tcp_queue_rcv().
(3) Handle RX on the second descriptor. Allocate SKB2 and pass it to
netif_receive_skb():
netif_receive_skb(SKB2)
ip_rcv(SKB2)
SKB3 = skb_clone(SKB2)
SKB2 and SKB3 share a reference to PAGE2 through
skb_shinfo()->dataref. The other ref to PAGE2 is still held by
RX_BD3:
SKB2 ---+- PAGE2
SKB3 __/ /
RX_BD3 _________/
(3b) Now while handling TCP, coalesce SKB3 with SKB1:
tcp_v4_rcv(SKB3)
tcp_try_coalesce(to=SKB1, from=SKB3) // succeeds
kfree_skb_partial(SKB3)
skb_release_data(SKB3) // drops one dataref
SKB1 _____ PAGE1
\____
SKB2 _____ PAGE2
/
RX_BD3 _________/
In skb_try_coalesce(), __skb_frag_ref() takes a page reference to
PAGE2, where it should instead have increased the page_pool frag
reference, pp_frag_count. Without coalescing, when releasing both
SKB2 and SKB3, a single reference to PAGE2 would be dropped. Now
when releasing SKB1 and SKB2, two references to PAGE2 will be
dropped, resulting in underflow.
(3c) Drop SKB2:
af_packet_rcv(SKB2)
consume_skb(SKB2)
skb_release_data(SKB2) // drops second dataref
page_pool_return_skb_page(PAGE2) // drops one pp_frag_count
SKB1 _____ PAGE1
\____
PAGE2
/
RX_BD3 _________/
(4) Userspace calls recvmsg()
Copies SKB1 and releases it. Since SKB3 was coalesced with SKB1, we
release the SKB3 page as well:
tcp_eat_recv_skb(SKB1)
skb_release_data(SKB1)
page_pool_return_skb_page(PAGE1)
page_pool_return_skb_page(PAGE2) // drops second pp_frag_count
(5) PAGE2 is freed, but the third RX descriptor was still using it!
In our case this causes IOMMU faults, but it would silently corrupt
memory if the IOMMU was disabled.
Change the logic that checks whether pp_recycle SKBs can be coalesced.
We still reject differing pp_recycle between 'from' and 'to' SKBs, but
in order to avoid the situation described above, we also reject
coalescing when both 'from' and 'to' are pp_recycled and 'from' is
cloned.
The new logic allows coalescing a cloned pp_recycle SKB into a page
refcounted one, because in this case the release (4) will drop the right
reference, the one taken by skb_try_coalesce().
Fixes:
53e0961da1c7 ("page_pool: add frag page recycling support in page pool")
Suggested-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eyal Birger [Thu, 31 Mar 2022 07:26:43 +0000 (10:26 +0300)]
vrf: fix packet sniffing for traffic originating from ip tunnels
in commit
048939088220
("vrf: add mac header for tunneled packets when sniffer is attached")
an Ethernet header was cooked for traffic originating from tunnel devices.
However, the header is added based on whether the mac_header is unset
and ignores cases where the device doesn't expose a mac header to upper
layers, such as in ip tunnels like ipip and gre.
Traffic originating from such devices still appears garbled when capturing
on the vrf device.
Fix by observing whether the original device exposes a header to upper
layers, similar to the logic done in af_packet.
In addition, skb->mac_len needs to be adjusted after adding the Ethernet
header for the skb_push/pull() surrounding dev_queue_xmit_nit() to work
on these packets.
Fixes:
048939088220 ("vrf: add mac header for tunneled packets when sniffer is attached")
Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ziyang Xuan [Thu, 31 Mar 2022 07:04:28 +0000 (15:04 +0800)]
net/tls: fix slab-out-of-bounds bug in decrypt_internal
The memory size of tls_ctx->rx.iv for AES128-CCM is 12 setting in
tls_set_sw_offload(). The return value of crypto_aead_ivsize()
for "ccm(aes)" is 16. So memcpy() require 16 bytes from 12 bytes
memory space will trigger slab-out-of-bounds bug as following:
==================================================================
BUG: KASAN: slab-out-of-bounds in decrypt_internal+0x385/0xc40 [tls]
Read of size 16 at addr
ffff888114e84e60 by task tls/10911
Call Trace:
<TASK>
dump_stack_lvl+0x34/0x44
print_report.cold+0x5e/0x5db
? decrypt_internal+0x385/0xc40 [tls]
kasan_report+0xab/0x120
? decrypt_internal+0x385/0xc40 [tls]
kasan_check_range+0xf9/0x1e0
memcpy+0x20/0x60
decrypt_internal+0x385/0xc40 [tls]
? tls_get_rec+0x2e0/0x2e0 [tls]
? process_rx_list+0x1a5/0x420 [tls]
? tls_setup_from_iter.constprop.0+0x2e0/0x2e0 [tls]
decrypt_skb_update+0x9d/0x400 [tls]
tls_sw_recvmsg+0x3c8/0xb50 [tls]
Allocated by task 10911:
kasan_save_stack+0x1e/0x40
__kasan_kmalloc+0x81/0xa0
tls_set_sw_offload+0x2eb/0xa20 [tls]
tls_setsockopt+0x68c/0x700 [tls]
__sys_setsockopt+0xfe/0x1b0
Replace the crypto_aead_ivsize() with prot->iv_size + prot->salt_size
when memcpy() iv value in TLS_1_3_VERSION scenario.
Fixes:
f295b3ae9f59 ("net/tls: Add support of AES128-CCM based ciphers")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Taehee Yoo [Wed, 30 Mar 2022 16:37:03 +0000 (16:37 +0000)]
net: sfc: add missing xdp queue reinitialization
After rx/tx ring buffer size is changed, kernel panic occurs when
it acts XDP_TX or XDP_REDIRECT.
When tx/rx ring buffer size is changed(ethtool -G), sfc driver
reallocates and reinitializes rx and tx queues and their buffer
(tx_queue->buffer).
But it misses reinitializing xdp queues(efx->xdp_tx_queues).
So, while it is acting XDP_TX or XDP_REDIRECT, it uses the uninitialized
tx_queue->buffer.
A new function efx_set_xdp_channels() is separated from efx_set_channels()
to handle only xdp queues.
Splat looks like:
BUG: kernel NULL pointer dereference, address:
000000000000002a
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 0 P4D 0
Oops: 0002 [#4] PREEMPT SMP NOPTI
RIP: 0010:efx_tx_map_chunk+0x54/0x90 [sfc]
CPU: 2 PID: 0 Comm: swapper/2 Tainted: G D 5.17.0+ #55
e8beeee8289528f11357029357cf
Code: 48 8b 8d a8 01 00 00 48 8d 14 52 4c 8d 2c d0 44 89 e0 48 85 c9 74 0e 44 89 e2 4c 89 f6 48 80
RSP: 0018:
ffff92f121e45c60 EFLAGS:
00010297
RIP: 0010:efx_tx_map_chunk+0x54/0x90 [sfc]
RAX:
0000000000000040 RBX:
ffff92ea506895c0 RCX:
ffffffffc0330870
RDX:
0000000000000001 RSI:
00000001139b10ce RDI:
ffff92ea506895c0
RBP:
ffffffffc0358a80 R08:
00000001139b110d R09:
0000000000000000
R10:
0000000000000001 R11:
ffff92ea414c0088 R12:
0000000000000040
R13:
0000000000000018 R14:
00000001139b10ce R15:
ffff92ea506895c0
FS:
0000000000000000(0000) GS:
ffff92f121ec0000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Code: 48 8b 8d a8 01 00 00 48 8d 14 52 4c 8d 2c d0 44 89 e0 48 85 c9 74 0e 44 89 e2 4c 89 f6 48 80
CR2:
000000000000002a CR3:
00000003e6810004 CR4:
00000000007706e0
RSP: 0018:
ffff92f121e85c60 EFLAGS:
00010297
PKRU:
55555554
RAX:
0000000000000040 RBX:
ffff92ea50689700 RCX:
ffffffffc0330870
RDX:
0000000000000001 RSI:
00000001145a90ce RDI:
ffff92ea50689700
RBP:
ffffffffc0358a80 R08:
00000001145a910d R09:
0000000000000000
R10:
0000000000000001 R11:
ffff92ea414c0088 R12:
0000000000000040
R13:
0000000000000018 R14:
00000001145a90ce R15:
ffff92ea50689700
FS:
0000000000000000(0000) GS:
ffff92f121e80000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
000000000000002a CR3:
00000003e6810005 CR4:
00000000007706e0
PKRU:
55555554
Call Trace:
<IRQ>
efx_xdp_tx_buffers+0x12b/0x3d0 [sfc
84c94b8e32d44d296c17e10a634d3ad454de4ba5]
__efx_rx_packet+0x5c3/0x930 [sfc
84c94b8e32d44d296c17e10a634d3ad454de4ba5]
efx_rx_packet+0x28c/0x2e0 [sfc
84c94b8e32d44d296c17e10a634d3ad454de4ba5]
efx_ef10_ev_process+0x5f8/0xf40 [sfc
84c94b8e32d44d296c17e10a634d3ad454de4ba5]
? enqueue_task_fair+0x95/0x550
efx_poll+0xc4/0x360 [sfc
84c94b8e32d44d296c17e10a634d3ad454de4ba5]
Fixes:
3990a8fffbda ("sfc: allocate channels for XDP tx queues")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 31 Mar 2022 18:23:31 +0000 (11:23 -0700)]
Merge tag 'net-5.18-rc1' of git://git./linux/kernel/git/netdev/net
Pull more networking updates from Jakub Kicinski:
"Networking fixes and rethook patches.
Features:
- kprobes: rethook: x86: replace kretprobe trampoline with rethook
Current release - regressions:
- sfc: avoid null-deref on systems without NUMA awareness in the new
queue sizing code
Current release - new code bugs:
- vxlan: do not feed vxlan_vnifilter_dump_dev with non-vxlan devices
- eth: lan966x: fix null-deref on PHY pointer in timestamp ioctl when
interface is down
Previous releases - always broken:
- openvswitch: correct neighbor discovery target mask field in the
flow dump
- wireguard: ignore v6 endpoints when ipv6 is disabled and fix a leak
- rxrpc: fix call timer start racing with call destruction
- rxrpc: fix null-deref when security type is rxrpc_no_security
- can: fix UAF bugs around echo skbs in multiple drivers
Misc:
- docs: move netdev-FAQ to the 'process' section of the
documentation"
* tag 'net-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits)
vxlan: do not feed vxlan_vnifilter_dump_dev with non vxlan devices
openvswitch: Add recirc_id to recirc warning
rxrpc: fix some null-ptr-deref bugs in server_key.c
rxrpc: Fix call timer start racing with call destruction
net: hns3: fix software vlan talbe of vlan 0 inconsistent with hardware
net: hns3: fix the concurrency between functions reading debugfs
docs: netdev: move the netdev-FAQ to the process pages
docs: netdev: broaden the new vs old code formatting guidelines
docs: netdev: call out the merge window in tag checking
docs: netdev: add missing back ticks
docs: netdev: make the testing requirement more stringent
docs: netdev: add a question about re-posting frequency
docs: netdev: rephrase the 'should I update patchwork' question
docs: netdev: rephrase the 'Under review' question
docs: netdev: shorten the name and mention msgid for patch status
docs: netdev: note that RFC postings are allowed any time
docs: netdev: turn the net-next closed into a Warning
docs: netdev: move the patch marking section up
docs: netdev: minor reword
docs: netdev: replace references to old archives
...
Linus Torvalds [Thu, 31 Mar 2022 18:17:39 +0000 (11:17 -0700)]
Merge tag 'v5.18-p1' of git://git./linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
- Missing Kconfig dependency on arm that leads to boot failure
- x86 SLS fixes
- Reference leak in the stm32 driver
* tag 'v5.18-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: x86/sm3 - Fixup SLS
crypto: x86/poly1305 - Fixup SLS
crypto: x86/chacha20 - Avoid spurious jumps to other functions
crypto: stm32 - fix reference leak in stm32_crc_remove
crypto: arm/aes-neonbs-cbc - Select generic cbc and aes
Eric Dumazet [Wed, 30 Mar 2022 19:46:43 +0000 (12:46 -0700)]
vxlan: do not feed vxlan_vnifilter_dump_dev with non vxlan devices
vxlan_vnifilter_dump_dev() assumes it is called only
for vxlan devices. Make sure it is the case.
BUG: KASAN: slab-out-of-bounds in vxlan_vnifilter_dump_dev+0x9a0/0xb40 drivers/net/vxlan/vxlan_vnifilter.c:349
Read of size 4 at addr
ffff888060d1ce70 by task syz-executor.3/17662
CPU: 0 PID: 17662 Comm: syz-executor.3 Tainted: G W 5.17.0-syzkaller-12888-g77c9387c0c5b #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
print_address_description.constprop.0.cold+0xeb/0x495 mm/kasan/report.c:313
print_report mm/kasan/report.c:429 [inline]
kasan_report.cold+0xf4/0x1c6 mm/kasan/report.c:491
vxlan_vnifilter_dump_dev+0x9a0/0xb40 drivers/net/vxlan/vxlan_vnifilter.c:349
vxlan_vnifilter_dump+0x3ff/0x650 drivers/net/vxlan/vxlan_vnifilter.c:428
netlink_dump+0x4b5/0xb70 net/netlink/af_netlink.c:2270
__netlink_dump_start+0x647/0x900 net/netlink/af_netlink.c:2375
netlink_dump_start include/linux/netlink.h:245 [inline]
rtnetlink_rcv_msg+0x70c/0xb80 net/core/rtnetlink.c:5953
netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2496
netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
netlink_unicast+0x543/0x7f0 net/netlink/af_netlink.c:1345
netlink_sendmsg+0x904/0xe00 net/netlink/af_netlink.c:1921
sock_sendmsg_nosec net/socket.c:705 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:725
____sys_sendmsg+0x6e2/0x800 net/socket.c:2413
___sys_sendmsg+0xf3/0x170 net/socket.c:2467
__sys_sendmsg+0xe5/0x1b0 net/socket.c:2496
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0x80 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f87b8e89049
Fixes:
f9c4bb0b245c ("vxlan: vni filtering support on collect metadata device")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Roopa Prabhu <roopa@nvidia.com>
Link: https://lore.kernel.org/r/20220330194643.2706132-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stéphane Graber [Wed, 30 Mar 2022 19:42:45 +0000 (15:42 -0400)]
openvswitch: Add recirc_id to recirc warning
When hitting the recirculation limit, the kernel would currently log
something like this:
[ 58.586597] openvswitch: ovs-system: deferred action limit reached, drop recirc action
Which isn't all that useful to debug as we only have the interface name
to go on but can't track it down to a specific flow.
With this change, we now instead get:
[ 58.586597] openvswitch: ovs-system: deferred action limit reached, drop recirc action (recirc_id=0x9e)
Which can now be correlated with the flow entries from OVS.
Suggested-by: Frode Nordahl <frode.nordahl@canonical.com>
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Tested-by: Stephane Graber <stgraber@ubuntu.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Link: https://lore.kernel.org/r/20220330194244.3476544-1-stgraber@ubuntu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 31 Mar 2022 15:36:17 +0000 (08:36 -0700)]
Merge tag 'linux-can-fixes-for-5.18-
20220331' of git://git./linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2022-03-31
The first patch is by Oliver Hartkopp and fixes MSG_PEEK feature in
the CAN ISOTP protocol (broken in net-next for v5.18 only).
Tom Rix's patch for the mcp251xfd driver fixes the propagation of an
error value in case of an error.
A patch by me for the m_can driver fixes a use-after-free in the xmit
handler for m_can IP cores v3.0.x.
Hangyu Hua contributes 3 patches fixing the same double free in the
error path of the xmit handler in the ems_usb, usb_8dev and mcba_usb
USB CAN driver.
Pavel Skripkin contributes a patch for the mcba_usb driver to properly
check the endpoint type.
The last patch is by me and fixes a mem leak in the gs_usb, which was
introduced in net-next for v5.18.
* tag 'linux-can-fixes-for-5.18-
20220331' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
can: gs_usb: gs_make_candev(): fix memory leak for devices with extended bit timing configuration
can: mcba_usb: properly check endpoint type
can: mcba_usb: mcba_usb_start_xmit(): fix double dev_kfree_skb in error path
can: usb_8dev: usb_8dev_start_xmit(): fix double dev_kfree_skb() in error path
can: ems_usb: ems_usb_start_xmit(): fix double dev_kfree_skb() in error path
can: m_can: m_can_tx_handler(): fix use after free of skb
can: mcp251xfd: mcp251xfd_register_get_dev_id(): fix return of error value
can: isotp: restore accidentally removed MSG_PEEK feature
====================
Link: https://lore.kernel.org/r/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Xiaolong Huang [Wed, 30 Mar 2022 14:22:14 +0000 (15:22 +0100)]
rxrpc: fix some null-ptr-deref bugs in server_key.c
Some function calls are not implemented in rxrpc_no_security, there are
preparse_server_key, free_preparse_server_key and destroy_server_key.
When rxrpc security type is rxrpc_no_security, user can easily trigger a
null-ptr-deref bug via ioctl. So judgment should be added to prevent it
The crash log:
user@syzkaller:~$ ./rxrpc_preparse_s
[ 37.956878][T15626] BUG: kernel NULL pointer dereference, address:
0000000000000000
[ 37.957645][T15626] #PF: supervisor instruction fetch in kernel mode
[ 37.958229][T15626] #PF: error_code(0x0010) - not-present page
[ 37.958762][T15626] PGD
4aadf067 P4D
4aadf067 PUD
4aade067 PMD 0
[ 37.959321][T15626] Oops: 0010 [#1] PREEMPT SMP
[ 37.959739][T15626] CPU: 0 PID: 15626 Comm: rxrpc_preparse_ Not tainted 5.17.0-01442-gb47d5a4f6b8d #43
[ 37.960588][T15626] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1 04/01/2014
[ 37.961474][T15626] RIP: 0010:0x0
[ 37.961787][T15626] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
[ 37.962480][T15626] RSP: 0018:
ffffc9000d9abdc0 EFLAGS:
00010286
[ 37.963018][T15626] RAX:
ffffffff84335200 RBX:
ffff888012a1ce80 RCX:
0000000000000000
[ 37.963727][T15626] RDX:
0000000000000000 RSI:
ffffffff84a736dc RDI:
ffffc9000d9abe48
[ 37.964425][T15626] RBP:
ffffc9000d9abe48 R08:
0000000000000000 R09:
0000000000000002
[ 37.965118][T15626] R10:
000000000000000a R11:
f000000000000000 R12:
ffff888013145680
[ 37.965836][T15626] R13:
0000000000000000 R14:
ffffffffffffffec R15:
ffff8880432aba80
[ 37.966441][T15626] FS:
00007f2177907700(0000) GS:
ffff88803ec00000(0000) knlGS:
0000000000000000
[ 37.966979][T15626] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 37.967384][T15626] CR2:
ffffffffffffffd6 CR3:
000000004aaf1000 CR4:
00000000000006f0
[ 37.967864][T15626] Call Trace:
[ 37.968062][T15626] <TASK>
[ 37.968240][T15626] rxrpc_preparse_s+0x59/0x90
[ 37.968541][T15626] key_create_or_update+0x174/0x510
[ 37.968863][T15626] __x64_sys_add_key+0x139/0x1d0
[ 37.969165][T15626] do_syscall_64+0x35/0xb0
[ 37.969451][T15626] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 37.969824][T15626] RIP: 0033:0x43a1f9
Signed-off-by: Xiaolong Huang <butterflyhuangxx@gmail.com>
Tested-by: Xiaolong Huang <butterflyhuangxx@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: http://lists.infradead.org/pipermail/linux-afs/2022-March/005069.html
Fixes:
12da59fcab5a ("rxrpc: Hand server key parsing off to the security class")
Link: https://lore.kernel.org/r/164865013439.2941502.8966285221215590921.stgit@warthog.procyon.org.uk
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
David Howells [Wed, 30 Mar 2022 14:39:16 +0000 (15:39 +0100)]
rxrpc: Fix call timer start racing with call destruction
The rxrpc_call struct has a timer used to handle various timed events
relating to a call. This timer can get started from the packet input
routines that are run in softirq mode with just the RCU read lock held.
Unfortunately, because only the RCU read lock is held - and neither ref or
other lock is taken - the call can start getting destroyed at the same time
a packet comes in addressed to that call. This causes the timer - which
was already stopped - to get restarted. Later, the timer dispatch code may
then oops if the timer got deallocated first.
Fix this by trying to take a ref on the rxrpc_call struct and, if
successful, passing that ref along to the timer. If the timer was already
running, the ref is discarded.
The timer completion routine can then pass the ref along to the call's work
item when it queues it. If the timer or work item where already
queued/running, the extra ref is discarded.
Fixes:
a158bdd3247b ("rxrpc: Fix call timeouts")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
Tested-by: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: http://lists.infradead.org/pipermail/linux-afs/2022-March/005073.html
Link: https://lore.kernel.org/r/164865115696.2943015.11097991776647323586.stgit@warthog.procyon.org.uk
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Thu, 31 Mar 2022 09:40:02 +0000 (11:40 +0200)]
Merge branch 'net-hns3-add-two-fixes-for-net'
Guangbin Huang says:
====================
net: hns3: add two fixes for -net
This series adds two fixes for the HNS3 ethernet driver.
====================
Link: https://lore.kernel.org/r/20220330134506.36635-1-huangguangbin2@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Guangbin Huang [Wed, 30 Mar 2022 13:45:06 +0000 (21:45 +0800)]
net: hns3: fix software vlan talbe of vlan 0 inconsistent with hardware
When user delete vlan 0, as driver will not delete vlan 0 for hardware in
function hclge_set_vlan_filter_hw(), so vlan 0 in software vlan talbe should
not be deleted.
Fixes:
fe4144d47eef ("net: hns3: sync VLAN filter entries when kill VLAN ID failed")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Yufeng Mo [Wed, 30 Mar 2022 13:45:05 +0000 (21:45 +0800)]
net: hns3: fix the concurrency between functions reading debugfs
Currently, the debugfs mechanism is that all functions share a
global variable to save the pointer for obtaining data. When
different functions concurrently access the same file node,
repeated release exceptions occur. Therefore, the granularity
of the pointer for storing the obtained data is adjusted to be
private for each function.
Fixes:
5e69ea7ee2a6 ("net: hns3: refactor the debugfs process")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Paolo Abeni [Thu, 31 Mar 2022 08:49:42 +0000 (10:49 +0200)]
Merge branch 'docs-update-and-move-the-netdev-faq'
Jakub Kicinski says:
====================
docs: update and move the netdev-FAQ
A section of documentation for tree-specific process quirks had
been created a while back. There's only one tree in it, so far,
the tip tree, but the contents seem to answer similar questions
as we answer in the netdev-FAQ. Move the netdev-FAQ.
Take this opportunity to touch up and update a few sections.
v3: remove some confrontational? language from patch 7
v2: remove non-git in patch 3
add patch 5
====================
Link: https://lore.kernel.org/r/20220330042505.2902770-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Wed, 30 Mar 2022 04:25:05 +0000 (21:25 -0700)]
docs: netdev: move the netdev-FAQ to the process pages
The documentation for the tip tree is really in quite a similar
spirit to the netdev-FAQ. Move the netdev-FAQ to the process docs
as well.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Wed, 30 Mar 2022 04:25:04 +0000 (21:25 -0700)]
docs: netdev: broaden the new vs old code formatting guidelines
Convert the "should I use new or old comment formatting" to cover
all formatting. This makes the question itself shorter.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Wed, 30 Mar 2022 04:25:03 +0000 (21:25 -0700)]
docs: netdev: call out the merge window in tag checking
Add the most important case to the question about "where are we
in the cycle" - the case of net-next being closed.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Wed, 30 Mar 2022 04:25:02 +0000 (21:25 -0700)]
docs: netdev: add missing back ticks
I think double back ticks are more correct. Add where they are missing.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Wed, 30 Mar 2022 04:25:01 +0000 (21:25 -0700)]
docs: netdev: make the testing requirement more stringent
These days we often ask for selftests so let's update our
testing requirements.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Wed, 30 Mar 2022 04:25:00 +0000 (21:25 -0700)]
docs: netdev: add a question about re-posting frequency
We have to tell people to stop reposting to often lately,
or not to repost while the discussion is ongoing.
Document this.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Wed, 30 Mar 2022 04:24:59 +0000 (21:24 -0700)]
docs: netdev: rephrase the 'should I update patchwork' question
Make the question shorter and adjust the start of the answer accordingly.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Wed, 30 Mar 2022 04:24:58 +0000 (21:24 -0700)]
docs: netdev: rephrase the 'Under review' question
The semantics of "Under review" have shifted. Reword the question
about it a bit and focus it on the response time.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Wed, 30 Mar 2022 04:24:57 +0000 (21:24 -0700)]
docs: netdev: shorten the name and mention msgid for patch status
Cut down the length of the question so it renders better in docs.
Mention that Message-ID can be used to search patchwork.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Wed, 30 Mar 2022 04:24:56 +0000 (21:24 -0700)]
docs: netdev: note that RFC postings are allowed any time
Document that RFCs are allowed during the merge window.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Wed, 30 Mar 2022 04:24:55 +0000 (21:24 -0700)]
docs: netdev: turn the net-next closed into a Warning
Use the sphinx Warning box to make the net-next being closed
stand out more.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Wed, 30 Mar 2022 04:24:54 +0000 (21:24 -0700)]
docs: netdev: move the patch marking section up
We want people to mark their patches with net and net-next in the subject.
Many miss doing that. Move the FAQ section which points that out up, and
place it after the section which enumerates the trees, that seems like
a pretty logical place for it. Since the two sections are together we
can remove a little bit (not too much) of the repetition.
v2: also remove the text for non-git setups, we want people to use git.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Wed, 30 Mar 2022 04:24:53 +0000 (21:24 -0700)]
docs: netdev: minor reword
that -> those
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Jakub Kicinski [Wed, 30 Mar 2022 04:24:52 +0000 (21:24 -0700)]
docs: netdev: replace references to old archives
Most people use (or should use) lore at this point.
Replace the pointers to older archiving systems.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Marc Kleine-Budde [Tue, 29 Mar 2022 19:29:43 +0000 (21:29 +0200)]
can: gs_usb: gs_make_candev(): fix memory leak for devices with extended bit timing configuration
Some CAN-FD capable devices offer extended bit timing information for
the data bit timing. The information must be read with an USB control
message. The memory for this message is allocated but not free()ed (in
the non error case). This patch adds the missing free.
Fixes:
6679f4c5e5a6 ("can: gs_usb: add extended bt_const feature")
Link: https://lore.kernel.org/all/20220329193450.659726-1-mkl@pengutronix.de
Reported-by: syzbot+4d0ae90a195b269f102d@syzkaller.appspotmail.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Pavel Skripkin [Sun, 13 Mar 2022 10:09:03 +0000 (13:09 +0300)]
can: mcba_usb: properly check endpoint type
Syzbot reported warning in usb_submit_urb() which is caused by wrong
endpoint type. We should check that in endpoint is actually present to
prevent this warning.
Found pipes are now saved to struct mcba_priv and code uses them
directly instead of making pipes in place.
Fail log:
| usb 5-1: BOGUS urb xfer, pipe 3 != type 1
| WARNING: CPU: 1 PID: 49 at drivers/usb/core/urb.c:502 usb_submit_urb+0xed2/0x18a0 drivers/usb/core/urb.c:502
| Modules linked in:
| CPU: 1 PID: 49 Comm: kworker/1:2 Not tainted 5.17.0-rc6-syzkaller-00184-g38f80f42147f #0
| Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
| Workqueue: usb_hub_wq hub_event
| RIP: 0010:usb_submit_urb+0xed2/0x18a0 drivers/usb/core/urb.c:502
| ...
| Call Trace:
| <TASK>
| mcba_usb_start drivers/net/can/usb/mcba_usb.c:662 [inline]
| mcba_usb_probe+0x8a3/0xc50 drivers/net/can/usb/mcba_usb.c:858
| usb_probe_interface+0x315/0x7f0 drivers/usb/core/driver.c:396
| call_driver_probe drivers/base/dd.c:517 [inline]
Fixes:
51f3baad7de9 ("can: mcba_usb: Add support for Microchip CAN BUS Analyzer")
Link: https://lore.kernel.org/all/20220313100903.10868-1-paskripkin@gmail.com
Reported-and-tested-by: syzbot+3bc1dce0cc0052d60fde@syzkaller.appspotmail.com
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Hangyu Hua [Fri, 11 Mar 2022 08:02:08 +0000 (16:02 +0800)]
can: mcba_usb: mcba_usb_start_xmit(): fix double dev_kfree_skb in error path
There is no need to call dev_kfree_skb() when usb_submit_urb() fails
because can_put_echo_skb() deletes original skb and
can_free_echo_skb() deletes the cloned skb.
Fixes:
51f3baad7de9 ("can: mcba_usb: Add support for Microchip CAN BUS Analyzer")
Link: https://lore.kernel.org/all/20220311080208.45047-1-hbh25y@gmail.com
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Hangyu Hua [Fri, 11 Mar 2022 08:06:14 +0000 (16:06 +0800)]
can: usb_8dev: usb_8dev_start_xmit(): fix double dev_kfree_skb() in error path
There is no need to call dev_kfree_skb() when usb_submit_urb() fails
because can_put_echo_skb() deletes original skb and
can_free_echo_skb() deletes the cloned skb.
Fixes:
0024d8ad1639 ("can: usb_8dev: Add support for USB2CAN interface from 8 devices")
Link: https://lore.kernel.org/all/20220311080614.45229-1-hbh25y@gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Hangyu Hua [Mon, 28 Feb 2022 08:36:39 +0000 (16:36 +0800)]
can: ems_usb: ems_usb_start_xmit(): fix double dev_kfree_skb() in error path
There is no need to call dev_kfree_skb() when usb_submit_urb() fails
beacause can_put_echo_skb() deletes the original skb and
can_free_echo_skb() deletes the cloned skb.
Link: https://lore.kernel.org/all/20220228083639.38183-1-hbh25y@gmail.com
Fixes:
702171adeed3 ("ems_usb: Added support for EMS CPC-USB/ARM7 CAN/USB interface")
Cc: stable@vger.kernel.org
Cc: Sebastian Haas <haas@ems-wuensche.com>
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Marc Kleine-Budde [Thu, 17 Mar 2022 07:57:35 +0000 (08:57 +0100)]
can: m_can: m_can_tx_handler(): fix use after free of skb
can_put_echo_skb() will clone skb then free the skb. Move the
can_put_echo_skb() for the m_can version 3.0.x directly before the
start of the xmit in hardware, similar to the 3.1.x branch.
Fixes:
80646733f11c ("can: m_can: update to support CAN FD features")
Link: https://lore.kernel.org/all/20220317081305.739554-1-mkl@pengutronix.de
Cc: stable@vger.kernel.org
Reported-by: Hangyu Hua <hbh25y@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Tom Rix [Sat, 19 Mar 2022 15:31:28 +0000 (08:31 -0700)]
can: mcp251xfd: mcp251xfd_register_get_dev_id(): fix return of error value
Clang static analysis reports this issue:
| mcp251xfd-core.c:1813:7: warning: The left operand
| of '&' is a garbage value
| FIELD_GET(MCP251XFD_REG_DEVID_ID_MASK, dev_id),
| ^ ~~~~~~
dev_id is set in a successful call to mcp251xfd_register_get_dev_id().
Though the status of calls made by mcp251xfd_register_get_dev_id() are
checked and handled, their status' are not returned. So return err.
Fixes:
55e5b97f003e ("can: mcp25xxfd: add driver for Microchip MCP25xxFD SPI CAN")
Link: https://lore.kernel.org/all/20220319153128.2164120-1-trix@redhat.com
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Oliver Hartkopp [Mon, 28 Mar 2022 11:36:11 +0000 (13:36 +0200)]
can: isotp: restore accidentally removed MSG_PEEK feature
In commit
42bf50a1795a ("can: isotp: support MSG_TRUNC flag when
reading from socket") a new check for recvmsg flags has been
introduced that only checked for the flags that are handled in
isotp_recvmsg() itself.
This accidentally removed the MSG_PEEK feature flag which is processed
later in the call chain in __skb_try_recv_from_queue().
Add MSG_PEEK to the set of valid flags to restore the feature.
Fixes:
42bf50a1795a ("can: isotp: support MSG_TRUNC flag when reading from socket")
Link: https://github.com/linux-can/can-utils/issues/347#issuecomment-1079554254
Link: https://lore.kernel.org/all/20220328113611.3691-1-socketcan@hartkopp.net
Reported-by: Derek Will <derekrobertwill@gmail.com>
Suggested-by: Derek Will <derekrobertwill@gmail.com>
Tested-by: Derek Will <derekrobertwill@gmail.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Martin KaFai Lau [Wed, 30 Mar 2022 01:15:02 +0000 (18:15 -0700)]
bpf: selftests: Test fentry tracing a struct_ops program
This patch tests attaching an fentry prog to a struct_ops prog.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220330011502.2985292-1-kafai@fb.com