platform/kernel/linux-rpi.git
9 months agobpf: tcp_read_skb needs to pop skb regardless of seq
John Fastabend [Tue, 26 Sep 2023 03:52:58 +0000 (20:52 -0700)]
bpf: tcp_read_skb needs to pop skb regardless of seq

Before fix e5c6de5fa0258 tcp_read_skb() would increment the tp->copied-seq
value. This (as described in the commit) would cause an error for apps
because once that is incremented the application might believe there is no
data to be read. Then some apps would stall or abort believing no data is
available.

However, the fix is incomplete because it introduces another issue in
the skb dequeue. The loop does tcp_recv_skb() in a while loop to consume
as many skbs as possible. The problem is the call is ...

  tcp_recv_skb(sk, seq, &offset)

... where 'seq' is:

  u32 seq = tp->copied_seq;

Now we can hit a case where we've yet incremented copied_seq from BPF side,
but then tcp_recv_skb() fails this test ...

 if (offset < skb->len || (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN))

... so that instead of returning the skb we call tcp_eat_recv_skb() which
frees the skb. This is because the routine believes the SKB has been collapsed
per comment:

 /* This looks weird, but this can happen if TCP collapsing
  * splitted a fat GRO packet, while we released socket lock
  * in skb_splice_bits()
  */

This can't happen here we've unlinked the full SKB and orphaned it. Anyways
it would confuse any BPF programs if the data were suddenly moved underneath
it.

To fix this situation do simpler operation and just skb_peek() the data
of the queue followed by the unlink. It shouldn't need to check this
condition and tcp_read_skb() reads entire skbs so there is no need to
handle the 'offset!=0' case as we would see in tcp_read_sock().

Fixes: e5c6de5fa0258 ("bpf, sockmap: Incorrectly handling copied_seq")
Fixes: 04919bed948dc ("tcp: Introduce tcp_read_skb()")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20230926035300.135096-2-john.fastabend@gmail.com
9 months agobpf: unconditionally reset backtrack_state masks on global func exit
Andrii Nakryiko [Mon, 18 Sep 2023 21:01:10 +0000 (14:01 -0700)]
bpf: unconditionally reset backtrack_state masks on global func exit

In mark_chain_precision() logic, when we reach the entry to a global
func, it is expected that R1-R5 might be still requested to be marked
precise. This would correspond to some integer input arguments being
tracked as precise. This is all expected and handled as a special case.

What's not expected is that we'll leave backtrack_state structure with
some register bits set. This is because for subsequent precision
propagations backtrack_state is reused without clearing masks, as all
code paths are carefully written in a way to leave empty backtrack_state
with zeroed out masks, for speed.

The fix is trivial, we always clear register bit in the register mask, and
then, optionally, set reg->precise if register is SCALAR_VALUE type.

Reported-by: Chris Mason <clm@meta.com>
Fixes: be2ef8161572 ("bpf: allow precision tracking for programs with subprogs")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230918210110.2241458-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
9 months agobpf: Fix tr dereferencing
Leon Hwang [Sun, 17 Sep 2023 15:38:46 +0000 (23:38 +0800)]
bpf: Fix tr dereferencing

Fix 'tr' dereferencing bug when CONFIG_BPF_JIT is turned off.

When CONFIG_BPF_JIT is turned off, 'bpf_trampoline_get()' returns NULL,
which is same as the cases when CONFIG_BPF_JIT is turned on.

Closes: https://lore.kernel.org/r/202309131936.5Nc8eUD0-lkp@intel.com/
Fixes: f7b12b6fea00 ("bpf: verifier: refactor check_attach_btf_id()")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Leon Hwang <hffilwlqm@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230917153846.88732-1-hffilwlqm@gmail.com
10 months agoMerge branch 's390-bpf-fix-arch_prepare_bpf_trampoline'
Alexei Starovoitov [Tue, 19 Sep 2023 09:57:30 +0000 (02:57 -0700)]
Merge branch 's390-bpf-fix-arch_prepare_bpf_trampoline'

Song Liu says:

====================
s390/bpf: Fix arch_prepare_bpf_trampoline

While working on trampoline, I found s390's arch_prepare_bpf_trampoline
returns 0 on success, which breaks struct_ops. However, the CI doesn't
catch this issue. Turns out test_progs:bpf_tcp_ca doesn't really test
members of a struct_ops are actually called via the trampolines.

1/2 fixes arch_prepare_bpf_trampoline for s390.
2/2 adds a check to test_progs:bpf_tcp_ca to verify bpf_cubic_acked() is
indeed called by the trampoline. Without 1/2, this check would fail on
s390.
====================

Link: https://lore.kernel.org/r/20230919060258.3237176-1-song@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
10 months agoselftests/bpf: Check bpf_cubic_acked() is called via struct_ops
Song Liu [Tue, 19 Sep 2023 06:02:58 +0000 (23:02 -0700)]
selftests/bpf: Check bpf_cubic_acked() is called via struct_ops

Test bpf_tcp_ca (in test_progs) checks multiple tcp_congestion_ops.
However, there isn't a test that verifies functions in the
tcp_congestion_ops is actually called. Add a check to verify that
bpf_cubic_acked is actually called during the test.

Suggested-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Song Liu <song@kernel.org>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230919060258.3237176-3-song@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
10 months agos390/bpf: Let arch_prepare_bpf_trampoline return program size
Song Liu [Tue, 19 Sep 2023 06:02:57 +0000 (23:02 -0700)]
s390/bpf: Let arch_prepare_bpf_trampoline return program size

arch_prepare_bpf_trampoline() for s390 currently returns 0 on success. This
is not a problem for regular trampoline. However, struct_ops relies on the
return value to advance "image" pointer:

bpf_struct_ops_map_update_elem() {
    ...
    for_each_member(i, t, member) {
        ...
        err = bpf_struct_ops_prepare_trampoline();
        ...
        image += err;
    }
}

When arch_prepare_bpf_trampoline returns 0 on success, all members of the
struct_ops will point to the same trampoline (the last one).

Fix this by returning the program size in arch_prepare_bpf_trampoline (on
success). This is the same behavior as other architectures.

Signed-off-by: Song Liu <song@kernel.org>
Fixes: 528eb2cb87bc ("s390/bpf: Implement arch_prepare_bpf_trampoline()")
Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20230919060258.3237176-2-song@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
10 months agonet: stmmac: fix incorrect rxq|txq_stats reference
Jisheng Zhang [Sun, 17 Sep 2023 16:53:28 +0000 (00:53 +0800)]
net: stmmac: fix incorrect rxq|txq_stats reference

commit 133466c3bbe1 ("net: stmmac: use per-queue 64 bit statistics
where necessary") caused one regression as found by Uwe, the backtrace
looks like:

INFO: trying to register non-static key.
The code is fine but needs lockdep annotation, or maybe
you didn't initialize this object before use?
turning off the locking correctness validator.
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.5.0-rc1-00449-g133466c3bbe1-dirty #21
Hardware name: STM32 (Device Tree Support)
 unwind_backtrace from show_stack+0x18/0x1c
 show_stack from dump_stack_lvl+0x60/0x90
 dump_stack_lvl from register_lock_class+0x98c/0x99c
 register_lock_class from __lock_acquire+0x74/0x293c
 __lock_acquire from lock_acquire+0x134/0x398
 lock_acquire from stmmac_get_stats64+0x2ac/0x2fc
 stmmac_get_stats64 from dev_get_stats+0x44/0x130
 dev_get_stats from rtnl_fill_stats+0x38/0x120
 rtnl_fill_stats from rtnl_fill_ifinfo+0x834/0x17f4
 rtnl_fill_ifinfo from rtmsg_ifinfo_build_skb+0xc0/0x144
 rtmsg_ifinfo_build_skb from rtmsg_ifinfo+0x50/0x88
 rtmsg_ifinfo from __dev_notify_flags+0xc0/0xec
 __dev_notify_flags from dev_change_flags+0x50/0x5c
 dev_change_flags from ip_auto_config+0x2f4/0x1260
 ip_auto_config from do_one_initcall+0x70/0x35c
 do_one_initcall from kernel_init_freeable+0x2ac/0x308
 kernel_init_freeable from kernel_init+0x1c/0x138
 kernel_init from ret_from_fork+0x14/0x2c

The reason is the rxq|txq_stats structures are not what expected
because stmmac_open() -> __stmmac_open() the structure is overwritten
by "memcpy(&priv->dma_conf, dma_conf, sizeof(*dma_conf));"
This causes the well initialized syncp member of rxq|txq_stats is
overwritten unexpectedly as pointed out by Johannes and Uwe.

Fix this issue by moving rxq|txq_stats back to stmmac_extra_stats. For
SMP cache friendly, we also mark stmmac_txq_stats and stmmac_rxq_stats
as ____cacheline_aligned_in_smp.

Fixes: 133466c3bbe1 ("net: stmmac: use per-queue 64 bit statistics where necessary")
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/20230917165328.3403-1-jszhang@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
10 months agoMerge branch 'ax25-project-links'
David S. Miller [Mon, 18 Sep 2023 11:56:58 +0000 (12:56 +0100)]
Merge branch 'ax25-project-links'

Peter Lafreniere says:

====================
ax25: Update link for linux-ax25.org

http://linux-ax25.org has been down for nearly a year. Its official
replacement is https://linux-ax25.in-berlin.de.

Update all references to the dead link to its replacement.

As the three touched files are in different areas of the tree, this is
being sent with one patch per file.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoax25: Kconfig: Update link for linux-ax25.org
Peter Lafreniere [Sun, 17 Sep 2023 15:30:21 +0000 (15:30 +0000)]
ax25: Kconfig: Update link for linux-ax25.org

http://linux-ax25.org has been down for nearly a year. Its official
replacement is https://linux-ax25.in-berlin.de. Change all references to
the old site in the ax25 Kconfig to its replacement.

Link: https://marc.info/?m=166792551600315
Signed-off-by: Peter Lafreniere <peter@n8pjl.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoMAINTAINERS: Update link for linux-ax25.org
Peter Lafreniere [Sun, 17 Sep 2023 15:30:10 +0000 (15:30 +0000)]
MAINTAINERS: Update link for linux-ax25.org

http://linux-ax25.org has been down for nearly a year. Its official
replacement is https://linux-ax25.in-berlin.de. Update all links to the
new URL.

Link: https://marc.info/?m=166792551600315
Signed-off-by: Peter Lafreniere <peter@n8pjl.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoDocumentation: netdev: fix dead link in ax25.rst
Peter Lafreniere [Sun, 17 Sep 2023 15:29:58 +0000 (15:29 +0000)]
Documentation: netdev: fix dead link in ax25.rst

http://linux-ax25.org has been down for nearly a year. Its official
replacement is https://linux-ax25.in-berlin.de.

Update the documentation to point there instead. And acknowledge that
while the linux-hams list isn't entirely dead, it isn't what most would
call 'active'. Remove that word.

Link: https://marc.info/?m=166792551600315
Signed-off-by: Peter Lafreniere <peter@n8pjl.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoMerge branch 'mptcp-stalled-connections-fix'
David S. Miller [Mon, 18 Sep 2023 11:47:56 +0000 (12:47 +0100)]
Merge branch 'mptcp-stalled-connections-fix'

Matthieu Baerts says:

====================
mptcp: fix stalled connections

Daire reported a few issues with MPTCP where some connections were
stalled in different states. Paolo did a great job fixing them.

Patch 1 fixes bogus receive window shrinkage with multiple subflows. Due
to a race condition and unlucky circumstances, that may lead to
TCP-level window shrinkage, and the connection being stalled on the
sender end.

Patch 2 is a preparation for patch 3 which processes pending subflow
errors on close. Without that and under specific circumstances, the
MPTCP-level socket might not switch to the CLOSE state and stall.

Patch 4 is also a preparation patch for the next one. Patch 5 fixes
MPTCP connections not switching to the CLOSE state when all subflows
have been closed but no DATA_FIN have been exchanged to explicitly close
the MPTCP connection. Now connections in such state will switch to the
CLOSE state after a timeout, still allowing the "make-after-break"
feature but making sure connections don't stall forever. It will be
possible to modify this timeout -- currently matching TCP TIMEWAIT value
(60 seconds) -- in a future version.
====================

Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
10 months agomptcp: fix dangling connection hang-up
Paolo Abeni [Sat, 16 Sep 2023 10:52:49 +0000 (12:52 +0200)]
mptcp: fix dangling connection hang-up

According to RFC 8684 section 3.3:

  A connection is not closed unless [...] or an implementation-specific
  connection-level send timeout.

Currently the MPTCP protocol does not implement such timeout, and
connection timing-out at the TCP-level never move to close state.

Introduces a catch-up condition at subflow close time to move the
MPTCP socket to close, too.

That additionally allows removing similar existing inside the worker.

Finally, allow some additional timeout for plain ESTABLISHED mptcp
sockets, as the protocol allows creating new subflows even at that
point and making the connection functional again.

This issue is actually present since the beginning, but it is basically
impossible to solve without a long chain of functional pre-requisites
topped by commit bbd49d114d57 ("mptcp: consolidate transition to
TCP_CLOSE in mptcp_do_fastclose()"). When backporting this current
patch, please also backport this other commit as well.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/430
Fixes: e16163b6e2b7 ("mptcp: refactor shutdown and close")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agomptcp: rename timer related helper to less confusing names
Paolo Abeni [Sat, 16 Sep 2023 10:52:48 +0000 (12:52 +0200)]
mptcp: rename timer related helper to less confusing names

The msk socket uses to different timeout to track close related
events and retransmissions. The existing helpers do not indicate
clearly which timer they actually touch, making the related code
quite confusing.

Change the existing helpers name to avoid such confusion. No
functional change intended.

This patch is linked to the next one ("mptcp: fix dangling connection
hang-up"). The two patches are supposed to be backported together.

Cc: stable@vger.kernel.org # v5.11+
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agomptcp: process pending subflow error on close
Paolo Abeni [Sat, 16 Sep 2023 10:52:47 +0000 (12:52 +0200)]
mptcp: process pending subflow error on close

On incoming TCP reset, subflow closing could happen before error
propagation. That in turn could cause the socket error being ignored,
and a missing socket state transition, as reported by Daire-Byrne.

Address the issues explicitly checking for subflow socket error at
close time. To avoid code duplication, factor-out of __mptcp_error_report()
a new helper implementing the relevant bits.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/429
Fixes: 15cc10453398 ("mptcp: deliver ssk errors to msk")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agomptcp: move __mptcp_error_report in protocol.c
Paolo Abeni [Sat, 16 Sep 2023 10:52:46 +0000 (12:52 +0200)]
mptcp: move __mptcp_error_report in protocol.c

This will simplify the next patch ("mptcp: process pending subflow error
on close").

No functional change intended.

Cc: stable@vger.kernel.org # v5.12+
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agomptcp: fix bogus receive window shrinkage with multiple subflows
Paolo Abeni [Sat, 16 Sep 2023 10:52:45 +0000 (12:52 +0200)]
mptcp: fix bogus receive window shrinkage with multiple subflows

In case multiple subflows race to update the mptcp-level receive
window, the subflow losing the race should use the window value
provided by the "winning" subflow to update it's own tcp-level
rcv_wnd.

To such goal, the current code bogusly uses the mptcp-level rcv_wnd
value as observed before the update attempt. On unlucky circumstances
that may lead to TCP-level window shrinkage, and stall the other end.

Address the issue feeding to the rcv wnd update the correct value.

Fixes: f3589be0c420 ("mptcp: never shrink offered window")
Cc: stable@vger.kernel.org
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/427
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoMerge branch 'tsnep-napi-fixes'
David S. Miller [Mon, 18 Sep 2023 09:42:37 +0000 (10:42 +0100)]
Merge branch 'tsnep-napi-fixes'

Gerhard Engleder says:

====================
tsnep: Fixes based on napi.rst

Based on the documentation networking/napi.rst some fixes have been
done. tsnep driver should be in line with this new documentation after
these fixes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agotsnep: Fix NAPI polling with budget 0
Gerhard Engleder [Fri, 15 Sep 2023 21:01:26 +0000 (23:01 +0200)]
tsnep: Fix NAPI polling with budget 0

According to the NAPI documentation networking/napi.rst, Rx specific
APIs like page pool and XDP cannot be used at all when budget is 0.
skb Tx processing should happen regardless of the budget.

Stop NAPI polling after Tx processing and skip Rx processing if budget
is 0.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agotsnep: Fix ethtool channels
Gerhard Engleder [Fri, 15 Sep 2023 21:01:25 +0000 (23:01 +0200)]
tsnep: Fix ethtool channels

According to the NAPI documentation networking/napi.rst, for the ethtool
API a channel is a IRQ/NAPI which services queues of a given type.

tsnep uses a single IRQ/NAPI instance for every TX/RX queue pair.
Therefore, combined channels shall be returned instead of separate tx/rx
channels.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agotsnep: Fix NAPI scheduling
Gerhard Engleder [Fri, 15 Sep 2023 21:01:24 +0000 (23:01 +0200)]
tsnep: Fix NAPI scheduling

According to the NAPI documentation networking/napi.rst, drivers which
have to mask interrupts explicitly should use the napi_schedule_prep()
and __napi_schedule() calls.

No problem seen so far with current implementation. Nevertheless, let's
align the implementation with documentation.

Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoMerge branch 'hsr-supervisor-frames'
David S. Miller [Mon, 18 Sep 2023 07:26:20 +0000 (08:26 +0100)]
Merge branch 'hsr-supervisor-frames'

Sebastian Andrzej Siewior says:

====================
net: hsr: Properly parse HSRv1 supervisor frames.

this is a follow-up to
https://lore.kernel.org/all/20230825153111.228768-1-lukma@denx.de/
replacing
https://lore.kernel.org/all/20230914124731.1654059-1-lukma@denx.de/

by grabing/ adding tags and reposting with a commit message plus a
missing __packed to a struct (#2) plus extending the testsuite to sover
HSRv1 which is what broke here (#3-#5).

HSRv0 is (was) not affected.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoselftests: hsr: Extend the testsuite to also cover HSRv1.
Sebastian Andrzej Siewior [Fri, 15 Sep 2023 18:10:06 +0000 (20:10 +0200)]
selftests: hsr: Extend the testsuite to also cover HSRv1.

The testsuite already has simply tests for HSRv0. The testuite would
have been able to notice the v1 breakage if it was there at the time.

Extend the testsuite to also cover HSRv1.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoselftests: hsr: Reorder the testsuite.
Sebastian Andrzej Siewior [Fri, 15 Sep 2023 18:10:05 +0000 (20:10 +0200)]
selftests: hsr: Reorder the testsuite.

Move the code and group into functions so it will be easier to extend
the test to HSRv1 so that both versions are covered.

Move the ping/test part into do_complete_ping_test() and the interface
setup into setup_hsr_interfaces().

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoselftests: hsr: Use `let' properly.
Sebastian Andrzej Siewior [Fri, 15 Sep 2023 18:10:04 +0000 (20:10 +0200)]
selftests: hsr: Use `let' properly.

The timeout in the while loop is never subtracted due wrong usage of
`let' leading to an endless loop if the former condition never gets
true.

Put the statement for let in quotes so it is parsed as a single
statement.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonet: hsr: Add __packed to struct hsr_sup_tlv.
Sebastian Andrzej Siewior [Fri, 15 Sep 2023 18:10:03 +0000 (20:10 +0200)]
net: hsr: Add __packed to struct hsr_sup_tlv.

Struct hsr_sup_tlv describes HW layout and therefore it needs a __packed
attribute to ensure the compiler does not add any padding.
Due to the size and __packed attribute of the structs that use
hsr_sup_tlv it has no functional impact.

Add __packed to struct hsr_sup_tlv.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonet: hsr: Properly parse HSRv1 supervisor frames.
Lukasz Majewski [Fri, 15 Sep 2023 18:10:02 +0000 (20:10 +0200)]
net: hsr: Properly parse HSRv1 supervisor frames.

While adding support for parsing the redbox supervision frames, the
author added `pull_size' and `total_pull_size' to track the amount of
bytes that were pulled from the skb during while parsing the skb so it
can be reverted/ pushed back at the end.
In the process probably copy&paste error occurred and for the HSRv1 case
the ethhdr was used instead of the hsr_tag. Later the hsr_tag was used
instead of hsr_sup_tag. The later error didn't matter because both
structs have the size so HSRv0 was still working. It broke however HSRv1
parsing because struct ethhdr is larger than struct hsr_tag.

Reinstate the old pulling flow and pull first ethhdr, hsr_tag in v1 case
followed by hsr_sup_tag.

[bigeasy: commit message]

Fixes: eafaa88b3eb7 ("net: hsr: Add support for redbox supervision frames")'
Suggested-by: Tristram.Ha@microchip.com
Signed-off-by: Lukasz Majewski <lukma@denx.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agodccp: fix dccp_v4_err()/dccp_v6_err() again
Eric Dumazet [Fri, 15 Sep 2023 19:00:35 +0000 (19:00 +0000)]
dccp: fix dccp_v4_err()/dccp_v6_err() again

dh->dccph_x is the 9th byte (offset 8) in "struct dccp_hdr",
not in the "byte 7" as Jann claimed.

We need to make sure the ICMP messages are big enough,
using more standard ways (no more assumptions).

syzbot reported:
BUG: KMSAN: uninit-value in pskb_may_pull_reason include/linux/skbuff.h:2667 [inline]
BUG: KMSAN: uninit-value in pskb_may_pull include/linux/skbuff.h:2681 [inline]
BUG: KMSAN: uninit-value in dccp_v6_err+0x426/0x1aa0 net/dccp/ipv6.c:94
pskb_may_pull_reason include/linux/skbuff.h:2667 [inline]
pskb_may_pull include/linux/skbuff.h:2681 [inline]
dccp_v6_err+0x426/0x1aa0 net/dccp/ipv6.c:94
icmpv6_notify+0x4c7/0x880 net/ipv6/icmp.c:867
icmpv6_rcv+0x19d5/0x30d0
ip6_protocol_deliver_rcu+0xda6/0x2a60 net/ipv6/ip6_input.c:438
ip6_input_finish net/ipv6/ip6_input.c:483 [inline]
NF_HOOK include/linux/netfilter.h:304 [inline]
ip6_input+0x15d/0x430 net/ipv6/ip6_input.c:492
ip6_mc_input+0xa7e/0xc80 net/ipv6/ip6_input.c:586
dst_input include/net/dst.h:468 [inline]
ip6_rcv_finish+0x5db/0x870 net/ipv6/ip6_input.c:79
NF_HOOK include/linux/netfilter.h:304 [inline]
ipv6_rcv+0xda/0x390 net/ipv6/ip6_input.c:310
__netif_receive_skb_one_core net/core/dev.c:5523 [inline]
__netif_receive_skb+0x1a6/0x5a0 net/core/dev.c:5637
netif_receive_skb_internal net/core/dev.c:5723 [inline]
netif_receive_skb+0x58/0x660 net/core/dev.c:5782
tun_rx_batched+0x83b/0x920
tun_get_user+0x564c/0x6940 drivers/net/tun.c:2002
tun_chr_write_iter+0x3af/0x5d0 drivers/net/tun.c:2048
call_write_iter include/linux/fs.h:1985 [inline]
new_sync_write fs/read_write.c:491 [inline]
vfs_write+0x8ef/0x15c0 fs/read_write.c:584
ksys_write+0x20f/0x4c0 fs/read_write.c:637
__do_sys_write fs/read_write.c:649 [inline]
__se_sys_write fs/read_write.c:646 [inline]
__x64_sys_write+0x93/0xd0 fs/read_write.c:646
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

Uninit was created at:
slab_post_alloc_hook+0x12f/0xb70 mm/slab.h:767
slab_alloc_node mm/slub.c:3478 [inline]
kmem_cache_alloc_node+0x577/0xa80 mm/slub.c:3523
kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:559
__alloc_skb+0x318/0x740 net/core/skbuff.c:650
alloc_skb include/linux/skbuff.h:1286 [inline]
alloc_skb_with_frags+0xc8/0xbd0 net/core/skbuff.c:6313
sock_alloc_send_pskb+0xa80/0xbf0 net/core/sock.c:2795
tun_alloc_skb drivers/net/tun.c:1531 [inline]
tun_get_user+0x23cf/0x6940 drivers/net/tun.c:1846
tun_chr_write_iter+0x3af/0x5d0 drivers/net/tun.c:2048
call_write_iter include/linux/fs.h:1985 [inline]
new_sync_write fs/read_write.c:491 [inline]
vfs_write+0x8ef/0x15c0 fs/read_write.c:584
ksys_write+0x20f/0x4c0 fs/read_write.c:637
__do_sys_write fs/read_write.c:649 [inline]
__se_sys_write fs/read_write.c:646 [inline]
__x64_sys_write+0x93/0xd0 fs/read_write.c:646
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

CPU: 0 PID: 4995 Comm: syz-executor153 Not tainted 6.6.0-rc1-syzkaller-00014-ga747acc0b752 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/04/2023

Fixes: 977ad86c2a1b ("dccp: Fix out of bounds access in DCCP error handler")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jann Horn <jannh@google.com>
Reviewed-by: Jann Horn <jannh@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoncsi: Propagate carrier gain/loss events to the NCSI controller
Johnathan Mantey [Fri, 15 Sep 2023 16:12:35 +0000 (09:12 -0700)]
ncsi: Propagate carrier gain/loss events to the NCSI controller

Report the carrier/no-carrier state for the network interface
shared between the BMC and the passthrough channel. Without this
functionality the BMC is unable to reconfigure the NIC in the event
of a re-cabling to a different subnet.

Signed-off-by: Johnathan Mantey <johnathanx.mantey@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoMerge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-
David S. Miller [Sun, 17 Sep 2023 16:48:15 +0000 (17:48 +0100)]
Merge branch '40GbE' of git://git./linux/kernel/git/tnguy/net-
queue

Tony Nguyen says:

====================
This series contains updates to iavf and i40e drivers.

Radoslaw prevents admin queue operations being added when the driver is
being removed for iavf.

Petr Oros immediately starts reconfiguration on changes to VLANs on
iavf.

Ivan Vecera moves reset of VF to occur after port VLAN values are set
on i40e.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoscsi: iscsi_tcp: restrict to TCP sockets
Eric Dumazet [Fri, 15 Sep 2023 17:11:11 +0000 (17:11 +0000)]
scsi: iscsi_tcp: restrict to TCP sockets

Nothing prevents iscsi_sw_tcp_conn_bind() to receive file descriptor
pointing to non TCP socket (af_unix for example).

Return -EINVAL if this is attempted, instead of crashing the kernel.

Fixes: 7ba247138907 ("[SCSI] open-iscsi/linux-iscsi-5 Initiator: Initiator code")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Lee Duncan <lduncan@suse.com>
Cc: Chris Leech <cleech@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: open-iscsi@googlegroups.com
Cc: linux-scsi@vger.kernel.org
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoipv4: fix null-deref in ipv4_link_failure
Kyle Zeng [Fri, 15 Sep 2023 05:12:57 +0000 (22:12 -0700)]
ipv4: fix null-deref in ipv4_link_failure

Currently, we assume the skb is associated with a device before calling
__ip_options_compile, which is not always the case if it is re-routed by
ipvs.
When skb->dev is NULL, dev_net(skb->dev) will become null-dereference.
This patch adds a check for the edge case and switch to use the net_device
from the rtable when skb->dev is NULL.

Fixes: ed0de45a1008 ("ipv4: recompile ip options in ipv4_link_failure")
Suggested-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Kyle Zeng <zengyhkyle@gmail.com>
Cc: Stephen Suryaputra <ssuryaextr@gmail.com>
Cc: Vadim Fedorenko <vfedorenko@novek.ru>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoigc: Fix infinite initialization loop with early XDP redirect
Vinicius Costa Gomes [Wed, 13 Sep 2023 18:06:15 +0000 (11:06 -0700)]
igc: Fix infinite initialization loop with early XDP redirect

When an XDP redirect happens before the link is ready, that
transmission will not finish and will timeout, causing an adapter
reset. If the redirects do not stop, the adapter will not stop
resetting.

Wait for the driver to signal that there's a carrier before allowing
transmissions to proceed.

Previous code was relying that when __IGC_DOWN is cleared, the NIC is
ready to transmit as all the queues are ready, what happens is that
the carrier presence will only be signaled later, after the watchdog
workqueue has a chance to run. And during this interval (between
clearing __IGC_DOWN and the watchdog running) if any transmission
happens the timeout is emitted (detected by igc_tx_timeout()) which
causes the reset, with the potential for the infinite loop.

Fixes: 4ff320361092 ("igc: Add support for XDP_REDIRECT action")
Reported-by: Ferenc Fejes <ferenc.fejes@ericsson.com>
Closes: https://lore.kernel.org/netdev/0caf33cf6adb3a5bf137eeaa20e89b167c9986d5.camel@ericsson.com/
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Ferenc Fejes <ferenc.fejes@ericsson.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonet: core: Use the bitmap API to allocate bitmaps
Andy Shevchenko [Wed, 13 Sep 2023 11:09:57 +0000 (14:09 +0300)]
net: core: Use the bitmap API to allocate bitmaps

Use bitmap_zalloc() and bitmap_free() instead of hand-writing them.
It is less verbose and it improves the type checking and semantic.

While at it, add missing header inclusion (should be bitops.h,
but with the above change it becomes bitmap.h).

Suggested-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230911154534.4174265-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoionic: fix 16bit math issue when PAGE_SIZE >= 64KB
David Christensen [Thu, 14 Sep 2023 22:02:52 +0000 (18:02 -0400)]
ionic: fix 16bit math issue when PAGE_SIZE >= 64KB

The ionic device supports a maximum buffer length of 16 bits (see
ionic_rxq_desc or ionic_rxq_sg_elem).  When adding new buffers to
the receive rings, the function ionic_rx_fill() uses 16bit math when
calculating the number of pages to allocate for an RX descriptor,
given the interface's MTU setting. If the system PAGE_SIZE >= 64KB,
and the buf_info->page_offset is 0, the remain_len value will never
decrement from the original MTU value and the frag_len value will
always be 0, causing additional pages to be allocated as scatter-
gather elements unnecessarily.

A similar math issue exists in ionic_rx_frags(), but no failures
have been observed here since a 64KB page should not normally
require any scatter-gather elements at any legal Ethernet MTU size.

Fixes: 4b0a7539a372 ("ionic: implement Rx page reuse")
Signed-off-by: David Christensen <drc@linux.vnet.ibm.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
David S. Miller [Sat, 16 Sep 2023 10:16:00 +0000 (11:16 +0100)]
Merge git://git./pub/scm/linux/kernel/git/bpf/bpf

Alexei Starovoitov says:

====================

The following pull-request contains BPF updates for your *net* tree.

We've added 21 non-merge commits during the last 8 day(s) which contain
a total of 21 files changed, 450 insertions(+), 36 deletions(-).

The main changes are:

1) Adjust bpf_mem_alloc buckets to match ksize(), from Hou Tao.

2) Check whether override is allowed in kprobe mult, from Jiri Olsa.

3) Fix btf_id symbol generation with ld.lld, from Jiri and Nick.

4) Fix potential deadlock when using queue and stack maps from NMI, from Toke Høiland-Jørgensen.

Please consider pulling these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git

Thanks a lot!

Also thanks to reporters, reviewers and testers of commits in this pull-request:

Alan Maguire, Biju Das, Björn Töpel, Dan Carpenter, Daniel Borkmann,
Eduard Zingerman, Hsin-Wei Hung, Marcus Seyfarth, Nathan Chancellor,
Satya Durga Srinivasu Prabhala, Song Liu, Stephen Rothwell
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agobpf: Fix BTF_ID symbol generation collision in tools/
Nick Desaulniers [Fri, 15 Sep 2023 17:34:28 +0000 (10:34 -0700)]
bpf: Fix BTF_ID symbol generation collision in tools/

Marcus and Satya reported an issue where BTF_ID macro generates same
symbol in separate objects and that breaks final vmlinux link.

  ld.lld: error: ld-temp.o <inline asm>:14577:1: symbol
  '__BTF_ID__struct__cgroup__624' is already defined

This can be triggered under specific configs when __COUNTER__ happens to
be the same for the same symbol in two different translation units,
which is already quite unlikely to happen.

Add __LINE__ number suffix to make BTF_ID symbol more unique, which is
not a complete fix, but it would help for now and meanwhile we can work
on better solution as suggested by Andrii.

Cc: stable@vger.kernel.org
Reported-by: Satya Durga Srinivasu Prabhala <quic_satyap@quicinc.com>
Reported-by: Marcus Seyfarth <m.seyfarth@gmail.com>
Closes: https://github.com/ClangBuiltLinux/linux/issues/1913
Debugged-by: Nathan Chancellor <nathan@kernel.org>
Co-developed-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/CAEf4Bzb5KQ2_LmhN769ifMeSJaWfebccUasQOfQKaOd0nQ51tw@mail.gmail.com/
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20230915-bpf_collision-v3-2-263fc519c21f@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
10 months agobpf: Fix BTF_ID symbol generation collision
Jiri Olsa [Fri, 15 Sep 2023 17:34:27 +0000 (10:34 -0700)]
bpf: Fix BTF_ID symbol generation collision

Marcus and Satya reported an issue where BTF_ID macro generates same
symbol in separate objects and that breaks final vmlinux link.

ld.lld: error: ld-temp.o <inline asm>:14577:1: symbol
'__BTF_ID__struct__cgroup__624' is already defined

This can be triggered under specific configs when __COUNTER__ happens to
be the same for the same symbol in two different translation units,
which is already quite unlikely to happen.

Add __LINE__ number suffix to make BTF_ID symbol more unique, which is
not a complete fix, but it would help for now and meanwhile we can work
on better solution as suggested by Andrii.

Cc: stable@vger.kernel.org
Reported-by: Satya Durga Srinivasu Prabhala <quic_satyap@quicinc.com>
Reported-by: Marcus Seyfarth <m.seyfarth@gmail.com>
Closes: https://github.com/ClangBuiltLinux/linux/issues/1913
Debugged-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/bpf/CAEf4Bzb5KQ2_LmhN769ifMeSJaWfebccUasQOfQKaOd0nQ51tw@mail.gmail.com/
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20230915-bpf_collision-v3-1-263fc519c21f@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
10 months agobpf: Fix uprobe_multi get_pid_task error path
Jiri Olsa [Fri, 15 Sep 2023 10:14:20 +0000 (12:14 +0200)]
bpf: Fix uprobe_multi get_pid_task error path

Dan reported Smatch static checker warning due to missing error
value set in uprobe multi link's get_pid_task error path.

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/bpf/c5ffa7c0-6b06-40d5-aca2-63833b5cd9af@moroto.mountain/
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230915101420.1193800-1-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
10 months agobpf: Skip unit_size checking for global per-cpu allocator
Hou Tao [Wed, 13 Sep 2023 13:59:43 +0000 (21:59 +0800)]
bpf: Skip unit_size checking for global per-cpu allocator

For global per-cpu allocator, the size of free object in free list
doesn't match with unit_size and now there is no way to get the size of
per-cpu pointer saved in free object, so just skip the checking.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/bpf/20230913133436.0eeec4cb@canb.auug.org.au/
Signed-off-by: Hou Tao <houtao1@huawei.com>
Tested-by: Biju Das <biju.das.jz@bp.renesas.com>
Link: https://lore.kernel.org/r/20230913135943.3137292-1-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
10 months agonetfilter, bpf: Adjust timeouts of non-confirmed CTs in bpf_ct_insert_entry()
Ilya Leoshkevich [Wed, 30 Aug 2023 01:07:43 +0000 (03:07 +0200)]
netfilter, bpf: Adjust timeouts of non-confirmed CTs in bpf_ct_insert_entry()

bpf_nf testcase fails on s390x: bpf_skb_ct_lookup() cannot find the entry
that was added by bpf_ct_insert_entry() within the same BPF function.

The reason is that this entry is deleted by nf_ct_gc_expired().

The CT timeout starts ticking after the CT confirmation; therefore
nf_conn.timeout is initially set to the timeout value, and
__nf_conntrack_confirm() sets it to the deadline value.

bpf_ct_insert_entry() sets IPS_CONFIRMED_BIT, but does not adjust the
timeout, making its value meaningless and causing false positives.

Fix the problem by making bpf_ct_insert_entry() adjust the timeout,
like __nf_conntrack_confirm().

Fixes: 2cdaa3eefed8 ("netfilter: conntrack: restore IPS_CONFIRMED out of nf_conntrack_hash_check_insert()")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Florian Westphal <fw@strlen.de>
Link: https://lore.kernel.org/bpf/20230830011128.1415752-3-iii@linux.ibm.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
10 months agoi40e: Fix VF VLAN offloading when port VLAN is configured
Ivan Vecera [Thu, 7 Sep 2023 15:44:57 +0000 (17:44 +0200)]
i40e: Fix VF VLAN offloading when port VLAN is configured

If port VLAN is configured on a VF then any other VLANs on top of this VF
are broken.

During i40e_ndo_set_vf_port_vlan() call the i40e driver reset the VF and
iavf driver asks PF (using VIRTCHNL_OP_GET_VF_RESOURCES) for VF capabilities
but this reset occurs too early, prior setting of vf->info.pvid field
and because this field can be zero during i40e_vc_get_vf_resources_msg()
then VIRTCHNL_VF_OFFLOAD_VLAN capability is reported to iavf driver.

This is wrong because iavf driver should not report VLAN offloading
capability when port VLAN is configured as i40e does not support QinQ
offloading.

Fix the issue by moving VF reset after setting of vf->port_vlan_id
field.

Without this patch:
$ echo 1 > /sys/class/net/enp2s0f0/device/sriov_numvfs
$ ip link set enp2s0f0 vf 0 vlan 3
$ ip link set enp2s0f0v0 up
$ ip link add link enp2s0f0v0 name vlan4 type vlan id 4
$ ip link set vlan4 up
...
$ ethtool -k enp2s0f0v0 | grep vlan-offload
rx-vlan-offload: on
tx-vlan-offload: on
$ dmesg -l err | grep iavf
[1292500.742914] iavf 0000:02:02.0: Failed to add VLAN filter, error IAVF_ERR_INVALID_QP_ID

With this patch:
$ echo 1 > /sys/class/net/enp2s0f0/device/sriov_numvfs
$ ip link set enp2s0f0 vf 0 vlan 3
$ ip link set enp2s0f0v0 up
$ ip link add link enp2s0f0v0 name vlan4 type vlan id 4
$ ip link set vlan4 up
...
$ ethtool -k enp2s0f0v0 | grep vlan-offload
rx-vlan-offload: off [requested on]
tx-vlan-offload: off [requested on]
$ dmesg -l err | grep iavf

Fixes: f9b4b6278d51 ("i40e: Reset the VF upon conflicting VLAN configuration")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
10 months agoiavf: schedule a request immediately after add/delete vlan
Petr Oros [Thu, 7 Sep 2023 15:02:51 +0000 (17:02 +0200)]
iavf: schedule a request immediately after add/delete vlan

When the iavf driver wants to reconfigure the VLAN filters
(iavf_add_vlan, iavf_del_vlan), it sets a flag in
aq_required:
  adapter->aq_required |= IAVF_FLAG_AQ_ADD_VLAN_FILTER;
or:
  adapter->aq_required |= IAVF_FLAG_AQ_DEL_VLAN_FILTER;

This is later processed by the watchdog_task, but it runs periodically
every 2 seconds, so it can be a long time before it processes the request.

In the worst case, the interface is unable to receive traffic for more
than 2 seconds for no objective reason.

Fixes: 5eae00c57f5e ("i40evf: main driver core")
Signed-off-by: Petr Oros <poros@redhat.com>
Co-developed-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Co-developed-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Ahmed Zaki <ahmed.zaki@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
10 months agoiavf: add iavf_schedule_aq_request() helper
Petr Oros [Thu, 7 Sep 2023 15:02:50 +0000 (17:02 +0200)]
iavf: add iavf_schedule_aq_request() helper

Add helper for set iavf aq request AVF_FLAG_AQ_* and immediately
schedule watchdog_task. Helper will be used in cases where it is
necessary to run aq requests asap

Signed-off-by: Petr Oros <poros@redhat.com>
Co-developed-by: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Co-developed-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
10 months agoiavf: do not process adminq tasks when __IAVF_IN_REMOVE_TASK is set
Radoslaw Tyl [Mon, 7 Aug 2023 12:59:40 +0000 (14:59 +0200)]
iavf: do not process adminq tasks when __IAVF_IN_REMOVE_TASK is set

Prevent schedule operations for adminq during device remove and when
__IAVF_IN_REMOVE_TASK flag is set. Currently, the iavf_down function
adds operations for adminq that shouldn't be processed when the device
is in the __IAVF_REMOVE state.

Reproduction:

echo 4 > /sys/bus/pci/devices/0000:17:00.0/sriov_numvfs
ip link set dev ens1f0 vf 0 trust on
ip link set dev ens1f0 vf 1 trust on
ip link set dev ens1f0 vf 2 trust on
ip link set dev ens1f0 vf 3 trust on

ip link set dev ens1f0 vf 0 mac 00:22:33:44:55:66
ip link set dev ens1f0 vf 1 mac 00:22:33:44:55:67
ip link set dev ens1f0 vf 2 mac 00:22:33:44:55:68
ip link set dev ens1f0 vf 3 mac 00:22:33:44:55:69

echo 0000:17:02.0 > /sys/bus/pci/devices/0000\:17\:02.0/driver/unbind
echo 0000:17:02.1 > /sys/bus/pci/devices/0000\:17\:02.1/driver/unbind
echo 0000:17:02.2 > /sys/bus/pci/devices/0000\:17\:02.2/driver/unbind
echo 0000:17:02.3 > /sys/bus/pci/devices/0000\:17\:02.3/driver/unbind
sleep 10
echo 0000:17:02.0 > /sys/bus/pci/drivers/iavf/bind
echo 0000:17:02.1 > /sys/bus/pci/drivers/iavf/bind
echo 0000:17:02.2 > /sys/bus/pci/drivers/iavf/bind
echo 0000:17:02.3 > /sys/bus/pci/drivers/iavf/bind

modprobe vfio-pci
echo 8086 154c > /sys/bus/pci/drivers/vfio-pci/new_id

qemu-system-x86_64 -accel kvm -m 4096 -cpu host \
-drive file=centos9.qcow2,if=none,id=virtio-disk0 \
-device virtio-blk-pci,drive=virtio-disk0,bootindex=0 -smp 4 \
-device vfio-pci,host=17:02.0 -net none \
-device vfio-pci,host=17:02.1 -net none \
-device vfio-pci,host=17:02.2 -net none \
-device vfio-pci,host=17:02.3 -net none \
-daemonize -vnc :5

Current result:
There is a probability that the mac of VF in guest is inconsistent with
it in host

Expected result:
When passthrough NIC VF to guest, the VF in guest should always get
the same mac as it in host.

Fixes: 14756b2ae265 ("iavf: Fix __IAVF_RESETTING state usage")
Signed-off-by: Radoslaw Tyl <radoslawx.tyl@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
10 months agoMerge tag 'nf-23-09-13' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
David S. Miller [Fri, 15 Sep 2023 12:56:58 +0000 (13:56 +0100)]
Merge tag 'nf-23-09-13' of git://git./linux/kernel/git/netfilter/nf

netfilter pull request 23-09-13

====================

The following patchset contains Netfilter fixes for net:

1) Do not permit to remove rules from chain binding, otherwise
   double rule release is possible, triggering UaF. This rule
   deletion support does not make sense and userspace does not use
   this. Problem exists since the introduction of chain binding support.

2) rbtree GC worker only collects the elements that have expired.
   This operation is not destructive, therefore, turn write into
   read spinlock to avoid datapath contention due to GC worker run.
   This was not fixed in the recent GC fix batch in the 6.5 cycle.

3) pipapo set backend performs sync GC, therefore, catchall elements
   must use sync GC queue variant. This bug was introduced in the
   6.5 cycle with the recent GC fixes.

4) Stop GC run if memory allocation fails in pipapo set backend,
   otherwise access to NULL pointer to GC transaction object might
   occur. This bug was introduced in the 6.5 cycle with the recent
   GC fixes.

5) rhash GC run uses an iterator that might hit EAGAIN to rewind,
   triggering double-collection of the same element. This bug was
   introduced in the 6.5 cycle with the recent GC fixes.

6) Do not permit to remove elements in anonymous sets, this type of
   sets are populated once and then bound to rules. This fix is
   similar to the chain binding patch coming first in this batch.
   API permits since the very beginning but it has no use case from
   userspace.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoocteon_ep: fix tx dma unmap len values in SG
Shinas Rasheed [Wed, 13 Sep 2023 08:41:56 +0000 (01:41 -0700)]
octeon_ep: fix tx dma unmap len values in SG

Lengths of SG pointers are kept in the following order in
the SG entries in hardware.
 63      48|47     32|31     16|15       0
 -----------------------------------------
 |  Len 0  |  Len 1  |  Len 2  |  Len 3  |
 -----------------------------------------
 |                Ptr 0                  |
 -----------------------------------------
 |                Ptr 1                  |
 -----------------------------------------
 |                Ptr 2                  |
 -----------------------------------------
 |                Ptr 3                  |
 -----------------------------------------
Dma pointers have to be unmapped based on their
respective lengths given in this format.

Fixes: 37d79d059606 ("octeon_ep: add Tx/Rx processing and interrupt support")
Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonet: thunderbolt: Fix TCPv6 GSO checksum calculation
Mika Westerberg [Wed, 13 Sep 2023 05:26:47 +0000 (08:26 +0300)]
net: thunderbolt: Fix TCPv6 GSO checksum calculation

Alex reported that running ssh over IPv6 does not work with
Thunderbolt/USB4 networking driver. The reason for that is that driver
should call skb_is_gso() before calling skb_is_gso_v6(), and it should
not return false after calculates the checksum successfully. This probably
was a copy paste error from the original driver where it was done properly.

Reported-by: Alex Balcanquall <alex@alexbal.com>
Fixes: e69b6c02b4c3 ("net: Add support for networking over Thunderbolt cable")
Cc: stable@vger.kernel.org
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonet/core: Fix ETH_P_1588 flow dissector
Sasha Neftin [Wed, 13 Sep 2023 06:39:05 +0000 (09:39 +0300)]
net/core: Fix ETH_P_1588 flow dissector

When a PTP ethernet raw frame with a size of more than 256 bytes followed
by a 0xff pattern is sent to __skb_flow_dissect, nhoff value calculation
is wrong. For example: hdr->message_length takes the wrong value (0xffff)
and it does not replicate real header length. In this case, 'nhoff' value
was overridden and the PTP header was badly dissected. This leads to a
kernel crash.

net/core: flow_dissector
net/core flow dissector nhoff = 0x0000000e
net/core flow dissector hdr->message_length = 0x0000ffff
net/core flow dissector nhoff = 0x0001000d (u16 overflow)
...
skb linear:   00000000: 00 a0 c9 00 00 00 00 a0 c9 00 00 00 88
skb frag:     00000000: f7 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

Using the size of the ptp_header struct will allow the corrected
calculation of the nhoff value.

net/core flow dissector nhoff = 0x0000000e
net/core flow dissector nhoff = 0x00000030 (sizeof ptp_header)
...
skb linear:   00000000: 00 a0 c9 00 00 00 00 a0 c9 00 00 00 88 f7 ff ff
skb linear:   00000010: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
skb linear:   00000020: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
skb frag:     00000000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

Kernel trace:
[   74.984279] ------------[ cut here ]------------
[   74.989471] kernel BUG at include/linux/skbuff.h:2440!
[   74.995237] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[   75.001098] CPU: 4 PID: 0 Comm: swapper/4 Tainted: G     U            5.15.85-intel-ese-standard-lts #1
[   75.011629] Hardware name: Intel Corporation A-Island (CPU:AlderLake)/A-Island (ID:06), BIOS SB_ADLP.01.01.00.01.03.008.D-6A9D9E73-dirty Mar 30 2023
[   75.026507] RIP: 0010:eth_type_trans+0xd0/0x130
[   75.031594] Code: 03 88 47 78 eb c7 8b 47 68 2b 47 6c 48 8b 97 c0 00 00 00 83 f8 01 7e 1b 48 85 d2 74 06 66 83 3a ff 74 09 b8 00 04 00 00 eb ab <0f> 0b b8 00 01 00 00 eb a2 48 85 ff 74 eb 48 8d 54 24 06 31 f6 b9
[   75.052612] RSP: 0018:ffff9948c0228de0 EFLAGS: 00010297
[   75.058473] RAX: 00000000000003f2 RBX: ffff8e47047dc300 RCX: 0000000000001003
[   75.066462] RDX: ffff8e4e8c9ea040 RSI: ffff8e4704e0a000 RDI: ffff8e47047dc300
[   75.074458] RBP: ffff8e4704e2acc0 R08: 00000000000003f3 R09: 0000000000000800
[   75.082466] R10: 000000000000000d R11: ffff9948c0228dec R12: ffff8e4715e4e010
[   75.090461] R13: ffff9948c0545018 R14: 0000000000000001 R15: 0000000000000800
[   75.098464] FS:  0000000000000000(0000) GS:ffff8e4e8fb00000(0000) knlGS:0000000000000000
[   75.107530] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   75.113982] CR2: 00007f5eb35934a0 CR3: 0000000150e0a002 CR4: 0000000000770ee0
[   75.121980] PKRU: 55555554
[   75.125035] Call Trace:
[   75.127792]  <IRQ>
[   75.130063]  ? eth_get_headlen+0xa4/0xc0
[   75.134472]  igc_process_skb_fields+0xcd/0x150
[   75.139461]  igc_poll+0xc80/0x17b0
[   75.143272]  __napi_poll+0x27/0x170
[   75.147192]  net_rx_action+0x234/0x280
[   75.151409]  __do_softirq+0xef/0x2f4
[   75.155424]  irq_exit_rcu+0xc7/0x110
[   75.159432]  common_interrupt+0xb8/0xd0
[   75.163748]  </IRQ>
[   75.166112]  <TASK>
[   75.168473]  asm_common_interrupt+0x22/0x40
[   75.173175] RIP: 0010:cpuidle_enter_state+0xe2/0x350
[   75.178749] Code: 85 c0 0f 8f 04 02 00 00 31 ff e8 39 6c 67 ff 45 84 ff 74 12 9c 58 f6 c4 02 0f 85 50 02 00 00 31 ff e8 52 b0 6d ff fb 45 85 f6 <0f> 88 b1 00 00 00 49 63 ce 4c 2b 2c 24 48 89 c8 48 6b d1 68 48 c1
[   75.199757] RSP: 0018:ffff9948c013bea8 EFLAGS: 00000202
[   75.205614] RAX: ffff8e4e8fb00000 RBX: ffffb948bfd23900 RCX: 000000000000001f
[   75.213619] RDX: 0000000000000004 RSI: ffffffff94206161 RDI: ffffffff94212e20
[   75.221620] RBP: 0000000000000004 R08: 000000117568973a R09: 0000000000000001
[   75.229622] R10: 000000000000afc8 R11: ffff8e4e8fb29ce4 R12: ffffffff945ae980
[   75.237628] R13: 000000117568973a R14: 0000000000000004 R15: 0000000000000000
[   75.245635]  ? cpuidle_enter_state+0xc7/0x350
[   75.250518]  cpuidle_enter+0x29/0x40
[   75.254539]  do_idle+0x1d9/0x260
[   75.258166]  cpu_startup_entry+0x19/0x20
[   75.262582]  secondary_startup_64_no_verify+0xc2/0xcb
[   75.268259]  </TASK>
[   75.270721] Modules linked in: 8021q snd_sof_pci_intel_tgl snd_sof_intel_hda_common tpm_crb snd_soc_hdac_hda snd_sof_intel_hda snd_hda_ext_core snd_sof_pci snd_sof snd_sof_xtensa_dsp snd_soc_acpi_intel_match snd_soc_acpi snd_soc_core snd_compress iTCO_wdt ac97_bus intel_pmc_bxt mei_hdcp iTCO_vendor_support snd_hda_codec_hdmi pmt_telemetry intel_pmc_core pmt_class snd_hda_intel x86_pkg_temp_thermal snd_intel_dspcfg snd_hda_codec snd_hda_core kvm_intel snd_pcm snd_timer kvm snd mei_me soundcore tpm_tis irqbypass i2c_i801 mei tpm_tis_core pcspkr intel_rapl_msr tpm i2c_smbus intel_pmt thermal sch_fq_codel uio uhid i915 drm_buddy video drm_display_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm fuse configfs
[   75.342736] ---[ end trace 3785f9f360400e3a ]---
[   75.347913] RIP: 0010:eth_type_trans+0xd0/0x130
[   75.352984] Code: 03 88 47 78 eb c7 8b 47 68 2b 47 6c 48 8b 97 c0 00 00 00 83 f8 01 7e 1b 48 85 d2 74 06 66 83 3a ff 74 09 b8 00 04 00 00 eb ab <0f> 0b b8 00 01 00 00 eb a2 48 85 ff 74 eb 48 8d 54 24 06 31 f6 b9
[   75.373994] RSP: 0018:ffff9948c0228de0 EFLAGS: 00010297
[   75.379860] RAX: 00000000000003f2 RBX: ffff8e47047dc300 RCX: 0000000000001003
[   75.387856] RDX: ffff8e4e8c9ea040 RSI: ffff8e4704e0a000 RDI: ffff8e47047dc300
[   75.395864] RBP: ffff8e4704e2acc0 R08: 00000000000003f3 R09: 0000000000000800
[   75.403857] R10: 000000000000000d R11: ffff9948c0228dec R12: ffff8e4715e4e010
[   75.411863] R13: ffff9948c0545018 R14: 0000000000000001 R15: 0000000000000800
[   75.419875] FS:  0000000000000000(0000) GS:ffff8e4e8fb00000(0000) knlGS:0000000000000000
[   75.428946] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   75.435403] CR2: 00007f5eb35934a0 CR3: 0000000150e0a002 CR4: 0000000000770ee0
[   75.443410] PKRU: 55555554
[   75.446477] Kernel panic - not syncing: Fatal exception in interrupt
[   75.453738] Kernel Offset: 0x11c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[   75.465794] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---

Fixes: 4f1cc51f3488 ("net: flow_dissector: Parse PTP L2 packet header")
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonet: ti: icssg-prueth: add PTP dependency
Arnd Bergmann [Tue, 12 Sep 2023 18:54:51 +0000 (20:54 +0200)]
net: ti: icssg-prueth: add PTP dependency

The driver can now use PTP if enabled but fails to link built-in
if PTP is a loadable module:

aarch64-linux-ld: drivers/net/ethernet/ti/icssg/icss_iep.o: in function `icss_iep_get_ptp_clock_idx':
icss_iep.c:(.text+0x200): undefined reference to `ptp_clock_index'

Add the usual dependency to avoid this.

Fixes: 186734c158865 ("net: ti: icssg-prueth: add packet timestamping and ptp support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: MD Danish Anwar <danishanwar@ti.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoselftests: tls: swap the TX and RX sockets in some tests
Sabrina Dubroca [Tue, 12 Sep 2023 14:16:25 +0000 (16:16 +0200)]
selftests: tls: swap the TX and RX sockets in some tests

tls.sendmsg_large and tls.sendmsg_multiple are trying to send through
the self->cfd socket (only configured with TLS_RX) and to receive through
the self->fd socket (only configured with TLS_TX), so they're not using
kTLS at all. Swap the sockets.

Fixes: 7f657d5bf507 ("selftests: tls: add selftests for TLS sockets")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoMerge branch 'sparx5-leaks'
David S. Miller [Fri, 15 Sep 2023 06:32:35 +0000 (07:32 +0100)]
Merge branch 'sparx5-leaks'

Jinjie Ruan says:

====================
net: microchip: sparx5: Fix some memory leaks in vcap_api_kunit

There are some memory leaks in vcap_api_kunit, this patchset
fixes them.

Changes in v3:
- Fix the typo in patch 3, from "export" to "vcap enabled port".
- Fix the typo in patch 4, from "vcap_dup_rule" to "vcap_alloc_rule".

Changes in v2:
- Adhere to the 80 character limit in vcap_free_caf()
- Fix kernel test robot reported warnings in test_vcap_xn_rule_creator()
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonet: microchip: sparx5: Fix possible memory leaks in vcap_api_kunit
Jinjie Ruan [Tue, 12 Sep 2023 11:03:10 +0000 (19:03 +0800)]
net: microchip: sparx5: Fix possible memory leaks in vcap_api_kunit

Inject fault while probing kunit-example-test.ko, the duprule which
is allocated by kzalloc in vcap_dup_rule() of
test_vcap_xn_rule_creator() is not freed, and it cause the memory leaks
below. Use vcap_del_rule() to free them as other functions do it.

unreferenced object 0xffff6eb4846f6180 (size 192):
  comm "kunit_try_catch", pid 405, jiffies 4294895522 (age 880.004s)
  hex dump (first 32 bytes):
    10 27 00 00 04 00 00 00 0a 00 00 00 f4 01 00 00  .'..............
    00 00 00 00 00 00 00 00 98 61 6f 84 b4 6e ff ff  .........ao..n..
  backtrace:
    [<00000000f1b5b86e>] slab_post_alloc_hook+0xb8/0x368
    [<00000000c56cdd9a>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000046ef1b64>] kmalloc_trace+0x40/0x164
    [<000000008565145b>] vcap_dup_rule+0x38/0x210
    [<00000000bd9e1f12>] vcap_add_rule+0x29c/0x32c
    [<0000000070a539b1>] test_vcap_xn_rule_creator.constprop.43+0x120/0x330
    [<00000000d2ac4ccb>] vcap_api_rule_insert_in_order_test+0xa4/0x114
    [<000000000f88f9cb>] kunit_try_run_case+0x50/0xac
    [<00000000e848de5a>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<0000000058a88b6b>] kthread+0x124/0x130
    [<00000000891cf28a>] ret_from_fork+0x10/0x20
unreferenced object 0xffff6eb4846f6240 (size 192):
  comm "kunit_try_catch", pid 405, jiffies 4294895524 (age 879.996s)
  hex dump (first 32 bytes):
    10 27 00 00 04 00 00 00 14 00 00 00 90 01 00 00  .'..............
    00 00 00 00 00 00 00 00 58 62 6f 84 b4 6e ff ff  ........Xbo..n..
  backtrace:
    [<00000000f1b5b86e>] slab_post_alloc_hook+0xb8/0x368
    [<00000000c56cdd9a>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000046ef1b64>] kmalloc_trace+0x40/0x164
    [<000000008565145b>] vcap_dup_rule+0x38/0x210
    [<00000000bd9e1f12>] vcap_add_rule+0x29c/0x32c
    [<0000000070a539b1>] test_vcap_xn_rule_creator.constprop.43+0x120/0x330
    [<0000000052e6ad35>] vcap_api_rule_insert_in_order_test+0xbc/0x114
    [<000000000f88f9cb>] kunit_try_run_case+0x50/0xac
    [<00000000e848de5a>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<0000000058a88b6b>] kthread+0x124/0x130
    [<00000000891cf28a>] ret_from_fork+0x10/0x20
unreferenced object 0xffff6eb4846f6300 (size 192):
  comm "kunit_try_catch", pid 405, jiffies 4294895524 (age 879.996s)
  hex dump (first 32 bytes):
    10 27 00 00 04 00 00 00 1e 00 00 00 2c 01 00 00  .'..........,...
    00 00 00 00 00 00 00 00 18 63 6f 84 b4 6e ff ff  .........co..n..
  backtrace:
    [<00000000f1b5b86e>] slab_post_alloc_hook+0xb8/0x368
    [<00000000c56cdd9a>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000046ef1b64>] kmalloc_trace+0x40/0x164
    [<000000008565145b>] vcap_dup_rule+0x38/0x210
    [<00000000bd9e1f12>] vcap_add_rule+0x29c/0x32c
    [<0000000070a539b1>] test_vcap_xn_rule_creator.constprop.43+0x120/0x330
    [<000000001b0895d4>] vcap_api_rule_insert_in_order_test+0xd4/0x114
    [<000000000f88f9cb>] kunit_try_run_case+0x50/0xac
    [<00000000e848de5a>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<0000000058a88b6b>] kthread+0x124/0x130
    [<00000000891cf28a>] ret_from_fork+0x10/0x20
unreferenced object 0xffff6eb4846f63c0 (size 192):
  comm "kunit_try_catch", pid 405, jiffies 4294895524 (age 880.012s)
  hex dump (first 32 bytes):
    10 27 00 00 04 00 00 00 28 00 00 00 c8 00 00 00  .'......(.......
    00 00 00 00 00 00 00 00 d8 63 6f 84 b4 6e ff ff  .........co..n..
  backtrace:
    [<00000000f1b5b86e>] slab_post_alloc_hook+0xb8/0x368
    [<00000000c56cdd9a>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000046ef1b64>] kmalloc_trace+0x40/0x164
    [<000000008565145b>] vcap_dup_rule+0x38/0x210
    [<00000000bd9e1f12>] vcap_add_rule+0x29c/0x32c
    [<0000000070a539b1>] test_vcap_xn_rule_creator.constprop.43+0x120/0x330
    [<00000000134c151f>] vcap_api_rule_insert_in_order_test+0xec/0x114
    [<000000000f88f9cb>] kunit_try_run_case+0x50/0xac
    [<00000000e848de5a>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<0000000058a88b6b>] kthread+0x124/0x130
    [<00000000891cf28a>] ret_from_fork+0x10/0x20
unreferenced object 0xffff6eb4845fc180 (size 192):
  comm "kunit_try_catch", pid 407, jiffies 4294895527 (age 880.000s)
  hex dump (first 32 bytes):
    10 27 00 00 04 00 00 00 14 00 00 00 c8 00 00 00  .'..............
    00 00 00 00 00 00 00 00 98 c1 5f 84 b4 6e ff ff  .........._..n..
  backtrace:
    [<00000000f1b5b86e>] slab_post_alloc_hook+0xb8/0x368
    [<00000000c56cdd9a>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000046ef1b64>] kmalloc_trace+0x40/0x164
    [<000000008565145b>] vcap_dup_rule+0x38/0x210
    [<00000000bd9e1f12>] vcap_add_rule+0x29c/0x32c
    [<0000000070a539b1>] test_vcap_xn_rule_creator.constprop.43+0x120/0x330
    [<00000000fa5f64d3>] vcap_api_rule_insert_reverse_order_test+0xc8/0x600
    [<000000000f88f9cb>] kunit_try_run_case+0x50/0xac
    [<00000000e848de5a>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<0000000058a88b6b>] kthread+0x124/0x130
    [<00000000891cf28a>] ret_from_fork+0x10/0x20
unreferenced object 0xffff6eb4845fc240 (size 192):
  comm "kunit_try_catch", pid 407, jiffies 4294895527 (age 880.000s)
  hex dump (first 32 bytes):
    10 27 00 00 04 00 00 00 1e 00 00 00 2c 01 00 00  .'..........,...
    00 00 00 00 00 00 00 00 58 c2 5f 84 b4 6e ff ff  ........X._..n..
  backtrace:
    [<00000000f1b5b86e>] slab_post_alloc_hook+0xb8/0x368
    [<00000000c56cdd9a>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000046ef1b64>] kmalloc_trace+0x40/0x164
    [<000000008565145b>] vcap_dup_rule+0x38/0x210
    [<00000000453dcd80>] vcap_add_rule+0x134/0x32c
    [<0000000070a539b1>] test_vcap_xn_rule_creator.constprop.43+0x120/0x330
    [<00000000a7db42de>] vcap_api_rule_insert_reverse_order_test+0x108/0x600
    [<000000000f88f9cb>] kunit_try_run_case+0x50/0xac
    [<00000000e848de5a>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<0000000058a88b6b>] kthread+0x124/0x130
    [<00000000891cf28a>] ret_from_fork+0x10/0x20
unreferenced object 0xffff6eb4845fc300 (size 192):
  comm "kunit_try_catch", pid 407, jiffies 4294895527 (age 880.000s)
  hex dump (first 32 bytes):
    10 27 00 00 04 00 00 00 28 00 00 00 90 01 00 00  .'......(.......
    00 00 00 00 00 00 00 00 18 c3 5f 84 b4 6e ff ff  .........._..n..
  backtrace:
    [<00000000f1b5b86e>] slab_post_alloc_hook+0xb8/0x368
    [<00000000c56cdd9a>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000046ef1b64>] kmalloc_trace+0x40/0x164
    [<000000008565145b>] vcap_dup_rule+0x38/0x210
    [<00000000453dcd80>] vcap_add_rule+0x134/0x32c
    [<0000000070a539b1>] test_vcap_xn_rule_creator.constprop.43+0x120/0x330
    [<00000000ea416c94>] vcap_api_rule_insert_reverse_order_test+0x150/0x600
    [<000000000f88f9cb>] kunit_try_run_case+0x50/0xac
    [<00000000e848de5a>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<0000000058a88b6b>] kthread+0x124/0x130
    [<00000000891cf28a>] ret_from_fork+0x10/0x20
unreferenced object 0xffff6eb4845fc3c0 (size 192):
  comm "kunit_try_catch", pid 407, jiffies 4294895527 (age 880.020s)
  hex dump (first 32 bytes):
    10 27 00 00 04 00 00 00 32 00 00 00 f4 01 00 00  .'......2.......
    00 00 00 00 00 00 00 00 d8 c3 5f 84 b4 6e ff ff  .........._..n..
  backtrace:
    [<00000000f1b5b86e>] slab_post_alloc_hook+0xb8/0x368
    [<00000000c56cdd9a>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000046ef1b64>] kmalloc_trace+0x40/0x164
    [<000000008565145b>] vcap_dup_rule+0x38/0x210
    [<00000000453dcd80>] vcap_add_rule+0x134/0x32c
    [<0000000070a539b1>] test_vcap_xn_rule_creator.constprop.43+0x120/0x330
    [<00000000764a39b4>] vcap_api_rule_insert_reverse_order_test+0x198/0x600
    [<000000000f88f9cb>] kunit_try_run_case+0x50/0xac
    [<00000000e848de5a>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<0000000058a88b6b>] kthread+0x124/0x130
    [<00000000891cf28a>] ret_from_fork+0x10/0x20
unreferenced object 0xffff6eb484cd4240 (size 192):
  comm "kunit_try_catch", pid 413, jiffies 4294895543 (age 879.956s)
  hex dump (first 32 bytes):
    10 27 00 00 04 00 00 00 1e 00 00 00 2c 01 00 00  .'..........,...
    00 00 00 00 00 00 00 00 58 42 cd 84 b4 6e ff ff  ........XB...n..
  backtrace:
    [<00000000f1b5b86e>] slab_post_alloc_hook+0xb8/0x368
    [<00000000c56cdd9a>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000046ef1b64>] kmalloc_trace+0x40/0x164
    [<000000008565145b>] vcap_dup_rule+0x38/0x210
    [<00000000bd9e1f12>] vcap_add_rule+0x29c/0x32c
    [<0000000070a539b1>] test_vcap_xn_rule_creator.constprop.43+0x120/0x330
    [<0000000023976dd4>] vcap_api_rule_remove_in_front_test+0x158/0x658
    [<000000000f88f9cb>] kunit_try_run_case+0x50/0xac
    [<00000000e848de5a>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<0000000058a88b6b>] kthread+0x124/0x130
    [<00000000891cf28a>] ret_from_fork+0x10/0x20
unreferenced object 0xffff6eb484cd4300 (size 192):
  comm "kunit_try_catch", pid 413, jiffies 4294895543 (age 879.956s)
  hex dump (first 32 bytes):
    10 27 00 00 04 00 00 00 28 00 00 00 c8 00 00 00  .'......(.......
    00 00 00 00 00 00 00 00 18 43 cd 84 b4 6e ff ff  .........C...n..
  backtrace:
    [<00000000f1b5b86e>] slab_post_alloc_hook+0xb8/0x368
    [<00000000c56cdd9a>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000046ef1b64>] kmalloc_trace+0x40/0x164
    [<000000008565145b>] vcap_dup_rule+0x38/0x210
    [<00000000bd9e1f12>] vcap_add_rule+0x29c/0x32c
    [<0000000070a539b1>] test_vcap_xn_rule_creator.constprop.43+0x120/0x330
    [<000000000b4760ff>] vcap_api_rule_remove_in_front_test+0x170/0x658
    [<000000000f88f9cb>] kunit_try_run_case+0x50/0xac
    [<00000000e848de5a>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<0000000058a88b6b>] kthread+0x124/0x130
    [<00000000891cf28a>] ret_from_fork+0x10/0x20

Fixes: dccc30cc4906 ("net: microchip: sparx5: Add KUNIT test of counters and sorted rules")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonet: microchip: sparx5: Fix possible memory leaks in test_vcap_xn_rule_creator()
Jinjie Ruan [Tue, 12 Sep 2023 11:03:09 +0000 (19:03 +0800)]
net: microchip: sparx5: Fix possible memory leaks in test_vcap_xn_rule_creator()

Inject fault while probing kunit-example-test.ko, the rule which
is allocated by kzalloc in vcap_alloc_rule(), the field which is
allocated by kzalloc in vcap_rule_add_action() and
vcap_rule_add_key() is not freed, and it cause the memory leaks
below. Use vcap_free_rule() to free them as other drivers do it.

And since the return rule of test_vcap_xn_rule_creator() is not
used, remove it and switch to void.

unreferenced object 0xffff058383334240 (size 192):
  comm "kunit_try_catch", pid 309, jiffies 4294894222 (age 639.800s)
  hex dump (first 32 bytes):
    10 27 00 00 04 00 00 00 14 00 00 00 90 01 00 00  .'..............
    00 00 00 00 00 00 00 00 00 81 93 84 83 05 ff ff  ................
  backtrace:
    [<000000008585a8f7>] slab_post_alloc_hook+0xb8/0x368
    [<00000000795eba12>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000061886991>] kmalloc_trace+0x40/0x164
    [<00000000648fefae>] vcap_alloc_rule+0x17c/0x26c
    [<000000004da16164>] test_vcap_xn_rule_creator.constprop.43+0xac/0x328
    [<00000000231b1097>] vcap_api_rule_insert_in_order_test+0xcc/0x184
    [<00000000548b559e>] kunit_try_run_case+0x50/0xac
    [<00000000663f0105>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<00000000e646f120>] kthread+0x124/0x130
    [<000000005257599e>] ret_from_fork+0x10/0x20
unreferenced object 0xffff0583849380c0 (size 64):
  comm "kunit_try_catch", pid 309, jiffies 4294894222 (age 639.800s)
  hex dump (first 32 bytes):
    40 81 93 84 83 05 ff ff 68 42 33 83 83 05 ff ff  @.......hB3.....
    22 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00  "...............
  backtrace:
    [<000000008585a8f7>] slab_post_alloc_hook+0xb8/0x368
    [<00000000795eba12>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000061886991>] kmalloc_trace+0x40/0x164
    [<00000000ee41df9e>] vcap_rule_add_action+0x104/0x178
    [<000000001cc1bb38>] test_vcap_xn_rule_creator.constprop.43+0xd8/0x328
    [<00000000231b1097>] vcap_api_rule_insert_in_order_test+0xcc/0x184
    [<00000000548b559e>] kunit_try_run_case+0x50/0xac
    [<00000000663f0105>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<00000000e646f120>] kthread+0x124/0x130
    [<000000005257599e>] ret_from_fork+0x10/0x20
unreferenced object 0xffff058384938100 (size 64):
  comm "kunit_try_catch", pid 309, jiffies 4294894222 (age 639.800s)
  hex dump (first 32 bytes):
    80 81 93 84 83 05 ff ff 58 42 33 83 83 05 ff ff  ........XB3.....
    7d 00 00 00 01 00 00 00 02 00 00 00 ff 00 00 00  }...............
  backtrace:
    [<000000008585a8f7>] slab_post_alloc_hook+0xb8/0x368
    [<00000000795eba12>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000061886991>] kmalloc_trace+0x40/0x164
    [<0000000043c78991>] vcap_rule_add_key+0x104/0x180
    [<00000000ba73cfbe>] vcap_add_type_keyfield+0xfc/0x128
    [<000000002b00f7df>] vcap_val_rule+0x274/0x3e8
    [<00000000e67d2ff5>] test_vcap_xn_rule_creator.constprop.43+0xf0/0x328
    [<00000000231b1097>] vcap_api_rule_insert_in_order_test+0xcc/0x184
    [<00000000548b559e>] kunit_try_run_case+0x50/0xac
    [<00000000663f0105>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<00000000e646f120>] kthread+0x124/0x130
    [<000000005257599e>] ret_from_fork+0x10/0x20

unreferenced object 0xffff0583833b6240 (size 192):
  comm "kunit_try_catch", pid 311, jiffies 4294894225 (age 639.844s)
  hex dump (first 32 bytes):
    10 27 00 00 04 00 00 00 1e 00 00 00 2c 01 00 00  .'..........,...
    00 00 00 00 00 00 00 00 40 91 8f 84 83 05 ff ff  ........@.......
  backtrace:
    [<000000008585a8f7>] slab_post_alloc_hook+0xb8/0x368
    [<00000000795eba12>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000061886991>] kmalloc_trace+0x40/0x164
    [<00000000648fefae>] vcap_alloc_rule+0x17c/0x26c
    [<000000004da16164>] test_vcap_xn_rule_creator.constprop.43+0xac/0x328
    [<00000000509de3f4>] vcap_api_rule_insert_reverse_order_test+0x10c/0x654
    [<00000000548b559e>] kunit_try_run_case+0x50/0xac
    [<00000000663f0105>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<00000000e646f120>] kthread+0x124/0x130
    [<000000005257599e>] ret_from_fork+0x10/0x20
unreferenced object 0xffff0583848f9100 (size 64):
  comm "kunit_try_catch", pid 311, jiffies 4294894225 (age 639.844s)
  hex dump (first 32 bytes):
    80 91 8f 84 83 05 ff ff 68 62 3b 83 83 05 ff ff  ........hb;.....
    22 00 00 00 01 00 00 00 00 00 00 00 a5 b4 ff ff  "...............
  backtrace:
    [<000000008585a8f7>] slab_post_alloc_hook+0xb8/0x368
    [<00000000795eba12>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000061886991>] kmalloc_trace+0x40/0x164
    [<00000000ee41df9e>] vcap_rule_add_action+0x104/0x178
    [<000000001cc1bb38>] test_vcap_xn_rule_creator.constprop.43+0xd8/0x328
    [<00000000509de3f4>] vcap_api_rule_insert_reverse_order_test+0x10c/0x654
    [<00000000548b559e>] kunit_try_run_case+0x50/0xac
    [<00000000663f0105>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<00000000e646f120>] kthread+0x124/0x130
    [<000000005257599e>] ret_from_fork+0x10/0x20
unreferenced object 0xffff0583848f9140 (size 64):
  comm "kunit_try_catch", pid 311, jiffies 4294894225 (age 639.844s)
  hex dump (first 32 bytes):
    c0 91 8f 84 83 05 ff ff 58 62 3b 83 83 05 ff ff  ........Xb;.....
    7d 00 00 00 01 00 00 00 02 00 00 00 ff 00 00 00  }...............
  backtrace:
    [<000000008585a8f7>] slab_post_alloc_hook+0xb8/0x368
    [<00000000795eba12>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000061886991>] kmalloc_trace+0x40/0x164
    [<0000000043c78991>] vcap_rule_add_key+0x104/0x180
    [<00000000ba73cfbe>] vcap_add_type_keyfield+0xfc/0x128
    [<000000002b00f7df>] vcap_val_rule+0x274/0x3e8
    [<00000000e67d2ff5>] test_vcap_xn_rule_creator.constprop.43+0xf0/0x328
    [<00000000509de3f4>] vcap_api_rule_insert_reverse_order_test+0x10c/0x654
    [<00000000548b559e>] kunit_try_run_case+0x50/0xac
    [<00000000663f0105>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<00000000e646f120>] kthread+0x124/0x130
    [<000000005257599e>] ret_from_fork+0x10/0x20

unreferenced object 0xffff05838264e0c0 (size 192):
  comm "kunit_try_catch", pid 313, jiffies 4294894230 (age 639.864s)
  hex dump (first 32 bytes):
    10 27 00 00 04 00 00 00 0a 00 00 00 f4 01 00 00  .'..............
    00 00 00 00 00 00 00 00 40 3a 97 84 83 05 ff ff  ........@:......
  backtrace:
    [<000000008585a8f7>] slab_post_alloc_hook+0xb8/0x368
    [<00000000795eba12>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000061886991>] kmalloc_trace+0x40/0x164
    [<00000000648fefae>] vcap_alloc_rule+0x17c/0x26c
    [<000000004da16164>] test_vcap_xn_rule_creator.constprop.43+0xac/0x328
    [<00000000a29794d8>] vcap_api_rule_remove_at_end_test+0xbc/0xb48
    [<00000000548b559e>] kunit_try_run_case+0x50/0xac
    [<00000000663f0105>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<00000000e646f120>] kthread+0x124/0x130
    [<000000005257599e>] ret_from_fork+0x10/0x20
unreferenced object 0xffff058384973a80 (size 64):
  comm "kunit_try_catch", pid 313, jiffies 4294894230 (age 639.864s)
  hex dump (first 32 bytes):
    e8 e0 64 82 83 05 ff ff e8 e0 64 82 83 05 ff ff  ..d.......d.....
    22 00 00 00 01 00 00 00 00 00 00 00 00 80 ff ff  "...............
  backtrace:
    [<000000008585a8f7>] slab_post_alloc_hook+0xb8/0x368
    [<00000000795eba12>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000061886991>] kmalloc_trace+0x40/0x164
    [<00000000ee41df9e>] vcap_rule_add_action+0x104/0x178
    [<000000001cc1bb38>] test_vcap_xn_rule_creator.constprop.43+0xd8/0x328
    [<00000000a29794d8>] vcap_api_rule_remove_at_end_test+0xbc/0xb48
    [<00000000548b559e>] kunit_try_run_case+0x50/0xac
    [<00000000663f0105>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<00000000e646f120>] kthread+0x124/0x130
    [<000000005257599e>] ret_from_fork+0x10/0x20
unreferenced object 0xffff058384973a40 (size 64):
  comm "kunit_try_catch", pid 313, jiffies 4294894230 (age 639.880s)
  hex dump (first 32 bytes):
    80 39 97 84 83 05 ff ff d8 e0 64 82 83 05 ff ff  .9........d.....
    7d 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00  }...............
  backtrace:
    [<000000008585a8f7>] slab_post_alloc_hook+0xb8/0x368
    [<00000000795eba12>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000061886991>] kmalloc_trace+0x40/0x164
    [<0000000043c78991>] vcap_rule_add_key+0x104/0x180
    [<0000000094335477>] vcap_add_type_keyfield+0xbc/0x128
    [<000000002b00f7df>] vcap_val_rule+0x274/0x3e8
    [<00000000e67d2ff5>] test_vcap_xn_rule_creator.constprop.43+0xf0/0x328
    [<00000000a29794d8>] vcap_api_rule_remove_at_end_test+0xbc/0xb48
    [<00000000548b559e>] kunit_try_run_case+0x50/0xac
    [<00000000663f0105>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<00000000e646f120>] kthread+0x124/0x130
    [<000000005257599e>] ret_from_fork+0x10/0x20

unreferenced object 0xffff0583832fa240 (size 192):
  comm "kunit_try_catch", pid 315, jiffies 4294894233 (age 639.920s)
  hex dump (first 32 bytes):
    10 27 00 00 04 00 00 00 14 00 00 00 90 01 00 00  .'..............
    00 00 00 00 00 00 00 00 00 a1 8b 84 83 05 ff ff  ................
  backtrace:
    [<000000008585a8f7>] slab_post_alloc_hook+0xb8/0x368
    [<00000000795eba12>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000061886991>] kmalloc_trace+0x40/0x164
    [<00000000648fefae>] vcap_alloc_rule+0x17c/0x26c
    [<000000004da16164>] test_vcap_xn_rule_creator.constprop.43+0xac/0x328
    [<00000000be638a45>] vcap_api_rule_remove_in_middle_test+0xc4/0xb80
    [<00000000548b559e>] kunit_try_run_case+0x50/0xac
    [<00000000663f0105>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<00000000e646f120>] kthread+0x124/0x130
    [<000000005257599e>] ret_from_fork+0x10/0x20
unreferenced object 0xffff0583848ba0c0 (size 64):
  comm "kunit_try_catch", pid 315, jiffies 4294894233 (age 639.920s)
  hex dump (first 32 bytes):
    40 a1 8b 84 83 05 ff ff 68 a2 2f 83 83 05 ff ff  @.......h./.....
    22 00 00 00 01 00 00 00 00 00 00 00 00 80 ff ff  "...............
  backtrace:
    [<000000008585a8f7>] slab_post_alloc_hook+0xb8/0x368
    [<00000000795eba12>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000061886991>] kmalloc_trace+0x40/0x164
    [<00000000ee41df9e>] vcap_rule_add_action+0x104/0x178
    [<000000001cc1bb38>] test_vcap_xn_rule_creator.constprop.43+0xd8/0x328
    [<00000000be638a45>] vcap_api_rule_remove_in_middle_test+0xc4/0xb80
    [<00000000548b559e>] kunit_try_run_case+0x50/0xac
    [<00000000663f0105>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<00000000e646f120>] kthread+0x124/0x130
    [<000000005257599e>] ret_from_fork+0x10/0x20
unreferenced object 0xffff0583848ba100 (size 64):
  comm "kunit_try_catch", pid 315, jiffies 4294894233 (age 639.920s)
  hex dump (first 32 bytes):
    80 a1 8b 84 83 05 ff ff 58 a2 2f 83 83 05 ff ff  ........X./.....
    7d 00 00 00 01 00 00 00 02 00 00 00 ff 00 00 00  }...............
  backtrace:
    [<000000008585a8f7>] slab_post_alloc_hook+0xb8/0x368
    [<00000000795eba12>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000061886991>] kmalloc_trace+0x40/0x164
    [<0000000043c78991>] vcap_rule_add_key+0x104/0x180
    [<00000000ba73cfbe>] vcap_add_type_keyfield+0xfc/0x128
    [<000000002b00f7df>] vcap_val_rule+0x274/0x3e8
    [<00000000e67d2ff5>] test_vcap_xn_rule_creator.constprop.43+0xf0/0x328
    [<00000000be638a45>] vcap_api_rule_remove_in_middle_test+0xc4/0xb80
    [<00000000548b559e>] kunit_try_run_case+0x50/0xac
    [<00000000663f0105>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<00000000e646f120>] kthread+0x124/0x130
    [<000000005257599e>] ret_from_fork+0x10/0x20

unreferenced object 0xffff0583827d2180 (size 192):
  comm "kunit_try_catch", pid 317, jiffies 4294894238 (age 639.956s)
  hex dump (first 32 bytes):
    10 27 00 00 04 00 00 00 14 00 00 00 90 01 00 00  .'..............
    00 00 00 00 00 00 00 00 00 e1 06 83 83 05 ff ff  ................
  backtrace:
    [<000000008585a8f7>] slab_post_alloc_hook+0xb8/0x368
    [<00000000795eba12>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000061886991>] kmalloc_trace+0x40/0x164
    [<00000000648fefae>] vcap_alloc_rule+0x17c/0x26c
    [<000000004da16164>] test_vcap_xn_rule_creator.constprop.43+0xac/0x328
    [<00000000e1ed8350>] vcap_api_rule_remove_in_front_test+0x144/0x6c0
    [<00000000548b559e>] kunit_try_run_case+0x50/0xac
    [<00000000663f0105>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<00000000e646f120>] kthread+0x124/0x130
    [<000000005257599e>] ret_from_fork+0x10/0x20
unreferenced object 0xffff05838306e0c0 (size 64):
  comm "kunit_try_catch", pid 317, jiffies 4294894238 (age 639.956s)
  hex dump (first 32 bytes):
    40 e1 06 83 83 05 ff ff a8 21 7d 82 83 05 ff ff  @........!}.....
    22 00 00 00 01 00 00 00 00 00 00 00 00 80 ff ff  "...............
  backtrace:
    [<000000008585a8f7>] slab_post_alloc_hook+0xb8/0x368
    [<00000000795eba12>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000061886991>] kmalloc_trace+0x40/0x164
    [<00000000ee41df9e>] vcap_rule_add_action+0x104/0x178
    [<000000001cc1bb38>] test_vcap_xn_rule_creator.constprop.43+0xd8/0x328
    [<00000000e1ed8350>] vcap_api_rule_remove_in_front_test+0x144/0x6c0
    [<00000000548b559e>] kunit_try_run_case+0x50/0xac
    [<00000000663f0105>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<00000000e646f120>] kthread+0x124/0x130
    [<000000005257599e>] ret_from_fork+0x10/0x20
unreferenced object 0xffff05838306e180 (size 64):
  comm "kunit_try_catch", pid 317, jiffies 4294894238 (age 639.968s)
  hex dump (first 32 bytes):
    98 21 7d 82 83 05 ff ff 00 e1 06 83 83 05 ff ff  .!}.............
    67 00 00 00 00 00 00 00 01 01 00 00 ff 00 00 00  g...............
  backtrace:
    [<000000008585a8f7>] slab_post_alloc_hook+0xb8/0x368
    [<00000000795eba12>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000061886991>] kmalloc_trace+0x40/0x164
    [<0000000043c78991>] vcap_rule_add_key+0x104/0x180
    [<000000006ce4945d>] test_add_def_fields+0x84/0x8c
    [<00000000507e0ab6>] vcap_val_rule+0x294/0x3e8
    [<00000000e67d2ff5>] test_vcap_xn_rule_creator.constprop.43+0xf0/0x328
    [<00000000e1ed8350>] vcap_api_rule_remove_in_front_test+0x144/0x6c0
    [<00000000548b559e>] kunit_try_run_case+0x50/0xac
    [<00000000663f0105>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<00000000e646f120>] kthread+0x124/0x130
    [<000000005257599e>] ret_from_fork+0x10/0x20

Fixes: dccc30cc4906 ("net: microchip: sparx5: Add KUNIT test of counters and sorted rules")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202309090950.uOTEKQq3-lkp@intel.com/
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonet: microchip: sparx5: Fix possible memory leak in vcap_api_encode_rule_test()
Jinjie Ruan [Tue, 12 Sep 2023 11:03:08 +0000 (19:03 +0800)]
net: microchip: sparx5: Fix possible memory leak in vcap_api_encode_rule_test()

Inject fault while probing kunit-example-test.ko, the duprule which
is allocated in vcap_dup_rule() and the vcap enabled port which
is allocated in vcap_enable() of vcap_enable_lookups in
vcap_api_encode_rule_test() is not freed, and it cause the memory
leaks below.

Use vcap_enable_lookups() with false arg to free the vcap enabled
port as other drivers do it. And use vcap_del_rule() to
free the duprule.

unreferenced object 0xffff677a0278bb00 (size 64):
  comm "kunit_try_catch", pid 388, jiffies 4294895987 (age 1101.840s)
  hex dump (first 32 bytes):
    18 bd a5 82 00 80 ff ff 18 bd a5 82 00 80 ff ff  ................
    40 fe c8 0e be c6 ff ff 00 00 00 00 00 00 00 00  @...............
  backtrace:
    [<000000007d53023a>] slab_post_alloc_hook+0xb8/0x368
    [<0000000076e3f654>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000034d76721>] kmalloc_trace+0x40/0x164
    [<00000000013380a5>] vcap_enable_lookups+0x1c8/0x70c
    [<00000000bbec496b>] vcap_api_encode_rule_test+0x2f8/0xb18
    [<000000002c2bfb7b>] kunit_try_run_case+0x50/0xac
    [<00000000ff74642b>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<000000004af845ca>] kthread+0x124/0x130
    [<0000000038a000ca>] ret_from_fork+0x10/0x20
unreferenced object 0xffff677a027803c0 (size 192):
  comm "kunit_try_catch", pid 388, jiffies 4294895988 (age 1101.836s)
  hex dump (first 32 bytes):
    00 12 7a 00 05 00 00 00 0a 00 00 00 64 00 00 00  ..z.........d...
    00 00 00 00 00 00 00 00 d8 03 78 02 7a 67 ff ff  ..........x.zg..
  backtrace:
    [<000000007d53023a>] slab_post_alloc_hook+0xb8/0x368
    [<0000000076e3f654>] __kmem_cache_alloc_node+0x174/0x290
    [<0000000034d76721>] kmalloc_trace+0x40/0x164
    [<00000000c1010131>] vcap_dup_rule+0x34/0x14c
    [<00000000d43c54a4>] vcap_add_rule+0x29c/0x32c
    [<0000000073f1c26d>] vcap_api_encode_rule_test+0x304/0xb18
    [<000000002c2bfb7b>] kunit_try_run_case+0x50/0xac
    [<00000000ff74642b>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<000000004af845ca>] kthread+0x124/0x130
    [<0000000038a000ca>] ret_from_fork+0x10/0x20

Fixes: c956b9b318d9 ("net: microchip: sparx5: Adding KUNIT tests of key/action values in VCAP API")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonet: microchip: sparx5: Fix memory leak for vcap_api_rule_add_actionvalue_test()
Jinjie Ruan [Tue, 12 Sep 2023 11:03:07 +0000 (19:03 +0800)]
net: microchip: sparx5: Fix memory leak for vcap_api_rule_add_actionvalue_test()

Inject fault while probing kunit-example-test.ko, the field which
is allocated by kzalloc in vcap_rule_add_action() of
vcap_rule_add_action_bit/u32() is not freed, and it cause
the memory leaks below.

unreferenced object 0xffff0276c496b300 (size 64):
  comm "kunit_try_catch", pid 286, jiffies 4294894224 (age 920.072s)
  hex dump (first 32 bytes):
    68 3c 62 82 00 80 ff ff 68 3c 62 82 00 80 ff ff  h<b.....h<b.....
    3c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  <...............
  backtrace:
    [<0000000028f08898>] slab_post_alloc_hook+0xb8/0x368
    [<00000000514b9b37>] __kmem_cache_alloc_node+0x174/0x290
    [<000000004620684a>] kmalloc_trace+0x40/0x164
    [<000000008b41c84d>] vcap_rule_add_action+0x104/0x178
    [<00000000ae66c16c>] vcap_api_rule_add_actionvalue_test+0xa4/0x990
    [<00000000fcc5326c>] kunit_try_run_case+0x50/0xac
    [<00000000f5f45b20>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<0000000026284079>] kthread+0x124/0x130
    [<0000000024d4a996>] ret_from_fork+0x10/0x20
unreferenced object 0xffff0276c496b2c0 (size 64):
  comm "kunit_try_catch", pid 286, jiffies 4294894224 (age 920.072s)
  hex dump (first 32 bytes):
    68 3c 62 82 00 80 ff ff 68 3c 62 82 00 80 ff ff  h<b.....h<b.....
    3c 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00  <...............
  backtrace:
    [<0000000028f08898>] slab_post_alloc_hook+0xb8/0x368
    [<00000000514b9b37>] __kmem_cache_alloc_node+0x174/0x290
    [<000000004620684a>] kmalloc_trace+0x40/0x164
    [<000000008b41c84d>] vcap_rule_add_action+0x104/0x178
    [<00000000607782aa>] vcap_api_rule_add_actionvalue_test+0x100/0x990
    [<00000000fcc5326c>] kunit_try_run_case+0x50/0xac
    [<00000000f5f45b20>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<0000000026284079>] kthread+0x124/0x130
    [<0000000024d4a996>] ret_from_fork+0x10/0x20
unreferenced object 0xffff0276c496b280 (size 64):
  comm "kunit_try_catch", pid 286, jiffies 4294894224 (age 920.072s)
  hex dump (first 32 bytes):
    68 3c 62 82 00 80 ff ff 68 3c 62 82 00 80 ff ff  h<b.....h<b.....
    3c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  <...............
  backtrace:
    [<0000000028f08898>] slab_post_alloc_hook+0xb8/0x368
    [<00000000514b9b37>] __kmem_cache_alloc_node+0x174/0x290
    [<000000004620684a>] kmalloc_trace+0x40/0x164
    [<000000008b41c84d>] vcap_rule_add_action+0x104/0x178
    [<000000004e640602>] vcap_api_rule_add_actionvalue_test+0x15c/0x990
    [<00000000fcc5326c>] kunit_try_run_case+0x50/0xac
    [<00000000f5f45b20>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<0000000026284079>] kthread+0x124/0x130
    [<0000000024d4a996>] ret_from_fork+0x10/0x20
unreferenced object 0xffff0276c496b240 (size 64):
  comm "kunit_try_catch", pid 286, jiffies 4294894224 (age 920.092s)
  hex dump (first 32 bytes):
    68 3c 62 82 00 80 ff ff 68 3c 62 82 00 80 ff ff  h<b.....h<b.....
    5a 00 00 00 01 00 00 00 32 54 76 98 00 00 00 00  Z.......2Tv.....
  backtrace:
    [<0000000028f08898>] slab_post_alloc_hook+0xb8/0x368
    [<00000000514b9b37>] __kmem_cache_alloc_node+0x174/0x290
    [<000000004620684a>] kmalloc_trace+0x40/0x164
    [<000000008b41c84d>] vcap_rule_add_action+0x104/0x178
    [<0000000011141bf8>] vcap_api_rule_add_actionvalue_test+0x1bc/0x990
    [<00000000fcc5326c>] kunit_try_run_case+0x50/0xac
    [<00000000f5f45b20>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<0000000026284079>] kthread+0x124/0x130
    [<0000000024d4a996>] ret_from_fork+0x10/0x20
unreferenced object 0xffff0276c496b200 (size 64):
  comm "kunit_try_catch", pid 286, jiffies 4294894224 (age 920.092s)
  hex dump (first 32 bytes):
    68 3c 62 82 00 80 ff ff 68 3c 62 82 00 80 ff ff  h<b.....h<b.....
    28 00 00 00 01 00 00 00 dd cc bb aa 00 00 00 00  (...............
  backtrace:
    [<0000000028f08898>] slab_post_alloc_hook+0xb8/0x368
    [<00000000514b9b37>] __kmem_cache_alloc_node+0x174/0x290
    [<000000004620684a>] kmalloc_trace+0x40/0x164
    [<000000008b41c84d>] vcap_rule_add_action+0x104/0x178
    [<00000000d5ed3088>] vcap_api_rule_add_actionvalue_test+0x22c/0x990
    [<00000000fcc5326c>] kunit_try_run_case+0x50/0xac
    [<00000000f5f45b20>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<0000000026284079>] kthread+0x124/0x130
    [<0000000024d4a996>] ret_from_fork+0x10/0x20

Fixes: c956b9b318d9 ("net: microchip: sparx5: Adding KUNIT tests of key/action values in VCAP API")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agonet: microchip: sparx5: Fix memory leak for vcap_api_rule_add_keyvalue_test()
Jinjie Ruan [Tue, 12 Sep 2023 11:03:06 +0000 (19:03 +0800)]
net: microchip: sparx5: Fix memory leak for vcap_api_rule_add_keyvalue_test()

Inject fault while probing kunit-example-test.ko, the field which
is allocated by kzalloc in vcap_rule_add_key() of
vcap_rule_add_key_bit/u32/u128() is not freed, and it cause
the memory leaks below.

unreferenced object 0xffff0276c14b7240 (size 64):
  comm "kunit_try_catch", pid 284, jiffies 4294894220 (age 920.072s)
  hex dump (first 32 bytes):
    28 3c 61 82 00 80 ff ff 28 3c 61 82 00 80 ff ff  (<a.....(<a.....
    67 00 00 00 00 00 00 00 00 01 37 2b af ab ff ff  g.........7+....
  backtrace:
    [<0000000028f08898>] slab_post_alloc_hook+0xb8/0x368
    [<00000000514b9b37>] __kmem_cache_alloc_node+0x174/0x290
    [<000000004620684a>] kmalloc_trace+0x40/0x164
    [<0000000059ad6bcd>] vcap_rule_add_key+0x104/0x180
    [<00000000ff8002d3>] vcap_api_rule_add_keyvalue_test+0x100/0xba8
    [<00000000fcc5326c>] kunit_try_run_case+0x50/0xac
    [<00000000f5f45b20>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<0000000026284079>] kthread+0x124/0x130
    [<0000000024d4a996>] ret_from_fork+0x10/0x20
unreferenced object 0xffff0276c14b7280 (size 64):
  comm "kunit_try_catch", pid 284, jiffies 4294894221 (age 920.068s)
  hex dump (first 32 bytes):
    28 3c 61 82 00 80 ff ff 28 3c 61 82 00 80 ff ff  (<a.....(<a.....
    67 00 00 00 00 00 00 00 01 01 37 2b af ab ff ff  g.........7+....
  backtrace:
    [<0000000028f08898>] slab_post_alloc_hook+0xb8/0x368
    [<00000000514b9b37>] __kmem_cache_alloc_node+0x174/0x290
    [<000000004620684a>] kmalloc_trace+0x40/0x164
    [<0000000059ad6bcd>] vcap_rule_add_key+0x104/0x180
    [<00000000f5ac9dc7>] vcap_api_rule_add_keyvalue_test+0x168/0xba8
    [<00000000fcc5326c>] kunit_try_run_case+0x50/0xac
    [<00000000f5f45b20>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<0000000026284079>] kthread+0x124/0x130
    [<0000000024d4a996>] ret_from_fork+0x10/0x20
unreferenced object 0xffff0276c14b72c0 (size 64):
  comm "kunit_try_catch", pid 284, jiffies 4294894221 (age 920.068s)
  hex dump (first 32 bytes):
    28 3c 61 82 00 80 ff ff 28 3c 61 82 00 80 ff ff  (<a.....(<a.....
    67 00 00 00 00 00 00 00 00 00 37 2b af ab ff ff  g.........7+....
  backtrace:
    [<0000000028f08898>] slab_post_alloc_hook+0xb8/0x368
    [<00000000514b9b37>] __kmem_cache_alloc_node+0x174/0x290
    [<000000004620684a>] kmalloc_trace+0x40/0x164
    [<0000000059ad6bcd>] vcap_rule_add_key+0x104/0x180
    [<00000000c918ae7f>] vcap_api_rule_add_keyvalue_test+0x1d0/0xba8
    [<00000000fcc5326c>] kunit_try_run_case+0x50/0xac
    [<00000000f5f45b20>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<0000000026284079>] kthread+0x124/0x130
    [<0000000024d4a996>] ret_from_fork+0x10/0x20
unreferenced object 0xffff0276c14b7300 (size 64):
  comm "kunit_try_catch", pid 284, jiffies 4294894221 (age 920.084s)
  hex dump (first 32 bytes):
    28 3c 61 82 00 80 ff ff 28 3c 61 82 00 80 ff ff  (<a.....(<a.....
    7d 00 00 00 01 00 00 00 32 54 76 98 ab ff 00 ff  }.......2Tv.....
  backtrace:
    [<0000000028f08898>] slab_post_alloc_hook+0xb8/0x368
    [<00000000514b9b37>] __kmem_cache_alloc_node+0x174/0x290
    [<000000004620684a>] kmalloc_trace+0x40/0x164
    [<0000000059ad6bcd>] vcap_rule_add_key+0x104/0x180
    [<0000000003352814>] vcap_api_rule_add_keyvalue_test+0x240/0xba8
    [<00000000fcc5326c>] kunit_try_run_case+0x50/0xac
    [<00000000f5f45b20>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<0000000026284079>] kthread+0x124/0x130
    [<0000000024d4a996>] ret_from_fork+0x10/0x20
unreferenced object 0xffff0276c14b7340 (size 64):
  comm "kunit_try_catch", pid 284, jiffies 4294894221 (age 920.084s)
  hex dump (first 32 bytes):
    28 3c 61 82 00 80 ff ff 28 3c 61 82 00 80 ff ff  (<a.....(<a.....
    51 00 00 00 07 00 00 00 17 26 35 44 63 62 71 00  Q........&5Dcbq.
  backtrace:
    [<0000000028f08898>] slab_post_alloc_hook+0xb8/0x368
    [<00000000514b9b37>] __kmem_cache_alloc_node+0x174/0x290
    [<000000004620684a>] kmalloc_trace+0x40/0x164
    [<0000000059ad6bcd>] vcap_rule_add_key+0x104/0x180
    [<000000001516f109>] vcap_api_rule_add_keyvalue_test+0x2cc/0xba8
    [<00000000fcc5326c>] kunit_try_run_case+0x50/0xac
    [<00000000f5f45b20>] kunit_generic_run_threadfn_adapter+0x20/0x2c
    [<0000000026284079>] kthread+0x124/0x130
    [<0000000024d4a996>] ret_from_fork+0x10/0x20

Fixes: c956b9b318d9 ("net: microchip: sparx5: Adding KUNIT tests of key/action values in VCAP API")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoMerge tag 'net-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 14 Sep 2023 17:03:34 +0000 (10:03 -0700)]
Merge tag 'net-6.6-rc2' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Quite unusually, this does not contains any fix coming from subtrees
  (nf, ebpf, wifi, etc).

  Current release - regressions:

   - bcmasp: fix possible OOB write in bcmasp_netfilt_get_all_active()

  Previous releases - regressions:

   - ipv4: fix one memleak in __inet_del_ifa()

   - tcp: fix bind() regressions for v4-mapped-v6 addresses.

   - tls: do not free tls_rec on async operation in
     bpf_exec_tx_verdict()

   - dsa: fixes for SJA1105 FDB regressions

   - veth: update XDP feature set when bringing up device

   - igb: fix hangup when enabling SR-IOV

  Previous releases - always broken:

   - kcm: fix memory leak in error path of kcm_sendmsg()

   - smc: fix data corruption in smcr_port_add

   - microchip: fix possible memory leak for vcap_dup_rule()"

* tag 'net-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (37 commits)
  kcm: Fix error handling for SOCK_DGRAM in kcm_sendmsg().
  net: renesas: rswitch: Add spin lock protection for irq {un}mask
  net: renesas: rswitch: Fix unmasking irq condition
  igb: clean up in all error paths when enabling SR-IOV
  ixgbe: fix timestamp configuration code
  selftest: tcp: Add v4-mapped-v6 cases in bind_wildcard.c.
  selftest: tcp: Move expected_errno into each test case in bind_wildcard.c.
  selftest: tcp: Fix address length in bind_wildcard.c.
  tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address.
  tcp: Fix bind() regression for v4-mapped-v6 wildcard address.
  tcp: Factorise sk_family-independent comparison in inet_bind2_bucket_match(_addr_any).
  ipv6: fix ip6_sock_set_addr_preferences() typo
  veth: Update XDP feature set when bringing up device
  net: macb: fix sleep inside spinlock
  net/tls: do not free tls_rec on async operation in bpf_exec_tx_verdict()
  net: ethernet: mtk_eth_soc: fix pse_port configuration for MT7988
  net: ethernet: mtk_eth_soc: fix uninitialized variable
  kcm: Fix memory leak in error path of kcm_sendmsg()
  r8152: check budget for r8152_poll()
  net: dsa: sja1105: block FDB accesses that are concurrent with a switch reset
  ...

10 months agokcm: Fix error handling for SOCK_DGRAM in kcm_sendmsg().
Kuniyuki Iwashima [Tue, 12 Sep 2023 02:27:53 +0000 (19:27 -0700)]
kcm: Fix error handling for SOCK_DGRAM in kcm_sendmsg().

syzkaller found a memory leak in kcm_sendmsg(), and commit c821a88bd720
("kcm: Fix memory leak in error path of kcm_sendmsg()") suppressed it by
updating kcm_tx_msg(head)->last_skb if partial data is copied so that the
following sendmsg() will resume from the skb.

However, we cannot know how many bytes were copied when we get the error.
Thus, we could mess up the MSG_MORE queue.

When kcm_sendmsg() fails for SOCK_DGRAM, we should purge the queue as we
do so for UDP by udp_flush_pending_frames().

Even without this change, when the error occurred, the following sendmsg()
resumed from a wrong skb and the queue was messed up.  However, we have
yet to get such a report, and only syzkaller stumbled on it.  So, this
can be changed safely.

Note this does not change SOCK_SEQPACKET behaviour.

Fixes: c821a88bd720 ("kcm: Fix memory leak in error path of kcm_sendmsg()")
Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230912022753.33327-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
10 months agoMerge branch 'net-renesas-rswitch-fix-a-lot-of-redundant-irq-issue'
Paolo Abeni [Thu, 14 Sep 2023 08:26:42 +0000 (10:26 +0200)]
Merge branch 'net-renesas-rswitch-fix-a-lot-of-redundant-irq-issue'

Yoshihiro Shimoda says:

====================
net: renesas: rswitch: Fix a lot of redundant irq issue

After this patch series was applied, a lot of redundant interrupts
no longer occur.

For example: when "iperf3 -c <ipaddr> -R" on R-Car S4-8 Spider
 Before the patches are applied: about 800,000 times happened
 After the patches were applied: about 100,000 times happened
====================

Link: https://lore.kernel.org/r/20230912014936.3175430-1-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
10 months agonet: renesas: rswitch: Add spin lock protection for irq {un}mask
Yoshihiro Shimoda [Tue, 12 Sep 2023 01:49:36 +0000 (10:49 +0900)]
net: renesas: rswitch: Add spin lock protection for irq {un}mask

Add spin lock protection for irq {un}mask registers' control.

After napi_complete_done() and this protection were applied,
a lot of redundant interrupts no longer occur.

For example: when "iperf3 -c <ipaddr> -R" on R-Car S4-8 Spider
 Before the patches are applied: about 800,000 times happened
 After the patches were applied: about 100,000 times happened

Fixes: 3590918b5d07 ("net: ethernet: renesas: Add support for "Ethernet Switch"")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
10 months agonet: renesas: rswitch: Fix unmasking irq condition
Yoshihiro Shimoda [Tue, 12 Sep 2023 01:49:35 +0000 (10:49 +0900)]
net: renesas: rswitch: Fix unmasking irq condition

Fix unmasking irq condition by using napi_complete_done(). Otherwise,
redundant interrupts happen.

Fixes: 3590918b5d07 ("net: ethernet: renesas: Add support for "Ethernet Switch"")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
10 months agoMerge tag 'pmdomain-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh...
Linus Torvalds [Wed, 13 Sep 2023 21:18:19 +0000 (14:18 -0700)]
Merge tag 'pmdomain-v6.6-rc1' of git://git./linux/kernel/git/ulfh/linux-pm

Pull genpm / pmdomain rename from Ulf Hansson:
 "This renames the genpd subsystem to pmdomain.

  As discussed on LKML, using 'genpd' as the name of a subsystem isn't
  very self-explanatory and the acronym itself that means Generic PM
  Domain, is known only by a limited group of people.

  The suggestion to improve the situation is to rename the subsystem to
  'pmdomain', which there seems to be a good consensus around using.

  Ideally it should indicate that its purpose is to manage Power Domains
  or 'PM domains' as we often also use within the Linux Kernel
  terminology"

* tag 'pmdomain-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm:
  pmdomain: Rename the genpd subsystem to pmdomain

10 months agoselftests: netfilter: Test nf_tables audit logging
Phil Sutter [Wed, 13 Sep 2023 13:51:37 +0000 (15:51 +0200)]
selftests: netfilter: Test nf_tables audit logging

Compare NETFILTER_CFG type audit logs emitted from kernel upon ruleset
modifications against expected output.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
10 months agonetfilter: nf_tables: Fix entries val in rule reset audit log
Phil Sutter [Wed, 13 Sep 2023 13:51:36 +0000 (15:51 +0200)]
netfilter: nf_tables: Fix entries val in rule reset audit log

The value in idx and the number of rules handled in that particular
__nf_tables_dump_rules() call is not identical. The former is a cursor
to pick up from if multiple netlink messages are needed, so its value is
ever increasing. Fixing this is not just a matter of subtracting s_idx
from it, though: When resetting rules in multiple chains,
__nf_tables_dump_rules() is called for each and cb->args[0] is not
adjusted in between. Introduce a dedicated counter to record the number
of rules reset in this call in a less confusing way.

While being at it, prevent the direct return upon buffer exhaustion: Any
rules previously dumped into that skb would evade audit logging
otherwise.

Fixes: 9b5ba5c9c5109 ("netfilter: nf_tables: Unbreak audit log reset")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
10 months agonetfilter: conntrack: fix extension size table
Florian Westphal [Tue, 12 Sep 2023 08:56:07 +0000 (10:56 +0200)]
netfilter: conntrack: fix extension size table

The size table is incorrect due to copypaste error,
this reserves more size than needed.

TSTAMP reserved 32 instead of 16 bytes.
TIMEOUT reserved 16 instead of 8 bytes.

Fixes: 5f31edc0676b ("netfilter: conntrack: move extension sizes into core")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
10 months agoselftests/bpf: Fix kprobe_multi_test/attach_override test
Jiri Olsa [Wed, 13 Sep 2023 11:47:11 +0000 (13:47 +0200)]
selftests/bpf: Fix kprobe_multi_test/attach_override test

We need to deny the attach_override test for arm64, denying the
whole kprobe_multi_test suite. Also making attach_override static.

Fixes: 7182e56411b9 ("selftests/bpf: Add kprobe_multi override test")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230913114711.499829-1-jolsa@kernel.org
10 months agoMerge tag 'tpmdd-v6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko...
Linus Torvalds [Wed, 13 Sep 2023 18:44:20 +0000 (11:44 -0700)]
Merge tag 'tpmdd-v6.6-rc2' of git://git./linux/kernel/git/jarkko/linux-tpmdd

Pull tpm fix from Jarkko Sakkinen.

* tag 'tpmdd-v6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
  tpm: Fix typo in tpmrm class definition

10 months agoMerge tag 'parisc-for-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/delle...
Linus Torvalds [Wed, 13 Sep 2023 18:35:53 +0000 (11:35 -0700)]
Merge tag 'parisc-for-6.6-rc2' of git://git./linux/kernel/git/deller/parisc-linux

Pull parisc architecture fixes from Helge Deller:

 - fix reference to exported symbols for parisc64 [Masahiro Yamada]

 - Block-TLB (BTLB) support on 32-bit CPUs

 - sparse and build-warning fixes

* tag 'parisc-for-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  linux/export: fix reference to exported functions for parisc64
  parisc: BTLB: Initialize BTLB tables at CPU startup
  parisc: firmware: Simplify calling non-PA20 functions
  parisc: BTLB: _edata symbol has to be page aligned for BTLB support
  parisc: BTLB: Add BTLB insert and purge firmware function wrappers
  parisc: BTLB: Clear possibly existing BTLB entries
  parisc: Prepare for Block-TLB support on 32-bit kernel
  parisc: shmparam.h: Document aliasing requirements of PA-RISC
  parisc: irq: Make irq_stack_union static to avoid sparse warning
  parisc: drivers: Fix sparse warning
  parisc: iosapic.c: Fix sparse warnings
  parisc: ccio-dma: Fix sparse warnings
  parisc: sba-iommu: Fix sparse warnigs
  parisc: sba: Fix compile warning wrt list of SBA devices
  parisc: sba_iommu: Fix build warning if procfs if disabled

10 months agoMerge tag 'trace-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace...
Linus Torvalds [Wed, 13 Sep 2023 18:30:11 +0000 (11:30 -0700)]
Merge tag 'trace-v6.6-rc1' of git://git./linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Add missing LOCKDOWN checks for eventfs callers

   When LOCKDOWN is active for tracing, it causes inconsistent state
   when some functions succeed and others fail.

 - Use dput() to free the top level eventfs descriptor

   There was a race between accesses and freeing it.

 - Fix a long standing bug that eventfs exposed due to changing timings
   by dynamically creating files. That is, If a event file is opened for
   an instance, there's nothing preventing the instance from being
   removed which will make accessing the files cause use-after-free
   bugs.

 - Fix a ring buffer race that happens when iterating over the ring
   buffer while writers are active. Check to make sure not to read the
   event meta data if it's beyond the end of the ring buffer sub buffer.

 - Fix the print trigger that disappeared because the test to create it
   was looking for the event dir field being filled, but now it has the
   "ef" field filled for the eventfs structure.

 - Remove the unused "dir" field from the event structure.

 - Fix the order of the trace_dynamic_info as it had it backwards for
   the offset and len fields for which one was for which endianess.

 - Fix NULL pointer dereference with eventfs_remove_rec()

   If an allocation fails in one of the eventfs_add_*() functions, the
   caller of it in event_subsystem_dir() or event_create_dir() assigns
   the result to the structure. But it's assigning the ERR_PTR and not
   NULL. This was passed to eventfs_remove_rec() which expects either a
   good pointer or a NULL, not ERR_PTR. The fix is to not assign the
   ERR_PTR to the structure, but to keep it NULL on error.

 - Fix list_for_each_rcu() to use list_for_each_srcu() in
   dcache_dir_open_wrapper(). One iteration of the code used RCU but
   because it had to call sleepable code, it had to be changed to use
   SRCU, but one of the iterations was missed.

 - Fix synthetic event print function to use "as_u64" instead of passing
   in a pointer to the union. To fix big/little endian issues, the u64
   that represented several types was turned into a union to define the
   types properly.

* tag 'trace-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  eventfs: Fix the NULL pointer dereference bug in eventfs_remove_rec()
  tracefs/eventfs: Use list_for_each_srcu() in dcache_dir_open_wrapper()
  tracing/synthetic: Print out u64 values properly
  tracing/synthetic: Fix order of struct trace_dynamic_info
  selftests/ftrace: Fix dependencies for some of the synthetic event tests
  tracing: Remove unused trace_event_file dir field
  tracing: Use the new eventfs descriptor for print trigger
  ring-buffer: Do not attempt to read past "commit"
  tracefs/eventfs: Free top level files on removal
  ring-buffer: Avoid softlockup in ring_buffer_resize()
  tracing: Have event inject files inc the trace array ref count
  tracing: Have option files inc the trace array ref count
  tracing: Have current_trace inc the trace array ref count
  tracing: Have tracing_max_latency inc the trace array ref count
  tracing: Increase trace array ref count on enable and filter files
  tracefs/eventfs: Use dput to free the toplevel events directory
  tracefs/eventfs: Add missing lockdown checks
  tracefs: Add missing lockdown check to tracefs_create_dir()

10 months agoigb: clean up in all error paths when enabling SR-IOV
Corinna Vinschen [Mon, 11 Sep 2023 20:28:49 +0000 (13:28 -0700)]
igb: clean up in all error paths when enabling SR-IOV

After commit 50f303496d92 ("igb: Enable SR-IOV after reinit"), removing
the igb module could hang or crash (depending on the machine) when the
module has been loaded with the max_vfs parameter set to some value != 0.

In case of one test machine with a dual port 82580, this hang occurred:

[  232.480687] igb 0000:41:00.1: removed PHC on enp65s0f1
[  233.093257] igb 0000:41:00.1: IOV Disabled
[  233.329969] pcieport 0000:40:01.0: AER: Multiple Uncorrected (Non-Fatal) err0
[  233.340302] igb 0000:41:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fata)
[  233.352248] igb 0000:41:00.0:   device [8086:1516] error status/mask=00100000
[  233.361088] igb 0000:41:00.0:    [20] UnsupReq               (First)
[  233.368183] igb 0000:41:00.0: AER:   TLP Header: 40000001 0000040f cdbfc00c c
[  233.376846] igb 0000:41:00.1: PCIe Bus Error: severity=Uncorrected (Non-Fata)
[  233.388779] igb 0000:41:00.1:   device [8086:1516] error status/mask=00100000
[  233.397629] igb 0000:41:00.1:    [20] UnsupReq               (First)
[  233.404736] igb 0000:41:00.1: AER:   TLP Header: 40000001 0000040f cdbfc00c c
[  233.538214] pci 0000:41:00.1: AER: can't recover (no error_detected callback)
[  233.538401] igb 0000:41:00.0: removed PHC on enp65s0f0
[  233.546197] pcieport 0000:40:01.0: AER: device recovery failed
[  234.157244] igb 0000:41:00.0: IOV Disabled
[  371.619705] INFO: task irq/35-aerdrv:257 blocked for more than 122 seconds.
[  371.627489]       Not tainted 6.4.0-dirty #2
[  371.632257] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this.
[  371.641000] task:irq/35-aerdrv   state:D stack:0     pid:257   ppid:2      f0
[  371.650330] Call Trace:
[  371.653061]  <TASK>
[  371.655407]  __schedule+0x20e/0x660
[  371.659313]  schedule+0x5a/0xd0
[  371.662824]  schedule_preempt_disabled+0x11/0x20
[  371.667983]  __mutex_lock.constprop.0+0x372/0x6c0
[  371.673237]  ? __pfx_aer_root_reset+0x10/0x10
[  371.678105]  report_error_detected+0x25/0x1c0
[  371.682974]  ? __pfx_report_normal_detected+0x10/0x10
[  371.688618]  pci_walk_bus+0x72/0x90
[  371.692519]  pcie_do_recovery+0xb2/0x330
[  371.696899]  aer_process_err_devices+0x117/0x170
[  371.702055]  aer_isr+0x1c0/0x1e0
[  371.705661]  ? __set_cpus_allowed_ptr+0x54/0xa0
[  371.710723]  ? __pfx_irq_thread_fn+0x10/0x10
[  371.715496]  irq_thread_fn+0x20/0x60
[  371.719491]  irq_thread+0xe6/0x1b0
[  371.723291]  ? __pfx_irq_thread_dtor+0x10/0x10
[  371.728255]  ? __pfx_irq_thread+0x10/0x10
[  371.732731]  kthread+0xe2/0x110
[  371.736243]  ? __pfx_kthread+0x10/0x10
[  371.740430]  ret_from_fork+0x2c/0x50
[  371.744428]  </TASK>

The reproducer was a simple script:

  #!/bin/sh
  for i in `seq 1 5`; do
    modprobe -rv igb
    modprobe -v igb max_vfs=1
    sleep 1
    modprobe -rv igb
  done

It turned out that this could only be reproduce on 82580 (quad and
dual-port), but not on 82576, i350 and i210.  Further debugging showed
that igb_enable_sriov()'s call to pci_enable_sriov() is failing, because
dev->is_physfn is 0 on 82580.

Prior to commit 50f303496d92 ("igb: Enable SR-IOV after reinit"),
igb_enable_sriov() jumped into the "err_out" cleanup branch.  After this
commit it only returned the error code.

So the cleanup didn't take place, and the incorrect VF setup in the
igb_adapter structure fooled the igb driver into assuming that VFs have
been set up where no VF actually existed.

Fix this problem by cleaning up again if pci_enable_sriov() fails.

Fixes: 50f303496d92 ("igb: Enable SR-IOV after reinit")
Signed-off-by: Corinna Vinschen <vinschen@redhat.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoixgbe: fix timestamp configuration code
Vadim Fedorenko [Mon, 11 Sep 2023 20:28:14 +0000 (13:28 -0700)]
ixgbe: fix timestamp configuration code

The commit in fixes introduced flags to control the status of hardware
configuration while processing packets. At the same time another structure
is used to provide configuration of timestamper to user-space applications.
The way it was coded makes this structures go out of sync easily. The
repro is easy for 82599 chips:

[root@hostname ~]# hwstamp_ctl -i eth0 -r 12 -t 1
current settings:
tx_type 0
rx_filter 0
new settings:
tx_type 1
rx_filter 12

The eth0 device is properly configured to timestamp any PTPv2 events.

[root@hostname ~]# hwstamp_ctl -i eth0 -r 1 -t 1
current settings:
tx_type 1
rx_filter 12
SIOCSHWTSTAMP failed: Numerical result out of range
The requested time stamping mode is not supported by the hardware.

The error is properly returned because HW doesn't support all packets
timestamping. But the adapter->flags is cleared of timestamp flags
even though no HW configuration was done. From that point no RX timestamps
are received by user-space application. But configuration shows good
values:

[root@hostname ~]# hwstamp_ctl -i eth0
current settings:
tx_type 1
rx_filter 12

Fix the issue by applying new flags only when the HW was actually
configured.

Fixes: a9763f3cb54c ("ixgbe: Update PTP to support X550EM_x devices")
Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agopmdomain: Rename the genpd subsystem to pmdomain
Ulf Hansson [Tue, 12 Sep 2023 22:11:27 +0000 (00:11 +0200)]
pmdomain: Rename the genpd subsystem to pmdomain

It has been pointed out that naming a subsystem "genpd" isn't very
self-explanatory and the acronym itself that means Generic PM Domain, is
known only by a limited group of people.

In a way to improve the situation, let's rename the subsystem to pmdomain,
which ideally should indicate that this is about so called Power Domains or
"PM domains" as we often also use within the Linux Kernel terminology.

Suggested-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20230912221127.487327-1-ulf.hansson@linaro.org
10 months agoMerge branch 'tcp-bind-fixes'
David S. Miller [Wed, 13 Sep 2023 06:18:05 +0000 (07:18 +0100)]
Merge branch 'tcp-bind-fixes'

Kuniyuki Iwashima says:

====================
tcp: Fix bind() regression for v4-mapped-v6 address

Since bhash2 was introduced, bind() is broken in two cases related
to v4-mapped-v6 address.

This series fixes the regression and adds test to cover the cases.

Changes:
  v2:
    * Added patch 1 to factorise duplicated comparison (Eric Dumazet)

  v1: https://lore.kernel.org/netdev/20230911165106.39384-1-kuniyu@amazon.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoselftest: tcp: Add v4-mapped-v6 cases in bind_wildcard.c.
Kuniyuki Iwashima [Mon, 11 Sep 2023 18:37:00 +0000 (11:37 -0700)]
selftest: tcp: Add v4-mapped-v6 cases in bind_wildcard.c.

We add these 8 test cases in bind_wildcard.c to check bind() conflicts.

  1st bind()          2nd bind()
  ---------           ---------
  0.0.0.0             ::FFFF:0.0.0.0
  ::FFFF:0.0.0.0      0.0.0.0
  0.0.0.0             ::FFFF:127.0.0.1
  ::FFFF:127.0.0.1    0.0.0.0
  127.0.0.1           ::FFFF:0.0.0.0
  ::FFFF:0.0.0.0      127.0.0.1
  127.0.0.1           ::FFFF:127.0.0.1
  ::FFFF:127.0.0.1    127.0.0.1

All test passed without bhash2 and with bhash2 and this series.

 Before bhash2:
  $ uname -r
  6.0.0-rc1-00393-g0bf73255d3a3
  $ ./bind_wildcard
  ...
  # PASSED: 16 / 16 tests passed.

 Just after bhash2:
  $ uname -r
  6.0.0-rc1-00394-g28044fc1d495
  $ ./bind_wildcard
  ...
  ok 15 bind_wildcard.v4_local_v6_v4mapped_local.v4_v6
  not ok 16 bind_wildcard.v4_local_v6_v4mapped_local.v6_v4
  # FAILED: 15 / 16 tests passed.

 On net.git:
  $ ./bind_wildcard
  ...
  not ok 14 bind_wildcard.v4_local_v6_v4mapped_any.v6_v4
  not ok 16 bind_wildcard.v4_local_v6_v4mapped_local.v6_v4
  # FAILED: 13 / 16 tests passed.

 With this series:
  $ ./bind_wildcard
  ...
  # PASSED: 16 / 16 tests passed.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoselftest: tcp: Move expected_errno into each test case in bind_wildcard.c.
Kuniyuki Iwashima [Mon, 11 Sep 2023 18:36:59 +0000 (11:36 -0700)]
selftest: tcp: Move expected_errno into each test case in bind_wildcard.c.

This is a preparation patch for the following patch.

Let's define expected_errno in each test case so that we can add other test
cases easily.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agoselftest: tcp: Fix address length in bind_wildcard.c.
Kuniyuki Iwashima [Mon, 11 Sep 2023 18:36:58 +0000 (11:36 -0700)]
selftest: tcp: Fix address length in bind_wildcard.c.

The selftest passes the IPv6 address length for an IPv4 address.
We should pass the correct length.

Note inet_bind_sk() does not check if the size is larger than
sizeof(struct sockaddr_in), so there is no real bug in this
selftest.

Fixes: 13715acf8ab5 ("selftest: Add test for bind() conflicts.")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agotcp: Fix bind() regression for v4-mapped-v6 non-wildcard address.
Kuniyuki Iwashima [Mon, 11 Sep 2023 18:36:57 +0000 (11:36 -0700)]
tcp: Fix bind() regression for v4-mapped-v6 non-wildcard address.

Since bhash2 was introduced, the example below does not work as expected.
These two bind() should conflict, but the 2nd bind() now succeeds.

  from socket import *

  s1 = socket(AF_INET6, SOCK_STREAM)
  s1.bind(('::ffff:127.0.0.1', 0))

  s2 = socket(AF_INET, SOCK_STREAM)
  s2.bind(('127.0.0.1', s1.getsockname()[1]))

During the 2nd bind() in inet_csk_get_port(), inet_bind2_bucket_find()
fails to find the 1st socket's tb2, so inet_bind2_bucket_create() allocates
a new tb2 for the 2nd socket.  Then, we call inet_csk_bind_conflict() that
checks conflicts in the new tb2 by inet_bhash2_conflict().  However, the
new tb2 does not include the 1st socket, thus the bind() finally succeeds.

In this case, inet_bind2_bucket_match() must check if AF_INET6 tb2 has
the conflicting v4-mapped-v6 address so that inet_bind2_bucket_find()
returns the 1st socket's tb2.

Note that if we bind two sockets to 127.0.0.1 and then ::FFFF:127.0.0.1,
the 2nd bind() fails properly for the same reason mentinoed in the previous
commit.

Fixes: 28044fc1d495 ("net: Add a bhash2 table hashed by port and address")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Andrei Vagin <avagin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agotcp: Fix bind() regression for v4-mapped-v6 wildcard address.
Kuniyuki Iwashima [Mon, 11 Sep 2023 18:36:56 +0000 (11:36 -0700)]
tcp: Fix bind() regression for v4-mapped-v6 wildcard address.

Andrei Vagin reported bind() regression with strace logs.

If we bind() a TCPv6 socket to ::FFFF:0.0.0.0 and then bind() a TCPv4
socket to 127.0.0.1, the 2nd bind() should fail but now succeeds.

  from socket import *

  s1 = socket(AF_INET6, SOCK_STREAM)
  s1.bind(('::ffff:0.0.0.0', 0))

  s2 = socket(AF_INET, SOCK_STREAM)
  s2.bind(('127.0.0.1', s1.getsockname()[1]))

During the 2nd bind(), if tb->family is AF_INET6 and sk->sk_family is
AF_INET in inet_bind2_bucket_match_addr_any(), we still need to check
if tb has the v4-mapped-v6 wildcard address.

The example above does not work after commit 5456262d2baa ("net: Fix
incorrect address comparison when searching for a bind2 bucket"), but
the blamed change is not the commit.

Before the commit, the leading zeros of ::FFFF:0.0.0.0 were treated
as 0.0.0.0, and the sequence above worked by chance.  Technically, this
case has been broken since bhash2 was introduced.

Note that if we bind() two sockets to 127.0.0.1 and then ::FFFF:0.0.0.0,
the 2nd bind() fails properly because we fall back to using bhash to
detect conflicts for the v4-mapped-v6 address.

Fixes: 28044fc1d495 ("net: Add a bhash2 table hashed by port and address")
Reported-by: Andrei Vagin <avagin@google.com>
Closes: https://lore.kernel.org/netdev/ZPuYBOFC8zsK6r9T@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agotcp: Factorise sk_family-independent comparison in inet_bind2_bucket_match(_addr_any).
Kuniyuki Iwashima [Mon, 11 Sep 2023 18:36:55 +0000 (11:36 -0700)]
tcp: Factorise sk_family-independent comparison in inet_bind2_bucket_match(_addr_any).

This is a prep patch to make the following patches cleaner that touch
inet_bind2_bucket_match() and inet_bind2_bucket_match_addr_any().

Both functions have duplicated comparison for netns, port, and l3mdev.
Let's factorise them.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
10 months agobpf, cgroup: fix multiple kernel-doc warnings
Randy Dunlap [Tue, 12 Sep 2023 06:08:12 +0000 (23:08 -0700)]
bpf, cgroup: fix multiple kernel-doc warnings

Fix missing or extra function parameter kernel-doc warnings
in cgroup.c:

kernel/bpf/cgroup.c:1359: warning: Excess function parameter 'type' description in '__cgroup_bpf_run_filter_skb'
kernel/bpf/cgroup.c:1359: warning: Function parameter or member 'atype' not described in '__cgroup_bpf_run_filter_skb'
kernel/bpf/cgroup.c:1439: warning: Excess function parameter 'type' description in '__cgroup_bpf_run_filter_sk'
kernel/bpf/cgroup.c:1439: warning: Function parameter or member 'atype' not described in '__cgroup_bpf_run_filter_sk'
kernel/bpf/cgroup.c:1467: warning: Excess function parameter 'type' description in '__cgroup_bpf_run_filter_sock_addr'
kernel/bpf/cgroup.c:1467: warning: Function parameter or member 'atype' not described in '__cgroup_bpf_run_filter_sock_addr'
kernel/bpf/cgroup.c:1512: warning: Excess function parameter 'type' description in '__cgroup_bpf_run_filter_sock_ops'
kernel/bpf/cgroup.c:1512: warning: Function parameter or member 'atype' not described in '__cgroup_bpf_run_filter_sock_ops'
kernel/bpf/cgroup.c:1685: warning: Excess function parameter 'type' description in '__cgroup_bpf_run_filter_sysctl'
kernel/bpf/cgroup.c:1685: warning: Function parameter or member 'atype' not described in '__cgroup_bpf_run_filter_sysctl'
kernel/bpf/cgroup.c:795: warning: Excess function parameter 'type' description in '__cgroup_bpf_replace'
kernel/bpf/cgroup.c:795: warning: Function parameter or member 'new_prog' not described in '__cgroup_bpf_replace'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230912060812.1715-1-rdunlap@infradead.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
10 months agoselftests/bpf: fix unpriv_disabled check in test_verifier
Artem Savkov [Tue, 12 Sep 2023 12:06:31 +0000 (14:06 +0200)]
selftests/bpf: fix unpriv_disabled check in test_verifier

Commit 1d56ade032a49 changed the function get_unpriv_disabled() to
return its results as a bool instead of updating a global variable, but
test_verifier was not updated to keep in line with these changes. Thus
unpriv_disabled is always false in test_verifier and unprivileged tests
are not properly skipped on systems with unprivileged bpf disabled.

Fixes: 1d56ade032a49 ("selftests/bpf: Unprivileged tests for test_loader.c")
Signed-off-by: Artem Savkov <asavkov@redhat.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230912120631.213139-1-asavkov@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
10 months agobpf: Fix a erroneous check after snprintf()
Christophe JAILLET [Fri, 8 Sep 2023 16:33:35 +0000 (18:33 +0200)]
bpf: Fix a erroneous check after snprintf()

snprintf() does not return negative error code on error, it returns the
number of characters which *would* be generated for the given input.

Fix the error handling check.

Fixes: 57539b1c0ac2 ("bpf: Enable annotating trusted nested pointers")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/393bdebc87b22563c08ace094defa7160eb7a6c0.1694190795.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
10 months agotpm: Fix typo in tpmrm class definition
Justin M. Forbes [Tue, 12 Sep 2023 17:02:47 +0000 (12:02 -0500)]
tpm: Fix typo in tpmrm class definition

Commit d2e8071bed0be ("tpm: make all 'class' structures const")
unfortunately had a typo for the name on tpmrm.

Fixes: d2e8071bed0b ("tpm: make all 'class' structures const")
Signed-off-by: Justin M. Forbes <jforbes@fedoraproject.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
10 months agoMerge tag 'for-6.6-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Tue, 12 Sep 2023 18:28:00 +0000 (11:28 -0700)]
Merge tag 'for-6.6-rc1-tag' of git://git./linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - several fixes for handling directory item (inserting, removing,
   iteration, error handling)

 - fix transaction commit stalls when auto relocation is running and
   blocks other tasks that want to commit

 - fix a build error when DEBUG is enabled

 - fix lockdep warning in inode number lookup ioctl

 - fix race when finishing block group creation

 - remove link to obsolete wiki in several files

* tag 'for-6.6-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  MAINTAINERS: remove links to obsolete btrfs.wiki.kernel.org
  btrfs: assert delayed node locked when removing delayed item
  btrfs: remove BUG() after failure to insert delayed dir index item
  btrfs: improve error message after failure to add delayed dir index item
  btrfs: fix a compilation error if DEBUG is defined in btree_dirty_folio
  btrfs: check for BTRFS_FS_ERROR in pending ordered assert
  btrfs: fix lockdep splat and potential deadlock after failure running delayed items
  btrfs: do not block starts waiting on previous transaction commit
  btrfs: release path before inode lookup during the ino lookup ioctl
  btrfs: fix race between finishing block group creation and its item update

10 months agoMerge tag 'platform-drivers-x86-v6.6-2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 12 Sep 2023 18:19:31 +0000 (11:19 -0700)]
Merge tag 'platform-drivers-x86-v6.6-2' of git://git./linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fixes from Hans de Goede:

 - various platform/mellanox fixes

 - one new DMI quirk for asus-wmi

* tag 'platform-drivers-x86-v6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86: asus-wmi: Support 2023 ROG X16 tablet mode
  platform/mellanox: NVSW_SN2201 should depend on ACPI
  platform/mellanox: mlxbf-bootctl: add NET dependency into Kconfig
  platform/mellanox: mlxbf-pmc: Fix reading of unprogrammed events
  platform/mellanox: mlxbf-pmc: Fix potential buffer overflows
  platform/mellanox: mlxbf-tmfifo: Drop jumbo frames
  platform/mellanox: mlxbf-tmfifo: Drop the Rx packet if no more descriptors

10 months agoipv6: fix ip6_sock_set_addr_preferences() typo
Eric Dumazet [Mon, 11 Sep 2023 15:42:13 +0000 (15:42 +0000)]
ipv6: fix ip6_sock_set_addr_preferences() typo

ip6_sock_set_addr_preferences() second argument should be an integer.

SUNRPC attempts to set IPV6_PREFER_SRC_PUBLIC were
translated to IPV6_PREFER_SRC_TMP

Fixes: 18d5ad623275 ("ipv6: add ip6_sock_set_addr_preferences")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230911154213.713941-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
10 months agoMerge tag 'linux-kselftest-next-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 12 Sep 2023 16:10:36 +0000 (09:10 -0700)]
Merge tag 'linux-kselftest-next-6.6-rc2' of git://git./linux/kernel/git/shuah/linux-kselftest

Pull kselftest fixes from Shuah Khan:

 - kselftest runner script to propagate SIGTERM to runner child
   to avoid kselftest hang

 - install symlinks required for test execution to avoid test
   failures

 - kselftest dependency checker script argument parsing

* tag 'linux-kselftest-next-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: Keep symlinks, when possible
  selftests: fix dependency checker script
  kselftest/runner.sh: Propagate SIGTERM to runner child
  selftests/ftrace: Correctly enable event in instance-event.tc

10 months agoMerge tag 'linux-kselftest-kunit-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kerne...
Linus Torvalds [Tue, 12 Sep 2023 16:05:49 +0000 (09:05 -0700)]
Merge tag 'linux-kselftest-kunit-6.6-rc2' of git://git./linux/kernel/git/shuah/linux-kselftest

Pull kunit fixes from Shuah Khan:
 "Fixes to possible memory leak, null-ptr-deref, wild-memory-access, and
  error path bugs"

* tag 'linux-kselftest-kunit-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  kunit: Fix possible memory leak in kunit_filter_suites()
  kunit: Fix possible null-ptr-deref in kunit_parse_glob_filter()
  kunit: Fix the wrong err path and add goto labels in kunit_filter_suites()
  kunit: Fix wild-memory-access bug in kunit_free_suite_set()
  kunit: test: Make filter strings in executor_test writable

10 months agoMerge tag 'ovl-fixes-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/overla...
Linus Torvalds [Tue, 12 Sep 2023 16:00:25 +0000 (09:00 -0700)]
Merge tag 'ovl-fixes-6.6-rc2' of git://git./linux/kernel/git/overlayfs/vfs

Pull overlayfs fixes from Amir Goldstein:
 "Two fixes for pretty old regressions"

* tag 'ovl-fixes-6.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs:
  ovl: fix incorrect fdput() on aio completion
  ovl: fix failed copyup of fileattr on a symlink

10 months agolinux/export: fix reference to exported functions for parisc64
Masahiro Yamada [Tue, 5 Sep 2023 18:46:57 +0000 (03:46 +0900)]
linux/export: fix reference to exported functions for parisc64

John David Anglin reported parisc has been broken since commit
ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost").

Like ia64, parisc64 uses a function descriptor. The function
references must be prefixed with P%.

Also, symbols prefixed $$ from the library have the symbol type
STT_LOPROC instead of STT_FUNC. They should be handled as functions
too.

Fixes: ddb5cdbafaaa ("kbuild: generate KSYMTAB entries by modpost")
Reported-by: John David Anglin <dave.anglin@bell.net>
Tested-by: John David Anglin <dave.anglin@bell.net>
Tested-by: Helge Deller <deller@gmx.de>
Closes: https://lore.kernel.org/linux-parisc/1901598a-e11d-f7dd-a5d9-9a69d06e6b6e@bell.net/T/#u
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Helge Deller <deller@gmx.de>
10 months agoselftests/bpf: ensure all CI arches set CONFIG_BPF_KPROBE_OVERRIDE=y
Andrii Nakryiko [Tue, 12 Sep 2023 05:59:28 +0000 (22:59 -0700)]
selftests/bpf: ensure all CI arches set CONFIG_BPF_KPROBE_OVERRIDE=y

Turns out CONFIG_BPF_KPROBE_OVERRIDE=y is only enabled in x86-64 CI, but
is not set on aarch64, causing CI failures ([0]).

Move CONFIG_BPF_KPROBE_OVERRIDE=y to arch-agnostic CI config.

  [0] https://github.com/kernel-patches/bpf/actions/runs/6122324047/job/16618390535

Fixes: 7182e56411b9 ("selftests/bpf: Add kprobe_multi override test")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230912055928.1704269-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
10 months agoveth: Update XDP feature set when bringing up device
Toke Høiland-Jørgensen [Mon, 11 Sep 2023 13:58:25 +0000 (15:58 +0200)]
veth: Update XDP feature set when bringing up device

There's an early return in veth_set_features() if the device is in a down
state, which leads to the XDP feature flags not being updated when enabling
GRO while the device is down. Which in turn leads to XDP_REDIRECT not
working, because the redirect code now checks the flags.

Fix this by updating the feature flags after bringing the device up.

Before this patch:

NETDEV_XDP_ACT_BASIC: yes
NETDEV_XDP_ACT_REDIRECT: yes
NETDEV_XDP_ACT_NDO_XMIT: no
NETDEV_XDP_ACT_XSK_ZEROCOPY: no
NETDEV_XDP_ACT_HW_OFFLOAD: no
NETDEV_XDP_ACT_RX_SG: yes
NETDEV_XDP_ACT_NDO_XMIT_SG: no

After this patch:

NETDEV_XDP_ACT_BASIC: yes
NETDEV_XDP_ACT_REDIRECT: yes
NETDEV_XDP_ACT_NDO_XMIT: yes
NETDEV_XDP_ACT_XSK_ZEROCOPY: no
NETDEV_XDP_ACT_HW_OFFLOAD: no
NETDEV_XDP_ACT_RX_SG: yes
NETDEV_XDP_ACT_NDO_XMIT_SG: yes

Fixes: fccca038f300 ("veth: take into account device reconfiguration for xdp_features flag")
Fixes: 66c0e13ad236 ("drivers: net: turn on XDP features")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20230911135826.722295-1-toke@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
10 months agoeventfs: Fix the NULL pointer dereference bug in eventfs_remove_rec()
Jinjie Ruan [Tue, 12 Sep 2023 13:47:52 +0000 (21:47 +0800)]
eventfs: Fix the NULL pointer dereference bug in eventfs_remove_rec()

Inject fault while probing btrfs.ko, if kstrdup() fails in
eventfs_prepare_ef() in eventfs_add_dir(), it will return ERR_PTR
to assign file->ef. But the eventfs_remove() check NULL in
trace_module_remove_events(), which causes the below NULL
pointer dereference.

As both Masami and Steven suggest, allocater side should handle the
error carefully and remove it, so fix the places where it failed.

 Could not create tracefs 'raid56_write' directory
 Btrfs loaded, zoned=no, fsverity=no
 Unable to handle kernel NULL pointer dereference at virtual address 000000000000001c
 Mem abort info:
   ESR = 0x0000000096000004
   EC = 0x25: DABT (current EL), IL = 32 bits
   SET = 0, FnV = 0
   EA = 0, S1PTW = 0
   FSC = 0x04: level 0 translation fault
 Data abort info:
   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
 user pgtable: 4k pages, 48-bit VAs, pgdp=0000000102544000
 [000000000000001c] pgd=0000000000000000, p4d=0000000000000000
 Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
 Dumping ftrace buffer:
    (ftrace buffer empty)
 Modules linked in: btrfs(-) libcrc32c xor xor_neon raid6_pq cfg80211 rfkill 8021q garp mrp stp llc ipv6 [last unloaded: btrfs]
 CPU: 15 PID: 1343 Comm: rmmod Tainted: G                 N 6.5.0+ #40
 Hardware name: linux,dummy-virt (DT)
 pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
 pc : eventfs_remove_rec+0x24/0xc0
 lr : eventfs_remove+0x68/0x1d8
 sp : ffff800082d63b60
 x29: ffff800082d63b60 x28: ffffb84b80ddd00c x27: ffffb84b3054ba40
 x26: 0000000000000002 x25: ffff800082d63bf8 x24: ffffb84b8398e440
 x23: ffffb84b82af3000 x22: dead000000000100 x21: dead000000000122
 x20: ffff800082d63bf8 x19: fffffffffffffff4 x18: ffffb84b82508820
 x17: 0000000000000000 x16: 0000000000000000 x15: 000083bc876a3166
 x14: 000000000000006d x13: 000000000000006d x12: 0000000000000000
 x11: 0000000000000001 x10: 00000000000017e0 x9 : 0000000000000001
 x8 : 0000000000000000 x7 : 0000000000000000 x6 : ffffb84b84289804
 x5 : 0000000000000000 x4 : 9696969696969697 x3 : ffff33a5b7601f38
 x2 : 0000000000000000 x1 : ffff800082d63bf8 x0 : fffffffffffffff4
 Call trace:
  eventfs_remove_rec+0x24/0xc0
  eventfs_remove+0x68/0x1d8
  remove_event_file_dir+0x88/0x100
  event_remove+0x140/0x15c
  trace_module_notify+0x1fc/0x230
  notifier_call_chain+0x98/0x17c
  blocking_notifier_call_chain+0x4c/0x74
  __arm64_sys_delete_module+0x1a4/0x298
  invoke_syscall+0x44/0x100
  el0_svc_common.constprop.1+0x68/0xe0
  do_el0_svc+0x1c/0x28
  el0_svc+0x3c/0xc4
  el0t_64_sync_handler+0xa0/0xc4
  el0t_64_sync+0x174/0x178
 Code: 5400052c a90153b3 aa0003f3 aa0103f4 (f9401400)
 ---[ end trace 0000000000000000 ]---
 Kernel panic - not syncing: Oops: Fatal exception
 SMP: stopping secondary CPUs
 Dumping ftrace buffer:
    (ftrace buffer empty)
 Kernel Offset: 0x384b00c00000 from 0xffff800080000000
 PHYS_OFFSET: 0xffffcc5b80000000
 CPU features: 0x88000203,3c020000,1000421b
 Memory Limit: none
 Rebooting in 1 seconds..

Link: https://lore.kernel.org/linux-trace-kernel/20230912134752.1838524-1-ruanjinjie@huawei.com
Link: https://lore.kernel.org/all/20230912025808.668187-1-ruanjinjie@huawei.com/
Link: https://lore.kernel.org/all/20230911052818.1020547-1-ruanjinjie@huawei.com/
Link: https://lore.kernel.org/all/20230909072817.182846-1-ruanjinjie@huawei.com/
Link: https://lore.kernel.org/all/20230908074816.3724716-1-ruanjinjie@huawei.com/
Cc: Ajay Kaher <akaher@vmware.com>
Fixes: 5bdcd5f5331a ("eventfs: Implement removal of meta data from eventfs")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
10 months agonet: macb: fix sleep inside spinlock
Sascha Hauer [Fri, 8 Sep 2023 11:29:13 +0000 (13:29 +0200)]
net: macb: fix sleep inside spinlock

macb_set_tx_clk() is called under a spinlock but itself calls clk_set_rate()
which can sleep. This results in:

| BUG: sleeping function called from invalid context at kernel/locking/mutex.c:580
| pps pps1: new PPS source ptp1
| in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 40, name: kworker/u4:3
| preempt_count: 1, expected: 0
| RCU nest depth: 0, expected: 0
| 4 locks held by kworker/u4:3/40:
|  #0: ffff000003409148
| macb ff0c0000.ethernet: gem-ptp-timer ptp clock registered.
|  ((wq_completion)events_power_efficient){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c
|  #1: ffff8000833cbdd8 ((work_completion)(&pl->resolve)){+.+.}-{0:0}, at: process_one_work+0x14c/0x51c
|  #2: ffff000004f01578 (&pl->state_mutex){+.+.}-{4:4}, at: phylink_resolve+0x44/0x4e8
|  #3: ffff000004f06f50 (&bp->lock){....}-{3:3}, at: macb_mac_link_up+0x40/0x2ac
| irq event stamp: 113998
| hardirqs last  enabled at (113997): [<ffff800080e8503c>] _raw_spin_unlock_irq+0x30/0x64
| hardirqs last disabled at (113998): [<ffff800080e84478>] _raw_spin_lock_irqsave+0xac/0xc8
| softirqs last  enabled at (113608): [<ffff800080010630>] __do_softirq+0x430/0x4e4
| softirqs last disabled at (113597): [<ffff80008001614c>] ____do_softirq+0x10/0x1c
| CPU: 0 PID: 40 Comm: kworker/u4:3 Not tainted 6.5.0-11717-g9355ce8b2f50-dirty #368
| Hardware name: ... ZynqMP ... (DT)
| Workqueue: events_power_efficient phylink_resolve
| Call trace:
|  dump_backtrace+0x98/0xf0
|  show_stack+0x18/0x24
|  dump_stack_lvl+0x60/0xac
|  dump_stack+0x18/0x24
|  __might_resched+0x144/0x24c
|  __might_sleep+0x48/0x98
|  __mutex_lock+0x58/0x7b0
|  mutex_lock_nested+0x24/0x30
|  clk_prepare_lock+0x4c/0xa8
|  clk_set_rate+0x24/0x8c
|  macb_mac_link_up+0x25c/0x2ac
|  phylink_resolve+0x178/0x4e8
|  process_one_work+0x1ec/0x51c
|  worker_thread+0x1ec/0x3e4
|  kthread+0x120/0x124
|  ret_from_fork+0x10/0x20

The obvious fix is to move the call to macb_set_tx_clk() out of the
protected area. This seems safe as rx and tx are both disabled anyway at
this point.
It is however not entirely clear what the spinlock shall protect. It
could be the read-modify-write access to the NCFGR register, but this
is accessed in macb_set_rx_mode() and macb_set_rxcsum_feature() as well
without holding the spinlock. It could also be the register accesses
done in mog_init_rings() or macb_init_buffers(), but again these
functions are called without holding the spinlock in macb_hresp_error_task().
The locking seems fishy in this driver and it might deserve another look
before this patch is applied.

Fixes: 633e98a711ac0 ("net: macb: use resolved link config in mac_link_up()")
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Link: https://lore.kernel.org/r/20230908112913.1701766-1-s.hauer@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
10 months agonet/tls: do not free tls_rec on async operation in bpf_exec_tx_verdict()
Liu Jian [Sat, 9 Sep 2023 08:14:34 +0000 (16:14 +0800)]
net/tls: do not free tls_rec on async operation in bpf_exec_tx_verdict()

I got the below warning when do fuzzing test:
BUG: KASAN: null-ptr-deref in scatterwalk_copychunks+0x320/0x470
Read of size 4 at addr 0000000000000008 by task kworker/u8:1/9

CPU: 0 PID: 9 Comm: kworker/u8:1 Tainted: G           OE
Hardware name: linux,dummy-virt (DT)
Workqueue: pencrypt_parallel padata_parallel_worker
Call trace:
 dump_backtrace+0x0/0x420
 show_stack+0x34/0x44
 dump_stack+0x1d0/0x248
 __kasan_report+0x138/0x140
 kasan_report+0x44/0x6c
 __asan_load4+0x94/0xd0
 scatterwalk_copychunks+0x320/0x470
 skcipher_next_slow+0x14c/0x290
 skcipher_walk_next+0x2fc/0x480
 skcipher_walk_first+0x9c/0x110
 skcipher_walk_aead_common+0x380/0x440
 skcipher_walk_aead_encrypt+0x54/0x70
 ccm_encrypt+0x13c/0x4d0
 crypto_aead_encrypt+0x7c/0xfc
 pcrypt_aead_enc+0x28/0x84
 padata_parallel_worker+0xd0/0x2dc
 process_one_work+0x49c/0xbdc
 worker_thread+0x124/0x880
 kthread+0x210/0x260
 ret_from_fork+0x10/0x18

This is because the value of rec_seq of tls_crypto_info configured by the
user program is too large, for example, 0xffffffffffffff. In addition, TLS
is asynchronously accelerated. When tls_do_encryption() returns
-EINPROGRESS and sk->sk_err is set to EBADMSG due to rec_seq overflow,
skmsg is released before the asynchronous encryption process ends. As a
result, the UAF problem occurs during the asynchronous processing of the
encryption module.

If the operation is asynchronous and the encryption module returns
EINPROGRESS, do not free the record information.

Fixes: 635d93981786 ("net/tls: free record only on encryption error")
Signed-off-by: Liu Jian <liujian56@huawei.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/20230909081434.2324940-1-liujian56@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
10 months agoMerge branch 'Avoid dummy bpf_offload_netdev in __bpf_prog_dev_bound_init'
Martin KaFai Lau [Tue, 12 Sep 2023 05:06:06 +0000 (22:06 -0700)]
Merge branch 'Avoid dummy bpf_offload_netdev in __bpf_prog_dev_bound_init'

Eduard Zingerman says:

====================
For a device bound BPF program with flag BPF_F_XDP_DEV_BOUND_ONLY,
in case if device does not support offload, __bpf_prog_dev_bound_init()
creates a dummy bpf_offload_netdev struct with .offdev field set to NULL.

This dummy struct might be reused for programs without this flag
bound to the same device. However, bpf_prog_offload_verifier_prep()
that uses bpf_offload_netdev assumes that .offdev field cannot be NULL.

This bug was reported by syzbot in [1].

[1] https://lore.kernel.org/bpf/000000000000d97f3c060479c4f8@google.com/
====================

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
10 months agoselftests/bpf: Offloaded prog after non-offloaded should not cause BUG
Eduard Zingerman [Tue, 12 Sep 2023 00:55:38 +0000 (03:55 +0300)]
selftests/bpf: Offloaded prog after non-offloaded should not cause BUG

Check what happens if non-offloaded dev bound BPF
program is followed by offloaded dev bound program.
Test case adapated from syzbot report [1].

[1] https://lore.kernel.org/bpf/000000000000d97f3c060479c4f8@google.com/

Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230912005539.2248244-3-eddyz87@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
10 months agobpf: Avoid dummy bpf_offload_netdev in __bpf_prog_dev_bound_init
Eduard Zingerman [Tue, 12 Sep 2023 00:55:37 +0000 (03:55 +0300)]
bpf: Avoid dummy bpf_offload_netdev in __bpf_prog_dev_bound_init

Fix for a bug observable under the following sequence of events:
1. Create a network device that does not support XDP offload.
2. Load a device bound XDP program with BPF_F_XDP_DEV_BOUND_ONLY flag
   (such programs are not offloaded).
3. Load a device bound XDP program with zero flags
   (such programs are offloaded).

At step (2) __bpf_prog_dev_bound_init() associates with device (1)
a dummy bpf_offload_netdev struct with .offdev field set to NULL.
At step (3) __bpf_prog_dev_bound_init() would reuse dummy struct
allocated at step (2).
However, downstream usage of the bpf_offload_netdev assumes that
.offdev field can't be NULL, e.g. in bpf_prog_offload_verifier_prep().

Adjust __bpf_prog_dev_bound_init() to require bpf_offload_netdev
with non-NULL .offdev for offloaded BPF programs.

Fixes: 2b3486bc2d23 ("bpf: Introduce device-bound XDP programs")
Reported-by: syzbot+291100dcb32190ec02a8@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/bpf/000000000000d97f3c060479c4f8@google.com/
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20230912005539.2248244-2-eddyz87@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
10 months agotracefs/eventfs: Use list_for_each_srcu() in dcache_dir_open_wrapper()
Steven Rostedt (Google) [Tue, 12 Sep 2023 00:06:54 +0000 (20:06 -0400)]
tracefs/eventfs: Use list_for_each_srcu() in dcache_dir_open_wrapper()

The eventfs files list is protected by SRCU. In earlier iterations it was
protected with just RCU, but because it needed to also call sleepable
code, it had to be switch to SRCU. The dcache_dir_open_wrapper()
list_for_each_rcu() was missed and did not get converted over to
list_for_each_srcu(). That needs to be fixed.

Link: https://lore.kernel.org/linux-trace-kernel/20230911120053.ca82f545e7f46ea753deda18@kernel.org/
Link: https://lore.kernel.org/linux-trace-kernel/20230911200654.71ce927c@gandalf.local.home
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ajay Kaher <akaher@vmware.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Reported-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Fixes: 63940449555e7 ("eventfs: Implement eventfs lookup, read, open functions")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>