platform/kernel/linux-starfive.git
2 years agobpf, docs: Fix typo "BFP_ALU" to "BPF_ALU"
Kosuke Fujimoto [Thu, 9 Jun 2022 08:39:37 +0000 (04:39 -0400)]
bpf, docs: Fix typo "BFP_ALU" to "BPF_ALU"

"BFP" should be "BPF"

Signed-off-by: Kosuke Fujimoto <fujimotokosuke0@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220609083937.245749-1-fujimotoksouke0@gmail.com
2 years agobpftool: Fix bootstrapping during a cross compilation
Shahab Vahedi [Wed, 8 Jun 2022 14:29:28 +0000 (14:29 +0000)]
bpftool: Fix bootstrapping during a cross compilation

This change adjusts the Makefile to use "HOSTAR" as the archive tool
to keep the sanity of the build process for the bootstrap part in
check. For the rationale, please continue reading.

When cross compiling bpftool with buildroot, it leads to an invocation
like:

$ AR="/path/to/buildroot/host/bin/arc-linux-gcc-ar" \
  CC="/path/to/buildroot/host/bin/arc-linux-gcc"    \
  ...
  make

Which in return fails while building the bootstrap section:

----------------------------------8<----------------------------------

  make: Entering directory '/src/bpftool-v6.7.0/src'
  ...                        libbfd: [ on  ]
  ...        disassembler-four-args: [ on  ]
  ...                          zlib: [ on  ]
  ...                        libcap: [ OFF ]
  ...               clang-bpf-co-re: [ on  ] <-- triggers bootstrap

  .
  .
  .

    LINK     /src/bpftool-v6.7.0/src/bootstrap/bpftool
  /usr/bin/ld: /src/bpftool-v6.7.0/src/bootstrap/libbpf/libbpf.a:
               error adding symbols: archive has no index; run ranlib
               to add one
  collect2: error: ld returned 1 exit status
  make: *** [Makefile:211: /src/bpftool-v6.7.0/src/bootstrap/bpftool]
            Error 1
  make: *** Waiting for unfinished jobs....
    AR       /src/bpftool-v6.7.0/src/libbpf/libbpf.a
    make[1]: Leaving directory '/src/bpftool-v6.7.0/libbpf/src'
    make: Leaving directory '/src/bpftool-v6.7.0/src'

---------------------------------->8----------------------------------

This occurs because setting "AR" confuses the build process for the
bootstrap section and it calls "arc-linux-gcc-ar" to create and index
"libbpf.a" instead of the host "ar".

Signed-off-by: Shahab Vahedi <shahab@synopsys.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
Link: https://lore.kernel.org/bpf/8d297f0c-cfd0-ef6f-3970-6dddb3d9a87a@synopsys.com
2 years agoMerge branch 'bpf: Add 64bit enum value support'
Alexei Starovoitov [Tue, 7 Jun 2022 17:20:44 +0000 (10:20 -0700)]
Merge branch 'bpf: Add 64bit enum value support'

Yonghong Song says:

====================

Currently, btf only supports upto 32bit enum value with BTF_KIND_ENUM.
But in kernel, some enum has 64bit values, e.g., in uapi bpf.h, we have
  enum {
        BPF_F_INDEX_MASK                = 0xffffffffULL,
        BPF_F_CURRENT_CPU               = BPF_F_INDEX_MASK,
        BPF_F_CTXLEN_MASK               = (0xfffffULL << 32),
  };
With BTF_KIND_ENUM, the value for BPF_F_CTXLEN_MASK will be encoded
as 0 which is incorrect.

To solve this problem, BTF_KIND_ENUM64 is proposed in this patch set
to support enum 64bit values. Also, since sometimes there is a need
to generate C code from btf, e.g., vmlinux.h, btf kflag support
is also added for BTF_KIND_ENUM and BTF_KIND_ENUM64 to indicate
signedness, helping proper value printout.

Changelog:
  v4 -> v5:
    - skip newly-added enum64 C test if clang version <= 14.
  v3 -> v4:
    - rename btf_type_is_any_enum() to btf_is_any_enum() to favor
      consistency in libbpf.
    - fix sign extension issue in btf_dump_get_enum_value().
    - fix BPF_CORE_FIELD_SIGNED signedness issue in bpf_core_calc_field_relo().
  v2 -> v3:
    - Implement separate btf_equal_enum()/btf_equal_enum64() and
      btf_compat_enum()/btf_compat_enum64().
    - Add a new enum64 placeholder type dynamicly for enum64 sanitization.
    - For bpftool output and unit selftest, printed out signed/unsigned
      encoding as well.
    - fix some issues with BTF_KIND_ENUM is doc and clarified sign extension
      rules for enum values.
  v1 -> v2:
    - Changed kflag default from signed to unsigned
    - Fixed sanitization issue
    - Broke down libbpf related patches for easier review
    - Added more tests
    - More code refactorization
    - Corresponding llvm patch (to support enum64) is also updated
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agodocs/bpf: Update documentation for BTF_KIND_ENUM64 support
Yonghong Song [Tue, 7 Jun 2022 06:27:24 +0000 (23:27 -0700)]
docs/bpf: Update documentation for BTF_KIND_ENUM64 support

Add BTF_KIND_ENUM64 documentation in btf.rst.
Also fixed a typo for section number for BTF_KIND_TYPE_TAG
from 2.2.17 to 2.2.18, and fixed a type size issue for
BTF_KIND_ENUM.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062724.3728215-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests/bpf: Add a test for enum64 value relocations
Yonghong Song [Tue, 7 Jun 2022 06:27:18 +0000 (23:27 -0700)]
selftests/bpf: Add a test for enum64 value relocations

Add a test for enum64 value relocations.
The test will be skipped if clang version is 14 or lower
since enum64 is only supported from version 15.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062718.3726307-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests/bpf: Test BTF_KIND_ENUM64 for deduplication
Yonghong Song [Tue, 7 Jun 2022 06:27:13 +0000 (23:27 -0700)]
selftests/bpf: Test BTF_KIND_ENUM64 for deduplication

Add a few unit tests for BTF_KIND_ENUM64 deduplication.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062713.3725409-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests/bpf: Add BTF_KIND_ENUM64 unit tests
Yonghong Song [Tue, 7 Jun 2022 06:27:08 +0000 (23:27 -0700)]
selftests/bpf: Add BTF_KIND_ENUM64 unit tests

Add unit tests for basic BTF_KIND_ENUM64 encoding.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062708.3724845-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests/bpf: Test new enum kflag and enum64 API functions
Yonghong Song [Tue, 7 Jun 2022 06:27:03 +0000 (23:27 -0700)]
selftests/bpf: Test new enum kflag and enum64 API functions

Add tests to use the new enum kflag and enum64 API functions
in selftest btf_write.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062703.3724287-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests/bpf: Fix selftests failure
Yonghong Song [Tue, 7 Jun 2022 06:26:57 +0000 (23:26 -0700)]
selftests/bpf: Fix selftests failure

The kflag is supported now for BTF_KIND_ENUM.
So remove the test which tests verifier failure
due to existence of kflag.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062657.3723737-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agobpftool: Add btf enum64 support
Yonghong Song [Tue, 7 Jun 2022 06:26:52 +0000 (23:26 -0700)]
bpftool: Add btf enum64 support

Add BTF_KIND_ENUM64 support.
For example, the following enum is defined in uapi bpf.h.
  $ cat core.c
  enum A {
        BPF_F_INDEX_MASK                = 0xffffffffULL,
        BPF_F_CURRENT_CPU               = BPF_F_INDEX_MASK,
        BPF_F_CTXLEN_MASK               = (0xfffffULL << 32),
  } g;
Compiled with
  clang -target bpf -O2 -g -c core.c
Using bpftool to dump types and generate format C file:
  $ bpftool btf dump file core.o
  ...
  [1] ENUM64 'A' encoding=UNSIGNED size=8 vlen=3
        'BPF_F_INDEX_MASK' val=4294967295ULL
        'BPF_F_CURRENT_CPU' val=4294967295ULL
        'BPF_F_CTXLEN_MASK' val=4503595332403200ULL
  $ bpftool btf dump file core.o format c
  ...
  enum A {
        BPF_F_INDEX_MASK = 4294967295ULL,
        BPF_F_CURRENT_CPU = 4294967295ULL,
        BPF_F_CTXLEN_MASK = 4503595332403200ULL,
  };
  ...

Note that for raw btf output, the encoding (UNSIGNED or SIGNED)
is printed out as well. The 64bit value is also represented properly
in BTF and C dump.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062652.3722649-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agolibbpf: Add enum64 relocation support
Yonghong Song [Tue, 7 Jun 2022 06:26:47 +0000 (23:26 -0700)]
libbpf: Add enum64 relocation support

The enum64 relocation support is added. The bpf local type
could be either enum or enum64 and the remote type could be
either enum or enum64 too. The all combinations of local enum/enum64
and remote enum/enum64 are supported.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062647.3721719-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agolibbpf: Add enum64 support for bpf linking
Yonghong Song [Tue, 7 Jun 2022 06:26:42 +0000 (23:26 -0700)]
libbpf: Add enum64 support for bpf linking

Add BTF_KIND_ENUM64 support for bpf linking, which is
very similar to BTF_KIND_ENUM.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062642.3721494-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agolibbpf: Add enum64 sanitization
Yonghong Song [Tue, 7 Jun 2022 06:26:36 +0000 (23:26 -0700)]
libbpf: Add enum64 sanitization

When old kernel does not support enum64 but user space btf
contains non-zero enum kflag or enum64, libbpf needs to
do proper sanitization so modified btf can be accepted
by the kernel.

Sanitization for enum kflag can be achieved by clearing
the kflag bit. For enum64, the type is replaced with an
union of integer member types and the integer member size
must be smaller than enum64 size. If such an integer
type cannot be found, a new type is created and used
for union members.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062636.3721375-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agolibbpf: Add enum64 support for btf_dump
Yonghong Song [Tue, 7 Jun 2022 06:26:31 +0000 (23:26 -0700)]
libbpf: Add enum64 support for btf_dump

Add enum64 btf dumping support. For long long and unsigned long long
dump, suffixes 'LL' and 'ULL' are added to avoid compilation errors
in some cases.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062631.3720526-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agolibbpf: Add enum64 deduplication support
Yonghong Song [Tue, 7 Jun 2022 06:26:26 +0000 (23:26 -0700)]
libbpf: Add enum64 deduplication support

Add enum64 deduplication support. BTF_KIND_ENUM64 handling
is very similar to BTF_KIND_ENUM.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062626.3720166-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agolibbpf: Add enum64 parsing and new enum64 public API
Yonghong Song [Tue, 7 Jun 2022 06:26:21 +0000 (23:26 -0700)]
libbpf: Add enum64 parsing and new enum64 public API

Add enum64 parsing support and two new enum64 public APIs:
  btf__add_enum64
  btf__add_enum64_value

Also add support of signedness for BTF_KIND_ENUM. The
BTF_KIND_ENUM API signatures are not changed. The signedness
will be changed from unsigned to signed if btf__add_enum_value()
finds any negative values.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062621.3719391-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agolibbpf: Refactor btf__add_enum() for future code sharing
Yonghong Song [Tue, 7 Jun 2022 06:26:15 +0000 (23:26 -0700)]
libbpf: Refactor btf__add_enum() for future code sharing

Refactor btf__add_enum() function to create a separate
function btf_add_enum_common() so later the common function
can be used to add enum64 btf type. There is no functionality
change for this patch.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062615.3718063-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agolibbpf: Fix an error in 64bit relocation value computation
Yonghong Song [Tue, 7 Jun 2022 06:26:10 +0000 (23:26 -0700)]
libbpf: Fix an error in 64bit relocation value computation

Currently, the 64bit relocation value in the instruction
is computed as follows:
  __u64 imm = insn[0].imm + ((__u64)insn[1].imm << 32)

Suppose insn[0].imm = -1 (0xffffffff) and insn[1].imm = 1.
With the above computation, insn[0].imm will first sign-extend
to 64bit -1 (0xffffffffFFFFFFFF) and then add 0x1FFFFFFFF,
producing incorrect value 0xFFFFFFFF. The correct value
should be 0x1FFFFFFFF.

Changing insn[0].imm to __u32 first will prevent 64bit sign
extension and fix the issue. Merging high and low 32bit values
also changed from '+' to '|' to be consistent with other
similar occurences in kernel and libbpf.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062610.3717378-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agolibbpf: Permit 64bit relocation value
Yonghong Song [Tue, 7 Jun 2022 06:26:05 +0000 (23:26 -0700)]
libbpf: Permit 64bit relocation value

Currently, the libbpf limits the relocation value to be 32bit
since all current relocations have such a limit. But with
BTF_KIND_ENUM64 support, the enum value could be 64bit.
So let us permit 64bit relocation value in libbpf.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062605.3716779-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agobpf: Add btf enum64 support
Yonghong Song [Tue, 7 Jun 2022 06:26:00 +0000 (23:26 -0700)]
bpf: Add btf enum64 support

Currently, BTF only supports upto 32bit enum value with BTF_KIND_ENUM.
But in kernel, some enum indeed has 64bit values, e.g.,
in uapi bpf.h, we have
  enum {
        BPF_F_INDEX_MASK                = 0xffffffffULL,
        BPF_F_CURRENT_CPU               = BPF_F_INDEX_MASK,
        BPF_F_CTXLEN_MASK               = (0xfffffULL << 32),
  };
In this case, BTF_KIND_ENUM will encode the value of BPF_F_CTXLEN_MASK
as 0, which certainly is incorrect.

This patch added a new btf kind, BTF_KIND_ENUM64, which permits
64bit value to cover the above use case. The BTF_KIND_ENUM64 has
the following three fields followed by the common type:
  struct bpf_enum64 {
    __u32 nume_off;
    __u32 val_lo32;
    __u32 val_hi32;
  };
Currently, btf type section has an alignment of 4 as all element types
are u32. Representing the value with __u64 will introduce a pad
for bpf_enum64 and may also introduce misalignment for the 64bit value.
Hence, two members of val_hi32 and val_lo32 are chosen to avoid these issues.

The kflag is also introduced for BTF_KIND_ENUM and BTF_KIND_ENUM64
to indicate whether the value is signed or unsigned. The kflag intends
to provide consistent output of BTF C fortmat with the original
source code. For example, the original BTF_KIND_ENUM bit value is 0xffffffff.
The format C has two choices, printing out 0xffffffff or -1 and current libbpf
prints out as unsigned value. But if the signedness is preserved in btf,
the value can be printed the same as the original source code.
The kflag value 0 means unsigned values, which is consistent to the default
by libbpf and should also cover most cases as well.

The new BTF_KIND_ENUM64 is intended to support the enum value represented as
64bit value. But it can represent all BTF_KIND_ENUM values as well.
The compiler ([1]) and pahole will generate BTF_KIND_ENUM64 only if the value has
to be represented with 64 bits.

In addition, a static inline function btf_kind_core_compat() is introduced which
will be used later when libbpf relo_core.c changed. Here the kernel shares the
same relo_core.c with libbpf.

  [1] https://reviews.llvm.org/D124641

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220607062600.3716578-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2 years agoselftests/bpf: Add drv mode testing for xdping
Hangbin Liu [Thu, 2 Jun 2022 03:25:07 +0000 (11:25 +0800)]
selftests/bpf: Add drv mode testing for xdping

As subject, we only test SKB mode for xdping at present.
Now add DRV mode for xdping.

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20220602032507.464453-1-liuhangbin@gmail.com
2 years agolibbpf: Fix is_pow_of_2
Yuze Chi [Fri, 3 Jun 2022 05:51:56 +0000 (22:51 -0700)]
libbpf: Fix is_pow_of_2

Move the correct definition from linker.c into libbpf_internal.h.

Fixes: 0087a681fa8c ("libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary")
Reported-by: Yuze Chi <chiyuze@google.com>
Signed-off-by: Yuze Chi <chiyuze@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220603055156.2830463-1-irogers@google.com
2 years agoselftests/bpf: Fix tc_redirect_dtime
Martin KaFai Lau [Wed, 1 Jun 2022 23:40:50 +0000 (16:40 -0700)]
selftests/bpf: Fix tc_redirect_dtime

tc_redirect_dtime was reported flaky from time to time.  It
always fails at the udp test and complains about the bpf@tc-ingress
got a skb->tstamp when handling udp packet.  It is unexpected
because the skb->tstamp should have been cleared when crossing
different netns.

The most likely cause is that the skb is actually a tcp packet
from the earlier tcp test.  It could be the final TCP_FIN handling.

This patch tightens the skb->tstamp check in the bpf prog.  It ensures
the skb is the current testing traffic.  First, it checks that skb
matches the IPPROTO of the running test (i.e. tcp vs udp).
Second, it checks the server port (dst_ns_port).  The server
port is unique for each test (50000 + test_enum).

Also fixed a typo in test_udp_dtime(): s/P100/P101/

Fixes: c803475fd8dd ("bpf: selftests: test skb->tstamp in redirect_neigh")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20220601234050.2572671-1-kafai@fb.com
2 years agobpf, test_run: Remove unnecessary prog type checks
Daniel Xu [Sun, 29 May 2022 20:15:41 +0000 (15:15 -0500)]
bpf, test_run: Remove unnecessary prog type checks

These checks were effectively noops b/c there's only one way these
functions get called: through prog_ops dispatching. And since there's no
other callers, we can be sure that `prog` is always the correct type.

Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/0a9aaac329f76ddb17df1786b001117823ffefa5.1653855302.git.dxu@dxuuu.xyz
2 years agolibbpf: Fix a couple of typos
Daniel Müller [Wed, 1 Jun 2022 15:40:25 +0000 (15:40 +0000)]
libbpf: Fix a couple of typos

This change fixes a couple of typos that were encountered while studying
the source code.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20220601154025.3295035-1-deso@posteo.net
2 years agobpf, sockmap: Fix sk->sk_forward_alloc warn_on in sk_stream_kill_queues
Wang Yufen [Tue, 24 May 2022 07:53:11 +0000 (15:53 +0800)]
bpf, sockmap: Fix sk->sk_forward_alloc warn_on in sk_stream_kill_queues

During TCP sockmap redirect pressure test, the following warning is triggered:

WARNING: CPU: 3 PID: 2145 at net/core/stream.c:205 sk_stream_kill_queues+0xbc/0xd0
CPU: 3 PID: 2145 Comm: iperf Kdump: loaded Tainted: G        W         5.10.0+ #9
Call Trace:
 inet_csk_destroy_sock+0x55/0x110
 inet_csk_listen_stop+0xbb/0x380
 tcp_close+0x41b/0x480
 inet_release+0x42/0x80
 __sock_release+0x3d/0xa0
 sock_close+0x11/0x20
 __fput+0x9d/0x240
 task_work_run+0x62/0x90
 exit_to_user_mode_prepare+0x110/0x120
 syscall_exit_to_user_mode+0x27/0x190
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

The reason we observed is that:

When the listener is closing, a connection may have completed the three-way
handshake but not accepted, and the client has sent some packets. The child
sks in accept queue release by inet_child_forget()->inet_csk_destroy_sock(),
but psocks of child sks have not released.

To fix, add sock_map_destroy to release psocks.

Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20220524075311.649153-1-wangyufen@huawei.com
2 years agosample: bpf: xdp_router_ipv4: Allow the kernel to send arp requests
Lorenzo Bianconi [Wed, 25 May 2022 09:44:27 +0000 (11:44 +0200)]
sample: bpf: xdp_router_ipv4: Allow the kernel to send arp requests

Forward the packet to the kernel if the gw router mac address is missing
in to trigger ARP discovery.

Fixes: 85bf1f51691c ("samples: bpf: Convert xdp_router_ipv4 to XDP samples helper")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/60bde5496d108089080504f58199bcf1143ea938.1653471558.git.lorenzo@kernel.org
2 years agolibbpf: Fix determine_ptr_size() guessing
Douglas Raillard [Tue, 24 May 2022 09:44:47 +0000 (10:44 +0100)]
libbpf: Fix determine_ptr_size() guessing

One strategy employed by libbpf to guess the pointer size is by finding
the size of "unsigned long" type. This is achieved by looking for a type
of with the expected name and checking its size.

Unfortunately, the C syntax is friendlier to humans than to computers
as there is some variety in how such a type can be named. Specifically,
gcc and clang do not use the same names for integer types in debug info:

    - clang uses "unsigned long"
    - gcc uses "long unsigned int"

Lookup all the names for such a type so that libbpf can hope to find the
information it wants.

Signed-off-by: Douglas Raillard <douglas.raillard@arm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220524094447.332186-1-douglas.raillard@arm.com
2 years agobpf: Fix KASAN use-after-free Read in compute_effective_progs
Tadeusz Struk [Tue, 17 May 2022 18:04:20 +0000 (11:04 -0700)]
bpf: Fix KASAN use-after-free Read in compute_effective_progs

Syzbot found a Use After Free bug in compute_effective_progs().
The reproducer creates a number of BPF links, and causes a fault
injected alloc to fail, while calling bpf_link_detach on them.
Link detach triggers the link to be freed by bpf_link_free(),
which calls __cgroup_bpf_detach() and update_effective_progs().
If the memory allocation in this function fails, the function restores
the pointer to the bpf_cgroup_link on the cgroup list, but the memory
gets freed just after it returns. After this, every subsequent call to
update_effective_progs() causes this already deallocated pointer to be
dereferenced in prog_list_length(), and triggers KASAN UAF error.

To fix this issue don't preserve the pointer to the prog or link in the
list, but remove it and replace it with a dummy prog without shrinking
the table. The subsequent call to __cgroup_bpf_detach() or
__cgroup_bpf_detach() will correct it.

Fixes: af6eea57437a ("bpf: Implement bpf_link-based cgroup BPF program attachment")
Reported-by: <syzbot+f264bffdfbd5614f3bb2@syzkaller.appspotmail.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://syzkaller.appspot.com/bug?id=8ebf179a95c2a2670f7cf1ba62429ec044369db4
Link: https://lore.kernel.org/bpf/20220517180420.87954-1-tadeusz.struk@linaro.org
2 years agobpftool: Check for NULL ptr of btf in codegen_asserts
Michael Mullin [Mon, 23 May 2022 19:49:17 +0000 (15:49 -0400)]
bpftool: Check for NULL ptr of btf in codegen_asserts

bpf_object__btf() can return a NULL value.  If bpf_object__btf returns
null, do not progress through codegen_asserts(). This avoids a null ptr
dereference at the call btf__type_cnt() in the function find_type_for_map()

Signed-off-by: Michael Mullin <masmullin@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220523194917.igkgorco42537arb@jup
2 years agoselftests/bpf: Fix test_run logic in fexit_stress.c
Yuntao Wang [Sat, 21 May 2022 15:13:29 +0000 (23:13 +0800)]
selftests/bpf: Fix test_run logic in fexit_stress.c

In the commit da00d2f117a0 ("bpf: Add test ops for BPF_PROG_TYPE_TRACING"),
the bpf_fentry_test1 function was moved into bpf_prog_test_run_tracing(),
which is the test_run function of the tracing BPF programs.

Thus calling 'bpf_prog_test_run_opts(filter_fd, &topts)' will not trigger
bpf_fentry_test1 function as filter_fd is a sk_filter BPF program.

Fix it by replacing filter_fd with fexit_fd in the bpf_prog_test_run_opts()
function.

Fixes: da00d2f117a0 ("bpf: Add test ops for BPF_PROG_TYPE_TRACING")
Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220521151329.648013-1-ytcoode@gmail.com
2 years agoMerge branch 'libbpf: Textual representation of enums'
Andrii Nakryiko [Tue, 31 May 2022 21:13:01 +0000 (14:13 -0700)]
Merge branch 'libbpf: Textual representation of enums'

Daniel Müller says:

====================

This patch set introduces the means for querying a textual representation of
the following BPF related enum types:
- enum bpf_map_type
- enum bpf_prog_type
- enum bpf_attach_type
- enum bpf_link_type

To make that possible, we introduce a new public function for each of the types:
libbpf_bpf_<type>_type_str.

Having a way to query a textual representation has been asked for in the past
(by systemd, among others). Such representations can generally be useful in
tracing and logging contexts, among others. At this point, at least one client,
bpftool, maintains such a mapping manually, which is prone to get out of date as
new enum variants are introduced. libbpf is arguably best situated to keep this
list complete and up-to-date. This patch series adds BTF based tests to ensure
that exhaustiveness is upheld moving forward.

The libbpf provided textual representation can be inferred from the
corresponding enum variant name by removing the prefix and lowercasing the
remainder. E.g., BPF_PROG_TYPE_SOCKET_FILTER -> socket_filter. Unfortunately,
bpftool does not use such a programmatic approach for some of the
bpf_attach_type variants. We decided in favor of changing its behavior to work
with libbpf representations. However, for user inputs, specifically, we do
maintain support for the traditionally used names around (please see patch
"bpftool: Use libbpf_bpf_attach_type_str").

The patch series is structured as follows:
- for each enumeration type in {bpf_prog_type, bpf_map_type, bpf_attach_type,
  bpf_link_type}:
  - we first introduce the corresponding public libbpf API function
  - we then add BTF based self-tests
  - we lastly adjust bpftool to use the libbpf provided functionality

Signed-off-by: Daniel Müller <deso@posteo.net>
Cc: Quentin Monnet <quentin@isovalent.com>
---
Changelog:
v3 -> v4:
- use full string comparison for newly added attach types
- switched away from erroneously used kdoc-style comments
- removed unused prog_types variable and containing section from
  test_bpftool_synctypes.py
- adjusted wording in documentation of get_types_from_array function
- split various test_bpftool_synctypes.py changes into commits where they are
  required to eliminate temporary failures of this test

v2 -> v3:
- use LIBBPF_1.0.0 section in libbpf.map for newly added exports

v1 -> v2:
- adjusted bpftool to work with algorithmically determined attach types as
  libbpf now uses (just removed prefix from enum name and lowercased the rest)
  - adjusted tests, man page, and completion script to work with the new names
  - renamed bpf_attach_type_str -> bpf_attach_type_input_str
  - for input: added special cases that accept the traditionally used strings as
    well
- changed 'char const *' -> 'const char *'
====================

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2 years agobpftool: Use libbpf_bpf_link_type_str
Daniel Müller [Mon, 23 May 2022 23:04:28 +0000 (23:04 +0000)]
bpftool: Use libbpf_bpf_link_type_str

This change switches bpftool over to using the recently introduced
libbpf_bpf_link_type_str function instead of maintaining its own string
representation for the bpf_link_type enum.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220523230428.3077108-13-deso@posteo.net
2 years agoselftests/bpf: Add test for libbpf_bpf_link_type_str
Daniel Müller [Mon, 23 May 2022 23:04:27 +0000 (23:04 +0000)]
selftests/bpf: Add test for libbpf_bpf_link_type_str

This change adds a test for libbpf_bpf_link_type_str. The test retrieves
all variants of the bpf_link_type enumeration using BTF and makes sure
that the function under test works as expected for them.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220523230428.3077108-12-deso@posteo.net
2 years agolibbpf: Introduce libbpf_bpf_link_type_str
Daniel Müller [Mon, 23 May 2022 23:04:26 +0000 (23:04 +0000)]
libbpf: Introduce libbpf_bpf_link_type_str

This change introduces a new function, libbpf_bpf_link_type_str, to the
public libbpf API. The function allows users to get a string
representation for a bpf_link_type enum variant.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220523230428.3077108-11-deso@posteo.net
2 years agobpftool: Use libbpf_bpf_attach_type_str
Daniel Müller [Mon, 23 May 2022 23:04:25 +0000 (23:04 +0000)]
bpftool: Use libbpf_bpf_attach_type_str

This change switches bpftool over to using the recently introduced
libbpf_bpf_attach_type_str function instead of maintaining its own
string representation for the bpf_attach_type enum.

Note that contrary to other enum types, the variant names that bpftool
maps bpf_attach_type to do not adhere a simple to follow rule. With
bpf_prog_type, for example, the textual representation can easily be
inferred by stripping the BPF_PROG_TYPE_ prefix and lowercasing the
remaining string. bpf_attach_type violates this rule for various
variants.
We decided to fix up this deficiency with this change, meaning that
bpftool uses the same textual representations as libbpf. Supporting
tests, completion scripts, and man pages have been adjusted accordingly.
However, we did add support for accepting (the now undocumented)
original attach type names when they are provided by users.

For the test (test_bpftool_synctypes.py), I have removed the enum
representation checks, because we no longer mirror the various enum
variant names in bpftool source code. For the man page, help text, and
completion script checks we are now using enum definitions from
uapi/linux/bpf.h as the source of truth directly.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20220523230428.3077108-10-deso@posteo.net
2 years agoselftests/bpf: Add test for libbpf_bpf_attach_type_str
Daniel Müller [Mon, 23 May 2022 23:04:24 +0000 (23:04 +0000)]
selftests/bpf: Add test for libbpf_bpf_attach_type_str

This change adds a test for libbpf_bpf_attach_type_str. The test
retrieves all variants of the bpf_attach_type enumeration using BTF and
makes sure that the function under test works as expected for them.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220523230428.3077108-9-deso@posteo.net
2 years agolibbpf: Introduce libbpf_bpf_attach_type_str
Daniel Müller [Mon, 23 May 2022 23:04:23 +0000 (23:04 +0000)]
libbpf: Introduce libbpf_bpf_attach_type_str

This change introduces a new function, libbpf_bpf_attach_type_str, to
the public libbpf API. The function allows users to get a string
representation for a bpf_attach_type variant.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220523230428.3077108-8-deso@posteo.net
2 years agobpftool: Use libbpf_bpf_map_type_str
Daniel Müller [Mon, 23 May 2022 23:04:22 +0000 (23:04 +0000)]
bpftool: Use libbpf_bpf_map_type_str

This change switches bpftool over to using the recently introduced
libbpf_bpf_map_type_str function instead of maintaining its own string
representation for the bpf_map_type enum.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20220523230428.3077108-7-deso@posteo.net
2 years agoselftests/bpf: Add test for libbpf_bpf_map_type_str
Daniel Müller [Mon, 23 May 2022 23:04:21 +0000 (23:04 +0000)]
selftests/bpf: Add test for libbpf_bpf_map_type_str

This change adds a test for libbpf_bpf_map_type_str. The test retrieves
all variants of the bpf_map_type enumeration using BTF and makes sure
that the function under test works as expected for them.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220523230428.3077108-6-deso@posteo.net
2 years agolibbpf: Introduce libbpf_bpf_map_type_str
Daniel Müller [Mon, 23 May 2022 23:04:20 +0000 (23:04 +0000)]
libbpf: Introduce libbpf_bpf_map_type_str

This change introduces a new function, libbpf_bpf_map_type_str, to the
public libbpf API. The function allows users to get a string
representation for a bpf_map_type enum variant.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220523230428.3077108-5-deso@posteo.net
2 years agobpftool: Use libbpf_bpf_prog_type_str
Daniel Müller [Mon, 23 May 2022 23:04:19 +0000 (23:04 +0000)]
bpftool: Use libbpf_bpf_prog_type_str

This change switches bpftool over to using the recently introduced
libbpf_bpf_prog_type_str function instead of maintaining its own string
representation for the bpf_prog_type enum.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20220523230428.3077108-4-deso@posteo.net
2 years agoselftests/bpf: Add test for libbpf_bpf_prog_type_str
Daniel Müller [Mon, 23 May 2022 23:04:18 +0000 (23:04 +0000)]
selftests/bpf: Add test for libbpf_bpf_prog_type_str

This change adds a test for libbpf_bpf_prog_type_str. The test retrieves
all variants of the bpf_prog_type enumeration using BTF and makes sure
that the function under test works as expected for them.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220523230428.3077108-3-deso@posteo.net
2 years agolibbpf: Introduce libbpf_bpf_prog_type_str
Daniel Müller [Mon, 23 May 2022 23:04:17 +0000 (23:04 +0000)]
libbpf: Introduce libbpf_bpf_prog_type_str

This change introduces a new function, libbpf_bpf_prog_type_str, to the
public libbpf API. The function allows users to get a string
representation for a bpf_prog_type variant.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220523230428.3077108-2-deso@posteo.net
2 years agobpf: Correct the comment about insn_to_jit_off
Pu Lehui [Mon, 30 May 2022 09:28:12 +0000 (17:28 +0800)]
bpf: Correct the comment about insn_to_jit_off

The insn_to_jit_off passed to bpf_prog_fill_jited_linfo should be the
first byte of the next instruction, or the byte off to the end of the
current instruction.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220530092815.1112406-4-pulehui@huawei.com
2 years agobpf, riscv: Support riscv jit to provide bpf_line_info
Pu Lehui [Mon, 30 May 2022 09:28:11 +0000 (17:28 +0800)]
bpf, riscv: Support riscv jit to provide bpf_line_info

Add support for riscv jit to provide bpf_line_info. We need to
consider the prologue offset in ctx->offset, but unlike x86 and
arm64, ctx->offset of riscv does not provide an extra slot for
the prologue, so here we just calculate the len of prologue and
add it to ctx->offset at the end. Both RV64 and RV32 have been
tested.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220530092815.1112406-3-pulehui@huawei.com
2 years agobpf: Unify data extension operation of jited_ksyms and jited_linfo
Pu Lehui [Mon, 30 May 2022 09:28:10 +0000 (17:28 +0800)]
bpf: Unify data extension operation of jited_ksyms and jited_linfo

We found that 32-bit environment can not print BPF line info due to a data
inconsistency between jited_ksyms[0] and jited_linfo[0].

For example:

  jited_kyms[0] = 0xb800067c, jited_linfo[0] = 0xffffffffb800067c

We know that both of them store BPF func address, but due to the different
data extension operations when extended to u64, they may not be the same.
We need to unify the data extension operations of them.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/CAEf4BzZ-eDcdJZgJ+Np7Y=V-TVjDDvOMqPwzKjyWrh=i5juv4w@mail.gmail.com
Link: https://lore.kernel.org/bpf/20220530092815.1112406-2-pulehui@huawei.com
2 years agoxdp: Directly use ida_alloc()/free() APIs
Ke Liu [Fri, 27 May 2022 06:46:09 +0000 (06:46 +0000)]
xdp: Directly use ida_alloc()/free() APIs

Use ida_alloc() / ida_free() instead of the deprecated ida_simple_get() /
ida_simple_remove().

Signed-off-by: Ke Liu <liuke94@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/20220527064609.2358482-1-liuke94@huawei.com
2 years agoMerge tag 'net-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 2 Jun 2022 19:50:16 +0000 (12:50 -0700)]
Merge tag 'net-5.19-rc1' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf and netfilter.

  Current release - new code bugs:

   - af_packet: make sure to pull the MAC header, avoid skb panic in GSO

   - ptp_clockmatrix: fix inverted logic in is_single_shot()

   - netfilter: flowtable: fix missing FLOWI_FLAG_ANYSRC flag

   - dt-bindings: net: adin: fix adi,phy-output-clock description syntax

   - wifi: iwlwifi: pcie: rename CAUSE macro, avoid MIPS build warning

  Previous releases - regressions:

   - Revert "net: af_key: add check for pfkey_broadcast in function
     pfkey_process"

   - tcp: fix tcp_mtup_probe_success vs wrong snd_cwnd

   - nf_tables: disallow non-stateful expression in sets earlier

   - nft_limit: clone packet limits' cost value

   - nf_tables: double hook unregistration in netns path

   - ping6: fix ping -6 with interface name

  Previous releases - always broken:

   - sched: fix memory barriers to prevent skbs from getting stuck in
     lockless qdiscs

   - neigh: set lower cap for neigh_managed_work rearming, avoid
     constantly scheduling the probe work

   - bpf: fix probe read error on big endian in ___bpf_prog_run()

   - amt: memory leak and error handling fixes

  Misc:

   - ipv6: expand & rename accept_unsolicited_na to accept_untracked_na"

* tag 'net-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (80 commits)
  net/af_packet: make sure to pull mac header
  net: add debug info to __skb_pull()
  net: CONFIG_DEBUG_NET depends on CONFIG_NET
  stmmac: intel: Add RPL-P PCI ID
  net: stmmac: use dev_err_probe() for reporting mdio bus registration failure
  tipc: check attribute length for bearer name
  ice: fix access-beyond-end in the switch code
  nfp: remove padding in nfp_nfdk_tx_desc
  ax25: Fix ax25 session cleanup problems
  net: usb: qmi_wwan: Add support for Cinterion MV31 with new baseline
  sfc/siena: fix wrong tx channel offset with efx_separate_tx_channels
  sfc/siena: fix considering that all channels have TX queues
  socket: Don't use u8 type in uapi socket.h
  net/sched: act_api: fix error code in tcf_ct_flow_table_fill_tuple_ipv6()
  net: ping6: Fix ping -6 with interface name
  macsec: fix UAF bug for real_dev
  octeontx2-af: fix error code in is_valid_offset()
  wifi: mac80211: fix use-after-free in chanctx code
  bonding: guard ns_targets by CONFIG_IPV6
  tcp: tcp_rtx_synack() can be called from process context
  ...

2 years agomodule: Fix prefix for module.sig_enforce module param
Saravana Kannan [Thu, 2 Jun 2022 03:56:52 +0000 (20:56 -0700)]
module: Fix prefix for module.sig_enforce module param

Commit cfc1d277891e ("module: Move all into module/") changed the prefix
of the module param by moving/renaming files.  A later commit also moves
the module_param() into a different file, thereby changing the prefix
yet again.

This would break kernel cmdline compatibility and also userspace
compatibility at /sys/module/module/parameters/sig_enforce.

So, set the prefix back to "module.".

Fixes: cfc1d277891e ("module: Move all into module/")
Link: https://lore.kernel.org/lkml/20220602034111.4163292-1-saravanak@google.com/
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Aaron Tomlin <atomlin@redhat.com>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Saravana Kannan <saravanak@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoMerge tag 'pci-v5.19-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaa...
Linus Torvalds [Thu, 2 Jun 2022 19:11:25 +0000 (12:11 -0700)]
Merge tag 'pci-v5.19-fixes-1' of git://git./linux/kernel/git/helgaas/pci

Pull pci fixes from Bjorn Helgaas:

 - Revert brcmstb patches that broke booting on Raspberry Pi Compute
   Module 4 (Bjorn Helgaas)

 - Fix bridge_d3_blacklist[] error that overwrote the existing Gigabyte
   X299 entry instead of adding a new one (Bjorn Helgaas)

 - Update Lorenzo Pieralisi's email address in MAINTAINERS (Lorenzo
   Pieralisi)

* tag 'pci-v5.19-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  MAINTAINERS: Update Lorenzo Pieralisi's email address
  PCI/PM: Fix bridge_d3_blacklist[] Elo i2 overwrite of Gigabyte X299
  Revert "PCI: brcmstb: Split brcm_pcie_setup() into two funcs"
  Revert "PCI: brcmstb: Add mechanism to turn on subdev regulators"
  Revert "PCI: brcmstb: Add control of subdevice voltage regulators"
  Revert "PCI: brcmstb: Do not turn off WOL regulators on suspend"

2 years agoMerge branch 'net-af_packet-be-careful-when-expanding-mac-header-size'
Jakub Kicinski [Thu, 2 Jun 2022 17:15:07 +0000 (10:15 -0700)]
Merge branch 'net-af_packet-be-careful-when-expanding-mac-header-size'

Eric Dumazet says:

====================
net: af_packet: be careful when expanding mac header size

A recent regression in af_packet needed a preliminary debug patch,
which will presumably be useful for next bugs hunting.

The af_packet fix is to make sure MAC headers are contained in
skb linear part, as GSO stack requests.

v2: CONFIG_DEBUG_NET depends on CONFIG_NET to avoid compile
   errors found by kernel bots.
====================

Link: https://lore.kernel.org/r/20220602161859.2546399-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet/af_packet: make sure to pull mac header
Eric Dumazet [Thu, 2 Jun 2022 16:18:59 +0000 (09:18 -0700)]
net/af_packet: make sure to pull mac header

GSO assumes skb->head contains link layer headers.

tun device in some case can provide base 14 bytes,
regardless of VLAN being used or not.

After blamed commit, we can end up setting a network
header offset of 18+, we better pull the missing
bytes to avoid a posible crash in GSO.

syzbot report was:
kernel BUG at include/linux/skbuff.h:2699!
invalid opcode: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 3601 Comm: syz-executor210 Not tainted 5.18.0-syzkaller-11338-g2c5ca23f7414 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:__skb_pull include/linux/skbuff.h:2699 [inline]
RIP: 0010:skb_mac_gso_segment+0x48f/0x530 net/core/gro.c:136
Code: 00 48 c7 c7 00 96 d4 8a c6 05 cb d3 45 06 01 e8 26 bb d0 01 e9 2f fd ff ff 49 c7 c4 ea ff ff ff e9 f1 fe ff ff e8 91 84 19 fa <0f> 0b 48 89 df e8 97 44 66 fa e9 7f fd ff ff e8 ad 44 66 fa e9 48
RSP: 0018:ffffc90002e2f4b8 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000012 RCX: 0000000000000000
RDX: ffff88805bb58000 RSI: ffffffff8760ed0f RDI: 0000000000000004
RBP: 0000000000005dbc R08: 0000000000000004 R09: 0000000000000fe0
R10: 0000000000000fe4 R11: 0000000000000000 R12: 0000000000000fe0
R13: ffff88807194d780 R14: 1ffff920005c5e9b R15: 0000000000000012
FS:  000055555730f300(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000200015c0 CR3: 0000000071ff8000 CR4: 0000000000350ee0
Call Trace:
 <TASK>
 __skb_gso_segment+0x327/0x6e0 net/core/dev.c:3411
 skb_gso_segment include/linux/netdevice.h:4749 [inline]
 validate_xmit_skb+0x6bc/0xf10 net/core/dev.c:3669
 validate_xmit_skb_list+0xbc/0x120 net/core/dev.c:3719
 sch_direct_xmit+0x3d1/0xbe0 net/sched/sch_generic.c:327
 __dev_xmit_skb net/core/dev.c:3815 [inline]
 __dev_queue_xmit+0x14a1/0x3a00 net/core/dev.c:4219
 packet_snd net/packet/af_packet.c:3071 [inline]
 packet_sendmsg+0x21cb/0x5550 net/packet/af_packet.c:3102
 sock_sendmsg_nosec net/socket.c:714 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:734
 ____sys_sendmsg+0x6eb/0x810 net/socket.c:2492
 ___sys_sendmsg+0xf3/0x170 net/socket.c:2546
 __sys_sendmsg net/socket.c:2575 [inline]
 __do_sys_sendmsg net/socket.c:2584 [inline]
 __se_sys_sendmsg net/socket.c:2582 [inline]
 __x64_sys_sendmsg+0x132/0x220 net/socket.c:2582
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7f4b95da06c9
Code: 28 c3 e8 4a 15 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffd7defc4c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007ffd7defc4f0 RCX: 00007f4b95da06c9
RDX: 0000000000000000 RSI: 0000000020000140 RDI: 0000000000000003
RBP: 0000000000000003 R08: bb1414ac00000050 R09: bb1414ac00000050
R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffd7defc4e0 R14: 00007ffd7defc4d8 R15: 00007ffd7defc4d4
 </TASK>

Fixes: dfed913e8b55 ("net/af_packet: add VLAN support for AF_PACKET SOCK_RAW GSO")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: add debug info to __skb_pull()
Eric Dumazet [Thu, 2 Jun 2022 16:18:58 +0000 (09:18 -0700)]
net: add debug info to __skb_pull()

While analyzing yet another syzbot report, I found the following
patch very useful. It allows to better understand what went wrong.

This debug info is only enabled if CONFIG_DEBUG_NET=y,
which is the case for syzbot builds.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: CONFIG_DEBUG_NET depends on CONFIG_NET
Eric Dumazet [Thu, 2 Jun 2022 16:18:57 +0000 (09:18 -0700)]
net: CONFIG_DEBUG_NET depends on CONFIG_NET

It makes little sense to debug networking stacks
if networking is not compiled in.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agostmmac: intel: Add RPL-P PCI ID
Michael Sit Wei Hong [Thu, 2 Jun 2022 07:35:07 +0000 (15:35 +0800)]
stmmac: intel: Add RPL-P PCI ID

Add PCI ID for Ethernet TSN Controller on RPL-P.

Signed-off-by: Michael Sit Wei Hong <michael.wei.hong.sit@intel.com>
Link: https://lore.kernel.org/r/20220602073507.3955721-1-michael.wei.hong.sit@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: stmmac: use dev_err_probe() for reporting mdio bus registration failure
Rasmus Villemoes [Thu, 2 Jun 2022 07:48:40 +0000 (09:48 +0200)]
net: stmmac: use dev_err_probe() for reporting mdio bus registration failure

I have a board where these two lines are always printed during boot:

   imx-dwmac 30bf0000.ethernet: Cannot register the MDIO bus
   imx-dwmac 30bf0000.ethernet: stmmac_dvr_probe: MDIO bus (id: 1) registration failed

It's perfectly fine, and the device is successfully (and silently, as
far as the console goes) probed later.

Use dev_err_probe() instead, which will demote these messages to debug
level (thus removing the alarming messages from the console) when the
error is -EPROBE_DEFER, and also has the advantage of including the
error code if/when it happens to be something other than -EPROBE_DEFER.

While here, add the missing \n to one of the format strings.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Link: https://lore.kernel.org/r/20220602074840.1143360-1-linux@rasmusvillemoes.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agotipc: check attribute length for bearer name
Hoang Le [Thu, 2 Jun 2022 06:30:53 +0000 (13:30 +0700)]
tipc: check attribute length for bearer name

syzbot reported uninit-value:
=====================================================
BUG: KMSAN: uninit-value in string_nocheck lib/vsprintf.c:644 [inline]
BUG: KMSAN: uninit-value in string+0x4f9/0x6f0 lib/vsprintf.c:725
 string_nocheck lib/vsprintf.c:644 [inline]
 string+0x4f9/0x6f0 lib/vsprintf.c:725
 vsnprintf+0x2222/0x3650 lib/vsprintf.c:2806
 vprintk_store+0x537/0x2150 kernel/printk/printk.c:2158
 vprintk_emit+0x28b/0xab0 kernel/printk/printk.c:2256
 vprintk_default+0x86/0xa0 kernel/printk/printk.c:2283
 vprintk+0x15f/0x180 kernel/printk/printk_safe.c:50
 _printk+0x18d/0x1cf kernel/printk/printk.c:2293
 tipc_enable_bearer net/tipc/bearer.c:371 [inline]
 __tipc_nl_bearer_enable+0x2022/0x22a0 net/tipc/bearer.c:1033
 tipc_nl_bearer_enable+0x6c/0xb0 net/tipc/bearer.c:1042
 genl_family_rcv_msg_doit net/netlink/genetlink.c:731 [inline]

- Do sanity check the attribute length for TIPC_NLA_BEARER_NAME.
- Do not use 'illegal name' in printing message.

Reported-by: syzbot+e820fdc8ce362f2dea51@syzkaller.appspotmail.com
Fixes: cb30a63384bc ("tipc: refactor function tipc_enable_bearer()")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Link: https://lore.kernel.org/r/20220602063053.5892-1-hoang.h.le@dektech.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'ceph-for-5.19-rc1' of https://github.com/ceph/ceph-client
Linus Torvalds [Thu, 2 Jun 2022 15:59:39 +0000 (08:59 -0700)]
Merge tag 'ceph-for-5.19-rc1' of https://github.com/ceph/ceph-client

Pull ceph updates from Ilya Dryomov:
 "A big pile of assorted fixes and improvements for the filesystem with
  nothing in particular standing out, except perhaps that the fact that
  the MDS never really maintained atime was made official and thus it's
  no longer updated on the client either.

  We also have a MAINTAINERS update: Jeff is transitioning his
  filesystem maintainership duties to Xiubo"

* tag 'ceph-for-5.19-rc1' of https://github.com/ceph/ceph-client: (23 commits)
  MAINTAINERS: move myself from ceph "Maintainer" to "Reviewer"
  ceph: fix decoding of client session messages flags
  ceph: switch TASK_INTERRUPTIBLE to TASK_KILLABLE
  ceph: remove redundant variable ino
  ceph: try to queue a writeback if revoking fails
  ceph: fix statfs for subdir mounts
  ceph: fix possible deadlock when holding Fwb to get inline_data
  ceph: redirty the page for writepage on failure
  ceph: try to choose the auth MDS if possible for getattr
  ceph: disable updating the atime since cephfs won't maintain it
  ceph: flush the mdlog for filesystem sync
  ceph: rename unsafe_request_wait()
  libceph: use swap() macro instead of taking tmp variable
  ceph: fix statx AT_STATX_DONT_SYNC vs AT_STATX_FORCE_SYNC check
  ceph: no need to invalidate the fscache twice
  ceph: replace usage of found with dedicated list iterator variable
  ceph: use dedicated list iterator variable
  ceph: update the dlease for the hashed dentry when removing
  ceph: stop retrying the request when exceeding 256 times
  ceph: stop forwarding the request when exceeding 256 times
  ...

2 years agoMerge tag 'livepatching-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 2 Jun 2022 15:55:01 +0000 (08:55 -0700)]
Merge tag 'livepatching-for-5.19' of git://git./linux/kernel/git/livepatching/livepatching

Pull livepatching cleanup from Petr Mladek:

 - Remove duplicated livepatch code [Christophe]

* tag 'livepatching-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching:
  livepatch: Remove klp_arch_set_pc() and asm/livepatch.h

2 years agoMerge tag 'printk-for-5.19-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 2 Jun 2022 15:49:54 +0000 (08:49 -0700)]
Merge tag 'printk-for-5.19-fixup' of git://git./linux/kernel/git/printk/linux

Pull printk fixup from Petr Mladek:

 - Revert inappropriate use of wake_up_interruptible_all() in printk()

* tag 'printk-for-5.19-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  Revert "printk: wake up all waiters"

2 years agoMerge tag 'memblock-v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt...
Linus Torvalds [Thu, 2 Jun 2022 15:46:30 +0000 (08:46 -0700)]
Merge tag 'memblock-v5.19-rc1' of git://git./linux/kernel/git/rppt/memblock

Pull memblock test suite updates from Mike Rapoport:
 "Comment updates for memblock test suite

  Update comments in the memblock tests so that they will have
  consistent style"

* tag 'memblock-v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
  memblock tests: remove completed TODO item
  memblock tests: update style of comments for memblock_free_*() functions
  memblock tests: update style of comments for memblock_remove_*() functions
  memblock tests: update style of comments for memblock_reserve_*() functions
  memblock tests: update style of comments for memblock_add_*() functions

2 years agoi2c: ismt: prevent memory corruption in ismt_access()
Dan Carpenter [Thu, 2 Jun 2022 11:02:18 +0000 (14:02 +0300)]
i2c: ismt: prevent memory corruption in ismt_access()

The "data->block[0]" variable comes from the user and is a number
between 0-255.  It needs to be capped to prevent writing beyond the end
of dma_buffer[].

Fixes: 5e9a97b1f449 ("i2c: ismt: Adding support for I2C_SMBUS_BLOCK_PROC_CALL")
Reported-and-tested-by: Zheyu Ma <zheyuma97@gmail.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoice: fix access-beyond-end in the switch code
Alexander Lobakin [Wed, 1 Jun 2022 10:59:24 +0000 (12:59 +0200)]
ice: fix access-beyond-end in the switch code

Global `-Warray-bounds` enablement revealed some problems, one of
which is the way we define and use AQC rules messages.
In fact, they have a shared header, followed by the actual message,
which can be of one of several different formats. So it is
straightforward enough to define that header as a separate struct
and then embed it into message structures as needed, but currently
all the formats reside in one union coupled with the header. Then,
the code allocates only the memory needed for a particular message
format, leaving the union potentially incomplete.
There are no actual reads or writes beyond the end of an allocated
chunk, but at the same time, the whole implementation is fragile and
backed by an equilibrium rather than strong type and memory checks.

Define the structures the other way around: one for the common
header and the rest for the actual formats with the header embedded.
There are no places where several union members would be used at the
same time anyway. This allows to use proper struct_size() and let
the compiler know what is going to be done.
Finally, unsilence `-Warray-bounds` back for ice_switch.c.

Other little things worth mentioning:
* &ice_sw_rule_vsi_list_query is not used anywhere, remove it. It's
  weird anyway to talk to hardware with purely kernel types
  (bitmaps);
* expand the ICE_SW_RULE_*_SIZE() macros to pass a structure
  variable name to struct_size() to let it do strict typechecking;
* rename ice_sw_rule_lkup_rx_tx::hdr to ::hdr_data to keep ::hdr
  for the header structure to have the same name for it constistenly
  everywhere;
* drop the duplicate of %ICE_SW_RULE_RX_TX_NO_HDR_SIZE residing in
  ice_switch.h.

Fixes: 9daf8208dd4d ("ice: Add support for switch filter programming")
Fixes: 66486d8943ba ("ice: replace single-element array used for C struct hack")
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Acked-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20220601105924.2841410-1-alexandr.lobakin@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonfp: remove padding in nfp_nfdk_tx_desc
Fei Qin [Wed, 1 Jun 2022 08:34:49 +0000 (10:34 +0200)]
nfp: remove padding in nfp_nfdk_tx_desc

NFDK firmware supports 48-bit dma addressing and
parses 16 high bits of dma addresses.

In nfp_nfdk_tx_desc, dma related structure and tso
related structure are union. When "mss" be filled
with nonzero value due to enable tso, the memory used
by "padding" may be also filled. Then, firmware may
parse wrong dma addresses which causes TX watchdog
timeout problem.

This patch removes padding and unifies the dma_addr_hi
bits with the one in firmware. nfp_nfdk_tx_desc_set_dma_addr
is also added to match this change.

Fixes: c10d12e3dce8 ("nfp: add support for NFDK data path")
Signed-off-by: Fei Qin <fei.qin@corigine.com>
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20220601083449.50556-1-simon.horman@corigine.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoax25: Fix ax25 session cleanup problems
Duoming Zhou [Mon, 30 May 2022 15:21:58 +0000 (23:21 +0800)]
ax25: Fix ax25 session cleanup problems

There are session cleanup problems in ax25_release() and
ax25_disconnect(). If we setup a session and then disconnect,
the disconnected session is still in "LISTENING" state that
is shown below.

Active AX.25 sockets
Dest       Source     Device  State        Vr/Vs    Send-Q  Recv-Q
DL9SAU-4   DL9SAU-3   ???     LISTENING    000/000  0       0
DL9SAU-3   DL9SAU-4   ???     LISTENING    000/000  0       0

The first reason is caused by del_timer_sync() in ax25_release().
The timers of ax25 are used for correct session cleanup. If we use
ax25_release() to close ax25 sessions and ax25_dev is not null,
the del_timer_sync() functions in ax25_release() will execute.
As a result, the sessions could not be cleaned up correctly,
because the timers have stopped.

In order to solve this problem, this patch adds a device_up flag
in ax25_dev in order to judge whether the device is up. If there
are sessions to be cleaned up, the del_timer_sync() in
ax25_release() will not execute. What's more, we add ax25_cb_del()
in ax25_kill_by_device(), because the timers have been stopped
and there are no functions that could delete ax25_cb if we do not
call ax25_release(). Finally, we reorder the position of
ax25_list_lock in ax25_cb_del() in order to synchronize among
different functions that call ax25_cb_del().

The second reason is caused by improper check in ax25_disconnect().
The incoming ax25 sessions which ax25->sk is null will close
heartbeat timer, because the check "if(!ax25->sk || ..)" is
satisfied. As a result, the session could not be cleaned up properly.

In order to solve this problem, this patch changes the improper
check to "if(ax25->sk && ..)" in ax25_disconnect().

What`s more, the ax25_disconnect() may be called twice, which is
not necessary. For example, ax25_kill_by_device() calls
ax25_disconnect() and sets ax25->state to AX25_STATE_0, but
ax25_release() calls ax25_disconnect() again.

In order to solve this problem, this patch add a check in
ax25_release(). If the flag of ax25->sk equals to SOCK_DEAD,
the ax25_disconnect() in ax25_release() should not be executed.

Fixes: 82e31755e55f ("ax25: Fix UAF bugs in ax25 timers")
Fixes: 8a367e74c012 ("ax25: Fix segfault after sock connection timeout")
Reported-and-tested-by: Thomas Osterried <thomas@osterried.de>
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Link: https://lore.kernel.org/r/20220530152158.108619-1-duoming@zju.edu.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoassoc_array: Fix BUG_ON during garbage collect
Stephen Brennan [Thu, 19 May 2022 08:50:30 +0000 (09:50 +0100)]
assoc_array: Fix BUG_ON during garbage collect

A rare BUG_ON triggered in assoc_array_gc:

    [3430308.818153] kernel BUG at lib/assoc_array.c:1609!

Which corresponded to the statement currently at line 1593 upstream:

    BUG_ON(assoc_array_ptr_is_meta(p));

Using the data from the core dump, I was able to generate a userspace
reproducer[1] and determine the cause of the bug.

[1]: https://github.com/brenns10/kernel_stuff/tree/master/assoc_array_gc

After running the iterator on the entire branch, an internal tree node
looked like the following:

    NODE (nr_leaves_on_branch: 3)
      SLOT [0] NODE (2 leaves)
      SLOT [1] NODE (1 leaf)
      SLOT [2..f] NODE (empty)

In the userspace reproducer, the pr_devel output when compressing this
node was:

    -- compress node 0x5607cc089380 --
    free=0, leaves=0
    [0] retain node 2/1 [nx 0]
    [1] fold node 1/1 [nx 0]
    [2] fold node 0/1 [nx 2]
    [3] fold node 0/2 [nx 2]
    [4] fold node 0/3 [nx 2]
    [5] fold node 0/4 [nx 2]
    [6] fold node 0/5 [nx 2]
    [7] fold node 0/6 [nx 2]
    [8] fold node 0/7 [nx 2]
    [9] fold node 0/8 [nx 2]
    [10] fold node 0/9 [nx 2]
    [11] fold node 0/10 [nx 2]
    [12] fold node 0/11 [nx 2]
    [13] fold node 0/12 [nx 2]
    [14] fold node 0/13 [nx 2]
    [15] fold node 0/14 [nx 2]
    after: 3

At slot 0, an internal node with 2 leaves could not be folded into the
node, because there was only one available slot (slot 0). Thus, the
internal node was retained. At slot 1, the node had one leaf, and was
able to be folded in successfully. The remaining nodes had no leaves,
and so were removed. By the end of the compression stage, there were 14
free slots, and only 3 leaf nodes. The tree was ascended and then its
parent node was compressed. When this node was seen, it could not be
folded, due to the internal node it contained.

The invariant for compression in this function is: whenever
nr_leaves_on_branch < ASSOC_ARRAY_FAN_OUT, the node should contain all
leaf nodes. The compression step currently cannot guarantee this, given
the corner case shown above.

To fix this issue, retry compression whenever we have retained a node,
and yet nr_leaves_on_branch < ASSOC_ARRAY_FAN_OUT. This second
compression will then allow the node in slot 1 to be folded in,
satisfying the invariant. Below is the output of the reproducer once the
fix is applied:

    -- compress node 0x560e9c562380 --
    free=0, leaves=0
    [0] retain node 2/1 [nx 0]
    [1] fold node 1/1 [nx 0]
    [2] fold node 0/1 [nx 2]
    [3] fold node 0/2 [nx 2]
    [4] fold node 0/3 [nx 2]
    [5] fold node 0/4 [nx 2]
    [6] fold node 0/5 [nx 2]
    [7] fold node 0/6 [nx 2]
    [8] fold node 0/7 [nx 2]
    [9] fold node 0/8 [nx 2]
    [10] fold node 0/9 [nx 2]
    [11] fold node 0/10 [nx 2]
    [12] fold node 0/11 [nx 2]
    [13] fold node 0/12 [nx 2]
    [14] fold node 0/13 [nx 2]
    [15] fold node 0/14 [nx 2]
    internal nodes remain despite enough space, retrying
    -- compress node 0x560e9c562380 --
    free=14, leaves=1
    [0] fold node 2/15 [nx 0]
    after: 3

Changes
=======
DH:
 - Use false instead of 0.
 - Reorder the inserted lines in a couple of places to put retained before
   next_slot.

ver #2)
 - Fix typo in pr_devel, correct comparison to "<="

Fixes: 3cb989501c26 ("Add a generic associative array implementation.")
Cc: <stable@vger.kernel.org>
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Andrew Morton <akpm@linux-foundation.org>
cc: keyrings@vger.kernel.org
Link: https://lore.kernel.org/r/20220511225517.407935-1-stephen.s.brennan@oracle.com/
Link: https://lore.kernel.org/r/20220512215045.489140-1-stephen.s.brennan@oracle.com/
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agonet: usb: qmi_wwan: Add support for Cinterion MV31 with new baseline
Slark Xiao [Wed, 1 Jun 2022 04:05:31 +0000 (12:05 +0800)]
net: usb: qmi_wwan: Add support for Cinterion MV31 with new baseline

Adding support for Cinterion device MV31 with Qualcomm
new baseline. Use different PIDs to separate it from
previous base line products.
All interfaces settings keep same as previous.

T:  Bus=03 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  7 Spd=480 MxCh= 0
D:  Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=1e2d ProdID=00b9 Rev=04.14
S:  Manufacturer=Cinterion
S:  Product=Cinterion PID 0x00B9 USB Mobile Broadband
S:  SerialNumber=90418e79
C:  #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=500mA
I:  If#=0x0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan
I:  If#=0x1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option
I:  If#=0x2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option
I:  If#=0x3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option

Signed-off-by: Slark Xiao <slark_xiao@163.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Link: https://lore.kernel.org/r/20220601040531.6016-1-slark_xiao@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'mlx5-fixes-2022-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git...
Jakub Kicinski [Thu, 2 Jun 2022 01:04:40 +0000 (18:04 -0700)]
Merge tag 'mlx5-fixes-2022-05-31' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes 2022-05-31

This series provides bug fixes to mlx5 driver.
Please pull and let me know if there is any problem.

* tag 'mlx5-fixes-2022-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: Fix mlx5_get_next_dev() peer device matching
  net/mlx5e: Update netdev features after changing XDP state
  net/mlx5: correct ECE offset in query qp output
  net/mlx5e: Disable softirq in mlx5e_activate_rq to avoid race condition
  net/mlx5: CT: Fix header-rewrite re-use for tupels
  net/mlx5e: TC NIC mode, fix tc chains miss table
  net/mlx5: Don't use already freed action pointer
====================

Link: https://lore.kernel.org/r/20220531205447.99236-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'sfc-siena-fix-some-efx_separate_tx_channels-errors'
Jakub Kicinski [Thu, 2 Jun 2022 00:47:19 +0000 (17:47 -0700)]
Merge branch 'sfc-siena-fix-some-efx_separate_tx_channels-errors'

Íñigo Huguet says:

====================
sfc/siena: fix some efx_separate_tx_channels errors

Trying to load sfc driver with modparam efx_separate_tx_channels=1
resulted in errors during initialization and not being able to use the
NIC. This patches fix a few bugs and make it work again.

This has been already done for sfc, do it also for sfc_siena.
====================

Link: https://lore.kernel.org/r/20220601063603.15362-1-ihuguet@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agosfc/siena: fix wrong tx channel offset with efx_separate_tx_channels
Íñigo Huguet [Wed, 1 Jun 2022 06:36:03 +0000 (08:36 +0200)]
sfc/siena: fix wrong tx channel offset with efx_separate_tx_channels

tx_channel_offset is calculated in efx_allocate_msix_channels, but it is
also calculated again in efx_set_channels because it was originally done
there, and when efx_allocate_msix_channels was introduced it was
forgotten to be removed from efx_set_channels.

Moreover, the old calculation is wrong when using
efx_separate_tx_channels because now we can have XDP channels after the
TX channels, so n_channels - n_tx_channels doesn't point to the first TX
channel.

Remove the old calculation from efx_set_channels, and add the
initialization of this variable if MSI or legacy interrupts are used,
next to the initialization of the rest of the related variables, where
it was missing.

This has been already done for sfc, do it also for sfc_siena.

Fixes: 3990a8fffbda ("sfc: allocate channels for XDP tx queues")
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agosfc/siena: fix considering that all channels have TX queues
Martin Habets [Wed, 1 Jun 2022 06:36:02 +0000 (08:36 +0200)]
sfc/siena: fix considering that all channels have TX queues

Normally, all channels have RX and TX queues, but this is not true if
modparam efx_separate_tx_channels=1 is used. In that cases, some
channels only have RX queues and others only TX queues (or more
preciselly, they have them allocated, but not initialized).

Fix efx_channel_has_tx_queues to return the correct value for this case
too.

This has been already done for sfc, do it also for sfc_siena.

Messages shown at probe time before the fix:
 sfc 0000:03:00.0 ens6f0np0: MC command 0x82 inlen 544 failed rc=-22 (raw=0) arg=0
 ------------[ cut here ]------------
 netdevice: ens6f0np0: failed to initialise TXQ -1
 WARNING: CPU: 1 PID: 626 at drivers/net/ethernet/sfc/ef10.c:2393 efx_ef10_tx_init+0x201/0x300 [sfc]
 [...] stripped
 RIP: 0010:efx_ef10_tx_init+0x201/0x300 [sfc]
 [...] stripped
 Call Trace:
  efx_init_tx_queue+0xaa/0xf0 [sfc]
  efx_start_channels+0x49/0x120 [sfc]
  efx_start_all+0x1f8/0x430 [sfc]
  efx_net_open+0x5a/0xe0 [sfc]
  __dev_open+0xd0/0x190
  __dev_change_flags+0x1b3/0x220
  dev_change_flags+0x21/0x60
 [...] stripped

Messages shown at remove time before the fix:
 sfc 0000:03:00.0 ens6f0np0: failed to flush 10 queues
 sfc 0000:03:00.0 ens6f0np0: failed to flush queues

Fixes: 8700aff08984 ("sfc: fix channel allocation with brute force")
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Tested-by: Íñigo Huguet <ihuguet@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Jakub Kicinski [Thu, 2 Jun 2022 00:44:03 +0000 (17:44 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/klassert/ipsec

Steffen Klassert says:

====================
ipsec 2022-06-01

1) Revert "net: af_key: add check for pfkey_broadcast in function pfkey_process"
   From Michal Kubecek.

2) Don't set IPv4 DF bit when encapsulating IPv6 frames below 1280 bytes.
   From Maciej Żenczykowski.

* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
  xfrm: do not set IPv4 DF flag when encapsulating IPv6 frames <= 1280 bytes.
  Revert "net: af_key: add check for pfkey_broadcast in function pfkey_process"
====================

Link: https://lore.kernel.org/r/20220601103349.2297361-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'wireless-2022-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git...
Jakub Kicinski [Thu, 2 Jun 2022 00:34:22 +0000 (17:34 -0700)]
Merge tag 'wireless-2022-06-01' of git://git./linux/kernel/git/wireless/wireless

Kalle Valo says:

====================
wireless fixes for v5.19

First set of fixes for v5.19. Build fixes for iwlwifi and libertas, a
scheduling while atomic fix for rtw88 and use-after-free fix for
mac80211.

* tag 'wireless-2022-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  wifi: mac80211: fix use-after-free in chanctx code
  wifi: rtw88: add a work to correct atomic scheduling warning of ::set_tim
  wifi: iwlwifi: pcie: rename CAUSE macro
  wifi: libertas: use variable-size data in assoc req/resp cmd
====================

Link: https://lore.kernel.org/r/20220601110741.90B28C385A5@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'xfs-5.19-for-linus-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Linus Torvalds [Thu, 2 Jun 2022 00:23:53 +0000 (17:23 -0700)]
Merge tag 'xfs-5.19-for-linus-2' of git://git./fs/xfs/xfs-linux

Pull more xfs updates from Dave Chinner:
 "This update is largely bug fixes and cleanups for all the code merged
  in the first pull request. The majority of them are to the new logged
  attribute code, but there are also a couple of fixes for other log
  recovery and memory leaks that have recently been found.

  Summary:

   - fix refcount leak in xfs_ifree()

   - fix xfs_buf_cancel structure leaks in log recovery

   - fix dquot leak after failed quota check

   - fix a couple of problematic ASSERTS

   - fix small aim7 perf regression in from new btree sibling validation

   - clean up log incompat feature marking for new logged attribute
     feature

   - disallow logged attributes on legacy V4 filesystem formats.

   - fix da state leak when freeing attr intents

   - improve validation of the attr log items in recovery

   - use slab caches for commonly used attr structures

   - fix leaks of attr name/value buffer and reduce copying overhead
     during intent logging

   - remove some dead debug code from log recovery"

* tag 'xfs-5.19-for-linus-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (33 commits)
  xfs: fix xfs_ifree() error handling to not leak perag ref
  xfs: move xfs_attr_use_log_assist usage out of libxfs
  xfs: move xfs_attr_use_log_assist out of xfs_log.c
  xfs: warn about LARP once per mount
  xfs: implement per-mount warnings for scrub and shrink usage
  xfs: don't log every time we clear the log incompat flags
  xfs: convert buf_cancel_table allocation to kmalloc_array
  xfs: don't leak xfs_buf_cancel structures when recovery fails
  xfs: refactor buffer cancellation table allocation
  xfs: don't leak btree cursor when insrec fails after a split
  xfs: purge dquots after inode walk fails during quotacheck
  xfs: assert in xfs_btree_del_cursor should take into account error
  xfs: don't assert fail on perag references on teardown
  xfs: avoid unnecessary runtime sibling pointer endian conversions
  xfs: share xattr name and value buffers when logging xattr updates
  xfs: do not use logged xattr updates on V4 filesystems
  xfs: Remove duplicate include
  xfs: reduce IOCB_NOWAIT judgment for retry exclusive unaligned DIO
  xfs: Remove dead code
  xfs: fix typo in comment
  ...

2 years agosocket: Don't use u8 type in uapi socket.h
Tobias Klauser [Tue, 31 May 2022 09:43:45 +0000 (11:43 +0200)]
socket: Don't use u8 type in uapi socket.h

Use plain 255 instead, which also avoid introducing an additional header
dependency on <linux/types.h>

Fixes: 26859240e4ee ("txhash: Add socket option to control TX hash rethink behavior")
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Link: https://lore.kernel.org/r/20220531094345.13801-1-tklauser@distanz.ch
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'rtc-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Linus Torvalds [Wed, 1 Jun 2022 21:48:13 +0000 (14:48 -0700)]
Merge tag 'rtc-5.19' of git://git./linux/kernel/git/abelloni/linux

Pull RTC updates from Alexandre Belloni:
 "A new driver represents the bulk of the changes and then we get the
  usual small fixes.

  New driver:

   - Renesas RZN1 rtc

  Drivers:

   - sun6i: Add nvmem support"

* tag 'rtc-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
  rtc: mxc: Silence a clang warning
  rtc: rzn1: Fix a variable type
  rtc: rzn1: Fix error code in probe
  rtc: rzn1: Avoid mixing variables
  rtc: ftrtc010: Fix error handling in ftrtc010_rtc_probe
  rtc: mt6397: check return value after calling platform_get_resource()
  rtc: rzn1: fix platform_no_drv_owner.cocci warning
  rtc: gamecube: Add missing iounmap in gamecube_rtc_read_offset_from_sram
  rtc: meson: Fix email address in MODULE_AUTHOR
  rtc: simplify the return expression of rx8025_set_offset()
  rtc: pcf85063: Add a compatible entry for pca85073a
  dt-binding: pcf85063: Add an entry for pca85073a
  MAINTAINERS: Add myself as maintainer of the RZN1 RTC driver
  rtc: rzn1: Add oscillator offset support
  rtc: rzn1: Add alarm support
  rtc: rzn1: Add new RTC driver
  dt-bindings: rtc: rzn1: Describe the RZN1 RTC
  rtc: sun6i: Add NVMEM provider

2 years agoMerge tag 'i3c/for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux
Linus Torvalds [Wed, 1 Jun 2022 21:44:01 +0000 (14:44 -0700)]
Merge tag 'i3c/for-5.19' of git://git./linux/kernel/git/i3c/linux

Pull i3c updates from Alexandre Belloni:
 "Only clean ups and no functional change this cycle. A couple of yaml
  conversions of the DT bindings, and a couple of code cleanups"

* tag 'i3c/for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux:
  MAINTAINERS: rectify entries for some i3c drivers after dt conversion
  i3c: master: svc: fix returnvar.cocci warning
  i3c/master: simplify the return expression of i3c_hci_remove()
  dt-bindings: i3c: Convert snps,dw-i3c-master to DT schema
  dt-bindings: i3c: Convert cdns,i3c-master to DT schema

2 years agoMerge tag 'for-5.19/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device...
Linus Torvalds [Wed, 1 Jun 2022 21:25:04 +0000 (14:25 -0700)]
Merge tag 'for-5.19/dm-fixes' of git://git./linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:

 - Fix DM core's dm_table_supports_poll to return false if target has no
   data devices.

 - Fix DM verity target so that it cannot be switched to a different DM
   target type (e.g. dm-linear) via DM table reload.

* tag 'for-5.19/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm verity: set DM_TARGET_IMMUTABLE feature flag
  dm table: fix dm_table_supports_poll to return false if no data devices

2 years agortc: mxc: Silence a clang warning
Fabio Estevam [Thu, 26 May 2022 01:14:59 +0000 (22:14 -0300)]
rtc: mxc: Silence a clang warning

Change the of_device_get_match_data() cast to (uintptr_t)
to silence the following clang warning:

drivers/rtc/rtc-mxc.c:315:19: warning: cast to smaller integer type 'enum imx_rtc_type' from 'const void *' [-Wvoid-pointer-to-enum-cast]

Reported-by: kernel test robot <lkp@intel.com>
Fixes: ba7aa63000f2 ("rtc: mxc: use of_device_get_match_data")
Signed-off-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20220526011459.1167197-1-festevam@gmail.com
2 years agoMerge tag 'for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux...
Linus Torvalds [Wed, 1 Jun 2022 21:13:41 +0000 (14:13 -0700)]
Merge tag 'for-v5.19' of git://git./linux/kernel/git/sre/linux-power-supply

Pull power supply and reset updates from Sebastian Reichel:
 "Not much from the power-supply subsystem this time around, since I was
  busy most of the cycle. This also contains some fixes that I
  originally planned to send for 5.18. Apart from this there is nothing
  noteworthy.

  Power-supply core:

   - init power_supply_info struct to zero

  Drivers:

   - bq27xxx: expose data for uncalibrated battery

   - bq24190-charger: use pm_runtime_resume_and_get

   - ab8500_fg: allocate wq in probe

   - axp288_fuel_gauge: drop BIOS version from 'T3 MRD' quirk

   - axp288_fuel_gauge: modify 'T3 MRD' quirk to also fix 'One Mix 1'"

* tag 'for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
  power: supply: bq24190_charger: using pm_runtime_resume_and_get instead of pm_runtime_get_sync
  power: supply: bq27xxx: expose battery data when CI=1
  power: supply: ab8500_fg: Allocate wq in probe
  power: supply: axp288_fuel_gauge: Drop BIOS version check from "T3 MRD" DMI quirk
  power: supply: axp288_fuel_gauge: Fix battery reporting on the One Mix 1
  power: supply: core: Initialize struct to zero

2 years agoMerge tag 'linux-watchdog-5.19-rc1' of git://www.linux-watchdog.org/linux-watchdog
Linus Torvalds [Wed, 1 Jun 2022 21:05:16 +0000 (14:05 -0700)]
Merge tag 'linux-watchdog-5.19-rc1' of git://linux-watchdog.org/linux-watchdog

Pull watchdog updates from Wim Van Sebroeck:

 - Add MediaTek MT8186 support

 - Add Mediatek MT7986 reset-controller support

 - Add i.MX93 support

 - Add watchdog driver for Sunplus SP7021

 - Add SC8180X and SC8280XP compatibles

 - Add Renesas RZ/N1 Watchdog driver and support for RZ/N1

 - rzg2l_wdt improvements and fixes

 - Several other improvements and fixes

* tag 'linux-watchdog-5.19-rc1' of git://www.linux-watchdog.org/linux-watchdog: (38 commits)
  watchdog: ts4800_wdt: Fix refcount leak in ts4800_wdt_probe
  dt-bindings: watchdog: renesas,wdt: R-Car V3U is R-Car Gen4
  watchdog: Add Renesas RZ/N1 Watchdog driver
  dt-bindings: watchdog: renesas,wdt: Add support for RZ/N1
  watchdog: wdat_wdt: Stop watchdog when uninstalling module
  watchdog: wdat_wdt: Stop watchdog when rebooting the system
  watchdog: wdat_wdt: Using the existing function to check parameter timeout
  dt-bindings: watchdog: da9062: add watchdog timeout mode
  dt-bindings: watchdog: renesas,wdt: Document RZ/G2UL SoC
  watchdog: iTCO_wdt: Using existing macro define covers more scenarios
  watchdog: rti-wdt: Fix pm_runtime_get_sync() error checking
  dt-bindings: watchdog: Add SC8180X and SC8280XP compatibles
  watchdog: rti_wdt: Fix calculation and evaluation of preset heartbeat
  dt-bindings: watchdog: uniphier: Use unevaluatedProperties
  watchdog: sp805: disable watchdog on remove
  watchdog: da9063: optionally disable watchdog during suspend
  dt-bindings: mfd: da9063: watchdog: add suspend disable option
  dt-bindings: watchdog: sunxi: clarify clock support
  dt-bindings: watchdog: sunxi: fix F1C100s compatible
  watchdog: Add watchdog driver for Sunplus SP7021
  ...

2 years agoMerge tag 'vfio-v5.19-rc1' of https://github.com/awilliam/linux-vfio
Linus Torvalds [Wed, 1 Jun 2022 20:49:15 +0000 (13:49 -0700)]
Merge tag 'vfio-v5.19-rc1' of https://github.com/awilliam/linux-vfio

Pull vfio updates from Alex Williamson:

 - Improvements to mlx5 vfio-pci variant driver, including support for
   parallel migration per PF (Yishai Hadas)

 - Remove redundant iommu_present() check (Robin Murphy)

 - Ongoing refactoring to consolidate the VFIO driver facing API to use
   vfio_device (Jason Gunthorpe)

 - Use drvdata to store vfio_device among all vfio-pci and variant
   drivers (Jason Gunthorpe)

 - Remove redundant code now that IOMMU core manages group DMA ownership
   (Jason Gunthorpe)

 - Remove vfio_group from external API handling struct file ownership
   (Jason Gunthorpe)

 - Correct typo in uapi comments (Thomas Huth)

 - Fix coccicheck detected deadlock (Wan Jiabing)

 - Use rwsem to remove races and simplify code around container and kvm
   association to groups (Jason Gunthorpe)

 - Harden access to devices in low power states and use runtime PM to
   enable d3cold support for unused devices (Abhishek Sahu)

 - Fix dma_owner handling of fake IOMMU groups (Jason Gunthorpe)

 - Set driver_managed_dma on vfio-pci variant drivers (Jason Gunthorpe)

 - Pass KVM pointer directly rather than via notifier (Matthew Rosato)

* tag 'vfio-v5.19-rc1' of https://github.com/awilliam/linux-vfio: (38 commits)
  vfio: remove VFIO_GROUP_NOTIFY_SET_KVM
  vfio/pci: Add driver_managed_dma to the new vfio_pci drivers
  vfio: Do not manipulate iommu dma_owner for fake iommu groups
  vfio/pci: Move the unused device into low power state with runtime PM
  vfio/pci: Virtualize PME related registers bits and initialize to zero
  vfio/pci: Change the PF power state to D0 before enabling VFs
  vfio/pci: Invalidate mmaps and block the access in D3hot power state
  vfio: Change struct vfio_group::container_users to a non-atomic int
  vfio: Simplify the life cycle of the group FD
  vfio: Fully lock struct vfio_group::container
  vfio: Split up vfio_group_get_device_fd()
  vfio: Change struct vfio_group::opened from an atomic to bool
  vfio: Add missing locking for struct vfio_group::kvm
  kvm/vfio: Fix potential deadlock problem in vfio
  include/uapi/linux/vfio.h: Fix trivial typo - _IORW should be _IOWR instead
  vfio/pci: Use the struct file as the handle not the vfio_group
  kvm/vfio: Remove vfio_group from kvm
  vfio: Change vfio_group_set_kvm() to vfio_file_set_kvm()
  vfio: Change vfio_external_check_extension() to vfio_file_enforced_coherent()
  vfio: Remove vfio_external_group_match_file()
  ...

2 years agoMAINTAINERS: rectify entries for some i3c drivers after dt conversion
Lukas Bulwahn [Wed, 1 Jun 2022 07:42:12 +0000 (09:42 +0200)]
MAINTAINERS: rectify entries for some i3c drivers after dt conversion

Commit 4bd69ecfa672 ("dt-bindings: i3c: Convert cdns,i3c-master to DT
schema") and commit 6742ca620bd9 ("dt-bindings: i3c: Convert
snps,dw-i3c-master to DT schema") convert some i3c dt-bindings to yaml,
but miss to adjust its reference in MAINTAINERS.

Hence, ./scripts/get_maintainer.pl --self-test=patterns complains about
broken references.

Repair these file references in I3C DRIVER FOR CADENCE I3C MASTER IP and
I3C DRIVER FOR SYNOPSYS DESIGNWARE.

Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Link: https://lore.kernel.org/r/20220601074212.19984-1-lukas.bulwahn@gmail.com
2 years agoMerge tag 'erofs-for-5.19-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 1 Jun 2022 18:54:29 +0000 (11:54 -0700)]
Merge tag 'erofs-for-5.19-rc1-fixes' of git://git./linux/kernel/git/xiang/erofs

Pull more erofs updates from Gao Xiang:
 "This is a follow-up to the main updates, including some fixes of
  fscache mode related to compressed inodes and a cachefiles tracepoint.
  There is also a patch to fix an unexpected decompression strategy
  change due to a cleanup in the past. All the fixes are quite small.

  Apart from these, documentation is also updated for a better
  description of recent new features.

  In addition, this has some trivial cleanups without actual code logic
  changes, so I could have a more recent codebase to work on folios and
  avoiding the PG_error page flag for the next cycle.

  Summary:

   - Leave compressed inodes unsupported in fscache mode for now

   - Avoid crash when using tracepoint cachefiles_prep_read

   - Fix `backmost' behavior due to a recent cleanup

   - Update documentation for better description of recent new features

   - Several decompression cleanups w/o logical change"

* tag 'erofs-for-5.19-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: fix 'backmost' member of z_erofs_decompress_frontend
  erofs: simplify z_erofs_pcluster_readmore()
  erofs: get rid of label `restart_now'
  erofs: get rid of `struct z_erofs_collection'
  erofs: update documentation
  erofs: fix crash when enable tracepoint cachefiles_prep_read
  erofs: leave compressed inodes unsupported in fscache mode for now

2 years agoMerge tag '5.19-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Linus Torvalds [Wed, 1 Jun 2022 18:17:24 +0000 (11:17 -0700)]
Merge tag '5.19-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd

Pull ksmbd server updates from Steve French:

 - rdma (smbdirect) fixes, cleanup and optimizations

 - crediting (flow control) fix for mounts from Windows client

 - ACL fix

 - Windows client query dir fix

 - write validation fix

 - cleanups

* tag '5.19-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: smbd: relax the count of sges required
  ksmbd: fix outstanding credits related bugs
  ksmbd: smbd: fix connection dropped issue
  ksmbd: Fix some kernel-doc comments
  ksmbd: fix wrong smbd max read/write size check
  ksmbd: add smbd max io size parameter
  ksmbd: handle smb2 query dir request for OutputBufferLength that is too small
  ksmbd: smbd: handle multiple Buffer descriptors
  ksmbd: smbd: change the return value of get_sg_list
  ksmbd: smbd: simplify tracking pending packets
  ksmbd: smbd: introduce read/write credits for RDMA read/write
  ksmbd: smbd: change prototypes of RDMA read/write related functions
  ksmbd: validate length in smb2_write()
  ksmbd: fix reference count leak in smb_check_perm_dacl()

2 years agoafs: Fix infinite loop found by xfstest generic/676
David Howells [Tue, 31 May 2022 08:30:40 +0000 (09:30 +0100)]
afs: Fix infinite loop found by xfstest generic/676

In AFS, a directory is handled as a file that the client downloads and
parses locally for the purposes of performing lookup and getdents
operations.  The in-kernel afs filesystem has a number of functions that
do this.

A directory file is arranged as a series of 2K blocks divided into
32-byte slots, where a directory entry occupies one or more slots, plus
each block starts with one or more metadata blocks.

When parsing a block, if the last slots are occupied by a dirent that
occupies more than a single slot and the file position points at a slot
that's not the initial one, the logic in afs_dir_iterate_block() that
skips over it won't advance the file pointer to the end of it.  This
will cause an infinite loop in getdents() as it will keep retrying that
block and failing to advance beyond the final entry.

Fix this by advancing the file pointer if the next entry will be beyond
it when we skip a block.

This was found by the generic/676 xfstest but can also be triggered with
something like:

~/xfstests-dev/src/t_readdir_3 /xfstest.test/z 4000 1

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
Tested-by: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: http://lore.kernel.org/r/165391973497.110268.2939296942213894166.stgit@warthog.procyon.org.uk/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoMerge tag 'pwm/for-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry...
Linus Torvalds [Wed, 1 Jun 2022 17:49:11 +0000 (10:49 -0700)]
Merge tag 'pwm/for-5.19-rc1' of git://git./linux/kernel/git/thierry.reding/linux-pwm

Pull pwm updates from Thierry Reding:
 "Quite a large number of conversions this time around, courtesy of Uwe
  who has been working tirelessly on these. No drivers of the legacy API
  are left at this point, so as a next step the old API can be removed.

  Support is added for a few new devices such as the Xilinx AXI timer-
  based PWMs and the PWM IP found on Sunplus SoCs.

  Other than that, there's a number of fixes, cleanups and optimizations"

* tag 'pwm/for-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: (43 commits)
  pwm: pwm-cros-ec: Add channel type support
  dt-bindings: google,cros-ec-pwm: Add the new -type compatible
  dt-bindings: Add mfd/cros_ec definitions
  pwm: Document that the pinstate of a disabled PWM isn't reliable
  pwm: twl-led: Implement .apply() callback
  pwm: lpc18xx: Implement .apply() callback
  pwm: mediatek: Implement .apply() callback
  pwm: lpc32xx: Implement .apply() callback
  pwm: tegra: Implement .apply() callback
  pwm: stmpe: Implement .apply() callback
  pwm: sti: Implement .apply() callback
  pwm: pwm-mediatek: Add support for MediaTek Helio X10 MT6795
  dt-bindings: pwm: pwm-mediatek: Add documentation for MT6795 SoC
  pwm: tegra: Optimize period calculation
  pwm: renesas-tpu: Improve precision of period and duty_cycle calculation
  pwm: renesas-tpu: Improve maths to compute register settings
  pwm: renesas-tpu: Rename variables to match the usual naming
  pwm: renesas-tpu: Implement .apply() callback
  pwm: renesas-tpu: Make use of devm functions
  pwm: renesas-tpu: Make use of dev_err_probe()
  ...

2 years agoMerge tag 'rpmsg-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc...
Linus Torvalds [Wed, 1 Jun 2022 17:39:58 +0000 (10:39 -0700)]
Merge tag 'rpmsg-v5.19' of git://git./linux/kernel/git/remoteproc/linux

Pull rpmsg updates from Bjorn Andersson:
 "This corrects the check for irq_of_parse_and_map() failures in the
  Qualcomm SMD driver and fixes unregistration and a couple of double
  free in the virtio rpmsg driver"

* tag 'rpmsg-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
  rpmsg: qcom_smd: Fix returning 0 if irq_of_parse_and_map() fails
  rpmsg: virtio: Fix the unregistration of the device rpmsg_ctrl
  rpmsg: virtio: Fix possible double free in rpmsg_virtio_add_ctrl_dev()
  rpmsg: virtio: Fix possible double free in rpmsg_probe()
  rpmsg: qcom_smd: Fix irq_of_parse_and_map() return value

2 years agoMerge tag 'rproc-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc...
Linus Torvalds [Wed, 1 Jun 2022 17:35:22 +0000 (10:35 -0700)]
Merge tag 'rproc-v5.19' of git://git./linux/kernel/git/remoteproc/linux

Pull remoteproc updates from Bjorn Andersson:
 "This fixes a race condition in the user space interface for starting
  and stopping remote processors, it makes the ELF loader properly skip
  zero memsz segments and it cleans up the debugfs tracefile code a bit
  by not checking for errors.

  It introduces support for controlling the audio DSP on Qualcomm
  MSM8226, as well as audio and compute DSPs on Qualcomm SC8280XP.

  It makes it possible to specify the firmware path for Mediatek's
  remote processors, fixes a double free in the SCP driver and addresses
  an issue with the SRAM initialization on MT8195.

  Lastly it deprecates the custom ELF loader in the iMX remoteproc
  driver, in favor of using the shared one"

* tag 'rproc-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux: (21 commits)
  dt-bindings: remoteproc: mediatek: Add optional memory-region to mtk,scp
  dt-bindings: remoteproc: mediatek: Make l1tcm reg exclusive to mt819x
  dt-bindings: remoteproc: st,stm32-rproc: Fix phandle-array parameters description
  remoteproc: imx_rproc: Support i.MX93
  dt-bindings: remoteproc: imx_rproc: Support i.MX93
  remoteproc: qcom: pas: Add MSM8226 ADSP support
  dt-bindings: remoteproc: qcom: pas: Add MSM8226 adsp
  remoteproc: mediatek: Allow reading firmware-name from DT
  dt-bindings: remoteproc: mediatek: Add firmware-name property
  remoteproc: qcom: pas: Add sc8280xp remoteprocs
  dt-bindings: remoteproc: qcom: pas: Add sc8280xp adsp and nsp pair
  dt-bindings: remoteproc: mediatek: Add interrupts property to mtk,scp
  remoteproc: imx_rproc: Ignore create mem entry for resource table
  remoteproc: core: Move state checking to remoteproc_core
  remoteproc: core: Remove state checking before calling rproc_boot()
  remoteproc: imx_dsp_rproc: Make rsc_table optional
  remoteproc: imx_dsp_rproc: use common rproc_elf_load_segments
  remoteproc: elf_loader: skip segment with memsz as zero
  remoteproc: mtk_scp: Fix a potential double free
  remoteproc: Don't bother checking the return value of debugfs_create*
  ...

2 years agoMerge tag 'spi-fix-v5.19-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/brooni...
Linus Torvalds [Wed, 1 Jun 2022 17:30:18 +0000 (10:30 -0700)]
Merge tag 'spi-fix-v5.19-rc0' of git://git./linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
 "A couple of fixes that came in during the merge window: a driver fix
  for spurious timeouts in the fsi driver and an improvement to make the
  core display error messages for transfer_one_message() to help people
  debug things"

* tag 'spi-fix-v5.19-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: core: Display return code when failing to transfer message
  spi: fsi: Fix spurious timeout

2 years agoMerge branch 'pcmcia-next' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo...
Linus Torvalds [Wed, 1 Jun 2022 17:25:06 +0000 (10:25 -0700)]
Merge branch 'pcmcia-next' of git://git./linux/kernel/git/brodo/linux

Pull pcmcia updates from Dominik Brodowski:
 "A few odd cleanups and fixes, including a Kconfig fix to add a
  required dependency on MIPS"

* 'pcmcia-next' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux:
  pcmcia: Use platform_get_irq() to get the interrupt
  pcmcia: db1xxx_ss: restrict to MIPS_DB1XXX boards
  drivers/pcmcia: Fix typo in comment

2 years agonet/sched: act_api: fix error code in tcf_ct_flow_table_fill_tuple_ipv6()
Dan Carpenter [Tue, 31 May 2022 12:10:05 +0000 (15:10 +0300)]
net/sched: act_api: fix error code in tcf_ct_flow_table_fill_tuple_ipv6()

The tcf_ct_flow_table_fill_tuple_ipv6() function is supposed to return
false on failure.  It should not return negatives because that means
succes/true.

Fixes: fcb6aa86532c ("act_ct: Support GRE offload")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Toshiaki Makita <toshiaki.makita1@gmail.com>
Link: https://lore.kernel.org/r/YpYFnbDxFl6tQ3Bn@kili
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: ping6: Fix ping -6 with interface name
Aya Levin [Tue, 31 May 2022 08:45:44 +0000 (11:45 +0300)]
net: ping6: Fix ping -6 with interface name

When passing interface parameter to ping -6:
$ ping -6 ::11:141:84:9 -I eth2
Results in:
PING ::11:141:84:10(::11:141:84:10) from ::11:141:84:9 eth2: 56 data bytes
ping: sendmsg: Invalid argument
ping: sendmsg: Invalid argument

Initialize the fl6's outgoing interface (OIF) before triggering
ip6_datagram_send_ctl. Don't wipe fl6 after ip6_datagram_send_ctl() as
changes in fl6 that may happen in the function are overwritten explicitly.
Update comment accordingly.

Fixes: 13651224c00b ("net: ping6: support setting basic SOL_IPV6 options via cmsg")
Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220531084544.15126-1-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agomacsec: fix UAF bug for real_dev
Ziyang Xuan [Tue, 31 May 2022 07:45:00 +0000 (15:45 +0800)]
macsec: fix UAF bug for real_dev

Create a new macsec device but not get reference to real_dev. That can
not ensure that real_dev is freed after macsec. That will trigger the
UAF bug for real_dev as following:

==================================================================
BUG: KASAN: use-after-free in macsec_get_iflink+0x5f/0x70 drivers/net/macsec.c:3662
Call Trace:
 ...
 macsec_get_iflink+0x5f/0x70 drivers/net/macsec.c:3662
 dev_get_iflink+0x73/0xe0 net/core/dev.c:637
 default_operstate net/core/link_watch.c:42 [inline]
 rfc2863_policy+0x233/0x2d0 net/core/link_watch.c:54
 linkwatch_do_dev+0x2a/0x150 net/core/link_watch.c:161

Allocated by task 22209:
 ...
 alloc_netdev_mqs+0x98/0x1100 net/core/dev.c:10549
 rtnl_create_link+0x9d7/0xc00 net/core/rtnetlink.c:3235
 veth_newlink+0x20e/0xa90 drivers/net/veth.c:1748

Freed by task 8:
 ...
 kfree+0xd6/0x4d0 mm/slub.c:4552
 kvfree+0x42/0x50 mm/util.c:615
 device_release+0x9f/0x240 drivers/base/core.c:2229
 kobject_cleanup lib/kobject.c:673 [inline]
 kobject_release lib/kobject.c:704 [inline]
 kref_put include/linux/kref.h:65 [inline]
 kobject_put+0x1c8/0x540 lib/kobject.c:721
 netdev_run_todo+0x72e/0x10b0 net/core/dev.c:10327

After commit faab39f63c1f ("net: allow out-of-order netdev unregistration")
and commit e5f80fcf869a ("ipv6: give an IPv6 dev to blackhole_netdev"), we
can add dev_hold_track() in macsec_dev_init() and dev_put_track() in
macsec_free_netdev() to fix the problem.

Fixes: 2bce1ebed17d ("macsec: fix refcnt leak in module exit routine")
Reported-by: syzbot+d0e94b65ac259c29ce7a@syzkaller.appspotmail.com
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Link: https://lore.kernel.org/r/20220531074500.1272846-1-william.xuanziyang@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoocteontx2-af: fix error code in is_valid_offset()
Dan Carpenter [Tue, 31 May 2022 07:28:45 +0000 (10:28 +0300)]
octeontx2-af: fix error code in is_valid_offset()

The is_valid_offset() function returns success/true if the call to
validate_and_get_cpt_blkaddr() fails.

Fixes: ecad2ce8c48f ("octeontx2-af: cn10k: Add mailbox to configure reassembly timeout")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/YpXDrTPb8qV01JSP@kili
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agowifi: mac80211: fix use-after-free in chanctx code
Johannes Berg [Wed, 1 Jun 2022 07:19:36 +0000 (09:19 +0200)]
wifi: mac80211: fix use-after-free in chanctx code

In ieee80211_vif_use_reserved_context(), when we have an
old context and the new context's replace_state is set to
IEEE80211_CHANCTX_REPLACE_NONE, we free the old context
in ieee80211_vif_use_reserved_reassign(). Therefore, we
cannot check the old_ctx anymore, so we should set it to
NULL after this point.

However, since the new_ctx replace state is clearly not
IEEE80211_CHANCTX_REPLACES_OTHER, we're not going to do
anything else in this function and can just return to
avoid accessing the freed old_ctx.

Cc: stable@vger.kernel.org
Fixes: 5bcae31d9cb1 ("mac80211: implement multi-vif in-place reservations")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220601091926.df419d91b165.I17a9b3894ff0b8323ce2afdb153b101124c821e5@changeid
2 years agobonding: guard ns_targets by CONFIG_IPV6
Hangbin Liu [Tue, 31 May 2022 06:37:27 +0000 (14:37 +0800)]
bonding: guard ns_targets by CONFIG_IPV6

Guard ns_targets in struct bond_params by CONFIG_IPV6, which could save
256 bytes if IPv6 not configed. Also add this protection for function
bond_is_ip6_target_ok() and bond_get_targets_ip6().

Remove the IS_ENABLED() check for bond_opts[] as this will make
BOND_OPT_NS_TARGETS uninitialized if CONFIG_IPV6 not enabled. Add
a dummy bond_option_ns_ip6_targets_set() for this situation.

Fixes: 4e24be018eb9 ("bonding: add new parameter ns_targets")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Jonathan Toppins <jtoppins@redhat.com>
Link: https://lore.kernel.org/r/20220531063727.224043-1-liuhangbin@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agotcp: tcp_rtx_synack() can be called from process context
Eric Dumazet [Mon, 30 May 2022 21:37:13 +0000 (14:37 -0700)]
tcp: tcp_rtx_synack() can be called from process context

Laurent reported the enclosed report [1]

This bug triggers with following coditions:

0) Kernel built with CONFIG_DEBUG_PREEMPT=y

1) A new passive FastOpen TCP socket is created.
   This FO socket waits for an ACK coming from client to be a complete
   ESTABLISHED one.
2) A socket operation on this socket goes through lock_sock()
   release_sock() dance.
3) While the socket is owned by the user in step 2),
   a retransmit of the SYN is received and stored in socket backlog.
4) At release_sock() time, the socket backlog is processed while
   in process context.
5) A SYNACK packet is cooked in response of the SYN retransmit.
6) -> tcp_rtx_synack() is called in process context.

Before blamed commit, tcp_rtx_synack() was always called from BH handler,
from a timer handler.

Fix this by using TCP_INC_STATS() & NET_INC_STATS()
which do not assume caller is in non preemptible context.

[1]
BUG: using __this_cpu_add() in preemptible [00000000] code: epollpep/2180
caller is tcp_rtx_synack.part.0+0x36/0xc0
CPU: 10 PID: 2180 Comm: epollpep Tainted: G           OE     5.16.0-0.bpo.4-amd64 #1  Debian 5.16.12-1~bpo11+1
Hardware name: Supermicro SYS-5039MC-H8TRF/X11SCD-F, BIOS 1.7 11/23/2021
Call Trace:
 <TASK>
 dump_stack_lvl+0x48/0x5e
 check_preemption_disabled+0xde/0xe0
 tcp_rtx_synack.part.0+0x36/0xc0
 tcp_rtx_synack+0x8d/0xa0
 ? kmem_cache_alloc+0x2e0/0x3e0
 ? apparmor_file_alloc_security+0x3b/0x1f0
 inet_rtx_syn_ack+0x16/0x30
 tcp_check_req+0x367/0x610
 tcp_rcv_state_process+0x91/0xf60
 ? get_nohz_timer_target+0x18/0x1a0
 ? lock_timer_base+0x61/0x80
 ? preempt_count_add+0x68/0xa0
 tcp_v4_do_rcv+0xbd/0x270
 __release_sock+0x6d/0xb0
 release_sock+0x2b/0x90
 sock_setsockopt+0x138/0x1140
 ? __sys_getsockname+0x7e/0xc0
 ? aa_sk_perm+0x3e/0x1a0
 __sys_setsockopt+0x198/0x1e0
 __x64_sys_setsockopt+0x21/0x30
 do_syscall_64+0x38/0xc0
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: 168a8f58059a ("tcp: TCP Fast Open Server - main code path")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Laurent Fasnacht <laurent.fasnacht@proton.ch>
Acked-by: Neal Cardwell <ncardwell@google.com>
Link: https://lore.kernel.org/r/20220530213713.601888-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Jakub Kicinski [Wed, 1 Jun 2022 03:58:24 +0000 (20:58 -0700)]
Merge git://git./linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

1) Missing proper sanitization for nft_set_desc_concat_parse().

2) Missing mutex in nf_tables pre_exit path.

3) Possible double hook unregistration from clean_net path.

4) Missing FLOWI_FLAG_ANYSRC flag in flowtable route lookup.
   Fix incorrect source and destination address in case of NAT.
   Patch from wenxu.

* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: flowtable: fix nft_flow_route source address for nat case
  netfilter: flowtable: fix missing FLOWI_FLAG_ANYSRC flag
  netfilter: nf_tables: double hook unregistration in netns path
  netfilter: nf_tables: hold mutex on netns pre_exit path
  netfilter: nf_tables: sanitize nft_set_desc_concat_parse()
====================

Link: https://lore.kernel.org/r/20220531215839.84765-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>