Michael Petlan [Mon, 4 Apr 2022 22:15:41 +0000 (00:15 +0200)]
perf tools: Add external commands to list-cmds
The `perf --list-cmds` output prints only internal commands, although
there is no reason for that from users' perspective.
Adding the external commands to commands array with NULL function
pointer allows printing all perf commands while not changing the logic
of command handler selection.
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220404221541.30312-2-mpetlan@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Michael Petlan [Mon, 4 Apr 2022 22:15:40 +0000 (00:15 +0200)]
perf docs: Add perf-iostat link to manpages
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220404221541.30312-1-mpetlan@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Denis Nikitin [Wed, 30 Mar 2022 03:11:30 +0000 (20:11 -0700)]
perf session: Remap buf if there is no space for event
If a perf event doesn't fit into remaining buffer space return NULL to
remap buf and fetch the event again.
Keep the logic to error out on inadequate input from fuzzing.
This fixes perf failing on ChromeOS (with 32b userspace):
$ perf report -v -i perf.data
...
prefetch_event: head=0x1fffff8 event->header_size=0x30, mmap_size=0x2000000: fuzzed or compressed perf.data?
Error:
failed to process sample
Fixes:
57fc032ad643ffd0 ("perf session: Avoid infinite loop when seeing invalid header.size")
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Denis Nikitin <denik@chromium.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220330031130.2152327-1-denik@chromium.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Athira Rajeev [Wed, 6 Apr 2022 17:51:11 +0000 (23:21 +0530)]
perf bench: Fix epoll bench to correct usage of affinity for machines with #CPUs > 1K
The 'perf bench epoll' testcase fails on systems with more than 1K CPUs.
Testcase: perf bench epoll all
Result snippet:
<<>>
Run summary [PID 106497]: 1399 threads monitoring on 64 file-descriptors for 8 secs.
perf: pthread_create: No such file or directory
<<>>
In epoll benchmarks (ctl, wait) pthread_create is invoked in do_threads
from respective bench_epoll_* function. Though the logs shows direct
failure from pthread_create, the actual failure is from
"sched_setaffinity" returning EINVAL (invalid argument).
This happens because the default mask size in glibc is 1024. To overcome
this 1024 CPUs mask size limitation of cpu_set_t, change the mask size
using the CPU_*_S macros.
Patch addresses this by fixing all the epoll benchmarks to use CPU_ALLOC
to allocate cpumask, CPU_ALLOC_SIZE for size, and CPU_SET_S to set the
mask.
Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220406175113.87881-3-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Athira Rajeev [Wed, 6 Apr 2022 17:51:10 +0000 (23:21 +0530)]
perf bench: Fix futex bench to correct usage of affinity for machines with #CPUs > 1K
The 'perf bench futex' testcase fails on systems with more than 1K CPUs.
Testcase: perf bench futex all
Failure snippet:
<<>>Running futex/hash benchmark...
perf: pthread_create: No such file or directory
<<>>
All the futex benchmarks (ie hash, lock-api, requeue, wake,
wake-parallel), pthread_create is invoked in respective bench_futex_*
function. Though the logs shows direct failure from pthread_create,
strace logs showed that actual failure is from "sched_setaffinity"
returning EINVAL (invalid argument).
This happens because the default mask size in glibc is 1024. To overcome
this 1024 CPUs mask size limitation of cpu_set_t, change the mask size
using the CPU_*_S macros.
Patch addresses this by fixing all the futex benchmarks to use CPU_ALLOC
to allocate cpumask, CPU_ALLOC_SIZE for size, and CPU_SET_S to set the
mask.
Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220406175113.87881-2-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 8 Apr 2022 13:26:25 +0000 (16:26 +0300)]
perf tools: Fix perf's libperf_print callback
eprintf() does not expect va_list as the type of the 4th parameter.
Use veprintf() because it does.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes:
428dab813a56ce94 ("libperf: Merge libperf_set_print() into libperf_init()")
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220408132625.2451452-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Fri, 8 Apr 2022 14:40:56 +0000 (15:40 +0100)]
perf: arm-spe: Fix perf report --mem-mode
Since commit
bb30acae4c4dacfa ("perf report: Bail out --mem-mode if mem
info is not available") "perf mem report" and "perf report --mem-mode"
don't allow opening the file unless one of the events has
PERF_SAMPLE_DATA_SRC set.
SPE doesn't have this set even though synthetic memory data is generated
after it is decoded. Fix this issue by setting DATA_SRC on SPE events.
This has no effect on the data collected because the SPE driver doesn't
do anything with that flag and doesn't generate samples.
Fixes:
bb30acae4c4dacfa ("perf report: Bail out --mem-mode if mem info is not available")
Signed-off-by: James Clark <james.clark@arm.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220408144056.1955535-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Wed, 6 Apr 2022 14:56:51 +0000 (15:56 +0100)]
perf unwind: Don't show unwind error messages when augmenting frame pointer stack
Commit Fixes:
b9f6fbb3b2c29736 ("perf arm64: Inject missing frames when
using 'perf record --call-graph=fp'") intended to add a 'best effort'
DWARF unwind that improved the frame pointer stack in most scenarios.
It's expected that the unwind will fail sometimes, but this shouldn't be
reported as an error. It only works when the return address can be
determined from the contents of the link register alone.
Fix the error shown when the unwinder requires extra registers by adding
a new flag that suppresses error messages. This flag is not set in the
normal --call-graph=dwarf unwind mode so that behavior is not changed.
Fixes:
b9f6fbb3b2c29736 ("perf arm64: Inject missing frames when using 'perf record --call-graph=fp'")
Reported-by: John Garry <john.garry@huawei.com>
Signed-off-by: James Clark <james.clark@arm.com>
Tested-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Truong <alexandre.truong@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220406145651.1392529-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Sat, 9 Apr 2022 14:48:15 +0000 (11:48 -0300)]
tools headers arm64: Sync arm64's cputype.h with the kernel sources
To get the changes in:
83bea32ac7ed37bb ("arm64: Add part number for Arm Cortex-A78AE")
That addresses this perf build warning:
Warning: Kernel ABI header at 'tools/arch/arm64/include/asm/cputype.h' differs from latest version at 'arch/arm64/include/asm/cputype.h'
diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Chanho Park <chanho61.park@samsung.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Will Deacon <will@kernel.org>
Link: http://lore.kernel.org/lkml/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Chengdong Li [Fri, 8 Apr 2022 08:47:48 +0000 (16:47 +0800)]
perf test tsc: Fix error message when not supported
By default `perf test tsc` does not return the error message when the
child process detected kernel does not support it. Instead, the child
process prints an error message to stderr, unfortunately stderr is
redirected to /dev/null when verbose <= 0.
This patch does:
- return TEST_SKIP to the parent process instead of TEST_OK when
perf_read_tsc_conversion() is not supported.
- Add a new subtest of testing if TSC is supported on current
architecture by moving exist code to a separate function.
It avoids two places in test__perf_time_to_tsc() that return
TEST_SKIP by doing this.
- Extend the test suite definition to contain above two subtests.
Current test_suite and test_case structs do not support printing skip
reason when the number of subtest less than 1. To print skip reason, it
is necessary to extend current test suite definition.
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Chengdong Li <chengdongli@tencent.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: likexu@tencent.com
Link: https://lore.kernel.org/r/20220408084748.43707-1-chengdongli@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Thu, 7 Apr 2022 14:04:20 +0000 (11:04 -0300)]
perf build: Don't use -ffat-lto-objects in the python feature test when building with clang-13
Using -ffat-lto-objects in the python feature test when building with
clang-13 results in:
clang-13: error: optimization flag '-ffat-lto-objects' is not supported [-Werror,-Wignored-optimization-argument]
error: command '/usr/sbin/clang' failed with exit code 1
cp: cannot stat '/tmp/build/perf/python_ext_build/lib/perf*.so': No such file or directory
make[2]: *** [Makefile.perf:639: /tmp/build/perf/python/perf.so] Error 1
Noticed when building on a docker.io/library/archlinux:base container.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Keeping <john@metanate.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Fri, 8 Apr 2022 13:08:07 +0000 (10:08 -0300)]
perf python: Fix probing for some clang command line options
The clang compiler complains about some options even without a source
file being available, while others require one, so use the simple
tools/build/feature/test-hello.c file.
Then check for the "is not supported" string in its output, in addition
to the "unknown argument" already being looked for.
This was noticed when building with clang-13 where -ffat-lto-objects
isn't supported and since we were looking just for "unknown argument"
and not providing a source code to clang, was mistakenly assumed as
being available and not being filtered to set of command line options
provided to clang, leading to a build failure.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Keeping <john@metanate.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Link: http://lore.kernel.org/lkml/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Tue, 5 Apr 2022 13:33:21 +0000 (10:33 -0300)]
tools build: Filter out options and warnings not supported by clang
These make the feature check fail when using clang, so remove them just
like is done in tools/perf/Makefile.config to build perf itself.
Adding -Wno-compound-token-split-by-macro to tools/perf/Makefile.config
when building with clang is also necessary to avoid these warnings
turned into errors (-Werror):
CC /tmp/build/perf/util/scripting-engines/trace-event-perl.o
In file included from util/scripting-engines/trace-event-perl.c:35:
In file included from /usr/lib64/perl5/CORE/perl.h:4085:
In file included from /usr/lib64/perl5/CORE/hv.h:659:
In file included from /usr/lib64/perl5/CORE/hv_func.h:34:
In file included from /usr/lib64/perl5/CORE/sbox32_hash.h:4:
/usr/lib64/perl5/CORE/zaphod32_hash.h:150:5: error: '(' and '{' tokens introducing statement expression appear in different macro expansion contexts [-Werror,-Wcompound-token-split-by-macro]
ZAPHOD32_SCRAMBLE32(state[0],0x9fade23b);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/lib64/perl5/CORE/zaphod32_hash.h:80:38: note: expanded from macro 'ZAPHOD32_SCRAMBLE32'
#define ZAPHOD32_SCRAMBLE32(v,prime) STMT_START { \
^~~~~~~~~~
/usr/lib64/perl5/CORE/perl.h:737:29: note: expanded from macro 'STMT_START'
# define STMT_START (void)( /* gcc supports "({ STATEMENTS; })" */
^
/usr/lib64/perl5/CORE/zaphod32_hash.h:150:5: note: '{' token is here
ZAPHOD32_SCRAMBLE32(state[0],0x9fade23b);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/lib64/perl5/CORE/zaphod32_hash.h:80:49: note: expanded from macro 'ZAPHOD32_SCRAMBLE32'
#define ZAPHOD32_SCRAMBLE32(v,prime) STMT_START { \
^
/usr/lib64/perl5/CORE/zaphod32_hash.h:150:5: error: '}' and ')' tokens terminating statement expression appear in different macro expansion contexts [-Werror,-Wcompound-token-split-by-macro]
ZAPHOD32_SCRAMBLE32(state[0],0x9fade23b);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/lib64/perl5/CORE/zaphod32_hash.h:87:41: note: expanded from macro 'ZAPHOD32_SCRAMBLE32'
v ^= (v>>23); \
^
/usr/lib64/perl5/CORE/zaphod32_hash.h:150:5: note: ')' token is here
ZAPHOD32_SCRAMBLE32(state[0],0x9fade23b);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/lib64/perl5/CORE/zaphod32_hash.h:88:3: note: expanded from macro 'ZAPHOD32_SCRAMBLE32'
} STMT_END
^~~~~~~~
/usr/lib64/perl5/CORE/perl.h:738:21: note: expanded from macro 'STMT_END'
# define STMT_END )
^
Please refer to the discussion on the Link: tag below, where Nathan
clarifies the situation:
<quote>
acme> And then get to the problems at the end of this message, which seem
acme> similar to the problem described here:
acme>
acme> From Nathan Chancellor <>
acme> Subject [PATCH] mwifiex: Remove unnecessary braces from HostCmd_SET_SEQ_NO_BSS_INFO
acme>
acme> https://lkml.org/lkml/2020/9/1/135
acme>
acme> So perhaps in this case its better to disable that
acme> -Werror,-Wcompound-token-split-by-macro when building with clang?
Yes, I think that is probably the best solution. As far as I can tell,
at least in this file and context, the warning appears harmless, as the
"create a GNU C statement expression from two different macros" is very
much intentional, based on the presence of PERL_USE_GCC_BRACE_GROUPS.
The warning is fixed in upstream Perl by just avoiding creating GNU C
statement expressions using STMT_START and STMT_END:
https://github.com/Perl/perl5/issues/18780
https://github.com/Perl/perl5/pull/18984
If I am reading the source code correctly, an alternative to disabling
the warning would be specifying -DPERL_GCC_BRACE_GROUPS_FORBIDDEN but it
seems like that might end up impacting more than just this site,
according to the issue discussion above.
</quote>
Based-on-a-patch-by: Sedat Dilek <sedat.dilek@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # Debian/Selfmade LLVM-14 (x86-64)
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Keeping <john@metanate.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Link: http://lore.kernel.org/lkml/YkxWcYzph5pC1EK8@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Mon, 4 Apr 2022 20:28:48 +0000 (17:28 -0300)]
tools build: Use $(shell ) instead of `` to get embedded libperl's ccopts
Just like its done for ldopts and for both in tools/perf/Makefile.config.
Using `` to initialize PERL_EMBED_CCOPTS somehow precludes using:
$(filter-out SOMETHING_TO_FILTER,$(PERL_EMBED_CCOPTS))
And we need to do it to allow for building with versions of clang where
some gcc options selected by distros are not available.
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # Debian/Selfmade LLVM-14 (x86-64)
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Keeping <john@metanate.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Link: http://lore.kernel.org/lkml/YktYX2OnLtyobRYD@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Tue, 14 Apr 2020 12:12:55 +0000 (09:12 -0300)]
tools include UAPI: Sync linux/vhost.h with the kernel sources
To get the changes in:
b04d910af330b55e ("vdpa: support exposing the count of vqs to userspace")
a61280ddddaa45f9 ("vdpa: support exposing the config size to userspace")
Silencing this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/vhost.h' differs from latest version at 'include/uapi/linux/vhost.h'
diff -u tools/include/uapi/linux/vhost.h include/uapi/linux/vhost.h
$ diff -u tools/include/uapi/linux/vhost.h include/uapi/linux/vhost.h
--- tools/include/uapi/linux/vhost.h 2021-07-15 16:17:01.
840818309 -0300
+++ include/uapi/linux/vhost.h 2022-04-02 18:55:05.
702522387 -0300
@@ -150,4 +150,11 @@
/* Get the valid iova range */
#define VHOST_VDPA_GET_IOVA_RANGE _IOR(VHOST_VIRTIO, 0x78, \
struct vhost_vdpa_iova_range)
+
+/* Get the config size */
+#define VHOST_VDPA_GET_CONFIG_SIZE _IOR(VHOST_VIRTIO, 0x79, __u32)
+
+/* Get the count of all virtqueues */
+#define VHOST_VDPA_GET_VQS_COUNT _IOR(VHOST_VIRTIO, 0x80, __u32)
+
#endif
$ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > before
$ cp include/uapi/linux/vhost.h tools/include/uapi/linux/vhost.h
$ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > after
$ diff -u before after
--- before 2022-04-04 14:52:25.
036375145 -0300
+++ after 2022-04-04 14:52:31.
906549976 -0300
@@ -38,4 +38,6 @@
[0x73] = "VDPA_GET_CONFIG",
[0x76] = "VDPA_GET_VRING_NUM",
[0x78] = "VDPA_GET_IOVA_RANGE",
+ [0x79] = "VDPA_GET_CONFIG_SIZE",
+ [0x80] = "VDPA_GET_VQS_COUNT",
};
$
Cc: Longpeng <longpeng2@huawei.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/lkml/YksxoFcOARk%2Fldev@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Linus Torvalds [Sat, 9 Apr 2022 04:58:03 +0000 (18:58 -1000)]
Merge tag 'block-5.18-2022-04-08' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"Nothing major in here, just a few small fixes:
- Small series of neglected drbd patches (Christoph, Lv, Xiaomeng)
- Remove dead variable in cdrom (Enze)"
* tag 'block-5.18-2022-04-08' of git://git.kernel.dk/linux-block:
drbd: set QUEUE_FLAG_STABLE_WRITES
drbd: fix an invalid memory access caused by incorrect use of list iterator
drbd: Fix five use after free bugs in get_initial_state
cdrom: remove unused variable
Linus Torvalds [Sat, 9 Apr 2022 04:50:14 +0000 (18:50 -1000)]
Merge tag 'io_uring-5.18-2022-04-08' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"A bit bigger than usual post merge window, largely due to a revert and
a fix of at what point files are assigned for requests.
The latter fixing a linked request use case where a dependent link can
rely on what file is assigned consistently.
Summary:
- 32-bit compat fix for IORING_REGISTER_IOWQ_AFF (Eugene)
- File assignment fixes (me)
- Revert of the NAPI poll addition from this merge window. The author
isn't available right now to engage on this, so let's revert it and
we can retry for the 5.19 release (me, Jakub)
- Fix a timeout removal race (me)
- File update and SCM fixes (Pavel)"
* tag 'io_uring-5.18-2022-04-08' of git://git.kernel.dk/linux-block:
io_uring: fix race between timeout flush and removal
io_uring: use nospec annotation for more indexes
io_uring: zero tag on rsrc removal
io_uring: don't touch scm_fp_list after queueing skb
io_uring: nospec index for tags on files update
io_uring: implement compat handling for IORING_REGISTER_IOWQ_AFF
Revert "io_uring: Add support for napi_busy_poll"
io_uring: drop the old style inflight file tracking
io_uring: defer file assignment
io_uring: propagate issue_flags state down to file assignment
io_uring: move read/write file prep state into actual opcode handler
io_uring: defer splice/tee file validity check until command issue
io_uring: don't check req->file in io_fsync_prep()
Linus Torvalds [Sat, 9 Apr 2022 04:29:02 +0000 (18:29 -1000)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"Several bug fixes for old bugs:
- Welcome Leon as co-maintainer for RDMA so we are back to having two
people
- Some corner cases are fixed in mlx5's MR code
- Long standing CM bug where a DREQ at the wrong time can result in a
long timeout
- Missing locking and refcounting in hf1"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/hfi1: Fix use-after-free bug for mm struct
IB/rdmavt: add lock to call to rvt_error_qp to prevent a race condition
IB/cm: Cancel mad on the DREQ event when the state is MRA_REP_RCVD
RDMA/mlx5: Add a missing update of cache->last_add
RDMA/mlx5: Don't remove cache MRs when a delay is needed
MAINTAINERS: Update qib and hfi1 related drivers
MAINTAINERS: Add Leon Romanovsky to RDMA maintainers
Linus Torvalds [Sat, 9 Apr 2022 04:23:02 +0000 (18:23 -1000)]
Merge tag 'acpi-5.18-rc2' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"These revert a problematic commit from the 5.17 development cycle and
finalize the elimination of acpi_bus_get_device() that mostly took
place during the recent merge window.
Specifics:
- Revert an ACPI processor driver change related to cache
invalidation in acpi_idle_play_dead() that clearly was a mistake
and introduced user-visible regressions (Akihiko Odaki).
- Replace the last instance of acpi_bus_get_device() added during the
recent merge window and drop the function to prevent more users of
it from being added (Rafael Wysocki)"
* tag 'acpi-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: bus: Eliminate acpi_bus_get_device()
Revert "ACPI: processor: idle: Only flush cache on entering C3"
Linus Torvalds [Sat, 9 Apr 2022 01:06:11 +0000 (15:06 -1000)]
Merge tag 'linux-kselftest-kunit-fixes-5.18-rc2' of git://git./linux/kernel/git/shuah/linux-kselftest
Pull KUnit fix from Shuah Khan:
"A single documentation fix to incorrect and outdated usage
information"
* tag 'linux-kselftest-kunit-fixes-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
Documentation: kunit: fix path to .kunitconfig in start.rst
Linus Torvalds [Sat, 9 Apr 2022 00:48:35 +0000 (14:48 -1000)]
Merge tag 'linux-kselftest-fixes-5.18-rc2' of git://git./linux/kernel/git/shuah/linux-kselftest
Pull Kselftest fixes from Shuah Khan:
"Build and run-times fixes to tests:
- header dependencies
- missing tear-downs to release allocated resources in assert paths
- missing error messages when build fails
- coccicheck and unused variable warnings"
* tag 'linux-kselftest-fixes-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/harness: Pass variant to teardown
selftests/harness: Run TEARDOWN for ASSERT failures
selftests: fix an unused variable warning in pidfd selftest
selftests: fix header dependency for pid_namespace selftests
selftests: x86: add 32bit build warnings for SUSE
selftests/proc: fix array_size.cocci warning
selftests/vDSO: fix array_size.cocci warning
Linus Torvalds [Sat, 9 Apr 2022 00:31:41 +0000 (14:31 -1000)]
Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
"9 patches.
Subsystems affected by this patch series: mm (migration, highmem,
sparsemem, mremap, mempolicy, and memcg), lz4, mailmap, and
MAINTAINERS"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
MAINTAINERS: add Tom as clang reviewer
mm/list_lru.c: revert "mm/list_lru: optimize memcg_reparent_list_lru_node()"
mailmap: update Vasily Averin's email address
mm/mempolicy: fix mpol_new leak in shared_policy_replace
mmmremap.c: avoid pointless invalidate_range_start/end on mremap(old_size=0)
mm/sparsemem: fix 'mem_section' will never be NULL gcc 12 warning
lz4: fix LZ4_decompress_safe_partial read out of bound
highmem: fix checks in __kmap_local_sched_{in,out}
mm: migrate: use thp_order instead of HPAGE_PMD_ORDER for new page allocation.
Tom Rix [Fri, 8 Apr 2022 20:09:16 +0000 (13:09 -0700)]
MAINTAINERS: add Tom as clang reviewer
I have been helping with build breaks and other clang things and would
like to help with the reviews.
Link: https://lkml.kernel.org/r/20220407175715.3378998-1-trix@redhat.com
Signed-off-by: Tom Rix <trix@redhat.com>
Acked-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrew Morton [Fri, 8 Apr 2022 20:09:13 +0000 (13:09 -0700)]
mm/list_lru.c: revert "mm/list_lru: optimize memcg_reparent_list_lru_node()"
Commit
405cc51fc104 ("mm/list_lru: optimize memcg_reparent_list_lru_node()")
has subtle races which are proving ugly to fix. Revert the original
optimization. If quantitative testing indicates that we have a
significant problem here then other implementations can be looked at.
Fixes:
405cc51fc104 ("mm/list_lru: optimize memcg_reparent_list_lru_node()")
Acked-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vasily Averin [Fri, 8 Apr 2022 20:09:10 +0000 (13:09 -0700)]
mailmap: update Vasily Averin's email address
I'm moving to a @linux.dev account. Map my old addresses.
Link: https://lkml.kernel.org/r/737c7c2b-cdab-63ee-be90-cb33316c9657@linux.dev
Signed-off-by: Vasily Averin <vasily.averin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Miaohe Lin [Fri, 8 Apr 2022 20:09:07 +0000 (13:09 -0700)]
mm/mempolicy: fix mpol_new leak in shared_policy_replace
If mpol_new is allocated but not used in restart loop, mpol_new will be
freed via mpol_put before returning to the caller. But refcnt is not
initialized yet, so mpol_put could not do the right things and might
leak the unused mpol_new. This would happen if mempolicy was updated on
the shared shmem file while the sp->lock has been dropped during the
memory allocation.
This issue could be triggered easily with the below code snippet if
there are many processes doing the below work at the same time:
shmid = shmget((key_t)5566, 1024 * PAGE_SIZE, 0666|IPC_CREAT);
shm = shmat(shmid, 0, 0);
loop many times {
mbind(shm, 1024 * PAGE_SIZE, MPOL_LOCAL, mask, maxnode, 0);
mbind(shm + 128 * PAGE_SIZE, 128 * PAGE_SIZE, MPOL_DEFAULT, mask,
maxnode, 0);
}
Link: https://lkml.kernel.org/r/20220329111416.27954-1-linmiaohe@huawei.com
Fixes:
42288fe366c4 ("mm: mempolicy: Convert shared_policy mutex to spinlock")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: <stable@vger.kernel.org> [3.8]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paolo Bonzini [Fri, 8 Apr 2022 20:09:04 +0000 (13:09 -0700)]
mmmremap.c: avoid pointless invalidate_range_start/end on mremap(old_size=0)
If an mremap() syscall with old_size=0 ends up in move_page_tables(), it
will call invalidate_range_start()/invalidate_range_end() unnecessarily,
i.e. with an empty range.
This causes a WARN in KVM's mmu_notifier. In the past, empty ranges
have been diagnosed to be off-by-one bugs, hence the WARNing. Given the
low (so far) number of unique reports, the benefits of detecting more
buggy callers seem to outweigh the cost of having to fix cases such as
this one, where userspace is doing something silly. In this particular
case, an early return from move_page_tables() is enough to fix the
issue.
Link: https://lkml.kernel.org/r/20220329173155.172439-1-pbonzini@redhat.com
Reported-by: syzbot+6bde52d89cfdf9f61425@syzkaller.appspotmail.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Waiman Long [Fri, 8 Apr 2022 20:09:01 +0000 (13:09 -0700)]
mm/sparsemem: fix 'mem_section' will never be NULL gcc 12 warning
The gcc 12 compiler reports a "'mem_section' will never be NULL" warning
on the following code:
static inline struct mem_section *__nr_to_section(unsigned long nr)
{
#ifdef CONFIG_SPARSEMEM_EXTREME
if (!mem_section)
return NULL;
#endif
if (!mem_section[SECTION_NR_TO_ROOT(nr)])
return NULL;
:
It happens with CONFIG_SPARSEMEM_EXTREME off. The mem_section definition
is
#ifdef CONFIG_SPARSEMEM_EXTREME
extern struct mem_section **mem_section;
#else
extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT];
#endif
In the !CONFIG_SPARSEMEM_EXTREME case, mem_section is a static
2-dimensional array and so the check "!mem_section[SECTION_NR_TO_ROOT(nr)]"
doesn't make sense.
Fix this warning by moving the "!mem_section[SECTION_NR_TO_ROOT(nr)]"
check up inside the CONFIG_SPARSEMEM_EXTREME block and adding an
explicit NR_SECTION_ROOTS check to make sure that there is no
out-of-bound array access.
Link: https://lkml.kernel.org/r/20220331180246.2746210-1-longman@redhat.com
Fixes:
3e347261a80b ("sparsemem extreme implementation")
Signed-off-by: Waiman Long <longman@redhat.com>
Reported-by: Justin Forbes <jforbes@redhat.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Rafael Aquini <aquini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Guo Xuenan [Fri, 8 Apr 2022 20:08:58 +0000 (13:08 -0700)]
lz4: fix LZ4_decompress_safe_partial read out of bound
When partialDecoding, it is EOF if we've either filled the output buffer
or can't proceed with reading an offset for following match.
In some extreme corner cases when compressed data is suitably corrupted,
UAF will occur. As reported by KASAN [1], LZ4_decompress_safe_partial
may lead to read out of bound problem during decoding. lz4 upstream has
fixed it [2] and this issue has been disscussed here [3] before.
current decompression routine was ported from lz4 v1.8.3, bumping
lib/lz4 to v1.9.+ is certainly a huge work to be done later, so, we'd
better fix it first.
[1] https://lore.kernel.org/all/
000000000000830d1205cf7f0477@google.com/
[2] https://github.com/lz4/lz4/commit/
c5d6f8a8be3927c0bec91bcc58667a6cfad244ad#
[3] https://lore.kernel.org/all/
CC666AE8-4CA4-4951-B6FB-
A2EFDE3AC03B@fb.com/
Link: https://lkml.kernel.org/r/20211111105048.2006070-1-guoxuenan@huawei.com
Reported-by: syzbot+63d688f1d899c588fb71@syzkaller.appspotmail.com
Signed-off-by: Guo Xuenan <guoxuenan@huawei.com>
Reviewed-by: Nick Terrell <terrelln@fb.com>
Acked-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Cc: Yann Collet <cyan@fb.com>
Cc: Chengyang Fan <cy.fan@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Max Filippov [Fri, 8 Apr 2022 20:08:55 +0000 (13:08 -0700)]
highmem: fix checks in __kmap_local_sched_{in,out}
When CONFIG_DEBUG_KMAP_LOCAL is enabled __kmap_local_sched_{in,out} check
that even slots in the tsk->kmap_ctrl.pteval are unmapped. The slots are
initialized with 0 value, but the check is done with pte_none. 0 pte
however does not necessarily mean that pte_none will return true. e.g.
on xtensa it returns false, resulting in the following runtime warnings:
WARNING: CPU: 0 PID: 101 at mm/highmem.c:627 __kmap_local_sched_out+0x51/0x108
CPU: 0 PID: 101 Comm: touch Not tainted 5.17.0-rc7-00010-gd3a1cdde80d2-dirty #13
Call Trace:
dump_stack+0xc/0x40
__warn+0x8f/0x174
warn_slowpath_fmt+0x48/0xac
__kmap_local_sched_out+0x51/0x108
__schedule+0x71a/0x9c4
preempt_schedule_irq+0xa0/0xe0
common_exception_return+0x5c/0x93
do_wp_page+0x30e/0x330
handle_mm_fault+0xa70/0xc3c
do_page_fault+0x1d8/0x3c4
common_exception+0x7f/0x7f
WARNING: CPU: 0 PID: 101 at mm/highmem.c:664 __kmap_local_sched_in+0x50/0xe0
CPU: 0 PID: 101 Comm: touch Tainted: G W 5.17.0-rc7-00010-gd3a1cdde80d2-dirty #13
Call Trace:
dump_stack+0xc/0x40
__warn+0x8f/0x174
warn_slowpath_fmt+0x48/0xac
__kmap_local_sched_in+0x50/0xe0
finish_task_switch$isra$0+0x1ce/0x2f8
__schedule+0x86e/0x9c4
preempt_schedule_irq+0xa0/0xe0
common_exception_return+0x5c/0x93
do_wp_page+0x30e/0x330
handle_mm_fault+0xa70/0xc3c
do_page_fault+0x1d8/0x3c4
common_exception+0x7f/0x7f
Fix it by replacing !pte_none(pteval) with pte_val(pteval) != 0.
Link: https://lkml.kernel.org/r/20220403235159.3498065-1-jcmvbkbc@gmail.com
Fixes:
5fbda3ecd14a ("sched: highmem: Store local kmaps in task struct")
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Zi Yan [Fri, 8 Apr 2022 20:08:52 +0000 (13:08 -0700)]
mm: migrate: use thp_order instead of HPAGE_PMD_ORDER for new page allocation.
Fix a VM_BUG_ON_FOLIO(folio_nr_pages(old) != nr_pages) crash.
With folios support, it is possible to have other than HPAGE_PMD_ORDER
THPs, in the form of folios, in the system. Use thp_order() to correctly
determine the source page order during migration.
Link: https://lkml.kernel.org/r/20220404165325.1883267-1-zi.yan@sent.com
Link: https://lore.kernel.org/linux-mm/20220404132908.GA785673@u2004/
Fixes:
d68eccad3706 ("mm/filemap: Allow large folios to be added to the page cache")
Reported-by: Naoya Horiguchi <naoya.horiguchi@linux.dev>
Signed-off-by: Zi Yan <ziy@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jens Axboe [Fri, 8 Apr 2022 17:08:58 +0000 (11:08 -0600)]
io_uring: fix race between timeout flush and removal
io_flush_timeouts() assumes the timeout isn't in progress of triggering
or being removed/canceled, so it unconditionally removes it from the
timeout list and attempts to cancel it.
Leave it on the list and let the normal timeout cancelation take care
of it.
Cc: stable@vger.kernel.org # 5.5+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Douglas Miller [Fri, 8 Apr 2022 13:35:23 +0000 (09:35 -0400)]
RDMA/hfi1: Fix use-after-free bug for mm struct
Under certain conditions, such as MPI_Abort, the hfi1 cleanup code may
represent the last reference held on the task mm.
hfi1_mmu_rb_unregister() then drops the last reference and the mm is freed
before the final use in hfi1_release_user_pages(). A new task may
allocate the mm structure while it is still being used, resulting in
problems. One manifestation is corruption of the mmap_sem counter leading
to a hang in down_write(). Another is corruption of an mm struct that is
in use by another task.
Fixes:
3d2a9d642512 ("IB/hfi1: Ensure correct mm is used at all times")
Link: https://lore.kernel.org/r/20220408133523.122165.72975.stgit@awfm-01.cornelisnetworks.com
Cc: <stable@vger.kernel.org>
Signed-off-by: Douglas Miller <doug.miller@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Rafael J. Wysocki [Fri, 8 Apr 2022 17:50:44 +0000 (19:50 +0200)]
Merge branch 'acpi-bus'
* acpi-bus:
ACPI: bus: Eliminate acpi_bus_get_device()
Linus Torvalds [Fri, 8 Apr 2022 17:39:17 +0000 (07:39 -1000)]
Merge tag 'nfs-for-5.18-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client fixes from Trond Myklebust:
"Stable fixes:
- SUNRPC: Ensure we flush any closed sockets before xs_xprt_free()
Bugfixes:
- Fix an Oopsable condition due to SLAB_ACCOUNT setting in the
NFSv4.2 xattr code.
- Fix for open() using an file open mode of '3' in NFSv4
- Replace readdir's use of xxhash() with hash_64()
- Several patches to handle malloc() failure in SUNRPC"
* tag 'nfs-for-5.18-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
SUNRPC: Move the call to xprt_send_pagedata() out of xprt_sock_sendmsg()
SUNRPC: svc_tcp_sendmsg() should handle errors from xdr_alloc_bvec()
SUNRPC: Handle allocation failure in rpc_new_task()
NFS: Ensure rpc_run_task() cannot fail in nfs_async_rename()
NFSv4/pnfs: Handle RPC allocation errors in nfs4_proc_layoutget
SUNRPC: Handle low memory situations in call_status()
SUNRPC: Handle ENOMEM in call_transmit_status()
NFSv4.2: Fix missing removal of SLAB_ACCOUNT on kmem_cache allocation
SUNRPC: Ensure we flush any closed sockets before xs_xprt_free()
NFS: Replace readdir's use of xxhash() with hash_64()
SUNRPC: handle malloc failure in ->request_prepare
NFSv4: fix open failure with O_ACCMODE flag
Revert "NFSv4: Handle the special Linux file open access mode"
Linus Torvalds [Fri, 8 Apr 2022 17:09:17 +0000 (07:09 -1000)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"The two main things to note are:
(1) The bulk of the diffstat is us reverting a horrible bodge we had
in place to ease the merging of maple tree during the merge
window (which turned out not to be needed, but anyway)
(2) The TLB invalidation fix is done in core code, as suggested by
(and Acked-by) Peter.
Summary:
- Revert temporary bodge in MTE coredumping to ease maple tree integration
- Fix stack frame size warning reported with 64k pages
- Fix stop_machine() race with instruction text patching
- Ensure alternatives patching routines are not instrumented
- Enable Spectre-BHB mitigation for Cortex-A78AE
- Fix hugetlb TLB invalidation when contiguous hint is used
- Minor perf driver fixes
- Fix some typos"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
perf/imx_ddr: Fix undefined behavior due to shift overflowing the constant
arm64: Add part number for Arm Cortex-A78AE
arm64: patch_text: Fixup last cpu should be master
tlb: hugetlb: Add more sizes to tlb_remove_huge_tlb_entry
arm64: alternatives: mark patch_alternative() as `noinstr`
perf: MARVELL_CN10K_DDR_PMU should depend on ARCH_THUNDER
perf: qcom_l2_pmu: fix an incorrect NULL check on list iterator
arm64: Fix comments in macro __init_el2_gicv3
arm64: fix typos in comments
arch/arm64: Fix topology initialization for core scheduling
arm64: mte: Fix the stack frame size warning in mte_dump_tag_range()
Revert "arm64: Change elfcore for_each_mte_vma() to use VMA iterator"
Linus Torvalds [Fri, 8 Apr 2022 16:52:50 +0000 (06:52 -1000)]
Merge tag 'folio-5.18e' of git://git.infradead.org/users/willy/pagecache
Pull folio fixes from Matthew Wilcox:
"Fewer bug reports than I was expecting from enabling large folios.
One that doesn't show up on x86 but does on arm64, one that shows up
with hugetlbfs memory failure testing and one that shows up with page
migration, which it turns out I wasn't testing because my last NUMA
machine died. Need to set up a qemu fake NUMA machine so I don't skip
testing that in future.
Summary:
- Remove the migration code's assumptions about large pages being PMD
sized
- Don't call pmd_page() on a non-leaf PMD
- Fix handling of hugetlbfs pages in page_vma_mapped_walk"
* tag 'folio-5.18e' of git://git.infradead.org/users/willy/pagecache:
mm/rmap: Fix handling of hugetlbfs pages in page_vma_mapped_walk
mm/mempolicy: Use vma_alloc_folio() in new_page()
mm: Add vma_alloc_folio()
mm/migrate: Use a folio in migrate_misplaced_transhuge_page()
mm/migrate: Use a folio in alloc_migration_target()
mm/huge_memory: Avoid calling pmd_page() on a non-leaf PMD
Linus Torvalds [Fri, 8 Apr 2022 16:45:38 +0000 (06:45 -1000)]
Merge tag 'spi-fix-v5.18-rc1' of git://git./linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A small collection of fixes that have arrived since the merge window,
the most noticable one is a fix for unmapping messages when the
mapping was done with the struct device supplied to do the mapping
overridden"
* tag 'spi-fix-v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: bcm-qspi: fix MSPI only access with bcm_qspi_exec_mem_op()
spi: cadence-quadspi: fix protocol setup for non-1-1-X operations
spi: core: add dma_map_dev for __spi_unmap_msg()
spi: mxic: Fix an error handling path in mxic_spi_probe()
spi: rpc-if: Fix RPM imbalance in probe error path
Linus Torvalds [Fri, 8 Apr 2022 16:42:03 +0000 (06:42 -1000)]
Merge tag 'regulator-fix-v5.18-rc1' of git://git./linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"A few small driver specific fixes for v5.18, plus an update to the
MAINTAINERS file"
* tag 'regulator-fix-v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
MAINTAINERS: Fix reviewer info for a few ROHM ICs
regulator: atc260x: Fix missing active_discharge_on setting
regulator: rtq2134: Fix missing active_discharge_on setting
regulator: wm8994: Add an off-on delay for WM8994 variant
Linus Torvalds [Fri, 8 Apr 2022 16:37:11 +0000 (06:37 -1000)]
Merge tag 'mmc-v5.18-rc1' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC updates from Ulf Hansson:
"MMC core:
- Improve API to make it clear that mmc_hw_reset() is for cards
- Fixup support for writeback-cache for eMMC and SD
- Check for errors after writes on SPI
MMC host:
- renesas_sdhi: A couple of fixes of TAP settings for eMMC HS400 mode
- mmci_stm32: Fixup check of all elements in sg list
- sdhci-xenon: Revert unnecessary fix for annoying 1.8V regulator warning"
* tag 'mmc-v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: core: improve API to make clear mmc_hw_reset is for cards
mmc: renesas_sdhi: don't overwrite TAP settings when HS400 tuning is complete
mmc: renesas_sdhi: special 4tap settings only apply to HS400
mmc: core: Fixup support for writeback-cache for eMMC and SD
mmc: block: Check for errors after write on SPI
mmc: mmci: stm32: correctly check all elements of sg list
Revert "mmc: sdhci-xenon: fix annoying 1.8V regulator warning"
Linus Torvalds [Fri, 8 Apr 2022 16:29:25 +0000 (06:29 -1000)]
Merge tag 'iommu-fix-v5.18-rc1' of git://git./linux/kernel/git/joro/iommu
Pull iommu fix from Joerg Roedel:
- Fix boot regression due to a NULL-ptr dereference on OMAP machines
* tag 'iommu-fix-v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/omap: Fix regression in probe for NULL pointer dereference
Borislav Petkov [Tue, 5 Apr 2022 15:15:15 +0000 (17:15 +0200)]
perf/imx_ddr: Fix undefined behavior due to shift overflowing the constant
Fix:
In file included from <command-line>:0:0:
In function ‘ddr_perf_counter_enable’,
inlined from ‘ddr_perf_irq_handler’ at drivers/perf/fsl_imx8_ddr_perf.c:651:2:
././include/linux/compiler_types.h:352:38: error: call to ‘__compiletime_assert_729’ \
declared with attribute error: FIELD_PREP: mask is not constant
_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
...
See https://lore.kernel.org/r/YkwQ6%2BtIH8GQpuct@zn.tnic for the gory
details as to why it triggers with older gccs only.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Frank Li <Frank.li@nxp.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Pengutronix Kernel Team <kernel@pengutronix.de>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: NXP Linux Team <linux-imx@nxp.com>
Cc: linux-arm-kernel@lists.infradead.org
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220405151517.29753-10-bp@alien8.de
Signed-off-by: Will Deacon <will@kernel.org>
Matti Vaittinen [Fri, 8 Apr 2022 08:32:00 +0000 (11:32 +0300)]
MAINTAINERS: Fix reviewer info for a few ROHM ICs
The email backend used by ROHM keeps labeling patches as spam.
Additionally, there have been reports of some emails been completely
dropped. Finally also the email list (or shared inbox)
linux-power@fi.rohmeurope.com inadvertly stopped working and has not
been reviwed during the past few weeks.
Remove no longer working list 'linux-power' list-entry and switch my
email to use the personal gmail account instead of the company account.
Signed-off-by: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>
Link: https://lore.kernel.org/r/Yk/zAHusOdf4+h06@dc73szyh141qn5ck3nwqy-3.rev.dnainternet.fi
Signed-off-by: Mark Brown <broonie@kernel.org>
Chanho Park [Thu, 7 Apr 2022 09:11:28 +0000 (18:11 +0900)]
arm64: Add part number for Arm Cortex-A78AE
Add the MIDR part number info for the Arm Cortex-A78AE[1] and add it to
spectre-BHB affected list[2].
[1]: https://developer.arm.com/Processors/Cortex-A78AE
[2]: https://developer.arm.com/Arm%20Security%20Center/Spectre-BHB
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: James Morse <james.morse@arm.com>
Signed-off-by: Chanho Park <chanho61.park@samsung.com>
Link: https://lore.kernel.org/r/20220407091128.8700-1-chanho61.park@samsung.com
Signed-off-by: Will Deacon <will@kernel.org>
Guo Ren [Thu, 7 Apr 2022 07:33:20 +0000 (15:33 +0800)]
arm64: patch_text: Fixup last cpu should be master
These patch_text implementations are using stop_machine_cpuslocked
infrastructure with atomic cpu_count. The original idea: When the
master CPU patch_text, the others should wait for it. But current
implementation is using the first CPU as master, which couldn't
guarantee the remaining CPUs are waiting. This patch changes the
last CPU as the master to solve the potential risk.
Fixes:
ae16480785de ("arm64: introduce interfaces to hotpatch kernel and module code")
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220407073323.743224-2-guoren@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Tony Lindgren [Thu, 31 Mar 2022 06:23:01 +0000 (09:23 +0300)]
iommu/omap: Fix regression in probe for NULL pointer dereference
Commit
3f6634d997db ("iommu: Use right way to retrieve iommu_ops") started
triggering a NULL pointer dereference for some omap variants:
__iommu_probe_device from probe_iommu_group+0x2c/0x38
probe_iommu_group from bus_for_each_dev+0x74/0xbc
bus_for_each_dev from bus_iommu_probe+0x34/0x2e8
bus_iommu_probe from bus_set_iommu+0x80/0xc8
bus_set_iommu from omap_iommu_init+0x88/0xcc
omap_iommu_init from do_one_initcall+0x44/0x24
This is caused by omap iommu probe returning 0 instead of ERR_PTR(-ENODEV)
as noted by Jason Gunthorpe <jgg@ziepe.ca>.
Looks like the regression already happened with an earlier commit
6785eb9105e3 ("iommu/omap: Convert to probe/release_device() call-backs")
that changed the function return type and missed converting one place.
Cc: Drew Fustini <dfustini@baylibre.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Suman Anna <s-anna@ti.com>
Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
Fixes:
6785eb9105e3 ("iommu/omap: Convert to probe/release_device() call-backs")
Fixes:
3f6634d997db ("iommu: Use right way to retrieve iommu_ops")
Signed-off-by: Tony Lindgren <tony@atomide.com>
Tested-by: Drew Fustini <dfustini@baylibre.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20220331062301.24269-1-tony@atomide.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Wolfram Sang [Fri, 8 Apr 2022 08:00:42 +0000 (10:00 +0200)]
mmc: core: improve API to make clear mmc_hw_reset is for cards
To make it unambiguous that mmc_hw_reset() is for cards and not for
controllers, we make the function argument mmc_card instead of mmc_host.
Also, all users are converted.
Suggested-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Acked-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/20220408080045.6497-2-wsa+renesas@sang-engineering.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Linus Torvalds [Fri, 8 Apr 2022 05:27:39 +0000 (19:27 -1000)]
Merge tag 'drm-fixes-2022-04-08' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Main set of fixes for rc2, mostly amdgpu, but some dma-fence fixups as
well, along with some other misc ones.
dma-fence:
- fix warning about fence containers
- fix logic error in new fence merge code
- handle empty dma_fence_arrays gracefully
bridge:
- Try all possible cases for bridge/panel detection.
bindings:
- Don't require input port for MIPI-DSI, and make width/height mandatory.
fbdev:
- Fix unregistering of framebuffers without device.
nouveau:
- Fix a crash when booting with nouveau on tegra.
amdgpu:
- GFX 10.3.7 fixes
- noretry updates
- VCN fixes
- TMDS fix
- zstate fix for freesync video
- DCN 3.1.5 fix
- Display stack size fix
- Audio fix
- DCN 3.1 pstate fix
- TMZ VCN fix
- APU passthrough fix
- Misc other fixes
- VCN 3.0 fixes
- Misc display fixes
- GC 10.3 golden register fix
- Suspend fix
- SMU 10 fix
amdkfd:
- Error handling fix
- xgmi p2p fix
- HWS VMIDs fix
- Event fix
panel:
- ili9341: Fix optional regulator handling
imx:
- Catch an EDID allocation failure in imx-ldb
- fix a leaked drm display mode on DT parsing error in parallel-display
- properly remove the dw_hdmi bridge in case the component_add fails in dw_hdmi-imx
- fix the IPU clock frequency debug printout in ipu-di"
* tag 'drm-fixes-2022-04-08' of git://anongit.freedesktop.org/drm/drm: (61 commits)
dt-bindings: display: panel: mipi-dbi-spi: Make width-mm/height-mm mandatory
fbdev: Fix unregistering of framebuffers without device
drm/amdgpu/smu10: fix SoC/fclk units in auto mode
drm/amd/display: update dcn315 clock table read
drm/amdgpu/display: change pipe policy for DCN 2.1
drm/amd/display: Add configuration options for AUX wake work around.
drm/amd/display: remove assert for odm transition case
drm/amdgpu: don't use BACO for reset in S3
drm/amd/display: Fix by adding FPU protection for dcn30_internal_validate_bw
drm/amdkfd: Create file descriptor after client is added to smi_clients list
drm/amdgpu: Sync up header and implementation to use the same parameter names
drm/amdgpu: fix incorrect GCR_GENERAL_CNTL address
amd/display: set backlight only if required
drm/amd/display: Fix allocate_mst_payload assert on resume
drm/amd/display: Revert FEC check in validation
drm/amd/display: Add work around for AUX failure on wake.
drm/amd/display: Clear optc false state when disable otg
drm/amd/display: Enable power gating before init_pipes
drm/amd/display: Remove redundant dsc power gating from init_hw
drm/amd/display: Correct Slice reset calculation
...
Linus Torvalds [Fri, 8 Apr 2022 05:16:49 +0000 (19:16 -1000)]
Merge tag '5.18-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs client fixes from Steve French:
- reconnect fixes: one for DFS and one to avoid a reconnect race
- small change to deal with upcoming behavior change of list iterators
* tag '5.18-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: update internal module number
cifs: force new session setup and tcon for dfs
cifs: remove check of list iterator against head past the loop body
cifs: fix potential race with cifsd thread
Linus Torvalds [Fri, 8 Apr 2022 05:01:47 +0000 (19:01 -1000)]
Merge tag 'net-5.18-rc2' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from bpf and netfilter.
Current release - new code bugs:
- mctp: correct mctp_i2c_header_create result
- eth: fungible: fix reference to __udivdi3 on 32b builds
- eth: micrel: remove latencies support lan8814
Previous releases - regressions:
- bpf: resolve to prog->aux->dst_prog->type only for BPF_PROG_TYPE_EXT
- vrf: fix packet sniffing for traffic originating from ip tunnels
- rxrpc: fix a race in rxrpc_exit_net()
- dsa: revert "net: dsa: stop updating master MTU from master.c"
- eth: ice: fix MAC address setting
Previous releases - always broken:
- tls: fix slab-out-of-bounds bug in decrypt_internal
- bpf: support dual-stack sockets in bpf_tcp_check_syncookie
- xdp: fix coalescing for page_pool fragment recycling
- ovs: fix leak of nested actions
- eth: sfc:
- add missing xdp queue reinitialization
- fix using uninitialized xdp tx_queue
- eth: ice:
- clear default forwarding VSI during VSI release
- fix broken IFF_ALLMULTI handling
- synchronize_rcu() when terminating rings
- eth: qede: confirm skb is allocated before using
- eth: aqc111: fix out-of-bounds accesses in RX fixup
- eth: slip: fix NPD bug in sl_tx_timeout()"
* tag 'net-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (61 commits)
drivers: net: slip: fix NPD bug in sl_tx_timeout()
bpf: Adjust bpf_tcp_check_syncookie selftest to test dual-stack sockets
bpf: Support dual-stack sockets in bpf_tcp_check_syncookie
myri10ge: fix an incorrect free for skb in myri10ge_sw_tso
net: usb: aqc111: Fix out-of-bounds accesses in RX fixup
qede: confirm skb is allocated before using
net: ipv6mr: fix unused variable warning with CONFIG_IPV6_PIMSM_V2=n
net: phy: mscc-miim: reject clause 45 register accesses
net: axiemac: use a phandle to reference pcs_phy
dt-bindings: net: add pcs-handle attribute
net: axienet: factor out phy_node in struct axienet_local
net: axienet: setup mdio unconditionally
net: sfc: fix using uninitialized xdp tx_queue
rxrpc: fix a race in rxrpc_exit_net()
net: openvswitch: fix leak of nested actions
net: ethernet: mv643xx: Fix over zealous checking of_get_mac_address()
net: openvswitch: don't send internal clone attribute to the userspace.
net: micrel: Fix KS8851 Kconfig
ice: clear cmd_type_offset_bsz for TX rings
ice: xsk: fix VSI state check in ice_xsk_wakeup()
...
Dave Airlie [Thu, 7 Apr 2022 23:22:07 +0000 (09:22 +1000)]
Merge tag 'drm-misc-fixes-2022-04-07' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
drm-misc-fixes for v5.18-rc2:
- Fix a crash when booting with nouveau on tegra.
- Don't require input port for MIPI-DSI, and make width/height mandatory.
- Fix unregistering of framebuffers without device.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/58fa2262-3eb6-876d-7157-ab7a135696b7@linux.intel.com
Dave Airlie [Thu, 7 Apr 2022 23:13:33 +0000 (09:13 +1000)]
Merge tag 'drm-misc-next-fixes-2022-04-07' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
drm-misc-next-fixes for v5.18-rc2:
- fix warning about fence containers
- fix logic error in new fence merge code
- handle empty dma_fence_arrays gracefully
- Try all possible cases for bridge/panel detection.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/3b8e6439-612e-f640-e380-51e834393e94@linux.intel.com
Trond Myklebust [Thu, 7 Apr 2022 02:51:58 +0000 (22:51 -0400)]
SUNRPC: Move the call to xprt_send_pagedata() out of xprt_sock_sendmsg()
The client and server have different requirements for their memory
allocation, so move the allocation of the send buffer out of the socket
send code that is common to both.
Reported-by: NeilBrown <neilb@suse.de>
Fixes:
b2648015d452 ("SUNRPC: Make the rpciod and xprtiod slab allocation modes consistent")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Trond Myklebust [Thu, 7 Apr 2022 18:10:23 +0000 (14:10 -0400)]
SUNRPC: svc_tcp_sendmsg() should handle errors from xdr_alloc_bvec()
The allocation is done with GFP_KERNEL, but it could still fail in a low
memory situation.
Fixes:
4a85a6a3320b ("SUNRPC: Handle TCP socket sends with kernel_sendpage() again")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Trond Myklebust [Thu, 7 Apr 2022 02:36:19 +0000 (22:36 -0400)]
SUNRPC: Handle allocation failure in rpc_new_task()
If the call to rpc_alloc_task() fails, then ensure that the calldata is
released, and that rpc_run_task() and rpc_run_bc_task() bail out early.
Reported-by: NeilBrown <neilb@suse.de>
Fixes:
910ad38697d9 ("NFS: Fix memory allocation in rpc_alloc_task()")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Trond Myklebust [Thu, 7 Apr 2022 02:34:35 +0000 (22:34 -0400)]
NFS: Ensure rpc_run_task() cannot fail in nfs_async_rename()
Ensure the call to rpc_run_task() cannot fail by preallocating the
rpc_task.
Fixes:
910ad38697d9 ("NFS: Fix memory allocation in rpc_alloc_task()")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Trond Myklebust [Thu, 7 Apr 2022 02:33:19 +0000 (22:33 -0400)]
NFSv4/pnfs: Handle RPC allocation errors in nfs4_proc_layoutget
If rpc_run_task() fails due to an allocation error, then bail out early.
Fixes:
910ad38697d9 ("NFS: Fix memory allocation in rpc_alloc_task()")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Trond Myklebust [Thu, 7 Apr 2022 13:50:19 +0000 (09:50 -0400)]
SUNRPC: Handle low memory situations in call_status()
We need to handle ENFILE, ENOBUFS, and ENOMEM, because
xprt_wake_pending_tasks() can be called with any one of these due to
socket creation failures.
Fixes:
b61d59fffd3e ("SUNRPC: xs_tcp_connect_worker{4,6}: merge common code")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Trond Myklebust [Thu, 7 Apr 2022 03:18:57 +0000 (23:18 -0400)]
SUNRPC: Handle ENOMEM in call_transmit_status()
Both call_transmit() and call_bc_transmit() can now return ENOMEM, so
let's make sure that we handle the errors gracefully.
Fixes:
0472e4766049 ("SUNRPC: Convert socket page send code to use iov_iter()")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Muchun Song [Fri, 1 Apr 2022 02:59:05 +0000 (10:59 +0800)]
NFSv4.2: Fix missing removal of SLAB_ACCOUNT on kmem_cache allocation
The commit
5c60e89e71f8 ("NFSv4.2: Fix up an invalid combination of memory
allocation flags") has stripped GFP_KERNEL_ACCOUNT down to GFP_KERNEL,
however, it forgot to remove SLAB_ACCOUNT from kmem_cache allocation.
It means that memory is still limited by kmemcg. This patch also fix a
NULL pointer reference issue [1] reported by NeilBrown.
Link: https://lore.kernel.org/all/164870069595.25542.17292003658915487357@noble.neil.brown.name/
Fixes:
5c60e89e71f8 ("NFSv4.2: Fix up an invalid combination of memory allocation flags")
Fixes:
5abc1e37afa0 ("mm: list_lru: allocate list_lru_one only when needed")
Reported-by: NeilBrown <neilb@suse.de>
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Trond Myklebust [Sun, 3 Apr 2022 19:58:11 +0000 (15:58 -0400)]
SUNRPC: Ensure we flush any closed sockets before xs_xprt_free()
We must ensure that all sockets are closed before we call xprt_free()
and release the reference to the net namespace. The problem is that
calling fput() will defer closing the socket until delayed_fput() gets
called.
Let's fix the situation by allowing rpciod and the transport teardown
code (which runs on the system wq) to call __fput_sync(), and directly
close the socket.
Reported-by: Felix Fu <foyjog@gmail.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Fixes:
a73881c96d73 ("SUNRPC: Fix an Oops in udp_poll()")
Cc: stable@vger.kernel.org # 5.1.x: 3be232f11a3c: SUNRPC: Prevent immediate close+reconnect
Cc: stable@vger.kernel.org # 5.1.x: 89f42494f92f: SUNRPC: Don't call connect() more than once on a TCP socket
Cc: stable@vger.kernel.org # 5.1.x
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Trond Myklebust [Thu, 31 Mar 2022 00:00:07 +0000 (20:00 -0400)]
NFS: Replace readdir's use of xxhash() with hash_64()
Both xxhash() and hash_64() appear to give similarly low collision
rates with a standard linearly increasing readdir offset. They both give
similarly higher collision rates when applied to ext4's offsets.
So switch to using the standard hash_64().
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Pavel Begunkov [Thu, 7 Apr 2022 13:05:05 +0000 (14:05 +0100)]
io_uring: use nospec annotation for more indexes
There are still several places that using pre array_index_nospec()
indexes, fix them up.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/b01ef5ee83f72ed35ad525912370b729f5d145f4.1649336342.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 7 Apr 2022 13:05:04 +0000 (14:05 +0100)]
io_uring: zero tag on rsrc removal
Automatically default rsrc tag in io_queue_rsrc_removal(), it's safer
than leaving it there and relying on the rest of the code to behave and
not use it.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/1cf262a50df17478ea25b22494dcc19f3a80301f.1649336342.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Wed, 6 Apr 2022 11:43:58 +0000 (12:43 +0100)]
io_uring: don't touch scm_fp_list after queueing skb
It's safer to not touch scm_fp_list after we queued an skb to which it
was assigned, there might be races lurking if we screw subtle sync
guarantees on the io_uring side.
Fixes:
6b06314c47e14 ("io_uring: add file set registration")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Wed, 6 Apr 2022 11:43:57 +0000 (12:43 +0100)]
io_uring: nospec index for tags on files update
Don't forget to array_index_nospec() for indexes before updating rsrc
tags in __io_sqe_files_update(), just use already safe and precalculated
index @i.
Fixes:
c3bdad0271834 ("io_uring: add generic rsrc update with tags")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Eugene Syromiatnikov [Wed, 6 Apr 2022 11:55:33 +0000 (13:55 +0200)]
io_uring: implement compat handling for IORING_REGISTER_IOWQ_AFF
Similarly to the way it is done im mbind syscall.
Cc: stable@vger.kernel.org # 5.14
Fixes:
fe76421d1da1dcdb ("io_uring: allow user configurable IO thread CPU affinity")
Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Tue, 5 Apr 2022 16:31:43 +0000 (10:31 -0600)]
Revert "io_uring: Add support for napi_busy_poll"
This reverts commit
adc8682ec69012b68d5ab7123e246d2ad9a6f94b.
There's some discussion on the API not being as good as it can be.
Rather than ship something and be stuck with it forever, let's revert
the NAPI support for now and work on getting something sorted out
for the next kernel release instead.
Link: https://lore.kernel.org/io-uring/b7bbc124-8502-0ee9-d4c8-7c41b4487264@kernel.dk/
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Thu, 31 Mar 2022 18:38:46 +0000 (12:38 -0600)]
io_uring: drop the old style inflight file tracking
io_uring tracks requests that are referencing an io_uring descriptor to
be able to cancel without worrying about loops in the references. Since
we now assign the file at execution time, the easier approach is to drop
a potentially problematic reference before we punt the request. This
eliminates the need to special case these types of files beyond just
marking them as such, and simplifies cancelation quite a bit.
This also fixes a recent issue where an async punted tee operation would
with the io_uring descriptor as the output file would crash when
attempting to get a reference to the file from the io-wq worker. We
could have worked around that, but this is the much cleaner fix.
Fixes:
6bf9c47a3989 ("io_uring: defer file assignment")
Reported-by: syzbot+c4b9303500a21750b250@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Tue, 29 Mar 2022 16:10:08 +0000 (10:10 -0600)]
io_uring: defer file assignment
If an application uses direct open or accept, it knows in advance what
direct descriptor value it will get as it picks it itself. This allows
combined requests such as:
sqe = io_uring_get_sqe(ring);
io_uring_prep_openat_direct(sqe, ..., file_slot);
sqe->flags |= IOSQE_IO_LINK | IOSQE_CQE_SKIP_SUCCESS;
sqe = io_uring_get_sqe(ring);
io_uring_prep_read(sqe,file_slot, buf, buf_size, 0);
sqe->flags |= IOSQE_FIXED_FILE;
io_uring_submit(ring);
where we prepare both a file open and read, and only get a completion
event for the read when both have completed successfully.
Currently links are fully prepared before the head is issued, but that
fails if the dependent link needs a file assigned that isn't valid until
the head has completed.
Conversely, if the same chain is performed but the fixed file slot is
already valid, then we would be unexpectedly returning data from the
old file slot rather than the newly opened one. Make sure we're
consistent here.
Allow deferral of file setup, which makes this documented case work.
Cc: stable@vger.kernel.org # v5.15+
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Mon, 4 Apr 2022 23:18:43 +0000 (17:18 -0600)]
io_uring: propagate issue_flags state down to file assignment
We'll need this in a future patch, when we could be assigning the file
after the prep stage. While at it, get rid of the io_file_get() helper,
it just makes the code harder to read.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Thu, 7 Apr 2022 16:35:34 +0000 (06:35 -1000)]
Merge tag 'hyperv-fixes-signed-
20220407' of git://git./linux/kernel/git/hyperv/linux
Pull hyperv fixes from Wei Liu:
- Correctly propagate coherence information for VMbus devices (Michael
Kelley)
- Disable balloon and memory hot-add on ARM64 temporarily (Boqun Feng)
- Use barrier to prevent reording when reading ring buffer (Michael
Kelley)
- Use virt_store_mb in favour of smp_store_mb (Andrea Parri)
- Fix VMbus device object initialization (Andrea Parri)
- Deactivate sysctl_record_panic_msg on isolated guest (Andrea Parri)
- Fix a crash when unloading VMbus module (Guilherme G. Piccoli)
* tag 'hyperv-fixes-signed-
20220407' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
Drivers: hv: vmbus: Replace smp_store_mb() with virt_store_mb()
Drivers: hv: balloon: Disable balloon and hot-add accordingly
Drivers: hv: balloon: Support status report for larger page sizes
Drivers: hv: vmbus: Prevent load re-ordering when reading ring buffer
PCI: hv: Propagate coherence from VMbus device to PCI device
Drivers: hv: vmbus: Propagate VMbus coherence to each VMbus device
Drivers: hv: vmbus: Fix potential crash on module unload
Drivers: hv: vmbus: Fix initialization of device object in vmbus_device_register()
Drivers: hv: vmbus: Deactivate sysctl_record_panic_msg by default in isolated guests
Linus Torvalds [Thu, 7 Apr 2022 16:02:55 +0000 (06:02 -1000)]
Merge tag 'random-5.18-rc2-for-linus' of git://git./linux/kernel/git/crng/random
Pull random number generator fixes from Jason Donenfeld:
- Another fixup to the fast_init/crng_init split, this time in how much
entropy is being credited, from Jan Varho.
- As discussed, we now opportunistically call try_to_generate_entropy()
in /dev/urandom reads, as a replacement for the reverted commit. I
opted to not do the more invasive wait_for_random_bytes() change at
least for now, preferring to do something smaller and more obvious
for the time being, but maybe that can be revisited as things evolve
later.
- Userspace can use FUSE or userfaultfd or simply move a process to
idle priority in order to make a read from the random device never
complete, which breaks forward secrecy, fixed by overwriting
sensitive bytes early on in the function.
- Jann Horn noticed that /dev/urandom reads were only checking for
pending signals if need_resched() was true, a bug going back to the
genesis commit, now fixed by always checking for signal_pending() and
calling cond_resched(). This explains various noticeable signal
delivery delays I've seen in programs over the years that do long
reads from /dev/urandom.
- In order to be more like other devices (e.g. /dev/zero) and to
mitigate the impact of fixing the above bug, which has been around
forever (users have never really needed to check the return value of
read() for medium-sized reads and so perhaps many didn't), we now
move signal checking to the bottom part of the loop, and do so every
PAGE_SIZE-bytes.
* tag 'random-5.18-rc2-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
random: check for signals every PAGE_SIZE chunk of /dev/[u]random
random: check for signal_pending() outside of need_resched() check
random: do not allow user to keep crng key around on stack
random: opportunistically initialize on /dev/urandom reads
random: do not split fast init input in add_hwgenerator_randomness()
Linus Torvalds [Thu, 7 Apr 2022 15:56:54 +0000 (05:56 -1000)]
Merge tag 'ata-5.18-rc2' of git://git./linux/kernel/git/dlemoal/libata
Pull ata fixes from Damien Le Moal:
- Fix a compilation warning due to an uninitialized variable in
ata_sff_lost_interrupt(), from me.
- Fix invalid internal command tag handling in the sata_dwc_460ex
driver, from Christian.
- Disable READ LOG DMA EXT with Samsung 840 EVO SSDs as this command
causes the drives to hang, from Christian.
- Change the config option CONFIG_SATA_LPM_POLICY back to its original
name CONFIG_SATA_LPM_MOBILE_POLICY to avoid potential problems with
users losing their configuration (as discussed during the merge
window), from Mario.
* tag 'ata-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
ata: ahci: Rename CONFIG_SATA_LPM_POLICY configuration item back
ata: libata-core: Disable READ LOG DMA EXT for Samsung 840 EVOs
ata: sata_dwc_460ex: Fix crash due to OOB write
ata: libata-sff: Fix compilation warning in ata_sff_lost_interrupt()
zhenwei pi [Thu, 7 Apr 2022 06:40:08 +0000 (14:40 +0800)]
mm/rmap: Fix handling of hugetlbfs pages in page_vma_mapped_walk
page_mapped_in_vma() sets nr_pages to 1, which is usually correct as we
only want to know about the precise page and not about other pages in
the folio. However, hugetlbfs does want to know about the entire hpage,
and using nr_pages to get the size of the hpage is wrong. We could
change page_mapped_in_vma() to special-case hugetlbfs pages, but it's
better to ignore nr_pages in page_vma_mapped_walk() and get the size
from the VMA instead.
Fixes:
2aff7a4755bed ("mm: Convert page_vma_mapped_walk to work on PFNs")
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
[edit commit message, use hstate directly]
Matthew Wilcox (Oracle) [Mon, 4 Apr 2022 19:23:39 +0000 (15:23 -0400)]
mm/mempolicy: Use vma_alloc_folio() in new_page()
Simplify new_page() by unifying the THP and base page cases, and
handle orders other than 0 and HPAGE_PMD_ORDER correctly.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Matthew Wilcox (Oracle) [Mon, 4 Apr 2022 19:11:04 +0000 (15:11 -0400)]
mm: Add vma_alloc_folio()
This wrapper around alloc_pages_vma() calls prep_transhuge_page(),
removing the obligation from the caller. This is in the same spirit
as __folio_alloc().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Matthew Wilcox (Oracle) [Tue, 6 Jul 2021 14:50:39 +0000 (10:50 -0400)]
mm/migrate: Use a folio in migrate_misplaced_transhuge_page()
Unify alloc_misplaced_dst_page() and alloc_misplaced_dst_page_thp().
Removes an assumption that compound pages are HPAGE_PMD_ORDER.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Matthew Wilcox (Oracle) [Mon, 4 Apr 2022 18:35:04 +0000 (14:35 -0400)]
mm/migrate: Use a folio in alloc_migration_target()
This removes an assumption that a large folio is HPAGE_PMD_ORDER
as well as letting us remove the call to prep_transhuge_page()
and a few hidden calls to compound_head().
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Matthew Wilcox (Oracle) [Wed, 6 Apr 2022 12:41:39 +0000 (08:41 -0400)]
mm/huge_memory: Avoid calling pmd_page() on a non-leaf PMD
Calling try_to_unmap() with TTU_SPLIT_HUGE_PMD and a folio that's not
mapped by a PMD causes oopses on arm64 because we now call page_folio()
on an invalid page. pmd_page() returns a valid page for non-leaf PMDs on
some architectures, so this bug escaped testing before now. Fix this bug
by delaying the call to pmd_page() until after we know the PMD is a leaf.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215804
Fixes:
af28a988b313 ("mm/huge_memory: Convert __split_huge_pmd() to take a folio")
Reported-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Tested-by: Zorro Lang <zlang@redhat.com>
Wolfram Sang [Mon, 4 Apr 2022 11:49:02 +0000 (13:49 +0200)]
mmc: renesas_sdhi: don't overwrite TAP settings when HS400 tuning is complete
When HS400 tuning is complete and HS400 is going to be activated, we
have to keep the current number of TAPs and should not overwrite them
with a hardcoded value. This was probably a copy&paste mistake when
upporting HS400 support from the BSP.
Fixes:
26eb2607fa28 ("mmc: renesas_sdhi: add eMMC HS400 mode support")
Reported-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220404114902.12175-1-wsa+renesas@sang-engineering.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Duoming Zhou [Tue, 5 Apr 2022 13:22:06 +0000 (21:22 +0800)]
drivers: net: slip: fix NPD bug in sl_tx_timeout()
When a slip driver is detaching, the slip_close() will act to
cleanup necessary resources and sl->tty is set to NULL in
slip_close(). Meanwhile, the packet we transmit is blocked,
sl_tx_timeout() will be called. Although slip_close() and
sl_tx_timeout() use sl->lock to synchronize, we don`t judge
whether sl->tty equals to NULL in sl_tx_timeout() and the
null pointer dereference bug will happen.
(Thread 1) | (Thread 2)
| slip_close()
| spin_lock_bh(&sl->lock)
| ...
... | sl->tty = NULL //(1)
sl_tx_timeout() | spin_unlock_bh(&sl->lock)
spin_lock(&sl->lock); |
... | ...
tty_chars_in_buffer(sl->tty)|
if (tty->ops->..) //(2) |
... | synchronize_rcu()
We set NULL to sl->tty in position (1) and dereference sl->tty
in position (2).
This patch adds check in sl_tx_timeout(). If sl->tty equals to
NULL, sl_tx_timeout() will goto out.
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Reviewed-by: Jiri Slaby <jirislaby@kernel.org>
Link: https://lore.kernel.org/r/20220405132206.55291-1-duoming@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 7 Apr 2022 04:58:49 +0000 (21:58 -0700)]
Merge https://git./linux/kernel/git/bpf/bpf
Alexei Starovoitov says:
====================
pull-request: bpf 2022-04-06
We've added 8 non-merge commits during the last 8 day(s) which contain
a total of 9 files changed, 139 insertions(+), 36 deletions(-).
The main changes are:
1) rethook related fixes, from Jiri and Masami.
2) Fix the case when tracing bpf prog is attached to struct_ops, from Martin.
3) Support dual-stack sockets in bpf_tcp_check_syncookie, from Maxim.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf: Adjust bpf_tcp_check_syncookie selftest to test dual-stack sockets
bpf: Support dual-stack sockets in bpf_tcp_check_syncookie
bpf: selftests: Test fentry tracing a struct_ops program
bpf: Resolve to prog->aux->dst_prog->type only for BPF_PROG_TYPE_EXT
rethook: Fix to use WRITE_ONCE() for rethook:: Handler
selftests/bpf: Fix warning comparing pointer to 0
bpf: Fix sparse warnings in kprobe_multi_resolve_syms
bpftool: Explicit errno handling in skeletons
====================
Link: https://lore.kernel.org/r/20220407031245.73026-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Marek Vasut [Mon, 4 Apr 2022 19:21:05 +0000 (21:21 +0200)]
dt-bindings: display: panel: mipi-dbi-spi: Make width-mm/height-mm mandatory
Make the width-mm/height-mm panel properties mandatory
to correctly report the panel dimensions to the OS.
Fixes:
2f3468b82db97 ("dt-bindings: display: add bindings for MIPI DBI compatible SPI panels")
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Christoph Niedermaier <cniedermaier@dh-electronics.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dmitry Osipenko <digetx@gmail.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Noralf Trønnes <noralf@tronnes.org>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Robert Foss <robert.foss@linaro.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: devicetree@vger.kernel.org
To: dri-devel@lists.freedesktop.org
Acked-by: Noralf Trønnes <noralf@tronnes.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Rob Herring <robh@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220404192105.12547-1-marex@denx.de
Dave Airlie [Thu, 7 Apr 2022 00:23:22 +0000 (10:23 +1000)]
Merge tag 'amd-drm-fixes-5.18-2022-04-06' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.18-2022-04-06:
amdgpu:
- VCN 3.0 fixes
- DCN 3.1.5 fix
- Misc display fixes
- GC 10.3 golden register fix
- Suspend fix
- SMU 10 fix
amdkfd:
- Event fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220406170441.5779-1-alexander.deucher@amd.com
Dave Airlie [Thu, 7 Apr 2022 00:23:03 +0000 (10:23 +1000)]
Merge tag 'imx-drm-fixes-2022-04-06' of git://git.pengutronix.de/pza/linux into drm-fixes
drm/imx: error handling and debug output fixes
Catch an EDID allocation failure in imx-ldb, fix a leaked drm display
mode on DT parsing error in parallel-display, properly remove the
dw_hdmi bridge in case the component_add fails in dw_hdmi-imx, and
fix the IPU clock frequency debug printout in ipu-di.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Philipp Zabel <p.zabel@pengutronix.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20220406155101.1271845-1-p.zabel@pengutronix.de
Dave Airlie [Thu, 7 Apr 2022 00:22:31 +0000 (10:22 +1000)]
Merge tag 'drm-misc-fixes-2022-03-24' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
* drm/panel/ili9341: Fix optional regulator handling
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YjwkvPp6UnePy4Q8@linux-uq9g.fritz.box
Dave Airlie [Thu, 7 Apr 2022 00:21:46 +0000 (10:21 +1000)]
Merge tag 'amd-drm-next-5.18-2022-03-25' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-next-5.18-2022-03-25:
amdgpu:
- GFX 10.3.7 fixes
- noretry updates
- VCN fixes
- TMDS fix
- zstate fix for freesync video
- DCN 3.1.5 fix
- Display stack size fix
- Audio fix
- DCN 3.1 pstate fix
- TMZ VCN fix
- APU passthrough fix
- Misc other fixes
amdkfd:
- Error handling fix
- xgmi p2p fix
- HWS VMIDs fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220325183602.5718-1-alexander.deucher@amd.com
Jason A. Donenfeld [Wed, 6 Apr 2022 00:36:16 +0000 (02:36 +0200)]
random: check for signals every PAGE_SIZE chunk of /dev/[u]random
In
1448769c9cdb ("random: check for signal_pending() outside of
need_resched() check"), Jann pointed out that we previously were only
checking the TIF_NOTIFY_SIGNAL and TIF_SIGPENDING flags if the process
had TIF_NEED_RESCHED set, which meant in practice, super long reads to
/dev/[u]random would delay signal handling by a long time. I tried this
using the below program, and indeed I wasn't able to interrupt a
/dev/urandom read until after several megabytes had been read. The bug
he fixed has always been there, and so code that reads from /dev/urandom
without checking the return value of read() has mostly worked for a long
time, for most sizes, not just for <= 256.
Maybe it makes sense to keep that code working. The reason it was so
small prior, ignoring the fact that it didn't work anyway, was likely
because /dev/random used to block, and that could happen for pretty
large lengths of time while entropy was gathered. But now, it's just a
chacha20 call, which is extremely fast and is just operating on pure
data, without having to wait for some external event. In that sense,
/dev/[u]random is a lot more like /dev/zero.
Taking a page out of /dev/zero's read_zero() function, it always returns
at least one chunk, and then checks for signals after each chunk. Chunk
sizes there are of length PAGE_SIZE. Let's just copy the same thing for
/dev/[u]random, and check for signals and cond_resched() for every
PAGE_SIZE amount of data. This makes the behavior more consistent with
expectations, and should mitigate the impact of Jann's fix for the
age-old signal check bug.
---- test program ----
#include <unistd.h>
#include <signal.h>
#include <stdio.h>
#include <sys/random.h>
static unsigned char x[~0U];
static void handle(int) { }
int main(int argc, char *argv[])
{
pid_t pid = getpid(), child;
signal(SIGUSR1, handle);
if (!(child = fork())) {
for (;;)
kill(pid, SIGUSR1);
}
pause();
printf("interrupted after reading %zd bytes\n", getrandom(x, sizeof(x), 0));
kill(child, SIGTERM);
return 0;
}
Cc: Jann Horn <jannh@google.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Thomas Zimmermann [Mon, 4 Apr 2022 19:44:02 +0000 (21:44 +0200)]
fbdev: Fix unregistering of framebuffers without device
OF framebuffers do not have an underlying device in the Linux
device hierarchy. Do a regular unregister call instead of hot
unplugging such a non-existing device. Fixes a NULL dereference.
An example error message on ppc64le is shown below.
BUG: Kernel NULL pointer dereference on read at 0x00000060
Faulting instruction address: 0xc00000000080dfa4
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
[...]
CPU: 2 PID: 139 Comm: systemd-udevd Not tainted 5.17.0-
ae085d7f9365 #1
NIP:
c00000000080dfa4 LR:
c00000000080df9c CTR:
c000000000797430
REGS:
c000000004132fe0 TRAP: 0300 Not tainted (5.17.0-
ae085d7f9365)
MSR:
8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR:
28228282 XER:
20000000
CFAR:
c00000000000c80c DAR:
0000000000000060 DSISR:
40000000 IRQMASK: 0
GPR00:
c00000000080df9c c000000004133280 c00000000169d200 0000000000000029
GPR04:
00000000ffffefff c000000004132f90 c000000004132f88 0000000000000000
GPR08:
c0000000015658f8 c0000000015cd200 c0000000014f57d0 0000000048228283
GPR12:
0000000000000000 c00000003fffe300 0000000020000000 0000000000000000
GPR16:
0000000000000000 0000000113fc4a40 0000000000000005 0000000113fcfb80
GPR20:
000001000f7283b0 0000000000000000 c000000000e4a588 c000000000e4a5b0
GPR24:
0000000000000001 00000000000a0000 c008000000db0168 c0000000021f6ec0
GPR28:
c0000000016d65a8 c000000004b36460 0000000000000000 c0000000016d64b0
NIP [
c00000000080dfa4] do_remove_conflicting_framebuffers+0x184/0x1d0
[
c000000004133280] [
c00000000080df9c] do_remove_conflicting_framebuffers+0x17c/0x1d0 (unreliable)
[
c000000004133350] [
c00000000080e4d0] remove_conflicting_framebuffers+0x60/0x150
[
c0000000041333a0] [
c00000000080e6f4] remove_conflicting_pci_framebuffers+0x134/0x1b0
[
c000000004133450] [
c008000000e70438] drm_aperture_remove_conflicting_pci_framebuffers+0x90/0x100 [drm]
[
c000000004133490] [
c008000000da0ce4] bochs_pci_probe+0x6c/0xa64 [bochs]
[...]
[
c000000004133db0] [
c00000000002aaa0] system_call_exception+0x170/0x2d0
[
c000000004133e10] [
c00000000000c3cc] system_call_common+0xec/0x250
The bug [1] was introduced by commit
27599aacbaef ("fbdev: Hot-unplug
firmware fb devices on forced removal"). Most firmware framebuffers
have an underlying platform device, which can be hot-unplugged
before loading the native graphics driver. OF framebuffers do not
(yet) have that device. Fix the code by unregistering the framebuffer
as before without a hot unplug.
Tested with 5.17 on qemu ppc64le emulation.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes:
27599aacbaef ("fbdev: Hot-unplug firmware fb devices on forced removal")
Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Cc: Zack Rusin <zackr@vmware.com>
Cc: Javier Martinez Canillas <javierm@redhat.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: stable@vger.kernel.org # v5.11+
Cc: Helge Deller <deller@gmx.de>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Zheyu Ma <zheyuma97@gmail.com>
Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: linux-fbdev@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Link: https://lore.kernel.org/all/YkHXO6LGHAN0p1pq@debian/
Link: https://patchwork.freedesktop.org/patch/msgid/20220404194402.29974-1-tzimmermann@suse.de
Christoph Böhmwalder [Wed, 6 Apr 2022 19:04:45 +0000 (21:04 +0200)]
drbd: set QUEUE_FLAG_STABLE_WRITES
We want our pages not to change while they are being written.
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Xiaomeng Tong [Wed, 6 Apr 2022 19:04:44 +0000 (21:04 +0200)]
drbd: fix an invalid memory access caused by incorrect use of list iterator
The bug is here:
idr_remove(&connection->peer_devices, vnr);
If the previous for_each_connection() don't exit early (no goto hit
inside the loop), the iterator 'connection' after the loop will be a
bogus pointer to an invalid structure object containing the HEAD
(&resource->connections). As a result, the use of 'connection' above
will lead to a invalid memory access (including a possible invalid free
as idr_remove could call free_layer).
The original intention should have been to remove all peer_devices,
but the following lines have already done the work. So just remove
this line and the unneeded label, to fix this bug.
Cc: stable@vger.kernel.org
Fixes:
c06ece6ba6f1b ("drbd: Turn connection->volumes into connection->peer_devices")
Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com>
Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Reviewed-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Lv Yunlong [Wed, 6 Apr 2022 19:04:43 +0000 (21:04 +0200)]
drbd: Fix five use after free bugs in get_initial_state
In get_initial_state, it calls notify_initial_state_done(skb,..) if
cb->args[5]==1. If genlmsg_put() failed in notify_initial_state_done(),
the skb will be freed by nlmsg_free(skb).
Then get_initial_state will goto out and the freed skb will be used by
return value skb->len, which is a uaf bug.
What's worse, the same problem goes even further: skb can also be
freed in the notify_*_state_change -> notify_*_state calls below.
Thus 4 additional uaf bugs happened.
My patch lets the problem callee functions: notify_initial_state_done
and notify_*_state_change return an error code if errors happen.
So that the error codes could be propagated and the uaf bugs can be avoid.
v2 reports a compilation warning. This v3 fixed this warning and built
successfully in my local environment with no additional warnings.
v2: https://lore.kernel.org/patchwork/patch/1435218/
Fixes:
a29728463b254 ("drbd: Backport the "events2" command")
Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Maxim Mikityanskiy [Wed, 6 Apr 2022 12:41:13 +0000 (15:41 +0300)]
bpf: Adjust bpf_tcp_check_syncookie selftest to test dual-stack sockets
The previous commit fixed support for dual-stack sockets in
bpf_tcp_check_syncookie. This commit adjusts the selftest to verify the
fixed functionality.
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Arthur Fabre <afabre@cloudflare.com>
Link: https://lore.kernel.org/bpf/20220406124113.2795730-2-maximmi@nvidia.com
Maxim Mikityanskiy [Wed, 6 Apr 2022 12:41:12 +0000 (15:41 +0300)]
bpf: Support dual-stack sockets in bpf_tcp_check_syncookie
bpf_tcp_gen_syncookie looks at the IP version in the IP header and
validates the address family of the socket. It supports IPv4 packets in
AF_INET6 dual-stack sockets.
On the other hand, bpf_tcp_check_syncookie looks only at the address
family of the socket, ignoring the real IP version in headers, and
validates only the packet size. This implementation has some drawbacks:
1. Packets are not validated properly, allowing a BPF program to trick
bpf_tcp_check_syncookie into handling an IPv6 packet on an IPv4
socket.
2. Dual-stack sockets fail the checks on IPv4 packets. IPv4 clients end
up receiving a SYNACK with the cookie, but the following ACK gets
dropped.
This patch fixes these issues by changing the checks in
bpf_tcp_check_syncookie to match the ones in bpf_tcp_gen_syncookie. IP
version from the header is taken into account, and it is validated
properly with address family.
Fixes:
399040847084 ("bpf: add helper to check for a valid SYN cookie")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Acked-by: Arthur Fabre <afabre@cloudflare.com>
Link: https://lore.kernel.org/bpf/20220406124113.2795730-1-maximmi@nvidia.com
Alex Deucher [Fri, 1 Apr 2022 15:08:48 +0000 (11:08 -0400)]
drm/amdgpu/smu10: fix SoC/fclk units in auto mode
SMU takes clock limits in Mhz units. socclk and fclk were
using 10 khz units in some cases. Switch to Mhz units.
Fixes higher than required SoC clocks.
Fixes:
97cf32996c46d9 ("drm/amd/pm: Removed fixed clock in auto mode DPM")
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Dmytro Laktyushkin [Wed, 30 Mar 2022 20:05:50 +0000 (16:05 -0400)]
drm/amd/display: update dcn315 clock table read
[Why & How]
Make dcn315 base its clock table off dcfclk rather than fclk.
This change also adds some sanity checking to make sure an
empty pmfw table does not result in invalid dal clocks.
Reviewed-by: Charlene Liu <Charlene.Liu@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Benjamin Marty [Wed, 23 Mar 2022 21:08:26 +0000 (22:08 +0100)]
drm/amdgpu/display: change pipe policy for DCN 2.1
Fixes crash on MST Hub disconnect.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1849
Fixes:
ee2698cf79cc ("drm/amd/display: Changed pipe split policy to allow for multi-display pipe split")
Signed-off-by: Benjamin Marty <info@benjaminmarty.ch>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Jimmy Kizito [Tue, 22 Mar 2022 23:12:47 +0000 (19:12 -0400)]
drm/amd/display: Add configuration options for AUX wake work around.
[Why]
Work around to try to wake unresponsive DP sinks may need to be adjusted
for certain sinks.
[How]
Add options to disable work around or adjust time spent trying to wake
unresponsive DPRX.
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Martin Leung <Martin.Leung@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Jimmy Kizito <Jimmy.Kizito@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Eric Bernstein [Mon, 21 Mar 2022 14:42:34 +0000 (10:42 -0400)]
drm/amd/display: remove assert for odm transition case
Remove assert that will hit during odm transition case,
since this is a valid case.
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Eric Bernstein <eric.bernstein@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>