James Clark [Fri, 6 Aug 2021 13:41:02 +0000 (14:41 +0100)]
perf cs-etm: Initialise architecture based on TRCIDR1
Currently the architecture is hard coded as ARCH_V8, but from ETMv4.4
onwards this should be ARCH_AA64.
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https //lore.kernel.org/r/
20210806134109.1182235-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Fri, 6 Aug 2021 13:41:01 +0000 (14:41 +0100)]
perf cs-etm: Refactor initialisation of decoder params.
The initialisation of the decoder params is duplicated between
creation of the packet printer and packet decoder. Put them both
into one function so that future changes only need to be made in one
place.
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https //lore.kernel.org/r/
20210806134109.1182235-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Mon, 16 Aug 2021 13:07:05 +0000 (14:07 +0100)]
tools build: Fix feature detect clean for out of source builds
Currently the clean target when using O= isn't cleaning the feature
detect output. This is because O= and OUTPUT= are set to canonical
paths. For example in tools/perf/Makefile:
FULL_O := $(shell cd $(PWD); readlink -f $(O) || echo $(O))
This means that OUTPUT ends in a / and most usages prepend it to a file
without adding an extra /. This line that was changed adds an extra /
before the 'feature' folder but not to the end, resulting in a clean
command like this:
rm -f /tmp/build//featuretest-all.bin ...
After the change the clean command looks like this:
rm -f /tmp/build/feature/test-all.bin ...
Fixes:
762323eb39a257c3 ("perf build: Move feature cleanup under tools/build")
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20210816130705.1331868-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Riccardo Mancini [Sat, 21 Aug 2021 09:19:33 +0000 (11:19 +0200)]
perf evlist: Add evlist__for_each_entry_from() macro
This patch adds a new iteration macro for evlist that resumes iteration
from a given evsel in the evlist.
This macro will be used in the workqueue series.
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/2386505f8b598adf0dbcd04ec21804c6bcf00826.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Riccardo Mancini [Sat, 21 Aug 2021 09:19:30 +0000 (11:19 +0200)]
perf evsel: Handle precise_ip fallback in evsel__open_cpu()
This is another patch in the effort to separate the fallback mechanisms
from the open itself.
In case of precise_ip fallback, the original precise_ip will be stored
in the evsel (it was stored in a local variable) and the open will be
retried. Since the precise_ip fallback will be the first in the chain of
fallbacks, there should be no functional change with this patch.
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/74208c433d2024a6c4af9c0b140b54ed6b5ea810.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Riccardo Mancini [Sat, 21 Aug 2021 09:19:29 +0000 (11:19 +0200)]
perf evsel: Move bpf_counter__install_pe() to success path in evsel__open_cpu()
I don't see why bpf_counter__install_pe() should get called even if
fd = -1, so I'm moving it to the success path.
This will be useful in following patches to separate the actual open and
the related operations from the fallback mechanisms.
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Link: http://lore.kernel.org/lkml/64f8a1b0a838a6e6049cd43c1beafd432999ae57.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Riccardo Mancini [Sat, 21 Aug 2021 09:19:28 +0000 (11:19 +0200)]
perf evsel: Move test_attr__open() to success path in evsel__open_cpu()
test_attr__open() ignores the fd if -1, therefore it is safe to move it to
the success path (fd >= 0).
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/b3baf11360ca96541c9631730614fd7d217496fc.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Riccardo Mancini [Sat, 21 Aug 2021 09:19:27 +0000 (11:19 +0200)]
perf evsel: Move ignore_missing_thread() to fallback code
This patch moves ignore_missing_thread outside the perf_event_open loop.
Doing so, we need to move the retry_open flag a few places higher, with
minimal impact. Furthermore, thread need not be decreased since it won't
get increased by the for loop (since we're jumping back inside), but we
need to check that the nthreads decrease didn't put thread out of range.
The goal is to have fallbacks handled in one place only, since in the
future parallel code, these would be handled separately.
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/4eca51443c786baaf6811b7cd8e73aafd97f7606.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Riccardo Mancini [Sat, 21 Aug 2021 09:19:26 +0000 (11:19 +0200)]
perf evsel: Separate rlimit increase from evsel__open_cpu()
This is a preparatory patch for the workqueue patches with the goal to
separate from evlist__open_cpu() the actual opening (which could be
performed in parallel), from the existing fallback mechanisms, which
should be handled sequentially.
This patch separates the rlimit increase from evsel__open_cpu().
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/2f256de8ec37b9809a5cef73c2fa7bce416af5d3.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Riccardo Mancini [Sat, 21 Aug 2021 09:19:25 +0000 (11:19 +0200)]
perf evsel: Separate missing feature detection from evsel__open_cpu()
This is a preparatory patch for the workqueue patches with the goal to
separate in evlist__open_cpu() the actual opening, which could be
performed in parallel, from the existing fallback mechanisms, which
should be handled sequentially.
This patch separates the missing feature detection in evsel__open_cpu()
into a new evsel__detect_missing_features() function.
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/cba0b7d939862473662adeedb0f9c9b69566ee9a.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Riccardo Mancini [Sat, 21 Aug 2021 09:19:24 +0000 (11:19 +0200)]
perf evsel: Add evsel__prepare_open()
This function will prepare the evsel and disable the missing features.
It will be used in one of the following patches.
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/fa5e78bbb92c848226f044278fdcf777b3ce4583.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Riccardo Mancini [Sat, 21 Aug 2021 09:19:23 +0000 (11:19 +0200)]
perf evsel: Separate missing feature disabling from evsel__open_cpu
This is a preparatory patch for the patches in the workqueue series with
the goal to separate in evlist__open_cpu() the actual opening, which
could be performed in parallel, from the existing fallback mechanisms,
which should be handled sequentially.
This patch separates the disabling of missing features from
evlist__open_cpu() into a new function evsel__disable_missing_features(().
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/48138bd2932646dde315505da733c2ca635ad2ee.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Riccardo Mancini [Sat, 21 Aug 2021 09:19:22 +0000 (11:19 +0200)]
perf evsel: Save open flags in evsel in prepare_open()
This patch caches the flags used in perf_event_open() inside evsel, so
that they can be set in __evsel__prepare_open() (this will be useful in
patches in the workqueue series, when the fallback mechanisms will be
handled outside the open itself).
This also optimizes the code, by not having to recompute them everytime.
Since flags are now saved in evsel, the flags argument in
perf_event_open() is removed.
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/d9f63159098e56fa518eecf25171d72e6f74df37.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Riccardo Mancini [Sat, 21 Aug 2021 09:19:21 +0000 (11:19 +0200)]
perf evsel: Separate open preparation from open itself
This is a preparatory patch for the following patches with the goal to
separate in evlist__open_cpu the actual perf_event_open, which could be
performed in parallel, from the existing fallback mechanisms, which
should be handled sequentially.
This patch separates the first lines of evsel__open_cpu into a new
__evsel__prepare_open function.
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/e14118b934c338dbbf68b8677f20d0d7dbf9359a.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Riccardo Mancini [Sat, 21 Aug 2021 09:19:20 +0000 (11:19 +0200)]
perf evsel: Remove retry_sample_id goto label
As far as I can tell, there is no good reason, apart from optimization
to have the retry_sample_id separate from fallback_missing_features.
Probably, this label was added to avoid reapplying patches for missing
features that had already been applied.
However, missing features that have been added later have not used this
optimization, always jumping to fallback_missing_features and reapplying
all missing features.
This patch removes that label, replacing it with
fallback_missing_features.
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/340af0d03408d6621fd9c742e311db18b3585b3b.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Riccardo Mancini [Sat, 21 Aug 2021 09:19:11 +0000 (11:19 +0200)]
perf mmap: Add missing bitops.h header
MMAP_CPU_MASK_BYTES uses the BITS_TO_LONGS macro, which is defined in
linux/bitops.h.
However, this header is not included directly, but gets imported
indirectly in files using the macro.
This patch adds the missing include.
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/c5b91ee432a2e28e7f16337c740b43b4d0b0e86c.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Riccardo Mancini [Sat, 21 Aug 2021 09:19:08 +0000 (11:19 +0200)]
libperf cpumap: Take into advantage it is sorted to optimize perf_cpu_map__max()
From commit
7074674e7338863e ("perf cpumap: Maintain cpumaps ordered and
without dups"), perf_cpu_map elements are sorted in ascending order.
This patch improves the perf_cpu_map__max function by returning the last
element.
Committer notes:
Do it as a ternary to keep it in just one return line, add a comment
explaining it is sorted and what functions does it.
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/fb79f02e7b86ea8044d563adb1e9890c906f982f.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Riccardo Mancini [Sat, 21 Aug 2021 09:19:37 +0000 (11:19 +0200)]
libsubcmd: add OPT_UINTEGER_OPTARG option type
This patch adds OPT_UINTEGER_OPTARG, which is the same as OPT_UINTEGER,
but also makes it possible to use the option without any value, setting
the variable to a default value, d.
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/c46749b3dff796729078352ff164d363457a3587.1629490974.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Tue, 31 Aug 2021 14:55:01 +0000 (15:55 +0100)]
perf tools: Fix LLVM download hint link
http://llvm.org/apt returns 404, it has moved to https://apt.llvm.org/
Signed-off-by: James Clark <james.clark@arm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20210831145501.2135754-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Tue, 31 Aug 2021 14:55:00 +0000 (15:55 +0100)]
perf tools: Fix LLVM test failure when running in verbose mode
A CI system might want to run all tests in verbose mode so that there is
enough information to diagnose issues. This LLVM test is the only test
that uses "-v" to signify to not skip the test if the preconditions
aren't met (LLVM isn't installed). This means that running the test in
verbose mode without LLVM installed causes a test failure.
For consistency with the other tests, remove this verbose/skip check. An
alternate solution would be to make _all_ tests not skip when run in
verbose mode, but I don't think that would be intuitive.
Also change the search_program() call to search_program_and_warn().
Previously the hint about installing LLVM was only printed by the actual
test because this check was skipped in verbose mode. To maintain the old
behaviour, the precondition check must also print the full warning.
Previous output:
$ ./perf test llvm
40: LLVM search and compile :
40.1: Basic BPF llvm compile : Skip
$ ./perf test -v llvm
40: LLVM search and compile :
40.1: Basic BPF llvm compile :
--- start ---
test child forked, pid 2085835
ERROR: unable to find clang.
Hint: Try to install latest clang/llvm to support BPF. Check your $PATH
...
test child finished with -1
---- end ----
LLVM search and compile subtest 1: FAILED!
New output (non verbose mode is identical, verbose changes from fail to
skip):
$ ./perf test llvm
40: LLVM search and compile :
40.1: Basic BPF llvm compile : Skip
$ ./perf test -v llvm
40: LLVM search and compile :
40.1: Basic BPF llvm compile :
--- start ---
test child forked, pid 2087680
ERROR: unable to find clang.
Hint: Try to install latest clang/llvm to support BPF. Check your $PATH
...
No clang, skip this test
test child finished with -2
---- end ----
LLVM search and compile subtest 1: Skip
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20210831145501.2135754-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Tue, 31 Aug 2021 14:54:59 +0000 (15:54 +0100)]
perf tools: Refactor LLVM test warning for missing binary
The same warning is duplicated in two places so refactor it into a
single function "search_program_and_warn". This will be used a third
time in a later commit.
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20210831145501.2135754-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Leo Yan [Sun, 29 Aug 2021 10:22:38 +0000 (18:22 +0800)]
perf auxtrace arm: Support compat_auxtrace_mmap__{read_head|write_tail}
When the tool runs with compat mode on Arm platform, the kernel is in
64-bit mode and user space is in 32-bit mode; the user space can use
instructions "ldrd" and "strd" for 64-bit value atomicity.
This patch adds compat_auxtrace_mmap__{read_head|write_tail} for arm
building, it uses "ldrd" and "strd" instructions to ensure accessing
atomicity for aux head and tail. The file arch/arm/util/auxtrace.c is
built for arm and arm64 building, these two functions are not needed for
arm64, so check the compiler macro "__arm__" to only include them for
arm building.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Russell King (oracle)" <linux@armlinux.org.uk>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210829102238.19693-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Leo Yan [Sun, 29 Aug 2021 10:22:37 +0000 (18:22 +0800)]
perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail}
When perf runs in compat mode (kernel in 64-bit mode and the perf is in
32-bit mode), the 64-bit value atomicity in the user space cannot be
assured, E.g. on some architectures, the 64-bit value accessing is split
into two instructions, one is for the low 32-bit word accessing and
another is for the high 32-bit word.
This patch introduces weak functions compat_auxtrace_mmap__read_head()
and compat_auxtrace_mmap__write_tail(), as their naming indicates, when
perf tool works in compat mode, it uses these two functions to access
the AUX head and tail. These two functions can allow the perf tool to
work properly in certain conditions, e.g. when perf tool works in
snapshot mode with only using AUX head pointer, or perf tool uses the
AUX buffer and the incremented tail is not bigger than 4GB.
When perf tool cannot handle the case when the AUX tail is bigger than
4GB, the function compat_auxtrace_mmap__write_tail() returns -1 and
tells the caller to bail out for the error.
These two functions are declared as weak attribute, this allows to
implement arch specific functions if any arch can support the 64-bit
value atomicity in compat mode.
Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Russell King (oracle)" <linux@armlinux.org.uk>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210829102238.19693-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ian Rogers [Thu, 26 Aug 2021 18:48:33 +0000 (11:48 -0700)]
perf bpf: Fix memory leaks relating to BTF.
BTF needs to be freed with btf__free().
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210826184833.408563-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Joshua Martinez [Tue, 24 Aug 2021 20:58:29 +0000 (13:58 -0700)]
perf data: Correct -h output
There is currently only 1 'perf data' command, but supporting extra
commands was breaking the help output. Simplify for now so that the help
output is correct.
Before:
$ perf data -h
Usage: perf data [<common options>] <command> [<options>]
$ perf data
Usage:
perf data [<common options>] <command> [<options>]
Available commands:
convert - converts data file between formats
After:
$ perf data
Usage: perf data convert [<options>]
-f, --force don't complain, do it
-i, --input <file> input file name
-v, --verbose be more verbose
--all Convert all events
--to-ctf ... Convert to CTF format
--to-json ... Convert to JSON format
--tod Convert time to wall clock time
$ perf data -h
Usage: perf data convert [<options>]
-f, --force don't complain, do it
-i, --input <file> input file name
-v, --verbose be more verbose
--all Convert all events
--to-ctf ... Convert to CTF format
--to-json ... Convert to JSON format
--tod Convert time to wall clock time
Signed-off-by: Joshua Martinez <joshuamart@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210824205829.52822-1-irogers@google.com
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Colin Ian King [Thu, 26 Aug 2021 12:18:01 +0000 (13:18 +0100)]
perf header: Fix spelling mistake "cant'" -> "can't"
There is a spelling mistake in a warning message. Fix it.
Signed-off-by: Colin King <colin.king@canonical.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-janitors@vger.kernel.org
Link: http://lore.kernel.org/lkml/20210826121801.13281-1-colin.king@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Mon, 30 Aug 2021 21:20:11 +0000 (18:20 -0300)]
perf dlfilters: Fix build on environments with a --sysroot gcc arg
Such as cross building on Android, so just add EXTRA_CFLAGS to the
dlfilters rules as it is where --sysroot= has been specified.
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/YS1JwIMTNNWcbGdT@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andreas Gerstmayr [Mon, 30 Aug 2021 16:47:27 +0000 (18:47 +0200)]
perf flamegraph: flamegraph.py script improvements
* display perf.data header
* display PIDs of user stacks
* added option to change color scheme
* default to blue/green color scheme to improve accessibility
* correctly identify kernel stacks when kernel-debuginfo is installed
Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210830164729.116049-1-agerstmayr@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Fri, 27 Aug 2021 23:32:12 +0000 (16:32 -0700)]
perf record: Fix wrong comm in system-wide mode with delay
Stephane found that the name of the forked process in a system-wide
mode is wrong when --delay option is used. For example,
# perf record -a --delay=1000 noploop 3
The noploop process will run a busy loop for 3 second. And on an idle
machine it should show up at the top in the perf report. It works
well without the --delay option. But if I add the option, it showed
'perf' not 'noploop'.
# perf report -s comm -q | head -3
52.94% perf
16.65% swapper
12.04% chrome
It turned out that the dummy event didn't work at all and it missed
COMM and MMAP events for the noploop process (and others too). We
should enable the dummy event immediately in system-wide mode, as the
enable-on-exec would work only for task events.
With this change,
# perf report -s comm -q | head -3
52.75% noploop
17.03% swapper
12.83% chrome
Reported-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210827233212.3121037-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Mon, 30 Aug 2021 17:02:00 +0000 (10:02 -0700)]
perf stat: Do not allow --for-each-cgroup without cpu
The cgroup mode should work with cpu events. Warn if --for-each-cgroup
option is used with a task target like existing -G option.
# perf stat --for-each-cgroup . sleep 1
both cgroup and no-aggregation modes only available in system-wide mode
Usage: perf stat [<options>] [<command>]
-G, --cgroup <name> monitor event in cgroup name only
-A, --no-aggr disable CPU count aggregation
-a, --all-cpus system-wide collection from all CPUs
--for-each-cgroup <name>
expand events for each cgroup
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210830170200.55652-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Mon, 30 Aug 2021 18:42:57 +0000 (15:42 -0300)]
perf bench evlist-open-close: Use PRIu64 with u64 to fix build on 32-bit architectures
73 9.00 ubuntu:18.04-x-powerpc : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
bench/evlist-open-close.c: In function 'bench_evlist_open_close__run':
bench/evlist-open-close.c:173:12: error: format '%ld' expects argument of type 'long int', but argument 5 has type 'u64 {aka long long unsigned int}' [-Werror=format=]
pr_debug("Iteration %d took:\t%ldus\n", i, runtime_us);
^
bench/../util/debug.h:18:21: note: in definition of macro 'pr_fmt'
#define pr_fmt(fmt) fmt
^~~
bench/evlist-open-close.c:173:3: note: in expansion of macro 'pr_debug'
pr_debug("Iteration %d took:\t%ldus\n", i, runtime_us);
^~~~~~~~
cc1: all warnings being treated as errors
/git/perf-5.14.0/tools/build/Makefile.build:139: recipe for target 'bench' failed
Cc: Riccardo Mancini <rickyman7@gmail.com>
Fixes:
4241eabf59d5b7e9 ("perf bench: Add benchmark for evlist open/close operations")
Link: http://lore.kernel.org/lkml/YS0oTcA9Zuy8Wjm9@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
James Clark [Wed, 25 Aug 2021 16:42:59 +0000 (17:42 +0100)]
perf tests: Fix *probe_vfs_getname.sh test failures
The commit
4d6101f5fd5d9960 ("perf probe: Clarify error message about
not finding kernel modules debuginfo") changed the error message "Failed
to find the path for kernel" to "Failed to find the path for the
kernel".
Update the regex so that the tests still skip rather than fail when
kernel debug symbols aren't present.
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: http://lore.kernel.org/lkml/20210825164259.833222-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Wed, 25 Aug 2021 14:50:37 +0000 (11:50 -0300)]
perf bench inject-buildid: Handle writen() errors
The build on fedora:35 and fedora:rawhide with clang is failing with:
49 41.00 fedora:35 : FAIL clang version 13.0.0 (Fedora 13.0.0~rc1-1.fc35)
bench/inject-buildid.c:351:6: error: variable 'len' set but not used [-Werror,-Wunused-but-set-variable]
u64 len = 0;
^
1 error generated.
make[3]: *** [/git/perf-5.14.0-rc7/tools/build/Makefile.build:139: bench] Error 2
50 41.11 fedora:rawhide : FAIL clang version 13.0.0 (Fedora 13.0.0~rc1-1.fc35)
bench/inject-buildid.c:351:6: error: variable 'len' set but not used [-Werror,-Wunused-but-set-variable]
u64 len = 0;
^
1 error generated.
make[3]: *** [/git/perf-5.14.0-rc7/tools/build/Makefile.build:139: bench] Error 2
That 'len' variable is not used at all, so just make sure all the
synthesize_RECORD() routines return ssize_t to propagate the writen()
return, as it may fail, ditch the 'ret' var and bail out if those
routines fail.
Fixes:
0bf02a0d80427f26 ("perf bench: Add build-id injection benchmark")
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/CAM9d7cgEZNSor+B+7Y2C+QYGme_v5aH0Zn0RLfxoQ+Fy83EHrg@mail.gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Li Huafei [Mon, 23 Aug 2021 13:43:40 +0000 (21:43 +0800)]
perf unwind: Do not overwrite FEATURE_CHECK_LDFLAGS-libunwind-{x86,aarch64}
When setting LIBUNWIND_DIR, we first set
FEATURE_CHECK_LDFLAGS-libunwind-{aarch64,x86} = -L$(LIBUNWIND_DIR)/lib.
<committer note>
This happens a bit before, the overwritting, in:
libunwind_arch_set_flags = $(eval $(libunwind_arch_set_flags_code))
define libunwind_arch_set_flags_code
FEATURE_CHECK_CFLAGS-libunwind-$(1) = -I$(LIBUNWIND_DIR)/include
FEATURE_CHECK_LDFLAGS-libunwind-$(1) = -L$(LIBUNWIND_DIR)/lib
endef
ifdef LIBUNWIND_DIR
LIBUNWIND_CFLAGS = -I$(LIBUNWIND_DIR)/include
LIBUNWIND_LDFLAGS = -L$(LIBUNWIND_DIR)/lib
LIBUNWIND_ARCHS = x86 x86_64 arm aarch64 debug-frame-arm debug-frame-aarch64
$(foreach libunwind_arch,$(LIBUNWIND_ARCHS),$(call libunwind_arch_set_flags,$(libunwind_arch)))
endif
Look at that 'foreach' on all the LIBUNWIND_ARCHS.
</>
After commit
5c4d7c82c0dc ("perf unwind: Do not put libunwind-{x86,aarch64}
in FEATURE_TESTS_BASIC"), FEATURE_CHECK_LDFLAGS-libunwind-{x86,aarch64} is
overwritten. As a result, the remote libunwind libraries cannot be searched
from $(LIBUNWIND_DIR)/lib directory during feature check tests. Fix it with
variable appending.
Before this patch:
perf$ make VF=1 LIBUNWIND_DIR=/opt/libunwind_aarch64
BUILD: Doing 'make -j16' parallel build
<SNIP>
...
... libopencsd: [ OFF ]
... libunwind-x86: [ OFF ]
... libunwind-x86_64: [ OFF ]
... libunwind-arm: [ OFF ]
... libunwind-aarch64: [ OFF ]
... libunwind-debug-frame: [ OFF ]
... libunwind-debug-frame-arm: [ OFF ]
... libunwind-debug-frame-aarch64: [ OFF ]
... cxx: [ OFF ]
<SNIP>
perf$ cat ../build/feature/test-libunwind-aarch64.make.output
/usr/bin/ld: cannot find -lunwind-aarch64
/usr/bin/ld: cannot find -lunwind-aarch64
collect2: error: ld returned 1 exit status
After this patch:
perf$ make VF=1 LIBUNWIND_DIR=/opt/libunwind_aarch64
BUILD: Doing 'make -j16' parallel build
<SNIP>
... libopencsd: [ OFF ]
... libunwind-x86: [ OFF ]
... libunwind-x86_64: [ OFF ]
... libunwind-arm: [ OFF ]
... libunwind-aarch64: [ on ]
... libunwind-debug-frame: [ OFF ]
... libunwind-debug-frame-arm: [ OFF ]
... libunwind-debug-frame-aarch64: [ OFF ]
... cxx: [ OFF ]
<SNIP>
perf$ cat ../build/feature/test-libunwind-aarch64.make.output
perf$ ldd ./perf
linux-vdso.so.1 (0x00007ffdf07da000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f30953dc000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f30951d4000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f3094e36000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f3094c32000)
libelf.so.1 => /usr/lib/x86_64-linux-gnu/libelf.so.1 (0x00007f3094a18000)
libdw.so.1 => /usr/lib/x86_64-linux-gnu/libdw.so.1 (0x00007f30947cc000)
libunwind-x86_64.so.8 => /usr/lib/x86_64-linux-gnu/libunwind-x86_64.so.8 (0x00007f30945ad000)
libunwind.so.8 => /usr/lib/x86_64-linux-gnu/libunwind.so.8 (0x00007f3094392000)
liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f309416c000)
libunwind-aarch64.so.8 => not found
libslang.so.2 => /lib/x86_64-linux-gnu/libslang.so.2 (0x00007f3093c8a000)
libpython2.7.so.1.0 => /usr/local/lib/libpython2.7.so.1.0 (0x00007f309386b000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f309364e000)
libnuma.so.1 => /usr/lib/x86_64-linux-gnu/libnuma.so.1 (0x00007f3093443000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f3093052000)
/lib64/ld-linux-x86-64.so.2 (0x00007f3096097000)
libbz2.so.1.0 => /lib/x86_64-linux-gnu/libbz2.so.1.0 (0x00007f3092e42000)
libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007f3092c3f000)
Fixes:
5c4d7c82c0dceccf ("perf unwind: Do not put libunwind-{x86,aarch64} in FEATURE_TESTS_BASIC")
Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zhang Jinhao <zhangjinhao2@huawei.com>
Link: http://lore.kernel.org/lkml/20210823134340.60955-1-lihuafei1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Fri, 20 Aug 2021 14:13:36 +0000 (11:13 -0300)]
perf config: Fix caching and memory leak in perf_home_perfconfig()
Acaict, perf_home_perfconfig() is supposed to cache the result of
home_perfconfig, which returns the default location of perfconfig for
the user, given the HOME environment variable.
However, the current implementation calls home_perfconfig every time
perf_home_perfconfig() is called (so no caching is actually performed),
replacing the previous pointer, thus also causing a memory leak.
This patch adds a check of whether either config or failed is set and,
in that case, directly returns config without calling home_perfconfig at
each invocation.
Fixes:
f5f03e19ce14fc31 ("perf config: Add perf_home_perfconfig function")
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Link: http://lore.kernel.org/lkml/20210820130817.740536-1-rickyman7@gmail.com
[ Removed needless double check for the 'failed' variable ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Alexey Dobriyan [Tue, 17 Aug 2021 11:58:33 +0000 (14:58 +0300)]
perf tools: Fixup get_current_dir_name() compilation
strdup() prototype doesn't live in stdlib.h .
Add limits.h for PATH_MAX definition as well.
This fixes the build on Android.
Signed-off-by: Alexey Dobriyan (SK hynix) <adobriyan@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/YRukaQbrgDWhiwGr@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Mon, 30 Aug 2021 13:05:46 +0000 (10:05 -0300)]
Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Linus Torvalds [Sun, 29 Aug 2021 22:04:50 +0000 (15:04 -0700)]
Linux 5.14
Linus Torvalds [Sun, 29 Aug 2021 19:52:17 +0000 (12:52 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux
Pull clk fix from Stephen Boyd:
"One hotfix for a NULL pointer deref in the Renesas usb clk driver"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: renesas: rcar-usb2-clock-sel: Fix kernel NULL pointer dereference
Linus Torvalds [Sun, 29 Aug 2021 17:54:14 +0000 (10:54 -0700)]
Merge tag 'sched_urgent_for_v5.14' of git://git./linux/kernel/git/tip/tip
Pull scheduler fixes from Borislav Petkov:
- Have get_push_task() check whether current has migration disabled and
thus avoid useless invocations of the migration thread
- Rework initialization flow so that all rq->core's are initialized,
even of CPUs which have not been onlined yet, so that iterating over
them all works as expected
* tag 'sched_urgent_for_v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Fix get_push_task() vs migrate_disable()
sched: Fix Core-wide rq->lock for uninitialized CPUs
Linus Torvalds [Sun, 29 Aug 2021 17:47:02 +0000 (10:47 -0700)]
Merge tag 'irq_urgent_for_v5.14' of git://git./linux/kernel/git/tip/tip
Pull irq fix from Borislav Petkov:
- Have msix_mask_all() check a global control which says whether MSI-X
masking should be done and thus make it usable on Xen-PV too
* tag 'irq_urgent_for_v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
PCI/MSI: Skip masking MSI-X on Xen PV
Linus Torvalds [Sun, 29 Aug 2021 17:36:32 +0000 (10:36 -0700)]
Merge tag 'perf_urgent_for_v5.14' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Borislav Petkov:
- Prevent the amd/power module from being removed while in use
- Mark AMD IBS as not supporting content exclusion
- Add a workaround for AMD erratum #1197 where IBS registers might not
be restored properly after exiting CC6 state
- Fix a potential truncation of a 32-bit variable due to shifting
- Read the correct bits describing the number of configurable address
ranges on Intel PT
* tag 'perf_urgent_for_v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/amd/power: Assign pmu.module
perf/x86/amd/ibs: Extend PERF_PMU_CAP_NO_EXCLUDE to IBS Op
perf/x86/amd/ibs: Work around erratum #1197
perf/x86/intel/uncore: Fix integer overflow on 23 bit left shift of a u32
perf/x86/intel/pt: Fix mask of num_address_ranges
Linus Torvalds [Sun, 29 Aug 2021 17:26:00 +0000 (10:26 -0700)]
Merge tag 'x86_urgent_for_v5.14' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Fix build error on RHEL where -Werror=maybe-uninitialized is set.
- Restore the firmware's IDT when calling EFI boot services and before
ExitBootServices() has been called. This fixes a boot failure on what
appears to be a tablet with 32-bit UEFI running a 64-bit kernel.
* tag 'x86_urgent_for_v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/resctrl: Fix a maybe-uninitialized build warning treated as error
x86/efi: Restore Firmware IDT before calling ExitBootServices()
Helge Deller [Fri, 27 Aug 2021 18:42:57 +0000 (20:42 +0200)]
Revert "parisc: Add assembly implementations for memset, strlen, strcpy, strncpy and strcat"
This reverts commit
83af58f8068ea3f7b3c537c37a30887bfa585069.
It turns out that at least the assembly implementation for strncpy() was
buggy. Revert the whole commit and return back to the default coding.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org> # v5.4+
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adam Ford [Thu, 26 Aug 2021 14:17:21 +0000 (09:17 -0500)]
clk: renesas: rcar-usb2-clock-sel: Fix kernel NULL pointer dereference
The probe was manually passing NULL instead of dev to devm_clk_hw_register.
This caused a Unable to handle kernel NULL pointer dereference error.
Fix this by passing 'dev'.
Signed-off-by: Adam Ford <aford173@gmail.com>
Fixes:
a20a40a8bbc2 ("clk: renesas: rcar-usb2-clock-sel: Fix error handling in .probe()")
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Linus Torvalds [Sat, 28 Aug 2021 18:39:16 +0000 (11:39 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fix from James Bottomley:
"A single fix for a race introduced by a fix that went into 5.14-rc5"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: core: Fix hang of freezing queue between blocking and running device
Linus Torvalds [Sat, 28 Aug 2021 18:32:16 +0000 (11:32 -0700)]
Merge tag 'usb-5.14' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are a few tiny USB fixes for reported issues with some USB
drivers.
These fixes include:
- gadget driver fixes for regressions
- tcpm driver fix
- dwc3 driver fixes
- xhci renesas firmware loading fix, again.
- usb serial option driver device id addition
- usb serial ch341 revert for regression
All all of these have been in linux-next with no reported problems"
* tag 'usb-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: gadget: u_audio: fix race condition on endpoint stop
usb: gadget: f_uac2: fixup feedback endpoint stop
usb: typec: tcpm: Raise vdm_sm_running flag only when VDM SM is running
usb: renesas-xhci: Prefer firmware loading on unknown ROM state
usb: dwc3: gadget: Stop EP0 transfers during pullup disable
usb: dwc3: gadget: Fix dwc3_calc_trbs_left()
Revert "USB: serial: ch341: fix character loss at high transfer rates"
USB: serial: option: add new VID/PID to support Fibocom FG150
Linus Torvalds [Sat, 28 Aug 2021 17:40:41 +0000 (10:40 -0700)]
Merge tag 'powerpc-5.14-7' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix scv implicit soft-mask table for relocated (eg. kdump) kernels
- Re-enable ARCH_ENABLE_SPLIT_PMD_PTLOCK, which was disabled due to a
typo
Thanks to Lukas Bulwahn, Nicholas Piggin, and Daniel Axtens.
* tag 'powerpc-5.14-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s: Fix scv implicit soft-mask table for relocated kernels
powerpc: Re-enable ARCH_ENABLE_SPLIT_PMD_PTLOCK
Linus Torvalds [Fri, 27 Aug 2021 23:08:29 +0000 (16:08 -0700)]
Merge tag 'block-5.14-2021-08-27' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- Revert the mq-deadline priority handling, it's causing serious
performance regressions. While experimental patches exists to fix
this up, it's too late to do so now. Revert it and re-do it properly
for 5.15 instead.
- Fix a NULL vs IS_ERR() regression in this release (Dan)
- Fix a mq-deadline accounting regression in this release (Bart)
- Mark cryptoloop as deprecated. It's broken and dm-crypt fully
supports it, and it's actively intefering with loop. Plan on removal
for 5.16 (Christoph)
* tag 'block-5.14-2021-08-27' of git://git.kernel.dk/linux-block:
cryptoloop: add a deprecation warning
pd: fix a NULL vs IS_ERR() check
Revert "block/mq-deadline: Prioritize high-priority requests"
mq-deadline: Fix request accounting
Linus Torvalds [Fri, 27 Aug 2021 22:59:00 +0000 (15:59 -0700)]
Merge tag 'soc-fixes-5.14-4' of git://git./linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"Just two trivial fixes from the reset driver tree, nothing else came
up since the last soc fixes"
* tag 'soc-fixes-5.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
reset: reset-zynqmp: Fixed the argument data type
reset: RESET_MCHP_SPARX5 should depend on ARCH_SPARX5
Linus Torvalds [Fri, 27 Aug 2021 19:18:09 +0000 (12:18 -0700)]
Merge tag 'acpi-5.14-rc8' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"Fix a regression introduced during this cycle that has been partially
addressed by an earlier commit (Andy Shevchenko)"
* tag 'acpi-5.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
media: ipu3-cio2: Drop reference on error path in cio2_bridge_connect_sensor()
Linus Torvalds [Fri, 27 Aug 2021 19:06:51 +0000 (12:06 -0700)]
Merge tag 'pm-5.14-rc8' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix two issues introduced during this cycle, one of which is a
regression and the other one affects new code.
Specifics:
- Prevent the operating performance points (OPP) code from crashing
when some entries in the table of required OPPs are set to error
pointer values (Marijn Suijten)
- Prevent the generic power domains (genpd) framework from
incorrectly overriding the performance state of a device set by its
driver while it is runtime-suspended or when runtime PM of it is
disabled (Dmitry Osipenko)"
* tag 'pm-5.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: domains: Improve runtime PM performance state handling
opp: core: Check for pending links before reading required_opp pointers
David Hildenbrand [Wed, 25 Aug 2021 10:24:15 +0000 (12:24 +0200)]
virtio-mem: fix sleeping in RCU read side section in virtio_mem_online_page_cb()
virtio_mem_set_fake_offline() might sleep now, and we call it under
rcu_read_lock(). To fix it, simply move the rcu_read_unlock() further
up, as we're done with the device.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes:
6cc26d77613a: "virtio-mem: use page_offline_(start|end) when setting PageOffline()
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: virtualization@lists.linux-foundation.org
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rafael J. Wysocki [Fri, 27 Aug 2021 18:27:01 +0000 (20:27 +0200)]
Merge branch 'pm-opp'
* pm-opp:
opp: core: Check for pending links before reading required_opp pointers
Linus Torvalds [Fri, 27 Aug 2021 18:04:57 +0000 (11:04 -0700)]
Merge tag 'riscv-for-linus-5.14-rc8' of git://git./linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- device tree updates for the Microsemi Polarfire development kit that
fix some mismatches between the u-boot and Linux enternet entries
- ensure that the F register state is correctly reflected in core dumps
* tag 'riscv-for-linus-5.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: dts: microchip: Add ethernet0 to the aliases node
riscv: dts: microchip: Use 'local-mac-address' for emac1
riscv: Ensure the value of FP registers in the core dump file is up to date
Linus Torvalds [Fri, 27 Aug 2021 16:52:48 +0000 (09:52 -0700)]
Merge tag 'mmc-v5.14-rc7' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC host fix from Ulf Hansson:
- sdhci-iproc: Fix clock error for ACPI rpi's
* tag 'mmc-v5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
Revert "mmc: sdhci-iproc: Set SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN on BCM2711"
Christoph Hellwig [Fri, 27 Aug 2021 16:32:50 +0000 (18:32 +0200)]
cryptoloop: add a deprecation warning
Support for cryptoloop has been officially marked broken and deprecated
in favor of dm-crypt (which supports the same broken algorithms if
needed) in Linux 2.6.4 (released in March 2004), and support for it has
been entirely removed from losetup in util-linux 2.23 (released in April
2013). Add a warning and a deprecation schedule.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210827163250.255325-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Fri, 27 Aug 2021 16:00:43 +0000 (09:00 -0700)]
Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
Pull ARM fix from Russell King:
"Resolve a Keystone 2 kernel mapping regression"
* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 9104/2: Fix Keystone 2 kernel mapping regression
Ulf Hansson [Fri, 27 Aug 2021 14:30:36 +0000 (16:30 +0200)]
Revert "mmc: sdhci-iproc: Set SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN on BCM2711"
This reverts commit
419dd626e357e89fc9c4e3863592c8b38cfe1571.
It turned out that the change from the reverted commit breaks the ACPI
based rpi's because it causes the 100Mhz max clock to be overridden to the
return from sdhci_iproc_get_max_clock(), which is 0 because there isn't a
OF/DT based clock device.
Reported-by: Jeremy Linton <jeremy.linton@arm.com>
Fixes:
419dd626e357 ("mmc: sdhci-iproc: Set SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN on BCM2711")
Acked-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Jerome Brunet [Fri, 27 Aug 2021 09:29:27 +0000 (11:29 +0200)]
usb: gadget: u_audio: fix race condition on endpoint stop
If the endpoint completion callback is call right after the ep_enabled flag
is cleared and before usb_ep_dequeue() is call, we could do a double free
on the request and the associated buffer.
Fix this by clearing ep_enabled after all the endpoint requests have been
dequeued.
Fixes:
7de8681be2cd ("usb: gadget: u_audio: Free requests only after callback")
Cc: stable <stable@vger.kernel.org>
Reported-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Link: https://lore.kernel.org/r/20210827092927.366482-1-jbrunet@baylibre.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jerome Brunet [Fri, 27 Aug 2021 07:58:53 +0000 (09:58 +0200)]
usb: gadget: f_uac2: fixup feedback endpoint stop
When the uac2 function is stopped, there seems to be an issue reported on
some platforms (Intel Merrifield at least)
BUG: kernel NULL pointer dereference, address:
0000000000000008
...
RIP: 0010:dwc3_gadget_del_and_unmap_request+0x19/0xe0
...
Call Trace:
dwc3_remove_requests.constprop.0+0x12f/0x170
__dwc3_gadget_ep_disable+0x7a/0x160
dwc3_gadget_ep_disable+0x3d/0xd0
usb_ep_disable+0x1c/0x70
u_audio_stop_capture+0x79/0x120 [u_audio]
afunc_set_alt+0x73/0x80 [usb_f_uac2]
composite_setup+0x224/0x1b90 [libcomposite]
The issue happens only when the gadget is using the sync type "async", not
"adaptive". This indicates that problem is coming from the feedback
endpoint, which is only used with async synchronization mode.
The problem is that request is freed regardless of usb_ep_dequeue(), which
ends up badly if the request is not actually dequeued yet.
Update the feedback endpoint free function to release the endpoint the same
way it is done for the data endpoint, which takes care of the problem.
Fixes:
24f779dac8f3 ("usb: gadget: f_uac2/u_audio: add feedback endpoint support")
Reported-by: Ferry Toth <ftoth@exalondelft.nl>
Tested-by: Ferry Toth <ftoth@exalondelft.nl>
Acked-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Link: https://lore.kernel.org/r/20210827075853.266912-1-jbrunet@baylibre.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dan Carpenter [Fri, 27 Aug 2021 10:00:23 +0000 (13:00 +0300)]
pd: fix a NULL vs IS_ERR() check
blk_mq_alloc_disk() returns error pointers, it doesn't return NULL
so correct the check.
Fixes:
262d431f9000 ("pd: use blk_mq_alloc_disk and blk_cleanup_disk")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20210827100023.GB9449@kili
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Fri, 27 Aug 2021 01:44:25 +0000 (18:44 -0700)]
Merge tag 'drm-fixes-2021-08-27' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Last set of fixes for 5.14, nothing major a couple of i915, couple of
imx and a few amdgpu. All pretty small.
i915:
- Fix syncmap memory leak
- Drop redundant display port debug print
amdgpu:
- Fix for pinning display buffers multiple times
- Fix delayed work handling for GFXOFF
- Fix build when CONFIG_SUSPEND is not set
imx:
- fix planar offset calculations
- fix accidental partial revert"
* tag 'drm-fixes-2021-08-27' of git://anongit.freedesktop.org/drm/drm:
drm/i915/dp: Drop redundant debug print
drm/i915: Fix syncmap memory leak
drm/amdgpu: Fix build with missing pm_suspend_target_state module export
drm/amdgpu: Cancel delayed work when GFXOFF is disabled
drm/amdgpu: use the preferred pin domain after the check
drm/imx: ipuv3-plane: fix accidental partial revert of 8 pixel alignment fix
gpu: ipu-v3: Fix i.MX IPU-v3 offset calculations for (semi)planar U/V formats
Dave Airlie [Fri, 27 Aug 2021 00:49:32 +0000 (10:49 +1000)]
Merge tag 'imx-drm-fixes-2021-08-18' of git://git.pengutronix.de/pza/linux into drm-fixes
drm/imx: imx-drm alignment and plane offset fixes
Fix an accidental partial revert of commit
94dfec48fca7 ("drm/imx: Add 8
pixel alignment fix") and plane offset calculations for capture of
non-aligned resolutions.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Philipp Zabel <p.zabel@pengutronix.de>
Link: https://patchwork.freedesktop.org/patch/msgid/85a41af99beb2c9e7d6020435a135bf9f205a5ff.camel@pengutronix.de
Dave Airlie [Fri, 27 Aug 2021 00:24:07 +0000 (10:24 +1000)]
Merge tag 'amd-drm-fixes-5.14-2021-08-25' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.14-2021-08-25:
amdgpu:
- Fix for pinning display buffers multiple times
- Fix delayed work handling for GFXOFF
- Fix build when CONFIG_SUSPEND is not set
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210826032658.4068-1-alexander.deucher@amd.com
Dave Airlie [Fri, 27 Aug 2021 00:13:37 +0000 (10:13 +1000)]
Merge tag 'drm-intel-fixes-2021-08-26' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Fix syncmap memory leak
- Drop redundant display port debug print
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YSfSeHbyS5wBZtNJ@intel.com
Marek Marczykowski-Górecki [Thu, 26 Aug 2021 17:03:42 +0000 (19:03 +0200)]
PCI/MSI: Skip masking MSI-X on Xen PV
When running as Xen PV guest, masking MSI-X is a responsibility of the
hypervisor. The guest has no write access to the relevant BAR at all - when
it tries to, it results in a crash like this:
BUG: unable to handle page fault for address:
ffffc9004069100c
#PF: supervisor write access in kernel mode
#PF: error_code(0x0003) - permissions violation
RIP: e030:__pci_enable_msix_range.part.0+0x26b/0x5f0
e1000e_set_interrupt_capability+0xbf/0xd0 [e1000e]
e1000_probe+0x41f/0xdb0 [e1000e]
local_pci_probe+0x42/0x80
(...)
The recently introduced function msix_mask_all() does not check the global
variable pci_msi_ignore_mask which is set by XEN PV to bypass the masking
of MSI[-X] interrupts.
Add the check to make this function XEN PV compatible.
Fixes:
7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X entries")
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210826170342.135172-1-marmarek@invisiblethingslab.com
Linus Torvalds [Thu, 26 Aug 2021 20:26:40 +0000 (13:26 -0700)]
Merge tag 'nfsd-5.14-1' of git://linux-nfs.org/~bfields/linux
Pull nfsd fix from Bruce Fields:
"This is a one-liner fix for a serious bug that can cause the server to
become unresponsive to a client, so I think it's worth the last-minute
inclusion for 5.14"
* tag 'nfsd-5.14-1' of git://linux-nfs.org/~bfields/linux:
SUNRPC: Fix XPT_BUSY flag leakage in svc_handle_xprt()...
Linus Torvalds [Thu, 26 Aug 2021 20:20:22 +0000 (13:20 -0700)]
Merge tag 'net-5.14-rc8' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Networking fixes, including fixes from can and bpf.
Closing three hw-dependent regressions. Any fixes of note are in the
'old code' category. Nothing blocking release from our perspective.
Current release - regressions:
- stmmac: revert "stmmac: align RX buffers"
- usb: asix: ax88772: move embedded PHY detection as early as
possible
- usb: asix: do not call phy_disconnect() for ax88178
- Revert "net: really fix the build...", from Kalle to fix QCA6390
Current release - new code bugs:
- phy: mediatek: add the missing suspend/resume callbacks
Previous releases - regressions:
- qrtr: fix another OOB Read in qrtr_endpoint_post
- stmmac: dwmac-rk: fix unbalanced pm_runtime_enable warnings
Previous releases - always broken:
- inet: use siphash in exception handling
- ip_gre: add validation for csum_start
- bpf: fix ringbuf helper function compatibility
- rtnetlink: return correct error on changing device netns
- e1000e: do not try to recover the NVM checksum on Tiger Lake"
* tag 'net-5.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (43 commits)
Revert "net: really fix the build..."
net: hns3: fix get wrong pfc_en when query PFC configuration
net: hns3: fix GRO configuration error after reset
net: hns3: change the method of getting cmd index in debugfs
net: hns3: fix duplicate node in VLAN list
net: hns3: fix speed unknown issue in bond 4
net: hns3: add waiting time before cmdq memory is released
net: hns3: clear hardware resource when loading driver
net: fix NULL pointer reference in cipso_v4_doi_free
rtnetlink: Return correct error on changing device netns
net: dsa: hellcreek: Adjust schedule look ahead window
net: dsa: hellcreek: Fix incorrect setting of GCL
cxgb4: dont touch blocked freelist bitmap after free
ipv4: use siphash instead of Jenkins in fnhe_hashfun()
ipv6: use siphash in rt6_exception_hash()
can: usb: esd_usb2: esd_usb2_rx_event(): fix the interchange of the CAN RX and TX error counters
net: usb: asix: ax88772: fix boolconv.cocci warnings
net/sched: ets: fix crash when flipping from 'strict' to 'quantum'
qede: Fix memset corruption
net: stmmac: fix kernel panic due to NULL pointer dereference of buf->xdp
...
Jens Axboe [Thu, 26 Aug 2021 18:59:44 +0000 (12:59 -0600)]
Revert "block/mq-deadline: Prioritize high-priority requests"
This reverts commit
fb926032b3209300f9dc454a36b8299582ae545c.
Zhen reports that this commit slows down mq-deadline on a 128 thread
box, going from 258K IOPS to 170-180K. My testing shows that Optane
gen2 IOPS goes from 2.3M IOPS to 1.2M IOPS on a 64 thread box.
Looking in detail at the code, the main culprit here is needing to sum
percpu counters in the dispatch hot path, leading to very high CPU
utilization there. To make matters worse, the code currently needs to
sum 2 percpu counters, and it does so in the most naive way of iterating
possible CPUs _twice_.
Since we're close to release, revert this commit and we can re-do it
with regular per-priority counters instead for the 5.15 kernel.
Link: https://lore.kernel.org/linux-block/20210826144039.2143-1-thunder.leizhen@huawei.com/
Reported-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Thu, 26 Aug 2021 18:26:00 +0000 (11:26 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fix from Will Deacon:
"We received a report this week that the generic version of
pfn_valid(), which we switched to this merge window in
16c9afc77660
("arm64/mm: drop HAVE_ARCH_PFN_VALID"), interacts badly with
dma_map_resource() due to the following check:
/* Don't allow RAM to be mapped */
if (WARN_ON_ONCE(pfn_valid(PHYS_PFN(phys_addr))))
return DMA_MAPPING_ERROR;
Since the ongoing saga to determine the semantics of pfn_valid() is
unlikely to be resolved this week (does it indicate valid memory, or
just the presence of a struct page, or whether that struct page has
been initialised?), just revert back to our old version of pfn_valid()
for 5.14.
Summary:
- Fix dma_map_resource() by reverting back to old pfn_valid() code"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
Partially revert "arm64/mm: drop HAVE_ARCH_PFN_VALID"
Linus Torvalds [Thu, 26 Aug 2021 18:18:30 +0000 (11:18 -0700)]
Merge tag 'ceph-for-5.14-rc8' of git://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
"Two memory management fixes for the filesystem"
* tag 'ceph-for-5.14-rc8' of git://github.com/ceph/ceph-client:
ceph: fix possible null-pointer dereference in ceph_mdsmap_decode()
ceph: correctly handle releasing an embedded cap flush
Kalle Valo [Thu, 26 Aug 2021 17:28:16 +0000 (20:28 +0300)]
Revert "net: really fix the build..."
This reverts commit
ce78ffa3ef1681065ba451cfd545da6126f5ca88.
Wren and Nicolas reported that ath11k was failing to initialise QCA6390
Wi-Fi 6 device with error:
qcom_mhi_qrtr: probe of mhi0_IPCR failed with error -22
Commit
ce78ffa3ef16 ("net: really fix the build..."), introduced in
v5.14-rc5, caused this regression in qrtr. Most likely all ath11k
devices are broken, but I only tested QCA6390. Let's revert the broken
commit so that ath11k works again.
Reported-by: Wren Turkal <wt@penguintechs.org>
Reported-by: Nicolas Schichan <nschichan@freebox.fr>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210826172816.24478-1-kvalo@codeaurora.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Thu, 26 Aug 2021 18:05:11 +0000 (11:05 -0700)]
Merge tag 'for-5.14-rc7-tag' of git://git./linux/kernel/git/kdave/linux
Pull btrfs fix from David Sterba:
"One more fix that I think qualifies for a late merge. It's a revert of
a one-liner fix that meanwhile got backported to stable kernels and we
got reports from users.
The broken fix prevents creating compressed inline extents, which
could be noticeable on space consumption.
Technically it's a regression as the patch was merged in 5.14-rc1 but
got propagated to several stable kernels and has higher exposure than
a 'typical' development cycle bug"
* tag 'for-5.14-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
Revert "btrfs: compression: don't try to compress if we don't have enough pages"
Sebastian Andrzej Siewior [Thu, 26 Aug 2021 13:37:38 +0000 (15:37 +0200)]
sched: Fix get_push_task() vs migrate_disable()
push_rt_task() attempts to move the currently running task away if the
next runnable task has migration disabled and therefore is pinned on the
current CPU.
The current task is retrieved via get_push_task() which only checks for
nr_cpus_allowed == 1, but does not check whether the task has migration
disabled and therefore cannot be moved either. The consequence is a
pointless invocation of the migration thread which correctly observes
that the task cannot be moved.
Return NULL if the task has migration disabled and cannot be moved to
another CPU.
Fixes:
a7c81556ec4d3 ("sched: Fix migrate_disable() vs rt/dl balancing")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210826133738.yiotqbtdaxzjsnfj@linutronix.de
Andy Shevchenko [Thu, 26 Aug 2021 10:53:24 +0000 (13:53 +0300)]
media: ipu3-cio2: Drop reference on error path in cio2_bridge_connect_sensor()
The commit
71f642833284 ("ACPI: utils: Fix reference counting in
for_each_acpi_dev_match()") moved adev assignment outside of error
path and hence made acpi_dev_put(sensor->adev) a no-op. We still
need to drop reference count on error path, and to achieve that,
replace sensor->adev by locally assigned adev.
Fixes:
71f642833284 ("ACPI: utils: Fix reference counting in for_each_acpi_dev_match()")
Depends-on:
fc68f42aa737 ("ACPI: fix NULL pointer dereference")
Reported-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Jakub Kicinski [Thu, 26 Aug 2021 15:43:20 +0000 (08:43 -0700)]
Merge https://git./linux/kernel/git/bpf/bpf
Alexei Starovoitov says:
====================
bpf 2021-08-26
We've added 1 non-merge commit during the last 1 day(s):
1) Fix ringbuf helper function compatibility, from Daniel.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf: Fix ringbuf helper function compatibility
====================
Link: https://lore.kernel.org/r/20210826153720.19083-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 26 Aug 2021 14:24:20 +0000 (07:24 -0700)]
Merge branch 'net-hns3-add-some-fixes-for-net'
Guangbin Huang says:
====================
net: hns3: add some fixes for -net
This series adds some fixes for the HNS3 ethernet driver.
====================
Link: https://lore.kernel.org/r/1629976921-43438-1-git-send-email-huangguangbin2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Guangbin Huang [Thu, 26 Aug 2021 11:22:01 +0000 (19:22 +0800)]
net: hns3: fix get wrong pfc_en when query PFC configuration
Currently, when query PFC configuration by dcbtool, driver will return
PFC enable status based on TC. As all priorities are mapped to TC0 by
default, if TC0 is enabled, then all priorities mapped to TC0 will be
shown as enabled status when query PFC setting, even though some
priorities have never been set.
for example:
$ dcb pfc show dev eth0
pfc-cap 4 macsec-bypass off delay 0
prio-pfc 0:off 1:off 2:off 3:off 4:off 5:off 6:off 7:off
$ dcb pfc set dev eth0 prio-pfc 0:on 1:on 2:on 3:on
$ dcb pfc show dev eth0
pfc-cap 4 macsec-bypass off delay 0
prio-pfc 0:on 1:on 2:on 3:on 4:on 5:on 6:on 7:on
To fix this problem, just returns user's PFC config parameter saved in
driver.
Fixes:
cacde272dd00 ("net: hns3: Add hclge_dcb module for the support of DCB feature")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yufeng Mo [Thu, 26 Aug 2021 11:22:00 +0000 (19:22 +0800)]
net: hns3: fix GRO configuration error after reset
The GRO configuration is enabled by default after reset. This
is incorrect and should be restored to the user-configured value.
So this restoration is added during reset initialization.
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yufeng Mo [Thu, 26 Aug 2021 11:21:59 +0000 (19:21 +0800)]
net: hns3: change the method of getting cmd index in debugfs
Currently, the cmd index is obtained in debugfs by comparing file names.
However, this method may cause errors when processing more complex file
names. So, change this method by saving cmd in private data and comparing
it when getting cmd index in debugfs for optimization.
Fixes:
5e69ea7ee2a6 ("net: hns3: refactor the debugfs process")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Guojia Liao [Thu, 26 Aug 2021 11:21:58 +0000 (19:21 +0800)]
net: hns3: fix duplicate node in VLAN list
VLAN list should not be added duplicate VLAN node, otherwise it would
cause "add failed" when restore VLAN from VLAN list, so this patch adds
VLAN ID check before adding node into VLAN list.
Fixes:
c6075b193462 ("net: hns3: Record VF vlan tables")
Signed-off-by: Guojia Liao <liaoguojia@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yonglong Liu [Thu, 26 Aug 2021 11:21:57 +0000 (19:21 +0800)]
net: hns3: fix speed unknown issue in bond 4
In bond 4, when the link goes down and up repeatedly, the bond may get an
unknown speed, and then this port can not work.
The driver notify netif_carrier_on() before update the link state, when the
bond receive carrier on, will query the speed of the port, if the query
operation happens before updating the link state, will get an unknown
speed. So need to notify netif_carrier_on() after update the link state.
Fixes:
46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Fixes:
e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yufeng Mo [Thu, 26 Aug 2021 11:21:56 +0000 (19:21 +0800)]
net: hns3: add waiting time before cmdq memory is released
After the cmdq registers are cleared, the firmware may take time to
clear out possible left over commands in the cmdq. Driver must release
cmdq memory only after firmware has completed processing of left over
commands.
Fixes:
232d0d55fca6 ("net: hns3: uninitialize command queue while unloading PF driver")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yufeng Mo [Thu, 26 Aug 2021 11:21:55 +0000 (19:21 +0800)]
net: hns3: clear hardware resource when loading driver
If a PF is bonded to a virtual machine and the virtual machine exits
unexpectedly, some hardware resource cannot be cleared. In this case,
loading driver may cause exceptions. Therefore, the hardware resource
needs to be cleared when the driver is loaded.
Fixes:
46a3df9f9718 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kyle Tso [Thu, 26 Aug 2021 12:42:01 +0000 (20:42 +0800)]
usb: typec: tcpm: Raise vdm_sm_running flag only when VDM SM is running
If the port is going to send Discover_Identity Message, vdm_sm_running
flag was intentionally set before entering Ready States in order to
avoid the conflict because the port and the port partner might start
AMS at almost the same time after entering Ready States.
However, the original design has a problem. When the port is doing
DR_SWAP from Device to Host, it raises the flag. Later in the
tcpm_send_discover_work, the flag blocks the procedure of sending the
Discover_Identity and it might never be cleared until disconnection.
Since there exists another flag send_discover representing that the port
is going to send Discover_Identity or not, it is enough to use that flag
to prevent the conflict. Also change the timing of the set/clear of
vdm_sm_running to indicate whether the VDM SM is actually running or
not.
Fixes:
c34e85fa69b9 ("usb: typec: tcpm: Send DISCOVER_IDENTITY from dedicated work")
Cc: stable <stable@vger.kernel.org>
Cc: Badhri Jagan Sridharan <badhri@google.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Kyle Tso <kyletso@google.com>
Link: https://lore.kernel.org/r/20210826124201.1562502-1-kyletso@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Takashi Iwai [Thu, 26 Aug 2021 12:41:27 +0000 (14:41 +0200)]
usb: renesas-xhci: Prefer firmware loading on unknown ROM state
The recent attempt to handle an unknown ROM state in the commit
d143825baf15 ("usb: renesas-xhci: Fix handling of unknown ROM state")
resulted in a regression and reverted later by the commit
44cf53602f5a
("Revert "usb: renesas-xhci: Fix handling of unknown ROM state"").
The problem of the former fix was that it treated the failure of
firmware loading as a fatal error. Since the firmware files aren't
included in the standard linux-firmware tree, most users don't have
them, hence they got the non-working system after that. The revert
fixed the regression, but also it didn't make the firmware loading
triggered even on the devices that do need it. So we need still a fix
for them.
This is another attempt to handle the unknown ROM state. Like the
previous fix, this also tries to load the firmware when ROM shows
unknown state. In this patch, however, the failure of a firmware
loading (such as a missing firmware file) isn't handled as a fatal
error any longer when ROM has been already detected, but it falls back
to the ROM mode like before. The error is returned only when no ROM
is detected and the firmware loading failed.
Along with it, for simplifying the code flow, the detection and the
check of ROM is factored out from renesas_fw_check_running() and done
in the caller side, renesas_xhci_check_request_fw(). It avoids the
redundant ROM checks.
The patch was tested on Lenovo Thinkpad T14 gen (BIOS 1.34). Also it
was confirmed that no regression is seen on another Thinkpad T14
machine that has worked without the patch, too.
Fixes:
44cf53602f5a ("Revert "usb: renesas-xhci: Fix handling of unknown ROM state"")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
BugLink: https://bugzilla.opensuse.org/show_bug.cgi?id=1189207
Link: https://lore.kernel.org/r/20210826124127.14789-1-tiwai@suse.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Wesley Cheng [Wed, 25 Aug 2021 04:28:55 +0000 (21:28 -0700)]
usb: dwc3: gadget: Stop EP0 transfers during pullup disable
During a USB cable disconnect, or soft disconnect scenario, a pending
SETUP transaction may not be completed, leading to the following
error:
dwc3 a600000.dwc3: timed out waiting for SETUP phase
If this occurs, then the entire pullup disable routine is skipped and
proper cleanup and halting of the controller does not complete.
Instead of returning an error (which is ignored from the UDC
perspective), allow the pullup disable routine to continue, which
will also handle disabling of EP0/1. This will end any active
transfers as well. Ensure to clear any delayed_status also, as the
timeout could happen within the STATUS stage.
Fixes:
bb0147364850 ("usb: dwc3: gadget: don't clear RUN/STOP when it's invalid to do so")
Cc: <stable@vger.kernel.org>
Reviewed-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Acked-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Wesley Cheng <wcheng@codeaurora.org>
Link: https://lore.kernel.org/r/20210825042855.7977-1-wcheng@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Arnd Bergmann [Mon, 23 Aug 2021 21:41:27 +0000 (23:41 +0200)]
Merge tag 'reset-fixes-for-v5.14' of git://git.pengutronix.de/pza/linux into arm/fixes
Reset controller fixes for v5.14
Hide the Sparx5 reset driver unless the ARCH_SPARX5 or COMPILE_TEST
options are enabled, to avoid unnecessarily asking users about this
driver. Fix a return value argument type in the ZynqMP reset driver.
* tag 'reset-fixes-for-v5.14' of git://git.pengutronix.de/pza/linux:
reset: reset-zynqmp: Fixed the argument data type
reset: RESET_MCHP_SPARX5 should depend on ARCH_SPARX5
Link: https://lore.kernel.org/r/e543959c5b5ee7b25686f81049bf187d602daeda.camel@pengutronix.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Thinh Nguyen [Thu, 19 Aug 2021 01:17:03 +0000 (03:17 +0200)]
usb: dwc3: gadget: Fix dwc3_calc_trbs_left()
We can't depend on the TRB's HWO bit to determine if the TRB ring is
"full". A TRB is only available when the driver had processed it, not
when the controller consumed and relinquished the TRB's ownership to the
driver. Otherwise, the driver may overwrite unprocessed TRBs. This can
happen when many transfer events accumulate and the system is slow to
process them and/or when there are too many small requests.
If a request is in the started_list, that means there is one or more
unprocessed TRBs remained. Check this instead of the TRB's HWO bit
whether the TRB ring is full.
Fixes:
c4233573f6ee ("usb: dwc3: gadget: prepare TRBs on update transfers too")
Cc: <stable@vger.kernel.org>
Acked-by: Felipe Balbi <balbi@kernel.org>
Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Link: https://lore.kernel.org/r/e91e975affb0d0d02770686afc3a5b9eb84409f6.1629335416.git.Thinh.Nguyen@synopsys.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Swati Sharma [Thu, 12 Aug 2021 13:11:07 +0000 (18:41 +0530)]
drm/i915/dp: Drop redundant debug print
drm_dp_dpcd_read/write already has debug error message.
Drop redundant error messages which gives false
status even if correct value is read in drm_dp_dpcd_read().
v2: -Added fixes tag (Ankit)
v3: -Fixed build error (CI)
Fixes:
9488a030ac91 ("drm/i915: Add support for enabling link status and recovery")
Cc: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Sean Paul <seanpaul@chromium.org>
Cc: Uma Shankar <uma.shankar@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.12+
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Swati Sharma <swati2.sharma@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210812131107.5531-1-swati2.sharma@intel.com
(cherry picked from commit
b6dfa416172939edaa46a5a647457b94c6d94119)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Matthew Brost [Fri, 30 Jul 2021 19:53:42 +0000 (12:53 -0700)]
drm/i915: Fix syncmap memory leak
A small race exists between intel_gt_retire_requests_timeout and
intel_timeline_exit which could result in the syncmap not getting
free'd. Rather than work to hard to seal this race, simply cleanup the
syncmap on fini.
unreferenced object 0xffff88813bc53b18 (size 96):
comm "gem_close_race", pid 5410, jiffies
4294917818 (age 1105.600s)
hex dump (first 32 bytes):
01 00 00 00 00 00 00 00 00 00 00 00 0a 00 00 00 ................
00 00 00 00 00 00 00 00 6b 6b 6b 6b 06 00 00 00 ........kkkk....
backtrace:
[<
00000000120b863a>] __sync_alloc_leaf+0x1e/0x40 [i915]
[<
00000000042f6959>] __sync_set+0x1bb/0x240 [i915]
[<
0000000090f0e90f>] i915_request_await_dma_fence+0x1c7/0x400 [i915]
[<
0000000056a48219>] i915_request_await_object+0x222/0x360 [i915]
[<
00000000aaac4ee3>] i915_gem_do_execbuffer+0x1bd0/0x2250 [i915]
[<
000000003c9d830f>] i915_gem_execbuffer2_ioctl+0x405/0xce0 [i915]
[<
00000000fd7a8e68>] drm_ioctl_kernel+0xb0/0xf0 [drm]
[<
00000000e721ee87>] drm_ioctl+0x305/0x3c0 [drm]
[<
000000008b0d8986>] __x64_sys_ioctl+0x71/0xb0
[<
0000000076c362a4>] do_syscall_64+0x33/0x80
[<
00000000eb7a4831>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Fixes:
531958f6f357 ("drm/i915/gt: Track timeline activeness in enter/exit")
Cc: <stable@vger.kernel.org>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210730195342.110234-1-matthew.brost@intel.com
(cherry picked from commit
faf890985e30d5e88cc3a7c50c1bcad32f89ab7c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
王贇 [Thu, 26 Aug 2021 03:42:42 +0000 (11:42 +0800)]
net: fix NULL pointer reference in cipso_v4_doi_free
In netlbl_cipsov4_add_std() when 'doi_def->map.std' alloc
failed, we sometime observe panic:
BUG: kernel NULL pointer dereference, address:
...
RIP: 0010:cipso_v4_doi_free+0x3a/0x80
...
Call Trace:
netlbl_cipsov4_add_std+0xf4/0x8c0
netlbl_cipsov4_add+0x13f/0x1b0
genl_family_rcv_msg_doit.isra.15+0x132/0x170
genl_rcv_msg+0x125/0x240
This is because in cipso_v4_doi_free() there is no check
on 'doi_def->map.std' when 'doi_def->type' equal 1, which
is possibe, since netlbl_cipsov4_add_std() haven't initialize
it before alloc 'doi_def->map.std'.
This patch just add the check to prevent panic happen for similar
cases.
Reported-by: Abaci <abaci@linux.alibaba.com>
Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrey Ignatov [Thu, 26 Aug 2021 00:25:40 +0000 (17:25 -0700)]
rtnetlink: Return correct error on changing device netns
Currently when device is moved between network namespaces using
RTM_NEWLINK message type and one of netns attributes (FLA_NET_NS_PID,
IFLA_NET_NS_FD, IFLA_TARGET_NETNSID) but w/o specifying IFLA_IFNAME, and
target namespace already has device with same name, userspace will get
EINVAL what is confusing and makes debugging harder.
Fix it so that userspace gets more appropriate EEXIST instead what makes
debugging much easier.
Before:
# ./ifname.sh
+ ip netns add ns0
+ ip netns exec ns0 ip link add l0 type dummy
+ ip netns exec ns0 ip link show l0
8: l0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 66:90:b5:d5:78:69 brd ff:ff:ff:ff:ff:ff
+ ip link add l0 type dummy
+ ip link show l0
10: l0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 6e:c6:1f:15:20:8d brd ff:ff:ff:ff:ff:ff
+ ip link set l0 netns ns0
RTNETLINK answers: Invalid argument
After:
# ./ifname.sh
+ ip netns add ns0
+ ip netns exec ns0 ip link add l0 type dummy
+ ip netns exec ns0 ip link show l0
8: l0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 1e:4a:72:e3:e3:8f brd ff:ff:ff:ff:ff:ff
+ ip link add l0 type dummy
+ ip link show l0
10: l0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether f2:fc:fe:2b:7d:a6 brd ff:ff:ff:ff:ff:ff
+ ip link set l0 netns ns0
RTNETLINK answers: File exists
The problem is that do_setlink() passes its `char *ifname` argument,
that it gets from a caller, to __dev_change_net_namespace() as is (as
`const char *pat`), but semantics of ifname and pat can be different.
For example, __rtnl_newlink() does this:
net/core/rtnetlink.c
3270 char ifname[IFNAMSIZ];
...
3286 if (tb[IFLA_IFNAME])
3287 nla_strscpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ);
3288 else
3289 ifname[0] = '\0';
...
3364 if (dev) {
...
3394 return do_setlink(skb, dev, ifm, extack, tb, ifname, status);
3395 }
, i.e. do_setlink() gets ifname pointer that is always valid no matter
if user specified IFLA_IFNAME or not and then do_setlink() passes this
ifname pointer as is to __dev_change_net_namespace() as pat argument.
But the pat (pattern) in __dev_change_net_namespace() is used as:
net/core/dev.c
11198 err = -EEXIST;
11199 if (__dev_get_by_name(net, dev->name)) {
11200 /* We get here if we can't use the current device name */
11201 if (!pat)
11202 goto out;
11203 err = dev_get_valid_name(net, dev, pat);
11204 if (err < 0)
11205 goto out;
11206 }
As the result the `goto out` path on line 11202 is neven taken and
instead of returning EEXIST defined on line 11198,
__dev_change_net_namespace() returns an error from dev_get_valid_name()
and this, in turn, will be EINVAL for ifname[0] = '\0' set earlier.
Fixes:
d8a5ec672768 ("[NET]: netlink support for moving devices between network namespaces.")
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 26 Aug 2021 09:26:06 +0000 (10:26 +0100)]
Merge branch 'dsa-hellcreek-fixes'
Kurt Kanzenbach says:
====================
net: dsa: hellcreek: 802.1Qbv Fixes
while using TAPRIO offloading on the Hirschmann hellcreek switch, I've noticed
two issues in the current implementation:
1. The gate control list is incorrectly programmed
2. The admin base time is not set properly
Fix it.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Kurt Kanzenbach [Wed, 25 Aug 2021 13:58:13 +0000 (15:58 +0200)]
net: dsa: hellcreek: Adjust schedule look ahead window
Traffic schedules can only be started up to eight seconds within the
future. Therefore, the driver periodically checks every two seconds whether the
admin base time provided by the user is inside that window. If so the schedule
is started. Otherwise the check is deferred.
However, according to the programming manual the look ahead window size should
be four - not eight - seconds. By using the proposed value of four seconds
starting a schedule at a specified admin base time actually works as expected.
Fixes:
24dfc6eb39b2 ("net: dsa: hellcreek: Add TAPRIO offloading support")
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kurt Kanzenbach [Wed, 25 Aug 2021 13:58:12 +0000 (15:58 +0200)]
net: dsa: hellcreek: Fix incorrect setting of GCL
Currently the gate control list which is programmed into the hardware is
incorrect resulting in wrong traffic schedules. The problem is the loop
variables are incremented before they are referenced. Therefore, move the
increment to the end of the loop.
Fixes:
24dfc6eb39b2 ("net: dsa: hellcreek: Add TAPRIO offloading support")
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rahul Lakkireddy [Wed, 25 Aug 2021 21:29:42 +0000 (02:59 +0530)]
cxgb4: dont touch blocked freelist bitmap after free
When adapter init fails, the blocked freelist bitmap is already freed
up and should not be touched. So, move the bitmap zeroing closer to
where it was successfully allocated. Also handle adapter init failure
unwind path immediately and avoid setting up RDMA memory windows.
Fixes:
5b377d114f2b ("cxgb4: Add debugfs facility to inject FL starvation")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 26 Aug 2021 09:20:34 +0000 (10:20 +0100)]
Merge branch 'inet-siphash'
Eric Dumazet says:
====================
inet: use siphash in exception handling
A group of security researchers brought to our attention
the weakness of hash functions used in rt6_exception_hash()
and fnhe_hashfun()
I made two distinct patches to help backports, since IPv6
part was added in 4.15
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 25 Aug 2021 23:17:29 +0000 (16:17 -0700)]
ipv4: use siphash instead of Jenkins in fnhe_hashfun()
A group of security researchers brought to our attention
the weakness of hash function used in fnhe_hashfun().
Lets use siphash instead of Jenkins Hash, to considerably
reduce security risks.
Also remove the inline keyword, this really is distracting.
Fixes:
d546c621542d ("ipv4: harden fnhe_hashfun()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Keyu Man <kman001@ucr.edu>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>