Sean Christopherson [Sat, 29 Jul 2023 00:36:42 +0000 (17:36 -0700)]
KVM: selftests: Print out guest RIP on unhandled exception
Use the newfanged printf-based guest assert framework to spit out the
guest RIP when an unhandled exception is detected, which makes debugging
such failures *much* easier.
Link: https://lore.kernel.org/r/20230729003643.1053367-34-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:41 +0000 (17:36 -0700)]
KVM: selftests: Rip out old, param-based guest assert macros
Drop the param-based guest assert macros and enable the printf versions
for all selftests. Note! This change can affect tests even if they
don't use directly use guest asserts! E.g. via library code, or due to
the compiler making different optimization decisions.
Link: https://lore.kernel.org/r/20230729003643.1053367-33-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:40 +0000 (17:36 -0700)]
KVM: selftests: Convert x86's XCR0 test to use printf-based guest asserts
Convert x86's XCR0 vs. CPUID test to use printf-based guest asserts.
Link: https://lore.kernel.org/r/20230729003643.1053367-32-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:39 +0000 (17:36 -0700)]
KVM: selftests: Convert VMX's PMU capabilities test to printf guest asserts
Convert x86's VMX PMU capabilities test to use printf-based guest asserts.
Opportunstically add a helper to do the WRMSR+assert so as to reduce the
amount of copy+paste needed to spit out debug information.
Link: https://lore.kernel.org/r/20230729003643.1053367-31-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:38 +0000 (17:36 -0700)]
KVM: selftests: Convert the x86 userspace I/O test to printf guest assert
Convert x86's userspace I/O test to use printf-based guest asserts.
Link: https://lore.kernel.org/r/20230729003643.1053367-30-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:37 +0000 (17:36 -0700)]
KVM: selftests: Convert x86's TSC MSRs test to use printf guest asserts
Convert x86's TSC MSRs test, and it's liberal use of GUEST_ASSERT_EQ(), to
use printf-based guest assert reporting.
Link: https://lore.kernel.org/r/20230729003643.1053367-29-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:36 +0000 (17:36 -0700)]
KVM: selftests: Convert the nSVM software interrupt test to printf guest asserts
Convert x86's nested SVM software interrupt injection test to use printf-
based guest asserts. Opportunistically use GUEST_ASSERT() and
GUEST_FAIL() in a few locations to spit out more debug information.
Link: https://lore.kernel.org/r/20230729003643.1053367-28-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:35 +0000 (17:36 -0700)]
KVM: selftests: Convert x86's set BSP ID test to printf style guest asserts
Convert the set_boot_cpu_id test to use printf-based guest asserts,
specifically the EQ and NE variants.
Link: https://lore.kernel.org/r/20230729003643.1053367-27-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:34 +0000 (17:36 -0700)]
KVM: selftests: Convert x86's nested exceptions test to printf guest asserts
Convert x86's nested exceptions test to printf-based guest asserts, and
use REPORT_GUEST_ASSERT() instead of TEST_FAIL() so that output is
formatted correctly.
Link: https://lore.kernel.org/r/20230729003643.1053367-26-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:33 +0000 (17:36 -0700)]
KVM: selftests: Convert the MONITOR/MWAIT test to use printf guest asserts
Convert x86's MONITOR/MWAIT test to use printf-based guest asserts. Add a
macro to handle reporting failures to reduce the amount of copy+paste
needed for MONITOR vs. MWAIT.
Link: https://lore.kernel.org/r/20230729003643.1053367-25-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:32 +0000 (17:36 -0700)]
KVM: selftests: Convert x86's KVM paravirt test to printf style GUEST_ASSERT
Convert x86's KVM paravirtualization test to use the printf-based
GUEST_ASSERT_EQ().
Link: https://lore.kernel.org/r/20230729003643.1053367-24-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:31 +0000 (17:36 -0700)]
KVM: selftests: Convert the Hyper-V feature test to printf style GUEST_ASSERT
Convert x86's Hyper-V feature test to use print-based guest asserts.
Opportunistically use the EQ and NE variants in a few places to capture
additional information.
Link: https://lore.kernel.org/r/20230729003643.1053367-23-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:30 +0000 (17:36 -0700)]
KVM: selftests: Convert the Hyper-V extended hypercalls test to printf asserts
Convert x86's Hyper-V extended hypercalls test to use printf-based
GUEST_ASSERT_EQ().
Link: https://lore.kernel.org/r/20230729003643.1053367-22-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:29 +0000 (17:36 -0700)]
KVM: selftests: Convert x86's CPUID test to printf style GUEST_ASSERT
Convert x86's CPUID test to use printf-based GUEST_ASSERT_EQ() so that
the test prints out debug information. Note, the test previously used
REPORT_GUEST_ASSERT_2(), but that was pointless because none of the
guest-side code passed any parameters to the assert.
Link: https://lore.kernel.org/r/20230729003643.1053367-21-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:28 +0000 (17:36 -0700)]
KVM: selftests: Convert steal_time test to printf style GUEST_ASSERT
Convert the steal_time test to use printf-based GUEST_ASERT.
Opportunistically use GUEST_ASSERT_EQ() and GUEST_ASSERT_NE() so that the
test spits out debug information on failure.
Link: https://lore.kernel.org/r/20230729003643.1053367-20-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:27 +0000 (17:36 -0700)]
KVM: selftests: Convert set_memory_region_test to printf-based GUEST_ASSERT
Convert set_memory_region_test to print-based GUEST_ASSERT, using a combo
of newfangled macros to report (hopefully) useful information.
Link: https://lore.kernel.org/r/20230729003643.1053367-19-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:26 +0000 (17:36 -0700)]
KVM: selftests: Convert s390's tprot test to printf style GUEST_ASSERT
Convert s390's tprot test to printf-based GUEST_ASSERT.
Link: https://lore.kernel.org/r/20230729003643.1053367-18-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:25 +0000 (17:36 -0700)]
KVM: selftests: Convert s390's memop test to printf style GUEST_ASSERT
Convert s390's memop test to printf-based GUEST_ASSERT, and
opportunistically use GUEST_FAIL() to report invalid sizes.
Link: https://lore.kernel.org/r/20230729003643.1053367-17-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:24 +0000 (17:36 -0700)]
KVM: selftests: Convert the memslot performance test to printf guest asserts
Use the printf-based GUEST_ASSERT_EQ() in the memslot perf test instead of
an half-baked open code version.
Link: https://lore.kernel.org/r/20230729003643.1053367-16-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:23 +0000 (17:36 -0700)]
KVM: selftests: Convert ARM's vGIC IRQ test to printf style GUEST_ASSERT
Use printf-based guest assert reporting in ARM's vGIC IRQ test. Note,
this is not as innocuous as it looks! The printf-based version of
GUEST_ASSERT_EQ() ensures the expressions are evaluated only once, whereas
the old version did not!
Link: https://lore.kernel.org/r/20230729003643.1053367-15-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:22 +0000 (17:36 -0700)]
KVM: selftests: Convert ARM's page fault test to printf style GUEST_ASSERT
Use GUEST_FAIL() in ARM's page fault test to report unexpected faults.
Link: https://lore.kernel.org/r/20230729003643.1053367-14-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:21 +0000 (17:36 -0700)]
KVM: selftests: Convert ARM's hypercalls test to printf style GUEST_ASSERT
Convert ARM's hypercalls test to use printf-based GUEST_ASSERT().
Opportunistically use GUEST_FAIL() to complain about an unexpected stage.
Link: https://lore.kernel.org/r/20230729003643.1053367-13-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:20 +0000 (17:36 -0700)]
KVM: selftests: Convert debug-exceptions to printf style GUEST_ASSERT
Convert ARM's debug exceptions test to use printf-based GUEST_ASSERT().
Opportunistically Use GUEST_ASSERT_EQ() in guest_code_ss() so that the
expected vs. actual values get printed out.
Link: https://lore.kernel.org/r/20230729003643.1053367-12-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:19 +0000 (17:36 -0700)]
KVM: selftests: Convert aarch_timer to printf style GUEST_ASSERT
Convert ARM's aarch_timer test to use printf-based GUEST_ASSERT().
To maintain existing functionality, manually print the host information,
e.g. stage and iteration, to stderr prior to reporting the guest assert.
Link: https://lore.kernel.org/r/20230729003643.1053367-11-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Aaron Lewis [Mon, 31 Jul 2023 20:30:26 +0000 (13:30 -0700)]
KVM: selftests: Add a selftest for guest prints and formatted asserts
Add a test to exercise the various features in KVM selftest's local
snprintf() and compare them to LIBC's snprintf() to ensure they behave
the same.
This is not an exhaustive test. KVM's local snprintf() does not
implement all the features LIBC does, e.g. KVM's local snprintf() does
not support floats or doubles, so testing for those features were
excluded.
Testing was added for the features that are expected to work to
support a minimal version of printf() in the guest.
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
[sean: use UCALL_EXIT_REASON, enable for all architectures]
Link: https://lore.kernel.org/r/20230731203026.1192091-4-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Mon, 31 Jul 2023 20:30:25 +0000 (13:30 -0700)]
KVM: selftests: Add #define of expected KVM exit reason for ucall
Define the expected architecture specific exit reason for a successful
ucall so that common tests can assert that a ucall occurred without the
test needing to implement arch specific code.
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20230731203026.1192091-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Mon, 31 Jul 2023 20:30:24 +0000 (13:30 -0700)]
KVM: selftests: Add arch ucall.h and inline simple arch hooks
Add an architecture specific ucall.h and inline the simple arch hooks,
e.g. the init hook for everything except ARM, and the actual "do ucall"
hook for everything except x86 (which should be simple, but temporarily
isn't due to carrying a workaround).
Having a per-arch ucall header will allow adding a #define for the
expected KVM exit reason for a ucall that is colocated (for everything
except x86) with the ucall itself.
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20230731203026.1192091-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:17 +0000 (17:36 -0700)]
KVM: selftests: Add formatted guest assert support in ucall framework
Add printf-based GUEST_ASSERT macros and accompanying host-side support to
provide an assert-specific versions of GUEST_PRINTF(). To make it easier
to parse assert messages, for humans and bots alike, preserve/use the same
layout as host asserts, e.g. in the example below, the reported expression,
file, line number, and message are from the guest assertion, not the host
reporting of the assertion.
The call stack still captures the host reporting, but capturing the guest
stack is a less pressing concern, i.e. can be done in the future, and an
optimal solution would capture *both* the host and guest stacks, i.e.
capturing the host stack isn't an outright bug.
Running soft int test
==== Test Assertion Failure ====
x86_64/svm_nested_soft_inject_test.c:39: regs->rip != (unsigned long)l2_guest_code_int
pid=214104 tid=214104 errno=4 - Interrupted system call
1 0x0000000000401b35: run_test at svm_nested_soft_inject_test.c:191
2 0x00000000004017d2: main at svm_nested_soft_inject_test.c:212
3 0x0000000000415b03: __libc_start_call_main at libc-start.o:?
4 0x000000000041714f: __libc_start_main_impl at ??:?
5 0x0000000000401660: _start at ??:?
Expected IRQ at RIP 0x401e50, received IRQ at 0x401e50
Don't bother sharing code between ucall_assert() and ucall_fmt(), as
forwarding the variable arguments would either require using macros or
building a va_list, i.e. would make the code less readable and/or require
just as much copy+paste code anyways.
Gate the new macros with a flag so that tests can more or less be switched
over one-by-one. The slow conversion won't be perfect, e.g. library code
won't pick up the flag, but the only asserts in library code are of the
vanilla GUEST_ASSERT() variety, i.e. don't print out variables.
Add a temporary alias to GUEST_ASSERT_1() to fudge around ARM's
arch_timer.h header using GUEST_ASSERT_1(), thus thwarting any attempt to
convert tests one-by-one.
Link: https://lore.kernel.org/r/20230729003643.1053367-9-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Aaron Lewis [Sat, 29 Jul 2023 00:36:16 +0000 (17:36 -0700)]
KVM: selftests: Add string formatting options to ucall
Add more flexibility to guest debugging and testing by adding
GUEST_PRINTF() and GUEST_ASSERT_FMT() to the ucall framework.
Add a sized buffer to the ucall structure to hold the formatted string,
i.e. to allow the guest to easily resolve the string, and thus avoid the
ugly pattern of the host side having to make assumptions about the desired
format, as well as having to pass around a large number of parameters.
The buffer size was chosen to accommodate most use cases, and based on
similar usage. E.g. printf() uses the same size buffer in
arch/x86/boot/printf.c. And 1KiB ought to be enough for anybody.
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
[sean: massage changelog, wrap macro param in ()]
Link: https://lore.kernel.org/r/20230729003643.1053367-8-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Aaron Lewis [Sat, 29 Jul 2023 00:36:15 +0000 (17:36 -0700)]
KVM: selftests: Add additional pages to the guest to accommodate ucall
Add additional pages to the guest to account for the number of pages
the ucall headers need. The only reason things worked before is the
ucall headers are fairly small. If they were ever to increase in
size the guest could run out of memory.
This is done in preparation for adding string formatting options to
the guest through the ucall framework which increases the size of
the ucall headers.
Fixes:
426729b2cf2e ("KVM: selftests: Add ucall pool based implementation")
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Link: https://lore.kernel.org/r/20230729003643.1053367-7-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Aaron Lewis [Sat, 29 Jul 2023 00:36:14 +0000 (17:36 -0700)]
KVM: selftests: Add guest_snprintf() to KVM selftests
Add a local version of guest_snprintf() for use in the guest.
Having a local copy allows the guest access to string formatting
options without dependencies on LIBC. LIBC is problematic because
it heavily relies on both AVX-512 instructions and a TLS, neither of
which are guaranteed to be set up in the guest.
The file guest_sprintf.c was lifted from arch/x86/boot/printf.c and
adapted to work in the guest, including the addition of buffer length.
I.e. s/sprintf/snprintf/
The functions where prefixed with "guest_" to allow guests to
explicitly call them.
A string formatted by this function is expected to succeed or die. If
something goes wrong during the formatting process a GUEST_ASSERT()
will be thrown.
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Link: https://lore.kernel.org/all/mtdi6smhur5rqffvpu7qux7mptonw223y2653x2nwzvgm72nlo@zyc4w3kwl3rg
[sean: add a link to the discussion of other options]
Link: https://lore.kernel.org/r/20230729003643.1053367-6-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Aaron Lewis [Sat, 29 Jul 2023 00:36:13 +0000 (17:36 -0700)]
KVM: selftests: Add strnlen() to the string overrides
Add strnlen() to the string overrides to allow it to be called in the
guest.
The implementation for strnlen() was taken from the kernel's generic
version, lib/string.c.
This will be needed when printf() is introduced.
Signed-off-by: Aaron Lewis <aaronlewis@google.com>
Link: https://lore.kernel.org/r/20230729003643.1053367-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:12 +0000 (17:36 -0700)]
KVM: selftests: Add a shameful hack to preserve/clobber GPRs across ucall
Preserve or clobber all GPRs (except RIP and RSP, as they're saved and
restored via the VMCS) when performing a ucall on x86 to fudge around a
horrific long-standing bug in selftests' nested VMX support where L2's
GPRs are not preserved across a nested VM-Exit. I.e. if a test triggers a
nested VM-Exit to L1 in response to a ucall, e.g. GUEST_SYNC(), then L2's
GPR state can be corrupted.
The issues manifests as an unexpected #GP in clear_bit() when running the
hyperv_evmcs test due to RBX being used to track the ucall object, and RBX
being clobbered by the nested VM-Exit. The problematic hyperv_evmcs
testcase is where L0 (test's host userspace) injects an NMI in response to
GUEST_SYNC(8) from L2, but the bug could "randomly" manifest in any test
that induces a nested VM-Exit from L0. The bug hasn't caused failures in
the past due to sheer dumb luck.
The obvious fix is to rework the nVMX helpers to save/restore L2 GPRs
across VM-Exit and VM-Enter, but that is a much bigger task and carries
its own risks, e.g. nSVM does save/restore GPRs, but not in a thread-safe
manner, and there is a _lot_ of cleanup that can be done to unify code
for doing VM-Enter on nVMX, nSVM, and eVMCS.
Link: https://lore.kernel.org/r/20230729003643.1053367-4-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Sean Christopherson [Sat, 29 Jul 2023 00:36:11 +0000 (17:36 -0700)]
KVM: selftests: Make TEST_ASSERT_EQ() output look like normal TEST_ASSERT()
Clean up TEST_ASSERT_EQ() so that the (mostly) raw code is captured in the
main assert message, not the helper macro's code. E.g. make this:
x86_64/tsc_msrs_test.c:106: __a == __b
pid=40470 tid=40470 errno=0 - Success
1 0x000000000040170e: main at tsc_msrs_test.c:106
2 0x0000000000416f23: __libc_start_call_main at libc-start.o:?
3 0x000000000041856f: __libc_start_main_impl at ??:?
4 0x0000000000401ef0: _start at ??:?
TEST_ASSERT_EQ(rounded_host_rdmsr(MSR_IA32_TSC), val + 1) failed.
rounded_host_rdmsr(MSR_IA32_TSC) is 0
val + 1 is 0x1
look like this:
x86_64/tsc_msrs_test.c:106: rounded_host_rdmsr(MSR_IA32_TSC) == val + 1
pid=5737 tid=5737 errno=0 - Success
1 0x0000000000401714: main at tsc_msrs_test.c:106
2 0x0000000000415c23: __libc_start_call_main at libc-start.o:?
3 0x000000000041726f: __libc_start_main_impl at ??:?
4 0x0000000000401e60: _start at ??:?
0 != 0x1 (rounded_host_rdmsr(MSR_IA32_TSC) != val + 1)
Opportunstically clean up the formatting of the entire macro.
Link: https://lore.kernel.org/r/20230729003643.1053367-3-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Thomas Huth [Wed, 12 Jul 2023 07:59:07 +0000 (09:59 +0200)]
KVM: selftests: Rename the ASSERT_EQ macro
There is already an ASSERT_EQ macro in the file
tools/testing/selftests/kselftest_harness.h, so currently KVM selftests
can't include test_util.h from the KVM selftests together with that file.
Rename the macro in the KVM selftests to TEST_ASSERT_EQ to avoid the
problem - it is also more similar to the other macros in test_util.h that
way.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20230712075910.22480-2-thuth@redhat.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Minjie Du [Tue, 4 Jul 2023 12:21:47 +0000 (20:21 +0800)]
KVM: selftests: Remove superfluous variable assignment
Don't nullify "nodep" to NULL one line before it's set to "tmp".
Signed-off-by: Minjie Du <duminjie@vivo.com>
Link: https://lore.kernel.org/r/20230704122148.11573-1-duminjie@vivo.com
[sean: massage shortlog+changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Bibo Mao [Mon, 31 Jul 2023 02:24:05 +0000 (10:24 +0800)]
KVM: selftests: use unified time type for comparison
With test case kvm_page_table_test, start time is acquired with
time type CLOCK_MONOTONIC_RAW, however end time in timespec_elapsed()
is acquired with time type CLOCK_MONOTONIC. This can cause inaccurate
elapsed time calculation due to mixing timebases, e.g. LoongArch in
particular will see weirdness.
Modify kvm_page_table_test to use unified time type CLOCK_MONOTONIC for
start time.
Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20230731022405.854884-1-maobibo@loongson.cn
[sean: massage changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Michal Luczaj [Fri, 28 Jul 2023 00:12:58 +0000 (02:12 +0200)]
KVM: selftests: Extend x86's sync_regs_test to check for exception races
Attempt to set the to-be-queued exception to be both pending and injected
_after_ KVM_CAP_SYNC_REGS's kvm_vcpu_ioctl_x86_set_vcpu_events() squashes
the pending exception (if there's also an injected exception). Buggy KVM
versions will eventually yell loudly about having impossible state when
processing queued excpetions, e.g.
WARNING: CPU: 0 PID: 1115 at arch/x86/kvm/x86.c:10095 kvm_check_and_inject_events+0x220/0x500 [kvm]
arch/x86/kvm/x86.c:kvm_check_and_inject_events():
WARN_ON_ONCE(vcpu->arch.exception.injected &&
vcpu->arch.exception.pending);
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://lore.kernel.org/r/20230728001606.2275586-3-mhal@rbox.co
[sean: split to separate patch, massage changelog and comment]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Michal Luczaj [Fri, 28 Jul 2023 00:12:58 +0000 (02:12 +0200)]
KVM: selftests: Extend x86's sync_regs_test to check for event vector races
Attempt to modify the to-be-injected exception vector to an illegal value
_after_ the sanity checks performed by KVM_CAP_SYNC_REGS's
arch/x86/kvm/x86.c:kvm_vcpu_ioctl_x86_set_vcpu_events(). Buggy KVM
versions will eventually yells loudly about attempting to inject a bogus
vector, e.g.
WARNING: CPU: 0 PID: 1107 at arch/x86/kvm/x86.c:547 kvm_check_and_inject_events+0x4a0/0x500 [kvm]
arch/x86/kvm/x86.c:exception_type():
WARN_ON(vector > 31 || vector == NMI_VECTOR)
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://lore.kernel.org/r/20230728001606.2275586-3-mhal@rbox.co
[sean: split to separate patch]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Michal Luczaj [Fri, 28 Jul 2023 00:12:58 +0000 (02:12 +0200)]
KVM: selftests: Extend x86's sync_regs_test to check for CR4 races
Attempt to modify vcpu->run->s.regs _after_ the sanity checks performed by
KVM_CAP_SYNC_REGS's arch/x86/kvm/x86.c:sync_regs(). This can lead to some
nonsensical vCPU states accompanied by kernel splats, e.g. disabling PAE
while long mode is enabled makes KVM all kinds of confused:
WARNING: CPU: 0 PID: 1142 at arch/x86/kvm/mmu/paging_tmpl.h:358 paging32_walk_addr_generic+0x431/0x8f0 [kvm]
arch/x86/kvm/mmu/paging_tmpl.h:
KVM_BUG_ON(is_long_mode(vcpu) && !is_pae(vcpu), vcpu->kvm)
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Link: https://lore.kernel.org/r/20230728001606.2275586-3-mhal@rbox.co
[sean: see link]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Michal Luczaj [Fri, 28 Jul 2023 00:12:57 +0000 (02:12 +0200)]
KVM: x86: Fix KVM_CAP_SYNC_REGS's sync_regs() TOCTOU issues
In a spirit of using a sledgehammer to crack a nut, make sync_regs() feed
__set_sregs() and kvm_vcpu_ioctl_x86_set_vcpu_events() with kernel's own
copy of data.
Both __set_sregs() and kvm_vcpu_ioctl_x86_set_vcpu_events() assume they
have exclusive rights to structs they operate on. While this is true when
coming from an ioctl handler (caller makes a local copy of user's data),
sync_regs() breaks this contract; a pointer to a user-modifiable memory
(vcpu->run->s.regs) is provided. This can lead to a situation when incoming
data is checked and/or sanitized only to be re-set by a user thread running
in parallel.
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Fixes:
01643c51bfcf ("KVM: x86: KVM_CAP_SYNC_REGS")
Link: https://lore.kernel.org/r/20230728001606.2275586-2-mhal@rbox.co
Signed-off-by: Sean Christopherson <seanjc@google.com>
Linus Torvalds [Sun, 16 Jul 2023 22:10:37 +0000 (15:10 -0700)]
Linux 6.5-rc2
Linus Torvalds [Sun, 16 Jul 2023 21:12:49 +0000 (14:12 -0700)]
Merge tag 'xtensa-
20230716' of https://github.com/jcmvbkbc/linux-xtensa
Pull xtensa fixes from Max Filippov:
- fix interaction between unaligned exception handler and load/store
exception handler
- fix parsing ISS network interface specification string
- add comment about etherdev freeing to ISS network driver
* tag 'xtensa-
20230716' of https://github.com/jcmvbkbc/linux-xtensa:
xtensa: fix unaligned and load/store configuration interaction
xtensa: ISS: fix call to split_if_spec
xtensa: ISS: add comment about etherdev freeing
Linus Torvalds [Sun, 16 Jul 2023 20:46:08 +0000 (13:46 -0700)]
Merge tag 'perf_urgent_for_v6.5_rc2' of git://git./linux/kernel/git/tip/tip
Pull perf fix from Borislav Petkov:
- Fix a lockdep warning when the event given is the first one, no event
group exists yet but the code still goes and iterates over event
siblings
* tag 'perf_urgent_for_v6.5_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86: Fix lockdep warning in for_each_sibling_event() on SPR
Linus Torvalds [Sun, 16 Jul 2023 20:34:29 +0000 (13:34 -0700)]
Merge tag 'objtool_urgent_for_v6.5_rc2' of git://git./linux/kernel/git/tip/tip
Pull objtool fixes from Borislav Petkov:
- Mark copy_iovec_from_user() __noclone in order to prevent gcc from
doing an inter-procedural optimization and confuse objtool
- Initialize struct elf fully to avoid build failures
* tag 'objtool_urgent_for_v6.5_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
iov_iter: Mark copy_iovec_from_user() noclone
objtool: initialize all of struct elf
Linus Torvalds [Sun, 16 Jul 2023 20:22:08 +0000 (13:22 -0700)]
Merge tag 'sched_urgent_for_v6.5_rc2' of git://git./linux/kernel/git/tip/tip
Pull scheduler fixes from Borislav Petkov:
- Remove a cgroup from under a polling process properly
- Fix the idle sibling selection
* tag 'sched_urgent_for_v6.5_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/psi: use kernfs polling functions for PSI trigger polling
sched/fair: Use recent_used_cpu to test p->cpus_ptr
Linus Torvalds [Sun, 16 Jul 2023 19:55:31 +0000 (12:55 -0700)]
Merge tag 'pinctrl-v6.5-2' of git://git./linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"I'm mostly on vacation but what would vacation be without a few
critical fixes so people can use their gaming laptops when hiding away
from the sun (or rain)?
- Fix a really annoying interrupt storm in the AMD driver affecting
Asus TUF gaming notebooks
- Fix device tree parsing in the Renesas driver"
* tag 'pinctrl-v6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: amd: Unify debounce handling into amd_pinconf_set()
pinctrl: amd: Drop pull up select configuration
pinctrl: amd: Use amd_pinconf_set() for all config options
pinctrl: amd: Only use special debounce behavior for GPIO 0
pinctrl: renesas: rzg2l: Handle non-unique subnode names
pinctrl: renesas: rzv2m: Handle non-unique subnode names
Linus Torvalds [Sun, 16 Jul 2023 19:49:05 +0000 (12:49 -0700)]
Merge tag '6.5-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- Two reconnect fixes: important fix to address inFlight count to leak
(which can leak credits), and fix for better handling a deleted share
- DFS fix
- SMB1 cleanup fix
- deferred close fix
* tag '6.5-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: fix mid leak during reconnection after timeout threshold
cifs: is_network_name_deleted should return a bool
smb: client: fix missed ses refcounting
smb: client: Fix -Wstringop-overflow issues
cifs: if deferred close is disabled then close files immediately
Linus Torvalds [Sun, 16 Jul 2023 19:28:04 +0000 (12:28 -0700)]
Merge tag 'powerpc-6.5-3' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix Speculation_Store_Bypass reporting in /proc/self/status on
Power10
- Fix HPT with 4K pages since recent changes by implementing pmd_same()
- Fix 64-bit native_hpte_remove() to be irq-safe
Thanks to Aneesh Kumar K.V, Nageswara R Sastry, and Russell Currey.
* tag 'powerpc-6.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm/book3s64/hash/4k: Add pmd_same callback for 4K page size
powerpc/64e: Fix obtool warnings in exceptions-64e.S
powerpc/security: Fix Speculation_Store_Bypass reporting on Power10
powerpc/64s: Fix native_hpte_remove() to be irq-safe
Linus Torvalds [Sun, 16 Jul 2023 19:18:18 +0000 (12:18 -0700)]
Merge tag 'hardening-v6.5-rc2' of git://git./linux/kernel/git/kees/linux
Pull hardening fixes from Kees Cook:
- Remove LTO-only suffixes from promoted global function symbols
(Yonghong Song)
- Remove unused .text..refcount section from vmlinux.lds.h (Petr Pavlu)
- Add missing __always_inline to sparc __arch_xchg() (Arnd Bergmann)
- Claim maintainership of string routines
* tag 'hardening-v6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
sparc: mark __arch_xchg() as __always_inline
MAINTAINERS: Foolishly claim maintainership of string routines
kallsyms: strip LTO-only suffixes from promoted global functions
vmlinux.lds.h: Remove a reference to no longer used sections .text..refcount
Linus Torvalds [Sun, 16 Jul 2023 19:13:51 +0000 (12:13 -0700)]
Merge tag 'probes-fixes-v6.5-rc1-2' of git://git./linux/kernel/git/trace/linux-trace
Pull probe fixes from Masami Hiramatsu:
- fprobe: Add a comment why fprobe will be skipped if another kprobe is
running in fprobe_kprobe_handler().
- probe-events: Fix some issues related to fetch-arguments:
- Fix double counting of the string length for user-string and
symstr. This will require longer buffer in the array case.
- Fix not to count error code (minus value) for the total used
length in array argument. This makes the total used length
shorter.
- Fix to update dynamic used data size counter only if fetcharg uses
the dynamic size data. This may mis-count the used dynamic data
size and corrupt data.
- Revert "tracing: Add "(fault)" name injection to kernel probes"
because that did not work correctly with a bug, and we agreed the
current '(fault)' output (instead of '"(fault)"' like a string)
explains what happened more clearly.
- Fix to record 0-length (means fault access) data_loc data in fetch
function itself, instead of store_trace_args(). If we record an
array of string, this will fix to save fault access data on each
entry of the array correctly.
* tag 'probes-fixes-v6.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/probes: Fix to record 0-length data_loc in fetch_store_string*() if fails
Revert "tracing: Add "(fault)" name injection to kernel probes"
tracing/probes: Fix to update dynamic data counter if fetcharg uses it
tracing/probes: Fix not to count error code to total length
tracing/probes: Fix to avoid double count of the string length on the array
fprobes: Add a comment why fprobe_kprobe_handler exits if kprobe is running
Linus Torvalds [Sat, 15 Jul 2023 15:51:02 +0000 (08:51 -0700)]
Merge tag 'spi-fix-v6.5-rc1' of git://git./linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A couple of fairly minor driver specific fixes here, plus a bunch of
maintainership and admin updates. Nothing too remarkable"
* tag 'spi-fix-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
mailmap: add entry for Jonas Gorski
MAINTAINERS: add myself for spi-bcm63xx
spi: s3c64xx: clear loopback bit after loopback test
spi: bcm63xx: fix max prepend length
MAINTAINERS: Add myself as a maintainer for Microchip SPI
Linus Torvalds [Sat, 15 Jul 2023 15:46:09 +0000 (08:46 -0700)]
Merge tag 'regmap-fix-v6.5-rc1' of git://git./linux/kernel/git/broonie/regmap
Pull regmap fix from Mark Brown:
"One fix for an out of bounds access in the interupt code here"
* tag 'regmap-fix-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap-irq: Fix out-of-bounds access when allocating config buffers
Linus Torvalds [Sat, 15 Jul 2023 15:40:00 +0000 (08:40 -0700)]
Merge tag 'iommu-fixes-v6.5-rc1' of git://git./linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
- Fix a regression causing a crash on sysfs access of iommu-group
specific files
- Fix signedness bug in SVA code
* tag 'iommu-fixes-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/sva: Fix signedness bug in iommu_sva_alloc_pasid()
iommu: Fix crash during syfs iommu_groups/N/type
Linus Torvalds [Sat, 15 Jul 2023 03:19:25 +0000 (20:19 -0700)]
Merge tag 'x86_urgent_for_6.5_rc2' of git://git./linux/kernel/git/tip/tip
Pull x86 CFI fixes from Peter Zijlstra:
"Fix kCFI/FineIBT weaknesses
The primary bug Alyssa noticed was that with FineIBT enabled function
prologues have a spurious ENDBR instruction:
__cfi_foo:
endbr64
subl $hash, %r10d
jz 1f
ud2
nop
1:
foo:
endbr64 <--- *sadface*
This means that any indirect call that fails to target the __cfi
symbol and instead targets (the regular old) foo+0, will succeed due
to that second ENDBR.
Fixing this led to the discovery of a single indirect call that was
still doing this: ret_from_fork(). Since that's an assembly stub the
compiler would not generate the proper kCFI indirect call magic and it
would not get patched.
Brian came up with the most comprehensive fix -- convert the thing to
C with only a very thin asm wrapper. This ensures the kernel thread
boostrap is a proper kCFI call.
While discussing all this, Kees noted that kCFI hashes could/should be
poisoned to seal all functions whose address is never taken, further
limiting the valid kCFI targets -- much like we already do for IBT.
So what was a 'simple' observation and fix cascaded into a bunch of
inter-related CFI infrastructure fixes"
* tag 'x86_urgent_for_6.5_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cfi: Only define poison_cfi() if CONFIG_X86_KERNEL_IBT=y
x86/fineibt: Poison ENDBR at +0
x86: Rewrite ret_from_fork() in C
x86/32: Remove schedule_tail_wrapper()
x86/cfi: Extend ENDBR sealing to kCFI
x86/alternative: Rename apply_ibt_endbr()
x86/cfi: Extend {JMP,CAKK}_NOSPEC comment
Linus Torvalds [Sat, 15 Jul 2023 02:57:29 +0000 (19:57 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is a bunch of small driver fixes and a larger rework of zone disk
handling (which reaches into blk and nvme).
The aacraid array-bounds fix is now critical since the security people
turned on -Werror for some build tests, which now fail without it"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: storvsc: Handle SRB status value 0x30
scsi: block: Improve checks in blk_revalidate_disk_zones()
scsi: block: virtio_blk: Set zone limits before revalidating zones
scsi: block: nullblk: Set zone limits before revalidating zones
scsi: nvme: zns: Set zone limits before revalidating zones
scsi: sd_zbc: Set zone limits before revalidating zones
scsi: ufs: core: Add support for qTimestamp attribute
scsi: aacraid: Avoid -Warray-bounds warning
scsi: ufs: ufs-mediatek: Add dependency for RESET_CONTROLLER
scsi: ufs: core: Update contact email for monitor sysfs nodes
scsi: scsi_debug: Remove dead code
scsi: qla2xxx: Use vmalloc_array() and vcalloc()
scsi: fnic: Use vmalloc_array() and vcalloc()
scsi: qla2xxx: Fix error code in qla2x00_start_sp()
scsi: qla2xxx: Silence a static checker warning
scsi: lpfc: Fix a possible data race in lpfc_unregister_fcf_rescan()
Linus Torvalds [Sat, 15 Jul 2023 02:52:18 +0000 (19:52 -0700)]
Merge tag 'block-6.5-2023-07-14' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
- NVMe pull request via Keith:
- Don't require quirk to use duplicate namespace identifiers
(Christoph, Sagi)
- One more BOGUS_NID quirk (Pankaj)
- IO timeout and error hanlding fixes for PCI (Keith)
- Enhanced metadata format mask fix (Ankit)
- Association race condition fix for fibre channel (Michael)
- Correct debugfs error checks (Minjie)
- Use PAGE_SECTORS_SHIFT where needed (Damien)
- Reduce kernel logs for legacy nguid attribute (Keith)
- Use correct dma direction when unmapping metadata (Ming)
- Fix for a flush handling regression in this release (Christoph)
- Fix for batched request time stamping (Chengming)
- Fix for a regression in the mq-deadline position calculation (Bart)
- Lockdep fix for blk-crypto (Eric)
- Fix for a regression in the Amiga partition handling changes
(Michael)
* tag 'block-6.5-2023-07-14' of git://git.kernel.dk/linux:
block: queue data commands from the flush state machine at the head
blk-mq: fix start_time_ns and alloc_time_ns for pre-allocated rq
nvme-pci: fix DMA direction of unmapping integrity data
nvme: don't reject probe due to duplicate IDs for single-ported PCIe devices
block/mq-deadline: Fix a bug in deadline_from_pos()
nvme: ensure disabling pairs with unquiesce
nvme-fc: fix race between error recovery and creating association
nvme-fc: return non-zero status code when fails to create association
nvme: fix parameter check in nvme_fault_inject_init()
nvme: warn only once for legacy uuid attribute
block: remove dead struc request->completion_data field
nvme: fix the NVME_ID_NS_NVM_STS_MASK definition
nvmet: use PAGE_SECTORS_SHIFT
nvme: add BOGUS_NID quirk for Samsung SM953
blk-crypto: use dynamic lock class for blk_crypto_profile::lock
block/partition: fix signedness issue for Amiga partitions
Linus Torvalds [Sat, 15 Jul 2023 02:46:54 +0000 (19:46 -0700)]
Merge tag 'io_uring-6.5-2023-07-14' of git://git.kernel.dk/linux
Pull io_uring fix from Jens Axboe:
"Just a single tweak for the wait logic in io_uring"
* tag 'io_uring-6.5-2023-07-14' of git://git.kernel.dk/linux:
io_uring: Use io_schedule* in cqring wait
Linus Torvalds [Fri, 14 Jul 2023 18:14:07 +0000 (11:14 -0700)]
Merge tag 'riscv-for-linus-6.5-rc2' of git://git./linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- fix a formatting error in the hwprobe documentation
- fix a spurious warning in the RISC-V PMU driver
- fix memory detection on rv32 (problem does not manifest on any known
system)
- avoid parsing legacy parsing of I in ACPI ISA strings
* tag 'riscv-for-linus-6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: Don't include Zicsr or Zifencei in I from ACPI
riscv: mm: fix truncation warning on RV32
perf: RISC-V: Remove PERF_HES_STOPPED flag checking in riscv_pmu_start()
Documentation: RISC-V: hwprobe: Fix a formatting error
Linus Torvalds [Fri, 14 Jul 2023 18:07:04 +0000 (11:07 -0700)]
Merge tag 'pm-6.5-rc2' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix hibernation (after recent changes), frequency QoS and the
sparc cpufreq driver.
Specifics:
- Unbreak the /sys/power/resume interface after recent changes (Azat
Khuzhin).
- Allow PM_QOS_DEFAULT_VALUE to be used with frequency QoS (Chungkai
Yang).
- Remove __init from cpufreq callbacks in the sparc driver, because
they may be called after initialization too (Viresh Kumar)"
* tag 'pm-6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: sparc: Don't mark cpufreq callbacks with __init
PM: QoS: Restore support for default value on frequency QoS
PM: hibernate: Fix writing maj:min to /sys/power/resume
Rafael J. Wysocki [Fri, 14 Jul 2023 17:13:21 +0000 (19:13 +0200)]
Merge branches 'pm-sleep' and 'pm-qos'
Merge a PM QoS fix and a hibernation fix for 6.5-rc2.
- Unbreak the /sys/power/resume interface after recent changes (Azat
Khuzhin).
- Allow PM_QOS_DEFAULT_VALUE to be used with frequency QoS (Chungkai
Yang).
* pm-sleep:
PM: hibernate: Fix writing maj:min to /sys/power/resume
* pm-qos:
PM: QoS: Restore support for default value on frequency QoS
Shyam Prasad N [Fri, 14 Jul 2023 08:56:33 +0000 (08:56 +0000)]
cifs: fix mid leak during reconnection after timeout threshold
When the number of responses with status of STATUS_IO_TIMEOUT
exceeds a specified threshold (NUM_STATUS_IO_TIMEOUT), we reconnect
the connection. But we do not return the mid, or the credits
returned for the mid, or reduce the number of in-flight requests.
This bug could result in the server->in_flight count to go bad,
and also cause a leak in the mids.
This change moves the check to a few lines below where the
response is decrypted, even of the response is read from the
transform header. This way, the code for returning the mids
can be reused.
Also, the cifs_reconnect was reconnecting just the transport
connection before. In case of multi-channel, this may not be
what we want to do after several timeouts. Changed that to
reconnect the session and the tree too.
Also renamed NUM_STATUS_IO_TIMEOUT to a more appropriate name
MAX_STATUS_IO_TIMEOUT.
Fixes:
8e670f77c4a5 ("Handle STATUS_IO_TIMEOUT gracefully")
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Shyam Prasad N [Fri, 14 Jul 2023 08:56:34 +0000 (08:56 +0000)]
cifs: is_network_name_deleted should return a bool
Currently, is_network_name_deleted and it's implementations
do not return anything if the network name did get deleted.
So the function doesn't fully achieve what it advertizes.
Changed the function to return a bool instead. It will now
return true if the error returned is STATUS_NETWORK_NAME_DELETED
and the share (tree id) was found to be connected. It returns
false otherwise.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Acked-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Linus Torvalds [Fri, 14 Jul 2023 16:10:28 +0000 (09:10 -0700)]
Merge tag 'drm-fixes-2023-07-14-1' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"There were a bunch of fixes lined up for 2 weeks, so we have quite a
few scattered fixes, mostly amdgpu and i915, but ttm has a bunch and
nouveau makes an appearance.
So a bit busier than usual for rc2, but nothing seems out of the
ordinary.
fbdev:
- dma: Fix documented default preferred_bpp value
ttm:
- fix warning that we shouldn't mix && and ||
- never consider pinned BOs for eviction&swap
- Don't leak a resource on eviction error
- Don't leak a resource on swapout move error
- fix bulk_move corruption when adding a entry
client:
- Send hotplug event after registering a client
dma-buf:
- keep the signaling time of merged fences v3
- fix an error pointer vs NULL bug
sched:
- wait for all deps in kill jobs
- call set fence parent from scheduled
i915:
- Don't preserve dpll_hw_state for slave crtc in Bigjoiner
- Consider OA buffer boundary when zeroing out reports
- Remove dead code from gen8_pte_encode
- Fix one wrong caching mode enum usage
amdgpu:
- SMU i2c locking fix
- Fix a possible deadlock in process restoration for ROCm apps
- Disable PCIe lane/speed switching on Intel platforms (the platforms
don't support it)
nouveau:
- disp: fix HDMI on gt215+
- disp/g94: enable HDMI
- acr: Abort loading ACR if no firmware was found
- bring back blit subchannel for pre nv50 GPUs
- Fix drm_dp_remove_payload() invocation
ivpu:
- Fix VPU register access in irq disable
- Clear specific interrupt status bits on C0
bridge:
- dw_hdmi: fix connector access for scdc
- ti-sn65dsi86: Fix auxiliary bus lifetime
panel:
- simple: Add connector_type for innolux_at043tn24
- simple: Add Powertip PH800480T013 drm_display_mode flags"
* tag 'drm-fixes-2023-07-14-1' of git://anongit.freedesktop.org/drm/drm: (32 commits)
drm/nouveau: bring back blit subchannel for pre nv50 GPUs
drm/nouveau/acr: Abort loading ACR if no firmware was found
drm/amd: Align SMU11 SMU_MSG_OverridePcieParameters implementation with SMU13
drm/amd: Move helper for dynamic speed switch check out of smu13
drm/amd/pm: conditionally disable pcie lane/speed switching for SMU13
drm/amd/pm: share the code around SMU13 pcie parameters update
drm/amdgpu: avoid restore process run into dead loop.
drm/amd/pm: fix smu i2c data read risk
drm/nouveau/disp/g94: enable HDMI
drm/nouveau/disp: fix HDMI on gt215+
drm/client: Send hotplug event after registering a client
drm/i915: Fix one wrong caching mode enum usage
drm/i915: Remove dead code from gen8_pte_encode
drm/i915/perf: Consider OA buffer boundary when zeroing out reports
drm/i915: Don't preserve dpll_hw_state for slave crtc in Bigjoiner
drm/ttm: never consider pinned BOs for eviction&swap
drm/fbdev-dma: Fix documented default preferred_bpp value
dma-buf: fix an error pointer vs NULL bug
accel/ivpu: Clear specific interrupt status bits on C0
accel/ivpu: Fix VPU register access in irq disable
...
Linus Torvalds [Fri, 14 Jul 2023 16:05:15 +0000 (09:05 -0700)]
Merge tag 'ceph-for-6.5-rc2' of https://github.com/ceph/ceph-client
Pull ceph fix from Ilya Dryomov:
"A fix to prevent a potential buffer overrun in the messenger, marked
for stable"
* tag 'ceph-for-6.5-rc2' of https://github.com/ceph/ceph-client:
libceph: harden msgr2.1 frame segment length checks
Christoph Hellwig [Fri, 14 Jul 2023 14:30:14 +0000 (16:30 +0200)]
block: queue data commands from the flush state machine at the head
We used to insert the data commands following a pre-flush to the head
of the queue until commit
1e82fadfc6b ("blk-mq: do not do head insertions
post-pre-flush commands"). Not doing this seems to cause hangs of
such commands on NFS workloads when exported from file systems with
SATA SSDs. I have no idea why this would starve these workloads,
but doing a semantic revert of this patch (which looks quite different
due to various other changes) fixes the hangs.
Fixes:
1e82fadfc6b ("blk-mq: do not do head insertions post-pre-flush commands")
Reported-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/20230714143014.11879-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Dan Carpenter [Thu, 6 Apr 2023 08:55:31 +0000 (11:55 +0300)]
iommu/sva: Fix signedness bug in iommu_sva_alloc_pasid()
The ida_alloc_range() function returns negative error codes on error.
On success it returns values in the min to max range (inclusive). It
never returns more then INT_MAX even if "max" is higher. It never
returns values in the 0 to (min - 1) range.
The bug is that "min" is an unsigned int so negative error codes will
be promoted to high positive values errors treated as success.
Fixes:
1a14bf0fc7ed ("iommu/sva: Use GFP_KERNEL for pasid allocation")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/6b32095d-7491-4ebb-a850-12e96209eaaf@kili.mountain
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Jason Gunthorpe [Mon, 26 Jun 2023 15:13:11 +0000 (12:13 -0300)]
iommu: Fix crash during syfs iommu_groups/N/type
The err_restore_domain flow was accidently inserted into the success path
in commit
1000dccd5d13 ("iommu: Allow IOMMU_RESV_DIRECT to work on
ARM"). It should only happen if iommu_create_device_direct_mappings()
fails. This caused the domains the be wrongly changed and freed whenever
the sysfs is used, resulting in an oops:
BUG: kernel NULL pointer dereference, address:
0000000000000000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 1 PID: 3417 Comm: avocado Not tainted 6.4.0-rc4-next-
20230602 #3
Hardware name: Dell Inc. PowerEdge R6515/07PXPY, BIOS 2.3.6 07/06/2021
RIP: 0010:__iommu_attach_device+0xc/0xa0
Code: c0 c3 cc cc cc cc 48 89 f0 c3 cc cc cc cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 41 54 55 48 8b 47 08 <48> 8b 00 48 85 c0 74 74 48 89 f5 e8 64 12 49 00 41 89 c4 85 c0 74
RSP: 0018:
ffffabae0220bd48 EFLAGS:
00010246
RAX:
0000000000000000 RBX:
ffff9ac04f70e410 RCX:
0000000000000001
RDX:
ffff9ac044db20c0 RSI:
ffff9ac044fa50d0 RDI:
ffff9ac04f70e410
RBP:
ffff9ac044fa50d0 R08:
1000000100209001 R09:
00000000000002dc
R10:
0000000000000000 R11:
0000000000000000 R12:
ffff9ac043d54700
R13:
ffff9ac043d54700 R14:
0000000000000001 R15:
0000000000000001
FS:
00007f02e30ae000(0000) GS:
ffff9afeb2440000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000000000000000 CR3:
000000012afca006 CR4:
0000000000770ee0
PKRU:
55555554
Call Trace:
<TASK>
? __die+0x24/0x70
? page_fault_oops+0x82/0x150
? __iommu_queue_command_sync+0x80/0xc0
? exc_page_fault+0x69/0x150
? asm_exc_page_fault+0x26/0x30
? __iommu_attach_device+0xc/0xa0
? __iommu_attach_device+0x1c/0xa0
__iommu_device_set_domain+0x42/0x80
__iommu_group_set_domain_internal+0x5d/0x160
iommu_setup_default_domain+0x318/0x400
iommu_group_store_type+0xb1/0x200
kernfs_fop_write_iter+0x12f/0x1c0
vfs_write+0x2a2/0x3b0
ksys_write+0x63/0xe0
do_syscall_64+0x3f/0x90
entry_SYSCALL_64_after_hwframe+0x6e/0xd8
RIP: 0033:0x7f02e2f14a6f
Reorganize the error flow so that the success branch and error branches
are clearer.
Fixes:
1000dccd5d13 ("iommu: Allow IOMMU_RESV_DIRECT to work on ARM")
Reported-by: Dheeraj Kumar Srivastava <dheerajkumar.srivastava@amd.com>
Tested-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/0-v1-5bd8cc969d9e+1f1-iommu_set_def_fix_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Masami Hiramatsu (Google) [Tue, 11 Jul 2023 14:16:07 +0000 (23:16 +0900)]
tracing/probes: Fix to record 0-length data_loc in fetch_store_string*() if fails
Fix to record 0-length data to data_loc in fetch_store_string*() if it fails
to get the string data.
Currently those expect that the data_loc is updated by store_trace_args() if
it returns the error code. However, that does not work correctly if the
argument is an array of strings. In that case, store_trace_args() only clears
the first entry of the array (which may have no error) and leaves other
entries. So it should be cleared by fetch_store_string*() itself.
Also, 'dyndata' and 'maxlen' in store_trace_args() should be updated
only if it is used (ret > 0 and argument is a dynamic data.)
Link: https://lore.kernel.org/all/168908496683.123124.4761206188794205601.stgit@devnote2/
Fixes:
40b53b771806 ("tracing: probeevent: Add array type support")
Cc: stable@vger.kernel.org
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Dave Airlie [Fri, 14 Jul 2023 01:44:54 +0000 (11:44 +1000)]
Merge tag 'amd-drm-fixes-6.5-2023-07-12' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.5-2023-07-12:
amdgpu:
- SMU i2c locking fix
- Fix a possible deadlock in process restoration for ROCm apps
- Disable PCIe lane/speed switching on Intel platforms (the platforms don't support it)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230712184009.7740-1-alexander.deucher@amd.com
Dave Airlie [Fri, 14 Jul 2023 01:13:14 +0000 (11:13 +1000)]
Merge tag 'drm-intel-fixes-2023-07-13' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Don't preserve dpll_hw_state for slave crtc in Bigjoiner (Stanislav Lisovskiy)
- Consider OA buffer boundary when zeroing out reports [perf] (Umesh Nerlige Ramappa)
- Remove dead code from gen8_pte_encode (Tvrtko Ursulin)
- Fix one wrong caching mode enum usage (Tvrtko Ursulin)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZK+nHLCltaxoxVw/@tursulin-desk
Dave Airlie [Fri, 14 Jul 2023 00:37:18 +0000 (10:37 +1000)]
Merge tag 'drm-misc-fixes-2023-07-13' of ssh://git.freedesktop.org/git/drm/drm-misc into drm-fixes
A couple of nouveau patches addressing improving HDMI support and
firmware handling, a fix for TTM to skip pinned BO when evicting, and a
fix for the fbdev documentation.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/nq3ke75juephbex5acfyi5t6bxv22nhmfcpfhru55haj2nv3us@gehrlmjbqgjk
Linus Torvalds [Thu, 13 Jul 2023 21:35:02 +0000 (14:35 -0700)]
Merge tag 'erofs-for-6.5-rc2-fixes' of git://git./linux/kernel/git/xiang/erofs
Pull erofs fixes from Gao Xiang:
"Three patches address regressions related to post-EOF unexpected
behaviors and fsdax unavailability of chunk-based regular files.
The other two patches mainly get rid of kmap_atomic() and simplify
z_erofs_transform_plain().
- Fix two unexpected loop cases when reading beyond EOF
- Fix fsdax unavailability for chunk-based regular files
- Get rid of the remaining kmap_atomic()
- Minor cleanups"
* tag 'erofs-for-6.5-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: fix fsdax unavailability for chunk-based regular files
erofs: avoid infinite loop in z_erofs_do_read_page() when reading beyond EOF
erofs: avoid useless loops in z_erofs_pcluster_readmore() when reading beyond EOF
erofs: simplify z_erofs_transform_plain()
erofs: get rid of the remaining kmap_atomic()
Linus Torvalds [Thu, 13 Jul 2023 21:21:22 +0000 (14:21 -0700)]
Merge tag 'net-6.5-rc2' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from netfilter, wireless and ebpf.
Current release - regressions:
- netfilter: conntrack: gre: don't set assured flag for clash entries
- wifi: iwlwifi: remove 'use_tfh' config to fix crash
Previous releases - regressions:
- ipv6: fix a potential refcount underflow for idev
- icmp6: ifix null-ptr-deref of ip6_null_entry->rt6i_idev in
icmp6_dev()
- bpf: fix max stack depth check for async callbacks
- eth: mlx5e:
- check for NOT_READY flag state after locking
- fix page_pool page fragment tracking for XDP
- eth: igc:
- fix tx hang issue when QBV gate is closed
- fix corner cases for TSN offload
- eth: octeontx2-af: Move validation of ptp pointer before its usage
- eth: ena: fix shift-out-of-bounds in exponential backoff
Previous releases - always broken:
- core: prevent skb corruption on frag list segmentation
- sched:
- cls_fw: fix improper refcount update leads to use-after-free
- sch_qfq: account for stab overhead in qfq_enqueue
- netfilter:
- report use refcount overflow
- prevent OOB access in nft_byteorder_eval
- wifi: mt7921e: fix init command fail with enabled device
- eth: ocelot: fix oversize frame dropping for preemptible TCs
- eth: fec: recycle pages for transmitted XDP frames"
* tag 'net-6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (79 commits)
selftests: tc-testing: add test for qfq with stab overhead
net/sched: sch_qfq: account for stab overhead in qfq_enqueue
selftests: tc-testing: add tests for qfq mtu sanity check
net/sched: sch_qfq: reintroduce lmax bound check for MTU
wifi: cfg80211: fix receiving mesh packets without RFC1042 header
wifi: rtw89: debug: fix error code in rtw89_debug_priv_send_h2c_set()
net: txgbe: fix eeprom calculation error
net/sched: make psched_mtu() RTNL-less safe
net: ena: fix shift-out-of-bounds in exponential backoff
netdevsim: fix uninitialized data in nsim_dev_trap_fa_cookie_write()
net/sched: flower: Ensure both minimum and maximum ports are specified
MAINTAINERS: Add another mailing list for QUALCOMM ETHQOS ETHERNET DRIVER
docs: netdev: update the URL of the status page
wifi: iwlwifi: remove 'use_tfh' config to fix crash
xdp: use trusted arguments in XDP hints kfuncs
bpf: cpumap: Fix memory leak in cpu_map_update_elem
wifi: airo: avoid uninitialized warning in airo_get_rate()
octeontx2-pf: Add additional check for MCAM rules
net: dsa: Removed unneeded of_node_put in felix_parse_ports_node
net: fec: use netdev_err_once() instead of netdev_err()
...
Linus Torvalds [Thu, 13 Jul 2023 20:44:28 +0000 (13:44 -0700)]
Merge tag 'trace-v6.5-rc1-3' of git://git./linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix some missing-prototype warnings
- Fix user events struct args (did not include size of struct)
When creating a user event, the "struct" keyword is to denote that
the size of the field will be passed in. But the parsing failed to
handle this case.
- Add selftest to struct sizes for user events
- Fix sample code for direct trampolines.
The sample code for direct trampolines attached to handle_mm_fault().
But the prototype changed and the direct trampoline sample code was
not updated. Direct trampolines needs to have the arguments correct
otherwise it can fail or crash the system.
- Remove unused ftrace_regs_caller_ret() prototype.
- Quiet false positive of FORTIFY_SOURCE
Due to backward compatibility, the structure used to save stack
traces in the kernel had a fixed size of 8. This structure is
exported to user space via the tracing format file. A change was made
to allow more than 8 functions to be recorded, and user space now
uses the size field to know how many functions are actually in the
stack.
But the structure still has size of 8 (even though it points into the
ring buffer that has the required amount allocated to hold a full
stack.
This was fine until the fortifier noticed that the
memcpy(&entry->caller, stack, size) was greater than the 8 functions
and would complain at runtime about it.
Hide this by using a pointer to the stack location on the ring buffer
instead of using the address of the entry structure caller field.
- Fix a deadloop in reading trace_pipe that was caused by a mismatch
between ring_buffer_empty() returning false which then asked to read
the data, but the read code uses rb_num_of_entries() that returned
zero, and causing a infinite "retry".
- Fix a warning caused by not using all pages allocated to store ftrace
functions, where this can happen if the linker inserts a bunch of
"NULL" entries, causing the accounting of how many pages needed to be
off.
- Fix histogram synthetic event crashing when the start event is
removed and the end event is still using a variable from it
- Fix memory leak in freeing iter->temp in tracing_release_pipe()
* tag 'trace-v6.5-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Fix memory leak of iter->temp when reading trace_pipe
tracing/histograms: Add histograms to hist_vars if they have referenced variables
tracing: Stop FORTIFY_SOURCE complaining about stack trace caller
ftrace: Fix possible warning on checking all pages used in ftrace_process_locs()
ring-buffer: Fix deadloop issue on reading trace_pipe
tracing: arm64: Avoid missing-prototype warnings
selftests/user_events: Test struct size match cases
tracing/user_events: Fix struct arg size match check
x86/ftrace: Remove unsued extern declaration ftrace_regs_caller_ret()
arm64: ftrace: Add direct call trampoline samples support
samples: ftrace: Save required argument registers in sample trampolines
Linus Torvalds [Thu, 13 Jul 2023 20:39:36 +0000 (13:39 -0700)]
Merge tag 'for-linus-6.5-rc2-tag' of git://git./linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- a cleanup of the Xen related ELF-notes
- a fix for virtio handling in Xen dom0 when running Xen in a VM
* tag 'for-linus-6.5-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen/virtio: Fix NULL deref when a bridge of PCI root bus has no parent
x86/Xen: tidy xen-head.S
Linus Torvalds [Thu, 13 Jul 2023 20:34:00 +0000 (13:34 -0700)]
Merge tag 'sh-for-v6.5-tag2' of git://git./linux/kernel/git/glaubitz/sh-linux
Pull sh fixes from John Paul Adrian Glaubitz:
"The sh updates introduced multiple regressions.
In particular, the change
a8ac2961148e ("sh: Avoid using IRQ0 on SH3
and SH4") causes several boards to hang during boot due to incorrect
IRQ numbers.
Geert Uytterhoeven has contributed patches that handle the virq offset
in the IRQ code for the dreamcast, highlander and r2d boards while
Artur Rojek has contributed a patch which handles the virq offset for
the hd64461 companion chip"
* tag 'sh-for-v6.5-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux:
sh: hd64461: Handle virq offset for offchip IRQ base and HD64461 IRQ
sh: mach-dreamcast: Handle virq offset in cascaded IRQ demux
sh: mach-highlander: Handle virq offset in cascaded IRL demux
sh: mach-r2d: Handle virq offset in cascaded IRL demux
Jens Axboe [Thu, 13 Jul 2023 20:26:38 +0000 (14:26 -0600)]
Merge tag 'nvme-6.5-2023-07-13' of git://git.infradead.org/nvme into block-6.5
Pull NVMe fixes from Keith:
"nvme fixes for Linux 6.5
- Don't require quirk to use duplicate namespace identifiers
(Christoph, Sagi)
- One more BOGUS_NID quirk (Pankaj)
- IO timeout and error hanlding fixes for PCI (Keith)
- Enhanced metadata format mask fix (Ankit)
- Association race condition fix for fibre channel (Michael)
- Correct debugfs error checks (Minjie)
- Use PAGE_SECTORS_SHIFT where needed (Damien)
- Reduce kernel logs for legacy nguid attribute (Keith)
- Use correct dma direction when unmapping metadata (Ming)"
* tag 'nvme-6.5-2023-07-13' of git://git.infradead.org/nvme:
nvme-pci: fix DMA direction of unmapping integrity data
nvme: don't reject probe due to duplicate IDs for single-ported PCIe devices
nvme: ensure disabling pairs with unquiesce
nvme-fc: fix race between error recovery and creating association
nvme-fc: return non-zero status code when fails to create association
nvme: fix parameter check in nvme_fault_inject_init()
nvme: warn only once for legacy uuid attribute
nvme: fix the NVME_ID_NS_NVM_STS_MASK definition
nvmet: use PAGE_SECTORS_SHIFT
nvme: add BOGUS_NID quirk for Samsung SM953
Chengming Zhou [Mon, 10 Jul 2023 10:55:16 +0000 (18:55 +0800)]
blk-mq: fix start_time_ns and alloc_time_ns for pre-allocated rq
The iocost rely on rq start_time_ns and alloc_time_ns to tell saturation
state of the block device. Most of the time request is allocated after
rq_qos_throttle() and its alloc_time_ns or start_time_ns won't be affected.
But for plug batched allocation introduced by the commit
47c122e35d7e
("block: pre-allocate requests if plug is started and is a batch"), we can
rq_qos_throttle() after the allocation of the request. This is what the
blk_mq_get_cached_request() does.
In this case, the cached request alloc_time_ns or start_time_ns is much
ahead if blocked in any qos ->throttle().
Fix it by setting alloc_time_ns and start_time_ns to now when the allocated
request is actually used.
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Acked-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20230710105516.2053478-1-chengming.zhou@linux.dev
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Arnd Bergmann [Wed, 28 Jun 2023 09:49:18 +0000 (11:49 +0200)]
sparc: mark __arch_xchg() as __always_inline
An otherwise correct change to the atomic operations uncovered an
existing bug in the sparc __arch_xchg() function, which is calls
__xchg_called_with_bad_pointer() when its arguments are unknown at
compile time:
ERROR: modpost: "__xchg_called_with_bad_pointer" [lib/atomic64_test.ko] undefined!
This now happens because gcc determines that it's better to not inline the
function. Avoid this by just marking the function as __always_inline
to force the compiler to do the right thing here.
Reported-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/all/c525adc9-6623-4660-8718-e0c9311563b8@roeck-us.net/
Fixes:
d12157efc8e08 ("locking/atomic: make atomic*_{cmp,}xchg optional")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://lore.kernel.org/r/20230628094938.2318171-1-arnd@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Kees Cook [Wed, 12 Jul 2023 19:46:29 +0000 (12:46 -0700)]
MAINTAINERS: Foolishly claim maintainership of string routines
Since the string API is tightly coupled with FORTIFY_SOURCE, I am
offering myself up as maintainer for it. Thankfully Andy is already a
reviewer and can keep me on the straight and narrow.
Acked-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20230712194625.never.252-kees@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Masami Hiramatsu (Google) [Tue, 11 Jul 2023 14:15:57 +0000 (23:15 +0900)]
Revert "tracing: Add "(fault)" name injection to kernel probes"
This reverts commit
2e9906f84fc7c99388bb7123ade167250d50f1c0.
It was turned out that commit
2e9906f84fc7 ("tracing: Add "(fault)"
name injection to kernel probes") did not work correctly and probe
events still show just '(fault)' (instead of '"(fault)"'). Also,
current '(fault)' is more explicit that it faulted.
This also moves FAULT_STRING macro to trace.h so that synthetic
event can keep using it, and uses it in trace_probe.c too.
Link: https://lore.kernel.org/all/168908495772.123124.1250788051922100079.stgit@devnote2/
Link: https://lore.kernel.org/all/20230706230642.3793a593@rorschach.local.home/
Cc: stable@vger.kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Masami Hiramatsu (Google) [Tue, 11 Jul 2023 14:15:48 +0000 (23:15 +0900)]
tracing/probes: Fix to update dynamic data counter if fetcharg uses it
Fix to update dynamic data counter ('dyndata') and max length ('maxlen')
only if the fetcharg uses the dynamic data. Also get out arg->dynamic
from unlikely(). This makes dynamic data address wrong if
process_fetch_insn() returns error on !arg->dynamic case.
Link: https://lore.kernel.org/all/168908494781.123124.8160245359962103684.stgit@devnote2/
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Link: https://lore.kernel.org/all/20230710233400.5aaf024e@gandalf.local.home/
Fixes:
9178412ddf5a ("tracing: probeevent: Return consumed bytes of dynamic area")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Masami Hiramatsu (Google) [Tue, 11 Jul 2023 14:15:38 +0000 (23:15 +0900)]
tracing/probes: Fix not to count error code to total length
Fix not to count the error code (which is minus value) to the total
used length of array, because it can mess up the return code of
process_fetch_insn_bottom(). Also clear the 'ret' value because it
will be used for calculating next data_loc entry.
Link: https://lore.kernel.org/all/168908493827.123124.2175257289106364229.stgit@devnote2/
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/
8819b154-2ba1-43c3-98a2-
cbde20892023@moroto.mountain/
Fixes:
9b960a38835f ("tracing: probeevent: Unify fetch_insn processing common part")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Masami Hiramatsu (Google) [Tue, 11 Jul 2023 14:15:29 +0000 (23:15 +0900)]
tracing/probes: Fix to avoid double count of the string length on the array
If an array is specified with the ustring or symstr, the length of the
strings are accumlated on both of 'ret' and 'total', which means the
length is double counted.
Just set the length to the 'ret' value for avoiding double counting.
Link: https://lore.kernel.org/all/168908492917.123124.15076463491122036025.stgit@devnote2/
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/
8819b154-2ba1-43c3-98a2-
cbde20892023@moroto.mountain/
Fixes:
88903c464321 ("tracing/probe: Add ustring type for user-space string")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Masami Hiramatsu (Google) [Fri, 7 Jul 2023 16:38:03 +0000 (01:38 +0900)]
fprobes: Add a comment why fprobe_kprobe_handler exits if kprobe is running
Add a comment the reason why fprobe_kprobe_handler() exits if any other
kprobe is running.
Link: https://lore.kernel.org/all/168874788299.159442.2485957441413653858.stgit@devnote2/
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Link: https://lore.kernel.org/all/20230706120916.3c6abf15@gandalf.local.home/
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Ming Lei [Thu, 13 Jul 2023 09:26:20 +0000 (17:26 +0800)]
nvme-pci: fix DMA direction of unmapping integrity data
DMA direction should be taken in dma_unmap_page() for unmapping integrity
data.
Fix this DMA direction, and reported in Guangwu's test.
Reported-by: Guangwu Zhang <guazhang@redhat.com>
Fixes:
4aedb705437f ("nvme-pci: split metadata handling from nvme_map_data / nvme_unmap_data")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Christoph Hellwig [Thu, 13 Jul 2023 13:30:42 +0000 (15:30 +0200)]
nvme: don't reject probe due to duplicate IDs for single-ported PCIe devices
While duplicate IDs are still very harmful, including the potential to easily
see changing devices in /dev/disk/by-id, it turn out they are extremely
common for cheap end user NVMe devices.
Relax our check for them for so that it doesn't reject the probe on
single-ported PCIe devices, but prints a big warning instead. In doubt
we'd still like to see quirk entries to disable the potential for
changing supposed stable device identifier links, but this will at least
allow users how have two (or more) of these devices to use them without
having to manually add a new PCI ID entry with the quirk through sysfs or
by patching the kernel.
Fixes:
2079f41ec6ff ("nvme: check that EUI/GUID/UUID are globally unique")
Cc: stable@vger.kernel.org # 6.0+
Co-developed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Zheng Yejian [Thu, 13 Jul 2023 14:14:35 +0000 (22:14 +0800)]
tracing: Fix memory leak of iter->temp when reading trace_pipe
kmemleak reports:
unreferenced object 0xffff88814d14e200 (size 256):
comm "cat", pid 336, jiffies
4294871818 (age 779.490s)
hex dump (first 32 bytes):
04 00 01 03 00 00 00 00 08 00 00 00 00 00 00 00 ................
0c d8 c8 9b ff ff ff ff 04 5a ca 9b ff ff ff ff .........Z......
backtrace:
[<
ffffffff9bdff18f>] __kmalloc+0x4f/0x140
[<
ffffffff9bc9238b>] trace_find_next_entry+0xbb/0x1d0
[<
ffffffff9bc9caef>] trace_print_lat_context+0xaf/0x4e0
[<
ffffffff9bc94490>] print_trace_line+0x3e0/0x950
[<
ffffffff9bc95499>] tracing_read_pipe+0x2d9/0x5a0
[<
ffffffff9bf03a43>] vfs_read+0x143/0x520
[<
ffffffff9bf04c2d>] ksys_read+0xbd/0x160
[<
ffffffff9d0f0edf>] do_syscall_64+0x3f/0x90
[<
ffffffff9d2000aa>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
when reading file 'trace_pipe', 'iter->temp' is allocated or relocated
in trace_find_next_entry() but not freed before 'trace_pipe' is closed.
To fix it, free 'iter->temp' in tracing_release_pipe().
Link: https://lore.kernel.org/linux-trace-kernel/20230713141435.1133021-1-zhengyejian1@huawei.com
Cc: stable@vger.kernel.org
Fixes:
ff895103a84ab ("tracing: Save off entry when peeking at next entry")
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Ilya Dryomov [Mon, 10 Jul 2023 18:39:29 +0000 (20:39 +0200)]
libceph: harden msgr2.1 frame segment length checks
ceph_frame_desc::fd_lens is an int array. decode_preamble() thus
effectively casts u32 -> int but the checks for segment lengths are
written as if on unsigned values. While reading in HELLO or one of the
AUTH frames (before authentication is completed), arithmetic in
head_onwire_len() can get duped by negative ctrl_len and produce
head_len which is less than CEPH_PREAMBLE_LEN but still positive.
This would lead to a buffer overrun in prepare_read_control() as the
preamble gets copied to the newly allocated buffer of size head_len.
Cc: stable@vger.kernel.org
Fixes:
cd1a677cad99 ("libceph, ceph: implement msgr2.1 protocol (crc and secure modes)")
Reported-by: Thelford Williams <thelford@google.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Paolo Abeni [Thu, 13 Jul 2023 09:12:01 +0000 (11:12 +0200)]
Merge branch 'net-sched-fixes-for-sch_qfq'
Pedro Tammela says:
====================
net/sched: fixes for sch_qfq
Patch 1 fixes a regression introduced in 6.4 where the MTU size could be
bigger than 'lmax'.
Patch 3 fixes an issue where the code doesn't account for qdisc_pkt_len()
returning a size bigger then 'lmax'.
Patches 2 and 4 are selftests for the issues above.
====================
Link: https://lore.kernel.org/r/20230711210103.597831-1-pctammela@mojatatu.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Pedro Tammela [Tue, 11 Jul 2023 21:01:03 +0000 (18:01 -0300)]
selftests: tc-testing: add test for qfq with stab overhead
A packet with stab overhead greater than QFQ_MAX_LMAX should be dropped
by the QFQ qdisc as it can't handle such lengths.
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Pedro Tammela [Tue, 11 Jul 2023 21:01:02 +0000 (18:01 -0300)]
net/sched: sch_qfq: account for stab overhead in qfq_enqueue
Lion says:
-------
In the QFQ scheduler a similar issue to CVE-2023-31436
persists.
Consider the following code in net/sched/sch_qfq.c:
static int qfq_enqueue(struct sk_buff *skb, struct Qdisc *sch,
struct sk_buff **to_free)
{
unsigned int len = qdisc_pkt_len(skb), gso_segs;
// ...
if (unlikely(cl->agg->lmax < len)) {
pr_debug("qfq: increasing maxpkt from %u to %u for class %u",
cl->agg->lmax, len, cl->common.classid);
err = qfq_change_agg(sch, cl, cl->agg->class_weight, len);
if (err) {
cl->qstats.drops++;
return qdisc_drop(skb, sch, to_free);
}
// ...
}
Similarly to CVE-2023-31436, "lmax" is increased without any bounds
checks according to the packet length "len". Usually this would not
impose a problem because packet sizes are naturally limited.
This is however not the actual packet length, rather the
"qdisc_pkt_len(skb)" which might apply size transformations according to
"struct qdisc_size_table" as created by "qdisc_get_stab()" in
net/sched/sch_api.c if the TCA_STAB option was set when modifying the qdisc.
A user may choose virtually any size using such a table.
As a result the same issue as in CVE-2023-31436 can occur, allowing heap
out-of-bounds read / writes in the kmalloc-8192 cache.
-------
We can create the issue with the following commands:
tc qdisc add dev $DEV root handle 1: stab mtu 2048 tsize 512 mpu 0 \
overhead
999999999 linklayer ethernet qfq
tc class add dev $DEV parent 1: classid 1:1 htb rate 6mbit burst 15k
tc filter add dev $DEV parent 1: matchall classid 1:1
ping -I $DEV 1.1.1.2
This is caused by incorrectly assuming that qdisc_pkt_len() returns a
length within the QFQ_MIN_LMAX < len < QFQ_MAX_LMAX.
Fixes:
462dbc9101ac ("pkt_sched: QFQ Plus: fair-queueing service at DRR cost")
Reported-by: Lion <nnamrec@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Pedro Tammela [Tue, 11 Jul 2023 21:01:01 +0000 (18:01 -0300)]
selftests: tc-testing: add tests for qfq mtu sanity check
QFQ only supports a certain bound of MTU size so make sure
we check for this requirement in the tests.
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Zhengchao Shao <shaozhengchao@huawei.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Pedro Tammela [Tue, 11 Jul 2023 21:01:00 +0000 (18:01 -0300)]
net/sched: sch_qfq: reintroduce lmax bound check for MTU
25369891fcef deletes a check for the case where no 'lmax' is
specified which
3037933448f6 previously fixed as 'lmax'
could be set to the device's MTU without any bound checking
for QFQ_LMAX_MIN and QFQ_LMAX_MAX. Therefore, reintroduce the check.
Fixes:
25369891fcef ("net/sched: sch_qfq: refactor parsing of netlink parameters")
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Artur Rojek [Mon, 10 Jul 2023 23:31:32 +0000 (01:31 +0200)]
sh: hd64461: Handle virq offset for offchip IRQ base and HD64461 IRQ
A recent change to start counting SuperH IRQ #s from 16 breaks support
for the Hitachi HD64461 companion chip.
Move the offchip IRQ base and HD64461 IRQ # by 16 in order to
accommodate for the new virq numbering rules.
Fixes:
a8ac2961148e ("sh: Avoid using IRQ0 on SH3 and SH4")
Signed-off-by: Artur Rojek <contact@artur-rojek.eu>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Link: https://lore.kernel.org/r/20230710233132.69734-1-contact@artur-rojek.eu
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Geert Uytterhoeven [Sun, 9 Jul 2023 13:10:43 +0000 (15:10 +0200)]
sh: mach-dreamcast: Handle virq offset in cascaded IRQ demux
Take into account the virq offset when translating cascaded interrupts.
Fixes:
a8ac2961148e8c72 ("sh: Avoid using IRQ0 on SH3 and SH4")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Link: https://lore.kernel.org/r/7d0cb246c9f1cd24bb1f637ec5cb67e799a4c3b8.1688908227.git.geert+renesas@glider.be
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Geert Uytterhoeven [Sun, 9 Jul 2023 13:10:23 +0000 (15:10 +0200)]
sh: mach-highlander: Handle virq offset in cascaded IRL demux
Take into account the virq offset when translating cascaded IRL
interrupts.
Fixes:
a8ac2961148e8c72 ("sh: Avoid using IRQ0 on SH3 and SH4")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Link: https://lore.kernel.org/r/4fcb0d08a2b372431c41e04312742dc9e41e1be4.1688908186.git.geert+renesas@glider.be
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Geert Uytterhoeven [Sun, 9 Jul 2023 11:15:49 +0000 (13:15 +0200)]
sh: mach-r2d: Handle virq offset in cascaded IRL demux
When booting rts7751r2dplus_defconfig on QEMU, the system hangs due to
an interrupt storm on IRQ 20. IRQ 20 aka event 0x280 is a cascaded IRL
interrupt, which maps to IRQ_VOYAGER, the interrupt used by the Silicon
Motion SM501 multimedia companion chip. As rts7751r2d_irq_demux() does
not take into account the new virq offset, the interrupt is no longer
translated, leading to an unhandled interrupt.
Fix this by taking into account the virq offset when translating
cascaded IRL interrupts.
Fixes:
a8ac2961148e8c72 ("sh: Avoid using IRQ0 on SH3 and SH4")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Closes: https://lore.kernel.org/r/
fbfea3ad-d327-4ad5-ac9c-
648c7ca3fe1f@roeck-us.net
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/2c99d5df41c40691f6c407b7b6a040d406bc81ac.1688901306.git.geert+renesas@glider.be
Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Paulo Alcantara [Tue, 11 Jul 2023 17:15:10 +0000 (14:15 -0300)]
smb: client: fix missed ses refcounting
Use new cifs_smb_ses_inc_refcount() helper to get an active reference
of @ses and @ses->dfs_root_ses (if set). This will prevent
@ses->dfs_root_ses of being put in the next call to cifs_put_smb_ses()
and thus potentially causing an use-after-free bug.
Fixes:
8e3554150d6c ("cifs: fix sharing of DFS connections")
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>