platform/kernel/linux-rpi.git
13 months agoKVM: riscv: selftests: Add get-reg-list test
Haibo Xu [Tue, 25 Jul 2023 08:41:39 +0000 (16:41 +0800)]
KVM: riscv: selftests: Add get-reg-list test

get-reg-list test is used to check for KVM registers regressions
during VM migration which happens when destination host kernel
missing registers that the source host kernel has. The blessed
list registers was created by running on v6.5-rc3

Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoKVM: riscv: Add KVM_GET_REG_LIST API support
Haibo Xu [Tue, 25 Jul 2023 08:41:38 +0000 (16:41 +0800)]
KVM: riscv: Add KVM_GET_REG_LIST API support

KVM_GET_REG_LIST API will return all registers that are available to
KVM_GET/SET_ONE_REG APIs. It's very useful to identify some platform
regression issue during VM migration.

Since this API was already supported on arm64, it is straightforward
to enable it on riscv with similar code structure.

Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoKVM: selftests: Add skip_set facility to get_reg_list test
Haibo Xu [Tue, 25 Jul 2023 08:41:37 +0000 (16:41 +0800)]
KVM: selftests: Add skip_set facility to get_reg_list test

Add new skips_set members to vcpu_reg_sublist so as to skip
set operation on some registers.

Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoKVM: selftests: Only do get/set tests on present blessed list
Haibo Xu [Tue, 25 Jul 2023 08:41:36 +0000 (16:41 +0800)]
KVM: selftests: Only do get/set tests on present blessed list

Only do the get/set tests on present and blessed registers
since we don't know the capabilities of any new ones.

Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoKVM: arm64: selftests: Move finalize_vcpu back to run_test
Haibo Xu [Tue, 25 Jul 2023 08:41:35 +0000 (16:41 +0800)]
KVM: arm64: selftests: Move finalize_vcpu back to run_test

No functional changes. Just move the finalize_vcpu call back to
run_test and do weak function trick to prepare for the opration
in riscv.

Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoKVM: arm64: selftests: Move reject_set check logic to a function
Haibo Xu [Tue, 25 Jul 2023 08:41:34 +0000 (16:41 +0800)]
KVM: arm64: selftests: Move reject_set check logic to a function

No functional changes. Just move the reject_set check logic to a
function so we can check for a specific errno. This is a preparation
for support reject_set in riscv.

Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoKVM: arm64: selftests: Finish generalizing get-reg-list
Andrew Jones [Tue, 25 Jul 2023 08:41:33 +0000 (16:41 +0800)]
KVM: arm64: selftests: Finish generalizing get-reg-list

Add some unfortunate #ifdeffery to ensure the common get-reg-list.c
can be compiled and run with other architectures. The next
architecture to support get-reg-list should now only need to provide
$(ARCH_DIR)/get-reg-list.c where arch-specific print_reg() and
vcpu_configs[] get defined.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoKVM: arm64: selftests: Split get-reg-list test code
Andrew Jones [Tue, 25 Jul 2023 08:41:32 +0000 (16:41 +0800)]
KVM: arm64: selftests: Split get-reg-list test code

Split the arch-neutral test code out of aarch64/get-reg-list.c into
get-reg-list.c. To do this we invent a new make variable
$(SPLIT_TESTS) which expects common parts to be in the KVM selftests
root and the counterparts to have the same name, but be in
$(ARCH_DIR).

There's still some work to be done to de-aarch64 the common
get-reg-list.c, but we leave that to the next patch to avoid
modifying too much code while moving it.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoKVM: arm64: selftests: Delete core_reg_fixup
Andrew Jones [Tue, 25 Jul 2023 08:41:31 +0000 (16:41 +0800)]
KVM: arm64: selftests: Delete core_reg_fixup

core_reg_fixup() complicates sharing the get-reg-list test with
other architectures. Rather than work at keeping it, with plenty
of #ifdeffery, just delete it, as it's unlikely to test a kernel
based on anything older than v5.2 with the get-reg-list test,
which is a test meant to check for regressions in new kernels.
(And, an older version of the test can still be used for older
kernels if necessary.)

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoKVM: arm64: selftests: Rename vcpu_config and add to kvm_util.h
Andrew Jones [Tue, 25 Jul 2023 08:41:30 +0000 (16:41 +0800)]
KVM: arm64: selftests: Rename vcpu_config and add to kvm_util.h

Rename vcpu_config to vcpu_reg_list to be more specific and add
it to kvm_util.h. While it may not get used outside get-reg-list
tests, exporting it doesn't hurt, as long as it has a unique enough
name. This is a step in the direction of sharing most of the get-
reg-list test code between architectures.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoKVM: arm64: selftests: Remove print_reg's dependency on vcpu_config
Andrew Jones [Tue, 25 Jul 2023 08:41:29 +0000 (16:41 +0800)]
KVM: arm64: selftests: Remove print_reg's dependency on vcpu_config

print_reg() and its helpers only use the vcpu_config pointer for
config_name(). So just pass the config name in instead, which is used
as a prefix in asserts. print_reg() can now be compiled independently
of config_name().

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoKVM: arm64: selftests: Drop SVE cap check in print_reg
Andrew Jones [Tue, 25 Jul 2023 08:41:28 +0000 (16:41 +0800)]
KVM: arm64: selftests: Drop SVE cap check in print_reg

The check doesn't prove much anyway, as the reg lists could be
messed up too. Just drop the check to simplify making print_reg
more independent.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoKVM: arm64: selftests: Replace str_with_index with strdup_printf
Andrew Jones [Tue, 25 Jul 2023 08:41:27 +0000 (16:41 +0800)]
KVM: arm64: selftests: Replace str_with_index with strdup_printf

The original author of aarch64/get-reg-list.c (me) was wearing
tunnel vision goggles when implementing str_with_index(). There's
no reason to have such a special case string function. Instead,
take inspiration from glib and implement strdup_printf. The
implementation builds on vasprintf() which requires _GNU_SOURCE,
but we require _GNU_SOURCE in most files already.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoRISC-V: KVM: Improve vector save/restore functions
Andrew Jones [Fri, 4 Aug 2023 13:56:18 +0000 (16:56 +0300)]
RISC-V: KVM: Improve vector save/restore functions

Make two nonfunctional changes to the vector get/set vector reg
functions and their supporting function for simplification and
readability. The first is to not pass KVM_REG_RISCV_VECTOR, but
rather integrate it directly into the masking. The second is to
rename reg_val to reg_addr where and address is used instead of
a value.

Also opportunistically touch up some of the code formatting for
a third nonfunctional change.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agodocs: kvm: riscv: document EBUSY in KVM_SET_ONE_REG
Daniel Henrique Barboza [Thu, 3 Aug 2023 16:33:02 +0000 (13:33 -0300)]
docs: kvm: riscv: document EBUSY in KVM_SET_ONE_REG

The EBUSY errno is being used for KVM_SET_ONE_REG as a way to tell
userspace that a given reg can't be changed after the vcpu started.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoRISC-V: KVM: Improve vector save/restore errors
Andrew Jones [Thu, 3 Aug 2023 16:33:01 +0000 (13:33 -0300)]
RISC-V: KVM: Improve vector save/restore errors

kvm_riscv_vcpu_(get/set)_reg_vector() now returns ENOENT if V is not
available, EINVAL if reg type is not of VECTOR type, and any error that
might be thrown by kvm_riscv_vcpu_vreg_addr().

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoRISC-V: KVM: avoid EBUSY when writing the same isa_ext val
Daniel Henrique Barboza [Thu, 3 Aug 2023 16:33:00 +0000 (13:33 -0300)]
RISC-V: KVM: avoid EBUSY when writing the same isa_ext val

riscv_vcpu_set_isa_ext_single() will prevent any write of isa_ext regs
if the vcpu already started spinning.

But if there's no extension state (enabled/disabled) made by the
userspace, there's no need to -EBUSY out - we can treat the operation as
a no-op.

zicbom/zicboz_block_size, ISA config reg and mvendorid/march/mimpid
already works in a more permissive manner w.r.t userspace writes being a
no-op, so let's do the same with isa_ext writes.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoRISC-V: KVM: avoid EBUSY when writing the same machine ID val
Daniel Henrique Barboza [Thu, 3 Aug 2023 16:32:59 +0000 (13:32 -0300)]
RISC-V: KVM: avoid EBUSY when writing the same machine ID val

Right now we do not allow any write in mvendorid/marchid/mimpid if the
vcpu already started, preventing these regs to be changed.

However, if userspace doesn't change them, an alternative is to consider
the reg write a no-op and avoid erroring out altogether. Userpace can
then be oblivious about KVM internals if no changes were intended in the
first place.

Allow the same form of 'lazy writing' that registers such as
zicbom/zicboz_block_size supports: avoid erroring out if userspace makes
no changes in mvendorid/marchid/mimpid during reg write.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoRISC-V: KVM: avoid EBUSY when writing same ISA val
Daniel Henrique Barboza [Thu, 3 Aug 2023 16:32:58 +0000 (13:32 -0300)]
RISC-V: KVM: avoid EBUSY when writing same ISA val

kvm_riscv_vcpu_set_reg_config() will return -EBUSY if the ISA config reg
is being written after the VCPU ran at least once.

The same restriction isn't placed in kvm_riscv_vcpu_get_reg_config(), so
there's a chance that we'll -EBUSY out on an ISA config reg write even
if the userspace intended no changes to it.

We'll allow the same form of 'lazy writing' that registers such as
zicbom/zicboz_block_size supports: avoid erroring out if userspace made
no changes to the ISA config reg.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoRISC-V: KVM: use EBUSY when !vcpu->arch.ran_atleast_once
Daniel Henrique Barboza [Thu, 3 Aug 2023 16:32:57 +0000 (13:32 -0300)]
RISC-V: KVM: use EBUSY when !vcpu->arch.ran_atleast_once

vcpu_set_reg_config() and vcpu_set_reg_isa_ext() is throwing an
EOPNOTSUPP error when !vcpu->arch.ran_atleast_once. In similar cases
we're throwing an EBUSY error, like in mvendorid/marchid/mimpid
set_reg().

EOPNOTSUPP has a conotation of finality. EBUSY is more adequate in this
case since its a condition/error related to the vcpu lifecycle.

Change these EOPNOTSUPP instances to EBUSY.

Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoRISC-V: KVM: do not EOPNOTSUPP in set KVM_REG_RISCV_TIMER_REG
Daniel Henrique Barboza [Thu, 3 Aug 2023 16:32:56 +0000 (13:32 -0300)]
RISC-V: KVM: do not EOPNOTSUPP in set KVM_REG_RISCV_TIMER_REG

The KVM_REG_RISCV_TIMER_REG can be read via get_one_reg(). But trying to
write anything in this reg via set_one_reg() results in an EOPNOTSUPP.

Change the API to behave like cbom_block_size: instead of always
erroring out with EOPNOTSUPP, allow userspace to write the same value
(riscv_timebase) back, throwing an EINVAL if a different value is
attempted.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoRISC-V: KVM: do not EOPNOTSUPP in set_one_reg() zicbo(m|z)
Daniel Henrique Barboza [Thu, 3 Aug 2023 16:32:55 +0000 (13:32 -0300)]
RISC-V: KVM: do not EOPNOTSUPP in set_one_reg() zicbo(m|z)

zicbom_block_size and zicboz_block_size have a peculiar API: they can be
read via get_one_reg() but any write will return a EOPNOTSUPP.

It makes sense to return a 'not supported' error since both values can't
be changed, but as far as userspace goes they're regs that are throwing
the same EOPNOTSUPP error even if they were read beforehand via
get_one_reg(), even if the same  read value is being written back.
EOPNOTSUPP is also returned even if ZICBOM/ZICBOZ aren't enabled in the
host.

Change both to work more like their counterparts in get_one_reg() and
return -ENOENT if their respective extensions aren't available. After
that, check if the userspace is written a valid value (i.e. the host
value). Throw an -EINVAL if that's not case, let it slide otherwise.

This allows both regs to be read/written by userspace in a 'lazy'
manner, as long as the userspace doesn't change the reg vals.

Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoRISC-V: KVM: use ENOENT in *_one_reg() when extension is unavailable
Daniel Henrique Barboza [Thu, 3 Aug 2023 16:32:54 +0000 (13:32 -0300)]
RISC-V: KVM: use ENOENT in *_one_reg() when extension is unavailable

Following a similar logic as the previous patch let's minimize the EINVAL
usage in *_one_reg() APIs by using ENOENT when an extension that
implements the reg is not available.

For consistency we're also replacing an EOPNOTSUPP instance that should
be an ENOENT since it's an "extension is not available" error.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoRISC-V: KVM: return ENOENT in *_one_reg() when reg is unknown
Daniel Henrique Barboza [Thu, 3 Aug 2023 16:32:53 +0000 (13:32 -0300)]
RISC-V: KVM: return ENOENT in *_one_reg() when reg is unknown

get_one_reg() and set_one_reg() are returning EINVAL errors for almost
everything: if a reg doesn't exist, if a reg ID is malformatted, if the
associated CPU extension that implements the reg isn't present in the
host, and for set_one_reg() if the value being written is invalid.

This isn't wrong according to the existing KVM API docs (EINVAL can be
used when there's no such register) but adding more ENOENT instances
will make easier for userspace to understand what went wrong.

Existing userspaces can be affected by this error code change. We
checked a few. As of current upstream code, crosvm doesn't check for any
particular errno code when using kvm_(get|set)_one_reg(). Neither does
QEMU. rust-vmm doesn't have kvm-riscv support yet. Thus we have a good
chance of changing these error codes now while the KVM RISC-V ecosystem
is still new, minimizing user impact.

Change all get_one_reg() and set_one_reg() implementations to return
-ENOENT at all "no such register" cases.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoRISC-V: KVM: provide UAPI for host SATP mode
Daniel Henrique Barboza [Fri, 28 Jul 2023 21:01:22 +0000 (18:01 -0300)]
RISC-V: KVM: provide UAPI for host SATP mode

KVM userspaces need to be aware of the host SATP to allow them to
advertise it back to the guest OS.

Since this information is used to build the guest FDT we can't wait for
the SATP reg to be readable. We just need to read the SATP mode, thus
we can use the existing 'satp_mode' global that represents the SATP reg
with MODE set and both ASID and PPN cleared. E.g. for a 32 bit host
running with sv32 satp_mode is 0x80000000, for a 64 bit host running
sv57 satp_mode is 0xa000000000000000, and so on.

Add a new userspace virtual config register 'satp_mode' to allow
userspace to read the current SATP mode the host is using with
GET_ONE_REG API before spinning the vcpu.

'satp_mode' can't be changed via KVM, so SET_ONE_REG is allowed as long
as userspace writes the existing 'satp_mode'.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoRISC-V: KVM: Sort ISA extensions alphabetically in ONE_REG interface
Anup Patel [Wed, 12 Jul 2023 10:03:12 +0000 (15:33 +0530)]
RISC-V: KVM: Sort ISA extensions alphabetically in ONE_REG interface

Let us sort isa extensions alphabetically in kvm_isa_ext_arr[] and
kvm_riscv_vcpu_isa_disable_allowed() so that future insertions are
more predictable.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoRISC-V: KVM: Allow Zicntr, Zicsr, Zifencei, and Zihpm for Guest/VM
Anup Patel [Wed, 12 Jul 2023 07:23:50 +0000 (12:53 +0530)]
RISC-V: KVM: Allow Zicntr, Zicsr, Zifencei, and Zihpm for Guest/VM

We extend the KVM ISA extension ONE_REG interface to allow KVM
user space to detect and enable Zicntr, Zicsr, Zifencei, and Zihpm
extensions for Guest/VM.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoRISC-V: KVM: Allow Zba and Zbs extensions for Guest/VM
Anup Patel [Wed, 12 Jul 2023 07:08:11 +0000 (12:38 +0530)]
RISC-V: KVM: Allow Zba and Zbs extensions for Guest/VM

We extend the KVM ISA extension ONE_REG interface to allow KVM
user space to detect and enable Zba and Zbs extensions for Guest/VM.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoRISC-V: KVM: Extend ONE_REG to enable/disable multiple ISA extensions
Anup Patel [Tue, 11 Jul 2023 16:41:16 +0000 (22:11 +0530)]
RISC-V: KVM: Extend ONE_REG to enable/disable multiple ISA extensions

Currently, the ISA extension ONE_REG interface only allows enabling or
disabling one extension at a time. To improve this, we extend the ISA
extension ONE_REG interface (similar to SBI extension ONE_REG interface)
so that KVM user space can enable/disable multiple extensions in one
ioctl.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoRISC-V: KVM: Factor-out ONE_REG related code to its own source file
Anup Patel [Tue, 11 Jul 2023 12:10:47 +0000 (17:40 +0530)]
RISC-V: KVM: Factor-out ONE_REG related code to its own source file

The VCPU ONE_REG interface has grown over time and it will continue
to grow with new ISA extensions and other features. Let us move all
ONE_REG related code to its own source file so that vcpu.c only
focuses only on high-level VCPU functions.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
13 months agoLinux 6.5-rc5
Linus Torvalds [Sun, 6 Aug 2023 22:07:51 +0000 (15:07 -0700)]
Linux 6.5-rc5

13 months agoMerge tag 'v6.5-rc5.vfs.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Linus Torvalds [Sun, 6 Aug 2023 17:43:52 +0000 (10:43 -0700)]
Merge tag 'v6.5-rc5.vfs.fixes' of git://git./linux/kernel/git/vfs/vfs

Pull vfs fixes from Christian Brauner:

 - Fix a wrong check for O_TMPFILE during RESOLVE_CACHED lookup

 - Clean up directory iterators and clarify file_needs_f_pos_lock()

* tag 'v6.5-rc5.vfs.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  fs: rely on ->iterate_shared to determine f_pos locking
  vfs: get rid of old '->iterate' directory operation
  proc: fix missing conversion to 'iterate_shared'
  open: make RESOLVE_CACHED correctly test for O_TMPFILE

13 months agofs: rely on ->iterate_shared to determine f_pos locking
Christian Brauner [Sun, 6 Aug 2023 12:49:35 +0000 (14:49 +0200)]
fs: rely on ->iterate_shared to determine f_pos locking

Now that we removed ->iterate we don't need to check for either
->iterate or ->iterate_shared in file_needs_f_pos_lock(). Simply check
for ->iterate_shared instead. This will tell us whether we need to
unconditionally take the lock. Not just does it allow us to avoid
checking f_inode's mode it also actually clearly shows that we're
locking because of readdir.

Signed-off-by: Christian Brauner <brauner@kernel.org>
13 months agovfs: get rid of old '->iterate' directory operation
Linus Torvalds [Sat, 5 Aug 2023 19:25:01 +0000 (12:25 -0700)]
vfs: get rid of old '->iterate' directory operation

All users now just use '->iterate_shared()', which only takes the
directory inode lock for reading.

Filesystems that never got convered to shared mode now instead use a
wrapper that drops the lock, re-takes it in write mode, calls the old
function, and then downgrades the lock back to read mode.

This way the VFS layer and other callers no longer need to care about
filesystems that never got converted to the modern era.

The filesystems that use the new wrapper are ceph, coda, exfat, jfs,
ntfs, ocfs2, overlayfs, and vboxsf.

Honestly, several of them look like they really could just iterate their
directories in shared mode and skip the wrapper entirely, but the point
of this change is to not change semantics or fix filesystems that
haven't been fixed in the last 7+ years, but to finally get rid of the
dual iterators.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
13 months agoproc: fix missing conversion to 'iterate_shared'
Linus Torvalds [Sat, 5 Aug 2023 17:49:31 +0000 (10:49 -0700)]
proc: fix missing conversion to 'iterate_shared'

I'm looking at the directory handling due to the discussion about f_pos
locking (see commit 797964253d35: "file: reinstate f_pos locking
optimization for regular files"), and wanting to clean that up.

And one source of ugliness is how we were supposed to move filesystems
over to the '->iterate_shared()' function that only takes the inode lock
for reading many many years ago, but several filesystems still use the
bad old '->iterate()' that takes the inode lock for exclusive access.

See commit 6192269444eb ("introduce a parallel variant of ->iterate()")
that also added some documentation stating

      Old method is only used if the new one is absent; eventually it will
      be removed.  Switch while you still can; the old one won't stay.

and that was back in April 2016.  Here we are, many years later, and the
old version is still clearly sadly alive and well.

Now, some of those old style iterators are probably just because the
filesystem may end up having per-inode mutable data that it uses for
iterating a directory, but at least one case is just a mistake.

Al switched over most filesystems to use '->iterate_shared()' back when
it was introduced.  In particular, the /proc filesystem was converted as
one of the first ones in commit f50752eaa0b0 ("switch all procfs
directories ->iterate_shared()").

But then later one new user of '->iterate()' was then re-introduced by
commit 6d9c939dbe4d ("procfs: add smack subdir to attrs").

And that's clearly not what we wanted, since that new case just uses the
same 'proc_pident_readdir()' and 'proc_pident_lookup()' helper functions
that other /proc pident directories use, and they are most definitely
safe to use with the inode lock held shared.

So just fix it.

This still leaves a fair number of oddball filesystems using the
old-style directory iterator (ceph, coda, exfat, jfs, ntfs, ocfs2,
overlayfs, and vboxsf), but at least we don't have any remaining in the
core filesystems.

I'm going to add a wrapper function that just drops the read-lock and
takes it as a write lock, so that we can clean up the core vfs layer and
make all the ugly 'this filesystem needs exclusive inode locking' be
just filesystem-internal warts.

I just didn't want to make that conversion when we still had a core user
left.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
13 months agoopen: make RESOLVE_CACHED correctly test for O_TMPFILE
Aleksa Sarai [Sat, 5 Aug 2023 16:11:58 +0000 (02:11 +1000)]
open: make RESOLVE_CACHED correctly test for O_TMPFILE

O_TMPFILE is actually __O_TMPFILE|O_DIRECTORY. This means that the old
fast-path check for RESOLVE_CACHED would reject all users passing
O_DIRECTORY with -EAGAIN, when in fact the intended test was to check
for __O_TMPFILE.

Cc: stable@vger.kernel.org # v5.12+
Fixes: 99668f618062 ("fs: expose LOOKUP_CACHED through openat2() RESOLVE_CACHED")
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Message-Id: <20230806-resolve_cached-o_tmpfile-v1-1-7ba16308465e@cyphar.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
13 months agoMerge tag 'rust-fixes-6.5-rc5' of https://github.com/Rust-for-Linux/linux
Linus Torvalds [Sun, 6 Aug 2023 02:28:02 +0000 (19:28 -0700)]
Merge tag 'rust-fixes-6.5-rc5' of https://github.com/Rust-for-Linux/linux

Pull rust fixes from Miguel Ojeda:

 - Allocator: prevent mis-aligned allocation

 - Types: delete 'ForeignOwnable::borrow_mut'. A sound replacement is
   planned for the merge window

 - Build: fix bindgen error with UBSAN_BOUNDS_STRICT

* tag 'rust-fixes-6.5-rc5' of https://github.com/Rust-for-Linux/linux:
  rust: fix bindgen build error with UBSAN_BOUNDS_STRICT
  rust: delete `ForeignOwnable::borrow_mut`
  rust: allocator: Prevent mis-aligned allocation

13 months agoMerge tag 'ata-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal...
Linus Torvalds [Sun, 6 Aug 2023 01:45:18 +0000 (18:45 -0700)]
Merge tag 'ata-6.5-rc5' of git://git./linux/kernel/git/dlemoal/libata

Pull ata fix from Damien Le Moal:

 - Prevent the scsi disk driver from issuing a START STOP UNIT command
   for ATA devices during system resume as this causes various issues
   reported by multiple users.

* tag 'ata-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
  ata,scsi: do not issue START STOP UNIT on resume

13 months agoMerge tag '6.5-rc4-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sat, 5 Aug 2023 20:44:06 +0000 (13:44 -0700)]
Merge tag '6.5-rc4-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fix from Steve French:

 - Fix DFS interlink problem (different namespace)

* tag '6.5-rc4-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6:
  smb: client: fix dfs link mount against w2k8

13 months agoMerge tag 'powerpc-6.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sat, 5 Aug 2023 20:16:17 +0000 (13:16 -0700)]
Merge tag 'powerpc-6.5-5' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - Fix vmemmap altmap boundary check which could cause memory hotunplug
   failure

 - Create a dummy stackframe to fix ftrace stack unwind

 - Fix secondary thread bringup for Book3E ELFv2 kernels

 - Use early_ioremap/unmap() in via_calibrate_decr()

Thanks to Aneesh Kumar K.V, Benjamin Gray, Christophe Leroy, David
Hildenbrand, and Naveen N Rao.

* tag 'powerpc-6.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/powermac: Use early_* IO variants in via_calibrate_decr()
  powerpc/64e: Fix secondary thread bringup for ELFv2 kernels
  powerpc/ftrace: Create a dummy stackframe to fix stack unwind
  powerpc/mm/altmap: Fix altmap boundary check

13 months agoMerge tag 'parisc-for-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/delle...
Linus Torvalds [Sat, 5 Aug 2023 20:09:05 +0000 (13:09 -0700)]
Merge tag 'parisc-for-6.5-rc5' of git://git./linux/kernel/git/deller/parisc-linux

Pull parisc architecture fixes from Helge Deller:

 - early fixmap preallocation to fix boot failures on kernel >= 6.4

 - remove DMA leftover code in parport_gsc

 - drop old comments and code style fixes

* tag 'parisc-for-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: unaligned: Add required spaces after ','
  parport: gsc: remove DMA leftover code
  parisc: pci-dma: remove unused and dead EISA code and comment
  parisc/mm: preallocate fixmap page tables at init

13 months agoMerge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 5 Aug 2023 02:35:09 +0000 (19:35 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux

Pull clk fixes from Stephen Boyd:
 "A few clk driver fixes for some SoC clk drivers:

   - Change a usleep() to udelay() to avoid scheduling while atomic in
     the Amlogic PLL code

   - Revert a patch to the Mediatek MT8183 driver that caused an
     out-of-bounds write

   - Return the right error value when devm_of_iomap() fails in
     imx93_clocks_probe()

   - Constrain the Kconfig for the fixed mmio clk so that it depends on
     HAS_IOMEM and can't be compiled on architectures such as s390"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: fixed-mmio: make COMMON_CLK_FIXED_MMIO depend on HAS_IOMEM
  clk: imx93: Propagate correct error in imx93_clocks_probe()
  clk: mediatek: mt8183: Add back SSPM related clocks
  clk: meson: change usleep_range() to udelay() for atomic context

13 months agoMerge tag 'hyperv-fixes-signed-20230804' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 5 Aug 2023 00:16:14 +0000 (17:16 -0700)]
Merge tag 'hyperv-fixes-signed-20230804' of git://git./linux/kernel/git/hyperv/linux

Pull hyperv fixes from Wei Liu:

 - Fix a bug in a python script for Hyper-V (Ani Sinha)

 - Workaround a bug in Hyper-V when IBT is enabled (Michael Kelley)

 - Fix an issue parsing MP table when Linux runs in VTL2 (Saurabh
   Sengar)

 - Several cleanup patches (Nischala Yelchuri, Kameron Carr, YueHaibing,
   ZhiHu)

* tag 'hyperv-fixes-signed-20230804' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  Drivers: hv: vmbus: Remove unused extern declaration vmbus_ontimer()
  x86/hyperv: add noop functions to x86_init mpparse functions
  vmbus_testing: fix wrong python syntax for integer value comparison
  x86/hyperv: fix a warning in mshyperv.h
  x86/hyperv: Disable IBT when hypercall page lacks ENDBR instruction
  x86/hyperv: Improve code for referencing hyperv_pcpu_input_arg
  Drivers: hv: Change hv_free_hyperv_page() to take void * argument

13 months agoMerge tag 'riscv-for-linus-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 4 Aug 2023 23:04:37 +0000 (16:04 -0700)]
Merge tag 'riscv-for-linus-6.5-rc5' of git://git./linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - A pair of fixes for build-related failures in the selftests

 - A fix for a sparse warning in acpi_os_ioremap()

 - A fix to restore the kernel PA offset in vmcoreinfo, to fix crash
   handling

* tag 'riscv-for-linus-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  Documentation: kdump: Add va_kernel_pa_offset for RISCV64
  riscv: Export va_kernel_pa_offset in vmcoreinfo
  RISC-V: ACPI: Fix acpi_os_ioremap to return iomem address
  selftests: riscv: Fix compilation error with vstate_exec_nolibc.c
  selftests/riscv: fix potential build failure during the "emit_tests" step

13 months agoMerge tag 'pm-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Fri, 4 Aug 2023 22:54:03 +0000 (15:54 -0700)]
Merge tag 'pm-6.5-rc5' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management fix from Rafael Wysocki:
 "Fix a sparse warning triggered by the TPMI interface recently added to
  the Intel RAPL power capping driver (Zhang Rui)"

* tag 'pm-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  powercap: intel_rapl: Fix a sparse warning in TPMI interface

13 months agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Fri, 4 Aug 2023 19:11:40 +0000 (12:11 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:
 "More SVE/SME fixes for ptrace() and for the (potentially future) case
  where SME is implemented in hardware without SVE support"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64/fpsimd: Sync and zero pad FPSIMD state for streaming SVE
  arm64/fpsimd: Sync FPSIMD state with SVE for SME only systems
  arm64/ptrace: Don't enable SVE when setting streaming SVE
  arm64/ptrace: Flush FP state when setting ZT0
  arm64/fpsimd: Clear SME state in the target task when setting the VL

13 months agoMerge tag 'mtd/fixes-for-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 4 Aug 2023 19:01:26 +0000 (12:01 -0700)]
Merge tag 'mtd/fixes-for-6.5-rc5' of git://git./linux/kernel/git/mtd/linux

Pull mtd fixes from Miquel Raynal:
 "Raw NAND fixes:
   - fsl_upm: Fix an off-by one test in fun_exec_op()
   - Rockchip:
       - Align hwecc vs. raw page helper layouts
       - Fix oobfree offset and description
   - Meson: Fix OOB available bytes for ECC
   - Omap ELM: Fix incorrect type in assignment

  SPI-NOR fix:
   - Avoid holes in struct spi_mem_op

  Hyperbus fix:
   - Add Tudor as reviewer in MAINTAINERS

  SPI-NAND fixes:
   - Winbond and Toshiba: Fix ecc_get_status"

* tag 'mtd/fixes-for-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
  mtd: rawnand: fsl_upm: Fix an off-by one test in fun_exec_op()
  mtd: spi-nor: avoid holes in struct spi_mem_op
  MAINTAINERS: Add myself as reviewer for HYPERBUS
  mtd: rawnand: rockchip: Align hwecc vs. raw page helper layouts
  mtd: rawnand: rockchip: fix oobfree offset and description
  mtd: rawnand: meson: fix OOB available bytes for ECC
  mtd: rawnand: omap_elm: Fix incorrect type in assignment
  mtd: spinand: winbond: Fix ecc_get_status
  mtd: spinand: toshiba: Fix ecc_get_status

13 months agoMerge tag 'drm-fixes-2023-08-04' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 4 Aug 2023 18:50:22 +0000 (11:50 -0700)]
Merge tag 'drm-fixes-2023-08-04' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Small set of fixes this week, i915 and a few misc ones. I didn't see
  an amd pull so maybe next week it'll have a few more on that driver.

  ttm:
   - NULL ptr deref fix

  panel:
   - add missing MODULE_DEVICE_TABLE

  imx/ipuv3:
   - timing fix

  i915:
   - Fix bug in getting msg length in AUX CH registers handler
   - Gen12 AUX invalidation fixes
   - Fix premature release of request's reusable memory"

* tag 'drm-fixes-2023-08-04' of git://anongit.freedesktop.org/drm/drm:
  drm/panel: samsung-s6d7aa0: Add MODULE_DEVICE_TABLE
  drm/i915: Fix premature release of request's reusable memory
  drm/i915/gt: Support aux invalidation on all engines
  drm/i915/gt: Poll aux invalidation register bit on invalidation
  drm/i915/gt: Enable the CCS_FLUSH bit in the pipe control and in the CS
  drm/i915/gt: Rename flags with bit_group_X according to the datasheet
  drm/i915/gt: Ensure memory quiesced before invalidation
  drm/i915: Add the gen12_needs_ccs_aux_inv helper
  drm/i915/gt: Cleanup aux invalidation registers
  drm/i915/gvt: Fix bug in getting msg length in AUX CH registers handler
  drm/imx/ipuv3: Fix front porch adjustment upon hactive aligning
  drm/ttm: check null pointer before accessing when swapping

13 months agoMerge tag 'ceph-for-6.5-rc5' of https://github.com/ceph/ceph-client
Linus Torvalds [Fri, 4 Aug 2023 18:29:38 +0000 (11:29 -0700)]
Merge tag 'ceph-for-6.5-rc5' of https://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
 "Two patches to improve RBD exclusive lock interaction with
  osd_request_timeout option and another fix to reduce the potential for
  erroneous blocklisting -- this time in CephFS. All going to stable"

* tag 'ceph-for-6.5-rc5' of https://github.com/ceph/ceph-client:
  libceph: fix potential hang in ceph_osdc_notify()
  rbd: prevent busy loop when requesting exclusive lock
  ceph: defer stopping mdsc delayed_work

13 months agofile: reinstate f_pos locking optimization for regular files
Linus Torvalds [Thu, 3 Aug 2023 18:35:53 +0000 (11:35 -0700)]
file: reinstate f_pos locking optimization for regular files

In commit 20ea1e7d13c1 ("file: always lock position for
FMODE_ATOMIC_POS") we ended up always taking the file pos lock, because
pidfd_getfd() could get a reference to the file even when it didn't have
an elevated file count due to threading of other sharing cases.

But Mateusz Guzik reports that the extra locking is actually measurable,
so let's re-introduce the optimization, and only force the locking for
directory traversal.

Directories need the lock for correctness reasons, while regular files
only need it for "POSIX semantics".  Since pidfd_getfd() is about
debuggers etc special things that are _way_ outside of POSIX, we can
relax the rules for that case.

Reported-by: Mateusz Guzik <mjguzik@gmail.com>
Cc: Christian Brauner <brauner@kernel.org>
Link: https://lore.kernel.org/linux-fsdevel/20230803095311.ijpvhx3fyrbkasul@f/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
14 months agoarm64/fpsimd: Sync and zero pad FPSIMD state for streaming SVE
Mark Brown [Thu, 3 Aug 2023 18:33:23 +0000 (19:33 +0100)]
arm64/fpsimd: Sync and zero pad FPSIMD state for streaming SVE

We have a function sve_sync_from_fpsimd_zeropad() which is used by the
ptrace code to update the SVE state when the user writes to the the
FPSIMD register set.  Currently this checks that the task has SVE
enabled but this will miss updates for tasks which have streaming SVE
enabled if SVE has not been enabled for the thread, also do the
conversion if the task has streaming SVE enabled.

Fixes: e12310a0d30f ("arm64/sme: Implement ptrace support for streaming mode SVE registers")
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230803-arm64-fix-ptrace-ssve-no-sve-v1-3-49df214bfb3e@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
14 months agoarm64/fpsimd: Sync FPSIMD state with SVE for SME only systems
Mark Brown [Thu, 3 Aug 2023 18:33:22 +0000 (19:33 +0100)]
arm64/fpsimd: Sync FPSIMD state with SVE for SME only systems

Currently we guard FPSIMD/SVE state conversions with a check for the system
supporting SVE but SME only systems may need to sync streaming mode SVE
state so add a check for SME support too.  These functions are only used
by the ptrace code.

Fixes: e12310a0d30f ("arm64/sme: Implement ptrace support for streaming mode SVE registers")
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230803-arm64-fix-ptrace-ssve-no-sve-v1-2-49df214bfb3e@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
14 months agoarm64/ptrace: Don't enable SVE when setting streaming SVE
Mark Brown [Thu, 3 Aug 2023 18:33:21 +0000 (19:33 +0100)]
arm64/ptrace: Don't enable SVE when setting streaming SVE

Systems which implement SME without also implementing SVE are
architecturally valid but were not initially supported by the kernel,
unfortunately we missed one issue in the ptrace code.

The SVE register setting code is shared between SVE and streaming mode
SVE. When we set full SVE register state we currently enable TIF_SVE
unconditionally, in the case where streaming SVE is being configured on a
system that supports vanilla SVE this is not an issue since we always
initialise enough state for both vector lengths but on a system which only
support SME it will result in us attempting to restore the SVE vector
length after having set streaming SVE registers.

Fix this by making the enabling of SVE conditional on setting SVE vector
state. If we set streaming SVE state and SVE was not already enabled this
will result in a SVE access trap on next use of normal SVE, this will cause
us to flush our register state but this is fine since the only way to
trigger a SVE access trap would be to exit streaming mode which will cause
the in register state to be flushed anyway.

Fixes: e12310a0d30f ("arm64/sme: Implement ptrace support for streaming mode SVE registers")
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230803-arm64-fix-ptrace-ssve-no-sve-v1-1-49df214bfb3e@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
14 months agorust: fix bindgen build error with UBSAN_BOUNDS_STRICT
Andrea Righi [Tue, 11 Jul 2023 07:19:14 +0000 (09:19 +0200)]
rust: fix bindgen build error with UBSAN_BOUNDS_STRICT

With commit 2d47c6956ab3 ("ubsan: Tighten UBSAN_BOUNDS on GCC") if
CONFIG_UBSAN is enabled and gcc supports -fsanitize=bounds-strict, we
can trigger the following build error due to bindgen lacking support for
this additional build option:

   BINDGEN rust/bindings/bindings_generated.rs
 error: unsupported argument 'bounds-strict' to option '-fsanitize='

Fix by adding -fsanitize=bounds-strict to the list of skipped gcc flags
for bindgen.

Fixes: 2d47c6956ab3 ("ubsan: Tighten UBSAN_BOUNDS on GCC")
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com>
Link: https://lore.kernel.org/r/20230711071914.133946-1-andrea.righi@canonical.com
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
14 months agorust: delete `ForeignOwnable::borrow_mut`
Alice Ryhl [Thu, 6 Jul 2023 09:46:15 +0000 (09:46 +0000)]
rust: delete `ForeignOwnable::borrow_mut`

We discovered that the current design of `borrow_mut` is problematic.
This patch removes it until a better solution can be found.

Specifically, the current design gives you access to a `&mut T`, which
lets you change where the `ForeignOwnable` points (e.g., with
`core::mem::swap`). No upcoming user of this API intended to make that
possible, making all of them unsound.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Benno Lossin <benno.lossin@proton.me>
Reviewed-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com>
Fixes: 0fc4424d24a2 ("rust: types: introduce `ForeignOwnable`")
Link: https://lore.kernel.org/r/20230706094615.3080784-1-aliceryhl@google.com
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
14 months agorust: allocator: Prevent mis-aligned allocation
Boqun Feng [Sun, 30 Jul 2023 01:29:02 +0000 (18:29 -0700)]
rust: allocator: Prevent mis-aligned allocation

Currently the rust allocator simply passes the size of the type Layout
to krealloc(), and in theory the alignment requirement from the type
Layout may be larger than the guarantee provided by SLAB, which means
the allocated object is mis-aligned.

Fix this by adjusting the allocation size to the nearest power of two,
which SLAB always guarantees a size-aligned allocation. And because Rust
guarantees that the original size must be a multiple of alignment and
the alignment must be a power of two, then the alignment requirement is
satisfied.

Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Co-developed-by: "Andreas Hindborg (Samsung)" <nmi@metaspace.dk>
Signed-off-by: "Andreas Hindborg (Samsung)" <nmi@metaspace.dk>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Cc: stable@vger.kernel.org # v6.1+
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Fixes: 247b365dc8dc ("rust: add `kernel` crate")
Link: https://github.com/Rust-for-Linux/linux/issues/974
Link: https://lore.kernel.org/r/20230730012905.643822-2-boqun.feng@gmail.com
[ Applied rewording of comment as discussed in the mailing list. ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
14 months agoMerge tag 'drm-intel-fixes-2023-08-03' of git://anongit.freedesktop.org/drm/drm-intel...
Dave Airlie [Thu, 3 Aug 2023 23:38:36 +0000 (09:38 +1000)]
Merge tag 'drm-intel-fixes-2023-08-03' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- Fix bug in getting msg length in AUX CH registers handler [gvt] (Yan Zhao)
- Gen12 AUX invalidation fixes [gt] (Andi Shyti, Jonathan Cavitt)
- Fix premature release of request's reusable memory (Janusz Krzysztofik)

- Merge tag 'gvt-fixes-2023-08-02' of https://github.com/intel/gvt-linux into drm-intel-fixes (Tvrtko Ursulin)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZMtkxWGuUKpaRMmo@tursulin-desk
14 months agoMerge tag 'drm-misc-fixes-2023-08-03' of ssh://git.freedesktop.org/git/drm/drm-misc...
Dave Airlie [Thu, 3 Aug 2023 23:26:56 +0000 (09:26 +1000)]
Merge tag 'drm-misc-fixes-2023-08-03' of ssh://git.freedesktop.org/git/drm/drm-misc into drm-fixes

A NULL pointer dereference fix for TTM, a timings fix for imx/ipuv3 and
the addition of a MODULE_DEVICE_TABLE for the samsung-s6d7aa0 panel.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ztfogof2dhtlvjwe73mvd2jp5kbldhkkav7k5culuseqblwpti@qfobohwx3c3j
14 months agoMerge tag 'perf-tools-fixes-for-v6.5-2-2023-08-03' of git://git.kernel.org/pub/scm...
Linus Torvalds [Thu, 3 Aug 2023 22:47:39 +0000 (15:47 -0700)]
Merge tag 'perf-tools-fixes-for-v6.5-2-2023-08-03' of git://git./linux/kernel/git/perf/perf-tools

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix segfault in the powerpc specific arch_skip_callchain_idx
   function. The patch doing the reference count init/exit that went
   into 6.5 missed this function.

 - Fix regression reading the arm64 PMU cpu slots in sysfs, a patch
   removing some code duplication ended up duplicating the /sysfs prefix
   for these files.

 - Fix grouping of events related to topdown, addressing a regression on
   the CSV output produced by 'perf stat' noticed on the downstream tool
   toplev.

 - Fix the uprobe_from_different_cu 'perf test' entry, it is failing
   when gcc isn't available, so we need to check that and skip the test
   if it is not installed.

* tag 'perf-tools-fixes-for-v6.5-2-2023-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
  perf test parse-events: Test complex name has required event format
  perf pmus: Create placholder regardless of scanning core_only
  perf test uprobe_from_different_cu: Skip if there is no gcc
  perf parse-events: Only move force grouped evsels when sorting
  perf parse-events: When fixing group leaders always set the leader
  perf parse-events: Extra care around force grouped events
  perf callchain powerpc: Fix addr location init during arch_skip_callchain_idx function
  perf pmu arm64: Fix reading the PMU cpu slots in sysfs

14 months agoMerge tag 'cxl-fixes-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Linus Torvalds [Thu, 3 Aug 2023 22:41:48 +0000 (15:41 -0700)]
Merge tag 'cxl-fixes-6.5-rc5' of git://git./linux/kernel/git/cxl/cxl

Pull cxl fixes from Vishal Verma:

 - Fixup the Sanitixe device ABI that was merged for v6.5 to hide some
   sysfs files when the necessary support is missing. Update the ABI
   documentation around this as well.

* tag 'cxl-fixes-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl/memdev: Only show sanitize sysfs files when supported
  cxl/memdev: Document security state in kern-doc
  cxl/memdev: Improve sanitize ABI descriptions

14 months agoMerge tag 'net-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 3 Aug 2023 21:00:02 +0000 (14:00 -0700)]
Merge tag 'net-6.5-rc5' of git://git./linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf and wireless.

  Nothing scary here. Feels like the first wave of regressions from v6.5
  is addressed - one outstanding fix still to come in TLS for the
  sendpage rework.

  Current release - regressions:

   - udp: fix __ip_append_data()'s handling of MSG_SPLICE_PAGES

   - dsa: fix older DSA drivers using phylink

  Previous releases - regressions:

   - gro: fix misuse of CB in udp socket lookup

   - mlx5: unregister devlink params in case interface is down

   - Revert "wifi: ath11k: Enable threaded NAPI"

  Previous releases - always broken:

   - sched: cls_u32: fix match key mis-addressing

   - sched: bind logic fixes for cls_fw, cls_u32 and cls_route

   - add bound checks to a number of places which hand-parse netlink

   - bpf: disable preemption in perf_event_output helpers code

   - qed: fix scheduling in a tasklet while getting stats

   - avoid using APIs which are not hardirq-safe in couple of drivers,
     when we may be in a hard IRQ (netconsole)

   - wifi: cfg80211: fix return value in scan logic, avoid page
     allocator warning

   - wifi: mt76: mt7615: do not advertise 5 GHz on first PHY of MT7615D
     (DBDC)

  Misc:

   - drop handful of inactive maintainers, put some new in place"

* tag 'net-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (98 commits)
  MAINTAINERS: update TUN/TAP maintainers
  test/vsock: remove vsock_perf executable on `make clean`
  tcp_metrics: fix data-race in tcpm_suck_dst() vs fastopen
  tcp_metrics: annotate data-races around tm->tcpm_net
  tcp_metrics: annotate data-races around tm->tcpm_vals[]
  tcp_metrics: annotate data-races around tm->tcpm_lock
  tcp_metrics: annotate data-races around tm->tcpm_stamp
  tcp_metrics: fix addr_same() helper
  prestera: fix fallback to previous version on same major version
  udp: Fix __ip_append_data()'s handling of MSG_SPLICE_PAGES
  net/mlx5e: Set proper IPsec source port in L4 selector
  net/mlx5: fs_core: Skip the FTs in the same FS_TYPE_PRIO_CHAINS fs_prio
  net/mlx5: fs_core: Make find_closest_ft more generic
  wifi: brcmfmac: Fix field-spanning write in brcmf_scan_params_v2_to_v1()
  vxlan: Fix nexthop hash size
  ip6mr: Fix skb_under_panic in ip6mr_cache_report()
  s390/qeth: Don't call dev_close/dev_open (DOWN/UP)
  net: tap_open(): set sk_uid from current_fsuid()
  net: tun_chr_open(): set sk_uid from current_fsuid()
  net: dcb: choose correct policy to parse DCB_ATTR_BCN
  ...

14 months agoMAINTAINERS: update TUN/TAP maintainers
Jakub Kicinski [Wed, 2 Aug 2023 18:28:43 +0000 (11:28 -0700)]
MAINTAINERS: update TUN/TAP maintainers

Willem and Jason have agreed to take over the maintainer
duties for TUN/TAP, thank you!

There's an existing entry for TUN/TAP which only covers
the user mode Linux implementation.
Since we haven't heard from Maxim on the list for almost
a decade, extend that entry and take it over, rather than
adding a new one.

Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://lore.kernel.org/r/20230802182843.4193099-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoMerge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Jakub Kicinski [Thu, 3 Aug 2023 18:22:53 +0000 (11:22 -0700)]
Merge tag 'for-netdev' of https://git./linux/kernel/git/bpf/bpf

Martin KaFai Lau says:

====================
pull-request: bpf 2023-08-03

We've added 5 non-merge commits during the last 7 day(s) which contain
a total of 3 files changed, 37 insertions(+), 20 deletions(-).

The main changes are:

1) Disable preemption in perf_event_output helpers code,
   from Jiri Olsa

2) Add length check for SK_DIAG_BPF_STORAGE_REQ_MAP_FD parsing,
   from Lin Ma

3) Multiple warning splat fixes in cpumap from Hou Tao

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf, cpumap: Handle skb as well when clean up ptr_ring
  bpf, cpumap: Make sure kthread is running before map update returns
  bpf: Add length check for SK_DIAG_BPF_STORAGE_REQ_MAP_FD parsing
  bpf: Disable preemption in bpf_event_output
  bpf: Disable preemption in bpf_perf_event_output
====================

Link: https://lore.kernel.org/r/20230803181429.994607-1-martin.lau@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoMerge tag 'wireless-2023-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git...
Jakub Kicinski [Thu, 3 Aug 2023 18:05:46 +0000 (11:05 -0700)]
Merge tag 'wireless-2023-08-03' of git://git./linux/kernel/git/wireless/wireless

Kalle Valo says:

====================
wireless fixes for v6.5

We did some house cleaning in MAINTAINERS file so several patches
about that. Few regressions fixed and also fix some recently enabled
memcpy() warnings. Only small commits and nothing special standing
out.

* tag 'wireless-2023-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
  wifi: brcmfmac: Fix field-spanning write in brcmf_scan_params_v2_to_v1()
  wifi: ray_cs: Replace 1-element array with flexible array
  MAINTAINERS: add Jeff as ath10k, ath11k and ath12k maintainer
  MAINTAINERS: wifi: mark mlw8k as orphan
  MAINTAINERS: wifi: mark b43 as orphan
  MAINTAINERS: wifi: mark zd1211rw as orphan
  MAINTAINERS: wifi: mark wl3501 as orphan
  MAINTAINERS: wifi: mark rndis_wlan as orphan
  MAINTAINERS: wifi: mark ar5523 as orphan
  MAINTAINERS: wifi: mark cw1200 as orphan
  MAINTAINERS: wifi: atmel: mark as orphan
  MAINTAINERS: wifi: rtw88: change Ping as the maintainer
  Revert "wifi: ath6k: silence false positive -Wno-dangling-pointer warning on GCC 12"
  wifi: cfg80211: Fix return value in scan logic
  Revert "wifi: ath11k: Enable threaded NAPI"
  MAINTAINERS: Update mwifiex maintainer list
  wifi: mt76: mt7615: do not advertise 5 GHz on first phy of MT7615D (DBDC)
====================

Link: https://lore.kernel.org/r/20230803140058.57476C433C9@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agotest/vsock: remove vsock_perf executable on `make clean`
Stefano Garzarella [Thu, 3 Aug 2023 08:54:54 +0000 (10:54 +0200)]
test/vsock: remove vsock_perf executable on `make clean`

We forgot to add vsock_perf to the rm command in the `clean`
target, so now we have a left over after `make clean` in
tools/testing/vsock.

Fixes: 8abbffd27ced ("test/vsock: vsock_perf utility")
Cc: AVKrasnov@sberdevices.ru
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Simon Horman <horms@kernel.org> # build-tested
Link: https://lore.kernel.org/r/20230803085454.30897-1-sgarzare@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoMerge branch 'tcp_metrics-series-of-fixes'
Jakub Kicinski [Thu, 3 Aug 2023 17:58:27 +0000 (10:58 -0700)]
Merge branch 'tcp_metrics-series-of-fixes'

Eric Dumazet says:

====================
tcp_metrics: series of fixes

This series contains a fix for addr_same() and various
data-race annotations.

We still have to address races over tm->tcpm_saddr and
tm->tcpm_daddr later.
====================

Link: https://lore.kernel.org/r/20230802131500.1478140-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agotcp_metrics: fix data-race in tcpm_suck_dst() vs fastopen
Eric Dumazet [Wed, 2 Aug 2023 13:15:00 +0000 (13:15 +0000)]
tcp_metrics: fix data-race in tcpm_suck_dst() vs fastopen

Whenever tcpm_new() reclaims an old entry, tcpm_suck_dst()
would overwrite data that could be read from tcp_fastopen_cache_get()
or tcp_metrics_fill_info().

We need to acquire fastopen_seqlock to maintain consistency.

For newly allocated objects, tcpm_new() can switch to kzalloc()
to avoid an extra fastopen_seqlock acquisition.

Fixes: 1fe4c481ba63 ("net-tcp: Fast Open client - cookie cache")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230802131500.1478140-7-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agotcp_metrics: annotate data-races around tm->tcpm_net
Eric Dumazet [Wed, 2 Aug 2023 13:14:59 +0000 (13:14 +0000)]
tcp_metrics: annotate data-races around tm->tcpm_net

tm->tcpm_net can be read or written locklessly.

Instead of changing write_pnet() and read_pnet() and potentially
hurt performance, add the needed READ_ONCE()/WRITE_ONCE()
in tm_net() and tcpm_new().

Fixes: 849e8a0ca8d5 ("tcp_metrics: Add a field tcpm_net and verify it matches on lookup")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230802131500.1478140-6-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agotcp_metrics: annotate data-races around tm->tcpm_vals[]
Eric Dumazet [Wed, 2 Aug 2023 13:14:58 +0000 (13:14 +0000)]
tcp_metrics: annotate data-races around tm->tcpm_vals[]

tm->tcpm_vals[] values can be read or written locklessly.

Add needed READ_ONCE()/WRITE_ONCE() to document this,
and force use of tcp_metric_get() and tcp_metric_set()

Fixes: 51c5d0c4b169 ("tcp: Maintain dynamic metrics in local cache.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agotcp_metrics: annotate data-races around tm->tcpm_lock
Eric Dumazet [Wed, 2 Aug 2023 13:14:57 +0000 (13:14 +0000)]
tcp_metrics: annotate data-races around tm->tcpm_lock

tm->tcpm_lock can be read or written locklessly.

Add needed READ_ONCE()/WRITE_ONCE() to document this.

Fixes: 51c5d0c4b169 ("tcp: Maintain dynamic metrics in local cache.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230802131500.1478140-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agotcp_metrics: annotate data-races around tm->tcpm_stamp
Eric Dumazet [Wed, 2 Aug 2023 13:14:56 +0000 (13:14 +0000)]
tcp_metrics: annotate data-races around tm->tcpm_stamp

tm->tcpm_stamp can be read or written locklessly.

Add needed READ_ONCE()/WRITE_ONCE() to document this.

Also constify tcpm_check_stamp() dst argument.

Fixes: 51c5d0c4b169 ("tcp: Maintain dynamic metrics in local cache.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230802131500.1478140-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agotcp_metrics: fix addr_same() helper
Eric Dumazet [Wed, 2 Aug 2023 13:14:55 +0000 (13:14 +0000)]
tcp_metrics: fix addr_same() helper

Because v4 and v6 families use separate inetpeer trees (respectively
net->ipv4.peers and net->ipv6.peers), inetpeer_addr_cmp(a, b) assumes
a & b share the same family.

tcp_metrics use a common hash table, where entries can have different
families.

We must therefore make sure to not call inetpeer_addr_cmp()
if the families do not match.

Fixes: d39d14ffa24c ("net: Add helper function to compare inetpeer addresses")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230802131500.1478140-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoprestera: fix fallback to previous version on same major version
Jonas Gorski [Wed, 2 Aug 2023 09:23:56 +0000 (11:23 +0200)]
prestera: fix fallback to previous version on same major version

When both supported and previous version have the same major version,
and the firmwares are missing, the driver ends in a loop requesting the
same (previous) version over and over again:

    [   76.327413] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.1.img firmware, fall-back to previous 4.0 version
    [   76.339802] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version
    [   76.352162] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version
    [   76.364502] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version
    [   76.376848] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version
    [   76.389183] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version
    [   76.401522] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version
    [   76.413860] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version
    [   76.426199] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.0.img firmware, fall-back to previous 4.0 version
    ...

Fix this by inverting the check to that we aren't yet at the previous
version, and also check the minor version.

This also catches the case where both versions are the same, as it was
after commit bb5dbf2cc64d ("net: marvell: prestera: add firmware v4.0
support").

With this fix applied:

    [   88.499622] Prestera DX 0000:01:00.0: missing latest mrvl/prestera/mvsw_prestera_fw-v4.1.img firmware, fall-back to previous 4.0 version
    [   88.511995] Prestera DX 0000:01:00.0: failed to request previous firmware: mrvl/prestera/mvsw_prestera_fw-v4.0.img
    [   88.522403] Prestera DX: probe of 0000:01:00.0 failed with error -2

Fixes: 47f26018a414 ("net: marvell: prestera: try to load previous fw version")
Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
Acked-by: Elad Nachman <enachman@marvell.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Taras Chornyi <taras.chornyi@plvision.eu>
Link: https://lore.kernel.org/r/20230802092357.163944-1-jonas.gorski@bisdn.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoMerge tag 'nfsd-6.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Linus Torvalds [Thu, 3 Aug 2023 16:26:34 +0000 (09:26 -0700)]
Merge tag 'nfsd-6.5-3' of git://git./linux/kernel/git/cel/linux

Pull nfsd fix from Chuck Lever:

 - Fix tmpfs splice read support

* tag 'nfsd-6.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  nfsd: Fix reading via splice

14 months agoMerge tag 'erofs-for-6.5-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 3 Aug 2023 16:20:50 +0000 (09:20 -0700)]
Merge tag 'erofs-for-6.5-rc5-fixes' of git://git./linux/kernel/git/xiang/erofs

Pull erofs fixes from Gao Xiang:

 - Fix data corruption caused by insufficient decompression on
   deduplicated compressed extents

 - Drop a useless s_magic checking in erofs_kill_sb()

* tag 'erofs-for-6.5-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: drop unnecessary WARN_ON() in erofs_kill_sb()
  erofs: fix wrong primary bvec selection on deduplicated extents

14 months agoMerge tag 's390-6.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Thu, 3 Aug 2023 16:06:38 +0000 (09:06 -0700)]
Merge tag 's390-6.5-4' of git://git./linux/kernel/git/s390/linux

Pull s390 fixes from Heiko Carstens:

 - Split kernel large page mappings into 4k mappings in case debug
   pagealloc is enabled again. This got accidentally removed by commit
   bb1520d581a3 ("s390/mm: start kernel with DAT enabled")

 - Fix error handling in KVM's sthyi handling

 - Add missing include to s390's uapi ptrace.h

 - Update defconfigs

* tag 's390-6.5-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/ptrace: add missing linux/const.h include
  KVM: s390: fix sthyi error handling
  s390: update defconfigs
  s390/vmem: split pages when debug pagealloc is enabled

14 months agoarm64/ptrace: Flush FP state when setting ZT0
Mark Brown [Thu, 3 Aug 2023 00:19:06 +0000 (01:19 +0100)]
arm64/ptrace: Flush FP state when setting ZT0

When setting ZT0 via ptrace we do not currently force a reload of the
floating point register state from memory, do that to ensure that the newly
set value gets loaded into the registers on next task execution.

The function was templated off the function for FPSIMD which due to our
providing the option of embedding a FPSIMD regset within the SVE regset
does not directly include the flush.

Fixes: f90b529bcbe5 ("arm64/sme: Implement ZT0 ptrace support")
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230803-arm64-fix-ptrace-zt0-flush-v1-1-72e854eaf96e@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
14 months agoarm64/fpsimd: Clear SME state in the target task when setting the VL
Mark Brown [Wed, 2 Aug 2023 23:46:39 +0000 (00:46 +0100)]
arm64/fpsimd: Clear SME state in the target task when setting the VL

When setting SME vector lengths we clear TIF_SME to reenable SME traps,
doing a reallocation of the backing storage on next use. We do this using
clear_thread_flag() which operates on the current thread, meaning that when
setting the vector length via ptrace we may both not force traps for the
target task and force a spurious flush of any SME state that the tracing
task may have.

Clear the flag in the target task.

Fixes: e12310a0d30f ("arm64/sme: Implement ptrace support for streaming mode SVE registers")
Reported-by: David Spickett <David.Spickett@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230803-arm64-fix-ptrace-tif-sme-v1-1-88312fd6fbfd@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
14 months agoparisc: unaligned: Add required spaces after ','
hanyu001@208suo.com [Thu, 20 Jul 2023 06:40:27 +0000 (14:40 +0800)]
parisc: unaligned: Add required spaces after ','

Fix checkpatch warnings:
unaligned.c:475: ERROR: space required after that ','

Signed-off-by: Yu Han <hanyu001@208suo.com>
Signed-off-by: Helge Deller <deller@gmx.de>
14 months agoparport: gsc: remove DMA leftover code
Arnd Bergmann [Wed, 26 Jul 2023 15:09:14 +0000 (17:09 +0200)]
parport: gsc: remove DMA leftover code

This driver does not actually work with DMA mode, but still tries
to call ISA DMA interface functions that are stubbed out on
parisc, resulting in a W=1 build warning:

drivers/parport/parport_gsc.c: In function 'parport_remove_chip':
drivers/parport/parport_gsc.c:389:20: warning: suggest braces around empty body in an 'if' statement [-Wempty-body]
  389 |    free_dma(p->dma);

Remove the corresponding code as a prerequisite for turning on -Wempty-body
by default in all kernels.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Helge Deller <deller@gmx.de>
14 months agoparisc: pci-dma: remove unused and dead EISA code and comment
Petr Tesarik [Wed, 2 Aug 2023 16:33:21 +0000 (18:33 +0200)]
parisc: pci-dma: remove unused and dead EISA code and comment

Clearly, this code isn't needed, but it gives a false positive when
grepping the complete source tree for coherent_dma_mask.

Signed-off-by: Petr Tesarik <petr.tesarik.ext@huawei.com>
Signed-off-by: Helge Deller <deller@gmx.de>
14 months agoparisc/mm: preallocate fixmap page tables at init
Mike Rapoport (IBM) [Thu, 3 Aug 2023 06:24:04 +0000 (09:24 +0300)]
parisc/mm: preallocate fixmap page tables at init

Christoph Biedl reported early OOM on recent kernels:

    swapper: page allocation failure: order:0, mode:0x100(__GFP_ZERO),
nodemask=(null)
    CPU: 0 PID: 0 Comm: swapper Not tainted 6.3.0-rc4+ #16
    Hardware name: 9000/785/C3600
    Backtrace:
     [<10408594>] show_stack+0x48/0x5c
     [<10e152d8>] dump_stack_lvl+0x48/0x64
     [<10e15318>] dump_stack+0x24/0x34
     [<105cf7f8>] warn_alloc+0x10c/0x1c8
     [<105d068c>] __alloc_pages+0xbbc/0xcf8
     [<105d0e4c>] __get_free_pages+0x28/0x78
     [<105ad10c>] __pte_alloc_kernel+0x30/0x98
     [<10406934>] set_fixmap+0xec/0xf4
     [<10411ad4>] patch_map.constprop.0+0xa8/0xdc
     [<10411bb0>] __patch_text_multiple+0xa8/0x208
     [<10411d78>] patch_text+0x30/0x48
     [<1041246c>] arch_jump_label_transform+0x90/0xcc
     [<1056f734>] jump_label_update+0xd4/0x184
     [<1056fc9c>] static_key_enable_cpuslocked+0xc0/0x110
     [<1056fd08>] static_key_enable+0x1c/0x2c
     [<1011362c>] init_mem_debugging_and_hardening+0xdc/0xf8
     [<1010141c>] start_kernel+0x5f0/0xa98
     [<10105da8>] start_parisc+0xb8/0xe4

    Mem-Info:
    active_anon:0 inactive_anon:0 isolated_anon:0
     active_file:0 inactive_file:0 isolated_file:0
     unevictable:0 dirty:0 writeback:0
     slab_reclaimable:0 slab_unreclaimable:0
     mapped:0 shmem:0 pagetables:0
     sec_pagetables:0 bounce:0
     kernel_misc_reclaimable:0
     free:0 free_pcp:0 free_cma:0
    Node 0 active_anon:0kB inactive_anon:0kB active_file:0kB
inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB
mapped:0kB dirty:0kB writeback:0kB shmem:0kB
+writeback_tmp:0kB kernel_stack:0kB pagetables:0kB sec_pagetables:0kB
all_unreclaimable? no
    Normal free:0kB boost:0kB min:0kB low:0kB high:0kB
reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB active_file:0kB
inactive_file:0kB unevictable:0kB writepending:0kB
+present:1048576kB managed:1039360kB mlocked:0kB bounce:0kB free_pcp:0kB
local_pcp:0kB free_cma:0kB
    lowmem_reserve[]: 0 0
    Normal: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB
0*1024kB 0*2048kB 0*4096kB = 0kB
    0 total pagecache pages
    0 pages in swap cache
    Free swap  = 0kB
    Total swap = 0kB
    262144 pages RAM
    0 pages HighMem/MovableOnly
    2304 pages reserved
    Backtrace:
     [<10411d78>] patch_text+0x30/0x48
     [<1041246c>] arch_jump_label_transform+0x90/0xcc
     [<1056f734>] jump_label_update+0xd4/0x184
     [<1056fc9c>] static_key_enable_cpuslocked+0xc0/0x110
     [<1056fd08>] static_key_enable+0x1c/0x2c
     [<1011362c>] init_mem_debugging_and_hardening+0xdc/0xf8
     [<1010141c>] start_kernel+0x5f0/0xa98
     [<10105da8>] start_parisc+0xb8/0xe4

    Kernel Fault: Code=15 (Data TLB miss fault) at addr 0f7fe3c0
    CPU: 0 PID: 0 Comm: swapper Not tainted 6.3.0-rc4+ #16
    Hardware name: 9000/785/C3600

This happens because patching static key code temporarily maps it via
fixmap and if it happens before page allocator is initialized set_fixmap()
cannot allocate memory using pte_alloc_kernel().

Make sure that fixmap page tables are preallocated early so that
pte_offset_kernel() in set_fixmap() never resorts to pte allocation.

Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Helge Deller <deller@gmx.de>
Tested-by: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de>
Tested-by: John David Anglin <dave.anglin@bell.net>
Cc: <stable@vger.kernel.org> # v6.4+
14 months agoudp: Fix __ip_append_data()'s handling of MSG_SPLICE_PAGES
David Howells [Tue, 1 Aug 2023 15:48:53 +0000 (16:48 +0100)]
udp: Fix __ip_append_data()'s handling of MSG_SPLICE_PAGES

__ip_append_data() can get into an infinite loop when asked to splice into
a partially-built UDP message that has more than the frag-limit data and up
to the MTU limit.  Something like:

        pipe(pfd);
        sfd = socket(AF_INET, SOCK_DGRAM, 0);
        connect(sfd, ...);
        send(sfd, buffer, 8161, MSG_CONFIRM|MSG_MORE);
        write(pfd[1], buffer, 8);
        splice(pfd[0], 0, sfd, 0, 0x4ffe0ul, 0);

where the amount of data given to send() is dependent on the MTU size (in
this instance an interface with an MTU of 8192).

The problem is that the calculation of the amount to copy in
__ip_append_data() goes negative in two places, and, in the second place,
this gets subtracted from the length remaining, thereby increasing it.

This happens when pagedlen > 0 (which happens for MSG_ZEROCOPY and
MSG_SPLICE_PAGES), because the terms in:

        copy = datalen - transhdrlen - fraggap - pagedlen;

then mostly cancel when pagedlen is substituted for, leaving just -fraggap.
This causes:

        length -= copy + transhdrlen;

to increase the length to more than the amount of data in msg->msg_iter,
which causes skb_splice_from_iter() to be unable to fill the request and it
returns less than 'copied' - which means that length never gets to 0 and we
never exit the loop.

Fix this by:

 (1) Insert a note about the dodgy calculation of 'copy'.

 (2) If MSG_SPLICE_PAGES, clear copy if it is negative from the above
     equation, so that 'offset' isn't regressed and 'length' isn't
     increased, which will mean that length and thus copy should match the
     amount left in the iterator.

 (3) When handling MSG_SPLICE_PAGES, give a warning and return -EIO if
     we're asked to splice more than is in the iterator.  It might be
     better to not give the warning or even just give a 'short' write.

[!] Note that this ought to also affect MSG_ZEROCOPY, but MSG_ZEROCOPY
avoids the problem by simply assuming that everything asked for got copied,
not just the amount that was in the iterator.  This is a potential bug for
the future.

Fixes: 7ac7c987850c ("udp: Convert udp_sendpage() to use MSG_SPLICE_PAGES")
Reported-by: syzbot+f527b971b4bdc8e79f9e@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/000000000000881d0606004541d1@google.com/
Signed-off-by: David Howells <dhowells@redhat.com>
cc: David Ahern <dsahern@kernel.org>
cc: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/1420063.1690904933@warthog.procyon.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoMerge branch 'mlx5-ipsec-fixes'
Jakub Kicinski [Thu, 3 Aug 2023 01:38:45 +0000 (18:38 -0700)]
Merge branch 'mlx5-ipsec-fixes'

Leon Romanovsky says:

====================
mlx5 IPsec fixes

The following patches are combination of Jianbo's work on IPsec eswitch mode
together with our internal review toward addition of TCP protocol selectors
support to IPSec packet offload.

Despite not-being fix, the first patch helps us to make second one more
clear, so I'm asking to apply it anyway as part of this series.
====================

Link: https://lore.kernel.org/r/cover.1690803944.git.leonro@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet/mlx5e: Set proper IPsec source port in L4 selector
Leon Romanovsky [Mon, 31 Jul 2023 11:58:42 +0000 (14:58 +0300)]
net/mlx5e: Set proper IPsec source port in L4 selector

Fix typo in setup_fte_upper_proto_match() where destination UDP port
was used instead of source port.

Fixes: a7385187a386 ("net/mlx5e: IPsec, support upper protocol selector field offload")
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/ffc024a4d192113103f392b0502688366ca88c1f.1690803944.git.leonro@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet/mlx5: fs_core: Skip the FTs in the same FS_TYPE_PRIO_CHAINS fs_prio
Jianbo Liu [Mon, 31 Jul 2023 11:58:41 +0000 (14:58 +0300)]
net/mlx5: fs_core: Skip the FTs in the same FS_TYPE_PRIO_CHAINS fs_prio

In the cited commit, new type of FS_TYPE_PRIO_CHAINS fs_prio was added
to support multiple parallel namespaces for multi-chains. And we skip
all the flow tables under the fs_node of this type unconditionally,
when searching for the next or previous flow table to connect for a
new table.

As this search function is also used for find new root table when the
old one is being deleted, it will skip the entire FS_TYPE_PRIO_CHAINS
fs_node next to the old root. However, new root table should be chosen
from it if there is any table in it. Fix it by skipping only the flow
tables in the same FS_TYPE_PRIO_CHAINS fs_node when finding the
closest FT for a fs_node.

Besides, complete the connecting from FTs of previous priority of prio
because there should be multiple prevs after this fs_prio type is
introduced. And also the next FT should be chosen from the first flow
table next to the prio in the same FS_TYPE_PRIO_CHAINS fs_prio, if
this prio is the first child.

Fixes: 328edb499f99 ("net/mlx5: Split FDB fast path prio to multiple namespaces")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/7a95754df479e722038996c97c97b062b372591f.1690803944.git.leonro@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agonet/mlx5: fs_core: Make find_closest_ft more generic
Jianbo Liu [Mon, 31 Jul 2023 11:58:40 +0000 (14:58 +0300)]
net/mlx5: fs_core: Make find_closest_ft more generic

As find_closest_ft_recursive is called to find the closest FT, the
first parameter of find_closest_ft can be changed from fs_prio to
fs_node. Thus this function is extended to find the closest FT for the
nodes of any type, not only prios, but also the sub namespaces.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/d3962c2b443ec8dde7a740dc742a1f052d5e256c.1690803944.git.leonro@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 months agoMerge tag 'soc-fixes-6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Linus Torvalds [Thu, 3 Aug 2023 01:21:12 +0000 (18:21 -0700)]
Merge tag 'soc-fixes-6.5-2' of git://git./linux/kernel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
 "A couple of platforms get a lone dts fix each:

   - SoCFPGA: Fix incorrect I2C property for SCL signal

   - Renesas: Fix interrupt names for MTU3 channels on RZ/G2L and
     RZ/V2L.

   - Juno/Vexpress: remove a dangling symlink

   - at91: sam9x60 SoC detection compatible strings

   - nspire: Fix arm primecell compatible string

  On the NXP i.MX platform, there multiple issues that get addressed:

   - A couple of ARM DTS fixes for i.MX6SLL usbphy and supported CPU
     frequency of sk-imx53 board

   - Add missing pull-up for imx8mn-var-som onboard PHY reset pinmux

   - A couple of imx8mm-venice fixes from Tim Harvey to diable
     disp_blk_ctrl

   - A couple of phycore-imx8mm fixes from Yashwanth Varakala to correct
     VPU label and gpio-line-names

   - Fix imx8mp-blk-ctrl driver to register HSIO PLL clock as
     bus_power_dev child, so that runtime PM can translate into the
     necessary GPC power domain action

  On the driver side, there are two fixes for tegra memory controller
  drivers addressing regressions from the merge window, a couple of
  minor correctness fixes for SCMI and SMCCC firmware, as well as a
  build fix for an lcd backlight driver"

* tag 'soc-fixes-6.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (22 commits)
  backlight: corgi_lcd: fix missing prototype
  memory: tegra: make icc_set_bw return zero if BWMGR not supported
  arm64: dts: renesas: rzg2l: Update overfow/underflow IRQ names for MTU3 channels
  dt-bindings: serial: atmel,at91-usart: update compatible for sam9x60
  ARM: dts: at91: sam9x60: fix the SOC detection
  ARM: dts: nspire: Fix arm primecell compatible string
  firmware: arm_scmi: Fix chan_free cleanup on SMC
  firmware: arm_scmi: Drop OF node reference in the transport channel setup
  soc: imx: imx8mp-blk-ctrl: register HSIO PLL clock as bus_power_dev child
  ARM: dts: nxp/imx: limit sk-imx53 supported frequencies
  firmware: arm_scmi: Fix signed error return values handling
  firmware: smccc: Fix use of uninitialised results structure
  arm64: dts: freescale: Fix VPU G2 clock
  arm64: dts: imx8mn-var-som: add missing pull-up for onboard PHY reset pinmux
  arm64: dts: phycore-imx8mm: Correction in gpio-line-names
  arm64: dts: phycore-imx8mm: Label typo-fix of VPU
  ARM: dts: nxp/imx6sll: fix wrong property name in usbphy node
  arm64: dts: imx8mm-venice-gw7904: disable disp_blk_ctrl
  arm64: dts: imx8mm-venice-gw7903: disable disp_blk_ctrl
  arm64: dts: arm: Remove the dangling vexpress-v2m-rs1.dtsi symlink
  ...

14 months agoMerge tag 'bitmap-6.5-rc5' of https://github.com:/norov/linux
Linus Torvalds [Thu, 3 Aug 2023 01:10:26 +0000 (18:10 -0700)]
Merge tag 'bitmap-6.5-rc5' of https://github.com:/norov/linux

Pull bitmap fixes from Yury Norov:

 - Fix for bitmap documentation

 - Fix for kernel build under certain configurations

* tag 'bitmap-6.5-rc5' of https://github.com:/norov/linux:
  lib/bitmap: workaround const_eval test build failure
  cpumask: eliminate kernel-doc warnings

14 months agoDrivers: hv: vmbus: Remove unused extern declaration vmbus_ontimer()
YueHaibing [Tue, 25 Jul 2023 14:21:08 +0000 (22:21 +0800)]
Drivers: hv: vmbus: Remove unused extern declaration vmbus_ontimer()

Since commit 30fbee49b071 ("Staging: hv: vmbus: Get rid of the unused function vmbus_ontimer()")
this is not used anymore, so can remove it.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20230725142108.27280-1-yuehaibing@huawei.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
14 months agox86/hyperv: add noop functions to x86_init mpparse functions
Saurabh Sengar [Fri, 23 Jun 2023 16:28:08 +0000 (09:28 -0700)]
x86/hyperv: add noop functions to x86_init mpparse functions

Hyper-V can run VMs at different privilege "levels" known as Virtual
Trust Levels (VTL). Sometimes, it chooses to run two different VMs
at different levels but they share some of their address space. In
such setups VTL2 (higher level VM) has visibility of all of the
VTL0 (level 0) memory space.

When the CONFIG_X86_MPPARSE is enabled for VTL2, the VTL2 kernel
performs a search within the low memory to locate MP tables. However,
in systems where VTL0 manages the low memory and may contain valid
tables, this scanning can result in incorrect MP table information
being provided to the VTL2 kernel, mistakenly considering VTL0's MP
table as its own

Add noop functions to avoid MP parse scan by VTL2.

Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/1687537688-5397-1-git-send-email-ssengar@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
14 months agoDocumentation: kdump: Add va_kernel_pa_offset for RISCV64
Song Shuai [Mon, 24 Jul 2023 10:09:17 +0000 (18:09 +0800)]
Documentation: kdump: Add va_kernel_pa_offset for RISCV64

RISC-V Linux exports "va_kernel_pa_offset" in vmcoreinfo to help
Crash-utility translate the kernel virtual address correctly.

Here adds the definition of "va_kernel_pa_offset".

Fixes: 3335068f8721 ("riscv: Use PUD/P4D/PGD pages for the linear mapping")
Link: https://lore.kernel.org/linux-riscv/20230724040649.220279-1-suagrfillet@gmail.com/
Signed-off-by: Song Shuai <suagrfillet@gmail.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20230724100917.309061-2-suagrfillet@gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
14 months agoriscv: Export va_kernel_pa_offset in vmcoreinfo
Song Shuai [Mon, 24 Jul 2023 10:09:16 +0000 (18:09 +0800)]
riscv: Export va_kernel_pa_offset in vmcoreinfo

Since RISC-V Linux v6.4, the commit 3335068f8721 ("riscv: Use
PUD/P4D/PGD pages for the linear mapping") changes phys_ram_base
from the physical start of the kernel to the actual start of the DRAM.

The Crash-utility's VTOP() still uses phys_ram_base and kernel_map.virt_addr
to translate kernel virtual address, that failed the Crash with Linux v6.4 [1].

Export kernel_map.va_kernel_pa_offset in vmcoreinfo to help Crash translate
the kernel virtual address correctly.

Fixes: 3335068f8721 ("riscv: Use PUD/P4D/PGD pages for the linear mapping")
Link: https://lore.kernel.org/linux-riscv/20230724040649.220279-1-suagrfillet@gmail.com/
Signed-off-by: Song Shuai <suagrfillet@gmail.com>
Reviewed-by: Xianting Tian  <xianting.tian@linux.alibaba.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20230724100917.309061-1-suagrfillet@gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
14 months agoRISC-V: ACPI: Fix acpi_os_ioremap to return iomem address
Sunil V L [Mon, 24 Jul 2023 10:03:46 +0000 (15:33 +0530)]
RISC-V: ACPI: Fix acpi_os_ioremap to return iomem address

acpi_os_ioremap() currently is a wrapper to memremap() on
RISC-V. But the callers of acpi_os_ioremap() expect it to
return __iomem address and hence sparse tool reports a new
warning. Fix this issue by type casting to  __iomem type.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202307230357.egcTAefj-lkp@intel.com/
Fixes: a91a9ffbd3a5 ("RISC-V: Add support to build the ACPI core")
Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20230724100346.1302937-1-sunilvl@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
14 months agoselftests: riscv: Fix compilation error with vstate_exec_nolibc.c
Alexandre Ghiti [Thu, 13 Jul 2023 11:58:29 +0000 (13:58 +0200)]
selftests: riscv: Fix compilation error with vstate_exec_nolibc.c

The following error happens:

In file included from vstate_exec_nolibc.c:2:
/usr/include/riscv64-linux-gnu/sys/prctl.h:42:12: error: conflicting types for ‘prctl’; h
ave ‘int(int, ...)’
   42 | extern int prctl (int __option, ...) __THROW;
      |            ^~~~~
In file included from ./../../../../include/nolibc/nolibc.h:99,
                 from <command-line>:
./../../../../include/nolibc/sys.h:892:5: note: previous definition of ‘prctl’ with type
‘int(int,  long unsigned int,  long unsigned int,  long unsigned int,  long unsigned int)

  892 | int prctl(int option, unsigned long arg2, unsigned long arg3,
      |     ^~~~~

Fix this by not including <sys/prctl.h>, which is not needed here since
prctl syscall is directly called using its number.

Fixes: 7cf6198ce22d ("selftests: Test RISC-V Vector prctl interface")
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20230713115829.110421-1-alexghiti@rivosinc.com
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
14 months agoselftests/riscv: fix potential build failure during the "emit_tests" step
John Hubbard [Wed, 12 Jul 2023 19:35:14 +0000 (12:35 -0700)]
selftests/riscv: fix potential build failure during the "emit_tests" step

The riscv selftests (which were modeled after the arm64 selftests) are
improperly declaring the "emit_tests" target to depend upon the "all"
target. This approach, when combined with commit 9fc96c7c19df
("selftests: error out if kernel header files are not yet built"), has
caused build failures [1] on arm64, and is likely to cause similar
failures for riscv.

To fix this, simply remove the unnecessary "all" dependency from the
emit_tests target. The dependency is still effectively honored, because
again, invocation is via "install", which also depends upon "all".

An alternative approach would be to harden the emit_tests target so that
it can depend upon "all", but that's a lot more complicated and hard to
get right, and doesn't seem worth it, especially given that emit_tests
should probably not be overridden at all.

[1] https://lore.kernel.org/20230710-kselftest-fix-arm64-v1-1-48e872844f25@kernel.org

Fixes: 9fc96c7c19df ("selftests: error out if kernel header files are not yet built")
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20230712193514.740033-1-jhubbard@nvidia.com
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
14 months agoMerge tag 'exfat-for-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/linkin...
Linus Torvalds [Wed, 2 Aug 2023 18:43:06 +0000 (11:43 -0700)]
Merge tag 'exfat-for-6.5-rc5' of git://git./linux/kernel/git/linkinjeon/exfat

Pull exfat fixes from Namjae Jeon:

 - Fix page allocation failure from allocation bitmap by using
   kvmalloc_array/kvfree

 - Add the check to validate if filename entries exceeds max filename
   length

 - Fix potential deadlock condition from dir_emit*()

* tag 'exfat-for-6.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
  exfat: release s_lock before calling dir_emit()
  exfat: check if filename entries exceeds max filename length
  exfat: use kvmalloc_array/kvfree instead of kmalloc_array/kfree

14 months agosmb: client: fix dfs link mount against w2k8
Paulo Alcantara [Wed, 2 Aug 2023 16:43:03 +0000 (13:43 -0300)]
smb: client: fix dfs link mount against w2k8

Customer reported that they couldn't mount their DFS link that was
seen by the client as a DFS interlink -- special form of DFS link
where its single target may point to a different DFS namespace -- and
it turned out that it was just a regular DFS link where its referral
header flags missed the StorageServers bit thus making the client
think it couldn't tree connect to target directly without requiring
further referrals.

When the DFS link referral header flags misses the StoraServers bit
and its target doesn't respond to any referrals, then tree connect to
it.

Fixes: a1c0d00572fc ("cifs: share dfs connections and supers")
Cc: stable@vger.kernel.org
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
14 months agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Wed, 2 Aug 2023 18:04:27 +0000 (11:04 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Three small fixes, all in drivers"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: pm80xx: Fix error return code in pm8001_pci_probe()
  scsi: zfcp: Defer fc_rport blocking until after ADISC response
  scsi: storvsc: Limit max_sectors for virtual Fibre Channel devices

14 months agoword-at-a-time: use the same return type for has_zero regardless of endianness
ndesaulniers@google.com [Tue, 1 Aug 2023 22:22:17 +0000 (15:22 -0700)]
word-at-a-time: use the same return type for has_zero regardless of endianness

Compiling big-endian targets with Clang produces the diagnostic:

  fs/namei.c:2173:13: warning: use of bitwise '|' with boolean operands [-Wbitwise-instead-of-logical]
} while (!(has_zero(a, &adata, &constants) | has_zero(b, &bdata, &constants)));
          ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                               ||
  fs/namei.c:2173:13: note: cast one or both operands to int to silence this warning

It appears that when has_zero was introduced, two definitions were
produced with different signatures (in particular different return
types).

Looking at the usage in hash_name() in fs/namei.c, I suspect that
has_zero() is meant to be invoked twice per while loop iteration; using
logical-or would not update `bdata` when `a` did not have zeros.  So I
think it's preferred to always return an unsigned long rather than a
bool than update the while loop in hash_name() to use a logical-or
rather than bitwise-or.

[ Also changed powerpc version to do the same  - Linus ]

Link: https://github.com/ClangBuiltLinux/linux/issues/1832
Link: https://lore.kernel.org/lkml/20230801-bitwise-v1-1-799bec468dc4@google.com/
Fixes: 36126f8f2ed8 ("word-at-a-time: make the interfaces truly generic")
Debugged-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>