platform/kernel/linux-starfive.git
18 months agobpf: Resolve fext program type when checking map compatibility
Toke Høiland-Jørgensen [Wed, 14 Dec 2022 23:02:53 +0000 (00:02 +0100)]
bpf: Resolve fext program type when checking map compatibility

[ Upstream commit 1c123c567fb138ebd187480b7fc0610fcb0851f5 ]

The bpf_prog_map_compatible() check makes sure that BPF program types are
not mixed inside BPF map types that can contain programs (tail call maps,
cpumaps and devmaps). It does this by setting the fields of the map->owner
struct to the values of the first program being checked against, and
rejecting any subsequent programs if the values don't match.

One of the values being set in the map owner struct is the program type,
and since the code did not resolve the prog type for fext programs, the map
owner type would be set to PROG_TYPE_EXT and subsequent loading of programs
of the target type into the map would fail.

This bug is seen in particular for XDP programs that are loaded as
PROG_TYPE_EXT using libxdp; these cannot insert programs into devmaps and
cpumaps because the check fails as described above.

Fix the bug by resolving the fext program type to its target program type
as elsewhere in the verifier.

v3:
- Add Yonghong's ACK

Fixes: f45d5b6ce2e8 ("bpf: generalise tail call map compatibility check")
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20221214230254.790066-1-toke@redhat.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agomedia: s5p-mfc: Fix in register read and write for H264
Smitha T Murthy [Wed, 7 Sep 2022 10:32:25 +0000 (16:02 +0530)]
media: s5p-mfc: Fix in register read and write for H264

commit 06710cd5d2436135046898d7e4b9408c8bb99446 upstream.

Few of the H264 encoder registers written were not getting reflected
since the read values were not stored and getting overwritten.

Fixes: 6a9c6f681257 ("[media] s5p-mfc: Add variants to access mfc registers")

Cc: stable@vger.kernel.org
Cc: linux-fsd@tesla.com
Signed-off-by: Smitha T Murthy <smitha.t@samsung.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agomedia: s5p-mfc: Clear workbit to handle error condition
Smitha T Murthy [Wed, 7 Sep 2022 10:32:26 +0000 (16:02 +0530)]
media: s5p-mfc: Clear workbit to handle error condition

commit d3f3c2fe54e30b0636496d842ffbb5ad3a547f9b upstream.

During error on CLOSE_INSTANCE command, ctx_work_bits was not getting
cleared. During consequent mfc execution NULL pointer dereferencing of
this context led to kernel panic. This patch fixes this issue by making
sure to clear ctx_work_bits always.

Fixes: 818cd91ab8c6 ("[media] s5p-mfc: Extract open/close MFC instance commands")
Cc: stable@vger.kernel.org
Cc: linux-fsd@tesla.com
Signed-off-by: Smitha T Murthy <smitha.t@samsung.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agomedia: s5p-mfc: Fix to handle reference queue during finishing
Smitha T Murthy [Wed, 7 Sep 2022 10:32:27 +0000 (16:02 +0530)]
media: s5p-mfc: Fix to handle reference queue during finishing

commit d8a46bc4e1e0446459daa77c4ce14218d32dacf9 upstream.

On receiving last buffer driver puts MFC to MFCINST_FINISHING state which
in turn skips transferring of frame from SRC to REF queue. This causes
driver to stop MFC encoding and last frame is lost.

This patch guarantees safe handling of frames during MFCINST_FINISHING and
correct clearing of workbit to avoid early stopping of encoding.

Fixes: af9357467810 ("[media] MFC: Add MFC 5.1 V4L2 driver")

Cc: stable@vger.kernel.org
Cc: linux-fsd@tesla.com
Signed-off-by: Smitha T Murthy <smitha.t@samsung.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agoext2: unbugger ext2_empty_dir()
Al Viro [Sat, 26 Nov 2022 03:17:17 +0000 (03:17 +0000)]
ext2: unbugger ext2_empty_dir()

commit 27e714c007e4ad01837bf0fac5c11913a38d7695 upstream.

In 27cfa258951a "ext2: fix fs corruption when trying to remove
a non-empty directory with IO error" a funny thing has happened:

-               page = ext2_get_page(inode, i, dir_has_error, &page_addr);
+               page = ext2_get_page(inode, i, 0, &page_addr);

 -               if (IS_ERR(page)) {
 -                       dir_has_error = 1;
 -                       continue;
 -               }
 +               if (IS_ERR(page))
 +                       goto not_empty;

And at not_empty: we hit ext2_put_page(page, page_addr), which does
put_page(page).  Which, unless I'm very mistaken, should oops
immediately when given ERR_PTR(-E...) as page.

OK, shit happens, insufficiently tested patches included.  But when
commit in question describes the fault-injection test that exercised
that particular failure exit...

Ow.

CC: stable@vger.kernel.org
Fixes: 27cfa258951a ("ext2: fix fs corruption when trying to remove a non-empty directory with IO error")
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agocpufreq: Init completion before kobject_init_and_add()
Yongqiang Liu [Thu, 10 Nov 2022 14:23:07 +0000 (14:23 +0000)]
cpufreq: Init completion before kobject_init_and_add()

commit 5c51054896bcce1d33d39fead2af73fec24f40b6 upstream.

In cpufreq_policy_alloc(), it will call uninitialed completion in
cpufreq_sysfs_release() when kobject_init_and_add() fails. And
that will cause a crash such as the following page fault in complete:

BUG: unable to handle page fault for address: fffffffffffffff8
[..]
RIP: 0010:complete+0x98/0x1f0
[..]
Call Trace:
 kobject_put+0x1be/0x4c0
 cpufreq_online.cold+0xee/0x1fd
 cpufreq_add_dev+0x183/0x1e0
 subsys_interface_register+0x3f5/0x4e0
 cpufreq_register_driver+0x3b7/0x670
 acpi_cpufreq_init+0x56c/0x1000 [acpi_cpufreq]
 do_one_initcall+0x13d/0x780
 do_init_module+0x1c3/0x630
 load_module+0x6e67/0x73b0
 __do_sys_finit_module+0x181/0x240
 do_syscall_64+0x35/0x80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fixes: 4ebe36c94aed ("cpufreq: Fix kobject memleak")
Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 5.2+ <stable@vger.kernel.org> # 5.2+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agoPM/devfreq: governor: Add a private governor_data for governor
Kant Fan [Tue, 25 Oct 2022 07:21:09 +0000 (15:21 +0800)]
PM/devfreq: governor: Add a private governor_data for governor

commit 5fdded8448924e3631d466eea499b11606c43640 upstream.

The member void *data in the structure devfreq can be overwrite
by governor_userspace. For example:
1. The device driver assigned the devfreq governor to simple_ondemand
by the function devfreq_add_device() and init the devfreq member
void *data to a pointer of a static structure devfreq_simple_ondemand_data
by the function devfreq_add_device().
2. The user changed the devfreq governor to userspace by the command
"echo userspace > /sys/class/devfreq/.../governor".
3. The governor userspace alloced a dynamic memory for the struct
userspace_data and assigend the member void *data of devfreq to
this memory by the function userspace_init().
4. The user changed the devfreq governor back to simple_ondemand
by the command "echo simple_ondemand > /sys/class/devfreq/.../governor".
5. The governor userspace exited and assigned the member void *data
in the structure devfreq to NULL by the function userspace_exit().
6. The governor simple_ondemand fetched the static information of
devfreq_simple_ondemand_data in the function
devfreq_simple_ondemand_func() but the member void *data of devfreq was
assigned to NULL by the function userspace_exit().
7. The information of upthreshold and downdifferential is lost
and the governor simple_ondemand can't work correctly.

The member void *data in the structure devfreq is designed for
a static pointer used in a governor and inited by the function
devfreq_add_device(). This patch add an element named governor_data
in the devfreq structure which can be used by a governor(E.g userspace)
who want to assign a private data to do some private things.

Fixes: ce26c5bb9569 ("PM / devfreq: Add basic governors")
Cc: stable@vger.kernel.org # 5.10+
Reviewed-by: Chanwoo Choi <cwchoi00@gmail.com>
Acked-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kant Fan <kant@allwinnertech.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agorandom: add helpers for random numbers with given floor or range
Jason A. Donenfeld [Thu, 20 Oct 2022 05:19:35 +0000 (23:19 -0600)]
random: add helpers for random numbers with given floor or range

commit 7f576b2593a978451416424e75f69ad1e3ae4efe upstream.

Now that we have get_random_u32_below(), it's nearly trivial to make
inline helpers to compute get_random_u32_above() and
get_random_u32_inclusive(), which will help clean up open coded loops
and manual computations throughout the tree.

One snag is that in order to make get_random_u32_inclusive() operate on
closed intervals, we have to do some (unlikely) special case handling if
get_random_u32_inclusive(0, U32_MAX) is called. The least expensive way
of doing this is actually to adjust the slowpath of
get_random_u32_below() to have its undefined 0 result just return the
output of get_random_u32(). We can make this basically free by calling
get_random_u32() before the branch, so that the branch latency gets
interleaved.

Cc: stable@vger.kernel.org # to ease future backports that use this api
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agox86/MCE/AMD: Clear DFR errors found in THR handler
Yazen Ghannam [Tue, 21 Jun 2022 15:59:43 +0000 (15:59 +0000)]
x86/MCE/AMD: Clear DFR errors found in THR handler

commit bc1b705b0eee4c645ad8b3bbff3c8a66e9688362 upstream.

AMD's MCA Thresholding feature counts errors of all severity levels, not
just correctable errors. If a deferred error causes the threshold limit
to be reached (it was the error that caused the overflow), then both a
deferred error interrupt and a thresholding interrupt will be triggered.

The order of the interrupts is not guaranteed. If the threshold
interrupt handler is executed first, then it will clear MCA_STATUS for
the error. It will not check or clear MCA_DESTAT which also holds a copy
of the deferred error. When the deferred error interrupt handler runs it
will not find an error in MCA_STATUS, but it will find the error in
MCA_DESTAT. This will cause two errors to be logged.

Check for deferred errors when handling a threshold interrupt. If a bank
contains a deferred error, then clear the bank's MCA_DESTAT register.

Define a new helper function to do the deferred error check and clearing
of MCA_DESTAT.

  [ bp: Simplify, convert comment to passive voice. ]

Fixes: 37d43acfd79f ("x86/mce/AMD: Redo error logging from APIC LVT interrupt handlers")
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220621155943.33623-1-yazen.ghannam@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agoselftests: Use optional USERCFLAGS and USERLDFLAGS
Mickaël Salaün [Fri, 9 Sep 2022 10:39:01 +0000 (12:39 +0200)]
selftests: Use optional USERCFLAGS and USERLDFLAGS

commit de3ee3f63400a23954e7c1ad1cb8c20f29ab6fe3 upstream.

This change enables to extend CFLAGS and LDFLAGS from command line, e.g.
to extend compiler checks: make USERCFLAGS=-Werror USERLDFLAGS=-static

USERCFLAGS and USERLDFLAGS are documented in
Documentation/kbuild/makefiles.rst and Documentation/kbuild/kbuild.rst

This should be backported (down to 5.10) to improve previous kernel
versions testing as well.

Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: stable@vger.kernel.org
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20220909103901.1503436-1-mic@digikod.net
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agoEDAC/mc_sysfs: Increase legacy channel support to 12
Yazen Ghannam [Tue, 18 Oct 2022 15:36:30 +0000 (10:36 -0500)]
EDAC/mc_sysfs: Increase legacy channel support to 12

commit 25836ce1df827cb4830291cb2325067efb46753a upstream.

Newer AMD systems, such as Genoa, can support up to 12 channels per EDAC
"mc" device. These are detected by the device's EDAC module, and the
current EDAC interface is properly enumerated. However, the legacy EDAC
sysfs interface provides device attributes only for channels 0 to 7.
Therefore, channels 8 to 11 will not be visible in the legacy interface.
This was overlooked in the initial support for AMD Genoa.

Add additional device attributes so that up to 12 channels are visible
in the legacy EDAC sysfs interface.

Fixes: e2be5955a886 ("EDAC/amd64: Add support for AMD Family 19h Models 10h-1Fh and A0h-AFh")
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20221018153630.14664-1-yazen.ghannam@amd.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agocxl/region: Fix missing probe failure
Dan Williams [Thu, 1 Dec 2022 22:03:24 +0000 (14:03 -0800)]
cxl/region: Fix missing probe failure

commit bf3e5da8cb43a671b32fc125fa81b8f6a3677192 upstream.

cxl_region_probe() allows for regions not in the 'commit' state to be
enabled. Fail probe when the region is not committed otherwise the
kernel may indicate that an address range is active when none of the
decoders are active.

Fixes: 8d48817df6ac ("cxl/region: Add region driver boiler plate")
Cc: <stable@vger.kernel.org>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/166993220462.1995348.1698008475198427361.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agoarm64: dts: qcom: sdm850-lenovo-yoga-c630: correct I2C12 pins drive strength
Krzysztof Kozlowski [Fri, 30 Sep 2022 19:20:37 +0000 (21:20 +0200)]
arm64: dts: qcom: sdm850-lenovo-yoga-c630: correct I2C12 pins drive strength

commit fd49776d8f458bba5499384131eddc0b8bcaf50c upstream.

The pin configuration (done with generic pin controller helpers and
as expressed by bindings) requires children nodes with either:
1. "pins" property and the actual configuration,
2. another set of nodes with above point.

The qup_i2c12_default pin configuration used second method - with a
"pinmux" child.

Fixes: 44acee207844 ("arm64: dts: qcom: Add Lenovo Yoga C630")
Cc: <stable@vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Tested-by: Steev Klimaszewski <steev@kali.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@somainline.org>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20220930192039.240486-1-krzysztof.kozlowski@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agox86/fpu/xstate: Fix XSTATE_WARN_ON() to emit relevant diagnostics
Andrew Cooper [Wed, 10 Aug 2022 22:19:09 +0000 (23:19 +0100)]
x86/fpu/xstate: Fix XSTATE_WARN_ON() to emit relevant diagnostics

commit 48280042f2c6e3ac2cfb1d8b752ab4a7e0baea24 upstream.

"XSAVE consistency problem" has been reported under Xen, but that's the extent
of my divination skills.

Modify XSTATE_WARN_ON() to force the caller to provide relevant diagnostic
information, and modify each caller suitably.

For check_xstate_against_struct(), this removes a double WARN() where one will
do perfectly fine.

CC stable as this has been wonky debugging for 7 years and it is good to
have there too.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220810221909.12768-1-andrew.cooper3@citrix.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agorandom: use rejection sampling for uniform bounded random integers
Jason A. Donenfeld [Sun, 9 Oct 2022 02:42:54 +0000 (20:42 -0600)]
random: use rejection sampling for uniform bounded random integers

commit e9a688bcb19348862afe30d7c85bc37c4c293471 upstream.

Until the very recent commits, many bounded random integers were
calculated using `get_random_u32() % max_plus_one`, which not only
incurs the price of a division -- indicating performance mostly was not
a real issue -- but also does not result in a uniformly distributed
output if max_plus_one is not a power of two. Recent commits moved to
using `prandom_u32_max(max_plus_one)`, which replaces the division with
a faster multiplication, but still does not solve the issue with
non-uniform output.

For some users, maybe this isn't a problem, and for others, maybe it is,
but for the majority of users, probably the question has never been
posed and analyzed, and nobody thought much about it, probably assuming
random is random is random. In other words, the unthinking expectation
of most users is likely that the resultant numbers are uniform.

So we implement here an efficient way of generating uniform bounded
random integers. Through use of compile-time evaluation, and avoiding
divisions as much as possible, this commit introduces no measurable
overhead. At least for hot-path uses tested, any potential difference
was lost in the noise. On both clang and gcc, code generation is pretty
small.

The new function, get_random_u32_below(), lives in random.h, rather than
prandom.h, and has a "get_random_xxx" function name, because it is
suitable for all uses, including cryptography.

In order to be efficient, we implement a kernel-specific variant of
Daniel Lemire's algorithm from "Fast Random Integer Generation in an
Interval", linked below. The kernel's variant takes advantage of
constant folding to avoid divisions entirely in the vast majority of
cases, works on both 32-bit and 64-bit architectures, and requests a
minimal amount of bytes from the RNG.

Link: https://arxiv.org/pdf/1805.10941.pdf
Cc: stable@vger.kernel.org # to ease future backports that use this api
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agoarm64: dts: qcom: sdm850-samsung-w737: correct I2C12 pins drive strength
Krzysztof Kozlowski [Fri, 30 Sep 2022 19:20:38 +0000 (21:20 +0200)]
arm64: dts: qcom: sdm850-samsung-w737: correct I2C12 pins drive strength

commit 3638ea010c37e1e6d93474c4b3368f403600413f upstream.

The pin configuration (done with generic pin controller helpers and
as expressed by bindings) requires children nodes with either:
1. "pins" property and the actual configuration,
2. another set of nodes with above point.

The qup_i2c12_default pin configuration used second method - with a
"pinmux" child.

Fixes: d4b341269efb ("arm64: dts: qcom: Add support for Samsung Galaxy Book2")
Cc: <stable@vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@somainline.org>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20220930192039.240486-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agoARM: ux500: do not directly dereference __iomem
Jason A. Donenfeld [Tue, 8 Nov 2022 12:37:55 +0000 (13:37 +0100)]
ARM: ux500: do not directly dereference __iomem

commit 65b0e307a1a9193571db12910f382f84195a3d29 upstream.

Sparse reports that calling add_device_randomness() on `uid` is a
violation of address spaces. And indeed the next usage uses readl()
properly, but that was left out when passing it toadd_device_
randomness(). So instead copy the whole thing to the stack first.

Fixes: 4040d10a3d44 ("ARM: ux500: add DB serial number to entropy pool")
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/202210230819.loF90KDh-lkp@intel.com/
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20221108123755.207438-1-Jason@zx2c4.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agobtrfs: fix resolving backrefs for inline extent followed by prealloc
Boris Burkov [Wed, 14 Dec 2022 23:05:08 +0000 (15:05 -0800)]
btrfs: fix resolving backrefs for inline extent followed by prealloc

commit 560840afc3e63bbe5d9c5ef6b2ecf8f3589adff6 upstream.

If a file consists of an inline extent followed by a regular or prealloc
extent, then a legitimate attempt to resolve a logical address in the
non-inline region will result in add_all_parents reading the invalid
offset field of the inline extent. If the inline extent item is placed
in the leaf eb s.t. it is the first item, attempting to access the
offset field will not only be meaningless, it will go past the end of
the eb and cause this panic:

  [17.626048] BTRFS warning (device dm-2): bad eb member end: ptr 0x3fd4 start 30834688 member offset 16377 size 8
  [17.631693] general protection fault, probably for non-canonical address 0x5088000000000: 0000 [#1] SMP PTI
  [17.635041] CPU: 2 PID: 1267 Comm: btrfs Not tainted 5.12.0-07246-g75175d5adc74-dirty #199
  [17.637969] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
  [17.641995] RIP: 0010:btrfs_get_64+0xe7/0x110
  [17.649890] RSP: 0018:ffffc90001f73a08 EFLAGS: 00010202
  [17.651652] RAX: 0000000000000001 RBX: ffff88810c42d000 RCX: 0000000000000000
  [17.653921] RDX: 0005088000000000 RSI: ffffc90001f73a0f RDI: 0000000000000001
  [17.656174] RBP: 0000000000000ff9 R08: 0000000000000007 R09: c0000000fffeffff
  [17.658441] R10: ffffc90001f73790 R11: ffffc90001f73788 R12: ffff888106afe918
  [17.661070] R13: 0000000000003fd4 R14: 0000000000003f6f R15: cdcdcdcdcdcdcdcd
  [17.663617] FS:  00007f64e7627d80(0000) GS:ffff888237c80000(0000) knlGS:0000000000000000
  [17.666525] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [17.668664] CR2: 000055d4a39152e8 CR3: 000000010c596002 CR4: 0000000000770ee0
  [17.671253] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  [17.673634] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  [17.676034] PKRU: 55555554
  [17.677004] Call Trace:
  [17.677877]  add_all_parents+0x276/0x480
  [17.679325]  find_parent_nodes+0xfae/0x1590
  [17.680771]  btrfs_find_all_leafs+0x5e/0xa0
  [17.682217]  iterate_extent_inodes+0xce/0x260
  [17.683809]  ? btrfs_inode_flags_to_xflags+0x50/0x50
  [17.685597]  ? iterate_inodes_from_logical+0xa1/0xd0
  [17.687404]  iterate_inodes_from_logical+0xa1/0xd0
  [17.689121]  ? btrfs_inode_flags_to_xflags+0x50/0x50
  [17.691010]  btrfs_ioctl_logical_to_ino+0x131/0x190
  [17.692946]  btrfs_ioctl+0x104a/0x2f60
  [17.694384]  ? selinux_file_ioctl+0x182/0x220
  [17.695995]  ? __x64_sys_ioctl+0x84/0xc0
  [17.697394]  __x64_sys_ioctl+0x84/0xc0
  [17.698697]  do_syscall_64+0x33/0x40
  [17.700017]  entry_SYSCALL_64_after_hwframe+0x44/0xae
  [17.701753] RIP: 0033:0x7f64e72761b7
  [17.709355] RSP: 002b:00007ffefb067f58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
  [17.712088] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f64e72761b7
  [17.714667] RDX: 00007ffefb067fb0 RSI: 00000000c0389424 RDI: 0000000000000003
  [17.717386] RBP: 00007ffefb06d188 R08: 000055d4a390d2b0 R09: 00007f64e7340a60
  [17.719938] R10: 0000000000000231 R11: 0000000000000246 R12: 0000000000000001
  [17.722383] R13: 0000000000000000 R14: 00000000c0389424 R15: 000055d4a38fd2a0
  [17.724839] Modules linked in:

Fix the bug by detecting the inline extent item in add_all_parents and
skipping to the next extent item.

CC: stable@vger.kernel.org # 4.9+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agobtrfs: fix extent map use-after-free when handling missing device in read_one_chunk
void0red [Wed, 23 Nov 2022 14:39:45 +0000 (22:39 +0800)]
btrfs: fix extent map use-after-free when handling missing device in read_one_chunk

commit 1742e1c90c3da344f3bb9b1f1309b3f47482756a upstream.

Store the error code before freeing the extent_map. Though it's
reference counted structure, in that function it's the first and last
allocation so this would lead to a potential use-after-free.

The error can happen eg. when chunk is stored on a missing device and
the degraded mount option is missing.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=216721
Reported-by: eriri <1527030098@qq.com>
Fixes: adfb69af7d8c ("btrfs: add_missing_dev() should return the actual error")
CC: stable@vger.kernel.org # 4.9+
Signed-off-by: void0red <void0red@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agobtrfs: fix uninitialized parent in insert_state
Josef Bacik [Fri, 18 Nov 2022 20:06:09 +0000 (15:06 -0500)]
btrfs: fix uninitialized parent in insert_state

commit d7c9e1be2876f63fb2178a24e0c1d5733ff98d47 upstream.

I don't know how this isn't caught when we build this in the kernel, but
while syncing extent-io-tree.c into btrfs-progs I got an error because
parent could potentially be uninitialized when we link in a new node,
specifically when the extent_io_tree is empty.  This means we could have
garbage in the parent color.  I don't know what the ramifications are of
that, but it's probably not great, so fix this by initializing parent to
NULL.  I spot checked all of our other usages in btrfs and we appear to
be doing the correct thing everywhere else.

Fixes: c7e118cf98c7 ("btrfs: open code rbtree search in insert_state")
CC: stable@vger.kernel.org # 6.0+
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agodrm/amd/pm: correct SMU13.0.0 pstate profiling clock settings
Evan Quan [Mon, 5 Dec 2022 06:53:34 +0000 (14:53 +0800)]
drm/amd/pm: correct SMU13.0.0 pstate profiling clock settings

commit 32a7819ff8e25375c7515aaae5cfcb8c44a461b7 upstream.

Correct the pstate standard/peak profiling mode clock settings
for SMU13.0.0.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.0.x
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agodrm/amd/pm: update SMU13.0.0 reported maximum shader clock
Evan Quan [Mon, 5 Dec 2022 07:33:31 +0000 (15:33 +0800)]
drm/amd/pm: update SMU13.0.0 reported maximum shader clock

commit 7a18e089eff02f17eaee49fc18641f5d16a8284b upstream.

Update the reported maximum shader clock to the value which can
be guarded to be achieved on all cards. This is to align with
Window setting.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.0.x
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agophy: qcom-qmp-combo: fix out-of-bounds clock access
Johan Hovold [Mon, 14 Nov 2022 08:13:41 +0000 (09:13 +0100)]
phy: qcom-qmp-combo: fix out-of-bounds clock access

commit d8a5b59c5fc75c99ba17e3eb1a8f580d8d172b28 upstream.

The SM8250 only uses three clocks but the DP configuration erroneously
described four clocks.

In case the DP part of the PHY is initialised before the USB part, this
would lead to uninitialised memory beyond the bulk-clocks array to be
treated as a clock pointer as the clocks are requested based on the USB
configuration.

Fixes: aff188feb5e1 ("phy: qcom-qmp: add support for sm8250-usb3-dp phy")
Cc: stable@vger.kernel.org # 5.13
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20221114081346.5116-2-johan+linaro@kernel.org
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agommc: sdhci-sprd: Disable CLK_AUTO when the clock is less than 400K
Wenchao Chen [Wed, 7 Dec 2022 05:19:09 +0000 (13:19 +0800)]
mmc: sdhci-sprd: Disable CLK_AUTO when the clock is less than 400K

commit ff874dbc4f868af128b412a9bd92637103cf11d7 upstream.

When the clock is less than 400K, some SD cards fail to initialize
because CLK_AUTO is enabled.

Fixes: fb8bd90f83c4 ("mmc: sdhci-sprd: Add Spreadtrum's initial host controller")
Signed-off-by: Wenchao Chen <wenchao.chen@unisoc.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20221207051909.32126-1-wenchao.chen@unisoc.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agoarm64: dts: qcom: sc8280xp: fix UFS reference clocks
Johan Hovold [Fri, 4 Nov 2022 09:20:44 +0000 (10:20 +0100)]
arm64: dts: qcom: sc8280xp: fix UFS reference clocks

commit f446022b932aff1d6a308ca5d537ec2b512debdc upstream.

There are three UFS reference clocks on SC8280XP which are used as
follows:

 - The GCC_UFS_REF_CLKREF_CLK clock is fed to any UFS device connected
   to either controller.

 - The GCC_UFS_1_CARD_CLKREF_CLK and GCC_UFS_CARD_CLKREF_CLK clocks
   provide reference clocks to the two PHYs.

Note that this depends on first updating the clock driver to reflect
that all three clocks are sourced from CXO. Specifically, the UFS
controller driver expects the device reference clock to have a valid
frequency:

ufshcd-qcom 1d84000.ufs: invalid ref_clk setting = 0

Fixes: 152d1faf1e2f ("arm64: dts: qcom: add SC8280XP platform")
Fixes: 8d6b458ce6e9 ("arm64: dts: qcom: sc8280xp: fix ufs_card_phy ref clock")
Fixes: f3aa975e230e ("arm64: dts: qcom: sc8280xp: correct ref clock for ufs_mem_phy")
Link: https://lore.kernel.org/lkml/Y2OEjNAPXg5BfOxH@hovoldconsulting.com/
Cc: stable@vger.kernel.org # 5.20
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Reviewed-by: Brian Masney <bmasney@redhat.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@somainline.org>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20221104092045.17410-2-johan+linaro@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agoarm64: dts: qcom: sdm845-db845c: correct SPI2 pins drive strength
Krzysztof Kozlowski [Mon, 10 Oct 2022 11:44:13 +0000 (07:44 -0400)]
arm64: dts: qcom: sdm845-db845c: correct SPI2 pins drive strength

commit 9905370560d9c29adc15f4937c5a0c0dac05f0b4 upstream.

The pin configuration (done with generic pin controller helpers and
as expressed by bindings) requires children nodes with either:
1. "pins" property and the actual configuration,
2. another set of nodes with above point.

The qup_spi2_default pin configuration uses alreaady the second method
with a "pinmux" child, so configure drive-strength similarly in
"pinconf".  Otherwise the PIN drive strength would not be applied.

Fixes: 8d23a0040475 ("arm64: dts: qcom: db845c: add Low speed expansion i2c and spi nodes")
Cc: <stable@vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20221010114417.29859-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agoperf/x86/intel/uncore: Clear attr_update properly
Alexander Antonov [Thu, 17 Nov 2022 12:28:25 +0000 (12:28 +0000)]
perf/x86/intel/uncore: Clear attr_update properly

commit 6532783310e2b2f50dc13f46c49aa6546cb6e7a3 upstream.

Current clear_attr_update procedure in pmu_set_mapping() sets attr_update
field in NULL that is not correct because intel_uncore_type pmu types can
contain several groups in attr_update field. For example, SPR platform
already has uncore_alias_group to update and then UPI topology group will
be added in next patches.

Fix current behavior and clear attr_update group related to mapping only.

Fixes: bb42b3d39781 ("perf/x86/intel/uncore: Expose an Uncore unit to IIO PMON mapping")
Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20221117122833.3103580-4-alexander.antonov@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agoperf/x86/intel/uncore: Disable I/O stacks to PMU mapping on ICX-D
Alexander Antonov [Thu, 17 Nov 2022 12:28:26 +0000 (12:28 +0000)]
perf/x86/intel/uncore: Disable I/O stacks to PMU mapping on ICX-D

commit efe062705d149b20a15498cb999a9edbb8241e6f upstream.

Current implementation of I/O stacks to PMU mapping doesn't support ICX-D.
Detect ICX-D system to disable mapping.

Fixes: 10337e95e04c ("perf/x86/intel/uncore: Enable I/O stacks to IIO PMON mapping on ICX")
Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20221117122833.3103580-5-alexander.antonov@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agojbd2: use the correct print format
Bixuan Cui [Tue, 11 Oct 2022 11:33:44 +0000 (19:33 +0800)]
jbd2: use the correct print format

commit d87a7b4c77a997d5388566dd511ca8e6b8e8a0a8 upstream.

The print format error was found when using ftrace event:
    <...>-1406 [000] .... 23599442.895823: jbd2_end_commit: dev 252,8 transaction -1866216965 sync 0 head -1866217368
    <...>-1406 [000] .... 23599442.896299: jbd2_start_commit: dev 252,8 transaction -1866216964 sync 0

Use the correct print format for transaction, head and tid.

Fixes: 879c5e6b7cb4 ('jbd2: convert instrumentation from markers to tracepoints')
Signed-off-by: Bixuan Cui <cuibixuan@linux.alibaba.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Link: https://lore.kernel.org/r/1665488024-95172-1-git-send-email-cuibixuan@linux.alibaba.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agoktest.pl minconfig: Unset configs instead of just removing them
Steven Rostedt [Fri, 2 Dec 2022 16:59:36 +0000 (11:59 -0500)]
ktest.pl minconfig: Unset configs instead of just removing them

commit ef784eebb56425eed6e9b16e7d47e5c00dcf9c38 upstream.

After a full run of a make_min_config test, I noticed there were a lot of
CONFIGs still enabled that really should not be. Looking at them, I
noticed they were all defined as "default y". The issue is that the test
simple removes the config and re-runs make oldconfig, which enables it
again because it is set to default 'y'. Instead, explicitly disable the
config with writing "# CONFIG_FOO is not set" to the file to keep it from
being set again.

With this change, one of my box's minconfigs went from 768 configs set,
down to 521 configs set.

Link: https://lkml.kernel.org/r/20221202115936.016fce23@gandalf.local.home
Cc: stable@vger.kernel.org
Fixes: 0a05c769a9de5 ("ktest: Added config_bisect test type")
Reviewed-by: John 'Warthog9' Hawley (VMware) <warthog9@eaglescrag.net>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agokest.pl: Fix grub2 menu handling for rebooting
Steven Rostedt [Wed, 30 Nov 2022 22:54:34 +0000 (17:54 -0500)]
kest.pl: Fix grub2 menu handling for rebooting

commit 26df05a8c1420ad3de314fdd407e7fc2058cc7aa upstream.

grub2 has submenus where to use grub-reboot, it requires:

  grub-reboot X>Y

where X is the main index and Y is the submenu. Thus if you have:

menuentry 'Debian GNU/Linux' --class debian --class gnu-linux ...
[...]
}
submenu 'Advanced options for Debian GNU/Linux' $menuentry_id_option ...
        menuentry 'Debian GNU/Linux, with Linux 6.0.0-4-amd64' --class debian --class gnu-linux ...
                [...]
        }
        menuentry 'Debian GNU/Linux, with Linux 6.0.0-4-amd64 (recovery mode)' --class debian --class gnu-linux ...
[...]
        }
        menuentry 'Debian GNU/Linux, with Linux test' --class debian --class gnu-linux ...
                [...]
        }

And wanted to boot to the "Linux test" kernel, you need to run:

 # grub-reboot 1>2

As 1 is the second top menu (the submenu) and 2 is the third of the sub
menu entries.

Have the grub.cfg parsing for grub2 handle such cases.

Cc: stable@vger.kernel.org
Fixes: a15ba91361d46 ("ktest: Add support for grub2")
Reviewed-by: John 'Warthog9' Hawley (VMware) <warthog9@eaglescrag.net>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agosoc: qcom: Select REMAP_MMIO for ICC_BWMON driver
Manivannan Sadhasivam [Tue, 29 Nov 2022 07:20:22 +0000 (12:50 +0530)]
soc: qcom: Select REMAP_MMIO for ICC_BWMON driver

commit a84160fbf4f2c8c5ffa588e19ea8f92eabd7ad17 upstream.

ICC_BWMON driver uses REGMAP_MMIO for accessing the hardware registers.
So select the dependency in Kconfig. Without this, there will be errors
while building the driver with COMPILE_TEST only:

ERROR: modpost: "__devm_regmap_init_mmio_clk" [drivers/soc/qcom/icc-bwmon.ko] undefined!
make[1]: *** [scripts/Makefile.modpost:126: Module.symvers] Error 1
make: *** [Makefile:1944: modpost] Error 2

Cc: <stable@vger.kernel.org> # 6.0
Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Fixes: b9c2ae6cac40 ("soc: qcom: icc-bwmon: Add bandwidth monitoring driver")
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20221129072022.41962-1-manivannan.sadhasivam@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agosoc: qcom: Select REMAP_MMIO for LLCC driver
Manivannan Sadhasivam [Tue, 29 Nov 2022 07:11:59 +0000 (12:41 +0530)]
soc: qcom: Select REMAP_MMIO for LLCC driver

commit 5d2fe2d7b616b8baa18348ead857b504fc2de336 upstream.

LLCC driver uses REGMAP_MMIO for accessing the hardware registers. So
select the dependency in Kconfig. Without this, there will be errors
while building the driver with COMPILE_TEST only:

ERROR: modpost: "__devm_regmap_init_mmio_clk" [drivers/soc/qcom/llcc-qcom.ko] undefined!
make[1]: *** [scripts/Makefile.modpost:126: Module.symvers] Error 1
make: *** [Makefile:1944: modpost] Error 2

Cc: <stable@vger.kernel.org> # 4.19
Fixes: a3134fb09e0b ("drivers: soc: Add LLCC driver")
Reported-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20221129071201.30024-2-manivannan.sadhasivam@linaro.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agoarm64: Prohibit instrumentation on arch_stack_walk()
Masami Hiramatsu (Google) [Fri, 2 Dec 2022 02:18:33 +0000 (11:18 +0900)]
arm64: Prohibit instrumentation on arch_stack_walk()

commit 0fbcd8abf3375052cc7627cc53aba6f2eb189fbb upstream.

Mark arch_stack_walk() as noinstr instead of notrace and inline functions
called from arch_stack_walk() as __always_inline so that user does not
put any instrumentations on it, because this function can be used from
return_address() which is used by lockdep.

Without this, if the kernel built with CONFIG_LOCKDEP=y, just probing
arch_stack_walk() via <tracefs>/kprobe_events will crash the kernel on
arm64.

 # echo p arch_stack_walk >> ${TRACEFS}/kprobe_events
 # echo 1 > ${TRACEFS}/events/kprobes/enable
  kprobes: Failed to recover from reentered kprobes.
  kprobes: Dump kprobe:
  .symbol_name = arch_stack_walk, .offset = 0, .addr = arch_stack_walk+0x0/0x1c0
  ------------[ cut here ]------------
  kernel BUG at arch/arm64/kernel/probes/kprobes.c:241!
  kprobes: Failed to recover from reentered kprobes.
  kprobes: Dump kprobe:
  .symbol_name = arch_stack_walk, .offset = 0, .addr = arch_stack_walk+0x0/0x1c0
  ------------[ cut here ]------------
  kernel BUG at arch/arm64/kernel/probes/kprobes.c:241!
  PREEMPT SMP
  Modules linked in:
  CPU: 0 PID: 17 Comm: migration/0 Tainted: G                 N 6.1.0-rc5+ #6
  Hardware name: linux,dummy-virt (DT)
  Stopper: 0x0 <- 0x0
  pstate: 600003c5 (nZCv DAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
  pc : kprobe_breakpoint_handler+0x178/0x17c
  lr : kprobe_breakpoint_handler+0x178/0x17c
  sp : ffff8000080d3090
  x29: ffff8000080d3090 x28: ffff0df5845798c0 x27: ffffc4f59057a774
  x26: ffff0df5ffbba770 x25: ffff0df58f420f18 x24: ffff49006f641000
  x23: ffffc4f590579768 x22: ffff0df58f420f18 x21: ffff8000080d31c0
  x20: ffffc4f590579768 x19: ffffc4f590579770 x18: 0000000000000006
  x17: 5f6b636174735f68 x16: 637261203d207264 x15: 64612e202c30203d
  x14: 2074657366666f2e x13: 30633178302f3078 x12: 302b6b6c61775f6b
  x11: 636174735f686372 x10: ffffc4f590dc5bd8 x9 : ffffc4f58eb31958
  x8 : 00000000ffffefff x7 : ffffc4f590dc5bd8 x6 : 80000000fffff000
  x5 : 000000000000bff4 x4 : 0000000000000000 x3 : 0000000000000000
  x2 : 0000000000000000 x1 : ffff0df5845798c0 x0 : 0000000000000064
  Call trace:
  kprobes: Failed to recover from reentered kprobes.
  kprobes: Dump kprobe:
  .symbol_name = arch_stack_walk, .offset = 0, .addr = arch_stack_walk+0x0/0x1c0
  ------------[ cut here ]------------
  kernel BUG at arch/arm64/kernel/probes/kprobes.c:241!

Fixes: 39ef362d2d45 ("arm64: Make return_address() use arch_stack_walk()")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/166994751368.439920.3236636557520824664.stgit@devnote3
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agoarm64: dts: qcom: sc8280xp: fix UFS DMA coherency
Johan Hovold [Mon, 5 Dec 2022 10:08:37 +0000 (11:08 +0100)]
arm64: dts: qcom: sc8280xp: fix UFS DMA coherency

commit 0953777640354dc459a22369eea488603d225dd9 upstream.

The SC8280XP UFS controllers are cache coherent and must be marked as
such in the devicetree to avoid potential data corruption.

Fixes: 152d1faf1e2f ("arm64: dts: qcom: add SC8280XP platform")
Cc: stable@vger.kernel.org # 6.0
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Signed-off-by: Bjorn Andersson <andersson@kernel.org>
Link: https://lore.kernel.org/r/20221205100837.29212-3-johan+linaro@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agocxl/region: Fix memdev reuse check
Fan Ni [Mon, 7 Nov 2022 21:22:31 +0000 (21:22 +0000)]
cxl/region: Fix memdev reuse check

commit f04facfb993de47e2133b2b842d72b97b1c50162 upstream.

Due to a typo, the check of whether or not a memdev has already been
used as a target for the region (above code piece) will always be
skipped. Given a memdev with more than one HDM decoder, an interleaved
region can be created that maps multiple HPAs to the same DPA. According
to CXL spec 3.0 8.1.3.8.4, "Aliasing (mapping more than one Host
Physical Address (HPA) to a single Device Physical Address) is
forbidden."

Fix this by using existing iterator for memdev reuse check.

Cc: <stable@vger.kernel.org>
Fixes: 384e624bb211 ("cxl/region: Attach endpoint decoders")
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Link: https://lore.kernel.org/r/20221107212153.745993-1-fan.ni@samsung.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agomedia: stv0288: use explicitly signed char
Jason A. Donenfeld [Mon, 24 Oct 2022 15:23:43 +0000 (17:23 +0200)]
media: stv0288: use explicitly signed char

commit 7392134428c92a4cb541bd5c8f4f5c8d2e88364d upstream.

With char becoming unsigned by default, and with `char` alone being
ambiguous and based on architecture, signed chars need to be marked
explicitly as such. Use `s8` and `u8` types here, since that's what
surrounding code does. This fixes:

drivers/media/dvb-frontends/stv0288.c:471 stv0288_set_frontend() warn: assigning (-9) to unsigned variable 'tm'
drivers/media/dvb-frontends/stv0288.c:471 stv0288_set_frontend() warn: we never enter this loop

Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-media@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agodrm/amdgpu: skip mes self test after s0i3 resume for MES IP v11.0
Tim Huang [Mon, 19 Dec 2022 10:32:32 +0000 (18:32 +0800)]
drm/amdgpu: skip mes self test after s0i3 resume for MES IP v11.0

commit 8660495a9c5b9afeec4cc006b3b75178f0fb2f10 upstream.

MES is part of gfxoff and MES suspend and resume are skipped for S0i3.
But the mes_self_test call path is still in the amdgpu_device_ip_late_init.
it's should also be skipped for s0ix as no hardware re-initialization
happened.

Besides, mes_self_test will free the BO that triggers a lot of warning
messages while in the suspend state.

[   81.656085] WARNING: CPU: 2 PID: 1550 at drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:425 amdgpu_bo_free_kernel+0xfc/0x110 [amdgpu]
[   81.679435] Call Trace:
[   81.679726]  <TASK>
[   81.679981]  amdgpu_mes_remove_hw_queue+0x17a/0x230 [amdgpu]
[   81.680857]  amdgpu_mes_self_test+0x390/0x430 [amdgpu]
[   81.681665]  mes_v11_0_late_init+0x37/0x50 [amdgpu]
[   81.682423]  amdgpu_device_ip_late_init+0x53/0x280 [amdgpu]
[   81.683257]  amdgpu_device_resume+0xae/0x2a0 [amdgpu]
[   81.684043]  amdgpu_pmops_resume+0x37/0x70 [amdgpu]
[   81.684818]  pci_pm_resume+0x5c/0xa0
[   81.685247]  ? pci_pm_thaw+0x90/0x90
[   81.685658]  dpm_run_callback+0x4e/0x160
[   81.686110]  device_resume+0xad/0x210
[   81.686529]  async_resume+0x1e/0x40
[   81.686931]  async_run_entry_fn+0x33/0x120
[   81.687405]  process_one_work+0x21d/0x3f0
[   81.687869]  worker_thread+0x4a/0x3c0
[   81.688293]  ? process_one_work+0x3f0/0x3f0
[   81.688777]  kthread+0xff/0x130
[   81.689157]  ? kthread_complete_and_exit+0x20/0x20
[   81.689707]  ret_from_fork+0x22/0x30
[   81.690118]  </TASK>
[   81.690380] ---[ end trace 0000000000000000 ]---

v2: make the comment clean and use adev->in_s0ix instead of
adev->suspend

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.0, 6.1
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agodrm/amdgpu: skip MES for S0ix as well since it's part of GFX
Alex Deucher [Fri, 16 Dec 2022 16:42:20 +0000 (11:42 -0500)]
drm/amdgpu: skip MES for S0ix as well since it's part of GFX

commit afa6646b1c5d3affd541f76bd7476e4b835a9174 upstream.

It's also part of gfxoff.

Cc: stable@vger.kernel.org # 6.0, 6.1
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agoLinux 6.1.3
Greg Kroah-Hartman [Wed, 4 Jan 2023 10:29:02 +0000 (11:29 +0100)]
Linux 6.1.3

Link: https://lore.kernel.org/r/20230102110551.509937186@linuxfoundation.org
Tested-by: Ronald Warsow <rwarsow@gmx.de>
Tested-by: Salvatore Bonaccorso <carnil@debian.org>
Tested-by: Rudi Heitbaum <rudi@heitbaum.com>
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Fenil Jain <fkjainco@gmail.com>
Tested-by: Takeshi Ogasawara <takeshi.ogasawara@futuring-girl.com>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Ron Economos <re@w6rz.net>
Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Tested-by: Allen Pais <apais@linux.microsoft.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Justin M. Forbes <jforbes@fedoraproject.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agokcsan: Instrument memcpy/memset/memmove with newer Clang
Marco Elver [Mon, 12 Sep 2022 09:45:40 +0000 (11:45 +0200)]
kcsan: Instrument memcpy/memset/memmove with newer Clang

commit 7c201739beef1a586d806463f1465429cdce34c5 upstream.

With Clang version 16+, -fsanitize=thread will turn
memcpy/memset/memmove calls in instrumented functions into
__tsan_memcpy/__tsan_memset/__tsan_memmove calls respectively.

Add these functions to the core KCSAN runtime, so that we (a) catch data
races with mem* functions, and (b) won't run into linker errors with
such newer compilers.

Cc: stable@vger.kernel.org # v5.10+
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agoSUNRPC: Don't leak netobj memory when gss_read_proxy_verf() fails
Chuck Lever [Sat, 26 Nov 2022 20:55:18 +0000 (15:55 -0500)]
SUNRPC: Don't leak netobj memory when gss_read_proxy_verf() fails

commit da522b5fe1a5f8b7c20a0023e87b52a150e53bf5 upstream.

Fixes: 030d794bf498 ("SUNRPC: Use gssproxy upcall for server RPCGSS authentication.")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: <stable@vger.kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agotpm: tpm_tis: Add the missed acpi_put_table() to fix memory leak
Hanjun Guo [Thu, 17 Nov 2022 11:23:42 +0000 (19:23 +0800)]
tpm: tpm_tis: Add the missed acpi_put_table() to fix memory leak

commit db9622f762104459ff87ecdf885cc42c18053fd9 upstream.

In check_acpi_tpm2(), we get the TPM2 table just to make
sure the table is there, not used after the init, so the
acpi_put_table() should be added to release the ACPI memory.

Fixes: 4cb586a188d4 ("tpm_tis: Consolidate the platform and acpi probe flow")
Cc: stable@vger.kernel.org
Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agotpm: tpm_crb: Add the missed acpi_put_table() to fix memory leak
Hanjun Guo [Thu, 17 Nov 2022 11:23:41 +0000 (19:23 +0800)]
tpm: tpm_crb: Add the missed acpi_put_table() to fix memory leak

commit 37e90c374dd11cf4919c51e847c6d6ced0abc555 upstream.

In crb_acpi_add(), we get the TPM2 table to retrieve information
like start method, and then assign them to the priv data, so the
TPM2 table is not used after the init, should be freed, call
acpi_put_table() to fix the memory leak.

Fixes: 30fc8d138e91 ("tpm: TPM 2.0 CRB Interface")
Cc: stable@vger.kernel.org
Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agotpm: acpi: Call acpi_put_table() to fix memory leak
Hanjun Guo [Thu, 17 Nov 2022 11:23:40 +0000 (19:23 +0800)]
tpm: acpi: Call acpi_put_table() to fix memory leak

commit 8740a12ca2e2959531ad253bac99ada338b33d80 upstream.

The start and length of the event log area are obtained from
TPM2 or TCPA table, so we call acpi_get_table() to get the
ACPI information, but the acpi_get_table() should be coupled with
acpi_put_table() to release the ACPI memory, add the acpi_put_table()
properly to fix the memory leak.

While we are at it, remove the redundant empty line at the
end of the tpm_read_log_acpi().

Fixes: 0bfb23746052 ("tpm: Move eventlog files to a subdirectory")
Fixes: 85467f63a05c ("tpm: Add support for event log pointer found in TPM2 ACPI table")
Cc: stable@vger.kernel.org
Signed-off-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agommc: vub300: fix warning - do not call blocking ops when !TASK_RUNNING
Deren Wu [Sun, 4 Dec 2022 08:24:16 +0000 (16:24 +0800)]
mmc: vub300: fix warning - do not call blocking ops when !TASK_RUNNING

commit 4a44cd249604e29e7b90ae796d7692f5773dd348 upstream.

vub300_enable_sdio_irq() works with mutex and need TASK_RUNNING here.
Ensure that we mark current as TASK_RUNNING for sleepable context.

[   77.554641] do not call blocking ops when !TASK_RUNNING; state=1 set at [<ffffffff92a72c1d>] sdio_irq_thread+0x17d/0x5b0
[   77.554652] WARNING: CPU: 2 PID: 1983 at kernel/sched/core.c:9813 __might_sleep+0x116/0x160
[   77.554905] CPU: 2 PID: 1983 Comm: ksdioirqd/mmc1 Tainted: G           OE      6.1.0-rc5 #1
[   77.554910] Hardware name: Intel(R) Client Systems NUC8i7BEH/NUC8BEB, BIOS BECFL357.86A.0081.2020.0504.1834 05/04/2020
[   77.554912] RIP: 0010:__might_sleep+0x116/0x160
[   77.554920] RSP: 0018:ffff888107b7fdb8 EFLAGS: 00010282
[   77.554923] RAX: 0000000000000000 RBX: ffff888118c1b740 RCX: 0000000000000000
[   77.554926] RDX: 0000000000000001 RSI: 0000000000000004 RDI: ffffed1020f6ffa9
[   77.554928] RBP: ffff888107b7fde0 R08: 0000000000000001 R09: ffffed1043ea60ba
[   77.554930] R10: ffff88821f5305cb R11: ffffed1043ea60b9 R12: ffffffff93aa3a60
[   77.554932] R13: 000000000000011b R14: 7fffffffffffffff R15: ffffffffc0558660
[   77.554934] FS:  0000000000000000(0000) GS:ffff88821f500000(0000) knlGS:0000000000000000
[   77.554937] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   77.554939] CR2: 00007f8a44010d68 CR3: 000000024421a003 CR4: 00000000003706e0
[   77.554942] Call Trace:
[   77.554944]  <TASK>
[   77.554952]  mutex_lock+0x78/0xf0
[   77.554973]  vub300_enable_sdio_irq+0x103/0x3c0 [vub300]
[   77.554981]  sdio_irq_thread+0x25c/0x5b0
[   77.555006]  kthread+0x2b8/0x370
[   77.555017]  ret_from_fork+0x1f/0x30
[   77.555023]  </TASK>
[   77.555025] ---[ end trace 0000000000000000 ]---

Fixes: 88095e7b473a ("mmc: Add new VUB300 USB-to-SD/SDIO/MMC driver")
Signed-off-by: Deren Wu <deren.wu@mediatek.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87dc45b122d26d63c80532976813c9365d7160b3.1670140888.git.deren.wu@mediatek.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agoblock: Do not reread partition table on exclusively open device
Jan Kara [Wed, 30 Nov 2022 17:56:53 +0000 (18:56 +0100)]
block: Do not reread partition table on exclusively open device

commit 36369f46e91785688a5f39d7a5590e3f07981316 upstream.

Since commit 10c70d95c0f2 ("block: remove the bd_openers checks in
blk_drop_partitions") we allow rereading of partition table although
there are users of the block device. This has an undesirable consequence
that e.g. if sda and sdb are assembled to a RAID1 device md0 with
partitions, BLKRRPART ioctl on sda will rescan partition table and
create sda1 device. This partition device under a raid device confuses
some programs (such as libstorage-ng used for initial partitioning for
distribution installation) leading to failures.

Fix the problem refusing to rescan partitions if there is another user
that has the block device exclusively open.

Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20221130135344.2ul4cyfstfs3znxg@quack3
Fixes: 10c70d95c0f2 ("block: remove the bd_openers checks in blk_drop_partitions")
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20221130175653.24299-1-jack@suse.cz
[axboe: fold in followup fix]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agof2fs: allow to read node block after shutdown
Jaegeuk Kim [Wed, 9 Nov 2022 01:59:34 +0000 (17:59 -0800)]
f2fs: allow to read node block after shutdown

commit e6ecb142429183cef4835f31d4134050ae660032 upstream.

If block address is still alive, we should give a valid node block even after
shutdown. Otherwise, we can see zero data when reading out a file.

Cc: stable@vger.kernel.org
Fixes: 83a3bfdb5a8a ("f2fs: indicate shutdown f2fs to allow unmount successfully")
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agof2fs: should put a page when checking the summary info
Pavel Machek [Mon, 24 Oct 2022 17:30:12 +0000 (19:30 +0200)]
f2fs: should put a page when checking the summary info

commit c3db3c2fd9992c08f49aa93752d3c103c3a4f6aa upstream.

The commit introduces another bug.

Cc: stable@vger.kernel.org
Fixes: c6ad7fd16657e ("f2fs: fix to do sanity check on summary info")
Signed-off-by: Pavel Machek <pavel@denx.de>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agomm, compaction: fix fast_isolate_around() to stay within boundaries
NARIBAYASHI Akira [Wed, 26 Oct 2022 11:24:38 +0000 (20:24 +0900)]
mm, compaction: fix fast_isolate_around() to stay within boundaries

commit be21b32afe470c5ae98e27e49201158a47032942 upstream.

Depending on the memory configuration, isolate_freepages_block() may scan
pages out of the target range and causes panic.

Panic can occur on systems with multiple zones in a single pageblock.

The reason it is rare is that it only happens in special
configurations.  Depending on how many similar systems there are, it
may be a good idea to fix this problem for older kernels as well.

The problem is that pfn as argument of fast_isolate_around() could be out
of the target range.  Therefore we should consider the case where pfn <
start_pfn, and also the case where end_pfn < pfn.

This problem should have been addressd by the commit 6e2b7044c199 ("mm,
compaction: make fast_isolate_freepages() stay within zone") but there was
an oversight.

 Case1: pfn < start_pfn

  <at memory compaction for node Y>
  |  node X's zone  | node Y's zone
  +-----------------+------------------------------...
   pageblock    ^   ^     ^
  +-----------+-----------+-----------+-----------+...
                ^   ^     ^
                ^   ^      end_pfn
                ^    start_pfn = cc->zone->zone_start_pfn
                 pfn
                <---------> scanned range by "Scan After"

 Case2: end_pfn < pfn

  <at memory compaction for node X>
  |  node X's zone  | node Y's zone
  +-----------------+------------------------------...
   pageblock  ^     ^   ^
  +-----------+-----------+-----------+-----------+...
              ^     ^   ^
              ^     ^    pfn
              ^      end_pfn
               start_pfn
              <---------> scanned range by "Scan Before"

It seems that there is no good reason to skip nr_isolated pages just after
given pfn.  So let perform simple scan from start to end instead of
dividing the scan into "Before" and "After".

Link: https://lkml.kernel.org/r/20221026112438.236336-1-a.naribayashi@fujitsu.com
Fixes: 6e2b7044c199 ("mm, compaction: make fast_isolate_freepages() stay within zone").
Signed-off-by: NARIBAYASHI Akira <a.naribayashi@fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agomd: fix a crash in mempool_free
Mikulas Patocka [Fri, 4 Nov 2022 13:53:38 +0000 (09:53 -0400)]
md: fix a crash in mempool_free

commit 341097ee53573e06ab9fc675d96a052385b851fa upstream.

There's a crash in mempool_free when running the lvm test
shell/lvchange-rebuild-raid.sh.

The reason for the crash is this:
* super_written calls atomic_dec_and_test(&mddev->pending_writes) and
  wake_up(&mddev->sb_wait). Then it calls rdev_dec_pending(rdev, mddev)
  and bio_put(bio).
* so, the process that waited on sb_wait and that is woken up is racing
  with bio_put(bio).
* if the process wins the race, it calls bioset_exit before bio_put(bio)
  is executed.
* bio_put(bio) attempts to free a bio into a destroyed bio set - causing
  a crash in mempool_free.

We fix this bug by moving bio_put before atomic_dec_and_test.

We also move rdev_dec_pending before atomic_dec_and_test as suggested by
Neil Brown.

The function md_end_flush has a similar bug - we must call bio_put before
we decrement the number of in-progress bios.

 BUG: kernel NULL pointer dereference, address: 0000000000000000
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page
 PGD 11557f0067 P4D 11557f0067 PUD 0
 Oops: 0002 [#1] PREEMPT SMP
 CPU: 0 PID: 73 Comm: kworker/0:1 Not tainted 6.1.0-rc3 #5
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
 Workqueue: kdelayd flush_expired_bios [dm_delay]
 RIP: 0010:mempool_free+0x47/0x80
 Code: 48 89 ef 5b 5d ff e0 f3 c3 48 89 f7 e8 32 45 3f 00 48 63 53 08 48 89 c6 3b 53 04 7d 2d 48 8b 43 10 8d 4a 01 48 89 df 89 4b 08 <48> 89 2c d0 e8 b0 45 3f 00 48 8d 7b 30 5b 5d 31 c9 ba 01 00 00 00
 RSP: 0018:ffff88910036bda8 EFLAGS: 00010093
 RAX: 0000000000000000 RBX: ffff8891037b65d8 RCX: 0000000000000001
 RDX: 0000000000000000 RSI: 0000000000000202 RDI: ffff8891037b65d8
 RBP: ffff8891447ba240 R08: 0000000000012908 R09: 00000000003d0900
 R10: 0000000000000000 R11: 0000000000173544 R12: ffff889101a14000
 R13: ffff8891562ac300 R14: ffff889102b41440 R15: ffffe8ffffa00d05
 FS:  0000000000000000(0000) GS:ffff88942fa00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 0000001102e99000 CR4: 00000000000006b0
 Call Trace:
  <TASK>
  clone_endio+0xf4/0x1c0 [dm_mod]
  clone_endio+0xf4/0x1c0 [dm_mod]
  __submit_bio+0x76/0x120
  submit_bio_noacct_nocheck+0xb6/0x2a0
  flush_expired_bios+0x28/0x2f [dm_delay]
  process_one_work+0x1b4/0x300
  worker_thread+0x45/0x3e0
  ? rescuer_thread+0x380/0x380
  kthread+0xc2/0x100
  ? kthread_complete_and_exit+0x20/0x20
  ret_from_fork+0x1f/0x30
  </TASK>
 Modules linked in: brd dm_delay dm_raid dm_mod af_packet uvesafb cfbfillrect cfbimgblt cn cfbcopyarea fb font fbdev tun autofs4 binfmt_misc configfs ipv6 virtio_rng virtio_balloon rng_core virtio_net pcspkr net_failover failover qemu_fw_cfg button mousedev raid10 raid456 libcrc32c async_raid6_recov async_memcpy async_pq raid6_pq async_xor xor async_tx raid1 raid0 md_mod sd_mod t10_pi crc64_rocksoft crc64 virtio_scsi scsi_mod evdev psmouse bsg scsi_common [last unloaded: brd]
 CR2: 0000000000000000
 ---[ end trace 0000000000000000 ]---

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agomfd: mt6360: Add bounds checking in Regmap read/write call-backs
ChiYuan Huang [Thu, 29 Sep 2022 02:00:17 +0000 (10:00 +0800)]
mfd: mt6360: Add bounds checking in Regmap read/write call-backs

commit 5f4f94e9f26cca6514474b307b59348b8485e711 upstream.

Fix the potential risk of OOB read if bank index is over the maximum.

Refer to the discussion list for the experiment result on mt6370.
https://lore.kernel.org/all/20220914013345.GA5802@cyhuang-hp-elitebook-840-g3.rt/
If not to check the bound, there is the same issue on mt6360.

Cc: stable@vger.kernel.org
Fixes: 3b0850440a06c (mfd: mt6360: Merge different sub-devices I2C read/write)
Signed-off-by: ChiYuan Huang <cy_huang@richtek.com>
Signed-off-by: Lee Jones <lee@kernel.org>
Link: https://lore.kernel.org/r/1664416817-31590-1-git-send-email-u0084500@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agopnode: terminate at peers of source
Christian Brauner [Sat, 17 Dec 2022 21:28:40 +0000 (22:28 +0100)]
pnode: terminate at peers of source

commit 11933cf1d91d57da9e5c53822a540bbdc2656c16 upstream.

The propagate_mnt() function handles mount propagation when creating
mounts and propagates the source mount tree @source_mnt to all
applicable nodes of the destination propagation mount tree headed by
@dest_mnt.

Unfortunately it contains a bug where it fails to terminate at peers of
@source_mnt when looking up copies of the source mount that become
masters for copies of the source mount tree mounted on top of slaves in
the destination propagation tree causing a NULL dereference.

Once the mechanics of the bug are understood it's easy to trigger.
Because of unprivileged user namespaces it is available to unprivileged
users.

While fixing this bug we've gotten confused multiple times due to
unclear terminology or missing concepts. So let's start this with some
clarifications:

* The terms "master" or "peer" denote a shared mount. A shared mount
  belongs to a peer group.

* A peer group is a set of shared mounts that propagate to each other.
  They are identified by a peer group id. The peer group id is available
  in @shared_mnt->mnt_group_id.
  Shared mounts within the same peer group have the same peer group id.
  The peers in a peer group can be reached via @shared_mnt->mnt_share.

* The terms "slave mount" or "dependent mount" denote a mount that
  receives propagation from a peer in a peer group. IOW, shared mounts
  may have slave mounts and slave mounts have shared mounts as their
  master. Slave mounts of a given peer in a peer group are listed on
  that peers slave list available at @shared_mnt->mnt_slave_list.

* The term "master mount" denotes a mount in a peer group. IOW, it
  denotes a shared mount or a peer mount in a peer group. The term
  "master mount" - or "master" for short - is mostly used when talking
  in the context of slave mounts that receive propagation from a master
  mount. A master mount of a slave identifies the closest peer group a
  slave mount receives propagation from. The master mount of a slave can
  be identified via @slave_mount->mnt_master. Different slaves may point
  to different masters in the same peer group.

* Multiple peers in a peer group can have non-empty ->mnt_slave_lists.
  Non-empty ->mnt_slave_lists of peers don't intersect. Consequently, to
  ensure all slave mounts of a peer group are visited the
  ->mnt_slave_lists of all peers in a peer group have to be walked.

* Slave mounts point to a peer in the closest peer group they receive
  propagation from via @slave_mnt->mnt_master (see above). Together with
  these peers they form a propagation group (see below). The closest
  peer group can thus be identified through the peer group id
  @slave_mnt->mnt_master->mnt_group_id of the peer/master that a slave
  mount receives propagation from.

* A shared-slave mount is a slave mount to a peer group pg1 while also
  a peer in another peer group pg2. IOW, a peer group may receive
  propagation from another peer group.

  If a peer group pg1 is a slave to another peer group pg2 then all
  peers in peer group pg1 point to the same peer in peer group pg2 via
  ->mnt_master. IOW, all peers in peer group pg1 appear on the same
  ->mnt_slave_list. IOW, they cannot be slaves to different peer groups.

* A pure slave mount is a slave mount that is a slave to a peer group
  but is not a peer in another peer group.

* A propagation group denotes the set of mounts consisting of a single
  peer group pg1 and all slave mounts and shared-slave mounts that point
  to a peer in that peer group via ->mnt_master. IOW, all slave mounts
  such that @slave_mnt->mnt_master->mnt_group_id is equal to
  @shared_mnt->mnt_group_id.

  The concept of a propagation group makes it easier to talk about a
  single propagation level in a propagation tree.

  For example, in propagate_mnt() the immediate peers of @dest_mnt and
  all slaves of @dest_mnt's peer group form a propagation group propg1.
  So a shared-slave mount that is a slave in propg1 and that is a peer
  in another peer group pg2 forms another propagation group propg2
  together with all slaves that point to that shared-slave mount in
  their ->mnt_master.

* A propagation tree refers to all mounts that receive propagation
  starting from a specific shared mount.

  For example, for propagate_mnt() @dest_mnt is the start of a
  propagation tree. The propagation tree ecompasses all mounts that
  receive propagation from @dest_mnt's peer group down to the leafs.

With that out of the way let's get to the actual algorithm.

We know that @dest_mnt is guaranteed to be a pure shared mount or a
shared-slave mount. This is guaranteed by a check in
attach_recursive_mnt(). So propagate_mnt() will first propagate the
source mount tree to all peers in @dest_mnt's peer group:

for (n = next_peer(dest_mnt); n != dest_mnt; n = next_peer(n)) {
        ret = propagate_one(n);
        if (ret)
               goto out;
}

Notice, that the peer propagation loop of propagate_mnt() doesn't
propagate @dest_mnt itself. @dest_mnt is mounted directly in
attach_recursive_mnt() after we propagated to the destination
propagation tree.

The mount that will be mounted on top of @dest_mnt is @source_mnt. This
copy was created earlier even before we entered attach_recursive_mnt()
and doesn't concern us a lot here.

It's just important to notice that when propagate_mnt() is called
@source_mnt will not yet have been mounted on top of @dest_mnt. Thus,
@source_mnt->mnt_parent will either still point to @source_mnt or - in
the case @source_mnt is moved and thus already attached - still to its
former parent.

For each peer @m in @dest_mnt's peer group propagate_one() will create a
new copy of the source mount tree and mount that copy @child on @m such
that @child->mnt_parent points to @m after propagate_one() returns.

propagate_one() will stash the last destination propagation node @m in
@last_dest and the last copy it created for the source mount tree in
@last_source.

Hence, if we call into propagate_one() again for the next destination
propagation node @m, @last_dest will point to the previous destination
propagation node and @last_source will point to the previous copy of the
source mount tree and mounted on @last_dest.

Each new copy of the source mount tree is created from the previous copy
of the source mount tree. This will become important later.

The peer loop in propagate_mnt() is straightforward. We iterate through
the peers copying and updating @last_source and @last_dest as we go
through them and mount each copy of the source mount tree @child on a
peer @m in @dest_mnt's peer group.

After propagate_mnt() handled the peers in @dest_mnt's peer group
propagate_mnt() will propagate the source mount tree down the
propagation tree that @dest_mnt's peer group propagates to:

for (m = next_group(dest_mnt, dest_mnt); m;
                m = next_group(m, dest_mnt)) {
        /* everything in that slave group */
        n = m;
        do {
                ret = propagate_one(n);
                if (ret)
                        goto out;
                n = next_peer(n);
        } while (n != m);
}

The next_group() helper will recursively walk the destination
propagation tree, descending into each propagation group of the
propagation tree.

The important part is that it takes care to propagate the source mount
tree to all peers in the peer group of a propagation group before it
propagates to the slaves to those peers in the propagation group. IOW,
it creates and mounts copies of the source mount tree that become
masters before it creates and mounts copies of the source mount tree
that become slaves to these masters.

It is important to remember that propagating the source mount tree to
each mount @m in the destination propagation tree simply means that we
create and mount new copies @child of the source mount tree on @m such
that @child->mnt_parent points to @m.

Since we know that each node @m in the destination propagation tree
headed by @dest_mnt's peer group will be overmounted with a copy of the
source mount tree and since we know that the propagation properties of
each copy of the source mount tree we create and mount at @m will mostly
mirror the propagation properties of @m. We can use that information to
create and mount the copies of the source mount tree that become masters
before their slaves.

The easy case is always when @m and @last_dest are peers in a peer group
of a given propagation group. In that case we know that we can simply
copy @last_source without having to figure out what the master for the
new copy @child of the source mount tree needs to be as we've done that
in a previous call to propagate_one().

The hard case is when we're dealing with a slave mount or a shared-slave
mount @m in a destination propagation group that we need to create and
mount a copy of the source mount tree on.

For each propagation group in the destination propagation tree we
propagate the source mount tree to we want to make sure that the copies
@child of the source mount tree we create and mount on slaves @m pick an
ealier copy of the source mount tree that we mounted on a master @m of
the destination propagation group as their master. This is a mouthful
but as far as we can tell that's the core of it all.

But, if we keep track of the masters in the destination propagation tree
@m we can use the information to find the correct master for each copy
of the source mount tree we create and mount at the slaves in the
destination propagation tree @m.

Let's walk through the base case as that's still fairly easy to grasp.

If we're dealing with the first slave in the propagation group that
@dest_mnt is in then we don't yet have marked any masters in the
destination propagation tree.

We know the master for the first slave to @dest_mnt's peer group is
simple @dest_mnt. So we expect this algorithm to yield a copy of the
source mount tree that was mounted on a peer in @dest_mnt's peer group
as the master for the copy of the source mount tree we want to mount at
the first slave @m:

for (n = m; ; n = p) {
        p = n->mnt_master;
        if (p == dest_master || IS_MNT_MARKED(p))
                break;
}

For the first slave we walk the destination propagation tree all the way
up to a peer in @dest_mnt's peer group. IOW, the propagation hierarchy
can be walked by walking up the @mnt->mnt_master hierarchy of the
destination propagation tree @m. We will ultimately find a peer in
@dest_mnt's peer group and thus ultimately @dest_mnt->mnt_master.

Btw, here the assumption we listed at the beginning becomes important.
Namely, that peers in a peer group pg1 that are slaves in another peer
group pg2 appear on the same ->mnt_slave_list. IOW, all slaves who are
peers in peer group pg1 point to the same peer in peer group pg2 via
their ->mnt_master. Otherwise the termination condition in the code
above would be wrong and next_group() would be broken too.

So the first iteration sets:

n = m;
p = n->mnt_master;

such that @p now points to a peer or @dest_mnt itself. We walk up one
more level since we don't have any marked mounts. So we end up with:

n = dest_mnt;
p = dest_mnt->mnt_master;

If @dest_mnt's peer group is not slave to another peer group then @p is
now NULL. If @dest_mnt's peer group is a slave to another peer group
then @p now points to @dest_mnt->mnt_master points which is a master
outside the propagation tree we're dealing with.

Now we need to figure out the master for the copy of the source mount
tree we're about to create and mount on the first slave of @dest_mnt's
peer group:

do {
        struct mount *parent = last_source->mnt_parent;
        if (last_source == first_source)
                break;
        done = parent->mnt_master == p;
        if (done && peers(n, parent))
                break;
        last_source = last_source->mnt_master;
} while (!done);

We know that @last_source->mnt_parent points to @last_dest and
@last_dest is the last peer in @dest_mnt's peer group we propagated to
in the peer loop in propagate_mnt().

Consequently, @last_source is the last copy we created and mount on that
last peer in @dest_mnt's peer group. So @last_source is the master we
want to pick.

We know that @last_source->mnt_parent->mnt_master points to
@last_dest->mnt_master. We also know that @last_dest->mnt_master is
either NULL or points to a master outside of the destination propagation
tree and so does @p. Hence:

done = parent->mnt_master == p;

is trivially true in the base condition.

We also know that for the first slave mount of @dest_mnt's peer group
that @last_dest either points @dest_mnt itself because it was
initialized to:

last_dest = dest_mnt;

at the beginning of propagate_mnt() or it will point to a peer of
@dest_mnt in its peer group. In both cases it is guaranteed that on the
first iteration @n and @parent are peers (Please note the check for
peers here as that's important.):

if (done && peers(n, parent))
        break;

So, as we expected, we select @last_source, which referes to the last
copy of the source mount tree we mounted on the last peer in @dest_mnt's
peer group, as the master of the first slave in @dest_mnt's peer group.
The rest is taken care of by clone_mnt(last_source, ...). We'll skip
over that part otherwise this becomes a blogpost.

At the end of propagate_mnt() we now mark @m->mnt_master as the first
master in the destination propagation tree that is distinct from
@dest_mnt->mnt_master. IOW, we mark @dest_mnt itself as a master.

By marking @dest_mnt or one of it's peers we are able to easily find it
again when we later lookup masters for other copies of the source mount
tree we mount copies of the source mount tree on slaves @m to
@dest_mnt's peer group. This, in turn allows us to find the master we
selected for the copies of the source mount tree we mounted on master in
the destination propagation tree again.

The important part is to realize that the code makes use of the fact
that the last copy of the source mount tree stashed in @last_source was
mounted on top of the previous destination propagation node @last_dest.
What this means is that @last_source allows us to walk the destination
propagation hierarchy the same way each destination propagation node @m
does.

If we take @last_source, which is the copy of @source_mnt we have
mounted on @last_dest in the previous iteration of propagate_one(), then
we know @last_source->mnt_parent points to @last_dest but we also know
that as we walk through the destination propagation tree that
@last_source->mnt_master will point to an earlier copy of the source
mount tree we mounted one an earlier destination propagation node @m.

IOW, @last_source->mnt_parent will be our hook into the destination
propagation tree and each consecutive @last_source->mnt_master will lead
us to an earlier propagation node @m via
@last_source->mnt_master->mnt_parent.

Hence, by walking up @last_source->mnt_master, each of which is mounted
on a node that is a master @m in the destination propagation tree we can
also walk up the destination propagation hierarchy.

So, for each new destination propagation node @m we use the previous
copy of @last_source and the fact it's mounted on the previous
propagation node @last_dest via @last_source->mnt_master->mnt_parent to
determine what the master of the new copy of @last_source needs to be.

The goal is to find the _closest_ master that the new copy of the source
mount tree we are about to create and mount on a slave @m in the
destination propagation tree needs to pick. IOW, we want to find a
suitable master in the propagation group.

As the propagation structure of the source mount propagation tree we
create mirrors the propagation structure of the destination propagation
tree we can find @m's closest master - i.e., a marked master - which is
a peer in the closest peer group that @m receives propagation from. We
store that closest master of @m in @p as before and record the slave to
that master in @n

We then search for this master @p via @last_source by walking up the
master hierarchy starting from the last copy of the source mount tree
stored in @last_source that we created and mounted on the previous
destination propagation node @m.

We will try to find the master by walking @last_source->mnt_master and
by comparing @last_source->mnt_master->mnt_parent->mnt_master to @p. If
we find @p then we can figure out what earlier copy of the source mount
tree needs to be the master for the new copy of the source mount tree
we're about to create and mount at the current destination propagation
node @m.

If @last_source->mnt_master->mnt_parent and @n are peers then we know
that the closest master they receive propagation from is
@last_source->mnt_master->mnt_parent->mnt_master. If not then the
closest immediate peer group that they receive propagation from must be
one level higher up.

This builds on the earlier clarification at the beginning that all peers
in a peer group which are slaves of other peer groups all point to the
same ->mnt_master, i.e., appear on the same ->mnt_slave_list, of the
closest peer group that they receive propagation from.

However, terminating the walk has corner cases.

If the closest marked master for a given destination node @m cannot be
found by walking up the master hierarchy via @last_source->mnt_master
then we need to terminate the walk when we encounter @source_mnt again.

This isn't an arbitrary termination. It simply means that the new copy
of the source mount tree we're about to create has a copy of the source
mount tree we created and mounted on a peer in @dest_mnt's peer group as
its master. IOW, @source_mnt is the peer in the closest peer group that
the new copy of the source mount tree receives propagation from.

We absolutely have to stop @source_mnt because @last_source->mnt_master
either points outside the propagation hierarchy we're dealing with or it
is NULL because @source_mnt isn't a shared-slave.

So continuing the walk past @source_mnt would cause a NULL dereference
via @last_source->mnt_master->mnt_parent. And so we have to stop the
walk when we encounter @source_mnt again.

One scenario where this can happen is when we first handled a series of
slaves of @dest_mnt's peer group and then encounter peers in a new peer
group that is a slave to @dest_mnt's peer group. We handle them and then
we encounter another slave mount to @dest_mnt that is a pure slave to
@dest_mnt's peer group. That pure slave will have a peer in @dest_mnt's
peer group as its master. Consequently, the new copy of the source mount
tree will need to have @source_mnt as it's master. So we walk the
propagation hierarchy all the way up to @source_mnt based on
@last_source->mnt_master.

So terminate on @source_mnt, easy peasy. Except, that the check misses
something that the rest of the algorithm already handles.

If @dest_mnt has peers in it's peer group the peer loop in
propagate_mnt():

for (n = next_peer(dest_mnt); n != dest_mnt; n = next_peer(n)) {
        ret = propagate_one(n);
        if (ret)
                goto out;
}

will consecutively update @last_source with each previous copy of the
source mount tree we created and mounted at the previous peer in
@dest_mnt's peer group. So after that loop terminates @last_source will
point to whatever copy of the source mount tree was created and mounted
on the last peer in @dest_mnt's peer group.

Furthermore, if there is even a single additional peer in @dest_mnt's
peer group then @last_source will __not__ point to @source_mnt anymore.
Because, as we mentioned above, @dest_mnt isn't even handled in this
loop but directly in attach_recursive_mnt(). So it can't even accidently
come last in that peer loop.

So the first time we handle a slave mount @m of @dest_mnt's peer group
the copy of the source mount tree we create will make the __last copy of
the source mount tree we created and mounted on the last peer in
@dest_mnt's peer group the master of the new copy of the source mount
tree we create and mount on the first slave of @dest_mnt's peer group__.

But this means that the termination condition that checks for
@source_mnt is wrong. The @source_mnt cannot be found anymore by
propagate_one(). Instead it will find the last copy of the source mount
tree we created and mounted for the last peer of @dest_mnt's peer group
again. And that is a peer of @source_mnt not @source_mnt itself.

IOW, we fail to terminate the loop correctly and ultimately dereference
@last_source->mnt_master->mnt_parent. When @source_mnt's peer group
isn't slave to another peer group then @last_source->mnt_master is NULL
causing the splat below.

For example, assume @dest_mnt is a pure shared mount and has three peers
in its peer group:

===================================================================================
                                         mount-id   mount-parent-id   peer-group-id
===================================================================================
(@dest_mnt) mnt_master[216]              309        297               shared:216
    \
     (@source_mnt) mnt_master[218]:      609        609               shared:218

(1) mnt_master[216]:                     607        605               shared:216
    \
     (P1) mnt_master[218]:               624        607               shared:218

(2) mnt_master[216]:                     576        574               shared:216
    \
     (P2) mnt_master[218]:               625        576               shared:218

(3) mnt_master[216]:                     545        543               shared:216
    \
     (P3) mnt_master[218]:               626        545               shared:218

After this sequence has been processed @last_source will point to (P3),
the copy generated for the third peer in @dest_mnt's peer group we
handled. So the copy of the source mount tree (P4) we create and mount
on the first slave of @dest_mnt's peer group:

===================================================================================
                                         mount-id   mount-parent-id   peer-group-id
===================================================================================
    mnt_master[216]                      309        297               shared:216
   /
  /
(S0) mnt_slave                           483        481               master:216
  \
   \    (P3) mnt_master[218]             626        545               shared:218
    \  /
     \/
    (P4) mnt_slave                       627        483               master:218

will pick the last copy of the source mount tree (P3) as master, not (S0).

When walking the propagation hierarchy via @last_source's master
hierarchy we encounter (P3) but not (S0), i.e., @source_mnt.

We can fix this in multiple ways:

(1) By setting @last_source to @source_mnt after we processed the peers
    in @dest_mnt's peer group right after the peer loop in
    propagate_mnt().

(2) By changing the termination condition that relies on finding exactly
    @source_mnt to finding a peer of @source_mnt.

(3) By only moving @last_source when we actually venture into a new peer
    group or some clever variant thereof.

The first two options are minimally invasive and what we want as a fix.
The third option is more intrusive but something we'd like to explore in
the near future.

This passes all LTP tests and specifically the mount propagation
testsuite part of it. It also holds up against all known reproducers of
this issues.

Final words.
First, this is a clever but __worringly__ underdocumented algorithm.
There isn't a single detailed comment to be found in next_group(),
propagate_one() or anywhere else in that file for that matter. This has
been a giant pain to understand and work through and a bug like this is
insanely difficult to fix without a detailed understanding of what's
happening. Let's not talk about the amount of time that was sunk into
fixing this.

Second, all the cool kids with access to
unshare --mount --user --map-root --propagation=unchanged
are going to have a lot of fun. IOW, triggerable by unprivileged users
while namespace_lock() lock is held.

[  115.848393] BUG: kernel NULL pointer dereference, address: 0000000000000010
[  115.848967] #PF: supervisor read access in kernel mode
[  115.849386] #PF: error_code(0x0000) - not-present page
[  115.849803] PGD 0 P4D 0
[  115.850012] Oops: 0000 [#1] PREEMPT SMP PTI
[  115.850354] CPU: 0 PID: 15591 Comm: mount Not tainted 6.1.0-rc7 #3
[  115.850851] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
VirtualBox 12/01/2006
[  115.851510] RIP: 0010:propagate_one.part.0+0x7f/0x1a0
[  115.851924] Code: 75 eb 4c 8b 05 c2 25 37 02 4c 89 ca 48 8b 4a 10
49 39 d0 74 1e 48 3b 81 e0 00 00 00 74 26 48 8b 92 e0 00 00 00 be 01
00 00 00 <48> 8b 4a 10 49 39 d0 75 e2 40 84 f6 74 38 4c 89 05 84 25 37
02 4d
[  115.853441] RSP: 0018:ffffb8d5443d7d50 EFLAGS: 00010282
[  115.853865] RAX: ffff8e4d87c41c80 RBX: ffff8e4d88ded780 RCX: ffff8e4da4333a00
[  115.854458] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8e4d88ded780
[  115.855044] RBP: ffff8e4d88ded780 R08: ffff8e4da4338000 R09: ffff8e4da43388c0
[  115.855693] R10: 0000000000000002 R11: ffffb8d540158000 R12: ffffb8d5443d7da8
[  115.856304] R13: ffff8e4d88ded780 R14: 0000000000000000 R15: 0000000000000000
[  115.856859] FS:  00007f92c90c9800(0000) GS:ffff8e4dfdc00000(0000)
knlGS:0000000000000000
[  115.857531] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  115.858006] CR2: 0000000000000010 CR3: 0000000022f4c002 CR4: 00000000000706f0
[  115.858598] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  115.859393] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  115.860099] Call Trace:
[  115.860358]  <TASK>
[  115.860535]  propagate_mnt+0x14d/0x190
[  115.860848]  attach_recursive_mnt+0x274/0x3e0
[  115.861212]  path_mount+0x8c8/0xa60
[  115.861503]  __x64_sys_mount+0xf6/0x140
[  115.861819]  do_syscall_64+0x5b/0x80
[  115.862117]  ? do_faccessat+0x123/0x250
[  115.862435]  ? syscall_exit_to_user_mode+0x17/0x40
[  115.862826]  ? do_syscall_64+0x67/0x80
[  115.863133]  ? syscall_exit_to_user_mode+0x17/0x40
[  115.863527]  ? do_syscall_64+0x67/0x80
[  115.863835]  ? do_syscall_64+0x67/0x80
[  115.864144]  ? do_syscall_64+0x67/0x80
[  115.864452]  ? exc_page_fault+0x70/0x170
[  115.864775]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  115.865187] RIP: 0033:0x7f92c92b0ebe
[  115.865480] Code: 48 8b 0d 75 4f 0c 00 f7 d8 64 89 01 48 83 c8 ff
c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00
00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 42 4f 0c 00 f7 d8 64 89
01 48
[  115.866984] RSP: 002b:00007fff000aa728 EFLAGS: 00000246 ORIG_RAX:
00000000000000a5
[  115.867607] RAX: ffffffffffffffda RBX: 000055a77888d6b0 RCX: 00007f92c92b0ebe
[  115.868240] RDX: 000055a77888d8e0 RSI: 000055a77888e6e0 RDI: 000055a77888e620
[  115.868823] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001
[  115.869403] R10: 0000000000001000 R11: 0000000000000246 R12: 000055a77888e620
[  115.869994] R13: 000055a77888d8e0 R14: 00000000ffffffff R15: 00007f92c93e4076
[  115.870581]  </TASK>
[  115.870763] Modules linked in: nft_fib_inet nft_fib_ipv4
nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6
nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6
nf_defrag_ipv4 ip_set rfkill nf_tables nfnetlink qrtr snd_intel8x0
sunrpc snd_ac97_codec ac97_bus snd_pcm snd_timer intel_rapl_msr
intel_rapl_common snd vboxguest intel_powerclamp video rapl joydev
soundcore i2c_piix4 wmi fuse zram xfs vmwgfx crct10dif_pclmul
crc32_pclmul crc32c_intel polyval_clmulni polyval_generic
drm_ttm_helper ttm e1000 ghash_clmulni_intel serio_raw ata_generic
pata_acpi scsi_dh_rdac scsi_dh_emc scsi_dh_alua dm_multipath
[  115.875288] CR2: 0000000000000010
[  115.875641] ---[ end trace 0000000000000000 ]---
[  115.876135] RIP: 0010:propagate_one.part.0+0x7f/0x1a0
[  115.876551] Code: 75 eb 4c 8b 05 c2 25 37 02 4c 89 ca 48 8b 4a 10
49 39 d0 74 1e 48 3b 81 e0 00 00 00 74 26 48 8b 92 e0 00 00 00 be 01
00 00 00 <48> 8b 4a 10 49 39 d0 75 e2 40 84 f6 74 38 4c 89 05 84 25 37
02 4d
[  115.878086] RSP: 0018:ffffb8d5443d7d50 EFLAGS: 00010282
[  115.878511] RAX: ffff8e4d87c41c80 RBX: ffff8e4d88ded780 RCX: ffff8e4da4333a00
[  115.879128] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8e4d88ded780
[  115.879715] RBP: ffff8e4d88ded780 R08: ffff8e4da4338000 R09: ffff8e4da43388c0
[  115.880359] R10: 0000000000000002 R11: ffffb8d540158000 R12: ffffb8d5443d7da8
[  115.880962] R13: ffff8e4d88ded780 R14: 0000000000000000 R15: 0000000000000000
[  115.881548] FS:  00007f92c90c9800(0000) GS:ffff8e4dfdc00000(0000)
knlGS:0000000000000000
[  115.882234] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  115.882713] CR2: 0000000000000010 CR3: 0000000022f4c002 CR4: 00000000000706f0
[  115.883314] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  115.883966] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Fixes: f2ebb3a921c1 ("smarter propagate_mnt()")
Fixes: 5ec0811d3037 ("propogate_mnt: Handle the first propogated copy being a slave")
Cc: <stable@vger.kernel.org>
Reported-by: Ditang Chen <ditang.c@gmail.com>
Signed-off-by: Seth Forshee (Digital Ocean) <sforshee@kernel.org>
Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agoALSA: hda/hdmi: Static PCM mapping again with AMD HDMI codecs
Takashi Iwai [Wed, 28 Dec 2022 12:57:14 +0000 (13:57 +0100)]
ALSA: hda/hdmi: Static PCM mapping again with AMD HDMI codecs

commit 090ddad4c7a9fefd647c762093a555870a19c8b2 upstream.

The recent code refactoring for HD-audio HDMI codec driver caused a
regression on AMD/ATI HDMI codecs; namely, PulseAudioand pipewire
don't recognize HDMI outputs any longer while the direct output via
ALSA raw access still works.

The problem turned out that, after the code refactoring, the driver
assumes only the dynamic PCM assignment, and when a PCM stream that
still isn't assigned to any pin gets opened, the driver tries to
assign any free converter to the PCM stream.  This behavior is OK for
Intel and other codecs, as they have arbitrary connections between
pins and converters.  OTOH, on AMD chips that have a 1:1 mapping
between pins and converters, this may end up with blocking the open of
the next PCM stream for the pin that is tied with the formerly taken
converter.

Also, with the code refactoring, more PCM streams are exposed than
necessary as we assume all converters can be used, while this isn't
true for AMD case.  This may change the PCM stream assignment and
confuse users as well.

This patch fixes those problems by:

- Introducing a flag spec->static_pcm_mapping, and if it's set, the
  driver applies the static mapping between pins and converters at the
  probe time
- Limiting the number of PCM streams per pins, too; this avoids the
  superfluous PCM streams

Fixes: ef6f5494faf6 ("ALSA: hda/hdmi: Use only dynamic PCM device allocation")
Cc: <stable@vger.kernel.org>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216836
Co-developed-by: Jaroslav Kysela <perex@perex.cz>
Signed-off-by: Jaroslav Kysela <perex@perex.cz>
Link: https://lore.kernel.org/r/20221228125714.16329-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agoALSA: line6: fix stack overflow in line6_midi_transmit
Artem Egorkine [Sun, 25 Dec 2022 10:57:28 +0000 (12:57 +0200)]
ALSA: line6: fix stack overflow in line6_midi_transmit

commit b8800d324abb50160560c636bfafe2c81001b66c upstream.

Correctly calculate available space including the size of the chunk
buffer. This fixes a buffer overflow when multiple MIDI sysex
messages are sent to a PODxt device.

Signed-off-by: Artem Egorkine <arteme@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20221225105728.1153989-2-arteme@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agoALSA: line6: correct midi status byte when receiving data from podxt
Artem Egorkine [Sun, 25 Dec 2022 10:57:27 +0000 (12:57 +0200)]
ALSA: line6: correct midi status byte when receiving data from podxt

commit 8508fa2e7472f673edbeedf1b1d2b7a6bb898ecc upstream.

A PODxt device sends 0xb2, 0xc2 or 0xf2 as a status byte for MIDI
messages over USB that should otherwise have a 0xb0, 0xc0 or 0xf0
status byte. This is usually corrected by the driver on other OSes.

This fixes MIDI sysex messages sent by PODxt.

[ tiwai: fixed white spaces ]

Signed-off-by: Artem Egorkine <arteme@gmail.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20221225105728.1153989-1-arteme@gmail.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agoovl: update ->f_iocb_flags when ovl_change_flags() modifies ->f_flags
Al Viro [Thu, 24 Nov 2022 17:03:11 +0000 (17:03 +0000)]
ovl: update ->f_iocb_flags when ovl_change_flags() modifies ->f_flags

commit 456b59e757b0c558df550764a4fd5ae6877e93f8 upstream.

ovl_change_flags() is an open-coded variant of fs/fcntl.c:setfl() and it
got missed by commit 164f4064ca81 ("keep iocb_flags() result cached in
struct file"); the same change applies there.

Reported-by: Pierre Labastie <pierre.labastie@neuf.fr>
Fixes: 164f4064ca81 ("keep iocb_flags() result cached in struct file")
Cc: <stable@vger.kernel.org> # v6.0
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216738
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agoovl: Use ovl mounter's fsuid and fsgid in ovl_link()
Zhang Tianci [Thu, 1 Sep 2022 08:29:29 +0000 (16:29 +0800)]
ovl: Use ovl mounter's fsuid and fsgid in ovl_link()

commit 5b0db51215e895a361bc63132caa7cca36a53d6a upstream.

There is a wrong case of link() on overlay:
  $ mkdir /lower /fuse /merge
  $ mount -t fuse /fuse
  $ mkdir /fuse/upper /fuse/work
  $ mount -t overlay /merge -o lowerdir=/lower,upperdir=/fuse/upper,\
    workdir=work
  $ touch /merge/file
  $ chown bin.bin /merge/file // the file's caller becomes "bin"
  $ ln /merge/file /merge/lnkfile

Then we will get an error(EACCES) because fuse daemon checks the link()'s
caller is "bin", it denied this request.

In the changing history of ovl_link(), there are two key commits:

The first is commit bb0d2b8ad296 ("ovl: fix sgid on directory") which
overrides the cred's fsuid/fsgid using the new inode. The new inode's
owner is initialized by inode_init_owner(), and inode->fsuid is
assigned to the current user. So the override fsuid becomes the
current user. We know link() is actually modifying the directory, so
the caller must have the MAY_WRITE permission on the directory. The
current caller may should have this permission. This is acceptable
to use the caller's fsuid.

The second is commit 51f7e52dc943 ("ovl: share inode for hard link")
which removed the inode creation in ovl_link(). This commit move
inode_init_owner() into ovl_create_object(), so the ovl_link() just
give the old inode to ovl_create_or_link(). Then the override fsuid
becomes the old inode's fsuid, neither the caller nor the overlay's
mounter! So this is incorrect.

Fix this bug by using ovl mounter's fsuid/fsgid to do underlying
fs's link().

Link: https://lore.kernel.org/all/20220817102952.xnvesg3a7rbv576x@wittgenstein/T
Link: https://lore.kernel.org/lkml/20220825130552.29587-1-zhangtianci.1997@bytedance.com/t
Signed-off-by: Zhang Tianci <zhangtianci.1997@bytedance.com>
Signed-off-by: Jiachen Zhang <zhangjiachen.jaycee@bytedance.com>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Fixes: 51f7e52dc943 ("ovl: share inode for hard link")
Cc: <stable@vger.kernel.org> # v4.8
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agobinfmt: Fix error return code in load_elf_fdpic_binary()
Wang Yufen [Fri, 2 Dec 2022 01:41:01 +0000 (09:41 +0800)]
binfmt: Fix error return code in load_elf_fdpic_binary()

commit e7f703ff2507f4e9f496da96cd4b78fd3026120c upstream.

Fix to return a negative error code from create_elf_fdpic_tables()
instead of 0.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/1669945261-30271-1-git-send-email-wangyufen@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agoACPI: x86: s2idle: Stop using AMD specific codepath for Rembrandt+
Mario Limonciello [Thu, 15 Dec 2022 19:16:16 +0000 (13:16 -0600)]
ACPI: x86: s2idle: Stop using AMD specific codepath for Rembrandt+

commit e555c85792bd5f9828a2fd2ca9761f70efb1c77b upstream.

After we introduced a module parameter and quirk infrastructure for
picking the Microsoft GUID over the SOC vendor GUID we discovered
that lots and lots of systems are getting this wrong.

The table continues to grow, and is becoming unwieldy.

We don't really have any benefit to forcing vendors to populate the
AMD GUID. This is just extra work, and more and more vendors seem
to mess it up.  As the Microsoft GUID is used by Windows as well,
it's very likely that it won't be messed up like this.

So drop all the quirks forcing it and the Rembrandt behavior. This
means that Cezanne or later effectively only run the Microsoft GUID
codepath with the exception of HP Elitebook 8*5 G9.

Fixes: fd894f05cf30 ("ACPI: x86: s2idle: If a new AMD _HID is missing assume Rembrandt")
Cc: stable@vger.kernel.org # 6.1
Reported-by: Benjamin Cheng <ben@bcheng.me>
Reported-by: bilkow@tutanota.com
Reported-by: Paul <paul@zogpog.com>
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2292
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216768
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Philipp Zabel <philipp.zabel@gmail.com>
Tested-by: Philipp Zabel <philipp.zabel@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agoACPI: x86: s2idle: Force AMD GUID/_REV 2 on HP Elitebook 865
Mario Limonciello [Thu, 15 Dec 2022 19:16:15 +0000 (13:16 -0600)]
ACPI: x86: s2idle: Force AMD GUID/_REV 2 on HP Elitebook 865

commit 3ea45390e9c0d35805ef8357ace55594fd4233d0 upstream.

HP Elitebook 865 supports both the AMD GUID w/ _REV 2 and Microsoft
GUID with _REV 0. Both have very similar code but the AMD GUID
has a special workaround that is specific to a problem with
spurious wakeups on systems with Qualcomm WLAN.

This is believed to be a bug in the Qualcomm WLAN F/W (it doesn't
affect any other WLAN H/W). If this WLAN firmware is fixed this
quirk can be dropped.

Cc: stable@vger.kernel.org # 6.1
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agohfsplus: fix bug causing custom uid and gid being unable to be assigned with mount
Aditya Garg [Wed, 7 Dec 2022 03:05:40 +0000 (03:05 +0000)]
hfsplus: fix bug causing custom uid and gid being unable to be assigned with mount

commit 9f2b5debc07073e6dfdd774e3594d0224b991927 upstream.

Despite specifying UID and GID in mount command, the specified UID and GID
were not being assigned. This patch fixes this issue.

Link: https://lkml.kernel.org/r/C0264BF5-059C-45CF-B8DA-3A3BD2C803A2@live.com
Signed-off-by: Aditya Garg <gargaditya08@live.com>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agopstore/zone: Use GFP_ATOMIC to allocate zone buffer
Qiujun Huang [Sun, 4 Sep 2022 15:17:13 +0000 (23:17 +0800)]
pstore/zone: Use GFP_ATOMIC to allocate zone buffer

commit 99b3b837855b987563bcfb397cf9ddd88262814b upstream.

There is a case found when triggering a panic_on_oom, pstore fails to dump
kmsg. Because psz_kmsg_write_record can't get the new buffer.

Handle this by using GFP_ATOMIC to allocate a buffer at lower watermark.

Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
Fixes: 335426c6dcdd ("pstore/zone: Provide way to skip "broken" zone for MTD devices")
Cc: WeiXiong Liao <gmpy.liaowx@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/CAJRQjofRCF7wjrYmw3D7zd5QZnwHQq+F8U-mJDJ6NZ4bddYdLA@mail.gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agopstore: Properly assign mem_type property
Luca Stefani [Thu, 22 Dec 2022 13:10:49 +0000 (14:10 +0100)]
pstore: Properly assign mem_type property

commit beca3e311a49cd3c55a056096531737d7afa4361 upstream.

If mem-type is specified in the device tree
it would end up overriding the record_size
field instead of populating mem_type.

As record_size is currently parsed after the
improper assignment with default size 0 it
continued to work as expected regardless of the
value found in the device tree.

Simply changing the target field of the struct
is enough to get mem-type working as expected.

Fixes: 9d843e8fafc7 ("pstore: Add mem_type property DT parsing support")
Cc: stable@vger.kernel.org
Signed-off-by: Luca Stefani <luca@osomprivacy.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221222131049.286288-1-luca@osomprivacy.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agokmsan: include linux/vmalloc.h
Arnd Bergmann [Thu, 15 Dec 2022 16:30:17 +0000 (17:30 +0100)]
kmsan: include linux/vmalloc.h

commit aaa746ad8b30f38ef89a301faf339ef1c19cf33a upstream.

This is needed for the vmap/vunmap declarations:

mm/kmsan/kmsan_test.c:316:9: error: implicit declaration of function 'vmap' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
        vbuf = vmap(pages, npages, VM_MAP, PAGE_KERNEL);
               ^
mm/kmsan/kmsan_test.c:316:29: error: use of undeclared identifier 'VM_MAP'
        vbuf = vmap(pages, npages, VM_MAP, PAGE_KERNEL);
                                   ^
mm/kmsan/kmsan_test.c:322:3: error: implicit declaration of function 'vunmap' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
                vunmap(vbuf);
                ^

Link: https://lkml.kernel.org/r/20221215163046.4079767-1-arnd@kernel.org
Fixes: 8ed691b02ade ("kmsan: add tests for KMSAN")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agokmsan: export kmsan_handle_urb
Arnd Bergmann [Thu, 15 Dec 2022 16:26:57 +0000 (17:26 +0100)]
kmsan: export kmsan_handle_urb

commit 7ba594d700998bafa96a75360d2e060aa39156d2 upstream.

USB support can be in a loadable module, and this causes a link failure
with KMSAN:

ERROR: modpost: "kmsan_handle_urb" [drivers/usb/core/usbcore.ko] undefined!

Export the symbol so it can be used by this module.

Link: https://lkml.kernel.org/r/20221215162710.3802378-1-arnd@kernel.org
Fixes: 553a80188a5d ("kmsan: handle memory sent to/from USB")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agomm/mempolicy: fix memory leak in set_mempolicy_home_node system call
Mathieu Desnoyers [Thu, 15 Dec 2022 19:46:21 +0000 (14:46 -0500)]
mm/mempolicy: fix memory leak in set_mempolicy_home_node system call

commit 38ce7c9bdfc228c14d7621ba36d3eebedd9d4f76 upstream.

When encountering any vma in the range with policy other than MPOL_BIND or
MPOL_PREFERRED_MANY, an error is returned without issuing a mpol_put on
the policy just allocated with mpol_dup().

This allows arbitrary users to leak kernel memory.

Link: https://lkml.kernel.org/r/20221215194621.202816-1-mathieu.desnoyers@efficios.com
Fixes: c6018b4b2549 ("mm/mempolicy: add set_mempolicy_home_node syscall")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: <stable@vger.kernel.org> [5.17+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agomm, mremap: fix mremap() expanding vma with addr inside vma
Vlastimil Babka [Fri, 16 Dec 2022 16:32:27 +0000 (17:32 +0100)]
mm, mremap: fix mremap() expanding vma with addr inside vma

commit 6f12be792fde994ed934168f93c2a0d2a0cf0bc5 upstream.

Since 6.1 we have noticed random rpm install failures that were tracked to
mremap() returning -ENOMEM and to commit ca3d76b0aa80 ("mm: add merging
after mremap resize").

The problem occurs when mremap() expands a VMA in place, but using an
starting address that's not vma->vm_start, but somewhere in the middle.
The extension_pgoff calculation introduced by the commit is wrong in that
case, so vma_merge() fails due to pgoffs not being compatible.  Fix the
calculation.

By the way it seems that the situations, where rpm now expands a vma from
the middle, were made possible also due to that commit, thanks to the
improved vma merging.  Yet it should work just fine, except for the buggy
calculation.

Link: https://lkml.kernel.org/r/20221216163227.24648-1-vbabka@suse.cz
Reported-by: Jiri Slaby <jirislaby@kernel.org>
Link: https://bugzilla.suse.com/show_bug.cgi?id=1206359
Fixes: ca3d76b0aa80 ("mm: add merging after mremap resize")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Jakub Matěna <matenajakub@gmail.com>
Cc: "Kirill A . Shutemov" <kirill@shutemov.name>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agortmutex: Add acquire semantics for rtmutex lock acquisition slow path
Mel Gorman [Fri, 2 Dec 2022 10:02:23 +0000 (10:02 +0000)]
rtmutex: Add acquire semantics for rtmutex lock acquisition slow path

commit 1c0908d8e441631f5b8ba433523cf39339ee2ba0 upstream.

Jan Kara reported the following bug triggering on 6.0.5-rt14 running dbench
on XFS on arm64.

 kernel BUG at fs/inode.c:625!
 Internal error: Oops - BUG: 0 [#1] PREEMPT_RT SMP
 CPU: 11 PID: 6611 Comm: dbench Tainted: G            E   6.0.0-rt14-rt+ #1
 pc : clear_inode+0xa0/0xc0
 lr : clear_inode+0x38/0xc0
 Call trace:
  clear_inode+0xa0/0xc0
  evict+0x160/0x180
  iput+0x154/0x240
  do_unlinkat+0x184/0x300
  __arm64_sys_unlinkat+0x48/0xc0
  el0_svc_common.constprop.4+0xe4/0x2c0
  do_el0_svc+0xac/0x100
  el0_svc+0x78/0x200
  el0t_64_sync_handler+0x9c/0xc0
  el0t_64_sync+0x19c/0x1a0

It also affects 6.1-rc7-rt5 and affects a preempt-rt fork of 5.14 so this
is likely a bug that existed forever and only became visible when ARM
support was added to preempt-rt. The same problem does not occur on x86-64
and he also reported that converting sb->s_inode_wblist_lock to
raw_spinlock_t makes the problem disappear indicating that the RT spinlock
variant is the problem.

Which in turn means that RT mutexes on ARM64 and any other weakly ordered
architecture are affected by this independent of RT.

Will Deacon observed:

  "I'd be more inclined to be suspicious of the slowpath tbh, as we need to
   make sure that we have acquire semantics on all paths where the lock can
   be taken. Looking at the rtmutex code, this really isn't obvious to me
   -- for example, try_to_take_rt_mutex() appears to be able to return via
   the 'takeit' label without acquire semantics and it looks like we might
   be relying on the caller's subsequent _unlock_ of the wait_lock for
   ordering, but that will give us release semantics which aren't correct."

Sebastian Andrzej Siewior prototyped a fix that does work based on that
comment but it was a little bit overkill and added some fences that should
not be necessary.

The lock owner is updated with an IRQ-safe raw spinlock held, but the
spin_unlock does not provide acquire semantics which are needed when
acquiring a mutex.

Adds the necessary acquire semantics for lock owner updates in the slow path
acquisition and the waiter bit logic.

It successfully completed 10 iterations of the dbench workload while the
vanilla kernel fails on the first iteration.

[ bigeasy@linutronix.de: Initial prototype fix ]

Fixes: 700318d1d7b38 ("locking/rtmutex: Use acquire/release semantics")
Fixes: 23f78d4a03c5 ("[PATCH] pi-futex: rt mutex core")
Reported-by: Jan Kara <jack@suse.cz>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20221202100223.6mevpbl7i6x5udfd@techsingularity.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agofutex: Fix futex_waitv() hrtimer debug object leak on kcalloc error
Mathieu Desnoyers [Wed, 14 Dec 2022 22:20:08 +0000 (17:20 -0500)]
futex: Fix futex_waitv() hrtimer debug object leak on kcalloc error

commit 94cd8fa09f5f1ebdd4e90964b08b7f2cc4b36c43 upstream.

In a scenario where kcalloc() fails to allocate memory, the futex_waitv
system call immediately returns -ENOMEM without invoking
destroy_hrtimer_on_stack(). When CONFIG_DEBUG_OBJECTS_TIMERS=y, this
results in leaking a timer debug object.

Fixes: bf69bad38cf6 ("futex: Implement sys_futex_waitv()")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: stable@vger.kernel.org
Cc: stable@vger.kernel.org # v5.16+
Link: https://lore.kernel.org/r/20221214222008.200393-1-mathieu.desnoyers@efficios.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
18 months agoHID: plantronics: Additional PIDs for double volume key presses quirk
Terry Junge [Thu, 8 Dec 2022 23:05:06 +0000 (15:05 -0800)]
HID: plantronics: Additional PIDs for double volume key presses quirk

[ Upstream commit 3d57f36c89d8ba32b2c312f397a37fd1a2dc7cfc ]

I no longer work for Plantronics (aka Poly, aka HP) and do not have
access to the headsets in order to test. However, as noted by Maxim,
the other 32xx models that share the same base code set as the 3220
would need the same quirk. This patch adds the PIDs for the rest of
the Blackwire 32XX product family that require the quirk.

Plantronics Blackwire 3210 Series (047f:c055)
Plantronics Blackwire 3215 Series (047f:c057)
Plantronics Blackwire 3225 Series (047f:c058)

Quote from previous patch by Maxim Mikityanskiy
Plantronics Blackwire 3220 Series (047f:c056) sends HID reports twice
for each volume key press. This patch adds a quirk to hid-plantronics
for this product ID, which will ignore the second volume key press if
it happens within 5 ms from the last one that was handled.

The patch was tested on the mentioned model only, it shouldn't affect
other models, however, this quirk might be needed for them too.
Auto-repeat (when a key is held pressed) is not affected, because the
rate is about 3 times per second, which is far less frequent than once
in 5 ms.
End quote

Signed-off-by: Terry Junge <linuxhid@cosmicgizmosystems.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agoHID: multitouch: fix Asus ExpertBook P2 P2451FA trackpoint
José Expósito [Mon, 28 Nov 2022 16:57:05 +0000 (17:57 +0100)]
HID: multitouch: fix Asus ExpertBook P2 P2451FA trackpoint

[ Upstream commit 4eab1c2fe06c98a4dff258dd64800b6986c101e9 ]

The HID descriptor of this device contains two mouse collections, one
for mouse emulation and the other for the trackpoint.

Both collections get merged and, because the first one defines X and Y,
the movemenent events reported by the trackpoint collection are
ignored.

Set the MT_CLS_WIN_8_FORCE_MULTI_INPUT class for this device to be able
to receive its reports.

This fix is similar to/based on commit 40d5bb87377a ("HID: multitouch:
enable multi-input as a quirk for some devices").

Link: https://gitlab.freedesktop.org/libinput/libinput/-/issues/825
Reported-by: Akito <the@akito.ooo>
Tested-by: Akito <the@akito.ooo>
Signed-off-by: José Expósito <jose.exposito89@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agokprobes: kretprobe events missing on 2-core KVM guest
wuqiang [Thu, 10 Nov 2022 08:15:02 +0000 (16:15 +0800)]
kprobes: kretprobe events missing on 2-core KVM guest

[ Upstream commit 3b7ddab8a19aefc768f345fd3782af35b4a68d9b ]

Default value of maxactive is set as num_possible_cpus() for nonpreemptable
systems. For a 2-core system, only 2 kretprobe instances would be allocated
in default, then these 2 instances for execve kretprobe are very likely to
be used up with a pipelined command.

Here's the testcase: a shell script was added to crontab, and the content
of the script is:

  #!/bin/sh
  do_something_magic `tr -dc a-z < /dev/urandom | head -c 10`

cron will trigger a series of program executions (4 times every hour). Then
events loss would be noticed normally after 3-4 hours of testings.

The issue is caused by a burst of series of execve requests. The best number
of kretprobe instances could be different case by case, and should be user's
duty to determine, but num_possible_cpus() as the default value is inadequate
especially for systems with small number of cpus.

This patch enables the logic for preemption as default, thus increases the
minimum of maxactive to 10 for nonpreemptable systems.

Link: https://lore.kernel.org/all/20221110081502.492289-1-wuqiang.matt@bytedance.com/
Signed-off-by: wuqiang <wuqiang.matt@bytedance.com>
Reviewed-by: Solar Designer <solar@openwall.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agoNFSD: fix use-after-free in __nfs42_ssc_open()
Dai Ngo [Mon, 12 Dec 2022 22:50:11 +0000 (14:50 -0800)]
NFSD: fix use-after-free in __nfs42_ssc_open()

[ Upstream commit 75333d48f92256a0dec91dbf07835e804fc411c0 ]

Problem caused by source's vfsmount being unmounted but remains
on the delayed unmount list. This happens when nfs42_ssc_open()
return errors.

Fixed by removing nfsd4_interssc_connect(), leave the vfsmount
for the laundromat to unmount when idle time expires.

We don't need to call nfs_do_sb_deactive when nfs42_ssc_open
return errors since the file was not opened so nfs_server->active
was not incremented. Same as in nfsd4_copy, if we fail to
launch nfsd4_do_async_copy thread then there's no need to
call nfs_do_sb_deactive

Reported-by: Xingyuan Mo <hdthky0@gmail.com>
Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Tested-by: Xingyuan Mo <hdthky0@gmail.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agortc: msc313: Fix function prototype mismatch in msc313_rtc_probe()
Kees Cook [Fri, 2 Dec 2022 18:45:30 +0000 (10:45 -0800)]
rtc: msc313: Fix function prototype mismatch in msc313_rtc_probe()

[ Upstream commit 21b8a1dd56a163825e5749b303858fb902ebf198 ]

With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed.

msc313_rtc_probe() was passing clk_disable_unprepare() directly, which
did not have matching prototypes for devm_add_action_or_reset()'s
callback argument. Refactor to use devm_clk_get_enabled() instead.

This was found as a result of Clang's new -Wcast-function-type-strict
flag, which is more sensitive than the simpler -Wcast-function-type,
which only checks for type width mismatches.

Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/202211041527.HD8TLSE1-lkp@intel.com
Suggested-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Daniel Palmer <daniel@thingy.jp>
Cc: Romain Perier <romain.perier@gmail.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-rtc@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Daniel Palmer <daniel@thingy.jp>
Tested-by: Daniel Palmer <daniel@thingy.jp>
Link: https://lore.kernel.org/r/20221202184525.gonna.423-kees@kernel.org
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agopowerpc/rtas: avoid scheduling in rtas_os_term()
Nathan Lynch [Fri, 18 Nov 2022 15:07:42 +0000 (09:07 -0600)]
powerpc/rtas: avoid scheduling in rtas_os_term()

[ Upstream commit 6c606e57eecc37d6b36d732b1ff7e55b7dc32dd4 ]

It's unsafe to use rtas_busy_delay() to handle a busy status from
the ibm,os-term RTAS function in rtas_os_term():

Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
BUG: sleeping function called from invalid context at arch/powerpc/kernel/rtas.c:618
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0
preempt_count: 2, expected: 0
CPU: 7 PID: 1 Comm: swapper/0 Tainted: G      D            6.0.0-rc5-02182-gf8553a572277-dirty #9
Call Trace:
[c000000007b8f000] [c000000001337110] dump_stack_lvl+0xb4/0x110 (unreliable)
[c000000007b8f040] [c0000000002440e4] __might_resched+0x394/0x3c0
[c000000007b8f0e0] [c00000000004f680] rtas_busy_delay+0x120/0x1b0
[c000000007b8f100] [c000000000052d04] rtas_os_term+0xb8/0xf4
[c000000007b8f180] [c0000000001150fc] pseries_panic+0x50/0x68
[c000000007b8f1f0] [c000000000036354] ppc_panic_platform_handler+0x34/0x50
[c000000007b8f210] [c0000000002303c4] notifier_call_chain+0xd4/0x1c0
[c000000007b8f2b0] [c0000000002306cc] atomic_notifier_call_chain+0xac/0x1c0
[c000000007b8f2f0] [c0000000001d62b8] panic+0x228/0x4d0
[c000000007b8f390] [c0000000001e573c] do_exit+0x140c/0x1420
[c000000007b8f480] [c0000000001e586c] make_task_dead+0xdc/0x200

Use rtas_busy_delay_time() instead, which signals without side effects
whether to attempt the ibm,os-term RTAS call again.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221118150751.469393-5-nathanl@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agopowerpc/rtas: avoid device tree lookups in rtas_os_term()
Nathan Lynch [Fri, 18 Nov 2022 15:07:41 +0000 (09:07 -0600)]
powerpc/rtas: avoid device tree lookups in rtas_os_term()

[ Upstream commit ed2213bfb192ab51f09f12e9b49b5d482c6493f3 ]

rtas_os_term() is called during panic. Its behavior depends on a couple
of conditions in the /rtas node of the device tree, the traversal of
which entails locking and local IRQ state changes. If the kernel panics
while devtree_lock is held, rtas_os_term() as currently written could
hang.

Instead of discovering the relevant characteristics at panic time,
cache them in file-static variables at boot. Note the lookup for
"ibm,extended-os-term" is converted to of_property_read_bool() since it
is a boolean property, not an RTAS function token.

Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com>
[mpe: Incorporate suggested change from Nick]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221118150751.469393-4-nathanl@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agoiommu/mediatek: Fix crash on isr after kexec()
Ricardo Ribalda [Mon, 28 Nov 2022 22:16:48 +0000 (23:16 +0100)]
iommu/mediatek: Fix crash on isr after kexec()

[ Upstream commit 00ef8885a945c37551547d8ac8361cacd20c4e42 ]

If the system is rebooted via isr(), the IRQ handler might
be triggered before the domain is initialized. Resulting on
an invalid memory access error.

Fix:
[    0.500930] Unable to handle kernel read from unreadable memory at virtual address 0000000000000070
[    0.501166] Call trace:
[    0.501174]  report_iommu_fault+0x28/0xfc
[    0.501180]  mtk_iommu_isr+0x10c/0x1c0

Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/20221125-mtk-iommu-v2-0-e168dff7d43e@chromium.org
[ joro: Fixed spelling in commit message ]
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agoobjtool: Fix SEGFAULT
Christophe Leroy [Mon, 14 Nov 2022 17:57:46 +0000 (23:27 +0530)]
objtool: Fix SEGFAULT

[ Upstream commit efb11fdb3e1a9f694fa12b70b21e69e55ec59c36 ]

find_insn() will return NULL in case of failure. Check insn in order
to avoid a kernel Oops for NULL pointer dereference.

Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Josh Poimboeuf <jpoimboe@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20221114175754.1131267-9-sv@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agofs/ntfs3: Fix slab-out-of-bounds in r_page
Yin Xiujiang [Mon, 6 Dec 2021 02:40:45 +0000 (10:40 +0800)]
fs/ntfs3: Fix slab-out-of-bounds in r_page

[ Upstream commit ecfbd57cf9c5ca225184ae266ce44ae473792132 ]

When PAGE_SIZE is 64K, if read_log_page is called by log_read_rst for
the first time, the size of *buffer would be equal to
DefaultLogPageSize(4K).But for *buffer operations like memcpy,
if the memory area size(n) which being assigned to buffer is larger
than 4K (log->page_size(64K) or bytes(64K-page_off)), it will cause
an out of boundary error.
 Call trace:
  [...]
  kasan_report+0x44/0x130
  check_memory_region+0xf8/0x1a0
  memcpy+0xc8/0x100
  ntfs_read_run_nb+0x20c/0x460
  read_log_page+0xd0/0x1f4
  log_read_rst+0x110/0x75c
  log_replay+0x1e8/0x4aa0
  ntfs_loadlog_and_replay+0x290/0x2d0
  ntfs_fill_super+0x508/0xec0
  get_tree_bdev+0x1fc/0x34c
  [...]

Fix this by setting variable r_page to NULL in log_read_rst.

Signed-off-by: Yin Xiujiang <yinxiujiang@kylinos.cn>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agofs/ntfs3: Delete duplicate condition in ntfs_read_mft()
Dan Carpenter [Sat, 15 Oct 2022 08:28:55 +0000 (11:28 +0300)]
fs/ntfs3: Delete duplicate condition in ntfs_read_mft()

[ Upstream commit 658015167a8432b88f5d032e9d85d8fd50e5bf2c ]

There were two patches which addressed the same bug and added the same
condition:

commit 6db620863f85 ("fs/ntfs3: Validate data run offset")
commit 887bfc546097 ("fs/ntfs3: Fix slab-out-of-bounds read in run_unpack")

Delete one condition.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agofs/ntfs3: Use __GFP_NOWARN allocation at ntfs_fill_super()
Tetsuo Handa [Sun, 2 Oct 2022 14:54:11 +0000 (23:54 +0900)]
fs/ntfs3: Use __GFP_NOWARN allocation at ntfs_fill_super()

[ Upstream commit 59bfd7a483da36bd202532a3d9ea1f14f3bf3aaf ]

syzbot is reporting too large allocation at ntfs_fill_super() [1], for a
crafted filesystem can contain bogus inode->i_size. Add __GFP_NOWARN in
order to avoid too large allocation warning, than exhausting memory by
using kvmalloc().

Link: https://syzkaller.appspot.com/bug?extid=33f3faaa0c08744f7d40
Reported-by: syzot <syzbot+33f3faaa0c08744f7d40@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agofs/ntfs3: Use __GFP_NOWARN allocation at wnd_init()
Tetsuo Handa [Sun, 2 Oct 2022 14:39:15 +0000 (23:39 +0900)]
fs/ntfs3: Use __GFP_NOWARN allocation at wnd_init()

[ Upstream commit 0d0f659bf713662fabed973f9996b8f23c59ca51 ]

syzbot is reporting too large allocation at wnd_init() [1], for a crafted
filesystem can become wnd->nwnd close to UINT_MAX. Add __GFP_NOWARN in
order to avoid too large allocation warning, than exhausting memory by
using kvcalloc().

Link: https://syzkaller.appspot.com/bug?extid=fa4648a5446460b7b963
Reported-by: syzot <syzbot+fa4648a5446460b7b963@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agofs/ntfs3: Validate index root when initialize NTFS security
Edward Lo [Fri, 30 Sep 2022 01:58:40 +0000 (09:58 +0800)]
fs/ntfs3: Validate index root when initialize NTFS security

[ Upstream commit bfcdbae0523bd95eb75a739ffb6221a37109881e ]

This enhances the sanity check for $SDH and $SII while initializing NTFS
security, guarantees these index root are legit.

[  162.459513] BUG: KASAN: use-after-free in hdr_find_e.isra.0+0x10c/0x320
[  162.460176] Read of size 2 at addr ffff8880037bca99 by task mount/243
[  162.460851]
[  162.461252] CPU: 0 PID: 243 Comm: mount Not tainted 6.0.0-rc7 #42
[  162.461744] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[  162.462609] Call Trace:
[  162.462954]  <TASK>
[  162.463276]  dump_stack_lvl+0x49/0x63
[  162.463822]  print_report.cold+0xf5/0x689
[  162.464608]  ? unwind_get_return_address+0x3a/0x60
[  162.465766]  ? hdr_find_e.isra.0+0x10c/0x320
[  162.466975]  kasan_report+0xa7/0x130
[  162.467506]  ? _raw_spin_lock_irq+0xc0/0xf0
[  162.467998]  ? hdr_find_e.isra.0+0x10c/0x320
[  162.468536]  __asan_load2+0x68/0x90
[  162.468923]  hdr_find_e.isra.0+0x10c/0x320
[  162.469282]  ? cmp_uints+0xe0/0xe0
[  162.469557]  ? cmp_sdh+0x90/0x90
[  162.469864]  ? ni_find_attr+0x214/0x300
[  162.470217]  ? ni_load_mi+0x80/0x80
[  162.470479]  ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  162.470931]  ? ntfs_bread_run+0x190/0x190
[  162.471307]  ? indx_get_root+0xe4/0x190
[  162.471556]  ? indx_get_root+0x140/0x190
[  162.471833]  ? indx_init+0x1e0/0x1e0
[  162.472069]  ? fnd_clear+0x115/0x140
[  162.472363]  ? _raw_spin_lock_irqsave+0x100/0x100
[  162.472731]  indx_find+0x184/0x470
[  162.473461]  ? sysvec_apic_timer_interrupt+0x57/0xc0
[  162.474429]  ? indx_find_buffer+0x2d0/0x2d0
[  162.474704]  ? do_syscall_64+0x3b/0x90
[  162.474962]  dir_search_u+0x196/0x2f0
[  162.475381]  ? ntfs_nls_to_utf16+0x450/0x450
[  162.475661]  ? ntfs_security_init+0x3d6/0x440
[  162.475906]  ? is_sd_valid+0x180/0x180
[  162.476191]  ntfs_extend_init+0x13f/0x2c0
[  162.476496]  ? ntfs_fix_post_read+0x130/0x130
[  162.476861]  ? iput.part.0+0x286/0x320
[  162.477325]  ntfs_fill_super+0x11e0/0x1b50
[  162.477709]  ? put_ntfs+0x1d0/0x1d0
[  162.477970]  ? vsprintf+0x20/0x20
[  162.478258]  ? set_blocksize+0x95/0x150
[  162.478538]  get_tree_bdev+0x232/0x370
[  162.478789]  ? put_ntfs+0x1d0/0x1d0
[  162.479038]  ntfs_fs_get_tree+0x15/0x20
[  162.479374]  vfs_get_tree+0x4c/0x130
[  162.479729]  path_mount+0x654/0xfe0
[  162.480124]  ? putname+0x80/0xa0
[  162.480484]  ? finish_automount+0x2e0/0x2e0
[  162.480894]  ? putname+0x80/0xa0
[  162.481467]  ? kmem_cache_free+0x1c4/0x440
[  162.482280]  ? putname+0x80/0xa0
[  162.482714]  do_mount+0xd6/0xf0
[  162.483264]  ? path_mount+0xfe0/0xfe0
[  162.484782]  ? __kasan_check_write+0x14/0x20
[  162.485593]  __x64_sys_mount+0xca/0x110
[  162.486024]  do_syscall_64+0x3b/0x90
[  162.486543]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  162.487141] RIP: 0033:0x7f9d374e948a
[  162.488324] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 008
[  162.489728] RSP: 002b:00007ffe30e73d18 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
[  162.490971] RAX: ffffffffffffffda RBX: 0000561cdb43a060 RCX: 00007f9d374e948a
[  162.491669] RDX: 0000561cdb43a260 RSI: 0000561cdb43a2e0 RDI: 0000561cdb442af0
[  162.492050] RBP: 0000000000000000 R08: 0000561cdb43a280 R09: 0000000000000020
[  162.492459] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 0000561cdb442af0
[  162.493183] R13: 0000561cdb43a260 R14: 0000000000000000 R15: 00000000ffffffff
[  162.493644]  </TASK>
[  162.493908]
[  162.494214] The buggy address belongs to the physical page:
[  162.494761] page:000000003e38a3d5 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x37bc
[  162.496064] flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff)
[  162.497278] raw: 000fffffc0000000 ffffea00000df1c8 ffffea00000df008 0000000000000000
[  162.498928] raw: 0000000000000000 0000000000240000 00000000ffffffff 0000000000000000
[  162.500542] page dumped because: kasan: bad access detected
[  162.501057]
[  162.501242] Memory state around the buggy address:
[  162.502230]  ffff8880037bc980: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[  162.502977]  ffff8880037bca00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[  162.503522] >ffff8880037bca80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[  162.503963]                             ^
[  162.504370]  ffff8880037bcb00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[  162.504766]  ffff8880037bcb80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff

Signed-off-by: Edward Lo <edward.lo@ambergroup.io>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agophy: sun4i-usb: Add support for the H616 USB PHY
Andre Przywara [Mon, 31 Oct 2022 11:13:55 +0000 (11:13 +0000)]
phy: sun4i-usb: Add support for the H616 USB PHY

[ Upstream commit 0f607406525d25019dd9c498bcc0b42734fc59d5 ]

The USB PHY used in the Allwinner H616 SoC inherits some traits from its
various predecessors: it has four full PHYs like the H3, needs some
extra bits to be set like the H6, and puts SIDDQ on a different bit like
the A100. Plus it needs this weird PHY2 quirk.

Name all those properties in a new config struct and assign a new
compatible name to it.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Samuel Holland <samuel@sholland.org>
Link: https://lore.kernel.org/r/20221031111358.3387297-5-andre.przywara@arm.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agophy: sun4i-usb: Introduce port2 SIDDQ quirk
Andre Przywara [Mon, 31 Oct 2022 11:13:54 +0000 (11:13 +0000)]
phy: sun4i-usb: Introduce port2 SIDDQ quirk

[ Upstream commit b45c6d80325bec2b78c716629a518b6442d8bdc6 ]

At least the Allwinner H616 SoC requires a weird quirk to make most
USB PHYs work: Only port2 works out of the box, but all other ports
need some help from this port2 to work correctly: The CLK_BUS_PHY2 and
RST_USB_PHY2 clock and reset need to be enabled, and the SIDDQ bit in
the PMU PHY control register needs to be cleared. For this register to
be accessible, CLK_BUS_ECHI2 needs to be ungated. Don't ask ....

Instead of disguising this as some generic feature, treat it more like
a quirk (what it really is):
If the quirk bit is set, and we initialise a PHY other than PHY2, ungate
this one special clock, and clear the SIDDQ bit. We also pick the clock
and reset from PHY2 and enable them as well.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Link: https://lore.kernel.org/r/20221031111358.3387297-4-andre.przywara@arm.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agosoundwire: dmi-quirks: add quirk variant for LAPBC710 NUC15
Pierre-Louis Bossart [Tue, 18 Oct 2022 01:25:00 +0000 (09:25 +0800)]
soundwire: dmi-quirks: add quirk variant for LAPBC710 NUC15

[ Upstream commit f74495761df10c25a98256d16ea7465191b6e2cd ]

Some NUC15 LAPBC710 devices don't expose the same DMI information as
the Intel reference, add additional entry in the match table.

BugLink: https://github.com/thesofproject/linux/issues/3885
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Link: https://lore.kernel.org/r/20221018012500.1592994-1-yung-chuan.liao@linux.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agofs/ntfs3: Fix slab-out-of-bounds read in run_unpack
Hawkins Jiawei [Fri, 23 Sep 2022 11:09:04 +0000 (19:09 +0800)]
fs/ntfs3: Fix slab-out-of-bounds read in run_unpack

[ Upstream commit 887bfc546097fbe8071dac13b2fef73b77920899 ]

Syzkaller reports slab-out-of-bounds bug as follows:
==================================================================
BUG: KASAN: slab-out-of-bounds in run_unpack+0x8b7/0x970 fs/ntfs3/run.c:944
Read of size 1 at addr ffff88801bbdff02 by task syz-executor131/3611

[...]
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 print_address_description mm/kasan/report.c:317 [inline]
 print_report.cold+0x2ba/0x719 mm/kasan/report.c:433
 kasan_report+0xb1/0x1e0 mm/kasan/report.c:495
 run_unpack+0x8b7/0x970 fs/ntfs3/run.c:944
 run_unpack_ex+0xb0/0x7c0 fs/ntfs3/run.c:1057
 ntfs_read_mft fs/ntfs3/inode.c:368 [inline]
 ntfs_iget5+0xc20/0x3280 fs/ntfs3/inode.c:501
 ntfs_loadlog_and_replay+0x124/0x5d0 fs/ntfs3/fsntfs.c:272
 ntfs_fill_super+0x1eff/0x37f0 fs/ntfs3/super.c:1018
 get_tree_bdev+0x440/0x760 fs/super.c:1323
 vfs_get_tree+0x89/0x2f0 fs/super.c:1530
 do_new_mount fs/namespace.c:3040 [inline]
 path_mount+0x1326/0x1e20 fs/namespace.c:3370
 do_mount fs/namespace.c:3383 [inline]
 __do_sys_mount fs/namespace.c:3591 [inline]
 __se_sys_mount fs/namespace.c:3568 [inline]
 __x64_sys_mount+0x27f/0x300 fs/namespace.c:3568
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
 [...]
 </TASK>

The buggy address belongs to the physical page:
page:ffffea00006ef600 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1bbd8
head:ffffea00006ef600 order:3 compound_mapcount:0 compound_pincount:0
flags: 0xfff00000010200(slab|head|node=0|zone=1|lastcpupid=0x7ff)
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff88801bbdfe00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88801bbdfe80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88801bbdff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
                   ^
 ffff88801bbdff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88801bbe0000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

Kernel will tries to read record and parse MFT from disk in
ntfs_read_mft().

Yet the problem is that during enumerating attributes in record,
kernel doesn't check whether run_off field loading from the disk
is a valid value.

To be more specific, if attr->nres.run_off is larger than attr->size,
kernel will passes an invalid argument run_buf_size in
run_unpack_ex(), which having an integer overflow. Then this invalid
argument will triggers the slab-out-of-bounds Read bug as above.

This patch solves it by adding the sanity check between
the offset to packed runs and attribute size.

link: https://lore.kernel.org/all/0000000000009145fc05e94bd5c3@google.com/#t
Reported-and-tested-by: syzbot+8d6fbb27a6aded64b25b@syzkaller.appspotmail.com
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agofs/ntfs3: Validate resident attribute name
Edward Lo [Thu, 22 Sep 2022 16:50:23 +0000 (00:50 +0800)]
fs/ntfs3: Validate resident attribute name

[ Upstream commit 54e45702b648b7c0000e90b3e9b890e367e16ea8 ]

Though we already have some sanity checks while enumerating attributes,
resident attribute names aren't included. This patch checks the resident
attribute names are in the valid ranges.

[  259.209031] BUG: KASAN: slab-out-of-bounds in ni_create_attr_list+0x1e1/0x850
[  259.210770] Write of size 426 at addr ffff88800632f2b2 by task exp/255
[  259.211551]
[  259.212035] CPU: 0 PID: 255 Comm: exp Not tainted 6.0.0-rc6 #37
[  259.212955] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[  259.214387] Call Trace:
[  259.214640]  <TASK>
[  259.214895]  dump_stack_lvl+0x49/0x63
[  259.215284]  print_report.cold+0xf5/0x689
[  259.215565]  ? kasan_poison+0x3c/0x50
[  259.215778]  ? kasan_unpoison+0x28/0x60
[  259.215991]  ? ni_create_attr_list+0x1e1/0x850
[  259.216270]  kasan_report+0xa7/0x130
[  259.216481]  ? ni_create_attr_list+0x1e1/0x850
[  259.216719]  kasan_check_range+0x15a/0x1d0
[  259.216939]  memcpy+0x3c/0x70
[  259.217136]  ni_create_attr_list+0x1e1/0x850
[  259.217945]  ? __rcu_read_unlock+0x5b/0x280
[  259.218384]  ? ni_remove_attr+0x2e0/0x2e0
[  259.218712]  ? kernel_text_address+0xcf/0xe0
[  259.219064]  ? __kernel_text_address+0x12/0x40
[  259.219434]  ? arch_stack_walk+0x9e/0xf0
[  259.219668]  ? __this_cpu_preempt_check+0x13/0x20
[  259.219904]  ? sysvec_apic_timer_interrupt+0x57/0xc0
[  259.220140]  ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
[  259.220561]  ni_ins_attr_ext+0x52c/0x5c0
[  259.220984]  ? ni_create_attr_list+0x850/0x850
[  259.221532]  ? run_deallocate+0x120/0x120
[  259.221972]  ? vfs_setxattr+0x128/0x300
[  259.222688]  ? setxattr+0x126/0x140
[  259.222921]  ? path_setxattr+0x164/0x180
[  259.223431]  ? __x64_sys_setxattr+0x6d/0x80
[  259.223828]  ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  259.224417]  ? mi_find_attr+0x3c/0xf0
[  259.224772]  ni_insert_attr+0x1ba/0x420
[  259.225216]  ? ni_ins_attr_ext+0x5c0/0x5c0
[  259.225504]  ? ntfs_read_ea+0x119/0x450
[  259.225775]  ni_insert_resident+0xc0/0x1c0
[  259.226316]  ? ni_insert_nonresident+0x400/0x400
[  259.227001]  ? __kasan_kmalloc+0x88/0xb0
[  259.227468]  ? __kmalloc+0x192/0x320
[  259.227773]  ntfs_set_ea+0x6bf/0xb30
[  259.228216]  ? ftrace_graph_ret_addr+0x2a/0xb0
[  259.228494]  ? entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  259.228838]  ? ntfs_read_ea+0x450/0x450
[  259.229098]  ? is_bpf_text_address+0x24/0x40
[  259.229418]  ? kernel_text_address+0xcf/0xe0
[  259.229681]  ? __kernel_text_address+0x12/0x40
[  259.229948]  ? unwind_get_return_address+0x3a/0x60
[  259.230271]  ? write_profile+0x270/0x270
[  259.230537]  ? arch_stack_walk+0x9e/0xf0
[  259.230836]  ntfs_setxattr+0x114/0x5c0
[  259.231099]  ? ntfs_set_acl_ex+0x2e0/0x2e0
[  259.231529]  ? evm_protected_xattr_common+0x6d/0x100
[  259.231817]  ? posix_xattr_acl+0x13/0x80
[  259.232073]  ? evm_protect_xattr+0x1f7/0x440
[  259.232351]  __vfs_setxattr+0xda/0x120
[  259.232635]  ? xattr_resolve_name+0x180/0x180
[  259.232912]  __vfs_setxattr_noperm+0x93/0x300
[  259.233219]  __vfs_setxattr_locked+0x141/0x160
[  259.233492]  ? kasan_poison+0x3c/0x50
[  259.233744]  vfs_setxattr+0x128/0x300
[  259.234002]  ? __vfs_setxattr_locked+0x160/0x160
[  259.234837]  do_setxattr+0xb8/0x170
[  259.235567]  ? vmemdup_user+0x53/0x90
[  259.236212]  setxattr+0x126/0x140
[  259.236491]  ? do_setxattr+0x170/0x170
[  259.236791]  ? debug_smp_processor_id+0x17/0x20
[  259.237232]  ? kasan_quarantine_put+0x57/0x180
[  259.237605]  ? putname+0x80/0xa0
[  259.237870]  ? __kasan_slab_free+0x11c/0x1b0
[  259.238234]  ? putname+0x80/0xa0
[  259.238500]  ? preempt_count_sub+0x18/0xc0
[  259.238775]  ? __mnt_want_write+0xaa/0x100
[  259.238990]  ? mnt_want_write+0x8b/0x150
[  259.239290]  path_setxattr+0x164/0x180
[  259.239605]  ? setxattr+0x140/0x140
[  259.239849]  ? debug_smp_processor_id+0x17/0x20
[  259.240174]  ? fpregs_assert_state_consistent+0x67/0x80
[  259.240411]  __x64_sys_setxattr+0x6d/0x80
[  259.240715]  do_syscall_64+0x3b/0x90
[  259.240934]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  259.241697] RIP: 0033:0x7fc6b26e4469
[  259.242647] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 088
[  259.244512] RSP: 002b:00007ffc3c7841f8 EFLAGS: 00000217 ORIG_RAX: 00000000000000bc
[  259.245086] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc6b26e4469
[  259.246025] RDX: 00007ffc3c784380 RSI: 00007ffc3c7842e0 RDI: 00007ffc3c784238
[  259.246961] RBP: 00007ffc3c788410 R08: 0000000000000001 R09: 00007ffc3c7884f8
[  259.247775] R10: 000000000000007f R11: 0000000000000217 R12: 00000000004004e0
[  259.248534] R13: 00007ffc3c7884f0 R14: 0000000000000000 R15: 0000000000000000
[  259.249368]  </TASK>
[  259.249644]
[  259.249888] Allocated by task 255:
[  259.250283]  kasan_save_stack+0x26/0x50
[  259.250957]  __kasan_kmalloc+0x88/0xb0
[  259.251826]  __kmalloc+0x192/0x320
[  259.252745]  ni_create_attr_list+0x11e/0x850
[  259.253298]  ni_ins_attr_ext+0x52c/0x5c0
[  259.253685]  ni_insert_attr+0x1ba/0x420
[  259.253974]  ni_insert_resident+0xc0/0x1c0
[  259.254311]  ntfs_set_ea+0x6bf/0xb30
[  259.254629]  ntfs_setxattr+0x114/0x5c0
[  259.254859]  __vfs_setxattr+0xda/0x120
[  259.255155]  __vfs_setxattr_noperm+0x93/0x300
[  259.255445]  __vfs_setxattr_locked+0x141/0x160
[  259.255862]  vfs_setxattr+0x128/0x300
[  259.256251]  do_setxattr+0xb8/0x170
[  259.256522]  setxattr+0x126/0x140
[  259.256911]  path_setxattr+0x164/0x180
[  259.257308]  __x64_sys_setxattr+0x6d/0x80
[  259.257637]  do_syscall_64+0x3b/0x90
[  259.257970]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  259.258550]
[  259.258772] The buggy address belongs to the object at ffff88800632f000
[  259.258772]  which belongs to the cache kmalloc-1k of size 1024
[  259.260190] The buggy address is located 690 bytes inside of
[  259.260190]  1024-byte region [ffff88800632f000ffff88800632f400)
[  259.261412]
[  259.261743] The buggy address belongs to the physical page:
[  259.262354] page:0000000081e8cac9 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x632c
[  259.263722] head:0000000081e8cac9 order:2 compound_mapcount:0 compound_pincount:0
[  259.264284] flags: 0xfffffc0010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
[  259.265312] raw: 000fffffc0010200 ffffea0000060d00 dead000000000004 ffff888001041dc0
[  259.265772] raw: 0000000000000000 0000000080080008 00000001ffffffff 0000000000000000
[  259.266305] page dumped because: kasan: bad access detected
[  259.266588]
[  259.266728] Memory state around the buggy address:
[  259.267225]  ffff88800632f300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  259.267841]  ffff88800632f380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  259.269111] >ffff88800632f400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  259.269626]                    ^
[  259.270162]  ffff88800632f480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  259.270810]  ffff88800632f500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc

Signed-off-by: Edward Lo <edward.lo@ambergroup.io>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agofs/ntfs3: Validate buffer length while parsing index
Edward Lo [Thu, 22 Sep 2022 07:30:44 +0000 (15:30 +0800)]
fs/ntfs3: Validate buffer length while parsing index

[ Upstream commit 4d42ecda239cc13738d6fd84d098a32e67b368b9 ]

indx_read is called when we have some NTFS directory operations that
need more information from the index buffers. This adds a sanity check
to make sure the returned index buffer length is legit, or we may have
some out-of-bound memory accesses.

[  560.897595] BUG: KASAN: slab-out-of-bounds in hdr_find_e.isra.0+0x10c/0x320
[  560.898321] Read of size 2 at addr ffff888009497238 by task exp/245
[  560.898760]
[  560.899129] CPU: 0 PID: 245 Comm: exp Not tainted 6.0.0-rc6 #37
[  560.899505] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[  560.900170] Call Trace:
[  560.900407]  <TASK>
[  560.900732]  dump_stack_lvl+0x49/0x63
[  560.901108]  print_report.cold+0xf5/0x689
[  560.901395]  ? hdr_find_e.isra.0+0x10c/0x320
[  560.901716]  kasan_report+0xa7/0x130
[  560.901950]  ? hdr_find_e.isra.0+0x10c/0x320
[  560.902208]  __asan_load2+0x68/0x90
[  560.902427]  hdr_find_e.isra.0+0x10c/0x320
[  560.902846]  ? cmp_uints+0xe0/0xe0
[  560.903363]  ? cmp_sdh+0x90/0x90
[  560.903883]  ? ntfs_bread_run+0x190/0x190
[  560.904196]  ? rwsem_down_read_slowpath+0x750/0x750
[  560.904969]  ? ntfs_fix_post_read+0xe0/0x130
[  560.905259]  ? __kasan_check_write+0x14/0x20
[  560.905599]  ? up_read+0x1a/0x90
[  560.905853]  ? indx_read+0x22c/0x380
[  560.906096]  indx_find+0x2ef/0x470
[  560.906352]  ? indx_find_buffer+0x2d0/0x2d0
[  560.906692]  ? __kasan_kmalloc+0x88/0xb0
[  560.906977]  dir_search_u+0x196/0x2f0
[  560.907220]  ? ntfs_nls_to_utf16+0x450/0x450
[  560.907464]  ? __kasan_check_write+0x14/0x20
[  560.907747]  ? mutex_lock+0x8f/0xe0
[  560.907970]  ? __mutex_lock_slowpath+0x20/0x20
[  560.908214]  ? kmem_cache_alloc+0x143/0x4b0
[  560.908459]  ntfs_lookup+0xe0/0x100
[  560.908788]  __lookup_slow+0x116/0x220
[  560.909050]  ? lookup_fast+0x1b0/0x1b0
[  560.909309]  ? lookup_fast+0x13f/0x1b0
[  560.909601]  walk_component+0x187/0x230
[  560.909944]  link_path_walk.part.0+0x3f0/0x660
[  560.910285]  ? handle_lookup_down+0x90/0x90
[  560.910618]  ? path_init+0x642/0x6e0
[  560.911084]  ? percpu_counter_add_batch+0x6e/0xf0
[  560.912559]  ? __alloc_file+0x114/0x170
[  560.913008]  path_openat+0x19c/0x1d10
[  560.913419]  ? getname_flags+0x73/0x2b0
[  560.913815]  ? kasan_save_stack+0x3a/0x50
[  560.914125]  ? kasan_save_stack+0x26/0x50
[  560.914542]  ? __kasan_slab_alloc+0x6d/0x90
[  560.914924]  ? kmem_cache_alloc+0x143/0x4b0
[  560.915339]  ? getname_flags+0x73/0x2b0
[  560.915647]  ? getname+0x12/0x20
[  560.916114]  ? __x64_sys_open+0x4c/0x60
[  560.916460]  ? path_lookupat.isra.0+0x230/0x230
[  560.916867]  ? __isolate_free_page+0x2e0/0x2e0
[  560.917194]  do_filp_open+0x15c/0x1f0
[  560.917448]  ? may_open_dev+0x60/0x60
[  560.917696]  ? expand_files+0xa4/0x3a0
[  560.917923]  ? __kasan_check_write+0x14/0x20
[  560.918185]  ? _raw_spin_lock+0x88/0xdb
[  560.918409]  ? _raw_spin_lock_irqsave+0x100/0x100
[  560.918783]  ? _find_next_bit+0x4a/0x130
[  560.919026]  ? _raw_spin_unlock+0x19/0x40
[  560.919276]  ? alloc_fd+0x14b/0x2d0
[  560.919635]  do_sys_openat2+0x32a/0x4b0
[  560.920035]  ? file_open_root+0x230/0x230
[  560.920336]  ? __rcu_read_unlock+0x5b/0x280
[  560.920813]  do_sys_open+0x99/0xf0
[  560.921208]  ? filp_open+0x60/0x60
[  560.921482]  ? exit_to_user_mode_prepare+0x49/0x180
[  560.921867]  __x64_sys_open+0x4c/0x60
[  560.922128]  do_syscall_64+0x3b/0x90
[  560.922369]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  560.923030] RIP: 0033:0x7f7dff2e4469
[  560.923681] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 088
[  560.924451] RSP: 002b:00007ffd41a210b8 EFLAGS: 00000206 ORIG_RAX: 0000000000000002
[  560.925168] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7dff2e4469
[  560.925655] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 00007ffd41a211f0
[  560.926085] RBP: 00007ffd41a252a0 R08: 00007f7dff60fba0 R09: 00007ffd41a25388
[  560.926405] R10: 0000000000400b80 R11: 0000000000000206 R12: 00000000004004e0
[  560.926867] R13: 00007ffd41a25380 R14: 0000000000000000 R15: 0000000000000000
[  560.927241]  </TASK>
[  560.927491]
[  560.927755] Allocated by task 245:
[  560.928409]  kasan_save_stack+0x26/0x50
[  560.929271]  __kasan_kmalloc+0x88/0xb0
[  560.929778]  __kmalloc+0x192/0x320
[  560.930023]  indx_read+0x249/0x380
[  560.930224]  indx_find+0x2a2/0x470
[  560.930695]  dir_search_u+0x196/0x2f0
[  560.930892]  ntfs_lookup+0xe0/0x100
[  560.931115]  __lookup_slow+0x116/0x220
[  560.931323]  walk_component+0x187/0x230
[  560.931570]  link_path_walk.part.0+0x3f0/0x660
[  560.931791]  path_openat+0x19c/0x1d10
[  560.932008]  do_filp_open+0x15c/0x1f0
[  560.932226]  do_sys_openat2+0x32a/0x4b0
[  560.932413]  do_sys_open+0x99/0xf0
[  560.932709]  __x64_sys_open+0x4c/0x60
[  560.933417]  do_syscall_64+0x3b/0x90
[  560.933776]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  560.934235]
[  560.934486] The buggy address belongs to the object at ffff888009497000
[  560.934486]  which belongs to the cache kmalloc-512 of size 512
[  560.935239] The buggy address is located 56 bytes to the right of
[  560.935239]  512-byte region [ffff888009497000ffff888009497200)
[  560.936153]
[  560.937326] The buggy address belongs to the physical page:
[  560.938228] page:0000000062a3dfae refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x9496
[  560.939616] head:0000000062a3dfae order:1 compound_mapcount:0 compound_pincount:0
[  560.940219] flags: 0xfffffc0010200(slab|head|node=0|zone=1|lastcpupid=0x1fffff)
[  560.942702] raw: 000fffffc0010200 ffffea0000164f80 dead000000000005 ffff888001041c80
[  560.943932] raw: 0000000000000000 0000000080080008 00000001ffffffff 0000000000000000
[  560.944568] page dumped because: kasan: bad access detected
[  560.945735]
[  560.946112] Memory state around the buggy address:
[  560.946870]  ffff888009497100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  560.947242]  ffff888009497180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  560.947611] >ffff888009497200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  560.947915]                                         ^
[  560.948249]  ffff888009497280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  560.948687]  ffff888009497300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc

Signed-off-by: Edward Lo <edward.lo@ambergroup.io>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agofs/ntfs3: Validate attribute name offset
Edward Lo [Fri, 9 Sep 2022 01:04:00 +0000 (09:04 +0800)]
fs/ntfs3: Validate attribute name offset

[ Upstream commit 4f1dc7d9756e66f3f876839ea174df2e656b7f79 ]

Although the attribute name length is checked before comparing it to
some common names (e.g., $I30), the offset isn't. This adds a sanity
check for the attribute name offset, guarantee the validity and prevent
possible out-of-bound memory accesses.

[  191.720056] BUG: unable to handle page fault for address: ffffebde00000008
[  191.721060] #PF: supervisor read access in kernel mode
[  191.721586] #PF: error_code(0x0000) - not-present page
[  191.722079] PGD 0 P4D 0
[  191.722571] Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
[  191.723179] CPU: 0 PID: 244 Comm: mount Not tainted 6.0.0-rc4 #28
[  191.723749] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[  191.724832] RIP: 0010:kfree+0x56/0x3b0
[  191.725870] Code: 80 48 01 d8 0f 82 65 03 00 00 48 c7 c2 00 00 00 80 48 2b 15 2c 06 dd 01 48 01 d0 48 c1 e8 0c 48 c1 e0 06 48 03 05 0a 069
[  191.727375] RSP: 0018:ffff8880076f7878 EFLAGS: 00000286
[  191.727897] RAX: ffffebde00000000 RBX: 0000000000000040 RCX: ffffffff8528d5b9
[  191.728531] RDX: 0000777f80000000 RSI: ffffffff8522d49c RDI: 0000000000000040
[  191.729183] RBP: ffff8880076f78a0 R08: 0000000000000000 R09: 0000000000000000
[  191.729628] R10: ffff888008949fd8 R11: ffffed10011293fd R12: 0000000000000040
[  191.730158] R13: ffff888008949f98 R14: ffff888008949ec0 R15: ffff888008949fb0
[  191.730645] FS:  00007f3520cd7e40(0000) GS:ffff88805ba00000(0000) knlGS:0000000000000000
[  191.731328] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  191.731667] CR2: ffffebde00000008 CR3: 0000000009704000 CR4: 00000000000006f0
[  191.732568] Call Trace:
[  191.733231]  <TASK>
[  191.733860]  kvfree+0x2c/0x40
[  191.734632]  ni_clear+0x180/0x290
[  191.735085]  ntfs_evict_inode+0x45/0x70
[  191.735495]  evict+0x199/0x280
[  191.735996]  iput.part.0+0x286/0x320
[  191.736438]  iput+0x32/0x50
[  191.736811]  iget_failed+0x23/0x30
[  191.737270]  ntfs_iget5+0x337/0x1890
[  191.737629]  ? ntfs_clear_mft_tail+0x20/0x260
[  191.738201]  ? ntfs_get_block_bmap+0x70/0x70
[  191.738482]  ? ntfs_objid_init+0xf6/0x140
[  191.738779]  ? ntfs_reparse_init+0x140/0x140
[  191.739266]  ntfs_fill_super+0x121b/0x1b50
[  191.739623]  ? put_ntfs+0x1d0/0x1d0
[  191.739984]  ? asm_sysvec_apic_timer_interrupt+0x1b/0x20
[  191.740466]  ? put_ntfs+0x1d0/0x1d0
[  191.740787]  ? sb_set_blocksize+0x6a/0x80
[  191.741272]  get_tree_bdev+0x232/0x370
[  191.741829]  ? put_ntfs+0x1d0/0x1d0
[  191.742669]  ntfs_fs_get_tree+0x15/0x20
[  191.743132]  vfs_get_tree+0x4c/0x130
[  191.743457]  path_mount+0x654/0xfe0
[  191.743938]  ? putname+0x80/0xa0
[  191.744271]  ? finish_automount+0x2e0/0x2e0
[  191.744582]  ? putname+0x80/0xa0
[  191.745053]  ? kmem_cache_free+0x1c4/0x440
[  191.745403]  ? putname+0x80/0xa0
[  191.745616]  do_mount+0xd6/0xf0
[  191.745887]  ? path_mount+0xfe0/0xfe0
[  191.746287]  ? __kasan_check_write+0x14/0x20
[  191.746582]  __x64_sys_mount+0xca/0x110
[  191.746850]  do_syscall_64+0x3b/0x90
[  191.747122]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  191.747517] RIP: 0033:0x7f351fee948a
[  191.748332] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 008
[  191.749341] RSP: 002b:00007ffd51cf3af8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
[  191.749960] RAX: ffffffffffffffda RBX: 000055b903733060 RCX: 00007f351fee948a
[  191.750589] RDX: 000055b903733260 RSI: 000055b9037332e0 RDI: 000055b90373bce0
[  191.751115] RBP: 0000000000000000 R08: 000055b903733280 R09: 0000000000000020
[  191.751537] R10: 00000000c0ed0000 R11: 0000000000000202 R12: 000055b90373bce0
[  191.751946] R13: 000055b903733260 R14: 0000000000000000 R15: 00000000ffffffff
[  191.752519]  </TASK>
[  191.752782] Modules linked in:
[  191.753785] CR2: ffffebde00000008
[  191.754937] ---[ end trace 0000000000000000 ]---
[  191.755429] RIP: 0010:kfree+0x56/0x3b0
[  191.755725] Code: 80 48 01 d8 0f 82 65 03 00 00 48 c7 c2 00 00 00 80 48 2b 15 2c 06 dd 01 48 01 d0 48 c1 e8 0c 48 c1 e0 06 48 03 05 0a 069
[  191.756744] RSP: 0018:ffff8880076f7878 EFLAGS: 00000286
[  191.757218] RAX: ffffebde00000000 RBX: 0000000000000040 RCX: ffffffff8528d5b9
[  191.757580] RDX: 0000777f80000000 RSI: ffffffff8522d49c RDI: 0000000000000040
[  191.758016] RBP: ffff8880076f78a0 R08: 0000000000000000 R09: 0000000000000000
[  191.758570] R10: ffff888008949fd8 R11: ffffed10011293fd R12: 0000000000000040
[  191.758957] R13: ffff888008949f98 R14: ffff888008949ec0 R15: ffff888008949fb0
[  191.759317] FS:  00007f3520cd7e40(0000) GS:ffff88805ba00000(0000) knlGS:0000000000000000
[  191.759711] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  191.760118] CR2: ffffebde00000008 CR3: 0000000009704000 CR4: 00000000000006f0

Signed-off-by: Edward Lo <edward.lo@ambergroup.io>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agofs/ntfs3: Add null pointer check for inode operations
Edward Lo [Fri, 9 Sep 2022 01:03:10 +0000 (09:03 +0800)]
fs/ntfs3: Add null pointer check for inode operations

[ Upstream commit c1ca8ef0262b25493631ecbd9cb8c9893e1481a1 ]

This adds a sanity check for the i_op pointer of the inode which is
returned after reading Root directory MFT record. We should check the
i_op is valid before trying to create the root dentry, otherwise we may
encounter a NPD while mounting a image with a funny Root directory MFT
record.

[  114.484325] BUG: kernel NULL pointer dereference, address: 0000000000000008
[  114.484811] #PF: supervisor read access in kernel mode
[  114.485084] #PF: error_code(0x0000) - not-present page
[  114.485606] PGD 0 P4D 0
[  114.485975] Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
[  114.486570] CPU: 0 PID: 237 Comm: mount Tainted: G    B              6.0.0-rc4 #28
[  114.486977] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[  114.488169] RIP: 0010:d_flags_for_inode+0xe0/0x110
[  114.488816] Code: 24 f7 ff 49 83 3e 00 74 41 41 83 cd 02 66 44 89 6b 02 eb 92 48 8d 7b 20 e8 6d 24 f7 ff 4c 8b 73 20 49 8d 7e 08 e8 60 241
[  114.490326] RSP: 0018:ffff8880065e7aa8 EFLAGS: 00000296
[  114.490695] RAX: 0000000000000001 RBX: ffff888008ccd750 RCX: ffffffff84af2aea
[  114.490986] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff87abd020
[  114.491364] RBP: ffff8880065e7ac8 R08: 0000000000000001 R09: fffffbfff0f57a05
[  114.491675] R10: ffffffff87abd027 R11: fffffbfff0f57a04 R12: 0000000000000000
[  114.491954] R13: 0000000000000008 R14: 0000000000000000 R15: ffff888008ccd750
[  114.492397] FS:  00007fdc8a627e40(0000) GS:ffff888058200000(0000) knlGS:0000000000000000
[  114.492797] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  114.493150] CR2: 0000000000000008 CR3: 00000000013ba000 CR4: 00000000000006f0
[  114.493671] Call Trace:
[  114.493890]  <TASK>
[  114.494075]  __d_instantiate+0x24/0x1c0
[  114.494505]  d_instantiate.part.0+0x35/0x50
[  114.494754]  d_make_root+0x53/0x80
[  114.494998]  ntfs_fill_super+0x1232/0x1b50
[  114.495260]  ? put_ntfs+0x1d0/0x1d0
[  114.495499]  ? vsprintf+0x20/0x20
[  114.495723]  ? set_blocksize+0x95/0x150
[  114.495964]  get_tree_bdev+0x232/0x370
[  114.496272]  ? put_ntfs+0x1d0/0x1d0
[  114.496502]  ntfs_fs_get_tree+0x15/0x20
[  114.496859]  vfs_get_tree+0x4c/0x130
[  114.497099]  path_mount+0x654/0xfe0
[  114.497507]  ? putname+0x80/0xa0
[  114.497933]  ? finish_automount+0x2e0/0x2e0
[  114.498362]  ? putname+0x80/0xa0
[  114.498571]  ? kmem_cache_free+0x1c4/0x440
[  114.498819]  ? putname+0x80/0xa0
[  114.499069]  do_mount+0xd6/0xf0
[  114.499343]  ? path_mount+0xfe0/0xfe0
[  114.499683]  ? __kasan_check_write+0x14/0x20
[  114.500133]  __x64_sys_mount+0xca/0x110
[  114.500592]  do_syscall_64+0x3b/0x90
[  114.500930]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  114.501294] RIP: 0033:0x7fdc898e948a
[  114.501542] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 008
[  114.502716] RSP: 002b:00007ffd793e58f8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
[  114.503175] RAX: ffffffffffffffda RBX: 0000564b2228f060 RCX: 00007fdc898e948a
[  114.503588] RDX: 0000564b2228f260 RSI: 0000564b2228f2e0 RDI: 0000564b22297ce0
[  114.504925] RBP: 0000000000000000 R08: 0000564b2228f280 R09: 0000000000000020
[  114.505484] R10: 00000000c0ed0000 R11: 0000000000000202 R12: 0000564b22297ce0
[  114.505823] R13: 0000564b2228f260 R14: 0000000000000000 R15: 00000000ffffffff
[  114.506562]  </TASK>
[  114.506887] Modules linked in:
[  114.507648] CR2: 0000000000000008
[  114.508884] ---[ end trace 0000000000000000 ]---
[  114.509675] RIP: 0010:d_flags_for_inode+0xe0/0x110
[  114.510140] Code: 24 f7 ff 49 83 3e 00 74 41 41 83 cd 02 66 44 89 6b 02 eb 92 48 8d 7b 20 e8 6d 24 f7 ff 4c 8b 73 20 49 8d 7e 08 e8 60 241
[  114.511762] RSP: 0018:ffff8880065e7aa8 EFLAGS: 00000296
[  114.512401] RAX: 0000000000000001 RBX: ffff888008ccd750 RCX: ffffffff84af2aea
[  114.513103] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff87abd020
[  114.513512] RBP: ffff8880065e7ac8 R08: 0000000000000001 R09: fffffbfff0f57a05
[  114.513831] R10: ffffffff87abd027 R11: fffffbfff0f57a04 R12: 0000000000000000
[  114.514757] R13: 0000000000000008 R14: 0000000000000000 R15: ffff888008ccd750
[  114.515411] FS:  00007fdc8a627e40(0000) GS:ffff888058200000(0000) knlGS:0000000000000000
[  114.515794] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  114.516208] CR2: 0000000000000008 CR3: 00000000013ba000 CR4: 00000000000006f0

Signed-off-by: Edward Lo <edward.lo@ambergroup.io>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agofs/ntfs3: Fix memory leak on ntfs_fill_super() error path
Shigeru Yoshida [Tue, 23 Aug 2022 10:32:05 +0000 (19:32 +0900)]
fs/ntfs3: Fix memory leak on ntfs_fill_super() error path

[ Upstream commit 51e76a232f8c037f1d9e9922edc25b003d5f3414 ]

syzbot reported kmemleak as below:

BUG: memory leak
unreferenced object 0xffff8880122f1540 (size 32):
  comm "a.out", pid 6664, jiffies 4294939771 (age 25.500s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 ed ff ed ff 00 00 00 00  ................
  backtrace:
    [<ffffffff81b16052>] ntfs_init_fs_context+0x22/0x1c0
    [<ffffffff8164aaa7>] alloc_fs_context+0x217/0x430
    [<ffffffff81626dd4>] path_mount+0x704/0x1080
    [<ffffffff81627e7c>] __x64_sys_mount+0x18c/0x1d0
    [<ffffffff84593e14>] do_syscall_64+0x34/0xb0
    [<ffffffff84600087>] entry_SYSCALL_64_after_hwframe+0x63/0xcd

This patch fixes this issue by freeing mount options on error path of
ntfs_fill_super().

Reported-by: syzbot+9d67170b20e8f94351c8@syzkaller.appspotmail.com
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agofs/ntfs3: Add null pointer check to attr_load_runs_vcn
Edward Lo [Sat, 6 Aug 2022 17:05:18 +0000 (01:05 +0800)]
fs/ntfs3: Add null pointer check to attr_load_runs_vcn

[ Upstream commit 2681631c29739509eec59cc0b34e977bb04c6cf1 ]

Some metadata files are handled before MFT. This adds a null pointer
check for some corner cases that could lead to NPD while reading these
metadata files for a malformed NTFS image.

[  240.190827] BUG: kernel NULL pointer dereference, address: 0000000000000158
[  240.191583] #PF: supervisor read access in kernel mode
[  240.191956] #PF: error_code(0x0000) - not-present page
[  240.192391] PGD 0 P4D 0
[  240.192897] Oops: 0000 [#1] PREEMPT SMP KASAN NOPTI
[  240.193805] CPU: 0 PID: 242 Comm: mount Tainted: G    B             5.19.0+ #17
[  240.194477] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[  240.195152] RIP: 0010:ni_find_attr+0xae/0x300
[  240.195679] Code: c8 48 c7 45 88 c0 4e 5e 86 c7 00 f1 f1 f1 f1 c7 40 04 00 f3 f3 f3 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 e8 e2 d9f
[  240.196642] RSP: 0018:ffff88800812f690 EFLAGS: 00000286
[  240.197019] RAX: 0000000000000001 RBX: 0000000000000000 RCX: ffffffff85ef037a
[  240.197523] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff88e95f60
[  240.197877] RBP: ffff88800812f738 R08: 0000000000000001 R09: fffffbfff11d2bed
[  240.198292] R10: ffffffff88e95f67 R11: fffffbfff11d2bec R12: 0000000000000000
[  240.198647] R13: 0000000000000080 R14: 0000000000000000 R15: 0000000000000000
[  240.199410] FS:  00007f233c33be40(0000) GS:ffff888058200000(0000) knlGS:0000000000000000
[  240.199895] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  240.200314] CR2: 0000000000000158 CR3: 0000000004d32000 CR4: 00000000000006f0
[  240.200839] Call Trace:
[  240.201104]  <TASK>
[  240.201502]  ? ni_load_mi+0x80/0x80
[  240.202297]  ? ___slab_alloc+0x465/0x830
[  240.202614]  attr_load_runs_vcn+0x8c/0x1a0
[  240.202886]  ? __kasan_slab_alloc+0x32/0x90
[  240.203157]  ? attr_data_write_resident+0x250/0x250
[  240.203543]  mi_read+0x133/0x2c0
[  240.203785]  mi_get+0x70/0x140
[  240.204012]  ni_load_mi_ex+0xfa/0x190
[  240.204346]  ? ni_std5+0x90/0x90
[  240.204588]  ? __kasan_kmalloc+0x88/0xb0
[  240.204859]  ni_enum_attr_ex+0xf1/0x1c0
[  240.205107]  ? ni_fname_type.part.0+0xd0/0xd0
[  240.205600]  ? ntfs_load_attr_list+0xbe/0x300
[  240.205864]  ? ntfs_cmp_names_cpu+0x125/0x180
[  240.206157]  ntfs_iget5+0x56c/0x1870
[  240.206510]  ? ntfs_get_block_bmap+0x70/0x70
[  240.206776]  ? __kasan_kmalloc+0x88/0xb0
[  240.207030]  ? set_blocksize+0x95/0x150
[  240.207545]  ntfs_fill_super+0xb8f/0x1e20
[  240.207839]  ? put_ntfs+0x1d0/0x1d0
[  240.208069]  ? vsprintf+0x20/0x20
[  240.208467]  ? mutex_unlock+0x81/0xd0
[  240.208846]  ? set_blocksize+0x95/0x150
[  240.209221]  get_tree_bdev+0x232/0x370
[  240.209804]  ? put_ntfs+0x1d0/0x1d0
[  240.210519]  ntfs_fs_get_tree+0x15/0x20
[  240.210991]  vfs_get_tree+0x4c/0x130
[  240.211455]  path_mount+0x645/0xfd0
[  240.211806]  ? putname+0x80/0xa0
[  240.212112]  ? finish_automount+0x2e0/0x2e0
[  240.212559]  ? kmem_cache_free+0x110/0x390
[  240.212906]  ? putname+0x80/0xa0
[  240.213329]  do_mount+0xd6/0xf0
[  240.213829]  ? path_mount+0xfd0/0xfd0
[  240.214246]  ? __kasan_check_write+0x14/0x20
[  240.214774]  __x64_sys_mount+0xca/0x110
[  240.215080]  do_syscall_64+0x3b/0x90
[  240.215442]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  240.215811] RIP: 0033:0x7f233b4e948a
[  240.216104] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 008
[  240.217615] RSP: 002b:00007fff02211ec8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
[  240.218718] RAX: ffffffffffffffda RBX: 0000561cdc35b060 RCX: 00007f233b4e948a
[  240.219556] RDX: 0000561cdc35b260 RSI: 0000561cdc35b2e0 RDI: 0000561cdc363af0
[  240.219975] RBP: 0000000000000000 R08: 0000561cdc35b280 R09: 0000000000000020
[  240.220403] R10: 00000000c0ed0000 R11: 0000000000000202 R12: 0000561cdc363af0
[  240.220803] R13: 0000561cdc35b260 R14: 0000000000000000 R15: 00000000ffffffff
[  240.221256]  </TASK>
[  240.221567] Modules linked in:
[  240.222028] CR2: 0000000000000158
[  240.223291] ---[ end trace 0000000000000000 ]---
[  240.223669] RIP: 0010:ni_find_attr+0xae/0x300
[  240.224058] Code: c8 48 c7 45 88 c0 4e 5e 86 c7 00 f1 f1 f1 f1 c7 40 04 00 f3 f3 f3 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 e8 e2 d9f
[  240.225033] RSP: 0018:ffff88800812f690 EFLAGS: 00000286
[  240.225968] RAX: 0000000000000001 RBX: 0000000000000000 RCX: ffffffff85ef037a
[  240.226624] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffffff88e95f60
[  240.227307] RBP: ffff88800812f738 R08: 0000000000000001 R09: fffffbfff11d2bed
[  240.227816] R10: ffffffff88e95f67 R11: fffffbfff11d2bec R12: 0000000000000000
[  240.228330] R13: 0000000000000080 R14: 0000000000000000 R15: 0000000000000000
[  240.228729] FS:  00007f233c33be40(0000) GS:ffff888058200000(0000) knlGS:0000000000000000
[  240.229281] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  240.230298] CR2: 0000000000000158 CR3: 0000000004d32000 CR4: 00000000000006f0

Signed-off-by: Edward Lo <edward.lo@ambergroup.io>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agofs/ntfs3: Validate data run offset
Edward Lo [Fri, 5 Aug 2022 16:47:27 +0000 (00:47 +0800)]
fs/ntfs3: Validate data run offset

[ Upstream commit 6db620863f8528ed9a9aa5ad323b26554a17881d ]

This adds sanity checks for data run offset. We should make sure data
run offset is legit before trying to unpack them, otherwise we may
encounter use-after-free or some unexpected memory access behaviors.

[   82.940342] BUG: KASAN: use-after-free in run_unpack+0x2e3/0x570
[   82.941180] Read of size 1 at addr ffff888008a8487f by task mount/240
[   82.941670]
[   82.942069] CPU: 0 PID: 240 Comm: mount Not tainted 5.19.0+ #15
[   82.942482] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[   82.943720] Call Trace:
[   82.944204]  <TASK>
[   82.944471]  dump_stack_lvl+0x49/0x63
[   82.944908]  print_report.cold+0xf5/0x67b
[   82.945141]  ? __wait_on_bit+0x106/0x120
[   82.945750]  ? run_unpack+0x2e3/0x570
[   82.946626]  kasan_report+0xa7/0x120
[   82.947046]  ? run_unpack+0x2e3/0x570
[   82.947280]  __asan_load1+0x51/0x60
[   82.947483]  run_unpack+0x2e3/0x570
[   82.947709]  ? memcpy+0x4e/0x70
[   82.947927]  ? run_pack+0x7a0/0x7a0
[   82.948158]  run_unpack_ex+0xad/0x3f0
[   82.948399]  ? mi_enum_attr+0x14a/0x200
[   82.948717]  ? run_unpack+0x570/0x570
[   82.949072]  ? ni_enum_attr_ex+0x1b2/0x1c0
[   82.949332]  ? ni_fname_type.part.0+0xd0/0xd0
[   82.949611]  ? mi_read+0x262/0x2c0
[   82.949970]  ? ntfs_cmp_names_cpu+0x125/0x180
[   82.950249]  ntfs_iget5+0x632/0x1870
[   82.950621]  ? ntfs_get_block_bmap+0x70/0x70
[   82.951192]  ? evict+0x223/0x280
[   82.951525]  ? iput.part.0+0x286/0x320
[   82.951969]  ntfs_fill_super+0x1321/0x1e20
[   82.952436]  ? put_ntfs+0x1d0/0x1d0
[   82.952822]  ? vsprintf+0x20/0x20
[   82.953188]  ? mutex_unlock+0x81/0xd0
[   82.953379]  ? set_blocksize+0x95/0x150
[   82.954001]  get_tree_bdev+0x232/0x370
[   82.954438]  ? put_ntfs+0x1d0/0x1d0
[   82.954700]  ntfs_fs_get_tree+0x15/0x20
[   82.955049]  vfs_get_tree+0x4c/0x130
[   82.955292]  path_mount+0x645/0xfd0
[   82.955615]  ? putname+0x80/0xa0
[   82.955955]  ? finish_automount+0x2e0/0x2e0
[   82.956310]  ? kmem_cache_free+0x110/0x390
[   82.956723]  ? putname+0x80/0xa0
[   82.957023]  do_mount+0xd6/0xf0
[   82.957411]  ? path_mount+0xfd0/0xfd0
[   82.957638]  ? __kasan_check_write+0x14/0x20
[   82.957948]  __x64_sys_mount+0xca/0x110
[   82.958310]  do_syscall_64+0x3b/0x90
[   82.958719]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   82.959341] RIP: 0033:0x7fd0d1ce948a
[   82.960193] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 008
[   82.961532] RSP: 002b:00007ffe59ff69a8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
[   82.962527] RAX: ffffffffffffffda RBX: 0000564dcc107060 RCX: 00007fd0d1ce948a
[   82.963266] RDX: 0000564dcc107260 RSI: 0000564dcc1072e0 RDI: 0000564dcc10fce0
[   82.963686] RBP: 0000000000000000 R08: 0000564dcc107280 R09: 0000000000000020
[   82.964272] R10: 00000000c0ed0000 R11: 0000000000000202 R12: 0000564dcc10fce0
[   82.964785] R13: 0000564dcc107260 R14: 0000000000000000 R15: 00000000ffffffff

Signed-off-by: Edward Lo <edward.lo@ambergroup.io>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agofs/ntfs3: Add overflow check for attribute size
edward lo [Mon, 1 Aug 2022 10:20:51 +0000 (18:20 +0800)]
fs/ntfs3: Add overflow check for attribute size

[ Upstream commit e19c6277652efba203af4ecd8eed4bd30a0054c9 ]

The offset addition could overflow and pass the used size check given an
attribute with very large size (e.g., 0xffffff7f) while parsing MFT
attributes. This could lead to out-of-bound memory R/W if we try to
access the next attribute derived by Add2Ptr(attr, asize)

[   32.963847] BUG: unable to handle page fault for address: ffff956a83c76067
[   32.964301] #PF: supervisor read access in kernel mode
[   32.964526] #PF: error_code(0x0000) - not-present page
[   32.964893] PGD 4dc01067 P4D 4dc01067 PUD 0
[   32.965316] Oops: 0000 [#1] PREEMPT SMP NOPTI
[   32.965727] CPU: 0 PID: 243 Comm: mount Not tainted 5.19.0+ #6
[   32.966050] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[   32.966628] RIP: 0010:mi_enum_attr+0x44/0x110
[   32.967239] Code: 89 f0 48 29 c8 48 89 c1 39 c7 0f 86 94 00 00 00 8b 56 04 83 fa 17 0f 86 88 00 00 00 89 d0 01 ca 48 01 f0 8d 4a 08 39 f9a
[   32.968101] RSP: 0018:ffffba15c06a7c38 EFLAGS: 00000283
[   32.968364] RAX: ffff956a83c76067 RBX: ffff956983c76050 RCX: 000000000000006f
[   32.968651] RDX: 0000000000000067 RSI: ffff956983c760e8 RDI: 00000000000001c8
[   32.968963] RBP: ffffba15c06a7c38 R08: 0000000000000064 R09: 00000000ffffff7f
[   32.969249] R10: 0000000000000007 R11: ffff956983c760e8 R12: ffff95698225e000
[   32.969870] R13: 0000000000000000 R14: ffffba15c06a7cd8 R15: ffff95698225e170
[   32.970655] FS:  00007fdab8189e40(0000) GS:ffff9569fdc00000(0000) knlGS:0000000000000000
[   32.971098] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   32.971378] CR2: ffff956a83c76067 CR3: 0000000002c58000 CR4: 00000000000006f0
[   32.972098] Call Trace:
[   32.972842]  <TASK>
[   32.973341]  ni_enum_attr_ex+0xda/0xf0
[   32.974087]  ntfs_iget5+0x1db/0xde0
[   32.974386]  ? slab_post_alloc_hook+0x53/0x270
[   32.974778]  ? ntfs_fill_super+0x4c7/0x12a0
[   32.975115]  ntfs_fill_super+0x5d6/0x12a0
[   32.975336]  get_tree_bdev+0x175/0x270
[   32.975709]  ? put_ntfs+0x150/0x150
[   32.975956]  ntfs_fs_get_tree+0x15/0x20
[   32.976191]  vfs_get_tree+0x2a/0xc0
[   32.976374]  ? capable+0x19/0x20
[   32.976572]  path_mount+0x484/0xaa0
[   32.977025]  ? putname+0x57/0x70
[   32.977380]  do_mount+0x80/0xa0
[   32.977555]  __x64_sys_mount+0x8b/0xe0
[   32.978105]  do_syscall_64+0x3b/0x90
[   32.978830]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   32.979311] RIP: 0033:0x7fdab72e948a
[   32.980015] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 008
[   32.981251] RSP: 002b:00007ffd15b87588 EFLAGS: 00000206 ORIG_RAX: 00000000000000a5
[   32.981832] RAX: ffffffffffffffda RBX: 0000557de0aaf060 RCX: 00007fdab72e948a
[   32.982234] RDX: 0000557de0aaf260 RSI: 0000557de0aaf2e0 RDI: 0000557de0ab7ce0
[   32.982714] RBP: 0000000000000000 R08: 0000557de0aaf280 R09: 0000000000000020
[   32.983046] R10: 00000000c0ed0000 R11: 0000000000000206 R12: 0000557de0ab7ce0
[   32.983494] R13: 0000557de0aaf260 R14: 0000000000000000 R15: 00000000ffffffff
[   32.984094]  </TASK>
[   32.984352] Modules linked in:
[   32.984753] CR2: ffff956a83c76067
[   32.985911] ---[ end trace 0000000000000000 ]---
[   32.986555] RIP: 0010:mi_enum_attr+0x44/0x110
[   32.987217] Code: 89 f0 48 29 c8 48 89 c1 39 c7 0f 86 94 00 00 00 8b 56 04 83 fa 17 0f 86 88 00 00 00 89 d0 01 ca 48 01 f0 8d 4a 08 39 f9a
[   32.988232] RSP: 0018:ffffba15c06a7c38 EFLAGS: 00000283
[   32.988532] RAX: ffff956a83c76067 RBX: ffff956983c76050 RCX: 000000000000006f
[   32.988916] RDX: 0000000000000067 RSI: ffff956983c760e8 RDI: 00000000000001c8
[   32.989356] RBP: ffffba15c06a7c38 R08: 0000000000000064 R09: 00000000ffffff7f
[   32.989994] R10: 0000000000000007 R11: ffff956983c760e8 R12: ffff95698225e000
[   32.990415] R13: 0000000000000000 R14: ffffba15c06a7cd8 R15: ffff95698225e170
[   32.991011] FS:  00007fdab8189e40(0000) GS:ffff9569fdc00000(0000) knlGS:0000000000000000
[   32.991524] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   32.991936] CR2: ffff956a83c76067 CR3: 0000000002c58000 CR4: 00000000000006f0

This patch adds an overflow check

Signed-off-by: edward lo <edward.lo@ambergroup.io>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agofs/ntfs3: Validate BOOT record_size
edward lo [Mon, 1 Aug 2022 07:37:31 +0000 (15:37 +0800)]
fs/ntfs3: Validate BOOT record_size

[ Upstream commit 0b66046266690454dc04e6307bcff4a5605b42a1 ]

When the NTFS BOOT record_size field < 0, it represents a
shift value. However, there is no sanity check on the shift result
and the sbi->record_bits calculation through blksize_bits() assumes
the size always > 256, which could lead to NPD while mounting a
malformed NTFS image.

[  318.675159] BUG: kernel NULL pointer dereference, address: 0000000000000158
[  318.675682] #PF: supervisor read access in kernel mode
[  318.675869] #PF: error_code(0x0000) - not-present page
[  318.676246] PGD 0 P4D 0
[  318.676502] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  318.676934] CPU: 0 PID: 259 Comm: mount Not tainted 5.19.0 #5
[  318.677289] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[  318.678136] RIP: 0010:ni_find_attr+0x2d/0x1c0
[  318.678656] Code: 89 ca 4d 89 c7 41 56 41 55 41 54 41 89 cc 55 48 89 fd 53 48 89 d3 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 180
[  318.679848] RSP: 0018:ffffa6c8c0297bd8 EFLAGS: 00000246
[  318.680104] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000080
[  318.680790] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[  318.681679] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[  318.682577] R10: 0000000000000000 R11: 0000000000000005 R12: 0000000000000080
[  318.683015] R13: ffff8d5582e68400 R14: 0000000000000100 R15: 0000000000000000
[  318.683618] FS:  00007fd9e1c81e40(0000) GS:ffff8d55fdc00000(0000) knlGS:0000000000000000
[  318.684280] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  318.684651] CR2: 0000000000000158 CR3: 0000000002e1a000 CR4: 00000000000006f0
[  318.685623] Call Trace:
[  318.686607]  <TASK>
[  318.686872]  ? ntfs_alloc_inode+0x1a/0x60
[  318.687235]  attr_load_runs_vcn+0x2b/0xa0
[  318.687468]  mi_read+0xbb/0x250
[  318.687576]  ntfs_iget5+0x114/0xd90
[  318.687750]  ntfs_fill_super+0x588/0x11b0
[  318.687953]  ? put_ntfs+0x130/0x130
[  318.688065]  ? snprintf+0x49/0x70
[  318.688164]  ? put_ntfs+0x130/0x130
[  318.688256]  get_tree_bdev+0x16a/0x260
[  318.688407]  vfs_get_tree+0x20/0xb0
[  318.688519]  path_mount+0x2dc/0x9b0
[  318.688877]  do_mount+0x74/0x90
[  318.689142]  __x64_sys_mount+0x89/0xd0
[  318.689636]  do_syscall_64+0x3b/0x90
[  318.689998]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[  318.690318] RIP: 0033:0x7fd9e133c48a
[  318.690687] Code: 48 8b 0d 11 fa 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 008
[  318.691357] RSP: 002b:00007ffd374406c8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
[  318.691632] RAX: ffffffffffffffda RBX: 0000564d0b051080 RCX: 00007fd9e133c48a
[  318.691920] RDX: 0000564d0b051280 RSI: 0000564d0b051300 RDI: 0000564d0b0596a0
[  318.692123] RBP: 0000000000000000 R08: 0000564d0b0512a0 R09: 0000000000000020
[  318.692349] R10: 00000000c0ed0000 R11: 0000000000000202 R12: 0000564d0b0596a0
[  318.692673] R13: 0000564d0b051280 R14: 0000000000000000 R15: 00000000ffffffff
[  318.693007]  </TASK>
[  318.693271] Modules linked in:
[  318.693614] CR2: 0000000000000158
[  318.694446] ---[ end trace 0000000000000000 ]---
[  318.694779] RIP: 0010:ni_find_attr+0x2d/0x1c0
[  318.694952] Code: 89 ca 4d 89 c7 41 56 41 55 41 54 41 89 cc 55 48 89 fd 53 48 89 d3 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 180
[  318.696042] RSP: 0018:ffffa6c8c0297bd8 EFLAGS: 00000246
[  318.696531] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000080
[  318.698114] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[  318.699286] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[  318.699795] R10: 0000000000000000 R11: 0000000000000005 R12: 0000000000000080
[  318.700236] R13: ffff8d5582e68400 R14: 0000000000000100 R15: 0000000000000000
[  318.700973] FS:  00007fd9e1c81e40(0000) GS:ffff8d55fdc00000(0000) knlGS:0000000000000000
[  318.701688] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  318.702190] CR2: 0000000000000158 CR3: 0000000002e1a000 CR4: 00000000000006f0
[  318.726510] mount (259) used greatest stack depth: 13320 bytes left

This patch adds a sanity check.

Signed-off-by: edward lo <edward.lo@ambergroup.io>
Signed-off-by: Konstantin Komarov <almaz.alexandrovich@paragon-software.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agonvmet: don't defer passthrough commands with trivial effects to the workqueue
Christoph Hellwig [Wed, 21 Dec 2022 08:51:19 +0000 (09:51 +0100)]
nvmet: don't defer passthrough commands with trivial effects to the workqueue

[ Upstream commit 2a459f6933e1c459bffb7cc73fd6c900edc714bd ]

Mask out the "Command Supported" and "Logical Block Content Change" bits
and only defer execution of commands that have non-trivial effects to
the workqueue for synchronous execution.  This allows to execute admin
commands asynchronously on controllers that provide a Command Supported
and Effects log page, and will keep allowing to execute Write commands
asynchronously once command effects on I/O commands are taken into
account.

Fixes: c1fef73f793b ("nvmet: add passthru code to process commands")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agonvme: fix the NVME_CMD_EFFECTS_CSE_MASK definition
Christoph Hellwig [Wed, 21 Dec 2022 09:30:45 +0000 (10:30 +0100)]
nvme: fix the NVME_CMD_EFFECTS_CSE_MASK definition

[ Upstream commit 685e6311637e46f3212439ce2789f8a300e5050f ]

3 << 16 does not generate the correct mask for bits 16, 17 and 18.
Use the GENMASK macro to generate the correct mask instead.

Fixes: 84fef62d135b ("nvme: check admin passthru command effects")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Kanchan Joshi <joshi.k@samsung.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
18 months agoata: ahci: Fix PCS quirk application for suspend
Adam Vodopjan [Fri, 9 Dec 2022 09:26:34 +0000 (09:26 +0000)]
ata: ahci: Fix PCS quirk application for suspend

[ Upstream commit 37e14e4f3715428b809e4df9a9958baa64c77d51 ]

Since kernel 5.3.4 my laptop (ICH8M controller) does not see Kingston
SV300S37A60G SSD disk connected into a SATA connector on wake from
suspend.  The problem was introduced in c312ef176399 ("libata/ahci: Drop
PCS quirk for Denverton and beyond"): the quirk is not applied on wake
from suspend as it originally was.

It is worth to mention the commit contained another bug: the quirk is
not applied at all to controllers which require it. The fix commit
09d6ac8dc51a ("libata/ahci: Fix PCS quirk application") landed in 5.3.8.
So testing my patch anywhere between commits c312ef176399 and
09d6ac8dc51a is pointless.

Not all disks trigger the problem. For example nothing bad happens with
Western Digital WD5000LPCX HDD.

Test hardware:
- Acer 5920G with ICH8M SATA controller
- sda: some SATA HDD connnected into the DVD drive IDE port with a
  SATA-IDE caddy. It is a boot disk
- sdb: Kingston SV300S37A60G SSD connected into the only SATA port

Sample "dmesg --notime | grep -E '^(sd |ata)'" output on wake:

sd 0:0:0:0: [sda] Starting disk
sd 2:0:0:0: [sdb] Starting disk
ata4: SATA link down (SStatus 4 SControl 300)
ata3: SATA link down (SStatus 4 SControl 300)
ata1.00: ACPI cmd ef/03:0c:00:00:00:a0 (SET FEATURES) filtered out
ata1.00: ACPI cmd ef/03:42:00:00:00:a0 (SET FEATURES) filtered out
ata1: FORCE: cable set to 80c
ata5: SATA link down (SStatus 0 SControl 300)
ata3: SATA link down (SStatus 4 SControl 300)
ata3: SATA link down (SStatus 4 SControl 300)
ata3.00: disabled
sd 2:0:0:0: rejecting I/O to offline device
ata3.00: detaching (SCSI 2:0:0:0)
sd 2:0:0:0: [sdb] Start/Stop Unit failed: Result: hostbyte=DID_NO_CONNECT
driverbyte=DRIVER_OK
sd 2:0:0:0: [sdb] Synchronizing SCSI cache
sd 2:0:0:0: [sdb] Synchronize Cache(10) failed: Result:
hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
sd 2:0:0:0: [sdb] Stopping disk
sd 2:0:0:0: [sdb] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET
driverbyte=DRIVER_OK

Commit c312ef176399 dropped ahci_pci_reset_controller() which internally
calls ahci_reset_controller() and applies the PCS quirk if needed after
that. It was called each time a reset was required instead of just
ahci_reset_controller(). This patch puts the function back in place.

Fixes: c312ef176399 ("libata/ahci: Drop PCS quirk for Denverton and beyond")
Signed-off-by: Adam Vodopjan <grozzly@protonmail.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>