Linus Torvalds [Sun, 12 Dec 2021 18:07:50 +0000 (10:07 -0800)]
Merge tag 'timers-urgent-2021-12-12' of git://git./linux/kernel/git/tip/tip
Pull timer fixes from Thomas Gleixner:
"Two fixes for clock chip drivers:
- A regression fix for the Designware APB timer. A recent change to
the error checking code transformed the error condition wrongly so
it turned into a fail if good condition.
- Fix a clang build fail of the ARM architected timer driver"
* tag 'timers-urgent-2021-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource/drivers/arm_arch_timer: Force inlining of erratum_set_next_event_generic()
clocksource/drivers/dw_apb_timer_of: Fix probe failure
Linus Torvalds [Sun, 12 Dec 2021 17:53:12 +0000 (09:53 -0800)]
Merge tag 'irq-urgent-2021-12-12' of git://git./linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
"A set of interrupt chip driver fixes:
- Fix the multi vector MSI allocation on Armada 370XP
- Do interrupt acknowledgement correctly in the aspeed-scu driver
- Make the IPR register offset correct in the NVIC driver
- Make redistribution table flushing correct by issueing a SYNC
command to ensure that the invalidation command has been executed
- Plug a device tree node reference leak in the bcm7210-l2 driver
- Trivial fixes in the MIPS GIC and the Apple AIC drivers"
* tag 'irq-urgent-2021-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/irq-bcm7120-l2: Add put_device() after of_find_device_by_node()
irqchip/irq-gic-v3-its.c: Force synchronisation when issuing INVALL
irqchip/apple-aic: Mark aic_init_smp() as __init
irqchip: nvic: Fix offset for Interrupt Priority Offsets
irqchip/mips-gic: Use bitfield helpers
irqchip/aspeed-scu: Replace update_bits with write_bits.
irqchip/armada-370-xp: Fix support for Multi-MSI interrupts
irqchip/armada-370-xp: Fix return value of armada_370_xp_msi_alloc()
Linus Torvalds [Sun, 12 Dec 2021 17:38:04 +0000 (09:38 -0800)]
Merge tag 'sched-urgent-2021-12-12' of git://git./linux/kernel/git/tip/tip
Pull scheduler fix from Thomas Gleixner:
"A single fix for the x86 scheduler topology:
Using cluster topology on hybrid CPUs, e.g. Alder Lake, biases the
scheduler towards the ATOM cluster as that has more total capacity.
Use selection based on CPU priority instead"
* tag 'sched-urgent-2021-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched,x86: Don't use cluster topology for x86 hybrid CPUs
Linus Torvalds [Sun, 12 Dec 2021 17:32:49 +0000 (09:32 -0800)]
Merge tag 'csky-for-linus-5.16-rc5' of git://github.com/c-sky/csky-linux
Pull csky from Guo Ren:
"Only one fix for csky: fix fpu config macro"
* tag 'csky-for-linus-5.16-rc5' of git://github.com/c-sky/csky-linux:
csky: fix typo of fpu config macro
Linus Torvalds [Sun, 12 Dec 2021 00:28:27 +0000 (16:28 -0800)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Four fixes, all in drivers.
Three are small and obvious, the qedi one is a bit larger but also
pretty obvious"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: qla2xxx: Format log strings only if needed
scsi: scsi_debug: Fix buffer size of REPORT ZONES command
scsi: qedi: Fix cmd_cleanup_cmpl counter mismatch issue
scsi: pm80xx: Do not call scsi_remove_host() in pm8001_alloc()
Linus Torvalds [Sun, 12 Dec 2021 00:21:06 +0000 (16:21 -0800)]
Merge tag 'xfs-5.16-fixes-3' of git://git./fs/xfs/xfs-linux
Pull xfs fix from Darrick Wong:
"This fixes a race between a readonly remount process and other
processes that hold a file IOLOCK on files that previously experienced
copy on write, that could result in severe filesystem corruption if
the filesystem is then remounted rw.
I think this is fairly rare (since the only reliable reproducer I have
that fits the second criteria is the experimental xfs_scrub program),
but the race is clear, so we still need to fix this.
Summary:
- Fix a data corruption vector that can result from the ro remount
process failing to clear all speculative preallocations from files
and the rw remount process not noticing the incomplete cleanup"
* tag 'xfs-5.16-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: remove all COW fork extents when remounting readonly
Linus Torvalds [Sun, 12 Dec 2021 00:14:17 +0000 (16:14 -0800)]
Merge branch 'for-5.16-fixes' of git://git./linux/kernel/git/dennis/percpu
Pull percpu fixes from Dennis Zhou:
"This contains a fix for SMP && !MMU archs for percpu which has been
tested by arm and sh. It seems in the past they have gotten away with
it due to mapping of vm functions to km functions, but this fell apart
a few releases ago and was just reported recently.
The other is just a minor dependency clean up.
I think queued up right now by Andrew is a fix in percpu that papers
of what seems to be a bug in hotplug for a special situation with
memoryless nodes. Michal Hocko is digging into it further"
* 'for-5.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu:
percpu_ref: Replace kernel.h with the necessary inclusions
percpu: km: ensure it is used with NOMMU (either UP or SMP)
Linus Torvalds [Sat, 11 Dec 2021 21:28:02 +0000 (13:28 -0800)]
Merge tag 'perf-tools-fixes-for-v5.16-2021-12-11' of git://git./linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Prevent out-of-bounds access to per sample registers.
- Fix NULL vs IS_ERR_OR_NULL() checking on the python binding.
- Intel PT fixes, half of those are one-liners:
- Fix some PGE (packet generation enable/control flow packets) usage.
- Fix sync state when a PSB (synchronization) packet is found.
- Fix intel_pt_fup_event() assumptions about setting state type.
- Fix state setting when receiving overflow (OVF) packet.
- Fix next 'err' value, walking trace.
- Fix missing 'instruction' events with 'q' option.
- Fix error timestamp setting on the decoder error path.
* tag 'perf-tools-fixes-for-v5.16-2021-12-11' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf python: Fix NULL vs IS_ERR_OR_NULL() checking
perf intel-pt: Fix error timestamp setting on the decoder error path
perf intel-pt: Fix missing 'instruction' events with 'q' option
perf intel-pt: Fix next 'err' value, walking trace
perf intel-pt: Fix state setting when receiving overflow (OVF) packet
perf intel-pt: Fix intel_pt_fup_event() assumptions about setting state type
perf intel-pt: Fix sync state when a PSB (synchronization) packet is found
perf intel-pt: Fix some PGE (packet generation enable/control flow packets) usage
perf tools: Prevent out-of-bounds access to registers
Linus Torvalds [Sat, 11 Dec 2021 17:25:07 +0000 (09:25 -0800)]
Merge tag 'block-5.16-2021-12-10' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"A few block fixes that should go into this release:
- NVMe pull request:
- set ana_log_size to 0 after freeing ana_log_buf (Hou Tao)
- show subsys nqn for duplicate cntlids (Keith Busch)
- disable namespace access for unsupported metadata (Keith
Busch)
- report write pointer for a full zone as zone start + zone len
(Niklas Cassel)
- fix use after free when disconnecting a reconnecting ctrl
(Ruozhu Li)
- fix a list corruption in nvmet-tcp (Sagi Grimberg)
- Fix for a regression on DIO single bio async IO (Pavel)
- ioprio seteuid fix (Davidlohr)
- mtd fix that subsequently got reverted as it was broken, will get
re-done and submitted for the next round
- Two MD fixes via Song (Markus, zhangyue)"
* tag 'block-5.16-2021-12-10' of git://git.kernel.dk/linux-block:
Revert "mtd_blkdevs: don't scan partitions for plain mtdblock"
block: fix ioprio_get(IOPRIO_WHO_PGRP) vs setuid(2)
md: fix double free of mddev->private in autorun_array()
md: fix update super 1.0 on rdev size change
nvmet-tcp: fix possible list corruption for unexpected command failure
block: fix single bio async DIO error handling
nvme: fix use after free when disconnecting a reconnecting ctrl
nvme-multipath: set ana_log_size to 0 after free ana_log_buf
mtd_blkdevs: don't scan partitions for plain mtdblock
nvme: report write pointer for a full zone as zone start + zone len
nvme: disable namespace access for unsupported metadata
nvme: show subsys nqn for duplicate cntlids
Linus Torvalds [Sat, 11 Dec 2021 17:19:44 +0000 (09:19 -0800)]
Merge tag 'io_uring-5.16-2021-12-10' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"A few fixes that are all bound for stable:
- Two syzbot reports for io-wq that turned out to be separate fixes,
but ultimately very closely related
- io_uring task_work running on cancelations"
* tag 'io_uring-5.16-2021-12-10' of git://git.kernel.dk/linux-block:
io-wq: check for wq exit after adding new worker task_work
io_uring: ensure task_work gets run as part of cancelations
io-wq: remove spurious bit clear on task_work addition
Linus Torvalds [Sat, 11 Dec 2021 17:09:59 +0000 (09:09 -0800)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"Two more I2C driver bugfixes"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: mpc: Use atomic read and fix break condition
i2c: virtio: fix completion handling
Linus Torvalds [Sat, 11 Dec 2021 17:06:08 +0000 (09:06 -0800)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux
Pull clk driver fixes from Stephen Boyd:
- Fix qcom mux logic to look at the proper parent table member. Luckily
this clk type isn't very common.
- Don't kill clks on qcom systems that use Trion PLLs that are enabled
out of the bootloader. We will simply skip programming the PLL rate
if it's already done.
- Use the proper clk_ops for the qcom sm6125 ICE clks.
- Use module_platform_driver() in i.MX as it can be a module.
- Fix a UAF in the versatile clk driver on an error path.
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: versatile: clk-icst: use after free on error path
clk: qcom: sm6125-gcc: Swap ops of ice and apps on sdcc1
clk: imx: use module_platform_driver
clk: qcom: clk-alpha-pll: Don't reconfigure running Trion
clk: qcom: regmap-mux: fix parent clock lookup
Linus Torvalds [Sat, 11 Dec 2021 16:58:04 +0000 (08:58 -0800)]
Merge tag 'devicetree-fixes-for-5.16-2' of git://git./linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:
- Revert schema checks on %.dtb targets. This was problematic for some
external build tools.
- A few DT binding example fixes
- Add back dropped 'enet-phy-lane-no-swap' Ethernet PHY property
- Drop erroneous if/then schema in nxp,imx7-mipi-csi2
- Add a quirk to fix some interrupt controllers use of 'interrupt-map'
* tag 'devicetree-fixes-for-5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
Revert "kbuild: Enable DT schema checks for %.dtb targets"
dt-bindings: bq25980: Fixup the example
dt-bindings: input: gpio-keys: Fix interrupts in example
dt-bindings: net: Reintroduce PHY no lane swap binding
dt-bindings: media: nxp,imx7-mipi-csi2: Drop bad if/then schema
of/irq: Add a quirk for controllers with their own definition of interrupt-map
dt-bindings: iio: adc: exynos-adc: Fix node name in example
Linus Torvalds [Sat, 11 Dec 2021 16:46:52 +0000 (08:46 -0800)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"21 patches.
Subsystems affected by this patch series: MAINTAINERS, mailmap, and mm
(mlock, pagecache, damon, slub, memcg, hugetlb, and pagecache)"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (21 commits)
mm: bdi: initialize bdi_min_ratio when bdi is unregistered
hugetlbfs: fix issue of preallocation of gigantic pages can't work
mm/memcg: relocate mod_objcg_mlstate(), get_obj_stock() and put_obj_stock()
mm/slub: fix endianness bug for alloc/free_traces attributes
selftests/damon: split test cases
selftests/damon: test debugfs file reads/writes with huge count
selftests/damon: test wrong DAMOS condition ranges input
selftests/damon: test DAMON enabling with empty target_ids case
selftests/damon: skip test if DAMON is running
mm/damon/vaddr-test: remove unnecessary variables
mm/damon/vaddr-test: split a test function having >1024 bytes frame size
mm/damon/vaddr: remove an unnecessary warning message
mm/damon/core: remove unnecessary error messages
mm/damon/dbgfs: remove an unnecessary error message
mm/damon/core: use better timer mechanisms selection threshold
mm/damon/core: fix fake load reports due to uninterruptible sleeps
timers: implement usleep_idle_range()
filemap: remove PageHWPoison check from next_uptodate_page()
mailmap: update email address for Guo Ren
MAINTAINERS: update kdump maintainers
...
Thomas Gleixner [Sat, 11 Dec 2021 12:56:30 +0000 (13:56 +0100)]
Merge tag 'timers-v5.16-rc4' of https://git.linaro.org/people/daniel.lezcano/linux into timers/urgent
Pull timer fixes from Daniel Lezcano:
- Fix build error with clang and some kernel configuration on the
arm64 architected timer by inlining the
erratum_set_next_event_generic() function (Marc Zyngier)
- Fix probe error on the dw_apb_timer_of driver by fixing the
incorrect condition previously introduced (Alexey Sheplyakov)
Link: https://lore.kernel.org/r/429b796d-9395-4ca8-81f3-30911f80a9a9@linaro.org
Miaoqian Lin [Sat, 11 Dec 2021 05:38:53 +0000 (05:38 +0000)]
perf python: Fix NULL vs IS_ERR_OR_NULL() checking
The function trace_event__tp_format_id may return ERR_PTR(-ENOMEM). Use
IS_ERR_OR_NULL to check tp_format.
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Link: http://lore.kernel.org/lkml/20211211053856.19827-1-linmq006@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 10 Dec 2021 16:23:03 +0000 (18:23 +0200)]
perf intel-pt: Fix error timestamp setting on the decoder error path
An error timestamp shows the last known timestamp for the queue, but this
is not updated on the error path. Fix by setting it.
Fixes:
f4aa081949e7b6 ("perf tools: Add Intel PT decoder")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v5.15+
Link: https://lore.kernel.org/r/20211210162303.2288710-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 10 Dec 2021 16:23:02 +0000 (18:23 +0200)]
perf intel-pt: Fix missing 'instruction' events with 'q' option
FUP packets contain IP information, which makes them also an 'instruction'
event in 'hop' mode i.e. the itrace 'q' option. That wasn't happening, so
restructure the logic so that FUP events are added along with appropriate
'instruction' and 'branch' events.
Fixes:
7c1b16ba0e26e6 ("perf intel-pt: Add support for decoding FUP/TIP only")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v5.15+
Link: https://lore.kernel.org/r/20211210162303.2288710-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 10 Dec 2021 16:23:01 +0000 (18:23 +0200)]
perf intel-pt: Fix next 'err' value, walking trace
Code after label 'next:' in intel_pt_walk_trace() assumes 'err' is zero,
but it may not be, if arrived at via a 'goto'. Ensure it is zero.
Fixes:
7c1b16ba0e26e6 ("perf intel-pt: Add support for decoding FUP/TIP only")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v5.15+
Link: https://lore.kernel.org/r/20211210162303.2288710-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 10 Dec 2021 16:23:00 +0000 (18:23 +0200)]
perf intel-pt: Fix state setting when receiving overflow (OVF) packet
An overflow (OVF packet) is treated as an error because it represents a
loss of trace data, but there is no loss of synchronization, so the packet
state should be INTEL_PT_STATE_IN_SYNC not INTEL_PT_STATE_ERR_RESYNC.
To support that, some additional variables must be reset, and the FUP
packet that may follow OVF is treated as an FUP event.
Fixes:
f4aa081949e7b6 ("perf tools: Add Intel PT decoder")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v5.15+
Link: https://lore.kernel.org/r/20211210162303.2288710-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 10 Dec 2021 16:22:59 +0000 (18:22 +0200)]
perf intel-pt: Fix intel_pt_fup_event() assumptions about setting state type
intel_pt_fup_event() assumes it can overwrite the state type if there has
been an FUP event, but this is an unnecessary and unexpected constraint on
callers.
Fix by touching only the state type flags that are affected by an FUP
event.
Fixes:
a472e65fc490a ("perf intel-pt: Add decoder support for ptwrite and power event packets")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v5.15+
Link: https://lore.kernel.org/r/20211210162303.2288710-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 10 Dec 2021 16:22:58 +0000 (18:22 +0200)]
perf intel-pt: Fix sync state when a PSB (synchronization) packet is found
When syncing, it may be that branch packet generation is not enabled at
that point, in which case there will not immediately be a control-flow
packet, so some packets before a control flow packet turns up, get
ignored. However, the decoder is in sync as soon as a PSB is found, so
the state should be set accordingly.
Fixes:
f4aa081949e7b6 ("perf tools: Add Intel PT decoder")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v5.15+
Link: https://lore.kernel.org/r/20211210162303.2288710-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Fri, 10 Dec 2021 16:22:57 +0000 (18:22 +0200)]
perf intel-pt: Fix some PGE (packet generation enable/control flow packets) usage
Packet generation enable (PGE) refers to whether control flow (COFI)
packets are being produced.
PGE may be false even when branch-tracing is enabled, due to being
out-of-context, or outside a filter address range. Fix some missing PGE
usage.
Fixes:
7c1b16ba0e26e6 ("perf intel-pt: Add support for decoding FUP/TIP only")
Fixes:
839598176b0554 ("perf intel-pt: Allow decoding with branch tracing disabled")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v5.15+
Link: https://lore.kernel.org/r/20211210162303.2288710-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
German Gomez [Wed, 1 Dec 2021 12:33:29 +0000 (12:33 +0000)]
perf tools: Prevent out-of-bounds access to registers
The size of the cache of register values is arch-dependant
(PERF_REGS_MAX). This has the potential of causing an out-of-bounds
access in the function "perf_reg_value" if the local architecture
contains less registers than the one the perf.data file was recorded on.
Since the maximum number of registers is bound by the bitmask "u64
cache_mask", and the size of the cache when running under x86 systems is
64 already, fix the size to 64 and add a range-check to the function
"perf_reg_value" to prevent out-of-bounds access.
Reported-by: Alexandre Truong <alexandre.truong@arm.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-csky@vger.kernel.org
Cc: linux-riscv@lists.infradead.org
Link: https://lore.kernel.org/r/20211201123334.679131-2-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Thomas Gleixner [Sat, 11 Dec 2021 09:51:23 +0000 (10:51 +0100)]
Merge tag 'irqchip-fixes-5.16-2' of git://git./linux/kernel/git/maz/arm-platforms into irq/urgent
Pull irqchip fixes from Marc Zyngier:
- Fix Armada-370-XP Multi-MSi allocation to be aligned on the allocation
size, as required by the PCI spec
- Fix aspeed-scu interrupt acknowledgement by directly writing to the
register instead of a read-modify-write sequence
- Use standard bitfirl helpers in the MIPS GIC driver instead of custom
constructs
- Fix the NVIC driver IPR register offset
- Correctly drop the reference of the device node in the irq-bcm7120-l2
driver
- Fix the GICv3 ITS INVALL command by issueing a following SYNC command
- Add a missing __init attribute to the init function of the Apple AIC
driver
Link: https://lore.kernel.org/r/20211210133516.664497-1-maz@kernel.org
Linus Torvalds [Sat, 11 Dec 2021 01:28:02 +0000 (17:28 -0800)]
Merge tag 'for-5.16-rc4-tag' of git://git./linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"A few more regression fixes and stable patches, mostly one-liners.
Regression fixes:
- fix pointer/ERR_PTR mismatch returned from memdup_user
- reset dedicated zoned mode relocation block group to avoid using it
and filling it without any recourse
Fixes:
- handle a case to FITRIM range (also to make fstests/generic/260
work)
- fix warning when extent buffer state and pages get out of sync
after an IO error
- fix transaction abort when syncing due to missing mapping error set
on metadata inode after inlining a compressed file
- fix transaction abort due to tree-log and zoned mode interacting in
an unexpected way
- fix memory leak of additional extent data when qgroup reservation
fails
- do proper handling of slot search call when deleting root refs"
* tag 'for-5.16-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: replace the BUG_ON in btrfs_del_root_ref with proper error handling
btrfs: zoned: clear data relocation bg on zone finish
btrfs: free exchange changeset on failures
btrfs: fix re-dirty process of tree-log nodes
btrfs: call mapping_set_error() on btree inode with a write error
btrfs: clear extent buffer uptodate when we fail to write it
btrfs: fail if fstrim_range->start == U64_MAX
btrfs: fix error pointer dereference in btrfs_ioctl_rm_dev_v2()
Linus Torvalds [Sat, 11 Dec 2021 01:24:57 +0000 (17:24 -0800)]
Merge tag '5.16-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Two cifs/smb3 fixes - one for stable, the other fixes a recently
reported NTLMSSP auth problem"
* tag '5.16-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: fix ntlmssp auth when there is no key exchange
cifs: Fix crash on unload of cifs_arc4.ko
Linus Torvalds [Sat, 11 Dec 2021 01:17:53 +0000 (17:17 -0800)]
Merge tag 'nfsd-5.16-2' of git://linux-nfs.org/~bfields/linux
Pull nfsd fixes from Bruce Fields:
"Fix a race on startup and another in the delegation code.
The latter has been around for years, but I suspect recent changes may
have widened the race window a little, so I'd like to go ahead and get
it in"
* tag 'nfsd-5.16-2' of git://linux-nfs.org/~bfields/linux:
nfsd: fix use-after-free due to delegation race
nfsd: Fix nsfd startup race (again)
Manjong Lee [Fri, 10 Dec 2021 22:47:11 +0000 (14:47 -0800)]
mm: bdi: initialize bdi_min_ratio when bdi is unregistered
Initialize min_ratio if it is set during bdi unregistration. This can
prevent problems that may occur a when bdi is removed without resetting
min_ratio.
For example.
1) insert external sdcard
2) set external sdcard's min_ratio 70
3) remove external sdcard without setting min_ratio 0
4) insert external sdcard
5) set external sdcard's min_ratio 70 << error occur(can't set)
Because when an sdcard is removed, the present bdi_min_ratio value will
remain. Currently, the only way to reset bdi_min_ratio is to reboot.
[akpm@linux-foundation.org: tweak comment and coding style]
Link: https://lkml.kernel.org/r/20211021161942.5983-1-mj0123.lee@samsung.com
Signed-off-by: Manjong Lee <mj0123.lee@samsung.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Changheun Lee <nanich.lee@samsung.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <seunghwan.hyun@samsung.com>
Cc: <sookwan7.kim@samsung.com>
Cc: <yt0928.kim@samsung.com>
Cc: <junho89.kim@samsung.com>
Cc: <jisoo2146.oh@samsung.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Zhenguo Yao [Fri, 10 Dec 2021 22:47:08 +0000 (14:47 -0800)]
hugetlbfs: fix issue of preallocation of gigantic pages can't work
Preallocation of gigantic pages can't work bacause of commit
b5389086ad7b ("hugetlbfs: extend the definition of hugepages parameter
to support node allocation"). When nid is NUMA_NO_NODE(-1),
alloc_bootmem_huge_page will always return without doing allocation.
Fix this by adding more check.
Link: https://lkml.kernel.org/r/20211129133803.15653-1-yaozhenguo1@gmail.com
Fixes:
b5389086ad7b ("hugetlbfs: extend the definition of hugepages parameter to support node allocation")
Signed-off-by: Zhenguo Yao <yaozhenguo1@gmail.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Tested-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Waiman Long [Fri, 10 Dec 2021 22:47:05 +0000 (14:47 -0800)]
mm/memcg: relocate mod_objcg_mlstate(), get_obj_stock() and put_obj_stock()
All the calls to mod_objcg_mlstate(), get_obj_stock() and
put_obj_stock() are done by functions defined within the same "#ifdef
CONFIG_MEMCG_KMEM" compilation block. When CONFIG_MEMCG_KMEM isn't
defined, the following compilation warnings will be issued [1] and [2].
mm/memcontrol.c:785:20: warning: unused function 'mod_objcg_mlstate'
mm/memcontrol.c:2113:33: warning: unused function 'get_obj_stock'
Fix these warning by moving those functions to under the same
CONFIG_MEMCG_KMEM compilation block. There is no functional change.
[1] https://lore.kernel.org/lkml/
202111272014.WOYNLUV6-lkp@intel.com/
[2] https://lore.kernel.org/lkml/
202111280551.LXsWYt1T-lkp@intel.com/
Link: https://lkml.kernel.org/r/20211129161140.306488-1-longman@redhat.com
Fixes:
559271146efc ("mm/memcg: optimize user context object stock access")
Fixes:
68ac5b3c8db2 ("mm/memcg: cache vmstat data in percpu memcg_stock_pcp")
Signed-off-by: Waiman Long <longman@redhat.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Gerald Schaefer [Fri, 10 Dec 2021 22:47:02 +0000 (14:47 -0800)]
mm/slub: fix endianness bug for alloc/free_traces attributes
On big-endian s390, the alloc/free_traces attributes produce endless
output, because of always 0 idx in slab_debugfs_show().
idx is de-referenced from *v, which points to a loff_t value, with
unsigned int idx = *(unsigned int *)v;
This will only give the upper 32 bits on big-endian, which remain 0.
Instead of only fixing this de-reference, during discussion it seemed
more appropriate to change the seq_ops so that they use an explicit
iterator in private loc_track struct.
This patch adds idx to loc_track, which will also fix the endianness
bug.
Link: https://lore.kernel.org/r/20211117193932.4049412-1-gerald.schaefer@linux.ibm.com
Link: https://lkml.kernel.org/r/20211126171848.17534-1-gerald.schaefer@linux.ibm.com
Fixes:
64dd68497be7 ("mm: slub: move sysfs slab alloc/free interfaces to debugfs")
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Reported-by: Steffen Maier <maier@linux.ibm.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Faiyaz Mohammed <faiyazm@codeaurora.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SeongJae Park [Fri, 10 Dec 2021 22:46:59 +0000 (14:46 -0800)]
selftests/damon: split test cases
Currently, the single test program, debugfs.sh, contains all test cases
for DAMON. When one of the cases fails, finding which case is failed
from the test log is not so easy, and all remaining tests will be
skipped. To improve the situation, this commit splits the single
program into small test programs having their own names.
Link: https://lkml.kernel.org/r/20211201150440.1088-12-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SeongJae Park [Fri, 10 Dec 2021 22:46:55 +0000 (14:46 -0800)]
selftests/damon: test debugfs file reads/writes with huge count
DAMON debugfs interface users were able to trigger warning by writing
some files with arbitrarily large 'count' parameter. The issue is fixed
with commit
db7a347b26fe ("mm/damon/dbgfs: use '__GFP_NOWARN' for
user-specified size buffer allocation"). This commit adds a test case
for the issue in DAMON selftests to avoid future regressions.
Link: https://lkml.kernel.org/r/20211201150440.1088-11-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SeongJae Park [Fri, 10 Dec 2021 22:46:52 +0000 (14:46 -0800)]
selftests/damon: test wrong DAMOS condition ranges input
A patch titled "mm/damon/schemes: add the validity judgment of
thresholds"[1] makes DAMON debugfs interface to validate DAMON scheme
inputs. This commit adds a test case for the validation logic in DAMON
selftests.
[1] https://lore.kernel.org/linux-mm/
d78360e52158d786fcbf20bc62c96785742e76d3.
1637239568.git.xhao@linux.alibaba.com/
Link: https://lkml.kernel.org/r/20211201150440.1088-10-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SeongJae Park [Fri, 10 Dec 2021 22:46:49 +0000 (14:46 -0800)]
selftests/damon: test DAMON enabling with empty target_ids case
DAMON debugfs didn't check empty targets when starting monitoring, and
the issue is fixed with commit
b5ca3e83ddb0 ("mm/damon/dbgfs: add
adaptive_targets list check before enable monitor_on"). To avoid future
regression, this commit adds a test case for that in DAMON selftests.
Link: https://lkml.kernel.org/r/20211201150440.1088-9-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SeongJae Park [Fri, 10 Dec 2021 22:46:46 +0000 (14:46 -0800)]
selftests/damon: skip test if DAMON is running
Testing the DAMON debugfs files while DAMON is running makes no sense,
as any write to the debugfs files will fail. This commit makes the test
be skipped in this case.
Link: https://lkml.kernel.org/r/20211201150440.1088-8-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SeongJae Park [Fri, 10 Dec 2021 22:46:43 +0000 (14:46 -0800)]
mm/damon/vaddr-test: remove unnecessary variables
A couple of test functions in DAMON virtual address space monitoring
primitives implementation has unnecessary damon_ctx variables. This
commit removes those.
Link: https://lkml.kernel.org/r/20211201150440.1088-7-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SeongJae Park [Fri, 10 Dec 2021 22:46:40 +0000 (14:46 -0800)]
mm/damon/vaddr-test: split a test function having >1024 bytes frame size
On some configuration[1], 'damon_test_split_evenly()' kunit test
function has >1024 bytes frame size, so below build warning is
triggered:
CC mm/damon/vaddr.o
In file included from mm/damon/vaddr.c:672:
mm/damon/vaddr-test.h: In function 'damon_test_split_evenly':
mm/damon/vaddr-test.h:309:1: warning: the frame size of 1064 bytes is larger than 1024 bytes [-Wframe-larger-than=]
309 | }
| ^
This commit fixes the warning by separating the common logic in the
function.
[1] https://lore.kernel.org/linux-mm/
202111182146.OV3C4uGr-lkp@intel.com/
Link: https://lkml.kernel.org/r/20211201150440.1088-6-sj@kernel.org
Fixes:
17ccae8bb5c9 ("mm/damon: add kunit tests")
Signed-off-by: SeongJae Park <sj@kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SeongJae Park [Fri, 10 Dec 2021 22:46:37 +0000 (14:46 -0800)]
mm/damon/vaddr: remove an unnecessary warning message
The DAMON virtual address space monitoring primitive prints a warning
message for wrong DAMOS action. However, it is not essential as the
code returns appropriate failure in the case. This commit removes the
message to make the log clean.
Link: https://lkml.kernel.org/r/20211201150440.1088-5-sj@kernel.org
Fixes:
6dea8add4d28 ("mm/damon/vaddr: support DAMON-based Operation Schemes")
Signed-off-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SeongJae Park [Fri, 10 Dec 2021 22:46:34 +0000 (14:46 -0800)]
mm/damon/core: remove unnecessary error messages
DAMON core prints error messages when damon_target object creation is
failed or wrong monitoring attributes are given. Because appropriate
error code is returned for each case, the messages are not essential.
Also, because the code path can be triggered with user-specified input,
this could result in kernel log mistakenly being messy. To avoid the
case, this commit removes the messages.
Link: https://lkml.kernel.org/r/20211201150440.1088-4-sj@kernel.org
Fixes:
4bc05954d007 ("mm/damon: implement a debugfs-based user space interface")
Fixes:
b9a6ac4e4ede ("mm/damon: adaptively adjust regions")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: kernel test robot <lkp@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SeongJae Park [Fri, 10 Dec 2021 22:46:31 +0000 (14:46 -0800)]
mm/damon/dbgfs: remove an unnecessary error message
When wrong scheme action is requested via the debugfs interface, DAMON
prints an error message. Because the function returns error code, this
is not really needed. Because the code path is triggered by the user
specified input, this can result in kernel log mistakenly being messy.
To avoid the case, this commit removes the message.
Link: https://lkml.kernel.org/r/20211201150440.1088-3-sj@kernel.org
Fixes:
af122dd8f3c0 ("mm/damon/dbgfs: support DAMON-based Operation Schemes")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SeongJae Park [Fri, 10 Dec 2021 22:46:28 +0000 (14:46 -0800)]
mm/damon/core: use better timer mechanisms selection threshold
Patch series "mm/damon: Trivial fixups and improvements".
This patchset contains trivial fixups and improvements for DAMON and its
kunit/kselftest tests.
This patch (of 11):
DAMON is using hrtimer if requested sleep time is <=100ms, while the
suggested threshold[1] is <=20ms. This commit applies the threshold.
[1] Documentation/timers/timers-howto.rst
Link: https://lkml.kernel.org/r/20211201150440.1088-2-sj@kernel.org
Fixes:
ee801b7dd7822 ("mm/damon/schemes: activate schemes based on a watermarks mechanism")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SeongJae Park [Fri, 10 Dec 2021 22:46:25 +0000 (14:46 -0800)]
mm/damon/core: fix fake load reports due to uninterruptible sleeps
Because DAMON sleeps in uninterruptible mode, /proc/loadavg reports fake
load while DAMON is turned on, though it is doing nothing. This can
confuse users[1]. To avoid the case, this commit makes DAMON sleeps in
idle mode.
[1] https://lore.kernel.org/all/
11868371.O9o76ZdvQC@natalenko.name/
Link: https://lkml.kernel.org/r/20211126145015.15862-3-sj@kernel.org
Fixes:
2224d8485492 ("mm: introduce Data Access MONitor (DAMON)")
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: SeongJae Park <sj@kernel.org>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SeongJae Park [Fri, 10 Dec 2021 22:46:22 +0000 (14:46 -0800)]
timers: implement usleep_idle_range()
Patch series "mm/damon: Fix fake /proc/loadavg reports", v3.
This patchset fixes DAMON's fake load report issue. The first patch
makes yet another variant of usleep_range() for this fix, and the second
patch fixes the issue of DAMON by making it using the newly introduced
function.
This patch (of 2):
Some kernel threads such as DAMON could need to repeatedly sleep in
micro seconds level. Because usleep_range() sleeps in uninterruptible
state, however, such threads would make /proc/loadavg reports fake load.
To help such cases, this commit implements a variant of usleep_range()
called usleep_idle_range(). It is same to usleep_range() but sets the
state of the current task as TASK_IDLE while sleeping.
Link: https://lkml.kernel.org/r/20211126145015.15862-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20211126145015.15862-2-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Cc: John Stultz <john.stultz@linaro.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Matthew Wilcox (Oracle) [Fri, 10 Dec 2021 22:46:18 +0000 (14:46 -0800)]
filemap: remove PageHWPoison check from next_uptodate_page()
Pages are individually marked as suffering from hardware poisoning.
Checking that the head page is not hardware poisoned doesn't make
sense; we might be after a subpage. We check each page individually
before we use it, so this was an optimisation gone wrong. It will
cause us to fall back to the slow path when there was no need to do
that
Link: https://lkml.kernel.org/r/20211120174429.2596303-1-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Guo Ren [Fri, 10 Dec 2021 22:46:15 +0000 (14:46 -0800)]
mailmap: update email address for Guo Ren
The ren_guo@c-sky.com would be deprecated and use guoren@kernel.org as the
main email address.
Link: https://lkml.kernel.org/r/20211123022741.545541-1-guoren@kernel.org
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dave Young [Fri, 10 Dec 2021 22:46:12 +0000 (14:46 -0800)]
MAINTAINERS: update kdump maintainers
Remove myself from kdump maintainers as I have no enough time to maintain
it now. But I can review patches on demand though.
Link: https://lkml.kernel.org/r/YZyKilzKFsWJYdgn@dhcp-128-65.nay.redhat.com
Signed-off-by: Dave Young <dyoung@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Drew DeVault [Fri, 10 Dec 2021 22:46:09 +0000 (14:46 -0800)]
Increase default MLOCK_LIMIT to 8 MiB
This limit has not been updated since 2008, when it was increased to 64
KiB at the request of GnuPG. Until recently, the main use-cases for this
feature were (1) preventing sensitive memory from being swapped, as in
GnuPG's use-case; and (2) real-time use-cases. In the first case, little
memory is called for, and in the second case, the user is generally in a
position to increase it if they need more.
The introduction of IOURING_REGISTER_BUFFERS adds a third use-case:
preparing fixed buffers for high-performance I/O. This use-case will take
as much of this memory as it can get, but is still limited to 64 KiB by
default, which is very little. This increases the limit to 8 MB, which
was chosen fairly arbitrarily as a more generous, but still conservative,
default value.
It is also possible to raise this limit in userspace. This is easily
done, for example, in the use-case of a network daemon: systemd, for
instance, provides for this via LimitMEMLOCK in the service file; OpenRC
via the rc_ulimit variables. However, there is no established userspace
facility for configuring this outside of daemons: end-user applications do
not presently have access to a convenient means of raising their limits.
The buck, as it were, stops with the kernel. It's much easier to address
it here than it is to bring it to hundreds of distributions, and it can
only realistically be relied upon to be high-enough by end-user software
if it is more-or-less ubiquitous. Most distros don't change this
particular rlimit from the kernel-supplied default value, so a change here
will easily provide that ubiquity.
Link: https://lkml.kernel.org/r/20211028080813.15966-1-sir@cmpwn.com
Signed-off-by: Drew DeVault <sir@cmpwn.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Cyril Hrubis <chrubis@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Andrew Dona-Couch <andrew@donacou.ch>
Cc: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 11 Dec 2021 01:02:46 +0000 (17:02 -0800)]
Merge tag 'thermal-5.16-rc5' of git://git./linux/kernel/git/rafael/linux-pm
Pull thermal control fix from Rafael Wysocki:
"Fix the definition of one of the Tiger Lake MMIO registers in the
int340x thermal driver (Sumeet Pawnikar)"
* tag 'thermal-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal: int340x: Fix VCoRefLow MMIO bit offset for TGL
Linus Torvalds [Fri, 10 Dec 2021 22:43:16 +0000 (14:43 -0800)]
Merge tag 'acpi-5.16-rc5' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"Create the output directory for the ACPI tools during build if it has
not been present before and prevent the compilation from failing in
that case (Chen Yu)"
* tag 'acpi-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: tools: Fix compilation when output directory is not present
Linus Torvalds [Fri, 10 Dec 2021 22:36:06 +0000 (14:36 -0800)]
Merge tag 'pm-5.16-rc5' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
"Fix a kernedoc comment that doesn't match the behavior of the function
documented by it"
* tag 'pm-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: runtime: Fix pm_runtime_active() kerneldoc comment
Linus Torvalds [Fri, 10 Dec 2021 22:31:45 +0000 (14:31 -0800)]
Merge tag 'hwmon-for-v5.16-rc5' of git://git./linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:
- In the pwm-fan driver, ensure that the internal pwm state matches the
state assumed by the pwm code.
- Avoid EREMOTEIO errors in sht4 driver
- In the nct6775 driver, make it explicit that the register value
passed to nct6775_asuswmi_read() is an 8-bit value
- Avoid WARNing in dell-smm driver removal after failing to create
/proc/i8k
- Stop using a plain integer as NULL pointer in corsair-psu driver
* tag 'hwmon-for-v5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (pwm-fan) Ensure the fan going on in .probe()
hwmon: (sht4x) Fix EREMOTEIO errors
hwmon: (nct6775) mask out bank number in nct6775_wmi_read_value()
hwmon: (dell-smm) Fix warning on /proc/i8k creation error
hwmon: (corsair-psu) fix plain integer used as NULL pointer
Linus Torvalds [Fri, 10 Dec 2021 22:24:05 +0000 (14:24 -0800)]
Merge tag 'trace-v5.16-rc4' of git://git./linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"Tracing, ftrace and tracefs fixes:
- Have tracefs honor the gid mount option
- Have new files in tracefs inherit the parent ownership
- Have direct_ops unregister when it has no more functions
- Properly clean up the ops when unregistering multi direct ops
- Add a sample module to test the multiple direct ops
- Fix memory leak in error path of __create_synth_event()"
* tag 'trace-v5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix possible memory leak in __create_synth_event() error path
ftrace/samples: Add module to test multi direct modify interface
ftrace: Add cleanup to unregister_ftrace_direct_multi
ftrace: Use direct_ops hash in unregister_ftrace_direct
tracefs: Set all files to the same group ownership as the mount option
tracefs: Have new files inherit the ownership of their parent
Linus Torvalds [Fri, 10 Dec 2021 22:15:39 +0000 (14:15 -0800)]
Merge tag 'aio-poll-for-linus' of git://git./linux/kernel/git/ebiggers/linux
Pull aio poll fixes from Eric Biggers:
"Fix three bugs in aio poll, and one issue with POLLFREE more broadly:
- aio poll didn't handle POLLFREE, causing a use-after-free.
- aio poll could block while the file is ready.
- aio poll called eventfd_signal() when it isn't allowed.
- POLLFREE didn't handle multiple exclusive waiters correctly.
This has been tested with the libaio test suite, as well as with test
programs I wrote that reproduce the first two bugs. I am sending this
pull request myself as no one seems to be maintaining this code"
* tag 'aio-poll-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
aio: Fix incorrect usage of eventfd_signal_allowed()
aio: fix use-after-free due to missing POLLFREE handling
aio: keep poll requests on waitqueue until completed
signalfd: use wake_up_pollfree()
binder: use wake_up_pollfree()
wait: add wake_up_pollfree()
Linus Torvalds [Fri, 10 Dec 2021 22:09:12 +0000 (14:09 -0800)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"More x86 fixes:
- Logic bugs in CR0 writes and Hyper-V hypercalls
- Don't use Enlightened MSR Bitmap for L3
- Remove user-triggerable WARN
Plus a few selftest fixes and a regression test for the
user-triggerable WARN"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
selftests: KVM: Add test to verify KVM doesn't explode on "bad" I/O
KVM: x86: Don't WARN if userspace mucks with RCX during string I/O exit
KVM: X86: Raise #GP when clearing CR0_PG in 64 bit mode
selftests: KVM: avoid failures due to reserved HyperTransport region
KVM: x86: Ignore sparse banks size for an "all CPUs", non-sparse IPI req
KVM: x86: Wait for IPIs to be delivered when handling Hyper-V TLB flush hypercall
KVM: x86: selftests: svm_int_ctl_test: fix intercept calculation
KVM: nVMX: Don't use Enlightened MSR Bitmap for L3
Chris Packham [Tue, 7 Dec 2021 04:21:44 +0000 (17:21 +1300)]
i2c: mpc: Use atomic read and fix break condition
Maxime points out that the polling code in mpc_i2c_isr should use the
_atomic API because it is called in an irq context and that the
behaviour of the MCF bit is that it is 1 when the byte transfer is
complete. All of this means the original code was effectively a
udelay(100).
Fix this by using readb_poll_timeout_atomic() and removing the negation
of the break condition.
Fixes:
4a8ac5e45cda ("i2c: mpc: Poll for MCF")
Reported-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Tested-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
Jens Axboe [Fri, 10 Dec 2021 15:29:30 +0000 (08:29 -0700)]
io-wq: check for wq exit after adding new worker task_work
We check IO_WQ_BIT_EXIT before attempting to create a new worker, and
wq exit cancels pending work if we have any. But it's possible to have
a race between the two, where creation checks exit finding it not set,
but we're in the process of exiting. The exit side will cancel pending
creation task_work, but there's a gap where we add task_work after we've
canceled existing creations at exit time.
Fix this by checking the EXIT bit post adding the creation task_work.
If it's set, run the same cancelation that exit does.
Reported-and-tested-by: syzbot+b60c982cb0efc5e05a47@syzkaller.appspotmail.com
Reviewed-by: Hao Xu <haoxu@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Thu, 9 Dec 2021 15:54:29 +0000 (08:54 -0700)]
io_uring: ensure task_work gets run as part of cancelations
If we successfully cancel a work item but that work item needs to be
processed through task_work, then we can be sleeping uninterruptibly
in io_uring_cancel_generic() and never process it. Hence we don't
make forward progress and we end up with an uninterruptible sleep
warning.
While in there, correct a comment that should be IFF, not IIF.
Reported-and-tested-by: syzbot+21e6887c0be14181206d@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Fri, 10 Dec 2021 19:56:05 +0000 (11:56 -0800)]
Merge tag 'pci-v5.16-fixes-2' of git://git./linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
- Revert emulation of Marvell Armada A3720 expansion ROM because it
doesn't work as expected (Marek Behún)
- Assert PERST# in Apple M1 driver to fix initialization when booting
from bootloaders using PCIe, such as U-Boot (Marc Zyngier)
- Describe PERST# as active low in Apple T8103 DT and update driver to
match (Marc Zyngier)
* tag 'pci-v5.16-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: apple: Fix PERST# polarity
arm64: dts: apple: t8103: Mark PCIe PERST# polarity active low in DT
PCI: apple: Follow the PCIe specifications when resetting the port
Revert "PCI: aardvark: Fix support for PCI_ROM_ADDRESS1 on emulated bridge"
Linus Torvalds [Fri, 10 Dec 2021 19:50:21 +0000 (11:50 -0800)]
Merge tag 'mmc-v5.16-rc3' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC host fixes from Ulf Hansson:
- mtk-sd: Fix memory leak during tuning
- renesas_sdhi: Initialize variable properly when tuning
* tag 'mmc-v5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: mediatek: free the ext_csd when mmc_get_ext_csd success
mmc: renesas_sdhi: initialize variable properly when tuning
Linus Torvalds [Fri, 10 Dec 2021 19:46:53 +0000 (11:46 -0800)]
Merge tag 'libata-5.16-rc5' of git://git./linux/kernel/git/dlemoal/libata
Pull libata fixes from Damien Le Moal:
- Fix a sparse warning in the ahci_ceva driver (me)
- Disable the ASMedia 1092 non-functional device (Hannes)
* tag 'libata-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
libata: add horkage for ASMedia 1092
ata: ahci_ceva: Fix id array access in ceva_ahci_read_id()
Linus Torvalds [Fri, 10 Dec 2021 19:43:00 +0000 (11:43 -0800)]
Merge tag 'sound-5.16-rc5' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Another collection of small fixes. It's still not quite calm yet, but
nothing looks scary.
ALSA core got a few fixes for covering the issues detected by fuzzer
and the 32bit compat problem of control API, while the rest are all
device-specific small fixes, including the continued fixes for Tegra"
* tag 'sound-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (23 commits)
ALSA: hda/realtek - Add headset Mic support for Lenovo ALC897 platform
ALSA: usb-audio: Reorder snd_djm_devices[] entries
ALSA: hda/realtek: Fix quirk for TongFang PHxTxX1
ALSA: ctl: Fix copy of updated id with element read/write
ALSA: pcm: oss: Handle missing errors in snd_pcm_oss_change_params*()
ALSA: pcm: oss: Limit the period size to 16MB
ALSA: pcm: oss: Fix negative period/buffer sizes
ASoC: codecs: wsa881x: fix return values from kcontrol put
ASoC: codecs: wcd934x: return correct value from mixer put
ASoC: codecs: wcd934x: handle channel mappping list correctly
ASoC: qdsp6: q6routing: Fix return value from msm_routing_put_audio_mixer
ASoC: SOF: Intel: Retry codec probing if it fails
ASoC: amd: fix uninitialized variable in snd_acp6x_probe()
ASoC: rockchip: i2s_tdm: Dup static DAI template
ASoC: rt5682s: Fix crash due to out of scope stack vars
ASoC: rt5682: Fix crash due to out of scope stack vars
ASoC: tegra: Use normal system sleep for ADX
ASoC: tegra: Use normal system sleep for AMX
ASoC: tegra: Use normal system sleep for Mixer
ASoC: tegra: Use normal system sleep for MVC
...
Linus Torvalds [Fri, 10 Dec 2021 19:29:53 +0000 (11:29 -0800)]
Merge tag 'drm-fixes-2021-12-10' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Regular fixes, pretty small overall, couple of core fixes, two i915
and two amdgpu, hopefully it stays this quiet.
ttm:
- fix ttm_bo_swapout
syncobj:
- fix fence find bug with signalled fences
i915:
- fix error pointer deref in gem execbuffer
- fix for GT init with GuC/HuC on ICL
amdgpu:
- DPIA fix
- eDP fix"
* tag 'drm-fixes-2021-12-10' of git://anongit.freedesktop.org/drm/drm:
drm/i915/gen11: Moving WAs to icl_gt_workarounds_init()
drm/amd/display: prevent reading unitialized links
drm/amd/display: Fix DPIA outbox timeout after S3/S4/reset
drm/i915: Fix error pointer dereference in i915_gem_do_execbuffer()
drm/syncobj: Deal with signalled fences in drm_syncobj_find_fence.
drm/ttm: fix ttm_bo_swapout
Jens Axboe [Fri, 10 Dec 2021 18:52:34 +0000 (11:52 -0700)]
Revert "mtd_blkdevs: don't scan partitions for plain mtdblock"
This reverts commit
776b54e97a7d993ba23696e032426d5dea5bbe70.
Looks like a last minute edit snuck into this patch, and as a result,
it doesn't even compile. Revert the change for now.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Davidlohr Bueso [Fri, 10 Dec 2021 18:20:58 +0000 (10:20 -0800)]
block: fix ioprio_get(IOPRIO_WHO_PGRP) vs setuid(2)
do_each_pid_thread(PIDTYPE_PGID) can race with a concurrent
change_pid(PIDTYPE_PGID) that can move the task from one hlist
to another while iterating. Serialize ioprio_get to take
the tasklist_lock in this case, just like it's set counterpart.
Fixes:
d69b78ba1de (ioprio: grab rcu_read_lock in sys_ioprio_{set,get}())
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Link: https://lore.kernel.org/r/20211210182058.43417-1-dave@stgolabs.net
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Fri, 10 Dec 2021 17:19:52 +0000 (10:19 -0700)]
Merge branch 'md-fixes' of https://git./linux/kernel/git/song/md into block-5.16
Pull MD fixes from Song.
* 'md-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
md: fix double free of mddev->private in autorun_array()
md: fix update super 1.0 on rdev size change
zhangyue [Tue, 16 Nov 2021 02:35:26 +0000 (10:35 +0800)]
md: fix double free of mddev->private in autorun_array()
In driver/md/md.c, if the function autorun_array() is called,
the problem of double free may occur.
In function autorun_array(), when the function do_md_run() returns an
error, the function do_md_stop() will be called.
The function do_md_run() called function md_run(), but in function
md_run(), the pointer mddev->private may be freed.
The function do_md_stop() called the function __md_stop(), but in
function __md_stop(), the pointer mddev->private also will be freed
without judging null.
At this time, the pointer mddev->private will be double free, so it
needs to be judged null or not.
Signed-off-by: zhangyue <zhangyue1@kylinos.cn>
Signed-off-by: Song Liu <songliubraving@fb.com>
Markus Hochholdinger [Tue, 16 Nov 2021 10:21:35 +0000 (10:21 +0000)]
md: fix update super 1.0 on rdev size change
The superblock of version 1.0 doesn't get moved to the new position on a
device size change. This leads to a rdev without a superblock on a known
position, the raid can't be re-assembled.
The line was removed by mistake and is re-added by this patch.
Fixes:
d9c0fa509eaf ("md: fix max sectors calculation for super 1.0")
Cc: stable@vger.kernel.org
Signed-off-by: Markus Hochholdinger <markus@hochholdinger.net>
Reviewed-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
J. Bruce Fields [Mon, 29 Nov 2021 20:08:00 +0000 (15:08 -0500)]
nfsd: fix use-after-free due to delegation race
A delegation break could arrive as soon as we've called vfs_setlease. A
delegation break runs a callback which immediately (in
nfsd4_cb_recall_prepare) adds the delegation to del_recall_lru. If we
then exit nfs4_set_delegation without hashing the delegation, it will be
freed as soon as the callback is done with it, without ever being
removed from del_recall_lru.
Symptoms show up later as use-after-free or list corruption warnings,
usually in the laundromat thread.
I suspect
aba2072f4523 "nfsd: grant read delegations to clients holding
writes" made this bug easier to hit, but I looked as far back as v3.0
and it looks to me it already had the same problem. So I'm not sure
where the bug was introduced; it may have been there from the beginning.
Cc: stable@vger.kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Alexander Sverdlin [Tue, 7 Dec 2021 14:00:39 +0000 (15:00 +0100)]
nfsd: Fix nsfd startup race (again)
Commit
bd5ae9288d64 ("nfsd: register pernet ops last, unregister first")
has re-opened rpc_pipefs_event() race against nfsd_net_id registration
(register_pernet_subsys()) which has been fixed by commit
bb7ffbf29e76
("nfsd: fix nsfd startup race triggering BUG_ON").
Restore the order of register_pernet_subsys() vs register_cld_notifier().
Add WARN_ON() to prevent a future regression.
Crash info:
Unable to handle kernel NULL pointer dereference at virtual address
0000000000000012
CPU: 8 PID: 345 Comm: mount Not tainted 5.4.144-... #1
pc : rpc_pipefs_event+0x54/0x120 [nfsd]
lr : rpc_pipefs_event+0x48/0x120 [nfsd]
Call trace:
rpc_pipefs_event+0x54/0x120 [nfsd]
blocking_notifier_call_chain
rpc_fill_super
get_tree_keyed
rpc_fs_get_tree
vfs_get_tree
do_mount
ksys_mount
__arm64_sys_mount
el0_svc_handler
el0_svc
Fixes:
bd5ae9288d64 ("nfsd: register pernet ops last, unregister first")
Cc: stable@vger.kernel.org
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Marc Zyngier [Wed, 17 Nov 2021 11:35:32 +0000 (11:35 +0000)]
clocksource/drivers/arm_arch_timer: Force inlining of erratum_set_next_event_generic()
With some specific kernel configuration and Clang, the kernel fails
to like with something like:
ld.lld: error: undefined symbol: __compiletime_assert_200
>>> referenced by arch_timer.h:156 (./arch/arm64/include/asm/arch_timer.h:156)
>>> clocksource/arm_arch_timer.o:(erratum_set_next_event_generic) in archive drivers/built-in.a
ld.lld: error: undefined symbol: __compiletime_assert_197
>>> referenced by arch_timer.h:133 (./arch/arm64/include/asm/arch_timer.h:133)
>>> clocksource/arm_arch_timer.o:(erratum_set_next_event_generic) in archive drivers/built-in.a
make: *** [Makefile:1161: vmlinux] Error 1
These are due to the BUILD_BUG() macros contained in the low-level
accessors (arch_timer_reg_{write,read}_cp15) being emitted, as the
access type wasn't known at compile time.
Fix this by making erratum_set_next_event_generic() __force_inline,
resulting in the 'access' parameter to be resolved at compile time,
similarly to what is already done for set_next_event().
Fixes:
4775bc63f880 ("Add build-time guards for unhandled register accesses")
Reported-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20211117113532.3895208-1-maz@kernel.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Alexey Sheplyakov [Tue, 9 Nov 2021 15:34:02 +0000 (19:34 +0400)]
clocksource/drivers/dw_apb_timer_of: Fix probe failure
The driver refuses to probe with -EINVAL since the commit
5d9814df0aec
("clocksource/drivers/dw_apb_timer_of: Add error handling if no clock
available").
Before the driver used to probe successfully if either "clock-freq" or
"clock-frequency" properties has been specified in the device tree.
That commit changed
if (A && B)
panic("No clock nor clock-frequency property");
into
if (!A && !B)
return 0;
That's a bug: the reverse of `A && B` is '!A || !B', not '!A && !B'
Signed-off-by: Vadim V. Vlasov <vadim.vlasov@elpitech.ru>
Signed-off-by: Alexey Sheplyakov <asheplyakov@basealt.ru>
Fixes:
5d9814df0aec56a6 ("clocksource/drivers/dw_apb_timer_of: Add error handling if no clock available").
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vadim V. Vlasov <vadim.vlasov@elpitech.ru>
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Link: https://lore.kernel.org/r/20211109153401.157491-1-asheplyakov@basealt.ru
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Sean Christopherson [Mon, 25 Oct 2021 20:13:11 +0000 (13:13 -0700)]
selftests: KVM: Add test to verify KVM doesn't explode on "bad" I/O
Add an x86 selftest to verify that KVM doesn't WARN or otherwise explode
if userspace modifies RCX during a userspace exit to handle string I/O.
This is a regression test for a user-triggerable WARN introduced by
commit
3b27de271839 ("KVM: x86: split the two parts of emulator_pio_in").
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20211025201311.1881846-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Mon, 25 Oct 2021 20:13:10 +0000 (13:13 -0700)]
KVM: x86: Don't WARN if userspace mucks with RCX during string I/O exit
Replace a WARN with a comment to call out that userspace can modify RCX
during an exit to userspace to handle string I/O. KVM doesn't actually
support changing the rep count during an exit, i.e. the scenario can be
ignored, but the WARN needs to go as it's trivial to trigger from
userspace.
Cc: stable@vger.kernel.org
Fixes:
3b27de271839 ("KVM: x86: split the two parts of emulator_pio_in")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20211025201311.1881846-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Lai Jiangshan [Tue, 7 Dec 2021 09:52:30 +0000 (17:52 +0800)]
KVM: X86: Raise #GP when clearing CR0_PG in 64 bit mode
In the SDM:
If the logical processor is in 64-bit mode or if CR4.PCIDE = 1, an
attempt to clear CR0.PG causes a general-protection exception (#GP).
Software should transition to compatibility mode and clear CR4.PCIDE
before attempting to disable paging.
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Message-Id: <
20211207095230.53437-1-jiangshanlai@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Jens Axboe [Fri, 10 Dec 2021 13:44:15 +0000 (06:44 -0700)]
Merge tag 'nvme-5.16-2021-12-10' of git://git.infradead.org/nvme into block-5.16
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 5.16
- set ana_log_size to 0 after freeing ana_log_buf (Hou Tao)
- show subsys nqn for duplicate cntlids (Keith Busch)
- disable namespace access for unsupported metadata (Keith Busch)
- report write pointer for a full zone as zone start + zone len
(Niklas Cassel)
- fix use after free when disconnecting a reconnecting ctrl
(Ruozhu Li)
- fix a list corruption in nvmet-tcp (Sagi Grimberg)"
* tag 'nvme-5.16-2021-12-10' of git://git.infradead.org/nvme:
nvmet-tcp: fix possible list corruption for unexpected command failure
nvme: fix use after free when disconnecting a reconnecting ctrl
nvme-multipath: set ana_log_size to 0 after free ana_log_buf
nvme: report write pointer for a full zone as zone start + zone len
nvme: disable namespace access for unsupported metadata
nvme: show subsys nqn for duplicate cntlids
Ye Guojin [Tue, 9 Nov 2021 05:59:58 +0000 (05:59 +0000)]
irqchip/irq-bcm7120-l2: Add put_device() after of_find_device_by_node()
This was found by coccicheck:
./drivers/irqchip/irq-bcm7120-l2.c,328,1-7,ERROR missing put_device;
call of_find_device_by_node on line 234, but without a corresponding
object release within this function.
./drivers/irqchip/irq-bcm7120-l2.c,341,1-7,ERROR missing put_device;
call of_find_device_by_node on line 234, but without a corresponding
object release within this function.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Ye Guojin <ye.guojin@zte.com.cn>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211109055958.130287-1-ye.guojin@zte.com.cn
Paolo Bonzini [Thu, 5 Aug 2021 10:54:23 +0000 (06:54 -0400)]
selftests: KVM: avoid failures due to reserved HyperTransport region
AMD proceessors define an address range that is reserved by HyperTransport
and causes a failure if used for guest physical addresses. Avoid
selftests failures by reserving those guest physical addresses; the
rules are:
- On parts with <40 bits, its fully hidden from software.
- Before Fam17h, it was always 12G just below 1T, even if there was more
RAM above this location. In this case we just not use any RAM above 1T.
- On Fam17h and later, it is variable based on SME, and is either just
below 2^48 (no encryption) or 2^43 (encryption).
Fixes:
ef4c9f4f6546 ("KVM: selftests: Fix 32-bit truncation of vm_get_max_gfn()")
Cc: stable@vger.kernel.org
Cc: David Matlack <dmatlack@google.com>
Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <
20210805105423.412878-1-pbonzini@redhat.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Tested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sean Christopherson [Tue, 7 Dec 2021 22:09:19 +0000 (22:09 +0000)]
KVM: x86: Ignore sparse banks size for an "all CPUs", non-sparse IPI req
Do not bail early if there are no bits set in the sparse banks for a
non-sparse, a.k.a. "all CPUs", IPI request. Per the Hyper-V spec, it is
legal to have a variable length of '0', e.g. VP_SET's BankContents in
this case, if the request can be serviced without the extra info.
It is possible that for a given invocation of a hypercall that does
accept variable sized input headers that all the header input fits
entirely within the fixed size header. In such cases the variable sized
input header is zero-sized and the corresponding bits in the hypercall
input should be set to zero.
Bailing early results in KVM failing to send IPIs to all CPUs as expected
by the guest.
Fixes:
214ff83d4473 ("KVM: x86: hyperv: implement PV IPI send hypercalls")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <
20211207220926.718794-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Vitaly Kuznetsov [Thu, 9 Dec 2021 10:29:37 +0000 (11:29 +0100)]
KVM: x86: Wait for IPIs to be delivered when handling Hyper-V TLB flush hypercall
Prior to commit
0baedd792713 ("KVM: x86: make Hyper-V PV TLB flush use
tlb_flush_guest()"), kvm_hv_flush_tlb() was using 'KVM_REQ_TLB_FLUSH |
KVM_REQUEST_NO_WAKEUP' when making a request to flush TLBs on other vCPUs
and KVM_REQ_TLB_FLUSH is/was defined as:
(0 | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
so KVM_REQUEST_WAIT was lost. Hyper-V TLFS, however, requires that
"This call guarantees that by the time control returns back to the
caller, the observable effects of all flushes on the specified virtual
processors have occurred." and without KVM_REQUEST_WAIT there's a small
chance that the vCPU making the TLB flush will resume running before
all IPIs get delivered to other vCPUs and a stale mapping can get read
there.
Fix the issue by adding KVM_REQUEST_WAIT flag to KVM_REQ_TLB_FLUSH_GUEST:
kvm_hv_flush_tlb() is the sole caller which uses it for
kvm_make_all_cpus_request()/kvm_make_vcpus_request_mask() where
KVM_REQUEST_WAIT makes a difference.
Cc: stable@kernel.org
Fixes:
0baedd792713 ("KVM: x86: make Hyper-V PV TLB flush use tlb_flush_guest()")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <
20211209102937.584397-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Dave Airlie [Fri, 10 Dec 2021 04:10:55 +0000 (14:10 +1000)]
Merge tag 'amd-drm-fixes-5.16-2021-12-08' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.16-2021-12-08:
amdgpu:
- DPIA fix
- eDP fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211209042824.6720-1-alexander.deucher@amd.com
Dave Airlie [Fri, 10 Dec 2021 04:10:29 +0000 (14:10 +1000)]
Merge tag 'drm-intel-fixes-2021-12-09' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
A fix to a error pointer dereference in gem_execbuffer and
a fix for GT initialization when GuC/HuC are used on ICL.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YbJVWYAd/jeERCYY@intel.com
Dave Airlie [Fri, 10 Dec 2021 03:55:55 +0000 (13:55 +1000)]
Merge tag 'drm-misc-fixes-2021-12-09' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
A fix in syncobj to handle fence already signalled better, and a fix for
a ttm_bo_swapout eviction check.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20211209124305.gxhid5zwf7m4oasn@houat
Linus Torvalds [Thu, 9 Dec 2021 21:20:59 +0000 (13:20 -0800)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"Quite a few small bug fixes old and new, also Doug Ledford is retiring
now, we thank him for his work. Details:
- Use after free in rxe
- mlx5 DM regression
- hns bugs triggred by device reset
- Two fixes for CONFIG_DEBUG_PREEMPT
- Several longstanding corner case bugs in hfi1
- Two irdma data path bugs in rare cases and some memory issues"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/irdma: Don't arm the CQ more than two times if no CE for this CQ
RDMA/irdma: Report correct WC errors
RDMA/irdma: Fix a potential memory allocation issue in 'irdma_prm_add_pble_mem()'
RDMA/irdma: Fix a user-after-free in add_pble_prm
IB/hfi1: Fix leak of rcvhdrtail_dummy_kvaddr
IB/hfi1: Fix early init panic
IB/hfi1: Insure use of smp_processor_id() is preempt disabled
IB/hfi1: Correct guard on eager buffer deallocation
RDMA/rtrs: Call {get,put}_cpu_ptr to silence a debug kernel warning
RDMA/hns: Do not destroy QP resources in the hw resetting phase
RDMA/hns: Do not halt commands during reset until later
Remove Doug Ledford from MAINTAINERS
RDMA/mlx5: Fix releasing unallocated memory in dereg MR flow
RDMA: Fix use-after-free in rxe_queue_cleanup
Andy Shevchenko [Thu, 9 Dec 2021 12:30:33 +0000 (14:30 +0200)]
percpu_ref: Replace kernel.h with the necessary inclusions
When kernel.h is used in the headers it adds a lot into dependency hell,
especially when there are circular dependencies are involved.
Replace kernel.h inclusion with the list of what is really being used.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Dennis Zhou <dennis@kernel.org>
Linus Torvalds [Thu, 9 Dec 2021 19:26:44 +0000 (11:26 -0800)]
Merge tag 'net-5.16-rc5' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bpf, can and netfilter.
Current release - regressions:
- bpf, sockmap: re-evaluate proto ops when psock is removed from
sockmap
Current release - new code bugs:
- bpf: fix bpf_check_mod_kfunc_call for built-in modules
- ice: fixes for TC classifier offloads
- vrf: don't run conntrack on vrf with !dflt qdisc
Previous releases - regressions:
- bpf: fix the off-by-two error in range markings
- seg6: fix the iif in the IPv6 socket control block
- devlink: fix netns refcount leak in devlink_nl_cmd_reload()
- dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's"
- dsa: mv88e6xxx: allow use of PHYs on CPU and DSA ports
Previous releases - always broken:
- ethtool: do not perform operations on net devices being
unregistered
- udp: use datalen to cap max gso segments
- ice: fix races in stats collection
- fec: only clear interrupt of handling queue in fec_enet_rx_queue()
- m_can: pci: fix incorrect reference clock rate
- m_can: disable and ignore ELO interrupt
- mvpp2: fix XDP rx queues registering
Misc:
- treewide: add missing includes masked by cgroup -> bpf.h
dependency"
* tag 'net-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (82 commits)
net: dsa: mv88e6xxx: allow use of PHYs on CPU and DSA ports
net: wwan: iosm: fixes unable to send AT command during mbim tx
net: wwan: iosm: fixes net interface nonfunctional after fw flash
net: wwan: iosm: fixes unnecessary doorbell send
net: dsa: felix: Fix memory leak in felix_setup_mmio_filtering
MAINTAINERS: s390/net: remove myself as maintainer
net/sched: fq_pie: prevent dismantle issue
net: mana: Fix memory leak in mana_hwc_create_wq
seg6: fix the iif in the IPv6 socket control block
nfp: Fix memory leak in nfp_cpp_area_cache_add()
nfc: fix potential NULL pointer deref in nfc_genl_dump_ses_done
nfc: fix segfault in nfc_genl_dump_devices_done
udp: using datalen to cap max gso segments
net: dsa: mv88e6xxx: error handling for serdes_power functions
can: kvaser_usb: get CAN clock frequency from device
can: kvaser_pciefd: kvaser_pciefd_rx_error_frame(): increase correct stats->{rx,tx}_errors counter
net: mvpp2: fix XDP rx queues registering
vmxnet3: fix minimum vectors alloc issue
net, neigh: clear whole pneigh_entry at alloc time
net: dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's"
...
Linus Torvalds [Thu, 9 Dec 2021 19:18:06 +0000 (11:18 -0800)]
Merge tag 'mtd/fixes-for-5.16-rc5' of git://git./linux/kernel/git/mtd/linux
Pull mtd fixes from Miquel Raynal:
"MTD fixes:
- dataflash: Add device-tree SPI IDs to avoid new warnings
Raw NAND fixes:
- Fix nand_choose_best_timings() on unsupported interface
- Fix nand_erase_op delay (wrong unit)
- fsmc:
- Fix timing computation
- Take instruction delay into account
- denali:
- Add the dependency on HAS_IOMEM to silence robots"
* tag 'mtd/fixes-for-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
mtd: dataflash: Add device-tree SPI IDs
mtd: rawnand: fsmc: Fix timing computation
mtd: rawnand: fsmc: Take instruction delay into account
mtd: rawnand: Fix nand_choose_best_timings() on unsupported interface
mtd: rawnand: Fix nand_erase_op delay
mtd: rawnand: denali: Add the dependency on HAS_IOMEM
Linus Torvalds [Thu, 9 Dec 2021 19:08:19 +0000 (11:08 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:
- fixes for various drivers which assume that a HID device is on USB
transport, but that might not necessarily be the case, as the device
can be faked by uhid. (Greg, Benjamin Tissoires)
- fix for spurious wakeups on certain Lenovo notebooks (Thomas
Weißschuh)
- a few other device-specific quirks
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: Ignore battery for Elan touchscreen on Asus UX550VE
HID: intel-ish-hid: ipc: only enable IRQ wakeup when requested
HID: google: add eel USB id
HID: add USB_HID dependancy to hid-prodikeys
HID: add USB_HID dependancy to hid-chicony
HID: bigbenff: prevent null pointer dereference
HID: sony: fix error path in probe
HID: add USB_HID dependancy on some USB HID drivers
HID: check for valid USB device for many HID drivers
HID: wacom: fix problems when device is not a valid USB device
HID: add hid_is_usb() function to make it simpler for USB detection
HID: quirks: Add quirk for the Microsoft Surface 3 type-cover
Xie Yongji [Mon, 13 Sep 2021 11:19:28 +0000 (19:19 +0800)]
aio: Fix incorrect usage of eventfd_signal_allowed()
We should defer eventfd_signal() to the workqueue when
eventfd_signal_allowed() return false rather than return
true.
Fixes:
b542e383d8c0 ("eventfd: Make signal recursion protection a task bit")
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Link: https://lore.kernel.org/r/20210913111928.98-1-xieyongji@bytedance.com
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Eric Biggers [Thu, 9 Dec 2021 01:04:55 +0000 (17:04 -0800)]
aio: fix use-after-free due to missing POLLFREE handling
signalfd_poll() and binder_poll() are special in that they use a
waitqueue whose lifetime is the current task, rather than the struct
file as is normally the case. This is okay for blocking polls, since a
blocking poll occurs within one task; however, non-blocking polls
require another solution. This solution is for the queue to be cleared
before it is freed, by sending a POLLFREE notification to all waiters.
Unfortunately, only eventpoll handles POLLFREE. A second type of
non-blocking poll, aio poll, was added in kernel v4.18, and it doesn't
handle POLLFREE. This allows a use-after-free to occur if a signalfd or
binder fd is polled with aio poll, and the waitqueue gets freed.
Fix this by making aio poll handle POLLFREE.
A patch by Ramji Jiyani <ramjiyani@google.com>
(https://lore.kernel.org/r/
20211027011834.2497484-1-ramjiyani@google.com)
tried to do this by making aio_poll_wake() always complete the request
inline if POLLFREE is seen. However, that solution had two bugs.
First, it introduced a deadlock, as it unconditionally locked the aio
context while holding the waitqueue lock, which inverts the normal
locking order. Second, it didn't consider that POLLFREE notifications
are missed while the request has been temporarily de-queued.
The second problem was solved by my previous patch. This patch then
properly fixes the use-after-free by handling POLLFREE in a
deadlock-free way. It does this by taking advantage of the fact that
freeing of the waitqueue is RCU-delayed, similar to what eventpoll does.
Fixes:
2c14fa838cbe ("aio: implement IOCB_CMD_POLL")
Cc: <stable@vger.kernel.org> # v4.18+
Link: https://lore.kernel.org/r/20211209010455.42744-6-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Eric Biggers [Thu, 9 Dec 2021 01:04:54 +0000 (17:04 -0800)]
aio: keep poll requests on waitqueue until completed
Currently, aio_poll_wake() will always remove the poll request from the
waitqueue. Then, if aio_poll_complete_work() sees that none of the
polled events are ready and the request isn't cancelled, it re-adds the
request to the waitqueue. (This can easily happen when polling a file
that doesn't pass an event mask when waking up its waitqueue.)
This is fundamentally broken for two reasons:
1. If a wakeup occurs between vfs_poll() and the request being
re-added to the waitqueue, it will be missed because the request
wasn't on the waitqueue at the time. Therefore, IOCB_CMD_POLL
might never complete even if the polled file is ready.
2. When the request isn't on the waitqueue, there is no way to be
notified that the waitqueue is being freed (which happens when its
lifetime is shorter than the struct file's). This is supposed to
happen via the waitqueue entries being woken up with POLLFREE.
Therefore, leave the requests on the waitqueue until they are actually
completed (or cancelled). To keep track of when aio_poll_complete_work
needs to be scheduled, use new fields in struct poll_iocb. Remove the
'done' field which is now redundant.
Note that this is consistent with how sys_poll() and eventpoll work;
their wakeup functions do *not* remove the waitqueue entries.
Fixes:
2c14fa838cbe ("aio: implement IOCB_CMD_POLL")
Cc: <stable@vger.kernel.org> # v4.18+
Link: https://lore.kernel.org/r/20211209010455.42744-5-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Eric Biggers [Thu, 9 Dec 2021 01:04:53 +0000 (17:04 -0800)]
signalfd: use wake_up_pollfree()
wake_up_poll() uses nr_exclusive=1, so it's not guaranteed to wake up
all exclusive waiters. Yet, POLLFREE *must* wake up all waiters. epoll
and aio poll are fortunately not affected by this, but it's very
fragile. Thus, the new function wake_up_pollfree() has been introduced.
Convert signalfd to use wake_up_pollfree().
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes:
d80e731ecab4 ("epoll: introduce POLLFREE to flush ->signalfd_wqh before kfree()")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211209010455.42744-4-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Eric Biggers [Thu, 9 Dec 2021 01:04:52 +0000 (17:04 -0800)]
binder: use wake_up_pollfree()
wake_up_poll() uses nr_exclusive=1, so it's not guaranteed to wake up
all exclusive waiters. Yet, POLLFREE *must* wake up all waiters. epoll
and aio poll are fortunately not affected by this, but it's very
fragile. Thus, the new function wake_up_pollfree() has been introduced.
Convert binder to use wake_up_pollfree().
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes:
f5cb779ba163 ("ANDROID: binder: remove waitqueue when thread exits.")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211209010455.42744-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Eric Biggers [Thu, 9 Dec 2021 01:04:51 +0000 (17:04 -0800)]
wait: add wake_up_pollfree()
Several ->poll() implementations are special in that they use a
waitqueue whose lifetime is the current task, rather than the struct
file as is normally the case. This is okay for blocking polls, since a
blocking poll occurs within one task; however, non-blocking polls
require another solution. This solution is for the queue to be cleared
before it is freed, using 'wake_up_poll(wq, EPOLLHUP | POLLFREE);'.
However, that has a bug: wake_up_poll() calls __wake_up() with
nr_exclusive=1. Therefore, if there are multiple "exclusive" waiters,
and the wakeup function for the first one returns a positive value, only
that one will be called. That's *not* what's needed for POLLFREE;
POLLFREE is special in that it really needs to wake up everyone.
Considering the three non-blocking poll systems:
- io_uring poll doesn't handle POLLFREE at all, so it is broken anyway.
- aio poll is unaffected, since it doesn't support exclusive waits.
However, that's fragile, as someone could add this feature later.
- epoll doesn't appear to be broken by this, since its wakeup function
returns 0 when it sees POLLFREE. But this is fragile.
Although there is a workaround (see epoll), it's better to define a
function which always sends POLLFREE to all waiters. Add such a
function. Also make it verify that the queue really becomes empty after
all waiters have been woken up.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20211209010455.42744-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Linus Torvalds [Thu, 9 Dec 2021 18:49:36 +0000 (10:49 -0800)]
Merge tag 'netfs-fixes-
20211207' of git://git./linux/kernel/git/dhowells/linux-fs
Pull netfslib fixes from David Howells:
- Fix a lockdep warning and potential deadlock. This is takes the
simple approach of offloading the write-to-cache done from within a
network filesystem read to a worker thread to avoid taking the
sb_writer lock from the cache backing filesystem whilst holding the
mmap lock on an inode from the network filesystem.
Jan Kara posits a scenario whereby this can cause deadlock[1], though
it's quite complex and I think requires someone in userspace to
actually do I/O on the cache files. Matthew Wilcox isn't so certain,
though[2].
An alternative way to fix this, suggested by Darrick Wong, might be
to allow cachefiles to prevent userspace from performing I/O upon the
file - something like an exclusive open - but that's beyond the scope
of a fix here if we do want to make such a facility in the future.
- In some of the error handling paths where netfs_ops->cleanup() is
called, the arguments are transposed[3]. gcc doesn't complain because
one of the parameters is void* and one of the values is void*.
Link: https://lore.kernel.org/r/20210922110420.GA21576@quack2.suse.cz/
Link: https://lore.kernel.org/r/Ya9eDiFCE2fO7K/S@casper.infradead.org/
Link: https://lore.kernel.org/r/20211207031449.100510-1-jefflexu@linux.alibaba.com/
* tag 'netfs-fixes-
20211207' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
netfs: fix parameter of cleanup()
netfs: Fix lockdep warning from taking sb_writers whilst holding mmap_lock
Miaoqian Lin [Thu, 9 Dec 2021 02:43:17 +0000 (02:43 +0000)]
tracing: Fix possible memory leak in __create_synth_event() error path
There's error paths in __create_synth_event() after the argv is allocated
that fail to free it. Add a jump to free it when necessary.
Link: https://lkml.kernel.org/r/20211209024317.11783-1-linmq006@gmail.com
Suggested-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
[ Fixed up the patch and change log ]
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Jiri Olsa [Mon, 6 Dec 2021 18:20:32 +0000 (19:20 +0100)]
ftrace/samples: Add module to test multi direct modify interface
Adding ftrace-direct-multi-modify.ko kernel module that uses
modify_ftrace_direct_multi API. The core functionality is taken
from ftrace-direct-modify.ko kernel module and changed to fit
multi direct interface.
The init function creates kthread that periodically calls
modify_ftrace_direct_multi to change the trampoline address
for the direct ftrace_ops. The ftrace trace_pipe then shows
trace from both trampolines.
Link: https://lkml.kernel.org/r/20211206182032.87248-4-jolsa@kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Tested-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Maciej S. Szmigiero [Thu, 2 Dec 2021 23:10:13 +0000 (00:10 +0100)]
KVM: x86: selftests: svm_int_ctl_test: fix intercept calculation
INTERCEPT_x are bit positions, but the code was using the raw value of
INTERCEPT_VINTR (4) instead of BIT(INTERCEPT_VINTR).
This resulted in masking of bit 2 - that is, SMI instead of VINTR.
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Message-Id: <
49b9571d25588870db5380b0be1a41df4bbaaf93.
1638486479.git.maciej.szmigiero@oracle.com>
Sasha Levin [Thu, 9 Dec 2021 16:51:13 +0000 (11:51 -0500)]
tools/lib/lockdep: drop leftover liblockdep headers
Clean up remaining headers that are specific to liblockdep but lived in
the shared header directory. These are all unused after the liblockdep
code was removed in commit
7246f4dcaccc ("tools/lib/lockdep: drop
liblockdep").
Note that there are still headers that were originally created for
liblockdep, that still have liblockdep references, but they are used by
other tools/ code at this point.
Signed-off-by: Sasha Levin <sashal@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>