Dom Cobley [Mon, 5 Sep 2022 11:43:55 +0000 (12:43 +0100)]
Merge tag 'v5.15.62' into rpi-5.15.y
This is the 5.15.62 stable release
Joerg Schambacher [Fri, 2 Sep 2022 09:50:16 +0000 (11:50 +0200)]
ASoC:ma120x0p: Extend the volume range to -144dB (mute)
Adjusts the usable volume range down to -144dB to allow 'muting'
of the amplifier through volume control.
Signed-off-by: Joerg Schambacher <joerg@hifiberry.com>
Naushir Patuck [Fri, 2 Sep 2022 07:35:35 +0000 (08:35 +0100)]
media: bcm2835-unicam: Fix for possible dummy buffer overrun
The Unicam hardware has been observed to cause a buffer overrun when using the
dummy buffer as a circular buffer. The conditions that cause the overrun are not
fully known, but it seems to occur when the memory bus is heavily loaded.
To avoid the overrun, program the hardware with a buffer size of 0 when using
the dummy buffer. This will cause overrun into the allocated dummy buffer, but
avoid out of bounds writes.
Signed-off-by: Naushir Patuck <naush@raspberrypi.com>
Dave Stevenson [Thu, 11 Aug 2022 14:52:28 +0000 (15:52 +0100)]
drm/vc4: Correct interrupt masking bit assignment for HVS5
HVS5 has moved the interrupt enable bits around within the
DISPCTRL register, therefore the configuration has to be updated
to account for this.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Dave Stevenson [Thu, 11 Aug 2022 14:49:16 +0000 (15:49 +0100)]
drm/vc4: SCALER_DISPBKGND_AUTOHS is only valid on HVS4
The bit used for SCALER_DISPBKGND_AUTOHS in SCALER_DISPBKGNDX
has been repurposed on HVS5 to configure whether a display can
win back-to-back arbitration wins for the COB.
This is not desirable, therefore only select this bit on HVS4,
and explicitly clear it on HVS5.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Dave Stevenson [Thu, 11 Aug 2022 12:59:34 +0000 (13:59 +0100)]
drm/vc4: Set AXI panic modes for the HVS
The HVS can change AXI request mode based on how full the COB
FIFOs are.
Until now the vc4 driver has been relying on the firmware to
have set these to sensible values.
With HVS channel 2 now being used for live video, change the
panic mode for all channels to be explicitly set by the driver,
and the same for all channels.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Dave Stevenson [Thu, 11 Aug 2022 12:49:16 +0000 (13:49 +0100)]
drm/vc4: Configure the HVS COB allocations
The HVS Composite Output Buffer (COB) is the memory used to
generate the output pixel data.
Until now the vc4 driver has been relying on the firmware to
have set these to sensible values.
In testing triple screen support it has been noted that only
1 line was being assigned to HVS channel 2. Whilst that is fine
for the transposer (TXP), and indeed needed as only some pixels
have an alpha channel, it is insufficient to run a live display.
Split the COB more evenly between the 3 HVS channels.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Revert vc4_regs change
Peter Rosin [Mon, 25 Apr 2022 20:35:50 +0000 (22:35 +0200)]
hwmon: (lm75) Add Atmel AT30TS74 support
commit
c851b715d38de0c262a63de16ad954ed39b47aca upstream.
Atmel (now Microchip) AT30TS74 is an LM75 compatible sensor. Add it.
Signed-off-by: Peter Rosin <peda@axentia.se>
Link: https://lore.kernel.org/r/9494dfbc-f506-3e94-501d-6760c487c93d@axentia.se
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
rlcamp [Wed, 31 Aug 2022 12:47:41 +0000 (05:47 -0700)]
overlays: Add a pull (up/down) parameter to pps-gpio
Square wave output from RTCs such as the DS3231 requires a pullup
resistor. This commit allows the pin's internal pull resistors to be
used by specifying "pull=up" or "pull=down" as necessary.
Phil Elwell [Tue, 30 Aug 2022 08:35:48 +0000 (09:35 +0100)]
configs: Add LAN7430 driver on BCM2711
See: https://github.com/raspberrypi/linux/issues/4117
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Lee Jackson [Fri, 26 Aug 2022 03:41:49 +0000 (11:41 +0800)]
media: i2c: arducam-pivariety: Add custom controls
Add support for strobe_shift, strobe_width and mode custom controls.
Signed-off-by: Lee Jackson <info@arducam.com>
Dave Stevenson [Thu, 25 Aug 2022 11:16:37 +0000 (12:16 +0100)]
defconfigs: Add CONFIG_MUX_GPIO.
This config was missed from
06ccedb39414 ("defconfigs: Add VIDEO_MUX to all defconfigs")
so we had the mux framework, but not the control of GPIOs.
Signed-off-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Phil Elwell [Fri, 26 Aug 2022 09:13:27 +0000 (10:13 +0100)]
Revert "usbnet: smsc95xx: Don't clear read-only PHY interrupt"
This reverts commit
04887243887630f2478a330730d13d87a079b6c8.
See: https://github.com/raspberrypi/linux/issues/5145
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Phil Elwell [Fri, 26 Aug 2022 09:12:58 +0000 (10:12 +0100)]
Revert "usbnet: smsc95xx: Avoid link settings race on interrupt reception"
This reverts commit
09201006dac9bf11a8e6755a11c74d5bccd54e7c.
See: https://github.com/raspberrypi/linux/issues/5145
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Phil Elwell [Fri, 26 Aug 2022 09:12:18 +0000 (10:12 +0100)]
Revert "usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling"
This reverts commit
eaf3a094d8924ecb0baacf6df62ae1c6a96083cf.
See: https://github.com/raspberrypi/linux/issues/5145
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Phil Elwell [Fri, 26 Aug 2022 09:10:56 +0000 (10:10 +0100)]
Revert "usbnet: smsc95xx: Fix deadlock on runtime resume"
This reverts commit
b574d1e3e9a2432b5acd9c4a9dc8d70b6a37aaf1.
See: https://github.com/raspberrypi/linux/issues/5145
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Phil Elwell [Wed, 24 Aug 2022 10:14:40 +0000 (11:14 +0100)]
drm/vc4: Add async update support for cursor planes
Now that cursors are implemented as regular planes, all cursor
movements result in atomic updates. As the firmware-kms driver
doesn't support asynchronous updates, these are synchronous, which
limits the update rate to the screen refresh rate. Xorg seems unaware
of this (or at least of the effect of this), because if the mouse is
configured with a higher update rate than the screen then continuous
mouse movement results in an increasing backlog of mouse events -
cue extreme lag.
Add minimal support for asynchronous updates - limited to cursor
planes - to eliminate the lag.
See: https://github.com/raspberrypi/linux/pull/4971
https://github.com/raspberrypi/linux/issues/4988
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Joerg Schambacher [Wed, 24 Aug 2022 09:41:36 +0000 (11:41 +0200)]
rpi-simple-soundcard: limits sample rate of Merus Amp board
Adding a dedicated hw_params function to limit the maximum sample
rate of the IFX/Merus board to 48000ksps.
Thanks to Ariel Muszkat for the support.
Signed-off-by: Joerg Schambacher <joerg@hifiberry.com>
Jonathan Bell [Mon, 16 May 2022 09:28:27 +0000 (10:28 +0100)]
mmc: block: Don't do single-sector reads during recovery
See https://github.com/raspberrypi/linux/issues/5019
If an SD card has degraded performance such that IO operations time out
then the MMC block layer will leak SG DMA mappings in the swiotlb during
recovery. It retries the same SG and this causes the leak, as it is
mapped twice - once in sdhci_pre_req() and again during single-block
reads in sdhci_prepare_data().
Resetting the card (including power-cycling if a regulator for vmmc is
present) ought to be enough to recover a stuck state, so for now don't
try single-block reads in the recovery path.
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Phil Elwell [Thu, 3 Feb 2022 15:51:41 +0000 (15:51 +0000)]
net: phy: lan87xx: Decrease phy polling rate
Polling at 100Hz for 1.5s consumes quite a bit of kworker time with no
obvious benefit. Reduce that polling rate to ~6Hz.
To further save CPU and power, defer the next poll if no energy is
detected.
See: https://github.com/raspberrypi/linux/issues/4780
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Dom Cobley [Tue, 23 Aug 2022 13:42:52 +0000 (14:42 +0100)]
Merge tag 'v5.15.61' into rpi-5.15.y
This is the 5.15.61 stable release
Dom Cobley [Tue, 23 Aug 2022 13:12:28 +0000 (14:12 +0100)]
Revert "net: phy: lan87xx: Decrease phy polling rate"
This reverts commit
0792bca64303bfd73f98e9fd00eb5bc1ed6d2b75.
Dom Cobley [Tue, 23 Aug 2022 13:12:06 +0000 (14:12 +0100)]
Revert "mmc: block: Don't do single-sector reads during recovery"
This reverts commit
dff79e31c3b05a50f725442c1fc19a6194491523.
Dom Cobley [Tue, 23 Aug 2022 13:07:41 +0000 (14:07 +0100)]
Merge tag 'v5.15.60' into rpi-5.15.y
This is the 5.15.60 stable release
Dom Cobley [Tue, 23 Aug 2022 13:07:38 +0000 (14:07 +0100)]
Merge tag 'v5.15.59' into rpi-5.15.y
This is the 5.15.59 stable release
Dom Cobley [Tue, 23 Aug 2022 13:07:35 +0000 (14:07 +0100)]
Merge tag 'v5.15.58' into rpi-5.15.y
This is the 5.15.58 stable release
Dom Cobley [Tue, 23 Aug 2022 13:07:27 +0000 (14:07 +0100)]
Merge tag 'v5.15.57' into rpi-5.15.y
This is the 5.15.57 stable release
AMuszkat [Tue, 23 Aug 2022 07:25:53 +0000 (09:25 +0200)]
overlays: merus-amp: Set GPIO 8 as input np
Set GPIO 8 as input np for proper msel configuration at boot up.
Ashish Vara [Mon, 22 Aug 2022 08:21:12 +0000 (13:51 +0530)]
configs: Enable DACBERRY400
Signed-off-by: Ashish Vara <ashishhvara@gmail.com>
Ashish Vara [Mon, 22 Aug 2022 07:41:37 +0000 (13:11 +0530)]
overlays: Add dacberry400
Added overlay file for dacberry400 audio card.
Signed-off-by: Ashish Vara <ashishhvara@gmail.com>
Ashish Vara [Sun, 14 Aug 2022 19:39:07 +0000 (01:09 +0530)]
sound: soc: bcm: Added Sound card driver for Dacberry400 Audio card for Raspberry Pi 400
Added Sound card driver for DACberry400 Audio card.
Signed-off-by: Ashish Vara <ashishhvara@gmail.com>
Phil Elwell [Sun, 21 Aug 2022 15:46:39 +0000 (16:46 +0100)]
overlays: gpio-fan: Document the hyst parameter
See: https://forums.raspberrypi.com/viewtopic.php?t=339143
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Phil Elwell [Sun, 21 Aug 2022 15:36:25 +0000 (16:36 +0100)]
overlays: gpio-fan: Add hyst (hysteresis) param
See: https://forums.raspberrypi.com/viewtopic.php?t=339143
Signed-off-by: Phil Elwell <phil@raspberrypi.com>
Greg Kroah-Hartman [Sun, 21 Aug 2022 13:17:49 +0000 (15:17 +0200)]
Linux 5.15.62
Link: https://lore.kernel.org/r/20220819153711.658766010@linuxfoundation.org
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Tested-by: Ron Economos <re@w6rz.net>
Link: https://lore.kernel.org/r/20220820182309.607584465@linuxfoundation.org
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Qu Wenruo [Fri, 19 Aug 2022 08:39:50 +0000 (16:39 +0800)]
btrfs: raid56: don't trust any cached sector in __raid56_parity_recover()
commit
f6065f8edeb25f4a9dfe0b446030ad995a84a088 upstream.
[BUG]
There is a small workload which will always fail with recent kernel:
(A simplified version from btrfs/125 test case)
mkfs.btrfs -f -m raid5 -d raid5 -b 1G $dev1 $dev2 $dev3
mount $dev1 $mnt
xfs_io -f -c "pwrite -S 0xee 0 1M" $mnt/file1
sync
umount $mnt
btrfs dev scan -u $dev3
mount -o degraded $dev1 $mnt
xfs_io -f -c "pwrite -S 0xff 0 128M" $mnt/file2
umount $mnt
btrfs dev scan
mount $dev1 $mnt
btrfs balance start --full-balance $mnt
umount $mnt
The failure is always failed to read some tree blocks:
BTRFS info (device dm-4): relocating block group
217710592 flags data|raid5
BTRFS error (device dm-4): parent transid verify failed on
38993920 wanted 9 found 7
BTRFS error (device dm-4): parent transid verify failed on
38993920 wanted 9 found 7
...
[CAUSE]
With the recently added debug output, we can see all RAID56 operations
related to full stripe
38928384:
56.1183: raid56_read_partial: full_stripe=
38928384 devid=2 type=DATA1 offset=0 opf=0x0 physical=
9502720 len=65536
56.1185: raid56_read_partial: full_stripe=
38928384 devid=3 type=DATA2 offset=16384 opf=0x0 physical=
9519104 len=16384
56.1185: raid56_read_partial: full_stripe=
38928384 devid=3 type=DATA2 offset=49152 opf=0x0 physical=
9551872 len=16384
56.1187: raid56_write_stripe: full_stripe=
38928384 devid=3 type=DATA2 offset=0 opf=0x1 physical=
9502720 len=16384
56.1188: raid56_write_stripe: full_stripe=
38928384 devid=3 type=DATA2 offset=32768 opf=0x1 physical=
9535488 len=16384
56.1188: raid56_write_stripe: full_stripe=
38928384 devid=1 type=PQ1 offset=0 opf=0x1 physical=
30474240 len=16384
56.1189: raid56_write_stripe: full_stripe=
38928384 devid=1 type=PQ1 offset=32768 opf=0x1 physical=
30507008 len=16384
56.1218: raid56_write_stripe: full_stripe=
38928384 devid=3 type=DATA2 offset=49152 opf=0x1 physical=
9551872 len=16384
56.1219: raid56_write_stripe: full_stripe=
38928384 devid=1 type=PQ1 offset=49152 opf=0x1 physical=
30523392 len=16384
56.2721: raid56_parity_recover: full stripe=
38928384 eb=
39010304 mirror=2
56.2723: raid56_parity_recover: full stripe=
38928384 eb=
39010304 mirror=2
56.2724: raid56_parity_recover: full stripe=
38928384 eb=
39010304 mirror=2
Before we enter raid56_parity_recover(), we have triggered some metadata
write for the full stripe
38928384, this leads to us to read all the
sectors from disk.
Furthermore, btrfs raid56 write will cache its calculated P/Q sectors to
avoid unnecessary read.
This means, for that full stripe, after any partial write, we will have
stale data, along with P/Q calculated using that stale data.
Thankfully due to patch "btrfs: only write the sectors in the vertical stripe
which has data stripes" we haven't submitted all the corrupted P/Q to disk.
When we really need to recover certain range, aka in
raid56_parity_recover(), we will use the cached rbio, along with its
cached sectors (the full stripe is all cached).
This explains why we have no event raid56_scrub_read_recover()
triggered.
Since we have the cached P/Q which is calculated using the stale data,
the recovered one will just be stale.
In our particular test case, it will always return the same incorrect
metadata, thus causing the same error message "parent transid verify
failed on
39010304 wanted 9 found 7" again and again.
[BTRFS DESTRUCTIVE RMW PROBLEM]
Test case btrfs/125 (and above workload) always has its trouble with
the destructive read-modify-write (RMW) cycle:
0 32K 64K
Data1: | Good | Good |
Data2: | Bad | Bad |
Parity: | Good | Good |
In above case, if we trigger any write into Data1, we will use the bad
data in Data2 to re-generate parity, killing the only chance to recovery
Data2, thus Data2 is lost forever.
This destructive RMW cycle is not specific to btrfs RAID56, but there
are some btrfs specific behaviors making the case even worse:
- Btrfs will cache sectors for unrelated vertical stripes.
In above example, if we're only writing into 0~32K range, btrfs will
still read data range (32K ~ 64K) of Data1, and (64K~128K) of Data2.
This behavior is to cache sectors for later update.
Incidentally commit
d4e28d9b5f04 ("btrfs: raid56: make steal_rbio()
subpage compatible") has a bug which makes RAID56 to never trust the
cached sectors, thus slightly improve the situation for recovery.
Unfortunately, follow up fix "btrfs: update stripe_sectors::uptodate in
steal_rbio" will revert the behavior back to the old one.
- Btrfs raid56 partial write will update all P/Q sectors and cache them
This means, even if data at (64K ~ 96K) of Data2 is free space, and
only (96K ~ 128K) of Data2 is really stale data.
And we write into that (96K ~ 128K), we will update all the parity
sectors for the full stripe.
This unnecessary behavior will completely kill the chance of recovery.
Thankfully, an unrelated optimization "btrfs: only write the sectors
in the vertical stripe which has data stripes" will prevent
submitting the write bio for untouched vertical sectors.
That optimization will keep the on-disk P/Q untouched for a chance for
later recovery.
[FIX]
Although we have no good way to completely fix the destructive RMW
(unless we go full scrub for each partial write), we can still limit the
damage.
With patch "btrfs: only write the sectors in the vertical stripe which
has data stripes" now we won't really submit the P/Q of unrelated
vertical stripes, so the on-disk P/Q should still be fine.
Now we really need to do is just drop all the cached sectors when doing
recovery.
By this, we have a chance to read the original P/Q from disk, and have a
chance to recover the stale data, while still keep the cache to speed up
regular write path.
In fact, just dropping all the cache for recovery path is good enough to
allow the test case btrfs/125 along with the small script to pass
reliably.
The lack of metadata write after the degraded mount, and forced metadata
COW is saving us this time.
So this patch will fix the behavior by not trust any cache in
__raid56_parity_recover(), to solve the problem while still keep the
cache useful.
But please note that this test pass DOES NOT mean we have solved the
destructive RMW problem, we just do better damage control a little
better.
Related patches:
- btrfs: only write the sectors in the vertical stripe
-
d4e28d9b5f04 ("btrfs: raid56: make steal_rbio() subpage compatible")
- btrfs: update stripe_sectors::uptodate in steal_rbio
Acked-by: David Sterba <dsterba@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Qu Wenruo [Fri, 19 Aug 2022 08:39:49 +0000 (16:39 +0800)]
btrfs: only write the sectors in the vertical stripe which has data stripes
commit
bd8f7e627703ca5707833d623efcd43f104c7b3f upstream.
If we have only 8K partial write at the beginning of a full RAID56
stripe, we will write the following contents:
0 8K 32K 64K
Disk 1 (data): |XX| | |
Disk 2 (data): | | |
Disk 3 (parity): |XXXXXXXXXXXXXXX|XXXXXXXXXXXXXXX|
|X| means the sector will be written back to disk.
Note that, although we won't write any sectors from disk 2, but we will
write the full 64KiB of parity to disk.
This behavior is fine for now, but not for the future (especially for
RAID56J, as we waste quite some space to journal the unused parity
stripes).
So here we will also utilize the btrfs_raid_bio::dbitmap, anytime we
queue a higher level bio into an rbio, we will update rbio::dbitmap to
indicate which vertical stripes we need to writeback.
And at finish_rmw(), we also check dbitmap to see if we need to write
any sector in the vertical stripe.
So after the patch, above example will only lead to the following
writeback pattern:
0 8K 32K 64K
Disk 1 (data): |XX| | |
Disk 2 (data): | | |
Disk 3 (parity): |XX| | |
Acked-by: David Sterba <dsterba@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Zijlstra [Tue, 16 Aug 2022 08:26:58 +0000 (05:26 -0300)]
x86/ftrace: Use alternative RET encoding
commit
1f001e9da6bbf482311e45e48f53c2bd2179e59c upstream.
Use the return thunk in ftrace trampolines, if needed.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
[cascardo: use memcpy(text_gen_insn) as there is no __text_gen_insn]
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Zijlstra [Tue, 16 Aug 2022 08:26:57 +0000 (05:26 -0300)]
x86/ibt,ftrace: Make function-graph play nice
commit
e52fc2cf3f662828cc0d51c4b73bed73ad275fce upstream.
Return trampoline must not use indirect branch to return; while this
preserves the RSB, it is fundamentally incompatible with IBT. Instead
use a retpoline like ROP gadget that defeats IBT while not unbalancing
the RSB.
And since ftrace_stub is no longer a plain RET, don't use it to copy
from. Since RET is a trivial instruction, poke it directly.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.347296408@infradead.org
[cascardo: remove ENDBR]
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Thadeu Lima de Souza Cascardo [Tue, 16 Aug 2022 08:26:56 +0000 (05:26 -0300)]
Revert "x86/ftrace: Use alternative RET encoding"
This reverts commit
e54fcb0812faebd147de72bd37ad87cc4951c68c.
This temporarily reverts the backport of upstream commit
1f001e9da6bbf482311e45e48f53c2bd2179e59c. It was not correct to copy the
ftrace stub as it would contain a relative jump to the return thunk which
would not apply to the context where it was being copied to, leading to
ftrace support to be broken.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Namjae Jeon [Mon, 1 Aug 2022 22:28:51 +0000 (07:28 +0900)]
ksmbd: fix heap-based overflow in set_ntacl_dacl()
commit
8f0541186e9ad1b62accc9519cc2b7a7240272a7 upstream.
The testcase use SMB2_SET_INFO_HE command to set a malformed file attribute
under the label `security.NTACL`. SMB2_QUERY_INFO_HE command in testcase
trigger the following overflow.
[ 4712.003781] ==================================================================
[ 4712.003790] BUG: KASAN: slab-out-of-bounds in build_sec_desc+0x842/0x1dd0 [ksmbd]
[ 4712.003807] Write of size 1060 at addr
ffff88801e34c068 by task kworker/0:0/4190
[ 4712.003813] CPU: 0 PID: 4190 Comm: kworker/0:0 Not tainted 5.19.0-rc5 #1
[ 4712.003850] Workqueue: ksmbd-io handle_ksmbd_work [ksmbd]
[ 4712.003867] Call Trace:
[ 4712.003870] <TASK>
[ 4712.003873] dump_stack_lvl+0x49/0x5f
[ 4712.003935] print_report.cold+0x5e/0x5cf
[ 4712.003972] ? ksmbd_vfs_get_sd_xattr+0x16d/0x500 [ksmbd]
[ 4712.003984] ? cmp_map_id+0x200/0x200
[ 4712.003988] ? build_sec_desc+0x842/0x1dd0 [ksmbd]
[ 4712.004000] kasan_report+0xaa/0x120
[ 4712.004045] ? build_sec_desc+0x842/0x1dd0 [ksmbd]
[ 4712.004056] kasan_check_range+0x100/0x1e0
[ 4712.004060] memcpy+0x3c/0x60
[ 4712.004064] build_sec_desc+0x842/0x1dd0 [ksmbd]
[ 4712.004076] ? parse_sec_desc+0x580/0x580 [ksmbd]
[ 4712.004088] ? ksmbd_acls_fattr+0x281/0x410 [ksmbd]
[ 4712.004099] smb2_query_info+0xa8f/0x6110 [ksmbd]
[ 4712.004111] ? psi_group_change+0x856/0xd70
[ 4712.004148] ? update_load_avg+0x1c3/0x1af0
[ 4712.004152] ? asym_cpu_capacity_scan+0x5d0/0x5d0
[ 4712.004157] ? xas_load+0x23/0x300
[ 4712.004162] ? smb2_query_dir+0x1530/0x1530 [ksmbd]
[ 4712.004173] ? _raw_spin_lock_bh+0xe0/0xe0
[ 4712.004179] handle_ksmbd_work+0x30e/0x1020 [ksmbd]
[ 4712.004192] process_one_work+0x778/0x11c0
[ 4712.004227] ? _raw_spin_lock_irq+0x8e/0xe0
[ 4712.004231] worker_thread+0x544/0x1180
[ 4712.004234] ? __cpuidle_text_end+0x4/0x4
[ 4712.004239] kthread+0x282/0x320
[ 4712.004243] ? process_one_work+0x11c0/0x11c0
[ 4712.004246] ? kthread_complete_and_exit+0x30/0x30
[ 4712.004282] ret_from_fork+0x1f/0x30
This patch add the buffer validation for security descriptor that is
stored by malformed SMB2_SET_INFO_HE command. and allocate large
response buffer about SMB2_O_INFO_SECURITY file info class.
Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3")
Cc: stable@vger.kernel.org
Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-17771
Reviewed-by: Hyunchul Lee <hyc.lee@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hyunchul Lee [Thu, 28 Jul 2022 14:41:51 +0000 (23:41 +0900)]
ksmbd: prevent out of bound read for SMB2_WRITE
commit
ac60778b87e45576d7bfdbd6f53df902654e6f09 upstream.
OOB read memory can be written to a file,
if DataOffset is 0 and Length is too large
in SMB2_WRITE request of compound request.
To prevent this, when checking the length of
the data area of SMB2_WRITE in smb2_get_data_area_len(),
let the minimum of DataOffset be the size of
SMB2 header + the size of SMB2_WRITE header.
This bug can lead an oops looking something like:
[ 798.008715] BUG: KASAN: slab-out-of-bounds in copy_page_from_iter_atomic+0xd3d/0x14b0
[ 798.008724] Read of size 252 at addr
ffff88800f863e90 by task kworker/0:2/2859
...
[ 798.008754] Call Trace:
[ 798.008756] <TASK>
[ 798.008759] dump_stack_lvl+0x49/0x5f
[ 798.008764] print_report.cold+0x5e/0x5cf
[ 798.008768] ? __filemap_get_folio+0x285/0x6d0
[ 798.008774] ? copy_page_from_iter_atomic+0xd3d/0x14b0
[ 798.008777] kasan_report+0xaa/0x120
[ 798.008781] ? copy_page_from_iter_atomic+0xd3d/0x14b0
[ 798.008784] kasan_check_range+0x100/0x1e0
[ 798.008788] memcpy+0x24/0x60
[ 798.008792] copy_page_from_iter_atomic+0xd3d/0x14b0
[ 798.008795] ? pagecache_get_page+0x53/0x160
[ 798.008799] ? iov_iter_get_pages_alloc+0x1590/0x1590
[ 798.008803] ? ext4_write_begin+0xfc0/0xfc0
[ 798.008807] ? current_time+0x72/0x210
[ 798.008811] generic_perform_write+0x2c8/0x530
[ 798.008816] ? filemap_fdatawrite_wbc+0x180/0x180
[ 798.008820] ? down_write+0xb4/0x120
[ 798.008824] ? down_write_killable+0x130/0x130
[ 798.008829] ext4_buffered_write_iter+0x137/0x2c0
[ 798.008833] ext4_file_write_iter+0x40b/0x1490
[ 798.008837] ? __fsnotify_parent+0x275/0xb20
[ 798.008842] ? __fsnotify_update_child_dentry_flags+0x2c0/0x2c0
[ 798.008846] ? ext4_buffered_write_iter+0x2c0/0x2c0
[ 798.008851] __kernel_write+0x3a1/0xa70
[ 798.008855] ? __x64_sys_preadv2+0x160/0x160
[ 798.008860] ? security_file_permission+0x4a/0xa0
[ 798.008865] kernel_write+0xbb/0x360
[ 798.008869] ksmbd_vfs_write+0x27e/0xb90 [ksmbd]
[ 798.008881] ? ksmbd_vfs_read+0x830/0x830 [ksmbd]
[ 798.008892] ? _raw_read_unlock+0x2a/0x50
[ 798.008896] smb2_write+0xb45/0x14e0 [ksmbd]
[ 798.008909] ? __kasan_check_write+0x14/0x20
[ 798.008912] ? _raw_spin_lock_bh+0xd0/0xe0
[ 798.008916] ? smb2_read+0x15e0/0x15e0 [ksmbd]
[ 798.008927] ? memcpy+0x4e/0x60
[ 798.008931] ? _raw_spin_unlock+0x19/0x30
[ 798.008934] ? ksmbd_smb2_check_message+0x16af/0x2350 [ksmbd]
[ 798.008946] ? _raw_spin_lock_bh+0xe0/0xe0
[ 798.008950] handle_ksmbd_work+0x30e/0x1020 [ksmbd]
[ 798.008962] process_one_work+0x778/0x11c0
[ 798.008966] ? _raw_spin_lock_irq+0x8e/0xe0
[ 798.008970] worker_thread+0x544/0x1180
[ 798.008973] ? __cpuidle_text_end+0x4/0x4
[ 798.008977] kthread+0x282/0x320
[ 798.008982] ? process_one_work+0x11c0/0x11c0
[ 798.008985] ? kthread_complete_and_exit+0x30/0x30
[ 798.008989] ret_from_fork+0x1f/0x30
[ 798.008995] </TASK>
Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3")
Cc: stable@vger.kernel.org
Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-17817
Signed-off-by: Hyunchul Lee <hyc.lee@gmail.com>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jamal Hadi Salim [Sun, 14 Aug 2022 11:27:58 +0000 (11:27 +0000)]
net_sched: cls_route: disallow handle of 0
commit
02799571714dc5dd6948824b9d080b44a295f695 upstream.
Follows up on:
https://lore.kernel.org/all/
20220809170518.164662-1-cascardo@canonical.com/
handle of 0 implies from/to of universe realm which is not very
sensible.
Lets see what this patch will do:
$sudo tc qdisc add dev $DEV root handle 1:0 prio
//lets manufacture a way to insert handle of 0
$sudo tc filter add dev $DEV parent 1:0 protocol ip prio 100 \
route to 0 from 0 classid 1:10 action ok
//gets rejected...
Error: handle of 0 is not valid.
We have an error talking to the kernel, -1
//lets create a legit entry..
sudo tc filter add dev $DEV parent 1:0 protocol ip prio 100 route from 10 \
classid 1:10 action ok
//what did the kernel insert?
$sudo tc filter ls dev $DEV parent 1:0
filter protocol ip pref 100 route chain 0
filter protocol ip pref 100 route chain 0 fh 0x000a8000 flowid 1:10 from 10
action order 1: gact action pass
random type none pass val 0
index 1 ref 1 bind 1
//Lets try to replace that legit entry with a handle of 0
$ sudo tc filter replace dev $DEV parent 1:0 protocol ip prio 100 \
handle 0x000a8000 route to 0 from 0 classid 1:10 action drop
Error: Replacing with handle of 0 is invalid.
We have an error talking to the kernel, -1
And last, lets run Cascardo's POC:
$ ./poc
0
0
-22
-22
-22
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jens Wiklander [Thu, 18 Aug 2022 11:08:59 +0000 (13:08 +0200)]
tee: add overflow check in register_shm_helper()
commit
573ae4f13f630d6660008f1974c0a8a29c30e18a upstream.
With special lengths supplied by user space, register_shm_helper() has
an integer overflow when calculating the number of pages covered by a
supplied user space memory region.
This causes internal_get_user_pages_fast() a helper function of
pin_user_pages_fast() to do a NULL pointer dereference:
Unable to handle kernel NULL pointer dereference at virtual address
0000000000000010
Modules linked in:
CPU: 1 PID: 173 Comm: optee_example_a Not tainted 5.19.0 #11
Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
pc : internal_get_user_pages_fast+0x474/0xa80
Call trace:
internal_get_user_pages_fast+0x474/0xa80
pin_user_pages_fast+0x24/0x4c
register_shm_helper+0x194/0x330
tee_shm_register_user_buf+0x78/0x120
tee_ioctl+0xd0/0x11a0
__arm64_sys_ioctl+0xa8/0xec
invoke_syscall+0x48/0x114
Fix this by adding an an explicit call to access_ok() in
tee_shm_register_user_buf() to catch an invalid user space address
early.
Fixes: 033ddf12bcf5 ("tee: add register user memory")
Cc: stable@vger.kernel.org
Reported-by: Nimish Mishra <neelam.nimish@gmail.com>
Reported-by: Anirban Chakraborty <ch.anirban00727@gmail.com>
Reported-by: Debdeep Mukhopadhyay <debdeep.mukhopadhyay@gmail.com>
Suggested-by: Jerome Forissier <jerome.forissier@linaro.org>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jens Axboe [Thu, 23 Jun 2022 17:06:43 +0000 (11:06 -0600)]
io_uring: use original request task for inflight tracking
commit
386e4fb6962b9f248a80f8870aea0870ca603e89 upstream.
In prior kernels, we did file assignment always at prep time. This meant
that req->task == current. But after deferring that assignment and then
pushing the inflight tracking back in, we've got the inflight tracking
using current when it should in fact now be using req->task.
Fixup that error introduced by adding the inflight tracking back after
file assignments got modifed.
Fixes: 9cae36a094e7 ("io_uring: reinstate the inflight tracking")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Wed, 17 Aug 2022 12:24:32 +0000 (14:24 +0200)]
Linux 5.15.61
Link: https://lore.kernel.org/r/20220815180337.130757997@linuxfoundation.org
Tested-by: Shuah Khan <skhan@linuxfoundation.org>
Tested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://lore.kernel.org/r/20220816124544.577833376@linuxfoundation.org
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Ron Economos <re@w6rz.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
James Smart [Fri, 3 Jun 2022 17:43:23 +0000 (10:43 -0700)]
scsi: lpfc: Resolve some cleanup issues following SLI path refactoring
commit
e27f05147bff21408c1b8410ad8e90cd286e7952 upstream.
Following refactoring and consolidation in SLI processing, fix up some
minor issues related to SLI path:
- Correct the setting of LPFC_EXCHANGE_BUSY flag in response IOCB.
- Fix some typographical errors.
- Fix duplicate log messages.
Link: https://lore.kernel.org/r/20220603174329.63777-4-jsmart2021@gmail.com
Fixes: 1b64aa9eae28 ("scsi: lpfc: SLI path split: Refactor fast and slow paths to native SLI4")
Cc: <stable@vger.kernel.org> # v5.18
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
James Smart [Fri, 6 May 2022 03:55:08 +0000 (20:55 -0700)]
scsi: lpfc: Fix element offset in __lpfc_sli_release_iocbq_s4()
commit
84c6f99e39074d45f75986e42ca28e27c140fd0d upstream.
The prior commit that moved from iocb elements to explicit wqe elements
missed a name change.
Correct __lpfc_sli_release_iocbq_s4() to reference wqe rather than iocb.
Link: https://lore.kernel.org/r/20220506035519.50908-2-jsmart2021@gmail.com
Fixes: a680a9298e7b ("scsi: lpfc: SLI path split: Refactor lpfc_iocbq")
Co-developed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
James Smart [Wed, 23 Mar 2022 20:55:45 +0000 (13:55 -0700)]
scsi: lpfc: Fix locking for lpfc_sli_iocbq_lookup()
commit
c26bd6602e1d348bfa754dc55e5608c922dd2801 upstream.
The rules changed for lpfc_sli_iocbq_lookup() vs locking. Prior, the
routine properly took out the lock. In newly refactored code, the locks
must be held when calling the routine.
Fix lpfc_sli_process_sol_iocb() to take the locks before calling the
routine.
Fix lpfc_sli_handle_fast_ring_event() to not release the locks to call the
routine.
Link: https://lore.kernel.org/r/20220323205545.81814-3-jsmart2021@gmail.com
Fixes: 1b64aa9eae28 ("scsi: lpfc: SLI path split: Refactor fast and slow paths to native SLI4")
Co-developed-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Maxime Ripard [Fri, 17 Sep 2021 18:09:25 +0000 (20:09 +0200)]
drm/bridge: Move devm_drm_of_get_bridge to bridge/panel.c
commit
d4ae66f10c8b9959dce1766d9a87070e567236eb upstream.
By depending on devm_drm_panel_bridge_add(), devm_drm_of_get_bridge()
introduces a circular dependency between the modules drm (where
devm_drm_of_get_bridge() ends up) and drm_kms_helper (where
devm_drm_panel_bridge_add() is).
Fix this by moving devm_drm_of_get_bridge() to bridge/panel.c and thus
drm_kms_helper.
Fixes: 87ea95808d53 ("drm/bridge: Add a function to abstract away panels")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20210917180925.2602266-1-maxime@cerno.tech
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Luiz Augusto von Dentz [Mon, 1 Aug 2022 20:52:07 +0000 (13:52 -0700)]
Bluetooth: L2CAP: Fix l2cap_global_chan_by_psm regression
commit
332f1795ca202489c665a75e62e18ff6284de077 upstream.
The patch
d0be8347c623: "Bluetooth: L2CAP: Fix use-after-free caused
by l2cap_chan_put" from Jul 21, 2022, leads to the following Smatch
static checker warning:
net/bluetooth/l2cap_core.c:1977 l2cap_global_chan_by_psm()
error: we previously assumed 'c' could be null (see line 1996)
Fixes: d0be8347c623 ("Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jose Alonso [Mon, 8 Aug 2022 11:35:04 +0000 (08:35 -0300)]
Revert "net: usb: ax88179_178a needs FLAG_SEND_ZLP"
commit
6fd2c17fb6e02a8c0ab51df1cfec82ce96b8e83d upstream.
This reverts commit
36a15e1cb134c0395261ba1940762703f778438c.
The usage of FLAG_SEND_ZLP causes problems to other firmware/hardware
versions that have no issues.
The FLAG_SEND_ZLP is not safe to use in this context.
See:
https://patchwork.ozlabs.org/project/netdev/patch/
1270599787.8900.8.camel@Linuxdev4-laptop/#118378
The original problem needs another way to solve.
Fixes: 36a15e1cb134 ("net: usb: ax88179_178a needs FLAG_SEND_ZLP")
Cc: stable@vger.kernel.org
Reported-by: Ronald Wahl <ronald.wahl@raritan.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216327
Link: https://bugs.archlinux.org/task/75491
Signed-off-by: Jose Alonso <joalonsof@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pavel Begunkov [Thu, 4 Aug 2022 14:13:46 +0000 (15:13 +0100)]
io_uring: mem-account pbuf buckets
commit
cc18cc5e82033d406f54144ad6f8092206004684 upstream.
Potentially, someone may create as many pbuf bucket as there are indexes
in an xarray without any other restrictions bounding our memory usage,
put memory needed for the buckets under memory accounting.
Cc: <stable@vger.kernel.org>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/d34c452e45793e978d26e2606211ec9070d329ea.1659622312.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Miaoqian Lin [Wed, 11 May 2022 05:40:51 +0000 (09:40 +0400)]
drm/meson: Fix refcount leak in meson_encoder_hdmi_init
commit
7381076809586528e2a812a709e2758916318a99 upstream.
of_find_device_by_node() takes reference, we should use put_device()
to release it when not need anymore.
Add missing put_device() in error path to avoid refcount
leak.
Fixes: 0af5e0b41110 ("drm/meson: encoder_hdmi: switch to bridge DRM_BRIDGE_ATTACH_NO_CONNECTOR")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220511054052.51981-1-linmq006@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rob Clark [Fri, 4 Mar 2022 20:21:45 +0000 (12:21 -0800)]
drm/msm: Fix dirtyfb refcounting
commit
9225b337072a10bf9b09df8bf281437488dd8a26 upstream.
refcount_t complains about 0->1 transitions, which isn't *quite* what we
wanted. So use dirtyfb==1 to mean that the fb is not connected to any
output that requires dirtyfb flushing, so that we can keep the underflow
and overflow checking.
Fixes: 9e4dde28e9cd ("drm/msm: Avoid dirtyfb stalls on video mode displays (v2)")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Link: https://lore.kernel.org/r/20220304202146.845566-1-robdclark@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kees Cook [Tue, 25 Jan 2022 22:00:37 +0000 (14:00 -0800)]
tracing/perf: Avoid -Warray-bounds warning for __rel_loc macro
commit
c6d777acdf8f62d4ebaef0e5c6cd8fedbd6e8546 upstream.
As done for trace_events.h, also fix the __rel_loc macro in perf.h,
which silences the -Warray-bounds warning:
In file included from ./include/linux/string.h:253,
from ./include/linux/bitmap.h:11,
from ./include/linux/cpumask.h:12,
from ./include/linux/mm_types_task.h:14,
from ./include/linux/mm_types.h:5,
from ./include/linux/buildid.h:5,
from ./include/linux/module.h:14,
from samples/trace_events/trace-events-sample.c:2:
In function '__fortify_strcpy',
inlined from 'perf_trace_foo_rel_loc' at samples/trace_events/./trace-events-sample.h:519:1:
./include/linux/fortify-string.h:47:33: warning: '__builtin_strcpy' offset 12 is out of the bounds [
0, 4] [-Warray-bounds]
47 | #define __underlying_strcpy __builtin_strcpy
| ^
./include/linux/fortify-string.h:445:24: note: in expansion of macro '__underlying_strcpy'
445 | return __underlying_strcpy(p, q);
| ^~~~~~~~~~~~~~~~~~~
Also make __data struct member a proper flexible array to avoid future
problems.
Link: https://lkml.kernel.org/r/20220125220037.2738923-1-keescook@chromium.org
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Fixes: 55de2c0b5610c ("tracing: Add '__rel_loc' using trace event macros")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tom Rix [Wed, 29 Jun 2022 20:01:01 +0000 (16:01 -0400)]
drm/vc4: change vc4_dma_range_matches from a global to static
commit
63569d90863ff26c8b10c8971d1271c17a45224b upstream.
sparse reports
drivers/gpu/drm/vc4/vc4_drv.c:270:27: warning: symbol 'vc4_dma_range_matches' was not declared. Should it be static?
vc4_dma_range_matches is only used in vc4_drv.c, so it's storage class specifier
should be static.
Fixes: da8e393e23ef ("drm/vc4: drv: Adopt the dma configuration from the HVS or V3D component")
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20220629200101.498138-1-trix@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Lukas Wunner [Mon, 20 Jun 2022 11:04:50 +0000 (13:04 +0200)]
net: phy: smsc: Disable Energy Detect Power-Down in interrupt mode
commit
2642cc6c3bbe0900ba15bab078fd15ad8baccbc5 upstream.
Simon reports that if two LAN9514 USB adapters are directly connected
without an intermediate switch, the link fails to come up and link LEDs
remain dark. The issue was introduced by commit
1ce8b37241ed ("usbnet:
smsc95xx: Forward PHY interrupts to PHY driver to avoid polling").
The PHY suffers from a known erratum wherein link detection becomes
unreliable if Energy Detect Power-Down is used. In poll mode, the
driver works around the erratum by briefly disabling EDPD for 640 msec
to detect a neighbor, then re-enabling it to save power.
In interrupt mode, no interrupt is signaled if EDPD is used by both link
partners, so it must not be enabled at all.
We'll recoup the power savings by enabling SUSPEND1 mode on affected
LAN95xx chips in a forthcoming commit.
Fixes: 1ce8b37241ed ("usbnet: smsc95xx: Forward PHY interrupts to PHY driver to avoid polling")
Reported-by: Simon Han <z.han@kunbus.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://lore.kernel.org/r/439a3f3168c2f9d44b5fd9bb8d2b551711316be6.1655714438.git.lukas@wunner.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Marek Vasut [Thu, 28 Apr 2022 21:31:32 +0000 (23:31 +0200)]
drm/bridge: tc358767: Fix (e)DP bridge endpoint parsing in dedicated function
commit
9030a9e571b3ba250d3d450a98310e3c74ecaff4 upstream.
Per toshiba,tc358767.yaml DT binding document, port@2 the output (e)DP
port is optional. In case this port is not described in DT, the bridge
driver operates in DPI-to-DP mode. The drm_of_find_panel_or_bridge()
call in tc_probe_edp_bridge_endpoint() returns -ENODEV in case port@2
is not present in DT and this specific return value is incorrectly
propagated outside of tc_probe_edp_bridge_endpoint() function. All
other error values must be propagated and are propagated correctly.
Return 0 in case the port@2 is missing instead, that reinstates the
original behavior before the commit this patch fixes.
Fixes: 8478095a8c4b ("drm/bridge: tc358767: Move (e)DP bridge endpoint parsing into dedicated function")
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Jonas Karlman <jonas@kwiboo.se>
Cc: Laurent Pinchart <Laurent.pinchart@ideasonboard.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Marek Vasut <marex@denx.de>
Cc: Maxime Ripard <maxime@cerno.tech>
Cc: Neil Armstrong <narmstrong@baylibre.com>
Cc: Robert Foss <robert.foss@linaro.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20220428213132.447890-1-marex@denx.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alexander Gordeev [Sat, 6 Aug 2022 07:29:46 +0000 (09:29 +0200)]
Revert "s390/smp: enforce lowcore protection on CPU restart"
commit
953503751a426413ea8aee2299ae3ee971b70d9b upstream.
This reverts commit
6f5c672d17f583b081e283927f5040f726c54598.
This breaks normal crash dump when CPU0 is offline.
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Mon, 27 Jun 2022 14:35:59 +0000 (16:35 +0200)]
Revert "mwifiex: fix sleep in atomic context bugs caused by dev_coredumpv"
commit
5f8954e099b8ae96e7de1bb95950e00c85bedd40 upstream.
This reverts commit
a52ed4866d2b90dd5e4ae9dabd453f3ed8fa3cbc as it
causes build problems in linux-next. It needs to be reintroduced in a
way that can allow the api to evolve and not require a "flag day" to
catch all users.
Link: https://lore.kernel.org/r/20220623160723.7a44b573@canb.auug.org.au
Cc: Duoming Zhou <duoming@zju.edu.cn>
Cc: Brian Norris <briannorris@chromium.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jason A. Donenfeld [Mon, 20 Jun 2022 07:52:43 +0000 (09:52 +0200)]
crypto: lib/blake2s - reduce stack frame usage in self test
commit
d6c14da474bf260d73953fbf7992c98d9112aec7 upstream.
Using 3 blocks here doesn't give us much more than using 2, and it
causes a stack frame size warning on certain compiler/config/arch
combinations:
lib/crypto/blake2s-selftest.c: In function 'blake2s_selftest':
>> lib/crypto/blake2s-selftest.c:632:1: warning: the frame size of 1088 bytes is larger than 1024 bytes [-Wframe-larger-than=]
632 | }
| ^
So this patch just reduces the block from 3 to 2, which makes the
warning go away.
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/linux-crypto/202206200851.gE3MHCgd-lkp@intel.com
Fixes: 2d16803c562e ("crypto: blake2s - remove shash module")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Tue, 14 Jun 2022 17:17:33 +0000 (10:17 -0700)]
tcp: fix over estimation in sk_forced_mem_schedule()
commit
c4ee118561a0f74442439b7b5b486db1ac1ddfeb upstream.
sk_forced_mem_schedule() has a bug similar to ones fixed
in commit
7c80b038d23e ("net: fix sk_wmem_schedule() and
sk_rmem_schedule() errors")
While this bug has little chance to trigger in old kernels,
we need to fix it before the following patch.
Fixes: d83769a580f1 ("tcp: fix possible deadlock in tcp_send_fin()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Wei Wang <weiwan@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ahmed Zaki [Sat, 2 Oct 2021 14:53:29 +0000 (08:53 -0600)]
mac80211: fix a memory leak where sta_info is not freed
commit
8f9dcc29566626f683843ccac6113a12208315ca upstream.
The following is from a system that went OOM due to a memory leak:
wlan0: Allocated STA 74:83:c2:64:0b:87
wlan0: Allocated STA 74:83:c2:64:0b:87
wlan0: IBSS finish 74:83:c2:64:0b:87 (---from ieee80211_ibss_add_sta)
wlan0: Adding new IBSS station 74:83:c2:64:0b:87
wlan0: moving STA 74:83:c2:64:0b:87 to state 2
wlan0: moving STA 74:83:c2:64:0b:87 to state 3
wlan0: Inserted STA 74:83:c2:64:0b:87
wlan0: IBSS finish 74:83:c2:64:0b:87 (---from ieee80211_ibss_work)
wlan0: Adding new IBSS station 74:83:c2:64:0b:87
wlan0: moving STA 74:83:c2:64:0b:87 to state 2
wlan0: moving STA 74:83:c2:64:0b:87 to state 3
.
.
wlan0: expiring inactive not authorized STA 74:83:c2:64:0b:87
wlan0: moving STA 74:83:c2:64:0b:87 to state 2
wlan0: moving STA 74:83:c2:64:0b:87 to state 1
wlan0: Removed STA 74:83:c2:64:0b:87
wlan0: Destroyed STA 74:83:c2:64:0b:87
The ieee80211_ibss_finish_sta() is called twice on the same STA from 2
different locations. On the second attempt, the allocated STA is not
destroyed creating a kernel memory leak.
This is happening because sta_info_insert_finish() does not call
sta_info_free() the second time when the STA already exists (returns
-EEXIST). Note that the caller sta_info_insert_rcu() assumes STA is
destroyed upon errors.
Same fix is applied to -ENOMEM.
Signed-off-by: Ahmed Zaki <anzaki@gmail.com>
Link: https://lore.kernel.org/r/20211002145329.3125293-1-anzaki@gmail.com
[change the error path label to use the existing code]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Viacheslav Sablin <sablin@ispras.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Thadeu Lima de Souza Cascardo [Tue, 9 Aug 2022 17:05:18 +0000 (14:05 -0300)]
net_sched: cls_route: remove from list when handle is 0
commit
9ad36309e2719a884f946678e0296be10f0bb4c1 upstream.
When a route filter is replaced and the old filter has a 0 handle, the old
one won't be removed from the hashtable, while it will still be freed.
The test was there since before commit
1109c00547fc ("net: sched: RCU
cls_route"), when a new filter was not allocated when there was an old one.
The old filter was reused and the reinserting would only be necessary if an
old filter was replaced. That was still wrong for the same case where the
old handle was 0.
Remove the old filter from the list independently from its handle value.
This fixes CVE-2022-2588, also reported as ZDI-CAN-17440.
Reported-by: Zhenpeng Lin <zplin@u.northwestern.edu>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Reviewed-by: Kamal Mostafa <kamal@canonical.com>
Cc: <stable@vger.kernel.org>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20220809170518.164662-1-cascardo@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Steven Rostedt (Google) [Sun, 31 Jul 2022 05:59:28 +0000 (01:59 -0400)]
tracing: Use a struct alignof to determine trace event field alignment
commit
4c3d2f9388d36eb28640a220a6f908328442d873 upstream.
alignof() gives an alignment of types as they would be as standalone
variables. But alignment in structures might be different, and when
building the fields of events, the alignment must be the actual
alignment otherwise the field offsets may not match what they actually
are.
This caused trace-cmd to crash, as libtraceevent did not check if the
field offset was bigger than the event. The write_msr and read_msr
events on 32 bit had their fields incorrect, because it had a u64 field
between two ints. alignof(u64) would give 8, but the u64 field was at a
4 byte alignment.
Define a macro as:
ALIGN_STRUCTFIELD(type) ((int)(offsetof(struct {char a; type b;}, b)))
which gives the actual alignment of types in a structure.
Link: https://lkml.kernel.org/r/20220731015928.7ab3a154@rorschach.local.home
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: stable@vger.kernel.org
Fixes: 04ae87a52074e ("ftrace: Rework event_create_dir()")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Christophe Leroy [Tue, 2 Aug 2022 09:02:36 +0000 (11:02 +0200)]
powerpc: Fix eh field when calling lwarx on PPC32
commit
18db466a9a306406dab3b134014d9f6ed642471c upstream.
Commit
9401f4e46cf6 ("powerpc: Use lwarx/ldarx directly instead of
PPC_LWARX/LDARX macros") properly handled the eh field of lwarx
in asm/bitops.h but failed to clear it for PPC32 in
asm/simple_spinlock.h
So, do as in arch_atomic_try_cmpxchg_lock(), set it to 1 if PPC64
but set it to 0 if PPC32. For that use IS_ENABLED(CONFIG_PPC64) which
returns 1 when CONFIG_PPC64 is set and 0 otherwise.
Fixes: 9401f4e46cf6 ("powerpc: Use lwarx/ldarx directly instead of PPC_LWARX/LDARX macros")
Cc: stable@vger.kernel.org # v5.15+
Reported-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Pali Rohár <pali@kernel.org>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
[mpe: Use symbolic names, use 'n' constraint per Segher]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/a1176e19e627dd6a1b8d24c6c457a8ab874b7d12.1659430931.git.christophe.leroy@csgroup.eu
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
SeongJae Park [Fri, 15 Jul 2022 22:51:08 +0000 (22:51 +0000)]
xen-blkfront: Apply 'feature_persistent' parameter when connect
commit
402c43ea6b34a1b371ffeed9adf907402569eaf5 upstream.
In some use cases[1], the backend is created while the frontend doesn't
support the persistent grants feature, but later the frontend can be
changed to support the feature and reconnect. In the past, 'blkback'
enabled the persistent grants feature since it unconditionally checked
if frontend supports the persistent grants feature for every connect
('connect_ring()') and decided whether it should use persistent grans or
not.
However, commit
aac8a70db24b ("xen-blkback: add a parameter for
disabling of persistent grants") has mistakenly changed the behavior.
It made the frontend feature support check to not be repeated once it
shown the 'feature_persistent' as 'false', or the frontend doesn't
support persistent grants.
Similar behavioral change has made on 'blkfront' by commit
74a852479c68
("xen-blkfront: add a parameter for disabling of persistent grants").
This commit changes the behavior of the parameter to make effect for
every connect, so that the previous behavior of 'blkfront' can be
restored.
[1] https://lore.kernel.org/xen-devel/CAJwUmVB6H3iTs-C+U=v-pwJB7-_ZRHPxHzKRJZ22xEPW7z8a=g@mail.gmail.com/
Fixes: 74a852479c68 ("xen-blkfront: add a parameter for disabling of persistent grants")
Cc: <stable@vger.kernel.org> # 5.10.x
Signed-off-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Maximilian Heyne <mheyne@amazon.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220715225108.193398-4-sj@kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Maximilian Heyne [Fri, 15 Jul 2022 22:51:07 +0000 (22:51 +0000)]
xen-blkback: Apply 'feature_persistent' parameter when connect
commit
e94c6101e151b019b8babc518ac2a6ada644a5a1 upstream.
In some use cases[1], the backend is created while the frontend doesn't
support the persistent grants feature, but later the frontend can be
changed to support the feature and reconnect. In the past, 'blkback'
enabled the persistent grants feature since it unconditionally checked
if frontend supports the persistent grants feature for every connect
('connect_ring()') and decided whether it should use persistent grans or
not.
However, commit
aac8a70db24b ("xen-blkback: add a parameter for
disabling of persistent grants") has mistakenly changed the behavior.
It made the frontend feature support check to not be repeated once it
shown the 'feature_persistent' as 'false', or the frontend doesn't
support persistent grants.
This commit changes the behavior of the parameter to make effect for
every connect, so that the previous workflow can work again as expected.
[1] https://lore.kernel.org/xen-devel/CAJwUmVB6H3iTs-C+U=v-pwJB7-_ZRHPxHzKRJZ22xEPW7z8a=g@mail.gmail.com/
Reported-by: Andrii Chepurnyi <andrii.chepurnyi82@gmail.com>
Fixes: aac8a70db24b ("xen-blkback: add a parameter for disabling of persistent grants")
Cc: <stable@vger.kernel.org> # 5.10.x
Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
Signed-off-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Maximilian Heyne <mheyne@amazon.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220715225108.193398-3-sj@kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
SeongJae Park [Fri, 15 Jul 2022 22:51:06 +0000 (22:51 +0000)]
xen-blkback: fix persistent grants negotiation
commit
fc9be616bb8f3ed9cf560308f86904f5c06be205 upstream.
Persistent grants feature can be used only when both backend and the
frontend supports the feature. The feature was always supported by
'blkback', but commit
aac8a70db24b ("xen-blkback: add a parameter for
disabling of persistent grants") has introduced a parameter for
disabling it runtime.
To avoid the parameter be updated while being used by 'blkback', the
commit caches the parameter into 'vbd->feature_gnt_persistent' in
'xen_vbd_create()', and then check if the guest also supports the
feature and finally updates the field in 'connect_ring()'.
However, 'connect_ring()' could be called before 'xen_vbd_create()', so
later execution of 'xen_vbd_create()' can wrongly overwrite 'true' to
'vbd->feature_gnt_persistent'. As a result, 'blkback' could try to use
'persistent grants' feature even if the guest doesn't support the
feature.
This commit fixes the issue by moving the parameter value caching to
'xen_blkif_alloc()', which allocates the 'blkif'. Because the struct
embeds 'vbd' object, which will be used by 'connect_ring()' later, this
should be called before 'connect_ring()' and therefore this should be
the right and safe place to do the caching.
Fixes: aac8a70db24b ("xen-blkback: add a parameter for disabling of persistent grants")
Cc: <stable@vger.kernel.org> # 5.10.x
Signed-off-by: Maximilian Heyne <mheyne@amazon.de>
Signed-off-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Maximilian Heyne <mheyne@amazon.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20220715225108.193398-2-sj@kernel.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Huacai Chen [Mon, 11 Jul 2022 01:17:38 +0000 (09:17 +0800)]
tpm: eventlog: Fix section mismatch for DEBUG_SECTION_MISMATCH
[ Upstream commit
bed4593645366ad7362a3aa7bc0d100d8d8236a8 ]
If DEBUG_SECTION_MISMATCH enabled, __calc_tpm2_event_size() will not be
inlined, this cause section mismatch like this:
WARNING: modpost: vmlinux.o(.text.unlikely+0xe30c): Section mismatch in reference from the variable L0 to the function .init.text:early_ioremap()
The function L0() references
the function __init early_memremap().
This is often because L0 lacks a __init
annotation or the annotation of early_ioremap is wrong.
Fix it by using __always_inline instead of inline for the called-once
function __calc_tpm2_event_size().
Fixes: 44038bc514a2 ("tpm: Abstract crypto agile event size calculations")
Cc: stable@vger.kernel.org # v5.3
Reported-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Tianjia Zhang [Tue, 28 Jun 2022 03:37:20 +0000 (11:37 +0800)]
KEYS: asymmetric: enforce SM2 signature use pkey algo
[ Upstream commit
0815291a8fd66cdcf7db1445d4d99b0d16065829 ]
The signature verification of SM2 needs to add the Za value and
recalculate sig->digest, which requires the detection of the pkey_algo
in public_key_verify_signature(). As Eric Biggers said, the pkey_algo
field in sig is attacker-controlled and should be use pkey->pkey_algo
instead of sig->pkey_algo, and secondly, if sig->pkey_algo is NULL, it
will also cause signature verification failure.
The software_key_determine_akcipher() already forces the algorithms
are matched, so the SM3 algorithm is enforced in the SM2 signature,
although this has been checked, we still avoid using any algorithm
information in the signature as input.
Fixes: 215525639631 ("X.509: support OSCCA SM2-with-SM3 certificate verification")
Reported-by: Eric Biggers <ebiggers@google.com>
Cc: stable@vger.kernel.org # v5.10+
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jan Kara [Tue, 12 Jul 2022 10:54:24 +0000 (12:54 +0200)]
ext4: fix race when reusing xattr blocks
[ Upstream commit
65f8b80053a1b2fd602daa6814e62d6fa90e5e9b ]
When ext4_xattr_block_set() decides to remove xattr block the following
race can happen:
CPU1 CPU2
ext4_xattr_block_set() ext4_xattr_release_block()
new_bh = ext4_xattr_block_cache_find()
lock_buffer(bh);
ref = le32_to_cpu(BHDR(bh)->h_refcount);
if (ref == 1) {
...
mb_cache_entry_delete();
unlock_buffer(bh);
ext4_free_blocks();
...
ext4_forget(..., bh, ...);
jbd2_journal_revoke(..., bh);
ext4_journal_get_write_access(..., new_bh, ...)
do_get_write_access()
jbd2_journal_cancel_revoke(..., new_bh);
Later the code in ext4_xattr_block_set() finds out the block got freed
and cancels reusal of the block but the revoke stays canceled and so in
case of block reuse and journal replay the filesystem can get corrupted.
If the race works out slightly differently, we can also hit assertions
in the jbd2 code.
Fix the problem by making sure that once matching mbcache entry is
found, code dropping the last xattr block reference (or trying to modify
xattr block in place) waits until the mbcache entry reference is
dropped. This way code trying to reuse xattr block is protected from
someone trying to drop the last reference to xattr block.
Reported-and-tested-by: Ritesh Harjani <ritesh.list@gmail.com>
CC: stable@vger.kernel.org
Fixes: 82939d7999df ("ext4: convert to mbcache2")
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220712105436.32204-5-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jan Kara [Tue, 12 Jul 2022 10:54:23 +0000 (12:54 +0200)]
ext4: unindent codeblock in ext4_xattr_block_set()
[ Upstream commit
fd48e9acdf26d0cbd80051de07d4a735d05d29b2 ]
Remove unnecessary else (and thus indentation level) from a code block
in ext4_xattr_block_set(). It will also make following code changes
easier. No functional changes.
CC: stable@vger.kernel.org
Fixes: 82939d7999df ("ext4: convert to mbcache2")
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220712105436.32204-4-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Shuqi Zhang [Wed, 25 May 2022 03:01:20 +0000 (11:01 +0800)]
ext4: use kmemdup() to replace kmalloc + memcpy
[ Upstream commit
4efd9f0d120c55b08852ee5605dbb02a77089a5d ]
Replace kmalloc + memcpy with kmemdup()
Signed-off-by: Shuqi Zhang <zhangshuqi3@huawei.com>
Reviewed-by: Ritesh Harjani <ritesh.list@gmail.com>
Link: https://lore.kernel.org/r/20220525030120.803330-1-zhangshuqi3@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jan Kara [Tue, 12 Jul 2022 10:54:22 +0000 (12:54 +0200)]
ext4: remove EA inode entry from mbcache on inode eviction
[ Upstream commit
6bc0d63dad7f9f54d381925ee855b402f652fa39 ]
Currently we remove EA inode from mbcache as soon as its xattr refcount
drops to zero. However there can be pending attempts to reuse the inode
and thus refcount handling code has to handle the situation when
refcount increases from zero anyway. So save some work and just keep EA
inode in mbcache until it is getting evicted. At that moment we are sure
following iget() of EA inode will fail anyway (or wait for eviction to
finish and load things from the disk again) and so removing mbcache
entry at that moment is fine and simplifies the code a bit.
CC: stable@vger.kernel.org
Fixes: 82939d7999df ("ext4: convert to mbcache2")
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220712105436.32204-3-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Lukas Czerner [Mon, 4 Jul 2022 14:27:21 +0000 (16:27 +0200)]
ext4: make sure ext4_append() always allocates new block
[ Upstream commit
b8a04fe77ef1360fbf73c80fddbdfeaa9407ed1b ]
ext4_append() must always allocate a new block, otherwise we run the
risk of overwriting existing directory block corrupting the directory
tree in the process resulting in all manner of problems later on.
Add a sanity check to see if the logical block is already allocated and
error out if it is.
Cc: stable@kernel.org
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20220704142721.157985-2-lczerner@redhat.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Lukas Czerner [Mon, 4 Jul 2022 14:27:20 +0000 (16:27 +0200)]
ext4: check if directory block is within i_size
[ Upstream commit
65f8ea4cd57dbd46ea13b41dc8bac03176b04233 ]
Currently ext4 directory handling code implicitly assumes that the
directory blocks are always within the i_size. In fact ext4_append()
will attempt to allocate next directory block based solely on i_size and
the i_size is then appropriately increased after a successful
allocation.
However, for this to work it requires i_size to be correct. If, for any
reason, the directory inode i_size is corrupted in a way that the
directory tree refers to a valid directory block past i_size, we could
end up corrupting parts of the directory tree structure by overwriting
already used directory blocks when modifying the directory.
Fix it by catching the corruption early in __ext4_read_dirblock().
Addresses Red-Hat-Bugzilla: #
2070205
CVE: CVE-2022-1184
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20220704142721.157985-1-lczerner@redhat.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Ye Bin [Fri, 17 Jun 2022 01:39:35 +0000 (09:39 +0800)]
ext4: fix warning in ext4_iomap_begin as race between bmap and write
[ Upstream commit
51ae846cff568c8c29921b1b28eb2dfbcd4ac12d ]
We got issue as follows:
------------[ cut here ]------------
WARNING: CPU: 3 PID: 9310 at fs/ext4/inode.c:3441 ext4_iomap_begin+0x182/0x5d0
RIP: 0010:ext4_iomap_begin+0x182/0x5d0
RSP: 0018:
ffff88812460fa08 EFLAGS:
00010293
RAX:
ffff88811f168000 RBX:
0000000000000000 RCX:
ffffffff97793c12
RDX:
0000000000000000 RSI:
0000000000000000 RDI:
0000000000000003
RBP:
ffff88812c669160 R08:
ffff88811f168000 R09:
ffffed10258cd20f
R10:
ffff88812c669077 R11:
ffffed10258cd20e R12:
0000000000000001
R13:
00000000000000a4 R14:
000000000000000c R15:
ffff88812c6691ee
FS:
00007fd0d6ff3740(0000) GS:
ffff8883af180000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00007fd0d6dda290 CR3:
0000000104a62000 CR4:
00000000000006e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
iomap_apply+0x119/0x570
iomap_bmap+0x124/0x150
ext4_bmap+0x14f/0x250
bmap+0x55/0x80
do_vfs_ioctl+0x952/0xbd0
__x64_sys_ioctl+0xc6/0x170
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Above issue may happen as follows:
bmap write
bmap
ext4_bmap
iomap_bmap
ext4_iomap_begin
ext4_file_write_iter
ext4_buffered_write_iter
generic_perform_write
ext4_da_write_begin
ext4_da_write_inline_data_begin
ext4_prepare_inline_data
ext4_create_inline_data
ext4_set_inode_flag(inode,
EXT4_INODE_INLINE_DATA);
if (WARN_ON_ONCE(ext4_has_inline_data(inode))) ->trigger bug_on
To solved above issue hold inode lock in ext4_bamp.
Signed-off-by: Ye Bin <yebin10@huawei.com>
Link: https://lore.kernel.org/r/20220617013935.397596-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Baokun Li [Thu, 16 Jun 2022 02:13:58 +0000 (10:13 +0800)]
ext4: correct the misjudgment in ext4_iget_extra_inode
[ Upstream commit
fd7e672ea98b95b9d4c9dae316639f03c16a749d ]
Use the EXT4_INODE_HAS_XATTR_SPACE macro to more accurately
determine whether the inode have xattr space.
Cc: stable@kernel.org
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220616021358.2504451-5-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Baokun Li [Thu, 16 Jun 2022 02:13:57 +0000 (10:13 +0800)]
ext4: correct max_inline_xattr_value_size computing
[ Upstream commit
c9fd167d57133c5b748d16913c4eabc55e531c73 ]
If the ext4 inode does not have xattr space, 0 is returned in the
get_max_inline_xattr_value_size function. Otherwise, the function returns
a negative value when the inode does not contain EXT4_STATE_XATTR.
Cc: stable@kernel.org
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220616021358.2504451-4-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Baokun Li [Thu, 16 Jun 2022 02:13:56 +0000 (10:13 +0800)]
ext4: fix use-after-free in ext4_xattr_set_entry
[ Upstream commit
67d7d8ad99beccd9fe92d585b87f1760dc9018e3 ]
Hulk Robot reported a issue:
==================================================================
BUG: KASAN: use-after-free in ext4_xattr_set_entry+0x18ab/0x3500
Write of size 4105 at addr
ffff8881675ef5f4 by task syz-executor.0/7092
CPU: 1 PID: 7092 Comm: syz-executor.0 Not tainted 4.19.90-dirty #17
Call Trace:
[...]
memcpy+0x34/0x50 mm/kasan/kasan.c:303
ext4_xattr_set_entry+0x18ab/0x3500 fs/ext4/xattr.c:1747
ext4_xattr_ibody_inline_set+0x86/0x2a0 fs/ext4/xattr.c:2205
ext4_xattr_set_handle+0x940/0x1300 fs/ext4/xattr.c:2386
ext4_xattr_set+0x1da/0x300 fs/ext4/xattr.c:2498
__vfs_setxattr+0x112/0x170 fs/xattr.c:149
__vfs_setxattr_noperm+0x11b/0x2a0 fs/xattr.c:180
__vfs_setxattr_locked+0x17b/0x250 fs/xattr.c:238
vfs_setxattr+0xed/0x270 fs/xattr.c:255
setxattr+0x235/0x330 fs/xattr.c:520
path_setxattr+0x176/0x190 fs/xattr.c:539
__do_sys_lsetxattr fs/xattr.c:561 [inline]
__se_sys_lsetxattr fs/xattr.c:557 [inline]
__x64_sys_lsetxattr+0xc2/0x160 fs/xattr.c:557
do_syscall_64+0xdf/0x530 arch/x86/entry/common.c:298
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x459fe9
RSP: 002b:
00007fa5e54b4c08 EFLAGS:
00000246 ORIG_RAX:
00000000000000bd
RAX:
ffffffffffffffda RBX:
000000000051bf60 RCX:
0000000000459fe9
RDX:
00000000200003c0 RSI:
0000000020000180 RDI:
0000000020000140
RBP:
000000000051bf60 R08:
0000000000000001 R09:
0000000000000000
R10:
0000000000001009 R11:
0000000000000246 R12:
0000000000000000
R13:
00007ffc73c93fc0 R14:
000000000051bf60 R15:
00007fa5e54b4d80
[...]
==================================================================
Above issue may happen as follows:
-------------------------------------
ext4_xattr_set
ext4_xattr_set_handle
ext4_xattr_ibody_find
>> s->end < s->base
>> no EXT4_STATE_XATTR
>> xattr_check_inode is not executed
ext4_xattr_ibody_set
ext4_xattr_set_entry
>> size_t min_offs = s->end - s->base
>> UAF in memcpy
we can easily reproduce this problem with the following commands:
mkfs.ext4 -F /dev/sda
mount -o debug_want_extra_isize=128 /dev/sda /mnt
touch /mnt/file
setfattr -n user.cat -v `seq -s z 4096|tr -d '[:digit:]'` /mnt/file
In ext4_xattr_ibody_find, we have the following assignment logic:
header = IHDR(inode, raw_inode)
= raw_inode + EXT4_GOOD_OLD_INODE_SIZE + i_extra_isize
is->s.base = IFIRST(header)
= header + sizeof(struct ext4_xattr_ibody_header)
is->s.end = raw_inode + s_inode_size
In ext4_xattr_set_entry
min_offs = s->end - s->base
= s_inode_size - EXT4_GOOD_OLD_INODE_SIZE - i_extra_isize -
sizeof(struct ext4_xattr_ibody_header)
last = s->first
free = min_offs - ((void *)last - s->base) - sizeof(__u32)
= s_inode_size - EXT4_GOOD_OLD_INODE_SIZE - i_extra_isize -
sizeof(struct ext4_xattr_ibody_header) - sizeof(__u32)
In the calculation formula, all values except s_inode_size and
i_extra_size are fixed values. When i_extra_size is the maximum value
s_inode_size - EXT4_GOOD_OLD_INODE_SIZE, min_offs is -4 and free is -8.
The value overflows. As a result, the preceding issue is triggered when
memcpy is executed.
Therefore, when finding xattr or setting xattr, check whether
there is space for storing xattr in the inode to resolve this issue.
Cc: stable@kernel.org
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220616021358.2504451-3-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Baokun Li [Thu, 16 Jun 2022 02:13:55 +0000 (10:13 +0800)]
ext4: add EXT4_INODE_HAS_XATTR_SPACE macro in xattr.h
[ Upstream commit
179b14152dcb6a24c3415200603aebca70ff13af ]
When adding an xattr to an inode, we must ensure that the inode_size is
not less than EXT4_GOOD_OLD_INODE_SIZE + extra_isize + pad. Otherwise,
the end position may be greater than the start position, resulting in UAF.
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Link: https://lore.kernel.org/r/20220616021358.2504451-2-libaokun1@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Eric Whitney [Wed, 15 Jun 2022 16:05:30 +0000 (12:05 -0400)]
ext4: fix extent status tree race in writeback error recovery path
[ Upstream commit
7f0d8e1d607c1a4fa9a27362a108921d82230874 ]
A race can occur in the unlikely event ext4 is unable to allocate a
physical cluster for a delayed allocation in a bigalloc file system
during writeback. Failure to allocate a cluster forces error recovery
that includes a call to mpage_release_unused_pages(). That function
removes any corresponding delayed allocated blocks from the extent
status tree. If a new delayed write is in progress on the same cluster
simultaneously, resulting in the addition of an new extent containing
one or more blocks in that cluster to the extent status tree, delayed
block accounting can be thrown off if that delayed write then encounters
a similar cluster allocation failure during future writeback.
Write lock the i_data_sem in mpage_release_unused_pages() to fix this
problem. Ext4's block/cluster accounting code for bigalloc relies on
i_data_sem for mutual exclusion, as is found in the delayed write path,
and the locking in mpage_release_unused_pages() is missing.
Cc: stable@kernel.org
Reported-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Link: https://lore.kernel.org/r/20220615160530.1928801-1-enwlinux@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Theodore Ts'o [Wed, 29 Jun 2022 04:00:25 +0000 (00:00 -0400)]
ext4: update s_overhead_clusters in the superblock during an on-line resize
[ Upstream commit
de394a86658ffe4e89e5328fd4993abfe41b7435 ]
When doing an online resize, the on-disk superblock on-disk wasn't
updated. This means that when the file system is unmounted and
remounted, and the on-disk overhead value is non-zero, this would
result in the results of statfs(2) to be incorrect.
This was partially fixed by Commits
10b01ee92df5 ("ext4: fix overhead
calculation to account for the reserved gdt blocks"),
85d825dbf489
("ext4: force overhead calculation if the s_overhead_cluster makes no
sense"), and
eb7054212eac ("ext4: update the cached overhead value in
the superblock").
However, since it was too expensive to forcibly recalculate the
overhead for bigalloc file systems at every mount, this didn't fix the
problem for bigalloc file systems. This commit should address the
problem when resizing file systems with the bigalloc feature enabled.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Link: https://lore.kernel.org/r/20220629040026.112371-1-tytso@mit.edu
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Masami Hiramatsu [Tue, 25 Jan 2022 14:19:30 +0000 (23:19 +0900)]
tracing: Avoid -Warray-bounds warning for __rel_loc macro
[ Upstream commit
58c5724ec2cdd72b22107ec5de00d90cc4797796 ]
Since -Warray-bounds checks the destination size from the type of given
pointer, __assign_rel_str() macro gets warned because it passes the
pointer to the 'u32' field instead of 'trace_event_raw_*' data structure.
Pass the data address calculated from the 'trace_event_raw_*' instead of
'u32' __rel_loc field.
Link: https://lkml.kernel.org/r/20220125233154.dac280ed36944c0c2fe6f3ac@kernel.org
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
[ This did not fix the warning, but is still a nice clean up ]
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Masami Hiramatsu [Mon, 22 Nov 2021 09:30:21 +0000 (18:30 +0900)]
tracing: Add '__rel_loc' using trace event macros
[ Upstream commit
55de2c0b5610cba5a5a93c0788031133c457e689 ]
Add '__rel_loc' using trace event macros. These macros are usually
not used in the kernel, except for testing purpose.
This also add "rel_" variant of macros for dynamic_array string,
and bitmask.
Link: https://lkml.kernel.org/r/163757342119.510314.816029622439099016.stgit@devnote2
Cc: Beau Belgrave <beaub@linux.microsoft.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Mikulas Patocka [Sun, 24 Jul 2022 18:33:52 +0000 (14:33 -0400)]
dm raid: fix address sanitizer warning in raid_resume
[ Upstream commit
7dad24db59d2d2803576f2e3645728866a056dab ]
There is a KASAN warning in raid_resume when running the lvm test
lvconvert-raid.sh. The reason for the warning is that mddev->raid_disks
is greater than rs->raid_disks, so the loop touches one entry beyond
the allocated length.
Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Mikulas Patocka [Sun, 24 Jul 2022 18:31:35 +0000 (14:31 -0400)]
dm raid: fix address sanitizer warning in raid_status
[ Upstream commit
1fbeea217d8f297fe0e0956a1516d14ba97d0396 ]
There is this warning when using a kernel with the address sanitizer
and running this testsuite:
https://gitlab.com/cki-project/kernel-tests/-/tree/main/storage/swraid/scsi_raid
==================================================================
BUG: KASAN: slab-out-of-bounds in raid_status+0x1747/0x2820 [dm_raid]
Read of size 4 at addr
ffff888079d2c7e8 by task lvcreate/13319
CPU: 0 PID: 13319 Comm: lvcreate Not tainted 5.18.0-0.rc3.<snip> #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
Call Trace:
<TASK>
dump_stack_lvl+0x6a/0x9c
print_address_description.constprop.0+0x1f/0x1e0
print_report.cold+0x55/0x244
kasan_report+0xc9/0x100
raid_status+0x1747/0x2820 [dm_raid]
dm_ima_measure_on_table_load+0x4b8/0xca0 [dm_mod]
table_load+0x35c/0x630 [dm_mod]
ctl_ioctl+0x411/0x630 [dm_mod]
dm_ctl_ioctl+0xa/0x10 [dm_mod]
__x64_sys_ioctl+0x12a/0x1a0
do_syscall_64+0x5b/0x80
The warning is caused by reading conf->max_nr_stripes in raid_status. The
code in raid_status reads mddev->private, casts it to struct r5conf and
reads the entry max_nr_stripes.
However, if we have different raid type than 4/5/6, mddev->private
doesn't point to struct r5conf; it may point to struct r0conf, struct
r1conf, struct r10conf or struct mpconf. If we cast a pointer to one
of these structs to struct r5conf, we will be reading invalid memory
and KASAN warns about it.
Fix this bug by reading struct r5conf only if raid type is 4, 5 or 6.
Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sean Christopherson [Fri, 22 Jul 2022 22:44:08 +0000 (22:44 +0000)]
KVM: nVMX: Attempt to load PERF_GLOBAL_CTRL on nVMX xfer iff it exists
[ Upstream commit
4496a6f9b45e8cd83343ad86a3984d614e22cf54 ]
Attempt to load PERF_GLOBAL_CTRL during nested VM-Enter/VM-Exit if and
only if the MSR exists (according to the guest vCPU model). KVM has very
misguided handling of VM_{ENTRY,EXIT}_LOAD_IA32_PERF_GLOBAL_CTRL and
attempts to force the nVMX MSR settings to match the vPMU model, i.e. to
hide/expose the control based on whether or not the MSR exists from the
guest's perspective.
KVM's modifications fail to handle the scenario where the vPMU is hidden
from the guest _after_ being exposed to the guest, e.g. by userspace
doing multiple KVM_SET_CPUID2 calls, which is allowed if done before any
KVM_RUN. nested_vmx_pmu_refresh() is called if and only if there's a
recognized vPMU, i.e. KVM will leave the bits in the allow state and then
ultimately reject the MSR load and WARN.
KVM should not force the VMX MSRs in the first place. KVM taking control
of the MSRs was a misguided attempt at mimicking what commit
5f76f6f5ff96
("KVM: nVMX: Do not expose MPX VMX controls when guest MPX disabled",
2018-10-01) did for MPX. However, the MPX commit was a workaround for
another KVM bug and not something that should be imitated (and it should
never been done in the first place).
In other words, KVM's ABI _should_ be that userspace has full control
over the MSRs, at which point triggering the WARN that loading the MSR
must not fail is trivial.
The intent of the WARN is still valid; KVM has consistency checks to
ensure that vmcs12->{guest,host}_ia32_perf_global_ctrl is valid. The
problem is that '0' must be considered a valid value at all times, and so
the simple/obvious solution is to just not actually load the MSR when it
does not exist. It is userspace's responsibility to provide a sane vCPU
model, i.e. KVM is well within its ABI and Intel's VMX architecture to
skip the loads if the MSR does not exist.
Fixes: 03a8871add95 ("KVM: nVMX: Expose load IA32_PERF_GLOBAL_CTRL VM-{Entry,Exit} control")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20220722224409.
1336532-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sean Christopherson [Fri, 22 Jul 2022 22:44:07 +0000 (22:44 +0000)]
KVM: VMX: Add helper to check if the guest PMU has PERF_GLOBAL_CTRL
[ Upstream commit
b663f0b5f3d665c261256d1f76e98f077c6e56af ]
Add a helper to check of the guest PMU has PERF_GLOBAL_CTRL, which is
unintuitive _and_ diverges from Intel's architecturally defined behavior.
Even worse, KVM currently implements the check using two different (but
equivalent) checks, _and_ there has been at least one attempt to add a
_third_ flavor.
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20220722224409.
1336532-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Like Xu [Mon, 9 May 2022 10:22:02 +0000 (18:22 +0800)]
KVM: x86/pmu: Ignore pmu->global_ctrl check if vPMU doesn't support global_ctrl
[ Upstream commit
98defd2e17803263f49548fea930cfc974d505aa ]
MSR_CORE_PERF_GLOBAL_CTRL is introduced as part of Architecture PMU V2,
as indicated by Intel SDM 19.2.2 and the intel_is_valid_msr() function.
So in the absence of global_ctrl support, all PMCs are enabled as AMD does.
Signed-off-by: Like Xu <likexu@tencent.com>
Message-Id: <
20220509102204.62389-1-likexu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sean Christopherson [Fri, 22 Jul 2022 22:44:06 +0000 (22:44 +0000)]
KVM: VMX: Mark all PERF_GLOBAL_(OVF)_CTRL bits reserved if there's no vPMU
[ Upstream commit
93255bf92939d948bc86d81c6bb70bb0fecc5db1 ]
Mark all MSR_CORE_PERF_GLOBAL_CTRL and MSR_CORE_PERF_GLOBAL_OVF_CTRL bits
as reserved if there is no guest vPMU. The nVMX VM-Entry consistency
checks do not check for a valid vPMU prior to consuming the masks via
kvm_valid_perf_global_ctrl(), i.e. may incorrectly allow a non-zero mask
to be loaded via VM-Enter or VM-Exit (well, attempted to be loaded, the
actual MSR load will be rejected by intel_is_valid_msr()).
Fixes: f5132b01386b ("KVM: Expose a version 2 architectural PMU to a guests")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <
20220722224409.
1336532-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Like Xu [Mon, 11 Apr 2022 10:19:34 +0000 (18:19 +0800)]
KVM: x86/pmu: Introduce the ctrl_mask value for fixed counter
[ Upstream commit
2c985527dd8d283e786ad7a67e532ef7f6f00fac ]
The mask value of fixed counter control register should be dynamic
adjusted with the number of fixed counters. This patch introduces a
variable that includes the reserved bits of fixed counter control
registers. This is a generic code refactoring.
Co-developed-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Message-Id: <
20220411101946.20262-6-likexu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jason A. Donenfeld [Wed, 27 Jul 2022 14:32:18 +0000 (00:32 +1000)]
powerpc/powernv/kvm: Use darn for H_RANDOM on Power9
[ Upstream commit
7ef3d06f1bc4a5e62273726f3dc2bd258ae1c71f ]
The existing logic in KVM to support guests calling H_RANDOM only works
on Power8, because it looks for an RNG in the device tree, but on Power9
we just use darn.
In addition the existing code needs to work in real mode, so we have the
special cased powernv_get_random_real_mode() to deal with that.
Instead just have KVM call ppc_md.get_random_seed(), and do the real
mode check inside of there, that way we use whatever RNG is available,
including darn on Power9.
Fixes: e928e9cb3601 ("KVM: PPC: Book3S HV: Add fast real-mode H_RANDOM implementation.")
Cc: stable@vger.kernel.org # v4.1+
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
[mpe: Rebase on previous commit, update change log appropriately]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220727143219.2684192-2-mpe@ellerman.id.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
Rafael J. Wysocki [Thu, 21 Jul 2022 17:41:10 +0000 (19:41 +0200)]
ACPI: CPPC: Do not prevent CPPC from working in the future
[ Upstream commit
4f4179fcf420873002035cf1941d844c9e0e7cb3 ]
There is a problem with the current revision checks in
is_cppc_supported() that they essentially prevent the CPPC support
from working if a new _CPC package format revision being a proper
superset of the v3 and only causing _CPC to return a package with more
entries (while retaining the types and meaning of the entries defined by
the v3) is introduced in the future and used by the platform firmware.
In that case, as long as the number of entries in the _CPC return
package is at least CPPC_V3_NUM_ENT, it should be perfectly fine to
use the v3 support code and disregard the additional package entries
added by the new package format revision.
For this reason, drop is_cppc_supported() altogether, put the revision
checks directly into acpi_cppc_processor_probe() so they are easier to
follow and rework them to take the case mentioned above into account.
Fixes: 4773e77cdc9b ("ACPI / CPPC: Add support for CPPC v3")
Cc: 4.18+ <stable@vger.kernel.org> # 4.18+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Nikolay Borisov [Thu, 23 Jun 2022 07:55:47 +0000 (10:55 +0300)]
btrfs: properly flag filesystem with BTRFS_FEATURE_INCOMPAT_BIG_METADATA
[ Upstream commit
e26b04c4c91925dba57324db177a24e18e2d0013 ]
Commit
6f93e834fa7c seemingly inadvertently moved the code responsible
for flagging the filesystem as having BIG_METADATA to a place where
setting the flag was essentially lost. This means that
filesystems created with kernels containing this bug (starting with 5.15)
can potentially be mounted by older (pre-3.4) kernels. In reality
chances for this happening are low because there are other incompat
flags introduced in the mean time. Still the correct behavior is to set
INCOMPAT_BIG_METADATA flag and persist this in the superblock.
Fixes: 6f93e834fa7c ("btrfs: fix upper limit for max_inline for page size 64K")
CC: stable@vger.kernel.org # 5.4+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Josef Bacik [Mon, 13 Jun 2022 22:31:17 +0000 (18:31 -0400)]
btrfs: reset block group chunk force if we have to wait
[ Upstream commit
1314ca78b2c35d3e7d0f097268a2ee6dc0d369ef ]
If you try to force a chunk allocation, but you race with another chunk
allocation, you will end up waiting on the chunk allocation that just
occurred and then allocate another chunk. If you have many threads all
doing this at once you can way over-allocate chunks.
Fix this by resetting force to NO_FORCE, that way if we think we need to
allocate we can, otherwise we don't force another chunk allocation if
one is already happening.
Reviewed-by: Filipe Manana <fdmanana@suse.com>
CC: stable@vger.kernel.org # 5.4+
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Naohiro Aota [Tue, 21 Jun 2022 06:40:59 +0000 (15:40 +0900)]
btrfs: ensure pages are unlocked on cow_file_range() failure
[ Upstream commit
9ce7466f372d83054c7494f6b3e4b9abaf3f0355 ]
There is a hung_task report on zoned btrfs like below.
https://github.com/naota/linux/issues/59
[726.328648] INFO: task rocksdb:high0:11085 blocked for more than 241 seconds.
[726.329839] Not tainted 5.16.0-rc1+ #1
[726.330484] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[726.331603] task:rocksdb:high0 state:D stack: 0 pid:11085 ppid: 11082 flags:0x00000000
[726.331608] Call Trace:
[726.331611] <TASK>
[726.331614] __schedule+0x2e5/0x9d0
[726.331622] schedule+0x58/0xd0
[726.331626] io_schedule+0x3f/0x70
[726.331629] __folio_lock+0x125/0x200
[726.331634] ? find_get_entries+0x1bc/0x240
[726.331638] ? filemap_invalidate_unlock_two+0x40/0x40
[726.331642] truncate_inode_pages_range+0x5b2/0x770
[726.331649] truncate_inode_pages_final+0x44/0x50
[726.331653] btrfs_evict_inode+0x67/0x480
[726.331658] evict+0xd0/0x180
[726.331661] iput+0x13f/0x200
[726.331664] do_unlinkat+0x1c0/0x2b0
[726.331668] __x64_sys_unlink+0x23/0x30
[726.331670] do_syscall_64+0x3b/0xc0
[726.331674] entry_SYSCALL_64_after_hwframe+0x44/0xae
[726.331677] RIP: 0033:0x7fb9490a171b
[726.331681] RSP: 002b:
00007fb943ffac68 EFLAGS:
00000246 ORIG_RAX:
0000000000000057
[726.331684] RAX:
ffffffffffffffda RBX:
0000000000000000 RCX:
00007fb9490a171b
[726.331686] RDX:
00007fb943ffb040 RSI:
000055a6bbe6ec20 RDI:
00007fb94400d300
[726.331687] RBP:
00007fb943ffad00 R08:
0000000000000000 R09:
0000000000000000
[726.331688] R10:
0000000000000031 R11:
0000000000000246 R12:
00007fb943ffb000
[726.331690] R13:
00007fb943ffb040 R14:
0000000000000000 R15:
00007fb943ffd260
[726.331693] </TASK>
While we debug the issue, we found running fstests generic/551 on 5GB
non-zoned null_blk device in the emulated zoned mode also had a
similar hung issue.
Also, we can reproduce the same symptom with an error injected
cow_file_range() setup.
The hang occurs when cow_file_range() fails in the middle of
allocation. cow_file_range() called from do_allocation_zoned() can
split the give region ([start, end]) for allocation depending on
current block group usages. When btrfs can allocate bytes for one part
of the split regions but fails for the other region (e.g. because of
-ENOSPC), we return the error leaving the pages in the succeeded regions
locked. Technically, this occurs only when @unlock == 0. Otherwise, we
unlock the pages in an allocated region after creating an ordered
extent.
Considering the callers of cow_file_range(unlock=0) won't write out
the pages, we can unlock the pages on error exit from
cow_file_range(). So, we can ensure all the pages except @locked_page
are unlocked on error case.
In summary, cow_file_range now behaves like this:
- page_started == 1 (return value)
- All the pages are unlocked. IO is started.
- unlock == 1
- All the pages except @locked_page are unlocked in any case
- unlock == 0
- On success, all the pages are locked for writing out them
- On failure, all the pages except @locked_page are unlocked
Fixes: 42c011000963 ("btrfs: zoned: introduce dedicated data write path for zoned filesystems")
CC: stable@vger.kernel.org # 5.12+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jinke Han [Wed, 20 Jul 2022 09:36:16 +0000 (17:36 +0800)]
block: don't allow the same type rq_qos add more than once
[ Upstream commit
14a6e2eb7df5c7897c15b109cba29ab0c4a791b6 ]
In our test of iocost, we encountered some list add/del corruptions of
inner_walk list in ioc_timer_fn.
The reason can be described as follows:
cpu 0 cpu 1
ioc_qos_write ioc_qos_write
ioc = q_to_ioc(queue);
if (!ioc) {
ioc = kzalloc();
ioc = q_to_ioc(queue);
if (!ioc) {
ioc = kzalloc();
...
rq_qos_add(q, rqos);
}
...
rq_qos_add(q, rqos);
...
}
When the io.cost.qos file is written by two cpus concurrently, rq_qos may
be added to one disk twice. In that case, there will be two iocs enabled
and running on one disk. They own different iocgs on their active list. In
the ioc_timer_fn function, because of the iocgs from two iocs have the
same root iocg, the root iocg's walk_list may be overwritten by each other
and this leads to list add/del corruptions in building or destroying the
inner_walk list.
And so far, the blk-rq-qos framework works in case that one instance for
one type rq_qos per queue by default. This patch make this explicit and
also fix the crash above.
Signed-off-by: Jinke Han <hanjinke.666@bytedance.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220720093616.70584-1-hanjinke.666@bytedance.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Christoph Hellwig [Mon, 20 Sep 2021 12:33:22 +0000 (14:33 +0200)]
block: remove the struct blk_queue_ctx forward declaration
[ Upstream commit
9778ac77c2027827ffdbb33d3e936b3a0ae9f0f9 ]
This type doesn't exist at all, so no need to forward declare it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20210920123328.1399408-12-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>