Jungseung Lee [Mon, 20 Mar 2023 03:29:05 +0000 (12:29 +0900)]
workqueue: Introduce show_freezable_workqueues
Currently show_all_workqueue is called if freeze fails at the time of
freeze the workqueues, which shows the status of all workqueues and of
all worker pools. In this cases we may only need to dump state of only
workqueues that are freezable and busy.
This patch defines show_freezable_workqueues, which uses
show_one_workqueue, a granular function that shows the state of individual
workqueues, so that dump only the state of freezable workqueues
at that time.
tj: Minor message adjustment.
Signed-off-by: Jungseung Lee <js07.lee@samsung.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Petr Mladek [Tue, 7 Mar 2023 12:53:35 +0000 (13:53 +0100)]
workqueue: Print backtraces from CPUs with hung CPU bound workqueues
The workqueue watchdog reports a lockup when there was not any progress
in the worker pool for a long time. The progress means that a pending
work item starts being proceed.
Worker pools for unbound workqueues always wake up an idle worker and
try to process the work immediately. The last idle worker has to create
new worker first. The stall might happen only when a new worker could
not be created in which case an error should get printed. Another problem
might be too high load. In this case, workers are victims of a global
system problem.
Worker pools for CPU bound workqueues are designed for lightweight
work items that do not need much CPU time. They are proceed one by
one on a single worker. New worker is used only when a work is sleeping.
It creates one additional scenario. The stall might happen when
the CPU-bound workqueue is used for CPU-intensive work.
More precisely, the stall is detected when a CPU-bound worker is in
the TASK_RUNNING state for too long. In this case, it might be useful
to see the backtrace from the problematic worker.
The information how long a worker is in the running state is not available.
But the CPU-bound worker pools do not have many workers in the running
state by definition. And only few pools are typically blocked.
It should be acceptable to print backtraces from all workers in
TASK_RUNNING state in the stalled worker pools. The number of false
positives should be very low.
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Petr Mladek [Tue, 7 Mar 2023 12:53:34 +0000 (13:53 +0100)]
workqueue: Warn when a rescuer could not be created
Rescuers are created when a workqueue with WQ_MEM_RECLAIM is allocated.
It typically happens during the system boot.
systemd switches the root filesystem from initrd to the booted system
during boot. It kills processes that block the switch for too long.
One of the process might be modprobe that tries to create a workqueue.
These problems are hard to reproduce. Also alloc_workqueue() does not
pass the error code. Make the debugging easier by printing an error,
similar to create_worker().
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Petr Mladek [Tue, 7 Mar 2023 12:53:33 +0000 (13:53 +0100)]
workqueue: Interrupted create_worker() is not a repeated event
kthread_create_on_node() might get interrupted(). It is rare but realistic.
For example, when an unbound workqueue is allocated in module_init()
callback. It is done in the context of the "modprobe" process. And,
for example, systemd might kill pending processes when switching root
from initrd to the booted system.
The interrupt is a one-off event and the race might be hard to reproduce.
It is always worth printing.
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Petr Mladek [Tue, 7 Mar 2023 12:53:32 +0000 (13:53 +0100)]
workqueue: Warn when a new worker could not be created
The workqueue watchdog reports a lockup when there was not any progress
in the worker pool for a long time. The progress means that a pending
work item starts being proceed.
The progress is guaranteed by using idle workers or creating new workers
for pending work items.
There are several reasons why a new worker could not be created:
+ there is not enough memory
+ there is no free pool ID (IDR API)
+ the system reached PID limit
+ the process creating the new worker was interrupted
+ the last idle worker (manager) has not been scheduled for a long
time. It was not able to even start creating the kthread.
None of these failures is reported at the moment. The only clue is that
show_one_worker_pool() prints that there is a manager. It is the last
idle worker that is responsible for creating a new one. But it is not
clear if create_worker() is failing and why.
Make the debugging easier by printing errors in create_worker().
The error code is important, especially from kthread_create_on_node().
It helps to distinguish the various reasons. For example, reaching
memory limit (-ENOMEM), other system limits (-EAGAIN), or process
interrupted (-EINTR).
Use pr_once() to avoid repeating the same error every CREATE_COOLDOWN
for each stuck worker pool.
Ratelimited printk() might be better. It would help to know if the problem
remains. It would be more clear if the create_worker() errors and workqueue
stalls are related. Also old messages might get lost when the internal log
buffer is full. The problem is that printk() might touch the watchdog.
For example, see touch_nmi_watchdog() in serial8250_console_write().
It would require synchronization of the begin and length of the ratelimit
interval with the workqueue watchdog. Otherwise, the error messages
might break the watchdog. This does not look worth the complexity.
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Petr Mladek [Tue, 7 Mar 2023 12:53:31 +0000 (13:53 +0100)]
workqueue: Fix hung time report of worker pools
The workqueue watchdog prints a warning when there is no progress in
a worker pool. Where the progress means that the pool started processing
a pending work item.
Note that it is perfectly fine to process work items much longer.
The progress should be guaranteed by waking up or creating idle
workers.
show_one_worker_pool() prints state of non-idle worker pool. It shows
a delay since the last pool->watchdog_ts.
The timestamp is updated when a first pending work is queued in
__queue_work(). Also it is updated when a work is dequeued for
processing in worker_thread() and rescuer_thread().
The delay is misleading when there is no pending work item. In this
case it shows how long the last work item is being proceed. Show
zero instead. There is no stall if there is no pending work.
Fixes:
82607adcf9cdf40fb7b ("workqueue: implement lockup detector")
Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Ammar Faizi [Sun, 26 Feb 2023 16:53:20 +0000 (23:53 +0700)]
workqueue: Simplify a pr_warn() call in wq_select_unbound_cpu()
Use pr_warn_once() to achieve the same thing. It's simpler.
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Ammar Faizi [Sun, 26 Feb 2023 16:53:21 +0000 (23:53 +0700)]
MAINTAINERS: Add workqueue_internal.h to the WORKQUEUE entry
This file doesn't have a maintainer. It should belong to the WORKQUEUE
entry part. Add it to the WORKQUEUE entry.
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Linus Torvalds [Fri, 17 Mar 2023 18:20:27 +0000 (11:20 -0700)]
Merge tag 'block-6.3-2023-03-16' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
"A bit bigger than usual, as the NVMe pull request missed last weeks
submission. In detail:
- NVMe pull request via Christoph:
- Avoid potential UAF in nvmet_req_complete (Damien Le Moal)
- More quirks (Elmer Miroslav Mosher Golovin, Philipp Geulen)
- Fix a memory leak in the nvme-pci probe teardown path
(Irvin Cote)
- Repair the MAINTAINERS entry (Lukas Bulwahn)
- Fix handling single range discard request (Ming Lei)
- Show more opcode names in trace events (Minwoo Im)
- Fix nvme-tcp timeout reporting (Sagi Grimberg)
- MD pull request via Song:
- Two fixes for old issues (Neil)
- Resource leak in device stopping (Xiao)
- Bio based device stats fix (Yu)
- Kill unused CONFIG_BLOCK_COMPAT (Lukas)
- sunvdc missing mdesc_grab() failure check (Liang)
- Fix for reversal of request ordering upon issue for certain cases
(Jan)
- null_blk timeout fixes (Damien)
- Loop use-after-free fix (Bart)
- blk-mq SRCU fix for BLK_MQ_F_BLOCKING devices (Chris)"
* tag 'block-6.3-2023-03-16' of git://git.kernel.dk/linux:
block: remove obsolete config BLOCK_COMPAT
md: select BLOCK_LEGACY_AUTOLOAD
block: count 'ios' and 'sectors' when io is done for bio-based device
block: sunvdc: add check for mdesc_grab() returning NULL
nvmet: avoid potential UAF in nvmet_req_complete()
nvme-trace: show more opcode names
nvme-tcp: add nvme-tcp pdu size build protection
nvme-tcp: fix opcode reporting in the timeout handler
nvme-pci: add NVME_QUIRK_BOGUS_NID for Lexar NM620
nvme-pci: add NVME_QUIRK_BOGUS_NID for Netac NV3000
nvme-pci: fixing memory leak in probe teardown path
nvme: fix handling single range discard request
MAINTAINERS: repair malformed T: entries in NVM EXPRESS DRIVERS
block: null_blk: cleanup null_queue_rq()
block: null_blk: Fix handling of fake timeout request
blk-mq: fix "bad unlock balance detected" on q->srcu in __blk_mq_run_dispatch_ops
loop: Fix use-after-free issues
block: do not reverse request order when flushing plug list
md: avoid signed overflow in slot_store()
md: Free resources in __md_stop
Linus Torvalds [Fri, 17 Mar 2023 18:12:07 +0000 (11:12 -0700)]
Merge tag 'io_uring-6.3-2023-03-16' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:
- When PF_NO_SETAFFINITY was removed for io-wq threads, we kind of
forgot about the SQPOLL thread. Remove it there as well, there's even
less of a reason to set it there (Michal)
- Fixup a confusing 'ret' setting (Li)
- When MSG_RING is used to send a direct descriptor to another ring,
it's possible to have it allocate it on the target ring rather than
provide a specific index for it. If this is done, return the chosen
value in the CQE, like we would've done locally (Pavel)
- Fix a regression in this series on huge page bvec collapsing (Pavel)
* tag 'io_uring-6.3-2023-03-16' of git://git.kernel.dk/linux:
io_uring/rsrc: fix folio accounting
io_uring/msg_ring: let target know allocated index
io_uring: rsrc: Optimize return value variable 'ret'
io_uring/sqpoll: Do not set PF_NO_SETAFFINITY on sqpoll threads
Linus Torvalds [Fri, 17 Mar 2023 18:02:26 +0000 (11:02 -0700)]
Merge tag 'pm-6.3-rc3' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix an error code path issue in a cpuidle driver and make the
sleepgraph utility more robust against unexpected input.
Specifics:
- Fix the psci_pd_init_topology() failure path in the PSCI cpuidle
driver (Shawn Guo)
- Modify the sleepgraph utility so it does not crash on binary data
in device names (Todd Brandt)"
* tag 'pm-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
pm-graph: sleepgraph: Avoid crashing on binary data in device names
cpuidle: psci: Iterate backwards over list in psci_pd_remove()
Linus Torvalds [Fri, 17 Mar 2023 17:57:09 +0000 (10:57 -0700)]
Merge tag 'acpi-6.3-rc3' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These add some new quirks, fix PPTT handling, fix an ACPI utility and
correct a mistake in the ACPI documentation.
Specifics:
- Fix ACPI PPTT handling to avoid sleep in the atomic context when it
is not present (Sudeep Holla)
- Add 'backlight=native' DMI quirk for Dell Vostro 15 3535 to the
ACPI video driver (Chia-Lin Kao)
- Add ACPI quirks for I2C device enumeration on Lenovo Yoga Book X90
and Acer Iconia One 7 B1-750 (Hans de Goede)
- Fix handling of invalid command line option values in the ACPI
pfrut utility (Chen Yu)
- Fix references to I2C device data type in the ACPI documentation
for device enumeration (Andy Shevchenko)"
* tag 'acpi-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: tools: pfrut: Check if the input of level and type is in the right numeric range
ACPI: PPTT: Fix to avoid sleep in the atomic context when PPTT is absent
ACPI: x86: Add skip i2c clients quirk for Lenovo Yoga Book X90
ACPI: x86: Add skip i2c clients quirk for Acer Iconia One 7 B1-750
ACPI: x86: Introduce an acpi_quirk_skip_gpio_event_handlers() helper
ACPI: video: Add backlight=native DMI quirk for Dell Vostro 15 3535
ACPI: docs: enumeration: Correct reference to the I²C device data type
Linus Torvalds [Fri, 17 Mar 2023 17:51:14 +0000 (10:51 -0700)]
Merge branch 'turbostat' of git://git./linux/kernel/git/lenb/linux
Pull turbostat fweaks and fixes from Len Brown:
"Leprechaun sized fixes and tweaks touching only turbostat.
'Keeping happy users happy since 2010'"
* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
tools/power turbostat: version 2023.03.17
tools/power turbostat: fix decoding of HWP_STATUS
tools/power turbostat: Introduce support for EMR
tools/power turbostat: remove stray newlines from warn/warnx strings
tools/power turbostat: Fix /dev/cpu_dma_latency warnings
tools/power turbostat: Provide better debug messages for failed capabilities accesses
tools/power turbostat: update dump of SECONDARY_TURBO_RATIO_LIMIT
Linus Torvalds [Fri, 17 Mar 2023 17:45:49 +0000 (10:45 -0700)]
Merge tag 'for-linus-6.3-rc3-tag' of git://git./linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- cleanup for xen time handling
- enable the VGA console in a Xen PVH dom0
- cleanup in the xenfs driver
* tag 'for-linus-6.3-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: remove unnecessary (void*) conversions
x86/PVH: obtain VGA console info in Dom0
x86/xen/time: cleanup xen_tsc_safe_clocksource
xen: update arch/x86/include/asm/xen/cpuid.h
Linus Torvalds [Fri, 17 Mar 2023 17:33:33 +0000 (10:33 -0700)]
Merge tag 'riscv-for-linus-6.3-rc3' of git://git./linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- fixes to the ASID allocator to avoid leaking stale mappings between
tasks
- fix the vmalloc fault handler to tolerate huge pages
* tag 'riscv-for-linus-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: mm: Support huge page in vmalloc_fault()
riscv: asid: Fixup stale TLB entry cause application crash
Revert "riscv: mm: notify remote harts about mmu cache updates"
Linus Torvalds [Fri, 17 Mar 2023 17:15:53 +0000 (10:15 -0700)]
Merge tag 's390-6.3-3' of git://git./linux/kernel/git/s390/linux
Pull s390 fixes from Vasily Gorbik:
- Update defconfigs
- Fix early boot code by adding missing intersection check to prevent
potential overwriting of the ipl report
- Fix a use-after-free issue in s390-specific code related to PCI
resources being retained after hot-unplugging individual functions,
by removing the resources from the PCI bus's resource list and using
the zpci_bar_struct's resource pointer directly
* tag 's390-6.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: update defconfigs
PCI: s390: Fix use-after-free of PCI resources with per-function hotplug
s390/ipl: add missing intersection check to ipl_report handling
Linus Torvalds [Fri, 17 Mar 2023 17:01:07 +0000 (10:01 -0700)]
Merge tag 'powerpc-6.3-3' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix false detection of read faults, introduced by execute-only
support
- Fix a build failure when GENERIC_ALLOCATOR is not selected
Thanks to Russell Currey, Randy Dunlap, Michal Suchánek, Nathan Lynch,
and Benjamin Gray.
* tag 'powerpc-6.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm: Fix false detection of read faults
powerpc/pseries: RTAS work area requires GENERIC_ALLOCATOR
Linus Torvalds [Fri, 17 Mar 2023 16:49:17 +0000 (09:49 -0700)]
Merge tag 'mmc-v6.3-rc1' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC host fixes from Ulf Hansson:
- dw_mmc-starfive: Fix initialization of the prev_err variable
- sdhci_am654: Lower power-on failed message severity
* tag 'mmc-v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: dw_mmc-starfive: Fix initialization of prev_err
mmc: sdhci_am654: lower power-on failed message severity
Linus Torvalds [Fri, 17 Mar 2023 16:43:10 +0000 (09:43 -0700)]
Merge tag 'sound-6.3-rc3' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Nothing surprising, a collection of small device-specific fixes.
The majority of changes are for ASoC Intel stuff, while a few other
ASoC and HD-audio fixes are found"
* tag 'sound-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (31 commits)
ALSA: hda/ca0132: fixup buffer overrun at tuning_ctl_set()
ALSA: asihpi: check pao in control_message()
ASoC: hdmi-codec: only startup/shutdown on supported streams
ASoC: da7219: Initialize jack_det_mutex
ALSA: hda: Match only Intel devices with CONTROLLER_IN_GPU()
ALSA: hda/realtek: Fix the speaker output on Samsung Galaxy Book2 Pro
ALSA: hda/realtek: fix speaker, mute/micmute LEDs not work on a HP platform
ALSA: hda: intel-dsp-config: add MTL PCI id
ASoC: SOF: IPC4: update gain ipc msg definition to align with fw
ASoC: SOF: sof-audio: don't squelch errors in WIDGET_SETUP phase
ASoC: SOF: Intel: hda-ctrl: re-add sleep after entering and exiting reset
ASoC: SOF: Intel: hda-dsp: harden D0i3 programming sequence
ASoC: SOF: ipc4-topology: set dmic dai index from copier
ASoC: SOF: sof-audio: Fix broken early bclk feature for SSP
ASoC: SOF: Intel: pci-tng: revert invalid bar size setting
ASoC: SOF: topology: Fix error handling in sof_widget_ready()
ASoC: Intel: soc-acpi: fix copy-paste issue in topology names
ASoC: SOF: ipc4-topology: Fix incorrect sample rate print unit
ASoC: SOF: ipc3: Check for upper size limit for the received message
ASOC: SOF: Intel: pci-tgl: Fix device description
...
Linus Torvalds [Fri, 17 Mar 2023 16:35:40 +0000 (09:35 -0700)]
Merge tag 'drm-fixes-2023-03-17' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Seems like a pretty regular rc3, i915 and amdgpu with the usual
selection of fixes, then a scattering of fixes across misc drivers and
other areas:
accel:
- build fix for accel
edid:
- fix info leak in edid
ttm:
- fix NULL ptr deref
- reference counting fix
i915:
- Fix hwmon PL1 power limit enabling
- Fix audio ELD handling for DP MST
- Fix PSR io and wake line calculations
- Fix DG2 HDMI modes with 267.30 and 319.89 MHz pixel clocks
- Fix SSEU subslice out-of-bounds access
- Fix misuse of non-idle barriers as fence trackers
amdgpu:
- SMU 13 update
- RDNA2 suspend/resume fix when overclocking is enabled
- SRIOV VCN fixes
- HDCP suspend/resume fix
- Fix drm polling splat regression
- Fix dirty rectangle tracking for PSR
- Fix vangogh regression on certain BIOSes
- Misc display fixes
- Suspend/resume IOMMU regression fix
amdkfd:
- Fix BO offset for multi-VMA page migration
- Fix a possible double free
- Fix potential use after free
- Fix process cleanup on module exit
bridge:
- fix returned array size name documentation
fbdev:
- ref-counting fix for fbdev deferred I/O
virtio:
- dma sync fix
shmem-helper:
- error path fix
msm:
- shrinker blocking fix
panfrost:
- shrinker rpm fix
chipsfb:
- fix error code
meson:
- fix 1px pink line
- fix regulator interaction
sun4i:
- fix missing component unbind"
* tag 'drm-fixes-2023-03-17' of git://anongit.freedesktop.org/drm/drm: (38 commits)
drm/ttm: drop extra ttm_bo_put in ttm_bo_cleanup_refs
drm/amdgpu: Don't resume IOMMU after incomplete init
drm/amdkfd: Fixed kfd_process cleanup on module exit.
drm/amd/display: disconnect MPCC only on OTG change
drm/amd/display: Fix DP MST sinks removal issue
drm/amd/display: Do not set DRR on pipe Commit
drm/amd/display: Remove OTG DIV register write for Virtual signals.
drm/meson: dw-hdmi: Fix devm_regulator_*get_enable*() conversion again
drm/bridge: Fix returned array size name for atomic_get_input_bus_fmts kdoc
drm/amdgpu/vcn: Disable indirect SRAM on Vangogh broken BIOSes
drm/amdgpu/nv: fix codec array for SR_IOV
drm/amd/display: Write to correct dirty_rect
drm/amdgpu: move poll enabled/disable into non DC path
drm/amd/display: Fix HDCP failing to enable after suspend
drm/amdkfd: fix potential kgd_mem UAFs
drm/amdgpu/vcn: custom video info caps for sriov
drm/amd/pm: Fix sienna cichlid incorrect OD volage after resume
drm/amd/pm: bump SMU 13.0.4 driver_if header version
drm/amdkfd: fix a potential double free in pqm_create_queue
drm/amdkfd: Get prange->offset after svm_range_vram_node_new
...
Linus Torvalds [Fri, 17 Mar 2023 16:30:57 +0000 (09:30 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Ten patches, eight in drivers and two in the core, which correct a
regression from directory removal and add a no VPD size quirk also to
fix a regression. All pretty small"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: mcq: Use active_reqs to check busy in clock scaling
scsi: core: Fix a procfs host directory removal regression
scsi: core: Add BLIST_NO_VPD_SIZE for some VDASD
scsi: mpi3mr: Fix expander node leak in mpi3mr_remove()
scsi: mpi3mr: Fix memory leaks in mpi3mr_init_ioc()
scsi: mpi3mr: Fix sas_hba.phy memory leak in mpi3mr_remove()
scsi: mpi3mr: Fix mpi3mr_hba_port memory leak in mpi3mr_remove()
scsi: mpi3mr: Fix config page DMA memory leak
scsi: mpi3mr: Fix throttle_groups memory leak
scsi: mpt3sas: Fix NULL pointer access in mpt3sas_transport_port_add()
Rafael J. Wysocki [Fri, 17 Mar 2023 15:55:01 +0000 (16:55 +0100)]
Merge branch 'pm-cpuidle'
Merge a PSCI cpuidle driver fix for 6.3-rc1:
- Fix the psci_pd_init_topology() failure path in the PSCI cpuidle
driver (Shawn Guo).
* pm-cpuidle:
cpuidle: psci: Iterate backwards over list in psci_pd_remove()
Rafael J. Wysocki [Fri, 17 Mar 2023 15:44:41 +0000 (16:44 +0100)]
Merge branches 'acpi-video', 'acpi-x86', 'acpi-tools' and 'acpi-docs'
Merge a new ACPI backlight quirk, new ACPI quirks for I2C device
enumeration on some platforms, a pfrut utility fix and an ACPI
documentation fix for 6.3-rc3:
- Add backlight=native DMI quirk for Dell Vostro 15 3535 to the ACPI
video driver (Chia-Lin Kao).
- Add ACPI quirks for I2C devices enumeration on Lenovo Yoga Book X90
and Acer Iconia One 7 B1-750 (Hans de Goede).
- Fix handling of invalid command line option values in the ACPI pfrut
utility (Chen Yu).
- Fix references to I2C device data type in the ACPI documentation for
device enumeration (Andy Shevchenko).
* acpi-video:
ACPI: video: Add backlight=native DMI quirk for Dell Vostro 15 3535
* acpi-x86:
ACPI: x86: Add skip i2c clients quirk for Lenovo Yoga Book X90
ACPI: x86: Add skip i2c clients quirk for Acer Iconia One 7 B1-750
ACPI: x86: Introduce an acpi_quirk_skip_gpio_event_handlers() helper
* acpi-tools:
ACPI: tools: pfrut: Check if the input of level and type is in the right numeric range
* acpi-docs:
ACPI: docs: enumeration: Correct reference to the I²C device data type
Len Brown [Fri, 17 Mar 2023 15:34:10 +0000 (11:34 -0400)]
tools/power turbostat: version 2023.03.17
Happy St. Patrick's Day!
Signed-off-by: Len Brown <len.brown@intel.com>
Antti Laakso [Wed, 25 Jan 2023 13:17:50 +0000 (15:17 +0200)]
tools/power turbostat: fix decoding of HWP_STATUS
The "excursion to minimum" information is in bit2
in HWP_STATUS MSR. Fix the bitmask used for
decoding the register.
Signed-off-by: Antti Laakso <antti.laakso@intel.com>
Reviewed-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Zhang Rui [Wed, 4 Jan 2023 14:23:53 +0000 (22:23 +0800)]
tools/power turbostat: Introduce support for EMR
Introduce support for EMR.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Tested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Len Brown [Fri, 17 Mar 2023 15:25:56 +0000 (11:25 -0400)]
tools/power turbostat: remove stray newlines from warn/warnx strings
warn(3) terminates strings with newlines
Signed-off-by: Len Brown <len.brown@intel.com>
Prarit Bhargava [Thu, 15 Dec 2022 15:18:16 +0000 (10:18 -0500)]
tools/power turbostat: Fix /dev/cpu_dma_latency warnings
When running as non-root the following error is seen in turbostat:
turbostat: fopen /dev/cpu_dma_latency
: Permission denied
turbostat and the man page have information on how to avoid other
permission errors, so these can be fixed the same way.
Provide better /dev/cpu_dma_latency warnings that provide instructions on
how to avoid the error, and update the man page.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: linux-pm@vger.kernel.org
Signed-off-by: Len Brown <len.brown@intel.com>
Prarit Bhargava [Tue, 18 Oct 2022 19:23:37 +0000 (15:23 -0400)]
tools/power turbostat: Provide better debug messages for failed capabilities accesses
turbostat reports some capabilities access errors and not others. Provide
the same debug message for all errors.
[lenb: remove extra quotes]
Cc: David Arcari <darcari@redhat.com>
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Len Brown [Thu, 13 Oct 2022 10:42:29 +0000 (12:42 +0200)]
tools/power turbostat: update dump of SECONDARY_TURBO_RATIO_LIMIT
cosmetic only (but useful if you copy/paste)
Signed-off-by: Len Brown <len.brown@intel.com>
Christian König [Thu, 16 Mar 2023 07:26:47 +0000 (08:26 +0100)]
drm/ttm: drop extra ttm_bo_put in ttm_bo_cleanup_refs
That was accidentially left over when we switched to the delayed delete
worker.
Suggested-by: Matthew Auld <matthew.william.auld@gmail.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes:
9bff18d13473 ("drm/ttm: use per BO cleanup workers")
Reported-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230316072647.406707-1-christian.koenig@amd.com
Dave Airlie [Fri, 17 Mar 2023 05:42:21 +0000 (15:42 +1000)]
Merge tag 'amd-drm-fixes-6.3-2023-03-15' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.3-2023-03-15:
amdgpu:
- SMU 13 update
- RDNA2 suspend/resume fix when overclocking is enabled
- SRIOV VCN fixes
- HDCP suspend/resume fix
- Fix drm polling splat regression
- Fix dirty rectangle tracking for PSR
- Fix vangogh regression on certain BIOSes
- Misc display fixes
- Suspend/resume IOMMU regression fix
amdkfd:
- Fix BO offset for multi-VMA page migration
- Fix a possible double free
- Fix potential use after free
- Fix process cleanup on module exit
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230315224400.7558-1-alexander.deucher@amd.com
Dave Airlie [Fri, 17 Mar 2023 02:03:57 +0000 (12:03 +1000)]
Merge tag 'drm-intel-fixes-2023-03-15' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
drm/i915 fixes for v6.3-rc3:
- Fix hwmon PL1 power limit enabling
- Fix audio ELD handling for DP MST
- Fix PSR io and wake line calculations
- Fix DG2 HDMI modes with 267.30 and 319.89 MHz pixel clocks
- Fix SSEU subslice out-of-bounds access
- Fix misuse of non-idle barriers as fence trackers
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87r0tq5nyn.fsf@intel.com
Dave Airlie [Fri, 17 Mar 2023 01:01:20 +0000 (11:01 +1000)]
Merge tag 'drm-misc-fixes-2023-03-16' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Short summary of fixes pull:
* fix info leak in edid
* build fix for accel/
* ref-counting fix for fbdev deferred I/O
* driver fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20230316143347.GA9246@linux-uq9g
Linus Torvalds [Thu, 16 Mar 2023 22:06:16 +0000 (15:06 -0700)]
Merge tag '6.3-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs client fixes from Steve French:
"Seven cifs/smb3 client fixes, all also for stable:
- four DFS fixes
- multichannel reconnect fix
- fix smb1 stats for cancel command
- fix for set file size error path"
* tag '6.3-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: use DFS root session instead of tcon ses
cifs: return DFS root session id in DebugData
cifs: fix use-after-free bug in refresh_cache_worker()
cifs: set DFS root session in cifs_get_smb_ses()
cifs: generate signkey for the channel that's reconnecting
cifs: Fix smb2_set_path_size()
cifs: Move the in_send statistic to __smb_send_rqst()
Linus Torvalds [Thu, 16 Mar 2023 18:32:12 +0000 (11:32 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"ARM64:
- Address a rather annoying bug w.r.t. guest timer offsetting. The
synchronization of timer offsets between vCPUs was broken, leading
to inconsistent timer reads within the VM.
x86:
- New tests for the slow path of the EVTCHNOP_send Xen hypercall
- Add missing nVMX consistency checks for CR0 and CR4
- Fix bug that broke AMD GATag on 512 vCPU machines
Selftests:
- Skip hugetlb tests if huge pages are not available
- Sync KVM exit reasons"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: selftests: Sync KVM exit reasons in selftests
KVM: selftests: Add macro to generate KVM exit reason strings
KVM: selftests: Print expected and actual exit reason in KVM exit reason assert
KVM: selftests: Make vCPU exit reason test assertion common
KVM: selftests: Add EVTCHNOP_send slow path test to xen_shinfo_test
KVM: selftests: Use enum for test numbers in xen_shinfo_test
KVM: selftests: Add helpers to make Xen-style VMCALL/VMMCALL hypercalls
KVM: selftests: Move the guts of kvm_hypercall() to a separate macro
KVM: SVM: WARN if GATag generation drops VM or vCPU ID information
KVM: SVM: Modify AVIC GATag to support max number of 512 vCPUs
KVM: SVM: Fix a benign off-by-one bug in AVIC physical table mask
selftests: KVM: skip hugetlb tests if huge pages are not available
KVM: VMX: Use tabs instead of spaces for indentation
KVM: VMX: Fix indentation coding style issue
KVM: nVMX: remove unnecessary #ifdef
KVM: nVMX: add missing consistency checks for CR0 and CR4
KVM: arm64: timers: Convert per-vcpu virtual offset to a global value
Lukas Bulwahn [Thu, 16 Mar 2023 11:16:30 +0000 (12:16 +0100)]
block: remove obsolete config BLOCK_COMPAT
Before commit
bdc1ddad3e5f ("compat_ioctl: block: move
blkdev_compat_ioctl() into ioctl.c"), the config BLOCK_COMPAT was used to
include compat_ioctl.c into the kernel build. With this commit, the code
is moved into ioctl.c and included with the config COMPAT. So, since then,
the config BLOCK_COMPAT has no effect and any further purpose.
Remove this obsolete config BLOCK_COMPAT.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20230316111630.4897-1-lukas.bulwahn@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 16 Mar 2023 15:26:05 +0000 (15:26 +0000)]
io_uring/rsrc: fix folio accounting
| BUG: Bad page state in process kworker/u8:0 pfn:5c001
| page:
00000000bfda61c8 refcount:0 mapcount:0 mapping:
0000000000000000 index:0x20001 pfn:0x5c001
| head:
0000000011409842 order:9 entire_mapcount:0 nr_pages_mapped:0 pincount:1
| anon flags: 0x3fffc00000b0004(uptodate|head|mappedtodisk|swapbacked|node=0|zone=0|lastcpupid=0xffff)
| raw:
03fffc0000000000 fffffc0000700001 ffffffff00700903 0000000100000000
| raw:
0000000000000200 0000000000000000 00000000ffffffff 0000000000000000
| head:
03fffc00000b0004 dead000000000100 dead000000000122 ffff00000a809dc1
| head:
0000000000020000 0000000000000000 00000000ffffffff 0000000000000000
| page dumped because: nonzero pincount
| CPU: 3 PID: 9 Comm: kworker/u8:0 Not tainted 6.3.0-rc2-00001-gc6811bf0cd87 #1
| Hardware name: linux,dummy-virt (DT)
| Workqueue: events_unbound io_ring_exit_work
| Call trace:
| dump_backtrace+0x13c/0x208
| show_stack+0x34/0x58
| dump_stack_lvl+0x150/0x1a8
| dump_stack+0x20/0x30
| bad_page+0xec/0x238
| free_tail_pages_check+0x280/0x350
| free_pcp_prepare+0x60c/0x830
| free_unref_page+0x50/0x498
| free_compound_page+0xcc/0x100
| free_transhuge_page+0x1f0/0x2b8
| destroy_large_folio+0x80/0xc8
| __folio_put+0xc4/0xf8
| gup_put_folio+0xd0/0x250
| unpin_user_page+0xcc/0x128
| io_buffer_unmap+0xec/0x2c0
| __io_sqe_buffers_unregister+0xa4/0x1e0
| io_ring_exit_work+0x68c/0x1188
| process_one_work+0x91c/0x1a58
| worker_thread+0x48c/0xe30
| kthread+0x278/0x2f0
| ret_from_fork+0x10/0x20
Mark reports an issue with the recent patches coalescing compound pages
while registering them in io_uring. The reason is that we try to drop
excessive references with folio_put_refs(), but pages were acquired
with pin_user_pages(), which has extra accounting and so should be put
down with matching unpin_user_pages() or at least gup_put_folio().
As a fix unpin_user_pages() all but first page instead, and let's figure
out a better API after.
Fixes:
57bebf807e2abcf8 ("io_uring/rsrc: optimise registered huge pages")
Reported-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Tested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/10efd5507d6d1f05ea0f3c601830e08767e189bd.1678980230.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 16 Mar 2023 12:11:42 +0000 (12:11 +0000)]
io_uring/msg_ring: let target know allocated index
msg_ring requests transferring files support auto index selection via
IORING_FILE_INDEX_ALLOC, however they don't return the selected index
to the target ring and there is no other good way for the userspace to
know where is the receieved file.
Return the index for allocated slots and 0 otherwise, which is
consistent with other fixed file installing requests.
Cc: stable@vger.kernel.org # v6.0+
Fixes:
e6130eba8a848 ("io_uring: add support for passing fixed file descriptors")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://github.com/axboe/liburing/issues/809
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Thu, 16 Mar 2023 13:01:48 +0000 (07:01 -0600)]
Merge tag 'nvme-6.3-2022-03-16' of git://git.infradead.org/nvme into block-6.3
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 6.3
- avoid potential UAF in nvmet_req_complete (Damien Le Moal)
- more quirks (Elmer Miroslav Mosher Golovin, Philipp Geulen)
- fix a memory leak in the nvme-pci probe teardown path (Irvin Cote)
- repair the MAINTAINERS entry (Lukas Bulwahn)
- fix handling single range discard request (Ming Lei)
- show more opcode names in trace events (Minwoo Im)
- fix nvme-tcp timeout reporting (Sagi Grimberg)"
* tag 'nvme-6.3-2022-03-16' of git://git.infradead.org/nvme:
nvmet: avoid potential UAF in nvmet_req_complete()
nvme-trace: show more opcode names
nvme-tcp: add nvme-tcp pdu size build protection
nvme-tcp: fix opcode reporting in the timeout handler
nvme-pci: add NVME_QUIRK_BOGUS_NID for Lexar NM620
nvme-pci: add NVME_QUIRK_BOGUS_NID for Netac NV3000
nvme-pci: fixing memory leak in probe teardown path
nvme: fix handling single range discard request
MAINTAINERS: repair malformed T: entries in NVM EXPRESS DRIVERS
Yu Zhe [Thu, 16 Mar 2023 08:39:54 +0000 (16:39 +0800)]
xen: remove unnecessary (void*) conversions
Pointer variables of void * type do not require type cast.
Signed-off-by: Yu Zhe <yuzhe@nfschina.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20230316083954.4223-1-yuzhe@nfschina.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Li zeming [Fri, 17 Mar 2023 18:25:38 +0000 (02:25 +0800)]
io_uring: rsrc: Optimize return value variable 'ret'
The initialization assignment of the variable ret is changed to 0, only
in 'goto fail;' Use the ret variable as the function return value.
Signed-off-by: Li zeming <zeming@nfschina.com>
Link: https://lore.kernel.org/r/20230317182538.3027-1-zeming@nfschina.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Felix Kuehling [Tue, 14 Mar 2023 00:03:08 +0000 (20:03 -0400)]
drm/amdgpu: Don't resume IOMMU after incomplete init
Check kfd->init_complete in kgd2kfd_iommu_resume, consistent with other
kgd2kfd calls. This should fix IOMMU errors on resume from suspend when
KFD IOMMU initialization failed.
Reported-by: Matt Fagnani <matt.fagnani@bell.net>
Link: https://lore.kernel.org/r/4a3b225c-2ffd-e758-4de1-447375e34cad@bell.net/
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217170
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2454
Cc: Vasant Hegde <vasant.hegde@amd.com>
Cc: Linux regression tracking (Thorsten Leemhuis) <regressions@leemhuis.info>
Cc: stable@vger.kernel.org
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Matt Fagnani <matt.fagnani@bell.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
David Belanger [Tue, 28 Feb 2023 19:11:24 +0000 (14:11 -0500)]
drm/amdkfd: Fixed kfd_process cleanup on module exit.
Handle case when module is unloaded (kfd_exit) before a process space
(mm_struct) is released.
v2: Fixed potential race conditions by removing all kfd_process from
the process table first, then working on releasing the resources.
v3: Fixed loop element access / synchronization. Fixed extra empty lines.
Signed-off-by: David Belanger <david.belanger@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ayush Gupta [Thu, 2 Mar 2023 14:58:05 +0000 (09:58 -0500)]
drm/amd/display: disconnect MPCC only on OTG change
[Why]
Framedrops are observed while playing Vp9 and Av1 10 bit
video on 8k resolution using VSR while playback controls
are disappeared/appeared
[How]
Now ODM 2 to 1 is disabled for 5k or greater resolutions on VSR.
Cc: stable@vger.kernel.org
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alvin Lee <Alvin.Lee2@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Ayush Gupta <ayugupta@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cruise Hung [Thu, 2 Mar 2023 02:33:51 +0000 (10:33 +0800)]
drm/amd/display: Fix DP MST sinks removal issue
[Why]
In USB4 DP tunneling, it's possible to have this scenario that
the path becomes unavailable and CM tears down the path a little bit late.
So, in this case, the HPD is high but fails to read any DPCD register.
That causes the link connection type to be set to sst.
And not all sinks are removed behind the MST branch.
[How]
Restore the link connection type if it fails to read DPCD register.
Cc: stable@vger.kernel.org
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Cruise Hung <Cruise.Hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wesley Chalmers [Fri, 4 Nov 2022 02:29:31 +0000 (22:29 -0400)]
drm/amd/display: Do not set DRR on pipe Commit
[WHY]
Writing to DRR registers such as OTG_V_TOTAL_MIN on the same frame as a
pipe commit can cause underflow.
Cc: stable@vger.kernel.org
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Wesley Chalmers <Wesley.Chalmers@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Saaem Rizvi [Mon, 27 Feb 2023 23:55:07 +0000 (18:55 -0500)]
drm/amd/display: Remove OTG DIV register write for Virtual signals.
[WHY]
Hot plugging and then hot unplugging leads to k1 and k2 values to
change, as signal is detected as a virtual signal on hot unplug. Writing
these values to OTG_PIXEL_RATE_DIV register might cause primary display
to blank (known hw bug).
[HOW]
No longer write k1 and k2 values to register if signal is virtual, we
have safe guards in place in the case that k1 and k2 is unassigned so
that an unknown value is not written to the register either.
Cc: stable@vger.kernel.org
Cc: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Samson Tam <Samson.Tam@amd.com>
Reviewed-by: Alvin Lee <Alvin.Lee2@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Saaem Rizvi <SyedSaaem.Rizvi@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Linus Torvalds [Wed, 15 Mar 2023 19:20:37 +0000 (12:20 -0700)]
Merge tag 'linux-kselftest-fixes-6.3-rc3' of git://git./linux/kernel/git/shuah/linux-kselftest
Pull kselftest fixes from Shuah Khan:
"A fix to amd-pstate test Makefile and a fix to LLVM build for x86 in
kselftest common lib.mk"
* tag 'linux-kselftest-fixes-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests: fix LLVM build for i386 and x86_64
selftests: amd-pstate: fix TEST_FILES
Jens Axboe [Wed, 15 Mar 2023 18:18:07 +0000 (12:18 -0600)]
Merge branch 'md-fixes' of https://git./linux/kernel/git/song/md into block-6.3
Pull MD fixes from Song:
"This set contains two fixes for old issues (by Neil) and one fix
for 6.3 (by Xiao)."
* 'md-fixes' of https://git.kernel.org/pub/scm/linux/kernel/git/song/md:
md: select BLOCK_LEGACY_AUTOLOAD
md: avoid signed overflow in slot_store()
md: Free resources in __md_stop
NeilBrown [Mon, 13 Mar 2023 20:29:17 +0000 (13:29 -0700)]
md: select BLOCK_LEGACY_AUTOLOAD
When BLOCK_LEGACY_AUTOLOAD is not enable, mdadm is not able to
activate new arrays unless "CREATE names=yes" appears in
mdadm.conf
As this is a regression we need to always enable BLOCK_LEGACY_AUTOLOAD
for when MD is selected - at least until mdadm is updated and the
updates widely available.
Cc: stable@vger.kernel.org # v5.18+
Fixes:
fbdee71bb5d8 ("block: deprecate autoloading based on dev_t")
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Song Liu <song@kernel.org>
Yu Kuai [Thu, 23 Feb 2023 09:12:26 +0000 (17:12 +0800)]
block: count 'ios' and 'sectors' when io is done for bio-based device
While using iostat for raid, I observed very strange 'await'
occasionally, and turns out it's due to that 'ios' and 'sectors' is
counted in bdev_start_io_acct(), while 'nsecs' is counted in
bdev_end_io_acct(). I'm not sure why they are ccounted like that
but I think this behaviour is obviously wrong because user will get
wrong disk stats.
Fix the problem by counting 'ios' and 'sectors' when io is done, like
what rq-based device does.
Fixes:
394ffa503bc4 ("blk: introduce generic io stat accounting help function")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20230223091226.1135678-1-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Liang He [Wed, 15 Mar 2023 06:20:32 +0000 (14:20 +0800)]
block: sunvdc: add check for mdesc_grab() returning NULL
In vdc_port_probe(), we should check the return value of mdesc_grab() as
it may return NULL, which can cause potential NPD bug.
Fixes:
43fdf27470b2 ("[SPARC64]: Abstract out mdesc accesses for better MD update handling.")
Signed-off-by: Liang He <windhl@126.com>
Link: https://lore.kernel.org/r/20230315062032.1741692-1-windhl@126.com
[axboe: style cleanup]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Mon, 6 Mar 2023 01:13:13 +0000 (10:13 +0900)]
nvmet: avoid potential UAF in nvmet_req_complete()
An nvme target ->queue_response() operation implementation may free the
request passed as argument. Such implementation potentially could result
in a use after free of the request pointer when percpu_ref_put() is
called in nvmet_req_complete().
Avoid such problem by using a local variable to save the sq pointer
before calling __nvmet_req_complete(), thus avoiding dereferencing the
req pointer after that function call.
Fixes:
a07b4970f464 ("nvmet: add a generic NVMe target")
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Minwoo Im [Thu, 9 Mar 2023 14:31:18 +0000 (23:31 +0900)]
nvme-trace: show more opcode names
We have more commands to show in the trace. Sync up.
Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Sagi Grimberg [Mon, 13 Mar 2023 08:56:23 +0000 (10:56 +0200)]
nvme-tcp: add nvme-tcp pdu size build protection
Make sure that we don't somehow mess up the wire structures in the spec.
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kkch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Sagi Grimberg [Mon, 13 Mar 2023 08:56:22 +0000 (10:56 +0200)]
nvme-tcp: fix opcode reporting in the timeout handler
For non in-capsule writes we reuse the request pdu space for a h2cdata
pdu in order to avoid over allocating space (either preallocate or
dynamically upon receving an r2t pdu). However if the request times out
the core expects to find the opcode in the start of the request, which
we override.
In order to prevent that, without sacrificing additional 24 bytes per
request, we just use the tail of the command pdu space instead (last
24 bytes from the 72 bytes command pdu). That should make the command
opcode always available, and we get away from allocating more space.
If in the future we would need the last 24 bytes of the nvme command
available we would need to allocate a dedicated space for it in the
request, but until then we can avoid doing so.
Reported-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Chaitanya Kulkarni <kkch@nvidia.com>
Tested-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Philipp Geulen [Mon, 13 Mar 2023 10:11:50 +0000 (11:11 +0100)]
nvme-pci: add NVME_QUIRK_BOGUS_NID for Lexar NM620
Added a quirk to fix Lexar NM620 1TB SSD reporting duplicate NGUIDs.
Signed-off-by: Philipp Geulen <p.geulen@js-elektronik.de>
Reviewed-by: Chaitanya Kulkarni <kkch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Elmer Miroslav Mosher Golovin [Wed, 8 Mar 2023 16:19:29 +0000 (19:19 +0300)]
nvme-pci: add NVME_QUIRK_BOGUS_NID for Netac NV3000
Added a quirk to fix the Netac NV3000 SSD reporting duplicate NGUIDs.
Cc: <stable@vger.kernel.org>
Signed-off-by: Elmer Miroslav Mosher Golovin <miroslav@mishamosher.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Irvin Cote [Wed, 8 Mar 2023 21:05:08 +0000 (18:05 -0300)]
nvme-pci: fixing memory leak in probe teardown path
In case the nvme_probe teardown path is triggered the ctrl ref count does
not reach 0 thus creating a memory leak upon failure of nvme_probe.
Signed-off-by: Irvin Cote <irvincoteg@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Ming Lei [Fri, 3 Mar 2023 23:13:45 +0000 (07:13 +0800)]
nvme: fix handling single range discard request
When investigating one customer report on warning in nvme_setup_discard,
we observed the controller(nvme/tcp) actually exposes
queue_max_discard_segments(req->q) == 1.
Obviously the current code can't handle this situation, since contiguity
merge like normal RW request is taken.
Fix the issue by building range from request sector/nr_sectors directly.
Fixes:
b35ba01ea697 ("nvme: support ranged discard requests")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Lukas Bulwahn [Wed, 8 Mar 2023 14:41:32 +0000 (15:41 +0100)]
MAINTAINERS: repair malformed T: entries in NVM EXPRESS DRIVERS
The T: entries shall be composed of a SCM tree type (git, hg, quilt, stgit
or topgit) and location.
Add the SCM tree type to the T: entry, and reorder the file entries in
alphabetical order.
Fixes:
b508fc354f6d ("nvme: update maintainers information")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Michal Koutný [Tue, 14 Mar 2023 18:33:32 +0000 (19:33 +0100)]
io_uring/sqpoll: Do not set PF_NO_SETAFFINITY on sqpoll threads
Users may specify a CPU where the sqpoll thread would run. This may
conflict with cpuset operations because of strict PF_NO_SETAFFINITY
requirement. That flag is unnecessary for polling "kernel" threads, see
the reasoning in commit
01e68ce08a30 ("io_uring/io-wq: stop setting
PF_NO_SETAFFINITY on io-wq workers"). Drop the flag on poll threads too.
Fixes:
01e68ce08a30 ("io_uring/io-wq: stop setting PF_NO_SETAFFINITY on io-wq workers")
Link: https://lore.kernel.org/all/20230314162559.pnyxdllzgw7jozgx@blackpad/
Signed-off-by: Michal Koutný <mkoutny@suse.com>
Link: https://lore.kernel.org/r/20230314183332.25834-1-mkoutny@suse.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Tue, 14 Mar 2023 04:11:06 +0000 (13:11 +0900)]
block: null_blk: cleanup null_queue_rq()
Use a local struct request pointer variable to avoid having to
dereference struct blk_mq_queue_data multiple times. While at it, also
fix the function argument indentation and remove a useless "else" after
a return.
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Pankaj Raghav <p.raghav@samsung.com>
Link: https://lore.kernel.org/r/20230314041106.19173-2-damien.lemoal@opensource.wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Tue, 14 Mar 2023 04:11:05 +0000 (13:11 +0900)]
block: null_blk: Fix handling of fake timeout request
When injecting a fake timeout into the null_blk driver using
fail_io_timeout, the request timeout handler does not execute
blk_mq_complete_request(), so the complete callback is never executed
for a timedout request.
The null_blk driver also has a driver-specific fake timeout mechanism
which does not have this problem. Fix the problem with fail_io_timeout
by using the same meachanism as null_blk internal timeout feature, using
the fake_timeout field of null_blk commands.
Reported-by: Akinobu Mita <akinobu.mita@gmail.com>
Fixes:
de3510e52b0a ("null_blk: fix command timeout completion handling")
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20230314041106.19173-2-damien.lemoal@opensource.wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Russell Currey [Fri, 10 Mar 2023 05:08:34 +0000 (16:08 +1100)]
powerpc/mm: Fix false detection of read faults
To support detection of read faults with Radix execute-only memory, the
vma_is_accessible() check in access_error() (which checks for PROT_NONE)
was replaced with a check to see if VM_READ was missing, and if so,
returns true to assert the fault was caused by a bad read.
This is incorrect, as it ignores that both VM_WRITE and VM_EXEC imply
read on powerpc, as defined in protection_map[]. This causes mappings
containing VM_WRITE or VM_EXEC without VM_READ to misreport the cause of
page faults, since the MMU is still allowing reads.
Correct this by restoring the original vma_is_accessible() check for
PROT_NONE mappings, and adding a separate check for Radix PROT_EXEC-only
mappings.
Fixes:
395cac7752b9 ("powerpc/mm: Support execute-only memory on the Radix MMU")
Reported-by: Michal Suchánek <msuchanek@suse.de>
Link: https://lore.kernel.org/r/20230308152702.GR19419@kitsune.suse.cz
Tested-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230310050834.63105-1-ruscur@russell.cc
Marek Szyprowski [Thu, 9 Mar 2023 15:24:46 +0000 (16:24 +0100)]
drm/meson: dw-hdmi: Fix devm_regulator_*get_enable*() conversion again
devm_regulator_get_enable_optional() returns -ENODEV if requested
optional regulator is not present. Adjust code for that, because in the
67d0a30128c9 I've incorrectly assumed that it also returns 0 when
regulator is not present.
Reported-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>
Fixes:
67d0a30128c9 ("drm/meson: dw-hdmi: Fix devm_regulator_*get_enable*() conversion")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Acked-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20230309152446.104913-1-m.szyprowski@samsung.com
Liu Ying [Tue, 14 Mar 2023 05:50:35 +0000 (13:50 +0800)]
drm/bridge: Fix returned array size name for atomic_get_input_bus_fmts kdoc
The returned array size for input formats is set through
atomic_get_input_bus_fmts()'s 'num_input_fmts' argument, so use
'num_input_fmts' to represent the array size in the function's kdoc,
not 'num_output_fmts'.
Fixes:
91ea83306bfa ("drm/bridge: Fix the bridge kernel doc")
Fixes:
f32df58acc68 ("drm/bridge: Add the necessary bits to support bus format negotiation")
Signed-off-by: Liu Ying <victor.liu@nxp.com>
Reviewed-by: Robert Foss <rfoss@kernel.org>
Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20230314055035.3731179-1-victor.liu@nxp.com
Paulo Alcantara [Tue, 14 Mar 2023 23:32:56 +0000 (20:32 -0300)]
cifs: use DFS root session instead of tcon ses
Use DFS root session whenever possible to get new DFS referrals
otherwise we might end up with an IPC tcon (tcon->ses->tcon_ipc) that
doesn't respond to them. It should be safe accessing
@ses->dfs_root_ses directly in cifs_inval_name_dfs_link_error() as it
has same lifetime as of @tcon.
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Cc: stable@vger.kernel.org # 6.2
Signed-off-by: Steve French <stfrench@microsoft.com>
Paulo Alcantara [Tue, 14 Mar 2023 23:32:55 +0000 (20:32 -0300)]
cifs: return DFS root session id in DebugData
Return the DFS root session id in /proc/fs/cifs/DebugData to make it
easier to track which IPC tcon was used to get new DFS referrals for a
specific connection, and aids in debugging.
A simple output of it would be
Sessions:
1) Address: 192.168.1.13 Uses: 1 Capability: 0x300067 Session Status: 1
Security type: RawNTLMSSP SessionId: 0xd80000000009
User: 0 Cred User: 0
DFS root session id: 0x128006c000035
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Cc: stable@vger.kernel.org # 6.2
Signed-off-by: Steve French <stfrench@microsoft.com>
Linus Torvalds [Wed, 15 Mar 2023 02:32:38 +0000 (19:32 -0700)]
sched_getaffinity: don't assume 'cpumask_size()' is fully initialized
The getaffinity() system call uses 'cpumask_size()' to decide how big
the CPU mask is - so far so good. It is indeed the allocation size of a
cpumask.
But the code also assumes that the whole allocation is initialized
without actually doing so itself. That's wrong, because we might have
fixed-size allocations (making copying and clearing more efficient), but
not all of it is then necessarily used if 'nr_cpu_ids' is smaller.
Having checked other users of 'cpumask_size()', they all seem to be ok,
either using it purely for the allocation size, or explicitly zeroing
the cpumask before using the size in bytes to copy it.
See for example the ublk_ctrl_get_queue_affinity() function that uses
the proper 'zalloc_cpumask_var()' to make sure that the whole mask is
cleared, whether the storage is on the stack or if it was an external
allocation.
Fix this by just zeroing the allocation before using it. Do the same
for the compat version of sched_getaffinity(), which had the same logic.
Also, for consistency, make sched_getaffinity() use 'cpumask_bits()' to
access the bits. For a cpumask_var_t, it ends up being a pointer to the
same data either way, but it's just a good idea to treat it like you
would a 'cpumask_t'. The compat case already did that.
Reported-by: Ryan Roberts <ryan.roberts@arm.com>
Link: https://lore.kernel.org/lkml/7d026744-6bd6-6827-0471-b5e8eae0be3f@arm.com/
Cc: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dylan Jhong [Fri, 10 Mar 2023 07:50:21 +0000 (15:50 +0800)]
RISC-V: mm: Support huge page in vmalloc_fault()
Since RISC-V supports ioremap() with huge page (pud/pmd) mapping,
However, vmalloc_fault() assumes that the vmalloc range is limited
to pte mappings. To complete the vmalloc_fault() function by adding
huge page support.
Fixes:
310f541a027b ("riscv: Enable HAVE_ARCH_HUGE_VMAP for 64BIT")
Cc: stable@vger.kernel.org
Signed-off-by: Dylan Jhong <dylan@andestech.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20230310075021.3919290-1-dylan@andestech.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Paulo Alcantara [Tue, 14 Mar 2023 23:32:54 +0000 (20:32 -0300)]
cifs: fix use-after-free bug in refresh_cache_worker()
The UAF bug occurred because we were putting DFS root sessions in
cifs_umount() while DFS cache refresher was being executed.
Make DFS root sessions have same lifetime as DFS tcons so we can avoid
the use-after-free bug is DFS cache refresher and other places that
require IPCs to get new DFS referrals on. Also, get rid of mount
group handling in DFS cache as we no longer need it.
This fixes below use-after-free bug catched by KASAN
[ 379.946955] BUG: KASAN: use-after-free in __refresh_tcon.isra.0+0x10b/0xc10 [cifs]
[ 379.947642] Read of size 8 at addr
ffff888018f57030 by task kworker/u4:3/56
[ 379.948096]
[ 379.948208] CPU: 0 PID: 56 Comm: kworker/u4:3 Not tainted 6.2.0-rc7-lku #23
[ 379.948661] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
rel-1.16.0-0-gd239552-rebuilt.opensuse.org 04/01/2014
[ 379.949368] Workqueue: cifs-dfscache refresh_cache_worker [cifs]
[ 379.949942] Call Trace:
[ 379.950113] <TASK>
[ 379.950260] dump_stack_lvl+0x50/0x67
[ 379.950510] print_report+0x16a/0x48e
[ 379.950759] ? __virt_addr_valid+0xd8/0x160
[ 379.951040] ? __phys_addr+0x41/0x80
[ 379.951285] kasan_report+0xdb/0x110
[ 379.951533] ? __refresh_tcon.isra.0+0x10b/0xc10 [cifs]
[ 379.952056] ? __refresh_tcon.isra.0+0x10b/0xc10 [cifs]
[ 379.952585] __refresh_tcon.isra.0+0x10b/0xc10 [cifs]
[ 379.953096] ? __pfx___refresh_tcon.isra.0+0x10/0x10 [cifs]
[ 379.953637] ? __pfx___mutex_lock+0x10/0x10
[ 379.953915] ? lock_release+0xb6/0x720
[ 379.954167] ? __pfx_lock_acquire+0x10/0x10
[ 379.954443] ? refresh_cache_worker+0x34e/0x6d0 [cifs]
[ 379.954960] ? __pfx_wb_workfn+0x10/0x10
[ 379.955239] refresh_cache_worker+0x4ad/0x6d0 [cifs]
[ 379.955755] ? __pfx_refresh_cache_worker+0x10/0x10 [cifs]
[ 379.956323] ? __pfx_lock_acquired+0x10/0x10
[ 379.956615] ? read_word_at_a_time+0xe/0x20
[ 379.956898] ? lockdep_hardirqs_on_prepare+0x12/0x220
[ 379.957235] process_one_work+0x535/0x990
[ 379.957509] ? __pfx_process_one_work+0x10/0x10
[ 379.957812] ? lock_acquired+0xb7/0x5f0
[ 379.958069] ? __list_add_valid+0x37/0xd0
[ 379.958341] ? __list_add_valid+0x37/0xd0
[ 379.958611] worker_thread+0x8e/0x630
[ 379.958861] ? __pfx_worker_thread+0x10/0x10
[ 379.959148] kthread+0x17d/0x1b0
[ 379.959369] ? __pfx_kthread+0x10/0x10
[ 379.959630] ret_from_fork+0x2c/0x50
[ 379.959879] </TASK>
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Cc: stable@vger.kernel.org # 6.2
Signed-off-by: Steve French <stfrench@microsoft.com>
Paulo Alcantara [Tue, 14 Mar 2023 23:32:53 +0000 (20:32 -0300)]
cifs: set DFS root session in cifs_get_smb_ses()
Set the DFS root session pointer earlier when creating a new SMB
session to prevent racing with smb2_reconnect(), cifs_reconnect_tcon()
and DFS cache refresher.
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Cc: stable@vger.kernel.org # 6.2
Signed-off-by: Steve French <stfrench@microsoft.com>
Chris Leech [Fri, 10 Mar 2023 01:09:13 +0000 (09:09 +0800)]
blk-mq: fix "bad unlock balance detected" on q->srcu in __blk_mq_run_dispatch_ops
The 'q' parameter of the macro __blk_mq_run_dispatch_ops may not be one
local variable, such as, it is rq->q, then request queue pointed by
this variable could be changed to another queue in case of
BLK_MQ_F_TAG_QUEUE_SHARED after 'dispatch_ops' returns, then
'bad unlock balance' is triggered.
Fixes the issue by adding one local variable for doing srcu lock/unlock.
Fixes:
2a904d00855f ("blk-mq: remove hctx_lock and hctx_unlock")
Cc: Marco Patalano <mpatalan@redhat.com>
Signed-off-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230310010913.1014789-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Bart Van Assche [Tue, 14 Mar 2023 18:21:54 +0000 (11:21 -0700)]
loop: Fix use-after-free issues
do_req_filebacked() calls blk_mq_complete_request() synchronously or
asynchronously when using asynchronous I/O unless memory allocation fails.
Hence, modify loop_handle_cmd() such that it does not dereference 'cmd' nor
'rq' after do_req_filebacked() finished unless we are sure that the request
has not yet been completed. This patch fixes the following kernel crash:
Unable to handle kernel NULL pointer dereference at virtual address
0000000000000054
Call trace:
css_put.42938+0x1c/0x1ac
loop_process_work+0xc8c/0xfd4
loop_rootcg_workfn+0x24/0x34
process_one_work+0x244/0x558
worker_thread+0x400/0x8fc
kthread+0x16c/0x1e0
ret_from_fork+0x10/0x20
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Dan Schatzberg <schatzberg.dan@gmail.com>
Fixes:
c74d40e8b5e2 ("loop: charge i/o to mem and blk cg")
Fixes:
bc07c10a3603 ("block: loop: support DIO & AIO")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20230314182155.80625-1-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Wed, 15 Mar 2023 00:13:58 +0000 (17:13 -0700)]
Merge tag 'mm-hotfixes-stable-2023-03-14-16-51' of git://git./linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"Eleven hotfixes.
Four of these are cc:stable and the remainder address post-6.2 issues
or aren't considered suitable for backporting.
Seven of these fixes are for MM"
* tag 'mm-hotfixes-stable-2023-03-14-16-51' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm/damon/paddr: fix folio_nr_pages() after folio_put() in damon_pa_mark_accessed_or_deactivate()
mm/damon/paddr: fix folio_size() call after folio_put() in damon_pa_young()
ocfs2: fix data corruption after failed write
migrate_pages: try migrate in batch asynchronously firstly
migrate_pages: move split folios processing out of migrate_pages_batch()
migrate_pages: fix deadlock in batched migration
.mailmap: add Alexandre Ghiti personal email address
mailmap: correct Dikshita Agarwal's Qualcomm email address
mailmap: updates for Jarkko Sakkinen
mm/userfaultfd: propagate uffd-wp bit when PTE-mapping the huge zeropage
mm: teach mincore_hugetlb about pte markers
Linus Torvalds [Wed, 15 Mar 2023 00:07:54 +0000 (17:07 -0700)]
Merge tag 'trace-v6.3-rc1' of git://git./linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Do not allow histogram values to have modifies. They can cause a NULL
pointer dereference if they do.
- Warn if hist_field_name() is passed a NULL. Prevent the NULL pointer
dereference mentioned above.
- Fix invalid address look up race in lookup_rec()
- Define ftrace_stub_graph conditionally to prevent linker errors
- Always check if RCU is watching at all tracepoint locations
* tag 'trace-v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Make tracepoint lockdep check actually test something
ftrace,kcfi: Define ftrace_stub_graph conditionally
ftrace: Fix invalid address access in lookup_rec() when index is 0
tracing: Check field value in hist_field_name()
tracing: Do not let histogram values have some modifiers
Linus Torvalds [Wed, 15 Mar 2023 00:03:25 +0000 (17:03 -0700)]
Merge tag 'zstd-linus-v6.3-rc3' of https://github.com/terrelln/linux
Pull zstd fixes from Nick Terrell:
"A small number of fixes for zstd-v1.5.2.
I'm not pulling in zstd-v1.5.4 from upstream this release because it
didn't have any time to bake in linux-next, but I'm aiming for the
next update in v6.4"
* tag 'zstd-linus-v6.3-rc3' of https://github.com/terrelln/linux:
zstd: Fix definition of assert()
lib: zstd: Backport fix for in-place decompression
lib: zstd: Fix -Wstringop-overflow warning
Linus Torvalds [Tue, 14 Mar 2023 23:58:33 +0000 (16:58 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"A collection of clk driver fixes, and a couple OF clk patches to fix
regressions seen in the last few weeks. The fwnode patch broke the
build for one driver that isn't always compiled, so I waited over the
weekend to be certain no more build issues came up.
- Mark the firmware node (fwnode) that matches the compatible in
CLK_OF_DECLARE() as initialized to fix a regression on u8500 SoCs
after fw_devlink stopped checking parent nodes in
of_link_to_phandle()
- Remove a couple MODULE_LICENSE macros in non-modules
- Update the maintainers file for Microchip clk drivers
- Use 'select' instead of 'depend on' for the REGMAP config to fix
Kconfig issues
- Use div_u64() for portable 64-bit division in K210 clk driver"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: Avoid invalid function names in CLK_OF_DECLARE()
clk: k210: remove an implicit 64-bit division
MAINTAINERS: add missing clock driver coverage for Microchip FPGAs
clk: HI655X: select REGMAP instead of depending on it
kbuild, clk: remove MODULE_LICENSE in non-modules
kbuild, clk: bcm2835: remove MODULE_LICENSE in non-modules
clk: Mark a fwnode as initialized when using CLK_OF_DECLARE() macro
Shyam Prasad N [Fri, 10 Mar 2023 15:32:01 +0000 (15:32 +0000)]
cifs: generate signkey for the channel that's reconnecting
Before my changes to how multichannel reconnects work, the
primary channel was always used to do a non-binding session
setup. With my changes, that is not the case anymore.
Missed this place where channel at index 0 was forcibly
updated with the signing key.
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
Volker Lendecke [Mon, 13 Mar 2023 15:09:54 +0000 (16:09 +0100)]
cifs: Fix smb2_set_path_size()
If cifs_get_writable_path() finds a writable file, smb2_compound_op()
must use that file's FID and not the COMPOUND_FID.
Cc: stable@vger.kernel.org
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Steven Rostedt (Google) [Fri, 10 Mar 2023 22:28:56 +0000 (17:28 -0500)]
tracing: Make tracepoint lockdep check actually test something
A while ago where the trace events had the following:
rcu_read_lock_sched_notrace();
rcu_dereference_sched(...);
rcu_read_unlock_sched_notrace();
If the tracepoint is enabled, it could trigger RCU issues if called in
the wrong place. And this warning was only triggered if lockdep was
enabled. If the tracepoint was never enabled with lockdep, the bug would
not be caught. To handle this, the above sequence was done when lockdep
was enabled regardless if the tracepoint was enabled or not (although the
always enabled code really didn't do anything, it would still trigger a
warning).
But a lot has changed since that lockdep code was added. One is, that
sequence no longer triggers any warning. Another is, the tracepoint when
enabled doesn't even do that sequence anymore.
The main check we care about today is whether RCU is "watching" or not.
So if lockdep is enabled, always check if rcu_is_watching() which will
trigger a warning if it is not (tracepoints require RCU to be watching).
Note, that old sequence did add a bit of overhead when lockdep was enabled,
and with the latest kernel updates, would cause the system to slow down
enough to trigger kernel "stalled" warnings.
Link: http://lore.kernel.org/lkml/20140806181801.GA4605@redhat.com
Link: http://lore.kernel.org/lkml/20140807175204.C257CAC5@viggo.jf.intel.com
Link: https://lore.kernel.org/lkml/20230307184645.521db5c9@gandalf.local.home/
Link: https://lore.kernel.org/linux-trace-kernel/20230310172856.77406446@gandalf.local.home
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Fixes:
e6753f23d961 ("tracepoint: Make rcuidle tracepoint callers use SRCU")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Chen Yu [Wed, 8 Mar 2023 13:23:09 +0000 (21:23 +0800)]
ACPI: tools: pfrut: Check if the input of level and type is in the right numeric range
The user provides arbitrary non-numeic value to level and type,
which could bring unexpected behavior. In this case the expected
behavior would be to throw an error.
pfrut -h
usage: pfrut [OPTIONS]
code injection:
-l, --load
-s, --stage
-a, --activate
-u, --update [stage and activate]
-q, --query
-d, --revid
update telemetry:
-G, --getloginfo
-T, --type(0:execution, 1:history)
-L, --level(0, 1, 2, 4)
-R, --read
-D, --revid log
pfrut -T A
pfrut -G
log_level:0
log_type:0
log_revid:2
max_data_size:65536
chunk1_size:0
chunk2_size:1530
rollover_cnt:0
reset_cnt:17
Fix this by restricting the input to be in the expected range.
Reported-by: Hariganesh Govindarajulu <hariganesh.govindarajulu@intel.com>
Suggested-by: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Sudeep Holla [Wed, 8 Mar 2023 11:26:32 +0000 (11:26 +0000)]
ACPI: PPTT: Fix to avoid sleep in the atomic context when PPTT is absent
Commit
0c80f9e165f8 ("ACPI: PPTT: Leave the table mapped for the runtime usage")
enabled to map PPTT once on the first invocation of acpi_get_pptt() and
never unmapped the same allowing it to be used at runtime with out the
hassle of mapping and unmapping the table. This was needed to fetch LLC
information from the PPTT in the cpuhotplug path which is executed in
the atomic context as the acpi_get_table() might sleep waiting for a
mutex.
However it missed to handle the case when there is no PPTT on the system
which results in acpi_get_pptt() being called from all the secondary
CPUs attempting to fetch the LLC information in the atomic context
without knowing the absence of PPTT resulting in the splat like below:
| BUG: sleeping function called from invalid context at kernel/locking/semaphore.c:164
| in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 0, name: swapper/1
| preempt_count: 1, expected: 0
| RCU nest depth: 0, expected: 0
| no locks held by swapper/1/0.
| irq event stamp: 0
| hardirqs last enabled at (0): 0x0
| hardirqs last disabled at (0): copy_process+0x61c/0x1b40
| softirqs last enabled at (0): copy_process+0x61c/0x1b40
| softirqs last disabled at (0): 0x0
| CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.3.0-rc1 #1
| Call trace:
| dump_backtrace+0xac/0x138
| show_stack+0x30/0x48
| dump_stack_lvl+0x60/0xb0
| dump_stack+0x18/0x28
| __might_resched+0x160/0x270
| __might_sleep+0x58/0xb0
| down_timeout+0x34/0x98
| acpi_os_wait_semaphore+0x7c/0xc0
| acpi_ut_acquire_mutex+0x58/0x108
| acpi_get_table+0x40/0xe8
| acpi_get_pptt+0x48/0xa0
| acpi_get_cache_info+0x38/0x140
| init_cache_level+0xf4/0x118
| detect_cache_attributes+0x2e4/0x640
| update_siblings_masks+0x3c/0x330
| store_cpu_topology+0x88/0xf0
| secondary_start_kernel+0xd0/0x168
| __secondary_switched+0xb8/0xc0
Update acpi_get_pptt() to consider the fact that PPTT is once checked and
is not available on the system and return NULL avoiding any attempts to
fetch PPTT and thereby avoiding any possible sleep waiting for a mutex
in the atomic context.
Fixes:
0c80f9e165f8 ("ACPI: PPTT: Leave the table mapped for the runtime usage")
Reported-by: Aishwarya TCV <aishwarya.tcv@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Pierre Gondois <pierre.gondois@arm.com>
Cc: 6.0+ <stable@vger.kernel.org> # 6.0+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Linus Torvalds [Tue, 14 Mar 2023 18:08:28 +0000 (11:08 -0700)]
Merge tag 'docs-6.3-fixes' of git://git.lwn.net/linux
Pull documentation fixes from Jonathan Corbet:
"A handful of fixes and minor documentation updates"
* tag 'docs-6.3-fixes' of git://git.lwn.net/linux:
docs: vfio: fix header path
docs: process: typo fix
docs/mm: hugetlbfs_reserv: fix a reference to a file that doesn't exist
docs/mm: Physical Memory: fix a reference to a file that doesn't exist
docs: rebasing-and-merging: Drop wrong statement about git
docs: programming-language: add Rust programming language section
docs: programming-language: remove mention of the Intel compiler
docs: Correct missing "d_" prefix for dentry_operations member d_weak_revalidate
sched/doc: supplement CPU capacity with RISC-V
Todd Brandt [Mon, 13 Mar 2023 22:26:52 +0000 (15:26 -0700)]
pm-graph: sleepgraph: Avoid crashing on binary data in device names
A regression has occurred in the hid-sensor code where a device
name string has not been initialized to 0, and ends up without
a NULL char and is printed with %s. This includes random binary
data in the device name, which makes its way into the ftrace output
and ends up crashing sleepgraph because it expects the ftrace output
to be ASCII only.
For example: "HID-SENSOR-INT-020b?.39.auto" ends up in ftrace instead
of "HID-SENSOR-INT-020b.39.auto". It causes this crash in sleepgraph:
File "/usr/bin/sleepgraph", line 5579, in executeSuspend
for line in fp:
File "/usr/lib/python3.10/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position
1568: invalid start byte
The issue is present in 6.3-rc1 and is described in full here:
https://bugzilla.kernel.org/show_bug.cgi?id=217169
A separate fix has been submitted to have this issue repaired, but
it has also exposed a larger bug in sleepgraph, since nothing should
make sleepgraph crash. Sleepgraph needs to be able to handle binary
data showing up in ftrace gracefully.
Modify the ftrace processing code to treat it as potentially binary
and to filter out binary data and leave just the ASCII.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217169
Fixes:
98c062e82451 ("HID: hid-sensor-custom: Allow more custom iio sensors")
Signed-off-by: Todd Brandt <todd.e.brandt@linux.intel.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Jiri Pirko [Fri, 10 Mar 2023 09:58:57 +0000 (10:58 +0100)]
docs: vfio: fix header path
The text points to a different header file, fix by changing
the path to "uapi".
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Link: https://lore.kernel.org/r/20230310095857.985814-1-jiri@resnulli.us
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Xujun Leng [Sun, 12 Mar 2023 07:14:23 +0000 (15:14 +0800)]
docs: process: typo fix
In the second paragraph of section "Respond to review comments", there is
a spelling mistake: "aganst" should be "against".
Signed-off-by: Xujun Leng <lengxujun2007@126.com>
Link: https://lore.kernel.org/r/20230312071423.3042-1-lengxujun2007@126.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Kuninori Morimoto [Mon, 13 Mar 2023 00:50:28 +0000 (00:50 +0000)]
ALSA: hda/ca0132: fixup buffer overrun at tuning_ctl_set()
tuning_ctl_set() might have buffer overrun at (X) if it didn't break
from loop by matching (A).
static int tuning_ctl_set(...)
{
for (i = 0; i < TUNING_CTLS_COUNT; i++)
(A) if (nid == ca0132_tuning_ctls[i].nid)
break;
snd_hda_power_up(...);
(X) dspio_set_param(..., ca0132_tuning_ctls[i].mid, ...);
snd_hda_power_down(...); ^
return 1;
}
We will get below error by cppcheck
sound/pci/hda/patch_ca0132.c:4229:2: note: After for loop, i has value 12
for (i = 0; i < TUNING_CTLS_COUNT; i++)
^
sound/pci/hda/patch_ca0132.c:4234:43: note: Array index out of bounds
dspio_set_param(codec, ca0132_tuning_ctls[i].mid, 0x20,
^
This patch cares non match case.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Link: https://lore.kernel.org/r/87sfe9eap7.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Kuninori Morimoto [Mon, 13 Mar 2023 00:49:24 +0000 (00:49 +0000)]
ALSA: asihpi: check pao in control_message()
control_message() might be called with pao = NULL.
Here indicates control_message() as sample.
(B) static void control_message(struct hpi_adapter_obj *pao, ...)
{ ^^^
struct hpi_hw_obj *phw = pao->priv;
... ^^^
}
(A) void _HPI_6205(struct hpi_adapter_obj *pao, ...)
{ ^^^
...
case HPI_OBJ_CONTROL:
(B) control_message(pao, phm, phr);
break; ^^^
...
}
void HPI_6205(...)
{
...
(A) _HPI_6205(NULL, phm, phr);
... ^^^^
}
Therefore, We will get too many warning via cppcheck, like below
sound/pci/asihpi/hpi6205.c:238:27: warning: Possible null pointer dereference: pao [nullPointer]
struct hpi_hw_obj *phw = pao->priv;
^
sound/pci/asihpi/hpi6205.c:433:13: note: Calling function '_HPI_6205', 1st argument 'NULL' value is 0
_HPI_6205(NULL, phm, phr);
^
sound/pci/asihpi/hpi6205.c:401:20: note: Calling function 'control_message', 1st argument 'pao' value is 0
control_message(pao, phm, phr);
^
Set phr->error like many functions doing, and don't call _HPI_6205()
with NULL.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Link: https://lore.kernel.org/r/87ttypeaqz.wl-kuninori.morimoto.gx@renesas.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Jan Kara [Mon, 13 Mar 2023 09:30:02 +0000 (10:30 +0100)]
block: do not reverse request order when flushing plug list
Commit
26fed4ac4eab ("block: flush plug based on hardware and software
queue order") changed flushing of plug list to submit requests one
device at a time. However while doing that it also started using
list_add_tail() instead of list_add() used previously thus effectively
submitting requests in reverse order. Also when forming a rq_list with
remaining requests (in case two or more devices are used), we
effectively reverse the ordering of the plug list for each device we
process. Submitting requests in reverse order has negative impact on
performance for rotational disks (when BFQ is not in use). We observe
10-25% regression in random 4k write throughput, as well as ~20%
regression in MariaDB OLTP benchmark on rotational storage on btrfs
filesystem.
Fix the problem by preserving ordering of the plug list when inserting
requests into the queuelist as well as by appending to requeue_list
instead of prepending to it.
Fixes:
26fed4ac4eab ("block: flush plug based on hardware and software queue order")
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20230313093002.11756-1-jack@suse.cz
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Guilherme G. Piccoli [Sun, 12 Mar 2023 16:51:00 +0000 (13:51 -0300)]
drm/amdgpu/vcn: Disable indirect SRAM on Vangogh broken BIOSes
The VCN firmware loading path enables the indirect SRAM mode if it's
advertised as supported. We might have some cases of FW issues that
prevents this mode to working properly though, ending-up in a failed
probe. An example below, observed in the Steam Deck:
[...]
[drm] failed to load ucode VCN0_RAM(0x3A)
[drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0000)
amdgpu 0000:04:00.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring vcn_dec_0 test failed (-110)
[drm:amdgpu_device_init.cold [amdgpu]] *ERROR* hw_init of IP block <vcn_v3_0> failed -110
amdgpu 0000:04:00.0: amdgpu: amdgpu_device_ip_init failed
amdgpu 0000:04:00.0: amdgpu: Fatal error during GPU init
[...]
Disabling the VCN block circumvents this, but it's a very invasive
workaround that turns off the entire feature. So, let's add a quirk
on VCN loading that checks for known problematic BIOSes on Vangogh,
so we can proactively disable the indirect SRAM mode and allow the
HW proper probe and VCN IP block to work fine.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2385
Fixes:
82132ecc5432 ("drm/amdgpu: enable Vangogh VCN indirect sram mode")
Cc: stable@vger.kernel.org
Cc: James Zhu <James.Zhu@amd.com>
Cc: Leo Liu <leo.liu@amd.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 9 Mar 2023 03:45:59 +0000 (22:45 -0500)]
drm/amdgpu/nv: fix codec array for SR_IOV
Copy paste error.
Fixes:
384334120b66 ("drm/amdgpu/nv: don't expose AV1 if VCN0 is harvested")
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4454
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Acked-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Benjamin Cheng [Mon, 13 Mar 2023 00:47:39 +0000 (20:47 -0400)]
drm/amd/display: Write to correct dirty_rect
When FB_DAMAGE_CLIPS are provided in a non-MPO scenario, the loop does
not use the counter i. This causes the fill_dc_dity_rect() to always
fill dirty_rects[0], causing graphical artifacts when a damage clip
aware DRM client sends more than 1 damage clip.
Instead, use the flip_addrs->dirty_rect_count which is incremented by
fill_dc_dirty_rect() on a successful fill.
Fixes:
30ebe41582d1 ("drm/amd/display: add FB_DAMAGE_CLIPS support")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2453
Signed-off-by: Benjamin Cheng <ben@bcheng.me>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x
Guchun Chen [Thu, 9 Mar 2023 02:02:45 +0000 (10:02 +0800)]
drm/amdgpu: move poll enabled/disable into non DC path
Some amd asics having reliable hotplug support don't call
drm_kms_helper_poll_init in driver init sequence. However,
due to the unified suspend/resume path for all asics, because
the output_poll_work->func is not set for these asics, a warning
arrives when suspending.
[ 90.656049] <TASK>
[ 90.656050] ? console_unlock+0x4d/0x100
[ 90.656053] ? __irq_work_queue_local+0x27/0x60
[ 90.656056] ? irq_work_queue+0x2b/0x50
[ 90.656057] ? __wake_up_klogd+0x40/0x60
[ 90.656059] __cancel_work_timer+0xed/0x180
[ 90.656061] drm_kms_helper_poll_disable.cold+0x1f/0x2c [drm_kms_helper]
[ 90.656072] amdgpu_device_suspend+0x81/0x170 [amdgpu]
[ 90.656180] amdgpu_pmops_runtime_suspend+0xb5/0x1b0 [amdgpu]
[ 90.656269] pci_pm_runtime_suspend+0x61/0x1b0
drm_kms_helper_poll_enable/disable is valid when poll_init is called in
amdgpu code, which is only used in non DC path. So move such codes into
non-DC path code to get rid of such warnings.
v1: introduce use_kms_poll flag in amdgpu as the poll stuff check
v2: use dc_enabled as the flag to simply code
v3: move code into non DC path instead of relying on any flag
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2411
Fixes:
a4e771729a51 ("drm/probe_helper: sort out poll_running vs poll_enabled")
Reported-by: Bert Karwatzki <spasswolf@web.de>
Suggested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Bhawanpreet Lakha [Fri, 17 Feb 2023 21:08:21 +0000 (16:08 -0500)]
drm/amd/display: Fix HDCP failing to enable after suspend
[Why]
On resume some displays are not ready for HDCP, so they will fail if we
start the hdcp authentintication too soon.
Add a delay so that the displays can be ready before we start.
NOTE: Previoulsy this delay was set to 3 seconds but it was causing
issues with compliance, 2 seconds should enough for compliance and the
s3 resume case.
[How]
Change the Delay to 2 seconds.
Reviewed-by: Aurabindo Pillai <Aurabindo.Pillai@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Chia-I Wu [Wed, 8 Mar 2023 21:37:24 +0000 (13:37 -0800)]
drm/amdkfd: fix potential kgd_mem UAFs
kgd_mem pointers returned by kfd_process_device_translate_handle are
only guaranteed to be valid while p->mutex is held. As soon as the mutex
is unlocked, another thread can free the BO.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Jane Jian [Tue, 28 Feb 2023 10:48:41 +0000 (18:48 +0800)]
drm/amdgpu/vcn: custom video info caps for sriov
for sriov, we added a new flag to indicate av1 support,
this will override the original caps info.
Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Błażej Szczygieł [Sat, 4 Mar 2023 23:44:31 +0000 (00:44 +0100)]
drm/amd/pm: Fix sienna cichlid incorrect OD volage after resume
Always setup overdrive tables after resume. Preserve only some
user-defined settings in user_overdrive_table if they're set.
Copy restored user_overdrive_table into od_table to get correct
values.
On cold boot, BTC was triggered and GfxVfCurve was calibrated. We
got VfCurve settings (a). On resuming back, BTC will be triggered
again and GfxVfCurve will be recalibrated. VfCurve settings (b)
got may be different from those of cold boot. So if we reuse
those VfCurve settings (a) got on cold boot on suspend, we can
run into discrepencies.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1897
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2276
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Błażej Szczygieł <mumei6102@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org