Linus Torvalds [Tue, 5 Sep 2023 19:37:28 +0000 (12:37 -0700)]
Merge tag 'ata-6.6-rc1' of git://git./linux/kernel/git/dlemoal/libata
Pull ata updates from Damien Le Moal:
- Fix OF include file for ata platform drivers (Rob)
- Simplify various ahci, sata and pata platform drivers using the
function devm_platform_ioremap_resource() (Yangtao)
- Cleanup libata time related argument types (e.g. timeouts values)
(Sergey)
- Cleanup libata code around error handling as all ata drivers now
define a error_handler operation (Hannes and Niklas)
- Remove functions intended for libsas that are in fact unused (Niklas)
- Change the remove device callback of platform drivers to a null
function (Uwe)
- Simplify the pata_imx driver using devm_clk_get_enabled() (Li)
- Remove old and uinused remnants of the ide code in arm, parisc,
powerpc, sparc and m68k architectures and associated drivers
(pata_buddha, pata_falcon and pata_gayle) (Geert)
- Add missing MODULE_DESCRIPTION() in the sata_gemini and pata_ftide010
drivers (me)
- Several fixes for the pata_ep93xx and pata_falcon drivers (Nikita,
Michael)
- Add Elkhart Lake AHCI controller support to the ahci driver (Werner)
- Disable NCQ trim on Micron 1100 drives (Pawel)
* tag 'ata-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: (60 commits)
ata: libata-core: Disable NCQ_TRIM on Micron 1100 drives
ata: ahci: Add Elkhart Lake AHCI controller
ata: pata_falcon: add data_swab option to byte-swap disk data
ata: pata_falcon: fix IO base selection for Q40
ata: pata_ep93xx: use soc_device_match for UDMA modes
ata: pata_ep93xx: fix error return code in probe
ata: sata_gemini: Add missing MODULE_DESCRIPTION
ata: pata_ftide010: Add missing MODULE_DESCRIPTION
m68k: Remove <asm/ide.h>
ata: pata_gayle: Remove #include <asm/ide.h>
ata: pata_falcon: Remove #include <asm/ide.h>
ata: pata_buddha: Remove #include <asm/ide.h>
asm-generic: Remove ide_iops.h
sparc: Remove <asm/ide.h>
powerpc: Remove <asm/ide.h>
parisc: Remove <asm/ide.h>
ARM: Remove <asm/ide.h>
ata: pata_imx: Use helper function devm_clk_get_enabled()
ata: sata_rcar: Convert to platform remove callback returning void
ata: sata_mv: Convert to platform remove callback returning void
...
Linus Torvalds [Tue, 5 Sep 2023 19:31:07 +0000 (12:31 -0700)]
Merge tag 'mailbox-v6.6' of git://git.linaro.org/landing-teams/working/fujitsu/integration
Pull mailbox updates from Jassi Brar:
- qcom: fix incorrect num_chans counting
- mhu: Remove redundant dev_err
- bcm: fix comments
- common changes:
- convert to use devm_platform_get_and_ioremap_resource
- correct DT includes
* tag 'mailbox-v6.6' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
mailbox: qcom-ipcc: fix incorrect num_chans counting
mailbox: Explicitly include correct DT includes
mailbox: ti-msgmgr: Use devm_platform_ioremap_resource_byname()
mailbox: platform-mhu: Remove redundant dev_err()
mailbox: bcm-pdc: Fix some kernel-doc comments
mailbox: mailbox-test: Fix an error check in mbox_test_probe()
mailbox: tegra-hsp: Convert to devm_platform_ioremap_resource()
mailbox: rockchip: Use devm_platform_get_and_ioremap_resource()
mailbox: mailbox-test: Use devm_platform_get_and_ioremap_resource()
mailbox: bcm-pdc: Use devm_platform_get_and_ioremap_resource()
mailbox: bcm-ferxrm-mailbox: Use devm_platform_get_and_ioremap_resource()
Linus Torvalds [Tue, 5 Sep 2023 19:22:39 +0000 (12:22 -0700)]
Merge tag 'mm-hotfixes-stable-2023-09-05-11-51' of git://git./linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"Seven hotfixes. Four are cc:stable and the remainder pertain to issues
which were introduced in the current merge window"
* tag 'mm-hotfixes-stable-2023-09-05-11-51' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
sparc64: add missing initialization of folio in tlb_batch_add()
mm: memory-failure: use rcu lock instead of tasklist_lock when collect_procs()
revert "memfd: improve userspace warnings for missing exec-related flags".
rcu: dump vmalloc memory info safely
mm/vmalloc: add a safer version of find_vm_area() for debug
tools/mm: fix undefined reference to pthread_once
memcontrol: ensure memcg acquired by id is properly set up
Linus Torvalds [Tue, 5 Sep 2023 18:15:59 +0000 (11:15 -0700)]
Merge tag 'tpmdd-v6.6-rc1' of git://git./linux/kernel/git/jarkko/linux-tpmdd
Pull more tpm updates from Jarkko Sakkinen:
"Two more bug fixes for tpm_crb, categorically disabling rng for AMD
CPU's in the tpm_crb driver, discarding the earlier probing approach"
* tag 'tpmdd-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
tpm: Enable hwrng only for Pluton on AMD CPUs
tpm_crb: Fix an error handling path in crb_acpi_add()
Mike Rapoport (IBM) [Mon, 4 Sep 2023 17:37:59 +0000 (20:37 +0300)]
sparc64: add missing initialization of folio in tlb_batch_add()
Commit
1a10a44dfc1d ("sparc64: implement the new page table range API")
missed initialization of folio variable in tlb_batch_add() which causes
boot tests to crash.
Add missing initialization.
Link: https://lkml.kernel.org/r/20230904174350.GF3223@kernel.org
Fixes:
1a10a44dfc1d ("sparc64: implement the new page table range API")
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tong Tiangen [Mon, 28 Aug 2023 02:25:27 +0000 (10:25 +0800)]
mm: memory-failure: use rcu lock instead of tasklist_lock when collect_procs()
We found a softlock issue in our test, analyzed the logs, and found that
the relevant CPU call trace as follows:
CPU0:
_do_fork
-> copy_process()
-> write_lock_irq(&tasklist_lock) //Disable irq,waiting for
//tasklist_lock
CPU1:
wp_page_copy()
->pte_offset_map_lock()
-> spin_lock(&page->ptl); //Hold page->ptl
-> ptep_clear_flush()
-> flush_tlb_others() ...
-> smp_call_function_many()
-> arch_send_call_function_ipi_mask()
-> csd_lock_wait() //Waiting for other CPUs respond
//IPI
CPU2:
collect_procs_anon()
-> read_lock(&tasklist_lock) //Hold tasklist_lock
->for_each_process(tsk)
-> page_mapped_in_vma()
-> page_vma_mapped_walk()
-> map_pte()
->spin_lock(&page->ptl) //Waiting for page->ptl
We can see that CPU1 waiting for CPU0 respond IPI,CPU0 waiting for CPU2
unlock tasklist_lock, CPU2 waiting for CPU1 unlock page->ptl. As a result,
softlockup is triggered.
For collect_procs_anon(), what we're doing is task list iteration, during
the iteration, with the help of call_rcu(), the task_struct object is freed
only after one or more grace periods elapse. the logic as follows:
release_task()
-> __exit_signal()
-> __unhash_process()
-> list_del_rcu()
-> put_task_struct_rcu_user()
-> call_rcu(&task->rcu, delayed_put_task_struct)
delayed_put_task_struct()
-> put_task_struct()
-> if (refcount_sub_and_test())
__put_task_struct()
-> free_task()
Therefore, under the protection of the rcu lock, we can safely use
get_task_struct() to ensure a safe reference to task_struct during the
iteration.
By removing the use of tasklist_lock in task list iteration, we can break
the softlock chain above.
The same logic can also be applied to:
- collect_procs_file()
- collect_procs_fsdax()
- collect_procs_ksm()
Link: https://lkml.kernel.org/r/20230828022527.241693-1-tongtiangen@huawei.com
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Andrew Morton [Sat, 2 Sep 2023 22:59:31 +0000 (15:59 -0700)]
revert "memfd: improve userspace warnings for missing exec-related flags".
This warning is telling userspace developers to pass MFD_EXEC and
MFD_NOEXEC_SEAL to memfd_create(). Commit
434ed3350f57 ("memfd: improve
userspace warnings for missing exec-related flags") made the warning more
frequent and visible in the hope that this would accelerate the fixing of
errant userspace.
But the overall effect is to generate far too much dmesg noise.
Fixes:
434ed3350f57 ("memfd: improve userspace warnings for missing exec-related flags")
Reported-by: Damian Tometzki <dtometzki@fedoraproject.org>
Closes: https://lkml.kernel.org/r/ZPFzCSIgZ4QuHsSC@fedora.fritz.box
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Daniel Verkamp <dverkamp@chromium.org>
Cc: Jeff Xu <jeffxu@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Linus Torvalds [Tue, 5 Sep 2023 18:01:47 +0000 (11:01 -0700)]
Merge tag 'kbuild-v6.6' of git://git./linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Enable -Wenum-conversion warning option
- Refactor the rpm-pkg target
- Fix scripts/setlocalversion to consider annotated tags for rt-kernel
- Add a jump key feature for the search menu of 'make nconfig'
- Support Qt6 for 'make xconfig'
- Enable -Wformat-overflow, -Wformat-truncation, -Wstringop-overflow,
and -Wrestrict warnings for W=1 builds
- Replace <asm/export.h> with <linux/export.h> for alpha, ia64, and
sparc
- Support DEB_BUILD_OPTIONS=parallel=N for the debian source package
- Refactor scripts/Makefile.modinst and fix some modules_sign issues
- Add a new Kconfig env variable to warn symbols that are not defined
anywhere
- Show help messages of config fragments in 'make help'
* tag 'kbuild-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (62 commits)
kconfig: fix possible buffer overflow
kbuild: Show marked Kconfig fragments in "help"
kconfig: add warn-unknown-symbols sanity check
kbuild: dummy-tools: make MPROFILE_KERNEL checks work on BE
Documentation/llvm: refresh docs
modpost: Skip .llvm.call-graph-profile section check
kbuild: support modules_sign for external modules as well
kbuild: support 'make modules_sign' with CONFIG_MODULE_SIG_ALL=n
kbuild: move more module installation code to scripts/Makefile.modinst
kbuild: reduce the number of mkdir calls during modules_install
kbuild: remove $(MODLIB)/source symlink
kbuild: move depmod rule to scripts/Makefile.modinst
kbuild: add modules_sign to no-{compiler,sync-config}-targets
kbuild: do not run depmod for 'make modules_sign'
kbuild: deb-pkg: support DEB_BUILD_OPTIONS=parallel=N in debian/rules
alpha: remove <asm/export.h>
alpha: replace #include <asm/export.h> with #include <linux/export.h>
ia64: remove <asm/export.h>
ia64: replace #include <asm/export.h> with #include <linux/export.h>
sparc: remove <asm/export.h>
...
Linus Torvalds [Tue, 5 Sep 2023 17:56:27 +0000 (10:56 -0700)]
Merge tag 'mm-stable-2023-09-04-14-00' of git://git./linux/kernel/git/akpm/mm
Pull more MM updates from Andrew Morton:
- Stefan Roesch has added ksm statistics to /proc/pid/smaps
- Also a number of singleton patches, mainly cleanups and leftovers
* tag 'mm-stable-2023-09-04-14-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm/kmemleak: move up cond_resched() call in page scanning loop
mm: page_alloc: remove stale CMA guard code
MAINTAINERS: add rmap.h to mm entry
rmap: remove anon_vma_link() nommu stub
proc/ksm: add ksm stats to /proc/pid/smaps
mm/hwpoison: rename hwp_walk* to hwpoison_walk*
mm: memory-failure: add PageOffline() check
Linus Torvalds [Tue, 5 Sep 2023 17:15:22 +0000 (10:15 -0700)]
Merge tag 'microblaze-v6.6' of git://git.monstr.eu/linux-2.6-microblaze
Pull microblaze updates from Michal Simek:
- Cleanup DT headers
- Remove unused zalloc_maybe_bootmem()
- Make virt_to_pfn() a static inline
* tag 'microblaze-v6.6' of git://git.monstr.eu/linux-2.6-microblaze:
microblaze: Make virt_to_pfn() a static inline
microblaze: Remove zalloc_maybe_bootmem()
microblaze: Explicitly include correct DT includes
Zqiang [Mon, 4 Sep 2023 18:08:05 +0000 (18:08 +0000)]
rcu: dump vmalloc memory info safely
Currently, for double invoke call_rcu(), will dump rcu_head objects memory
info, if the objects is not allocated from the slab allocator, the
vmalloc_dump_obj() will be invoke and the vmap_area_lock spinlock need to
be held, since the call_rcu() can be invoked in interrupt context,
therefore, there is a possibility of spinlock deadlock scenarios.
And in Preempt-RT kernel, the rcutorture test also trigger the following
lockdep warning:
BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0
preempt_count: 1, expected: 0
RCU nest depth: 1, expected: 1
3 locks held by swapper/0/1:
#0:
ffffffffb534ee80 (fullstop_mutex){+.+.}-{4:4}, at: torture_init_begin+0x24/0xa0
#1:
ffffffffb5307940 (rcu_read_lock){....}-{1:3}, at: rcu_torture_init+0x1ec7/0x2370
#2:
ffffffffb536af40 (vmap_area_lock){+.+.}-{3:3}, at: find_vmap_area+0x1f/0x70
irq event stamp: 565512
hardirqs last enabled at (565511): [<
ffffffffb379b138>] __call_rcu_common+0x218/0x940
hardirqs last disabled at (565512): [<
ffffffffb5804262>] rcu_torture_init+0x20b2/0x2370
softirqs last enabled at (399112): [<
ffffffffb36b2586>] __local_bh_enable_ip+0x126/0x170
softirqs last disabled at (399106): [<
ffffffffb43fef59>] inet_register_protosw+0x9/0x1d0
Preemption disabled at:
[<
ffffffffb58040c3>] rcu_torture_init+0x1f13/0x2370
CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 6.5.0-rc4-rt2-yocto-preempt-rt+ #15
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x68/0xb0
dump_stack+0x14/0x20
__might_resched+0x1aa/0x280
? __pfx_rcu_torture_err_cb+0x10/0x10
rt_spin_lock+0x53/0x130
? find_vmap_area+0x1f/0x70
find_vmap_area+0x1f/0x70
vmalloc_dump_obj+0x20/0x60
mem_dump_obj+0x22/0x90
__call_rcu_common+0x5bf/0x940
? debug_smp_processor_id+0x1b/0x30
call_rcu_hurry+0x14/0x20
rcu_torture_init+0x1f82/0x2370
? __pfx_rcu_torture_leak_cb+0x10/0x10
? __pfx_rcu_torture_leak_cb+0x10/0x10
? __pfx_rcu_torture_init+0x10/0x10
do_one_initcall+0x6c/0x300
? debug_smp_processor_id+0x1b/0x30
kernel_init_freeable+0x2b9/0x540
? __pfx_kernel_init+0x10/0x10
kernel_init+0x1f/0x150
ret_from_fork+0x40/0x50
? __pfx_kernel_init+0x10/0x10
ret_from_fork_asm+0x1b/0x30
</TASK>
The previous patch fixes this by using the deadlock-safe best-effort
version of find_vm_area. However, in case of failure print the fact that
the pointer was a vmalloc pointer so that we print at least something.
Link: https://lkml.kernel.org/r/20230904180806.1002832-2-joel@joelfernandes.org
Fixes:
98f180837a89 ("mm: Make mem_dump_obj() handle vmalloc() memory")
Signed-off-by: Zqiang <qiang.zhang1211@gmail.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Reported-by: Zhen Lei <thunder.leizhen@huaweicloud.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Joel Fernandes (Google) [Mon, 4 Sep 2023 18:08:04 +0000 (18:08 +0000)]
mm/vmalloc: add a safer version of find_vm_area() for debug
It is unsafe to dump vmalloc area information when trying to do so from
some contexts. Add a safer trylock version of the same function to do a
best-effort VMA finding and use it from vmalloc_dump_obj().
[applied test robot feedback on unused function fix.]
[applied Uladzislau feedback on locking.]
Link: https://lkml.kernel.org/r/20230904180806.1002832-1-joel@joelfernandes.org
Fixes:
98f180837a89 ("mm: Make mem_dump_obj() handle vmalloc() memory")
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reported-by: Zhen Lei <thunder.leizhen@huaweicloud.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Zqiang <qiang.zhang1211@gmail.com>
Cc: <stable@vger.kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Xie XiuQi [Thu, 31 Aug 2023 03:42:05 +0000 (11:42 +0800)]
tools/mm: fix undefined reference to pthread_once
Commit
97d5f2e9ee12 ("tools api fs: More thread safety for global
filesystem variables") introduces pthread_once, so the libpthread
should be added at link time, or we'll meet the following compile
error when 'make -C tools/mm':
gcc -Wall -Wextra -I../lib/ -o page-types page-types.c ../lib/api/libapi.a
~/linux/tools/lib/api/fs/fs.c:146: undefined reference to `pthread_once'
~/linux/tools/lib/api/fs/fs.c:147: undefined reference to `pthread_once'
~/linux/tools/lib/api/fs/fs.c:148: undefined reference to `pthread_once'
~/linux/tools/lib/api/fs/fs.c:149: undefined reference to `pthread_once'
~/linux/tools/lib/api/fs/fs.c:150: undefined reference to `pthread_once'
/usr/bin/ld: ../lib/api/libapi.a(libapi-in.o):~/linux/tools/lib/api/fs/fs.c:151:
more undefined references to `pthread_once' follow
collect2: error: ld returned 1 exit status
make: *** [Makefile:22: page-types] Error 1
Link: https://lkml.kernel.org/r/20230831034205.2376653-1-xiexiuqi@huaweicloud.com
Fixes:
97d5f2e9ee12 ("tools api fs: More thread safety for global filesystem variables")
Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Johannes Weiner [Wed, 23 Aug 2023 22:54:30 +0000 (15:54 -0700)]
memcontrol: ensure memcg acquired by id is properly set up
In the eviction recency check, we attempt to retrieve the memcg to which
the folio belonged when it was evicted, by the memcg id stored in the
shadow entry. However, there is a chance that the retrieved memcg is not
the original memcg that has been killed, but a new one which happens to
have the same id.
This is a somewhat unfortunate, but acceptable and rare inaccuracy in the
heuristics. However, if we retrieve this new memcg between its allocation
and when it is properly attached to the memcg hierarchy, we could run into
the following NULL pointer exception during the memcg hierarchy traversal
done in mem_cgroup_get_nr_swap_pages():
[ 155757.793456] BUG: kernel NULL pointer dereference, address:
00000000000000c0
[ 155757.807568] #PF: supervisor read access in kernel mode
[ 155757.818024] #PF: error_code(0x0000) - not-present page
[ 155757.828482] PGD
401f77067 P4D
401f77067 PUD
401f76067 PMD 0
[ 155757.839985] Oops: 0000 [#1] SMP
[ 155757.887870] RIP: 0010:mem_cgroup_get_nr_swap_pages+0x3d/0xb0
[ 155757.899377] Code: 29 19 4a 02 48 39 f9 74 63 48 8b 97 c0 00 00 00 48 8b b7 58 02 00 00 48 2b b7 c0 01 00 00 48 39 f0 48 0f 4d c6 48 39 d1 74 42 <48> 8b b2 c0 00 00 00 48 8b ba 58 02 00 00 48 2b ba c0 01 00 00 48
[ 155757.937125] RSP: 0018:
ffffc9002ecdfbc8 EFLAGS:
00010286
[ 155757.947755] RAX:
00000000003a3b1c RBX:
000007ffffffffff RCX:
ffff888280183000
[ 155757.962202] RDX:
0000000000000000 RSI:
0007ffffffffffff RDI:
ffff888bbc2d1000
[ 155757.976648] RBP:
0000000000000001 R08:
000000000000000b R09:
ffff888ad9cedba0
[ 155757.991094] R10:
ffffea0039c07900 R11:
0000000000000010 R12:
ffff888b23a7b000
[ 155758.005540] R13:
0000000000000000 R14:
ffff888bbc2d1000 R15:
000007ffffc71354
[ 155758.019991] FS:
00007f6234c68640(0000) GS:
ffff88903f9c0000(0000) knlGS:
0000000000000000
[ 155758.036356] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 155758.048023] CR2:
00000000000000c0 CR3:
0000000a83eb8004 CR4:
00000000007706e0
[ 155758.062473] DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
[ 155758.076924] DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
[ 155758.091376] PKRU:
55555554
[ 155758.096957] Call Trace:
[ 155758.102016] <TASK>
[ 155758.106502] ? __die+0x78/0xc0
[ 155758.112793] ? page_fault_oops+0x286/0x380
[ 155758.121175] ? exc_page_fault+0x5d/0x110
[ 155758.129209] ? asm_exc_page_fault+0x22/0x30
[ 155758.137763] ? mem_cgroup_get_nr_swap_pages+0x3d/0xb0
[ 155758.148060] workingset_test_recent+0xda/0x1b0
[ 155758.157133] workingset_refault+0xca/0x1e0
[ 155758.165508] filemap_add_folio+0x4d/0x70
[ 155758.173538] page_cache_ra_unbounded+0xed/0x190
[ 155758.182919] page_cache_sync_ra+0xd6/0x1e0
[ 155758.191738] filemap_read+0x68d/0xdf0
[ 155758.199495] ? mlx5e_napi_poll+0x123/0x940
[ 155758.207981] ? __napi_schedule+0x55/0x90
[ 155758.216095] __x64_sys_pread64+0x1d6/0x2c0
[ 155758.224601] do_syscall_64+0x3d/0x80
[ 155758.232058] entry_SYSCALL_64_after_hwframe+0x46/0xb0
[ 155758.242473] RIP: 0033:0x7f62c29153b5
[ 155758.249938] Code: e8 48 89 75 f0 89 7d f8 48 89 4d e0 e8 b4 e6 f7 ff 41 89 c0 4c 8b 55 e0 48 8b 55 e8 48 8b 75 f0 8b 7d f8 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 33 44 89 c7 48 89 45 f8 e8 e7 e6 f7 ff 48 8b
[ 155758.288005] RSP: 002b:
00007f6234c5ffd0 EFLAGS:
00000293 ORIG_RAX:
0000000000000011
[ 155758.303474] RAX:
ffffffffffffffda RBX:
00007f628c4e70c0 RCX:
00007f62c29153b5
[ 155758.318075] RDX:
000000000003c041 RSI:
00007f61d2986000 RDI:
0000000000000076
[ 155758.332678] RBP:
00007f6234c5fff0 R08:
0000000000000000 R09:
0000000064d5230c
[ 155758.347452] R10:
000000000027d450 R11:
0000000000000293 R12:
000000000003c041
[ 155758.362044] R13:
00007f61d2986000 R14:
00007f629e11b060 R15:
000000000027d450
[ 155758.376661] </TASK>
This patch fixes the issue by moving the memcg's id publication from the
alloc stage to online stage, ensuring that any memcg acquired via id must
be connected to the memcg tree.
Link: https://lkml.kernel.org/r/20230823225430.166925-1-nphamcs@gmail.com
Fixes:
f78dfc7b77d5 ("workingset: fix confusion around eviction vs refault container")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Co-developed-by: Nhat Pham <nphamcs@gmail.com>
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Cc: Yosry Ahmed <yosryahmed@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Linus Torvalds [Tue, 5 Sep 2023 17:09:31 +0000 (10:09 -0700)]
Merge tag 'for-linus' of https://github.com/openrisc/linux
Pull OpenRISC updates from Stafford Horne:
- Fixes from me to cleanup all compiler warnings reported under
arch/openrisc
- One cleanup from Linus Walleij to convert pfn macros to static
inlines
* tag 'for-linus' of https://github.com/openrisc/linux:
openrisc: Remove kernel-doc marker from ioremap comment
openrisc: Remove unused tlb_init function
openriac: Remove unused nommu_dump_state function
openrisc: Include cpu.h and switch_to.h for prototypes
openrisc: Add prototype for die to bug.h
openrisc: Add prototype for show_registers to processor.h
openrisc: Declare do_signal function as static
openrisc: Add missing prototypes for assembly called fnctions
openrisc: Make pfn accessors statics inlines
Konstantin Meskhidze [Tue, 5 Sep 2023 09:59:14 +0000 (17:59 +0800)]
kconfig: fix possible buffer overflow
Buffer 'new_argv' is accessed without bound check after accessing with
bound check via 'new_argc' index.
Fixes:
e298f3b49def ("kconfig: add built-in function support")
Co-developed-by: Ivanov Mikhail <ivanov.mikhail1@huawei-partners.com>
Signed-off-by: Konstantin Meskhidze <konstantin.meskhidze@huawei.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Jonathan Marek [Wed, 2 Aug 2023 13:52:22 +0000 (09:52 -0400)]
mailbox: qcom-ipcc: fix incorrect num_chans counting
Breaking out early when a match is found leads to an incorrect num_chans
value when more than one ipcc mailbox channel is used by the same device.
Fixes:
e9d50e4b4d04 ("mailbox: qcom-ipcc: Dynamic alloc for channel arrangement")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
Rob Herring [Fri, 14 Jul 2023 17:47:01 +0000 (11:47 -0600)]
mailbox: Explicitly include correct DT includes
The DT of_device.h and of_platform.h date back to the separate
of_platform_bus_type before it as merged into the regular platform bus.
As part of that merge prepping Arm DT support 13 years ago, they
"temporarily" include each other. They also include platform_device.h
and of.h. As a result, there's a pretty much random mix of those include
files used throughout the tree. In order to detangle these headers and
replace the implicit includes with struct declarations, users need to
explicitly include the correct includes.
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
Li Zetao [Tue, 1 Aug 2023 08:51:07 +0000 (16:51 +0800)]
mailbox: ti-msgmgr: Use devm_platform_ioremap_resource_byname()
Convert platform_get_resource_byname() + devm_ioremap_resource() to a
single call to devm_platform_ioremap_resource_byname(), as this is
exactly what this function does.
Signed-off-by: Li Zetao <lizetao1@huawei.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
Ruan Jinjie [Thu, 27 Jul 2023 10:41:37 +0000 (10:41 +0000)]
mailbox: platform-mhu: Remove redundant dev_err()
There is no need to call the dev_err() function directly to print a custom
message when handling an error from platform_get_irq() function as
it is going to display an appropriate error message in case of a failure.
Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
Yang Li [Fri, 11 Aug 2023 01:34:48 +0000 (09:34 +0800)]
mailbox: bcm-pdc: Fix some kernel-doc comments
Fix some kernel-doc comments to silence the warnings:
drivers/mailbox/bcm-pdc-mailbox.c:707: warning: Function parameter or member 'pdcs' not described in 'pdc_tx_list_sg_add'
drivers/mailbox/bcm-pdc-mailbox.c:707: warning: Excess function parameter 'spu_idx' description in 'pdc_tx_list_sg_add'
drivers/mailbox/bcm-pdc-mailbox.c:875: warning: Function parameter or member 'pdcs' not described in 'pdc_rx_list_sg_add'
drivers/mailbox/bcm-pdc-mailbox.c:875: warning: Excess function parameter 'spu_idx' description in 'pdc_rx_list_sg_add'
drivers/mailbox/bcm-pdc-mailbox.c:966: warning: Function parameter or member 't' not described in 'pdc_tasklet_cb'
drivers/mailbox/bcm-pdc-mailbox.c:966: warning: Excess function parameter 'data' description in 'pdc_tasklet_cb'
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
Minjie Du [Thu, 13 Jul 2023 10:18:08 +0000 (18:18 +0800)]
mailbox: mailbox-test: Fix an error check in mbox_test_probe()
mbox_test_request_channel() function returns NULL or
error value embedded in the pointer (PTR_ERR).
Evaluate the return value using IS_ERR_OR_NULL.
Signed-off-by: Minjie Du <duminjie@vivo.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
Yangtao Li [Tue, 4 Jul 2023 13:37:26 +0000 (21:37 +0800)]
mailbox: tegra-hsp: Convert to devm_platform_ioremap_resource()
Use devm_platform_ioremap_resource() to simplify code.
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
Yangtao Li [Tue, 4 Jul 2023 13:37:25 +0000 (21:37 +0800)]
mailbox: rockchip: Use devm_platform_get_and_ioremap_resource()
Convert platform_get_resource(), devm_ioremap_resource() to a single
call to devm_platform_get_and_ioremap_resource(), as this is exactly
what this function does.
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
Yangtao Li [Tue, 4 Jul 2023 13:37:24 +0000 (21:37 +0800)]
mailbox: mailbox-test: Use devm_platform_get_and_ioremap_resource()
Convert platform_get_resource(), devm_ioremap_resource() to a single
call to devm_platform_get_and_ioremap_resource(), as this is exactly
what this function does.
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
Yangtao Li [Tue, 4 Jul 2023 13:37:23 +0000 (21:37 +0800)]
mailbox: bcm-pdc: Use devm_platform_get_and_ioremap_resource()
Convert platform_get_resource(), devm_ioremap_resource() to a single
call to devm_platform_get_and_ioremap_resource(), as this is exactly
what this function does.
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
Yangtao Li [Tue, 4 Jul 2023 13:37:22 +0000 (21:37 +0800)]
mailbox: bcm-ferxrm-mailbox: Use devm_platform_get_and_ioremap_resource()
Convert platform_get_resource(), devm_ioremap_resource() to a single
call to devm_platform_get_and_ioremap_resource(), as this is exactly
what this function does.
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
Linus Torvalds [Mon, 4 Sep 2023 22:38:24 +0000 (15:38 -0700)]
Merge tag 'arc-6.6-rc1' of git://git./linux/kernel/git/vgupta/arc
Pull ARC updates from Vineet Gupta:
- fixes for -Wmissing-prototype warnings
- missing compiler barrier in relaxed atomics
- some uaccess simplification, declutter
- removal of massive glocal struct cpuinfo_arc from bootlog code
- __switch_to consolidation (removal of inline asm variant)
- use GP to cache task pointer (vs. r25)
- misc rework of entry code
* tag 'arc-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (24 commits)
ARC: boot log: fix warning
arc: Explicitly include correct DT includes
ARC: pt_regs: create seperate type for ecr
ARCv2: entry: rearrange pt_regs slightly
ARC: entry: replace 8 byte ADD.ne with 4 byte ADD2.ne
ARC: entry: replace 8 byte OR with 4 byte BSET
ARC: entry: Add more common chores to EXCEPTION_PROLOGUE
ARC: entry: EV_MachineCheck dont re-read ECR
ARC: entry: ARcompact EV_ProtV to use r10 directly
ARC: entry: rework (non-functional)
ARC: __switch_to: move ksp to thread_info from thread_struct
ARC: __switch_to: asm with dwarf ops (vs. inline asm)
ARC: kernel stack: INIT_THREAD need not setup @init_stack in @ksp
ARC: entry: use gp to cache task pointer (vs. r25)
ARC: boot log: eliminate struct cpuinfo_arc #4: boot log per ISA
ARC: boot log: eliminate struct cpuinfo_arc #3: don't export
ARC: boot log: eliminate struct cpuinfo_arc #2: cache
ARC: boot log: eliminate struct cpuinfo_arc #1: mm
ARCv2: memset: don't prefetch for len == 0 which happens a alot
ARC: uaccess: elide unaliged handling if hardware supports
...
Linus Torvalds [Mon, 4 Sep 2023 22:21:55 +0000 (15:21 -0700)]
Merge tag 'pm-6.6-rc1-2' of git://git./linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"These fix cpufreq core and the pcc cpufreq driver, add per-policy
boost support to cpufreq and add Georgian translation Makefile
LANGUAGES in cpupower.
Specifics:
- Add Georgian translation to Makefile LANGUAGES in cpupower (Shuah
Khan).
- Add support for per-policy performance boost to cpufreq (Jie Zhan).
- Fix assorted issues in the cpufreq core, common governor code and
in the pcc cpufreq driver (Liao Chang)"
* tag 'pm-6.6-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: Support per-policy performance boost
cpufreq: pcc: Fix the potentinal scheduling delays in target_index()
cpufreq: governor: Free dbs_data directly when gov->init() fails
cpufreq: Fix the race condition while updating the transition_task of policy
cpufreq: Avoid printing kernel addresses in cpufreq_resume()
cpupower: Add Georgian translation to Makefile LANGUAGES
Linus Torvalds [Mon, 4 Sep 2023 22:17:28 +0000 (15:17 -0700)]
Merge tag 'thermal-6.6-rc1-2' of git://git./linux/kernel/git/rafael/linux-pm
Pull more thermal control updates from Rafael Wysocki:
"These are mostly updates of thermal control drivers for ARM platforms,
new thermal control support for Loongson-2 and a couple of core
cleanups made possible by recent changes merged previously.
Specifics:
- Check if the Tegra BPMP supports the trip points in order to set
the .set_trips callback (Mikko Perttunen)
- Add new Loongson-2 thermal sensor along with the DT bindings (Yinbo
Zhu)
- Use IS_ERR_OR_NULL() helper to replace a double test on the TI
bandgap sensor (Li Zetao)
- Remove redundant platform_set_drvdata() calls, as there are no
corresponding calls to platform_get_drvdata(), from a bunch of
drivers (Andrei Coardos)
- Switch the Mediatek LVTS mode to filtered in order to enable
interrupts (Nícolas F. R. A. Prado)
- Fix Wvoid-pointer-to-enum-cast warning on the Exynos TMU (Krzysztof
Kozlowski)
- Remove redundant dev_err_probe(), because the underlying function
already called it, from the Mediatek sensor (Chen Jiahao)
- Free calibration nvmem after reading it on sun8i (Mark Brown)
- Remove useless comment from the sun8i driver (Yangtao Li)
- Make tsens_xxxx_nvmem static to fix a sparse warning on QCom tsens
(Min-Hua Chen)
- Remove error message at probe deferral on imx8mm (Ahmad Fatoum)
- Fix parameter check in lvts_debugfs_init() with IS_ERR() on
Mediatek LVTS (Minjie Du)
- Fix interrupt routine and configuratoin for Mediatek LVTS (Nícolas
F. R. A. Prado)
- Drop unused .get_trip_type(), .get_trip_temp() and .get_trip_hyst()
thermal zone callbacks from the core and rework the .get_trend()
one to take a trip point pointer as an argument (Rafael Wysocki)"
* tag 'thermal-6.6-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (29 commits)
thermal: core: Rework .get_trend() thermal zone callback
thermal: core: Drop unused .get_trip_*() callbacks
thermal/drivers/tegra-bpmp: Check if BPMP supports trip points
thermal: dt-bindings: add loongson-2 thermal
thermal/drivers/loongson-2: Add thermal management support
thermal/drivers/ti-soc-thermal: Use helper function IS_ERR_OR_NULL()
thermal/drivers/generic-adc: Removed unneeded call to platform_set_drvdata()
thermal/drivers/max77620_thermal: Removed unneeded call to platform_set_drvdata()
thermal/drivers/mediatek/auxadc_thermal: Removed call to platform_set_drvdata()
thermal/drivers/sun8i_thermal: Remove unneeded call to platform_set_drvdata()
thermal/drivers/broadcom/brcstb_thermal: Removed unneeded platform_set_drvdata()
thermal/drivers/mediatek/lvts_thermal: Make readings valid in filtered mode
thermal/drivers/k3_bandgap: Remove unneeded call to platform_set_drvdata()
thermal/drivers/k3_j72xx_bandgap: Removed unneeded call to platform_set_drvdata()
thermal/drivers/broadcom/sr-thermal: Removed call to platform_set_drvdata()
thermal/drivers/samsung: Fix Wvoid-pointer-to-enum-cast warning
thermal/drivers/db8500: Remove redundant of_match_ptr()
thermal/drivers/mediatek: Clean up redundant dev_err_probe()
thermal/drivers/sun8i: Free calibration nvmem after reading it
thermal/drivers/sun8i: Remove unneeded comments
...
Linus Torvalds [Mon, 4 Sep 2023 22:12:26 +0000 (15:12 -0700)]
Merge tag 'rproc-v6.6' of git://git./linux/kernel/git/remoteproc/linux
Pull remoteproc updates from Bjorn Andersson:
"Support for booting the iMX remoteprocs using MMIO, instead of SMCCC
is added. The iMX driver is also extended to support delivering
interrupts from an arbitrary number of vdev.
Support is added to the TI PRU driver, to allow GPMUX to be controlled
from DeviceTree.
The Qualcomm coredump collector is extended to fall back to generating
a full coredump, in the case that the loaded firmware doesn't support
generating minidump. The overly terse MD abbreviation of "MINIDUMP" is
expanded, to make the code easier on the eye.
The list of Qualcomm Sensor Low Power Island (SLPI) instances
supported is cleaned up, and SDM845 is added. SDM630/636/660 support
for the modem subsystem (mss) is added.
All the Qualcomm drivers are transitioned to of_reserved_mem_lookup()
instead of open coding the resolution of reserved-memory regions, to
gain handling of error cases. A couple of drivers are transitioned to
use devm_platform_ioremap_resource_byname().
The stm32 remoteproc driver's PM operations are updated to modern
macros, to avoid the "unused variable"-warning in some configurations.
Drivers are transitioned away from directly including of_device.h"
* tag 'rproc-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux: (23 commits)
remoteproc: pru: add support for configuring GPMUX based on client setup
remoteproc: stm32: fix incorrect optional pointers
remoteproc: imx_rproc: Switch iMX8MN/MP from SMCCC to MMIO
dt-bindings: remoteproc: imx_rproc: Support i.MX8MN/P MMIO
dt-bindings: remoteproc: qcom,msm8996-mss-pil: Fix 8996 clocks
remoteproc: qcom: pas: add SDM845 SLPI compatible
remoteproc: qcom: q6v5-mss: Add support for SDM630/636/660
dt-bindings: remoteproc: qcom,msm8996-mss-pil: Add SDM660 compatible
remoteproc: qcom: Expand MD_* as MINIDUMP_*
remoteproc: qcom: pas: refactor SLPI remoteproc init
dt-bindings: remoteproc: qcom: adsp: add qcom,sdm845-slpi-pas compatible
remoteproc: qcom: wcnss: use devm_platform_ioremap_resource_byname()
remoteproc: qcom: q6v5: use devm_platform_ioremap_resource_byname()
dt-bindings: remoteproc: qcom: sm6115-pas: Add QCM2290
remoteproc: qcom: Add full coredump fallback mechanism
remoteproc: core: Export the rproc coredump APIs
remoteproc: qcom: Use of_reserved_mem_lookup()
remoteproc: imx_rproc: iterate all notifiyids in rx callback
dt-bindings: remoteproc: qcom,adsp: bring back firmware-name
dt-bindings: remoteproc: qcom,sm8550-pas: require memory-region
...
Linus Torvalds [Mon, 4 Sep 2023 22:08:52 +0000 (15:08 -0700)]
Merge tag 'rpmsg-v6.6' of git://git./linux/kernel/git/remoteproc/linux
Pull rpmsg updates from Bjorn Andersson:
"Add support for the GLINK flow control signals, and expose this to the
user through the rpmsg_char interface. Add missing kstrdup() failure
handling during allocation of GLINK channel objects"
* tag 'rpmsg-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
rpmsg: glink: Avoid dereferencing NULL channel
rpmsg: glink: Add check for kstrdup
rpmsg: char: Add RPMSG GET/SET FLOWCONTROL IOCTL support
rpmsg: glink: Add support to handle signals command
rpmsg: core: Add signal API support
Linus Torvalds [Mon, 4 Sep 2023 22:04:31 +0000 (15:04 -0700)]
Merge tag 'hwlock-v6.6' of git://git./linux/kernel/git/remoteproc/linux
Pull hwspinlock updates from Bjorn Andersson:
"Convert u8500 and omap drivers to void-returning remove.
Complete the support for representing the Qualcomm TCSR mutex as a
mmio device, and check the return value of devm_regmap_field_alloc()
in the same"
* tag 'hwlock-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
hwspinlock: qcom: add missing regmap config for SFPB MMIO implementation
hwspinlock: u8500: Convert to platform remove callback returning void
hwspinlock: omap: Convert to platform remove callback returning void
hwspinlock: omap: Emit only one error message for errors in .remove()
hwspinlock: add a check of devm_regmap_field_alloc in qcom_hwspinlock_probe
Linus Torvalds [Mon, 4 Sep 2023 20:52:58 +0000 (13:52 -0700)]
Merge tag 'leds-next-6.6' of git://git./linux/kernel/git/lee/leds
Pull LED updates from Lee Jones:
"Core Frameworks:
- Add new framework to support Group Multi-Color (GMC) LEDs
- Offer an 'optional' API for non-essential LEDs
- Support obtaining 'max brightness' values from Device Tree
- Provide new led_classdev member 'color' (settable via DT and SYFS)
- Stop TTY Trigger from using the old LED_ON constraints
- Statically allocate leds_class
New Drivers:
- Add support for NXP PCA995x I2C Constant Current LED Driver
New Device Support:
- Add support for Siemens Simatic IPC BX-21 to Simatic IPC
Fix-ups:
- Some dependency / Kconfig tweaking
- Move final probe() functions back over from .probe_new()
- Simplify obtaining resources (memory, device data) using unified
API helpers
- Bunch of Device Tree additions, conversions and adaptions
- Fix trivial styling issues; comments
- Ensure correct includes are present and remove some that are not
required
- Omit the use of redundant casts and if relevant replace with better
ones
- Use purpose-built APIs for various actions; sysfs_emit(),
module_led_trigger()
- Remove a bunch of superfluous locking
Bug Fixes:
- Ensure error codes are correctly propagated back up the call chain
- Fix incorrect error values from being returned (missing '-')
- Ensure get'ed resources are put'ed to prevent leaks
- Use correct class when exporting module resources
- Fixing rounding (or lack there of) issues
- Fix 'always false' LED_COLOR_ID_MULTI BUG() check"
* tag 'leds-next-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds: (40 commits)
leds: aw2013: Enable pull-up supply for interrupt and I2C
dt-bindings: leds: Document pull-up supply for interrupt and I2C
dt-bindings: leds: aw2013: Document interrupt
leds: uleds: Use module_misc_device macro to simplify the code
leds: trigger: netdev: Use module_led_trigger macro to simplify the code
dt-bindings: leds: Fix reference to definition of default-state
leds: turris-omnia: Drop unnecessary mutex locking
leds: turris-omnia: Use sysfs_emit() instead of sprintf()
leds: Make leds_class a static const structure
leds: Remove redundant of_match_ptr()
dt-bindings: leds: Add gpio-line-names to PCA9532 GPIO
leds: trigger: tty: Do not use LED_ON/OFF constants, use led_blink_set_oneshot instead
dt-bindings: leds: rohm,bd71828: Drop select:false
leds: Fix BUG_ON check for LED_COLOR_ID_MULTI that is always false
leds: multicolor: Use rounded division when calculating color components
leds: rgb: Add a multicolor LED driver to group monochromatic LEDs
dt-bindings: leds: Add binding for a multicolor group of LEDs
leds: class: Store the color index in struct led_classdev
leds: Provide devm_of_led_get_optional()
leds: pca995x: Fix MODULE_DEVICE_TABLE for OF
...
Linus Torvalds [Mon, 4 Sep 2023 20:47:59 +0000 (13:47 -0700)]
Merge tag 'mfd-next-6.6' of git://git./linux/kernel/git/lee/mfd
Pull NFD updates from Lee Jones:
"New Drivers:
- Add support for the Cirrus Logic CS42L43 Audio CODEC
Fix-ups:
- Make use of specific printk() format tags for various optimisations
- Kconfig / module modifications / tweaking
- Simplify obtaining resources (memory, device data) using unified
API helpers
- Bunch of Device Tree additions, conversions and adaptions
- Convert a bunch of Regmap configurations to use the Maple Tree
cache
- Ensure correct includes are present and remove some that are not
required
- Remove superfluous code
- Reduce amount of cycles spent in critical sections
- Omit the use of redundant casts and if relevant replace with better
ones
- Swap out raw_spin_{un}lock_irq{save,restore}() for
spin_{un}lock_irq{save,restore}()
Bug Fixes:
- Repair theoretical deadlock situation
- Fix some link-time dependencies
- Use more appropriate datatype when casting"
* tag 'mfd-next-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (70 commits)
mfd: mc13xxx: Simplify device data fetching in probe()
mfd: rz-mtu3: Replace raw_spin_lock->spin_lock()
mfd: rz-mtu3: Reduce critical sections
mfd: mxs-lradc: Fix Wvoid-pointer-to-enum-cast warning
mfd: wm31x: Fix Wvoid-pointer-to-enum-cast warning
mfd: wm8994: Fix Wvoid-pointer-to-enum-cast warning
mfd: tc3589: Fix Wvoid-pointer-to-enum-cast warning
mfd: lp87565: Fix Wvoid-pointer-to-enum-cast warning
mfd: hi6421-pmic: Fix Wvoid-pointer-to-enum-cast warning
mfd: max77541: Fix Wvoid-pointer-to-enum-cast warning
mfd: max14577: Fix Wvoid-pointer-to-enum-cast warning
mfd: stmpe: Fix Wvoid-pointer-to-enum-cast warning
mfd: rn5t618: Remove redundant of_match_ptr()
mfd: lochnagar-i2c: Remove redundant of_match_ptr()
mfd: stpmic1: Remove redundant of_match_ptr()
mfd: act8945a: Remove redundant of_match_ptr()
mfd: rsmu_spi: Remove redundant of_match_ptr()
mfd: altera-a10sr: Remove redundant of_match_ptr()
mfd: rsmu_i2c: Remove redundant of_match_ptr()
mfd: tc3589x: Remove redundant of_match_ptr()
...
Linus Torvalds [Mon, 4 Sep 2023 20:44:11 +0000 (13:44 -0700)]
Merge tag 'i2c-for-6.6-rc1' of git://git./linux/kernel/git/wsa/linux
Pull i2c updates from Wolfram Sang:
"I2C has mainly cleanups this time and a few driver improvements.
Because a lot of developers were on holidays (including myself) it was
a good timing to apply lots of cleanups which would normally cause
merge conflicts with other floating patches. Extra thanks go to Andi
Shyti who backed me up when I was on a four week hiatus. This is also
the reason that some patches were commited later than ideal"
* tag 'i2c-for-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (67 commits)
i2c: at91: Use dev_err_probe() instead of dev_err()
I2C: ali15x3: Do PCI error checks on own line
i2c: Make return value check more accurate and explicit for devm_pinctrl_get()
i2c: designware: Add support for recovery when GPIO need pinctrl
i2c: mlxcpld: Add support for extended transaction length
i2c: mlxcpld: Allow driver to run on ARM64 architecture
i2c: nforce2: Do PCI error check on own line
i2c: sis5595: Do PCI error checks on own line
i2c: qcom-cci: Fix error checking in cci_probe()
i2c: muxes: pca954x: Add regulator support
i2c: muxes: pca954x: Add MAX735x/MAX736x support
dt-bindings: i2c: Add Maxim MAX735x/MAX736x variants
dt-bindings: i2c: pca954x: Correct interrupt support
i2c: pnx: Use devm_platform_get_and_ioremap_resource()
i2c: pxa: Use devm_platform_get_and_ioremap_resource()
i2c: s3c2410: Use devm_platform_get_and_ioremap_resource()
i2c: sh_mobile: Use devm_platform_get_and_ioremap_resource()
i2c: st: Use devm_platform_get_and_ioremap_resource()
i2c: qcom-geni: Convert to devm_platform_ioremap_resource()
i2c: stm32f4: Use devm_platform_get_and_ioremap_resource()
...
Linus Torvalds [Mon, 4 Sep 2023 20:20:19 +0000 (13:20 -0700)]
Merge tag 'printk-for-6.6' of git://git./linux/kernel/git/printk/linux
Pull printk updates from Petr Mladek:
- Do not try to get the console lock when it is not need or useful in
panic()
- Replace the global console_suspended state by a per-console flag
- Export symbols needed for dumping the raw printk buffer in panic()
- Fix documentation of printf formats for integer types
- Moved Sergey Senozhatsky to the reviewer role
- Misc cleanups
* tag 'printk-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
printk: export symbols for debug modules
lib: test_scanf: Add explicit type cast to result initialization in test_number_prefix()
printk: ringbuffer: Fix truncating buffer size min_t cast
printk: Rename abandon_console_lock_in_panic() to other_cpu_in_panic()
printk: Add per-console suspended state
printk: Consolidate console deferred printing
printk: Do not take console lock for console_flush_on_panic()
printk: Keep non-panic-CPUs out of console lock
printk: Reduce console_unblank() usage in unsafe scenarios
kdb: Do not assume write() callback available
docs: printk-formats: Treat char as always unsigned
docs: printk-formats: Fix hex printing of signed values
MAINTAINERS: adjust printk/vsprintf entries
Linus Torvalds [Mon, 4 Sep 2023 20:15:57 +0000 (13:15 -0700)]
Merge tag 'timers-core-2023-09-04-v2' of git://git./linux/kernel/git/tip/tip
Pull clocksource/clockevent driver updates from Thomas Gleixner:
- Remove the OXNAS driver instead of adding a new one!
- A set of boring fixes, cleanups and improvements
* tag 'timers-core-2023-09-04-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource: Explicitly include correct DT includes
clocksource/drivers/sun5i: Convert to platform device driver
clocksource/drivers/sun5i: Remove pointless struct
clocksource/drivers/sun5i: Remove duplication of code and data
clocksource/drivers/loongson1: Set variable ls1x_timer_lock storage-class-specifier to static
clocksource/drivers/arm_arch_timer: Disable timer before programming CVAL
dt-bindings: timer: oxsemi,rps-timer: remove obsolete bindings
clocksource/drivers/timer-oxnas-rps: Remove obsolete timer driver
Jarkko Sakkinen [Mon, 4 Sep 2023 18:12:10 +0000 (21:12 +0300)]
tpm: Enable hwrng only for Pluton on AMD CPUs
The vendor check introduced by commit
554b841d4703 ("tpm: Disable RNG for
all AMD fTPMs") doesn't work properly on a number of Intel fTPMs. On the
reported systems the TPM doesn't reply at bootup and returns back the
command code. This makes the TPM fail probe on Lenovo Legion Y540 laptop.
Since only Microsoft Pluton is the only known combination of AMD CPU and
fTPM from other vendor, disable hwrng otherwise. In order to make sysadmin
aware of this, print also info message to the klog.
Cc: stable@vger.kernel.org
Fixes:
554b841d4703 ("tpm: Disable RNG for all AMD fTPMs")
Reported-by: Todd Brandt <todd.e.brandt@intel.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217804
Reported-by: Patrick Steinhardt <ps@pks.im>
Reported-by: Raymond Jay Golo <rjgolo@gmail.com>
Reported-by: Ronan Pigott <ronan@rjp.ie>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Christophe JAILLET [Sat, 25 Feb 2023 10:58:48 +0000 (11:58 +0100)]
tpm_crb: Fix an error handling path in crb_acpi_add()
Some error paths don't call acpi_put_table() before returning.
Branch to the correct place instead of doing some direct return.
Fixes:
4d2732882703 ("tpm_crb: Add support for CRB devices based on Pluton")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Matthew Garrett <mgarrett@aurora.tech>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
Linus Torvalds [Mon, 4 Sep 2023 18:34:33 +0000 (11:34 -0700)]
Merge tag 'm68knommu-for-v6.6' of git://git./linux/kernel/git/gerg/m68knommu
Pull m68knommu updates from Greg Ungerer:
"Two changes, one a trivial white space clean up, the other removes the
unnecessary local pcibios_setup() code"
* tag 'm68knommu-for-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
m68k: coldfire: dma_timer: ERROR: "foo __init bar" should be "foo __init bar"
m68k/pci: Drop useless pcibios_setup()
Linus Torvalds [Mon, 4 Sep 2023 18:32:21 +0000 (11:32 -0700)]
Merge tag 'uml-for-linus-6.6-rc1' of git://git./linux/kernel/git/uml/linux
Pull UML updates from Richard Weinberger:
- Drop 32-bit checksum implementation and re-use it from arch/x86
- String function cleanup
- Fixes for -Wmissing-variable-declarations and -Wmissing-prototypes
builds
* tag 'uml-for-linus-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux:
um: virt-pci: fix missing declaration warning
um: Refactor deprecated strncpy to memcpy
um: fix 3 instances of -Wmissing-prototypes
um: port_kern: fix -Wmissing-variable-declarations
uml: audio: fix -Wmissing-variable-declarations
um: vector: refactor deprecated strncpy
um: use obj-y to descend into arch/um/*/
um: Hard-code the result of 'uname -s'
um: Use the x86 checksum implementation on 32-bit
asm-generic: current: Don't include thread-info.h if building asm
um: Remove unsued extern declaration ldt_host_info()
um: Fix hostaudio build errors
um: Remove strlcpy usage
Linus Torvalds [Mon, 4 Sep 2023 18:26:29 +0000 (11:26 -0700)]
Merge tag 'hyperv-next-signed-
20230902' of git://git./linux/kernel/git/hyperv/linux
Pull hyperv updates from Wei Liu:
- Support for SEV-SNP guests on Hyper-V (Tianyu Lan)
- Support for TDX guests on Hyper-V (Dexuan Cui)
- Use SBRM API in Hyper-V balloon driver (Mitchell Levy)
- Avoid dereferencing ACPI root object handle in VMBus driver (Maciej
Szmigiero)
- A few misecllaneous fixes (Jiapeng Chong, Nathan Chancellor, Saurabh
Sengar)
* tag 'hyperv-next-signed-
20230902' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (24 commits)
x86/hyperv: Remove duplicate include
x86/hyperv: Move the code in ivm.c around to avoid unnecessary ifdef's
x86/hyperv: Remove hv_isolation_type_en_snp
x86/hyperv: Use TDX GHCI to access some MSRs in a TDX VM with the paravisor
Drivers: hv: vmbus: Bring the post_msg_page back for TDX VMs with the paravisor
x86/hyperv: Introduce a global variable hyperv_paravisor_present
Drivers: hv: vmbus: Support >64 VPs for a fully enlightened TDX/SNP VM
x86/hyperv: Fix serial console interrupts for fully enlightened TDX guests
Drivers: hv: vmbus: Support fully enlightened TDX guests
x86/hyperv: Support hypercalls for fully enlightened TDX guests
x86/hyperv: Add hv_isolation_type_tdx() to detect TDX guests
x86/hyperv: Fix undefined reference to isolation_type_en_snp without CONFIG_HYPERV
x86/hyperv: Add missing 'inline' to hv_snp_boot_ap() stub
hv: hyperv.h: Replace one-element array with flexible-array member
Drivers: hv: vmbus: Don't dereference ACPI root object handle
x86/hyperv: Add hyperv-specific handling for VMMCALL under SEV-ES
x86/hyperv: Add smp support for SEV-SNP guest
clocksource: hyper-v: Mark hyperv tsc page unencrypted in sev-snp enlightened guest
x86/hyperv: Use vmmcall to implement Hyper-V hypercall in sev-snp enlightened guest
drivers: hv: Mark percpu hvcall input arg page unencrypted in SEV-SNP enlightened guest
...
Linus Torvalds [Mon, 4 Sep 2023 17:43:44 +0000 (10:43 -0700)]
Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:
"A small pull request this time around, mostly because the vduse
network got postponed to next relase so we can be sure we got the
security store right"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio_ring: fix avail_wrap_counter in virtqueue_add_packed
virtio_vdpa: build affinity masks conditionally
virtio_net: merge dma operations when filling mergeable buffers
virtio_ring: introduce dma sync api for virtqueue
virtio_ring: introduce dma map api for virtqueue
virtio_ring: introduce virtqueue_reset()
virtio_ring: separate the logic of reset/enable from virtqueue_resize
virtio_ring: correct the expression of the description of virtqueue_resize()
virtio_ring: skip unmap for premapped
virtio_ring: introduce virtqueue_dma_dev()
virtio_ring: support add premapped buf
virtio_ring: introduce virtqueue_set_dma_premapped()
virtio_ring: put mapping error check in vring_map_one_sg
virtio_ring: check use_dma_api before unmap desc for indirect
vdpa_sim: offer VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK
vdpa: add get_backend_features vdpa operation
vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature
vdpa: add VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK flag
vdpa/mlx5: Remove unused function declarations
Linus Torvalds [Mon, 4 Sep 2023 17:38:35 +0000 (10:38 -0700)]
Merge tag 'tomoyo-pr-
20230903' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1
Pull tomoyo updates from Tetsuo Handa:
"Three cleanup patches, no behavior changes"
* tag 'tomoyo-pr-
20230903' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1:
tomoyo: remove unused function declaration
tomoyo: refactor deprecated strncpy
tomoyo: add format attributes to functions
Rafael J. Wysocki [Mon, 4 Sep 2023 16:55:03 +0000 (18:55 +0200)]
Merge branch 'pm-cpufreq'
Merge additional cpufreq updates for 6.6-rc1:
- Add support for per-policy performance boost (Jie Zhan).
- Fix assorted issues in the cpufreq core, common governor code and in
the pcc cpufreq driver (Liao Chang).
* pm-cpufreq:
cpufreq: Support per-policy performance boost
cpufreq: pcc: Fix the potentinal scheduling delays in target_index()
cpufreq: governor: Free dbs_data directly when gov->init() fails
cpufreq: Fix the race condition while updating the transition_task of policy
cpufreq: Avoid printing kernel addresses in cpufreq_resume()
Petr Mladek [Mon, 4 Sep 2023 09:37:37 +0000 (11:37 +0200)]
Merge branch 'rework/misc-cleanups' into for-linus
Petr Mladek [Mon, 4 Sep 2023 09:37:11 +0000 (11:37 +0200)]
Merge branch 'for-6.6-vsprintf-doc' into for-linus
Yuan Yao [Tue, 8 Aug 2023 05:10:59 +0000 (05:10 +0000)]
virtio_ring: fix avail_wrap_counter in virtqueue_add_packed
In current packed virtqueue implementation, the avail_wrap_counter won't
flip, in the case when the driver supplies a descriptor chain with a
length equals to the queue size; total_sg == vq->packed.vring.num.
Let’s assume the following situation:
vq->packed.vring.num=4
vq->packed.next_avail_idx: 1
vq->packed.avail_wrap_counter: 0
Then the driver adds a descriptor chain containing 4 descriptors.
We expect the following result with avail_wrap_counter flipped:
vq->packed.next_avail_idx: 1
vq->packed.avail_wrap_counter: 1
But, the current implementation gives the following result:
vq->packed.next_avail_idx: 1
vq->packed.avail_wrap_counter: 0
To reproduce the bug, you can set a packed queue size as small as
possible, so that the driver is more likely to provide a descriptor
chain with a length equal to the packed queue size. For example, in
qemu run following commands:
sudo qemu-system-x86_64 \
-enable-kvm \
-nographic \
-kernel "path/to/kernel_image" \
-m 1G \
-drive file="path/to/rootfs",if=none,id=disk \
-device virtio-blk,drive=disk \
-drive file="path/to/disk_image",if=none,id=rwdisk \
-device virtio-blk,drive=rwdisk,packed=on,queue-size=4,\
indirect_desc=off \
-append "console=ttyS0 root=/dev/vda rw init=/bin/bash"
Inside the VM, create a directory and mount the rwdisk device on it. The
rwdisk will hang and mount operation will not complete.
This commit fixes the wrap counter error by flipping the
packed.avail_wrap_counter, when start of descriptor chain equals to the
end of descriptor chain (head == i).
Fixes:
1ce9e6055fa0 ("virtio_ring: introduce packed ring support")
Signed-off-by: Yuan Yao <yuanyaogoog@chromium.org>
Message-Id: <
20230808051110.3492693-1-yuanyaogoog@chromium.org>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Jason Wang [Fri, 11 Aug 2023 09:15:39 +0000 (05:15 -0400)]
virtio_vdpa: build affinity masks conditionally
We try to build affinity mask via create_affinity_masks()
unconditionally which may lead several issues:
- the affinity mask is not used for parent without affinity support
(only VDUSE support the affinity now)
- the logic of create_affinity_masks() might not work for devices
other than block. For example it's not rare in the networking device
where the number of queues could exceed the number of CPUs. Such
case breaks the current affinity logic which is based on
group_cpus_evenly() who assumes the number of CPUs are not less than
the number of groups. This can trigger a warning[1]:
if (ret >= 0)
WARN_ON(nr_present + nr_others < numgrps);
Fixing this by only build the affinity masks only when
- Driver passes affinity descriptor, driver like virtio-blk can make
sure to limit the number of queues when it exceeds the number of CPUs
- Parent support affinity setting config ops
This help to avoid the warning. More optimizations could be done on
top.
[1]
[ 682.146655] WARNING: CPU: 6 PID: 1550 at lib/group_cpus.c:400 group_cpus_evenly+0x1aa/0x1c0
[ 682.146668] CPU: 6 PID: 1550 Comm: vdpa Not tainted 6.5.0-rc5jason+ #79
[ 682.146671] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
[ 682.146673] RIP: 0010:group_cpus_evenly+0x1aa/0x1c0
[ 682.146676] Code: 4c 89 e0 5b 5d 41 5c 41 5d 41 5e c3 cc cc cc cc e8 1b c4 74 ff 48 89 ef e8 13 ac 98 ff 4c 89 e7 45 31 e4 e8 08 ac 98 ff eb c2 <0f> 0b eb b6 e8 fd 05 c3 00 45 31 e4 eb e5 cc cc cc cc cc cc cc cc
[ 682.146679] RSP: 0018:
ffffc9000215f498 EFLAGS:
00010293
[ 682.146682] RAX:
000000000001f1e0 RBX:
0000000000000041 RCX:
0000000000000000
[ 682.146684] RDX:
ffff888109922058 RSI:
0000000000000041 RDI:
0000000000000030
[ 682.146686] RBP:
ffff888109922058 R08:
ffffc9000215f498 R09:
ffffc9000215f4a0
[ 682.146687] R10:
00000000000198d0 R11:
0000000000000030 R12:
ffff888107e02800
[ 682.146689] R13:
0000000000000030 R14:
0000000000000030 R15:
0000000000000041
[ 682.146692] FS:
00007fef52315740(0000) GS:
ffff888237380000(0000) knlGS:
0000000000000000
[ 682.146695] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 682.146696] CR2:
00007fef52509000 CR3:
0000000110dbc004 CR4:
0000000000370ee0
[ 682.146698] DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
[ 682.146700] DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
[ 682.146701] Call Trace:
[ 682.146703] <TASK>
[ 682.146705] ? __warn+0x7b/0x130
[ 682.146709] ? group_cpus_evenly+0x1aa/0x1c0
[ 682.146712] ? report_bug+0x1c8/0x1e0
[ 682.146717] ? handle_bug+0x3c/0x70
[ 682.146721] ? exc_invalid_op+0x14/0x70
[ 682.146723] ? asm_exc_invalid_op+0x16/0x20
[ 682.146727] ? group_cpus_evenly+0x1aa/0x1c0
[ 682.146729] ? group_cpus_evenly+0x15c/0x1c0
[ 682.146731] create_affinity_masks+0xaf/0x1a0
[ 682.146735] virtio_vdpa_find_vqs+0x83/0x1d0
[ 682.146738] ? __pfx_default_calc_sets+0x10/0x10
[ 682.146742] virtnet_find_vqs+0x1f0/0x370
[ 682.146747] virtnet_probe+0x501/0xcd0
[ 682.146749] ? vp_modern_get_status+0x12/0x20
[ 682.146751] ? get_cap_addr.isra.0+0x10/0xc0
[ 682.146754] virtio_dev_probe+0x1af/0x260
[ 682.146759] really_probe+0x1a5/0x410
Fixes:
3dad56823b53 ("virtio-vdpa: Support interrupt affinity spreading mechanism")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <
20230811091539.1359865-1-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Xuan Zhuo [Thu, 10 Aug 2023 12:30:57 +0000 (20:30 +0800)]
virtio_net: merge dma operations when filling mergeable buffers
Currently, the virtio core will perform a dma operation for each
buffer. Although, the same page may be operated multiple times.
This patch, the driver does the dma operation and manages the dma
address based the feature premapped of virtio core.
This way, we can perform only one dma operation for the pages of the
alloc frag. This is beneficial for the iommu device.
kernel command line: intel_iommu=on iommu.passthrough=0
| strict=0 | strict=1
Before | 775496pps | 428614pps
After | 1109316pps | 742853pps
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <
20230810123057.43407-13-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Xuan Zhuo [Thu, 10 Aug 2023 12:30:56 +0000 (20:30 +0800)]
virtio_ring: introduce dma sync api for virtqueue
These API has been introduced:
* virtqueue_dma_need_sync
* virtqueue_dma_sync_single_range_for_cpu
* virtqueue_dma_sync_single_range_for_device
These APIs can be used together with the premapped mechanism to sync the
DMA address.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <
20230810123057.43407-12-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Xuan Zhuo [Thu, 10 Aug 2023 12:30:55 +0000 (20:30 +0800)]
virtio_ring: introduce dma map api for virtqueue
Added virtqueue_dma_map_api* to map DMA addresses for virtual memory in
advance. The purpose is to keep memory mapped across multiple add/get
buf operations.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <
20230810123057.43407-11-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Xuan Zhuo [Thu, 10 Aug 2023 12:30:54 +0000 (20:30 +0800)]
virtio_ring: introduce virtqueue_reset()
Introduce virtqueue_reset() to release all buffer inside vq.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <
20230810123057.43407-10-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Xuan Zhuo [Thu, 10 Aug 2023 12:30:53 +0000 (20:30 +0800)]
virtio_ring: separate the logic of reset/enable from virtqueue_resize
The subsequent reset function will reuse these logic.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <
20230810123057.43407-9-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Xuan Zhuo [Thu, 10 Aug 2023 12:30:52 +0000 (20:30 +0800)]
virtio_ring: correct the expression of the description of virtqueue_resize()
Modify the "useless" to a more accurate "unused".
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <
20230810123057.43407-8-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Xuan Zhuo [Thu, 10 Aug 2023 12:30:51 +0000 (20:30 +0800)]
virtio_ring: skip unmap for premapped
Now we add a case where we skip dma unmap, the vq->premapped is true.
We can't just rely on use_dma_api to determine whether to skip the dma
operation. For convenience, I introduced the "do_unmap". By default, it
is the same as use_dma_api. If the driver is configured with premapped,
then do_unmap is false.
So as long as do_unmap is false, for addr of desc, we should skip dma
unmap operation.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <
20230810123057.43407-7-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Xuan Zhuo [Thu, 10 Aug 2023 12:30:50 +0000 (20:30 +0800)]
virtio_ring: introduce virtqueue_dma_dev()
Added virtqueue_dma_dev() to get DMA device for virtio. Then the
caller can do dma operation in advance. The purpose is to keep memory
mapped across multiple add/get buf operations.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <
20230810123057.43407-6-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Xuan Zhuo [Thu, 10 Aug 2023 12:30:49 +0000 (20:30 +0800)]
virtio_ring: support add premapped buf
If the vq is the premapped mode, use the sg_dma_address() directly.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <
20230810123057.43407-5-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Xuan Zhuo [Thu, 10 Aug 2023 12:30:48 +0000 (20:30 +0800)]
virtio_ring: introduce virtqueue_set_dma_premapped()
This helper allows the driver change the dma mode to premapped mode.
Under the premapped mode, the virtio core do not do dma mapping
internally.
This just work when the use_dma_api is true. If the use_dma_api is false,
the dma options is not through the DMA APIs, that is not the standard
way of the linux kernel.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <
20230810123057.43407-4-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Xuan Zhuo [Thu, 10 Aug 2023 12:30:47 +0000 (20:30 +0800)]
virtio_ring: put mapping error check in vring_map_one_sg
This patch put the dma addr error check in vring_map_one_sg().
The benefits of doing this:
1. reduce one judgment of vq->use_dma_api.
2. make vring_map_one_sg more simple, without calling
vring_mapping_error to check the return value. simplifies subsequent
code
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <
20230810123057.43407-3-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Xuan Zhuo [Thu, 10 Aug 2023 12:30:46 +0000 (20:30 +0800)]
virtio_ring: check use_dma_api before unmap desc for indirect
Inside detach_buf_split(), if use_dma_api is false,
vring_unmap_one_split_indirect will be called many times, but actually
nothing is done. So this patch check use_dma_api firstly.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <
20230810123057.43407-2-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eugenio Pérez [Fri, 9 Jun 2023 09:21:27 +0000 (11:21 +0200)]
vdpa_sim: offer VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK
Start offering the feature in the simulator. Other parent drivers can
follow this code to offer it too.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Shannon Nelson <shannon.nelson@amd.com>
Message-Id: <
20230609092127.170673-5-eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eugenio Pérez [Fri, 9 Jun 2023 09:21:26 +0000 (11:21 +0200)]
vdpa: add get_backend_features vdpa operation
This operation allow vdpa parent to expose its own backend feature bits.
Next patches introduce a feature not compatible with all parent drivers:
the ability to enable vq after driver_ok. Each parent must declare if
it allows it or not.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Shannon Nelson <shannon.nelson@amd.com>
Message-Id: <
20230609092127.170673-4-eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eugenio Pérez [Fri, 9 Jun 2023 09:21:25 +0000 (11:21 +0200)]
vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature
Accepting VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature if
userland sets it.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Shannon Nelson <shannon.nelson@amd.com>
Message-Id: <
20230609092127.170673-3-eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eugenio Pérez [Fri, 9 Jun 2023 09:21:24 +0000 (11:21 +0200)]
vdpa: add VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK flag
This feature flag allows the driver enabling virtqueues both before and
after DRIVER_OK.
This is needed for software assisted live migration, so userland can
restore the device status in devices with control virtqueue before the
dataplane is enabled.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Shannon Nelson <shannon.nelson@amd.com>
Message-Id: <
20230609092127.170673-2-eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Yue Haibing [Thu, 3 Aug 2023 14:30:41 +0000 (22:30 +0800)]
vdpa/mlx5: Remove unused function declarations
Commit
29064bfdabd5 ("vdpa/mlx5: Add support library for mlx5 VDPA implementation")
declared but never implemented these.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Message-Id: <
20230803143041.23388-1-yuehaibing@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Linus Torvalds [Sun, 3 Sep 2023 17:49:42 +0000 (10:49 -0700)]
Merge tag 'dmaengine-6.6-rc1' of git://git./linux/kernel/git/vkoul/dmaengine
Pull dmaengine updates from Vinod Koul:
"New controller support and updates to drivers.
New support:
- Qualcomm SM6115 and QCM2290 dmaengine support
- at_xdma support for microchip,sam9x7 controller
Updates:
- idxd updates for wq simplification and ats knob updates
- fsl edma updates for v3 support
- Xilinx AXI4-Stream control support
- Yaml conversion for bcm dma binding"
* tag 'dmaengine-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (53 commits)
dmaengine: fsl-edma: integrate v3 support
dt-bindings: fsl-dma: fsl-edma: add edma3 compatible string
dmaengine: fsl-edma: move tcd into struct fsl_dma_chan
dmaengine: fsl-edma: refactor chan_name setup and safety
dmaengine: fsl-edma: move clearing of register interrupt into setup_irq function
dmaengine: fsl-edma: refactor using devm_clk_get_enabled
dmaengine: fsl-edma: simply ATTR_DSIZE and ATTR_SSIZE by using ffs()
dmaengine: fsl-edma: move common IRQ handler to common.c
dmaengine: fsl-edma: Remove enum edma_version
dmaengine: fsl-edma: transition from bool fields to bitmask flags in drvdata
dmaengine: fsl-edma: clean up EXPORT_SYMBOL_GPL in fsl-edma-common.c
dmaengine: fsl-edma: fix build error when arch is s390
dmaengine: idxd: Fix issues with PRS disable sysfs knob
dmaengine: idxd: Allow ATS disable update only for configurable devices
dmaengine: xilinx_dma: Program interrupt delay timeout
dmaengine: xilinx_dma: Use tasklet_hi_schedule for timing critical usecase
dmaengine: xilinx_dma: Freeup active list based on descriptor completion bit
dmaengine: xilinx_dma: Increase AXI DMA transaction segment count
dmaengine: xilinx_dma: Pass AXI4-Stream control words to dma client
dt-bindings: dmaengine: xilinx_dma: Add xlnx,irq-delay property
...
Linus Torvalds [Sun, 3 Sep 2023 17:38:02 +0000 (10:38 -0700)]
Merge tag 'phy-for-6.6' of git://git./linux/kernel/git/phy/linux-phy
Pull phy updates from Vinod Koul:
"As usual a couple of new drivers, a bunch of new device support and
few updates to existing drivers
New Support:
- Starfive dphy rx, JH7110 usb and pcie support
- Rockchip rv1126 inno-dsi phy, rk3588 usb and pcie support
- Qualcomm sa8775p PCIe support, M31 USB PHY driver
- Samsung Exynos850 usb support
Updates:
- Mediatek dsi driver clock updates
- Qualcomm sm8150 combo phy with reworking of qmp pcie driver
- Xilinx zynqmp runtime PM support"
* tag 'phy-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (83 commits)
phy: exynos5-usbdrd: Add Exynos850 support
phy: exynos5-usbdrd: Add 26MHz ref clk support
phy: exynos5-usbdrd: Make it possible to pass custom phy ops
dt-bindings: phy: samsung,usb3-drd-phy: Add Exynos850 support
phy: qcom-qmp-combo: fix clock probing
phy: qcom-qmp-pcie: support SM8150 PCIe QMP PHYs
phy: qcom-qmp-pcie: populate offsets configuration
phy: qcom-qmp-pcie: simplify clock handling
phy: qcom-qmp-pcie: keep offset tables sorted
phy: qcom-qmp-pcie: drop ln_shrd from v5_20 config
dt-bindings: phy: qcom,qmp-pcie: describe SM8150 PCIe PHYs
dt-bindings: phy: migrate QMP PCIe PHY bindings to qcom,sc8280xp-qmp-pcie-phy.yaml
phy: fsl-imx8mq-usb: add dev_err_probe if getting vbus failed
phy: qcom: Introduce M31 USB PHY driver
dt-bindings: phy: qcom,m31: Document qcom,m31 USB phy
phy: rockchip: inno-dsidphy: Add rv1126 support
dt-bindings: phy: rockchip-inno-dsidphy: Document rv1126
dt-bindings: phy: mediatek,tphy: allow simple nodename pattern
phy: amlogic: meson-g12a-usb2: fix Wvoid-pointer-to-enum-cast warning
phy: marvell pxa-usb: fix Wvoid-pointer-to-enum-cast warning
...
Linus Torvalds [Sun, 3 Sep 2023 17:20:57 +0000 (10:20 -0700)]
Merge tag 'soundwire-6.6-rc1' of git://git./linux/kernel/git/vkoul/soundwire
Pull soundwire updates from Vinod Koul:
"Device numbering and intel driver changes are main features:
- Core support for soundwire device number allocation
- intel driver updates for adding hw_params for DAI ops, hybrid
number allocation and power managemnt callback updates
- DT header include changes for subsystem"
* tag 'soundwire-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
soundwire: intel_ace2x: add DAI hw_params/prepare/hw_free callbacks
soundwire: intel_auxdevice: add hybrid IDA-based device_number allocation
soundwire: bus: add callbacks for device_number allocation
soundwire: extend parameters of new_peripheral_assigned() callback
soundWire: intel_auxdevice: resume 'sdw-master' on startup and system resume
soundwire: intel_auxdevice: enable pm_runtime earlier on startup
soundwire: Explicitly include correct DT includes
Kees Cook [Thu, 31 Aug 2023 19:13:39 +0000 (12:13 -0700)]
kbuild: Show marked Kconfig fragments in "help"
Currently the Kconfig fragments in kernel/configs and arch/*/configs
that aren't used internally aren't discoverable through "make help",
which consists of hard-coded lists of config fragments. Instead, list
all the fragment targets that have a "# Help: " comment prefix so the
targets can be generated dynamically.
Add logic to the Makefile to search for and display the fragment and
comment. Add comments to fragments that are intended to be direct targets.
Signed-off-by: Kees Cook <keescook@chromium.org>
Co-developed-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Linus Torvalds [Sun, 3 Sep 2023 16:59:53 +0000 (09:59 -0700)]
Merge tag 'mtd/for-6.6' of git://git./linux/kernel/git/mtd/linux
Pull MTD updates from Miquel Raynal:
"Core MTD changes:
- Use refcount to prevent corruption
- Call external _get and _put in right order
- Fix use-after-free in mtd release
- Explicitly include correct DT includes
- Clean refcounting with MTD_PARTITIONED_MASTER
- mtdblock: make warning messages ratelimited
- dt-bindings: Add SEAMA partition bindings
Device driver changes:
- Use devm helper functions
- Fix questionable cast, remove pointless ones.
- error handling fixes
- add support for new chip versions
- update DT bindings
- misc cleanups - fix typos, whitespace, indentation"
* tag 'mtd/for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (105 commits)
dt-bindings: mtd: amlogic,meson-nand: drop unneeded quotes
mtd: spear_smi: Use helper function devm_clk_get_enabled()
mtd: rawnand: orion: Use helper function devm_clk_get_optional_enabled()
mtd: rawnand: vf610_nfc: Use helper function devm_clk_get_enabled()
mtd: rawnand: sunxi: Use helper function devm_clk_get_enabled()
mtd: rawnand: stm32_fmc2: Use helper function devm_clk_get_enabled()
mtd: rawnand: mtk: Use helper function devm_clk_get_enabled()
mtd: rawnand: mpc5121: Use helper function devm_clk_get_enabled()
mtd: rawnand: lpc32xx_slc: Use helper function devm_clk_get_enabled()
mtd: rawnand: intel: Use helper function devm_clk_get_enabled()
mtd: rawnand: fsmc: Use helper function devm_clk_get_enabled()
mtd: rawnand: arasan: Use helper function devm_clk_get_enabled()
mtd: rawnand: qcom: Add read/read_start ops in exec_op path
mtd: rawnand: qcom: Clear buf_count and buf_start in raw read
mtd: maps: fix -Wvoid-pointer-to-enum-cast warning
mtd: rawnand: fix -Wvoid-pointer-to-enum-cast warning
mtd: rawnand: fsmc: handle clk prepare error in fsmc_nand_resume()
mtd: rawnand: Propagate error and simplify ternary operators for brcmstb_nand_wait_for_completion()
mtd: rawnand: qcom: Sort includes alphabetically
mtd: rawnand: qcom: Do not override the error no of submit_descs()
...
Linus Torvalds [Sat, 2 Sep 2023 22:37:59 +0000 (15:37 -0700)]
Merge tag 'f2fs-for-6-6-rc1' of git://git./linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"In this cycle, we don't have a highlighted feature enhancement, but
mostly have fixed issues mainly in two parts: 1) zoned block device,
and 2) compression support.
For zoned block device, we've tried to improve the power-off recovery
flow as much as possible. For compression, we found some corner cases
caused by wrong compression policy and logics. Other than them, there
were some reverts and stat corrections.
Bug fixes:
- use finish zone command when closing a zone
- check zone type before sending async reset zone command
- fix to assign compress_level for lz4 correctly
- fix error path of f2fs_submit_page_read()
- don't {,de}compress non-full cluster
- send small discard commands during checkpoint back
- flush inode if atomic file is aborted
- correct to account gc/cp stats
And, there are minor bug fixes, avoiding false lockdep warning, and
clean-ups"
* tag 'f2fs-for-6-6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (25 commits)
f2fs: use finish zone command when closing a zone
f2fs: compress: fix to assign compress_level for lz4 correctly
f2fs: fix error path of f2fs_submit_page_read()
f2fs: clean up error handling in sanity_check_{compress_,}inode()
f2fs: avoid false alarm of circular locking
Revert "f2fs: do not issue small discard commands during checkpoint"
f2fs: doc: fix description of max_small_discards
f2fs: should update REQ_TIME for direct write
f2fs: fix to account cp stats correctly
f2fs: fix to account gc stats correctly
f2fs: remove unneeded check condition in __f2fs_setxattr()
f2fs: fix to update i_ctime in __f2fs_setxattr()
Revert "f2fs: fix to do sanity check on extent cache correctly"
f2fs: increase usage of folio_next_index() helper
f2fs: Only lfs mode is allowed with zoned block device feature
f2fs: check zone type before sending async reset zone command
f2fs: compress: don't {,de}compress non-full cluster
f2fs: allow f2fs_ioc_{,de}compress_file to be interrupted
f2fs: don't reopen the main block device in f2fs_scan_devices
f2fs: fix to avoid mmap vs set_compress_option case
...
Waiman Long [Fri, 25 Aug 2023 16:49:47 +0000 (12:49 -0400)]
mm/kmemleak: move up cond_resched() call in page scanning loop
Commit
bde5f6bc68db ("kmemleak: add scheduling point to kmemleak_scan()")
added a cond_resched() call to the struct page scanning loop to prevent
soft lockup from happening. However, soft lockup can still happen in that
loop in some corner cases when the pages that satisfy the "!(pfn & 63)"
check are skipped for some reasons.
Fix this corner case by moving up the cond_resched() check so that it will
be called every 64 pages unconditionally.
Link: https://lkml.kernel.org/r/20230825164947.1317981-1-longman@redhat.com
Fixes:
bde5f6bc68db ("kmemleak: add scheduling point to kmemleak_scan()")
Signed-off-by: Waiman Long <longman@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Yisheng Xie <xieyisheng1@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Johannes Weiner [Thu, 24 Aug 2023 15:38:21 +0000 (11:38 -0400)]
mm: page_alloc: remove stale CMA guard code
In the past, movable allocations could be disallowed from CMA through
PF_MEMALLOC_PIN. As CMA pages are funneled through the MOVABLE pcplist,
this required filtering that cornercase during allocations, such that
pinnable allocations wouldn't accidentally get a CMA page.
However, since
8e3560d963d2 ("mm: honor PF_MEMALLOC_PIN for all movable
pages"), PF_MEMALLOC_PIN automatically excludes __GFP_MOVABLE. Once
again, MOVABLE implies CMA is allowed.
Remove the stale filtering code. Also remove a stale comment that was
introduced as part of the filtering code, because the filtering let
order-0 pages fall through to the buddy allocator. See
1d91df85f399
("mm/page_alloc: handle a missing case for memalloc_nocma_{save/restore}
APIs") for context. The comment's been obsolete since the introduction of
the explicit ALLOC_HIGHATOMIC flag in
eb2e2b425c69 ("mm/page_alloc:
explicitly record high-order atomic allocations in alloc_flags").
Link: https://lkml.kernel.org/r/20230824153821.243148-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: David Hildenbrand <david@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Baruch Siach [Thu, 24 Aug 2023 11:38:09 +0000 (14:38 +0300)]
MAINTAINERS: add rmap.h to mm entry
Make it easier to figure out where to send patches for this file.
Link: https://lkml.kernel.org/r/efbc7689d35a48ff402644d696aa9a8d8bb6333a.1692877089.git.baruch@tkos.co.il
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Baruch Siach [Thu, 24 Aug 2023 11:38:08 +0000 (14:38 +0300)]
rmap: remove anon_vma_link() nommu stub
anon_vma_link() is unused since commit
5beb49305251 ("mm: change anon_vma
linking to fix multi-process server scalability issue").
Link: https://lkml.kernel.org/r/cdce9b00c9ab15f6d02eddf40dcad537d3e9676f.1692877089.git.baruch@tkos.co.il
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stefan Roesch [Tue, 22 Aug 2023 18:05:39 +0000 (11:05 -0700)]
proc/ksm: add ksm stats to /proc/pid/smaps
With madvise and prctl KSM can be enabled for different VMA's. Once it is
enabled we can query how effective KSM is overall. However we cannot
easily query if an individual VMA benefits from KSM.
This commit adds a KSM section to the /prod/<pid>/smaps file. It reports
how many of the pages are KSM pages. Note that KSM-placed zeropages are
not included, only actual KSM pages.
Here is a typical output:
7f420a000000-
7f421a000000 rw-p
00000000 00:00 0
Size: 262144 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Rss: 51212 kB
Pss: 8276 kB
Shared_Clean: 172 kB
Shared_Dirty: 42996 kB
Private_Clean: 196 kB
Private_Dirty: 7848 kB
Referenced: 15388 kB
Anonymous: 51212 kB
KSM: 41376 kB
LazyFree: 0 kB
AnonHugePages: 0 kB
ShmemPmdMapped: 0 kB
FilePmdMapped: 0 kB
Shared_Hugetlb: 0 kB
Private_Hugetlb: 0 kB
Swap: 202016 kB
SwapPss: 3882 kB
Locked: 0 kB
THPeligible: 0
ProtectionKey: 0
ksm_state: 0
ksm_skip_base: 0
ksm_skip_count: 0
VmFlags: rd wr mr mw me nr mg anon
This information also helps with the following workflow:
- First enable KSM for all the VMA's of a process with prctl.
- Then analyze with the above smaps report which VMA's benefit the most
- Change the application (if possible) to add the corresponding madvise
calls for the VMA's that benefit the most
[shr@devkernel.io: v5]
Link: https://lkml.kernel.org/r/20230823170107.1457915-1-shr@devkernel.io
Link: https://lkml.kernel.org/r/20230822180539.1424843-1-shr@devkernel.io
Signed-off-by: Stefan Roesch <shr@devkernel.io>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Jiaqi Yan [Thu, 13 Jul 2023 23:55:53 +0000 (23:55 +0000)]
mm/hwpoison: rename hwp_walk* to hwpoison_walk*
In the discussion of "Improve hugetlbfs read on HWPOISON hugepages" [1],
Matthew Wilcox suggests hwp is a bad abbreviation of hwpoison, as hwp is
already used as "an acronym by acpi, intel_pstate, some clock drivers, an
ethernet driver, and a scsi driver"[1].
So rename hwp_walk and hwp_walk_ops to hwpoison_walk and
hwpoison_walk_ops respectively.
raw_hwp_(page|list), *_raw_hwp, and raw_hwp_unreliable flag are other
major appearances of "hwp". However, given the "raw" hint in the name, it
is easy to differentiate them from other "hwp" acronyms. Since renaming
them is not as straightforward as renaming hwp_walk*, they are not covered
by this commit.
[1] https://lore.kernel.org/lkml/
20230707201904.953262-5-jiaqiyan@google.com/T/#me6fecb8ce1ad4d5769199c9e162a44bc88f7bdec
Link: https://lkml.kernel.org/r/20230713235553.4121855-1-jiaqiyan@google.com
Signed-off-by: Jiaqi Yan <jiaqiyan@google.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Miaohe Lin [Thu, 27 Jul 2023 11:56:43 +0000 (19:56 +0800)]
mm: memory-failure: add PageOffline() check
Memory failure is not interested in logically offlined pages. Skip this
type of page.
Link: https://lkml.kernel.org/r/20230727115643.639741-5-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Linus Torvalds [Sat, 2 Sep 2023 19:02:41 +0000 (12:02 -0700)]
Merge tag 'scsi-misc' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"Updates to the usual drivers (ufs, lpfc, qla2xxx, mpi3mr, libsas) and
the usual minor updates and bug fixes but no significant core changes"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (116 commits)
scsi: storvsc: Handle additional SRB status values
scsi: libsas: Delete sas_ata_task.retry_count
scsi: libsas: Delete sas_ata_task.stp_affil_pol
scsi: libsas: Delete sas_ata_task.set_affil_pol
scsi: libsas: Delete sas_ssp_task.task_prio
scsi: libsas: Delete sas_ssp_task.enable_first_burst
scsi: libsas: Delete sas_ssp_task.retry_count
scsi: libsas: Delete struct scsi_core
scsi: libsas: Delete enum sas_phy_type
scsi: libsas: Delete enum sas_class
scsi: libsas: Delete sas_ha_struct.lldd_module
scsi: target: Fix write perf due to unneeded throttling
scsi: lpfc: Do not abuse UUID APIs and LPFC_COMPRESS_VMID_SIZE
scsi: pm8001: Remove unused declarations
scsi: fcoe: Fix potential deadlock on &fip->ctlr_lock
scsi: elx: sli4: Remove code duplication
scsi: bfa: Replace one-element array with flexible-array member in struct fc_rscn_pl_s
scsi: qla2xxx: Remove unused declarations
scsi: pmcraid: Use pci_dev_id() to simplify the code
scsi: pm80xx: Set RETFIS when requested by libsas
...
Linus Torvalds [Sat, 2 Sep 2023 18:10:50 +0000 (11:10 -0700)]
Merge tag 'probes-v6.6' of git://git./linux/kernel/git/trace/linux-trace
Pull probes updates from Masami Hiramatsu:
- kprobes: use struct_size() for variable size kretprobe_instance data
structure.
- eprobe: Simplify trace_eprobe list iteration.
- probe events: Data structure field access support on BTF argument.
- Update BTF argument support on the functions in the kernel
loadable modules (only loaded modules are supported).
- Move generic BTF access function (search function prototype and
get function parameters) to a separated file.
- Add a function to search a member of data structure in BTF.
- Support accessing BTF data structure member from probe args by
C-like arrow('->') and dot('.') operators. e.g.
't sched_switch next=next->pid vruntime=next->se.vruntime'
- Support accessing BTF data structure member from $retval. e.g.
'f getname_flags%return +0($retval->name):string'
- Add string type checking if BTF type info is available. This will
reject if user specify ":string" type for non "char pointer"
type.
- Automatically assume the fprobe event as a function return event
if $retval is used.
- selftests/ftrace: Add BTF data field access test cases.
- Documentation: Update fprobe event example with BTF data field.
* tag 'probes-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
Documentation: tracing: Update fprobe event example with BTF field
selftests/ftrace: Add BTF fields access testcases
tracing/fprobe-event: Assume fprobe is a return event by $retval
tracing/probes: Add string type check with BTF
tracing/probes: Support BTF field access from $retval
tracing/probes: Support BTF based data structure field access
tracing/probes: Add a function to search a member of a struct/union
tracing/probes: Move finding func-proto API and getting func-param API to trace_btf
tracing/probes: Support BTF argument on module functions
tracing/eprobe: Iterate trace_eprobe directly
kernel: kprobes: Use struct_size()
Linus Torvalds [Sat, 2 Sep 2023 17:50:54 +0000 (10:50 -0700)]
Merge tag 'trace-v6.6-2' of git://git./linux/kernel/git/trace/linux-trace
Pull more tracing updates from Steven Rostedt:
"Tracing fixes and clean ups:
- Replace strlcpy() with strscpy()
- Initialize the pipe cpumask to zero on allocation
- Use within_module() instead of open coding it
- Remove extra space in hwlat_detectory/mode output
- Use LIST_HEAD() instead of open coding it
- A bunch of clean ups and fixes for the cpumask filter
- Set local da_mon_##name to static
- Fix race in snapshot buffer between cpu write and swap"
* tag 'trace-v6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing/filters: Fix coding style issues
tracing/filters: Change parse_pred() cpulist ternary into an if block
tracing/filters: Fix double-free of struct filter_pred.mask
tracing/filters: Fix error-handling of cpulist parsing buffer
tracing: Zero the pipe cpumask on alloc to avoid spurious -EBUSY
ftrace: Use LIST_HEAD to initialize clear_hash
ftrace: Use within_module to check rec->ip within specified module.
tracing: Replace strlcpy with strscpy in trace/events/task.h
tracing: Fix race issue between cpu buffer write and swap
tracing: Remove extra space at the end of hwlat_detector/mode
rv: Set variable 'da_mon_##name' to static
Linus Torvalds [Sat, 2 Sep 2023 17:45:17 +0000 (10:45 -0700)]
Merge tag 'pstore-v6.6-rc1-fix' of git://git./linux/kernel/git/kees/linux
Pull pstore fix from Kees Cook:
- Adjust sizes of buffers just avoid uncompress failures (Ard
Biesheuvel)
* tag 'pstore-v6.6-rc1-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
pstore: Base compression input buffer size on estimated compressed size
Linus Torvalds [Sat, 2 Sep 2023 16:07:27 +0000 (09:07 -0700)]
Merge tag 'x86-urgent-2023-09-02' of git://git./linux/kernel/git/tip/tip
Pull x86 selftest fix from Ingo Molnar:
"Fix the __NR_map_shadow_stack syscall-renumbering fallout in the x86
self-test code.
[ Arguably the existing code was unnecessarily fragile, and tooling
should have picked up the new syscall number, and a wider fix is
being worked on - but meanwhile, let's not have the old syscall
number in the kernel tree. ]"
* tag 'x86-urgent-2023-09-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
selftests/x86: Update map_shadow_stack syscall nr
Linus Torvalds [Sat, 2 Sep 2023 16:01:48 +0000 (09:01 -0700)]
Merge tag 'timers-urgent-2023-09-02' of git://git./linux/kernel/git/tip/tip
Pull timer fix from Ingo Molnar:
"Fix false positive 'softirq work is pending' messages on -rt kernels,
caused by a buggy factoring-out of existing code"
* tag 'timers-urgent-2023-09-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick/rcu: Fix false positive "softirq work is pending" messages
Linus Torvalds [Sat, 2 Sep 2023 15:58:49 +0000 (08:58 -0700)]
Merge tag 'smp-urgent-2023-09-02' of git://git./linux/kernel/git/tip/tip
Pull CPU hotplug fix from Ingo Molnar:
"Fix a CPU hotplug related deadlock between the task which initiates
and controls a CPU hot-unplug operation vs. the CFS bandwidth timer"
* tag 'smp-urgent-2023-09-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
cpu/hotplug: Prevent self deadlock on CPU hot-unplug
Linus Torvalds [Sat, 2 Sep 2023 15:49:08 +0000 (08:49 -0700)]
Merge tag 'sched-urgent-2023-09-02' of git://git./linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
"Miscellaneous scheduler fixes: a reporting fix, a static symbol fix,
and a kernel-doc fix"
* tag 'sched-urgent-2023-09-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/core: Report correct state for TASK_IDLE | TASK_FREEZABLE
sched/fair: Make update_entity_lag() static
sched/core: Add kernel-doc for set_cpus_allowed_ptr()
Hugh Dickins [Sat, 2 Sep 2023 15:29:30 +0000 (08:29 -0700)]
mm/pagewalk: fix bootstopping regression from extra pte_unmap()
Mikhail reports early-6.6-based Fedora Rawhide not booting: "rcu_preempt
detected expedited stalls", minutes wait, and then hung_task splat while
kworker trying to synchronize_rcu_expedited(). Nothing logged to disk.
He bisected to my 6.6
a349d72fd9ef ("mm/pgtable: add rcu_read_lock() and
rcu_read_unlock()s"): but the one to blame is my 6.5 commit to fix the
espfix "bad pmd" warnings when booting x86_64 with CONFIG_EFI_PGT_DUMP=y.
Gaah, that added an "addr >= TASK_SIZE" check to avoid pte_offset_map(),
but failed to add the equivalent check when choosing to pte_unmap().
It's not a problem on 6.5 (for different reasons, it's harmless on both
64-bit and 32-bit), but becomes a bootstopper on 6.6 with the unbalanced
rcu_read_unlock() - RCU has a WARN_ON_ONCE for that, but it would have
scrolled off Mikhail's console too quickly.
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Closes: https://lore.kernel.org/linux-mm/CABXGCsNi8Tiv5zUPNXr6UJw6qV1VdaBEfGqEAMkkXE3QPvZuAQ@mail.gmail.com/
Fixes:
8b1cb4a2e819 ("mm/pagewalk: fix EFI_PGT_DUMP of espfix area")
Fixes:
a349d72fd9ef ("mm/pgtable: add rcu_read_lock() and rcu_read_unlock()s")
Signed-off-by: Hugh Dickins <hughd@google.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 2 Sep 2023 15:27:17 +0000 (08:27 -0700)]
cgroup: fix build when CGROUP_SCHED is not enabled
Sudip Mukherjee reports that the mips sb1250_swarm_defconfig build fails
with the current kernel. It isn't actually MIPS-specific, it's just
that that defconfig does not have CGROUP_SCHED enabled like most configs
do, and as such shows this error:
kernel/cgroup/cgroup.c: In function 'cgroup_local_stat_show':
kernel/cgroup/cgroup.c:3699:15: error: implicit declaration of function 'cgroup_tryget_css'; did you mean 'cgroup_tryget'? [-Werror=implicit-function-declaration]
3699 | css = cgroup_tryget_css(cgrp, ss);
| ^~~~~~~~~~~~~~~~~
| cgroup_tryget
kernel/cgroup/cgroup.c:3699:13: warning: assignment to 'struct cgroup_subsys_state *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
3699 | css = cgroup_tryget_css(cgrp, ss);
| ^
because cgroup_tryget_css() only exists when CGROUP_SCHED is enabled,
and the cgroup_local_stat_show() function should similarly be guarded by
that config option.
Move things around a bit to fix this all.
Fixes:
d1d4ff5d11a5 ("cgroup: put cgroup_tryget_css() inside CONFIG_CGROUP_SCHED")
Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Sudip Mukherjee [Sat, 2 Sep 2023 09:51:02 +0000 (10:51 +0100)]
fbdev/g364fb: fix build failure with mips
Fix the typo which resulted in the driver using FB_DEFAULT_IOMEM_HELPERS
instead of FB_DEFAULT_IOMEM_OPS as the fbdev I/O helpers.
Fixes:
501126083855 ("fbdev/g364fb: Use fbdev I/O helpers")
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
James Bottomley [Sat, 2 Sep 2023 07:25:19 +0000 (08:25 +0100)]
Merge branch 'fixes' into misc
Pawel Zmarzly [Sat, 2 Sep 2023 03:04:51 +0000 (12:04 +0900)]
ata: libata-core: Disable NCQ_TRIM on Micron 1100 drives
Micron 1100 drives lock up when encountering queued TRIM command. It is
a quite old hardware series, for past years we have been running our
machines with these drives using libata.force=noncqtrim.
[Damien] Move the "Crucial_CT*M500*" entry to keep Micron and Crucial
entries together.
Signed-off-by: Pawel Zmarzly <pzmarzly@meta.com>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Werner Fischer [Tue, 29 Aug 2023 11:33:58 +0000 (13:33 +0200)]
ata: ahci: Add Elkhart Lake AHCI controller
Elkhart Lake is the successor of Apollo Lake and Gemini Lake. These
CPUs and their PCHs are used in mobile and embedded environments.
With this patch I suggest that Elkhart Lake SATA controllers [1] should
use the default LPM policy for mobile chipsets.
The disadvantage of missing hot-plug support with this setting should
not be an issue, as those CPUs are used in embedded environments and
not in servers with hot-plug backplanes.
We discovered that the Elkhart Lake SATA controllers have been missing
in ahci.c after a customer reported the throttling of his SATA SSD
after a short period of higher I/O. We determined the high temperature
of the SSD controller in idle mode as the root cause for that.
Depending on the used SSD, we have seen up to 1.8 Watt lower system
idle power usage and up to 30°C lower SSD controller temperatures in
our tests, when we set med_power_with_dipm manually. I have provided a
table showing seven different SATA SSDs from ATP, Intel/Solidigm and
Samsung [2].
Intel lists a total of 3 SATA controller IDs (4B60, 4B62, 4B63) in [1]
for those mobile PCHs.
This commit just adds 0x4b63 as I do not have test systems with 0x4b60
and 0x4b62 SATA controllers.
I have tested this patch with a system which uses 0x4b63 as SATA
controller.
[1] https://sata-io.org/product/8803
[2] https://www.thomas-krenn.com/en/wiki/SATA_Link_Power_Management#Example_LES_v4
Signed-off-by: Werner Fischer <devlists@wefi.net>
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Valentin Schneider [Fri, 1 Sep 2023 15:10:39 +0000 (17:10 +0200)]
tracing/filters: Fix coding style issues
Recent commits have introduced some coding style issues, fix those up.
Link: https://lkml.kernel.org/r/20230901151039.125186-5-vschneid@redhat.com
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Valentin Schneider [Fri, 1 Sep 2023 15:10:38 +0000 (17:10 +0200)]
tracing/filters: Change parse_pred() cpulist ternary into an if block
Review comments noted that an if block would be clearer than a ternary, so
swap it out.
No change in behaviour intended
Link: https://lkml.kernel.org/r/20230901151039.125186-4-vschneid@redhat.com
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Valentin Schneider [Fri, 1 Sep 2023 15:10:37 +0000 (17:10 +0200)]
tracing/filters: Fix double-free of struct filter_pred.mask
When a cpulist filter is found to contain a single CPU, that CPU is saved
as a scalar and the backing cpumask storage is freed.
Also NULL the mask to avoid a double-free once we get down to
free_predicate().
Link: https://lkml.kernel.org/r/20230901151039.125186-3-vschneid@redhat.com
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Valentin Schneider [Fri, 1 Sep 2023 15:10:36 +0000 (17:10 +0200)]
tracing/filters: Fix error-handling of cpulist parsing buffer
parse_pred() allocates a string buffer to parse the user-provided cpulist,
but doesn't check the allocation result nor does it free the buffer once it
is no longer needed.
Add an allocation check, and free the buffer as soon as it is no longer
needed.
Link: https://lkml.kernel.org/r/20230901151039.125186-2-vschneid@redhat.com
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Reported-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Brian Foster [Thu, 31 Aug 2023 12:55:00 +0000 (08:55 -0400)]
tracing: Zero the pipe cpumask on alloc to avoid spurious -EBUSY
The pipe cpumask used to serialize opens between the main and percpu
trace pipes is not zeroed or initialized. This can result in
spurious -EBUSY returns if underlying memory is not fully zeroed.
This has been observed by immediate failure to read the main
trace_pipe file on an otherwise newly booted and idle system:
# cat /sys/kernel/debug/tracing/trace_pipe
cat: /sys/kernel/debug/tracing/trace_pipe: Device or resource busy
Zero the allocation of pipe_cpumask to avoid the problem.
Link: https://lore.kernel.org/linux-trace-kernel/20230831125500.986862-1-bfoster@redhat.com
Cc: stable@vger.kernel.org
Fixes:
c2489bb7e6be ("tracing: Introduce pipe_cpumask to avoid race on trace_pipes")
Reviewed-by: Zheng Yejian <zhengyejian1@huawei.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Ruan Jinjie [Wed, 9 Aug 2023 07:15:51 +0000 (15:15 +0800)]
ftrace: Use LIST_HEAD to initialize clear_hash
Use LIST_HEAD() to initialize clear_hash instead of open-coding it.
Link: https://lore.kernel.org/linux-trace-kernel/20230809071551.913041-1-ruanjinjie@huawei.com
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>