platform/kernel/linux-rpi.git
10 months agoMerge tag 'mailbox-v6.6' of git://git.linaro.org/landing-teams/working/fujitsu/integr...
Linus Torvalds [Tue, 5 Sep 2023 19:31:07 +0000 (12:31 -0700)]
Merge tag 'mailbox-v6.6' of git://git.linaro.org/landing-teams/working/fujitsu/integration

Pull mailbox updates from Jassi Brar:

 - qcom: fix incorrect num_chans counting

 - mhu: Remove redundant dev_err

 - bcm: fix comments

 - common changes:
    - convert to use devm_platform_get_and_ioremap_resource
    - correct DT includes

* tag 'mailbox-v6.6' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
  mailbox: qcom-ipcc: fix incorrect num_chans counting
  mailbox: Explicitly include correct DT includes
  mailbox: ti-msgmgr: Use devm_platform_ioremap_resource_byname()
  mailbox: platform-mhu: Remove redundant dev_err()
  mailbox: bcm-pdc: Fix some kernel-doc comments
  mailbox: mailbox-test: Fix an error check in mbox_test_probe()
  mailbox: tegra-hsp: Convert to devm_platform_ioremap_resource()
  mailbox: rockchip: Use devm_platform_get_and_ioremap_resource()
  mailbox: mailbox-test: Use devm_platform_get_and_ioremap_resource()
  mailbox: bcm-pdc: Use devm_platform_get_and_ioremap_resource()
  mailbox: bcm-ferxrm-mailbox: Use devm_platform_get_and_ioremap_resource()

10 months agoMerge tag 'mm-hotfixes-stable-2023-09-05-11-51' of git://git.kernel.org/pub/scm/linux...
Linus Torvalds [Tue, 5 Sep 2023 19:22:39 +0000 (12:22 -0700)]
Merge tag 'mm-hotfixes-stable-2023-09-05-11-51' of git://git./linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "Seven hotfixes. Four are cc:stable and the remainder pertain to issues
  which were introduced in the current merge window"

* tag 'mm-hotfixes-stable-2023-09-05-11-51' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  sparc64: add missing initialization of folio in tlb_batch_add()
  mm: memory-failure: use rcu lock instead of tasklist_lock when collect_procs()
  revert "memfd: improve userspace warnings for missing exec-related flags".
  rcu: dump vmalloc memory info safely
  mm/vmalloc: add a safer version of find_vm_area() for debug
  tools/mm: fix undefined reference to pthread_once
  memcontrol: ensure memcg acquired by id is properly set up

10 months agoMerge tag 'tpmdd-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko...
Linus Torvalds [Tue, 5 Sep 2023 18:15:59 +0000 (11:15 -0700)]
Merge tag 'tpmdd-v6.6-rc1' of git://git./linux/kernel/git/jarkko/linux-tpmdd

Pull more tpm updates from Jarkko Sakkinen:
 "Two more bug fixes for tpm_crb, categorically disabling rng for AMD
  CPU's in the tpm_crb driver, discarding the earlier probing approach"

* tag 'tpmdd-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd:
  tpm: Enable hwrng only for Pluton on AMD CPUs
  tpm_crb: Fix an error handling path in crb_acpi_add()

10 months agosparc64: add missing initialization of folio in tlb_batch_add()
Mike Rapoport (IBM) [Mon, 4 Sep 2023 17:37:59 +0000 (20:37 +0300)]
sparc64: add missing initialization of folio in tlb_batch_add()

Commit 1a10a44dfc1d ("sparc64: implement the new page table range API")
missed initialization of folio variable in tlb_batch_add() which causes
boot tests to crash.

Add missing initialization.

Link: https://lkml.kernel.org/r/20230904174350.GF3223@kernel.org
Fixes: 1a10a44dfc1d ("sparc64: implement the new page table range API")
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 months agomm: memory-failure: use rcu lock instead of tasklist_lock when collect_procs()
Tong Tiangen [Mon, 28 Aug 2023 02:25:27 +0000 (10:25 +0800)]
mm: memory-failure: use rcu lock instead of tasklist_lock when collect_procs()

We found a softlock issue in our test, analyzed the logs, and found that
the relevant CPU call trace as follows:

CPU0:
  _do_fork
    -> copy_process()
      -> write_lock_irq(&tasklist_lock)  //Disable irq,waiting for
        //tasklist_lock

CPU1:
  wp_page_copy()
    ->pte_offset_map_lock()
      -> spin_lock(&page->ptl);        //Hold page->ptl
    -> ptep_clear_flush()
      -> flush_tlb_others() ...
        -> smp_call_function_many()
          -> arch_send_call_function_ipi_mask()
            -> csd_lock_wait()         //Waiting for other CPUs respond
                               //IPI

CPU2:
  collect_procs_anon()
    -> read_lock(&tasklist_lock)       //Hold tasklist_lock
      ->for_each_process(tsk)
        -> page_mapped_in_vma()
          -> page_vma_mapped_walk()
    -> map_pte()
              ->spin_lock(&page->ptl)  //Waiting for page->ptl

We can see that CPU1 waiting for CPU0 respond IPI,CPU0 waiting for CPU2
unlock tasklist_lock, CPU2 waiting for CPU1 unlock page->ptl. As a result,
softlockup is triggered.

For collect_procs_anon(), what we're doing is task list iteration, during
the iteration, with the help of call_rcu(), the task_struct object is freed
only after one or more grace periods elapse. the logic as follows:

release_task()
  -> __exit_signal()
    -> __unhash_process()
      -> list_del_rcu()

  -> put_task_struct_rcu_user()
    -> call_rcu(&task->rcu, delayed_put_task_struct)

delayed_put_task_struct()
  -> put_task_struct()
  -> if (refcount_sub_and_test())
      __put_task_struct()
          -> free_task()

Therefore, under the protection of the rcu lock, we can safely use
get_task_struct() to ensure a safe reference to task_struct during the
iteration.

By removing the use of tasklist_lock in task list iteration, we can break
the softlock chain above.

The same logic can also be applied to:
 - collect_procs_file()
 - collect_procs_fsdax()
 - collect_procs_ksm()

Link: https://lkml.kernel.org/r/20230828022527.241693-1-tongtiangen@huawei.com
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 months agorevert "memfd: improve userspace warnings for missing exec-related flags".
Andrew Morton [Sat, 2 Sep 2023 22:59:31 +0000 (15:59 -0700)]
revert "memfd: improve userspace warnings for missing exec-related flags".

This warning is telling userspace developers to pass MFD_EXEC and
MFD_NOEXEC_SEAL to memfd_create().  Commit 434ed3350f57 ("memfd: improve
userspace warnings for missing exec-related flags") made the warning more
frequent and visible in the hope that this would accelerate the fixing of
errant userspace.

But the overall effect is to generate far too much dmesg noise.

Fixes: 434ed3350f57 ("memfd: improve userspace warnings for missing exec-related flags")
Reported-by: Damian Tometzki <dtometzki@fedoraproject.org>
Closes: https://lkml.kernel.org/r/ZPFzCSIgZ4QuHsSC@fedora.fritz.box
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Daniel Verkamp <dverkamp@chromium.org>
Cc: Jeff Xu <jeffxu@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 months agoMerge tag 'kbuild-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy...
Linus Torvalds [Tue, 5 Sep 2023 18:01:47 +0000 (11:01 -0700)]
Merge tag 'kbuild-v6.6' of git://git./linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Enable -Wenum-conversion warning option

 - Refactor the rpm-pkg target

 - Fix scripts/setlocalversion to consider annotated tags for rt-kernel

 - Add a jump key feature for the search menu of 'make nconfig'

 - Support Qt6 for 'make xconfig'

 - Enable -Wformat-overflow, -Wformat-truncation, -Wstringop-overflow,
   and -Wrestrict warnings for W=1 builds

 - Replace <asm/export.h> with <linux/export.h> for alpha, ia64, and
   sparc

 - Support DEB_BUILD_OPTIONS=parallel=N for the debian source package

 - Refactor scripts/Makefile.modinst and fix some modules_sign issues

 - Add a new Kconfig env variable to warn symbols that are not defined
   anywhere

 - Show help messages of config fragments in 'make help'

* tag 'kbuild-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (62 commits)
  kconfig: fix possible buffer overflow
  kbuild: Show marked Kconfig fragments in "help"
  kconfig: add warn-unknown-symbols sanity check
  kbuild: dummy-tools: make MPROFILE_KERNEL checks work on BE
  Documentation/llvm: refresh docs
  modpost: Skip .llvm.call-graph-profile section check
  kbuild: support modules_sign for external modules as well
  kbuild: support 'make modules_sign' with CONFIG_MODULE_SIG_ALL=n
  kbuild: move more module installation code to scripts/Makefile.modinst
  kbuild: reduce the number of mkdir calls during modules_install
  kbuild: remove $(MODLIB)/source symlink
  kbuild: move depmod rule to scripts/Makefile.modinst
  kbuild: add modules_sign to no-{compiler,sync-config}-targets
  kbuild: do not run depmod for 'make modules_sign'
  kbuild: deb-pkg: support DEB_BUILD_OPTIONS=parallel=N in debian/rules
  alpha: remove <asm/export.h>
  alpha: replace #include <asm/export.h> with #include <linux/export.h>
  ia64: remove <asm/export.h>
  ia64: replace #include <asm/export.h> with #include <linux/export.h>
  sparc: remove <asm/export.h>
  ...

10 months agoMerge tag 'mm-stable-2023-09-04-14-00' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 5 Sep 2023 17:56:27 +0000 (10:56 -0700)]
Merge tag 'mm-stable-2023-09-04-14-00' of git://git./linux/kernel/git/akpm/mm

Pull more MM updates from Andrew Morton:

 - Stefan Roesch has added ksm statistics to /proc/pid/smaps

 - Also a number of singleton patches, mainly cleanups and leftovers

* tag 'mm-stable-2023-09-04-14-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm/kmemleak: move up cond_resched() call in page scanning loop
  mm: page_alloc: remove stale CMA guard code
  MAINTAINERS: add rmap.h to mm entry
  rmap: remove anon_vma_link() nommu stub
  proc/ksm: add ksm stats to /proc/pid/smaps
  mm/hwpoison: rename hwp_walk* to hwpoison_walk*
  mm: memory-failure: add PageOffline() check

10 months agoMerge tag 'microblaze-v6.6' of git://git.monstr.eu/linux-2.6-microblaze
Linus Torvalds [Tue, 5 Sep 2023 17:15:22 +0000 (10:15 -0700)]
Merge tag 'microblaze-v6.6' of git://git.monstr.eu/linux-2.6-microblaze

Pull microblaze updates from Michal Simek:

 - Cleanup DT headers

 - Remove unused zalloc_maybe_bootmem()

 - Make virt_to_pfn() a static inline

* tag 'microblaze-v6.6' of git://git.monstr.eu/linux-2.6-microblaze:
  microblaze: Make virt_to_pfn() a static inline
  microblaze: Remove zalloc_maybe_bootmem()
  microblaze: Explicitly include correct DT includes

10 months agorcu: dump vmalloc memory info safely
Zqiang [Mon, 4 Sep 2023 18:08:05 +0000 (18:08 +0000)]
rcu: dump vmalloc memory info safely

Currently, for double invoke call_rcu(), will dump rcu_head objects memory
info, if the objects is not allocated from the slab allocator, the
vmalloc_dump_obj() will be invoke and the vmap_area_lock spinlock need to
be held, since the call_rcu() can be invoked in interrupt context,
therefore, there is a possibility of spinlock deadlock scenarios.

And in Preempt-RT kernel, the rcutorture test also trigger the following
lockdep warning:

BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0
preempt_count: 1, expected: 0
RCU nest depth: 1, expected: 1
3 locks held by swapper/0/1:
 #0: ffffffffb534ee80 (fullstop_mutex){+.+.}-{4:4}, at: torture_init_begin+0x24/0xa0
 #1: ffffffffb5307940 (rcu_read_lock){....}-{1:3}, at: rcu_torture_init+0x1ec7/0x2370
 #2: ffffffffb536af40 (vmap_area_lock){+.+.}-{3:3}, at: find_vmap_area+0x1f/0x70
irq event stamp: 565512
hardirqs last  enabled at (565511): [<ffffffffb379b138>] __call_rcu_common+0x218/0x940
hardirqs last disabled at (565512): [<ffffffffb5804262>] rcu_torture_init+0x20b2/0x2370
softirqs last  enabled at (399112): [<ffffffffb36b2586>] __local_bh_enable_ip+0x126/0x170
softirqs last disabled at (399106): [<ffffffffb43fef59>] inet_register_protosw+0x9/0x1d0
Preemption disabled at:
[<ffffffffb58040c3>] rcu_torture_init+0x1f13/0x2370
CPU: 0 PID: 1 Comm: swapper/0 Tainted: G        W          6.5.0-rc4-rt2-yocto-preempt-rt+ #15
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0x68/0xb0
 dump_stack+0x14/0x20
 __might_resched+0x1aa/0x280
 ? __pfx_rcu_torture_err_cb+0x10/0x10
 rt_spin_lock+0x53/0x130
 ? find_vmap_area+0x1f/0x70
 find_vmap_area+0x1f/0x70
 vmalloc_dump_obj+0x20/0x60
 mem_dump_obj+0x22/0x90
 __call_rcu_common+0x5bf/0x940
 ? debug_smp_processor_id+0x1b/0x30
 call_rcu_hurry+0x14/0x20
 rcu_torture_init+0x1f82/0x2370
 ? __pfx_rcu_torture_leak_cb+0x10/0x10
 ? __pfx_rcu_torture_leak_cb+0x10/0x10
 ? __pfx_rcu_torture_init+0x10/0x10
 do_one_initcall+0x6c/0x300
 ? debug_smp_processor_id+0x1b/0x30
 kernel_init_freeable+0x2b9/0x540
 ? __pfx_kernel_init+0x10/0x10
 kernel_init+0x1f/0x150
 ret_from_fork+0x40/0x50
 ? __pfx_kernel_init+0x10/0x10
 ret_from_fork_asm+0x1b/0x30
 </TASK>

The previous patch fixes this by using the deadlock-safe best-effort
version of find_vm_area.  However, in case of failure print the fact that
the pointer was a vmalloc pointer so that we print at least something.

Link: https://lkml.kernel.org/r/20230904180806.1002832-2-joel@joelfernandes.org
Fixes: 98f180837a89 ("mm: Make mem_dump_obj() handle vmalloc() memory")
Signed-off-by: Zqiang <qiang.zhang1211@gmail.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Reported-by: Zhen Lei <thunder.leizhen@huaweicloud.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 months agomm/vmalloc: add a safer version of find_vm_area() for debug
Joel Fernandes (Google) [Mon, 4 Sep 2023 18:08:04 +0000 (18:08 +0000)]
mm/vmalloc: add a safer version of find_vm_area() for debug

It is unsafe to dump vmalloc area information when trying to do so from
some contexts.  Add a safer trylock version of the same function to do a
best-effort VMA finding and use it from vmalloc_dump_obj().

[applied test robot feedback on unused function fix.]
[applied Uladzislau feedback on locking.]
Link: https://lkml.kernel.org/r/20230904180806.1002832-1-joel@joelfernandes.org
Fixes: 98f180837a89 ("mm: Make mem_dump_obj() handle vmalloc() memory")
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reported-by: Zhen Lei <thunder.leizhen@huaweicloud.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Zqiang <qiang.zhang1211@gmail.com>
Cc: <stable@vger.kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 months agotools/mm: fix undefined reference to pthread_once
Xie XiuQi [Thu, 31 Aug 2023 03:42:05 +0000 (11:42 +0800)]
tools/mm: fix undefined reference to pthread_once

Commit 97d5f2e9ee12 ("tools api fs: More thread safety for global
filesystem variables") introduces pthread_once, so the libpthread
should be added at link time, or we'll meet the following compile
error when 'make -C tools/mm':

  gcc -Wall -Wextra -I../lib/ -o page-types page-types.c ../lib/api/libapi.a
  ~/linux/tools/lib/api/fs/fs.c:146: undefined reference to `pthread_once'
  ~/linux/tools/lib/api/fs/fs.c:147: undefined reference to `pthread_once'
  ~/linux/tools/lib/api/fs/fs.c:148: undefined reference to `pthread_once'
  ~/linux/tools/lib/api/fs/fs.c:149: undefined reference to `pthread_once'
  ~/linux/tools/lib/api/fs/fs.c:150: undefined reference to `pthread_once'
  /usr/bin/ld: ../lib/api/libapi.a(libapi-in.o):~/linux/tools/lib/api/fs/fs.c:151:
  more undefined references to `pthread_once' follow
  collect2: error: ld returned 1 exit status
  make: *** [Makefile:22: page-types] Error 1

Link: https://lkml.kernel.org/r/20230831034205.2376653-1-xiexiuqi@huaweicloud.com
Fixes: 97d5f2e9ee12 ("tools api fs: More thread safety for global filesystem variables")
Signed-off-by: Xie XiuQi <xiexiuqi@huawei.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 months agomemcontrol: ensure memcg acquired by id is properly set up
Johannes Weiner [Wed, 23 Aug 2023 22:54:30 +0000 (15:54 -0700)]
memcontrol: ensure memcg acquired by id is properly set up

In the eviction recency check, we attempt to retrieve the memcg to which
the folio belonged when it was evicted, by the memcg id stored in the
shadow entry.  However, there is a chance that the retrieved memcg is not
the original memcg that has been killed, but a new one which happens to
have the same id.

This is a somewhat unfortunate, but acceptable and rare inaccuracy in the
heuristics.  However, if we retrieve this new memcg between its allocation
and when it is properly attached to the memcg hierarchy, we could run into
the following NULL pointer exception during the memcg hierarchy traversal
done in mem_cgroup_get_nr_swap_pages():

[ 155757.793456] BUG: kernel NULL pointer dereference, address: 00000000000000c0
[ 155757.807568] #PF: supervisor read access in kernel mode
[ 155757.818024] #PF: error_code(0x0000) - not-present page
[ 155757.828482] PGD 401f77067 P4D 401f77067 PUD 401f76067 PMD 0
[ 155757.839985] Oops: 0000 [#1] SMP
[ 155757.887870] RIP: 0010:mem_cgroup_get_nr_swap_pages+0x3d/0xb0
[ 155757.899377] Code: 29 19 4a 02 48 39 f9 74 63 48 8b 97 c0 00 00 00 48 8b b7 58 02 00 00 48 2b b7 c0 01 00 00 48 39 f0 48 0f 4d c6 48 39 d1 74 42 <48> 8b b2 c0 00 00 00 48 8b ba 58 02 00 00 48 2b ba c0 01 00 00 48
[ 155757.937125] RSP: 0018:ffffc9002ecdfbc8 EFLAGS: 00010286
[ 155757.947755] RAX: 00000000003a3b1c RBX: 000007ffffffffff RCX: ffff888280183000
[ 155757.962202] RDX: 0000000000000000 RSI: 0007ffffffffffff RDI: ffff888bbc2d1000
[ 155757.976648] RBP: 0000000000000001 R08: 000000000000000b R09: ffff888ad9cedba0
[ 155757.991094] R10: ffffea0039c07900 R11: 0000000000000010 R12: ffff888b23a7b000
[ 155758.005540] R13: 0000000000000000 R14: ffff888bbc2d1000 R15: 000007ffffc71354
[ 155758.019991] FS:  00007f6234c68640(0000) GS:ffff88903f9c0000(0000) knlGS:0000000000000000
[ 155758.036356] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 155758.048023] CR2: 00000000000000c0 CR3: 0000000a83eb8004 CR4: 00000000007706e0
[ 155758.062473] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 155758.076924] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 155758.091376] PKRU: 55555554
[ 155758.096957] Call Trace:
[ 155758.102016]  <TASK>
[ 155758.106502]  ? __die+0x78/0xc0
[ 155758.112793]  ? page_fault_oops+0x286/0x380
[ 155758.121175]  ? exc_page_fault+0x5d/0x110
[ 155758.129209]  ? asm_exc_page_fault+0x22/0x30
[ 155758.137763]  ? mem_cgroup_get_nr_swap_pages+0x3d/0xb0
[ 155758.148060]  workingset_test_recent+0xda/0x1b0
[ 155758.157133]  workingset_refault+0xca/0x1e0
[ 155758.165508]  filemap_add_folio+0x4d/0x70
[ 155758.173538]  page_cache_ra_unbounded+0xed/0x190
[ 155758.182919]  page_cache_sync_ra+0xd6/0x1e0
[ 155758.191738]  filemap_read+0x68d/0xdf0
[ 155758.199495]  ? mlx5e_napi_poll+0x123/0x940
[ 155758.207981]  ? __napi_schedule+0x55/0x90
[ 155758.216095]  __x64_sys_pread64+0x1d6/0x2c0
[ 155758.224601]  do_syscall_64+0x3d/0x80
[ 155758.232058]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
[ 155758.242473] RIP: 0033:0x7f62c29153b5
[ 155758.249938] Code: e8 48 89 75 f0 89 7d f8 48 89 4d e0 e8 b4 e6 f7 ff 41 89 c0 4c 8b 55 e0 48 8b 55 e8 48 8b 75 f0 8b 7d f8 b8 11 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 33 44 89 c7 48 89 45 f8 e8 e7 e6 f7 ff 48 8b
[ 155758.288005] RSP: 002b:00007f6234c5ffd0 EFLAGS: 00000293 ORIG_RAX: 0000000000000011
[ 155758.303474] RAX: ffffffffffffffda RBX: 00007f628c4e70c0 RCX: 00007f62c29153b5
[ 155758.318075] RDX: 000000000003c041 RSI: 00007f61d2986000 RDI: 0000000000000076
[ 155758.332678] RBP: 00007f6234c5fff0 R08: 0000000000000000 R09: 0000000064d5230c
[ 155758.347452] R10: 000000000027d450 R11: 0000000000000293 R12: 000000000003c041
[ 155758.362044] R13: 00007f61d2986000 R14: 00007f629e11b060 R15: 000000000027d450
[ 155758.376661]  </TASK>

This patch fixes the issue by moving the memcg's id publication from the
alloc stage to online stage, ensuring that any memcg acquired via id must
be connected to the memcg tree.

Link: https://lkml.kernel.org/r/20230823225430.166925-1-nphamcs@gmail.com
Fixes: f78dfc7b77d5 ("workingset: fix confusion around eviction vs refault container")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Co-developed-by: Nhat Pham <nphamcs@gmail.com>
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Cc: Yosry Ahmed <yosryahmed@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 months agoMerge tag 'for-linus' of https://github.com/openrisc/linux
Linus Torvalds [Tue, 5 Sep 2023 17:09:31 +0000 (10:09 -0700)]
Merge tag 'for-linus' of https://github.com/openrisc/linux

Pull OpenRISC updates from Stafford Horne:

 - Fixes from me to cleanup all compiler warnings reported under
   arch/openrisc

 - One cleanup from Linus Walleij to convert pfn macros to static
   inlines

* tag 'for-linus' of https://github.com/openrisc/linux:
  openrisc: Remove kernel-doc marker from ioremap comment
  openrisc: Remove unused tlb_init function
  openriac: Remove unused nommu_dump_state function
  openrisc: Include cpu.h and switch_to.h for prototypes
  openrisc: Add prototype for die to bug.h
  openrisc: Add prototype for show_registers to processor.h
  openrisc: Declare do_signal function as static
  openrisc: Add missing prototypes for assembly called fnctions
  openrisc: Make pfn accessors statics inlines

10 months agokconfig: fix possible buffer overflow
Konstantin Meskhidze [Tue, 5 Sep 2023 09:59:14 +0000 (17:59 +0800)]
kconfig: fix possible buffer overflow

Buffer 'new_argv' is accessed without bound check after accessing with
bound check via 'new_argc' index.

Fixes: e298f3b49def ("kconfig: add built-in function support")
Co-developed-by: Ivanov Mikhail <ivanov.mikhail1@huawei-partners.com>
Signed-off-by: Konstantin Meskhidze <konstantin.meskhidze@huawei.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
10 months agomailbox: qcom-ipcc: fix incorrect num_chans counting
Jonathan Marek [Wed, 2 Aug 2023 13:52:22 +0000 (09:52 -0400)]
mailbox: qcom-ipcc: fix incorrect num_chans counting

Breaking out early when a match is found leads to an incorrect num_chans
value when more than one ipcc mailbox channel is used by the same device.

Fixes: e9d50e4b4d04 ("mailbox: qcom-ipcc: Dynamic alloc for channel arrangement")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
10 months agomailbox: Explicitly include correct DT includes
Rob Herring [Fri, 14 Jul 2023 17:47:01 +0000 (11:47 -0600)]
mailbox: Explicitly include correct DT includes

The DT of_device.h and of_platform.h date back to the separate
of_platform_bus_type before it as merged into the regular platform bus.
As part of that merge prepping Arm DT support 13 years ago, they
"temporarily" include each other. They also include platform_device.h
and of.h. As a result, there's a pretty much random mix of those include
files used throughout the tree. In order to detangle these headers and
replace the implicit includes with struct declarations, users need to
explicitly include the correct includes.

Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
10 months agomailbox: ti-msgmgr: Use devm_platform_ioremap_resource_byname()
Li Zetao [Tue, 1 Aug 2023 08:51:07 +0000 (16:51 +0800)]
mailbox: ti-msgmgr: Use devm_platform_ioremap_resource_byname()

Convert platform_get_resource_byname() + devm_ioremap_resource() to a
single call to devm_platform_ioremap_resource_byname(), as this is
exactly what this function does.

Signed-off-by: Li Zetao <lizetao1@huawei.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
10 months agomailbox: platform-mhu: Remove redundant dev_err()
Ruan Jinjie [Thu, 27 Jul 2023 10:41:37 +0000 (10:41 +0000)]
mailbox: platform-mhu: Remove redundant dev_err()

There is no need to call the dev_err() function directly to print a custom
message when handling an error from platform_get_irq() function as
it is going to display an appropriate error message in case of a failure.

Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
10 months agomailbox: bcm-pdc: Fix some kernel-doc comments
Yang Li [Fri, 11 Aug 2023 01:34:48 +0000 (09:34 +0800)]
mailbox: bcm-pdc: Fix some kernel-doc comments

Fix some kernel-doc comments to silence the warnings:

drivers/mailbox/bcm-pdc-mailbox.c:707: warning: Function parameter or member 'pdcs' not described in 'pdc_tx_list_sg_add'
drivers/mailbox/bcm-pdc-mailbox.c:707: warning: Excess function parameter 'spu_idx' description in 'pdc_tx_list_sg_add'
drivers/mailbox/bcm-pdc-mailbox.c:875: warning: Function parameter or member 'pdcs' not described in 'pdc_rx_list_sg_add'
drivers/mailbox/bcm-pdc-mailbox.c:875: warning: Excess function parameter 'spu_idx' description in 'pdc_rx_list_sg_add'
drivers/mailbox/bcm-pdc-mailbox.c:966: warning: Function parameter or member 't' not described in 'pdc_tasklet_cb'
drivers/mailbox/bcm-pdc-mailbox.c:966: warning: Excess function parameter 'data' description in 'pdc_tasklet_cb'

Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
10 months agomailbox: mailbox-test: Fix an error check in mbox_test_probe()
Minjie Du [Thu, 13 Jul 2023 10:18:08 +0000 (18:18 +0800)]
mailbox: mailbox-test: Fix an error check in mbox_test_probe()

mbox_test_request_channel() function returns NULL or
error value embedded in the pointer (PTR_ERR).
Evaluate the return value using IS_ERR_OR_NULL.

Signed-off-by: Minjie Du <duminjie@vivo.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
10 months agomailbox: tegra-hsp: Convert to devm_platform_ioremap_resource()
Yangtao Li [Tue, 4 Jul 2023 13:37:26 +0000 (21:37 +0800)]
mailbox: tegra-hsp: Convert to devm_platform_ioremap_resource()

Use devm_platform_ioremap_resource() to simplify code.

Signed-off-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
10 months agomailbox: rockchip: Use devm_platform_get_and_ioremap_resource()
Yangtao Li [Tue, 4 Jul 2023 13:37:25 +0000 (21:37 +0800)]
mailbox: rockchip: Use devm_platform_get_and_ioremap_resource()

Convert platform_get_resource(), devm_ioremap_resource() to a single
call to devm_platform_get_and_ioremap_resource(), as this is exactly
what this function does.

Signed-off-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
10 months agomailbox: mailbox-test: Use devm_platform_get_and_ioremap_resource()
Yangtao Li [Tue, 4 Jul 2023 13:37:24 +0000 (21:37 +0800)]
mailbox: mailbox-test: Use devm_platform_get_and_ioremap_resource()

Convert platform_get_resource(), devm_ioremap_resource() to a single
call to devm_platform_get_and_ioremap_resource(), as this is exactly
what this function does.

Signed-off-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
10 months agomailbox: bcm-pdc: Use devm_platform_get_and_ioremap_resource()
Yangtao Li [Tue, 4 Jul 2023 13:37:23 +0000 (21:37 +0800)]
mailbox: bcm-pdc: Use devm_platform_get_and_ioremap_resource()

Convert platform_get_resource(), devm_ioremap_resource() to a single
call to devm_platform_get_and_ioremap_resource(), as this is exactly
what this function does.

Signed-off-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
10 months agomailbox: bcm-ferxrm-mailbox: Use devm_platform_get_and_ioremap_resource()
Yangtao Li [Tue, 4 Jul 2023 13:37:22 +0000 (21:37 +0800)]
mailbox: bcm-ferxrm-mailbox: Use devm_platform_get_and_ioremap_resource()

Convert platform_get_resource(), devm_ioremap_resource() to a single
call to devm_platform_get_and_ioremap_resource(), as this is exactly
what this function does.

Signed-off-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>
10 months agoMerge tag 'arc-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Linus Torvalds [Mon, 4 Sep 2023 22:38:24 +0000 (15:38 -0700)]
Merge tag 'arc-6.6-rc1' of git://git./linux/kernel/git/vgupta/arc

Pull ARC updates from Vineet Gupta:

 - fixes for -Wmissing-prototype warnings

 - missing compiler barrier in relaxed atomics

 - some uaccess simplification, declutter

 - removal of massive glocal struct cpuinfo_arc from bootlog code

 - __switch_to consolidation (removal of inline asm variant)

 - use GP to cache task pointer (vs. r25)

 - misc rework of entry code

* tag 'arc-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (24 commits)
  ARC: boot log: fix warning
  arc: Explicitly include correct DT includes
  ARC: pt_regs: create seperate type for ecr
  ARCv2: entry: rearrange pt_regs slightly
  ARC: entry: replace 8 byte ADD.ne with 4 byte ADD2.ne
  ARC: entry: replace 8 byte OR with 4 byte BSET
  ARC: entry: Add more common chores to EXCEPTION_PROLOGUE
  ARC: entry: EV_MachineCheck dont re-read ECR
  ARC: entry: ARcompact EV_ProtV to use r10 directly
  ARC: entry: rework (non-functional)
  ARC: __switch_to: move ksp to thread_info from thread_struct
  ARC: __switch_to: asm with dwarf ops (vs. inline asm)
  ARC: kernel stack: INIT_THREAD need not setup @init_stack in @ksp
  ARC: entry: use gp to cache task pointer (vs. r25)
  ARC: boot log: eliminate struct cpuinfo_arc #4: boot log per ISA
  ARC: boot log: eliminate struct cpuinfo_arc #3: don't export
  ARC: boot log: eliminate struct cpuinfo_arc #2: cache
  ARC: boot log: eliminate struct cpuinfo_arc #1: mm
  ARCv2: memset: don't prefetch for len == 0 which happens a alot
  ARC: uaccess: elide unaliged handling if hardware supports
  ...

10 months agoMerge tag 'pm-6.6-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Mon, 4 Sep 2023 22:21:55 +0000 (15:21 -0700)]
Merge tag 'pm-6.6-rc1-2' of git://git./linux/kernel/git/rafael/linux-pm

Pull more power management updates from Rafael Wysocki:
 "These fix cpufreq core and the pcc cpufreq driver, add per-policy
  boost support to cpufreq and add Georgian translation Makefile
  LANGUAGES in cpupower.

  Specifics:

   - Add Georgian translation to Makefile LANGUAGES in cpupower (Shuah
     Khan).

   - Add support for per-policy performance boost to cpufreq (Jie Zhan).

   - Fix assorted issues in the cpufreq core, common governor code and
     in the pcc cpufreq driver (Liao Chang)"

* tag 'pm-6.6-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: Support per-policy performance boost
  cpufreq: pcc: Fix the potentinal scheduling delays in target_index()
  cpufreq: governor: Free dbs_data directly when gov->init() fails
  cpufreq: Fix the race condition while updating the transition_task of policy
  cpufreq: Avoid printing kernel addresses in cpufreq_resume()
  cpupower: Add Georgian translation to Makefile LANGUAGES

10 months agoMerge tag 'thermal-6.6-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Mon, 4 Sep 2023 22:17:28 +0000 (15:17 -0700)]
Merge tag 'thermal-6.6-rc1-2' of git://git./linux/kernel/git/rafael/linux-pm

Pull more thermal control updates from Rafael Wysocki:
 "These are mostly updates of thermal control drivers for ARM platforms,
  new thermal control support for Loongson-2 and a couple of core
  cleanups made possible by recent changes merged previously.

  Specifics:

   - Check if the Tegra BPMP supports the trip points in order to set
     the .set_trips callback (Mikko Perttunen)

   - Add new Loongson-2 thermal sensor along with the DT bindings (Yinbo
     Zhu)

   - Use IS_ERR_OR_NULL() helper to replace a double test on the TI
     bandgap sensor (Li Zetao)

   - Remove redundant platform_set_drvdata() calls, as there are no
     corresponding calls to platform_get_drvdata(), from a bunch of
     drivers (Andrei Coardos)

   - Switch the Mediatek LVTS mode to filtered in order to enable
     interrupts (Nícolas F. R. A. Prado)

   - Fix Wvoid-pointer-to-enum-cast warning on the Exynos TMU (Krzysztof
     Kozlowski)

   - Remove redundant dev_err_probe(), because the underlying function
     already called it, from the Mediatek sensor (Chen Jiahao)

   - Free calibration nvmem after reading it on sun8i (Mark Brown)

   - Remove useless comment from the sun8i driver (Yangtao Li)

   - Make tsens_xxxx_nvmem static to fix a sparse warning on QCom tsens
     (Min-Hua Chen)

   - Remove error message at probe deferral on imx8mm (Ahmad Fatoum)

   - Fix parameter check in lvts_debugfs_init() with IS_ERR() on
     Mediatek LVTS (Minjie Du)

   - Fix interrupt routine and configuratoin for Mediatek LVTS (Nícolas
     F. R. A. Prado)

   - Drop unused .get_trip_type(), .get_trip_temp() and .get_trip_hyst()
     thermal zone callbacks from the core and rework the .get_trend()
     one to take a trip point pointer as an argument (Rafael Wysocki)"

* tag 'thermal-6.6-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (29 commits)
  thermal: core: Rework .get_trend() thermal zone callback
  thermal: core: Drop unused .get_trip_*() callbacks
  thermal/drivers/tegra-bpmp: Check if BPMP supports trip points
  thermal: dt-bindings: add loongson-2 thermal
  thermal/drivers/loongson-2: Add thermal management support
  thermal/drivers/ti-soc-thermal: Use helper function IS_ERR_OR_NULL()
  thermal/drivers/generic-adc: Removed unneeded call to platform_set_drvdata()
  thermal/drivers/max77620_thermal: Removed unneeded call to platform_set_drvdata()
  thermal/drivers/mediatek/auxadc_thermal: Removed call to platform_set_drvdata()
  thermal/drivers/sun8i_thermal: Remove unneeded call to platform_set_drvdata()
  thermal/drivers/broadcom/brcstb_thermal: Removed unneeded platform_set_drvdata()
  thermal/drivers/mediatek/lvts_thermal: Make readings valid in filtered mode
  thermal/drivers/k3_bandgap: Remove unneeded call to platform_set_drvdata()
  thermal/drivers/k3_j72xx_bandgap: Removed unneeded call to platform_set_drvdata()
  thermal/drivers/broadcom/sr-thermal: Removed call to platform_set_drvdata()
  thermal/drivers/samsung: Fix Wvoid-pointer-to-enum-cast warning
  thermal/drivers/db8500: Remove redundant of_match_ptr()
  thermal/drivers/mediatek: Clean up redundant dev_err_probe()
  thermal/drivers/sun8i: Free calibration nvmem after reading it
  thermal/drivers/sun8i: Remove unneeded comments
  ...

10 months agoMerge tag 'rproc-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc...
Linus Torvalds [Mon, 4 Sep 2023 22:12:26 +0000 (15:12 -0700)]
Merge tag 'rproc-v6.6' of git://git./linux/kernel/git/remoteproc/linux

Pull remoteproc updates from Bjorn Andersson:
 "Support for booting the iMX remoteprocs using MMIO, instead of SMCCC
  is added. The iMX driver is also extended to support delivering
  interrupts from an arbitrary number of vdev.

  Support is added to the TI PRU driver, to allow GPMUX to be controlled
  from DeviceTree.

  The Qualcomm coredump collector is extended to fall back to generating
  a full coredump, in the case that the loaded firmware doesn't support
  generating minidump. The overly terse MD abbreviation of "MINIDUMP" is
  expanded, to make the code easier on the eye.

  The list of Qualcomm Sensor Low Power Island (SLPI) instances
  supported is cleaned up, and SDM845 is added. SDM630/636/660 support
  for the modem subsystem (mss) is added.

  All the Qualcomm drivers are transitioned to of_reserved_mem_lookup()
  instead of open coding the resolution of reserved-memory regions, to
  gain handling of error cases. A couple of drivers are transitioned to
  use devm_platform_ioremap_resource_byname().

  The stm32 remoteproc driver's PM operations are updated to modern
  macros, to avoid the "unused variable"-warning in some configurations.

  Drivers are transitioned away from directly including of_device.h"

* tag 'rproc-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux: (23 commits)
  remoteproc: pru: add support for configuring GPMUX based on client setup
  remoteproc: stm32: fix incorrect optional pointers
  remoteproc: imx_rproc: Switch iMX8MN/MP from SMCCC to MMIO
  dt-bindings: remoteproc: imx_rproc: Support i.MX8MN/P MMIO
  dt-bindings: remoteproc: qcom,msm8996-mss-pil: Fix 8996 clocks
  remoteproc: qcom: pas: add SDM845 SLPI compatible
  remoteproc: qcom: q6v5-mss: Add support for SDM630/636/660
  dt-bindings: remoteproc: qcom,msm8996-mss-pil: Add SDM660 compatible
  remoteproc: qcom: Expand MD_* as MINIDUMP_*
  remoteproc: qcom: pas: refactor SLPI remoteproc init
  dt-bindings: remoteproc: qcom: adsp: add qcom,sdm845-slpi-pas compatible
  remoteproc: qcom: wcnss: use devm_platform_ioremap_resource_byname()
  remoteproc: qcom: q6v5: use devm_platform_ioremap_resource_byname()
  dt-bindings: remoteproc: qcom: sm6115-pas: Add QCM2290
  remoteproc: qcom: Add full coredump fallback mechanism
  remoteproc: core: Export the rproc coredump APIs
  remoteproc: qcom: Use of_reserved_mem_lookup()
  remoteproc: imx_rproc: iterate all notifiyids in rx callback
  dt-bindings: remoteproc: qcom,adsp: bring back firmware-name
  dt-bindings: remoteproc: qcom,sm8550-pas: require memory-region
  ...

10 months agoMerge tag 'rpmsg-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc...
Linus Torvalds [Mon, 4 Sep 2023 22:08:52 +0000 (15:08 -0700)]
Merge tag 'rpmsg-v6.6' of git://git./linux/kernel/git/remoteproc/linux

Pull rpmsg updates from Bjorn Andersson:
 "Add support for the GLINK flow control signals, and expose this to the
  user through the rpmsg_char interface. Add missing kstrdup() failure
  handling during allocation of GLINK channel objects"

* tag 'rpmsg-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
  rpmsg: glink: Avoid dereferencing NULL channel
  rpmsg: glink: Add check for kstrdup
  rpmsg: char: Add RPMSG GET/SET FLOWCONTROL IOCTL support
  rpmsg: glink: Add support to handle signals command
  rpmsg: core: Add signal API support

10 months agoMerge tag 'hwlock-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc...
Linus Torvalds [Mon, 4 Sep 2023 22:04:31 +0000 (15:04 -0700)]
Merge tag 'hwlock-v6.6' of git://git./linux/kernel/git/remoteproc/linux

Pull hwspinlock updates from Bjorn Andersson:
 "Convert u8500 and omap drivers to void-returning remove.

  Complete the support for representing the Qualcomm TCSR mutex as a
  mmio device, and check the return value of devm_regmap_field_alloc()
  in the same"

* tag 'hwlock-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
  hwspinlock: qcom: add missing regmap config for SFPB MMIO implementation
  hwspinlock: u8500: Convert to platform remove callback returning void
  hwspinlock: omap: Convert to platform remove callback returning void
  hwspinlock: omap: Emit only one error message for errors in .remove()
  hwspinlock: add a check of devm_regmap_field_alloc in qcom_hwspinlock_probe

10 months agoMerge tag 'leds-next-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds
Linus Torvalds [Mon, 4 Sep 2023 20:52:58 +0000 (13:52 -0700)]
Merge tag 'leds-next-6.6' of git://git./linux/kernel/git/lee/leds

Pull LED updates from Lee Jones:
 "Core Frameworks:
   - Add new framework to support Group Multi-Color (GMC) LEDs
   - Offer an 'optional' API for non-essential LEDs
   - Support obtaining 'max brightness' values from Device Tree
   - Provide new led_classdev member 'color' (settable via DT and SYFS)
   - Stop TTY Trigger from using the old LED_ON constraints
   - Statically allocate leds_class

  New Drivers:
   - Add support for NXP PCA995x I2C Constant Current LED Driver

  New Device Support:
   - Add support for Siemens Simatic IPC BX-21 to Simatic IPC

  Fix-ups:
   - Some dependency / Kconfig tweaking
   - Move final probe() functions back over from .probe_new()
   - Simplify obtaining resources (memory, device data) using unified
     API helpers
   - Bunch of Device Tree additions, conversions and adaptions
   - Fix trivial styling issues; comments
   - Ensure correct includes are present and remove some that are not
     required
   - Omit the use of redundant casts and if relevant replace with better
     ones
   - Use purpose-built APIs for various actions; sysfs_emit(),
     module_led_trigger()
   - Remove a bunch of superfluous locking

  Bug Fixes:
   - Ensure error codes are correctly propagated back up the call chain
   - Fix incorrect error values from being returned (missing '-')
   - Ensure get'ed resources are put'ed to prevent leaks
   - Use correct class when exporting module resources
   - Fixing rounding (or lack there of) issues
   - Fix 'always false' LED_COLOR_ID_MULTI BUG() check"

* tag 'leds-next-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/leds: (40 commits)
  leds: aw2013: Enable pull-up supply for interrupt and I2C
  dt-bindings: leds: Document pull-up supply for interrupt and I2C
  dt-bindings: leds: aw2013: Document interrupt
  leds: uleds: Use module_misc_device macro to simplify the code
  leds: trigger: netdev: Use module_led_trigger macro to simplify the code
  dt-bindings: leds: Fix reference to definition of default-state
  leds: turris-omnia: Drop unnecessary mutex locking
  leds: turris-omnia: Use sysfs_emit() instead of sprintf()
  leds: Make leds_class a static const structure
  leds: Remove redundant of_match_ptr()
  dt-bindings: leds: Add gpio-line-names to PCA9532 GPIO
  leds: trigger: tty: Do not use LED_ON/OFF constants, use led_blink_set_oneshot instead
  dt-bindings: leds: rohm,bd71828: Drop select:false
  leds: Fix BUG_ON check for LED_COLOR_ID_MULTI that is always false
  leds: multicolor: Use rounded division when calculating color components
  leds: rgb: Add a multicolor LED driver to group monochromatic LEDs
  dt-bindings: leds: Add binding for a multicolor group of LEDs
  leds: class: Store the color index in struct led_classdev
  leds: Provide devm_of_led_get_optional()
  leds: pca995x: Fix MODULE_DEVICE_TABLE for OF
  ...

10 months agoMerge tag 'mfd-next-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
Linus Torvalds [Mon, 4 Sep 2023 20:47:59 +0000 (13:47 -0700)]
Merge tag 'mfd-next-6.6' of git://git./linux/kernel/git/lee/mfd

Pull NFD updates from Lee Jones:
 "New Drivers:
   - Add support for the Cirrus Logic CS42L43 Audio CODEC

  Fix-ups:
   - Make use of specific printk() format tags for various optimisations
   - Kconfig / module modifications / tweaking
   - Simplify obtaining resources (memory, device data) using unified
     API helpers
   - Bunch of Device Tree additions, conversions and adaptions
   - Convert a bunch of Regmap configurations to use the Maple Tree
     cache
   - Ensure correct includes are present and remove some that are not
     required
   - Remove superfluous code
   - Reduce amount of cycles spent in critical sections
   - Omit the use of redundant casts and if relevant replace with better
     ones
   - Swap out raw_spin_{un}lock_irq{save,restore}() for
     spin_{un}lock_irq{save,restore}()

  Bug Fixes:
   - Repair theoretical deadlock situation
   - Fix some link-time dependencies
   - Use more appropriate datatype when casting"

* tag 'mfd-next-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (70 commits)
  mfd: mc13xxx: Simplify device data fetching in probe()
  mfd: rz-mtu3: Replace raw_spin_lock->spin_lock()
  mfd: rz-mtu3: Reduce critical sections
  mfd: mxs-lradc: Fix Wvoid-pointer-to-enum-cast warning
  mfd: wm31x: Fix Wvoid-pointer-to-enum-cast warning
  mfd: wm8994: Fix Wvoid-pointer-to-enum-cast warning
  mfd: tc3589: Fix Wvoid-pointer-to-enum-cast warning
  mfd: lp87565: Fix Wvoid-pointer-to-enum-cast warning
  mfd: hi6421-pmic: Fix Wvoid-pointer-to-enum-cast warning
  mfd: max77541: Fix Wvoid-pointer-to-enum-cast warning
  mfd: max14577: Fix Wvoid-pointer-to-enum-cast warning
  mfd: stmpe: Fix Wvoid-pointer-to-enum-cast warning
  mfd: rn5t618: Remove redundant of_match_ptr()
  mfd: lochnagar-i2c: Remove redundant of_match_ptr()
  mfd: stpmic1: Remove redundant of_match_ptr()
  mfd: act8945a: Remove redundant of_match_ptr()
  mfd: rsmu_spi: Remove redundant of_match_ptr()
  mfd: altera-a10sr: Remove redundant of_match_ptr()
  mfd: rsmu_i2c: Remove redundant of_match_ptr()
  mfd: tc3589x: Remove redundant of_match_ptr()
  ...

10 months agoMerge tag 'i2c-for-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Mon, 4 Sep 2023 20:44:11 +0000 (13:44 -0700)]
Merge tag 'i2c-for-6.6-rc1' of git://git./linux/kernel/git/wsa/linux

Pull i2c updates from Wolfram Sang:
 "I2C has mainly cleanups this time and a few driver improvements.

  Because a lot of developers were on holidays (including myself) it was
  a good timing to apply lots of cleanups which would normally cause
  merge conflicts with other floating patches. Extra thanks go to Andi
  Shyti who backed me up when I was on a four week hiatus. This is also
  the reason that some patches were commited later than ideal"

* tag 'i2c-for-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (67 commits)
  i2c: at91: Use dev_err_probe() instead of dev_err()
  I2C: ali15x3: Do PCI error checks on own line
  i2c: Make return value check more accurate and explicit for devm_pinctrl_get()
  i2c: designware: Add support for recovery when GPIO need pinctrl
  i2c: mlxcpld: Add support for extended transaction length
  i2c: mlxcpld: Allow driver to run on ARM64 architecture
  i2c: nforce2: Do PCI error check on own line
  i2c: sis5595: Do PCI error checks on own line
  i2c: qcom-cci: Fix error checking in cci_probe()
  i2c: muxes: pca954x: Add regulator support
  i2c: muxes: pca954x: Add MAX735x/MAX736x support
  dt-bindings: i2c: Add Maxim MAX735x/MAX736x variants
  dt-bindings: i2c: pca954x: Correct interrupt support
  i2c: pnx: Use devm_platform_get_and_ioremap_resource()
  i2c: pxa: Use devm_platform_get_and_ioremap_resource()
  i2c: s3c2410: Use devm_platform_get_and_ioremap_resource()
  i2c: sh_mobile: Use devm_platform_get_and_ioremap_resource()
  i2c: st: Use devm_platform_get_and_ioremap_resource()
  i2c: qcom-geni: Convert to devm_platform_ioremap_resource()
  i2c: stm32f4: Use devm_platform_get_and_ioremap_resource()
  ...

10 months agoMerge tag 'printk-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/printk...
Linus Torvalds [Mon, 4 Sep 2023 20:20:19 +0000 (13:20 -0700)]
Merge tag 'printk-for-6.6' of git://git./linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

 - Do not try to get the console lock when it is not need or useful in
   panic()

 - Replace the global console_suspended state by a per-console flag

 - Export symbols needed for dumping the raw printk buffer in panic()

 - Fix documentation of printf formats for integer types

 - Moved Sergey Senozhatsky to the reviewer role

 - Misc cleanups

* tag 'printk-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printk: export symbols for debug modules
  lib: test_scanf: Add explicit type cast to result initialization in test_number_prefix()
  printk: ringbuffer: Fix truncating buffer size min_t cast
  printk: Rename abandon_console_lock_in_panic() to other_cpu_in_panic()
  printk: Add per-console suspended state
  printk: Consolidate console deferred printing
  printk: Do not take console lock for console_flush_on_panic()
  printk: Keep non-panic-CPUs out of console lock
  printk: Reduce console_unblank() usage in unsafe scenarios
  kdb: Do not assume write() callback available
  docs: printk-formats: Treat char as always unsigned
  docs: printk-formats: Fix hex printing of signed values
  MAINTAINERS: adjust printk/vsprintf entries

10 months agoMerge tag 'timers-core-2023-09-04-v2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 4 Sep 2023 20:15:57 +0000 (13:15 -0700)]
Merge tag 'timers-core-2023-09-04-v2' of git://git./linux/kernel/git/tip/tip

Pull clocksource/clockevent driver updates from Thomas Gleixner:

 - Remove the OXNAS driver instead of adding a new one!

 - A set of boring fixes, cleanups and improvements

* tag 'timers-core-2023-09-04-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource: Explicitly include correct DT includes
  clocksource/drivers/sun5i: Convert to platform device driver
  clocksource/drivers/sun5i: Remove pointless struct
  clocksource/drivers/sun5i: Remove duplication of code and data
  clocksource/drivers/loongson1: Set variable ls1x_timer_lock storage-class-specifier to static
  clocksource/drivers/arm_arch_timer: Disable timer before programming CVAL
  dt-bindings: timer: oxsemi,rps-timer: remove obsolete bindings
  clocksource/drivers/timer-oxnas-rps: Remove obsolete timer driver

10 months agotpm: Enable hwrng only for Pluton on AMD CPUs
Jarkko Sakkinen [Mon, 4 Sep 2023 18:12:10 +0000 (21:12 +0300)]
tpm: Enable hwrng only for Pluton on AMD CPUs

The vendor check introduced by commit 554b841d4703 ("tpm: Disable RNG for
all AMD fTPMs") doesn't work properly on a number of Intel fTPMs.  On the
reported systems the TPM doesn't reply at bootup and returns back the
command code. This makes the TPM fail probe on Lenovo Legion Y540 laptop.

Since only Microsoft Pluton is the only known combination of AMD CPU and
fTPM from other vendor, disable hwrng otherwise. In order to make sysadmin
aware of this, print also info message to the klog.

Cc: stable@vger.kernel.org
Fixes: 554b841d4703 ("tpm: Disable RNG for all AMD fTPMs")
Reported-by: Todd Brandt <todd.e.brandt@intel.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217804
Reported-by: Patrick Steinhardt <ps@pks.im>
Reported-by: Raymond Jay Golo <rjgolo@gmail.com>
Reported-by: Ronan Pigott <ronan@rjp.ie>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
10 months agotpm_crb: Fix an error handling path in crb_acpi_add()
Christophe JAILLET [Sat, 25 Feb 2023 10:58:48 +0000 (11:58 +0100)]
tpm_crb: Fix an error handling path in crb_acpi_add()

Some error paths don't call acpi_put_table() before returning.
Branch to the correct place instead of doing some direct return.

Fixes: 4d2732882703 ("tpm_crb: Add support for CRB devices based on Pluton")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Matthew Garrett <mgarrett@aurora.tech>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
10 months agoMerge tag 'm68knommu-for-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg...
Linus Torvalds [Mon, 4 Sep 2023 18:34:33 +0000 (11:34 -0700)]
Merge tag 'm68knommu-for-v6.6' of git://git./linux/kernel/git/gerg/m68knommu

Pull m68knommu updates from Greg Ungerer:
 "Two changes, one a trivial white space clean up, the other removes the
  unnecessary local pcibios_setup() code"

* tag 'm68knommu-for-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
  m68k: coldfire: dma_timer: ERROR: "foo __init bar" should be "foo __init bar"
  m68k/pci: Drop useless pcibios_setup()

10 months agoMerge tag 'uml-for-linus-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Mon, 4 Sep 2023 18:32:21 +0000 (11:32 -0700)]
Merge tag 'uml-for-linus-6.6-rc1' of git://git./linux/kernel/git/uml/linux

Pull UML updates from Richard Weinberger:

 - Drop 32-bit checksum implementation and re-use it from arch/x86

 - String function cleanup

 - Fixes for -Wmissing-variable-declarations and -Wmissing-prototypes
   builds

* tag 'uml-for-linus-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux:
  um: virt-pci: fix missing declaration warning
  um: Refactor deprecated strncpy to memcpy
  um: fix 3 instances of -Wmissing-prototypes
  um: port_kern: fix -Wmissing-variable-declarations
  uml: audio: fix -Wmissing-variable-declarations
  um: vector: refactor deprecated strncpy
  um: use obj-y to descend into arch/um/*/
  um: Hard-code the result of 'uname -s'
  um: Use the x86 checksum implementation on 32-bit
  asm-generic: current: Don't include thread-info.h if building asm
  um: Remove unsued extern declaration ldt_host_info()
  um: Fix hostaudio build errors
  um: Remove strlcpy usage

10 months agoMerge tag 'hyperv-next-signed-20230902' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Mon, 4 Sep 2023 18:26:29 +0000 (11:26 -0700)]
Merge tag 'hyperv-next-signed-20230902' of git://git./linux/kernel/git/hyperv/linux

Pull hyperv updates from Wei Liu:

 - Support for SEV-SNP guests on Hyper-V (Tianyu Lan)

 - Support for TDX guests on Hyper-V (Dexuan Cui)

 - Use SBRM API in Hyper-V balloon driver (Mitchell Levy)

 - Avoid dereferencing ACPI root object handle in VMBus driver (Maciej
   Szmigiero)

 - A few misecllaneous fixes (Jiapeng Chong, Nathan Chancellor, Saurabh
   Sengar)

* tag 'hyperv-next-signed-20230902' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (24 commits)
  x86/hyperv: Remove duplicate include
  x86/hyperv: Move the code in ivm.c around to avoid unnecessary ifdef's
  x86/hyperv: Remove hv_isolation_type_en_snp
  x86/hyperv: Use TDX GHCI to access some MSRs in a TDX VM with the paravisor
  Drivers: hv: vmbus: Bring the post_msg_page back for TDX VMs with the paravisor
  x86/hyperv: Introduce a global variable hyperv_paravisor_present
  Drivers: hv: vmbus: Support >64 VPs for a fully enlightened TDX/SNP VM
  x86/hyperv: Fix serial console interrupts for fully enlightened TDX guests
  Drivers: hv: vmbus: Support fully enlightened TDX guests
  x86/hyperv: Support hypercalls for fully enlightened TDX guests
  x86/hyperv: Add hv_isolation_type_tdx() to detect TDX guests
  x86/hyperv: Fix undefined reference to isolation_type_en_snp without CONFIG_HYPERV
  x86/hyperv: Add missing 'inline' to hv_snp_boot_ap() stub
  hv: hyperv.h: Replace one-element array with flexible-array member
  Drivers: hv: vmbus: Don't dereference ACPI root object handle
  x86/hyperv: Add hyperv-specific handling for VMMCALL under SEV-ES
  x86/hyperv: Add smp support for SEV-SNP guest
  clocksource: hyper-v: Mark hyperv tsc page unencrypted in sev-snp enlightened guest
  x86/hyperv: Use vmmcall to implement Hyper-V hypercall in sev-snp enlightened guest
  drivers: hv: Mark percpu hvcall input arg page unencrypted in SEV-SNP enlightened guest
  ...

10 months agoMerge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Linus Torvalds [Mon, 4 Sep 2023 17:43:44 +0000 (10:43 -0700)]
Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost

Pull virtio updates from Michael Tsirkin:
 "A small pull request this time around, mostly because the vduse
  network got postponed to next relase so we can be sure we got the
  security store right"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio_ring: fix avail_wrap_counter in virtqueue_add_packed
  virtio_vdpa: build affinity masks conditionally
  virtio_net: merge dma operations when filling mergeable buffers
  virtio_ring: introduce dma sync api for virtqueue
  virtio_ring: introduce dma map api for virtqueue
  virtio_ring: introduce virtqueue_reset()
  virtio_ring: separate the logic of reset/enable from virtqueue_resize
  virtio_ring: correct the expression of the description of virtqueue_resize()
  virtio_ring: skip unmap for premapped
  virtio_ring: introduce virtqueue_dma_dev()
  virtio_ring: support add premapped buf
  virtio_ring: introduce virtqueue_set_dma_premapped()
  virtio_ring: put mapping error check in vring_map_one_sg
  virtio_ring: check use_dma_api before unmap desc for indirect
  vdpa_sim: offer VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK
  vdpa: add get_backend_features vdpa operation
  vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature
  vdpa: add VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK flag
  vdpa/mlx5: Remove unused function declarations

10 months agoMerge tag 'tomoyo-pr-20230903' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1
Linus Torvalds [Mon, 4 Sep 2023 17:38:35 +0000 (10:38 -0700)]
Merge tag 'tomoyo-pr-20230903' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1

Pull tomoyo updates from Tetsuo Handa:
 "Three cleanup patches, no behavior changes"

* tag 'tomoyo-pr-20230903' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1:
  tomoyo: remove unused function declaration
  tomoyo: refactor deprecated strncpy
  tomoyo: add format attributes to functions

10 months agoMerge branch 'pm-cpufreq'
Rafael J. Wysocki [Mon, 4 Sep 2023 16:55:03 +0000 (18:55 +0200)]
Merge branch 'pm-cpufreq'

Merge additional cpufreq updates for 6.6-rc1:

 - Add support for per-policy performance boost (Jie Zhan).

 - Fix assorted issues in the cpufreq core, common governor code and in
   the pcc cpufreq driver (Liao Chang).

* pm-cpufreq:
  cpufreq: Support per-policy performance boost
  cpufreq: pcc: Fix the potentinal scheduling delays in target_index()
  cpufreq: governor: Free dbs_data directly when gov->init() fails
  cpufreq: Fix the race condition while updating the transition_task of policy
  cpufreq: Avoid printing kernel addresses in cpufreq_resume()

10 months agoMerge branch 'rework/misc-cleanups' into for-linus
Petr Mladek [Mon, 4 Sep 2023 09:37:37 +0000 (11:37 +0200)]
Merge branch 'rework/misc-cleanups' into for-linus

10 months agoMerge branch 'for-6.6-vsprintf-doc' into for-linus
Petr Mladek [Mon, 4 Sep 2023 09:37:11 +0000 (11:37 +0200)]
Merge branch 'for-6.6-vsprintf-doc' into for-linus

10 months agovirtio_ring: fix avail_wrap_counter in virtqueue_add_packed
Yuan Yao [Tue, 8 Aug 2023 05:10:59 +0000 (05:10 +0000)]
virtio_ring: fix avail_wrap_counter in virtqueue_add_packed

In current packed virtqueue implementation, the avail_wrap_counter won't
flip, in the case when the driver supplies a descriptor chain with a
length equals to the queue size; total_sg == vq->packed.vring.num.

Let’s assume the following situation:
vq->packed.vring.num=4
vq->packed.next_avail_idx: 1
vq->packed.avail_wrap_counter: 0

Then the driver adds a descriptor chain containing 4 descriptors.

We expect the following result with avail_wrap_counter flipped:
vq->packed.next_avail_idx: 1
vq->packed.avail_wrap_counter: 1

But, the current implementation gives the following result:
vq->packed.next_avail_idx: 1
vq->packed.avail_wrap_counter: 0

To reproduce the bug, you can set a packed queue size as small as
possible, so that the driver is more likely to provide a descriptor
chain with a length equal to the packed queue size. For example, in
qemu run following commands:
sudo qemu-system-x86_64 \
-enable-kvm \
-nographic \
-kernel "path/to/kernel_image" \
-m 1G \
-drive file="path/to/rootfs",if=none,id=disk \
-device virtio-blk,drive=disk \
-drive file="path/to/disk_image",if=none,id=rwdisk \
-device virtio-blk,drive=rwdisk,packed=on,queue-size=4,\
indirect_desc=off \
-append "console=ttyS0 root=/dev/vda rw init=/bin/bash"

Inside the VM, create a directory and mount the rwdisk device on it. The
rwdisk will hang and mount operation will not complete.

This commit fixes the wrap counter error by flipping the
packed.avail_wrap_counter, when start of descriptor chain equals to the
end of descriptor chain (head == i).

Fixes: 1ce9e6055fa0 ("virtio_ring: introduce packed ring support")
Signed-off-by: Yuan Yao <yuanyaogoog@chromium.org>
Message-Id: <20230808051110.3492693-1-yuanyaogoog@chromium.org>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 months agovirtio_vdpa: build affinity masks conditionally
Jason Wang [Fri, 11 Aug 2023 09:15:39 +0000 (05:15 -0400)]
virtio_vdpa: build affinity masks conditionally

We try to build affinity mask via create_affinity_masks()
unconditionally which may lead several issues:

- the affinity mask is not used for parent without affinity support
  (only VDUSE support the affinity now)
- the logic of create_affinity_masks() might not work for devices
  other than block. For example it's not rare in the networking device
  where the number of queues could exceed the number of CPUs. Such
  case breaks the current affinity logic which is based on
  group_cpus_evenly() who assumes the number of CPUs are not less than
  the number of groups. This can trigger a warning[1]:

if (ret >= 0)
WARN_ON(nr_present + nr_others < numgrps);

Fixing this by only build the affinity masks only when

- Driver passes affinity descriptor, driver like virtio-blk can make
  sure to limit the number of queues when it exceeds the number of CPUs
- Parent support affinity setting config ops

This help to avoid the warning. More optimizations could be done on
top.

[1]
[  682.146655] WARNING: CPU: 6 PID: 1550 at lib/group_cpus.c:400 group_cpus_evenly+0x1aa/0x1c0
[  682.146668] CPU: 6 PID: 1550 Comm: vdpa Not tainted 6.5.0-rc5jason+ #79
[  682.146671] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
[  682.146673] RIP: 0010:group_cpus_evenly+0x1aa/0x1c0
[  682.146676] Code: 4c 89 e0 5b 5d 41 5c 41 5d 41 5e c3 cc cc cc cc e8 1b c4 74 ff 48 89 ef e8 13 ac 98 ff 4c 89 e7 45 31 e4 e8 08 ac 98 ff eb c2 <0f> 0b eb b6 e8 fd 05 c3 00 45 31 e4 eb e5 cc cc cc cc cc cc cc cc
[  682.146679] RSP: 0018:ffffc9000215f498 EFLAGS: 00010293
[  682.146682] RAX: 000000000001f1e0 RBX: 0000000000000041 RCX: 0000000000000000
[  682.146684] RDX: ffff888109922058 RSI: 0000000000000041 RDI: 0000000000000030
[  682.146686] RBP: ffff888109922058 R08: ffffc9000215f498 R09: ffffc9000215f4a0
[  682.146687] R10: 00000000000198d0 R11: 0000000000000030 R12: ffff888107e02800
[  682.146689] R13: 0000000000000030 R14: 0000000000000030 R15: 0000000000000041
[  682.146692] FS:  00007fef52315740(0000) GS:ffff888237380000(0000) knlGS:0000000000000000
[  682.146695] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  682.146696] CR2: 00007fef52509000 CR3: 0000000110dbc004 CR4: 0000000000370ee0
[  682.146698] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  682.146700] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  682.146701] Call Trace:
[  682.146703]  <TASK>
[  682.146705]  ? __warn+0x7b/0x130
[  682.146709]  ? group_cpus_evenly+0x1aa/0x1c0
[  682.146712]  ? report_bug+0x1c8/0x1e0
[  682.146717]  ? handle_bug+0x3c/0x70
[  682.146721]  ? exc_invalid_op+0x14/0x70
[  682.146723]  ? asm_exc_invalid_op+0x16/0x20
[  682.146727]  ? group_cpus_evenly+0x1aa/0x1c0
[  682.146729]  ? group_cpus_evenly+0x15c/0x1c0
[  682.146731]  create_affinity_masks+0xaf/0x1a0
[  682.146735]  virtio_vdpa_find_vqs+0x83/0x1d0
[  682.146738]  ? __pfx_default_calc_sets+0x10/0x10
[  682.146742]  virtnet_find_vqs+0x1f0/0x370
[  682.146747]  virtnet_probe+0x501/0xcd0
[  682.146749]  ? vp_modern_get_status+0x12/0x20
[  682.146751]  ? get_cap_addr.isra.0+0x10/0xc0
[  682.146754]  virtio_dev_probe+0x1af/0x260
[  682.146759]  really_probe+0x1a5/0x410

Fixes: 3dad56823b53 ("virtio-vdpa: Support interrupt affinity spreading mechanism")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230811091539.1359865-1-jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 months agovirtio_net: merge dma operations when filling mergeable buffers
Xuan Zhuo [Thu, 10 Aug 2023 12:30:57 +0000 (20:30 +0800)]
virtio_net: merge dma operations when filling mergeable buffers

Currently, the virtio core will perform a dma operation for each
buffer. Although, the same page may be operated multiple times.

This patch, the driver does the dma operation and manages the dma
address based the feature premapped of virtio core.

This way, we can perform only one dma operation for the pages of the
alloc frag. This is beneficial for the iommu device.

kernel command line: intel_iommu=on iommu.passthrough=0

       |  strict=0  | strict=1
Before |  775496pps | 428614pps
After  | 1109316pps | 742853pps

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <20230810123057.43407-13-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 months agovirtio_ring: introduce dma sync api for virtqueue
Xuan Zhuo [Thu, 10 Aug 2023 12:30:56 +0000 (20:30 +0800)]
virtio_ring: introduce dma sync api for virtqueue

These API has been introduced:

* virtqueue_dma_need_sync
* virtqueue_dma_sync_single_range_for_cpu
* virtqueue_dma_sync_single_range_for_device

These APIs can be used together with the premapped mechanism to sync the
DMA address.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <20230810123057.43407-12-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 months agovirtio_ring: introduce dma map api for virtqueue
Xuan Zhuo [Thu, 10 Aug 2023 12:30:55 +0000 (20:30 +0800)]
virtio_ring: introduce dma map api for virtqueue

Added virtqueue_dma_map_api* to map DMA addresses for virtual memory in
advance. The purpose is to keep memory mapped across multiple add/get
buf operations.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <20230810123057.43407-11-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 months agovirtio_ring: introduce virtqueue_reset()
Xuan Zhuo [Thu, 10 Aug 2023 12:30:54 +0000 (20:30 +0800)]
virtio_ring: introduce virtqueue_reset()

Introduce virtqueue_reset() to release all buffer inside vq.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230810123057.43407-10-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 months agovirtio_ring: separate the logic of reset/enable from virtqueue_resize
Xuan Zhuo [Thu, 10 Aug 2023 12:30:53 +0000 (20:30 +0800)]
virtio_ring: separate the logic of reset/enable from virtqueue_resize

The subsequent reset function will reuse these logic.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230810123057.43407-9-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 months agovirtio_ring: correct the expression of the description of virtqueue_resize()
Xuan Zhuo [Thu, 10 Aug 2023 12:30:52 +0000 (20:30 +0800)]
virtio_ring: correct the expression of the description of virtqueue_resize()

Modify the "useless" to a more accurate "unused".

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230810123057.43407-8-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 months agovirtio_ring: skip unmap for premapped
Xuan Zhuo [Thu, 10 Aug 2023 12:30:51 +0000 (20:30 +0800)]
virtio_ring: skip unmap for premapped

Now we add a case where we skip dma unmap, the vq->premapped is true.

We can't just rely on use_dma_api to determine whether to skip the dma
operation. For convenience, I introduced the "do_unmap". By default, it
is the same as use_dma_api. If the driver is configured with premapped,
then do_unmap is false.

So as long as do_unmap is false, for addr of desc, we should skip dma
unmap operation.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <20230810123057.43407-7-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 months agovirtio_ring: introduce virtqueue_dma_dev()
Xuan Zhuo [Thu, 10 Aug 2023 12:30:50 +0000 (20:30 +0800)]
virtio_ring: introduce virtqueue_dma_dev()

Added virtqueue_dma_dev() to get DMA device for virtio. Then the
caller can do dma operation in advance. The purpose is to keep memory
mapped across multiple add/get buf operations.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230810123057.43407-6-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 months agovirtio_ring: support add premapped buf
Xuan Zhuo [Thu, 10 Aug 2023 12:30:49 +0000 (20:30 +0800)]
virtio_ring: support add premapped buf

If the vq is the premapped mode, use the sg_dma_address() directly.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <20230810123057.43407-5-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 months agovirtio_ring: introduce virtqueue_set_dma_premapped()
Xuan Zhuo [Thu, 10 Aug 2023 12:30:48 +0000 (20:30 +0800)]
virtio_ring: introduce virtqueue_set_dma_premapped()

This helper allows the driver change the dma mode to premapped mode.
Under the premapped mode, the virtio core do not do dma mapping
internally.

This just work when the use_dma_api is true. If the use_dma_api is false,
the dma options is not through the DMA APIs, that is not the standard
way of the linux kernel.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <20230810123057.43407-4-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 months agovirtio_ring: put mapping error check in vring_map_one_sg
Xuan Zhuo [Thu, 10 Aug 2023 12:30:47 +0000 (20:30 +0800)]
virtio_ring: put mapping error check in vring_map_one_sg

This patch put the dma addr error check in vring_map_one_sg().

The benefits of doing this:

1. reduce one judgment of vq->use_dma_api.
2. make vring_map_one_sg more simple, without calling
   vring_mapping_error to check the return value. simplifies subsequent
   code

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230810123057.43407-3-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 months agovirtio_ring: check use_dma_api before unmap desc for indirect
Xuan Zhuo [Thu, 10 Aug 2023 12:30:46 +0000 (20:30 +0800)]
virtio_ring: check use_dma_api before unmap desc for indirect

Inside detach_buf_split(), if use_dma_api is false,
vring_unmap_one_split_indirect will be called many times, but actually
nothing is done. So this patch check use_dma_api firstly.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230810123057.43407-2-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 months agovdpa_sim: offer VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK
Eugenio Pérez [Fri, 9 Jun 2023 09:21:27 +0000 (11:21 +0200)]
vdpa_sim: offer VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK

Start offering the feature in the simulator.  Other parent drivers can
follow this code to offer it too.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Shannon Nelson <shannon.nelson@amd.com>
Message-Id: <20230609092127.170673-5-eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 months agovdpa: add get_backend_features vdpa operation
Eugenio Pérez [Fri, 9 Jun 2023 09:21:26 +0000 (11:21 +0200)]
vdpa: add get_backend_features vdpa operation

This operation allow vdpa parent to expose its own backend feature bits.

Next patches introduce a feature not compatible with all parent drivers:
the ability to enable vq after driver_ok.  Each parent must declare if
it allows it or not.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Shannon Nelson <shannon.nelson@amd.com>
Message-Id: <20230609092127.170673-4-eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 months agovdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature
Eugenio Pérez [Fri, 9 Jun 2023 09:21:25 +0000 (11:21 +0200)]
vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature

Accepting VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature if
userland sets it.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Shannon Nelson <shannon.nelson@amd.com>
Message-Id: <20230609092127.170673-3-eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 months agovdpa: add VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK flag
Eugenio Pérez [Fri, 9 Jun 2023 09:21:24 +0000 (11:21 +0200)]
vdpa: add VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK flag

This feature flag allows the driver enabling virtqueues both before and
after DRIVER_OK.

This is needed for software assisted live migration, so userland can
restore the device status in devices with control virtqueue before the
dataplane is enabled.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Shannon Nelson <shannon.nelson@amd.com>
Message-Id: <20230609092127.170673-2-eperezma@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 months agovdpa/mlx5: Remove unused function declarations
Yue Haibing [Thu, 3 Aug 2023 14:30:41 +0000 (22:30 +0800)]
vdpa/mlx5: Remove unused function declarations

Commit 29064bfdabd5 ("vdpa/mlx5: Add support library for mlx5 VDPA implementation")
declared but never implemented these.

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Message-Id: <20230803143041.23388-1-yuehaibing@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
10 months agoMerge tag 'dmaengine-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul...
Linus Torvalds [Sun, 3 Sep 2023 17:49:42 +0000 (10:49 -0700)]
Merge tag 'dmaengine-6.6-rc1' of git://git./linux/kernel/git/vkoul/dmaengine

Pull dmaengine updates from Vinod Koul:
 "New controller support and updates to drivers.

  New support:
   - Qualcomm SM6115 and QCM2290 dmaengine support
   - at_xdma support for microchip,sam9x7 controller

  Updates:
   - idxd updates for wq simplification and ats knob updates
   - fsl edma updates for v3 support
   - Xilinx AXI4-Stream control support
   - Yaml conversion for bcm dma binding"

* tag 'dmaengine-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (53 commits)
  dmaengine: fsl-edma: integrate v3 support
  dt-bindings: fsl-dma: fsl-edma: add edma3 compatible string
  dmaengine: fsl-edma: move tcd into struct fsl_dma_chan
  dmaengine: fsl-edma: refactor chan_name setup and safety
  dmaengine: fsl-edma: move clearing of register interrupt into setup_irq function
  dmaengine: fsl-edma: refactor using devm_clk_get_enabled
  dmaengine: fsl-edma: simply ATTR_DSIZE and ATTR_SSIZE by using ffs()
  dmaengine: fsl-edma: move common IRQ handler to common.c
  dmaengine: fsl-edma: Remove enum edma_version
  dmaengine: fsl-edma: transition from bool fields to bitmask flags in drvdata
  dmaengine: fsl-edma: clean up EXPORT_SYMBOL_GPL in fsl-edma-common.c
  dmaengine: fsl-edma: fix build error when arch is s390
  dmaengine: idxd: Fix issues with PRS disable sysfs knob
  dmaengine: idxd: Allow ATS disable update only for configurable devices
  dmaengine: xilinx_dma: Program interrupt delay timeout
  dmaengine: xilinx_dma: Use tasklet_hi_schedule for timing critical usecase
  dmaengine: xilinx_dma: Freeup active list based on descriptor completion bit
  dmaengine: xilinx_dma: Increase AXI DMA transaction segment count
  dmaengine: xilinx_dma: Pass AXI4-Stream control words to dma client
  dt-bindings: dmaengine: xilinx_dma: Add xlnx,irq-delay property
  ...

10 months agoMerge tag 'phy-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy
Linus Torvalds [Sun, 3 Sep 2023 17:38:02 +0000 (10:38 -0700)]
Merge tag 'phy-for-6.6' of git://git./linux/kernel/git/phy/linux-phy

Pull phy updates from Vinod Koul:
 "As usual a couple of new drivers, a bunch of new device support and
  few updates to existing drivers

  New Support:
   - Starfive dphy rx, JH7110 usb and pcie support
   - Rockchip rv1126 inno-dsi phy, rk3588 usb and pcie support
   - Qualcomm sa8775p PCIe support, M31 USB PHY driver
   - Samsung Exynos850 usb support

  Updates:
   - Mediatek dsi driver clock updates
   - Qualcomm sm8150 combo phy with reworking of qmp pcie driver
   - Xilinx zynqmp runtime PM support"

* tag 'phy-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (83 commits)
  phy: exynos5-usbdrd: Add Exynos850 support
  phy: exynos5-usbdrd: Add 26MHz ref clk support
  phy: exynos5-usbdrd: Make it possible to pass custom phy ops
  dt-bindings: phy: samsung,usb3-drd-phy: Add Exynos850 support
  phy: qcom-qmp-combo: fix clock probing
  phy: qcom-qmp-pcie: support SM8150 PCIe QMP PHYs
  phy: qcom-qmp-pcie: populate offsets configuration
  phy: qcom-qmp-pcie: simplify clock handling
  phy: qcom-qmp-pcie: keep offset tables sorted
  phy: qcom-qmp-pcie: drop ln_shrd from v5_20 config
  dt-bindings: phy: qcom,qmp-pcie: describe SM8150 PCIe PHYs
  dt-bindings: phy: migrate QMP PCIe PHY bindings to qcom,sc8280xp-qmp-pcie-phy.yaml
  phy: fsl-imx8mq-usb: add dev_err_probe if getting vbus failed
  phy: qcom: Introduce M31 USB PHY driver
  dt-bindings: phy: qcom,m31: Document qcom,m31 USB phy
  phy: rockchip: inno-dsidphy: Add rv1126 support
  dt-bindings: phy: rockchip-inno-dsidphy: Document rv1126
  dt-bindings: phy: mediatek,tphy: allow simple nodename pattern
  phy: amlogic: meson-g12a-usb2: fix Wvoid-pointer-to-enum-cast warning
  phy: marvell pxa-usb: fix Wvoid-pointer-to-enum-cast warning
  ...

10 months agoMerge tag 'soundwire-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul...
Linus Torvalds [Sun, 3 Sep 2023 17:20:57 +0000 (10:20 -0700)]
Merge tag 'soundwire-6.6-rc1' of git://git./linux/kernel/git/vkoul/soundwire

Pull soundwire updates from Vinod Koul:
 "Device numbering and intel driver changes are main features:

   - Core support for soundwire device number allocation

   - intel driver updates for adding hw_params for DAI ops, hybrid
     number allocation and power managemnt callback updates

   - DT header include changes for subsystem"

* tag 'soundwire-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
  soundwire: intel_ace2x: add DAI hw_params/prepare/hw_free callbacks
  soundwire: intel_auxdevice: add hybrid IDA-based device_number allocation
  soundwire: bus: add callbacks for device_number allocation
  soundwire: extend parameters of new_peripheral_assigned() callback
  soundWire: intel_auxdevice: resume 'sdw-master' on startup and system resume
  soundwire: intel_auxdevice: enable pm_runtime earlier on startup
  soundwire: Explicitly include correct DT includes

10 months agokbuild: Show marked Kconfig fragments in "help"
Kees Cook [Thu, 31 Aug 2023 19:13:39 +0000 (12:13 -0700)]
kbuild: Show marked Kconfig fragments in "help"

Currently the Kconfig fragments in kernel/configs and arch/*/configs
that aren't used internally aren't discoverable through "make help",
which consists of hard-coded lists of config fragments. Instead, list
all the fragment targets that have a "# Help: " comment prefix so the
targets can be generated dynamically.

Add logic to the Makefile to search for and display the fragment and
comment. Add comments to fragments that are intended to be direct targets.

Signed-off-by: Kees Cook <keescook@chromium.org>
Co-developed-by: Masahiro Yamada <masahiroy@kernel.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
10 months agoMerge tag 'mtd/for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Linus Torvalds [Sun, 3 Sep 2023 16:59:53 +0000 (09:59 -0700)]
Merge tag 'mtd/for-6.6' of git://git./linux/kernel/git/mtd/linux

Pull MTD updates from Miquel Raynal:
 "Core MTD changes:
   - Use refcount to prevent corruption
   - Call external _get and _put in right order
   - Fix use-after-free in mtd release
   - Explicitly include correct DT includes
   - Clean refcounting with MTD_PARTITIONED_MASTER
   - mtdblock: make warning messages ratelimited
   - dt-bindings: Add SEAMA partition bindings

  Device driver changes:
   - Use devm helper functions
   - Fix questionable cast, remove pointless ones.
   - error handling fixes
   - add support for new chip versions
   - update DT bindings
   - misc cleanups - fix typos, whitespace, indentation"

* tag 'mtd/for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (105 commits)
  dt-bindings: mtd: amlogic,meson-nand: drop unneeded quotes
  mtd: spear_smi: Use helper function devm_clk_get_enabled()
  mtd: rawnand: orion: Use helper function devm_clk_get_optional_enabled()
  mtd: rawnand: vf610_nfc: Use helper function devm_clk_get_enabled()
  mtd: rawnand: sunxi: Use helper function devm_clk_get_enabled()
  mtd: rawnand: stm32_fmc2: Use helper function devm_clk_get_enabled()
  mtd: rawnand: mtk: Use helper function devm_clk_get_enabled()
  mtd: rawnand: mpc5121: Use helper function devm_clk_get_enabled()
  mtd: rawnand: lpc32xx_slc: Use helper function devm_clk_get_enabled()
  mtd: rawnand: intel: Use helper function devm_clk_get_enabled()
  mtd: rawnand: fsmc: Use helper function devm_clk_get_enabled()
  mtd: rawnand: arasan: Use helper function devm_clk_get_enabled()
  mtd: rawnand: qcom: Add read/read_start ops in exec_op path
  mtd: rawnand: qcom: Clear buf_count and buf_start in raw read
  mtd: maps: fix -Wvoid-pointer-to-enum-cast warning
  mtd: rawnand: fix -Wvoid-pointer-to-enum-cast warning
  mtd: rawnand: fsmc: handle clk prepare error in fsmc_nand_resume()
  mtd: rawnand: Propagate error and simplify ternary operators for brcmstb_nand_wait_for_completion()
  mtd: rawnand: qcom: Sort includes alphabetically
  mtd: rawnand: qcom: Do not override the error no of submit_descs()
  ...

10 months agoMerge tag 'f2fs-for-6-6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk...
Linus Torvalds [Sat, 2 Sep 2023 22:37:59 +0000 (15:37 -0700)]
Merge tag 'f2fs-for-6-6-rc1' of git://git./linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "In this cycle, we don't have a highlighted feature enhancement, but
  mostly have fixed issues mainly in two parts: 1) zoned block device,
  and 2) compression support.

  For zoned block device, we've tried to improve the power-off recovery
  flow as much as possible. For compression, we found some corner cases
  caused by wrong compression policy and logics. Other than them, there
  were some reverts and stat corrections.

  Bug fixes:
   - use finish zone command when closing a zone
   - check zone type before sending async reset zone command
   - fix to assign compress_level for lz4 correctly
   - fix error path of f2fs_submit_page_read()
   - don't {,de}compress non-full cluster
   - send small discard commands during checkpoint back
   - flush inode if atomic file is aborted
   - correct to account gc/cp stats

  And, there are minor bug fixes, avoiding false lockdep warning, and
  clean-ups"

* tag 'f2fs-for-6-6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (25 commits)
  f2fs: use finish zone command when closing a zone
  f2fs: compress: fix to assign compress_level for lz4 correctly
  f2fs: fix error path of f2fs_submit_page_read()
  f2fs: clean up error handling in sanity_check_{compress_,}inode()
  f2fs: avoid false alarm of circular locking
  Revert "f2fs: do not issue small discard commands during checkpoint"
  f2fs: doc: fix description of max_small_discards
  f2fs: should update REQ_TIME for direct write
  f2fs: fix to account cp stats correctly
  f2fs: fix to account gc stats correctly
  f2fs: remove unneeded check condition in __f2fs_setxattr()
  f2fs: fix to update i_ctime in __f2fs_setxattr()
  Revert "f2fs: fix to do sanity check on extent cache correctly"
  f2fs: increase usage of folio_next_index() helper
  f2fs: Only lfs mode is allowed with zoned block device feature
  f2fs: check zone type before sending async reset zone command
  f2fs: compress: don't {,de}compress non-full cluster
  f2fs: allow f2fs_ioc_{,de}compress_file to be interrupted
  f2fs: don't reopen the main block device in f2fs_scan_devices
  f2fs: fix to avoid mmap vs set_compress_option case
  ...

10 months agomm/kmemleak: move up cond_resched() call in page scanning loop
Waiman Long [Fri, 25 Aug 2023 16:49:47 +0000 (12:49 -0400)]
mm/kmemleak: move up cond_resched() call in page scanning loop

Commit bde5f6bc68db ("kmemleak: add scheduling point to kmemleak_scan()")
added a cond_resched() call to the struct page scanning loop to prevent
soft lockup from happening.  However, soft lockup can still happen in that
loop in some corner cases when the pages that satisfy the "!(pfn & 63)"
check are skipped for some reasons.

Fix this corner case by moving up the cond_resched() check so that it will
be called every 64 pages unconditionally.

Link: https://lkml.kernel.org/r/20230825164947.1317981-1-longman@redhat.com
Fixes: bde5f6bc68db ("kmemleak: add scheduling point to kmemleak_scan()")
Signed-off-by: Waiman Long <longman@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Yisheng Xie <xieyisheng1@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 months agomm: page_alloc: remove stale CMA guard code
Johannes Weiner [Thu, 24 Aug 2023 15:38:21 +0000 (11:38 -0400)]
mm: page_alloc: remove stale CMA guard code

In the past, movable allocations could be disallowed from CMA through
PF_MEMALLOC_PIN.  As CMA pages are funneled through the MOVABLE pcplist,
this required filtering that cornercase during allocations, such that
pinnable allocations wouldn't accidentally get a CMA page.

However, since 8e3560d963d2 ("mm: honor PF_MEMALLOC_PIN for all movable
pages"), PF_MEMALLOC_PIN automatically excludes __GFP_MOVABLE.  Once
again, MOVABLE implies CMA is allowed.

Remove the stale filtering code.  Also remove a stale comment that was
introduced as part of the filtering code, because the filtering let
order-0 pages fall through to the buddy allocator.  See 1d91df85f399
("mm/page_alloc: handle a missing case for memalloc_nocma_{save/restore}
APIs") for context.  The comment's been obsolete since the introduction of
the explicit ALLOC_HIGHATOMIC flag in eb2e2b425c69 ("mm/page_alloc:
explicitly record high-order atomic allocations in alloc_flags").

Link: https://lkml.kernel.org/r/20230824153821.243148-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: David Hildenbrand <david@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 months agoMAINTAINERS: add rmap.h to mm entry
Baruch Siach [Thu, 24 Aug 2023 11:38:09 +0000 (14:38 +0300)]
MAINTAINERS: add rmap.h to mm entry

Make it easier to figure out where to send patches for this file.

Link: https://lkml.kernel.org/r/efbc7689d35a48ff402644d696aa9a8d8bb6333a.1692877089.git.baruch@tkos.co.il
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 months agormap: remove anon_vma_link() nommu stub
Baruch Siach [Thu, 24 Aug 2023 11:38:08 +0000 (14:38 +0300)]
rmap: remove anon_vma_link() nommu stub

anon_vma_link() is unused since commit 5beb49305251 ("mm: change anon_vma
linking to fix multi-process server scalability issue").

Link: https://lkml.kernel.org/r/cdce9b00c9ab15f6d02eddf40dcad537d3e9676f.1692877089.git.baruch@tkos.co.il
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 months agoproc/ksm: add ksm stats to /proc/pid/smaps
Stefan Roesch [Tue, 22 Aug 2023 18:05:39 +0000 (11:05 -0700)]
proc/ksm: add ksm stats to /proc/pid/smaps

With madvise and prctl KSM can be enabled for different VMA's.  Once it is
enabled we can query how effective KSM is overall.  However we cannot
easily query if an individual VMA benefits from KSM.

This commit adds a KSM section to the /prod/<pid>/smaps file.  It reports
how many of the pages are KSM pages.  Note that KSM-placed zeropages are
not included, only actual KSM pages.

Here is a typical output:

7f420a000000-7f421a000000 rw-p 00000000 00:00 0
Size:             262144 kB
KernelPageSize:        4 kB
MMUPageSize:           4 kB
Rss:               51212 kB
Pss:                8276 kB
Shared_Clean:        172 kB
Shared_Dirty:      42996 kB
Private_Clean:       196 kB
Private_Dirty:      7848 kB
Referenced:        15388 kB
Anonymous:         51212 kB
KSM:               41376 kB
LazyFree:              0 kB
AnonHugePages:         0 kB
ShmemPmdMapped:        0 kB
FilePmdMapped:         0 kB
Shared_Hugetlb:        0 kB
Private_Hugetlb:       0 kB
Swap:             202016 kB
SwapPss:            3882 kB
Locked:                0 kB
THPeligible:    0
ProtectionKey:         0
ksm_state:          0
ksm_skip_base:      0
ksm_skip_count:     0
VmFlags: rd wr mr mw me nr mg anon

This information also helps with the following workflow:
- First enable KSM for all the VMA's of a process with prctl.
- Then analyze with the above smaps report which VMA's benefit the most
- Change the application (if possible) to add the corresponding madvise
calls for the VMA's that benefit the most

[shr@devkernel.io: v5]
Link: https://lkml.kernel.org/r/20230823170107.1457915-1-shr@devkernel.io
Link: https://lkml.kernel.org/r/20230822180539.1424843-1-shr@devkernel.io
Signed-off-by: Stefan Roesch <shr@devkernel.io>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 months agomm/hwpoison: rename hwp_walk* to hwpoison_walk*
Jiaqi Yan [Thu, 13 Jul 2023 23:55:53 +0000 (23:55 +0000)]
mm/hwpoison: rename hwp_walk* to hwpoison_walk*

In the discussion of "Improve hugetlbfs read on HWPOISON hugepages" [1],
Matthew Wilcox suggests hwp is a bad abbreviation of hwpoison, as hwp is
already used as "an acronym by acpi, intel_pstate, some clock drivers, an
ethernet driver, and a scsi driver"[1].

So rename hwp_walk and hwp_walk_ops to hwpoison_walk and
hwpoison_walk_ops respectively.

raw_hwp_(page|list), *_raw_hwp, and raw_hwp_unreliable flag are other
major appearances of "hwp".  However, given the "raw" hint in the name, it
is easy to differentiate them from other "hwp" acronyms.  Since renaming
them is not as straightforward as renaming hwp_walk*, they are not covered
by this commit.

[1] https://lore.kernel.org/lkml/20230707201904.953262-5-jiaqiyan@google.com/T/#me6fecb8ce1ad4d5769199c9e162a44bc88f7bdec

Link: https://lkml.kernel.org/r/20230713235553.4121855-1-jiaqiyan@google.com
Signed-off-by: Jiaqi Yan <jiaqiyan@google.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 months agomm: memory-failure: add PageOffline() check
Miaohe Lin [Thu, 27 Jul 2023 11:56:43 +0000 (19:56 +0800)]
mm: memory-failure: add PageOffline() check

Memory failure is not interested in logically offlined pages.  Skip this
type of page.

Link: https://lkml.kernel.org/r/20230727115643.639741-5-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
10 months agoMerge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sat, 2 Sep 2023 19:02:41 +0000 (12:02 -0700)]
Merge tag 'scsi-misc' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "Updates to the usual drivers (ufs, lpfc, qla2xxx, mpi3mr, libsas) and
  the usual minor updates and bug fixes but no significant core changes"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (116 commits)
  scsi: storvsc: Handle additional SRB status values
  scsi: libsas: Delete sas_ata_task.retry_count
  scsi: libsas: Delete sas_ata_task.stp_affil_pol
  scsi: libsas: Delete sas_ata_task.set_affil_pol
  scsi: libsas: Delete sas_ssp_task.task_prio
  scsi: libsas: Delete sas_ssp_task.enable_first_burst
  scsi: libsas: Delete sas_ssp_task.retry_count
  scsi: libsas: Delete struct scsi_core
  scsi: libsas: Delete enum sas_phy_type
  scsi: libsas: Delete enum sas_class
  scsi: libsas: Delete sas_ha_struct.lldd_module
  scsi: target: Fix write perf due to unneeded throttling
  scsi: lpfc: Do not abuse UUID APIs and LPFC_COMPRESS_VMID_SIZE
  scsi: pm8001: Remove unused declarations
  scsi: fcoe: Fix potential deadlock on &fip->ctlr_lock
  scsi: elx: sli4: Remove code duplication
  scsi: bfa: Replace one-element array with flexible-array member in struct fc_rscn_pl_s
  scsi: qla2xxx: Remove unused declarations
  scsi: pmcraid: Use pci_dev_id() to simplify the code
  scsi: pm80xx: Set RETFIS when requested by libsas
  ...

10 months agoMerge tag 'probes-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux...
Linus Torvalds [Sat, 2 Sep 2023 18:10:50 +0000 (11:10 -0700)]
Merge tag 'probes-v6.6' of git://git./linux/kernel/git/trace/linux-trace

Pull probes updates from Masami Hiramatsu:

 - kprobes: use struct_size() for variable size kretprobe_instance data
   structure.

 - eprobe: Simplify trace_eprobe list iteration.

 - probe events: Data structure field access support on BTF argument.

     - Update BTF argument support on the functions in the kernel
       loadable modules (only loaded modules are supported).

     - Move generic BTF access function (search function prototype and
       get function parameters) to a separated file.

     - Add a function to search a member of data structure in BTF.

     - Support accessing BTF data structure member from probe args by
       C-like arrow('->') and dot('.') operators. e.g.
          't sched_switch next=next->pid vruntime=next->se.vruntime'

     - Support accessing BTF data structure member from $retval. e.g.
          'f getname_flags%return +0($retval->name):string'

     - Add string type checking if BTF type info is available. This will
       reject if user specify ":string" type for non "char pointer"
       type.

     - Automatically assume the fprobe event as a function return event
       if $retval is used.

 - selftests/ftrace: Add BTF data field access test cases.

 - Documentation: Update fprobe event example with BTF data field.

* tag 'probes-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  Documentation: tracing: Update fprobe event example with BTF field
  selftests/ftrace: Add BTF fields access testcases
  tracing/fprobe-event: Assume fprobe is a return event by $retval
  tracing/probes: Add string type check with BTF
  tracing/probes: Support BTF field access from $retval
  tracing/probes: Support BTF based data structure field access
  tracing/probes: Add a function to search a member of a struct/union
  tracing/probes: Move finding func-proto API and getting func-param API to trace_btf
  tracing/probes: Support BTF argument on module functions
  tracing/eprobe: Iterate trace_eprobe directly
  kernel: kprobes: Use struct_size()

10 months agoMerge tag 'trace-v6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux...
Linus Torvalds [Sat, 2 Sep 2023 17:50:54 +0000 (10:50 -0700)]
Merge tag 'trace-v6.6-2' of git://git./linux/kernel/git/trace/linux-trace

Pull more tracing updates from Steven Rostedt:
 "Tracing fixes and clean ups:

   - Replace strlcpy() with strscpy()

   - Initialize the pipe cpumask to zero on allocation

   - Use within_module() instead of open coding it

   - Remove extra space in hwlat_detectory/mode output

   - Use LIST_HEAD() instead of open coding it

   - A bunch of clean ups and fixes for the cpumask filter

   - Set local da_mon_##name to static

   - Fix race in snapshot buffer between cpu write and swap"

* tag 'trace-v6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing/filters: Fix coding style issues
  tracing/filters: Change parse_pred() cpulist ternary into an if block
  tracing/filters: Fix double-free of struct filter_pred.mask
  tracing/filters: Fix error-handling of cpulist parsing buffer
  tracing: Zero the pipe cpumask on alloc to avoid spurious -EBUSY
  ftrace: Use LIST_HEAD to initialize clear_hash
  ftrace: Use within_module to check rec->ip within specified module.
  tracing: Replace strlcpy with strscpy in trace/events/task.h
  tracing: Fix race issue between cpu buffer write and swap
  tracing: Remove extra space at the end of hwlat_detector/mode
  rv: Set variable 'da_mon_##name' to static

10 months agoMerge tag 'pstore-v6.6-rc1-fix' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 2 Sep 2023 17:45:17 +0000 (10:45 -0700)]
Merge tag 'pstore-v6.6-rc1-fix' of git://git./linux/kernel/git/kees/linux

Pull pstore fix from Kees Cook:

 - Adjust sizes of buffers just avoid uncompress failures (Ard
   Biesheuvel)

* tag 'pstore-v6.6-rc1-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  pstore: Base compression input buffer size on estimated compressed size

10 months agoMerge tag 'x86-urgent-2023-09-02' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 2 Sep 2023 16:07:27 +0000 (09:07 -0700)]
Merge tag 'x86-urgent-2023-09-02' of git://git./linux/kernel/git/tip/tip

Pull x86 selftest fix from Ingo Molnar:
 "Fix the __NR_map_shadow_stack syscall-renumbering fallout in the x86
  self-test code.

  [ Arguably the existing code was unnecessarily fragile, and tooling
    should have picked up the new syscall number, and a wider fix is
    being worked on - but meanwhile, let's not have the old syscall
    number in the kernel tree. ]"

* tag 'x86-urgent-2023-09-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  selftests/x86: Update map_shadow_stack syscall nr

10 months agoMerge tag 'timers-urgent-2023-09-02' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 2 Sep 2023 16:01:48 +0000 (09:01 -0700)]
Merge tag 'timers-urgent-2023-09-02' of git://git./linux/kernel/git/tip/tip

Pull timer fix from Ingo Molnar:
 "Fix false positive 'softirq work is pending' messages on -rt kernels,
  caused by a buggy factoring-out of existing code"

* tag 'timers-urgent-2023-09-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tick/rcu: Fix false positive "softirq work is pending" messages

10 months agoMerge tag 'smp-urgent-2023-09-02' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 2 Sep 2023 15:58:49 +0000 (08:58 -0700)]
Merge tag 'smp-urgent-2023-09-02' of git://git./linux/kernel/git/tip/tip

Pull CPU hotplug fix from Ingo Molnar:
 "Fix a CPU hotplug related deadlock between the task which initiates
  and controls a CPU hot-unplug operation vs. the CFS bandwidth timer"

* tag 'smp-urgent-2023-09-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  cpu/hotplug: Prevent self deadlock on CPU hot-unplug

10 months agoMerge tag 'sched-urgent-2023-09-02' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 2 Sep 2023 15:49:08 +0000 (08:49 -0700)]
Merge tag 'sched-urgent-2023-09-02' of git://git./linux/kernel/git/tip/tip

Pull scheduler fixes from Ingo Molnar:
 "Miscellaneous scheduler fixes: a reporting fix, a static symbol fix,
  and a kernel-doc fix"

* tag 'sched-urgent-2023-09-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/core: Report correct state for TASK_IDLE | TASK_FREEZABLE
  sched/fair: Make update_entity_lag() static
  sched/core: Add kernel-doc for set_cpus_allowed_ptr()

10 months agomm/pagewalk: fix bootstopping regression from extra pte_unmap()
Hugh Dickins [Sat, 2 Sep 2023 15:29:30 +0000 (08:29 -0700)]
mm/pagewalk: fix bootstopping regression from extra pte_unmap()

Mikhail reports early-6.6-based Fedora Rawhide not booting: "rcu_preempt
detected expedited stalls", minutes wait, and then hung_task splat while
kworker trying to synchronize_rcu_expedited().  Nothing logged to disk.

He bisected to my 6.6 a349d72fd9ef ("mm/pgtable: add rcu_read_lock() and
rcu_read_unlock()s"): but the one to blame is my 6.5 commit to fix the
espfix "bad pmd" warnings when booting x86_64 with CONFIG_EFI_PGT_DUMP=y.

Gaah, that added an "addr >= TASK_SIZE" check to avoid pte_offset_map(),
but failed to add the equivalent check when choosing to pte_unmap().

It's not a problem on 6.5 (for different reasons, it's harmless on both
64-bit and 32-bit), but becomes a bootstopper on 6.6 with the unbalanced
rcu_read_unlock() - RCU has a WARN_ON_ONCE for that, but it would have
scrolled off Mikhail's console too quickly.

Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Closes: https://lore.kernel.org/linux-mm/CABXGCsNi8Tiv5zUPNXr6UJw6qV1VdaBEfGqEAMkkXE3QPvZuAQ@mail.gmail.com/
Fixes: 8b1cb4a2e819 ("mm/pagewalk: fix EFI_PGT_DUMP of espfix area")
Fixes: a349d72fd9ef ("mm/pgtable: add rcu_read_lock() and rcu_read_unlock()s")
Signed-off-by: Hugh Dickins <hughd@google.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 months agocgroup: fix build when CGROUP_SCHED is not enabled
Linus Torvalds [Sat, 2 Sep 2023 15:27:17 +0000 (08:27 -0700)]
cgroup: fix build when CGROUP_SCHED is not enabled

Sudip Mukherjee reports that the mips sb1250_swarm_defconfig build fails
with the current kernel.  It isn't actually MIPS-specific, it's just
that that defconfig does not have CGROUP_SCHED enabled like most configs
do, and as such shows this error:

  kernel/cgroup/cgroup.c: In function 'cgroup_local_stat_show':
  kernel/cgroup/cgroup.c:3699:15: error: implicit declaration of function 'cgroup_tryget_css'; did you mean 'cgroup_tryget'? [-Werror=implicit-function-declaration]
   3699 |         css = cgroup_tryget_css(cgrp, ss);
        |               ^~~~~~~~~~~~~~~~~
        |               cgroup_tryget
  kernel/cgroup/cgroup.c:3699:13: warning: assignment to 'struct cgroup_subsys_state *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
   3699 |         css = cgroup_tryget_css(cgrp, ss);
        |             ^

because cgroup_tryget_css() only exists when CGROUP_SCHED is enabled,
and the cgroup_local_stat_show() function should similarly be guarded by
that config option.

Move things around a bit to fix this all.

Fixes: d1d4ff5d11a5 ("cgroup: put cgroup_tryget_css() inside CONFIG_CGROUP_SCHED")
Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 months agofbdev/g364fb: fix build failure with mips
Sudip Mukherjee [Sat, 2 Sep 2023 09:51:02 +0000 (10:51 +0100)]
fbdev/g364fb: fix build failure with mips

Fix the typo which resulted in the driver using FB_DEFAULT_IOMEM_HELPERS
instead of FB_DEFAULT_IOMEM_OPS as the fbdev I/O helpers.

Fixes: 501126083855 ("fbdev/g364fb: Use fbdev I/O helpers")
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
10 months agoMerge branch 'fixes' into misc
James Bottomley [Sat, 2 Sep 2023 07:25:19 +0000 (08:25 +0100)]
Merge branch 'fixes' into misc

10 months agotracing/filters: Fix coding style issues
Valentin Schneider [Fri, 1 Sep 2023 15:10:39 +0000 (17:10 +0200)]
tracing/filters: Fix coding style issues

Recent commits have introduced some coding style issues, fix those up.

Link: https://lkml.kernel.org/r/20230901151039.125186-5-vschneid@redhat.com
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
10 months agotracing/filters: Change parse_pred() cpulist ternary into an if block
Valentin Schneider [Fri, 1 Sep 2023 15:10:38 +0000 (17:10 +0200)]
tracing/filters: Change parse_pred() cpulist ternary into an if block

Review comments noted that an if block would be clearer than a ternary, so
swap it out.

No change in behaviour intended

Link: https://lkml.kernel.org/r/20230901151039.125186-4-vschneid@redhat.com
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
10 months agotracing/filters: Fix double-free of struct filter_pred.mask
Valentin Schneider [Fri, 1 Sep 2023 15:10:37 +0000 (17:10 +0200)]
tracing/filters: Fix double-free of struct filter_pred.mask

When a cpulist filter is found to contain a single CPU, that CPU is saved
as a scalar and the backing cpumask storage is freed.

Also NULL the mask to avoid a double-free once we get down to
free_predicate().

Link: https://lkml.kernel.org/r/20230901151039.125186-3-vschneid@redhat.com
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
10 months agotracing/filters: Fix error-handling of cpulist parsing buffer
Valentin Schneider [Fri, 1 Sep 2023 15:10:36 +0000 (17:10 +0200)]
tracing/filters: Fix error-handling of cpulist parsing buffer

parse_pred() allocates a string buffer to parse the user-provided cpulist,
but doesn't check the allocation result nor does it free the buffer once it
is no longer needed.

Add an allocation check, and free the buffer as soon as it is no longer
needed.

Link: https://lkml.kernel.org/r/20230901151039.125186-2-vschneid@redhat.com
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Reported-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
10 months agotracing: Zero the pipe cpumask on alloc to avoid spurious -EBUSY
Brian Foster [Thu, 31 Aug 2023 12:55:00 +0000 (08:55 -0400)]
tracing: Zero the pipe cpumask on alloc to avoid spurious -EBUSY

The pipe cpumask used to serialize opens between the main and percpu
trace pipes is not zeroed or initialized. This can result in
spurious -EBUSY returns if underlying memory is not fully zeroed.
This has been observed by immediate failure to read the main
trace_pipe file on an otherwise newly booted and idle system:

 # cat /sys/kernel/debug/tracing/trace_pipe
 cat: /sys/kernel/debug/tracing/trace_pipe: Device or resource busy

Zero the allocation of pipe_cpumask to avoid the problem.

Link: https://lore.kernel.org/linux-trace-kernel/20230831125500.986862-1-bfoster@redhat.com
Cc: stable@vger.kernel.org
Fixes: c2489bb7e6be ("tracing: Introduce pipe_cpumask to avoid race on trace_pipes")
Reviewed-by: Zheng Yejian <zhengyejian1@huawei.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
10 months agoftrace: Use LIST_HEAD to initialize clear_hash
Ruan Jinjie [Wed, 9 Aug 2023 07:15:51 +0000 (15:15 +0800)]
ftrace: Use LIST_HEAD to initialize clear_hash

Use LIST_HEAD() to initialize clear_hash instead of open-coding it.

Link: https://lore.kernel.org/linux-trace-kernel/20230809071551.913041-1-ruanjinjie@huawei.com
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
10 months agoftrace: Use within_module to check rec->ip within specified module.
Levi Yun [Thu, 3 Aug 2023 20:52:36 +0000 (21:52 +0100)]
ftrace: Use within_module to check rec->ip within specified module.

within_module_core && within_module_init condition is same to
within module but it's more readable.

Use within_module instead of former condition to check rec->ip
within specified module area or not.

Link: https://lore.kernel.org/linux-trace-kernel/20230803205236.32201-1-ppbuk5246@gmail.com
Signed-off-by: Levi Yun <ppbuk5246@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
10 months agotracing: Replace strlcpy with strscpy in trace/events/task.h
Azeem Shaikh [Thu, 31 Aug 2023 19:42:12 +0000 (19:42 +0000)]
tracing: Replace strlcpy with strscpy in trace/events/task.h

strlcpy() reads the entire source buffer first.
This read may exceed the destination size limit.
This is both inefficient and can lead to linear read
overflows if a source string is not NUL-terminated [1].
In an effort to remove strlcpy() completely [2], replace
strlcpy() here with strscpy().

No return values were used, so direct replacement is safe.

[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy
[2] https://github.com/KSPP/linux/issues/89

Link: https://lore.kernel.org/linux-trace-kernel/20230831194212.1529941-1-azeemshaikh38@gmail.com
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
10 months agotracing: Fix race issue between cpu buffer write and swap
Zheng Yejian [Thu, 31 Aug 2023 13:27:39 +0000 (21:27 +0800)]
tracing: Fix race issue between cpu buffer write and swap

Warning happened in rb_end_commit() at code:
if (RB_WARN_ON(cpu_buffer, !local_read(&cpu_buffer->committing)))

  WARNING: CPU: 0 PID: 139 at kernel/trace/ring_buffer.c:3142
rb_commit+0x402/0x4a0
  Call Trace:
   ring_buffer_unlock_commit+0x42/0x250
   trace_buffer_unlock_commit_regs+0x3b/0x250
   trace_event_buffer_commit+0xe5/0x440
   trace_event_buffer_reserve+0x11c/0x150
   trace_event_raw_event_sched_switch+0x23c/0x2c0
   __traceiter_sched_switch+0x59/0x80
   __schedule+0x72b/0x1580
   schedule+0x92/0x120
   worker_thread+0xa0/0x6f0

It is because the race between writing event into cpu buffer and swapping
cpu buffer through file per_cpu/cpu0/snapshot:

  Write on CPU 0             Swap buffer by per_cpu/cpu0/snapshot on CPU 1
  --------                   --------
                             tracing_snapshot_write()
                               [...]

  ring_buffer_lock_reserve()
    cpu_buffer = buffer->buffers[cpu]; // 1. Suppose find 'cpu_buffer_a';
    [...]
    rb_reserve_next_event()
      [...]

                               ring_buffer_swap_cpu()
                                 if (local_read(&cpu_buffer_a->committing))
                                     goto out_dec;
                                 if (local_read(&cpu_buffer_b->committing))
                                     goto out_dec;
                                 buffer_a->buffers[cpu] = cpu_buffer_b;
                                 buffer_b->buffers[cpu] = cpu_buffer_a;
                                 // 2. cpu_buffer has swapped here.

      rb_start_commit(cpu_buffer);
      if (unlikely(READ_ONCE(cpu_buffer->buffer)
          != buffer)) { // 3. This check passed due to 'cpu_buffer->buffer'
        [...]           //    has not changed here.
        return NULL;
      }
                                 cpu_buffer_b->buffer = buffer_a;
                                 cpu_buffer_a->buffer = buffer_b;
                                 [...]

      // 4. Reserve event from 'cpu_buffer_a'.

  ring_buffer_unlock_commit()
    [...]
    cpu_buffer = buffer->buffers[cpu]; // 5. Now find 'cpu_buffer_b' !!!
    rb_commit(cpu_buffer)
      rb_end_commit()  // 6. WARN for the wrong 'committing' state !!!

Based on above analysis, we can easily reproduce by following testcase:
  ``` bash
  #!/bin/bash

  dmesg -n 7
  sysctl -w kernel.panic_on_warn=1
  TR=/sys/kernel/tracing
  echo 7 > ${TR}/buffer_size_kb
  echo "sched:sched_switch" > ${TR}/set_event
  while [ true ]; do
          echo 1 > ${TR}/per_cpu/cpu0/snapshot
  done &
  while [ true ]; do
          echo 1 > ${TR}/per_cpu/cpu0/snapshot
  done &
  while [ true ]; do
          echo 1 > ${TR}/per_cpu/cpu0/snapshot
  done &
  ```

To fix it, IIUC, we can use smp_call_function_single() to do the swap on
the target cpu where the buffer is located, so that above race would be
avoided.

Link: https://lore.kernel.org/linux-trace-kernel/20230831132739.4070878-1-zhengyejian1@huawei.com
Cc: <mhiramat@kernel.org>
Fixes: f1affcaaa861 ("tracing: Add snapshot in the per_cpu trace directories")
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>