Waiman Long [Tue, 14 Jun 2022 22:03:59 +0000 (18:03 -0400)]
mm/kmemleak: prevent soft lockup in first object iteration loop of kmemleak_scan()
The first RCU-based object iteration loop has to modify the object count.
So we cannot skip taking the object lock.
One way to avoid soft lockup is to insert occasional cond_resched() call
into the loop. This cannot be done while holding the RCU read lock which
is to protect objects from being freed. However, taking a reference to
the object will prevent it from being freed. We can then do a
cond_resched() call after every 64k objects safely.
Link: https://lkml.kernel.org/r/20220614220359.59282-4-longman@redhat.com
Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Waiman Long [Tue, 14 Jun 2022 22:03:58 +0000 (18:03 -0400)]
mm/kmemleak: skip unlikely objects in kmemleak_scan() without taking lock
There are 3 RCU-based object iteration loops in kmemleak_scan(). Because
of the need to take RCU read lock, we can't insert cond_resched() into the
loop like other parts of the function. As there can be millions of
objects to be scanned, it takes a while to iterate all of them. The
kmemleak functionality is usually enabled in a debug kernel which is much
slower than a non-debug kernel. With sufficient number of kmemleak
objects, the time to iterate them all may exceed 22s causing soft lockup.
watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [kmemleak:625]
In this particular bug report, the soft lockup happen in the 2nd iteration
loop.
In the 2nd and 3rd loops, most of the objects are checked and then skipped
under the object lock. Only a selected fews are modified. Those objects
certainly need lock protection. However, the lock/unlock operation is
slow especially with interrupt disabling and enabling included.
We can actually do some basic check like color_white() without taking the
lock and skip the object accordingly. Of course, this kind of check is
racy and may miss objects that are being modified concurrently. The cost
of missed objects, however, is just that they will be discovered in the
next scan instead. The advantage of doing so is that iteration can be
done much faster especially with LOCKDEP enabled in a debug kernel.
With a debug kernel running on a 2-socket 96-thread x86-64 system
(HZ=1000), the 2nd and 3rd iteration loops speedup with this patch on the
first kmemleak_scan() call after bootup is shown in the table below.
Before patch After patch
Loop # # of objects Elapsed time # of objects Elapsed time
------ ------------ ------------ ------------ ------------
2 2,599,850 2.392s 2,596,364 0.266s
3 2,600,176 2.171s 2,597,061 0.260s
This patch reduces loop iteration times by about 88%. This will greatly
reduce the chance of a soft lockup happening in the 2nd or 3rd iteration
loops.
Even though the first loop runs a little bit faster, it can still be
problematic if many kmemleak objects are there. As the object count has
to be modified in every object, we cannot avoid taking the object lock.
So other way to prevent soft lockup will be needed.
Link: https://lkml.kernel.org/r/20220614220359.59282-3-longman@redhat.com
Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Waiman Long [Tue, 14 Jun 2022 22:03:57 +0000 (18:03 -0400)]
mm/kmemleak: use _irq lock/unlock variants in kmemleak_scan/_clear()
Patch series "mm/kmemleak: Avoid soft lockup in kmemleak_scan()", v2.
There are 3 RCU-based object iteration loops in kmemleak_scan(). Because
of the need to take RCU read lock, we can't insert cond_resched() into the
loop like other parts of the function. As there can be millions of
objects to be scanned, it takes a while to iterate all of them. The
kmemleak functionality is usually enabled in a debug kernel which is much
slower than a non-debug kernel. With sufficient number of kmemleak
objects, the time to iterate them all may exceed 22s causing soft lockup.
watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [kmemleak:625]
This patch series make changes to the 3 object iteration loops in
kmemleak_scan() to prevent them from causing soft lockup.
This patch (of 3):
kmemleak_scan() is called only from the kmemleak scan thread or from write
to the kmemleak debugfs file. Both are in task context and so we can
directly use the simpler _irq() lock/unlock calls instead of the more
complex _irqsave/_irqrestore variants.
Similarly, kmemleak_clear() is called only from write to the kmemleak
debugfs file. The same change can be applied.
Link: https://lkml.kernel.org/r/20220614220359.59282-1-longman@redhat.com
Link: https://lkml.kernel.org/r/20220614220359.59282-2-longman@redhat.com
Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Gautam Menghani [Sun, 12 Jun 2022 18:23:20 +0000 (11:23 -0700)]
mm/sparse-vmemmap.c: remove unwanted initialization in vmemmap_populate_compound_pages()
Remove unnecessary initialization for the variable 'next'. This fixes
the clang scan warning: Value stored to 'next' during its
initialization is never read [deadcode.DeadStores]
Link: https://lkml.kernel.org/r/20220612182320.160651-1-gautammenghani201@gmail.com
Signed-off-by: Gautam Menghani <gautammenghani201@gmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Joel Savitz [Thu, 9 Jun 2022 20:32:17 +0000 (16:32 -0400)]
selftests: make use of GUP_TEST_FILE macro
Commit
17de1e559cf1 ("selftests: clarify common error when running
gup_test") had most of its hunks dropped due to a conflict with another
patch accepted into Linux around the same time that implemented the same
behavior as a subset of other changes.
However, the remaining hunk defines the GUP_TEST_FILE macro without making
use of it. This patch makes use of the macro in the two relevant places.
Furthermore, the above mentioned commit's log message erroneously
describes the changes that were dropped from the patch.
This patch corrects the record.
Link: https://lkml.kernel.org/r/20220609203217.3206247-1-jsavitz@redhat.com
Fixes:
17de1e559cf1 ("selftests: clarify common error when running gup_test")
Signed-off-by: Joel Savitz <jsavitz@redhat.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Nico Pache <npache@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Xiang wangx [Fri, 10 Jun 2022 07:12:44 +0000 (15:12 +0800)]
userfaultfd/selftests: fix typo in comment
Delete the redundant word 'in'.
Link: https://lkml.kernel.org/r/20220610071244.59679-1-wangxiang@cdjrlc.com
Signed-off-by: Xiang wangx <wangxiang@cdjrlc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Vasily Averin [Fri, 3 Jun 2022 04:19:43 +0000 (07:19 +0300)]
net: set proper memcg for net_init hooks allocations
__register_pernet_operations() executes init hook of registered
pernet_operation structure in all existing net namespaces.
Typically, these hooks are called by a process associated with the
specified net namespace, and all __GFP_ACCOUNT marked allocation are
accounted for corresponding container/memcg.
However __register_pernet_operations() calls the hooks in the same
context, and as a result all marked allocations are accounted to one memcg
for all processed net namespaces.
This patch adjusts active memcg for each net namespace and helps to
account memory allocated inside ops_init() into the proper memcg.
Link: https://lkml.kernel.org/r/f9394752-e272-9bf9-645f-a18c56d1c4ec@openvz.org
Signed-off-by: Vasily Averin <vvs@openvz.org>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: Shakeel Butt <shakeelb@google.com>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Linux Kernel Functional Testing <lkft@linaro.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Cc: Qian Cai <quic_qiancai@quicinc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Roman Gushchin [Fri, 10 Jun 2022 18:03:10 +0000 (11:03 -0700)]
mm: kmem: make mem_cgroup_from_obj() vmalloc()-safe
Currently mem_cgroup_from_obj() is not working properly with objects
allocated using vmalloc(). It creates problems in some cases, when it's
called for static objects belonging to modules or generally allocated
using vmalloc().
This patch makes mem_cgroup_from_obj() safe to be called on objects
allocated using vmalloc().
It also introduces mem_cgroup_from_slab_obj(), which is a faster version
to use in places when we know the object is either a slab object or a
generic slab page (e.g. when adding an object to a lru list).
Link: https://lkml.kernel.org/r/20220610180310.1725111-1-roman.gushchin@linux.dev
Suggested-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Acked-by: Shakeel Butt <shakeelb@google.com>
Tested-by: Vasily Averin <vvs@openvz.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Muchun Song <songmuchun@bytedance.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Cc: Qian Cai <quic_qiancai@quicinc.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Miaohe Lin [Thu, 9 Jun 2022 12:13:05 +0000 (20:13 +0800)]
mm/memremap: fix memunmap_pages() race with get_dev_pagemap()
Think about the below scene:
CPU1 CPU2
memunmap_pages
percpu_ref_exit
__percpu_ref_exit
free_percpu(percpu_count);
/* percpu_count is freed here! */
get_dev_pagemap
xa_load(&pgmap_array, PHYS_PFN(phys))
/* pgmap still in the pgmap_array */
percpu_ref_tryget_live(&pgmap->ref)
if __ref_is_percpu
/* __PERCPU_REF_ATOMIC_DEAD not set yet */
this_cpu_inc(*percpu_count)
/* access freed percpu_count here! */
ref->percpu_count_ptr = __PERCPU_REF_ATOMIC_DEAD;
/* too late... */
pageunmap_range
To fix the issue, do percpu_ref_exit() after pgmap_array is emptied. So
we won't do percpu_ref_tryget_live() against a being freed percpu_ref.
Link: https://lkml.kernel.org/r/20220609121305.2508-1-linmiaohe@huawei.com
Fixes:
b7b3c01b1915 ("mm/memremap_pages: support multiple ranges per invocation")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patrick Wang [Sat, 11 Jun 2022 03:55:51 +0000 (11:55 +0800)]
mm: kmemleak: check physical address when scan
Check the physical address of objects for its boundary when scan instead
of in kmemleak_*_phys().
Link: https://lkml.kernel.org/r/20220611035551.1823303-5-patrick.wang.shcn@gmail.com
Fixes:
23c2d497de21 ("mm: kmemleak: take a full lowmem check in kmemleak_*_phys()")
Signed-off-by: Patrick Wang <patrick.wang.shcn@gmail.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Yee Lee <yee.lee@mediatek.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patrick Wang [Sat, 11 Jun 2022 03:55:50 +0000 (11:55 +0800)]
mm: kmemleak: add rbtree and store physical address for objects allocated with PA
Add object_phys_tree_root to store the objects allocated with physical
address. Distinguish it from object_tree_root by OBJECT_PHYS flag or
function argument. The physical address is stored directly in those
objects.
Link: https://lkml.kernel.org/r/20220611035551.1823303-4-patrick.wang.shcn@gmail.com
Signed-off-by: Patrick Wang <patrick.wang.shcn@gmail.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Yee Lee <yee.lee@mediatek.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patrick Wang [Sat, 11 Jun 2022 03:55:49 +0000 (11:55 +0800)]
mm: kmemleak: add OBJECT_PHYS flag for objects allocated with physical address
Add OBJECT_PHYS flag for object. This flag is used to identify the
objects allocated with physical address. The create_object_phys()
function is added as well to set that flag and is used by
kmemleak_alloc_phys().
Link: https://lkml.kernel.org/r/20220611035551.1823303-3-patrick.wang.shcn@gmail.com
Signed-off-by: Patrick Wang <patrick.wang.shcn@gmail.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Yee Lee <yee.lee@mediatek.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Patrick Wang [Sat, 11 Jun 2022 03:55:48 +0000 (11:55 +0800)]
mm: kmemleak: remove kmemleak_not_leak_phys() and the min_count argument to kmemleak_alloc_phys()
Patch series "mm: kmemleak: store objects allocated with physical address
separately and check when scan", v4.
The kmemleak_*_phys() interface uses "min_low_pfn" and "max_low_pfn" to
check address. But on some architectures, kmemleak_*_phys() is called
before those two variables initialized. The following steps will be
taken:
1) Add OBJECT_PHYS flag and rbtree for the objects allocated
with physical address
2) Store physical address in objects if allocated with OBJECT_PHYS
3) Check the boundary when scan instead of in kmemleak_*_phys()
This patch set will solve:
https://lore.kernel.org/r/
20220527032504.30341-1-yee.lee@mediatek.com
https://lore.kernel.org/r/
9dd08bb5-f39e-53d8-f88d-
bec598a08c93@gmail.com
v3: https://lore.kernel.org/r/
20220609124950.1694394-1-patrick.wang.shcn@gmail.com
v2: https://lore.kernel.org/r/
20220603035415.1243913-1-patrick.wang.shcn@gmail.com
v1: https://lore.kernel.org/r/
20220531150823.1004101-1-patrick.wang.shcn@gmail.com
This patch (of 4):
Remove the unused kmemleak_not_leak_phys() function. And remove the
min_count argument to kmemleak_alloc_phys() function, assume it's 0.
Link: https://lkml.kernel.org/r/20220611035551.1823303-1-patrick.wang.shcn@gmail.com
Link: https://lkml.kernel.org/r/20220611035551.1823303-2-patrick.wang.shcn@gmail.com
Signed-off-by: Patrick Wang <patrick.wang.shcn@gmail.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Yee Lee <yee.lee@mediatek.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Miaohe Lin [Thu, 9 Jun 2022 13:08:35 +0000 (21:08 +0800)]
lib/test_hmm: avoid accessing uninitialized pages
If make_device_exclusive_range() fails or returns pages marked for
exclusive access less than required, remaining fields of pages will left
uninitialized. So dmirror_atomic_map() will access those yet
uninitialized fields of pages. To fix it, do dmirror_atomic_map() iff all
pages are marked for exclusive access (we will break if mapped is less
than required anyway) so we won't access those uninitialized fields of
pages.
Link: https://lkml.kernel.org/r/20220609130835.35110-1-linmiaohe@huawei.com
Fixes:
b659baea7546 ("mm: selftests for exclusive device memory")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Miaohe Lin [Tue, 7 Jun 2022 14:36:21 +0000 (22:36 +0800)]
mm/memremap: fix wrong function name above memremap_pages()
Fix the wrong function name dev_memremap_pages above memremap_pages() to
avoid confusion. Minor readability improvement.
Link: https://lkml.kernel.org/r/20220607143621.58989-1-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Daniel Vetter [Sun, 5 Jun 2022 15:25:39 +0000 (17:25 +0200)]
mm/mempool: use might_alloc()
mempool are generally used for GFP_NOIO, so this wont benefit all that
much because might_alloc currently only checks GFP_NOFS. But it does
validate against mmu notifier pte zapping, some might catch some drivers
doing really silly things, plus it's a bit more meaningful in what we're
checking for here.
Link: https://lkml.kernel.org/r/20220605152539.3196045-3-daniel.vetter@ffwll.ch
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Daniel Vetter [Sun, 5 Jun 2022 15:25:38 +0000 (17:25 +0200)]
mm/slab: delete cache_alloc_debugcheck_before()
It only does a might_sleep_if(GFP_RECLAIM) check, which is already covered
by the might_alloc() in slab_pre_alloc_hook(). And all callers of
cache_alloc_debugcheck_before() call that beforehand already.
Link: https://lkml.kernel.org/r/20220605152539.3196045-2-daniel.vetter@ffwll.ch
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Daniel Vetter [Sun, 5 Jun 2022 15:25:37 +0000 (17:25 +0200)]
mm/page_alloc: use might_alloc()
... instead of open coding it. Completely equivalent code, just a notch
more meaningful when reading.
Link: https://lkml.kernel.org/r/20220605152539.3196045-1-daniel.vetter@ffwll.ch
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Fabio M. De Francesco [Mon, 6 Jun 2022 14:15:33 +0000 (16:15 +0200)]
mm/highmem: delete memmove_page()
Matthew Wilcox reported that, while he was looking at memmove_page(), he
realized that it can't actually work.
The reasons are hidden in its implementation, which makes use of memmove()
on logical addresses provided by kmap_local_page(). memmove() does the
wrong thing when it tests "if (dest <= src)".
Therefore, delete memmove_page().
No need to change any other code because we have no call sites of
memmove_page() across the whole kernel.
Link: https://lkml.kernel.org/r/20220606141533.555-1-fmdefrancesco@gmail.com
Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Reported-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Qi Zheng [Sat, 4 Jun 2022 08:22:09 +0000 (16:22 +0800)]
mm: memcontrol: add {pgscan,pgsteal}_{kswapd,direct} items in memory.stat of cgroup v2
There are already statistics of {pgscan,pgsteal}_kswapd and
{pgscan,pgsteal}_direct of memcg event here, but now only the sum of the
two is displayed in memory.stat of cgroup v2.
In order to obtain more accurate information during monitoring and
debugging, and to align with the display in /proc/vmstat, it better to
display {pgscan,pgsteal}_kswapd and {pgscan,pgsteal}_direct separately.
Also, for forward compatibility, we still display pgscan and pgsteal items
so that it won't break existing applications.
[zhengqi.arch@bytedance.com: add comment for memcg_vm_event_stat (suggested by Michal)]
Link: https://lkml.kernel.org/r/20220606154028.55030-1-zhengqi.arch@bytedance.com
[zhengqi.arch@bytedance.com: fix the doc, thanks to Johannes]
Link: https://lkml.kernel.org/r/20220607064803.79363-1-zhengqi.arch@bytedance.com
Link: https://lkml.kernel.org/r/20220604082209.55174-1-zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Baoquan He [Tue, 7 Jun 2022 10:59:58 +0000 (18:59 +0800)]
mm/vmalloc: add code comment for find_vmap_area_exceed_addr()
Its behaviour is like find_vma() which finds an area above the specified
address, add comment to make it easier to understand.
And also fix two places of grammer mistake/typo.
Link: https://lkml.kernel.org/r/20220607105958.382076-5-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Baoquan He [Tue, 7 Jun 2022 10:59:57 +0000 (18:59 +0800)]
mm/vmalloc: fix typo in local variable name
In __purge_vmap_area_lazy(), rename local_pure_list to local_purge_list.
Link: https://lkml.kernel.org/r/20220607105958.382076-4-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Baoquan He [Tue, 7 Jun 2022 10:59:56 +0000 (18:59 +0800)]
mm/vmalloc: remove the redundant boundary check
In find_va_links(), when traversing the vmap_area tree, the comparing to
check if the passed in 'va' is above or below 'tmp_va' is redundant,
assuming both 'va' and 'tmp_va' has ->va_start <= ->va_end.
Here, to simplify the checking as code change.
Link: https://lkml.kernel.org/r/20220607105958.382076-3-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Baoquan He [Tue, 7 Jun 2022 10:59:55 +0000 (18:59 +0800)]
mm/vmalloc: invoke classify_va_fit_type() in adjust_va_to_fit_type()
Patch series "Cleanup patches of vmalloc", v2.
Some cleanup patches found when reading vmalloc code.
This patch (of 4):
adjust_va_to_fit_type() checks all values of passed in fit type, including
NOTHING_FIT in the else branch. However, the check of NOTHING_FIT has
been done inside adjust_va_to_fit_type() and before it's called in all
call sites.
In fact, both of these functions are coupled tightly, since
classify_va_fit_type() is doing the preparation work for
adjust_va_to_fit_type(). So putting invocation of classify_va_fit_type()
inside adjust_va_to_fit_type() can simplify code logic and the redundant
check of NOTHING_FIT issue will go away.
Link: https://lkml.kernel.org/r/20220607105958.382076-1-bhe@redhat.com
Link: https://lkml.kernel.org/r/20220607105958.382076-2-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Suggested-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Chengming Zhou [Tue, 31 May 2022 02:04:21 +0000 (10:04 +0800)]
mm/damon: remove obsolete comments of kdamond_stop
Since commit
0f91d13366a4 ("mm/damon: simplify stop mechanism") delete
kdamond_stop and change to use kthread stop mechanism, these obsolete
comments should be removed accordingly.
Link: https://lkml.kernel.org/r/20220531020421.46849-1-zhouchengming@bytedance.com
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Anshuman Khandual [Tue, 31 May 2022 09:04:41 +0000 (14:34 +0530)]
mm/memory_hotplug: drop 'reason' argument from check_pfn_span()
In check_pfn_span(), a 'reason' string is being used to recreate the
caller function name, while printing the warning message. It is really
unnecessary as the warning message could just be printed inside the caller
depending on the return code. Currently there are just two callers for
check_pfn_span() i.e __add_pages() and __remove_pages(). Let's clean this
up.
Link: https://lkml.kernel.org/r/20220531090441.170650-1-anshuman.khandual@arm.com
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Miaohe Lin [Mon, 30 May 2022 11:58:41 +0000 (19:58 +0800)]
mm/shmem.c: clean up comment of shmem_swapin_folio
shmem_swapin_folio has changed to use folio but comment still mentions
page. Update the relevant comment accordingly as suggested by Naoya.
Link: https://lkml.kernel.org/r/20220530115841.4348-1-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Peter Xu [Mon, 30 May 2022 18:34:50 +0000 (14:34 -0400)]
mm: avoid unnecessary page fault retires on shared memory types
I observed that for each of the shared file-backed page faults, we're very
likely to retry one more time for the 1st write fault upon no page. It's
because we'll need to release the mmap lock for dirty rate limit purpose
with balance_dirty_pages_ratelimited() (in fault_dirty_shared_page()).
Then after that throttling we return VM_FAULT_RETRY.
We did that probably because VM_FAULT_RETRY is the only way we can return
to the fault handler at that time telling it we've released the mmap lock.
However that's not ideal because it's very likely the fault does not need
to be retried at all since the pgtable was well installed before the
throttling, so the next continuous fault (including taking mmap read lock,
walk the pgtable, etc.) could be in most cases unnecessary.
It's not only slowing down page faults for shared file-backed, but also add
more mmap lock contention which is in most cases not needed at all.
To observe this, one could try to write to some shmem page and look at
"pgfault" value in /proc/vmstat, then we should expect 2 counts for each
shmem write simply because we retried, and vm event "pgfault" will capture
that.
To make it more efficient, add a new VM_FAULT_COMPLETED return code just to
show that we've completed the whole fault and released the lock. It's also
a hint that we should very possibly not need another fault immediately on
this page because we've just completed it.
This patch provides a ~12% perf boost on my aarch64 test VM with a simple
program sequentially dirtying 400MB shmem file being mmap()ed and these are
the time it needs:
Before: 650.980 ms (+-1.94%)
After: 569.396 ms (+-1.38%)
I believe it could help more than that.
We need some special care on GUP and the s390 pgfault handler (for gmap
code before returning from pgfault), the rest changes in the page fault
handlers should be relatively straightforward.
Another thing to mention is that mm_account_fault() does take this new
fault as a generic fault to be accounted, unlike VM_FAULT_RETRY.
I explicitly didn't touch hmm_vma_fault() and break_ksm() because they do
not handle VM_FAULT_RETRY even with existing code, so I'm literally keeping
them as-is.
Link: https://lkml.kernel.org/r/20220530183450.42886-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Vineet Gupta <vgupta@kernel.org>
Acked-by: Guo Ren <guoren@kernel.org>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> [arm part]
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Stafford Horne <shorne@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Brian Cain <bcain@quicinc.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Richard Weinberger <richard@nod.at>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Janosch Frank <frankja@linux.ibm.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Will Deacon <will@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Chris Zankel <chris@zankel.net>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Rich Felker <dalias@libc.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Helge Deller <deller@gmx.de>
Cc: Yoshinori Sato <ysato@users.osdn.me>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Yuanzheng Song [Sat, 28 May 2022 06:31:17 +0000 (06:31 +0000)]
tools/vm/slabinfo: use alphabetic order when two values are equal
When the number of partial slabs in each cache is the same (e.g., the
value are 0), the results of the `slabinfo -X -N5` and `slabinfo -P -N5`
are different.
/ # slabinfo -X -N5
...
Slabs sorted by number of partial slabs
---------------------------------------
Name Objects Objsize Space Slabs/Part/Cpu O/S O %Fr %Ef Flg
inode_cache 15180 392 6217728 758/0/1 20 1 0 95 a
kernfs_node_cache 22494 88 2002944 488/0/1 46 0 0 98
shmem_inode_cache 663 464 319488 38/0/1 17 1 0 96
biovec-max 50 3072 163840 4/0/1 10 3 0 93 A
dentry 19050 136 2600960 633/0/2 30 0 0 99 a
/ # slabinfo -P -N5
Name Objects Objsize Space Slabs/Part/Cpu O/S O %Fr %Ef Flg
bdev_cache 32 984 32.7K 1/0/1 16 2 0 96 Aa
ext4_inode_cache 42 752 32.7K 1/0/1 21 2 0 96 a
dentry 19050 136 2.6M 633/0/2 30 0 0 99 a
TCPv6 17 1840 32.7K 0/0/1 17 3 0 95 A
RAWv6 18 856 16.3K 0/0/1 18 2 0 94 A
This problem is caused by the sort_slabs(). So let's use alphabetic order
when two values are equal in the sort_slabs().
By the way, the content of the `slabinfo -h` is not aligned because the
`-P|--partial Sort by number of partial slabs`
uses tabs instead of spaces. So let's use spaces instead of tabs to fix
it.
Link: https://lkml.kernel.org/r/20220528063117.935158-1-songyuanzheng@huawei.com
Fixes:
1106b205a3fe ("tools/vm/slabinfo: add partial slab listing to -X")
Signed-off-by: Yuanzheng Song <songyuanzheng@huawei.com>
Cc: "Tobin C. Harding" <tobin@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Fanjun Kong [Thu, 26 May 2022 14:02:57 +0000 (22:02 +0800)]
mm: use PAGE_ALIGNED instead of IS_ALIGNED
<linux/mm.h> already provides the PAGE_ALIGNED macro. Let's use this
macro instead of IS_ALIGNED and passing PAGE_SIZE directly.
Link: https://lkml.kernel.org/r/20220526140257.1568744-1-bh1scw@gmail.com
Signed-off-by: Fanjun Kong <bh1scw@gmail.com>
Acked-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Peter Xu [Wed, 25 May 2022 19:52:20 +0000 (15:52 -0400)]
mm/x86: remove dead code for hugetlbpage.c
It seems to exist since the old times and never used once. Remove them.
Link: https://lkml.kernel.org/r/20220525195220.10241-1-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Acked-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Linus Torvalds [Sun, 12 Jun 2022 23:11:37 +0000 (16:11 -0700)]
Linux 5.19-rc2
Linus Torvalds [Sun, 12 Jun 2022 18:33:42 +0000 (11:33 -0700)]
Merge tag 'platform-drivers-x86-v5.19-2' of git://git./linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Hans de Goede:
"Highlights:
- Fix hp-wmi regression on HP Omen laptops introduced in 5.18
- Several hardware-id additions
- A couple of other tiny fixes"
* tag 'platform-drivers-x86-v5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86/intel: hid: Add Surface Go to VGBS allow list
platform/x86: hp-wmi: Use zero insize parameter only when supported
platform/x86: hp-wmi: Resolve WMI query failures on some devices
platform/x86: gigabyte-wmi: Add support for B450M DS3H-CF
platform/x86: gigabyte-wmi: Add Z690M AORUS ELITE AX DDR4 support
platform/x86: barco-p50-gpio: Add check for platform_driver_register
platform/x86/intel: pmc: Support Intel Raptorlake P
platform/x86/intel: Fix pmt_crashlog array reference
platform/mellanox: Add static in struct declaration.
platform/mellanox: Spelling s/platfom/platform/
Linus Torvalds [Sun, 12 Jun 2022 18:16:00 +0000 (11:16 -0700)]
Merge tag 'wq-for-5.19-rc1-fixes' of git://git./linux/kernel/git/tj/wq
Pull workqueue fixes from Tejun Heo:
"Tetsuo's patch to trigger build warnings if system-wide wq's are
flushed along with a TP type update and trivial comment update"
* tag 'wq-for-5.19-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: Switch to new kerneldoc syntax for named variable macro argument
workqueue: Fix type of cpu in trace event
workqueue: Wrap flush_workqueue() using a macro
Linus Torvalds [Sun, 12 Jun 2022 18:10:07 +0000 (11:10 -0700)]
Merge tag 'kbuild-fixes-v5.19' of git://git./linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Make the *.mod build rule portable for POSIX awk
- Fix regression of 'make nsdeps'
- Make scripts/check-local-export working for older bash versions
- Fix scripts/gdb to extract the .config data from vmlinux
* tag 'kbuild-fixes-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
scripts/gdb: change kernel config dumping method
scripts/check-local-export: avoid 'wait $!' for process substitution
scripts/nsdeps: adjust to the format change of *.mod files
kbuild: avoid regex RS for POSIX awk
Linus Torvalds [Sun, 12 Jun 2022 18:05:44 +0000 (11:05 -0700)]
Merge tag '5.19-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs client fixes from Steve French:
"Three reconnect fixes, all for stable as well.
One of these three reconnect fixes does address a problem with
multichannel reconnect, but this does not include the additional
fix (still being tested) for dynamically detecting multichannel
adapter changes which will improve those reconnect scenarios even
more"
* tag '5.19-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: populate empty hostnames for extra channels
cifs: return errors during session setup during reconnects
cifs: fix reconnect on smb3 mount types
Linus Torvalds [Sun, 12 Jun 2022 17:33:38 +0000 (10:33 -0700)]
Merge tag 'random-5.19-rc2-for-linus' of git://git./linux/kernel/git/crng/random
Pull random number generator fixes from Jason Donenfeld:
- A fix for a 5.19 regression for a case in which early device tree
initializes the RNG, which flips a static branch.
On most plaforms, jump labels aren't initialized until much later, so
this caused splats. On a few mailing list threads, we cooked up easy
fixes for arm64, arm32, and risc-v. But then things looked slightly
more involved for xtensa, powerpc, arc, and mips. And at that point,
when we're patching 7 architectures in a place before the console is
even available, it seems like the cost/risk just wasn't worth it.
So random.c works around it now by checking the already exported
`static_key_initialized` boolean, as though somebody already ran into
this issue in the past. I'm not super jazzed about that; it'd be
prettier to not have to complicate downstream code. But I suppose
it's practical.
- A few small code nits and adding a missing __init annotation.
- A change to the default config values to use the cpu and bootloader's
seeds for initializing the RNG earlier.
This brings them into line with what all the distros do (Fedora/RHEL,
Debian, Ubuntu, Gentoo, Arch, NixOS, Alpine, SUSE, and Void... at
least), and moreover will now give us test coverage in various test
beds that might have caught the above device tree bug earlier.
- A change to WireGuard CI's configuration to increase test coverage
around the RNG.
- A documentation comment fix to unrelated maintainerless CRC code that
I was asked to take, I guess because it has to do with polynomials
(which the RNG thankfully no longer uses).
* tag 'random-5.19-rc2-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
wireguard: selftests: use maximum cpu features and allow rng seeding
random: remove rng_has_arch_random()
random: credit cpu and bootloader seeds by default
random: do not use jump labels before they are initialized
random: account for arch randomness in bits
random: mark bootloader randomness code as __init
random: avoid checking crng_ready() twice in random_init()
crc-itu-t: fix typo in CRC ITU-T polynomial comment
Duke Lee [Tue, 7 Jun 2022 21:36:54 +0000 (14:36 -0700)]
platform/x86/intel: hid: Add Surface Go to VGBS allow list
The Surface Go reports Chassis Type 9 (Laptop,) so the device needs to be
added to dmi_vgbs_allow_list to enable tablet mode when an attached Type
Cover is folded back.
BugLink: https://github.com/linux-surface/linux-surface/issues/837
Signed-off-by: Duke Lee <krnhotwings@gmail.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20220607213654.5567-1-krnhotwings@gmail.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Bedant Patnaik [Wed, 8 Jun 2022 19:28:43 +0000 (00:58 +0530)]
platform/x86: hp-wmi: Use zero insize parameter only when supported
commit
be9d73e64957 ("platform/x86: hp-wmi: Fix 0x05 error code reported by
several WMI calls") and commit
12b19f14a21a ("platform/x86: hp-wmi: Fix
hp_wmi_read_int() reporting error (0x05)") cause ACPI BIOS Error (bug):
Attempt to CreateField of length zero (
20211217/dsopcode-133) because of
the ACPI method HWMC, which unconditionally creates a Field of
size (insize*8) bits:
CreateField (Arg1, 0x80, (Local5 * 0x08), DAIN)
In cases where args->insize = 0, the Field size is 0, resulting in
an error.
Fix this by using zero insize only if 0x5 error code is returned
Tested on Omen 15 AMD (2020) board ID: 8786.
Fixes:
be9d73e64957 ("platform/x86: hp-wmi: Fix 0x05 error code reported by several WMI calls")
Signed-off-by: Bedant Patnaik <bedant.patnaik@gmail.com>
Tested-by: Jorge Lopez <jorge.lopez2@hp.com>
Link: https://lore.kernel.org/r/41be46743d21c78741232a47bbb5f1cdbcc3d21e.camel@gmail.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Jorge Lopez [Wed, 8 Jun 2022 21:29:23 +0000 (16:29 -0500)]
platform/x86: hp-wmi: Resolve WMI query failures on some devices
WMI queries fail on some devices where the ACPI method HWMC
unconditionally attempts to create Fields beyond the buffer
if the buffer is too small, this breaks essential features
such as power profiles:
CreateByteField (Arg1, 0x10, D008)
CreateByteField (Arg1, 0x11, D009)
CreateByteField (Arg1, 0x12, D010)
CreateDWordField (Arg1, 0x10, D032)
CreateField (Arg1, 0x80, 0x0400, D128)
In cases where args->data had zero length, ACPI BIOS Error
(bug): AE_AML_BUFFER_LIMIT, Field [D008] at bit
offset/length 128/8 exceeds size of target Buffer (128 bits)
(
20211217/dsopcode-198) was obtained.
ACPI BIOS Error (bug): AE_AML_BUFFER_LIMIT, Field [D009] at bit
offset/length 136/8 exceeds size of target Buffer (136bits)
(
20211217/dsopcode-198)
The original code created a buffer size of 128 bytes regardless if
the WMI call required a smaller buffer or not. This particular
behavior occurs in older BIOS and reproduced in OMEN laptops. Newer
BIOS handles buffer sizes properly and meets the latest specification
requirements. This is the reason why testing with a dynamically
allocated buffer did not uncover any failures with the test systems at
hand.
This patch was tested on several OMEN, Elite, and Zbooks. It was
confirmed the patch resolves HPWMI_FAN GET/SET calls in an OMEN
Laptop 15-ek0xxx. No problems were reported when testing on several Elite
and Zbooks notebooks.
Fixes:
4b4967cbd268 ("platform/x86: hp-wmi: Changing bios_args.data to be dynamically allocated")
Signed-off-by: Jorge Lopez <jorge.lopez2@hp.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: https://lore.kernel.org/r/20220608212923.8585-2-jorge.lopez2@hp.com
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Jonathan Neuschäfer [Thu, 9 Jun 2022 23:41:10 +0000 (01:41 +0200)]
workqueue: Switch to new kerneldoc syntax for named variable macro argument
The syntax without dots is available since commit
43756e347f21
("scripts/kernel-doc: Add support for named variable macro arguments").
The same HTML output is produced with and without this patch.
Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Linus Torvalds [Sat, 11 Jun 2022 23:56:41 +0000 (16:56 -0700)]
Merge tag 'gpio-fixes-for-v5.19-rc2' of git://git./linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
"A set of fixes. Most address the new warning we emit at build time
when irq chips are not immutable with some additional tweaks to
gpio-crystalcove from Andy and a small tweak to gpio-dwapd.
- make irq_chip structs immutable in several Diolan and intel drivers
to get rid of the new warning we emit when fiddling with irq chips
- don't print error messages on probe deferral in gpio-dwapb"
* tag 'gpio-fixes-for-v5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: dwapb: Don't print error on -EPROBE_DEFER
gpio: dln2: make irq_chip immutable
gpio: sch: make irq_chip immutable
gpio: merrifield: make irq_chip immutable
gpio: wcove: make irq_chip immutable
gpio: crystalcove: Join function declarations and long lines
gpio: crystalcove: Use specific type and API for IRQ number
gpio: crystalcove: make irq_chip immutable
Linus Torvalds [Sat, 11 Jun 2022 23:50:39 +0000 (16:50 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Driver fixes and and one core patch.
Nine of the driver patches are minor fixes and reworks to lpfc and the
rest are trivial and minor fixes elsewhere"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: pmcraid: Fix missing resource cleanup in error case
scsi: ipr: Fix missing/incorrect resource cleanup in error case
scsi: mpt3sas: Fix out-of-bounds compiler warning
scsi: lpfc: Update lpfc version to 14.2.0.4
scsi: lpfc: Allow reduced polling rate for nvme_admin_async_event cmd completion
scsi: lpfc: Add more logging of cmd and cqe information for aborted NVMe cmds
scsi: lpfc: Fix port stuck in bypassed state after LIP in PT2PT topology
scsi: lpfc: Resolve NULL ptr dereference after an ELS LOGO is aborted
scsi: lpfc: Address NULL pointer dereference after starget_to_rport()
scsi: lpfc: Resolve some cleanup issues following SLI path refactoring
scsi: lpfc: Resolve some cleanup issues following abort path refactoring
scsi: lpfc: Correct BDE type for XMIT_SEQ64_WQE in lpfc_ct_reject_event()
scsi: vmw_pvscsi: Expand vcpuHint to 16 bits
scsi: sd: Fix interpretation of VPD B9h length
Linus Torvalds [Sat, 11 Jun 2022 23:32:47 +0000 (16:32 -0700)]
Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost
Pull virtio fixes from Michael Tsirkin:
"Fixes all over the place, most notably fixes for latent bugs in
drivers that got exposed by suppressing interrupts before DRIVER_OK,
which in turn has been done by
8b4ec69d7e09 ("virtio: harden vring
IRQ")"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
um: virt-pci: set device ready in probe()
vdpa: make get_vq_group and set_group_asid optional
virtio: Fix all occurences of the "the the" typo
vduse: Fix NULL pointer dereference on sysfs access
vringh: Fix loop descriptors check in the indirect cases
vdpa/mlx5: clean up indenting in handle_ctrl_vlan()
vdpa/mlx5: fix error code for deleting vlan
virtio-mmio: fix missing put_device() when vm_cmdline_parent registration failed
vdpa/mlx5: Fix syntax errors in comments
virtio-rng: make device ready before making request
Linus Torvalds [Sat, 11 Jun 2022 19:37:39 +0000 (12:37 -0700)]
Merge tag 'loongarch-fixes-5.19-1' of git://git./linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen.
"Fix build errors and a stale comment"
* tag 'loongarch-fixes-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: Remove MIPS comment about cycle counter
LoongArch: Fix copy_thread() build errors
LoongArch: Fix the !CONFIG_SMP build
Linus Torvalds [Sat, 11 Jun 2022 17:30:20 +0000 (10:30 -0700)]
iov_iter: fix build issue due to possible type mis-match
Commit
6c77676645ad ("iov_iter: Fix iter_xarray_get_pages{,_alloc}()")
introduced a problem on some 32-bit architectures (at least arm, xtensa,
csky,sparc and mips), that have a 'size_t' that is 'unsigned int'.
The reason is that we now do
min(nr * PAGE_SIZE - offset, maxsize);
where 'nr' and 'offset' and both 'unsigned int', and PAGE_SIZE is
'unsigned long'. As a result, the normal C type rules means that the
first argument to 'min()' ends up being 'unsigned long'.
In contrast, 'maxsize' is of type 'size_t'.
Now, 'size_t' and 'unsigned long' are always the same physical type in
the kernel, so you'd think this doesn't matter, and from an actual
arithmetic standpoint it doesn't.
But on 32-bit architectures 'size_t' is commonly 'unsigned int', even if
it could also be 'unsigned long'. In that situation, both are unsigned
32-bit types, but they are not the *same* type.
And as a result 'min()' will complain about the distinct types (ignore
the "pointer types" part of the error message: that's an artifact of the
way we have made 'min()' check types for being the same):
lib/iov_iter.c: In function 'iter_xarray_get_pages':
include/linux/minmax.h:20:35: error: comparison of distinct pointer types lacks a cast [-Werror]
20 | (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
| ^~
lib/iov_iter.c:1464:16: note: in expansion of macro 'min'
1464 | return min(nr * PAGE_SIZE - offset, maxsize);
| ^~~
This was not visible on 64-bit architectures (where we always define
'size_t' to be 'unsigned long').
Force these cases to use 'min_t(size_t, x, y)' to make the type explicit
and avoid the issue.
[ Nit-picky note: technically 'size_t' doesn't have to match 'unsigned
long' arithmetically. We've certainly historically seen environments
with 16-bit address spaces and 32-bit 'unsigned long'.
Similarly, even in 64-bit modern environments, 'size_t' could be its
own type distinct from 'unsigned long', even if it were arithmetically
identical.
So the above type commentary is only really descriptive of the kernel
environment, not some kind of universal truth for the kinds of wild
and crazy situations that are allowed by the C standard ]
Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Link: https://lore.kernel.org/all/YqRyL2sIqQNDfky2@debian/
Cc: Jeff Layton <jlayton@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jason A. Donenfeld [Fri, 10 Jun 2022 14:32:02 +0000 (16:32 +0200)]
wireguard: selftests: use maximum cpu features and allow rng seeding
By forcing the maximum CPU that QEMU has available, we expose additional
capabilities, such as the RNDR instruction, which increases test
coverage. This then allows the CI to skip the fake seeding step in some
cases. Also enable STRICT_KERNEL_RWX to catch issues related to early
jump labels when the RNG is initialized at boot.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Kuan-Ying Lee [Fri, 10 Jun 2022 07:14:57 +0000 (15:14 +0800)]
scripts/gdb: change kernel config dumping method
MAGIC_START("IKCFG_ST") and MAGIC_END("IKCFG_ED") are moved out
from the kernel_config_data variable.
Thus, we parse kernel_config_data directly instead of considering
offset of MAGIC_START and MAGIC_END.
Fixes:
13610aa908dc ("kernel/configs: use .incbin directive to embed config_data.gz")
Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Vincent Whitchurch [Fri, 10 Jun 2022 15:12:03 +0000 (17:12 +0200)]
um: virt-pci: set device ready in probe()
Call virtio_device_ready() to make this driver work after commit
b4ec69d7e09 ("virtio: harden vring IRQ"), since the driver uses the
virtqueues in the probe function. (The virtio core sets the device
ready when probe returns.)
Fixes:
8b4ec69d7e09 ("virtio: harden vring IRQ")
Fixes:
68f5d3f3b654 ("um: add PCI over virtio emulation driver")
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Message-Id: <
20220610151203.3492541-1-vincent.whitchurch@axis.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Johannes Berg <johannes@sipsolutions.net>
Linus Torvalds [Sat, 11 Jun 2022 00:28:43 +0000 (17:28 -0700)]
Merge tag 'nfsd-5.19-1' of git://git./linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
"Notable changes:
- There is now a backup maintainer for NFSD
Notable fixes:
- Prevent array overruns in svc_rdma_build_writes()
- Prevent buffer overruns when encoding NFSv3 READDIR results
- Fix a potential UAF in nfsd_file_put()"
* tag 'nfsd-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
SUNRPC: Remove pointer type casts from xdr_get_next_encode_buffer()
SUNRPC: Clean up xdr_get_next_encode_buffer()
SUNRPC: Clean up xdr_commit_encode()
SUNRPC: Optimize xdr_reserve_space()
SUNRPC: Fix the calculation of xdr->end in xdr_get_next_encode_buffer()
SUNRPC: Trap RDMA segment overflows
NFSD: Fix potential use-after-free in nfsd_file_put()
MAINTAINERS: reciprocal co-maintainership for file locking and nfsd
Shyam Prasad N [Mon, 6 Jun 2022 09:52:46 +0000 (09:52 +0000)]
cifs: populate empty hostnames for extra channels
Currently, the secondary channels of a multichannel session
also get hostname populated based on the info in primary channel.
However, this will end up with a wrong resolution of hostname to
IP address during reconnect.
This change fixes this by not populating hostname info for all
secondary channels.
Fixes:
5112d80c162f ("cifs: populate server_hostname for extra channels")
Cc: stable@vger.kernel.org
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Linus Torvalds [Fri, 10 Jun 2022 23:32:49 +0000 (16:32 -0700)]
Merge tag 'for-5.19/dm-fixes-2' of git://git./linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- Fix DM core's bioset initialization so that blk integrity pool is
properly setup. Remove now unused bioset_init_from_src.
- Fix DM zoned hang from locking imbalance due to needless check in
clone_endio().
* tag 'for-5.19/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm: fix zoned locking imbalance due to needless check in clone_endio
block: remove bioset_init_from_src
dm: fix bio_set allocation
Linus Torvalds [Fri, 10 Jun 2022 23:15:19 +0000 (16:15 -0700)]
Merge branch 'fscache-fixes' of git://git./linux/kernel/git/dhowells/linux-fs
Pull fscache cleanups from David Howells:
- fix checker complaint in afs
- two netfs cleanups:
- netfs_inode calling convention cleanup plus the requisite
documentation changes
- replace the ->cleanup op with a ->free_request op.
This is possible as the I/O request is now always available at
the cleanup point as the stuff to be cleaned up is no longer
passed into the API functions, but rather obtained by ->init_request.
* 'fscache-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
netfs: Rename the netfs_io_request cleanup op and give it an op pointer
netfs: Further cleanups after struct netfs_inode wrapper introduced
afs: Fix some checker issues
Linus Torvalds [Fri, 10 Jun 2022 22:53:09 +0000 (15:53 -0700)]
Merge tag 'pull-fixes' of git://git./linux/kernel/git/viro/vfs
Pull iov_iter fix from Al Viro:
"ITER_XARRAY get_pages fix; now the return value is a lot saner (and
more similar to logics for other flavours)"
* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
iov_iter: Fix iter_xarray_get_pages{,_alloc}()
August Wikerfors [Wed, 8 Jun 2022 21:20:28 +0000 (23:20 +0200)]
platform/x86: gigabyte-wmi: Add support for B450M DS3H-CF
Tested and works on my system.
Signed-off-by: August Wikerfors <git@augustwikerfors.se>
Link: https://lore.kernel.org/r/20220608212028.28307-1-git@augustwikerfors.se
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Piotr Chmura [Mon, 6 Jun 2022 17:15:13 +0000 (19:15 +0200)]
platform/x86: gigabyte-wmi: Add Z690M AORUS ELITE AX DDR4 support
Add dmi_system_id of Gigabyte Z690M AORUS ELITE AX DDR4 board.
Tested on my PC.
Signed-off-by: Piotr Chmura <chmooreck@gmail.com>
Link: https://lore.kernel.org/r/bd83567e-ebf5-0b31-074b-5f6dc7f7c147@gmail.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Jiasheng Jiang [Thu, 26 May 2022 09:03:45 +0000 (17:03 +0800)]
platform/x86: barco-p50-gpio: Add check for platform_driver_register
As platform_driver_register() could fail, it should be better
to deal with the return value in order to maintain the code
consisitency.
Fixes:
86af1d02d458 ("platform/x86: Support for EC-connected GPIOs for identify LED/button on Barco P50 board")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Acked-by: Peter Korsgaard <peter.korsgaard@barco.com>
Link: https://lore.kernel.org/r/20220526090345.1444172-1-jiasheng@iscas.ac.cn
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
George D Sworo [Thu, 2 Jun 2022 01:26:17 +0000 (18:26 -0700)]
platform/x86/intel: pmc: Support Intel Raptorlake P
Add Raptorlake P to the list of the platforms that intel_pmc_core driver
supports for pmc_core device. Raptorlake P PCH is based on Alderlake P
PCH.
Signed-off-by: George D Sworo <george.d.sworo@intel.com>
Reviewed-by: David E. Box <david.e.box@linux.intel.com>
Link: https://lore.kernel.org/r/20220602012617.20100-1-george.d.sworo@intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
David Arcari [Thu, 26 May 2022 20:31:40 +0000 (16:31 -0400)]
platform/x86/intel: Fix pmt_crashlog array reference
The probe function pmt_crashlog_probe() may incorrectly reference
the 'priv->entry array' as it uses 'i' to reference the array instead
of 'priv->num_entries' as it should. This is similar to the problem
that was addressed in pmt_telemetry_probe via commit
2cdfa0c20d58
("platform/x86/intel: Fix 'rmmod pmt_telemetry' panic").
Cc: "David E. Box" <david.e.box@linux.intel.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Mark Gross <markgross@kernel.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: David Arcari <darcari@redhat.com>
Reviewed-by: David E. Box <david.e.box@linux.intel.com>
Link: https://lore.kernel.org/r/20220526203140.339120-1-darcari@redhat.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Michael Shych [Thu, 2 Jun 2022 14:51:03 +0000 (17:51 +0300)]
platform/mellanox: Add static in struct declaration.
Fix problem of missing static in struct declaration.
Fixes:
662f24826f954 ("platform/mellanox: Add support for new SN2201 system")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Michael Shych <michaelsh@nvidia.com>
Link: https://lore.kernel.org/r/20220602145103.11859-1-michaelsh@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
David Howells [Thu, 9 Jun 2022 08:07:01 +0000 (09:07 +0100)]
iov_iter: Fix iter_xarray_get_pages{,_alloc}()
The maths at the end of iter_xarray_get_pages() to calculate the actual
size doesn't work under some circumstances, such as when it's been asked to
extract a partial single page. Various terms of the equation cancel out
and you end up with actual == offset. The same issue exists in
iter_xarray_get_pages_alloc().
Fix these to just use min() to select the lesser amount from between the
amount of page content transcribed into the buffer, minus the offset, and
the size limit specified.
This doesn't appear to have caused a problem yet upstream because network
filesystems aren't getting the pages from an xarray iterator, but rather
passing it directly to the socket, which just iterates over it. Cachefiles
*does* do DIO from one to/from ext4/xfs/btrfs/etc. but it always asks for
whole pages to be written or read.
Fixes:
7ff5062079ef ("iov_iter: Add ITER_XARRAY")
Reported-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Alexander Viro <viro@zeniv.linux.org.uk>
cc: Dominique Martinet <asmadeus@codewreck.org>
cc: Mike Marshall <hubcap@omnibond.com>
cc: Gao Xiang <xiang@kernel.org>
cc: linux-afs@lists.infradead.org
cc: v9fs-developer@lists.sourceforge.net
cc: devel@lists.orangefs.org
cc: linux-erofs@lists.ozlabs.org
cc: linux-cachefs@redhat.com
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
David Howells [Fri, 25 Feb 2022 11:19:14 +0000 (11:19 +0000)]
netfs: Rename the netfs_io_request cleanup op and give it an op pointer
The netfs_io_request cleanup op is now always in a position to be given a
pointer to a netfs_io_request struct, so this can be passed in instead of
the mapping and private data arguments (both of which are included in the
struct).
So rename the ->cleanup op to ->free_request (to match ->init_request) and
pass in the I/O pointer.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: linux-cachefs@redhat.com
Linus Torvalds [Thu, 9 Jun 2022 22:04:01 +0000 (15:04 -0700)]
netfs: Further cleanups after struct netfs_inode wrapper introduced
Change the signature of netfs helper functions to take a struct netfs_inode
pointer rather than a struct inode pointer where appropriate, thereby
relieving the need for the network filesystem to convert its internal inode
format down to the VFS inode only for netfslib to bounce it back up. For
type safety, it's better not to do that (and it's less typing too).
Give netfs_write_begin() an extra argument to pass in a pointer to the
netfs_inode struct rather than deriving it internally from the file
pointer. Note that the ->write_begin() and ->write_end() ops are intended
to be replaced in the future by netfslib code that manages this without the
need to call in twice for each page.
netfs_readpage() and similar are intended to be pointed at directly by the
address_space_operations table, so must stick to the signature dictated by
the function pointers there.
Changes
=======
- Updated the kerneldoc comments and documentation [DH].
Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/CAHk-=wgkwKyNmNdKpQkqZ6DnmUL-x9hp0YBnUGjaPFEAdxDTbw@mail.gmail.com/
David Howells [Thu, 19 May 2022 07:40:12 +0000 (08:40 +0100)]
afs: Fix some checker issues
Remove an unused global variable and make another static as reported by
make C=1.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-afs@lists.infradead.org
Linus Torvalds [Fri, 10 Jun 2022 19:41:48 +0000 (12:41 -0700)]
Merge tag 'folio-5.19a' of git://git.infradead.org/users/willy/pagecache
Pull folio fixes from Matthew Wilcox:
"Four folio-related fixes:
- Don't release a folio while it's still locked
- Fix a use-after-free after dropping the mmap_lock
- Fix a memory leak when splitting a page
- Fix a kernel-doc warning for struct folio"
* tag 'folio-5.19a' of git://git.infradead.org/users/willy/pagecache:
mm: Add kernel-doc for folio->mlock_count
mm/huge_memory: Fix xarray node memory leak
filemap: Cache the value of vm_flags
filemap: Don't release a locked folio
Mike Snitzer [Fri, 10 Jun 2022 19:07:48 +0000 (15:07 -0400)]
dm: fix zoned locking imbalance due to needless check in clone_endio
After the commit
ca522482e3ea ("dm: pass NULL bdev to bio_alloc_clone"),
clone_endio() only calls dm_zone_endio() when DM targets remap the
clone bio's bdev to something other than the md->disk->part0 default.
However, if a DM target (e.g. dm-crypt) stacked ontop of a dm-zoned
does not remap the clone bio using bio_set_dev() then dm_zone_endio()
is not called at completion of the bios and zone locks are not
properly unlocked. This triggers a hang, in dm_zone_map_bio(), when
blktests block/004 is run for dm-crypt on zoned block devices. To
avoid the hang, simply remove the clone_endio() check that verifies
the target remapped the clone bio to a device other than the default.
Fixes:
ca522482e3ea ("dm: pass NULL bdev to bio_alloc_clone")
Reported-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Geert Uytterhoeven [Tue, 24 May 2022 07:15:44 +0000 (09:15 +0200)]
platform/mellanox: Spelling s/platfom/platform/
Fix a misspelling of the word "platform".
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Michael Shych <michaelsh@nvidia.com>
Link: https://lore.kernel.org/r/9c8edde31e271311b7832d7677fe84aba917da8d.1653376503.git.geert@linux-m68k.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Linus Torvalds [Fri, 10 Jun 2022 18:57:36 +0000 (11:57 -0700)]
Merge tag 'devicetree-fixes-for-5.19-2' of git://git./linux/kernel/git/robh/linux
Pull more devicetree fixes from Rob Herring:
- More DT meta-schema check fixes from new bindings in merge window
- Fix stale DT binding references from Mauro
- Update various binding maintainers
- Fix in arm,malidp properties to match reality
- Add deprecated 'atheros' vendor prefix
* tag 'devicetree-fixes-for-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
dt-bindings: display: arm,malidp: remove bogus RQOS property
dt-bindings: pinctrl: ralink: Fix 'enum' lists with duplicate entries
dt-bindings: Drop more redundant 'maxItems/minItems' in if/then schemas
dt-bindings: nvme: apple,nvme-ans: Drop 'maxItems' from 'apple,sart'
MAINTAINERS: rectify entries for ARM DRM DRIVERS after dt conversion
MAINTAINERS: update snps,axs10x-reset.yaml reference
MAINTAINERS: update dongwoon,dw9807-vcm.yaml reference
MAINTAINERS: update cortina,gemini-ethernet.yaml reference
dt-bindings: mfd: rk808: update rockchip,rk808.yaml reference
dt-bindings: reset: update st,stih407-powerdown.yaml references
dt-bindings: arm: update vexpress-config.yaml references
dt-bindings: interrupt-controller: update brcm,l2-intc.yaml reference
dt-bindings: mfd: bd9571mwv: update rohm,bd9571mwv.yaml reference
dt-bindings: update Luca Ceresoli's e-mail address
dt-bindings: msm: update maintainers list with proper id
dt-bindings: vendor-prefixes: document deprecated Atheros
dt-bindings: Update QCOM USB subsystem maintainer information
Linus Torvalds [Fri, 10 Jun 2022 18:49:27 +0000 (11:49 -0700)]
Merge tag 'pm-5.19-rc2' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix an intel_idle issue introduced during the 5.16 development
cycle and two recent regressions in the system reboot/poweroff code.
Specifics:
- Fix CPUIDLE_FLAG_IRQ_ENABLE handling in intel_idle (Peter Zijlstra)
- Allow all platforms to use the global poweroff handler and make
non-syscall poweroff code paths work again (Dmitry Osipenko)"
* tag 'pm-5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpuidle,intel_idle: Fix CPUIDLE_FLAG_IRQ_ENABLE
kernel/reboot: Fix powering off using a non-syscall code paths
kernel/reboot: Use static handler for register_platform_power_off()
David Howells [Fri, 10 Jun 2022 18:35:55 +0000 (19:35 +0100)]
certs: Convert spaces in certs/Makefile to a tab
There's a rule in certs/Makefile for which the command begins with eight
spaces. This results in:
../certs/Makefile:21: FORCE prerequisite is missing
../certs/Makefile:21: *** missing separator. Stop.
Fix this by turning the spaces into a tab.
Fixes:
addf466389d9 ("certs: Check that builtin blacklist hashes are valid")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: Mickaël Salaün <mic@linux.microsoft.com>
cc: keyrings@vger.kernel.org
Link: https://lore.kernel.org/r/486b1b80-9932-aab6-138d-434c541c934a@digikod.net/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andre Przywara [Thu, 9 Jun 2022 16:27:29 +0000 (17:27 +0100)]
dt-bindings: display: arm,malidp: remove bogus RQOS property
As Liviu pointed out, the arm,malidp-arqos-high-level property
mentioned in the original .txt binding was a mistake, and
arm,malidp-arqos-value needs to take its place.
The binding commit
ce6eb0253cba ("dt/bindings: display: Add optional
property node define for Mali DP500") mentions the right name in the
commit message, but has the wrong name in the diff.
Commit
d298e6a27a81 ("drm/arm/mali-dp: Add display QoS interface
configuration for Mali DP500") uses the property in the driver, but uses
the shorter name.
Remove the wrong property from the binding, and use the proper name in
the example. The actual property was already documented properly.
Fixes:
2c8b082a3ab1 ("dt-bindings: display: convert Arm Mali-DP to DT schema")
Link: https://lore.kernel.org/linux-arm-kernel/YnumGEilUblhBx8E@e110455-lin.cambridge.arm.com/
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reported-by: Liviu Dudau <liviu.dudau@arm.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220609162729.1441760-1-andre.przywara@arm.com
Rafael J. Wysocki [Fri, 10 Jun 2022 18:24:10 +0000 (20:24 +0200)]
Merge branch 'pm-sysoff'
Merge fixes for regressions introduced by the recent rework of the
system reboot/poweroff code.
* pm-sysoff:
kernel/reboot: Fix powering off using a non-syscall code paths
kernel/reboot: Use static handler for register_platform_power_off()
Rob Herring [Mon, 6 Jun 2022 21:22:39 +0000 (16:22 -0500)]
dt-bindings: pinctrl: ralink: Fix 'enum' lists with duplicate entries
There's no reason to list the same value twice in an 'enum'. This was fixed
treewide in commit
c3b006819426 ("dt-bindings: Fix 'enum' lists with
duplicate entries"), but this one got added in the merge window.
A meta-schema change will catch future cases.
Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>
Link: https://lore.kernel.org/r/20220606212239.1360877-1-robh@kernel.org
Linus Torvalds [Fri, 10 Jun 2022 18:14:47 +0000 (11:14 -0700)]
Merge tag 'docs-5.19-3' of git://git.lwn.net/linux
Pull documentation fixes from Jonathan Corbet:
"A few documentation fixes for 5.19, including moving the new HTE docs
to a more suitable location, adding loongarch to the features lists,
and a couple of typo fixes"
* tag 'docs-5.19-3' of git://git.lwn.net/linux:
docs: arm: tcm: Fix typo in description of TCM and MMU usage
docs: Move the HTE documentation to driver-api/
docs: usb: fix literal block marker in usbmon verification example
Documentation/features: Update the arch support status files
Linus Torvalds [Fri, 10 Jun 2022 18:03:51 +0000 (11:03 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- SME save/restore for EFI fix - incorrect logic for detecting the need
for saving/restoring the FFR state.
- SME fix for a CPU ID field value.
- Sysreg generation awk script fix (comparison operator).
- Some typos in documentation or comments and silence a sparse warning
(missing prototype).
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Add kasan_hw_tags_enable() prototype to silence sparse
arm64/sme: Fix EFI save/restore
arm64/fpsimd: Fix typo in comment
arm64/sysreg: Fix typo in Enum element regex
arm64/sme: Fix SVE/SME typo in ABI documentation
arm64/sme: Fix tests for 0b1111 value ID registers
Linus Torvalds [Fri, 10 Jun 2022 17:56:28 +0000 (10:56 -0700)]
Merge tag 'zonefs-5.19-rc2' of git://git./linux/kernel/git/dlemoal/zonefs
Pull zonefs fixes from Damien Le Moal:
- Fix handling of the explicit-open mount option, and in particular the
conditions under which this option can be ignored.
- Fix a problem with zonefs iomap_begin method, causing a hang in
iomap_readahead() when a readahead request reaches the end of a file.
* tag 'zonefs-5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
zonefs: fix zonefs_iomap_begin() for reads
zonefs: Do not ignore explicit_open with active zone limit
zonefs: fix handling of explicit_open option on mount
Linus Torvalds [Fri, 10 Jun 2022 17:30:43 +0000 (10:30 -0700)]
Merge tag 'ata-5.19-rc2' of git://git./linux/kernel/git/dlemoal/libata
Pull ATA fixes from Damien Le Moal:
"Several small fixes for rc2:
- Remove unused field in struct ata_port (Hannes)
- Fix a potential (very unlikely) NULL pointer dereference in
ata_host_alloc_pinfo() (Sergey)
- Fix a device reference leak in the pata_octeon_cf driver (Miaoqian)
- Fixes for handling access to the concurrent positioning ranges log
page used with multi-actuator HDDs (Tyler)
- Fix the values shown by the pio_mode and dma_mode sysfs device
attributes (Sergey)
- Update the MAINTAINERS file to add libata sysfs ABI documentation
file (Sergey)"
* tag 'ata-5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
MAINTAINERS: add ATA sysfs file documentation to libata entry
ata: libata-transport: fix {dma|pio|xfer}_mode sysfs files
libata: fix translation of concurrent positioning ranges
libata: fix reading concurrent positioning ranges log
ata: pata_octeon_cf: Fix refcount leak in octeon_cf_probe
ata: libata-core: fix NULL pointer deref in ata_host_alloc_pinfo()
ata: libata: drop 'sas_last_tag'
Linus Torvalds [Fri, 10 Jun 2022 17:20:57 +0000 (10:20 -0700)]
Merge tag 'sound-5.19-rc2' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of fixes; almost all changes are device-specific small
fixes over ASoC, HD-audio and USB-audio. No sign of serious breakage,
so far"
* tag 'sound-5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (23 commits)
ALSA: hda/realtek: Add quirk for HP Dev One
ALSA: hda/realtek - Add HW8326 support
ALSA: hda/conexant - Fix loopback issue with CX20632
ALSA: hda: MTL: add HD Audio PCI ID and HDMI codec vendor ID
ALSA: usb-audio: Set up (implicit) sync for Saffire 6
ALSA: usb-audio: Skip generic sync EP parse for secondary EP
ASoC: wm_adsp: Fix event generation for wm_adsp_fw_put()
ASoC: es8328: Fix event generation for deemphasis control
ASoC: wm8962: Fix suspend while playing music
ASoC: SOF: ipc-msg-injector: Fix reversed if statement
ASoC: SOF: ipc-msg-injector: Propagate write errors correctly
ASoC: fsl_sai: Add support for i.MX8MN
ASoC: SOF: Fix potential NULL pointer dereference
ALSA: hda/realtek: Fix for quirk to enable speaker output on the Lenovo Yoga DuetITL 2021
ASoC: cs42l51: Correct minimum value for SX volume control
ASoC: cs42l56: Correct typo in minimum level for SX volume controls
ASoC: cs42l52: Correct TLV for Bypass Volume
ASoC: cs53l30: Correct number of volume levels on SX controls
ASoC: cs35l36: Update digital volume TLV
ASoC: cs42l52: Fix TLV scales for mixer controls
...
Linus Torvalds [Fri, 10 Jun 2022 17:13:24 +0000 (10:13 -0700)]
Merge tag 'drm-fixes-2022-06-10' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Not a huge amount here, mainly a bunch of scattered amdgpu fixes, and
then some misc panfrost, bridge/panel ones, and one ast fix for
multi-monitors. Probably pick up a bit more next week like rc3 often
does.
amdgpu:
- DCN 3.1 golden settings fix
- eDP fixes
- DMCUB fixes
- GFX11 fixes and cleanups
- VCN fix for yellow carp
- GMC11 fixes
- RAS fixes
- GPUVM TLB flush fixes
- SMU13 fixes
- VCN3 AV1 regression fix
- VCN2 JPEG fix
- Other misc fixes
amdkfd:
- MMU notifier fix
- Support for more GC 10.3.x families
- Pinned BO handling fix
- Partial migration bug fix
panfrost:
- fix a use after free
ti-sn65dsi83:
- fix invalid DT configuration
panel:
- two self refresh fixes
ast:
- multiple output fix"
* tag 'drm-fixes-2022-06-10' of git://anongit.freedesktop.org/drm/drm: (37 commits)
drm/ast: Support multiple outputs
drm/amdgpu/mes: only invalid/prime icache when finish loading both pipe MES FWs.
drm/amdgpu/jpeg2: Add jpeg vmid update under IB submit
drm/amdgpu: always flush the TLB on gfx8
drm/amdgpu: fix limiting AV1 to the first instance on VCN3
drm/amdkfd:Fix fw version for 10.3.6
drm/amdgpu: Add MODE register to wave debug info in gfx11
Revert "drm/amd/display: Pass the new context into disable OTG WA"
Revert "drm/amdgpu: Ensure the DMA engine is deactivated during set ups"
drm/atomic: Force bridge self-refresh-exit on CRTC switch
drm/bridge: analogix_dp: Support PSR-exit to disable transition
drm/amdgpu: suppress the compile warning about 64 bit type
drm/amd/pm: suppress compile warnings about possible unaligned accesses
drm/amdkfd: Fix partial migration bugs
drm/amdkfd: add pinned BOs to kfd_bo_list
drm/amdgpu: Update PDEs flush TLB if PTB/PDB moved
drm/amdgpu: enable tmz by default for GC 10.3.7
drm/amdkfd: Add GC 10.3.6 and 10.3.7 KFD definitions
drm/amdkfd: Use mmget_not_zero in MMU notifier
drm/amdgpu: Resolve RAS GFX error count issue after cold boot on Arcturus
...
Linus Torvalds [Fri, 10 Jun 2022 17:07:06 +0000 (10:07 -0700)]
Merge tag 'net-5.19-rc2-2' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Quick follow up, to cleanly fast-forward net again.
Current release - new code bugs:
- Revert "net/mlx5e: Allow relaxed ordering over VFs"
Previous releases - regressions:
- seg6: fix seg6_lookup_any_nexthop() to handle VRFs using
flowi_l3mdev
Misc:
- rename TLS_INFO_ZC_SENDFILE to better express the meaning"
* tag 'net-5.19-rc2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net:
net: seg6: fix seg6_lookup_any_nexthop() to handle VRFs using flowi_l3mdev
nfp: flower: restructure flow-key for gre+vlan combination
nfp: avoid unnecessary check warnings in nfp_app_get_vf_config
tls: Rename TLS_INFO_ZC_SENDFILE to TLS_INFO_ZC_TX
net/mlx5: fs, fail conflicting actions
net/mlx5: Rearm the FW tracer after each tracer event
net/mlx5: E-Switch, pair only capable devices
net/mlx5e: CT: Fix cleanup of CT before cleanup of TC ct rules
Revert "net/mlx5e: Allow relaxed ordering over VFs"
MAINTAINERS: adjust MELLANOX ETHERNET INNOVA DRIVERS to TLS support removal
Catalin Marinas [Fri, 10 Jun 2022 17:01:31 +0000 (18:01 +0100)]
arm64: Add kasan_hw_tags_enable() prototype to silence sparse
This function is only called from assembly, no need for a prototype
declaration in a header file. In addition, add #ifdef around the
function since it is only used when CONFIG_KASAN_HW_TAGS.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: kernel test robot <lkp@intel.com>
Linus Torvalds [Fri, 10 Jun 2022 16:57:11 +0000 (09:57 -0700)]
Merge tag 'for-linus-5.19a-rc2-tag' of git://git./linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
- a small cleanup removing "export" of an __init function
- a small series adding a new infrastructure for platform flags
- a series adding generic virtio support for Xen guests (frontend side)
* tag 'for-linus-5.19a-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: unexport __init-annotated xen_xlate_map_ballooned_pages()
arm/xen: Assign xen-grant DMA ops for xen-grant DMA devices
xen/grant-dma-ops: Retrieve the ID of backend's domain for DT devices
xen/grant-dma-iommu: Introduce stub IOMMU driver
dt-bindings: Add xen,grant-dma IOMMU description for xen-grant DMA ops
xen/virtio: Enable restricted memory access using Xen grant mappings
xen/grant-dma-ops: Add option to restrict memory access under Xen
xen/grants: support allocating consecutive grants
arm/xen: Introduce xen_setup_dma_ops()
virtio: replace arch_has_restricted_virtio_memory_access()
kernel: add platform_has() infrastructure
Linus Torvalds [Fri, 10 Jun 2022 16:52:11 +0000 (09:52 -0700)]
Merge tag 'mips-fixes_5.19_1' of git://git./linux/kernel/git/mips/linux
Pull MIPS fix from Thomas Bogendoerfer:
"Build fix for Loongson-3"
* tag 'mips-fixes_5.19_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: Loongson-3: fix compile mips cpu_hwmon as module build error.
Mark Brown [Thu, 2 Jun 2022 12:41:32 +0000 (14:41 +0200)]
arm64/sme: Fix EFI save/restore
The EFI save/restore code is confused. When saving the check for saving
FFR is inverted due to confusion with the streaming mode check, and when
restoring we check if we need to restore FFR by checking the percpu
efi_sm_state without the required wrapper rather than based on the
combination of FA64 support and streaming mode.
Fixes:
e0838f6373e5 ("arm64/sme: Save and restore streaming mode over EFI runtime calls")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220602124132.3528951-1-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Xiang wangx [Fri, 10 Jun 2022 07:05:43 +0000 (15:05 +0800)]
arm64/fpsimd: Fix typo in comment
Delete the redundant word 'in'.
Signed-off-by: Xiang wangx <wangxiang@cdjrlc.com>
Link: https://lore.kernel.org/r/20220610070543.59338-1-wangxiang@cdjrlc.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Alejandro Tafalla [Thu, 9 Jun 2022 20:42:18 +0000 (22:42 +0200)]
arm64/sysreg: Fix typo in Enum element regex
In the awk script, there was a typo with the comparison operator when
checking if the matched pattern is inside an Enum block.
This prevented the generation of the whole sysreg-defs.h header.
Fixes:
66847e0618d7 ("arm64: Add sysreg header generation scripting")
Signed-off-by: Alejandro Tafalla <atafalla@dnyon.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220609204220.12112-1-atafalla@dnyon.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Serge Semin [Fri, 10 Jun 2022 10:45:00 +0000 (13:45 +0300)]
gpio: dwapb: Don't print error on -EPROBE_DEFER
Currently if the APB or Debounce clocks aren't yet ready to be requested
the DW GPIO driver will correctly handle that by deferring the probe
procedure, but the error is still printed to the system log. It needlessly
pollutes the log since there was no real error but a request to postpone
the clock request procedure since the clocks subsystem hasn't been fully
initialized yet. Let's fix that by using the dev_err_probe method to print
the APB/clock request error status. It will correctly handle the deferred
probe situation and print the error if it actually happens.
Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Jason A. Donenfeld [Wed, 8 Jun 2022 08:31:25 +0000 (10:31 +0200)]
random: remove rng_has_arch_random()
With arch randomness being used by every distro and enabled in
defconfigs, the distinction between rng_has_arch_random() and
rng_is_initialized() is now rather small. In fact, the places where they
differ are now places where paranoid users and system builders really
don't want arch randomness to be used, in which case we should respect
that choice, or places where arch randomness is known to be broken, in
which case that choice is all the more important. So this commit just
removes the function and its one user.
Reviewed-by: Petr Mladek <pmladek@suse.com> # for vsprintf.c
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Jason A. Donenfeld [Sun, 5 Jun 2022 16:30:46 +0000 (18:30 +0200)]
random: credit cpu and bootloader seeds by default
This commit changes the default Kconfig values of RANDOM_TRUST_CPU and
RANDOM_TRUST_BOOTLOADER to be Y by default. It does not change any
existing configs or change any kernel behavior. The reason for this is
several fold.
As background, I recently had an email thread with the kernel
maintainers of Fedora/RHEL, Debian, Ubuntu, Gentoo, Arch, NixOS, Alpine,
SUSE, and Void as recipients. I noted that some distros trust RDRAND,
some trust EFI, and some trust both, and I asked why or why not. There
wasn't really much of a "debate" but rather an interesting discussion of
what the historical reasons have been for this, and it came up that some
distros just missed the introduction of the bootloader Kconfig knob,
while another didn't want to enable it until there was a boot time
switch to turn it off for more concerned users (which has since been
added). The result of the rather uneventful discussion is that every
major Linux distro enables these two options by default.
While I didn't have really too strong of an opinion going into this
thread -- and I mostly wanted to learn what the distros' thinking was
one way or another -- ultimately I think their choice was a decent
enough one for a default option (which can be disabled at boot time).
I'll try to summarize the pros and cons:
Pros:
- The RNG machinery gets initialized super quickly, and there's no
messing around with subsequent blocking behavior.
- The bootloader mechanism is used by kexec in order for the prior
kernel to initialize the RNG of the next kernel, which increases
the entropy available to early boot daemons of the next kernel.
- Previous objections related to backdoors centered around
Dual_EC_DRBG-like kleptographic systems, in which observing some
amount of the output stream enables an adversary holding the right key
to determine the entire output stream.
This used to be a partially justified concern, because RDRAND output
was mixed into the output stream in varying ways, some of which may
have lacked pre-image resistance (e.g. XOR or an LFSR).
But this is no longer the case. Now, all usage of RDRAND and
bootloader seeds go through a cryptographic hash function. This means
that the CPU would have to compute a hash pre-image, which is not
considered to be feasible (otherwise the hash function would be
terribly broken).
- More generally, if the CPU is backdoored, the RNG is probably not the
realistic vector of choice for an attacker.
- These CPU or bootloader seeds are far from being the only source of
entropy. Rather, there is generally a pretty huge amount of entropy,
not all of which is credited, especially on CPUs that support
instructions like RDRAND. In other words, assuming RDRAND outputs all
zeros, an attacker would *still* have to accurately model every single
other entropy source also in use.
- The RNG now reseeds itself quite rapidly during boot, starting at 2
seconds, then 4, then 8, then 16, and so forth, so that other sources
of entropy get used without much delay.
- Paranoid users can set random.trust_{cpu,bootloader}=no in the kernel
command line, and paranoid system builders can set the Kconfig options
to N, so there's no reduction or restriction of optionality.
- It's a practical default.
- All the distros have it set this way. Microsoft and Apple trust it
too. Bandwagon.
Cons:
- RDRAND *could* still be backdoored with something like a fixed key or
limited space serial number seed or another indexable scheme like
that. (However, it's hard to imagine threat models where the CPU is
backdoored like this, yet people are still okay making *any*
computations with it or connecting it to networks, etc.)
- RDRAND *could* be defective, rather than backdoored, and produce
garbage that is in one way or another insufficient for crypto.
- Suggesting a *reduction* in paranoia, as this commit effectively does,
may cause some to question my personal integrity as a "security
person".
- Bootloader seeds and RDRAND are generally very difficult if not all
together impossible to audit.
Keep in mind that this doesn't actually change any behavior. This
is just a change in the default Kconfig value. The distros already are
shipping kernels that set things this way.
Ard made an additional argument in [1]:
We're at the mercy of firmware and micro-architecture anyway, given
that we are also relying on it to ensure that every instruction in
the kernel's executable image has been faithfully copied to memory,
and that the CPU implements those instructions as documented. So I
don't think firmware or ISA bugs related to RNGs deserve special
treatment - if they are broken, we should quirk around them like we
usually do. So enabling these by default is a step in the right
direction IMHO.
In [2], Phil pointed out that having this disabled masked a bug that CI
otherwise would have caught:
A clean 5.15.45 boots cleanly, whereas a downstream kernel shows the
static key warning (but it does go on to boot). The significant
difference is that our defconfigs set CONFIG_RANDOM_TRUST_BOOTLOADER=y
defining that on top of multi_v7_defconfig demonstrates the issue on
a clean 5.15.45. Conversely, not setting that option in a
downstream kernel build avoids the warning
[1] https://lore.kernel.org/lkml/CAMj1kXGi+ieviFjXv9zQBSaGyyzeGW_VpMpTLJK8PJb2QHEQ-w@mail.gmail.com/
[2] https://lore.kernel.org/lkml/
c47c42e3-1d56-5859-a6ad-
976a1a3381c6@raspberrypi.com/
Cc: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Jason A. Donenfeld [Tue, 7 Jun 2022 15:28:06 +0000 (17:28 +0200)]
random: do not use jump labels before they are initialized
Stephen reported that a static key warning splat appears during early
boot on systems that credit randomness from device trees that contain an
"rng-seed" property, because because setup_machine_fdt() is called
before jump_label_init() during setup_arch():
static_key_enable_cpuslocked(): static key '0xffffffe51c6fcfc0' used before call to jump_label_init()
WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:166 static_key_enable_cpuslocked+0xb0/0xb8
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.18.0+ #224
44b43e377bfc84bc99bb5ab885ff694984ee09ff
pstate:
600001c9 (nZCv dAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : static_key_enable_cpuslocked+0xb0/0xb8
lr : static_key_enable_cpuslocked+0xb0/0xb8
sp :
ffffffe51c393cf0
x29:
ffffffe51c393cf0 x28:
000000008185054c x27:
00000000f1042f10
x26:
0000000000000000 x25:
00000000f10302b2 x24:
0000002513200000
x23:
0000002513200000 x22:
ffffffe51c1c9000 x21:
fffffffdfdc00000
x20:
ffffffe51c2f0831 x19:
ffffffe51c6fcfc0 x18:
00000000ffff1020
x17:
00000000e1e2ac90 x16:
00000000000000e0 x15:
ffffffe51b710708
x14:
0000000000000066 x13:
0000000000000018 x12:
0000000000000000
x11:
0000000000000000 x10:
00000000ffffffff x9 :
0000000000000000
x8 :
0000000000000000 x7 :
61632065726f6665 x6 :
6220646573752027
x5 :
ffffffe51c641d25 x4 :
ffffffe51c13142c x3 :
ffff0a00ffffff05
x2 :
40000000ffffe003 x1 :
00000000000001c0 x0 :
0000000000000065
Call trace:
static_key_enable_cpuslocked+0xb0/0xb8
static_key_enable+0x2c/0x40
crng_set_ready+0x24/0x30
execute_in_process_context+0x80/0x90
_credit_init_bits+0x100/0x154
add_bootloader_randomness+0x64/0x78
early_init_dt_scan_chosen+0x140/0x184
early_init_dt_scan_nodes+0x28/0x4c
early_init_dt_scan+0x40/0x44
setup_machine_fdt+0x7c/0x120
setup_arch+0x74/0x1d8
start_kernel+0x84/0x44c
__primary_switched+0xc0/0xc8
---[ end trace
0000000000000000 ]---
random: crng init done
Machine model: Google Lazor (rev1 - 2) with LTE
A trivial fix went in to address this on arm64,
73e2d827a501 ("arm64:
Initialize jump labels before setup_machine_fdt()"). I wrote patches as
well for arm32 and risc-v. But still patches are needed on xtensa,
powerpc, arc, and mips. So that's 7 platforms where things aren't quite
right. This sort of points to larger issues that might need a larger
solution.
Instead, this commit just defers setting the static branch until later
in the boot process. random_init() is called after jump_label_init() has
been called, and so is always a safe place from which to adjust the
static branch.
Fixes:
f5bda35fba61 ("random: use static branch for crng_ready()")
Reported-by: Stephen Boyd <swboyd@chromium.org>
Reported-by: Phil Elwell <phil@raspberrypi.com>
Tested-by: Phil Elwell <phil@raspberrypi.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Jason A. Donenfeld [Tue, 7 Jun 2022 15:04:38 +0000 (17:04 +0200)]
random: account for arch randomness in bits
Rather than accounting in bytes and multiplying (shifting), we can just
account in bits and avoid the shift. The main motivation for this is
there are other patches in flux that expand this code a bit, and
avoiding the duplication of "* 8" everywhere makes things a bit clearer.
Cc: stable@vger.kernel.org
Fixes:
12e45a2a6308 ("random: credit architectural init the exact amount")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Jason A. Donenfeld [Tue, 7 Jun 2022 15:00:16 +0000 (17:00 +0200)]
random: mark bootloader randomness code as __init
add_bootloader_randomness() and the variables it touches are only used
during __init and not after, so mark these as __init. At the same time,
unexport this, since it's only called by other __init code that's
built-in.
Cc: stable@vger.kernel.org
Fixes:
428826f5358c ("fdt: add support for rng-seed")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Jason A. Donenfeld [Tue, 7 Jun 2022 07:44:07 +0000 (09:44 +0200)]
random: avoid checking crng_ready() twice in random_init()
The current flow expands to:
if (crng_ready())
...
else if (...)
if (!crng_ready())
...
The second crng_ready() call is redundant, but can't so easily be
optimized out by the compiler.
This commit simplifies that to:
if (crng_ready()
...
else if (...)
...
Fixes:
560181c27b58 ("random: move initialization functions out of hot pages")
Cc: stable@vger.kernel.org
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Jakub Kicinski [Fri, 10 Jun 2022 05:05:36 +0000 (22:05 -0700)]
Merge tag 'mlx5-fixes-2022-06-08' of git://git./linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5 fixes 2022-06-08
This series provides bug fixes to mlx5 driver.
* tag 'mlx5-fixes-2022-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5: fs, fail conflicting actions
net/mlx5: Rearm the FW tracer after each tracer event
net/mlx5: E-Switch, pair only capable devices
net/mlx5e: CT: Fix cleanup of CT before cleanup of TC ct rules
Revert "net/mlx5e: Allow relaxed ordering over VFs"
MAINTAINERS: adjust MELLANOX ETHERNET INNOVA DRIVERS to TLS support removal
====================
Link: https://lore.kernel.org/r/20220608185855.19818-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Andrea Mayer [Wed, 8 Jun 2022 09:19:17 +0000 (11:19 +0200)]
net: seg6: fix seg6_lookup_any_nexthop() to handle VRFs using flowi_l3mdev
Commit
40867d74c374 ("net: Add l3mdev index to flow struct and avoid oif
reset for port devices") adds a new entry (flowi_l3mdev) in the common
flow struct used for indicating the l3mdev index for later rule and
table matching.
The l3mdev_update_flow() has been adapted to properly set the
flowi_l3mdev based on the flowi_oif/flowi_iif. In fact, when a valid
flowi_iif is supplied to the l3mdev_update_flow(), this function can
update the flowi_l3mdev entry only if it has not yet been set (i.e., the
flowi_l3mdev entry is equal to 0).
The SRv6 End.DT6 behavior in VRF mode leverages a VRF device in order to
force the routing lookup into the associated routing table. This routing
operation is performed by seg6_lookup_any_nextop() preparing a flowi6
data structure used by ip6_route_input_lookup() which, in turn,
(indirectly) invokes l3mdev_update_flow().
However, seg6_lookup_any_nexthop() does not initialize the new
flowi_l3mdev entry which is filled with random garbage data. This
prevents l3mdev_update_flow() from properly updating the flowi_l3mdev
with the VRF index, and thus SRv6 End.DT6 (VRF mode)/DT46 behaviors are
broken.
This patch correctly initializes the flowi6 instance allocated and used
by seg6_lookup_any_nexhtop(). Specifically, the entire flowi6 instance
is wiped out: in case new entries are added to flowi/flowi6 (as happened
with the flowi_l3mdev entry), we should no longer have incorrectly
initialized values. As a result of this operation, the value of
flowi_l3mdev is also set to 0.
The proposed fix can be tested easily. Starting from the commit
referenced in the Fixes, selftests [1],[2] indicate that the SRv6
End.DT6 (VRF mode)/DT46 behaviors no longer work correctly. By applying
this patch, those behaviors are back to work properly again.
[1] - tools/testing/selftests/net/srv6_end_dt46_l3vpn_test.sh
[2] - tools/testing/selftests/net/srv6_end_dt6_l3vpn_test.sh
Fixes:
40867d74c374 ("net: Add l3mdev index to flow struct and avoid oif reset for port devices")
Reported-by: Anton Makarov <am@3a-alliance.com>
Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220608091917.20345-1-andrea.mayer@uniroma2.it
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 10 Jun 2022 05:02:42 +0000 (22:02 -0700)]
Merge branch 'nfp-fixes-for-v5-19'
Simon Horman says:
====================
nfp: fixes for v5.19
this short series includes two fixes for the NFP driver.
1. Restructure GRE+VLAN flower offload to address a miss match
between the NIC firmware and driver implementation which
prevented these features from working in combination.
2. Prevent unnecessary warnings regarding rate limiting support.-
It is expected that this feature to not _always_ be present
but this was not taken into account when the code to check
for this feature was added.
====================
Link: https://lore.kernel.org/r/20220608092901.124780-1-simon.horman@corigine.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Etienne van der Linde [Wed, 8 Jun 2022 09:29:01 +0000 (11:29 +0200)]
nfp: flower: restructure flow-key for gre+vlan combination
Swap around the GRE and VLAN parts in the flow-key offloaded by
the driver to fit in with other tunnel types and the firmware.
Without this change used cases with GRE+VLAN on the outer header
does not get offloaded as the flow-key mismatches what the
firmware expect.
Fixes:
0d630f58989a ("nfp: flower: add support to offload QinQ match")
Fixes:
5a2b93041646 ("nfp: flower-ct: compile match sections of flow_payload")
Signed-off-by: Etienne van der Linde <etienne.vanderlinde@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fei Qin [Wed, 8 Jun 2022 09:29:00 +0000 (11:29 +0200)]
nfp: avoid unnecessary check warnings in nfp_app_get_vf_config
nfp_net_sriov_check is added in nfp_app_get_vf_config which intends
to ensure ivi->vlan_proto and ivi->max_tx_rate/min_tx_rate can be
read from VF config table only when firmware supports corresponding
capability.
However, "nfp_app_get_vf_config" can be called by commands like
"ip a", "ip link set $DEV up" and "ip link set $DEV vf $NUM vlan
$param" (with VF). When using commands above, many warnings
"ndo_set_vf_<cap_x> not supported" would appear if firmware doesn't
support VF rate limit and 802.1ad VLAN assingment. If more VFs are
created, things could get worse.
Thus, this patch add an extra bool parameter for nfp_net_sriov_check
to enable/disable the cap check warning report. Unnecessary warnings
in nfp_app_get_vf_config can be avoided. Valid warnings in kinds of
vf setting function can be reserved.
Fixes:
e0d0e1fdf1ed ("nfp: VF rate limit support")
Fixes:
59359597b010 ("nfp: support 802.1ad VLAN assingment to VF")
Signed-off-by: Fei Qin <fei.qin@corigine.com>
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Maxim Mikityanskiy [Wed, 8 Jun 2022 15:34:25 +0000 (18:34 +0300)]
tls: Rename TLS_INFO_ZC_SENDFILE to TLS_INFO_ZC_TX
To embrace possible future optimizations of TLS, rename zerocopy
sendfile definitions to more generic ones:
* setsockopt: TLS_TX_ZEROCOPY_SENDFILE- > TLS_TX_ZEROCOPY_RO
* sock_diag: TLS_INFO_ZC_SENDFILE -> TLS_INFO_ZC_RO_TX
RO stands for readonly and emphasizes that the application shouldn't
modify the data being transmitted with zerocopy to avoid potential
disconnection.
Fixes:
c1318b39c7d3 ("tls: Add opt-in zerocopy mode of sendfile()")
Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Link: https://lore.kernel.org/r/20220608153425.3151146-1-maximmi@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Dave Airlie [Fri, 10 Jun 2022 03:29:15 +0000 (13:29 +1000)]
Merge tag 'drm-misc-fixes-2022-06-09' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
two fixes for panel self-refresh handling, and one to fix
multiple output support on AST.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20220609100754.kvrkjy67gqabjuee@houat