platform/kernel/linux-starfive.git
2 years agomm/damon/sysfs: change few functions execute order
Xin Hao [Thu, 8 Sep 2022 08:19:32 +0000 (16:19 +0800)]
mm/damon/sysfs: change few functions execute order

There's no need to run container_of() as early as we do.

The compiler figures this out, but the resulting code is more readable.

Link: https://lkml.kernel.org/r/20220908081932.77370-1-xhao@linux.alibaba.com
Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm/huge_memory: prevent THP_ZERO_PAGE_ALLOC increased twice
Liu Shixin [Fri, 9 Sep 2022 02:16:53 +0000 (10:16 +0800)]
mm/huge_memory: prevent THP_ZERO_PAGE_ALLOC increased twice

A user who reads THP_ZERO_PAGE_ALLOC may be more concerned about the huge
zero pages that are really allocated for thp.  It is misleading to
increase THP_ZERO_PAGE_ALLOC twice if two threads call get_huge_zero_page
concurrently.  Don't increase the value if the huge page is not really
used.

Update Documentation/admin-guide/mm/transhuge.rst to suit.

Link: https://lkml.kernel.org/r/20220909021653.3371879-1-liushixin2@huawei.com
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agowriteback: remove unused macro DIRTY_FULL_SCOPE
Miaohe Lin [Fri, 9 Sep 2022 02:57:11 +0000 (10:57 +0800)]
writeback: remove unused macro DIRTY_FULL_SCOPE

It's introduced but never used. Remove it.

Link: https://lkml.kernel.org/r/20220909025711.32012-1-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: Jens Axboe <axboe@kernel.dk>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: NeilBrown <neilb@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: zhanglianjie <zhanglianjie@uniontech.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm: use nth_page instead of mem_map_offset mem_map_next
Cheng Li [Fri, 9 Sep 2022 07:31:09 +0000 (07:31 +0000)]
mm: use nth_page instead of mem_map_offset mem_map_next

To handle the discontiguous case, mem_map_next() has a parameter named
`offset`.  As a function caller, one would be confused why "get next
entry" needs a parameter named "offset".  The other drawback of
mem_map_next() is that the callers must take care of the map between
parameter "iter" and "offset", otherwise we may get an hole or duplication
during iteration.  So we use nth_page instead of mem_map_next.

And replace mem_map_offset with nth_page() per Matthew's comments.

Link: https://lkml.kernel.org/r/1662708669-9395-1-git-send-email-lic121@chinatelecom.cn
Signed-off-by: Cheng Li <lic121@chinatelecom.cn>
Fixes: 69d177c2fc70 ("hugetlbfs: handle pages higher order than MAX_ORDER")
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm/damon: remove duplicate get_monitoring_region() definitions
Xin Hao [Fri, 9 Sep 2022 21:36:06 +0000 (21:36 +0000)]
mm/damon: remove duplicate get_monitoring_region() definitions

In lru_sort.c and reclaim.c, they are all defining get_monitoring_region()
function, there is no need to define it separately.

As 'get_monitoring_region()' is not a 'static' function anymore, we try to
use a prefix to distinguish with other functions, so there rename it to
'damon_find_biggest_system_ram'.

Link: https://lkml.kernel.org/r/20220909213606.136221-1-sj@kernel.org
Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
Signed-off-by: SeongJae Park <sj@kernel.org>
Suggested-by: SeongJae Park <sj@kernel.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm: kfence: convert to DEFINE_SEQ_ATTRIBUTE
Liu Shixin [Fri, 9 Sep 2022 08:31:40 +0000 (16:31 +0800)]
mm: kfence: convert to DEFINE_SEQ_ATTRIBUTE

Use DEFINE_SEQ_ATTRIBUTE helper macro to simplify the code.

Link: https://lkml.kernel.org/r/20220909083140.3592919-1-liushixin2@huawei.com
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Reviewed-by: Marco Elver <elver@google.com>
Tested-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agozsmalloc: use correct types in _first_obj_offset functions
Alexey Romanov [Fri, 9 Sep 2022 08:37:22 +0000 (11:37 +0300)]
zsmalloc: use correct types in _first_obj_offset functions

Since commit ffedd09fa9b0 ("zsmalloc: Stop using slab fields in struct
page") we are using page->page_type (unsigned int) field instead of
page->units (int) as first object offset in a subpage of zspage.  So
get_first_obj_offset() and set_first_obj_offset() functions should work
with unsigned int type.

Link: https://lkml.kernel.org/r/20220909083722.85024-1-avromanov@sberdevices.ru
Fixes: ffedd09fa9b0 ("zsmalloc: Stop using slab fields in struct page")
Signed-off-by: Alexey Romanov <avromanov@sberdevices.ru>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Alexey Romanov <avromanov@sberdevices.ru>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm/shuffle: convert module_param_call to module_param_cb
Liu Shixin [Fri, 9 Sep 2022 08:39:47 +0000 (16:39 +0800)]
mm/shuffle: convert module_param_call to module_param_cb

module_param_call is now completely consistent with module_param_cb, so
there is no need to keep two macros.  Convert module_param_call to
module_param_cb since former is obsolete and latter is more kernel-ish.

Link: https://lkml.kernel.org/r/20220909083947.3595610-1-liushixin2@huawei.com
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Liu Shixin <liushixin2@huawei.com>
Cc: Paul Russel <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agoDocs/admin-guide/mm/damon/usage: note DAMON debugfs interface deprecation plan
SeongJae Park [Fri, 9 Sep 2022 20:29:01 +0000 (20:29 +0000)]
Docs/admin-guide/mm/damon/usage: note DAMON debugfs interface deprecation plan

Commit b18402726bd1 ("Docs/admin-guide/mm/damon/usage: document DAMON
sysfs interface") announced the DAMON debugfs interface deprecation plan,
but it is not so aggressively announced.  As the deprecation time is
coming, this commit makes the announce more easy to be found by adding the
note at the beginning of the DAMON debugfs interface usage document.

Link: https://lkml.kernel.org/r/20220909202901.57977-8-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Yun Levi <ppbuk5246@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agoDocs/admin-guide/mm/damon/start: mention the dependency as sysfs instead of debugfs
SeongJae Park [Fri, 9 Sep 2022 20:29:00 +0000 (20:29 +0000)]
Docs/admin-guide/mm/damon/start: mention the dependency as sysfs instead of debugfs

'Getting Started' document of DAMON says DAMON user-space tool, damo[1],
is using DAMON debugfs interface, and therefore it needs to ensure debugfs
is mounted.  However, the latest version of the tool is using DAMON sysfs
interface.  Moreover, DAMON debugfs interface is going to be deprecated as
announced by commit b18402726bd1 ("Docs/admin-guide/mm/damon/usage:
document DAMON sysfs interface").

This commit therefore update the document to tell readers about DAMON
sysfs interface dependency instead and never mention about debugfs
interface, which will be deprecated.

[1] https://github.com/awslabs/damo

Link: https://lkml.kernel.org/r/20220909202901.57977-7-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Yun Levi <ppbuk5246@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm/damon/Kconfig: notify debugfs deprecation plan
SeongJae Park [Fri, 9 Sep 2022 20:28:59 +0000 (20:28 +0000)]
mm/damon/Kconfig: notify debugfs deprecation plan

Commit b18402726bd1 ("Docs/admin-guide/mm/damon/usage: document DAMON
sysfs interface") announced the DAMON debugfs interface deprecation plan,
but it is not so aggressively announced.  As the deprecation time is
coming, this commit makes the announce more easy to be found by adding the
note to the config menu of DAMON debugfs interface.

Link: https://lkml.kernel.org/r/20220909202901.57977-6-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Yun Levi <ppbuk5246@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agoDocs/admin-guide/mm/damon: rename the title of the document
SeongJae Park [Fri, 9 Sep 2022 20:28:58 +0000 (20:28 +0000)]
Docs/admin-guide/mm/damon: rename the title of the document

The title of the DAMON document for admin-guide, 'Monitoring Data
Accesses', could confuse readers in some ways.  First of all, DAMON is not
the only single way for data access monitoring.  And the document is for
not only the data access monitoring but also data access pattern based
memory management optimizations (DAMOS).  This commit updates the title to
'DAMON: Data Access MONitor', which more explicitly explains what the
document describes.

Link: https://lkml.kernel.org/r/20220909202901.57977-5-sj@kernel.org
Fixes: c4ba6014aec3 ("Documentation: add documents for DAMON")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Yun Levi <ppbuk5246@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm/damon/core-test: test damon_set_regions
SeongJae Park [Fri, 9 Sep 2022 20:28:57 +0000 (20:28 +0000)]
mm/damon/core-test: test damon_set_regions

Preceding commit fixes a bug in 'damon_set_regions()', which allows holes
in the new monitoring target ranges.  This commit adds a kunit test case
for the problem to avoid any regression.

Link: https://lkml.kernel.org/r/20220909202901.57977-4-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Yun Levi <ppbuk5246@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm/damon/core: avoid holes in newly set monitoring target ranges
SeongJae Park [Fri, 9 Sep 2022 20:28:56 +0000 (20:28 +0000)]
mm/damon/core: avoid holes in newly set monitoring target ranges

When there are two or more non-contiguous regions intersecting with given
new ranges, 'damon_set_regions()' does not fill the holes.  This commit
makes the function to fill the holes with newly created regions.

[sj@kernel.org: handle error from 'damon_fill_regions_holes()']
Link: https://lkml.kernel.org/r/20220913215420.57761-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20220909202901.57977-3-sj@kernel.org
Fixes: 3f49584b262c ("mm/damon: implement primitives for the virtual memory address spaces")
Signed-off-by: SeongJae Park <sj@kernel.org>
Reported-by: Yun Levi <ppbuk5246@gmail.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agoselftest/damon: add a test for duplicate context dirs creation
SeongJae Park [Fri, 9 Sep 2022 20:28:55 +0000 (20:28 +0000)]
selftest/damon: add a test for duplicate context dirs creation

Patch series "mm/damon: minor fixes and cleanups".

This patchset contains minor fixes and cleanups for DAMON including

- selftest for a bug we found before (Patch 1),
- fix of region holes in vaddr corner case and a kunit test for it
  (Patches 2 and 3), and
- documents/Kconfig updates for title wordsmithing (Patch 4) and more
  aggressive DAMON debugfs interface deprecation announcement
  (Patches 5-7).

This patch (of 7):

Commit d26f60703606 ("mm/damon/dbgfs: avoid duplicate context directory
creation") fixes a bug which could result in memory leak and DAMON
disablement.  This commit adds a selftest for verifying the fix and avoid
regression.

Link: https://lkml.kernel.org/r/20220909202901.57977-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20220909202901.57977-2-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Yun Levi <ppbuk5246@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agotmpfs: add support for an i_version counter
Jeff Layton [Fri, 9 Sep 2022 13:00:31 +0000 (09:00 -0400)]
tmpfs: add support for an i_version counter

NFSv4 mandates a change attribute to avoid problems with timestamp
granularity, which Linux implements using the i_version counter. This is
particularly important when the underlying filesystem is fast.

Give tmpfs an i_version counter. Since it doesn't have to be persistent,
we can just turn on SB_I_VERSION and sprinkle some inode_inc_iversion
calls in the right places.

Also, while there is no formal spec for xattrs, most implementations
update the ctime on setxattr. Fix shmem_xattr_handler_set to update the
ctime and bump the i_version appropriately.

Link: https://lkml.kernel.org/r/20220909130031.15477-1-jlayton@kernel.org
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm/damon/vaddr: add a comment for 'default' case in damon_va_apply_scheme()
Kaixu Xia [Thu, 8 Sep 2022 03:13:17 +0000 (11:13 +0800)]
mm/damon/vaddr: add a comment for 'default' case in damon_va_apply_scheme()

The switch case 'DAMOS_STAT' and switch case 'default' have same return
value in damon_va_apply_scheme(), and the 'default' case is for DAMOS
actions that not supported by 'vaddr'.  It might make sense to add a
comment here.

[akpm@linux-foundation.org: fx comment grammar]
Link: https://lkml.kernel.org/r/1662606797-23534-1-git-send-email-kaixuxia@tencent.com
Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm/damon: introduce struct damos_access_pattern
Yajun Deng [Thu, 8 Sep 2022 19:14:43 +0000 (19:14 +0000)]
mm/damon: introduce struct damos_access_pattern

damon_new_scheme() has too many parameters, so introduce struct
damos_access_pattern to simplify it.

In additon, we can't use a bpf trace kprobe that has more than 5
parameters.

Link: https://lkml.kernel.org/r/20220908191443.129534-1-sj@kernel.org
Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Signed-off-by: SeongJae Park <sj@kernel.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm/rodata_test: use PAGE_ALIGNED() helper
Xiu Jianfeng [Tue, 6 Sep 2022 07:53:12 +0000 (15:53 +0800)]
mm/rodata_test: use PAGE_ALIGNED() helper

Use PAGE_ALIGNED() helper instead of open-coding operation, no functional
changes here.

Link: https://lkml.kernel.org/r/20220906075312.166595-1-xiujianfeng@huawei.com
Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm/hwpoison: add __init/__exit annotations to module init/exit funcs
Xiu Jianfeng [Tue, 6 Sep 2022 09:35:30 +0000 (17:35 +0800)]
mm/hwpoison: add __init/__exit annotations to module init/exit funcs

Add missing __init/__exit annotations to module init/exit funcs.

Link: https://lkml.kernel.org/r/20220906093530.243262-1-xiujianfeng@huawei.com
Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomemcg: reduce size of memcg vmstats structures
Shakeel Butt [Wed, 7 Sep 2022 04:35:37 +0000 (04:35 +0000)]
memcg: reduce size of memcg vmstats structures

The struct memcg_vmstats and struct memcg_vmstats_percpu contains two
arrays each for events of size NR_VM_EVENT_ITEMS which can be as large as
110.  However the memcg v1 only uses 4 of those while memcg v2 uses 15.
The union of both is 17.  On a 64 bit system, we are wasting approximately
((110 - 17) * 8 * 2) * (nr_cpus + 1) bytes which is significant on large
machines.

This patch reduces the size of the given structures by adding one
indirection and only stores array of events which are actually used by the
memcg code.  With this patch, the size of memcg_vmstats has reduced from
2544 bytes to 1056 bytes while the size of memcg_vmstats_percpu has
reduced from 2568 bytes to 1080 bytes.

[akpm@linux-foundation.org: fix memcg_events_local() array index, per Shakeel]
Link: https://lkml.kernel.org/r/CALvZod70Mvxr+Nzb6k0yiU2RFYjTD=0NFhKK-Eyp+5ejd1PSFw@mail.gmail.com
Link: https://lkml.kernel.org/r/20220907043537.3457014-4-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomemcg: rearrange code
Shakeel Butt [Wed, 7 Sep 2022 04:35:36 +0000 (04:35 +0000)]
memcg: rearrange code

This is a preparatory patch for easing the review of the follow up patch
which will reduce the memory overhead of memory cgroups.

Link: https://lkml.kernel.org/r/20220907043537.3457014-3-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomemcg: extract memcg_vmstats from struct mem_cgroup
Shakeel Butt [Wed, 7 Sep 2022 04:35:35 +0000 (04:35 +0000)]
memcg: extract memcg_vmstats from struct mem_cgroup

Patch series "memcg: reduce memory overhead of memory cgroups".

Currently a lot of memory is wasted to maintain the vmevents for memory
cgroups as we have multiple arrays of size NR_VM_EVENT_ITEMS which can be
as large as 110.  However memcg code uses small portion of those entries.
This patch series eliminate this overhead by removing the unneeded vmevent
entries from memory cgroup data structures.

This patch (of 3):

This is a preparatory patch to reduce the memory overhead of memory
cgroup. The struct memcg_vmstats is the largest object embedded into the
struct mem_cgroup. This patch extracts struct memcg_vmstats from struct
mem_cgroup to ease the following patches in reducing the size of struct
memcg_vmstats.

Link: https://lkml.kernel.org/r/20220907043537.3457014-1-shakeelb@google.com
Link: https://lkml.kernel.org/r/20220907043537.3457014-2-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomemblock tests: add new pageblock related macro
Kefeng Wang [Wed, 7 Sep 2022 08:26:43 +0000 (16:26 +0800)]
memblock tests: add new pageblock related macro

Add new pageblock_start_pfn() and pageblock_align() macro which are needed
by memblock tests.

Link: https://lkml.kernel.org/r/20220907082643.186979-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm: add pageblock_aligned() macro
Kefeng Wang [Wed, 7 Sep 2022 06:08:44 +0000 (14:08 +0800)]
mm: add pageblock_aligned() macro

Add pageblock_aligned() and use it to simplify code.

Link: https://lkml.kernel.org/r/20220907060844.126891-3-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm: add pageblock_align() macro
Kefeng Wang [Wed, 7 Sep 2022 06:08:43 +0000 (14:08 +0800)]
mm: add pageblock_align() macro

Add pageblock_align() macro and use it to simplify code.

Link: https://lkml.kernel.org/r/20220907060844.126891-2-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm: reuse pageblock_start/end_pfn() macro
Kefeng Wang [Wed, 7 Sep 2022 06:08:42 +0000 (14:08 +0800)]
mm: reuse pageblock_start/end_pfn() macro

Move pageblock_start_pfn/pageblock_end_pfn() into pageblock-flags.h, then
they could be used somewhere else, not only in compaction, also use
ALIGN_DOWN() instead of round_down() to be pair with ALIGN(), which should
be same for pageblock usage.

Link: https://lkml.kernel.org/r/20220907060844.126891-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm/page_owner.c: remove redundant drain_all_pages
Zhenhua Huang [Wed, 7 Sep 2022 08:01:13 +0000 (16:01 +0800)]
mm/page_owner.c: remove redundant drain_all_pages

Remove an expensive and unnecessary operation as PCP pages are safely
skipped when reading page owner.PCP pages can be skipped because
PAGE_EXT_OWNER_ALLOCATED is cleared.

With draining PCP pages, these pages are moved to buddy list so they can
be identified as buddy pages and skipped quickly.  Although it improved
efficiency of PFN walker, the drain is guaranteed expensive that is
unlikely to be offset by a slight increase in efficiency when skipping
free pages.

PAGE_EXT_OWNER_ALLOCATED is cleared in the page owner reset path below:
free_unref_page
-> free_unref_page_prepare
-> free_pcp_prepare
-> free_pages_prepare which do page owner
reset
-> free_unref_page_commit which add pages into pcp list

Link: https://lkml.kernel.org/r/1662704326-15899-1-git-send-email-quic_zhenhuah@quicinc.com
Link: https://lkml.kernel.org/r/1662633204-10044-1-git-send-email-quic_zhenhuah@quicinc.com
Link: https://lkml.kernel.org/r/1662537673-9392-1-git-send-email-quic_zhenhuah@quicinc.com
Signed-off-by: Zhenhua Huang <quic_zhenhuah@quicinc.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm/damon: simplify damon_ctx check in damon_sysfs_before_terminate
Xin Hao [Wed, 7 Sep 2022 08:41:16 +0000 (16:41 +0800)]
mm/damon: simplify damon_ctx check in damon_sysfs_before_terminate

In damon_sysfs_before_terminate(), it needs to check whether ctx->ops.id
supports 'DAMON_OPS_VADDR' or 'DAMON_OPS_FVADDR', there we can use
damon_target_has_pid() instead.

Link: https://lkml.kernel.org/r/20220907084116.62053-1-xhao@linux.alibaba.com
Signed-off-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm/damon/core: iterate the regions list from current point in damon_set_regions()
Kaixu Xia [Tue, 6 Sep 2022 15:18:47 +0000 (23:18 +0800)]
mm/damon/core: iterate the regions list from current point in damon_set_regions()

We iterate the whole regions list every time to get the first/last regions
intersecting with the specific range in damon_set_regions(), in order to
add new region or resize existing regions to fit in the specific range.
Actually, it is unnecessary to iterate the new added regions and the front
regions that have been checked.  Just iterate the regions list from the
current point using list_for_each_entry_from() every time to improve
performance.

The kunit tests passed:
 [PASSED] damon_test_apply_three_regions1
 [PASSED] damon_test_apply_three_regions2
 [PASSED] damon_test_apply_three_regions3
 [PASSED] damon_test_apply_three_regions4

Link: https://lkml.kernel.org/r/1662477527-13003-1-git-send-email-kaixuxia@tencent.com
Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm/hmm/test: use char dev with struct device to get device node
Mika Penttilä [Fri, 26 Aug 2022 05:06:31 +0000 (08:06 +0300)]
mm/hmm/test: use char dev with struct device to get device node

HMM selftests use an in-kernel pseudo device to emulate device memory.
The pseudo device registers a major device range for two or four pseudo
device instances.  User space has a script that reads /proc/devices in
order to find the assigned major number, and sends that to mknod(1), once
for each node.

Change this to properly use cdev and struct device APIs.

Delete the /proc/devices parsing from the user-space test script, now that
it is unnecessary.

Also, delete an unused field in struct dmirror_device: devmem.

Link: https://lkml.kernel.org/r/20220826050631.25771-1-mpenttil@redhat.com
Signed-off-by: Mika Penttilä <mpenttil@redhat.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: better invalid/double-free report header
Andrey Konovalov [Sat, 10 Sep 2022 23:25:30 +0000 (01:25 +0200)]
kasan: better invalid/double-free report header

Update the report header for invalid- and double-free bugs to contain the
address being freed:

BUG: KASAN: invalid-free in kfree+0x280/0x2a8
Free of addr ffff00000beac001 by task kunit_try_catch/99

Link: https://lkml.kernel.org/r/fce40f8dbd160972fe01a1ff39d0c426c310e4b7.1662852281.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: move tests to mm/kasan/
Andrey Konovalov [Mon, 5 Sep 2022 22:18:36 +0000 (00:18 +0200)]
kasan: move tests to mm/kasan/

Move KASAN tests to mm/kasan/ to keep the test code alongside the
implementation.

Link: https://lkml.kernel.org/r/676398f0aeecd47d2f8e3369ea0e95563f641a36.1662416260.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: add another use-after-free test
Andrey Konovalov [Mon, 5 Sep 2022 21:05:49 +0000 (23:05 +0200)]
kasan: add another use-after-free test

Add a new use-after-free test that checks that KASAN detects
use-after-free when another object was allocated in the same slot.

This test is mainly relevant for the tag-based modes, which do not use
quarantine.

Once [1] is resolved, this test can be extended to check that the stack
traces in the report point to the proper kmalloc/kfree calls.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=212203

Link: https://lkml.kernel.org/r/0659cfa15809dd38faa02bc0a59d0b5dbbd81211.1662411800.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: better identify bug types for tag-based modes
Andrey Konovalov [Mon, 5 Sep 2022 21:05:48 +0000 (23:05 +0200)]
kasan: better identify bug types for tag-based modes

Identify the bug type for the tag-based modes based on the stack trace
entries found in the stack ring.

If a free entry is found first (meaning that it was added last), mark the
bug as use-after-free.  If an alloc entry is found first, mark the bug as
slab-out-of-bounds.  Otherwise, assign the common bug type.

This change returns the functionalify of the previously dropped
CONFIG_KASAN_TAGS_IDENTIFY.

Link: https://lkml.kernel.org/r/13ce7fa07d9d995caedd1439dfae4d51401842f2.1662411800.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: dynamically allocate stack ring entries
Andrey Konovalov [Mon, 5 Sep 2022 21:05:47 +0000 (23:05 +0200)]
kasan: dynamically allocate stack ring entries

Instead of using a large static array, allocate the stack ring dynamically
via memblock_alloc().

The size of the stack ring is controlled by a new kasan.stack_ring_size
command-line parameter.  When kasan.stack_ring_size is not provided, the
default value of 32 << 10 is used.

When the stack trace collection is disabled via kasan.stacktrace=off, the
stack ring is not allocated.

Link: https://lkml.kernel.org/r/03b82ab60db53427e9818e0b0c1971baa10c3cbc.1662411800.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: support kasan.stacktrace for SW_TAGS
Andrey Konovalov [Mon, 5 Sep 2022 21:05:46 +0000 (23:05 +0200)]
kasan: support kasan.stacktrace for SW_TAGS

Add support for the kasan.stacktrace command-line argument for Software
Tag-Based KASAN.

The following patch adds a command-line argument for selecting the stack
ring size, and, as the stack ring is supported by both the Software and
the Hardware Tag-Based KASAN modes, it is natural that both of them have
support for kasan.stacktrace too.

Link: https://lkml.kernel.org/r/3b43059103faa7f8796017847b7d674b658f11b5.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: implement stack ring for tag-based modes
Andrey Konovalov [Mon, 5 Sep 2022 21:05:45 +0000 (23:05 +0200)]
kasan: implement stack ring for tag-based modes

Implement storing stack depot handles for alloc/free stack traces for slab
objects for the tag-based KASAN modes in a ring buffer.

This ring buffer is referred to as the stack ring.

On each alloc/free of a slab object, the tagged address of the object and
the current stack trace are recorded in the stack ring.

On each bug report, if the accessed address belongs to a slab object, the
stack ring is scanned for matching entries.  The newest entries are used
to print the alloc/free stack traces in the report: one entry for alloc
and one for free.

The number of entries in the stack ring is fixed in this patch, but one of
the following patches adds a command-line argument to control it.

[andreyknvl@google.com: initialize read-write lock in stack ring]
Link: https://lkml.kernel.org/r/576182d194e27531e8090bad809e4136953895f4.1663700262.git.andreyknvl@google.com
Link: https://lkml.kernel.org/r/692de14b6b6a1bc817fd55e4ad92fc1f83c1ab59.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Acked-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: introduce kasan_complete_mode_report_info
Andrey Konovalov [Mon, 5 Sep 2022 21:05:44 +0000 (23:05 +0200)]
kasan: introduce kasan_complete_mode_report_info

Add bug_type and alloc/free_track fields to kasan_report_info and add a
kasan_complete_mode_report_info() function that fills in these fields.
This function is implemented differently for different KASAN mode.

Change the reporting code to use the filled in fields instead of invoking
kasan_get_bug_type() and kasan_get_alloc/free_track().

For the Generic mode, kasan_complete_mode_report_info() invokes these
functions instead.  For the tag-based modes, only the bug_type field is
filled in; alloc/free_track are handled in the next patch.

Using a single function that fills in these fields is required for the
tag-based modes, as the values for all three fields are determined in a
single procedure implemented in the following patch.

Link: https://lkml.kernel.org/r/8432b861054fa8d0cee79a8877dedeaf3b677ca8.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: rework function arguments in report.c
Andrey Konovalov [Mon, 5 Sep 2022 21:05:43 +0000 (23:05 +0200)]
kasan: rework function arguments in report.c

Pass a pointer to kasan_report_info to describe_object() and
describe_object_stacks(), instead of passing the structure's fields.

The untagged pointer and the tag are still passed as separate arguments to
some of the functions to avoid duplicating the untagging logic.

This is preparatory change for the next patch.

Link: https://lkml.kernel.org/r/2e0cdb91524ab528a3c2b12b6d8bcb69512fc4af.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: fill in cache and object in complete_report_info
Andrey Konovalov [Mon, 5 Sep 2022 21:05:42 +0000 (23:05 +0200)]
kasan: fill in cache and object in complete_report_info

Add cache and object fields to kasan_report_info and fill them in in
complete_report_info() instead of fetching them in the middle of the
report printing code.

This allows the reporting code to get access to the object information
before starting printing the report.  One of the following patches uses
this information to determine the bug type with the tag-based modes.

Link: https://lkml.kernel.org/r/23264572cb2cbb8f0efbb51509b6757eb3cc1fc9.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: introduce complete_report_info
Andrey Konovalov [Mon, 5 Sep 2022 21:05:41 +0000 (23:05 +0200)]
kasan: introduce complete_report_info

Introduce a complete_report_info() function that fills in the
first_bad_addr field of kasan_report_info instead of doing it in
kasan_report_*().

This function will be extended in the next patch.

Link: https://lkml.kernel.org/r/8eb1a9bd01f5d31eab4524da54a101b8720b469e.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: simplify print_report
Andrey Konovalov [Mon, 5 Sep 2022 21:05:40 +0000 (23:05 +0200)]
kasan: simplify print_report

To simplify reading the implementation of print_report(), remove the
tagged_addr variable and rename untagged_addr to addr.

Link: https://lkml.kernel.org/r/f64f5f1093b3c06896bf0f850c5d9e661313fcb2.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: make kasan_addr_to_page static
Andrey Konovalov [Mon, 5 Sep 2022 21:05:39 +0000 (23:05 +0200)]
kasan: make kasan_addr_to_page static

As kasan_addr_to_page() is only used in report.c, rename it to
addr_to_page() and make it static.

Link: https://lkml.kernel.org/r/66c1267200fe0c16e2ac8847a9315fda041918cb.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: use kasan_addr_to_slab in print_address_description
Andrey Konovalov [Mon, 5 Sep 2022 21:05:38 +0000 (23:05 +0200)]
kasan: use kasan_addr_to_slab in print_address_description

Use the kasan_addr_to_slab() helper in print_address_description() instead
of separately invoking PageSlab() and page_slab().

Link: https://lkml.kernel.org/r/8b744fbf8c3c7fc5d34329ec70b60ee5c8dba66c.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: use virt_addr_valid in kasan_addr_to_page/slab
Andrey Konovalov [Mon, 5 Sep 2022 21:05:37 +0000 (23:05 +0200)]
kasan: use virt_addr_valid in kasan_addr_to_page/slab

Instead of open-coding the validity checks for addr in
kasan_addr_to_page/slab(), use the virt_addr_valid() helper.

Link: https://lkml.kernel.org/r/c22a4850d74d7430f8a6c08216fd55c2860a2b9e.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: cosmetic changes in report.c
Andrey Konovalov [Mon, 5 Sep 2022 21:05:36 +0000 (23:05 +0200)]
kasan: cosmetic changes in report.c

Do a few non-functional style fixes for the code in report.c.

Link: https://lkml.kernel.org/r/b728eae71f3ea505a885449724de21cf3f476a7b.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: move kasan_get_alloc/free_track definitions
Andrey Konovalov [Mon, 5 Sep 2022 21:05:35 +0000 (23:05 +0200)]
kasan: move kasan_get_alloc/free_track definitions

Move the definitions of kasan_get_alloc/free_track() to report_*.c, as
they belong with other the reporting code.

Link: https://lkml.kernel.org/r/0cb15423956889b3905a0174b58782633bbbd72e.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: pass tagged pointers to kasan_save_alloc/free_info
Andrey Konovalov [Mon, 5 Sep 2022 21:05:34 +0000 (23:05 +0200)]
kasan: pass tagged pointers to kasan_save_alloc/free_info

Pass tagged pointers to kasan_save_alloc/free_info().

This is a preparatory patch to simplify other changes in the series.

Link: https://lkml.kernel.org/r/d5bc48cfcf0dca8269dc3ed863047e4d4d2030f1.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: only define kasan_cache_create for Generic mode
Andrey Konovalov [Mon, 5 Sep 2022 21:05:33 +0000 (23:05 +0200)]
kasan: only define kasan_cache_create for Generic mode

Right now, kasan_cache_create() assigns SLAB_KASAN for all KASAN modes and
then sets up metadata-related cache parameters for the Generic mode.

SLAB_KASAN is used in two places:

1. In slab_ksize() to account for per-object metadata when
   calculating the size of the accessible memory within the object.
2. In slab_common.c via kasan_never_merge() to prevent merging of
   caches with per-object metadata.

Both cases are only relevant when per-object metadata is present, which is
only the case with the Generic mode.

Thus, assign SLAB_KASAN and define kasan_cache_create() only for the
Generic mode.

Also update the SLAB_KASAN-related comment.

Link: https://lkml.kernel.org/r/61faa2aa1906e2d02c97d00ddf99ce8911dda095.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: only define metadata structs for Generic mode
Andrey Konovalov [Mon, 5 Sep 2022 21:05:32 +0000 (23:05 +0200)]
kasan: only define metadata structs for Generic mode

Hide the definitions of kasan_alloc_meta and kasan_free_meta under an
ifdef CONFIG_KASAN_GENERIC check, as these structures are now only used
when the Generic mode is enabled.

Link: https://lkml.kernel.org/r/8d2aabff8c227c444a3f62edf87d5630beb77640.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: only define metadata offsets for Generic mode
Andrey Konovalov [Mon, 5 Sep 2022 21:05:31 +0000 (23:05 +0200)]
kasan: only define metadata offsets for Generic mode

Hide the definitions of alloc_meta_offset and free_meta_offset under an
ifdef CONFIG_KASAN_GENERIC check, as these fields are now only used when
the Generic mode is enabled.

Link: https://lkml.kernel.org/r/d4bafa0534facafd1a23c465a94261e64f366493.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: only define kasan_never_merge for Generic mode
Andrey Konovalov [Mon, 5 Sep 2022 21:05:30 +0000 (23:05 +0200)]
kasan: only define kasan_never_merge for Generic mode

KASAN prevents merging of slab caches whose objects have per-object
metadata stored in redzones.

As now only the Generic mode uses per-object metadata, define
kasan_never_merge() only for this mode.

Link: https://lkml.kernel.org/r/81ed01f29ff3443580b7e2fe362a8b47b1e8006d.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: only define kasan_metadata_size for Generic mode
Andrey Konovalov [Mon, 5 Sep 2022 21:05:29 +0000 (23:05 +0200)]
kasan: only define kasan_metadata_size for Generic mode

KASAN provides a helper for calculating the size of per-object metadata
stored in the redzone.

As now only the Generic mode uses per-object metadata, only define
kasan_metadata_size() for this mode.

Link: https://lkml.kernel.org/r/8f81d4938b80446bc72538a08217009f328a3e23.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: drop CONFIG_KASAN_GENERIC check from kasan_init_cache_meta
Andrey Konovalov [Mon, 5 Sep 2022 21:05:28 +0000 (23:05 +0200)]
kasan: drop CONFIG_KASAN_GENERIC check from kasan_init_cache_meta

As kasan_init_cache_meta() is only defined for the Generic mode, it does
not require the CONFIG_KASAN_GENERIC check.

Link: https://lkml.kernel.org/r/211f8f2b213aa91e9148ca63342990b491c4917a.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: introduce kasan_init_cache_meta
Andrey Konovalov [Mon, 5 Sep 2022 21:05:27 +0000 (23:05 +0200)]
kasan: introduce kasan_init_cache_meta

Add a kasan_init_cache_meta() helper that initializes metadata-related
cache parameters and use this helper in the common KASAN code.

Put the implementation of this new helper into generic.c, as only the
Generic mode uses per-object metadata.

Link: https://lkml.kernel.org/r/a6d7ea01876eb36472c9879f7b23f1b24766276e.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: introduce kasan_requires_meta
Andrey Konovalov [Mon, 5 Sep 2022 21:05:26 +0000 (23:05 +0200)]
kasan: introduce kasan_requires_meta

Add a kasan_requires_meta() helper that indicates whether the enabled
KASAN mode requires per-object metadata and use this helper in the common
code.

Also hide kasan_init_object_meta() under CONFIG_KASAN_GENERIC ifdef check,
as Generic is the only mode that uses per-object metadata.

To allow for a potential future change that makes Generic KASAN support
the kasan.stacktrace command-line parameter, let kasan_requires_meta()
return kasan_stack_collection_enabled() instead of simply returning true.

Link: https://lkml.kernel.org/r/cf837e9996246aaaeebf704ccf8ec26a34fcf64f.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: move kasan_get_*_meta to generic.c
Andrey Konovalov [Mon, 5 Sep 2022 21:05:25 +0000 (23:05 +0200)]
kasan: move kasan_get_*_meta to generic.c

Move the implementations of kasan_get_alloc/free_meta() to generic.c, as
the common KASAN code does not use these functions anymore.

Also drop kasan_reset_tag() from the implementation, as the Generic mode
does not tag pointers.

Link: https://lkml.kernel.org/r/ffcfc0ad654d78a2ef4ca054c943ddb4e5ca477b.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: clear metadata functions for tag-based modes
Andrey Konovalov [Mon, 5 Sep 2022 21:05:24 +0000 (23:05 +0200)]
kasan: clear metadata functions for tag-based modes

Remove implementations of the metadata-related functions for the tag-based
modes.

The following patches in the series will provide alternative
implementations.

As of this patch, the tag-based modes no longer collect alloc and free
stack traces.  This functionality will be restored later in the series.

Link: https://lkml.kernel.org/r/470fbe5d15e8015092e76e395de354be18ccceab.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: introduce kasan_init_object_meta
Andrey Konovalov [Mon, 5 Sep 2022 21:05:23 +0000 (23:05 +0200)]
kasan: introduce kasan_init_object_meta

Add a kasan_init_object_meta() helper that initializes metadata for a slab
object and use it in the common code.

For now, the implementations of this helper are the same for the Generic
and tag-based modes, but they will diverge later in the series.

This change hides references to alloc_meta from the common code.  This is
desired as only the Generic mode will be using per-object metadata after
this series.

Link: https://lkml.kernel.org/r/47c12938fc7f8105e7aaa592527c0e9d3c81fc37.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: introduce kasan_get_alloc_track
Andrey Konovalov [Mon, 5 Sep 2022 21:05:22 +0000 (23:05 +0200)]
kasan: introduce kasan_get_alloc_track

Add a kasan_get_alloc_track() helper that fetches alloc_track for a slab
object and use this helper in the common reporting code.

For now, the implementations of this helper are the same for the Generic
and tag-based modes, but they will diverge later in the series.

This change hides references to alloc_meta from the common reporting code.
This is desired as only the Generic mode will be using per-object
metadata after this series.

Link: https://lkml.kernel.org/r/0c365a35f4a833fff46f9d42c3212b32f7166556.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: introduce kasan_print_aux_stacks
Andrey Konovalov [Mon, 5 Sep 2022 21:05:21 +0000 (23:05 +0200)]
kasan: introduce kasan_print_aux_stacks

Add a kasan_print_aux_stacks() helper that prints the auxiliary stack
traces for the Generic mode.

This change hides references to alloc_meta from the common reporting code.
This is desired as only the Generic mode will be using per-object
metadata after this series.

Link: https://lkml.kernel.org/r/67c7a9ea6615533762b1f8ccc267cd7f9bafb749.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: drop CONFIG_KASAN_TAGS_IDENTIFY
Andrey Konovalov [Mon, 5 Sep 2022 21:05:20 +0000 (23:05 +0200)]
kasan: drop CONFIG_KASAN_TAGS_IDENTIFY

Drop CONFIG_KASAN_TAGS_IDENTIFY and related code to simplify making
changes to the reporting code.

The dropped functionality will be restored in the following patches in
this series.

Link: https://lkml.kernel.org/r/4c66ba98eb237e9ed9312c19d423bbcf4ecf88f8.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: split save_alloc_info implementations
Andrey Konovalov [Mon, 5 Sep 2022 21:05:19 +0000 (23:05 +0200)]
kasan: split save_alloc_info implementations

Provide standalone implementations of save_alloc_info() for the Generic
and tag-based modes.

For now, the implementations are the same, but they will diverge later in
the series.

Link: https://lkml.kernel.org/r/77f1a078489c1e859aedb5403f772e5e1f7410a0.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: move is_kmalloc check out of save_alloc_info
Andrey Konovalov [Mon, 5 Sep 2022 21:05:18 +0000 (23:05 +0200)]
kasan: move is_kmalloc check out of save_alloc_info

Move kasan_info.is_kmalloc check out of save_alloc_info().

This is a preparatory change that simplifies the following patches in this
series.

Link: https://lkml.kernel.org/r/df89f1915b788f9a10319905af6d0202a3b30c30.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: rename kasan_set_*_info to kasan_save_*_info
Andrey Konovalov [Mon, 5 Sep 2022 21:05:17 +0000 (23:05 +0200)]
kasan: rename kasan_set_*_info to kasan_save_*_info

Rename set_alloc_info() and kasan_set_free_info() to save_alloc_info() and
kasan_save_free_info().  The new names make more sense.

Link: https://lkml.kernel.org/r/9f04777a15cb9d96bf00331da98e021d732fe1c9.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokasan: check KASAN_NO_FREE_META in __kasan_metadata_size
Andrey Konovalov [Mon, 5 Sep 2022 21:05:16 +0000 (23:05 +0200)]
kasan: check KASAN_NO_FREE_META in __kasan_metadata_size

Patch series "kasan: switch tag-based modes to stack ring from per-object
metadata", v3.

This series makes the tag-based KASAN modes use a ring buffer for storing
stack depot handles for alloc/free stack traces for slab objects instead
of per-object metadata.  This ring buffer is referred to as the stack
ring.

On each alloc/free of a slab object, the tagged address of the object and
the current stack trace are recorded in the stack ring.

On each bug report, if the accessed address belongs to a slab object, the
stack ring is scanned for matching entries.  The newest entries are used
to print the alloc/free stack traces in the report: one entry for alloc
and one for free.

The advantages of this approach over storing stack trace handles in
per-object metadata with the tag-based KASAN modes:

- Allows to find relevant stack traces for use-after-free bugs without
  using quarantine for freed memory. (Currently, if the object was
  reallocated multiple times, the report contains the latest alloc/free
  stack traces, not necessarily the ones relevant to the buggy allocation.)
- Allows to better identify and mark use-after-free bugs, effectively
  making the CONFIG_KASAN_TAGS_IDENTIFY functionality always-on.
- Has fixed memory overhead.

The disadvantage:

- If the affected object was allocated/freed long before the bug happened
  and the stack trace events were purged from the stack ring, the report
  will have no stack traces.

Discussion
==========

The proposed implementation of the stack ring uses a single ring buffer
for the whole kernel.  This might lead to contention due to atomic
accesses to the ring buffer index on multicore systems.

At this point, it is unknown whether the performance impact from this
contention would be significant compared to the slowdown introduced by
collecting stack traces due to the planned changes to the latter part, see
the section below.

For now, the proposed implementation is deemed to be good enough, but this
might need to be revisited once the stack collection becomes faster.

A considered alternative is to keep a separate ring buffer for each CPU
and then iterate over all of them when printing a bug report.  This
approach requires somehow figuring out which of the stack rings has the
freshest stack traces for an object if multiple stack rings have them.

Further plans
=============

This series is a part of an effort to make KASAN stack trace collection
suitable for production.  This requires stack trace collection to be fast
and memory-bounded.

The planned steps are:

1. Speed up stack trace collection (potentially, by using SCS;
   patches on-hold until steps #2 and #3 are completed).
2. Keep stack trace handles in the stack ring (this series).
3. Add a memory-bounded mode to stack depot or provide an alternative
   memory-bounded stack storage.
4. Potentially, implement stack trace collection sampling to minimize
   the performance impact.

This patch (of 34):

__kasan_metadata_size() calculates the size of the redzone for objects in
a slab cache.

When accounting for presence of kasan_free_meta in the redzone, this
function only compares free_meta_offset with 0.  But free_meta_offset
could also be equal to KASAN_NO_FREE_META, which indicates that
kasan_free_meta is not present at all.

Add a comparison with KASAN_NO_FREE_META into __kasan_metadata_size().

Link: https://lkml.kernel.org/r/cover.1662411799.git.andreyknvl@google.com
Link: https://lkml.kernel.org/r/c7b316d30d90e5947eb8280f4dc78856a49298cf.1662411799.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Peter Collingbourne <pcc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agofilemap: convert filemap_range_has_writeback() to use folios
Vishal Moola (Oracle) [Mon, 5 Sep 2022 21:45:57 +0000 (14:45 -0700)]
filemap: convert filemap_range_has_writeback() to use folios

Removes 3 calls to compound_head().

Link: https://lkml.kernel.org/r/20220905214557.868606-1-vishal.moola@gmail.com
Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agohugetlb_encode.h: fix undefined behaviour (34 << 26)
Matthias Goergens [Mon, 5 Sep 2022 03:19:04 +0000 (11:19 +0800)]
hugetlb_encode.h: fix undefined behaviour (34 << 26)

Left-shifting past the size of your datatype is undefined behaviour in C.
The literal 34 gets the type `int`, and that one is not big enough to be
left shifted by 26 bits.

An `unsigned` is long enough (on any machine that has at least 32 bits for
their ints.)

For uniformity, we mark all the literals as unsigned.  But it's only
really needed for HUGETLB_FLAG_ENCODE_16GB.

Thanks to Randy Dunlap for an initial review and suggestion.

Link: https://lkml.kernel.org/r/20220905031904.150925-1-matthias.goergens@gmail.com
Signed-off-by: Matthias Goergens <matthias.goergens@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm/damon/sysfs: simplify the judgement whether kdamonds are busy
Kaixu Xia [Sun, 4 Sep 2022 14:36:06 +0000 (22:36 +0800)]
mm/damon/sysfs: simplify the judgement whether kdamonds are busy

It is unnecessary to get the number of the running kdamond to judge
whether kdamonds are busy.  Here we can use the
damon_sysfs_kdamond_running() helper and return -EBUSY directly when
finding a running kdamond.  Meanwhile, merging with the judgement that a
kdamond has current sysfs command callback request to make the code more
clear.

Link: https://lkml.kernel.org/r/1662302166-13216-1-git-send-email-kaixuxia@tencent.com
Signed-off-by: Kaixu Xia <kaixuxia@tencent.com>
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm/hugetlb.c: remove unnecessary initialization of local `err'
Li zeming [Mon, 5 Sep 2022 02:09:18 +0000 (10:09 +0800)]
mm/hugetlb.c: remove unnecessary initialization of local `err'

Link: https://lkml.kernel.org/r/20220905020918.3552-1-zeming@nfschina.com
Signed-off-by: Li zeming <zeming@nfschina.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm: convert lock_page_or_retry() to folio_lock_or_retry()
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:53 +0000 (20:46 +0100)]
mm: convert lock_page_or_retry() to folio_lock_or_retry()

Remove a call to compound_head() in each of the two callers.

Link: https://lkml.kernel.org/r/20220902194653.1739778-58-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agouprobes: use new_folio in __replace_page()
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:52 +0000 (20:46 +0100)]
uprobes: use new_folio in __replace_page()

Saves several calls to compound_head().

Link: https://lkml.kernel.org/r/20220902194653.1739778-57-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agormap: remove page_unlock_anon_vma_read()
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:51 +0000 (20:46 +0100)]
rmap: remove page_unlock_anon_vma_read()

This was simply an alias for anon_vma_unlock_read() since 2011.

Link: https://lkml.kernel.org/r/20220902194653.1739778-56-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm: convert page_get_anon_vma() to folio_get_anon_vma()
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:50 +0000 (20:46 +0100)]
mm: convert page_get_anon_vma() to folio_get_anon_vma()

With all callers now passing in a folio, rename the function and convert
all callers.  Removes a couple of calls to compound_head() and a reference
to page->mapping.

Link: https://lkml.kernel.org/r/20220902194653.1739778-55-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agohuge_memory: convert unmap_page() to unmap_folio()
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:49 +0000 (20:46 +0100)]
huge_memory: convert unmap_page() to unmap_folio()

Remove a folio->page->folio conversion.

Link: https://lkml.kernel.org/r/20220902194653.1739778-54-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agohuge_memory: convert split_huge_page_to_list() to use a folio
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:48 +0000 (20:46 +0100)]
huge_memory: convert split_huge_page_to_list() to use a folio

Saves many calls to compound_head().

Link: https://lkml.kernel.org/r/20220902194653.1739778-53-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomigrate: convert unmap_and_move_huge_page() to use folios
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:47 +0000 (20:46 +0100)]
migrate: convert unmap_and_move_huge_page() to use folios

Saves several calls to compound_head() and removes a couple of uses of
page->lru.

Link: https://lkml.kernel.org/r/20220902194653.1739778-52-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomigrate: convert __unmap_and_move() to use folios
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:46 +0000 (20:46 +0100)]
migrate: convert __unmap_and_move() to use folios

Removes a lot of calls to compound_head().  Also remove a VM_BUG_ON that
can never trigger as the PageAnon bit is the bottom bit of page->mapping.

Link: https://lkml.kernel.org/r/20220902194653.1739778-51-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agormap: convert page_move_anon_rmap() to use a folio
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:45 +0000 (20:46 +0100)]
rmap: convert page_move_anon_rmap() to use a folio

Removes one call to compound_head() and a reference to page->mapping.

Link: https://lkml.kernel.org/r/20220902194653.1739778-50-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm: remove try_to_free_swap()
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:44 +0000 (20:46 +0100)]
mm: remove try_to_free_swap()

All callers have now been converted to folio_free_swap() and we can remove
this wrapper.

Link: https://lkml.kernel.org/r/20220902194653.1739778-49-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomemcg: convert mem_cgroup_swap_full() to take a folio
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:43 +0000 (20:46 +0100)]
memcg: convert mem_cgroup_swap_full() to take a folio

All callers now have a folio, so convert the function to take a folio.
Saves a couple of calls to compound_head().

Link: https://lkml.kernel.org/r/20220902194653.1739778-48-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm: convert do_swap_page() to use folio_free_swap()
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:42 +0000 (20:46 +0100)]
mm: convert do_swap_page() to use folio_free_swap()

Also convert should_try_to_free_swap() to use a folio.  This removes a few
calls to compound_head().

Link: https://lkml.kernel.org/r/20220902194653.1739778-47-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agoksm: use a folio in replace_page()
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:41 +0000 (20:46 +0100)]
ksm: use a folio in replace_page()

Replace three calls to compound_head() with one.

Link: https://lkml.kernel.org/r/20220902194653.1739778-46-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agouprobes: use folios more widely in __replace_page()
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:40 +0000 (20:46 +0100)]
uprobes: use folios more widely in __replace_page()

Remove a few hidden calls to compound_head().

Link: https://lkml.kernel.org/r/20220902194653.1739778-45-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomadvise: convert madvise_free_pte_range() to use a folio
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:39 +0000 (20:46 +0100)]
madvise: convert madvise_free_pte_range() to use a folio

Saves a lot of calls to compound_head().

Link: https://lkml.kernel.org/r/20220902194653.1739778-44-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agohuge_memory: convert do_huge_pmd_wp_page() to use a folio
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:38 +0000 (20:46 +0100)]
huge_memory: convert do_huge_pmd_wp_page() to use a folio

Removes many calls to compound_head().  Does not remove the assumption
that a folio may not be larger than a PMD.

Link: https://lkml.kernel.org/r/20220902194653.1739778-43-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm: convert do_wp_page() to use a folio
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:37 +0000 (20:46 +0100)]
mm: convert do_wp_page() to use a folio

Saves many calls to compound_head().

Link: https://lkml.kernel.org/r/20220902194653.1739778-42-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agoswap: convert swap_writepage() to use a folio
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:36 +0000 (20:46 +0100)]
swap: convert swap_writepage() to use a folio

Removes many calls to compound_head().

Link: https://lkml.kernel.org/r/20220902194653.1739778-41-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agoswap_state: convert free_swap_cache() to use a folio
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:35 +0000 (20:46 +0100)]
swap_state: convert free_swap_cache() to use a folio

Saves several calls to compound_head().

Link: https://lkml.kernel.org/r/20220902194653.1739778-40-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm: remove lookup_swap_cache()
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:34 +0000 (20:46 +0100)]
mm: remove lookup_swap_cache()

All callers have now been converted to swap_cache_get_folio(), so we can
remove this wrapper.

Link: https://lkml.kernel.org/r/20220902194653.1739778-39-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agomm: convert do_swap_page() to use swap_cache_get_folio()
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:33 +0000 (20:46 +0100)]
mm: convert do_swap_page() to use swap_cache_get_folio()

Saves a folio->page->folio conversion.

Link: https://lkml.kernel.org/r/20220902194653.1739778-38-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agoswapfile: convert unuse_pte_range() to use a folio
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:32 +0000 (20:46 +0100)]
swapfile: convert unuse_pte_range() to use a folio

Delay fetching the precise page from the folio until we're in unuse_pte().
Saves many calls to compound_head().

Link: https://lkml.kernel.org/r/20220902194653.1739778-37-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agoswapfile: convert __try_to_reclaim_swap() to use a folio
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:31 +0000 (20:46 +0100)]
swapfile: convert __try_to_reclaim_swap() to use a folio

Saves five calls to compound_head().

Link: https://lkml.kernel.org/r/20220902194653.1739778-36-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agoswapfile: convert try_to_unuse() to use a folio
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:30 +0000 (20:46 +0100)]
swapfile: convert try_to_unuse() to use a folio

Saves five calls to compound_head().

Link: https://lkml.kernel.org/r/20220902194653.1739778-35-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agoshmem: remove shmem_getpage()
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:29 +0000 (20:46 +0100)]
shmem: remove shmem_getpage()

With all callers removed, remove this wrapper function.  The flags are now
mysteriously called SGP, but I think we can live with that.

Link: https://lkml.kernel.org/r/20220902194653.1739778-34-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agouserfaultfd: convert mcontinue_atomic_pte() to use a folio
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:28 +0000 (20:46 +0100)]
userfaultfd: convert mcontinue_atomic_pte() to use a folio

shmem_getpage() is being replaced by shmem_get_folio() so use a folio
throughout this function.  Saves several calls to compound_head().

Link: https://lkml.kernel.org/r/20220902194653.1739778-33-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agokhugepaged: call shmem_get_folio()
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:27 +0000 (20:46 +0100)]
khugepaged: call shmem_get_folio()

shmem_getpage() is being removed, so call its replacement and find the
precise page ourselves.

Link: https://lkml.kernel.org/r/20220902194653.1739778-32-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agoshmem: convert shmem_get_link() to use a folio
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:26 +0000 (20:46 +0100)]
shmem: convert shmem_get_link() to use a folio

Symlinks will never use a large folio, but using the folio API removes a
lot of unnecessary folio->page->folio conversions.

Link: https://lkml.kernel.org/r/20220902194653.1739778-31-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2 years agoshmem: convert shmem_symlink() to use a folio
Matthew Wilcox (Oracle) [Fri, 2 Sep 2022 19:46:25 +0000 (20:46 +0100)]
shmem: convert shmem_symlink() to use a folio

While symlinks will always be < PAGE_SIZE, using the folio APIs gets rid
of unnecessary calls to compound_head().

Link: https://lkml.kernel.org/r/20220902194653.1739778-30-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>