mm/pagewalk: walk_pte_range() allow for pte_offset_map()
authorHugh Dickins <hughd@google.com>
Fri, 9 Jun 2023 01:18:49 +0000 (18:18 -0700)
committerAndrew Morton <akpm@linux-foundation.org>
Mon, 19 Jun 2023 23:19:14 +0000 (16:19 -0700)
walk_pte_range() has a no_vma option to serve walk_page_range_novma().  I
don't know of any problem, but it looks safer to check for init_mm, and
use pte_offset_kernel() rather than pte_offset_map() in that case:
pte_offset_map()'s pmdval validation is intended for userspace.

Allow for its pte_offset_map() or pte_offset_map_lock() to fail, and retry
with ACTION_AGAIN if so.  Add a second check for ACTION_AGAIN in
walk_pmd_range(), to catch it after return from walk_pte_range().

Remove the pmd_trans_unstable() check after split_huge_pmd() in
walk_pmd_range(): walk_pte_range() now handles those cases safely (and
they must fail powerpc's is_hugepd() check).

Link: https://lkml.kernel.org/r/3eba6f0-2b-fb66-6bb6-2ee8533e221@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Zack Rusin <zackr@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/pagewalk.c

index cb23f8a..6443710 100644 (file)
@@ -46,15 +46,27 @@ static int walk_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
        spinlock_t *ptl;
 
        if (walk->no_vma) {
-               pte = pte_offset_map(pmd, addr);
-               err = walk_pte_range_inner(pte, addr, end, walk);
-               pte_unmap(pte);
+               /*
+                * pte_offset_map() might apply user-specific validation.
+                */
+               if (walk->mm == &init_mm)
+                       pte = pte_offset_kernel(pmd, addr);
+               else
+                       pte = pte_offset_map(pmd, addr);
+               if (pte) {
+                       err = walk_pte_range_inner(pte, addr, end, walk);
+                       if (walk->mm != &init_mm)
+                               pte_unmap(pte);
+               }
        } else {
                pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
-               err = walk_pte_range_inner(pte, addr, end, walk);
-               pte_unmap_unlock(pte, ptl);
+               if (pte) {
+                       err = walk_pte_range_inner(pte, addr, end, walk);
+                       pte_unmap_unlock(pte, ptl);
+               }
        }
-
+       if (!pte)
+               walk->action = ACTION_AGAIN;
        return err;
 }
 
@@ -141,11 +153,8 @@ again:
                    !(ops->pte_entry))
                        continue;
 
-               if (walk->vma) {
+               if (walk->vma)
                        split_huge_pmd(walk->vma, pmd, addr);
-                       if (pmd_trans_unstable(pmd))
-                               goto again;
-               }
 
                if (is_hugepd(__hugepd(pmd_val(*pmd))))
                        err = walk_hugepd_range((hugepd_t *)pmd, addr, next, walk, PMD_SHIFT);
@@ -153,6 +162,10 @@ again:
                        err = walk_pte_range(pmd, addr, next, walk);
                if (err)
                        break;
+
+               if (walk->action == ACTION_AGAIN)
+                       goto again;
+
        } while (pmd++, addr = next, addr != end);
 
        return err;