mm/migrate_device: allow pte_offset_map_lock() to fail
authorHugh Dickins <hughd@google.com>
Fri, 9 Jun 2023 01:38:17 +0000 (18:38 -0700)
committerAndrew Morton <akpm@linux-foundation.org>
Mon, 19 Jun 2023 23:19:17 +0000 (16:19 -0700)
migrate_vma_collect_pmd(): remove the pmd_trans_unstable() handling after
splitting huge zero pmd, and the pmd_none() handling after successfully
splitting huge page: those are now managed inside pte_offset_map_lock(),
and by "goto again" when it fails.

But the skip after unsuccessful split_huge_page() must stay: it avoids an
endless loop.  The skip when pmd_bad()?  Remove that: it will be treated
as a hole rather than a skip once cleared by pte_offset_map_lock(), but
with different timing that would be so anyway; and it's arguably best to
leave the pmd_bad() handling centralized there.

migrate_vma_insert_page(): remove comment on the old pte_offset_map() and
old locking limitations; remove the pmd_trans_unstable() check and just
proceed to pte_offset_map_lock(), aborting when it fails (page has been
charged to memcg, but as in other cases, it's uncharged when freed).

Link: https://lkml.kernel.org/r/1131be62-2e84-da2f-8f45-807b2cbeeec5@google.com
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: SeongJae Park <sj@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Steven Price <steven.price@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Zack Rusin <zackr@vmware.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/migrate_device.c

index d30c9de60b0d7d37dff51f68baa98428c0b05e9f..a14af6b12b04bda48bb82f2a7283e836f2ae11d9 100644 (file)
@@ -83,9 +83,6 @@ again:
                if (is_huge_zero_page(page)) {
                        spin_unlock(ptl);
                        split_huge_pmd(vma, pmdp, addr);
-                       if (pmd_trans_unstable(pmdp))
-                               return migrate_vma_collect_skip(start, end,
-                                                               walk);
                } else {
                        int ret;
 
@@ -100,16 +97,12 @@ again:
                        if (ret)
                                return migrate_vma_collect_skip(start, end,
                                                                walk);
-                       if (pmd_none(*pmdp))
-                               return migrate_vma_collect_hole(start, end, -1,
-                                                               walk);
                }
        }
 
-       if (unlikely(pmd_bad(*pmdp)))
-               return migrate_vma_collect_skip(start, end, walk);
-
        ptep = pte_offset_map_lock(mm, pmdp, addr, &ptl);
+       if (!ptep)
+               goto again;
        arch_enter_lazy_mmu_mode();
 
        for (; addr < end; addr += PAGE_SIZE, ptep++) {
@@ -595,27 +588,10 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate,
        pmdp = pmd_alloc(mm, pudp, addr);
        if (!pmdp)
                goto abort;
-
        if (pmd_trans_huge(*pmdp) || pmd_devmap(*pmdp))
                goto abort;
-
-       /*
-        * Use pte_alloc() instead of pte_alloc_map().  We can't run
-        * pte_offset_map() on pmds where a huge pmd might be created
-        * from a different thread.
-        *
-        * pte_alloc_map() is safe to use under mmap_write_lock(mm) or when
-        * parallel threads are excluded by other means.
-        *
-        * Here we only have mmap_read_lock(mm).
-        */
        if (pte_alloc(mm, pmdp))
                goto abort;
-
-       /* See the comment in pte_alloc_one_map() */
-       if (unlikely(pmd_trans_unstable(pmdp)))
-               goto abort;
-
        if (unlikely(anon_vma_prepare(vma)))
                goto abort;
        if (mem_cgroup_charge(page_folio(page), vma->vm_mm, GFP_KERNEL))
@@ -650,7 +626,8 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate,
        }
 
        ptep = pte_offset_map_lock(mm, pmdp, addr, &ptl);
-
+       if (!ptep)
+               goto abort;
        if (check_stable_address_space(mm))
                goto unlock_abort;