mm/pagewalk: fix EFI_PGT_DUMP of espfix area
authorHugh Dickins <hughd@google.com>
Sun, 23 Jul 2023 21:17:55 +0000 (14:17 -0700)
committerAndrew Morton <akpm@linux-foundation.org>
Thu, 27 Jul 2023 20:07:04 +0000 (13:07 -0700)
Booting x86_64 with CONFIG_EFI_PGT_DUMP=y shows messages of the form
"mm/pgtable-generic.c:53: bad pmd (____ptrval____)(8000000100077061)".

EFI_PGT_DUMP dumps all of efi_mm, including the espfix area, which is set
up with pmd entries which fit the pmd_bad() check: so 0d940a9b270b warns
and clears those entries, which would ruin running Win16 binaries.

The failing pte_offset_map() stopped such a kernel from even booting,
until a few commits later be872f83bf57 changed the pagewalk to tolerate
that: but it needs to be even more careful, to not spoil those entries.

I might have preferred to change init_espfix_ap() not to use "bad" pmd
entries; or to leave them out of the efi_mm dump.  But there is great
value in staying away from there, and a pagewalk check of address against
TASK_SIZE may protect from other such aberrations too.

Link: https://lkml.kernel.org/r/22bca736-4cab-9ee5-6a52-73a3b2bbe865@google.com
Closes: https://lore.kernel.org/linux-mm/CABXGCsN3JqXckWO=V7p=FhPU1tK03RE1w9UE6xL5Y86SMk209w@mail.gmail.com/
Fixes: 0d940a9b270b ("mm/pgtable: allow pte_offset_map[_lock]() to fail")
Fixes: be872f83bf57 ("mm/pagewalk: walk_pte_range() allow for pte_offset_map()")
Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Cc: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: Laura Abbott <labbott@fedoraproject.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
mm/pagewalk.c

index 6443710..2022333 100644 (file)
@@ -48,8 +48,11 @@ static int walk_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
        if (walk->no_vma) {
                /*
                 * pte_offset_map() might apply user-specific validation.
+                * Indeed, on x86_64 the pmd entries set up by init_espfix_ap()
+                * fit its pmd_bad() check (_PAGE_NX set and _PAGE_RW clear),
+                * and CONFIG_EFI_PGT_DUMP efi_mm goes so far as to walk them.
                 */
-               if (walk->mm == &init_mm)
+               if (walk->mm == &init_mm || addr >= TASK_SIZE)
                        pte = pte_offset_kernel(pmd, addr);
                else
                        pte = pte_offset_map(pmd, addr);