mm: create a new system state and fix core_kernel_text()
authorChristophe Leroy <christophe.leroy@csgroup.eu>
Fri, 5 Nov 2021 20:40:40 +0000 (13:40 -0700)
committerLinus Torvalds <torvalds@linux-foundation.org>
Sat, 6 Nov 2021 20:30:38 +0000 (13:30 -0700)
core_kernel_text() considers that until system_state in at least
SYSTEM_RUNNING, init memory is valid.

But init memory is freed a few lines before setting SYSTEM_RUNNING, so
we have a small period of time when core_kernel_text() is wrong.

Create an intermediate system state called SYSTEM_FREEING_INIT that is
set before starting freeing init memory, and use it in
core_kernel_text() to report init memory invalid earlier.

Link: https://lkml.kernel.org/r/9ecfdee7dd4d741d172cb93ff1d87f1c58127c9a.1633001016.git.christophe.leroy@csgroup.eu
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
include/linux/kernel.h
init/main.c
kernel/extable.c

index 2776423..471bc05 100644 (file)
@@ -248,6 +248,7 @@ extern bool early_boot_irqs_disabled;
 extern enum system_states {
        SYSTEM_BOOTING,
        SYSTEM_SCHEDULING,
+       SYSTEM_FREEING_INITMEM,
        SYSTEM_RUNNING,
        SYSTEM_HALT,
        SYSTEM_POWER_OFF,
index 3c4054a..767ee26 100644 (file)
@@ -1506,6 +1506,8 @@ static int __ref kernel_init(void *unused)
        kernel_init_freeable();
        /* need to finish all async __init code before freeing the memory */
        async_synchronize_full();
+
+       system_state = SYSTEM_FREEING_INITMEM;
        kprobe_free_init_mem();
        ftrace_free_init_mem();
        kgdb_free_init_mem();
index b0ea5eb..290661f 100644 (file)
@@ -76,7 +76,7 @@ int notrace core_kernel_text(unsigned long addr)
            addr < (unsigned long)_etext)
                return 1;
 
-       if (system_state < SYSTEM_RUNNING &&
+       if (system_state < SYSTEM_FREEING_INITMEM &&
            init_kernel_text(addr))
                return 1;
        return 0;