mm, memory_failure: don't send BUS_MCEERR_AO for action required error
authorWetp Zhang <wetp.zy@linux.alibaba.com>
Tue, 2 Jun 2020 04:50:11 +0000 (21:50 -0700)
committerLinus Torvalds <torvalds@linux-foundation.org>
Tue, 2 Jun 2020 17:59:10 +0000 (10:59 -0700)
Some processes dont't want to be killed early, but in "Action Required"
case, those also may be killed by BUS_MCEERR_AO when sharing memory with
other which is accessing the fail memory.  And sending SIGBUS with
BUS_MCEERR_AO for action required error is strange, so ignore the
non-current processes here.

Suggested-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Signed-off-by: Wetp Zhang <wetp.zy@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Link: http://lkml.kernel.org/r/1590817116-21281-1-git-send-email-wetp.zy@linux.alibaba.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/memory-failure.c

index a96364be8ab40066f7ee121a7895094872e28315..dd3862fcf2e9db5f231d92c7f1e32f8ea4060d53 100644 (file)
@@ -210,14 +210,17 @@ static int kill_proc(struct to_kill *tk, unsigned long pfn, int flags)
 {
        struct task_struct *t = tk->tsk;
        short addr_lsb = tk->size_shift;
-       int ret;
+       int ret = 0;
 
-       pr_err("Memory failure: %#lx: Sending SIGBUS to %s:%d due to hardware memory corruption\n",
-               pfn, t->comm, t->pid);
+       if ((t->mm == current->mm) || !(flags & MF_ACTION_REQUIRED))
+               pr_err("Memory failure: %#lx: Sending SIGBUS to %s:%d due to hardware memory corruption\n",
+                       pfn, t->comm, t->pid);
 
-       if ((flags & MF_ACTION_REQUIRED) && t->mm == current->mm) {
-               ret = force_sig_mceerr(BUS_MCEERR_AR, (void __user *)tk->addr,
-                                      addr_lsb);
+       if (flags & MF_ACTION_REQUIRED) {
+               if (t->mm == current->mm)
+                       ret = force_sig_mceerr(BUS_MCEERR_AR,
+                                        (void __user *)tk->addr, addr_lsb);
+               /* send no signal to non-current processes */
        } else {
                /*
                 * Don't use force here, it's convenient if the signal