watchdog/hpwdt: Disable NMI in Crash Kernel
authorJerry Hoemann <jerry.hoemann@hpe.com>
Mon, 23 Nov 2020 02:08:39 +0000 (19:08 -0700)
committerWim Van Sebroeck <wim@linux-watchdog.org>
Sun, 13 Dec 2020 15:17:42 +0000 (16:17 +0100)
NMIs received during the crash path are problematic as hpwdt_pretimeout
handling of the NMI would cause a reentry into kdump.

The situation is complicated in that I/O errors can be signaled as NMI
circumventing hpwdt_pretimeout's attempt to not claim NMI not associated
with either the WDT or the iLO NMI switch.  These NMI can additionally
cause a secondary NMI which cause the system to hang.

By disabling pretimeout and hpwdtimeout in crash path we both reduce
the risk of receiving an NMI and simuletaneously leave the WDT running
(if it was already in use) to allow the WDT to break the system out of
hangs by the WDT reset.

Signed-off-by: Jerry Hoemann <jerry.hoemann@hpe.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/1606097320-56762-2-git-send-email-jerry.hoemann@hpe.com
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@linux-watchdog.org>
drivers/watchdog/hpwdt.c

index 7d34bcf..eeb4df2 100644 (file)
@@ -21,6 +21,7 @@
 #include <linux/types.h>
 #include <linux/watchdog.h>
 #include <asm/nmi.h>
+#include <linux/crash_dump.h>
 
 #define HPWDT_VERSION                  "2.0.3"
 #define SECS_TO_TICKS(secs)            ((secs) * 1000 / 128)
@@ -334,6 +335,11 @@ static int hpwdt_init_one(struct pci_dev *dev,
        watchdog_set_nowayout(&hpwdt_dev, nowayout);
        watchdog_init_timeout(&hpwdt_dev, soft_margin, NULL);
 
+       if (is_kdump_kernel()) {
+               pretimeout = 0;
+               kdumptimeout = 0;
+       }
+
        if (pretimeout && hpwdt_dev.timeout <= PRETIMEOUT_SEC) {
                dev_warn(&dev->dev, "timeout <= pretimeout. Setting pretimeout to zero\n");
                pretimeout = 0;