x86/kvm: Fix broken irq restoration in kvm_wait
authorWanpeng Li <wanpengli@tencent.com>
Mon, 15 Mar 2021 06:55:28 +0000 (14:55 +0800)
committerPaolo Bonzini <pbonzini@redhat.com>
Thu, 18 Mar 2021 17:58:14 +0000 (13:58 -0400)
After commit 997acaf6b4b59c (lockdep: report broken irq restoration), the guest
splatting below during boot:

 raw_local_irq_restore() called with IRQs enabled
 WARNING: CPU: 1 PID: 169 at kernel/locking/irqflag-debug.c:10 warn_bogus_irq_restore+0x26/0x30
 Modules linked in: hid_generic usbhid hid
 CPU: 1 PID: 169 Comm: systemd-udevd Not tainted 5.11.0+ #25
 RIP: 0010:warn_bogus_irq_restore+0x26/0x30
 Call Trace:
  kvm_wait+0x76/0x90
  __pv_queued_spin_lock_slowpath+0x285/0x2e0
  do_raw_spin_lock+0xc9/0xd0
  _raw_spin_lock+0x59/0x70
  lockref_get_not_dead+0xf/0x50
  __legitimize_path+0x31/0x60
  legitimize_root+0x37/0x50
  try_to_unlazy_next+0x7f/0x1d0
  lookup_fast+0xb0/0x170
  path_openat+0x165/0x9b0
  do_filp_open+0x99/0x110
  do_sys_openat2+0x1f1/0x2e0
  do_sys_open+0x5c/0x80
  __x64_sys_open+0x21/0x30
  do_syscall_64+0x32/0x50
  entry_SYSCALL_64_after_hwframe+0x44/0xae

The new consistency checking,  expects local_irq_save() and
local_irq_restore() to be paired and sanely nested, and therefore expects
local_irq_restore() to be called with irqs disabled.
The irqflags handling in kvm_wait() which ends up doing:

local_irq_save(flags);
safe_halt();
local_irq_restore(flags);

instead triggers it.  This patch fixes it by using
local_irq_disable()/enable() directly.

Cc: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1615791328-2735-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
arch/x86/kernel/kvm.c

index 5e78e01..78bb0fa 100644 (file)
@@ -836,28 +836,25 @@ static void kvm_kick_cpu(int cpu)
 
 static void kvm_wait(u8 *ptr, u8 val)
 {
-       unsigned long flags;
-
        if (in_nmi())
                return;
 
-       local_irq_save(flags);
-
-       if (READ_ONCE(*ptr) != val)
-               goto out;
-
        /*
         * halt until it's our turn and kicked. Note that we do safe halt
         * for irq enabled case to avoid hang when lock info is overwritten
         * in irq spinlock slowpath and no spurious interrupt occur to save us.
         */
-       if (arch_irqs_disabled_flags(flags))
-               halt();
-       else
-               safe_halt();
+       if (irqs_disabled()) {
+               if (READ_ONCE(*ptr) == val)
+                       halt();
+       } else {
+               local_irq_disable();
 
-out:
-       local_irq_restore(flags);
+               if (READ_ONCE(*ptr) == val)
+                       safe_halt();
+
+               local_irq_enable();
+       }
 }
 
 #ifdef CONFIG_X86_32