KVM: nVMX: Unconditionally purge queued/injected events on nested "exit"
authorSean Christopherson <seanjc@google.com>
Tue, 30 Aug 2022 23:15:48 +0000 (23:15 +0000)
committerPaolo Bonzini <pbonzini@redhat.com>
Mon, 26 Sep 2022 16:03:03 +0000 (12:03 -0400)
Drop pending exceptions and events queued for re-injection when leaving
nested guest mode, even if the "exit" is due to VM-Fail, SMI, or forced
by host userspace.  Failure to purge events could result in an event
belonging to L2 being injected into L1.

This _should_ never happen for VM-Fail as all events should be blocked by
nested_run_pending, but it's possible if KVM, not the L1 hypervisor, is
the source of VM-Fail when running vmcs02.

SMI is a nop (barring unknown bugs) as recognition of SMI and thus entry
to SMM is blocked by pending exceptions and re-injected events.

Forced exit is definitely buggy, but has likely gone unnoticed because
userspace probably follows the forced exit with KVM_SET_VCPU_EVENTS (or
some other ioctl() that purges the queue).

Fixes: 4f350c6dbcb9 ("kvm: nVMX: Handle deferred early VMLAUNCH/VMRESUME failure properly")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20220830231614.3580124-2-seanjc@google.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
arch/x86/kvm/vmx/nested.c

index 6c8b61a..e6218b7 100644 (file)
@@ -4301,14 +4301,6 @@ static void prepare_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
                        nested_vmx_abort(vcpu,
                                         VMX_ABORT_SAVE_GUEST_MSR_FAIL);
        }
-
-       /*
-        * Drop what we picked up for L2 via vmx_complete_interrupts. It is
-        * preserved above and would only end up incorrectly in L1.
-        */
-       vcpu->arch.nmi_injected = false;
-       kvm_clear_exception_queue(vcpu);
-       kvm_clear_interrupt_queue(vcpu);
 }
 
 /*
@@ -4648,6 +4640,17 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason,
                WARN_ON_ONCE(nested_early_check);
        }
 
+       /*
+        * Drop events/exceptions that were queued for re-injection to L2
+        * (picked up via vmx_complete_interrupts()), as well as exceptions
+        * that were pending for L2.  Note, this must NOT be hoisted above
+        * prepare_vmcs12(), events/exceptions queued for re-injection need to
+        * be captured in vmcs12 (see vmcs12_save_pending_event()).
+        */
+       vcpu->arch.nmi_injected = false;
+       kvm_clear_exception_queue(vcpu);
+       kvm_clear_interrupt_queue(vcpu);
+
        vmx_switch_vmcs(vcpu, &vmx->vmcs01);
 
        /* Update any VMCS fields that might have changed while L2 ran */