KVM: nVMX: Document priority of all known events on Intel CPUs
authorSean Christopherson <seanjc@google.com>
Tue, 30 Aug 2022 23:16:07 +0000 (23:16 +0000)
committerPaolo Bonzini <pbonzini@redhat.com>
Mon, 26 Sep 2022 16:03:10 +0000 (12:03 -0400)
Add a gigantic comment above vmx_check_nested_events() to document the
priorities of all known events on Intel CPUs.  Intel's SDM doesn't
include VMX-specific events in its "Priority Among Concurrent Events",
which makes it painfully difficult to suss out the correct priority
between things like Monitor Trap Flag VM-Exits and pending #DBs.

Kudos to Jim Mattson for doing the hard work of collecting and
interpreting the priorities from various locations throughtout the SDM
(because putting them all in one place in the SDM would be too easy).

Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20220830231614.3580124-21-seanjc@google.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
arch/x86/kvm/vmx/nested.c

index e773f3d8e1880614d4d7688ad7982012fedfd2aa..e0a93d974829ff10d2cf641050d0c2811a6257b8 100644 (file)
@@ -3954,6 +3954,89 @@ static bool nested_vmx_preemption_timer_pending(struct kvm_vcpu *vcpu)
               to_vmx(vcpu)->nested.preemption_timer_expired;
 }
 
+/*
+ * Per the Intel SDM's table "Priority Among Concurrent Events", with minor
+ * edits to fill in missing examples, e.g. #DB due to split-lock accesses,
+ * and less minor edits to splice in the priority of VMX Non-Root specific
+ * events, e.g. MTF and NMI/INTR-window exiting.
+ *
+ * 1 Hardware Reset and Machine Checks
+ *     - RESET
+ *     - Machine Check
+ *
+ * 2 Trap on Task Switch
+ *     - T flag in TSS is set (on task switch)
+ *
+ * 3 External Hardware Interventions
+ *     - FLUSH
+ *     - STOPCLK
+ *     - SMI
+ *     - INIT
+ *
+ * 3.5 Monitor Trap Flag (MTF) VM-exit[1]
+ *
+ * 4 Traps on Previous Instruction
+ *     - Breakpoints
+ *     - Trap-class Debug Exceptions (#DB due to TF flag set, data/I-O
+ *       breakpoint, or #DB due to a split-lock access)
+ *
+ * 4.3 VMX-preemption timer expired VM-exit
+ *
+ * 4.6 NMI-window exiting VM-exit[2]
+ *
+ * 5 Nonmaskable Interrupts (NMI)
+ *
+ * 5.5 Interrupt-window exiting VM-exit and Virtual-interrupt delivery
+ *
+ * 6 Maskable Hardware Interrupts
+ *
+ * 7 Code Breakpoint Fault
+ *
+ * 8 Faults from Fetching Next Instruction
+ *     - Code-Segment Limit Violation
+ *     - Code Page Fault
+ *     - Control protection exception (missing ENDBRANCH at target of indirect
+ *                                     call or jump)
+ *
+ * 9 Faults from Decoding Next Instruction
+ *     - Instruction length > 15 bytes
+ *     - Invalid Opcode
+ *     - Coprocessor Not Available
+ *
+ *10 Faults on Executing Instruction
+ *     - Overflow
+ *     - Bound error
+ *     - Invalid TSS
+ *     - Segment Not Present
+ *     - Stack fault
+ *     - General Protection
+ *     - Data Page Fault
+ *     - Alignment Check
+ *     - x86 FPU Floating-point exception
+ *     - SIMD floating-point exception
+ *     - Virtualization exception
+ *     - Control protection exception
+ *
+ * [1] Per the "Monitor Trap Flag" section: System-management interrupts (SMIs),
+ *     INIT signals, and higher priority events take priority over MTF VM exits.
+ *     MTF VM exits take priority over debug-trap exceptions and lower priority
+ *     events.
+ *
+ * [2] Debug-trap exceptions and higher priority events take priority over VM exits
+ *     caused by the VMX-preemption timer.  VM exits caused by the VMX-preemption
+ *     timer take priority over VM exits caused by the "NMI-window exiting"
+ *     VM-execution control and lower priority events.
+ *
+ * [3] Debug-trap exceptions and higher priority events take priority over VM exits
+ *     caused by "NMI-window exiting".  VM exits caused by this control take
+ *     priority over non-maskable interrupts (NMIs) and lower priority events.
+ *
+ * [4] Virtual-interrupt delivery has the same priority as that of VM exits due to
+ *     the 1-setting of the "interrupt-window exiting" VM-execution control.  Thus,
+ *     non-maskable interrupts (NMIs) and higher priority events take priority over
+ *     delivery of a virtual interrupt; delivery of a virtual interrupt takes
+ *     priority over external interrupts and lower priority events.
+ */
 static int vmx_check_nested_events(struct kvm_vcpu *vcpu)
 {
        struct kvm_lapic *apic = vcpu->arch.apic;