KVM: nVMX: avoid incorrect preemption timer vmexit in nested guest
The preemption timer for nested VMX is emulated by hrtimer which is started on L2
entry, stopped on L2 exit and evaluated via the check_nested_events hook. However,
nested_vmx_exit_handled is always returning true for preemption timer vmexit. Then,
the L1 preemption timer vmexit is captured and be treated as a L2 preemption
timer vmexit, causing NULL pointer dereferences or worse in the L1 guest's
vmexit handler:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [< (null)>] (null)
PGD 0
Oops: 0010 [#1] SMP
Call Trace:
? kvm_lapic_expired_hv_timer+0x47/0x90 [kvm]
handle_preemption_timer+0xe/0x20 [kvm_intel]
vmx_handle_exit+0x169/0x15a0 [kvm_intel]
? kvm_arch_vcpu_ioctl_run+0xd5d/0x19d0 [kvm]
kvm_arch_vcpu_ioctl_run+0xdee/0x19d0 [kvm]
? kvm_arch_vcpu_ioctl_run+0xd5d/0x19d0 [kvm]
? vcpu_load+0x1c/0x60 [kvm]
? kvm_arch_vcpu_load+0x57/0x260 [kvm]
kvm_vcpu_ioctl+0x2d3/0x7c0 [kvm]
do_vfs_ioctl+0x96/0x6a0
? __fget_light+0x2a/0x90
SyS_ioctl+0x79/0x90
do_syscall_64+0x68/0x180
entry_SYSCALL64_slow_path+0x25/0x25
Code: Bad RIP value.
RIP [< (null)>] (null)
RSP <
ffff8800b5263c48>
CR2:
0000000000000000
---[ end trace
9c70c48b1a2bc66e ]---
This can be reproduced readily by preemption timer enabled on L0 and disabled
on L1.
Return false since preemption timer vmexits must never be reflected to L2.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Yunhong Jiang <yunhong.jiang@intel.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Haozhong Zhang <haozhong.zhang@intel.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>