KVM: SVM: Periodically schedule when unregistering regions on destroy
authorDavid Rientjes <rientjes@google.com>
Tue, 25 Aug 2020 19:56:28 +0000 (12:56 -0700)
committerPaolo Bonzini <pbonzini@redhat.com>
Fri, 11 Sep 2020 17:24:15 +0000 (13:24 -0400)
There may be many encrypted regions that need to be unregistered when a
SEV VM is destroyed.  This can lead to soft lockups.  For example, on a
host running 4.15:

watchdog: BUG: soft lockup - CPU#206 stuck for 11s! [t_virtual_machi:194348]
CPU: 206 PID: 194348 Comm: t_virtual_machi
RIP: 0010:free_unref_page_list+0x105/0x170
...
Call Trace:
 [<0>] release_pages+0x159/0x3d0
 [<0>] sev_unpin_memory+0x2c/0x50 [kvm_amd]
 [<0>] __unregister_enc_region_locked+0x2f/0x70 [kvm_amd]
 [<0>] svm_vm_destroy+0xa9/0x200 [kvm_amd]
 [<0>] kvm_arch_destroy_vm+0x47/0x200
 [<0>] kvm_put_kvm+0x1a8/0x2f0
 [<0>] kvm_vm_release+0x25/0x30
 [<0>] do_exit+0x335/0xc10
 [<0>] do_group_exit+0x3f/0xa0
 [<0>] get_signal+0x1bc/0x670
 [<0>] do_signal+0x31/0x130

Although the CLFLUSH is no longer issued on every encrypted region to be
unregistered, there are no other changes that can prevent soft lockups for
very large SEV VMs in the latest kernel.

Periodically schedule if necessary.  This still holds kvm->lock across the
resched, but since this only happens when the VM is destroyed this is
assumed to be acceptable.

Signed-off-by: David Rientjes <rientjes@google.com>
Message-Id: <alpine.DEB.2.23.453.2008251255240.2987727@chino.kir.corp.google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
arch/x86/kvm/svm/sev.c

index 402dc4234e397861daef0be131a61118c8f2681a..7bf7bf734979488ac0f1cf10808f32ce3be882a9 100644 (file)
@@ -1106,6 +1106,7 @@ void sev_vm_destroy(struct kvm *kvm)
                list_for_each_safe(pos, q, head) {
                        __unregister_enc_region_locked(kvm,
                                list_entry(pos, struct enc_region, list));
+                       cond_resched();
                }
        }