powerpc/kvm: Fix lockups when running KVM guests on Power8
authorMichael Ellerman <mpe@ellerman.id.au>
Thu, 19 Apr 2018 06:22:20 +0000 (16:22 +1000)
committerMichael Ellerman <mpe@ellerman.id.au>
Thu, 19 Apr 2018 06:22:20 +0000 (16:22 +1000)
When running KVM guests on Power8 we can see a lockup where one CPU
stops responding. This often leads to a message such as:

  watchdog: CPU 136 detected hard LOCKUP on other CPUs 72
  Task dump for CPU 72:
  qemu-system-ppc R  running task    10560 20917  20908 0x00040004

And then backtraces on other CPUs, such as:

  Task dump for CPU 48:
  ksmd            R  running task    10032  1519      2 0x00000804
  Call Trace:
    ...
    --- interrupt: 901 at smp_call_function_many+0x3c8/0x460
        LR = smp_call_function_many+0x37c/0x460
    pmdp_invalidate+0x100/0x1b0
    __split_huge_pmd+0x52c/0xdb0
    try_to_unmap_one+0x764/0x8b0
    rmap_walk_anon+0x15c/0x370
    try_to_unmap+0xb4/0x170
    split_huge_page_to_list+0x148/0xa30
    try_to_merge_one_page+0xc8/0x990
    try_to_merge_with_ksm_page+0x74/0xf0
    ksm_scan_thread+0x10ec/0x1ac0
    kthread+0x160/0x1a0
    ret_from_kernel_thread+0x5c/0x78

This is caused by commit 8c1c7fb0b5ec ("powerpc/64s/idle: avoid sync
for KVM state when waking from idle"), which added a check in
pnv_powersave_wakeup() to see if the kvm_hstate.hwthread_state is
already set to KVM_HWTHREAD_IN_KERNEL, and if so to skip the store and
test of kvm_hstate.hwthread_req.

The problem is that the primary does not set KVM_HWTHREAD_IN_KVM when
entering the guest, so it can then come out to cede with
KVM_HWTHREAD_IN_KERNEL set. It can then go idle in kvm_do_nap after
setting hwthread_req to 1, but because hwthread_state is still
KVM_HWTHREAD_IN_KERNEL we will skip the test of hwthread_req when we
wake up from idle and won't go to kvm_start_guest. From there the
thread will return somewhere garbage and crash.

Fix it by skipping the store of hwthread_state, but not the test of
hwthread_req, when coming out of idle. It's OK to skip the sync in
that case because hwthread_req will have been set on the same thread,
so there is no synchronisation required.

Fixes: 8c1c7fb0b5ec ("powerpc/64s/idle: avoid sync for KVM state when waking from idle")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
arch/powerpc/kernel/idle_book3s.S

index 79d0054..e734f6e 100644 (file)
@@ -553,12 +553,12 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300)
 #ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
        lbz     r0,HSTATE_HWTHREAD_STATE(r13)
        cmpwi   r0,KVM_HWTHREAD_IN_KERNEL
-       beq     1f
+       beq     0f
        li      r0,KVM_HWTHREAD_IN_KERNEL
        stb     r0,HSTATE_HWTHREAD_STATE(r13)
        /* Order setting hwthread_state vs. testing hwthread_req */
        sync
-       lbz     r0,HSTATE_HWTHREAD_REQ(r13)
+0:     lbz     r0,HSTATE_HWTHREAD_REQ(r13)
        cmpwi   r0,0
        beq     1f
        b       kvm_start_guest