KVM: MMU: Fix mmu_shrink() so that it can free mmu pages as intended
authorTakuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Mon, 20 Aug 2012 09:35:39 +0000 (18:35 +0900)
committerAvi Kivity <avi@redhat.com>
Wed, 22 Aug 2012 12:27:13 +0000 (15:27 +0300)
Although the possible race described in

  commit 85b7059169e128c57a3a8a3e588fb89cb2031da1
  KVM: MMU: fix shrinking page from the empty mmu

was correct, the real cause of that issue was a more trivial bug of
mmu_shrink() introduced by

  commit 1952639665e92481c34c34c3e2a71bf3e66ba362
  KVM: MMU: do not iterate over all VMs in mmu_shrink()

Here is the bug:

if (kvm->arch.n_used_mmu_pages > 0) {
if (!nr_to_scan--)
break;
continue;
}

We skip VMs whose n_used_mmu_pages is not zero and try to shrink others:
in other words we try to shrink empty ones by mistake.

This patch reverses the logic so that mmu_shrink() can free pages from
the first VM whose n_used_mmu_pages is not zero.  Note that we also add
comments explaining the role of nr_to_scan which is not practically
important now, hoping this will be improved in the future.

Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
arch/x86/kvm/mmu.c

index 01ca004..7fbd0d2 100644 (file)
@@ -4113,16 +4113,21 @@ static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
                LIST_HEAD(invalid_list);
 
                /*
+                * Never scan more than sc->nr_to_scan VM instances.
+                * Will not hit this condition practically since we do not try
+                * to shrink more than one VM and it is very unlikely to see
+                * !n_used_mmu_pages so many times.
+                */
+               if (!nr_to_scan--)
+                       break;
+               /*
                 * n_used_mmu_pages is accessed without holding kvm->mmu_lock
                 * here. We may skip a VM instance errorneosly, but we do not
                 * want to shrink a VM that only started to populate its MMU
                 * anyway.
                 */
-               if (kvm->arch.n_used_mmu_pages > 0) {
-                       if (!nr_to_scan--)
-                               break;
+               if (!kvm->arch.n_used_mmu_pages)
                        continue;
-               }
 
                idx = srcu_read_lock(&kvm->srcu);
                spin_lock(&kvm->mmu_lock);