KVM: Move VM's worker kthreads back to the original cgroup before exiting.
authorVipin Sharma <vipinsh@google.com>
Tue, 22 Feb 2022 05:48:48 +0000 (05:48 +0000)
committerPaolo Bonzini <pbonzini@redhat.com>
Fri, 25 Feb 2022 13:20:14 +0000 (08:20 -0500)
VM worker kthreads can linger in the VM process's cgroup for sometime
after KVM terminates the VM process.

KVM terminates the worker kthreads by calling kthread_stop() which waits
on the 'exited' completion, triggered by exit_mm(), via mm_release(), in
do_exit() during the kthread's exit.  However, these kthreads are
removed from the cgroup using the cgroup_exit() which happens after the
exit_mm(). Therefore, A VM process can terminate in between the
exit_mm() and cgroup_exit() calls, leaving only worker kthreads in the
cgroup.

Moving worker kthreads back to the original cgroup (kthreadd_task's
cgroup) makes sure that the cgroup is empty as soon as the main VM
process is terminated.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
Suggested-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220222054848.563321-1-vipinsh@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
virt/kvm/kvm_main.c

index 83c57bc..cdf1fa3 100644 (file)
@@ -5810,6 +5810,7 @@ static int kvm_vm_worker_thread(void *context)
         * we have to locally copy anything that is needed beyond initialization
         */
        struct kvm_vm_worker_thread_context *init_context = context;
+       struct task_struct *parent;
        struct kvm *kvm = init_context->kvm;
        kvm_vm_thread_fn_t thread_fn = init_context->thread_fn;
        uintptr_t data = init_context->data;
@@ -5836,7 +5837,7 @@ init_complete:
        init_context = NULL;
 
        if (err)
-               return err;
+               goto out;
 
        /* Wait to be woken up by the spawner before proceeding. */
        kthread_parkme();
@@ -5844,6 +5845,25 @@ init_complete:
        if (!kthread_should_stop())
                err = thread_fn(kvm, data);
 
+out:
+       /*
+        * Move kthread back to its original cgroup to prevent it lingering in
+        * the cgroup of the VM process, after the latter finishes its
+        * execution.
+        *
+        * kthread_stop() waits on the 'exited' completion condition which is
+        * set in exit_mm(), via mm_release(), in do_exit(). However, the
+        * kthread is removed from the cgroup in the cgroup_exit() which is
+        * called after the exit_mm(). This causes the kthread_stop() to return
+        * before the kthread actually quits the cgroup.
+        */
+       rcu_read_lock();
+       parent = rcu_dereference(current->real_parent);
+       get_task_struct(parent);
+       rcu_read_unlock();
+       cgroup_attach_task_all(parent, current);
+       put_task_struct(parent);
+
        return err;
 }