sched/vtime: Fix guest/system mis-accounting on task switch
authorFrederic Weisbecker <frederic@kernel.org>
Wed, 25 Sep 2019 21:42:42 +0000 (23:42 +0200)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Wed, 6 Nov 2019 12:06:01 +0000 (13:06 +0100)
commit2cd003a820fea172506618160e1584a196601ebc
tree2bc8732673e738ca4487ae61912fef7fe78c747d
parent58d33d4a4a1dad8579fe0d8fc1cfaad48f168579
sched/vtime: Fix guest/system mis-accounting on task switch

[ Upstream commit 68e7a4d66b0ce04bf18ff2ffded5596ab3618585 ]

vtime_account_system() assumes that the target task to account cputime
to is always the current task. This is most often true indeed except on
task switch where we call:

vtime_common_task_switch(prev)
vtime_account_system(prev)

Here prev is the scheduling-out task where we account the cputime to. It
doesn't match current that is already the scheduling-in task at this
stage of the context switch.

So we end up checking the wrong task flags to determine if we are
accounting guest or system time to the previous task.

As a result the wrong task is used to check if the target is running in
guest mode. We may then spuriously account or leak either system or
guest time on task switch.

Fix this assumption and also turn vtime_guest_enter/exit() to use the
task passed in parameter as well to avoid future similar issues.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wanpeng Li <wanpengli@tencent.com>
Fixes: 2a42eb9594a1 ("sched/cputime: Accumulate vtime on top of nsec clocksource")
Link: https://lkml.kernel.org/r/20190925214242.21873-1-frederic@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
kernel/sched/cputime.c