timers/nohz: Only ever update sleeptime from idle exit
authorFrederic Weisbecker <frederic@kernel.org>
Wed, 22 Feb 2023 14:46:43 +0000 (15:46 +0100)
committerThomas Gleixner <tglx@linutronix.de>
Tue, 18 Apr 2023 14:35:12 +0000 (16:35 +0200)
commit07b65a800b6d5b6afbd6a91487b47038eac97c21
tree149edef721496e1990ba56b708255b9b325f418f
parent605da849d5982dee0527edb2488b79795f31a150
timers/nohz: Only ever update sleeptime from idle exit

The idle and IO sleeptime statistics appearing in /proc/stat can be
currently updated from two sites: locally on idle exit and remotely
by cpufreq. However there is no synchronization mechanism protecting
concurrent updates. It is therefore possible to account the sleeptime
twice, among all the other possible broken scenarios.

To prevent from breaking the sleeptime accounting source, restrict the
sleeptime updates to the local idle exit site. If there is a delta to
add since the last update, IO/Idle sleep time readers will now only
compute the delta without actually writing it back to the internal idle
statistic fields.

This fixes a writer VS writer race. Note there are still two known
reader VS writer races to handle. A subsequent patch will fix one.

Reported-by: Yu Liao <liaoyu15@huawei.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230222144649.624380-3-frederic@kernel.org
kernel/time/tick-sched.c