sched: Fix balance_push() vs __sched_setscheduler()
authorPeter Zijlstra <peterz@infradead.org>
Tue, 7 Jun 2022 20:41:55 +0000 (22:41 +0200)
committerPeter Zijlstra <peterz@infradead.org>
Mon, 13 Jun 2022 08:15:07 +0000 (10:15 +0200)
commit04193d590b390ec7a0592630f46d559ec6564ba1
treed95902fa794cb7a157387360af43a8a79a23759c
parentb13baccc3850ca8b8cccbf8ed9912dbaa0fdf7f3
sched: Fix balance_push() vs __sched_setscheduler()

The purpose of balance_push() is to act as a filter on task selection
in the case of CPU hotplug, specifically when taking the CPU out.

It does this by (ab)using the balance callback infrastructure, with
the express purpose of keeping all the unlikely/odd cases in a single
place.

In order to serve its purpose, the balance_push_callback needs to be
(exclusively) on the callback list at all times (noting that the
callback always places itself back on the list the moment it runs,
also noting that when the CPU goes down, regular balancing concerns
are moot, so ignoring them is fine).

And here-in lies the problem, __sched_setscheduler()'s use of
splice_balance_callbacks() takes the callbacks off the list across a
lock-break, making it possible for, an interleaving, __schedule() to
see an empty list and not get filtered.

Fixes: ae7927023243 ("sched: Optimize finish_lock_switch()")
Reported-by: Jing-Ting Wu <jing-ting.wu@mediatek.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Jing-Ting Wu <jing-ting.wu@mediatek.com>
Link: https://lkml.kernel.org/r/20220519134706.GH2578@worktop.programming.kicks-ass.net
kernel/sched/core.c
kernel/sched/sched.h