workqueue: Unbind kworkers before sending them to exit()
authorValentin Schneider <vschneid@redhat.com>
Thu, 12 Jan 2023 16:14:31 +0000 (16:14 +0000)
committerTejun Heo <tj@kernel.org>
Thu, 12 Jan 2023 16:21:49 +0000 (06:21 -1000)
commite02b93124855cd34b78e61ae44846c8cb5fddfc3
treebbee233a318cf815b387fa5a95b05ffca0a29de8
parent9ab03be42b8f9136dcc01a90ecc9ac71bc6149ef
workqueue: Unbind kworkers before sending them to exit()

It has been reported that isolated CPUs can suffer from interference due to
per-CPU kworkers waking up just to die.

A surge of workqueue activity during initial setup of a latency-sensitive
application (refresh_vm_stats() being one of the culprits) can cause extra
per-CPU kworkers to be spawned. Then, said latency-sensitive task can be
running merrily on an isolated CPU only to be interrupted sometime later by
a kworker marked for death (cf. IDLE_WORKER_TIMEOUT, 5 minutes after last
kworker activity).

Prevent this by affining kworkers to the wq_unbound_cpumask (which doesn't
contain isolated CPUs, cf. HK_TYPE_WQ) before waking them up after marking
them with WORKER_DIE.

Changing the affinity does require a sleepable context, leverage the newly
introduced pool->idle_cull_work to get that.

Remove dying workers from pool->workers and keep track of them in a
separate list. This intentionally prevents for_each_loop_worker() from
iterating over workers that are marked for death.

Rename destroy_worker() to set_working_dying() to better reflect its
effects and relationship with wake_dying_workers().

Signed-off-by: Valentin Schneider <vschneid@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
kernel/workqueue.c