percpu: optimize locking in pcpu_balance_workfn()
authorRoman Gushchin <guro@fb.com>
Thu, 17 Jun 2021 19:03:22 +0000 (12:03 -0700)
committerDennis Zhou <dennis@kernel.org>
Thu, 17 Jun 2021 23:05:24 +0000 (23:05 +0000)
commite4d777003a43feab2e000749163e531f6c48c385
treeba3ec20752ce5ecc7e4e5185e466dea3e0f83e8d
parent4829c791b22f98f95339248a428caf08b5f1e3e5
percpu: optimize locking in pcpu_balance_workfn()

pcpu_balance_workfn() unconditionally calls pcpu_balance_free(),
pcpu_reclaim_populated(), pcpu_balance_populated() and
pcpu_balance_free() again.

Each call to pcpu_balance_free() and pcpu_reclaim_populated() will
cause at least one acquisition of the pcpu_lock. So even if the
balancing was scheduled because of a failed atomic allocation,
pcpu_lock will be acquired at least 4 times. This obviously
increases the contention on the pcpu_lock.

To optimize the scheme let's grab the pcpu_lock on the upper level
(in pcpu_balance_workfn()) and keep it generally locked for the whole
duration of the scheduled work, but release conditionally to perform
any slow operations like chunk (de)population and creation of new
chunks.

Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Dennis Zhou <dennis@kernel.org>
mm/percpu.c