squashfs: Make use of local lock in multi_cpu decompressor
authorJulia Cartwright <julia@ni.com>
Wed, 27 May 2020 20:11:16 +0000 (22:11 +0200)
committerIngo Molnar <mingo@kernel.org>
Thu, 28 May 2020 08:31:10 +0000 (10:31 +0200)
commitfd56200a16c72c7c3ec3e54e06160dfaa5b8dee8
treeb3454028143415c9e2f9da1ca77d11f4cfe23dbb
parentb01b2141999936ac3e4746b7f76c0f204ae4b445
squashfs: Make use of local lock in multi_cpu decompressor

The squashfs multi CPU decompressor makes use of get_cpu_ptr() to
acquire a pointer to per-CPU data. get_cpu_ptr() implicitly disables
preemption which serializes the access to the per-CPU data.

But decompression can take quite some time depending on the size. The
observed preempt disabled times in real world scenarios went up to 8ms,
causing massive wakeup latencies. This happens on all CPUs as the
decompression is fully parallelized.

Replace the implicit preemption control with an explicit local lock.
This allows RT kernels to substitute it with a real per CPU lock, which
serializes the access but keeps the code section preemptible. On non RT
kernels this maps to preempt_disable() as before, i.e. no functional
change.

[ bigeasy: Use local_lock(), patch description]

Reported-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Julia Cartwright <julia@ni.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Alexander Stein <alexander.stein@systec-electronic.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20200527201119.1692513-5-bigeasy@linutronix.de
fs/squashfs/decompressor_multi_percpu.c