staging: lustre: obd: RCU stalls in lu_cache_shrink_count()
authorAnn Koehler <amk@cray.com>
Sun, 29 Jan 2017 00:04:39 +0000 (19:04 -0500)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Fri, 3 Feb 2017 12:01:37 +0000 (13:01 +0100)
commite202d55b05f29f7d17548167b84764bb951b5b90
treee88e36257590d47a15f8daa58ec3d59c2bb7c4be
parent65aaf99d5dc12a7807da3008e0cd9babc1cad265
staging: lustre: obd: RCU stalls in lu_cache_shrink_count()

The algorithm for counting freeable objects in the lu_cache shrinker
does not scale with the number of cpus. The LU_SS_LRU_LEN counter
for each cpu is read and summed at shrink time while holding the
lu_sites_guard mutex. With a large number of cpus and low memory
conditions, processes bottleneck on the mutex.

This mod reduces the time spent counting by using the kernel's percpu
counter functions to maintain the length of a site's lru. The summing
occurs when a percpu value is incremented or decremented and a
threshold is exceeded. lu_cache_shrink_count() simply returns the
last such computed sum.

This mod also replaces the lu_sites_guard mutex with a rw semaphore.
The lock protects the lu_site list, which is modified when a file
system is mounted/umounted or when the lu_site is purged.
lu_cache_shrink_count simply reads data so it does not need to wait
for other readers. lu_cache_shrink_scan, which actually frees the
unused objects, is still serialized.

Signed-off-by: Ann Koehler <amk@cray.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7997
Reviewed-on: http://review.whamcloud.com/19390
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
drivers/staging/lustre/lustre/include/lu_object.h
drivers/staging/lustre/lustre/obdclass/lu_object.c