BACKPORT: mm: Multi-gen LRU: remove wait_event_killable()
authorKalesh Singh <kaleshsingh@google.com>
Thu, 13 Apr 2023 21:43:26 +0000 (14:43 -0700)
committerMarek Szyprowski <m.szyprowski@samsung.com>
Wed, 17 Jan 2024 17:15:54 +0000 (18:15 +0100)
commit3f3e1a95461f51b61d6c394670ca5fac4c955add
tree892b838ee5aab61653cbcd876ddbf4e4bd735448
parentda7988723345ed5081ee978a740965dc6f87bada
BACKPORT: mm: Multi-gen LRU: remove wait_event_killable()

Android 14 and later default to MGLRU [1] and field telemetry showed
occasional long tail latency (>100ms) in the reclaim path.

Tracing revealed priority inversion in the reclaim path.  In
try_to_inc_max_seq(), when high priority tasks were blocked on
wait_event_killable(), the preemption of the low priority task to call
wake_up_all() caused those high priority tasks to wait longer than
necessary.  In general, this problem is not different from others of its
kind, e.g., one caused by mutex_lock().  However, it is specific to MGLRU
because it introduced the new wait queue lruvec->mm_state.wait.

The purpose of this new wait queue is to avoid the thundering herd
problem.  If many direct reclaimers rush into try_to_inc_max_seq(), only
one can succeed, i.e., the one to wake up the rest, and the rest who
failed might cause premature OOM kills if they do not wait.  So far there
is no evidence supporting this scenario, based on how often the wait has
been hit.  And this begs the question how useful the wait queue is in
practice.

Based on Minchan's recommendation, which is in line with his commit
6d4675e60135 ("mm: don't be stuck to rmap lock on reclaim path") and the
rest of the MGLRU code which also uses trylock when possible, remove the
wait queue.

[1] https://android-review.googlesource.com/q/I7ed7fbfd6ef9ce10053347528125dd98c39e50bf

Link: https://lkml.kernel.org/r/20230413214326.2147568-1-kaleshsingh@google.com
Fixes: bd74fdaea146 ("mm: multi-gen LRU: support page table walks")
Change-Id: I911f3968fd1adb25171279cc5b6f48ccb7efc8de
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Suggested-by: Minchan Kim <minchan@kernel.org>
Reported-by: Wei Wang <wvw@google.com>
Acked-by: Yu Zhao <yuzhao@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Jan Alexander Steffens (heftig) <heftig@archlinux.org>
Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
Cc: Suleiman Souhlal <suleiman@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit 7f63cf2d9b9bbe7b90f808927558a66ff737d399)
Bug: 277906484
[ Kalesh Singh - Fix conflicts in mm/vmscan.c; rename force_scan ->
full_scan]
[ Kalesh Singh - Fix up ABI breakages in include/linux/mmzone.h ]
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
[backport of the commit 10d315f8354581a6b07b0e8b4eabf68524063983 from android13-5.15 branch]
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
include/linux/mmzone.h
mm/vmscan.c