mm/page_alloc.c: ratelimit allocation failure warnings more aggressively
authorJohannes Weiner <hannes@cmpxchg.org>
Wed, 6 Nov 2019 05:16:51 +0000 (21:16 -0800)
committerLinus Torvalds <torvalds@linux-foundation.org>
Wed, 6 Nov 2019 16:47:50 +0000 (08:47 -0800)
While investigating a bug related to higher atomic allocation failures,
we noticed the failure warnings positively drowning the console, and in
our case trigger lockup warnings because of a serial console too slow to
handle all that output.

But even if we had a faster console, it's unclear what additional
information the current level of repetition provides.

Allocation failures happen for three reasons: The machine is OOM, the VM
is failing to handle reasonable requests, or somebody is making
unreasonable requests (and didn't acknowledge their opportunism with
__GFP_NOWARN).  Having the memory dump, a callstack, and the ratelimit
stats on skipped failure warnings should provide enough information to
let users/admins/developers know whether something is wrong and point
them in the right direction for debugging, bpftracing etc.

Limit allocation failure warnings to one spew every ten seconds.

Link: http://lkml.kernel.org/r/20191028194906.26899-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/page_alloc.c

index 6c717ad..f391c0c 100644 (file)
@@ -3728,10 +3728,6 @@ try_this_zone:
 static void warn_alloc_show_mem(gfp_t gfp_mask, nodemask_t *nodemask)
 {
        unsigned int filter = SHOW_MEM_FILTER_NODES;
-       static DEFINE_RATELIMIT_STATE(show_mem_rs, HZ, 1);
-
-       if (!__ratelimit(&show_mem_rs))
-               return;
 
        /*
         * This documents exceptions given to allocations in certain
@@ -3752,8 +3748,7 @@ void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, const char *fmt, ...)
 {
        struct va_format vaf;
        va_list args;
-       static DEFINE_RATELIMIT_STATE(nopage_rs, DEFAULT_RATELIMIT_INTERVAL,
-                                     DEFAULT_RATELIMIT_BURST);
+       static DEFINE_RATELIMIT_STATE(nopage_rs, 10*HZ, 1);
 
        if ((gfp_mask & __GFP_NOWARN) || !__ratelimit(&nopage_rs))
                return;