sched/fair: Fix the decision for load balance
authorKeisuke Nishimura <keisuke.nishimura@inria.fr>
Tue, 31 Oct 2023 13:38:22 +0000 (14:38 +0100)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Sun, 3 Dec 2023 06:33:02 +0000 (07:33 +0100)
[ Upstream commit 6d7e4782bcf549221b4ccfffec2cf4d1a473f1a3 ]

should_we_balance is called for the decision to do load-balancing.
When sched ticks invoke this function, only one CPU should return
true. However, in the current code, two CPUs can return true. The
following situation, where b means busy and i means idle, is an
example, because CPU 0 and CPU 2 return true.

        [0, 1] [2, 3]
         b  b   i  b

This fix checks if there exists an idle CPU with busy sibling(s)
after looking for a CPU on an idle core. If some idle CPUs with busy
siblings are found, just the first one should do load-balancing.

Fixes: b1bfeab9b002 ("sched/fair: Consider the idle state of the whole core for load balance")
Signed-off-by: Keisuke Nishimura <keisuke.nishimura@inria.fr>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Chen Yu <yu.c.chen@intel.com>
Reviewed-by: Shrikanth Hegde <sshegde@linux.vnet.ibm.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lkml.kernel.org/r/20231031133821.1570861-1-keisuke.nishimura@inria.fr
Signed-off-by: Sasha Levin <sashal@kernel.org>
kernel/sched/fair.c

index 0351320148177ef2736010cafb74bfa70c06476e..fa9fff0f9620dce1434e62487345230a239a01ee 100644 (file)
@@ -11121,12 +11121,16 @@ static int should_we_balance(struct lb_env *env)
                        continue;
                }
 
-               /* Are we the first idle CPU? */
+               /*
+                * Are we the first idle core in a non-SMT domain or higher,
+                * or the first idle CPU in a SMT domain?
+                */
                return cpu == env->dst_cpu;
        }
 
-       if (idle_smt == env->dst_cpu)
-               return true;
+       /* Are we the first idle CPU with busy siblings? */
+       if (idle_smt != -1)
+               return idle_smt == env->dst_cpu;
 
        /* Are we the first CPU of this group ? */
        return group_balance_cpu(sg) == env->dst_cpu;