sched/pelt: Relax the sync of util_sum with util_avg
authorVincent Guittot <vincent.guittot@linaro.org>
Tue, 11 Jan 2022 13:46:56 +0000 (14:46 +0100)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Tue, 1 Feb 2022 16:27:10 +0000 (17:27 +0100)
commit58e81159b4f7ce89d69d350c1dcebca763f587ac
tree94c41d51e8ee8effc1ea496d7025e5c2c604b8f9
parent767060539ac4fe683300a9004e7956b1f6dbd657
sched/pelt: Relax the sync of util_sum with util_avg

[ Upstream commit 98b0d890220d45418cfbc5157b3382e6da5a12ab ]

Rick reported performance regressions in bugzilla because of cpu frequency
being lower than before:
    https://bugzilla.kernel.org/show_bug.cgi?id=215045

He bisected the problem to:
commit 1c35b07e6d39 ("sched/fair: Ensure _sum and _avg values stay consistent")

This commit forces util_sum to be synced with the new util_avg after
removing the contribution of a task and before the next periodic sync. By
doing so util_sum is rounded to its lower bound and might lost up to
LOAD_AVG_MAX-1 of accumulated contribution which has not yet been
reflected in util_avg.

Instead of always setting util_sum to the low bound of util_avg, which can
significantly lower the utilization of root cfs_rq after propagating the
change down into the hierarchy, we revert the change of util_sum and
propagate the difference.

In addition, we also check that cfs's util_sum always stays above the
lower bound for a given util_avg as it has been observed that
sched_entity's util_sum is sometimes above cfs one.

Fixes: 1c35b07e6d39 ("sched/fair: Ensure _sum and _avg values stay consistent")
Reported-by: Rick Yiu <rickyiu@google.com>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
Link: https://lkml.kernel.org/r/20220111134659.24961-2-vincent.guittot@linaro.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
kernel/sched/fair.c
kernel/sched/pelt.h