x86: Fix CPUIDLE_FLAG_IRQ_ENABLE leaking timer reprogram
authorPeter Zijlstra <peterz@infradead.org>
Wed, 15 Nov 2023 15:13:23 +0000 (10:13 -0500)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Thu, 25 Jan 2024 23:35:12 +0000 (15:35 -0800)
commit08beb0d4362b3d3737c65de55f45ca6b7e52c71a
tree7aff03afbb4cd298f26b8a46ad650dd45e74492d
parentf7aac5fede0b0e4f5d98027a4e91f62458cebc33
x86: Fix CPUIDLE_FLAG_IRQ_ENABLE leaking timer reprogram

[ Upstream commit edc8fc01f608108b0b7580cb2c29dfb5135e5f0e ]

intel_idle_irq() re-enables IRQs very early. As a result, an interrupt
may fire before mwait() is eventually called. If such an interrupt queues
a timer, it may go unnoticed until mwait returns and the idle loop
handles the tick re-evaluation. And monitoring TIF_NEED_RESCHED doesn't
help because a local timer enqueue doesn't set that flag.

The issue is mitigated by the fact that this idle handler is only invoked
for shallow C-states when, presumably, the next tick is supposed to be
close enough. There may still be rare cases though when the next tick
is far away and the selected C-state is shallow, resulting in a timer
getting ignored for a while.

Fix this with using sti_mwait() whose IRQ-reenablement only triggers
upon calling mwait(), dealing with the race while keeping the interrupt
latency within acceptable bounds.

Fixes: c227233ad64c (intel_idle: enable interrupts before C1 on Xeons)
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lkml.kernel.org/r/20231115151325.6262-3-frederic@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
arch/x86/include/asm/mwait.h
drivers/idle/intel_idle.c