This crash:
[ 1774.088275] divide error: 0000 [#1] SMP
[ 1774.100355] CPU 13
[ 1774.102498] Modules linked in:
[ 1774.105631] Pid: 30881, comm: hackbench Not tainted 2.6.31-rc8-tip-01308-g484d664-dirty #1629 X8DTN
[ 1774.114807] RIP: 0010:[<
ffffffff81041c38>] [<
ffffffff81041c38>]
sched_balance_self+0x19b/0x2d4
Triggers because update_group_power() modifies the sd tree and does
temporary calculations there - not considering that other CPUs
could observe intermediate values, such as the zero initial value.
Calculate it in a temporary variable instead. (we need no memory
barrier as these are all statistical values anyway)
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <
20090904092742.GA11014@elte.hu>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
{
struct sched_domain *child = sd->child;
struct sched_group *group, *sdg = sd->groups;
+ unsigned long power;
if (!child) {
update_cpu_power(sd, cpu);
return;
}
- sdg->cpu_power = 0;
+ power = 0;
group = child->groups;
do {
- sdg->cpu_power += group->cpu_power;
+ power += group->cpu_power;
group = group->next;
} while (group != child->groups);
+
+ sdg->cpu_power = power;
}
/**