Fix panic seen on some IBM and HP systems on 2.6.32-rc6:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<
ffffffff8120bf3f>] find_next_bit+0x77/0x9c
[...]
[<
ffffffff8120bbde>] cpumask_next_and+0x2e/0x3b
[<
ffffffff81225c62>] pci_device_probe+0x8e/0xf5
[<
ffffffff812b9be6>] ? driver_sysfs_add+0x47/0x6c
[<
ffffffff812b9da5>] driver_probe_device+0xd9/0x1f9
[<
ffffffff812b9f1d>] __driver_attach+0x58/0x7c
[<
ffffffff812b9ec5>] ? __driver_attach+0x0/0x7c
[<
ffffffff812b9298>] bus_for_each_dev+0x54/0x89
[<
ffffffff812b9b4f>] driver_attach+0x19/0x1b
[<
ffffffff812b97ae>] bus_add_driver+0xd3/0x23d
[<
ffffffff812ba1e7>] driver_register+0x98/0x109
[<
ffffffff81225ed0>] __pci_register_driver+0x63/0xd3
[<
ffffffff81072776>] ? up_read+0x26/0x2a
[<
ffffffffa0081000>] ? k8temp_init+0x0/0x20 [k8temp]
[<
ffffffffa008101e>] k8temp_init+0x1e/0x20 [k8temp]
[<
ffffffff8100a073>] do_one_initcall+0x6d/0x185
[<
ffffffff8108d765>] sys_init_module+0xd3/0x236
[<
ffffffff81011ac2>] system_call_fastpath+0x16/0x1b
I put in a printk and commented out the set_dev_node()
call when and got this output:
quirk_amd_nb_node: current numa_node = 0x0, would set to val & 7 = 0x0
quirk_amd_nb_node: current numa_node = 0x0, would set to val & 7 = 0x1
quirk_amd_nb_node: current numa_node = 0x0, would set to val & 7 = 0x2
quirk_amd_nb_node: current numa_node = 0x0, would set to val & 7 = 0x3
I.e. the issue appears to be that the HW has set val to a valid
value, however, the system is only configured for a single
node -- 0, the others are offline.
Check to see if the node is actually online before setting
the numa node for an AMD northbridge in quirk_amd_nb_node().
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: bhavna.sarathy@amd.com
Cc: jbarnes@virtuousgeek.org
Cc: andreas.herrmann3@amd.com
LKML-Reference: <
20091112180933.12532.98685.sendpatchset@prarit.bos.redhat.com>
[ v2: clean up the code and add comments ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>