In scsi at least two cases of the parent device being deleted before the
child is added have been observed.
1/ scsi is performing async scans and the device is removed prior to the
async can thread running (can happen with an in-opportune / unlikely
unplug during initial scan).
2/ libsas discovery event running after the parent port has been torn
down (this is a bug in libsas).
Result in crash signatures like:
BUG: unable to handle kernel NULL pointer dereference at
0000000000000098
IP: [<
ffffffff8115e100>] sysfs_create_dir+0x32/0xb6
...
Process scsi_scan_8 (pid: 5417, threadinfo
ffff88080bd16000, task
ffff880801b8a0b0)
Stack:
00000000fffffffe ffff880813470628 ffff88080bd17cd0 ffff88080614b7e8
ffff88080b45c108 00000000fffffffe ffff88080bd17d20 ffffffff8125e4a8
ffff88080bd17cf0 ffffffff81075149 ffff88080bd17d30 ffff88080614b7e8
Call Trace:
[<
ffffffff8125e4a8>] kobject_add_internal+0x120/0x1e3
[<
ffffffff81075149>] ? trace_hardirqs_on+0xd/0xf
[<
ffffffff8125e641>] kobject_add_varg+0x41/0x50
[<
ffffffff8125e70b>] kobject_add+0x64/0x66
[<
ffffffff8131122b>] device_add+0x12d/0x63a
In this scenario the parent is still valid (because we have a
reference), but it has been device_del()'d which means its kobj->sd
pointer is NULL'd via:
device_del()->kobject_del()->sysfs_remove_dir()
...and then sysfs_create_dir() (without this fix) goes ahead and
de-references parent_sd via sysfs_ns_type():
return (sd->s_flags & SYSFS_NS_TYPE_MASK) >> SYSFS_NS_TYPE_SHIFT;
This scenario is being fixed in scsi/libsas, but if other subsystems
present the same ordering the system need not immediately crash.
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: James Bottomley <JBottomley@parallels.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>