scsi: qla2xxx: Fix NVME cmd and LS cmd timeout race condition
authorQuinn Tran <qutran@marvell.com>
Fri, 21 Jun 2019 16:50:24 +0000 (09:50 -0700)
committerMartin K. Petersen <martin.petersen@oracle.com>
Thu, 27 Jun 2019 04:09:18 +0000 (00:09 -0400)
commit4c2a2d0178d5d8006a6bc50c8dc0ed122e4e946e
tree6fa1068a4cc56bc8dec1c2f2c368df652ef50538
parent2eb9238affa72a5260b14388cf56598f7413109b
scsi: qla2xxx: Fix NVME cmd and LS cmd timeout race condition

This patch uses kref to protect access between fcp_abort path and nvme
command and LS command completion path.  Stack trace below shows the abort
path is accessing stale memory (nvme_private->sp).

When command kref reaches 0, nvme_private & srb resource will be
disconnected from each other.  Any subsequence nvme abort request will not
be able to reference the original srb.

[ 5631.003998] BUG: unable to handle kernel paging request at 00000010000005d8
[ 5631.004016] IP: [<ffffffffc087df92>] qla_nvme_abort_work+0x22/0x100 [qla2xxx]
[ 5631.004086] Workqueue: events qla_nvme_abort_work [qla2xxx]
[ 5631.004097] RIP: 0010:[<ffffffffc087df92>]  [<ffffffffc087df92>] qla_nvme_abort_work+0x22/0x100 [qla2xxx]
[ 5631.004109] Call Trace:
[ 5631.004115]  [<ffffffffaa4b8174>] ? pwq_dec_nr_in_flight+0x64/0xb0
[ 5631.004117]  [<ffffffffaa4b9d4f>] process_one_work+0x17f/0x440
[ 5631.004120]  [<ffffffffaa4bade6>] worker_thread+0x126/0x3c0

Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
drivers/scsi/qla2xxx/qla_def.h
drivers/scsi/qla2xxx/qla_nvme.c
drivers/scsi/qla2xxx/qla_nvme.h