scsi: lpfc: Fix EEH encountering oops with NVMe traffic
authorJames Smart <jsmart2021@gmail.com>
Wed, 27 Jan 2021 22:16:01 +0000 (14:16 -0800)
committerMartin K. Petersen <martin.petersen@oracle.com>
Fri, 29 Jan 2021 18:41:39 +0000 (13:41 -0500)
In testing, in a configuration with Redfish and native NVMe multipath when
an EEH is injected, a kernel oops is being encountered:

(unreliable)
lpfc_nvme_ls_req+0x328/0x720 [lpfc]
__nvme_fc_send_ls_req.constprop.13+0x1d8/0x3d0 [nvme_fc]
nvme_fc_create_association+0x224/0xd10 [nvme_fc]
nvme_fc_reset_ctrl_work+0x110/0x154 [nvme_fc]
process_one_work+0x304/0x5d

the NBMe transport is issuing a Disconnect LS request, which the driver
receives and tries to post but the work queue used by the driver is already
being torn down by the eeh.

Fix by validating the validity of the work queue before proceeding with the
LS transmit.

Link: https://lore.kernel.org/r/20210127221601.84878-1-jsmart2021@gmail.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
drivers/scsi/lpfc/lpfc_nvme.c

index 1cb82fa6a60e4c40d6e1c9d35287c274dc286fd0..39d147e251bf4f2e103a9b003b2fcf4a025b880a 100644 (file)
@@ -559,6 +559,9 @@ __lpfc_nvme_ls_req(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp,
                return -ENODEV;
        }
 
+       if (!vport->phba->sli4_hba.nvmels_wq)
+               return -ENOMEM;
+
        /*
         * there are two dma buf in the request, actually there is one and
         * the second one is just the start address + cmd size.