scsi: qla2xxx: Fix erroneous mailbox timeout after PCI error injection
authorQuinn Tran <qutran@marvell.com>
Thu, 16 Jun 2022 05:35:07 +0000 (22:35 -0700)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Wed, 17 Aug 2022 12:24:17 +0000 (14:24 +0200)
commit f260694e6463b63ae550aad25ddefe94cb1904da upstream.

Clear wait for mailbox interrupt flag to prevent stale mailbox:

Feb 22 05:22:56 ltcden4-lp7 kernel: qla2xxx [0135:90:00.1]-500a:4: LOOP UP detected (16 Gbps).
Feb 22 05:22:59 ltcden4-lp7 kernel: qla2xxx [0135:90:00.1]-d04c:4: MBX Command timeout for cmd 69, ...

To fix the issue, driver needs to clear the MBX_INTR_WAIT flag on purging
the mailbox. When the stale mailbox completion does arrive, it will be
dropped.

Link: https://lore.kernel.org/r/20220616053508.27186-11-njavali@marvell.com
Fixes: b6faaaf796d7 ("scsi: qla2xxx: Serialize mailbox request")
Cc: Naresh Bannoth <nbannoth@in.ibm.com>
Cc: Kyle Mahlkuch <Kyle.Mahlkuch@ibm.com>
Cc: stable@vger.kernel.org
Reported-by: Naresh Bannoth <nbannoth@in.ibm.com>
Tested-by: Naresh Bannoth <nbannoth@in.ibm.com>
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
drivers/scsi/qla2xxx/qla_mbx.c

index 96303ae..5bcb8da 100644 (file)
@@ -276,6 +276,12 @@ qla2x00_mailbox_command(scsi_qla_host_t *vha, mbx_cmd_t *mcp)
                atomic_inc(&ha->num_pend_mbx_stage3);
                if (!wait_for_completion_timeout(&ha->mbx_intr_comp,
                    mcp->tov * HZ)) {
+                       ql_dbg(ql_dbg_mbx, vha, 0x117a,
+                           "cmd=%x Timeout.\n", command);
+                       spin_lock_irqsave(&ha->hardware_lock, flags);
+                       clear_bit(MBX_INTR_WAIT, &ha->mbx_cmd_flags);
+                       spin_unlock_irqrestore(&ha->hardware_lock, flags);
+
                        if (chip_reset != ha->chip_reset) {
                                eeh_delay = ha->flags.eeh_busy ? 1 : 0;
 
@@ -288,12 +294,6 @@ qla2x00_mailbox_command(scsi_qla_host_t *vha, mbx_cmd_t *mcp)
                                rval = QLA_ABORTED;
                                goto premature_exit;
                        }
-                       ql_dbg(ql_dbg_mbx, vha, 0x117a,
-                           "cmd=%x Timeout.\n", command);
-                       spin_lock_irqsave(&ha->hardware_lock, flags);
-                       clear_bit(MBX_INTR_WAIT, &ha->mbx_cmd_flags);
-                       spin_unlock_irqrestore(&ha->hardware_lock, flags);
-
                } else if (ha->flags.purge_mbox ||
                    chip_reset != ha->chip_reset) {
                        eeh_delay = ha->flags.eeh_busy ? 1 : 0;