platform/kernel/linux-rpi.git
3 years agoscsi: elx: efct: Hardware queue creation and deletion
James Smart [Tue, 1 Jun 2021 23:55:00 +0000 (16:55 -0700)]
scsi: elx: efct: Hardware queue creation and deletion

Add routines for queue creation, deletion, and configuration.

Link: https://lore.kernel.org/r/20210601235512.20104-20-jsmart2021@gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Co-developed-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: elx: efct: Driver initialization routines
James Smart [Tue, 1 Jun 2021 23:54:59 +0000 (16:54 -0700)]
scsi: elx: efct: Driver initialization routines

Add driver definitions for:

 - Emulex FC Target driver init, attach and hardware setup routines.

Link: https://lore.kernel.org/r/20210601235512.20104-19-jsmart2021@gmail.com
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Co-developed-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: elx: efct: Data structures and defines for hw operations
James Smart [Tue, 1 Jun 2021 23:54:58 +0000 (16:54 -0700)]
scsi: elx: efct: Data structures and defines for hw operations

Start the population of the efct target mode driver.  The driver is
contained in the drivers/scsi/elx/efct subdirectory.

Create the efct directory and start population of the driver by adding
SLI-4 configuration parameters, data structures for configuring SLI-4
queues, converting from OS to SLI-4 IO requests, and handling async events.

Link: https://lore.kernel.org/r/20210601235512.20104-18-jsmart2021@gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Co-developed-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: elx: libefc: Register discovery objects with hardware
James Smart [Tue, 1 Jun 2021 23:54:57 +0000 (16:54 -0700)]
scsi: elx: libefc: Register discovery objects with hardware

Add library interface definitions for:

 - Registrations for VFI, VPI and RPI.

Link: https://lore.kernel.org/r/20210601235512.20104-17-jsmart2021@gmail.com
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Co-developed-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: elx: libefc: Extended link Service I/O handling
James Smart [Tue, 1 Jun 2021 23:54:56 +0000 (16:54 -0700)]
scsi: elx: libefc: Extended link Service I/O handling

Add library interface definitions for:

 - Functions to build and send ELS/CT/BLS commands and responses.

Link: https://lore.kernel.org/r/20210601235512.20104-16-jsmart2021@gmail.com
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Co-developed-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: elx: libefc: FC node ELS and state handling
James Smart [Tue, 1 Jun 2021 23:54:55 +0000 (16:54 -0700)]
scsi: elx: libefc: FC node ELS and state handling

Add library interface definitions for:

 - FC node PRLI handling and state management.

Link: https://lore.kernel.org/r/20210601235512.20104-15-jsmart2021@gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Co-developed-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: elx: libefc: Fabric node state machine interfaces
James Smart [Tue, 1 Jun 2021 23:54:54 +0000 (16:54 -0700)]
scsi: elx: libefc: Fabric node state machine interfaces

Add library interface definitions for:

 - Fabric node initialization and logins.

 - Name/Directory Services node.

 - Fabric Controller node to process rscn events.

These are all interactions with remote ports that correspond to well-known
fabric entities

Link: https://lore.kernel.org/r/20210601235512.20104-14-jsmart2021@gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Co-developed-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: elx: libefc: Remote node state machine interfaces
James Smart [Tue, 1 Jun 2021 23:54:53 +0000 (16:54 -0700)]
scsi: elx: libefc: Remote node state machine interfaces

Add library interface definitions for:

 - Remote node (aka remote port) allocation, initializaion and destroy
   routines.

Link: https://lore.kernel.org/r/20210601235512.20104-13-jsmart2021@gmail.com
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Co-developed-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: elx: libefc: SLI and FC PORT state machine interfaces
James Smart [Tue, 1 Jun 2021 23:54:52 +0000 (16:54 -0700)]
scsi: elx: libefc: SLI and FC PORT state machine interfaces

Add library interface definitions for:

 - SLI and FC port (aka n_port_id) registration, allocation and
   deallocation.

Link: https://lore.kernel.org/r/20210601235512.20104-12-jsmart2021@gmail.com
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Co-developed-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: elx: libefc: FC Domain state machine interfaces
James Smart [Tue, 1 Jun 2021 23:54:51 +0000 (16:54 -0700)]
scsi: elx: libefc: FC Domain state machine interfaces

Add library interface definitions for:

 - FC Domain registration, allocation and deallocation sequence

Link: https://lore.kernel.org/r/20210601235512.20104-11-jsmart2021@gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Co-developed-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: elx: libefc: Emulex FC discovery library APIs and definitions
James Smart [Tue, 1 Jun 2021 23:54:50 +0000 (16:54 -0700)]
scsi: elx: libefc: Emulex FC discovery library APIs and definitions

Add library interface definitions for:

 - SLI/Local FC port objects

 - efc_domain_s: FC domain (aka fabric) objects

 - efc_node_s: FC node (aka remote ports) objects

Link: https://lore.kernel.org/r/20210601235512.20104-10-jsmart2021@gmail.com
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Co-developed-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: elx: libefc: Generic state machine framework
James Smart [Tue, 1 Jun 2021 23:54:49 +0000 (16:54 -0700)]
scsi: elx: libefc: Generic state machine framework

Start the population of the libefc library.

The library will contain common tasks usable by a target or initiator
driver. The library will also contain a FC discovery state machine
interface.

Creates the library directory and add definitions for the discovery state
machine interface.

Link: https://lore.kernel.org/r/20210601235512.20104-9-jsmart2021@gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Co-developed-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: elx: libefc_sli: APIs to setup SLI library
James Smart [Tue, 1 Jun 2021 23:54:48 +0000 (16:54 -0700)]
scsi: elx: libefc_sli: APIs to setup SLI library

Add APIs to initialize the library, initialize the SLI Port, reset
firmware, terminate the SLI Port, and terminate the library.

Link: https://lore.kernel.org/r/20210601235512.20104-8-jsmart2021@gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Co-developed-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: elx: libefc_sli: BMBX routines and SLI config commands
James Smart [Tue, 1 Jun 2021 23:54:47 +0000 (16:54 -0700)]
scsi: elx: libefc_sli: BMBX routines and SLI config commands

Add routines to create mailbox commands used during adapter initialization
and add APIs to issue mailbox commands to the adapter through the bootstrap
mailbox register.

Link: https://lore.kernel.org/r/20210601235512.20104-7-jsmart2021@gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Co-developed-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: elx: libefc_sli: Populate and post different WQEs
James Smart [Tue, 1 Jun 2021 23:54:46 +0000 (16:54 -0700)]
scsi: elx: libefc_sli: Populate and post different WQEs

Add service routines to create different WQEs and add APIs to issue iread,
iwrite, treceive, tsend and other work queue entries.

Link: https://lore.kernel.org/r/20210601235512.20104-6-jsmart2021@gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Co-developed-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: elx: libefc_sli: Queue create/destroy/parse routines
James Smart [Tue, 1 Jun 2021 23:54:45 +0000 (16:54 -0700)]
scsi: elx: libefc_sli: Queue create/destroy/parse routines

Add service routines to create mailbox commands and add APIs to
create/destroy/parse SLI-4 EQ, CQ, RQ and MQ queues.

Link: https://lore.kernel.org/r/20210601235512.20104-5-jsmart2021@gmail.com
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Co-developed-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: elx: libefc_sli: Data structures and defines for mbox commands
James Smart [Tue, 1 Jun 2021 23:54:44 +0000 (16:54 -0700)]
scsi: elx: libefc_sli: Data structures and defines for mbox commands

Add definitions for SLI-4 mailbox commands and responses.

Link: https://lore.kernel.org/r/20210601235512.20104-4-jsmart2021@gmail.com
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Co-developed-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: elx: libefc_sli: SLI Descriptors and Queue entries
James Smart [Tue, 1 Jun 2021 23:54:43 +0000 (16:54 -0700)]
scsi: elx: libefc_sli: SLI Descriptors and Queue entries

Continue the libefc_sli SLI-4 library population.

Add SLI-4 Data structures and defines for:

 - Buffer Descriptors (BDEs)

 - Scatter/Gather List elements (SGEs)

 - Queues and their Entry Descriptions for: Event Queues (EQs), Completion
   Queues (CQs), Receive Queues (RQs), and the Mailbox Queue (MQ).

Link: https://lore.kernel.org/r/20210601235512.20104-3-jsmart2021@gmail.com
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Co-developed-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: elx: libefc_sli: SLI-4 register offsets and field definitions
James Smart [Tue, 1 Jun 2021 23:54:42 +0000 (16:54 -0700)]
scsi: elx: libefc_sli: SLI-4 register offsets and field definitions

This is the initial patch for the new Emulex target mode SCSI driver.

 - Create the new Emulex source level directory drivers/scsi/elx and add
   the directory to the MAINTAINERS file.

 - Create the first library subdirectory drivers/scsi/elx/libefc_sli.  This
   library is a SLI-4 interface library.

 - Start the population of the libefc_sli library with definitions of SLI-4
   hardware register offsets and definitions.

Link: https://lore.kernel.org/r/20210601235512.20104-2-jsmart2021@gmail.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Co-developed-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: Ram Vegesna <ram.vegesna@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: pm8001: Remove unnecessary OOM message
Zhen Lei [Thu, 10 Jun 2021 09:46:05 +0000 (17:46 +0800)]
scsi: pm8001: Remove unnecessary OOM message

Fixes scripts/checkpatch.pl warning:
WARNING: Possible unnecessary 'out of memory' message

Remove it can help us save a bit of memory.

Also change the return error code from "-1" to "-ENOMEM".

Link: https://lore.kernel.org/r/20210610094605.16672-2-thunder.leizhen@huawei.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qla2xxx: Use list_move_tail() instead of list_del()/list_add_tail()
Baokun Li [Wed, 9 Jun 2021 07:23:21 +0000 (15:23 +0800)]
scsi: qla2xxx: Use list_move_tail() instead of list_del()/list_add_tail()

Using list_move_tail() instead of list_del() + list_add_tail().

Link: https://lore.kernel.org/r/20210609072321.1356896-1-libaokun1@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Baokun Li <libaokun1@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: ufs-mediatek: Create reset control device_link
Peter Wang [Wed, 2 Jun 2021 02:42:00 +0000 (10:42 +0800)]
scsi: ufs-mediatek: Create reset control device_link

Mediatek UFS reset function relies on Reset Control provided by
reset-ti-syscon. To make Mediatek Reset Control work properly, select
reset-ti-syscon to ensure it is being built.

In addition, establish device_link to wait until reset-ti-syscon
initialization is complete during UFS probing.

Link: https://lore.kernel.org/r/1622601720-22466-1-git-send-email-peter.wang@mediatek.com
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qla2xxx: Remove duplicate declarations
Shaokun Zhang [Mon, 24 May 2021 08:03:22 +0000 (16:03 +0800)]
scsi: qla2xxx: Remove duplicate declarations

qla2x00_post_uevent_work(), qla2x00_free_fcport() and ql2xexlogins are
declared multiple times. Remove the duplicates.

Link: https://lore.kernel.org/r/1621843402-34828-1-git-send-email-zhangshaokun@hisilicon.com
Cc: Nilesh Javali <njavali@marvell.com>
Cc: GR-QLogic-Storage-Upstream@marvell.com
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: lpfc: Use list_move_tail() instead of list_del()/list_add_tail()
Zou Wei [Tue, 8 Jun 2021 00:51:33 +0000 (08:51 +0800)]
scsi: lpfc: Use list_move_tail() instead of list_del()/list_add_tail()

Using list_move_tail() instead of list_del() + list_add_tail().

Link: https://lore.kernel.org/r/1623113493-49384-1-git-send-email-zou_wei@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: storvsc: Correctly handle multiple flags in srb_status
Michael Kelley [Fri, 4 Jun 2021 17:21:03 +0000 (10:21 -0700)]
scsi: storvsc: Correctly handle multiple flags in srb_status

Hyper-V is observed to sometimes set multiple flags in the srb_status, such
as ABORTED and ERROR. Current code in storvsc_handle_error() handles only a
single flag being set, and does nothing when multiple flags are set.  Fix
this by changing the case statement into a series of "if" statements
testing individual flags. The functionality for handling each flag is
unchanged.

Link: https://lore.kernel.org/r/1622827263-12516-3-git-send-email-mikelley@microsoft.com
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: storvsc: Update error logging
Michael Kelley [Fri, 4 Jun 2021 17:21:02 +0000 (10:21 -0700)]
scsi: storvsc: Update error logging

When an I/O error is reported by the underlying Hyper-V host, current code
provides details only when the logging level is set to WARN, making it more
difficult to diagnose problems in live customer situations. Fix this by
reporting details at ERROR level, which is the default.  Also add more
information, including the Hyper-V error code, and the tag # so that the
message can be matched with messages at the SCSI and blk-mq levels.

Also, sense information logging is inconsistent and duplicative. The
existence of sense info is first logged at WARN level, and then full sense
info is logged at ERROR level. Fix this by removing the logging of the
existence of sense info, and change the logging of full sense info to WARN
level in favor of letting the generic SCSI layer handle such logging. With
the change to WARN level, it's no longer necessary to filter out as noise
any NOT READY sense info generated by the virtual DVD device.

Link: https://lore.kernel.org/r/1622827263-12516-2-git-send-email-mikelley@microsoft.com
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: storvsc: Miscellaneous code cleanups
Michael Kelley [Fri, 4 Jun 2021 17:21:01 +0000 (10:21 -0700)]
scsi: storvsc: Miscellaneous code cleanups

As general cleanup and in preparation for subsequent patches:

 - Use min() instead of open coding.

 - Use set_host_byte() instead of open coding access to scsi_status
   field.

 - Collapse nested "if" statements to reduce indentation.

 - Fix other indentation.

 - Remove extra blank lines.

No functional changes.

[mkp: dropped status_byte() which no longer exists]

Link: https://lore.kernel.org/r/1622827263-12516-1-git-send-email-mikelley@microsoft.com
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: lpfc: vmid: Introduce VMID in I/O path
Gaurav Srivastava [Tue, 8 Jun 2021 04:35:56 +0000 (10:05 +0530)]
scsi: lpfc: vmid: Introduce VMID in I/O path

Introduce the VMID in the I/O path. Check if the VMID is enabled and if
the I/O belongs to a VM or not.

Link: https://lore.kernel.org/r/20210608043556.274139-14-muneendra.kumar@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: lpfc: vmid: Add QFPA and VMID timeout check in worker thread
Gaurav Srivastava [Tue, 8 Jun 2021 04:35:55 +0000 (10:05 +0530)]
scsi: lpfc: vmid: Add QFPA and VMID timeout check in worker thread

Add a periodic check for issuing of QFPA command and VMID timeout in the
worker thread. The inactivity timeout check is added via the timer
function.

Link: https://lore.kernel.org/r/20210608043556.274139-13-muneendra.kumar@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: lpfc: vmid: Timeout implementation for VMID
Gaurav Srivastava [Tue, 8 Jun 2021 04:35:54 +0000 (10:05 +0530)]
scsi: lpfc: vmid: Timeout implementation for VMID

Implement timeout functionality for the VMID. After the set time period of
inactivity, the VMID is deregistered from the switch.

Link: https://lore.kernel.org/r/20210608043556.274139-12-muneendra.kumar@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: lpfc: vmid: Append the VMID to the wqe before sending
Gaurav Srivastava [Tue, 8 Jun 2021 04:35:53 +0000 (10:05 +0530)]
scsi: lpfc: vmid: Append the VMID to the wqe before sending

Add the VMID in wqe before sending out the request. The type of VMID
depends on the configured type and is checked before being appended.

Link: https://lore.kernel.org/r/20210608043556.274139-11-muneendra.kumar@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: lpfc: vmid: Implement CT commands for appid
Gaurav Srivastava [Tue, 8 Jun 2021 04:35:52 +0000 (10:05 +0530)]
scsi: lpfc: vmid: Implement CT commands for appid

Implement CT commands for registering and deregistering the appid for the
application. Also, a small change in decrementing the ndlp ref counter has
been added.

Link: https://lore.kernel.org/r/20210608043556.274139-10-muneendra.kumar@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: lpfc: vmid: Functions to manage VMIDs
Gaurav Srivastava [Tue, 8 Jun 2021 04:35:51 +0000 (10:05 +0530)]
scsi: lpfc: vmid: Functions to manage VMIDs

Implement routines to save, retrieve, and remove the VMIDs from the data
structure. A hash table is used to save the VMIDs and the corresponding
UUIDs associated with the application/VMs.

Link: https://lore.kernel.org/r/20210608043556.274139-9-muneendra.kumar@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: lpfc: vmid: Implement ELS commands for appid
Gaurav Srivastava [Tue, 8 Jun 2021 04:35:50 +0000 (10:05 +0530)]
scsi: lpfc: vmid: Implement ELS commands for appid

Implement ELS commands QFPA and UVEM for the priority tagging appid
support.

Link: https://lore.kernel.org/r/20210608043556.274139-8-muneendra.kumar@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: lpfc: vmid: Add support for VMID in mailbox command
Gaurav Srivastava [Tue, 8 Jun 2021 04:35:49 +0000 (10:05 +0530)]
scsi: lpfc: vmid: Add support for VMID in mailbox command

Add supporting datastructures for mailbox command which helps in
determining if the firmware supports appid. Allocate resources for VMID at
initialization time and clean them up on removal.

Link: https://lore.kernel.org/r/20210608043556.274139-7-muneendra.kumar@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: lpfc: vmid: VMID parameter initialization
Gaurav Srivastava [Tue, 8 Jun 2021 04:35:48 +0000 (10:05 +0530)]
scsi: lpfc: vmid: VMID parameter initialization

Initialize parameters such as type of VMID, max number of VMIDs supported
and timeout value for the VMID registration based on the user input.

Link: https://lore.kernel.org/r/20210608043556.274139-6-muneendra.kumar@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: lpfc: vmid: Add datastructure for supporting VMID in lpfc
Gaurav Srivastava [Tue, 8 Jun 2021 04:35:47 +0000 (10:05 +0530)]
scsi: lpfc: vmid: Add datastructure for supporting VMID in lpfc

Add the primary datastructures needed to implement VMID in the lpfc
driver. Maintain the capability, current state, and hash table for the
vmid/appid along with other information. This implementation supports the
two versions of vmid implementation (app header and priority tagging).

Link: https://lore.kernel.org/r/20210608043556.274139-5-muneendra.kumar@broadcom.com
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Gaurav Srivastava <gaurav.srivastava@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: nvme: Added a new sysfs attribute appid_store
Muneendra Kumar [Tue, 8 Jun 2021 04:35:46 +0000 (10:05 +0530)]
scsi: nvme: Added a new sysfs attribute appid_store

Add a new sysfs attribute, appid_store, which can be used to set the
application identifier in the blkcg associated with a cgroup id.

Below is the interface provided to set the app_id:

  echo "<cgroupid>:<appid>" >> /sys/class/fc/fc_udev_device/appid_store

  echo "457E:100000109b521d27" >> /sys/class/fc/fc_udev_device/appid_store

Link: https://lore.kernel.org/r/20210608043556.274139-4-muneendra.kumar@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: blkcg: Add app identifier support for blkcg
Muneendra Kumar [Tue, 8 Jun 2021 04:35:45 +0000 (10:05 +0530)]
scsi: blkcg: Add app identifier support for blkcg

Add a unique application identifier (i.e fc_app_id member) in blkcg. This
allows identification of traffic belonging to an specific both on the host
and in the fabric infrastructure. As an example, this allows the storage
stack to uniquely identify traffic belong to particular virtual machine.

Link: https://lore.kernel.org/r/20210608043556.274139-3-muneendra.kumar@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: cgroup: Add cgroup_get_from_id()
Muneendra Kumar [Tue, 8 Jun 2021 04:35:44 +0000 (10:05 +0530)]
scsi: cgroup: Add cgroup_get_from_id()

Add a new function, cgroup_get_from_id(), to retrieve the cgroup associated
with a cgroup id. Also export the function cgroup_get_e_css() as this is
needed in blk-cgroup.h.

Link: https://lore.kernel.org/r/20210608043556.274139-2-muneendra.kumar@broadcom.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Muneendra Kumar <muneendra.kumar@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: fc: FDMI enhancement
Javed Hasan [Thu, 3 Jun 2021 12:16:23 +0000 (05:16 -0700)]
scsi: fc: FDMI enhancement

Added RHBA and RPA attributes type and length.

As per FC_GC_7 document section "Table 400 â€“ Attribute Entry Types and
associated Values" ASCII type attributes length can be vary from "4 to 256
byte".  If we keep all RHBA ASCII attributes length 256 then total length
is going upto 2750, which is far more than 2048 (max frame size).

In libfc we do have logic to split FCP commands but not for CT commands.
Practically all version/names get covered with in 64 bytes except OS name,
for that we need 128 bytes.  Hence length of all RBHA ASCII attributes
is reduced to 64 bytes and 128 bytes in case of OS name.

RPA attributes total length is within frame size.

Link: https://lore.kernel.org/r/20210603121623.10084-6-jhasan@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Javed Hasan <jhasan@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: libfc: FDMI enhancements
Javed Hasan [Thu, 3 Jun 2021 12:16:22 +0000 (05:16 -0700)]
scsi: libfc: FDMI enhancements

Add all the attributes for FDMI.

Fall back mechanism is added in between FDMI V2 and FDMI V1 attributes. In
case FDMI get fails for V2 attributes we fall back to V1 attributes.

Link: https://lore.kernel.org/r/20210603121623.10084-5-jhasan@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Javed Hasan <jhasan@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: libfc: Add FDMI-2 attributes
Javed Hasan [Thu, 3 Jun 2021 12:16:21 +0000 (05:16 -0700)]
scsi: libfc: Add FDMI-2 attributes

Add all attributes for RHBA and RPA registration.

Fallback mechanism is added between RBHA V2 and RHBA V1 attributes. In case
RHBA get fails for V2 attributes we fall back to V1 attribute registration.

Link: https://lore.kernel.org/r/20210603121623.10084-4-jhasan@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Javed Hasan <jhasan@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qedf: Add vendor identifier attribute
Javed Hasan [Thu, 3 Jun 2021 12:16:20 +0000 (05:16 -0700)]
scsi: qedf: Add vendor identifier attribute

Link: https://lore.kernel.org/r/20210603121623.10084-3-jhasan@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Javed Hasan <jhasan@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: libfc: Initialisation of RHBA and RPA attributes
Javed Hasan [Thu, 3 Jun 2021 12:16:19 +0000 (05:16 -0700)]
scsi: libfc: Initialisation of RHBA and RPA attributes

Initialize RHBA and RPA attributes.

Link: https://lore.kernel.org/r/20210603121623.10084-2-jhasan@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Javed Hasan <jhasan@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: libfc: Correct the condition check and invalid argument passed
Javed Hasan [Thu, 3 Jun 2021 10:14:04 +0000 (03:14 -0700)]
scsi: libfc: Correct the condition check and invalid argument passed

Incorrect condition check was leading to data corruption.

Link: https://lore.kernel.org/r/20210603101404.7841-3-jhasan@marvell.com
Fixes: 8fd9efca86d0 ("scsi: libfc: Work around -Warray-bounds warning")
CC: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Javed Hasan <jhasan@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: fc: Correct RHBA attributes length
Javed Hasan [Thu, 3 Jun 2021 10:14:03 +0000 (03:14 -0700)]
scsi: fc: Correct RHBA attributes length

As per the FC-GS-5 specification, attribute lengths of node_name and
manufacturer should in range of "4 to 64 Bytes" only.

Link: https://lore.kernel.org/r/20210603101404.7841-2-jhasan@marvell.com
Fixes: e721eb0616f6 ("scsi: scsi_transport_fc: Match HBA Attribute Length with HBAAPI V2.0 definitions")
CC: stable@vger.kernel.org
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Javed Hasan <jhasan@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: hisi_sas: Speed up error handling when internal abort timeout occurs
Luo Jiaxing [Mon, 7 Jun 2021 09:29:39 +0000 (17:29 +0800)]
scsi: hisi_sas: Speed up error handling when internal abort timeout occurs

If an internal task abort timeout occurs, the controller has developed a
fault, and needs to be reset to be recovered.

When this occurs during error handling, the current policy is to allow
error handling to continue, and the inevitable nexus ha reset will handle
the required reset.

However various steps of error handling need to taken before this happens.
These also involve some level of HW interaction, which will also fail with
various timeouts.

Speed up this process by recording a HW fault bit for an internal abort
timeout - when this is set, just automatically error any HW interaction,
and essentially go straight to clear nexus ha (to reset the controller).

Link: https://lore.kernel.org/r/1623058179-80434-6-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: hisi_sas: Reset controller for internal abort timeout
Luo Jiaxing [Mon, 7 Jun 2021 09:29:38 +0000 (17:29 +0800)]
scsi: hisi_sas: Reset controller for internal abort timeout

If an internal task abort timeout occurs, the controller has developed a
fault, and needs to be reset to be recovered. However if a timeout occurs
during SCSI error handling, issuing a controller reset immediately may
conflict with the error handling.

To handle internal abort in these two scenarios, only queue the reset when
not in an error handling function. In the case of a timeout during error
handling, do nothing and rely on the inevitable ha nexus reset to reset the
controller.

Link: https://lore.kernel.org/r/1623058179-80434-5-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: hisi_sas: Include HZ in timer macros
Luo Jiaxing [Mon, 7 Jun 2021 09:29:37 +0000 (17:29 +0800)]
scsi: hisi_sas: Include HZ in timer macros

Include HZ in timer macros to make the code more concise.

Link: https://lore.kernel.org/r/1623058179-80434-4-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: hisi_sas: Run I_T nexus resets in parallel for clear nexus reset
Luo Jiaxing [Mon, 7 Jun 2021 09:29:36 +0000 (17:29 +0800)]
scsi: hisi_sas: Run I_T nexus resets in parallel for clear nexus reset

For a clear nexus reset operation, the I_T nexus resets are executed
serially for each device. For devices attached through an expander, this
may take 2s per device; so, in total, could take a long time.

Reduce the total time by running the I_T nexus resets in parallel through
async operations.

Link: https://lore.kernel.org/r/1623058179-80434-3-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: hisi_sas: Put a limit of link reset retries
Luo Jiaxing [Mon, 7 Jun 2021 09:29:35 +0000 (17:29 +0800)]
scsi: hisi_sas: Put a limit of link reset retries

If an OOB event is received but the phy still fails to come up, a link
reset will be issued repeatedly at an interval of 20s until the phy comes
up.

Set a limit for link reset issue retries to avoid printing the timeout
message endlessly.

Link: https://lore.kernel.org/r/1623058179-80434-2-git-send-email-john.garry@huawei.com
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qedi: Fix host removal with running sessions
Mike Christie [Wed, 9 Jun 2021 19:27:09 +0000 (14:27 -0500)]
scsi: qedi: Fix host removal with running sessions

qedi_clear_session_ctx() could race with the in-kernel or userspace driven
recovery/removal and we could access a NULL conn or do a double free.

We should be using iscsi_host_remove() to start the removal process from
the driver. It will start the in-kernel recovery and notify userspace that
the driver's scsi_hosts are being removed. iscsid will then drive the
session removal like is done when the logout command is run. When the
sessions are removed, iscsi_host_remove() will return so qedi can finish
knowing there are no running sessions and no new sessions will be allowed.

This also fixes an issue where we check for a NULL conn after already
accessing it introduced in commit 27e986289e73 ("scsi: iscsi: Drop suspend
calls from ep_disconnect") by just removing the function completely.

Link: https://lore.kernel.org/r/20210609192709.5094-1-michael.christie@oracle.com
Fixes: 27e986289e73 ("scsi: iscsi: Drop suspend calls from ep_disconnect")
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: mpi3mr: Fix error handling in mpi3mr_setup_isr()
Dan Carpenter [Wed, 9 Jun 2021 09:27:14 +0000 (12:27 +0300)]
scsi: mpi3mr: Fix error handling in mpi3mr_setup_isr()

The pci_alloc_irq_vectors_affinity() function returns negative error codes
or it returns a number between the minimum vectors (1 in this case) and
max_vectors.  It won't return zero.  Because "i" is a u16 then the error
handling won't work.  And also if it did work the error code was not set.

Really "max_vectors" can be an int as well because we're doing a min_t() on
int type.  The other change is that it's better to remove unnecessary
initialization so that static checkers can warn us if there are ever
uninitialized variable bugs introduced in the future.

I changed the error code from -1 (-EPERM) if the kmalloc() failed to
-ENOMEM.  And on success path I changed it from "return retval;" to "return
0;" which shouldn't affect the compiled code but makes it more readable.

Link: https://lore.kernel.org/r/YMCJcnmSI4kOIyv/@mwanda
Fixes: 824a156633df ("scsi: mpi3mr: Base driver code")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: mpi3mr: Delete unnecessary NULL check
Dan Carpenter [Wed, 9 Jun 2021 09:26:02 +0000 (12:26 +0300)]
scsi: mpi3mr: Delete unnecessary NULL check

The "mrioc->intr_info" pointer can't be NULL, but if it could then the
second iteration through the loop would Oops.  Let's delete the confusing
and impossible NULL check.

Link: https://lore.kernel.org/r/YMCJKgykDYtyvY44@mwanda
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: mpi3mr: Fix a double free
Tomas Henzl [Tue, 8 Jun 2021 14:57:12 +0000 (16:57 +0200)]
scsi: mpi3mr: Fix a double free

Fix a double free, scsi_tgt_priv_data will be freed in
mpi3mr_target_destroy() so remove the kfree() from mpi3mr_target_alloc().
I've also removed few unneeded initialisations.

Link: https://lore.kernel.org/r/20210608145712.16386-1-thenzl@redhat.com
Acked-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: ufs: core: Fix a possible use before initialization case
Can Guo [Wed, 9 Jun 2021 08:24:00 +0000 (01:24 -0700)]
scsi: ufs: core: Fix a possible use before initialization case

In ufshcd_exec_dev_cmd(), if error happens before lrpb is initialized, then
we should bail out instead of letting trace record the error.

Link: https://lore.kernel.org/r/1623227044-22635-1-git-send-email-cang@codeaurora.org
Fixes: a45f937110fa ("scsi: ufs: Optimize host lock on transfer requests send/compl paths")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: ufs: core: Use UPIU query trace in devman_upiu_cmd()
Bean Huo [Mon, 31 May 2021 10:43:08 +0000 (12:43 +0200)]
scsi: ufs: core: Use UPIU query trace in devman_upiu_cmd()

Since devman_upiu_cmd() is not COMMAND UPIU, and doesn't have CDB, it is
better to use UPIU query trace, which provides more helpful information for
issue troubleshooting.

Link: https://lore.kernel.org/r/20210531104308.391842-5-huobean@gmail.com
Reviewed-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: ufs: core: Capture command trace only for the cmd != NULL case
Bean Huo [Mon, 31 May 2021 10:43:07 +0000 (12:43 +0200)]
scsi: ufs: core: Capture command trace only for the cmd != NULL case

For the query request, we already have query_trace, but in
ufshcd_send_command(), there will add two more redundant traces. Since
lrbp->cmd is NULL in the query request, the two trace events below provide
nothing except the tag and DB. Instead of letting them take up the limited
trace ring buffer, it’s better not to print these traces in case of cmd ==
NULL.

ufshcd_command: send_req: ff3b0000.ufs: tag: 28, DB: 0x0, size: -1, IS: 0, LBA: 18446744073709551615, opcode: 0x0 (0x0), group_id: 0x0
ufshcd_command: dev_complete: ff3b0000.ufs: tag: 28, DB: 0x0, size: -1, IS: 0, LBA: 18446744073709551615, opcode: 0x0 (0x0), group_id: 0x0

Link: https://lore.kernel.org/r/20210531104308.391842-4-huobean@gmail.com
Reviewed-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: ufs: core: Let UPIU completion trace print RSP UPIU header
Bean Huo [Mon, 31 May 2021 10:43:06 +0000 (12:43 +0200)]
scsi: ufs: core: Let UPIU completion trace print RSP UPIU header

The current UPIU completion event trace still prints the COMMAND UPIU
header, rather than the RSP UPIU header. This makes UPIU command trace
useless in problem shooting in case we receive a trace log from the
customer/field.

There are two important fields in RSP UPIU:

 1. The response field, which indicates the UFS defined overall success or
    failure of the series of Command, Data and RESPONSE UPIU’s that make up
    the execution of a task.

2. The Status field, which contains the command set specific status for a
    specific command issued by the initiator device.

Before this commit, the UPIU paired trace events:

ufshcd_upiu: send_req: fe3b0000.ufs: HDR:01 20 00 1c 00 00 00 00 00 00 00 00, CDB:3b e1 00 00 00 00 00 00 30 00 00 00 00 00 00 00
ufshcd_upiu: complete_rsp: fe3b0000.ufs: HDR:01 20 00 1c 00 00 00 00 00 00 00 00, CDB:3b e1 00 00 00 00 00 00 30 00 00 00 00 00 00 00

After this commit:

ufshcd_upiu: send_req: fe3b0000.ufs: HDR:01 20 00 1c 00 00 00 00 00 00 00 00, CDB:3b e1 00 00 00 00 00 00 30 00 00 00 00 00 00 00
ufshcd_upiu: complete_rsp: fe3b0000.ufs: HDR:21 00 00 1c 00 00 00 00 00 00 00 00, CDB:3b e1 00 00 00 00 00 00 30 00 00 00 00 00 00 00

Link: https://lore.kernel.org/r/20210531104308.391842-3-huobean@gmail.com
Reviewed-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: ufs: core: Clean up ufshcd_add_command_trace()
Bean Huo [Mon, 31 May 2021 10:43:05 +0000 (12:43 +0200)]
scsi: ufs: core: Clean up ufshcd_add_command_trace()

To consistent with trace event print, convert the value of the variable
'lba' from a block layer sector address to a logical block adress.

Link: https://lore.kernel.org/r/20210531104308.391842-2-huobean@gmail.com
Suggested-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: ufs: core: Remove repeated word in comment
Keoseong Park [Fri, 4 Jun 2021 02:40:38 +0000 (11:40 +0900)]
scsi: ufs: core: Remove repeated word in comment

Remove repeated word "for" in comment.

Link: https://lore.kernel.org/r/1891546521.01622777101796.JavaMail.epsvc@epcpadp3
Signed-off-by: Keoseong Park <keosung.park@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: mpi3mr: Fix fall-through warning for Clang
Gustavo A. R. Silva [Fri, 4 Jun 2021 02:35:30 +0000 (21:35 -0500)]
scsi: mpi3mr: Fix fall-through warning for Clang

In preparation to enable -Wimplicit-fallthrough for Clang, fix a
fall-through warning by explicitly adding a break statement instead of just
letting the code fall through to the next case.

Link: https://github.com/KSPP/linux/issues/115
Link: https://lore.kernel.org/r/20210604023530.GA180997@embeddedor
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: NCR5380: Fix fall-through warning for Clang
Gustavo A. R. Silva [Fri, 4 Jun 2021 02:27:52 +0000 (21:27 -0500)]
scsi: NCR5380: Fix fall-through warning for Clang

In preparation to enable -Wimplicit-fallthrough for Clang, fix a
fall-through warning by replacing a /* fallthrough */ comment with the new
pseudo-keyword macro fallthrough;

Link: https://github.com/KSPP/linux/issues/115
Link: https://lore.kernel.org/r/20210604022752.GA168289@embeddedor
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: ufs: Utilize Transfer Request List Completion Notification Register
Can Guo [Mon, 24 May 2021 08:36:58 +0000 (01:36 -0700)]
scsi: ufs: Utilize Transfer Request List Completion Notification Register

By reading the UTP Transfer Request List Completion Notification Register,
which is added in UFSHCI Ver 3.0, SW can easily get the compeleted transfer
requests. Thus, SW can get rid of host lock, which is used to synchronize
the tr_doorbell and outstanding_reqs, on transfer requests dispatch and
completion paths. This can further benefit random read/write performance.

Link: https://lore.kernel.org/r/1621845419-14194-4-git-send-email-cang@codeaurora.org
Cc: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Co-developed-by: Asutosh Das <asutoshd@codeaurora.org>
Signed-off-by: Asutosh Das <asutoshd@codeaurora.org>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: ufs: Optimize host lock on transfer requests send/compl paths
Can Guo [Mon, 24 May 2021 08:36:57 +0000 (01:36 -0700)]
scsi: ufs: Optimize host lock on transfer requests send/compl paths

Current UFS IRQ handler is completely wrapped by host lock, and because
ufshcd_send_command() is also protected by host lock, when IRQ handler
fires, not only the CPU running the IRQ handler cannot send new requests,
the rest CPUs can neither. Move the host lock wrapping the IRQ handler into
specific branches, i.e., ufshcd_uic_cmd_compl(), ufshcd_check_errors(),
ufshcd_tmc_handler() and ufshcd_transfer_req_compl(). Meanwhile, to further
reduce occpuation of host lock in ufshcd_transfer_req_compl(), host lock is
no longer required to call __ufshcd_transfer_req_compl(). As per test, the
optimization can bring considerable gain to random read/write performance.

Link: https://lore.kernel.org/r/1621845419-14194-3-git-send-email-cang@codeaurora.org
Cc: Stanley Chu <stanley.chu@mediatek.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Co-developed-by: Asutosh Das <asutoshd@codeaurora.org>
Signed-off-by: Asutosh Das <asutoshd@codeaurora.org>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: ufs: Remove a redundant command completion logic in error handler
Can Guo [Mon, 24 May 2021 08:36:56 +0000 (01:36 -0700)]
scsi: ufs: Remove a redundant command completion logic in error handler

ufshcd_host_reset_and_restore() anyways completes all pending requests
before starts re-probing, so there is no need to complete the command on
the highest bit in tr_doorbell in advance.

Link: https://lore.kernel.org/r/1621845419-14194-2-git-send-email-cang@codeaurora.org
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Stanley Chu <stanley.chu@mediatek.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Can Guo <cang@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: scsi_dh_alua: Fix signedness bug in alua_rtpg()
Dan Carpenter [Thu, 3 Jun 2021 12:33:20 +0000 (15:33 +0300)]
scsi: scsi_dh_alua: Fix signedness bug in alua_rtpg()

The "retval" variable needs to be signed for the error handling to work.

Link: https://lore.kernel.org/r/YLjMEAFNxOas1mIp@mwanda
Fixes: 7e26e3ea0287 ("scsi: scsi_dh_alua: Check for negative result value")
Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: ufs: core: Remove irrelevant reference to non-existing doc
Avri Altman [Thu, 3 Jun 2021 12:22:09 +0000 (15:22 +0300)]
scsi: ufs: core: Remove irrelevant reference to non-existing doc

Remove all references to the description of __ufshcd_wl_{suspend,resume} as
no such description exist.

Fixes: b294ff3e3449 (scsi: ufs: core: Enable power management for wlun)
Link: https://lore.kernel.org/r/20210603122209.635799-1-avri.altman@wdc.com
Signed-off-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: fcoe: Statically initialize flogi_maddr
Kees Cook [Wed, 2 Jun 2021 18:00:00 +0000 (11:00 -0700)]
scsi: fcoe: Statically initialize flogi_maddr

In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memcpy() avoid using an inline const buffer
argument and instead just statically initialize the destination array
directly.

Link: https://lore.kernel.org/r/20210602180000.3326448-1-keescook@chromium.org
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qedf: Update the max_id value in host structure
Saurav Kashyap [Wed, 2 Jun 2021 10:46:53 +0000 (03:46 -0700)]
scsi: qedf: Update the max_id value in host structure

host->max_id defines the maximum target id that the SCSI midlayer will
attempt to manually scan. The default is 8. Update the value to the max
sessions the driver supports.

[mkp: applied by hand]

Link: https://lore.kernel.org/r/20210602104653.17278-1-jhasan@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Saurav Kashyap <skashyap@marvell.com>
Signed-off-by: Javed Hasan <jhasan@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: core: Change the type of the second argument of scsi_host_complete_all_commands()
Bart Van Assche [Mon, 24 May 2021 02:54:57 +0000 (19:54 -0700)]
scsi: core: Change the type of the second argument of scsi_host_complete_all_commands()

Allow the compiler to verify the type of the second argument passed to
scsi_host_complete_all_commands().

Link: https://lore.kernel.org/r/20210524025457.11299-4-bvanassche@acm.org
Cc: Hannes Reinecke <hare@suse.com>
Cc: John Garry <john.garry@huawei.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: core: Introduce enums for the SAM and host status codes
Bart Van Assche [Mon, 24 May 2021 02:54:56 +0000 (19:54 -0700)]
scsi: core: Introduce enums for the SAM and host status codes

Make it possible for the compiler to verify whether SAM and host
status codes are used correctly.

[mkp: resolve conflicts with Hannes' SCSI result series]

Link: https://lore.kernel.org/r/20210524025457.11299-3-bvanassche@acm.org
Cc: Hannes Reinecke <hare@suse.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: libsas: Introduce more SAM status code aliases in enum exec_status
Bart Van Assche [Mon, 24 May 2021 02:54:55 +0000 (19:54 -0700)]
scsi: libsas: Introduce more SAM status code aliases in enum exec_status

This patch prepares for converting SAM status codes into an enum. Without
this patch converting SAM status codes into an enumeration type would
trigger complaints about enum type mismatches for the SAS code.

Link: https://lore.kernel.org/r/20210524025457.11299-2-bvanassche@acm.org
Cc: Hannes Reinecke <hare@suse.com>
Cc: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Cc: Jason Yan <yanaijie@huawei.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoMerge branch '5.14/scsi-result' into 5.14/scsi-staging
Martin K. Petersen [Wed, 2 Jun 2021 05:33:12 +0000 (01:33 -0400)]
Merge branch '5.14/scsi-result' into 5.14/scsi-staging

Include Hannes' SCSI command result rework in the staging branch.

[mkp: remove DRIVER_SENSE from mpi3mr]

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qedi: Wake up if cmd_cleanup_req is set
Mike Christie [Tue, 25 May 2021 18:18:21 +0000 (13:18 -0500)]
scsi: qedi: Wake up if cmd_cleanup_req is set

If we got a response then we should always wake up the conn. For both the
cmd_cleanup_req == 0 or cmd_cleanup_req > 0, we shouldn't dig into
iscsi_itt_to_task because we don't know what the upper layers are doing.

We can also remove the qedi_clear_task_idx call here because once we signal
success libiscsi will loop over the affected commands and end up calling
the cleanup_task callout which will release it.

Link: https://lore.kernel.org/r/20210525181821.7617-29-michael.christie@oracle.com
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qedi: Complete TMF works before disconnect
Mike Christie [Tue, 25 May 2021 18:18:20 +0000 (13:18 -0500)]
scsi: qedi: Complete TMF works before disconnect

We need to make sure that abort and reset completion work has completed
before ep_disconnect returns. After ep_disconnect we can't manipulate
cmds because libiscsi will call conn_stop and take onwership.

We are trying to make sure abort work and reset completion work has
completed before we do the cmd clean up in ep_disconnect. The problem is
that:

 1. the work function sets the QEDI_CONN_FW_CLEANUP bit, so if the work was
    still pending we would not see the bit set. We need to do this before
    the work is queued.

 2. If we had multiple works queued then we could break from the loop in
    qedi_ep_disconnect early because when abort work 1 completes it could
    clear QEDI_CONN_FW_CLEANUP. qedi_ep_disconnect could then see that
    before work 2 has run.

 3. A TMF reset completion work could run after ep_disconnect starts
    cleaning up cmds via qedi_clearsq. ep_disconnect's call to qedi_clearsq
    -> qedi_cleanup_all_io would might think it's done cleaning up cmds,
    but the reset completion work could still be running. We then return
    from ep_disconnect while still doing cleanup.

This replaces the bit with a counter to track the number of queued TMF
works, and adds a bool to prevent new works from starting from the
completion path once a ep_disconnect starts.

Link: https://lore.kernel.org/r/20210525181821.7617-28-michael.christie@oracle.com
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qedi: Pass send_iscsi_tmf task to abort
Mike Christie [Tue, 25 May 2021 18:18:19 +0000 (13:18 -0500)]
scsi: qedi: Pass send_iscsi_tmf task to abort

qedi_abort_work knows what task to abort so just pass it to send_iscsi_tmf.

Link: https://lore.kernel.org/r/20210525181821.7617-27-michael.christie@oracle.com
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qedi: Fix cleanup session block/unblock use
Mike Christie [Tue, 25 May 2021 18:18:18 +0000 (13:18 -0500)]
scsi: qedi: Fix cleanup session block/unblock use

Drivers shouldn't be calling block/unblock session for cmd cleanup because
the functions can change the session state from under libiscsi.  This adds
a new a driver level bit so it can block all I/O the host while it drains
the card.

Link: https://lore.kernel.org/r/20210525181821.7617-26-michael.christie@oracle.com
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qedi: Fix TMF session block/unblock use
Mike Christie [Tue, 25 May 2021 18:18:17 +0000 (13:18 -0500)]
scsi: qedi: Fix TMF session block/unblock use

Drivers shouldn't be calling block/unblock session for tmf handling because
the functions can change the session state from under libiscsi.
iscsi_queuecommand's call to iscsi_prep_scsi_cmd_pdu->
iscsi_check_tmf_restrictions will prevent new cmds from being sent to qedi
after we've started handling a TMF. So we don't need to try and block it in
the driver, and we can remove these block calls.

Link: https://lore.kernel.org/r/20210525181821.7617-25-michael.christie@oracle.com
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qedi: Use GFP_NOIO for TMF allocation
Mike Christie [Tue, 25 May 2021 18:18:16 +0000 (13:18 -0500)]
scsi: qedi: Use GFP_NOIO for TMF allocation

We run from a workqueue with no locks held so use GFP_NOIO.

Link: https://lore.kernel.org/r/20210525181821.7617-24-michael.christie@oracle.com
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qedi: Fix TMF tid allocation
Mike Christie [Tue, 25 May 2021 18:18:15 +0000 (13:18 -0500)]
scsi: qedi: Fix TMF tid allocation

qedi_iscsi_abort_work and qedi_tmf_work both allocate a tid then call
qedi_send_iscsi_tmf which also allocates a tid. This removes the tid
allocation from the callers.

Link: https://lore.kernel.org/r/20210525181821.7617-23-michael.christie@oracle.com
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qedi: Fix use after free during abort cleanup
Mike Christie [Tue, 25 May 2021 18:18:14 +0000 (13:18 -0500)]
scsi: qedi: Fix use after free during abort cleanup

If qedi_tmf_work's qedi_wait_for_cleanup_request call times out we will
also force the clean up of the qedi_work_map but
qedi_process_cmd_cleanup_resp could still be accessing the qedi_cmd.

To fix this issue we extend where we hold the tmf_work_lock and back_lock
so the qedi_process_cmd_cleanup_resp access is serialized with the cleanup
done in qedi_tmf_work and any completion handling for the iscsi_task.

Link: https://lore.kernel.org/r/20210525181821.7617-22-michael.christie@oracle.com
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qedi: Fix race during abort timeouts
Mike Christie [Tue, 25 May 2021 18:18:13 +0000 (13:18 -0500)]
scsi: qedi: Fix race during abort timeouts

If the SCSI cmd completes after qedi_tmf_work calls iscsi_itt_to_task then
the qedi qedi_cmd->task_id could be freed and used for another cmd. If we
then call qedi_iscsi_cleanup_task with that task_id we will be cleaning up
the wrong cmd.

Wait to release the task_id until the last put has been done on the
iscsi_task. Because libiscsi grabs a ref to the task when sending the
abort, we know that for the non-abort timeout case that the task_id we are
referencing is for the cmd that was supposed to be aborted.

A latter commit will fix the case where the abort times out while we are
running qedi_tmf_work.

Link: https://lore.kernel.org/r/20210525181821.7617-21-michael.christie@oracle.com
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: qedi: Fix null ref during abort handling
Mike Christie [Tue, 25 May 2021 18:18:12 +0000 (13:18 -0500)]
scsi: qedi: Fix null ref during abort handling

If qedi_process_cmd_cleanup_resp finds the cmd it frees the work and sets
list_tmf_work to NULL, so qedi_tmf_work should check if list_tmf_work is
non-NULL when it wants to force cleanup.

Link: https://lore.kernel.org/r/20210525181821.7617-20-michael.christie@oracle.com
Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: iscsi: Move pool freeing
Mike Christie [Tue, 25 May 2021 18:18:11 +0000 (13:18 -0500)]
scsi: iscsi: Move pool freeing

This doesn't fix any bugs, but it makes more sense to free the pool after
we have removed the session. At that time we know nothing is touching any
of the session fields, because all devices have been removed and scans are
stopped.

Link: https://lore.kernel.org/r/20210525181821.7617-19-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: iscsi: Hold task ref during TMF timeout handling
Mike Christie [Tue, 25 May 2021 18:18:10 +0000 (13:18 -0500)]
scsi: iscsi: Hold task ref during TMF timeout handling

For aborts, qedi needs to cleanup the FW then send the TMF from a worker
thread. While it's doing these the cmd could complete normally and the TMF
could time out. libiscsi would then complete the iscsi_task which will call
into the driver to cleanup the driver level resources while it still might
be accessing them for the cleanup/abort.

This has iscsi_eh_abort keep the iscsi_task ref if the TMF times out, so
qedi does not have to worry about if the task is being freed while in use
and does not need to get its own ref.

Link: https://lore.kernel.org/r/20210525181821.7617-18-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: iscsi: Flush block work before unblock
Mike Christie [Tue, 25 May 2021 18:18:09 +0000 (13:18 -0500)]
scsi: iscsi: Flush block work before unblock

We set the max_active iSCSI EH works to 1, so all work is going to execute
in order by default. However, userspace can now override this in sysfs. If
max_active > 1, we can end up with the block_work on CPU1 and
iscsi_unblock_session running the unblock_work on CPU2 and the session and
target/device state will end up out of sync with each other.

This adds a flush of the block_work in iscsi_unblock_session.

Link: https://lore.kernel.org/r/20210525181821.7617-17-michael.christie@oracle.com
Fixes: 1d726aa6ef57 ("scsi: iscsi: Optimize work queue flush use")
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: iscsi: Fix completion check during abort races
Mike Christie [Tue, 25 May 2021 18:18:08 +0000 (13:18 -0500)]
scsi: iscsi: Fix completion check during abort races

We have a ref to the task being aborted, so SCp.ptr will never be NULL. We
need to use iscsi_task_is_completed to check for the completed state.

Link: https://lore.kernel.org/r/20210525181821.7617-16-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: iscsi: Fix shost->max_id use
Mike Christie [Tue, 25 May 2021 18:18:07 +0000 (13:18 -0500)]
scsi: iscsi: Fix shost->max_id use

The iscsi offload drivers are setting the shost->max_id to the max number
of sessions they support. The problem is that max_id is not the max number
of targets but the highest identifier the targets can have. To use it to
limit the number of targets we need to set it to max sessions - 1, or we
can end up with a session we might not have preallocated resources for.

Link: https://lore.kernel.org/r/20210525181821.7617-15-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: iscsi: Fix conn use after free during resets
Mike Christie [Tue, 25 May 2021 18:18:06 +0000 (13:18 -0500)]
scsi: iscsi: Fix conn use after free during resets

If we haven't done a unbind target call we can race where
iscsi_conn_teardown wakes up the EH thread and then frees the conn while
those threads are still accessing the conn ehwait.

We can only do one TMF per session so this just moves the TMF fields from
the conn to the session. We can then rely on the
iscsi_session_teardown->iscsi_remove_session->__iscsi_unbind_session call
to remove the target and it's devices, and know after that point there is
no device or scsi-ml callout trying to access the session.

Link: https://lore.kernel.org/r/20210525181821.7617-14-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: iscsi: Get ref to conn during reset handling
Mike Christie [Tue, 25 May 2021 18:18:05 +0000 (13:18 -0500)]
scsi: iscsi: Get ref to conn during reset handling

The comment in iscsi_eh_session_reset is wrong and we don't wait for the
EH to complete before tearing down the conn. This has us get a ref to the
conn when we are not holding the eh_mutex/frwd_lock so it does not get
freed from under us.

Link: https://lore.kernel.org/r/20210525181821.7617-13-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: iscsi: Have abort handler get ref to conn
Mike Christie [Tue, 25 May 2021 18:18:04 +0000 (13:18 -0500)]
scsi: iscsi: Have abort handler get ref to conn

If SCSI midlayer is aborting a task when we are tearing down the conn we
could free the conn while the abort thread is accessing the conn. This has
the abort handler get a ref to the conn so it won't be freed from under it.

Note: this is not needed for device/target reset because we are holding the
eh_mutex when accessing the conn.

Link: https://lore.kernel.org/r/20210525181821.7617-12-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: iscsi: Add iscsi_cls_conn refcount helpers
Mike Christie [Tue, 25 May 2021 18:18:03 +0000 (13:18 -0500)]
scsi: iscsi: Add iscsi_cls_conn refcount helpers

There are a couple places where we could free the iscsi_cls_conn while it's
still in use. This adds some helpers to get/put a refcount on the struct
and converts an exiting user. Subsequent commits will then use the helpers
to fix 2 bugs in the eh code.

Link: https://lore.kernel.org/r/20210525181821.7617-11-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: iscsi: iscsi_tcp: Start socket shutdown during conn stop
Mike Christie [Tue, 25 May 2021 18:18:02 +0000 (13:18 -0500)]
scsi: iscsi: iscsi_tcp: Start socket shutdown during conn stop

Make sure the conn socket shutdown starts before we start the timer to fail
commands to upper layers.

Link: https://lore.kernel.org/r/20210525181821.7617-10-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: iscsi: iscsi_tcp: Set no linger
Mike Christie [Tue, 25 May 2021 18:18:01 +0000 (13:18 -0500)]
scsi: iscsi: iscsi_tcp: Set no linger

Userspace (open-iscsi based tools at least) sets no linger on the socket to
prevent stale data from being sent. However, with the in-kernel cleanup if
userspace is not up the sockfd_put will release the socket without having
set that sockopt.

iscsid sets that opt at socket close time, but it seems ok to set this at
setup time in the kernel for all tools.

Link: https://lore.kernel.org/r/20210525181821.7617-9-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: iscsi: Fix in-kernel conn failure handling
Mike Christie [Tue, 25 May 2021 18:18:00 +0000 (13:18 -0500)]
scsi: iscsi: Fix in-kernel conn failure handling

Commit 0ab710458da1 ("scsi: iscsi: Perform connection failure entirely in
kernel space") has the following regressions/bugs that this patch fixes:

1. It can return cmds to upper layers like dm-multipath where that can
retry them. After they are successful the fs/app can send new I/O to the
same sectors, but we've left the cmds running in FW or in the net layer.
We need to be calling ep_disconnect if userspace is not up.

This patch only fixes the issue for offload drivers. iscsi_tcp will be
fixed in separate commit because it doesn't have a ep_disconnect call.

2. The drivers that implement ep_disconnect expect that it's called before
conn_stop. Besides crashes, if the cleanup_task callout is called before
ep_disconnect it might free up driver/card resources for session1 then they
could be allocated for session2. But because the driver's ep_disconnect is
not called it has not cleaned up the firmware so the card is still using
the resources for the original cmd.

3. The stop_conn_work_fn can run after userspace has done its recovery and
we are happily using the session. We will then end up with various bugs
depending on what is going on at the time.

We may also run stop_conn_work_fn late after userspace has called stop_conn
and ep_disconnect and is now going to call start/bind conn. If
stop_conn_work_fn runs after bind but before start, we would leave the conn
in a unbound but sort of started state where IO might be allowed even
though the drivers have been set in a state where they no longer expect
I/O.

4. Returning -EAGAIN in iscsi_if_destroy_conn if we haven't yet run the in
kernel stop_conn function is breaking userspace. We should have been doing
this for the caller.

Link: https://lore.kernel.org/r/20210525181821.7617-8-michael.christie@oracle.com
Fixes: 0ab710458da1 ("scsi: iscsi: Perform connection failure entirely in kernel space")
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: iscsi: Rel ref after iscsi_lookup_endpoint()
Mike Christie [Tue, 25 May 2021 18:17:59 +0000 (13:17 -0500)]
scsi: iscsi: Rel ref after iscsi_lookup_endpoint()

Subsequent commits allow the kernel to do ep_disconnect. In that case we
will have to get a proper refcount on the ep so one thread does not delete
it from under another.

Link: https://lore.kernel.org/r/20210525181821.7617-7-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: iscsi: Use system_unbound_wq for destroy_work
Mike Christie [Tue, 25 May 2021 18:17:58 +0000 (13:17 -0500)]
scsi: iscsi: Use system_unbound_wq for destroy_work

Use the system_unbound_wq for async session destruction. We don't need a
dedicated workqueue for async session destruction because:

 1. perf does not seem to be an issue since we only allow 1 active work.

 2. it does not have deps with other system works and we can run them in
    parallel with each other.

Link: https://lore.kernel.org/r/20210525181821.7617-6-michael.christie@oracle.com
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
3 years agoscsi: iscsi: Force immediate failure during shutdown
Mike Christie [Tue, 25 May 2021 18:17:57 +0000 (13:17 -0500)]
scsi: iscsi: Force immediate failure during shutdown

If the system is not up, we can just fail immediately since iscsid is not
going to ever answer our netlink events. We are already setting the
recovery_tmo to 0, but by passing stop_conn STOP_CONN_TERM we never will
block the session and start the recovery timer, because for that flag
userspace will do the unbind and destroy events which would remove the
devices and wake up and kill the eh.

Since the conn is dead and the system is going dowm this just has us use
STOP_CONN_RECOVER with recovery_tmo=0 so we fail immediately. However, if
the user has set the recovery_tmo=-1 we let the system hang like they
requested since they might have used that setting for specific reasons
(one known reason is for buggy cluster software).

Link: https://lore.kernel.org/r/20210525181821.7617-5-michael.christie@oracle.com
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>