Steffen Maier [Thu, 17 May 2018 17:15:04 +0000 (19:15 +0200)]
scsi: zfcp: cleanup indentation for posting FC events
I just happened to see the function header indentation of
zfcp_fc_enqueue_event() and I picked some more from checkpatch:
$ checkpatch.pl --strict -f drivers/s390/scsi/zfcp_fc.c
...
CHECK: Alignment should match open parenthesis
#113: FILE: drivers/s390/scsi/zfcp_fc.c:113:
+ fc_host_post_event(adapter->scsi_host, fc_get_event_number(),
+ event->code, event->data);
CHECK: Blank lines aren't necessary before a close brace '}'
#118: FILE: drivers/s390/scsi/zfcp_fc.c:118:
+
+}
...
The change complements v2.6.36 commit
2d1e547f7523 ("[SCSI] zfcp: Post
events through FC transport class").
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Steffen Maier [Thu, 17 May 2018 17:15:03 +0000 (19:15 +0200)]
scsi: zfcp: support SCSI_ADAPTER_RESET via scsi_host sysfs attribute host_reset
Make use of feature introduced with v3.2 commit
294436914454 ("[SCSI] scsi:
Added support for adapter and firmware reset"). The common code interface
was introduced for commit
95d31262b3c1 ("[SCSI] qla4xxx: Added support for
adapter and firmware reset").
$ echo adapter > /sys/class/scsi_host/host<N>/host_reset
Example trace record formatted with zfcpdbf from s390-tools:
Timestamp : ...
Area : REC
Subarea : 00
Level : 1
Exception : -
CPU ID : ..
Caller : 0x...
Record ID : 1 ZFCP_DBF_REC_TRIG
Tag : scshr_y SCSI sysfs host_reset yes
LUN : 0xffffffffffffffff none (invalid)
WWPN : 0x0000000000000000 none (invalid)
D_ID : 0x00000000 none (invalid)
Adapter status : 0x4500050b
Port status : 0x00000000 none (invalid)
LUN status : 0x00000000 none (invalid)
Ready count : 0x00000001
Running count : 0x00000000
ERP want : 0x04 ZFCP_ERP_ACTION_REOPEN_ADAPTER
ERP need : 0x04 ZFCP_ERP_ACTION_REOPEN_ADAPTER
This is the common code equivalent to the zfcp-specific
&dev_attr_adapter_failed.attr in zfcp_sysfs_adapter_attrs.attrs[]:
$ echo 0 > /sys/bus/ccw/drivers/zfcp/<devbusid>/failed
The unsupported case returns EOPNOTSUPP:
$ echo firmware > /sys/class/scsi_host/host<N>/host_reset
-bash: echo: write error: Operation not supported
Example trace record formatted with zfcpdbf from s390-tools:
Timestamp : ...
Area : SCSI
Subarea : 00
Level : 1
Exception : -
CPU ID : ..
Caller : 0x...
Record ID : 1
Tag : scshr_n SCSI sysfs host_reset no
Request ID : 0x0000000000000000 none (invalid)
SCSI ID : 0xffffffff none (invalid)
SCSI LUN : 0xffffffff none (invalid)
SCSI LUN high : 0xffffffff none (invalid)
SCSI result : 0xffffffa1 -EOPNOTSUPP==-95
SCSI retries : 0xff none (invalid)
SCSI allowed : 0xff none (invalid)
SCSI scribble : 0xffffffffffffffff none (invalid)
SCSI opcode :
ffffffff ffffffff ffffffff ffffffff none (invalid)
FCP rsp inf cod: 0xff none (invalid)
FCP rsp IU :
00000000 00000000 00000000 00000000 none (invalid)
00000000 00000000
For any other invalid value, common code returns EINVAL without invoking
our callback:
$ echo foo > /sys/class/scsi_host/host<N>/host_reset
-bash: echo: write error: Invalid argument
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Steffen Maier [Thu, 17 May 2018 17:15:02 +0000 (19:15 +0200)]
scsi: zfcp: explicitly support initiator in scsi_host_template
While the default did already correctly print "Initiator" let's make it
explicit and convert zfcp to the feature.
$ cat /sys/class/scsi_host/host0/supported_mode
Initiator
$ cat /sys/class/scsi_host/host0/active_mode
Initiator
The default worked, because not setting the field has it initialized to zero
== MODE_UNKNOWN. scsi_host_alloc() sets shost->active_mode = MODE_INITIATOR
in this case. The sysfs accessor function show_shost_supported_mode()
assumes MODE_INITIATOR in this case. This default behavior was introduced
with v2.6.24 commit
7a39ac3f25be ("[SCSI] make supported_mode default to
initiator."). The feature flag was introduced with v2.6.24 commit
5dc2b89e1242 ("[SCSI] add supported_mode and active_mode attributes to the
host"). So there was no release where zfcp would have shown "unknown".
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Steffen Maier [Thu, 17 May 2018 17:15:01 +0000 (19:15 +0200)]
scsi: zfcp: remove unused return values of ERP trigger functions
Since v2.6.27 commit
553448f6c483 ("[SCSI] zfcp: Message cleanup"), none of
the callers has been interested any more. Values were not returned
consistently in all ERP trigger functions.
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Steffen Maier [Thu, 17 May 2018 17:15:00 +0000 (19:15 +0200)]
scsi: zfcp: zfcp_erp_action_exists() does only check for running
Simplify its signature to return boolean and rename it to
zfcp_erp_action_is_running() to indicate its actual unmodified semantics.
It has always been used like this since v2.6.0 history commit
ea127f975424
("[PATCH] s390 (7/7): zfcp host adapter.").
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Steffen Maier [Thu, 17 May 2018 17:14:59 +0000 (19:14 +0200)]
scsi: zfcp: remove unused ERP enum values
All constant defines were introduced with v2.6.0 history commit
ea127f975424
("[PATCH] s390 (7/7): zfcp host adapter.") and refactored into enums with
commit
287ac01acf22 ("[SCSI] zfcp: Cleanup code in zfcp_erp.c").
ZFCP_STATUS_ERP_DISMISSING and ZFCP_ERP_STEP_FSF_XCONFIG were never used.
v2.6.27 commit
287ac01acf22 ("[SCSI] zfcp: Cleanup code in zfcp_erp.c")
removed the use of ZFCP_ERP_ACTION_READY on refactoring
zfcp_erp_action_exists() to now only check adapter->erp_running_head but no
longer adapter->erp_ready_head. The same commit could have changed the
function return type from int to "enum zfcp_erp_act_state".
ZFCP_ERP_ACTION_READY was never used outside of zfcp_erp_action_exists().
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Steffen Maier [Thu, 17 May 2018 17:14:58 +0000 (19:14 +0200)]
scsi: zfcp: consistently use function name space prefix
I've been mixing up
zfcp_task_mgmt_function() [SCSI] and
zfcp_fsf_fcp_task_mgmt() [FSF]
so often lately that I wanted to fix this.
SCSI changes complement v2.6.27 commit
f76af7d7e363 ("[SCSI] zfcp: Cleanup
of code in zfcp_scsi.c").
While at it, also fixup the other inconsistencies elsewhere.
ERP changes complement v2.6.27 commit
287ac01acf22 ("[SCSI] zfcp: Cleanup
code in zfcp_erp.c") which introduced status_change_set().
FC changes complement v2.6.32 commit
6f53a2d2ecae ("[SCSI] zfcp: Apply
common naming conventions to zfcp_fc"). by renaming a leftover introduced
with v2.6.27 commit
cc8c282963bd ("[SCSI] zfcp: Automatically attach remote
ports").
FSF changes fixup v2.6.32 commit
a4623c467ff7 ("[SCSI] zfcp: Improve request
allocation through mempools"). which replaced zfcp_fsf_alloc_qtcb()
introduced with v2.6.27 commit
c41f8cbddd4e ("[SCSI] zfcp: zfcp_fsf
cleanup.").
SCSI fc_host statistics were introduced with v2.6.16 commit
f6cd94b126aa
("[SCSI] zfcp: transport class adaptations").
SCSI fc_host port_state was introduced with v2.6.27 commit
85a82392fe6f
("[SCSI] zfcp: Add port_state attribute to sysfs").
SCSI rport setter for dev_loss_tmo was introduced with v2.6.18 commit
338151e06608 ("[SCSI] zfcp: make use of fc_remote_port_delete when target
port is unavailable").
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Steffen Maier [Thu, 17 May 2018 17:14:57 +0000 (19:14 +0200)]
scsi: zfcp: workqueue: set description for port work items with their WWPN as context
As a prerequisite, complement commit
3d1cb2059d93 ("workqueue: include
workqueue info when printing debug dump of a worker task") to be usable with
kernel modules by exporting the symbol set_worker_desc(). Current built-in
user was introduced with commit
ef3b101925f2 ("writeback: set worker desc to
identify writeback workers in task dumps").
Can help distinguishing work items which do not have adapter scope.
Description is printed out with task dump for debugging on WARN, BUG, panic,
or magic-sysrq [show-task-states(t)].
Example:
$ echo 0 >| /sys/bus/ccw/drivers/zfcp/0.0.1880/0x50050763031bd327/failed &
$ echo 't' >| /proc/sysrq-trigger
$ dmesg
sysrq: SysRq : Show State
task PC stack pid father
...
zfcp_q_0.0.1880 S14640 2165 2 0x02000000
Call Trace:
([<
00000000009df464>] __schedule+0xbf4/0xc78)
[<
00000000009df57c>] schedule+0x94/0xc0
[<
0000000000168654>] rescuer_thread+0x33c/0x3a0
[<
000000000016f8be>] kthread+0x166/0x178
[<
00000000009e71f2>] kernel_thread_starter+0x6/0xc
[<
00000000009e71ec>] kernel_thread_starter+0x0/0xc
no locks held by zfcp_q_0.0.1880/2165.
...
kworker/u512:2 D11280 2193 2 0x02000000
Workqueue: zfcp_q_0.0.1880 zfcp_scsi_rport_work [zfcp] (zrpd-
50050763031bd327)
^^^^^^^^^^^^^^^^^^^^^
Call Trace:
([<
00000000009df464>] __schedule+0xbf4/0xc78)
[<
00000000009df57c>] schedule+0x94/0xc0
[<
00000000009e50c0>] schedule_timeout+0x488/0x4d0
[<
00000000001e425c>] msleep+0x5c/0x78 >>test code only<<
[<
000003ff8008a21e>] zfcp_scsi_rport_work+0xbe/0x100 [zfcp]
[<
0000000000167154>] process_one_work+0x3b4/0x718
[<
000000000016771c>] worker_thread+0x264/0x408
[<
000000000016f8be>] kthread+0x166/0x178
[<
00000000009e71f2>] kernel_thread_starter+0x6/0xc
[<
00000000009e71ec>] kernel_thread_starter+0x0/0xc
2 locks held by kworker/u512:2/2193:
#0: (name){++++.+}, at: [<
0000000000166f4e>] process_one_work+0x1ae/0x718
#1: ((&(&port->rport_work)->work)){+.+.+.}, at: [<
0000000000166f4e>] process_one_work+0x1ae/0x718
...
=============================================
Showing busy workqueues and worker pools:
workqueue zfcp_q_0.0.1880: flags=0x2000a
pwq 512: cpus=0-255 flags=0x4 nice=0 active=1/1
in-flight: 2193:zfcp_scsi_rport_work [zfcp]
pool 512: cpus=0-255 flags=0x4 nice=0 hung=0s workers=4 idle: 5 2354 2311
Work items with adapter scope are already identified by the workqueue name
"zfcp_q_<devbusid>" and the work item function name.
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Steffen Maier [Thu, 17 May 2018 17:14:56 +0000 (19:14 +0200)]
scsi: zfcp: decouple our scsi_eh callbacks from scsi_cmnd
Note: zfcp_scsi_eh_host_reset_handler() will be converted in a later patch.
zfcp_scsi_eh_device_reset_handler() now only depends on scsi_device.
zfcp_scsi_eh_target_reset_handler() now only depends on scsi_target.
All derive other objects from these intended callback arguments.
zfcp_scsi_eh_target_reset_handler() is special: The FCP channel requires a
valid LUN handle so we try to find ourselves a stand-in scsi_device as
suggested by Hannes Reinecke. If it cannot find a stand-in scsi device,
trace a record like the following (formatted with zfcpdbf from s390-tools):
Timestamp : ...
Area : SCSI
Subarea : 00
Level : 1
Exception : -
CPU ID : ..
Caller : 0x...
Record ID : 1
Tag : tr_nosd target reset, no SCSI device
Request ID : 0x0000000000000000 none (invalid)
SCSI ID : 0x00000000 SCSI ID/target denoting scope
SCSI LUN : 0xffffffff none (invalid)
SCSI LUN high : 0xffffffff none (invalid)
SCSI result : 0x00002003 field re-used for midlayer value: FAILED
SCSI retries : 0xff none (invalid)
SCSI allowed : 0xff none (invalid)
SCSI scribble : 0xffffffffffffffff none (invalid)
SCSI opcode :
ffffffff ffffffff ffffffff ffffffff none (invalid)
FCP rsp inf cod: 0xff none (invalid)
FCP rsp IU :
00000000 00000000 00000000 00000000 none (invalid)
00000000 00000000
Actually change the signature of zfcp_task_mgmt_function() used by
zfcp_scsi_eh_device_reset_handler() & zfcp_scsi_eh_target_reset_handler().
Since it was prepared in a previous patch, we only need to delete a local
auto variable which is now the intended argument.
Suggested-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Steffen Maier [Thu, 17 May 2018 17:14:55 +0000 (19:14 +0200)]
scsi: zfcp: decouple TMFs from scsi_cmnd by using fc_block_rport
Intentionally retrieve the rport by walking SCSI common code objects
rather than zfcp_sdev->port->rport.
The latter is used for pairing the calls to fc_remote_port_add() and
fc_remote_port_delete(). [see v2.6.31 commit
379d6bf6573e ("[SCSI] zfcp:
Add port only once to FC transport class")]
zfcp_scsi_rport_register() sets zfcp_port.rport to what
fc_remote_port_add() returned.
zfcp_scsi_rport_block() sets zfcp_port.rport = NULL after having called
fc_remote_port_delete().
Hence, while an rport is blocked (or in any subsequent state due to
scsi_transport_fc timeouts such as fast_io_fail_tmo or dev_loss_tmo),
zfcp_port.rport is NULL and cannot serve as argument to fc_block_rport().
During zfcp recovery, a just recovered zfcp_port can have the UNBLOCKED
status flag, but an async rport unblocking has only started via
zfcp_scsi_schedule_rport_register() in zfcp_erp_try_rport_unblock()
[see v4.10 commit
6f2ce1c6af37 ("scsi: zfcp: fix rport unblock race with
LUN recovery")] in zfcp_erp_action_cleanup(). Now zfcp_erp_wait() can
return. This would be sufficient to successfully send a TMF.
But the rport can still be blocked and zfcp_port.rport can still be NULL
until zfcp_port.rport_work was scheduled and has actually called
fc_remote_port_add() and assigned its return value to zfcp_port.rport.
We need an unblocked rport for a successful scsi_eh TUR.
Similarly, for a zfcp_port which has just lost its UNBLOCKED status flag,
the return of zfcp_erp_wait() can race with zfcp_port.rport_work queued
by zfcp_scsi_schedule_rport_block(). Therefore we cannot reliably access
zfcp_port.rport. However, we'd like to get fc_rport_block()'s opinion on
when fast_io_fail_tmo triggered. While we might use
flush_work(&port->rport_work) to sync with the work item, we can simply use
the other way to get an rport pointer.
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Steffen Maier [Thu, 17 May 2018 17:14:54 +0000 (19:14 +0200)]
scsi: zfcp: decouple SCSI setup of TMF from scsi_cmnd
Actually change the signature of zfcp_fsf_fcp_task_mgmt().
Since it was prepared in the previous patch, we only need to delete
a local auto variable which is now the intended argument.
Prepare zfcp_fsf_fcp_task_mgmt's caller zfcp_task_mgmt_function()
to have its function body only depend on a scsi_device and derived objects.
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Steffen Maier [Thu, 17 May 2018 17:14:53 +0000 (19:14 +0200)]
scsi: zfcp: decouple FSF request setup of TMF from scsi_cmnd
In zfcp_fsf_fcp_task_mgmt() resolve the still old argument scsi_cmnd into
scsi_device very early and only depend on scsi_device and derived objects in
the function body.
This prepares to later change the function signature replacing the scsi_cmnd
argument with scsi_device.
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Steffen Maier [Thu, 17 May 2018 17:14:52 +0000 (19:14 +0200)]
scsi: zfcp: split FCP_CMND IU setup between SCSI I/O and TMF again
This reverts commit
2443c8b23aea ("[SCSI] zfcp: Merge FCP task management
setup with regular FCP command setup"), because this introduced a
dependency on the unsuitable SCSI command for scsi_eh / TMF.
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Steffen Maier [Thu, 17 May 2018 17:14:51 +0000 (19:14 +0200)]
scsi: zfcp: decouple TMF response handler from scsi_cmnd
Originally, I planned for TMF handling to have different context data in
fsf_req->data depending on the TMF scope in fcp_cmnd->fc_tm_flags:
* scsi_device if FCP_TMF_LUN_RESET,
* zfcp_port if FCP_TMF_TGT_RESET.
However, the FCP channel requires a valid LUN handle so we now use
scsi_device as context data with any TMF for the time being.
Regular SCSI I/O FCP requests continue using scsi_cmnd as req->data.
Hence, the callers of zfcp_fsf_fcp_handler_common() must resolve req->data
and pass scsi_device as common context. While at it, remove the detour
zfcp_sdev->port->adapter and use the more direct req->adapter as elsewhere
in this function already.
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Steffen Maier [Thu, 17 May 2018 17:14:50 +0000 (19:14 +0200)]
scsi: zfcp: decouple SCSI traces for scsi_eh / TMF from scsi_cmnd
The SCSI command pointer passed to scsi_eh callbacks is just one arbitrary
command of potentially many that are in the eh queue to be processed. The
command is only used to indirectly pass the TMF scope in terms of SCSI
ID/target and SCSI LUN for LUN reset.
Hence, zfcp had filled in SCSI trace record fields which do not really
belong to the TMF. This was confusing.
Therefore, refactor the TMF tracing to work without SCSI command. Since the
FCP channel always requires a valid LUN handle, we use SCSI device as common
context for any TMF (even target reset). To make it even clearer, we set
all bits to 1 for the fields, which do not belong to the TMF, to indicate
that these fields are invalid.
The old zfcp_dbf_scsi() became zfcp_dbf_scsi_common() to now handle both
SCSI commands and TMFs. The old argument scsi_cmnd is now optional and can
be NULL with TMFs. The new argument scsi_device is mandatory to carry
context, as well as SCSI ID/target and SCSI LUN in case of TMFs.
New example trace record formatted with zfcpdbf from s390-tools:
Timestamp : ...
Area : SCSI
Subarea : 00
Level : 1
Exception : -
CPU ID : ..
Caller : 0x...
Record ID : 1
Tag : [lt]r_....
Request ID : 0x<reqid> ID of FSF FCP request with TM flag
For cases without FSF request: 0x0 for none (invalid)
SCSI ID : 0x<scsi_id> SCSI ID/target denoting scope
SCSI LUN : 0x<scsi_lun> SCSI LUN denoting scope
SCSI LUN high : 0x<scsi_lun_high> SCSI LUN denoting scope
SCSI result : 0xffffffff none (invalid)
SCSI retries : 0xff none (invalid)
SCSI allowed : 0xff none (invalid)
SCSI scribble : 0xffffffffffffffff none (invalid)
SCSI opcode :
ffffffff ffffffff ffffffff ffffffff none (invalid)
FCP rsp inf cod: 0x00 FCP_RSP info code of TMF
FCP rsp IU :
00000000 00000000 00000100 00000000 ext FCP_RSP IU
00000000 00000008 ext FCP_RSP IU
FCP rsp IU len : 32 FCP_RSP IU length
Payload time : ...
FCP rsp IU all :
00000000 00000000 00000100 00000000 full FCP_RSP IU
00000000 00000008 00000000 00000000 full FCP_RSP IU
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Steffen Maier [Thu, 17 May 2018 17:14:49 +0000 (19:14 +0200)]
scsi: zfcp: fix missing REC trigger trace on enqueue without ERP thread
Example trace record formatted with zfcpdbf from s390-tools:
Timestamp : ...
Area : REC
Subarea : 00
Level : 1
Exception : -
CPU ID : ..
Caller : 0x...
Record ID : 1 ZFCP_DBF_REC_TRIG
Tag : .......
LUN : 0x...
WWPN : 0x...
D_ID : 0x...
Adapter status : 0x...
Port status : 0x...
LUN status : 0x...
Ready count : 0x...
Running count : 0x...
ERP want : 0x0. ZFCP_ERP_ACTION_REOPEN_...
ERP need : 0xc0 ZFCP_ERP_ACTION_NONE
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Cc: <stable@vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Steffen Maier [Thu, 17 May 2018 17:14:48 +0000 (19:14 +0200)]
scsi: zfcp: fix missing REC trigger trace for all objects in ERP_FAILED
That other commit introduced an inconsistency because it would trace on
ERP_FAILED for all callers of port forced reopen triggers (not just
terminate_rport_io), but it would not trace on ERP_FAILED for all callers of
other ERP triggers such as adapter, port regular, LUN.
Therefore, generalize that other commit. zfcp_erp_action_enqueue() already
had two early outs which re-used the one zfcp_dbf_rec_trig() call. All ERP
trigger functions finally run through zfcp_erp_action_enqueue(). So move
the special handling for ZFCP_STATUS_COMMON_ERP_FAILED into
zfcp_erp_action_enqueue() and add another early out with new trace marker
for pseudo ERP need in this case. This removes all early returns from all
ERP trigger functions so we always end up at zfcp_dbf_rec_trig().
Example trace record formatted with zfcpdbf from s390-tools:
Timestamp : ...
Area : REC
Subarea : 00
Level : 1
Exception : -
CPU ID : ..
Caller : 0x...
Record ID : 1 ZFCP_DBF_REC_TRIG
Tag : .......
LUN : 0x...
WWPN : 0x...
D_ID : 0x...
Adapter status : 0x...
Port status : 0x...
LUN status : 0x...
Ready count : 0x...
Running count : 0x...
ERP want : 0x0. ZFCP_ERP_ACTION_REOPEN_...
ERP need : 0xe0 ZFCP_ERP_ACTION_FAILED
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Cc: <stable@vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Steffen Maier [Thu, 17 May 2018 17:14:47 +0000 (19:14 +0200)]
scsi: zfcp: fix missing REC trigger trace on terminate_rport_io for ERP_FAILED
For problem determination we always want to see when we were invoked on the
terminate_rport_io callback whether we perform something or not.
Temporal event sequence of interest with a long fast_io_fail_tmo of 27 sec:
loose remote port
t workqueue
[s] zfcp_q_<dev> IRQ zfcperp<dev>
=== ================== =================== ============================
0 recv RSCN
q p.test_link_work
block rport
start fast_io_fail_tmo
send ADISC ELS
4 recv ADISC fail
block zfcp_port
port forced reopen
send open port
12 recv open port fail
q p.gid_pn_work
zfcp_erp_wakeup
(zfcp_erp_wait would return)
GID_PN fail
Before this point, we got a SCSI trace with tag "sctrpi1" on fast_io_fail,
e.g. with the typical 5 sec setting.
port.status |= ERP_FAILED
If fast_io_fail_tmo triggers after this point, we missed a SCSI trace.
workqueue
fc_dl_<host>
==================
27 fc_timeout_fail_rport_io
fc_terminate_rport_io
zfcp_scsi_terminate_rport_io
zfcp_erp_port_forced_reopen
_zfcp_erp_port_forced_reopen
if (port.status & ERP_FAILED)
return;
Therefore, write a trace before above early return.
Example trace record formatted with zfcpdbf from s390-tools:
Timestamp : ...
Area : REC
Subarea : 00
Level : 1
Exception : -
CPU ID : ..
Caller : 0x...
Record ID : 1 ZFCP_DBF_REC_TRIG
Tag : sctrpi1 SCSI terminate rport I/O
LUN : 0xffffffffffffffff none (invalid)
WWPN : 0x<wwpn>
D_ID : 0x<n_port_id>
Adapter status : 0x...
Port status : 0x...
LUN status : 0x00000000 none (invalid)
Ready count : 0x...
Running count : 0x...
ERP want : 0x03 ZFCP_ERP_ACTION_REOPEN_PORT_FORCED
ERP need : 0xe0 ZFCP_ERP_ACTION_FAILED
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Cc: <stable@vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Steffen Maier [Thu, 17 May 2018 17:14:46 +0000 (19:14 +0200)]
scsi: zfcp: fix missing REC trigger trace on terminate_rport_io early return
get_device() and its internally used kobject_get() only return NULL if they
get passed NULL as argument. zfcp_get_port_by_wwpn() loops over
adapter->port_list so the iteration variable port is always non-NULL.
Struct device is embedded in struct zfcp_port so &port->dev is always
non-NULL. This is the argument to get_device(). However, if we get an
fc_rport in terminate_rport_io() for which we cannot find a match within
zfcp_get_port_by_wwpn(), the latter can return NULL. v2.6.30 commit
70932935b61e ("[SCSI] zfcp: Fix oops when port disappears") introduced an
early return without adding a trace record for this case. Even if we don't
need recovery in this case, for debugging we should still see that our
callback was invoked originally by scsi_transport_fc.
Example trace record formatted with zfcpdbf from s390-tools:
Timestamp : ...
Area : REC
Subarea : 00
Level : 1
Exception : -
CPU ID : ..
Caller : 0x...
Record ID : 1
Tag : sctrpin SCSI terminate rport I/O, no zfcp port
LUN : 0xffffffffffffffff none (invalid)
WWPN : 0x<wwpn> WWPN
D_ID : 0x<n_port_id> N_Port-ID
Adapter status : 0x...
Port status : 0xffffffff unknown (-1)
LUN status : 0x00000000 none (invalid)
Ready count : 0x...
Running count : 0x...
ERP want : 0x03 ZFCP_ERP_ACTION_REOPEN_PORT_FORCED
ERP need : 0xc0 ZFCP_ERP_ACTION_NONE
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Fixes:
70932935b61e ("[SCSI] zfcp: Fix oops when port disappears")
Cc: <stable@vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Steffen Maier [Thu, 17 May 2018 17:14:45 +0000 (19:14 +0200)]
scsi: zfcp: fix misleading REC trigger trace where erp_action setup failed
If a SCSI device is deleted during scsi_eh host reset, we cannot get a
reference to the SCSI device anymore since scsi_device_get returns !=0 by
design. Assuming the recovery of adapter and port(s) was successful,
zfcp_erp_strategy_followup_success() attempts to trigger a LUN reset for the
half-gone SCSI device. Unfortunately, it causes the following confusing
trace record which states that zfcp will do a LUN recovery as "ERP need" is
ZFCP_ERP_ACTION_REOPEN_LUN == 1 and equals "ERP want".
Old example trace record formatted with zfcpdbf from s390-tools:
Tag: : ersfs_3 ERP, trigger, unit reopen, port reopen succeeded
LUN : 0x<FCP_LUN>
WWPN : 0x<WWPN>
D_ID : 0x<N_Port-ID>
Adapter status : 0x5400050b
Port status : 0x54000001
LUN status : 0x40000000 ZFCP_STATUS_COMMON_RUNNING
but not ZFCP_STATUS_COMMON_UNBLOCKED as it
was closed on close part of adapter reopen
ERP want : 0x01
ERP need : 0x01 misleading
However, zfcp_erp_setup_act() returns NULL as it cannot get the reference.
Hence, zfcp_erp_action_enqueue() takes an early goto out and _NO_ recovery
actually happens.
We always do want the recovery trigger trace record even if no erp_action
could be enqueued as in this case. For other cases where we did not enqueue
an erp_action, 'need' has always been zero to indicate this. In order to
indicate above goto out, introduce an eyecatcher "flag" to mark the "ERP
need" as 'not needed' but still keep the information which erp_action type,
that zfcp_erp_required_act() had decided upon, is needed. 0xc_ is chosen to
be visibly different from 0x0_ in "ERP want".
New example trace record formatted with zfcpdbf from s390-tools:
Tag: : ersfs_3 ERP, trigger, unit reopen, port reopen succeeded
LUN : 0x<FCP_LUN>
WWPN : 0x<WWPN>
D_ID : 0x<N_Port-ID>
Adapter status : 0x5400050b
Port status : 0x54000001
LUN status : 0x40000000
ERP want : 0x01
ERP need : 0xc1 would need LUN ERP, but no action set up
^
Before v2.6.38 commit
ae0904f60fab ("[SCSI] zfcp: Redesign of the debug
tracing for recovery actions.") we could detect this case because the
"erp_action" field in the trace was NULL. The rework removed erp_action as
argument and field from the trace.
This patch here is for tracing. A fix to allow LUN recovery in the case at
hand is a topic for a separate patch.
See also commit
fdbd1c5e27da ("[SCSI] zfcp: Allow running unit/LUN shutdown
without acquiring reference") for a similar case and background info.
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Fixes:
ae0904f60fab ("[SCSI] zfcp: Redesign of the debug tracing for recovery actions.")
Cc: <stable@vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Steffen Maier [Thu, 17 May 2018 17:14:44 +0000 (19:14 +0200)]
scsi: zfcp: fix missing SCSI trace for retry of abort / scsi_eh TMF
We already have a SCSI trace for the end of abort and scsi_eh TMF. Due to
zfcp_erp_wait() and fc_block_scsi_eh() time can pass between the start of
our eh callback and an actual send/recv of an abort / TMF request. In order
to see the temporal sequence including any abort / TMF send retries, add a
trace before the above two blocking functions. This supports problem
determination with scsi_eh and parallel zfcp ERP.
No need to explicitly trace the beginning of our eh callback, since we
typically can send an abort / TMF and see its HBA response (in the worst
case, it's a pseudo response on dismiss all of adapter recovery, e.g. due to
an FSF request timeout [fsrth_1] of the abort / TMF). If we cannot send, we
now get a trace record for the first "abrt_wt" or "[lt]r_wait" which denotes
almost the beginning of the callback.
No need to explicitly trace the wakeup after the above two blocking
functions because the next retry loop causes another trace in any case and
that is sufficient.
Example trace records formatted with zfcpdbf from s390-tools:
Timestamp : ...
Area : SCSI
Subarea : 00
Level : 1
Exception : -
CPU ID : ..
Caller : 0x...
Record ID : 1
Tag : abrt_wt abort, before zfcp_erp_wait()
Request ID : 0x0000000000000000 none (invalid)
SCSI ID : 0x<scsi_id>
SCSI LUN : 0x<scsi_lun>
SCSI LUN high : 0x<scsi_lun_high>
SCSI result : 0x<scsi_result_of_cmd_to_be_aborted>
SCSI retries : 0x<retries_of_cmd_to_be_aborted>
SCSI allowed : 0x<allowed_retries_of_cmd_to_be_aborted>
SCSI scribble : 0x<req_id_of_cmd_to_be_aborted>
SCSI opcode : <CDB_of_cmd_to_be_aborted>
FCP rsp inf cod: 0x.. none (invalid)
FCP rsp IU : ... none (invalid)
Timestamp : ...
Area : SCSI
Subarea : 00
Level : 1
Exception : -
CPU ID : ..
Caller : 0x...
Record ID : 1
Tag : lr_wait LUN reset, before zfcp_erp_wait()
Request ID : 0x0000000000000000 none (invalid)
SCSI ID : 0x<scsi_id>
SCSI LUN : 0x<scsi_lun>
SCSI LUN high : 0x<scsi_lun_high>
SCSI result : 0x... unrelated
SCSI retries : 0x.. unrelated
SCSI allowed : 0x.. unrelated
SCSI scribble : 0x... unrelated
SCSI opcode : ... unrelated
FCP rsp inf cod: 0x.. none (invalid)
FCP rsp IU : ... none (invalid)
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Fixes:
63caf367e1c9 ("[SCSI] zfcp: Improve reliability of SCSI eh handlers in zfcp")
Fixes:
af4de36d911a ("[SCSI] zfcp: Block scsi_eh thread for rport state BLOCKED")
Cc: <stable@vger.kernel.org> #2.6.38+
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Steffen Maier [Thu, 17 May 2018 17:14:43 +0000 (19:14 +0200)]
scsi: zfcp: fix missing SCSI trace for result of eh_host_reset_handler
For problem determination we need to see whether and why we were successful
or not. This allows deduction of scsi_eh escalation.
Example trace record formatted with zfcpdbf from s390-tools:
Timestamp : ...
Area : SCSI
Subarea : 00
Level : 1
Exception : -
CPU ID : ..
Caller : 0x...
Record ID : 1
Tag : schrh_r SCSI host reset handler result
Request ID : 0x0000000000000000 none (invalid)
SCSI ID : 0xffffffff none (invalid)
SCSI LUN : 0xffffffff none (invalid)
SCSI LUN high : 0xffffffff none (invalid)
SCSI result : 0x00002002 field re-used for midlayer value: SUCCESS
or in other cases: 0x2009 == FAST_IO_FAIL
SCSI retries : 0xff none (invalid)
SCSI allowed : 0xff none (invalid)
SCSI scribble : 0xffffffffffffffff none (invalid)
SCSI opcode :
ffffffff ffffffff ffffffff ffffffff none (invalid)
FCP rsp inf cod: 0xff none (invalid)
FCP rsp IU :
00000000 00000000 00000000 00000000 none (invalid)
00000000 00000000
v2.6.35 commit
a1dbfddd02d2 ("[SCSI] zfcp: Pass return code from
fc_block_scsi_eh to scsi eh") introduced the first return with something
other than the previously hardcoded single SUCCESS return path.
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Fixes:
a1dbfddd02d2 ("[SCSI] zfcp: Pass return code from fc_block_scsi_eh to scsi eh")
Cc: <stable@vger.kernel.org> #2.6.38+
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Uma Krishnan [Fri, 11 May 2018 19:06:19 +0000 (14:06 -0500)]
scsi: cxlflash: Isolate external module dependencies
Depending on the underlying transport, cxlflash has a dependency on either
the CXL or OCXL drivers, which are enabled via their Kconfig option.
Instead of having a module wide dependency on these config options, it is
better to isolate the object modules that are dependent on the CXL and OCXL
drivers and adjust the module dependencies accordingly.
This commit isolates the object files that are dependent on CXL and/or
OCXL. The cxl/ocxl fops used in the core driver are tucked under an ifdef to
avoid compilation errors.
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Uma Krishnan [Fri, 11 May 2018 19:06:05 +0000 (14:06 -0500)]
scsi: cxlflash: Abstract hardware dependent assignments
As a staging cleanup to support transport specific builds of the cxlflash
module, relocate device dependent assignments to header files. This will
avoid littering the core driver with conditional compilation logic.
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Uma Krishnan [Fri, 11 May 2018 19:05:51 +0000 (14:05 -0500)]
scsi: cxlflash: Add include guards to backend.h
The new header file, backend.h, that was recently added is missing the
include guards. This commit adds the guards.
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Matthew R. Ochs [Fri, 11 May 2018 19:05:37 +0000 (14:05 -0500)]
scsi: cxlflash: Use local mutex for AFU serialization
AFUs can only process a single AFU command at a time. This is enforced with
a global mutex situated within the AFU send routine. As this mutex has a
global scope, it has the potential to unnecessarily block commands destined
for other AFUs.
Instead of using a global mutex, transition the mutex to be per-AFU. This
will allow commands to only be blocked by siblings of the same AFU.
Signed-off-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Uma Krishnan [Fri, 11 May 2018 19:05:22 +0000 (14:05 -0500)]
scsi: cxlflash: Acquire semaphore before invoking ioctl services
When a superpipe process that makes use of virtual LUNs is terminated or
killed abruptly, there is a possibility that the cxlflash driver could hang
and deprive other operations on the adapter.
The release fop registered to be invoked on a context close, detaches every
LUN associated with the context. The underlying service to detach the LUN
assumes it has been called with the read semaphore held, and releases the
semaphore before any operation that could be time consuming.
When invoked without holding the read semaphore, an opportunity is created
for the semaphore's count to become negative when it is temporarily released
during one of these potential lengthy operations. This negative count
results in subsequent acquisition attempts taking forever, leading to the
hang.
To support the current design point of holding the semaphore on the ioctl()
paths, the release fop should acquire it before invoking any ioctl services.
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Uma Krishnan [Fri, 11 May 2018 19:05:08 +0000 (14:05 -0500)]
scsi: cxlflash: Limit the debug logs in the IO path
The kernel log can get filled with debug messages from send_cmd_ioarrin()
when dynamic debug is enabled for the cxlflash module and there is a lot of
legacy I/O traffic.
While these messages are necessary to debug issues that involve command
tracking, the abundance of data can overwrite other useful data in the
log. The best option available is to limit the messages that should serve
most of the common use cases.
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Uma Krishnan [Fri, 11 May 2018 19:04:46 +0000 (14:04 -0500)]
scsi: cxlflash: Yield to active send threads
The following Oops may be encountered if the device is reset, i.e. EEH
recovery, while there is heavy I/O traffic:
59:mon> t
[
c000200db64bb680]
c008000009264c40 cxlflash_queuecommand+0x3b8/0x500
[cxlflash]
[
c000200db64bb770]
c00000000090d3b0 scsi_dispatch_cmd+0x130/0x2f0
[
c000200db64bb7f0]
c00000000090fdd8 scsi_request_fn+0x3c8/0x8d0
[
c000200db64bb900]
c00000000067f528 __blk_run_queue+0x68/0xb0
[
c000200db64bb930]
c00000000067ab80 __elv_add_request+0x140/0x3c0
[
c000200db64bb9b0]
c00000000068daac blk_execute_rq_nowait+0xec/0x1a0
[
c000200db64bba00]
c00000000068dbb0 blk_execute_rq+0x50/0xe0
[
c000200db64bba50]
c0000000006b2040 sg_io+0x1f0/0x520
[
c000200db64bbaf0]
c0000000006b2e94 scsi_cmd_ioctl+0x534/0x610
[
c000200db64bbc20]
c000000000926208 sd_ioctl+0x118/0x280
[
c000200db64bbcc0]
c00000000069f7ac blkdev_ioctl+0x7fc/0xe30
[
c000200db64bbd20]
c000000000439204 block_ioctl+0x84/0xa0
[
c000200db64bbd40]
c0000000003f8514 do_vfs_ioctl+0xd4/0xa00
[
c000200db64bbde0]
c0000000003f8f04 SyS_ioctl+0xc4/0x130
[
c000200db64bbe30]
c00000000000b184 system_call+0x58/0x6c
When there is no room to send the I/O request, the cached room is refreshed
by reading the memory mapped command room value from the AFU. The AFU
register mapping is refreshed during a reset, creating a race condition that
can lead to the Oops above.
During a device reset, the AFU should not be unmapped until all the active
send threads quiesce. An atomic counter, cmds_active, is currently used to
track internal AFU commands and quiesce during reset. This same counter can
also be used for the active send threads.
Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiaofei Tan [Wed, 9 May 2018 15:10:50 +0000 (23:10 +0800)]
scsi: hisi_sas: add check of device in hisi_sas_task_exec()
Currently we don't check that device is not gone before dereferencing
its elements in the function hisi_sas_task_exec() (specifically, the DQ
pointer).
This patch fixes this issue by filling in the DQ pointer in
hisi_sas_task_prep() after we check that the device pointer is still
safe to reference.
[mkp: typo]
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiang Chen [Wed, 9 May 2018 15:10:49 +0000 (23:10 +0800)]
scsi: hisi_sas: Use device lock to protect slot alloc/free
The IPTT of a slot is unique, and we currently use hisi_hba lock to
protect it.
Now slot is managed on hisi_sas_device.list, so use DQ lock to protect
for allocating and freeing the slot.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiang Chen [Wed, 9 May 2018 15:10:48 +0000 (23:10 +0800)]
scsi: hisi_sas: Don't lock DQ for complete task sending
Currently we lock the DQ to protect whole delivery process. So this
stops us building slots for the same queue in parallel, and can affect
performance.
To optimise it, only lock the DQ during special periods, specifically
when allocating a slot from the DQ and when delivering a slot to the HW.
This approach is now safe, thanks to the previous patches to ensure that
we always deliver a slot to the HW once allocated.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiang Chen [Wed, 9 May 2018 15:10:47 +0000 (23:10 +0800)]
scsi: hisi_sas: allocate slot buffer earlier
Currently we allocate the slot's memory buffer after allocating the DQ
slot.
To aid DQ lockout reduction, and allow slots to be built in parallel,
move this step (which can fail) prior to allocating the slot.
Also a stray spin_unlock_irqrestore() is removed from internal task exec
function.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiang Chen [Wed, 9 May 2018 15:10:46 +0000 (23:10 +0800)]
scsi: hisi_sas: make return type of prep functions void
Since the task prep functions now should not fail, adjust the return
types to void.
In addition, some checks in the task prep functions are relocated to the
main module; this is specifically the check for the number of elements
in an sg list exceeded the HW SGE limit.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiang Chen [Wed, 9 May 2018 15:10:45 +0000 (23:10 +0800)]
scsi: hisi_sas: relocate smp sg map
Currently we use DQ lock to protect delivery of DQ entry one by one.
To optimise to allow more than one slot to be built for a single DQ in
parallel, we need to remove the DQ lock when preparing slots, prior to
delivery.
To achieve this, we rearrange the slot build order to ensure that once
we allocate a slot for a task, we do cannot fail to deliver the task.
In this patch, we rearrange the slot building for SMP tasks to ensure
that sg mapping part (which can fail) happens before we allocate the
slot in the DQ.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Alim Akhtar [Sun, 6 May 2018 10:14:18 +0000 (15:44 +0530)]
scsi: ufs: make ufshcd_config_pwr_mode of non-static func
This makes ufshcd_config_pwr_mode non-static so that other vendors like
exynos can use it.
Signed-off-by: Seungwon Jeon <essuuj@gmail.com>
Signed-off-by: Alim Akhtar <alim.akhtar@samsung.com>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Alim Akhtar [Sun, 6 May 2018 10:14:17 +0000 (15:44 +0530)]
scsi: ufs: add quirk to enable host controller without hce
Some host controllers don't support host controller enable via HCE.
Signed-off-by: Seungwon Jeon <essuuj@gmail.com>
Signed-off-by: Alim Akhtar <alim.akhtar@samsung.com>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Alim Akhtar [Sun, 6 May 2018 10:14:16 +0000 (15:44 +0530)]
scsi: ufs: add quirk to disallow reset of interrupt aggregation
Some host controllers support interrupt aggregation but don't allow
resetting counter and timer in software.
Signed-off-by: Seungwon Jeon <essuuj@gmail.com>
Signed-off-by: Alim Akhtar <alim.akhtar@samsung.com>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Alim Akhtar [Sun, 6 May 2018 10:14:15 +0000 (15:44 +0530)]
scsi: ufs: add quirk to fix mishandling utrlclr/utmrlclr
In the right behavior, setting the bit to '0' indicates clear and '1'
indicates no change. If host controller handles this the other way,
UFSHCI_QUIRK_BROKEN_REQ_LIST_CLR can be used.
[mkp: typo]
Signed-off-by: Seungwon Jeon <essuuj@gmail.com>
Signed-off-by: Alim Akhtar <alim.akhtar@samsung.com>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Reviewed-by: "Asutosh Das (asd)" <asutoshd@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Kees Cook [Wed, 2 May 2018 23:58:09 +0000 (16:58 -0700)]
scsi: ufs: ufshcd: Remove VLA usage
On the quest to remove all VLAs from the kernel[1] this moves buffers
off the stack. In the second instance, this collapses two separately
allocated buffers into a single buffer, since they are used
consecutively, which saves 256 bytes (QUERY_DESC_MAX_SIZE + 1) of stack
space.
[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Luc Van Oostenryck [Tue, 24 Apr 2018 13:15:58 +0000 (15:15 +0200)]
scsi: mptlan: Fix mpt_lan_sdu_send()'s return type
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, but the implementation in this
driver returns an 'int'.
Fix this by returning 'netdev_tx_t' in this driver too.
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Wen Xiong [Wed, 9 May 2018 18:47:54 +0000 (13:47 -0500)]
scsi: ipr: new IOASC update
This patch adds new adapter error log for P9 system with the new AZ SAS
cable.
Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Colin Ian King [Tue, 8 May 2018 21:54:35 +0000 (22:54 +0100)]
scsi: esas2r: fix spelling mistake: "requestss" -> "requests"
Trivial fix to spelling mistake in esas2r_debug message.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Andrei Vagin [Thu, 22 Mar 2018 06:55:02 +0000 (23:55 -0700)]
scsi: target: target/file: Add support of direct and async I/O
There are two advantages:
* Direct I/O allows to avoid the write-back cache, so it reduces affects
to other processes in the system.
* Async I/O allows to handle a few commands concurrently.
DIO + AIO shows a better perfomance for random write operations:
Mode: O_DSYNC Async: 1
$ ./fio --bs=4K --direct=1 --rw=randwrite --ioengine=libaio --iodepth=64 --name=/dev/sda --runtime=20 --numjobs=2
WRITE: bw=45.9MiB/s (48.1MB/s), 21.9MiB/s-23.0MiB/s (22.0MB/s-25.2MB/s), io=919MiB (963MB), run=20002-20020msec
Mode: O_DSYNC Async: 0
$ ./fio --bs=4K --direct=1 --rw=randwrite --ioengine=libaio --iodepth=64 --name=/dev/sdb --runtime=20 --numjobs=2
WRITE: bw=1607KiB/s (1645kB/s), 802KiB/s-805KiB/s (821kB/s-824kB/s), io=31.8MiB (33.4MB), run=20280-20295msec
Known issue:
DIF (PI) emulation doesn't work when a target uses async I/O, because
DIF metadata is saved in a separate file, and it is another non-trivial
task how to synchronize writing in two files, so that a following read
operation always returns a consisten metadata for a specified block.
Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Tested-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Kees Cook [Wed, 2 May 2018 22:55:45 +0000 (15:55 -0700)]
scsi: libosd: Remove VLA usage
On the quest to remove all VLAs from the kernel[1] this rearranges the
code to avoid a VLA warning under -Wvla (gcc doesn't recognize "const"
variables as not triggering VLA creation). Additionally cleans up
variable naming to avoid 80 character column limit.
[1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Boaz Harrosh <ooo@electrozaur.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Zhu Lingshan [Wed, 2 May 2018 03:13:44 +0000 (11:13 +0800)]
scsi: tcmu: refactor nl wr_cache attr with new helpers
use new netlink events helpers tcmu_netlink_init() and
tcmu_netlink_send() to refactor netlink event attribute
TCMU_ATTR_WRITECACHE(belongs to TCMU_CMD_RECONFIG_DEVICE) which is also
emulate_write_cache in configFS.
Removed tcmu_netlink_event() since we have new netlink
events helpers now.
Signed-off-by: Zhu Lingshan <lszhu@suse.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Zhu Lingshan [Wed, 2 May 2018 03:13:43 +0000 (11:13 +0800)]
scsi: tcmu: refactor nl dev_size attr with new helpers
use new netlink events helpers tcmu_netlink_init() and
tcmu_netlink_send() to refactor netlink event attribute
TCMU_ATTR_DEV_SIZE(belongs to TCMU_CMD_RECONFIG_DEVICE) which is also
dev_size in configFS.
Signed-off-by: Zhu Lingshan <lszhu@suse.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Zhu Lingshan [Wed, 2 May 2018 03:13:42 +0000 (11:13 +0800)]
scsi: tcmu: refactor nl dev_cfg attr with new nl helpers
use new netlink events helpers tcmu_netlink_init() and
tcmu_netlink_send() to refactor netlink event attribute
TCMU_ATTR_DEV_CFG(belongs to TCMU_CMD_RECONFIG_DEVICE) which is also
dev_config in configFS.
Signed-off-by: Zhu Lingshan <lszhu@suse.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Zhu Lingshan [Wed, 2 May 2018 03:13:41 +0000 (11:13 +0800)]
scsi: tcmu: refactor rm_device cmd with new nl helpers
use new netlink events helpers tcmu_netlink_init() and
tcmu_netlink_send() to refactor netlink event TCMU_CMD_REMOVED_DEVICE
Signed-off-by: Zhu Lingshan <lszhu@suse.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Zhu Lingshan [Wed, 2 May 2018 03:13:40 +0000 (11:13 +0800)]
scsi: tcmu: refactor add_device cmd with new nl helpers
use new netlink events helpers tcmu_netlink_init() and
tcmu_netlink_send() to refactor netlink event TCMU_CMD_ADDED_DEVICE
Signed-off-by: Zhu Lingshan <lszhu@suse.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Zhu Lingshan [Wed, 2 May 2018 03:13:39 +0000 (11:13 +0800)]
scsi: tcmu: add new netlink events helpers
Add new netlink events helpers tcmu_netlink_event_init() and
tcmu_netlink_event_send(). These new functions intend to replace
existing netlink events helper function tcmu_netlink_event().
The existing function tcmu_netlink_event() works well for events like
TCMU_ADDED_DEVICE and TCMU_REMOVED_DEVICE which only has one netlink
attribute. But if there is a command requires more than one attributes
to send out, we have to use a struct to adapt the paremeter
reconfig_data, it is hard to use one struct or a union in one struct to
adapt every command with different attributes, it may get long and ugly.
With the new two functions, we can call tcmu_netlink_event_init() to
initialize a netlink event, then add all attributes we need by using
nla_put_xxx(), at last use tcmu_netlink_event_send() to send it out. So
that we don't need to use a long struct or union if we want to send
mulitple attributes for different commands.
[mkp: typos]
Signed-off-by: Zhu Lingshan <lszhu@suse.com>
Acked-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Wenwen Wang [Tue, 8 May 2018 00:54:01 +0000 (19:54 -0500)]
scsi: 3w-xxxx: fix a missing-check bug
In tw_chrdev_ioctl(), the length of the data buffer is firstly copied
from the userspace pointer 'argp' and saved to the kernel object
'data_buffer_length'. Then a security check is performed on it to make
sure that the length is not more than 'TW_MAX_IOCTL_SECTORS *
512'. Otherwise, an error code -EINVAL is returned. If the security
check is passed, the entire ioctl command is copied again from the
'argp' pointer and saved to the kernel object 'tw_ioctl'. Then, various
operations are performed on 'tw_ioctl' according to the 'cmd'. Given
that the 'argp' pointer resides in userspace, a malicious userspace
process can race to change the buffer length between the two
copies. This way, the user can bypass the security check and inject
invalid data buffer length. This can cause potential security issues in
the following execution.
This patch checks for capable(CAP_SYS_ADMIN) in tw_chrdev_open() to
avoid the above issues.
Signed-off-by: Wenwen Wang <wang6495@umn.edu>
Acked-by: Adam Radford <aradford@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Wenwen Wang [Tue, 8 May 2018 00:46:43 +0000 (19:46 -0500)]
scsi: 3w-9xxx: fix a missing-check bug
In twa_chrdev_ioctl(), the ioctl driver command is firstly copied from
the userspace pointer 'argp' and saved to the kernel object
'driver_command'. Then a security check is performed on the data buffer
size indicated by 'driver_command', which is
'driver_command.buffer_length'. If the security check is passed, the
entire ioctl command is copied again from the 'argp' pointer and saved
to the kernel object 'tw_ioctl'. Then, various operations are performed
on 'tw_ioctl' according to the 'cmd'. Given that the 'argp' pointer
resides in userspace, a malicious userspace process can race to change
the buffer size between the two copies. This way, the user can bypass
the security check and inject invalid data buffer size. This can cause
potential security issues in the following execution.
This patch checks for capable(CAP_SYS_ADMIN) in twa_chrdev_open()t o
avoid the above issues.
Signed-off-by: Wenwen Wang <wang6495@umn.edu>
Acked-by: Adam Radford <aradford@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Tomohiro Kusumi [Fri, 4 May 2018 23:45:29 +0000 (16:45 -0700)]
scsi: mpt3sas: fix header path in ioctl documentation
MPT2_MAGIC_NUMBER as well as drivers/scsi/mpt2sas/mpt2sas_ctl.h were
removed to reuse mpt3sas code since commit
09ec55ed74 ("mpt2sas: Remove
.c and .h files from mpt2sas driver").
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@osnexus.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Tomohiro Kusumi [Fri, 4 May 2018 23:45:28 +0000 (16:45 -0700)]
scsi: mpt3sas: remove obsolete path "drivers/scsi/mpt2sas/" from MAINTAINERS
drivers/scsi/mpt2sas/ no longer exists after commit
c84b06a48c ("mpt3sas: Single driver module which supports both SAS 2.0 &
SAS 3.0 HBAs") merged/removed it.
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@osnexus.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Dan Carpenter [Thu, 3 May 2018 10:54:32 +0000 (13:54 +0300)]
scsi: megaraid: silence a static checker bug
If we had more than 32 megaraid cards then it would cause memory
corruption. That's not likely, of course, but it's handy to enforce it
and make the static checker happy.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Colin Ian King [Thu, 3 May 2018 10:18:07 +0000 (11:18 +0100)]
scsi: mptsas: fix spelling mistake: "matchs" -> "matches"
Trivial fix to spelling mistake in warning message
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Colin Ian King [Thu, 3 May 2018 09:26:12 +0000 (10:26 +0100)]
scsi: lpfc: fix spelling mistakes: "mabilbox" and "maibox"
Trivial fix to spelling mistakes in lpfc_printf_log log message
"mabilbox" -> "mailbox"
"maibox" -> "mailbox"
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Andrei Vagin [Wed, 2 May 2018 20:31:13 +0000 (13:31 -0700)]
scsi: qla2xxx: remove the unused tcm_qla2xxx_cmd_wq
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Colin Ian King [Wed, 2 May 2018 09:12:43 +0000 (10:12 +0100)]
scsi: mptfusion: fix spelling mistake: "initators" -> "initiators"
Trivial fix to spelling mistake in text string.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiaofei Tan [Wed, 2 May 2018 15:56:34 +0000 (23:56 +0800)]
scsi: hisi_sas: workaround a v3 hw hilink bug
There is an SoC bug of v3 hw development version. When hot- unplugging a
directly attached disk, the PHY down interrupt may not happen. It is
very easy to appear on some boards.
When this issue occurs, the controller will receive many invalid dword
frames, and the "alos" fields of register HILINK_ERR_DFX can indicate
that disk was unplugged.
As an workaround solution, this patch detects this issue in the channel
interrupt, and workaround it by following steps:
- Disable the PHY
- Clear error code and interrupt
- Enable the PHY
Then the HW will reissue PHY down interrupt.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
John Garry [Wed, 2 May 2018 15:56:33 +0000 (23:56 +0800)]
scsi: hisi_sas: add readl poll timeout helper wrappers
It is common to use readl poll timeout helpers in the driver, so create
custom wrappers.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiaofei Tan [Wed, 2 May 2018 15:56:32 +0000 (23:56 +0800)]
scsi: hisi_sas: remove redundant handling to event95 for v3
Event95 is used for DFX purpose. The relevant bit for this interrupt in
the ENT_INT_SRC_MSK3 register has been disabled, so remove the
processing.
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiang Chen [Wed, 2 May 2018 15:56:31 +0000 (23:56 +0800)]
scsi: hisi_sas: config ATA de-reset as an constrained command for v3 hw
As a unconstrained command, a command can be sent to SATA disk even if
SATA disk status is BUSY, ERR or DRQ.
If an ATA reset assert is successful but ATA reset de-assert fails, then
it will retry the reset de-assert. If reset de- assert retry is
successful, we think it is okay to probe the device but actually it
still has Err status.
Apparently we need to retry the ATA reset assertion and de- assertion
instead for this mentioned scenario.
As such, we config ATA reset assert as a constrained command, if ATA
reset de-assert fails, then ATA reset de-assert retry will also
fail. Then we will retry the proper process of ATA reset assert and
de-assert again.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiang Chen [Wed, 2 May 2018 15:56:30 +0000 (23:56 +0800)]
scsi: hisi_sas: update PHY linkrate after a controller reset
After the controller is reset, we currently may not honour the PHY max
linkrate set via sysfs, in that after a reset we always revert to max
linkrate of 12Gbps, ignoring the value set via sysfs.
This patch modifies to policy to set the programmed PHY linkrate,
honouring the max linkrate programmed via sysfs.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
John Garry [Wed, 2 May 2018 15:56:29 +0000 (23:56 +0800)]
scsi: hisi_sas: stop controller timer for reset
We should only have the timer enabled after PHY up after controller
reset, so disable prior to reset.
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiang Chen [Wed, 2 May 2018 15:56:28 +0000 (23:56 +0800)]
scsi: hisi_sas: check sas_dev gone earlier in hisi_sas_abort_task()
It is possible to dereference a NULL-pointer in hisi_sas_abort_task() in
special scenario when the device has been removed.
If an SMP task times-out, it will call hisi_sas_abort_task() to
recover. And currently there is a check in hisi_sas_abort_task() to
avoid the situation of processing the abort for the removed device.
However we have an ordering problem, in that we may reference a task for
the removed device before checking if the device has been removed.
Fix this by only referencing the sas_dev after we know it is still
present.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiang Chen [Wed, 2 May 2018 15:56:27 +0000 (23:56 +0800)]
scsi: hisi_sas: fix PI memory size
There are 28 bytes of protection information record of SSP for v3 hw, 16
bytes for v2 hw, and probably 24 for v1 hw (forgotten now).
So use a value big enough in hisi_sas_command_table_ssp.prot to cover
all cases.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiang Chen [Wed, 2 May 2018 15:56:26 +0000 (23:56 +0800)]
scsi: hisi_sas: check host frozen before calling "done" function
When the host is frozen in SCSI EH state, at any point after the LLDD
sets SAS_TASK_STATE_DONE for the sas_task task state, libsas may free
the task; see sas_scsi_find_task().
This puts the LLDD in a difficult position, in that once it sets
SAS_TASK_STATE_DONE for the task state it should not reference the
sas_task again. But the LLDD needs will check the sas_task indirectly in
calling task->task_done()->sas_scsi_task_done() or sas_ata_task_done()
(to check if the host is frozen state actually).
And the LLDD cannot set SAS_TASK_STATE_DONE for the task state after
task->task_done() is called (as the sas_task is free'd at this point).
This situation would seem to be a problem made by libsas.
To work around, check in the LLDD whether the host is in frozen state to
ensure it is ok to call task->task_done() function. If in the frozen
state, we rely on SCSI EH and libsas to free the sas_task directly.
We do not do this for the following IO types:
- SMP - they are managed in libsas directly, outside SCSI EH
- Any internally originated IO, for similar reason
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiang Chen [Wed, 2 May 2018 15:56:25 +0000 (23:56 +0800)]
scsi: hisi_sas: Add some checks to avoid free'ing a sas_task twice
If the SCSI host enters EH, any pending IO will be processed by SCSI
EH. However it is possible that SCSI EH will try to abort the IO and
also at the same time the IO completes in the driver. In this situation
there is a small chance of freeing the sas_task twice.
Then if another IO re-uses freed sas_task before the second time of
free'ing sas_task, it is possible to free incorrect sas_task.
To avoid this situation, add some checks to increase reliability. The
sas_task task state flag SAS_TASK_STATE_ABORTED is used to mutually
protect the LLDD and libsas freeing the task.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiang Chen [Wed, 2 May 2018 15:56:24 +0000 (23:56 +0800)]
scsi: hisi_sas: optimise the usage of DQ locking
In the DQ tasklet processing it is not necessary to take the DQ lock, as
there is no contention between adding slots to the CQ and removing slots
from the matching DQ.
In addition, since we run each DQ in a separate tasklet context, there
would be no possible contention between DQ processing running for the
same queue in parallel.
It is still necessary to take hisi_hba lock when free'ing slots.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Sat, 5 May 2018 03:37:59 +0000 (20:37 -0700)]
scsi: lpfc: Comment cleanup regarding Broadcom copyright header
Fix small formatting and wording nits in Broadcom copyright header
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Sat, 5 May 2018 03:37:58 +0000 (20:37 -0700)]
scsi: lpfc: update driver version to 12.0.0.3
Update the driver version to 12.0.0.3
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Sat, 5 May 2018 03:37:57 +0000 (20:37 -0700)]
scsi: lpfc: Enhance log messages when reporting CQE errors
Enhance log messages for CQEs as they were not reporting certain fields.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Sat, 5 May 2018 03:37:56 +0000 (20:37 -0700)]
scsi: lpfc: Fix up log messages and stats counters in IO submit code path
Fix up log messages and add an fcp error stat counter in the IO submit
code path to make diagnosing problems easier
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Sat, 5 May 2018 03:37:54 +0000 (20:37 -0700)]
scsi: lpfc: Driver NVME load fails when CPU cnt > WQ resource cnt
If the cpu count is larger than the number of WQ resources available,
adapter attachment eventually failes due to a WQ_CREATE failure.
Calculate the number of WQs desired (which initializes to cpu count)
after accounting for the number of queues the adapter supports and the
number allocated to SCSI and the control/ELS path, and scale down if
necessary.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Sat, 5 May 2018 03:37:53 +0000 (20:37 -0700)]
scsi: lpfc: Handle new link fault code returned by adapter firmware.
The driver encounters a link event ACQE with a fault code it doesn't
recognize, it logs an "Invalid" fault type and futher treats the unknown
value as a mailbox command failure. First off, there is no "invalid"
value, only values that are unknown. Secondly, the fault code doesn't
indicate status - the rest of the ACQE contains that status so there is
no reason to "fail the commands".
Change the "Invalid" to "Unknown". There is no "invalid" code value.
Separate fault code parsing and message genaration from any mbx handling
status.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Sat, 5 May 2018 03:37:52 +0000 (20:37 -0700)]
scsi: lpfc: Correct fw download error message
In situations when the firmware image in inappropriate for the chip
type, initial validation checks were light, allowing the checks to pass,
thus allowing the firmware to be downloaded. Eventually, after the
download, the chip rejects the firmware but it is logged as a generic
firmware download error.
Revise the initial checks to validate the image vs asic type so that the
correct message is displayed and the download process is avoided.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Sat, 5 May 2018 03:37:51 +0000 (20:37 -0700)]
scsi: lpfc: enhance LE data structure copies to hardware
The driver builds the control structures in host memory using
definitions that are based on 32-bit words. After building the structure
it is then written to the adapter.
This patch slightly optimizes LE hosts by copying the structures via
64-bit copies. This is doable as the adapter interface is LE thus there
is no byteswapping as the copy is performed.
The same optimization would be nice on BE systems, but when byteswapping
occurs, it swaps 32-bit words as well, thus trashing the control
structure. Given amount of code that is dependent upon the 32-bit word
definition, it was decided to not change things for the minor
optimization. Thus PPC 64-bit systems sticks with doing 32-bit copies.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
James Smart [Sat, 5 May 2018 03:37:50 +0000 (20:37 -0700)]
scsi: lpfc: Change IO submit return to EBUSY if remote port is recovering
I/O submission paths in the lpfc nvme path are rejecting the io with an
error code that reflects back to the callee as a hard io failure. Many
of these conditions are transient and would likely resolve if retried.
Correct by returning -EBUSY, which the FC transport triggers off of to
return busy status codes to the blk-mq layer.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Chad Dupuis [Wed, 25 Apr 2018 13:09:05 +0000 (06:09 -0700)]
scsi: qedf: Update version number to 8.33.16.20
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Chad Dupuis [Wed, 25 Apr 2018 13:09:04 +0000 (06:09 -0700)]
scsi: qedf: Update copyright for 2018
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Chad Dupuis [Wed, 25 Apr 2018 13:09:03 +0000 (06:09 -0700)]
scsi: qedf: Add more defensive checks for concurrent error conditions
During an uplink toggle test all error handling is done via timeout and
firmware error conditions which can occur concurrently:
- SCSI layer timeouts
- Error detect CQEs
- Firmware detected underruns
- ABTS timeouts
All these concurrent events require more defensive checks in the driver
including:
- Check both internally and externally generated aborts to make sure the
xid is not already been aborted in another context or in cleanup.
- Check back pointers in qedf_cmd_timeout to verify the context of the
io_req, fcport and qedf_ctx
- Check rport state in host reset handler to not reset the whole host
if the rport is already uploaded or in the process of relogin
- Check to state for an fcport before initiating a middle path ELS
request
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Chad Dupuis [Wed, 25 Apr 2018 13:09:02 +0000 (06:09 -0700)]
scsi: qedf: Set the UNLOADING flag when removing a vport
Similar to what we do when we remove a PCI function, set the
QEDF_UNLOADING flag to prevent any requests from being queued while a
vport is being deleted. This prevents any requests from getting stuck
in limbo when the vport is unloaded or deleted.
Fixes the crash:
PID: 106676 TASK:
ffff9a436aa90000 CPU: 12 COMMAND: "multipathd"
#0 [
ffff9a43567d3550] machine_kexec+522 at
ffffffffaca60b2a
#1 [
ffff9a43567d35b0] __crash_kexec+114 at
ffffffffacb13512
#2 [
ffff9a43567d3680] crash_kexec+48 at
ffffffffacb13600
#3 [
ffff9a43567d3698] oops_end+168 at
ffffffffad117768
#4 [
ffff9a43567d36c0] no_context+645 at
ffffffffad106f52
#5 [
ffff9a43567d3710] __bad_area_nosemaphore+116 at
ffffffffad106fe9
#6 [
ffff9a43567d3760] bad_area+70 at
ffffffffad107379
#7 [
ffff9a43567d3788] __do_page_fault+1247 at
ffffffffad11a8cf
#8 [
ffff9a43567d37f0] do_page_fault+53 at
ffffffffad11a915
#9 [
ffff9a43567d3820] page_fault+40 at
ffffffffad116768
[exception RIP: qedf_init_task+61]
RIP:
ffffffffc0e13c2d RSP:
ffff9a43567d38d0 RFLAGS:
00010046
RAX:
0000000000000000 RBX:
ffffbe920472c738 RCX:
ffff9a434fa0e3e8
RDX:
ffff9a434f695280 RSI:
ffffbe920472c738 RDI:
ffff9a43aa359c80
RBP:
ffff9a43567d3950 R8:
0000000000000c15 R9:
ffff9a3fb09b9880
R10:
ffff9a434fa0e3e8 R11:
ffff9a43567d35ce R12:
0000000000000000
R13:
ffff9a434f695280 R14:
ffff9a43aa359c80 R15:
ffff9a3fb9e005c0
ORIG_RAX:
ffffffffffffffff CS: 0010 SS: 0018
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Chad Dupuis [Wed, 25 Apr 2018 13:09:01 +0000 (06:09 -0700)]
scsi: qedf: Add additional checks when restarting an rport due to ABTS timeout
There are a couple of kernel cases when we restart a remote port due to
ABTS timeout that we need to handle:
1. Flush any outstanding ABTS requests when flushing I/Os so that we do
not hold up the eh_abort handler indefinitely causing process hangs.
2. Check if we are currently uploading a connection before issuing an
ABTS.
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Chad Dupuis [Wed, 25 Apr 2018 13:09:00 +0000 (06:09 -0700)]
scsi: qedf: If qed fails to enable MSI-X fail PCI probe
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Chad Dupuis [Wed, 25 Apr 2018 13:08:59 +0000 (06:08 -0700)]
scsi: qedf: Honor default_prio module parameter even if DCBX does not converge
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Chad Dupuis [Wed, 25 Apr 2018 13:08:58 +0000 (06:08 -0700)]
scsi: qedf: Improve firmware debug dump handling
Get all firmware debug data instead of just a grc dump.
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Saurav Kashyap [Wed, 25 Apr 2018 13:08:57 +0000 (06:08 -0700)]
scsi: qedf: Remove setting DCBX pending during soft context reset
PROBLEM DESCRIPTION:
According to the logs, STAG was changing and it was triggering soft
reset. In soft reset we used to virtual link down and up and also we
were disabling DCBx flag. Since this was virtual link flap, DCBx never
used to converge again.
SOLUTION:
Code change is to remove disabling DCBx flag from soft reset.
Signed-off-by: Saurav Kashyap <saurav.kashyap@cavium.com>
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Chad Dupuis [Wed, 25 Apr 2018 13:08:56 +0000 (06:08 -0700)]
scsi: qedf: Add task id to kref_get_unless_zero() debug messages when flushing requests
Helps to corroborate which requests we can't get reference on and if
it's real bug or not.
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Chad Dupuis [Wed, 25 Apr 2018 13:08:55 +0000 (06:08 -0700)]
scsi: qedf: Check if link is already up when receiving a link up event from qed
[mkp: typo]
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Chad Dupuis [Wed, 25 Apr 2018 13:08:54 +0000 (06:08 -0700)]
scsi: qedf: Return request as DID_NO_CONNECT if MSI-X is not enabled
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Chad Dupuis [Wed, 25 Apr 2018 13:08:53 +0000 (06:08 -0700)]
scsi: qedf: Release RRQ reference correctly when RRQ command times out
When an RRQ request times out the reference is not getting decremented
correctly as there are still ELS commands leftover when we flush any
pending I/Os during offload:
[ 281.788553] [0000:21:00.3]:[qedf_cmd_timeout:58]:4: ELS timeout, xid=0x96a.
...
[ 281.788553] [0000:21:00.3]:[qedf_cmd_timeout:58]:4: ELS timeout, xid=0x96a.
[ 281.788772] [0000:21:00.3]:[qedf_rrq_compl:182]:4: Entered.
[ 281.788774] [0000:21:00.3]:[qedf_rrq_compl:200]:4: rrq_compl: orig io =
ffffc90004c556f8, orig xid = 0x81b, rrq_xid = 0x96a, refcount=1
...
[ 331.448032] [0000:21:00.3]:[qedf_flush_els_req:1512]:4: Flushing ELS request xid=0x96a refcount=2.
The fix is to call kref_put on the rrq_req in case of timeout as the
timeout handler will call rrq_compl directly vs. a normal completion
where it is call from els_compl.
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Chad Dupuis [Wed, 25 Apr 2018 13:08:52 +0000 (06:08 -0700)]
scsi: qedf: Honor priority from DCBX FCoE App tag
We currently hard code the priority in the 8021q tag to 3 for FCoE
traffic. The vast majority of the time this is fine but if the priority
is something else besides 3, any VLAN ID comparison either in the
non-offload path or offload path will fail and cause dropped frames
where none are expected.
Change the behavior so that the driver default is 3 if we do not get any
DCBX convergence.
If DCBX does converge, then set the FIP/FCoE priority in the following
manner:
1. If the qedf_default_prio modparam is set use that
2. If the DCBX FCoE priority is not in range (0..7) use 3
3. Use the DCBX FCoE priority we get in the driver's DCBX handler
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Chad Dupuis [Wed, 25 Apr 2018 13:08:51 +0000 (06:08 -0700)]
scsi: qedf: Add dcbx_not_wait module parameter so we won't wait for DCBX convergence to start discovery
This module parameter is to work around cases where we do not receive
the DCBX handler notification from qed but discovery is still possible
if we send out a FIP VLAN request irregardless of the DCBX state.
[mkp: zeroday warning]
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Chad Dupuis [Wed, 25 Apr 2018 13:08:50 +0000 (06:08 -0700)]
scsi: qedf: Sanity check FCoE/FIP priority value to make sure it's between 0 and 7
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Chad Dupuis [Wed, 25 Apr 2018 13:08:49 +0000 (06:08 -0700)]
scsi: qedf: Add check for offload before flushing I/Os for target
We need to check that a fcport is offloaded before we try to flush any
requests. No doing so could lead to undefined results and most likely a
crash.
Fixes the oops:
[ 343.971886] [0000:42:00.3]:[qedf_execute_tmf:2070]:8: wait for tm_cmpl timeout!
[ 343.971933] BUG: unable to handle kernel paging request at
00000000000024a8
[ 343.971949] IP: [<
ffffffffa06b8cc6>] qedf_flush_active_ios+0x46/0x260 [qedf]
[ 343.971952] PGD
42c569067 PUD
4160fe067 PMD 0
[ 343.971954] Oops: 0000 [#1] SMP
[ 343.972008] Modules linked in: qedf(OEX) qed(OEX) bnx2i cnic fuse af_packet iscsi_ibft msr xfs intel_rapl sb_edac edac_core x86_pkg_temp_thermal bnx2x geneve intel_powerclamp vxlan coretemp ipmi_ssif ipmi_devintf kvm_intel kvm libiscsi joydev irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel tg3 ip6_udp_tunnel udp_tunnel mdio libcrc32c iTCO_wdt scsi_transport_iscsi uio drbg iTCO_vendor_support iscsi_boot_sysfs dcdbas(X) ipmi_si ansi_cprng aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper ptp pps_core pcspkr libphy lpc_ich mfd_core cryptd fjes wmi ipmi_msghandler button crc8 libfcoe libfc scsi_transport_fc mei_me mei shpchp processor acpi_pad btrfs xor hid_generic usbhid raid6_pq sd_mod sr_mod cdrom mgag200 crc32c_intel i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
[ 343.972020] fb_sys_fops ttm ahci ehci_pci libahci ehci_hcd drm libata usbcore megaraid_sas usb_common sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod autofs4 [last unloaded: qedf]
[ 343.972022] Supported: Yes, External
[ 343.972026] CPU: 30 PID: 12777 Comm: sg_reset Tainted: G W OE X 4.4.73-5-default #1
[ 343.972027] Hardware name: Dell Inc. PowerEdge R720/0X3D66, BIOS 2.1.3 11/20/2013
[ 343.972029] task:
ffff88018dfc0e80 ti:
ffff88042bd7c000 task.ti:
ffff88042bd7c000
[ 343.972036] RIP: 0010:[<
ffffffffa06b8cc6>] [<
ffffffffa06b8cc6>] qedf_flush_active_ios+0x46/0x260 [qedf]
[ 343.972038] RSP: 0018:
ffff88042bd7fbe0 EFLAGS:
00010286
[ 343.972039] RAX:
0000000000000000 RBX:
ffff88042ce37800 RCX:
0000000000000400
[ 343.972040] RDX:
000000000000060e RSI:
ffffffffa06be830 RDI:
ffff8807e5072cc0
[ 343.972041] RBP:
0000000000001000 R08:
ffffffffa06bff4d R09:
ffff88018dd84580
[ 343.972042] R10:
000000000000018b R11:
0000000000000002 R12:
0000000000002003
[ 343.972043] R13:
0000000000000000 R14:
0000000000000000 R15:
ffff8807e5072cc0
[ 343.972046] FS:
00007fc1c8809700(0000) GS:
ffff88042fbc0000(0000) knlGS:
0000000000000000
[ 343.972048] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 343.972049] CR2:
00000000000024a8 CR3:
00000004236ec000 CR4:
00000000001406e0
[ 343.972050] Stack:
[ 343.972053]
504c78750607e154 ffffffff810a7d10 ffff88042ce37800 0000000000000010
[ 343.972055]
0000000000002003 ffff8807ff480c48 ffff8807e5072cc0 ffffc90004ec4ff8
[ 343.972057]
ffffffffa06b9b86 ffff880800000010 0000000000000282 ffff88042ce37800
[ 343.972058] Call Trace:
[ 343.972094] [<
ffffffffa06b9b86>] qedf_initiate_tmf+0x346/0x3e0 [qedf]
[ 343.972120] [<
ffffffffa000fa06>] scsi_try_bus_device_reset+0x26/0x40 [scsi_mod]
[ 343.972133] [<
ffffffffa001038e>] scsi_ioctl_reset+0x13e/0x260 [scsi_mod]
[ 343.972145] [<
ffffffffa000f416>] scsi_ioctl+0x136/0x3d0 [scsi_mod]
[ 343.972154] [<
ffffffff812ff6eb>] blkdev_ioctl+0x6bb/0x950
[ 343.972164] [<
ffffffff8123cfed>] block_ioctl+0x3d/0x40
[ 343.972170] [<
ffffffff81217e2d>] do_vfs_ioctl+0x2cd/0x4a0
[ 343.972186] [<
ffffffff81218074>] SyS_ioctl+0x74/0x80
[ 343.972193] [<
ffffffff8160916e>] entry_SYSCALL_64_fastpath+0x12/0x6d
[ 343.975285] DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x12/0x6d
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Chad Dupuis [Wed, 25 Apr 2018 13:08:48 +0000 (06:08 -0700)]
scsi: qedf: Fix VLAN display when printing sent FIP frames
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Chad Dupuis [Wed, 25 Apr 2018 13:08:47 +0000 (06:08 -0700)]
scsi: qedf: Add missing skb frees in error path
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Chad Dupuis [Wed, 25 Apr 2018 13:08:46 +0000 (06:08 -0700)]
scsi: qedf: Increase the number of default FIP VLAN request retries to 60
Some configurations need more than 30 seconds to respond to a FIP VLAN
request so increase the default to 60 seconds.
Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>