review.tizen.org Git - platform/kernel/linux-starfive.git/log

scsi: qedf: Check if link is already up when receiving a link up event from qed

[mkp: typo]

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qedf: Return request as DID_NO_CONNECT if MSI-X is not enabled

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qedf: Release RRQ reference correctly when RRQ command times out

When an RRQ request times out the reference is not getting decremented
correctly as there are still ELS commands leftover when we flush any
pending I/Os during offload:

[  281.788553] [0000:21:00.3]:[qedf_cmd_timeout:58]:4: ELS timeout, xid=0x96a.
...
[  281.788553] [0000:21:00.3]:[qedf_cmd_timeout:58]:4: ELS timeout, xid=0x96a.
[  281.788772] [0000:21:00.3]:[qedf_rrq_compl:182]:4: Entered.
[  281.788774] [0000:21:00.3]:[qedf_rrq_compl:200]:4: rrq_compl: orig io = ffffc90004c556f8, orig xid = 0x81b, rrq_xid = 0x96a, refcount=1
...
[  331.448032] [0000:21:00.3]:[qedf_flush_els_req:1512]:4: Flushing ELS request xid=0x96a refcount=2.

The fix is to call kref_put on the rrq_req in case of timeout as the
timeout handler will call rrq_compl directly vs. a normal completion
where it is call from els_compl.

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qedf: Honor priority from DCBX FCoE App tag

We currently hard code the priority in the 8021q tag to 3 for FCoE
traffic. The vast majority of the time this is fine but if the priority
is something else besides 3, any VLAN ID comparison either in the
non-offload path or offload path will fail and cause dropped frames
where none are expected.

Change the behavior so that the driver default is 3 if we do not get any
DCBX convergence.

If DCBX does converge, then set the FIP/FCoE priority in the following
manner:

1. If the qedf_default_prio modparam is set use that
2. If the DCBX FCoE priority is not in range (0..7) use 3
3. Use the DCBX FCoE priority we get in the driver's DCBX handler

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qedf: Add dcbx_not_wait module parameter so we won't wait for DCBX convergence to start discovery

This module parameter is to work around cases where we do not receive
the DCBX handler notification from qed but discovery is still possible
if we send out a FIP VLAN request irregardless of the DCBX state.

[mkp: zeroday warning]

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qedf: Sanity check FCoE/FIP priority value to make sure it's between 0 and 7

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qedf: Add check for offload before flushing I/Os for target

We need to check that a fcport is offloaded before we try to flush any
requests.  No doing so could lead to undefined results and most likely a
crash.

Fixes the oops:

[  343.971886] [0000:42:00.3]:[qedf_execute_tmf:2070]:8: wait for tm_cmpl timeout!
[  343.971933] BUG: unable to handle kernel paging request at 00000000000024a8
[  343.971949] IP: [<ffffffffa06b8cc6>] qedf_flush_active_ios+0x46/0x260 [qedf]
[  343.971952] PGD 42c569067 PUD 4160fe067 PMD 0
[  343.971954] Oops: 0000 [#1] SMP
[  343.972008] Modules linked in: qedf(OEX) qed(OEX) bnx2i cnic fuse af_packet iscsi_ibft msr xfs intel_rapl sb_edac edac_core x86_pkg_temp_thermal bnx2x geneve intel_powerclamp vxlan coretemp ipmi_ssif ipmi_devintf kvm_intel kvm libiscsi joydev irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel tg3 ip6_udp_tunnel udp_tunnel mdio libcrc32c iTCO_wdt scsi_transport_iscsi uio drbg iTCO_vendor_support iscsi_boot_sysfs dcdbas(X) ipmi_si ansi_cprng aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper ptp pps_core pcspkr libphy lpc_ich mfd_core cryptd fjes wmi ipmi_msghandler button crc8 libfcoe libfc scsi_transport_fc mei_me mei shpchp processor acpi_pad btrfs xor hid_generic usbhid raid6_pq sd_mod sr_mod cdrom mgag200 crc32c_intel i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
[  343.972020]  fb_sys_fops ttm ahci ehci_pci libahci ehci_hcd drm libata usbcore megaraid_sas usb_common sg dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua scsi_mod autofs4 [last unloaded: qedf]
[  343.972022] Supported: Yes, External
[  343.972026] CPU: 30 PID: 12777 Comm: sg_reset Tainted: G        W  OE    X 4.4.73-5-default #1
[  343.972027] Hardware name: Dell Inc. PowerEdge R720/0X3D66, BIOS 2.1.3 11/20/2013
[  343.972029] task: ffff88018dfc0e80 ti: ffff88042bd7c000 task.ti: ffff88042bd7c000
[  343.972036] RIP: 0010:[<ffffffffa06b8cc6>]  [<ffffffffa06b8cc6>] qedf_flush_active_ios+0x46/0x260 [qedf]
[  343.972038] RSP: 0018:ffff88042bd7fbe0  EFLAGS: 00010286
[  343.972039] RAX: 0000000000000000 RBX: ffff88042ce37800 RCX: 0000000000000400
[  343.972040] RDX: 000000000000060e RSI: ffffffffa06be830 RDI: ffff8807e5072cc0
[  343.972041] RBP: 0000000000001000 R08: ffffffffa06bff4d R09: ffff88018dd84580
[  343.972042] R10: 000000000000018b R11: 0000000000000002 R12: 0000000000002003
[  343.972043] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8807e5072cc0
[  343.972046] FS:  00007fc1c8809700(0000) GS:ffff88042fbc0000(0000) knlGS:0000000000000000
[  343.972048] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  343.972049] CR2: 00000000000024a8 CR3: 00000004236ec000 CR4: 00000000001406e0
[  343.972050] Stack:
[  343.972053]  504c78750607e154 ffffffff810a7d10 ffff88042ce37800 0000000000000010
[  343.972055]  0000000000002003 ffff8807ff480c48 ffff8807e5072cc0 ffffc90004ec4ff8
[  343.972057]  ffffffffa06b9b86 ffff880800000010 0000000000000282 ffff88042ce37800
[  343.972058] Call Trace:
[  343.972094]  [<ffffffffa06b9b86>] qedf_initiate_tmf+0x346/0x3e0 [qedf]
[  343.972120]  [<ffffffffa000fa06>] scsi_try_bus_device_reset+0x26/0x40 [scsi_mod]
[  343.972133]  [<ffffffffa001038e>] scsi_ioctl_reset+0x13e/0x260 [scsi_mod]
[  343.972145]  [<ffffffffa000f416>] scsi_ioctl+0x136/0x3d0 [scsi_mod]
[  343.972154]  [<ffffffff812ff6eb>] blkdev_ioctl+0x6bb/0x950
[  343.972164]  [<ffffffff8123cfed>] block_ioctl+0x3d/0x40
[  343.972170]  [<ffffffff81217e2d>] do_vfs_ioctl+0x2cd/0x4a0
[  343.972186]  [<ffffffff81218074>] SyS_ioctl+0x74/0x80
[  343.972193]  [<ffffffff8160916e>] entry_SYSCALL_64_fastpath+0x12/0x6d
[  343.975285] DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x12/0x6d

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qedf: Fix VLAN display when printing sent FIP frames

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qedf: Add missing skb frees in error path

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qedf: Increase the number of default FIP VLAN request retries to 60

Some configurations need more than 30 seconds to respond to a FIP VLAN
request so increase the default to 60 seconds.

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qedf: Synchronize rport restarts when multiple ELS commands time out

If multiple ELS commands time out, such as aborts, they could all try to
restart the same rport and the same time.  This could mean multiple
multiple processes trying to clean up any outstanding commands or trying
to upload the same port.

Add a new flag (QEDF_RPORT_IN_RESET) and check other fcport state flags
before trying to reset the port.

Fixes the crash:

[17501.824701] ------------[ cut here ]------------
[17501.824733] kernel BUG at include/asm-generic/dma-mapping-common.h:65!
[17501.824760] invalid opcode: 0000 [#1] SMP
[17501.824781] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ses enclosure dm_service_time vfat fat sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass joydev btrfs hpilo raid6_pq iTCO_wdt iTCO_vendor_support xor hpwdt ipmi_ssif sg crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul ioatdma lpc_ich glue_helper ablk_helper i2c_i801 shpchp cryptd ipmi_si pcspkr acpi_power_meter ipmi_devintf pcc_cpufreq dca wmi ipmi_msghandler dm_multipath nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sr_mod cdrom sd_mod
[17501.825119]  crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm qedf(OE) drm libfcoe ahci qedi(OE) crct10dif_pclmul libfc libahci uio crct10dif_common crc32c_intel libiscsi libata scsi_transport_iscsi scsi_transport_fc tg3 qede(OE) scsi_tgt hpsa qed(OE) i2c_core ptp scsi_transport_sas pps_core iscsi_boot_sysfs dm_mirror dm_region_hash dm_log dm_mod
[17501.825292] CPU: 8 PID: 10531 Comm: kworker/u96:1 Tainted: G           OE  ------------   3.10.0-693.el7.x86_64 #1
[17501.825330] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 06/02/2016
[17501.825372] Workqueue: fc_rport_eq fc_rport_work [libfc]
[17501.825395] task: ffff88101bca8000 ti: ffff881025278000 task.ti: ffff881025278000
[17501.825424] RIP: 0010:[<ffffffffc042def9>]  [<ffffffffc042def9>] qedf_unmap_sg_list.isra.15+0x89/0x90 [qedf]
[17501.825471] RSP: 0018:ffff88102527bb98  EFLAGS: 00010212
[17501.825493] RAX: ffff8800224eac00 RBX: ffffc9000cd05210 RCX: 0000000000001000
[17501.825520] RDX: 000000007e655e40 RSI: 0000000000001000 RDI: ffff88107fe3b098
[17501.826683] RBP: ffff88102527bba0 R08: ffffffff81a13200 R09: 0000000000000286
[17501.827747] R10: 0000000000000004 R11: 0000000000000005 R12: ffffc9000cd051b8
[17501.828804] R13: ffff881037640c28 R14: 0000000000000007 R15: ffffc9000cd05200
[17501.829850] FS:  0000000000000000(0000) GS:ffff88103fa00000(0000) knlGS:0000000000000000
[17501.830910] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[17501.831966] CR2: 00007f9b94005f38 CR3: 00000000019f2000 CR4: 00000000003407e0
[17501.833027] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[17501.834087] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[17501.835142] Stack:
[17501.836201]  ffff881033ddbb80 ffff88102527bc30 ffffffffc042f834 0000000000002710
[17501.837264]  ffff88102527bbd0 ffffffff8133d9dd ffffc9000cd052a0 ffff88102527bc30
[17501.838325]  ffffffff816a9c65 0000000000000001 ffff88101bca8000 ffffffff810c4810
[17501.839388] Call Trace:
[17501.840446]  [<ffffffffc042f834>] qedf_scsi_done+0x54/0x1d0 [qedf]
[17501.841504]  [<ffffffff8133d9dd>] ? list_del+0xd/0x30
[17501.842537]  [<ffffffff816a9c65>] ? wait_for_completion_timeout+0x125/0x140
[17501.843560]  [<ffffffff810c4810>] ? wake_up_state+0x20/0x20
[17501.844577]  [<ffffffffc0430311>] qedf_initiate_cleanup+0x2e1/0x310 [qedf]
[17501.845587]  [<ffffffffc04305fe>] qedf_flush_active_ios+0x10e/0x260 [qedf]
[17501.846612]  [<ffffffffc042892f>] qedf_cleanup_fcport+0x5f/0x370 [qedf]
[17501.847613]  [<ffffffffc04292d8>] qedf_rport_event_handler+0x398/0x950 [qedf]
[17501.848602]  [<ffffffff810cdc7c>] ? dequeue_entity+0x11c/0x5d0
[17501.849581]  [<ffffffff81098a2b>] ? __internal_add_timer+0xab/0x130
[17501.850555]  [<ffffffff810ce54e>] ? dequeue_task_fair+0x41e/0x660
[17501.851528]  [<ffffffffc03241a4>] fc_rport_work+0xf4/0x6c0 [libfc]
[17501.852490]  [<ffffffff810a881a>] process_one_work+0x17a/0x440
[17501.853446]  [<ffffffff810a94e6>] worker_thread+0x126/0x3c0

Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Update driver version to 10.00.00.07-k

Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Fix TMF and Multi-Queue config

For target mode, task management command is queued to specific cpu base
on where the SCSI command is residing. This prevent race condition of
task management command getting ahead of regular scsi command.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Prevent relogin loop by removing stale code

Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Remove stale debug value for login_retry flag

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Use predefined get_datalen_for_atio() inline function

- Uses predefine inline function to access add_cdb_len field in ATIO.

- Return SS_RESIDUAL_UNDER status when sending BUSY

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Fix Inquiry command being dropped in Target mode

When a connection is established, the target core session may not be
created immediately. Current code will drop/terminate the command based
on the session state. This patch will return BUSY status for any
commands arriving on wire before the session is created.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Move GPSC and GFPNID out of session management

Move GPSC & GFPNID commands out of session management to reduce time lag
in reporting the session state to remote port. These commands are not
essential when it comes to maintaining the rport state. Delay sending
these commands after rport state is set to Online.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Reduce redundant ADISC command for RSCNs

For each RSCN that triggers a rescan of the fabric, ADISC is used to
revalidate an existing session. If the RSCN is not affecting all
existing sessions, then driver should not send redundant ADISC for all
existing sessions.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Delete session for nport id change

This patch fixes regression introduced by commit a4239945b8ad ("scsi:
qla2xxx: Add switch command to simplify fabric discovery") by scheduling
session deletion when Nport ID changes.

[mkp: clarified commit]

Fixes: a4239945b8ad ("scsi: qla2xxx: Add switch command to simplify fabric discovery")
Cc: <stable@vger.kernel.org>
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Fix Rport and session state getting out of sync

This patch fixes rport state and session state getting out of sync.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: qla2xxx: Fix sending ADISC command for login

This patch fixes login_retry login for ADISC command.

when login_retry count reaches 0, further attempt to send ADISC command
is ignored by the code. Remove this redundant login_retry count check
from qla24xx_fcport_handle_login()

[mkp: fix typo]

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: mpt3sas: Update driver version "25.100.00.00"

Update driver version to match OOB/internal driver version.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: mpt3sas: fix possible memory leak.

In ioctl exit path driver refers ioc_list to free memory associated with
diag buffers and event_log pointer used to save events by driver.
If ctl_exit() func is called after unregistering driver, then ioc_list will
be empty and hence driver will not be able to free the allocated memory
which in turn causes memory leak.
So call ctl_exit() function before unregistering mpt3sas driver.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: mpt3sas: For NVME device, issue a protocol level reset

1) Manufacturing Page 11 contains parameters to control internal
   firmware behavior. Based on AddlFlags2 field FW/Driver behaviour can
   be changed, (flag tm_custom_handling is used for this)

a) For PCIe device, protocol level reset should be used if flag
   tm_custom_handling is 0.  Since Abort Task Set, LUN reset and Target
   reset will result in a protocol level reset. Drivers should issue
   only one type of this reset, if that fails then it should escalate to
   a controller reset (diag reset/OCR).

b) If the driver has control over the TM reset timeout value, then
   driver should use the value exposed in PCIe Device Page 2 for pcie
   device (field ControllerResetTO).

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: mpt3sas: Update MPI Headers

Update MPI Files to support protocol level reset for NVMe device.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: mpt3sas: Report Firmware Package Version from HBA Driver.

Added function _base_display_fwpkg_version, which sends FWUpload request
to pull FW package version from FW Image Header. Now driver prints FW
package version in addition to FW version if the PackageVersion is
valid.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: mpt3sas: Cache enclosure pages during enclosure add.

In function _scsih_add_device, for each device connected to an
enclosure, driver reads the enclosure page(To get details like enclosure
handle, enclosure logical ID, enclosure level etc.)

With this patch, instead of reading enclosure page everytime, driver
maintains a list for enclosure device(During enclosure add event,
enclosure device is added to the list and removed from the list on
delete events) and uses the enclosure page from the list.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: mpt3sas: Allow processing of events during driver unload.

Events were not processed during driver unload, hence unloading of
driver doesn't complete when drives are disconnected while unloading of
driver. So don't block events in ISR path, i,e., remove the flag
ioc->remove_host so that events are getting processed during driver
unload. Thus allowing driver unload to complete by processing drive
removal events during driver unload.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: mpt3sas: Increase event log buffer to support 24 port HBA's.

For 24 port HBA's events generated by IOC are more in certain cases and
the current circular buffer may be overwritten.Hence increased the event
log buffer to accommodate more events.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: mpt3sas: Added support for SAS Device Discovery Error Event.

The SAS Device Discovery Error Event is sent to the host when discovery
for a particular device is failed during discovery, even after maximum
retries by the IOC.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: mpt3sas: Enhanced handling of Sense Buffer.

Enhanced DMA allocation for Sense Buffer, if the allocation does not fit
within same 4GB.Introduced is_MSB_are_same function to check if allocted
buffer within 4GB range or not.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: mpt3sas: Optimize I/O memory consumption in driver.

For every IO, memory of PAGE size is allocated for handling NVMe native
PRPS. And in addition to that for every IO (chains need per IO * chain
buffer size, e.g. 38 * 128byte) amount of memory is allocated for chain
buffers.

However, at any point of time; the IO request can be for NVMe target
device (where PRP's page is used for framing PRP's) or can be for SCSI
target device (where chain buffers are used for framing chain
SGE's). This patch modifies the driver to reuse same pre-allocated PRP
page buffers as a chain buffer for IO's targeted for SCSI target
devices. No need to allocate separate buffers for chain SGE's buffers.

Suppose if the number of chain buffers need for IO doesn't fit in the
PRP Page size then driver maintain's separate buffers for those extra
chain buffers that exceeds the PRP page size. For example consider PRP
page size as 4K and chain buffer size as 128 bytes, then number of chain
buffers that can fit in PRP page is 4096/128 => 32. if the number of
chain buffer need per IO exceeds 32; for example consider number of
chains need per IO is 36 then for remaining 4 chain buffer's driver
allocates them individual.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: mpt3sas: Lockless access for chain buffers.

Introduces Chain lookup table/tracker and implements accessing chain
buffer using smid. Removed link list based access of chain buffer which
requires lock and allocated as many chains needed.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: mpt3sas: Pre-allocate RDPQ Array at driver boot time.

Instead of allocating RDPQ array (This stores the address's of each RDPQ
pools) at run time, now it will be allocated once during driver load
time and same will be reused during host reset operation also (instead
of allocating & freeing this buffer on the fly during every host reset
operation) and then freed during driver unload.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: mpt3sas: Bug fix for big endian systems.

This patch fixes sparse warnings and bugs on big endian systems.

Signed-off-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: mpt3sas: fix spelling mistake: "disbale" -> "disable"

Trivial fix to spelling mistake in module parameter description text

[mkp: applied by hand]

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: megaraid_sas: fix spelling mistake: "disbale" -> "disable"

Trivial fix to spelling mistake in module parameter description text

[mkp: applied by hand]

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: esas2r: fix spelling mistake: "asynchromous" -> "asynchronous"

Trivial fix to spelling mistake in module description text

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: isci: remove redundant check on in_connection_align_insertion_frequency

The sanity check on u->in_connection_align_insertion_frequency is being
performed twice and hence the first check can be removed since it is
redundant. Cleans up cppcheck warning:

drivers/scsi/ibmvscsi/ibmvscsi.c:1711: (warning) Identical inner 'if'
condition is always true.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: a100u2w: Use module_pci_driver

Remove boilerplate code by using macro module_pci_driver.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: wd719x: Use module_pci_driver

Remove boilerplate code by using macro module_pci_driver.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: am53c974: Use module_pci_driver

Remove boilerplate code by using macro module_pci_driver.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: scsi_transport_sas: don't bounce highmem pages for the smp handler

All three instance of ->smp_handler deal with highmem backed requests
just fine.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ips: fix firmware timestamps for 32-bit

do_gettimeofday() is deprecated since it will stop working in 2038 on
32-bit platforms, leading to incorrect times passed to the firmware.
On 64-bit platforms the current code appears to be fine, as the
calculation passes an 8-bit century number into the firmware that can
represent times long in the future (possibly until 25599).

Using ktime_get_real_seconds() to get a 64-bit seconds value and
time64_to_tm() to convert it into the firmware format greatly simplifies
the ips timekeeping code, makes 32-bit and 64-bit behave the same way
here, and gets us closer to removing the deprecated interfaces.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: esas2r: use ktime_get_real_seconds()

do_gettimeofday() is deprecated because of the y2038 overflow. Here, we
use the result to pass into a 32-bit field in the firmware, which still
risks an overflow, but if the firmware is written to expect unsigned
values, it can at least last until y2106, and there is not much we can
do about it.

This changes do_gettimeofday() to ktime_get_real_seconds(), which at
least simplifies the code a bit, and avoids the deprecated
interface. I'm adding a comment about the overflow to document what
happens.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: mvumi: Using module_pci_driver

Remove boilerplate code by using macro module_pci_driver.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: add driver-api document

Add a driver-api document for target/iSCSI interfaces.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
To: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: linux-scsi@vger.kernel.org
Cc: target-devel@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: target_core_user.[ch]: convert comments into DOC:

Make documentation on target-supported userspace-I/O design be
usable by kernel-doc by using "DOC:". This is used in the driver-api
Documentation chapter.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
To: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: linux-scsi@vger.kernel.org
Cc: target-devel@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: target_core_transport.c: enable+fix kernel-doc

For exported functions that already have near-kernel-doc notation,
fix them to begin with "/**" and make a few corrections so that they
don't have any kernel-doc warnings.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
To: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: linux-scsi@vger.kernel.org
Cc: target-devel@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: target_core_transport.c: fix kernel-doc warnings

Correct a function parameter's name to eliminate kernel-doc warnings
in drivers/target/target_core_transport.c.

Fixes these kernel-doc warnings: (tested by adding these files to a new
target.rst documentation file)

../drivers/target/target_core_transport.c:1671: warning: No description found for parameter 'fabric_tmr_ptr'
../drivers/target/target_core_transport.c:1671: warning: Excess function parameter 'fabric_context' description in 'target_submit_tmr'

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
To: "Nicholas A. Bellinger" <nab@linux-iscsi.org>
Cc: linux-scsi@vger.kernel.org
Cc: target-devel@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: st: Replace GFP_ATOMIC with GFP_KERNEL in new_tape_buffer

new_tape_buffer() is never called in atomic context. new_tape_buffer()
is only called by st_probe(), which is only set as ".probe" in struct
scsi_driver.

Despite never getting called from atomic context, new_tape_buffer()
calls kzalloc() with GFP_ATOMIC, which does not sleep for allocation.
GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL, which
can sleep and improve the possibility of sucessful allocation.

This is found by a static analysis tool named DCNS written by myself.
And I also manually check it.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: st: Replace GFP_ATOMIC with GFP_KERNEL in st_probe

st_probe() is never called in atomic context. st_probe() is only set as
".probe" in struct scsi_driver.

Despite never getting called from atomic context, st_probe() calls
kzalloc() with GFP_ATOMIC, which does not sleep for allocation.
GFP_ATOMIC is not necessary and can be replaced with GFP_KERNEL, which
can sleep and improve the possibility of sucessful allocation.

This is found by a static analysis tool named DCNS written by myself.
And I also manually check it.

Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: devinfo: BLIST_RETRY_ASC_C1 for Fujitsu ETERNUS

On Fujitsu ETERNUS systems, sense code ABORTED COMMAND with ASC/Q C1/01
is used to indicate temporary condition where the storage-internal path
to a target is switched from one controller to another. SCSI commands
that return with this error code must be retried unconditionally
(i.e. without the "maybe_retry" logic in scsi_decide_disposition);
otherwise dm-multipath might initiate a failover from a healthy path
e.g. for REQ_FAILFAST_DEV commands.

Introduce a new blist flag for this case.

[mkp: applied by hand]

Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: devinfo: add BLIST_RETRY_ITF for EMC Symmetrix

EMC Symmetrix returns 'internal target error' for a variety of
conditions, most of which will be transient. So we should always retry
it, even with failfast set. Otherwise we'd get spurious path flaps with
multipath.

Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: devinfo: warn on undefined blist flags

Warn if a device (or the user) sets blist flags which are unknown
or have been removed. This should enable us to reuse freed blist
bits in later releases.

Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: devinfo: change blist_flag_t to 64bit

Space for SCSI blist flags is gradually running out. Change the type to
__u64 and fix a checkpatch complaint about symbolic mode flags in
scsi_devinfo.c.

Make checkpatch happy by replacing simple_strtoul() with kstrtoull().

Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: devinfo: use const_ilog2 for array indices

Use the just introduced const_ilog2() macro to avoid sparse errors.

Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: ilog2: create truly constant version for sparse

Sparse emits errors about ilog2() in array indices because of the use of
__ilog2_32() and __ilog2_64(), rightly so
(https://www.spinics.net/lists/linux-sparse/msg03471.html).

Create a const_ilog2() variant that works with sparse for this scenario.

(Note: checkpatch.pl complains about missing parentheses, but that
appears to be a false positive. I can get rid of the warning simply by
inserting whitespace, making checkpatch "see" the whole macro).

Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: storvsc: Select channel based on available percentage of ring buffer to write

This is a best effort for estimating on how busy the ring buffer is for
that channel, based on available buffer to write in percentage. It is
still possible that at the time of actual ring buffer write, the space
may not be available due to other processes may be writing at the time.

Selecting a channel based on how full it is can reduce the possibility
that a ring buffer write will fail, and avoid the situation a channel is
over busy.

Now it's possible that storvsc can use a smaller ring buffer size
(e.g. 40k bytes) to take advantage of cache locality.

Signed-off-by: Long Li <longli@microsoft.com>
Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: Change return type to vm_fault_t

Use new return type vm_fault_t for fault handler in struct
vm_operations_struct.

Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: target: prefer dbroot of /etc/target over /var/target

The target database root directory, dbroot, has defaulted to /var/target
for a while, but its main client, targetcli-fb, has been moving it to
/etc/target for quite some time. With the plethora of target drivers now
appearing, it has become more difficult to initialize this attribute
before use by any child drivers.

If the directory /etc/target exists, use that as the DB root. Otherwise,
fall back to using /var/target.

The ability to override this dbroot attribute still exists via sysfs.

Signed-off-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: mptfc: fix spelling mistake in macro names

Rename macros MPI_FCPORTPAGE0_SUPPORT_SPEED_UKNOWN and
MPI_FCPORTPAGE0_CURRENT_SPEED_UKNOWN to add in missing N in UNKNOWN

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: sd_zbc: Let the SCSI core handle ILLEGAL REQUEST / ASC 0x21

scsi_io_completion() translates the sense key ILLEGAL REQUEST / ASC 0x21 into
ACTION_FAIL. That means that setting cmd->allowed to zero in sd_zbc_complete()
for this sense code / ASC combination is not necessary. Hence remove the code
that resets cmd->allowed from sd_zbc_complete().

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: sd_zbc: Change the type of the ZBC fields into u32

This patch does not change any functionality but makes it clear that it is on
purpose that these fields are 32 bits wide.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: storsvc: don't set a bounce limit

The default already is to never bounce, so the call is a no-op.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: iscsi_tcp: don't set a bounce limit

The default already is to never bounce, so the call is a no-op.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: sg: Change return type to vm_fault_t

Use new return type vm_fault_t for fault handler in struct
vm_operations_struct.

Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: zorro_esp: New driver for Amiga Zorro NCR53C9x boards

New combined SCSI driver for all ESP based Zorro SCSI boards for m68k Amiga.

Code largely based on board specific parts of the old drivers (blz1230.c,
blz2060.c, cyberstorm.c, cyberstormII.c, fastlane.c which were removed after
the 2.6 kernel series for lack of maintenance) with contributions by Tuomas
Vainikka (TCQ bug tests and workaround) and Finn Thain (TCQ bugfix by use of
PIO in extended message in transfer).

New Kconfig option and Makefile entries for new Amiga Zorro ESP SCSI driver
included in this patch.

Use DMA transfers wherever possible, with board-specific DMA set-up functions
copied from the old driver code. Three byte reselection messages do appear to
cause DMA timeouts. So wire up a PIO transfer routine for these
instead. esp_reselect_with_tag explicitly sets
esp->cmd_block_dma as target address for the message bytes but PIO
requires a virtual address. Substiute kernel virtual address
esp->cmd_block in PIO transfer call if DMA address is esp->cmd_block_dma
and phase is message in.

PIO code taken from mac_esp.c where the reselection timeout issue was debugged
and fixed first, with minor macro and function rename.

Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
Reviewed-by: Finn Thain <fthain@telegraphics.com.au>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Christian T. Steigies <cts@debian.org>
Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: scsi_dh: replace too broad "TP9" string with the exact models

SGI/TP9100 is not an RDAC array:
^^^
https://git.opensvc.com/gitweb.cgi?p=multipath-tools/.git;a=blob;f=libmultipath/hwtable.c;h=88b4700beb1d8940008020fbe4c3cd97d62f4a56;hb=HEAD#l235

This partially reverts commit 35204772ea03 ("[SCSI] scsi_dh_rdac :
Consolidate rdac strings together")

[mkp: fixed up the new entries to align with rest of struct]

Cc: NetApp RDAC team <ng-eseries-upstream-maintainers@netapp.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: SCSI ML <linux-scsi@vger.kernel.org>
Cc: DM ML <dm-devel@redhat.com>
Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: devinfo: delete duplicate "Generic"/"USB Storage-SMC" device

The revision field is currently unused by the devinfo pattern matching
code. Combine two blacklist entries into one.

$ egrep "Generic.*Storage-SMC" /proc/scsi/device_info
'Generic' 'USB Storage-SMC' 0x402
'Generic' 'USB Storage-SMC' 0x402

[mkp: tweaked commit desc]

Cc: Hannes Reinecke <hare@suse.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
Cc: SCSI ML <linux-scsi@vger.kernel.org>
Signed-off-by: Xose Vazquez Perez <xose.vazquez@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: update driver version to 12.0.0.2

Update the driver version to 12.0.0.2

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Correct missing remoteport registration during link bounces

Remote port disappearance/reappearances would cause a series of RSCN
events to be delivered to the driver. During the resulting GID_FT
handling, the driver clears the fc4 settings on the remote port, which
makes it skip registration. As such, the nvme associations eventually
fail and return io errors to the applications.

Correct by not clearng the nlp_fc4_types for all nodes in
lpfc_issue_gidft. Instead, when the GID_FT response is handled, clear
the nlp_fc4_types of FCP and NVME prior to evaluating the fc4_type
returned by the GID_FT response. This approach leaves "skipped" nodes
with their nlp_fc4_types intacted.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix NULL pointer reference when resetting adapter

Points referencing local port structures didn't accommodate cases where
the localport may not be registered yet.

Add NULL pointer checks to logic.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix nvme remoteport registration race conditions

On tests adding and removing a remote port, calls to nvme_info would
eventually show fewer target ports discovered than were present in the
san. Additionally, the following error messages were seen:

6031 RemotePort Registration failed err: -116, DID x471301

There is a race condition that exists between the driver and the nvme
transport on remote port unregister vs the confirmed deletion. It's
possible that the driver may rediscover the remote port and reregister
the remote port before a prior unregister delete callback was made (as
it rebinded to the prior remoteport structure). However, the driver was
coded to expect the callback before seeing the remote port again thus a
new registration. The logic results in the driver having an invalid
remoteport pointer set.

Correct by tracking when waiting for the delete callback. In cases where
the ndlp remoteport pointer is updated, it is only cleared when the wait
has not been superceded by a prior registration.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix driver not recovering NVME rports during target link faults

During target-side port faults, the driver would not recover all target
port logins. This resulted in a loss of nvme device discovery.

The driver is coded to wait for all GID_FT requests to complete before
restarting discovery. A fault is seen where the outstanding GIT_FT
counts are not properly decremented, thus discovery would never
start. Another fault was found in the clearing of the gidft_inp counter
that would be skipped in this condition. And a third fault found with
lpfc_nvme_register_port that would remove a reverence on the ndlp which
then allows a node swap on a port address change to prematurely remove
the reference and release the ndlp.

The following changes are made:

- Correct the decrementing of the outstanding GID_FT counters.

- In RSCN handling, no longer zero the counter before calling to issue
another GID_FT.

- No longer remove the reference on the dlp when the ndlp->nrport value
is not yet null.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix WQ/CQ creation for older asic's.

The patch to enlarge WQ/CQ creation keys off of an adapter response that
indicates support for the larger values. Older adapters return an
incorrect response and are limited in size. Thus the adapters fail the
WQ creation steps.

Augment the WQ sizing checks with a check on the older adapter types and
limit them to the restricted sizes.

Fixes: c176ffa0841c ("scsi: lpfc: Increase CQ and WQ sizes for SCSI")
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix NULL pointer access in lpfc_nvme_info_show

After making remoteport unregister requests, the ndlp nrport pointer was
stale.

Track when waiting for waiting for unregister completion callback and
adjust nldp pointer assignment. Add a few safety checks for NULL
pointer values.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix lingering lpfc_wq resource after driver unload

After driver unloads, lpfc_wq remains active. The destroy_workqueue
calls were not being made in driver unload. Additionally, SLI3 is
allocating lpfc_wq resources, but never uses it.

Make the destroy_workqueue calls on driver unload. Modify the SLI3 code
path no longer allocate lpfc_wq resources.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix Abort request WQ selection

When running loads that generated aborts, io errors where seen. Turns
out the abort requests where not placed on the proper WQ resulting in
the errors. Closer inspection inspection of this error also showed
improper spinlock api use.

Correct the WQ selection policy for the abort requests. Correct
spin_lock/spin_lock_irq/spin_lock_irqsave usage.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Enlarge nvmet asynchronous receive buffer counts

Under large io load, the current sizing of asynchronous buffer counts
could be exceeded, indicated by a 2885 log message:

  2885 Port Status Event: port status reg 0x81800000, port smphr
      reg 0xc000, error 1=0x52004a01, error 2=0x0

Enlarge the async receive queue size.  Allow for a configurable number
of buffers to be posted to each RQ, using the new attribute
lpfc_nvmet_mrq_post.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Add per io channel NVME IO statistics

When debugging various issues, per IO channel IO statistics were useful
to understand what was happening. However, many of the stats were on a
port basis rather than an io channel basis.

Move statistics to an io channel basis.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Correct target queue depth application changes

The max_scsicmpl_time parameter can be used to perform scsi cmd queue
depth mgmt based on io completion time: the queue depth is reduced to
make completion time shorter. However, as soon as an io completes and
the completion time is within limits, the code immediately bumps the
queue depth limit back up to the target queue depth. Thus the procedure
restarts, effectively limiting the usefulness of adjusting queue depth
to help completion time.

This patch makes the following changes:

- Removes the code at io completion that resets the queue depth as soon
   as within limits.

- As the code removed was where the target queue depth was first
   applied, change target queue depth application so that it occurs when
   the parameter is changed.

- Makes target queue depth a standard parameter: both a module
   parameter and a sysfs parameter.

- Optimizes the command pending count by using atomics rather than
   locks.

- Updates the debugfs nodelist stats to allow better debugging of
   pending command counts.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: lpfc: Fix multiple PRLI completion error path

Nodelist entry for SCSI array ends up in UNMAPPED state. This is due to
illegal discovery State machine transition because of two PRLIs and the
first one failing with LS_RJT. Also, the error path was designed
assuming the PRLIs complete in the order they were sent, FCP first, then
NVME. In a failing case, the array thinks about the first PRLI (FCP),
but issues LS_RJT for the 2nd PRLI immediately.

Fix PRLI completion error path for the ordering expectation. Ensure the
discovery state machine update is not set until all outstanding PRLIs
are complete.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: megaraid_sas: driver version upgrade

Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: megaraid_sas: Increase timeout by 1 sec for non-RAID fastpath IOs

Hardware could time out Fastpath IOs one second earlier than the timeout
provided by the host.

For non-RAID devices, driver provides timeout value based on OS provided
timeout value. Under certain scenarios, if the OS provides a timeout
value of 1 second, due to above behavior hardware will timeout
immediately.

Increase timeout value for non-RAID fastpath IOs by 1 second.

Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: megaraid_sas: Use zeroing memory allocator than allocator/memset

Use pci_zalloc_consistent for allocating zeroed memory and remove
unnecessary memset function.

Done using Coccinelle.
Generated by: scripts/coccinelle/api/alloc/kzalloc-simple.cocci

Suggested-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: netvsc: Use the vmbus function to calculate ring buffer percentage

In Vmbus, we have defined a function to calculate available ring buffer
percentage to write.

Use that function and remove netvsc's private version.

[mkp: typo]

Signed-off-by: Long Li <longli@microsoft.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: vmbus: Add function to report available ring buffer to write in total ring size percentage

Netvsc has a function to calculate how much ring buffer in percentage is
available to write. This function is also useful for storvsc and other
vmbus devices.

Define a similar function in vmbus to be used by other vmbus devices.

Signed-off-by: Long Li <longli@microsoft.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: libsas: add transport class for ATA devices

Now ata devices attached with sas controller do not have transport
class, so that we can not see any information of these ata devices in
/sys/class/ata_port(or ata_link or ata_device).

Add transport class for the ata devices attached with sas controller.
The /sys/class directory will show the infomation of the ata devices
as follows:

localhost:/sys/class # ls ata*
ata_device:
dev1.0  dev2.0

ata_link:
link1  link2

ata_port:
ata1  ata2

No functional change of the device scanning and io path. The ata
transport class was deleted when destroying the sas devices.

Signed-off-by: Jason Yan <yanaijie@huawei.com>
CC: Dan Williams <dan.j.williams@intel.com>
CC: Tejun Heo <tj@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: remove some unneeded structure members

This patch removes unneeded structure elements:

- hisi_sas_phy.dev_sas_addr: only ever written
- Also remove associated function which writes it,
hisi_sas_init_add().

- hisi_sas_device.attached_phy: only ever written
- Also remove code to set it in hisi_sas_dev_found()

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: print device id for errors

When we find an erroneous slot completion, to help aid debugging add the
device index to the current debug log.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: check IPTT is valid before using it for v3 hw

There is a bug of v3 hw development version. When AXI error happen, hw
may return an abnormal CQ that IPTT value is 0xffff. This will cause
IPTT out-of-bounds reference.

This patch adds a check of IPTT in cq_tasklet_v3_hw() and discards
invalid slot. This workaround scheme is just to enhance fault-tolerance
of the driver. So, we will apply this scheme for all version of v3 hw,
although release version has fixed this SoC bug.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: consolidate command check in hisi_sas_get_ata_protocol()

Currently we check the fis->command value in 2 locations in
hisi_sas_get_ata_protocol() switch statement. Fix this by consolidating
the check for fis->command value to 1 location only.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: use dma_zalloc_coherent()

This is a warning coming from Coccinelle, and need to use new interface
dma_zalloc_coherent() instead of dma_alloc_coherent()/memset().

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: delete timer when removing hisi_sas driver

Delete timer for v1 and v3 hw when removing hisi_sas driver.

Signed-off-by: Xiang chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: update RAS feature for later revision of v3 HW

There is an modification for later revision of v3 hw. More HW errors are
reported through RAS interrupt. These errors were originally reported
only through MSI.

When report to RAS, some combinations are done to port AXI errors and
FIFO OMIT errors. For example, each port has 4 AXI errors, and they are
combined to one when report to RAS.

This patch does two things:

1. Enable RAS interrupt of these errors and handle them in PCI
error handlers.

2. Disable MSI interrupts of these errors for this later revision hw.

Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: hisi_sas: make SAS address of SATA disks unique

When directly connected with SATA disks in different SAS cores, fill SAS
address with scsi_host's id to make it's fake SAS address unique.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: cxlflash: Handle spurious interrupts

The following Oops can occur when there is heavy I/O traffic and the host is
reset by a tool such as sg_reset.

[c000200fff3fbc90] c00800001690117c process_cmd_doneq+0x104/0x500
[cxlflash] (unreliable)
[c000200fff3fbd80] c008000016901648 cxlflash_rrq_irq+0xd0/0x150 [cxlflash]
[c000200fff3fbde0] c000000000193130 __handle_irq_event_percpu+0xa0/0x310
[c000200fff3fbea0] c0000000001933d8 handle_irq_event_percpu+0x38/0x90
[c000200fff3fbee0] c000000000193494 handle_irq_event+0x64/0xb0
[c000200fff3fbf10] c000000000198ea0 handle_fasteoi_irq+0xc0/0x230
[c000200fff3fbf40] c00000000019182c generic_handle_irq+0x4c/0x70
[c000200fff3fbf60] c00000000001794c __do_irq+0x7c/0x1c0
[c000200fff3fbf90] c00000000002a390 call_do_irq+0x14/0x24
[c000200e5828fab0] c000000000017b2c do_IRQ+0x9c/0x130
[c000200e5828fb00] c000000000009b04 h_virt_irq_common+0x114/0x120

When a context is reset, the pending commands are flushed and the AFU is
notified. Before the AFU handles this request there could be command
completion interrupts queued to PHB which are yet to be delivered to the
context. In this scenario, a context could receive an interrupt for a command
that has been flushed, leading to a possible crash when the memory for the
flushed command is accessed.

To resolve this problem, a boolean will indicate if the hardware queue is
ready to process interrupts or not. This can be evaluated in the interrupt
handler before proessing an interrupt.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

scsi: cxlflash: Remove commmands from pending list on timeout

The following Oops can occur if an internal command sent to the AFU does not
complete within the timeout:

[c000000ff101b810] c008000016020d94 term_mc+0xfc/0x1b0 [cxlflash]
[c000000ff101b8a0] c008000016020fb0 term_afu+0x168/0x280 [cxlflash]
[c000000ff101b930] c0080000160232ec cxlflash_pci_error_detected+0x184/0x230
[cxlflash]
[c000000ff101b9e0] c00800000d95d468 cxl_vphb_error_detected+0x90/0x150[cxl]
[c000000ff101ba20] c00800000d95f27c cxl_pci_error_detected+0xa4/0x240 [cxl]
[c000000ff101bac0] c00000000003eaf8 eeh_report_error+0xd8/0x1b0
[c000000ff101bb20] c00000000003d0b8 eeh_pe_dev_traverse+0x98/0x170
[c000000ff101bbb0] c00000000003f438 eeh_handle_normal_event+0x198/0x580
[c000000ff101bc60] c00000000003fba4 eeh_handle_event+0x2a4/0x338
[c000000ff101bd10] c0000000000400b8 eeh_event_handler+0x1f8/0x200
[c000000ff101bdc0] c00000000013da48 kthread+0x1a8/0x1b0
[c000000ff101be30] c00000000000b528 ret_from_kernel_thread+0x5c/0xb4

When an internal command times out, the command buffer is freed while it is
still in the pending commands list of the context. This corrupts the list and
when the context is cleaned up, a crash is encountered.

To resolve this issue, when an AFU command or TMF command times out, the
command should be deleted from the hardware queue pending command list before
freeing the buffer.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>