We used to use the wrong type of integer in 'zfcp_fsf_req_send()' to cache
the FSF request ID when sending a new FSF request. This is used in case the
sending fails and we need to remove the request from our internal hash
table again (so we don't keep an invalid reference and use it when we free
the request again).
In 'zfcp_fsf_req_send()' we used to cache the ID as 'int' (signed and 32
bit wide), but the rest of the zfcp code (and the firmware specification)
handles the ID as 'unsigned long'/'u64' (unsigned and 64 bit wide [s390x
ELF ABI]). For one this has the obvious problem that when the ID grows
past 32 bit (this can happen reasonably fast) it is truncated to 32 bit
when storing it in the cache variable and so doesn't match the original ID
anymore. The second less obvious problem is that even when the original ID
has not yet grown past 32 bit, as soon as the 32nd bit is set in the
original ID (0x80000000 = 2'147'483'648) we will have a mismatch when we
cast it back to 'unsigned long'. As the cached variable is of a signed
type, the compiler will choose a sign-extending instruction to load the 32
bit variable into a 64 bit register (e.g.: 'lgf %r11,188(%r15)'). So once
we pass the cached variable into 'zfcp_reqlist_find_rm()' to remove the
request again all the leading zeros will be flipped to ones to extend the
sign and won't match the original ID anymore (this has been observed in
practice).
If we can't successfully remove the request from the hash table again after
'zfcp_qdio_send()' fails (this happens regularly when zfcp cannot notify
the adapter about new work because the adapter is already gone during
e.g. a ChpID toggle) we will end up with a double free. We unconditionally
free the request in the calling function when 'zfcp_fsf_req_send()' fails,
but because the request is still in the hash table we end up with a stale
memory reference, and once the zfcp adapter is either reset during recovery
or shutdown we end up freeing the same memory twice.
The resulting stack traces vary depending on the kernel and have no direct
correlation to the place where the bug occurs. Here are three examples that
have been seen in practice:
list_del corruption. next->prev should be
00000001b9d13800, but was
00000000dead4ead. (next=
00000001bd131a00)
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:62!
monitor event: 0040 ilc:2 [#1] PREEMPT SMP
Modules linked in: ...
CPU: 9 PID: 1617 Comm: zfcperp0.0.1740 Kdump: loaded
Hardware name: ...
Krnl PSW :
0704d00180000000 00000003cbeea1f8 (__list_del_entry_valid+0x98/0x140)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
Krnl GPRS:
00000000916d12f1 0000000080000000 000000000000006d 00000003cb665cd6
0000000000000001 0000000000000000 0000000000000000 00000000d28d21e8
00000000d3844000 00000380099efd28 00000001bd131a00 00000001b9d13800
00000000d3290100 0000000000000000 00000003cbeea1f4 00000380099efc70
Krnl Code:
00000003cbeea1e8:
c020004f68a7 larl %r2,
00000003cc8d7336
00000003cbeea1ee:
c0e50027fd65 brasl %r14,
00000003cc3e9cb8
#
00000003cbeea1f4:
af000000 mc 0,0
>
00000003cbeea1f8:
c02000920440 larl %r2,
00000003cd12aa78
00000003cbeea1fe:
c0e500289c25 brasl %r14,
00000003cc3fda48
00000003cbeea204:
b9040043 lgr %r4,%r3
00000003cbeea208:
b9040051 lgr %r5,%r1
00000003cbeea20c:
b9040032 lgr %r3,%r2
Call Trace:
[<
00000003cbeea1f8>] __list_del_entry_valid+0x98/0x140
([<
00000003cbeea1f4>] __list_del_entry_valid+0x94/0x140)
[<
000003ff7ff502fe>] zfcp_fsf_req_dismiss_all+0xde/0x150 [zfcp]
[<
000003ff7ff49cd0>] zfcp_erp_strategy_do_action+0x160/0x280 [zfcp]
[<
000003ff7ff4a22e>] zfcp_erp_strategy+0x21e/0xca0 [zfcp]
[<
000003ff7ff4ad34>] zfcp_erp_thread+0x84/0x1a0 [zfcp]
[<
00000003cb5eece8>] kthread+0x138/0x150
[<
00000003cb557f3c>] __ret_from_fork+0x3c/0x60
[<
00000003cc4172ea>] ret_from_fork+0xa/0x40
INFO: lockdep is turned off.
Last Breaking-Event-Address:
[<
00000003cc3e9d04>] _printk+0x4c/0x58
Kernel panic - not syncing: Fatal exception: panic_on_oops
or:
Unable to handle kernel pointer dereference in virtual kernel address space
Failing address:
6b6b6b6b6b6b6000 TEID:
6b6b6b6b6b6b6803
Fault in home space mode while using kernel ASCE.
AS:
0000000063b10007 R3:
0000000000000024
Oops: 0038 ilc:3 [#1] SMP
Modules linked in: ...
CPU: 10 PID: 0 Comm: swapper/10 Kdump: loaded
Hardware name: ...
Krnl PSW :
0404d00180000000 000003ff7febaf8e (zfcp_fsf_reqid_check+0x86/0x158 [zfcp])
R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
Krnl GPRS:
5a6f1cfa89c49ac3 00000000aff2c4c8 6b6b6b6b6b6b6b6b 00000000000002a8
0000000000000000 0000000000000055 0000000000000000 00000000a8515800
0700000000000000 00000000a6e14500 00000000aff2c000 000000008003c44c
000000008093c700 0000000000000010 00000380009ebba8 00000380009ebb48
Krnl Code:
000003ff7febaf7e:
a7f4003d brc 15,
000003ff7febaff8
000003ff7febaf82:
e32020000004 lg %r2,0(%r2)
#
000003ff7febaf88:
ec2100388064 cgrj %r2,%r1,8,
000003ff7febaff8
>
000003ff7febaf8e:
e3b020100020 cg %r11,16(%r2)
000003ff7febaf94:
a774fff7 brc 7,
000003ff7febaf82
000003ff7febaf98:
ec280030007c cgij %r2,0,8,
000003ff7febaff8
000003ff7febaf9e:
e31020080004 lg %r1,8(%r2)
000003ff7febafa4:
e33020000004 lg %r3,0(%r2)
Call Trace:
[<
000003ff7febaf8e>] zfcp_fsf_reqid_check+0x86/0x158 [zfcp]
[<
000003ff7febbdbc>] zfcp_qdio_int_resp+0x6c/0x170 [zfcp]
[<
000003ff7febbf90>] zfcp_qdio_irq_tasklet+0xd0/0x108 [zfcp]
[<
0000000061d90a04>] tasklet_action_common.constprop.0+0xdc/0x128
[<
000000006292f300>] __do_softirq+0x130/0x3c0
[<
0000000061d906c6>] irq_exit_rcu+0xfe/0x118
[<
000000006291e818>] do_io_irq+0xc8/0x168
[<
000000006292d516>] io_int_handler+0xd6/0x110
[<
000000006292d596>] psw_idle_exit+0x0/0xa
([<
0000000061d3be50>] arch_cpu_idle+0x40/0xd0)
[<
000000006292ceea>] default_idle_call+0x52/0xf8
[<
0000000061de4fa4>] do_idle+0xd4/0x168
[<
0000000061de51fe>] cpu_startup_entry+0x36/0x40
[<
0000000061d4faac>] smp_start_secondary+0x12c/0x138
[<
000000006292d88e>] restart_int_handler+0x6e/0x90
Last Breaking-Event-Address:
[<
000003ff7febaf94>] zfcp_fsf_reqid_check+0x8c/0x158 [zfcp]
Kernel panic - not syncing: Fatal exception in interrupt
or:
Unable to handle kernel pointer dereference in virtual kernel address space
Failing address:
523b05d3ae76a000 TEID:
523b05d3ae76a803
Fault in home space mode while using kernel ASCE.
AS:
0000000077c40007 R3:
0000000000000024
Oops: 0038 ilc:3 [#1] SMP
Modules linked in: ...
CPU: 3 PID: 453 Comm: kworker/3:1H Kdump: loaded
Hardware name: ...
Workqueue: kblockd blk_mq_run_work_fn
Krnl PSW :
0404d00180000000 0000000076fc0312 (__kmalloc+0xd2/0x398)
R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
Krnl GPRS:
ffffffffffffffff 523b05d3ae76abf6 0000000000000000 0000000000092a20
0000000000000002 00000007e49b5cc0 00000007eda8f000 0000000000092a20
00000007eda8f000 00000003b02856b9 00000000000000a8 523b05d3ae76abf6
00000007dd662000 00000007eda8f000 0000000076fc02b2 000003e0037637a0
Krnl Code:
0000000076fc0302:
c004000000d4 brcl 0,
76fc04aa
0000000076fc0308:
b904001b lgr %r1,%r11
#
0000000076fc030c:
e3106020001a algf %r1,32(%r6)
>
0000000076fc0312:
e31010000082 xg %r1,0(%r1)
0000000076fc0318:
b9040001 lgr %r0,%r1
0000000076fc031c:
e30061700082 xg %r0,368(%r6)
0000000076fc0322:
ec59000100d9 aghik %r5,%r9,1
0000000076fc0328:
e34003b80004 lg %r4,952
Call Trace:
[<
0000000076fc0312>] __kmalloc+0xd2/0x398
[<
0000000076f318f2>] mempool_alloc+0x72/0x1f8
[<
000003ff8027c5f8>] zfcp_fsf_req_create.isra.7+0x40/0x268 [zfcp]
[<
000003ff8027f1bc>] zfcp_fsf_fcp_cmnd+0xac/0x3f0 [zfcp]
[<
000003ff80280f1a>] zfcp_scsi_queuecommand+0x122/0x1d0 [zfcp]
[<
000003ff800b4218>] scsi_queue_rq+0x778/0xa10 [scsi_mod]
[<
00000000771782a0>] __blk_mq_try_issue_directly+0x130/0x208
[<
000000007717a124>] blk_mq_request_issue_directly+0x4c/0xa8
[<
000003ff801302e2>] dm_mq_queue_rq+0x2ea/0x468 [dm_mod]
[<
0000000077178c12>] blk_mq_dispatch_rq_list+0x33a/0x818
[<
000000007717f064>] __blk_mq_do_dispatch_sched+0x284/0x2f0
[<
000000007717f44c>] __blk_mq_sched_dispatch_requests+0x1c4/0x218
[<
000000007717fa7a>] blk_mq_sched_dispatch_requests+0x52/0x90
[<
0000000077176d74>] __blk_mq_run_hw_queue+0x9c/0xc0
[<
0000000076da6d74>] process_one_work+0x274/0x4d0
[<
0000000076da7018>] worker_thread+0x48/0x560
[<
0000000076daef18>] kthread+0x140/0x160
[<
000000007751d144>] ret_from_fork+0x28/0x30
Last Breaking-Event-Address:
[<
0000000076fc0474>] __kmalloc+0x234/0x398
Kernel panic - not syncing: Fatal exception: panic_on_oops
To fix this, simply change the type of the cache variable to 'unsigned
long', like the rest of zfcp and also the argument for
'zfcp_reqlist_find_rm()'. This prevents truncation and wrong sign extension
and so can successfully remove the request from the hash table.
Fixes: e60a6d69f1f8 ("[SCSI] zfcp: Remove function zfcp_reqlist_find_safe")
Cc: <stable@vger.kernel.org> #v2.6.34+
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Link: https://lore.kernel.org/r/979f6e6019d15f91ba56182f1aaf68d61bf37fc6.1668595505.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>