nvme-pci: fix surprise removal
authorIgor Konopko <igor.j.konopko@intel.com>
Fri, 23 Nov 2018 15:58:10 +0000 (16:58 +0100)
committerChristoph Hellwig <hch@lst.de>
Tue, 27 Nov 2018 17:12:26 +0000 (18:12 +0100)
When a PCIe NVMe device is not present, nvme_dev_remove_admin() calls
blk_cleanup_queue() on the admin queue, which frees the hctx for that
queue.  Moments later, on the same path nvme_kill_queues() calls
blk_mq_unquiesce_queue() on admin queue and tries to access hctx of it,
which leads to following OOPS:

Oops: 0000 [#1] SMP PTI
RIP: 0010:sbitmap_any_bit_set+0xb/0x40
Call Trace:
 blk_mq_run_hw_queue+0xd5/0x150
 blk_mq_run_hw_queues+0x3a/0x50
 nvme_kill_queues+0x26/0x50
 nvme_remove_namespaces+0xb2/0xc0
 nvme_remove+0x60/0x140
 pci_device_remove+0x3b/0xb0

Fixes: cb4bfda62afa2 ("nvme-pci: fix hot removal during error handling")
Signed-off-by: Igor Konopko <igor.j.konopko@intel.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
drivers/nvme/host/core.c

index 5afda6f..bb39b91 100644 (file)
@@ -3607,7 +3607,7 @@ void nvme_kill_queues(struct nvme_ctrl *ctrl)
        down_read(&ctrl->namespaces_rwsem);
 
        /* Forcibly unquiesce queues to avoid blocking dispatch */
-       if (ctrl->admin_q)
+       if (ctrl->admin_q && !blk_queue_dying(ctrl->admin_q))
                blk_mq_unquiesce_queue(ctrl->admin_q);
 
        list_for_each_entry(ns, &ctrl->namespaces, list)