nvmet-rdma: Fix list_del corruption on queue establishment failure
authorIsrael Rukshin <israelr@nvidia.com>
Tue, 5 Jan 2021 08:46:54 +0000 (10:46 +0200)
committerChristoph Hellwig <hch@lst.de>
Wed, 6 Jan 2021 09:30:37 +0000 (10:30 +0100)
When a queue is in NVMET_RDMA_Q_CONNECTING state, it may has some
requests at rsp_wait_list. In case a disconnect occurs at this
state, no one will empty this list and will return the requests to
free_rsps list. Normally nvmet_rdma_queue_established() free those
requests after moving the queue to NVMET_RDMA_Q_LIVE state, but in
this case __nvmet_rdma_queue_disconnect() is called before. The
crash happens at nvmet_rdma_free_rsps() when calling
list_del(&rsp->free_list), because the request exists only at
the wait list. To fix the issue, simply clear rsp_wait_list when
destroying the queue.

Signed-off-by: Israel Rukshin <israelr@nvidia.com>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
drivers/nvme/target/rdma.c

index 5c1e7cb..bdfc22e 100644 (file)
@@ -1641,6 +1641,16 @@ static void __nvmet_rdma_queue_disconnect(struct nvmet_rdma_queue *queue)
        spin_lock_irqsave(&queue->state_lock, flags);
        switch (queue->state) {
        case NVMET_RDMA_Q_CONNECTING:
+               while (!list_empty(&queue->rsp_wait_list)) {
+                       struct nvmet_rdma_rsp *rsp;
+
+                       rsp = list_first_entry(&queue->rsp_wait_list,
+                                              struct nvmet_rdma_rsp,
+                                              wait_list);
+                       list_del(&rsp->wait_list);
+                       nvmet_rdma_put_rsp(rsp);
+               }
+               fallthrough;
        case NVMET_RDMA_Q_LIVE:
                queue->state = NVMET_RDMA_Q_DISCONNECTING;
                disconnect = true;