nbd: Fix hung on disconnect request if socket is closed before
authorXie Yongji <xieyongji@bytedance.com>
Tue, 22 Mar 2022 08:06:39 +0000 (16:06 +0800)
committerJens Axboe <axboe@kernel.dk>
Mon, 16 May 2022 12:19:35 +0000 (06:19 -0600)
When userspace closes the socket before sending a disconnect
request, the following I/O requests will be blocked in
wait_for_reconnect() until dead timeout. This will cause the
following disconnect request also hung on blk_mq_quiesce_queue().
That means we have no way to disconnect a nbd device if there
are some I/O requests waiting for reconnecting until dead timeout.
It's not expected. So let's wake up the thread waiting for
reconnecting directly when a disconnect request is sent.

Reported-by: Xu Jianhai <zero.xu@bytedance.com>
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/r/20220322080639.142-1-xieyongji@bytedance.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
drivers/block/nbd.c

index fd1501f..ac8b045 100644 (file)
@@ -946,11 +946,15 @@ static int wait_for_reconnect(struct nbd_device *nbd)
        struct nbd_config *config = nbd->config;
        if (!config->dead_conn_timeout)
                return 0;
-       if (test_bit(NBD_RT_DISCONNECTED, &config->runtime_flags))
+
+       if (!wait_event_timeout(config->conn_wait,
+                               test_bit(NBD_RT_DISCONNECTED,
+                                        &config->runtime_flags) ||
+                               atomic_read(&config->live_connections) > 0,
+                               config->dead_conn_timeout))
                return 0;
-       return wait_event_timeout(config->conn_wait,
-                                 atomic_read(&config->live_connections) > 0,
-                                 config->dead_conn_timeout) > 0;
+
+       return !test_bit(NBD_RT_DISCONNECTED, &config->runtime_flags);
 }
 
 static int nbd_handle_cmd(struct nbd_cmd *cmd, int index)
@@ -2076,6 +2080,7 @@ static void nbd_disconnect_and_put(struct nbd_device *nbd)
        mutex_lock(&nbd->config_lock);
        nbd_disconnect(nbd);
        sock_shutdown(nbd);
+       wake_up(&nbd->config->conn_wait);
        /*
         * Make sure recv thread has finished, we can safely call nbd_clear_que()
         * to cancel the inflight I/Os.