io_uring: fix bug in slow unregistering of nodes
authorDylan Yudaken <dylany@fb.com>
Fri, 21 Jan 2022 12:38:56 +0000 (04:38 -0800)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Tue, 1 Feb 2022 16:27:11 +0000 (17:27 +0100)
[ Upstream commit b36a2050040b2d839bdc044007cdd57101d7f881 ]

In some cases io_rsrc_ref_quiesce will call io_rsrc_node_switch_start,
and then immediately flush the delayed work queue &ctx->rsrc_put_work.

However the percpu_ref_put does not immediately destroy the node, it
will be called asynchronously via RCU. That ends up with
io_rsrc_node_ref_zero only being called after rsrc_put_work has been
flushed, and so the process ends up sleeping for 1 second unnecessarily.

This patch executes the put code immediately if we are busy
quiescing.

Fixes: 4a38aed2a0a7 ("io_uring: batch reap of dead file registrations")
Signed-off-by: Dylan Yudaken <dylany@fb.com>
Link: https://lore.kernel.org/r/20220121123856.3557884-1-dylany@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
fs/io_uring.c

index f713b91537f41a35fbd2bc1e3e0cdd8d0651e6ec..993913c585fbf9fbccc256d4c41f278792370dd1 100644 (file)
@@ -7718,10 +7718,15 @@ static void io_rsrc_node_ref_zero(struct percpu_ref *ref)
        struct io_ring_ctx *ctx = node->rsrc_data->ctx;
        unsigned long flags;
        bool first_add = false;
+       unsigned long delay = HZ;
 
        spin_lock_irqsave(&ctx->rsrc_ref_lock, flags);
        node->done = true;
 
+       /* if we are mid-quiesce then do not delay */
+       if (node->rsrc_data->quiesce)
+               delay = 0;
+
        while (!list_empty(&ctx->rsrc_ref_list)) {
                node = list_first_entry(&ctx->rsrc_ref_list,
                                            struct io_rsrc_node, node);
@@ -7734,7 +7739,7 @@ static void io_rsrc_node_ref_zero(struct percpu_ref *ref)
        spin_unlock_irqrestore(&ctx->rsrc_ref_lock, flags);
 
        if (first_add)
-               mod_delayed_work(system_wq, &ctx->rsrc_put_work, HZ);
+               mod_delayed_work(system_wq, &ctx->rsrc_put_work, delay);
 }
 
 static struct io_rsrc_node *io_rsrc_node_alloc(struct io_ring_ctx *ctx)