btrfs: retry block group reclaim without infinite loop
authorBoris Burkov <boris@bur.io>
Fri, 7 Jun 2024 19:50:14 +0000 (12:50 -0700)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Thu, 27 Jun 2024 11:49:11 +0000 (13:49 +0200)
commit 4eb4e85c4f818491efc67e9373aa16b123c3f522 upstream.

If inc_block_group_ro systematically fails (e.g. due to ETXTBUSY from
swap) or btrfs_relocate_chunk systematically fails (from lack of
space), then this worker becomes an infinite loop.

At the very least, this strands the cleaner thread, but can also result
in hung tasks/RCU stalls on PREEMPT_NONE kernels and if the
reclaim_bgs_lock mutex is not contended.

I believe the best long term fix is to manage reclaim via work queue,
where we queue up a relocation on the triggering condition and re-queue
on failure. In the meantime, this is an easy fix to apply to avoid the
immediate pain.

Fixes: 7e2718099438 ("btrfs: reinsert BGs failed to reclaim")
CC: stable@vger.kernel.org # 6.6+
Signed-off-by: Boris Burkov <boris@bur.io>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
fs/btrfs/block-group.c

index 77a9984647ac7019b879da2508b913b4b8e8a4ca..b3accb082af01fc44cc0ac3c59e5aa4b837bea6d 100644 (file)
@@ -1788,6 +1788,7 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
                container_of(work, struct btrfs_fs_info, reclaim_bgs_work);
        struct btrfs_block_group *bg;
        struct btrfs_space_info *space_info;
+       LIST_HEAD(retry_list);
 
        if (!test_bit(BTRFS_FS_OPEN, &fs_info->flags))
                return;
@@ -1924,8 +1925,11 @@ void btrfs_reclaim_bgs_work(struct work_struct *work)
                }
 
 next:
-               if (ret)
-                       btrfs_mark_bg_to_reclaim(bg);
+               if (ret) {
+                       /* Refcount held by the reclaim_bgs list after splice. */
+                       btrfs_get_block_group(bg);
+                       list_add_tail(&bg->bg_list, &retry_list);
+               }
                btrfs_put_block_group(bg);
 
                mutex_unlock(&fs_info->reclaim_bgs_lock);
@@ -1945,6 +1949,9 @@ next:
        spin_unlock(&fs_info->unused_bgs_lock);
        mutex_unlock(&fs_info->reclaim_bgs_lock);
 end:
+       spin_lock(&fs_info->unused_bgs_lock);
+       list_splice_tail(&retry_list, &fs_info->reclaim_bgs);
+       spin_unlock(&fs_info->unused_bgs_lock);
        btrfs_exclop_finish(fs_info);
        sb_end_write(fs_info->sb);
 }