btrfs: fix hang during unmount when stopping a space reclaim worker

author Filipe Manana <fdmanana@suse.com>

Thu, 8 Sep 2022 11:31:51 +0000 (12:31 +0100)

committer Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Wed, 28 Sep 2022 09:11:42 +0000 (11:11 +0200)
author Filipe Manana <fdmanana@suse.com>
Thu, 8 Sep 2022 11:31:51 +0000 (12:31 +0100)
committer Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Wed, 28 Sep 2022 09:11:42 +0000 (11:11 +0200)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c

index cf13c7b7aa2620165fbf1a61fc14a8eb79fef91b..f4015556cafadfd2ce792903f09bac35a33acbc5 100644 (file)
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -4348,6 +4348,31 @@ void __cold close_ctree(struct btrfs_fs_info *fs_info)
         /* clear out the rbtree of defraggable inodes */
         btrfs_cleanup_defrag_inodes(fs_info);
  
+       /*
+        * After we parked the cleaner kthread, ordered extents may have
+        * completed and created new delayed iputs. If one of the async reclaim
+        * tasks is running and in the RUN_DELAYED_IPUTS flush state, then we
+        * can hang forever trying to stop it, because if a delayed iput is
+        * added after it ran btrfs_run_delayed_iputs() and before it called
+        * btrfs_wait_on_delayed_iputs(), it will hang forever since there is
+        * no one else to run iputs.
+        *
+        * So wait for all ongoing ordered extents to complete and then run
+        * delayed iputs. This works because once we reach this point no one
+        * can either create new ordered extents nor create delayed iputs
+        * through some other means.
+        *
+        * Also note that btrfs_wait_ordered_roots() is not safe here, because
+        * it waits for BTRFS_ORDERED_COMPLETE to be set on an ordered extent,
+        * but the delayed iput for the respective inode is made only when doing
+        * the final btrfs_put_ordered_extent() (which must happen at
+        * btrfs_finish_ordered_io() when we are unmounting).
+        */
+       btrfs_flush_workqueue(fs_info->endio_write_workers);
+       /* Ordered extents for free space inodes. */
+       btrfs_flush_workqueue(fs_info->endio_freespace_worker);
+       btrfs_run_delayed_iputs(fs_info);
+
         cancel_work_sync(&fs_info->async_reclaim_work);
         cancel_work_sync(&fs_info->async_data_reclaim_work);
         cancel_work_sync(&fs_info->preempt_reclaim_work);
author	Filipe Manana <fdmanana@suse.com>
	Thu, 8 Sep 2022 11:31:51 +0000 (12:31 +0100)
committer	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
	Wed, 28 Sep 2022 09:11:42 +0000 (11:11 +0200)