btrfs: qgroup: don't try to wait flushing if we're already holding a transaction
authorQu Wenruo <wqu@suse.com>
Fri, 4 Dec 2020 01:24:47 +0000 (09:24 +0800)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Tue, 12 Jan 2021 19:18:24 +0000 (20:18 +0100)
commit ae5e070eaca9dbebde3459dd8f4c2756f8c097d0 upstream.

There is a chance of racing for qgroup flushing which may lead to
deadlock:

Thread A | Thread B
   (not holding trans handle) |  (holding a trans handle)
--------------------------------+--------------------------------
__btrfs_qgroup_reserve_meta()   | __btrfs_qgroup_reserve_meta()
|- try_flush_qgroup() | |- try_flush_qgroup()
   |- QGROUP_FLUSHING bit set   |    |
   | |    |- test_and_set_bit()
   | |    |- wait_event()
   |- btrfs_join_transaction() |
   |- btrfs_commit_transaction()|

!!! DEAD LOCK !!!

Since thread A wants to commit transaction, but thread B is holding a
transaction handle, blocking the commit.
At the same time, thread B is waiting for thread A to finish its commit.

This is just a hot fix, and would lead to more EDQUOT when we're near
the qgroup limit.

The proper fix would be to make all metadata/data reservations happen
without holding a transaction handle.

CC: stable@vger.kernel.org # 5.9+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
fs/btrfs/qgroup.c

index 87bd37b..faed0e9 100644 (file)
@@ -3565,16 +3565,6 @@ static int try_flush_qgroup(struct btrfs_root *root)
        bool can_commit = true;
 
        /*
-        * We don't want to run flush again and again, so if there is a running
-        * one, we won't try to start a new flush, but exit directly.
-        */
-       if (test_and_set_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state)) {
-               wait_event(root->qgroup_flush_wait,
-                       !test_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state));
-               return 0;
-       }
-
-       /*
         * If current process holds a transaction, we shouldn't flush, as we
         * assume all space reservation happens before a transaction handle is
         * held.
@@ -3588,6 +3578,26 @@ static int try_flush_qgroup(struct btrfs_root *root)
            current->journal_info != BTRFS_SEND_TRANS_STUB)
                can_commit = false;
 
+       /*
+        * We don't want to run flush again and again, so if there is a running
+        * one, we won't try to start a new flush, but exit directly.
+        */
+       if (test_and_set_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state)) {
+               /*
+                * We are already holding a transaction, thus we can block other
+                * threads from flushing.  So exit right now. This increases
+                * the chance of EDQUOT for heavy load and near limit cases.
+                * But we can argue that if we're already near limit, EDQUOT is
+                * unavoidable anyway.
+                */
+               if (!can_commit)
+                       return 0;
+
+               wait_event(root->qgroup_flush_wait,
+                       !test_bit(BTRFS_ROOT_QGROUP_FLUSHING, &root->state));
+               return 0;
+       }
+
        ret = btrfs_start_delalloc_snapshot(root);
        if (ret < 0)
                goto out;