Revert "Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d""
authorSong Liu <song@kernel.org>
Thu, 25 Jan 2024 08:21:31 +0000 (00:21 -0800)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Thu, 25 Jan 2024 23:36:01 +0000 (15:36 -0800)
This reverts commit bed9e27baf52a09b7ba2a3714f1e24e17ced386d.

The original set [1][2] was expected to undo a suboptimal fix in [2], and
replace it with a better fix [1]. However, as reported by Dan Moulding [2]
causes an issue with raid5 with journal device.

Revert [2] for now to close the issue. We will follow up on another issue
reported by Juxiao Bi, as [2] is expected to fix it. We believe this is a
good trade-off, because the latter issue happens less freqently.

In the meanwhile, we will NOT revert [1], as it contains the right logic.

[1] commit d6e035aad6c0 ("md: bypass block throttle for superblock update")
[2] commit bed9e27baf52 ("Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"")

Reported-by: Dan Moulding <dan@danm.net>
Closes: https://lore.kernel.org/linux-raid/20240123005700.9302-1-dan@danm.net/
Fixes: bed9e27baf52 ("Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in raid5d"")
Cc: stable@vger.kernel.org # v5.19+
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
drivers/md/raid5.c

index 4bbd2a247dc71ad2669f0ff7bdf864aa5fdebe1e..68d86dbecb4ac8d6a837d7e72a99c7c6165c8e6a 100644 (file)
@@ -36,6 +36,7 @@
  */
 
 #include <linux/blkdev.h>
+#include <linux/delay.h>
 #include <linux/kthread.h>
 #include <linux/raid/pq.h>
 #include <linux/async_tx.h>
@@ -6842,7 +6843,18 @@ static void raid5d(struct md_thread *thread)
                        spin_unlock_irq(&conf->device_lock);
                        md_check_recovery(mddev);
                        spin_lock_irq(&conf->device_lock);
+
+                       /*
+                        * Waiting on MD_SB_CHANGE_PENDING below may deadlock
+                        * seeing md_check_recovery() is needed to clear
+                        * the flag when using mdmon.
+                        */
+                       continue;
                }
+
+               wait_event_lock_irq(mddev->sb_wait,
+                       !test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags),
+                       conf->device_lock);
        }
        pr_debug("%d stripes handled\n", handled);