md/raid5-cache: fix a deadlock in r5l_exit_log()
authorYu Kuai <yukuai3@huawei.com>
Sat, 8 Jul 2023 09:17:27 +0000 (17:17 +0800)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Wed, 13 Sep 2023 07:42:43 +0000 (09:42 +0200)
[ Upstream commit a705b11b358dee677aad80630e7608b2d5f56691 ]

Commit b13015af94cf ("md/raid5-cache: Clear conf->log after finishing
work") introduce a new problem:

// caller hold reconfig_mutex
r5l_exit_log
 flush_work(&log->disable_writeback_work)
r5c_disable_writeback_async
 wait_event
  /*
   * conf->log is not NULL, and mddev_trylock()
   * will fail, wait_event() can never pass.
   */
 conf->log = NULL

Fix this problem by setting 'config->log' to NULL before wake_up() as it
used to be, so that wait_event() from r5c_disable_writeback_async() can
exist. In the meantime, move forward md_unregister_thread() so that
null-ptr-deref this commit fixed can still be fixed.

Fixes: b13015af94cf ("md/raid5-cache: Clear conf->log after finishing work")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Link: https://lore.kernel.org/r/20230708091727.1417894-1-yukuai1@huaweicloud.com
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
drivers/md/raid5-cache.c

index 832d8566e165650d43bd271bb436bc19ecbb8171..477e3ae17545a587b068bf4cbabce085a78e0b2f 100644 (file)
@@ -3166,12 +3166,15 @@ void r5l_exit_log(struct r5conf *conf)
 {
        struct r5l_log *log = conf->log;
 
-       /* Ensure disable_writeback_work wakes up and exits */
-       wake_up(&conf->mddev->sb_wait);
-       flush_work(&log->disable_writeback_work);
        md_unregister_thread(&log->reclaim_thread);
 
+       /*
+        * 'reconfig_mutex' is held by caller, set 'confg->log' to NULL to
+        * ensure disable_writeback_work wakes up and exits.
+        */
        conf->log = NULL;
+       wake_up(&conf->mddev->sb_wait);
+       flush_work(&log->disable_writeback_work);
 
        mempool_exit(&log->meta_pool);
        bioset_exit(&log->bs);