ubi: Fix deadlock caused by recursively holding work_sem
authorZhaoLong Wang <wangzhaolong1@huawei.com>
Sat, 4 Mar 2023 01:41:41 +0000 (09:41 +0800)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Thu, 20 Apr 2023 10:35:14 +0000 (12:35 +0200)
[ Upstream commit f773f0a331d6c41733b17bebbc1b6cae12e016f5 ]

During the processing of the bgt, if the sync_erase() return -EBUSY
or some other error code in __erase_worker(),schedule_erase() called
again lead to the down_read(ubi->work_sem) hold twice and may get
block by down_write(ubi->work_sem) in ubi_update_fastmap(),
which cause deadlock.

          ubi bgt                        other task
 do_work
  down_read(&ubi->work_sem)          ubi_update_fastmap
  erase_worker                         # Blocked by down_read
   __erase_worker                      down_write(&ubi->work_sem)
    schedule_erase
     schedule_ubi_work
      down_read(&ubi->work_sem)

Fix this by changing input parameter @nested of the schedule_erase() to
'true' to avoid recursively acquiring the down_read(&ubi->work_sem).

Also, fix the incorrect comment about @nested parameter of the
schedule_erase() because when down_write(ubi->work_sem) is held, the
@nested is also need be true.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=217093
Fixes: 2e8f08deabbc ("ubi: Fix races around ubi_refill_pools()")
Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Sasha Levin <sashal@kernel.org>
drivers/mtd/ubi/wl.c

index 9e14319..6049ab9 100644 (file)
@@ -575,7 +575,7 @@ static int erase_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk,
  * @vol_id: the volume ID that last used this PEB
  * @lnum: the last used logical eraseblock number for the PEB
  * @torture: if the physical eraseblock has to be tortured
- * @nested: denotes whether the work_sem is already held in read mode
+ * @nested: denotes whether the work_sem is already held
  *
  * This function returns zero in case of success and a %-ENOMEM in case of
  * failure.
@@ -1131,7 +1131,7 @@ static int __erase_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk)
                int err1;
 
                /* Re-schedule the LEB for erasure */
-               err1 = schedule_erase(ubi, e, vol_id, lnum, 0, false);
+               err1 = schedule_erase(ubi, e, vol_id, lnum, 0, true);
                if (err1) {
                        spin_lock(&ubi->wl_lock);
                        wl_entry_destroy(ubi, e);