block: mq-deadline: Fix dd_finish_request() for zoned devices
authorDamien Le Moal <damien.lemoal@opensource.wdc.com>
Thu, 24 Nov 2022 02:12:07 +0000 (11:12 +0900)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Sat, 7 Jan 2023 10:11:51 +0000 (11:11 +0100)
commit 2820e5d0820ac4daedff1272616a53d9c7682fd2 upstream.

dd_finish_request() tests if the per prio fifo_list is not empty to
determine if request dispatching must be restarted for handling blocked
write requests to zoned devices with a call to
blk_mq_sched_mark_restart_hctx(). While simple, this implementation has
2 problems:

1) Only the priority level of the completed request is considered.
   However, writes to a zone may be blocked due to other writes to the
   same zone using a different priority level. While this is unlikely to
   happen in practice, as writing a zone with different IO priorirites
   does not make sense, nothing in the code prevents this from
   happening.
2) The use of list_empty() is dangerous as dd_finish_request() does not
   take dd->lock and may run concurrently with the insert and dispatch
   code.

Fix these 2 problems by testing the write fifo list of all priority
levels using the new helper dd_has_write_work(), and by testing each
fifo list using list_empty_careful().

Fixes: c807ab520fc3 ("block/mq-deadline: Add I/O priority support")
Cc: <stable@vger.kernel.org>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20221124021208.242541-2-damien.lemoal@opensource.wdc.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
block/mq-deadline.c

index 5639921..3637448 100644 (file)
@@ -789,6 +789,18 @@ static void dd_prepare_request(struct request *rq)
        rq->elv.priv[0] = NULL;
 }
 
+static bool dd_has_write_work(struct blk_mq_hw_ctx *hctx)
+{
+       struct deadline_data *dd = hctx->queue->elevator->elevator_data;
+       enum dd_prio p;
+
+       for (p = 0; p <= DD_PRIO_MAX; p++)
+               if (!list_empty_careful(&dd->per_prio[p].fifo_list[DD_WRITE]))
+                       return true;
+
+       return false;
+}
+
 /*
  * Callback from inside blk_mq_free_request().
  *
@@ -828,9 +840,10 @@ static void dd_finish_request(struct request *rq)
 
                spin_lock_irqsave(&dd->zone_lock, flags);
                blk_req_zone_write_unlock(rq);
-               if (!list_empty(&per_prio->fifo_list[DD_WRITE]))
-                       blk_mq_sched_mark_restart_hctx(rq->mq_hctx);
                spin_unlock_irqrestore(&dd->zone_lock, flags);
+
+               if (dd_has_write_work(rq->mq_hctx))
+                       blk_mq_sched_mark_restart_hctx(rq->mq_hctx);
        }
 }