raid10: avoid spin_lock from fastpath from raid10_unplug()
authorYu Kuai <yukuai3@huawei.com>
Wed, 21 Jun 2023 10:57:28 +0000 (18:57 +0800)
committerSong Liu <song@kernel.org>
Fri, 23 Jun 2023 16:41:50 +0000 (09:41 -0700)
Commit 0c0be98bbe67 ("md/raid10: prevent unnecessary calls to wake_up()
in fast path") missed one place, for example, with:

fio -direct=1 -rw=write/randwrite -iodepth=1 ...

Plug and unplug are called for each io, then wake_up() from raid10_unplug()
will cause lock contention as well.

Avoid this contention by using wake_up_barrier() instead of wake_up(),
where spin_lock is not held if waitqueue is empty.

Fio test script:

[global]
name=random reads and writes
ioengine=libaio
direct=1
readwrite=randrw
rwmixread=70
iodepth=64
buffered=0
filename=/dev/md0
size=1G
runtime=30
time_based
randrepeat=0
norandommap
refill_buffers
ramp_time=10
bs=4k
numjobs=400
group_reporting=1
[job1]

Test result with ramdisk raid10(By Ali):

Before this patch With this patch
READ IOPS=2033k IOPS=3642k
WRITE IOPS=871k IOPS=1561K

By the way, in this scenario, blk_plug_cb() will be allocated and freed
for each io, this seems need to be optimized as well.

Reported-and-tested-by: Ali Gholami Rudi <aligrudi@gmail.com>
Closes: https://lore.kernel.org/all/20231606122233@laper.mirepesht/
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20230621105728.1268542-1-yukuai1@huaweicloud.com
drivers/md/raid10.c

index 7906776..5051149 100644 (file)
@@ -1118,7 +1118,7 @@ static void raid10_unplug(struct blk_plug_cb *cb, bool from_schedule)
                spin_lock_irq(&conf->device_lock);
                bio_list_merge(&conf->pending_bio_list, &plug->pending);
                spin_unlock_irq(&conf->device_lock);
-               wake_up(&conf->wait_barrier);
+               wake_up_barrier(conf);
                md_wakeup_thread(mddev->thread);
                kfree(plug);
                return;
@@ -1127,7 +1127,7 @@ static void raid10_unplug(struct blk_plug_cb *cb, bool from_schedule)
        /* we aren't scheduling, so we can do the write-out directly. */
        bio = bio_list_get(&plug->pending);
        raid1_prepare_flush_writes(mddev->bitmap);
-       wake_up(&conf->wait_barrier);
+       wake_up_barrier(conf);
 
        while (bio) { /* submit pending writes */
                struct bio *next = bio->bi_next;