btrfs: zoned: wait for data BG to be finished on direct IO allocation
authorNaohiro Aota <naohiro.aota@wdc.com>
Tue, 17 Oct 2023 08:00:31 +0000 (17:00 +0900)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Tue, 28 Nov 2023 17:07:15 +0000 (17:07 +0000)
commit 776a838f1fa95670c1c1cf7109a898090b473fa3 upstream.

Running the fio command below on a ZNS device results in "Resource
temporarily unavailable" error.

  $ sudo fio --name=w --directory=/mnt --filesize=1GB --bs=16MB --numjobs=16 \
        --rw=write --ioengine=libaio --iodepth=128 --direct=1

  fio: io_u error on file /mnt/w.2.0: Resource temporarily unavailable: write offset=117440512, buflen=16777216
  fio: io_u error on file /mnt/w.2.0: Resource temporarily unavailable: write offset=134217728, buflen=16777216
  ...

This happens because -EAGAIN error returned from btrfs_reserve_extent()
called from btrfs_new_extent_direct() is spilling over to the userland.

btrfs_reserve_extent() returns -EAGAIN when there is no active zone
available. Then, the caller should wait for some other on-going IO to
finish a zone and retry the allocation.

This logic is already implemented for buffered write in cow_file_range(),
but it is missing for the direct IO counterpart. Implement the same logic
for it.

Reported-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Fixes: 2ce543f47843 ("btrfs: zoned: wait until zone is finished when allocation didn't progress")
CC: stable@vger.kernel.org # 6.1+
Tested-by: Shinichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
fs/btrfs/inode.c

index 4063447..81eac12 100644 (file)
@@ -7166,8 +7166,15 @@ static struct extent_map *btrfs_new_extent_direct(struct btrfs_inode *inode,
        int ret;
 
        alloc_hint = get_extent_allocation_hint(inode, start, len);
+again:
        ret = btrfs_reserve_extent(root, len, len, fs_info->sectorsize,
                                   0, alloc_hint, &ins, 1, 1);
+       if (ret == -EAGAIN) {
+               ASSERT(btrfs_is_zoned(fs_info));
+               wait_on_bit_io(&inode->root->fs_info->flags, BTRFS_FS_NEED_ZONE_FINISH,
+                              TASK_UNINTERRUPTIBLE);
+               goto again;
+       }
        if (ret)
                return ERR_PTR(ret);