writeback: Avoid skipping inode writeback
authorJing Xia <jing.xia@unisoc.com>
Tue, 10 May 2022 02:35:14 +0000 (10:35 +0800)
committerJan Kara <jack@suse.cz>
Tue, 10 May 2022 10:04:21 +0000 (12:04 +0200)
We have run into an issue that a task gets stuck in
balance_dirty_pages_ratelimited() when perform I/O stress testing.
The reason we observed is that an I_DIRTY_PAGES inode with lots
of dirty pages is in b_dirty_time list and standard background
writeback cannot writeback the inode.
After studing the relevant code, the following scenario may lead
to the issue:

task1                                   task2
-----                                   -----
fuse_flush
 write_inode_now //in b_dirty_time
  writeback_single_inode
   __writeback_single_inode
                                 fuse_write_end
                                  filemap_dirty_folio
                                   __xa_set_mark:PAGECACHE_TAG_DIRTY
    lock inode->i_lock
    if mapping tagged PAGECACHE_TAG_DIRTY
    inode->i_state |= I_DIRTY_PAGES
    unlock inode->i_lock
                                   __mark_inode_dirty:I_DIRTY_PAGES
                                      lock inode->i_lock
                                      -was dirty,inode stays in
                                      -b_dirty_time
                                      unlock inode->i_lock

   if(!(inode->i_state & I_DIRTY_All))
      -not true,so nothing done

This patch moves the dirty inode to b_dirty list when the inode
currently is not queued in b_io or b_more_io list at the end of
writeback_single_inode.

Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
CC: stable@vger.kernel.org
Fixes: 0ae45f63d4ef ("vfs: add support for a lazytime mount option")
Signed-off-by: Jing Xia <jing.xia@unisoc.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20220510023514.27399-1-jing.xia@unisoc.com
fs/fs-writeback.c

index 591fe9c..1fae019 100644 (file)
@@ -1712,6 +1712,10 @@ static int writeback_single_inode(struct inode *inode,
         */
        if (!(inode->i_state & I_DIRTY_ALL))
                inode_cgwb_move_to_attached(inode, wb);
+       else if (!(inode->i_state & I_SYNC_QUEUED) &&
+                (inode->i_state & I_DIRTY))
+               redirty_tail_locked(inode, wb);
+
        spin_unlock(&wb->list_lock);
        inode_sync_complete(inode);
 out: