ext4: Fix occasional generic/418 failure
authorJan Kara <jack@suse.cz>
Thu, 15 Apr 2021 15:54:17 +0000 (17:54 +0200)
committerTheodore Ts'o <tytso@mit.edu>
Thu, 22 Apr 2021 20:51:03 +0000 (16:51 -0400)
commit5899593f51e63dde2f07c67358bd65a641585abb
tree46c28acfe70ecc4bcc317c6d35df659e988023bc
parent5afa7e8b70d65819245fece61a65fd753b4aae33
ext4: Fix occasional generic/418 failure

Eric has noticed that after pagecache read rework, generic/418 is
occasionally failing for ext4 when blocksize < pagesize. In fact, the
pagecache rework just made hard to hit race in ext4 more likely. The
problem is that since ext4 conversion of direct IO writes to iomap
framework (commit 378f32bab371), we update inode size after direct IO
write only after invalidating page cache. Thus if buffered read sneaks
at unfortunate moment like:

CPU1 - write at offset 1k                       CPU2 - read from offset 0
iomap_dio_rw(..., IOMAP_DIO_FORCE_WAIT);
                                                ext4_readpage();
ext4_handle_inode_extension()

the read will zero out tail of the page as it still sees smaller inode
size and thus page cache becomes inconsistent with on-disk contents with
all the consequences.

Fix the problem by moving inode size update into end_io handler which
gets called before the page cache is invalidated.

Reported-and-tested-by: Eric Whitney <enwlinux@gmail.com>
Fixes: 378f32bab371 ("ext4: introduce direct I/O write using iomap infrastructure")
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Acked-by: Dave Chinner <dchinner@redhat.com>
Link: https://lore.kernel.org/r/20210415155417.4734-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
fs/ext4/file.c