tmpfs: fix regressions from wider use of ZERO_PAGE
authorHugh Dickins <hughd@google.com>
Fri, 15 Apr 2022 02:13:27 +0000 (19:13 -0700)
committerLinus Torvalds <torvalds@linux-foundation.org>
Fri, 15 Apr 2022 21:49:54 +0000 (14:49 -0700)
Chuck Lever reported fsx-based xfstests generic 075 091 112 127 failing
when 5.18-rc1 NFS server exports tmpfs: bisected to recent tmpfs change.

Whilst nfsd_splice_action() does contain some questionable handling of
repeated pages, and Chuck was able to work around there, history from
Mark Hemment makes clear that there might be similar dangers elsewhere:
it was not a good idea for me to pass ZERO_PAGE down to unknown actors.

Revert shmem_file_read_iter() to using ZERO_PAGE for holes only when
iter_is_iovec(); in other cases, use the more natural iov_iter_zero()
instead of copy_page_to_iter().

We would use iov_iter_zero() throughout, but the x86 clear_user() is not
nearly so well optimized as copy to user (dd of 1T sparse tmpfs file
takes 57 seconds rather than 44 seconds).

And now pagecache_init() does not need to SetPageUptodate(ZERO_PAGE(0)):
which had caused boot failure on arm noMMU STM32F7 and STM32H7 boards

Link: https://lkml.kernel.org/r/9a978571-8648-e830-5735-1f4748ce2e30@google.com
Fixes: 56a8c8eb1eaf ("tmpfs: do not allocate pages on read")
Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Patrice CHOTARD <patrice.chotard@foss.st.com>
Reported-by: Chuck Lever III <chuck.lever@oracle.com>
Tested-by: Chuck Lever III <chuck.lever@oracle.com>
Cc: Mark Hemment <markhemm@googlemail.com>
Cc: Patrice CHOTARD <patrice.chotard@foss.st.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: Lukas Czerner <lczerner@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/filemap.c
mm/shmem.c

index 3a5ffb5587cd055323a49ae63713528f47bb2caf..9a1eef6c5d350e6fadb77be23a48c34818d6310e 100644 (file)
@@ -1063,12 +1063,6 @@ void __init pagecache_init(void)
                init_waitqueue_head(&folio_wait_table[i]);
 
        page_writeback_init();
-
-       /*
-        * tmpfs uses the ZERO_PAGE for reading holes: it is up-to-date,
-        * and splice's page_cache_pipe_buf_confirm() needs to see that.
-        */
-       SetPageUptodate(ZERO_PAGE(0));
 }
 
 /*
index 529c9ad3e9264340ee94f2792bb24b5c19647aba..4b2fea33158e8a33a87354ccefc574ef74c1e793 100644 (file)
@@ -2513,7 +2513,6 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
                pgoff_t end_index;
                unsigned long nr, ret;
                loff_t i_size = i_size_read(inode);
-               bool got_page;
 
                end_index = i_size >> PAGE_SHIFT;
                if (index > end_index)
@@ -2570,24 +2569,34 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
                         */
                        if (!offset)
                                mark_page_accessed(page);
-                       got_page = true;
+                       /*
+                        * Ok, we have the page, and it's up-to-date, so
+                        * now we can copy it to user space...
+                        */
+                       ret = copy_page_to_iter(page, offset, nr, to);
+                       put_page(page);
+
+               } else if (iter_is_iovec(to)) {
+                       /*
+                        * Copy to user tends to be so well optimized, but
+                        * clear_user() not so much, that it is noticeably
+                        * faster to copy the zero page instead of clearing.
+                        */
+                       ret = copy_page_to_iter(ZERO_PAGE(0), offset, nr, to);
                } else {
-                       page = ZERO_PAGE(0);
-                       got_page = false;
+                       /*
+                        * But submitting the same page twice in a row to
+                        * splice() - or others? - can result in confusion:
+                        * so don't attempt that optimization on pipes etc.
+                        */
+                       ret = iov_iter_zero(nr, to);
                }
 
-               /*
-                * Ok, we have the page, and it's up-to-date, so
-                * now we can copy it to user space...
-                */
-               ret = copy_page_to_iter(page, offset, nr, to);
                retval += ret;
                offset += ret;
                index += offset >> PAGE_SHIFT;
                offset &= ~PAGE_MASK;
 
-               if (got_page)
-                       put_page(page);
                if (!iov_iter_count(to))
                        break;
                if (ret < nr) {