platform/kernel/linux-rpi.git
3 years agonfsd: report client confirmation status in "info" file
NeilBrown [Fri, 19 Mar 2021 22:38:04 +0000 (09:38 +1100)]
nfsd: report client confirmation status in "info" file

mountd can now monitor clients appearing and disappearing in
/proc/fs/nfsd/clients, and will log these events, in liu of the logging
of mount/unmount events for NFSv3.

Currently it cannot distinguish between unconfirmed clients (which might
be transient and totally uninteresting) and confirmed clients.

So add a "status: " line which reports either "confirmed" or
"unconfirmed", and use fsnotify to report that the info file
has been modified.

This requires a bit of infrastructure to keep the dentry for the "info"
file.  There is no need to take a counted reference as the dentry must
remain around until the client is removed.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agonfsd: don't ignore high bits of copy count
J. Bruce Fields [Fri, 19 Mar 2021 00:03:22 +0000 (20:03 -0400)]
nfsd: don't ignore high bits of copy count

Note size_t is 32-bit on a 32-bit architecture, but cp_count is defined
by the protocol to be 64 bit, so we could be turning a large copy into a
0-length copy here.

Reported-by: <radchenkoy@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agonfsd: COPY with length 0 should copy to end of file
J. Bruce Fields [Fri, 19 Mar 2021 00:03:23 +0000 (20:03 -0400)]
nfsd: COPY with length 0 should copy to end of file

>From https://tools.ietf.org/html/rfc7862#page-65

A count of 0 (zero) requests that all bytes from ca_src_offset
through EOF be copied to the destination.

Reported-by: <radchenkoy@gmail.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agonfsd: Fix typo "accesible"
Ricardo Ribalda [Thu, 18 Mar 2021 20:22:21 +0000 (21:22 +0100)]
nfsd: Fix typo "accesible"

Trivial fix.

Cc: linux-nfs@vger.kernel.org
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agonfsd: Ensure knfsd shuts down when the "nfsd" pseudofs is unmounted
Trond Myklebust [Sat, 13 Mar 2021 21:08:47 +0000 (16:08 -0500)]
nfsd: Ensure knfsd shuts down when the "nfsd" pseudofs is unmounted

In order to ensure that knfsd threads don't linger once the nfsd
pseudofs is unmounted (e.g. when the container is killed) we let
nfsd_umount() shut down those threads and wait for them to exit.

This also should ensure that we don't need to do a kernel mount of
the pseudofs, since the thread lifetime is now limited by the
lifetime of the filesystem.

Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agonfsd: Log client tracking type log message as info instead of warning
Paul Menzel [Fri, 12 Mar 2021 21:03:00 +0000 (22:03 +0100)]
nfsd: Log client tracking type log message as info instead of warning

`printk()`, by default, uses the log level warning, which leaves the
user reading

    NFSD: Using UMH upcall client tracking operations.

wondering what to do about it (`dmesg --level=warn`).

Several client tracking methods are tried, and expected to fail. That’s
why a message is printed only on success. It might be interesting for
users to know the chosen method, so use info-level instead of
debug-level.

Cc: linux-nfs@vger.kernel.org
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agonfsd: helper for laundromat expiry calculations
J. Bruce Fields [Tue, 2 Mar 2021 15:46:23 +0000 (10:46 -0500)]
nfsd: helper for laundromat expiry calculations

We do this same logic repeatedly, and it's easy to get the sense of the
comparison wrong.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Clean up NFSDDBG_FACILITY macro
Chuck Lever [Fri, 5 Mar 2021 19:22:32 +0000 (14:22 -0500)]
NFSD: Clean up NFSDDBG_FACILITY macro

These are no longer needed because there are no dprintk() call sites
in these files.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Add a tracepoint to record directory entry encoding
Chuck Lever [Fri, 5 Mar 2021 18:57:40 +0000 (13:57 -0500)]
NFSD: Add a tracepoint to record directory entry encoding

Enable watching the progress of directory encoding to capture the
timing of any issues with reading or encoding a directory. The
new tracepoint captures dirent encoding for all NFS versions.

For example, here's what a few NFSv4 directory entries might look
like:

nfsd-989   [002]   468.596265: nfsd_dirent:          fh_hash=0x5d162594 ino=2 name=.
nfsd-989   [002]   468.596267: nfsd_dirent:          fh_hash=0x5d162594 ino=1 name=..
nfsd-989   [002]   468.596299: nfsd_dirent:          fh_hash=0x5d162594 ino=3827 name=zlib.c
nfsd-989   [002]   468.596325: nfsd_dirent:          fh_hash=0x5d162594 ino=3811 name=xdiff
nfsd-989   [002]   468.596351: nfsd_dirent:          fh_hash=0x5d162594 ino=3810 name=xdiff-interface.h
nfsd-989   [002]   468.596377: nfsd_dirent:          fh_hash=0x5d162594 ino=3809 name=xdiff-interface.c

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Clean up after updating NFSv3 ACL encoders
Chuck Lever [Sun, 15 Nov 2020 20:09:16 +0000 (15:09 -0500)]
NFSD: Clean up after updating NFSv3 ACL encoders

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv3 SETACL result encoder to use struct xdr_stream
Chuck Lever [Wed, 18 Nov 2020 21:21:24 +0000 (16:21 -0500)]
NFSD: Update the NFSv3 SETACL result encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv3 GETACL result encoder to use struct xdr_stream
Chuck Lever [Wed, 18 Nov 2020 21:11:42 +0000 (16:11 -0500)]
NFSD: Update the NFSv3 GETACL result encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Clean up after updating NFSv2 ACL encoders
Chuck Lever [Sun, 15 Nov 2020 19:31:42 +0000 (14:31 -0500)]
NFSD: Clean up after updating NFSv2 ACL encoders

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv2 ACL ACCESS result encoder to use struct xdr_stream
Chuck Lever [Wed, 18 Nov 2020 19:52:09 +0000 (14:52 -0500)]
NFSD: Update the NFSv2 ACL ACCESS result encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv2 ACL GETATTR result encoder to use struct xdr_stream
Chuck Lever [Wed, 18 Nov 2020 19:49:57 +0000 (14:49 -0500)]
NFSD: Update the NFSv2 ACL GETATTR result encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv2 SETACL result encoder to use struct xdr_stream
Chuck Lever [Wed, 18 Nov 2020 19:47:56 +0000 (14:47 -0500)]
NFSD: Update the NFSv2 SETACL result encoder to use struct xdr_stream

The SETACL result encoder is exactly the same as the NFSv2
attrstatres decoder.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv2 GETACL result encoder to use struct xdr_stream
Chuck Lever [Wed, 18 Nov 2020 19:38:47 +0000 (14:38 -0500)]
NFSD: Update the NFSv2 GETACL result encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Add an xdr_stream-based encoder for NFSv2/3 ACLs
Chuck Lever [Wed, 18 Nov 2020 19:55:05 +0000 (14:55 -0500)]
NFSD: Add an xdr_stream-based encoder for NFSv2/3 ACLs

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Remove unused NFSv2 directory entry encoders
Chuck Lever [Sun, 15 Nov 2020 19:30:13 +0000 (14:30 -0500)]
NFSD: Remove unused NFSv2 directory entry encoders

Clean up.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv2 READDIR entry encoder to use struct xdr_stream
Chuck Lever [Sat, 14 Nov 2020 18:45:35 +0000 (13:45 -0500)]
NFSD: Update the NFSv2 READDIR entry encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv2 READDIR result encoder to use struct xdr_stream
Chuck Lever [Fri, 23 Oct 2020 20:49:01 +0000 (16:49 -0400)]
NFSD: Update the NFSv2 READDIR result encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Count bytes instead of pages in the NFSv2 READDIR encoder
Chuck Lever [Fri, 13 Nov 2020 21:57:44 +0000 (16:57 -0500)]
NFSD: Count bytes instead of pages in the NFSv2 READDIR encoder

Clean up: Counting the bytes used by each returned directory entry
seems less brittle to me than trying to measure consumed pages after
the fact.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Add a helper that encodes NFSv3 directory offset cookies
Chuck Lever [Fri, 13 Nov 2020 21:53:17 +0000 (16:53 -0500)]
NFSD: Add a helper that encodes NFSv3 directory offset cookies

Refactor: Add helper function similar to nfs3svc_encode_cookie3().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv2 STATFS result encoder to use struct xdr_stream
Chuck Lever [Fri, 23 Oct 2020 23:01:38 +0000 (19:01 -0400)]
NFSD: Update the NFSv2 STATFS result encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv2 READ result encoder to use struct xdr_stream
Chuck Lever [Fri, 23 Oct 2020 20:40:11 +0000 (16:40 -0400)]
NFSD: Update the NFSv2 READ result encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv2 READLINK result encoder to use struct xdr_stream
Chuck Lever [Fri, 23 Oct 2020 19:41:09 +0000 (15:41 -0400)]
NFSD: Update the NFSv2 READLINK result encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv2 diropres encoder to use struct xdr_stream
Chuck Lever [Fri, 23 Oct 2020 20:44:16 +0000 (16:44 -0400)]
NFSD: Update the NFSv2 diropres encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv2 attrstat encoder to use struct xdr_stream
Chuck Lever [Fri, 23 Oct 2020 19:28:59 +0000 (15:28 -0400)]
NFSD: Update the NFSv2 attrstat encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv2 stat encoder to use struct xdr_stream
Chuck Lever [Fri, 23 Oct 2020 15:08:02 +0000 (11:08 -0400)]
NFSD: Update the NFSv2 stat encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Reduce svc_rqst::rq_pages churn during READDIR operations
Chuck Lever [Fri, 15 Jan 2021 14:28:44 +0000 (09:28 -0500)]
NFSD: Reduce svc_rqst::rq_pages churn during READDIR operations

During NFSv2 and NFSv3 READDIR/PLUS operations, NFSD advances
rq_next_page to the full size of the client-requested buffer, then
releases all those pages at the end of the request. The next request
to use that nfsd thread has to refill the pages.

NFSD does this even when the dirlist in the reply is small. With
NFSv3 clients that send READDIR operations with large buffer sizes,
that can be 256 put_page/alloc_page pairs per READDIR request, even
though those pages often remain unused.

We can save some work by not releasing dirlist buffer pages that
were not used to form the READDIR Reply. I've left the NFSv2 code
alone since there are never more than three pages involved in an
NFSv2 READDIR Reply.

Eventually we should nail down why these pages need to be released
at all in order to avoid allocating and releasing pages
unnecessarily.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Remove unused NFSv3 directory entry encoders
Chuck Lever [Fri, 13 Nov 2020 16:27:13 +0000 (11:27 -0500)]
NFSD: Remove unused NFSv3 directory entry encoders

Clean up.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update NFSv3 READDIR entry encoders to use struct xdr_stream
Chuck Lever [Thu, 22 Oct 2020 23:46:58 +0000 (19:46 -0400)]
NFSD: Update NFSv3 READDIR entry encoders to use struct xdr_stream

The benefit of the xdr_stream helpers is that they transparently
handle encoding an XDR data item that crosses page boundaries.
Most of the open-coded logic to do that here can be eliminated.

A sub-buffer and sub-stream are set up as a sink buffer for the
directory entry encoder. As an entry is encoded, it is added to
the end of the content in this buffer/stream. The total length of
the directory list is tracked in the buffer's @len field.

When it comes time to encode the Reply, the sub-buffer is merged
into rq_res's page array at the correct place using
xdr_write_pages().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv3 READDIR3res encoder to use struct xdr_stream
Chuck Lever [Thu, 22 Oct 2020 23:31:48 +0000 (19:31 -0400)]
NFSD: Update the NFSv3 READDIR3res encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Count bytes instead of pages in the NFSv3 READDIR encoder
Chuck Lever [Mon, 9 Nov 2020 18:13:21 +0000 (13:13 -0500)]
NFSD: Count bytes instead of pages in the NFSv3 READDIR encoder

Clean up: Counting the bytes used by each returned directory entry
seems less brittle to me than trying to measure consumed pages after
the fact.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Add a helper that encodes NFSv3 directory offset cookies
Chuck Lever [Tue, 10 Nov 2020 14:57:14 +0000 (09:57 -0500)]
NFSD: Add a helper that encodes NFSv3 directory offset cookies

Refactor: De-duplicate identical code that handles encoding of
directory offset cookies across page boundaries.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv3 COMMIT3res encoder to use struct xdr_stream
Chuck Lever [Thu, 22 Oct 2020 19:35:46 +0000 (15:35 -0400)]
NFSD: Update the NFSv3 COMMIT3res encoder to use struct xdr_stream

As an additional clean up, encode_wcc_data() is removed because it
is now no longer used.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv3 PATHCONF3res encoder to use struct xdr_stream
Chuck Lever [Fri, 6 Nov 2020 18:15:09 +0000 (13:15 -0500)]
NFSD: Update the NFSv3 PATHCONF3res encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv3 FSINFO3res encoder to use struct xdr_stream
Chuck Lever [Thu, 22 Oct 2020 17:42:13 +0000 (13:42 -0400)]
NFSD: Update the NFSv3 FSINFO3res encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv3 FSSTAT3res encoder to use struct xdr_stream
Chuck Lever [Fri, 6 Nov 2020 18:08:45 +0000 (13:08 -0500)]
NFSD: Update the NFSv3 FSSTAT3res encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv3 LINK3res encoder to use struct xdr_stream
Chuck Lever [Thu, 22 Oct 2020 19:08:29 +0000 (15:08 -0400)]
NFSD: Update the NFSv3 LINK3res encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv3 RENAMEv3res encoder to use struct xdr_stream
Chuck Lever [Thu, 22 Oct 2020 19:33:05 +0000 (15:33 -0400)]
NFSD: Update the NFSv3 RENAMEv3res encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv3 CREATE family of encoders to use struct xdr_stream
Chuck Lever [Thu, 22 Oct 2020 19:27:23 +0000 (15:27 -0400)]
NFSD: Update the NFSv3 CREATE family of encoders to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv3 WRITE3res encoder to use struct xdr_stream
Chuck Lever [Thu, 22 Oct 2020 19:26:31 +0000 (15:26 -0400)]
NFSD: Update the NFSv3 WRITE3res encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv3 READ3res encode to use struct xdr_stream
Chuck Lever [Thu, 22 Oct 2020 19:23:50 +0000 (15:23 -0400)]
NFSD: Update the NFSv3 READ3res encode to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv3 READLINK3res encoder to use struct xdr_stream
Chuck Lever [Thu, 22 Oct 2020 19:18:40 +0000 (15:18 -0400)]
NFSD: Update the NFSv3 READLINK3res encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv3 wccstat result encoder to use struct xdr_stream
Chuck Lever [Thu, 22 Oct 2020 19:12:38 +0000 (15:12 -0400)]
NFSD: Update the NFSv3 wccstat result encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv3 LOOKUP3res encoder to use struct xdr_stream
Chuck Lever [Thu, 22 Oct 2020 18:46:58 +0000 (14:46 -0400)]
NFSD: Update the NFSv3 LOOKUP3res encoder to use struct xdr_stream

Also, clean up: Rename the encoder function to match the name of
the result structure in RFC 1813, consistent with other encoder
function names in nfs3xdr.c. "diropres" is an NFSv2 thingie.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the NFSv3 ACCESS3res encoder to use struct xdr_stream
Chuck Lever [Thu, 22 Oct 2020 17:56:58 +0000 (13:56 -0400)]
NFSD: Update the NFSv3 ACCESS3res encoder to use struct xdr_stream

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Update the GETATTR3res encoder to use struct xdr_stream
Chuck Lever [Wed, 21 Oct 2020 15:58:41 +0000 (11:58 -0400)]
NFSD: Update the GETATTR3res encoder to use struct xdr_stream

As an additional clean up, some renaming is done to more closely
reflect the data type and variable names used in the NFSv3 XDR
definition provided in RFC 1813. "attrstat" is an NFSv2 thingie.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoNFSD: Extract the svcxdr_init_encode() helper
Chuck Lever [Tue, 27 Oct 2020 19:53:42 +0000 (15:53 -0400)]
NFSD: Extract the svcxdr_init_encode() helper

NFSD initializes an encode xdr_stream only after the RPC layer has
already inserted the RPC Reply header. Thus it behaves differently
than xdr_init_encode does, which assumes the passed-in xdr_buf is
entirely devoid of content.

nfs4proc.c has this server-side stream initialization helper, but
it is visible only to the NFSv4 code. Move this helper to a place
that can be accessed by NFSv2 and NFSv3 server XDR functions.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
3 years agoLinux 5.12-rc4
Linus Torvalds [Sun, 21 Mar 2021 21:56:43 +0000 (14:56 -0700)]
Linux 5.12-rc4

3 years agoMerge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 21 Mar 2021 21:06:10 +0000 (14:06 -0700)]
Merge tag 'ext4_for_linus_stable' of git://git./linux/kernel/git/tytso/ext4

Pull ext4 fixes from Ted Ts'o:
 "Miscellaneous ext4 bug fixes for v5.12"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: initialize ret to suppress smatch warning
  ext4: stop inode update before return
  ext4: fix rename whiteout with fast commit
  ext4: fix timer use-after-free on failed mount
  ext4: fix potential error in ext4_do_update_inode
  ext4: do not try to set xattr into ea_inode if value is empty
  ext4: do not iput inode under running transaction in ext4_rename()
  ext4: find old entry again if failed to rename whiteout
  ext4: fix error handling in ext4_end_enable_verity()
  ext4: fix bh ref count on error paths
  fs/ext4: fix integer overflow in s_log_groups_per_flex
  ext4: add reclaim checks to xattr code
  ext4: shrink race window in ext4_should_retry_alloc()

3 years agoMerge tag 'io_uring-5.12-2021-03-21' of git://git.kernel.dk/linux-block
Linus Torvalds [Sun, 21 Mar 2021 19:25:54 +0000 (12:25 -0700)]
Merge tag 'io_uring-5.12-2021-03-21' of git://git.kernel.dk/linux-block

Pull io_uring followup fixes from Jens Axboe:

 - The SIGSTOP change from Eric, so we properly ignore that for
   PF_IO_WORKER threads.

 - Disallow sending signals to PF_IO_WORKER threads in general, we're
   not interested in having them funnel back to the io_uring owning
   task.

 - Stable fix from Stefan, ensuring we properly break links for short
   send/sendmsg recv/recvmsg if MSG_WAITALL is set.

 - Catch and loop when needing to run task_work before a PF_IO_WORKER
   threads goes to sleep.

* tag 'io_uring-5.12-2021-03-21' of git://git.kernel.dk/linux-block:
  io_uring: call req_set_fail_links() on short send[msg]()/recv[msg]() with MSG_WAITALL
  io-wq: ensure task is running before processing task_work
  signal: don't allow STOP on PF_IO_WORKER threads
  signal: don't allow sending any signals to PF_IO_WORKER threads

3 years agoMerge tag 'staging-5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh...
Linus Torvalds [Sun, 21 Mar 2021 18:54:04 +0000 (11:54 -0700)]
Merge tag 'staging-5.12-rc4' of git://git./linux/kernel/git/gregkh/staging

Pull staging and IIO driver fixes from Greg KH:
 "Some small staging and IIO driver fixes:

   - MAINTAINERS changes for the move of the staging mailing list

   - comedi driver fixes to get request_irq() to work correctly

   - counter driver fixes for reported issues with iio devices

   - tiny iio driver fixes for reported issues.

  All of these have been in linux-next with no reported problems"

* tag 'staging-5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: vt665x: fix alignment constraints
  staging: comedi: cb_pcidas64: fix request_irq() warn
  staging: comedi: cb_pcidas: fix request_irq() warn
  MAINTAINERS: move the staging subsystem to lists.linux.dev
  MAINTAINERS: move some real subsystems off of the staging mailing list
  iio: gyro: mpu3050: Fix error handling in mpu3050_trigger_handler
  iio: hid-sensor-temperature: Fix issues of timestamp channel
  iio: hid-sensor-humidity: Fix alignment issue of timestamp channel
  counter: stm32-timer-cnt: fix ceiling miss-alignment with reload register
  counter: stm32-timer-cnt: fix ceiling write max value
  counter: stm32-timer-cnt: Report count function when SLAVE_MODE_DISABLED
  iio: adc: ab8500-gpadc: Fix off by 10 to 3
  iio:adc:stm32-adc: Add HAS_IOMEM dependency
  iio: adis16400: Fix an error code in adis16400_initial_setup()
  iio: adc: adi-axi-adc: add proper Kconfig dependencies
  iio: adc: ad7949: fix wrong ADC result due to incorrect bit mask
  iio: hid-sensor-prox: Fix scale not correct issue
  iio:adc:qcom-spmi-vadc: add default scale to LR_MUX2_BAT_ID channel

3 years agoMerge tag 'usb-5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Sun, 21 Mar 2021 18:49:16 +0000 (11:49 -0700)]
Merge tag 'usb-5.12-rc4' of git://git./linux/kernel/git/gregkh/usb

Pull USB and Thunderbolt driver fixes from Greg KH:
 "Here are some small Thunderbolt and USB driver fixes for some reported
  issues:

   - thunderbolt fixes for minor problems

   - typec fixes for power issues

   - usb-storage quirk addition

   - usbip bugfix

   - dwc3 bugfix when stopping transfers

   - cdnsp bugfix for isoc transfers

   - gadget use-after-free fix

  All have been in linux-next this week with no reported issues"

* tag 'usb-5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: typec: tcpm: Skip sink_cap query only when VDM sm is busy
  usb: dwc3: gadget: Prevent EP queuing while stopping transfers
  usb: typec: tcpm: Invoke power_supply_changed for tcpm-source-psy-
  usb: typec: Remove vdo[3] part of tps6598x_rx_identity_reg struct
  usb-storage: Add quirk to defeat Kindle's automatic unload
  usb: gadget: configfs: Fix KASAN use-after-free
  usbip: Fix incorrect double assignment to udc->ud.tcp_rx
  usb: cdnsp: Fixes incorrect value in ISOC TRB
  thunderbolt: Increase runtime PM reference count on DP tunnel discovery
  thunderbolt: Initialize HopID IDAs in tb_switch_alloc()

3 years agoMerge tag 'irq-urgent-2021-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 21 Mar 2021 18:34:24 +0000 (11:34 -0700)]
Merge tag 'irq-urgent-2021-03-21' of git://git./linux/kernel/git/tip/tip

Pull irq fix from Ingo Molnar:
 "A change to robustify force-threaded IRQ handlers to always disable
  interrupts, plus a DocBook fix.

  The force-threaded IRQ handler change has been accelerated from the
  normal schedule of such a change to keep the bad pattern/workaround of
  spin_lock_irqsave() in handlers or IRQF_NOTHREAD as a kludge from
  spreading"

* tag 'irq-urgent-2021-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Disable interrupts for force threaded handlers
  genirq/irq_sim: Fix typos in kernel doc (fnode -> fwnode)

3 years agoMerge tag 'perf-urgent-2021-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 21 Mar 2021 18:26:21 +0000 (11:26 -0700)]
Merge tag 'perf-urgent-2021-03-21' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
 "Boundary condition fixes for bugs unearthed by the perf fuzzer"

* tag 'perf-urgent-2021-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Fix unchecked MSR access error caused by VLBR_EVENT
  perf/x86/intel: Fix a crash caused by zero PEBS status

3 years agoMerge tag 'locking-urgent-2021-03-21' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 21 Mar 2021 18:19:29 +0000 (11:19 -0700)]
Merge tag 'locking-urgent-2021-03-21' of git://git./linux/kernel/git/tip/tip

Pull locking fixes from Ingo Molnar:

 - Get static calls & modules right. Hopefully.

 - WW mutex fixes

* tag 'locking-urgent-2021-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  static_call: Fix static_call_update() sanity check
  static_call: Align static_call_is_init() patching condition
  static_call: Fix static_call_set_init()
  locking/ww_mutex: Fix acquire/release imbalance in ww_acquire_init()/ww_acquire_fini()
  locking/ww_mutex: Simplify use_ww_ctx & ww_ctx handling

3 years agoMerge tag 'efi-urgent-2021-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 21 Mar 2021 18:11:22 +0000 (11:11 -0700)]
Merge tag 'efi-urgent-2021-03-21' of git://git./linux/kernel/git/tip/tip

Pull EFI fixes from Ingo Molnar:

 - another missing RT_PROP table related fix, to ensure that the
   efivarfs pseudo filesystem fails gracefully if variable services
   are unsupported

 - use the correct alignment for literal EFI GUIDs

 - fix a use after unmap issue in the memreserve code

* tag 'efi-urgent-2021-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efi: use 32-bit alignment for efi_guid_t literals
  firmware/efi: Fix a use after bug in efi_mem_reserve_persistent
  efivars: respect EFI_UNSUPPORTED return from firmware

3 years agoMerge tag 'x86_urgent_for_v5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 21 Mar 2021 18:04:20 +0000 (11:04 -0700)]
Merge tag 'x86_urgent_for_v5.12-rc4' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:
 "The freshest pile of shiny x86 fixes for 5.12:

   - Add the arch-specific mapping between physical and logical CPUs to
     fix devicetree-node lookups

   - Restore the IRQ2 ignore logic

   - Fix get_nr_restart_syscall() to return the correct restart syscall
     number. Split in a 4-patches set to avoid kABI breakage when
     backporting to dead kernels"

* tag 'x86_urgent_for_v5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/apic/of: Fix CPU devicetree-node lookups
  x86/ioapic: Ignore IRQ2 again
  x86: Introduce restart_block->arch_data to remove TS_COMPAT_RESTART
  x86: Introduce TS_COMPAT_RESTART to fix get_nr_restart_syscall()
  x86: Move TS_COMPAT back to asm/thread_info.h
  kernel, fs: Introduce and use set_restart_fn() and arch_set_restart_data()

3 years agoMerge tag 'powerpc-5.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sun, 21 Mar 2021 17:57:35 +0000 (10:57 -0700)]
Merge tag 'powerpc-5.12-4' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:

 - Fix a possible stack corruption and subsequent DLPAR failure in the
   rpadlpar_io PCI hotplug driver

 - Two build fixes for uncommon configurations

Thanks to Christophe Leroy and Tyrel Datwyler.

* tag 'powerpc-5.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  PCI: rpadlpar: Fix potential drc_name corruption in store functions
  powerpc: Force inlining of cpu_has_feature() to avoid build failure
  powerpc/vdso32: Add missing _restgpr_31_x to fix build failure

3 years agoio_uring: call req_set_fail_links() on short send[msg]()/recv[msg]() with MSG_WAITALL
Stefan Metzmacher [Sat, 20 Mar 2021 19:33:36 +0000 (20:33 +0100)]
io_uring: call req_set_fail_links() on short send[msg]()/recv[msg]() with MSG_WAITALL

Without that it's not safe to use them in a linked combination with
others.

Now combinations like IORING_OP_SENDMSG followed by IORING_OP_SPLICE
should be possible.

We already handle short reads and writes for the following opcodes:

- IORING_OP_READV
- IORING_OP_READ_FIXED
- IORING_OP_READ
- IORING_OP_WRITEV
- IORING_OP_WRITE_FIXED
- IORING_OP_WRITE
- IORING_OP_SPLICE
- IORING_OP_TEE

Now we have it for these as well:

- IORING_OP_SENDMSG
- IORING_OP_SEND
- IORING_OP_RECVMSG
- IORING_OP_RECV

For IORING_OP_RECVMSG we also check for the MSG_TRUNC and MSG_CTRUNC
flags in order to call req_set_fail_links().

There might be applications arround depending on the behavior
that even short send[msg]()/recv[msg]() retuns continue an
IOSQE_IO_LINK chain.

It's very unlikely that such applications pass in MSG_WAITALL,
which is only defined in 'man 2 recvmsg', but not in 'man 2 sendmsg'.

It's expected that the low level sock_sendmsg() call just ignores
MSG_WAITALL, as MSG_ZEROCOPY is also ignored without explicitly set
SO_ZEROCOPY.

We also expect the caller to know about the implicit truncation to
MAX_RW_COUNT, which we don't detect.

cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/c4e1a4cc0d905314f4d5dc567e65a7b09621aab3.1615908477.git.metze@samba.org
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoio-wq: ensure task is running before processing task_work
Jens Axboe [Sun, 21 Mar 2021 13:06:56 +0000 (07:06 -0600)]
io-wq: ensure task is running before processing task_work

Mark the current task as running if we need to run task_work from the
io-wq threads as part of work handling. If that is the case, then return
as such so that the caller can appropriately loop back and reset if it
was part of a going-to-sleep flush.

Fixes: 3bfe6106693b ("io-wq: fork worker threads from original task")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agosignal: don't allow STOP on PF_IO_WORKER threads
Eric W. Biederman [Sun, 21 Mar 2021 15:37:48 +0000 (09:37 -0600)]
signal: don't allow STOP on PF_IO_WORKER threads

Just like we don't allow normal signals to IO threads, don't deliver a
STOP to a task that has PF_IO_WORKER set. The IO threads don't take
signals in general, and have no means of flushing out a stop either.

Longer term, we may want to look into allowing stop of these threads,
as it relates to eg process freezing. For now, this prevents a spin
issue if a SIGSTOP is delivered to the parent task.

Reported-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
3 years agosignal: don't allow sending any signals to PF_IO_WORKER threads
Jens Axboe [Sat, 20 Mar 2021 01:25:13 +0000 (19:25 -0600)]
signal: don't allow sending any signals to PF_IO_WORKER threads

They don't take signals individually, and even if they share signals with
the parent task, don't allow them to be delivered through the worker
thread. Linux does allow this kind of behavior for regular threads, but
it's really a compatability thing that we need not care about for the IO
threads.

Reported-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agoext4: initialize ret to suppress smatch warning
Theodore Ts'o [Sun, 21 Mar 2021 04:45:37 +0000 (00:45 -0400)]
ext4: initialize ret to suppress smatch warning

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
3 years agoext4: stop inode update before return
Pan Bian [Sun, 17 Jan 2021 08:57:32 +0000 (00:57 -0800)]
ext4: stop inode update before return

The inode update should be stopped before returing the error code.

Signed-off-by: Pan Bian <bianpan2016@163.com>
Link: https://lore.kernel.org/r/20210117085732.93788-1-bianpan2016@163.com
Fixes: 8016e29f4362 ("ext4: fast commit recovery path")
Cc: stable@kernel.org
Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
3 years agoext4: fix rename whiteout with fast commit
Harshad Shirwadkar [Tue, 16 Mar 2021 22:19:21 +0000 (15:19 -0700)]
ext4: fix rename whiteout with fast commit

This patch adds rename whiteout support in fast commits. Note that the
whiteout object that gets created is actually char device. Which
imples, the function ext4_inode_journal_mode(struct inode *inode)
would return "JOURNAL_DATA" for this inode. This has a consequence in
fast commit code that it will make creation of the whiteout object a
fast-commit ineligible behavior and thus will fall back to full
commits. With this patch, this can be observed by running fast commits
with rename whiteout and seeing the stats generated by ext4_fc_stats
tracepoint as follows:

ext4_fc_stats: dev 254:32 fc ineligible reasons:
XATTR:0, CROSS_RENAME:0, JOURNAL_FLAG_CHANGE:0, NO_MEM:0, SWAP_BOOT:0,
RESIZE:0, RENAME_DIR:0, FALLOC_RANGE:0, INODE_JOURNAL_DATA:16;
num_commits:6, ineligible: 6, numblks: 3

So in short, this patch guarantees that in case of rename whiteout, we
fall back to full commits.

Amir mentioned that instead of creating a new whiteout object for
every rename, we can create a static whiteout object with irrelevant
nlink. That will make fast commits to not fall back to full
commit. But until this happens, this patch will ensure correctness by
falling back to full commits.

Fixes: 8016e29f4362 ("ext4: fast commit recovery path")
Cc: stable@kernel.org
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Link: https://lore.kernel.org/r/20210316221921.1124955-1-harshadshirwadkar@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
3 years agoext4: fix timer use-after-free on failed mount
Jan Kara [Mon, 15 Mar 2021 16:59:06 +0000 (17:59 +0100)]
ext4: fix timer use-after-free on failed mount

When filesystem mount fails because of corrupted filesystem we first
cancel the s_err_report timer reminding fs errors every day and only
then we flush s_error_work. However s_error_work may report another fs
error and re-arm timer thus resulting in timer use-after-free. Fix the
problem by first flushing the work and only after that canceling the
s_err_report timer.

Reported-by: syzbot+628472a2aac693ab0fcd@syzkaller.appspotmail.com
Fixes: 2d01ddc86606 ("ext4: save error info to sb through journal if available")
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20210315165906.2175-1-jack@suse.cz
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
3 years agoext4: fix potential error in ext4_do_update_inode
Shijie Luo [Fri, 12 Mar 2021 06:50:51 +0000 (01:50 -0500)]
ext4: fix potential error in ext4_do_update_inode

If set_large_file = 1 and errors occur in ext4_handle_dirty_metadata(),
the error code will be overridden, go to out_brelse to avoid this
situation.

Signed-off-by: Shijie Luo <luoshijie1@huawei.com>
Link: https://lore.kernel.org/r/20210312065051.36314-1-luoshijie1@huawei.com
Cc: stable@kernel.org
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
3 years agoext4: do not try to set xattr into ea_inode if value is empty
zhangyi (F) [Fri, 5 Mar 2021 12:05:08 +0000 (20:05 +0800)]
ext4: do not try to set xattr into ea_inode if value is empty

Syzbot report a warning that ext4 may create an empty ea_inode if set
an empty extent attribute to a file on the file system which is no free
blocks left.

  WARNING: CPU: 6 PID: 10667 at fs/ext4/xattr.c:1640 ext4_xattr_set_entry+0x10f8/0x1114 fs/ext4/xattr.c:1640
  ...
  Call trace:
   ext4_xattr_set_entry+0x10f8/0x1114 fs/ext4/xattr.c:1640
   ext4_xattr_block_set+0x1d0/0x1b1c fs/ext4/xattr.c:1942
   ext4_xattr_set_handle+0x8a0/0xf1c fs/ext4/xattr.c:2390
   ext4_xattr_set+0x120/0x1f0 fs/ext4/xattr.c:2491
   ext4_xattr_trusted_set+0x48/0x5c fs/ext4/xattr_trusted.c:37
   __vfs_setxattr+0x208/0x23c fs/xattr.c:177
  ...

Now, ext4 try to store extent attribute into an external inode if
ext4_xattr_block_set() return -ENOSPC, but for the case of store an
empty extent attribute, store the extent entry into the extent
attribute block is enough. A simple reproduce below.

  fallocate test.img -l 1M
  mkfs.ext4 -F -b 2048 -O ea_inode test.img
  mount test.img /mnt
  dd if=/dev/zero of=/mnt/foo bs=2048 count=500
  setfattr -n "user.test" /mnt/foo

Reported-by: syzbot+98b881fdd8ebf45ab4ae@syzkaller.appspotmail.com
Fixes: 9c6e7853c531 ("ext4: reserve space for xattr entries/names")
Cc: stable@kernel.org
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Link: https://lore.kernel.org/r/20210305120508.298465-1-yi.zhang@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
3 years agoext4: do not iput inode under running transaction in ext4_rename()
zhangyi (F) [Wed, 3 Mar 2021 13:17:03 +0000 (21:17 +0800)]
ext4: do not iput inode under running transaction in ext4_rename()

In ext4_rename(), when RENAME_WHITEOUT failed to add new entry into
directory, it ends up dropping new created whiteout inode under the
running transaction. After commit <9b88f9fb0d2> ("ext4: Do not iput inode
under running transaction"), we follow the assumptions that evict() does
not get called from a transaction context but in ext4_rename() it breaks
this suggestion. Although it's not a real problem, better to obey it, so
this patch add inode to orphan list and stop transaction before final
iput().

Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Link: https://lore.kernel.org/r/20210303131703.330415-2-yi.zhang@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
3 years agoext4: find old entry again if failed to rename whiteout
zhangyi (F) [Wed, 3 Mar 2021 13:17:02 +0000 (21:17 +0800)]
ext4: find old entry again if failed to rename whiteout

If we failed to add new entry on rename whiteout, we cannot reset the
old->de entry directly, because the old->de could have moved from under
us during make indexed dir. So find the old entry again before reset is
needed, otherwise it may corrupt the filesystem as below.

  /dev/sda: Entry '00000001' in ??? (12) has deleted/unused inode 15. CLEARED.
  /dev/sda: Unattached inode 75
  /dev/sda: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.

Fixes: 6b4b8e6b4ad ("ext4: fix bug for rename with RENAME_WHITEOUT")
Cc: stable@vger.kernel.org
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Link: https://lore.kernel.org/r/20210303131703.330415-1-yi.zhang@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
3 years agogenirq: Disable interrupts for force threaded handlers
Thomas Gleixner [Wed, 17 Mar 2021 14:38:52 +0000 (15:38 +0100)]
genirq: Disable interrupts for force threaded handlers

With interrupt force threading all device interrupt handlers are invoked
from kernel threads. Contrary to hard interrupt context the invocation only
disables bottom halfs, but not interrupts. This was an oversight back then
because any code like this will have an issue:

thread(irq_A)
  irq_handler(A)
    spin_lock(&foo->lock);

interrupt(irq_B)
  irq_handler(B)
    spin_lock(&foo->lock);

This has been triggered with networking (NAPI vs. hrtimers) and console
drivers where printk() happens from an interrupt which interrupted the
force threaded handler.

Now people noticed and started to change the spin_lock() in the handler to
spin_lock_irqsave() which affects performance or add IRQF_NOTHREAD to the
interrupt request which in turn breaks RT.

Fix the root cause and not the symptom and disable interrupts before
invoking the force threaded handler which preserves the regular semantics
and the usefulness of the interrupt force threading as a general debugging
tool.

For not RT this is not changing much, except that during the execution of
the threaded handler interrupts are delayed until the handler
returns. Vs. scheduling and softirq processing there is no difference.

For RT kernels there is no issue.

Fixes: 8d32a307e4fa ("genirq: Provide forced interrupt threading")
Reported-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Johan Hovold <johan@kernel.org>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20210317143859.513307808@linutronix.de
3 years agoMerge tag 'riscv-for-linus-5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 20 Mar 2021 18:01:54 +0000 (11:01 -0700)]
Merge tag 'riscv-for-linus-5.12-rc4' of git://git./linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:
 "A handful of fixes for 5.12:

   - fix the SBI remote fence numbers for hypervisor fences, which had
     been transcribed in the wrong order in Linux. These fences are only
     used with the KVM patches applied.

   - fix a whole host of build warnings, these should have no functional
     change.

   - fix init_resources() to prevent an off-by-one error from causing an
     out-of-bounds array reference. This was manifesting during boot on
     vexriscv.

   - ensure the KASAN mappings are visible before proceeding to use
     them"

* tag 'riscv-for-linus-5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Correct SPARSEMEM configuration
  RISC-V: kasan: Declare kasan_shallow_populate() static
  riscv: Ensure page table writes are flushed when initializing KASAN vmalloc
  RISC-V: Fix out-of-bounds accesses in init_resources()
  riscv: Fix compilation error with Canaan SoC
  ftrace: Fix spelling mistake "disabed" -> "disabled"
  riscv: fix bugon.cocci warnings
  riscv: process: Fix no prototype for arch_dup_task_struct
  riscv: ftrace: Use ftrace_get_regs helper
  riscv: process: Fix no prototype for show_regs
  riscv: syscall_table: Reduce W=1 compilation warnings noise
  riscv: time: Fix no prototype for time_init
  riscv: ptrace: Fix no prototype warnings
  riscv: sbi: Fix comment of __sbi_set_timer_v01
  riscv: irq: Fix no prototype warning
  riscv: traps: Fix no prototype warnings
  RISC-V: correct enum sbi_ext_rfence_fid

3 years agoMerge tag '5.12-rc3-smb3' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sat, 20 Mar 2021 18:00:25 +0000 (11:00 -0700)]
Merge tag '5.12-rc3-smb3' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "Five cifs/smb3 fixes - three for stable, including an important ACL
  fix and security signature fix"

* tag '5.12-rc3-smb3' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: fix allocation size on newly created files
  cifs: warn and fail if trying to use rootfs without the config option
  fs/cifs/: fix misspellings using codespell tool
  cifs: Fix preauth hash corruption
  cifs: update new ACE pointer after populate_new_aces.

3 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sat, 20 Mar 2021 17:57:10 +0000 (10:57 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Eight fixes, all in drivers, all fairly minor either being fixes in
  error legs, memory leaks on teardown, context errors or semantic
  problems"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: mpt3sas: Do not use GFP_KERNEL in atomic context
  scsi: ufs: ufs-mediatek: Correct operator & -> &&
  scsi: sd_zbc: Update write pointer offset cache
  scsi: lpfc: Fix some error codes in debugfs
  scsi: qla2xxx: Fix broken #endif placement
  scsi: st: Fix a use after free in st_open()
  scsi: myrs: Fix a double free in myrs_cleanup()
  scsi: ibmvfc: Free channel_setup_buf during device tear down

3 years agoMerge tag 'zonefs-5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal...
Linus Torvalds [Sat, 20 Mar 2021 00:32:30 +0000 (17:32 -0700)]
Merge tag 'zonefs-5.12-rc4' of git://git./linux/kernel/git/dlemoal/zonefs

Pull zonefs fixes from Damien Le Moal:

 - fix inode write open reference count (Chao)

 - Fix wrong write offset for asynchronous O_APPEND writes (me)

 - Prevent use of sequential zone file as swap files (me)

* tag 'zonefs-5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
  zonefs: fix to update .i_wr_refcnt correctly in zonefs_open_zone()
  zonefs: Fix O_APPEND async write handling
  zonefs: prevent use of seq files as swap file

3 years agoMerge tag 'block-5.12-2021-03-19' of git://git.kernel.dk/linux-block
Linus Torvalds [Sat, 20 Mar 2021 00:07:10 +0000 (17:07 -0700)]
Merge tag 'block-5.12-2021-03-19' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "Just an NVMe pull request this week:

   - fix tag allocation for keep alive

   - fix a unit mismatch for the Write Zeroes limits

   - various TCP transport fixes (Sagi Grimberg, Elad Grupi)

   - fix iosqes and iocqes validation for discovery controllers (Sagi Grimberg)"

* tag 'block-5.12-2021-03-19' of git://git.kernel.dk/linux-block:
  nvmet-tcp: fix kmap leak when data digest in use
  nvmet: don't check iosqes,iocqes for discovery controllers
  nvme-rdma: fix possible hang when failing to set io queues
  nvme-tcp: fix possible hang when failing to set io queues
  nvme-tcp: fix misuse of __smp_processor_id with preemption enabled
  nvme-tcp: fix a NULL deref when receiving a 0-length r2t PDU
  nvme: fix Write Zeroes limitations
  nvme: allocate the keep alive request using BLK_MQ_REQ_NOWAIT
  nvme: merge nvme_keep_alive into nvme_keep_alive_work
  nvme-fabrics: only reserve a single tag

3 years agoMerge tag 'io_uring-5.12-2021-03-19' of git://git.kernel.dk/linux-block
Linus Torvalds [Sat, 20 Mar 2021 00:01:09 +0000 (17:01 -0700)]
Merge tag 'io_uring-5.12-2021-03-19' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:
 "Quieter week this time, which was both expected and desired. About
  half of the below is fixes for this release, the other half are just
  fixes in general. In detail:

   - Fix the freezing of IO threads, by making the freezer not send them
     fake signals. Make them freezable by default.

   - Like we did for personalities, move the buffer IDR to xarray. Kills
     some code and avoids a use-after-free on teardown.

   - SQPOLL cleanups and fixes (Pavel)

   - Fix linked timeout race (Pavel)

   - Fix potential completion post use-after-free (Pavel)

   - Cleanup and move internal structures outside of general kernel view
     (Stefan)

   - Use MSG_SIGNAL for send/recv from io_uring (Stefan)"

* tag 'io_uring-5.12-2021-03-19' of git://git.kernel.dk/linux-block:
  io_uring: don't leak creds on SQO attach error
  io_uring: use typesafe pointers in io_uring_task
  io_uring: remove structures from include/linux/io_uring.h
  io_uring: imply MSG_NOSIGNAL for send[msg]()/recv[msg]() calls
  io_uring: fix sqpoll cancellation via task_work
  io_uring: add generic callback_head helpers
  io_uring: fix concurrent parking
  io_uring: halt SQO submission on ctx exit
  io_uring: replace sqd rw_semaphore with mutex
  io_uring: fix complete_post use ctx after free
  io_uring: fix ->flags races by linked timeouts
  io_uring: convert io_buffer_idr to XArray
  io_uring: allow IO worker threads to be frozen
  kernel: freezer should treat PF_IO_WORKER like PF_KTHREAD for freezing

3 years agox86/apic/of: Fix CPU devicetree-node lookups
Johan Hovold [Fri, 12 Mar 2021 09:20:33 +0000 (10:20 +0100)]
x86/apic/of: Fix CPU devicetree-node lookups

Architectures that describe the CPU topology in devicetree and do not have
an identity mapping between physical and logical CPU ids must override the
default implementation of arch_match_cpu_phys_id().

Failing to do so breaks CPU devicetree-node lookups using of_get_cpu_node()
and of_cpu_device_node_get() which several drivers rely on. It also causes
the CPU struct devices exported through sysfs to point to the wrong
devicetree nodes.

On x86, CPUs are described in devicetree using their APIC ids and those
do not generally coincide with the logical ids, even if CPU0 typically
uses APIC id 0.

Add the missing implementation of arch_match_cpu_phys_id() so that CPU-node
lookups work also with SMP.

Apart from fixing the broken sysfs devicetree-node links this likely does
not affect current users of mainline kernels on x86.

Fixes: 4e07db9c8db8 ("x86/devicetree: Use CPU description from Device Tree")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210312092033.26317-1-johan@kernel.org
3 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Fri, 19 Mar 2021 21:10:07 +0000 (14:10 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "Fixes for kvm on x86:

   - new selftests

   - fixes for migration with HyperV re-enlightenment enabled

   - fix RCU/SRCU usage

   - fixes for local_irq_restore misuse false positive"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  documentation/kvm: additional explanations on KVM_SET_BOOT_CPU_ID
  x86/kvm: Fix broken irq restoration in kvm_wait
  KVM: X86: Fix missing local pCPU when executing wbinvd on all dirty pCPUs
  KVM: x86: Protect userspace MSR filter with SRCU, and set atomically-ish
  selftests: kvm: add set_boot_cpu_id test
  selftests: kvm: add _vm_ioctl
  selftests: kvm: add get_msr_index_features
  selftests: kvm: Add basic Hyper-V clocksources tests
  KVM: x86: hyper-v: Don't touch TSC page values when guest opted for re-enlightenment
  KVM: x86: hyper-v: Track Hyper-V TSC page status
  KVM: x86: hyper-v: Prevent using not-yet-updated TSC page by secondary CPUs
  KVM: x86: hyper-v: Limit guest to writing zero to HV_X64_MSR_TSC_EMULATION_STATUS
  KVM: x86/mmu: Store the address space ID in the TDP iterator
  KVM: x86/mmu: Factor out tdp_iter_return_to_root
  KVM: x86/mmu: Fix RCU usage when atomically zapping SPTEs
  KVM: x86/mmu: Fix RCU usage in handle_removed_tdp_mmu_page

3 years agoMerge tag 'gpio-fixes-for-v5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 19 Mar 2021 21:07:19 +0000 (14:07 -0700)]
Merge tag 'gpio-fixes-for-v5.12-rc4' of git://git./linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:
 "Two fixes for the GPIO subsystem. Both address issues in the core GPIO
  code:

   - fix the return value in error path in gpiolib_dev_init()

   - fix the 'gpio-line-names' property handling correctly this time"

* tag 'gpio-fixes-for-v5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpiolib: Assign fwnode to parent's if no primary one provided
  gpiolib: Fix error return code in gpiolib_dev_init()

3 years agoMerge tag 's390-5.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Fri, 19 Mar 2021 18:39:28 +0000 (11:39 -0700)]
Merge tag 's390-5.12-4' of git://git./linux/kernel/git/s390/linux

Pull s390 updates from Heiko Carstens:

 - disable preemption when accessing local per-cpu variables in the new
   counter set driver

 - fix by a factor of four increased steal time due to missing
   cputime_to_nsecs() conversion

 - fix PCI device structure leak

* tag 's390-5.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/pci: fix leak of PCI device structure
  s390/vtime: fix increased steal time accounting
  s390/cpumf: disable preemption when accessing per-cpu variable

3 years agoMerge tag 'trace-v5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Linus Torvalds [Fri, 19 Mar 2021 17:06:30 +0000 (10:06 -0700)]
Merge tag 'trace-v5.12-rc3' of git://git./linux/kernel/git/rostedt/linux-trace

Pull workqueue tracing fix from Steven Rostedt:
 "Fix workqueue trace event unsafe string reference

  After adding a verifier to test all strings printed in trace events to
  make sure they either point to a string on the ring buffer, or to read
  only core kernel memory, it triggered on a workqueue trace event. The
  trace event workqueue_queue_work references the allocated name of the
  workqueue in the output. If the workqueue is freed before the trace is
  read, then the trace will dereference freed memory.

  Update the trace event to use the __string(), __assign_str(), and
  __get_str() helpers to handle such cases"

* tag 'trace-v5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  workqueue/tracing: Copy workqueue name to buffer in trace event

3 years agoMerge tag 'pm-5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Fri, 19 Mar 2021 17:00:10 +0000 (10:00 -0700)]
Merge tag 'pm-5.12-rc4' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "Revert two problematic commits.

  Specifics:

   - Revert ACPI PM commit that attempted to improve reboot handling on
     some systems, but it caused other systems to panic() during reboot
     (Josef Bacik)

   - Revert PM-runtime commit that attempted to improve the handling of
     suppliers during PM-runtime suspend of a consumer device, but it
     introduced a race condition potentially leading to unexpected
     behavior (Rafael Wysocki)"

* tag 'pm-5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  Revert "PM: runtime: Update device status before letting suppliers suspend"
  Revert "PM: ACPI: reboot: Use S5 for reboot"

3 years agoMerge tag 'iommu-fixes-v5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 19 Mar 2021 16:56:04 +0000 (09:56 -0700)]
Merge tag 'iommu-fixes-v5.12-rc3' of git://git./linux/kernel/git/joro/iommu

Pull iommu fixes from Joerg Roedel:

 - Three AMD IOMMU patches to fix a boot crash on AMD Stoney systems and
   every other AMD IOMMU system booted with 'amd_iommu=off'.

   This is a v5.11 regression.

 - A Fix for the Tegra IOMMU driver to make sure it detects all IOMMUs

* tag 'iommu-fixes-v5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/tegra-smmu: Make tegra_smmu_probe_device() to handle all IOMMU phandles
  iommu/amd: Keep track of amd_iommu_irq_remap state
  iommu/amd: Don't call early_amd_iommu_init() when AMD IOMMU is disabled
  iommu/amd: Move Stoney Ridge check to detect_ivrs()

3 years agoMerge tag 'sound-5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Fri, 19 Mar 2021 16:53:32 +0000 (09:53 -0700)]
Merge tag 'sound-5.12-rc4' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "The majority of changes are various ASoC device/platform-specific
  small fixes (including a removal of stale file) while the only common
  change is a clk management fix in ASoC simple-card driver.

  The rest are the usual HD-audio quirks"

* tag 'sound-5.12-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (44 commits)
  ALSA: usb-audio: Fix unintentional sign extension issue
  ALSA: hda/realtek: fix mute/micmute LEDs for HP 850 G8
  ASoC: dt-bindings: fsl_spdif: Add compatible string for new platforms
  ASoC: rt711: add snd_soc_component remove callback
  ASoC: rt5659: Update MCLK rate in set_sysclk()
  ASoC: simple-card-utils: Do not handle device clock
  ALSA: hda/realtek: fix mute/micmute LEDs for HP 440 G8
  ALSA: hda/realtek: fix mute/micmute LEDs for HP 840 G8
  ALSA: hda/realtek: apply pin quirk for XiaomiNotebook Pro
  ALSA: hda/realtek: Apply headset-mic quirks for Xiaomi Redmibook Air
  ASoC: mediatek: mt8192: fix tdm out data is valid on rising edge
  ALSA: dice: fix null pointer dereference when node is disconnected
  ALSA: hda: generic: Fix the micmute led init state
  ASoC: qcom: lpass-cpu: Fix lpass dai ids parse
  spi: cadence: set cqspi to the driver_data field of struct device
  ASoC: SOF: intel: fix wrong poll bits in dsp power down
  ASoC: codecs: wcd934x: add a sanity check in set channel map
  ASoC: qcom: sdm845: Fix array out of range on rx slim channels
  ASoC: qcom: sdm845: Fix array out of bounds access
  ASoC: remove remnants of sirf prima/atlas audio codec
  ...

3 years agocifs: fix allocation size on newly created files
Steve French [Fri, 19 Mar 2021 05:05:48 +0000 (00:05 -0500)]
cifs: fix allocation size on newly created files

Applications that create and extend and write to a file do not
expect to see 0 allocation size.  When file is extended,
set its allocation size to a plausible value until we have a
chance to query the server for it.  When the file is cached
this will prevent showing an impossible number of allocated
blocks (like 0).  This fixes e.g. xfstests 614 which does

    1) create a file and set its size to 64K
    2) mmap write 64K to the file
    3) stat -c %b for the file (to query the number of allocated blocks)

It was failing because we returned 0 blocks.  Even though we would
return the correct cached file size, we returned an impossible
allocation size.

Signed-off-by: Steve French <stfrench@microsoft.com>
CC: <stable@vger.kernel.org>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
3 years agoMerge branch 'pm-core'
Rafael J. Wysocki [Fri, 19 Mar 2021 15:38:45 +0000 (16:38 +0100)]
Merge branch 'pm-core'

* pm-core:
  Revert "PM: runtime: Update device status before letting suppliers suspend"

3 years agoRevert "PM: runtime: Update device status before letting suppliers suspend"
Rafael J. Wysocki [Fri, 19 Mar 2021 14:47:25 +0000 (15:47 +0100)]
Revert "PM: runtime: Update device status before letting suppliers suspend"

Revert commit 44cc89f76464 ("PM: runtime: Update device status
before letting suppliers suspend") that introduced a race condition
into __rpm_callback() which allowed a concurrent rpm_resume() to
run and resume the device prematurely after its status had been
changed to RPM_SUSPENDED by __rpm_callback().

Fixes: 44cc89f76464 ("PM: runtime: Update device status before letting suppliers suspend")
Link: https://lore.kernel.org/linux-pm/24dfb6fc-5d54-6ee2-9195-26428b7ecf8a@intel.com/
Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: 4.10+ <stable@vger.kernel.org> # 4.10+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
3 years agoMerge tag 'efi-urgent-for-v5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel...
Ingo Molnar [Fri, 19 Mar 2021 13:23:46 +0000 (14:23 +0100)]
Merge tag 'efi-urgent-for-v5.12-rc3' of git://git./linux/kernel/git/efi/efi into efi/urgent

Pull EFI fixes from Ard Biesheuvel:

 "- another missing RT_PROP table related fix, to ensure that the efivarfs
    pseudo filesystem fails gracefully if variable services are unsupported
  - use the correct alignment for literal EFI GUIDs
  - fix a use after unmap issue in the memreserve code"

Signed-off-by: Ingo Molnar <mingo@kernel.org>
3 years agoMerge tag 'nvme-5.12-20210319' of git://git.infradead.org/nvme into block-5.12
Jens Axboe [Fri, 19 Mar 2021 12:40:47 +0000 (06:40 -0600)]
Merge tag 'nvme-5.12-20210319' of git://git.infradead.org/nvme into block-5.12

Pull NVMe updates from Christoph:

"nvme fixes for 5.12

 - fix tag allocation for keep alive
 - fix a unit mismatch for the Write Zeroes limits
 - various TCP transport fixes (Sagi Grimberg, Elad Grupi)
 - fix iosqes and iocqes validation for discovery controllers (Sagi Grimberg)"

* tag 'nvme-5.12-20210319' of git://git.infradead.org/nvme:
  nvmet-tcp: fix kmap leak when data digest in use
  nvmet: don't check iosqes,iocqes for discovery controllers
  nvme-rdma: fix possible hang when failing to set io queues
  nvme-tcp: fix possible hang when failing to set io queues
  nvme-tcp: fix misuse of __smp_processor_id with preemption enabled
  nvme-tcp: fix a NULL deref when receiving a 0-length r2t PDU
  nvme: fix Write Zeroes limitations
  nvme: allocate the keep alive request using BLK_MQ_REQ_NOWAIT
  nvme: merge nvme_keep_alive into nvme_keep_alive_work
  nvme-fabrics: only reserve a single tag

3 years agostatic_call: Fix static_call_update() sanity check
Peter Zijlstra [Thu, 18 Mar 2021 10:31:51 +0000 (11:31 +0100)]
static_call: Fix static_call_update() sanity check

Sites that match init_section_contains() get marked as INIT. For
built-in code init_sections contains both __init and __exit text. OTOH
kernel_text_address() only explicitly includes __init text (and there
are no __exit text markers).

Match what jump_label already does and ignore the warning for INIT
sites. Also see the excellent changelog for commit: 8f35eaa5f2de
("jump_label: Don't warn on __exit jump entries")

Fixes: 9183c3f9ed710 ("static_call: Add inline static call infrastructure")
Reported-by: Sumit Garg <sumit.garg@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Tested-by: Sumit Garg <sumit.garg@linaro.org>
Link: https://lkml.kernel.org/r/20210318113610.739542434@infradead.org
3 years agostatic_call: Align static_call_is_init() patching condition
Peter Zijlstra [Thu, 18 Mar 2021 10:29:56 +0000 (11:29 +0100)]
static_call: Align static_call_is_init() patching condition

The intent is to avoid writing init code after init (because the text
might have been freed). The code is needlessly different between
jump_label and static_call and not obviously correct.

The existing code relies on the fact that the module loader clears the
init layout, such that within_module_init() always fails, while
jump_label relies on the module state which is more obvious and
matches the kernel logic.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Tested-by: Sumit Garg <sumit.garg@linaro.org>
Link: https://lkml.kernel.org/r/20210318113610.636651340@infradead.org
3 years agostatic_call: Fix static_call_set_init()
Peter Zijlstra [Thu, 18 Mar 2021 10:27:19 +0000 (11:27 +0100)]
static_call: Fix static_call_set_init()

It turns out that static_call_set_init() does not preserve the other
flags; IOW. it clears TAIL if it was set.

Fixes: 9183c3f9ed710 ("static_call: Add inline static call infrastructure")
Reported-by: Sumit Garg <sumit.garg@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Jarkko Sakkinen <jarkko@kernel.org>
Tested-by: Sumit Garg <sumit.garg@linaro.org>
Link: https://lkml.kernel.org/r/20210318113610.519406371@infradead.org
3 years agox86/ioapic: Ignore IRQ2 again
Thomas Gleixner [Thu, 18 Mar 2021 19:26:47 +0000 (20:26 +0100)]
x86/ioapic: Ignore IRQ2 again

Vitaly ran into an issue with hotplugging CPU0 on an Amazon instance where
the matrix allocator claimed to be out of vectors. He analyzed it down to
the point that IRQ2, the PIC cascade interrupt, which is supposed to be not
ever routed to the IO/APIC ended up having an interrupt vector assigned
which got moved during unplug of CPU0.

The underlying issue is that IRQ2 for various reasons (see commit
af174783b925 ("x86: I/O APIC: Never configure IRQ2" for details) is treated
as a reserved system vector by the vector core code and is not accounted as
a regular vector. The Amazon BIOS has an routing entry of pin2 to IRQ2
which causes the IO/APIC setup to claim that interrupt which is granted by
the vector domain because there is no sanity check. As a consequence the
allocation counter of CPU0 underflows which causes a subsequent unplug to
fail with:

  [ ... ] CPU 0 has 4294967295 vectors, 589 available. Cannot disable CPU

There is another sanity check missing in the matrix allocator, but the
underlying root cause is that the IO/APIC code lost the IRQ2 ignore logic
during the conversion to irqdomains.

For almost 6 years nobody complained about this wreckage, which might
indicate that this requirement could be lifted, but for any system which
actually has a PIC IRQ2 is unusable by design so any routing entry has no
effect and the interrupt cannot be connected to a device anyway.

Due to that and due to history biased paranoia reasons restore the IRQ2
ignore logic and treat it as non existent despite a routing entry claiming
otherwise.

Fixes: d32932d02e18 ("x86/irq: Convert IOAPIC to use hierarchical irqdomain interfaces")
Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210318192819.636943062@linutronix.de
3 years agodocumentation/kvm: additional explanations on KVM_SET_BOOT_CPU_ID
Emanuele Giuseppe Esposito [Fri, 19 Mar 2021 09:16:50 +0000 (10:16 +0100)]
documentation/kvm: additional explanations on KVM_SET_BOOT_CPU_ID

The ioctl KVM_SET_BOOT_CPU_ID fails when called after vcpu creation.
Add this explanation in the documentation.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20210319091650.11967-1-eesposit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoefi: use 32-bit alignment for efi_guid_t literals
Ard Biesheuvel [Wed, 10 Mar 2021 07:33:19 +0000 (08:33 +0100)]
efi: use 32-bit alignment for efi_guid_t literals

Commit 494c704f9af0 ("efi: Use 32-bit alignment for efi_guid_t") updated
the type definition of efi_guid_t to ensure that it always appears
sufficiently aligned (the UEFI spec is ambiguous about this, but given
the fact that its EFI_GUID type is defined in terms of a struct carrying
a uint32_t, the natural alignment is definitely >= 32 bits).

However, we missed the EFI_GUID() macro which is used to instantiate
efi_guid_t literals: that macro is still based on the guid_t type,
which does not have a minimum alignment at all. This results in warnings
such as

  In file included from drivers/firmware/efi/mokvar-table.c:35:
  include/linux/efi.h:1093:34: warning: passing 1-byte aligned argument to
      4-byte aligned parameter 2 of 'get_var' may result in an unaligned pointer
      access [-Walign-mismatch]
          status = get_var(L"SecureBoot", &EFI_GLOBAL_VARIABLE_GUID, NULL, &size,
                                          ^
  include/linux/efi.h:1101:24: warning: passing 1-byte aligned argument to
      4-byte aligned parameter 2 of 'get_var' may result in an unaligned pointer
      access [-Walign-mismatch]
          get_var(L"SetupMode", &EFI_GLOBAL_VARIABLE_GUID, NULL, &size, &setupmode);

The distinction only matters on CPUs that do not support misaligned loads
fully, but 32-bit ARM's load-multiple instructions fall into that category,
and these are likely to be emitted by the compiler that built the firmware
for loading word-aligned 128-bit GUIDs from memory

So re-implement the initializer in terms of our own efi_guid_t type, so that
the alignment becomes a property of the literal's type.

Fixes: 494c704f9af0 ("efi: Use 32-bit alignment for efi_guid_t")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://github.com/ClangBuiltLinux/linux/issues/1327
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
3 years agofirmware/efi: Fix a use after bug in efi_mem_reserve_persistent
Lv Yunlong [Wed, 10 Mar 2021 08:31:27 +0000 (00:31 -0800)]
firmware/efi: Fix a use after bug in efi_mem_reserve_persistent

In the for loop in efi_mem_reserve_persistent(), prsv = rsv->next
use the unmapped rsv. Use the unmapped pages will cause segment
fault.

Fixes: 18df7577adae6 ("efi/memreserve: deal with memreserve entries in unmapped memory")
Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>