platform/kernel/linux-rpi.git
23 months agoNFSD: verify the opened dentry after setting a delegation
Jeff Layton [Tue, 26 Jul 2022 06:45:30 +0000 (16:45 +1000)]
NFSD: verify the opened dentry after setting a delegation

Between opening a file and setting a delegation on it, someone could
rename or unlink the dentry. If this happens, we do not want to grant a
delegation on the open.

On a CLAIM_NULL open, we're opening by filename, and we may (in the
non-create case) or may not (in the create case) be holding i_rwsem
when attempting to set a delegation.  The latter case allows a
race.

After getting a lease, redo the lookup of the file being opened and
validate that the resulting dentry matches the one in the open file
description.

To properly redo the lookup we need an rqst pointer to pass to
nfsd_lookup_dentry(), so make sure that is available.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: drop fh argument from alloc_init_deleg
Jeff Layton [Tue, 26 Jul 2022 06:45:30 +0000 (16:45 +1000)]
NFSD: drop fh argument from alloc_init_deleg

Currently, we pass the fh of the opened file down through several
functions so that alloc_init_deleg can pass it to delegation_blocked.
The filehandle of the open file is available in the nfs4_file however,
so there's no need to pass it in a separate argument.

Drop the argument from alloc_init_deleg, nfs4_open_delegation and
nfs4_set_delegation.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Move copy offload callback arguments into a separate structure
Chuck Lever [Wed, 27 Jul 2022 18:41:18 +0000 (14:41 -0400)]
NFSD: Move copy offload callback arguments into a separate structure

Refactor so that CB_OFFLOAD arguments can be passed without
allocating a whole struct nfsd4_copy object. On my system (x86_64)
this removes another 96 bytes from struct nfsd4_copy.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Add nfsd4_send_cb_offload()
Chuck Lever [Wed, 27 Jul 2022 18:41:12 +0000 (14:41 -0400)]
NFSD: Add nfsd4_send_cb_offload()

Refactor for legibility.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Remove kmalloc from nfsd4_do_async_copy()
Chuck Lever [Wed, 27 Jul 2022 18:41:06 +0000 (14:41 -0400)]
NFSD: Remove kmalloc from nfsd4_do_async_copy()

Instead of manufacturing a phony struct nfsd_file, pass the
struct file returned by nfs42_ssc_open() directly to
nfsd4_do_copy().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Refactor nfsd4_do_copy()
Chuck Lever [Wed, 27 Jul 2022 18:40:59 +0000 (14:40 -0400)]
NFSD: Refactor nfsd4_do_copy()

Refactor: Now that nfsd4_do_copy() no longer calls the cleanup
helpers, plumb the use of struct file pointers all the way down to
_nfsd_copy_file_range().

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Refactor nfsd4_cleanup_inter_ssc() (2/2)
Chuck Lever [Wed, 27 Jul 2022 18:40:53 +0000 (14:40 -0400)]
NFSD: Refactor nfsd4_cleanup_inter_ssc() (2/2)

Move the nfsd4_cleanup_*() call sites out of nfsd4_do_copy(). A
subsequent patch will modify one of the new call sites to avoid
the need to manufacture the phony struct nfsd_file.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Refactor nfsd4_cleanup_inter_ssc() (1/2)
Chuck Lever [Wed, 27 Jul 2022 18:40:47 +0000 (14:40 -0400)]
NFSD: Refactor nfsd4_cleanup_inter_ssc() (1/2)

The @src parameter is sometimes a pointer to a struct nfsd_file and
sometimes a pointer to struct file hiding in a phony struct
nfsd_file. Refactor nfsd4_cleanup_inter_ssc() so the @src parameter
is always an explicit struct file.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Replace boolean fields in struct nfsd4_copy
Chuck Lever [Wed, 27 Jul 2022 18:40:41 +0000 (14:40 -0400)]
NFSD: Replace boolean fields in struct nfsd4_copy

Clean up: saves 8 bytes, and we can replace check_and_set_stop_copy()
with an atomic bitop.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Make nfs4_put_copy() static
Chuck Lever [Wed, 27 Jul 2022 18:40:35 +0000 (14:40 -0400)]
NFSD: Make nfs4_put_copy() static

Clean up: All call sites are in fs/nfsd/nfs4proc.c.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Reorder the fields in struct nfsd4_op
Chuck Lever [Wed, 27 Jul 2022 18:40:28 +0000 (14:40 -0400)]
NFSD: Reorder the fields in struct nfsd4_op

Pack the fields to reduce the size of struct nfsd4_op, which is used
an array in struct nfsd4_compoundargs.

sizeof(struct nfsd4_op):
Before: /* size: 672, cachelines: 11, members: 5 */
After:  /* size: 640, cachelines: 10, members: 5 */

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Shrink size of struct nfsd4_copy
Chuck Lever [Wed, 27 Jul 2022 18:40:22 +0000 (14:40 -0400)]
NFSD: Shrink size of struct nfsd4_copy

struct nfsd4_copy is part of struct nfsd4_op, which resides in an
8-element array.

sizeof(struct nfsd4_op):
Before: /* size: 1696, cachelines: 27, members: 5 */
After:  /* size: 672, cachelines: 11, members: 5 */

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Shrink size of struct nfsd4_copy_notify
Chuck Lever [Wed, 27 Jul 2022 18:40:16 +0000 (14:40 -0400)]
NFSD: Shrink size of struct nfsd4_copy_notify

struct nfsd4_copy_notify is part of struct nfsd4_op, which resides
in an 8-element array.

sizeof(struct nfsd4_op):
Before: /* size: 2208, cachelines: 35, members: 5 */
After:  /* size: 1696, cachelines: 27, members: 5 */

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: nfserrno(-ENOMEM) is nfserr_jukebox
Chuck Lever [Wed, 27 Jul 2022 18:40:09 +0000 (14:40 -0400)]
NFSD: nfserrno(-ENOMEM) is nfserr_jukebox

Suggested-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Fix strncpy() fortify warning
Chuck Lever [Wed, 27 Jul 2022 18:40:03 +0000 (14:40 -0400)]
NFSD: Fix strncpy() fortify warning

In function ‘strncpy’,
    inlined from ‘nfsd4_ssc_setup_dul’ at /home/cel/src/linux/manet/fs/nfsd/nfs4proc.c:1392:3,
    inlined from ‘nfsd4_interssc_connect’ at /home/cel/src/linux/manet/fs/nfsd/nfs4proc.c:1489:11:
/home/cel/src/linux/manet/include/linux/fortify-string.h:52:33: warning: ‘__builtin_strncpy’ specified bound 63 equals destination size [-Wstringop-truncation]
   52 | #define __underlying_strncpy    __builtin_strncpy
      |                                 ^
/home/cel/src/linux/manet/include/linux/fortify-string.h:89:16: note: in expansion of macro ‘__underlying_strncpy’
   89 |         return __underlying_strncpy(p, q, size);
      |                ^~~~~~~~~~~~~~~~~~~~

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Clean up nfsd4_encode_readlink()
Chuck Lever [Fri, 22 Jul 2022 20:09:23 +0000 (16:09 -0400)]
NFSD: Clean up nfsd4_encode_readlink()

Similar changes to nfsd4_encode_readv(), all bundled into a single
patch.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Use xdr_pad_size()
Chuck Lever [Fri, 22 Jul 2022 20:09:16 +0000 (16:09 -0400)]
NFSD: Use xdr_pad_size()

Clean up: Use a helper instead of open-coding the calculation of
the XDR pad size.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Simplify starting_len
Chuck Lever [Fri, 22 Jul 2022 20:09:10 +0000 (16:09 -0400)]
NFSD: Simplify starting_len

Clean-up: Now that nfsd4_encode_readv() does not have to encode the
EOF or rd_length values, it no longer needs to subtract 8 from
@starting_len.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Optimize nfsd4_encode_readv()
Chuck Lever [Fri, 22 Jul 2022 20:09:04 +0000 (16:09 -0400)]
NFSD: Optimize nfsd4_encode_readv()

write_bytes_to_xdr_buf() is pretty expensive to use for inserting
an XDR data item that is always 1 XDR_UNIT at an address that is
always XDR word-aligned.

Since both the readv and splice read paths encode EOF and maxcount
values, move both to a common code path.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Add an nfsd4_read::rd_eof field
Chuck Lever [Fri, 22 Jul 2022 20:08:57 +0000 (16:08 -0400)]
NFSD: Add an nfsd4_read::rd_eof field

Refactor: Make the EOF result available in the entire NFSv4 READ
path.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Clean up SPLICE_OK in nfsd4_encode_read()
Chuck Lever [Fri, 22 Jul 2022 20:08:51 +0000 (16:08 -0400)]
NFSD: Clean up SPLICE_OK in nfsd4_encode_read()

Do the test_bit() once -- this reduces the number of locked-bus
operations and makes the function a little easier to read.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Optimize nfsd4_encode_fattr()
Chuck Lever [Fri, 22 Jul 2022 20:08:45 +0000 (16:08 -0400)]
NFSD: Optimize nfsd4_encode_fattr()

write_bytes_to_xdr_buf() is a generic way to place a variable-length
data item in an already-reserved spot in the encoding buffer.

However, it is costly. In nfsd4_encode_fattr(), it is unnecessary
because the data item is fixed in size and the buffer destination
address is always word-aligned.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Optimize nfsd4_encode_operation()
Chuck Lever [Fri, 22 Jul 2022 20:08:38 +0000 (16:08 -0400)]
NFSD: Optimize nfsd4_encode_operation()

write_bytes_to_xdr_buf() is a generic way to place a variable-length
data item in an already-reserved spot in the encoding buffer.
However, it is costly, and here, it is unnecessary because the
data item is fixed in size, the buffer destination address is
always word-aligned, and the destination location is already in
@p.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agonfsd: silence extraneous printk on nfsd.ko insertion
Jeff Layton [Wed, 20 Jul 2022 12:39:23 +0000 (08:39 -0400)]
nfsd: silence extraneous printk on nfsd.ko insertion

This printk pops every time nfsd.ko gets plugged in. Most kmods don't do
that and this one is not very informative. Olaf's email address seems to
be defunct at this point anyway. Just drop it.

Cc: Olaf Kirch <okir@suse.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: limit the number of v4 clients to 1024 per 1GB of system memory
Dai Ngo [Fri, 15 Jul 2022 23:54:53 +0000 (16:54 -0700)]
NFSD: limit the number of v4 clients to 1024 per 1GB of system memory

Currently there is no limit on how many v4 clients are supported
by the system. This can be a problem in systems with small memory
configuration to function properly when a very large number of
clients exist that creates memory shortage conditions.

This patch enforces a limit of 1024 NFSv4 clients, including courtesy
clients, per 1GB of system memory.  When the number of the clients
reaches the limit, requests that create new clients are returned
with NFS4ERR_DELAY and the laundromat is kicked start to trim old
clients. Due to the overhead of the upcall to remove the client
record, the maximun number of clients the laundromat removes on
each run is limited to 128. This is done to ensure the laundromat
can still process the other tasks in a timely manner.

Since there is now a limit of the number of clients, the 24-hr
idle time limit of courtesy client is no longer needed and was
removed.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: keep track of the number of v4 clients in the system
Dai Ngo [Fri, 15 Jul 2022 23:54:52 +0000 (16:54 -0700)]
NFSD: keep track of the number of v4 clients in the system

Add counter nfs4_client_count to keep track of the total number
of v4 clients, including courtesy clients, in the system.

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: refactoring v4 specific code to a helper in nfs4state.c
Dai Ngo [Fri, 15 Jul 2022 23:54:51 +0000 (16:54 -0700)]
NFSD: refactoring v4 specific code to a helper in nfs4state.c

This patch moves the v4 specific code from nfsd_init_net() to
nfsd4_init_leases_net() helper in nfs4state.c

Signed-off-by: Dai Ngo <dai.ngo@oracle.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Ensure nf_inode is never dereferenced
Chuck Lever [Fri, 8 Jul 2022 18:27:09 +0000 (14:27 -0400)]
NFSD: Ensure nf_inode is never dereferenced

The documenting comment for struct nf_file states:

/*
 * A representation of a file that has been opened by knfsd. These are hashed
 * in the hashtable by inode pointer value. Note that this object doesn't
 * hold a reference to the inode by itself, so the nf_inode pointer should
 * never be dereferenced, only used for comparison.
 */

Replace the two existing dereferences to make the comment always
true.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: NFSv4 CLOSE should release an nfsd_file immediately
Chuck Lever [Fri, 8 Jul 2022 18:27:02 +0000 (14:27 -0400)]
NFSD: NFSv4 CLOSE should release an nfsd_file immediately

The last close of a file should enable other accessors to open and
use that file immediately. Leaving the file open in the filecache
prevents other users from accessing that file until the filecache
garbage-collects the file -- sometimes that takes several seconds.

Reported-by: Wang Yugui <wangyugui@e16-tech.com>
Link: https://bugzilla.linux-nfs.org/show_bug.cgi?387
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Move nfsd_file_trace_alloc() tracepoint
Chuck Lever [Fri, 8 Jul 2022 18:26:49 +0000 (14:26 -0400)]
NFSD: Move nfsd_file_trace_alloc() tracepoint

Avoid recording the allocation of an nfsd_file item that is
immediately released because a matching item was already
inserted in the hash.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Separate tracepoints for acquire and create
Chuck Lever [Fri, 8 Jul 2022 18:26:43 +0000 (14:26 -0400)]
NFSD: Separate tracepoints for acquire and create

These tracepoints collect different information: the create case does
not open a file, so there's no nf_file available.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Clean up unused code after rhashtable conversion
Chuck Lever [Fri, 8 Jul 2022 18:26:36 +0000 (14:26 -0400)]
NFSD: Clean up unused code after rhashtable conversion

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Convert the filecache to use rhashtable
Chuck Lever [Fri, 8 Jul 2022 18:26:30 +0000 (14:26 -0400)]
NFSD: Convert the filecache to use rhashtable

Enable the filecache hash table to start small, then grow with the
workload. Smaller server deployments benefit because there should
be lower memory utilization. Larger server deployments should see
improved scaling with the number of open files.

Suggested-by: Jeff Layton <jlayton@kernel.org>
Suggested-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Set up an rhashtable for the filecache
Chuck Lever [Fri, 8 Jul 2022 18:26:23 +0000 (14:26 -0400)]
NFSD: Set up an rhashtable for the filecache

Add code to initialize and tear down an rhashtable. The rhashtable
is not used yet.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Replace the "init once" mechanism
Chuck Lever [Fri, 8 Jul 2022 18:26:16 +0000 (14:26 -0400)]
NFSD: Replace the "init once" mechanism

In a moment, the nfsd_file_hashtbl global will be replaced with an
rhashtable. Replace the one or two spots that need to check if the
hash table is available. We can easily reuse the SHUTDOWN flag for
this purpose.

Document that this mechanism relies on callers to hold the
nfsd_mutex to prevent init, shutdown, and purging to run
concurrently.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Remove nfsd_file::nf_hashval
Chuck Lever [Fri, 8 Jul 2022 18:26:10 +0000 (14:26 -0400)]
NFSD: Remove nfsd_file::nf_hashval

The value in this field can always be computed from nf_inode, thus
it is no longer used.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: nfsd_file_hash_remove can compute hashval
Chuck Lever [Fri, 8 Jul 2022 18:26:03 +0000 (14:26 -0400)]
NFSD: nfsd_file_hash_remove can compute hashval

Remove an unnecessary use of nf_hashval.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Refactor __nfsd_file_close_inode()
Chuck Lever [Fri, 8 Jul 2022 18:25:57 +0000 (14:25 -0400)]
NFSD: Refactor __nfsd_file_close_inode()

The code that computes the hashval is the same in both callers.

To prevent them from going stale, reframe the documenting comments
to remove descriptions of the underlying hash table structure, which
is about to be replaced.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: nfsd_file_unhash can compute hashval from nf->nf_inode
Chuck Lever [Fri, 8 Jul 2022 18:25:50 +0000 (14:25 -0400)]
NFSD: nfsd_file_unhash can compute hashval from nf->nf_inode

Remove an unnecessary usage of nf_hashval.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Remove lockdep assertion from unhash_and_release_locked()
Chuck Lever [Fri, 8 Jul 2022 18:25:44 +0000 (14:25 -0400)]
NFSD: Remove lockdep assertion from unhash_and_release_locked()

IIUC, holding the hash bucket lock is needed only in
nfsd_file_unhash, and there is already a lockdep assertion there.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: No longer record nf_hashval in the trace log
Chuck Lever [Fri, 8 Jul 2022 18:25:37 +0000 (14:25 -0400)]
NFSD: No longer record nf_hashval in the trace log

I'm about to replace nfsd_file_hashtbl with an rhashtable. The
individual hash values will no longer be visible or relevant, so
remove them from the tracepoints.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Never call nfsd_file_gc() in foreground paths
Chuck Lever [Fri, 8 Jul 2022 18:25:30 +0000 (14:25 -0400)]
NFSD: Never call nfsd_file_gc() in foreground paths

The checks in nfsd_file_acquire() and nfsd_file_put() that directly
invoke filecache garbage collection are intended to keep cache
occupancy between a low- and high-watermark. The reason to limit the
capacity of the filecache is to keep filecache lookups reasonably
fast.

However, invoking garbage collection at those points has some
undesirable negative impacts. Files that are held open by NFSv4
clients often push the occupancy of the filecache over these
watermarks. At that point:

- Every call to nfsd_file_acquire() and nfsd_file_put() results in
  an LRU walk. This has the same effect on lookup latency as long
  chains in the hash table.
- Garbage collection will then run on every nfsd thread, causing a
  lot of unnecessary lock contention.
- Limiting cache capacity pushes out files used only by NFSv3
  clients, which are the type of files the filecache is supposed to
  help.

To address those negative impacts, remove the direct calls to the
garbage collector. Subsequent patches will address maintaining
lookup efficiency as cache capacity increases.

Suggested-by: Wang Yugui <wangyugui@e16-tech.com>
Suggested-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Fix the filecache LRU shrinker
Chuck Lever [Fri, 8 Jul 2022 18:25:24 +0000 (14:25 -0400)]
NFSD: Fix the filecache LRU shrinker

Without LRU item rotation, the shrinker visits only a few items on
the end of the LRU list, and those would always be long-term OPEN
files for NFSv4 workloads. That makes the filecache shrinker
completely ineffective.

Adopt the same strategy as the inode LRU by using LRU_ROTATE.

Suggested-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Leave open files out of the filecache LRU
Chuck Lever [Fri, 8 Jul 2022 18:25:17 +0000 (14:25 -0400)]
NFSD: Leave open files out of the filecache LRU

There have been reports of problems when running fstests generic/531
against Linux NFS servers with NFSv4. The NFS server that hosts the
test's SCRATCH_DEV suffers from CPU soft lock-ups during the test.
Analysis shows that:

fs/nfsd/filecache.c
 482                 ret = list_lru_walk(&nfsd_file_lru,
 483                                 nfsd_file_lru_cb,
 484                                 &head, LONG_MAX);

causes nfsd_file_gc() to walk the entire length of the filecache LRU
list every time it is called (which is quite frequently). The walk
holds a spinlock the entire time that prevents other nfsd threads
from accessing the filecache.

What's more, for NFSv4 workloads, none of the items that are visited
during this walk may be evicted, since they are all files that are
held OPEN by NFS clients.

Address this by ensuring that open files are not kept on the LRU
list.

Reported-by: Frank van der Linden <fllinden@amazon.com>
Reported-by: Wang Yugui <wangyugui@e16-tech.com>
Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=386
Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Trace filecache LRU activity
Chuck Lever [Fri, 8 Jul 2022 18:25:11 +0000 (14:25 -0400)]
NFSD: Trace filecache LRU activity

Observe the operation of garbage collection and the lifetime of
filecache items.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: WARN when freeing an item still linked via nf_lru
Chuck Lever [Fri, 8 Jul 2022 18:25:04 +0000 (14:25 -0400)]
NFSD: WARN when freeing an item still linked via nf_lru

Add a guardrail to prevent freeing memory that is still on a list.
This includes either a dispose list or the LRU list.

This is the sign of a bug, but this class of bugs can be detected
so that they don't endanger system stability, especially while
debugging.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Hook up the filecache stat file
Chuck Lever [Fri, 8 Jul 2022 18:24:58 +0000 (14:24 -0400)]
NFSD: Hook up the filecache stat file

There has always been the capability of exporting filecache metrics
via /proc, but it was never hooked up. Let's surface these metrics
to enable better observability of the filecache.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Zero counters when the filecache is re-initialized
Chuck Lever [Fri, 8 Jul 2022 18:24:51 +0000 (14:24 -0400)]
NFSD: Zero counters when the filecache is re-initialized

If nfsd_file_cache_init() is called after a shutdown, be sure the
stat counters are reset.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Record number of flush calls
Chuck Lever [Fri, 8 Jul 2022 18:24:45 +0000 (14:24 -0400)]
NFSD: Record number of flush calls

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Report the number of items evicted by the LRU walk
Chuck Lever [Fri, 8 Jul 2022 18:24:38 +0000 (14:24 -0400)]
NFSD: Report the number of items evicted by the LRU walk

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Refactor nfsd_file_lru_scan()
Chuck Lever [Fri, 8 Jul 2022 18:24:31 +0000 (14:24 -0400)]
NFSD: Refactor nfsd_file_lru_scan()

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Refactor nfsd_file_gc()
Chuck Lever [Fri, 8 Jul 2022 18:24:25 +0000 (14:24 -0400)]
NFSD: Refactor nfsd_file_gc()

Refactor nfsd_file_gc() to use the new list_lru helper.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Add nfsd_file_lru_dispose_list() helper
Chuck Lever [Fri, 8 Jul 2022 18:24:18 +0000 (14:24 -0400)]
NFSD: Add nfsd_file_lru_dispose_list() helper

Refactor the invariant part of nfsd_file_lru_walk_list() into a
separate helper function.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Report average age of filecache items
Chuck Lever [Fri, 8 Jul 2022 18:24:12 +0000 (14:24 -0400)]
NFSD: Report average age of filecache items

This is a measure of how long items stay in the filecache, to help
assess how efficient the cache is.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Report count of freed filecache items
Chuck Lever [Fri, 8 Jul 2022 18:24:05 +0000 (14:24 -0400)]
NFSD: Report count of freed filecache items

Surface the count of freed nfsd_file items.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Report count of calls to nfsd_file_acquire()
Chuck Lever [Fri, 8 Jul 2022 18:23:59 +0000 (14:23 -0400)]
NFSD: Report count of calls to nfsd_file_acquire()

Count the number of successful acquisitions that did not create a
file (ie, acquisitions that do not result in a compulsory cache
miss). This count can be compared directly with the reported hit
count to compute a hit ratio.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Report filecache LRU size
Chuck Lever [Fri, 8 Jul 2022 18:23:52 +0000 (14:23 -0400)]
NFSD: Report filecache LRU size

Surface the NFSD filecache's LRU list length to help field
troubleshooters monitor filecache issues.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Demote a WARN to a pr_warn()
Chuck Lever [Fri, 8 Jul 2022 18:23:45 +0000 (14:23 -0400)]
NFSD: Demote a WARN to a pr_warn()

The call trace doesn't add much value, but it sure is noisy.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoSUNRPC: Fix server-side fault injection documentation
Chuck Lever [Fri, 1 Jul 2022 14:37:15 +0000 (10:37 -0400)]
SUNRPC: Fix server-side fault injection documentation

Fixes: 37324e6bb120 ("SUNRPC: Cache deferral injection")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agonfsd: remove redundant assignment to variable len
Colin Ian King [Tue, 28 Jun 2022 21:25:25 +0000 (22:25 +0100)]
nfsd: remove redundant assignment to variable len

Variable len is being assigned a value zero and this is never
read, it is being re-assigned later. The assignment is redundant
and can be removed.

Cleans up clang scan-build warning:
fs/nfsd/nfsctl.c:636:2: warning: Value stored to 'len' is never read

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Fix space and spelling mistake
Zhang Jiaming [Thu, 23 Jun 2022 08:20:05 +0000 (16:20 +0800)]
NFSD: Fix space and spelling mistake

Add a blank space after ','.
Change 'succesful' to 'successful'.

Signed-off-by: Zhang Jiaming <jiaming@nfschina.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNFSD: Instrument fh_verify()
Chuck Lever [Tue, 21 Jun 2022 14:06:23 +0000 (10:06 -0400)]
NFSD: Instrument fh_verify()

Capture file handles and how they map to local inodes. In particular,
NFSv4 PUTFH uses fh_verify() so we can now observe which file handles
are the target of OPEN, LOOKUP, RENAME, and so on.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoSUNRPC: Expand the svc_alloc_arg_err tracepoint
Chuck Lever [Tue, 21 Jun 2022 14:06:16 +0000 (10:06 -0400)]
SUNRPC: Expand the svc_alloc_arg_err tracepoint

Record not only the number of pages requested, but the number of
pages that were actually allocated, to get a measure of progress
(or lack thereof).

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoNLM: Defend against file_lock changes after vfs_test_lock()
Benjamin Coddington [Mon, 13 Jun 2022 13:40:06 +0000 (09:40 -0400)]
NLM: Defend against file_lock changes after vfs_test_lock()

Instead of trusting that struct file_lock returns completely unchanged
after vfs_test_lock() when there's no conflicting lock, stash away our
nlm_lockowner reference so we can properly release it for all cases.

This defends against another file_lock implementation overwriting fl_owner
when the return type is F_UNLCK.

Reported-by: Roberto Bergantinos Corpas <rbergant@redhat.com>
Tested-by: Roberto Bergantinos Corpas <rbergant@redhat.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
23 months agoSUNRPC: Fix xdr_encode_bool()
Chuck Lever [Tue, 19 Jul 2022 13:18:35 +0000 (09:18 -0400)]
SUNRPC: Fix xdr_encode_bool()

I discovered that xdr_encode_bool() was returning the same address
that was passed in the @p parameter. The documenting comment states
that the intent is to return the address of the next buffer
location, just like the other "xdr_encode_*" helpers.

The result was the encoded results of NFSv3 PATHCONF operations were
not formed correctly.

Fixes: ded04a587f6c ("NFSD: Update the NFSv3 PATHCONF3res encoder to use struct xdr_stream")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
23 months agonfsd: eliminate the NFSD_FILE_BREAK_* flags
Jeff Layton [Fri, 29 Jul 2022 21:01:07 +0000 (17:01 -0400)]
nfsd: eliminate the NFSD_FILE_BREAK_* flags

We had a report from the spring Bake-a-thon of data corruption in some
nfstest_interop tests. Looking at the traces showed the NFS server
allowing a v3 WRITE to proceed while a read delegation was still
outstanding.

Currently, we only set NFSD_FILE_BREAK_* flags if
NFSD_MAY_NOT_BREAK_LEASE was set when we call nfsd_file_alloc.
NFSD_MAY_NOT_BREAK_LEASE was intended to be set when finding files for
COMMIT ops, where we need a writeable filehandle but don't need to
break read leases.

It doesn't make any sense to consult that flag when allocating a file
since the file may be used on subsequent calls where we do want to break
the lease (and the usage of it here seems to be reverse from what it
should be anyway).

Also, after calling nfsd_open_break_lease, we don't want to clear the
BREAK_* bits. A lease could end up being set on it later (more than
once) and we need to be able to break those leases as well.

This means that the NFSD_FILE_BREAK_* flags now just mirror
NFSD_MAY_{READ,WRITE} flags, so there's no need for them at all. Just
drop those flags and unconditionally call nfsd_open_break_lease every
time.

Reported-by: Olga Kornieskaia <kolga@netapp.com>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2107360
Fixes: 65294c1f2c5e (nfsd: add a new struct file caching facility to nfsd)
Cc: <stable@vger.kernel.org> # 5.4.x : bb283ca18d1e NFSD: Clean up the show_nf_flags() macro
Cc: <stable@vger.kernel.org> # 5.4.x
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2 years agoLinux 5.19-rc7
Linus Torvalds [Sun, 17 Jul 2022 20:30:22 +0000 (13:30 -0700)]
Linux 5.19-rc7

2 years agoMerge tag 'drm-intel-fixes-2022-07-17' of git://anongit.freedesktop.org/drm/drm-intel
Linus Torvalds [Sun, 17 Jul 2022 20:08:03 +0000 (13:08 -0700)]
Merge tag 'drm-intel-fixes-2022-07-17' of git://anongit.freedesktop.org/drm/drm-intel

Pull intel drm build fix from Rodrigo Vivi:
 "Our 'dim' flow has a problem with fixes of fixes getting missed. We
  need to take a look on that later.

  Meanwhile, please allow me to quickly propagate this fix for the
  32-bit build issue here upstream"

* tag 'drm-intel-fixes-2022-07-17' of git://anongit.freedesktop.org/drm/drm-intel:
  drm/i915/ttm: fix 32b build

2 years agoMerge tag 'perf-tools-fixes-for-v5.19-2022-07-17' of git://git.kernel.org/pub/scm...
Linus Torvalds [Sun, 17 Jul 2022 19:42:57 +0000 (12:42 -0700)]
Merge tag 'perf-tools-fixes-for-v5.19-2022-07-17' of git://git./linux/kernel/git/acme/linux

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix SIGSEGV when processing syscall args in perf.data files in 'perf
   trace'

 - Sync kvm, msr-index and cpufeatures headers with the kernel sources

 - Fix 'convert perf time to TSC' 'perf test':
     - No need to open events twice
     - Fix finding correct event on hybrid systems

* tag 'perf-tools-fixes-for-v5.19-2022-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf trace: Fix SIGSEGV when processing syscall args
  perf tests: Fix Convert perf time to TSC test for hybrid
  perf tests: Stop Convert perf time to TSC test opening events twice
  tools arch x86: Sync the msr-index.h copy with the kernel sources
  tools headers cpufeatures: Sync with the kernel sources
  tools headers UAPI: Sync linux/kvm.h with the kernel sources

2 years agodrm/i915/ttm: fix 32b build
Matthew Auld [Tue, 12 Jul 2022 17:40:50 +0000 (18:40 +0100)]
drm/i915/ttm: fix 32b build

Since segment_pages is no longer a compile time constant, it looks the
DIV_ROUND_UP(node->size, segment_pages) breaks the 32b build. Simplest
is just to use the ULL variant, but really we should need not need more
than u32 for the page alignment (also we are limited by that due to the
sg->length type), so also make it all u32.

Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Fixes: aff1e0b09b54 ("drm/i915/ttm: fix sg_table construction")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Nirmoy Das <nirmoy.das@linux.intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220712174050.592550-1-matthew.auld@intel.com
(cherry picked from commit 9306b2b2dfce6931241ef804783692cee526599c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2 years agoMerge tag 'perf_urgent_for_v5.19_rc7' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 17 Jul 2022 15:34:02 +0000 (08:34 -0700)]
Merge tag 'perf_urgent_for_v5.19_rc7' of git://git./linux/kernel/git/tip/tip

Pull perf fix from Borislav Petkov:

 - A single data race fix on the perf event cleanup path to avoid
   endless loops due to insufficient locking

* tag 'perf_urgent_for_v5.19_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Fix data race between perf_event_set_output() and perf_mmap_close()

2 years agoMerge tag 'x86_urgent_for_v5.19_rc7' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 17 Jul 2022 15:27:30 +0000 (08:27 -0700)]
Merge tag 'x86_urgent_for_v5.19_rc7' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Improve the check whether the kernel supports WP mappings so that it
   can accomodate a XenPV guest due to how the latter is setting up the
   PAT machinery

  - Now that the retbleed nightmare is public, here's the first round of
    fallout fixes:

      * Fix a build failure on 32-bit due to missing include

      * Remove an untraining point in espfix64 return path

      * other small cleanups

* tag 'x86_urgent_for_v5.19_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/bugs: Remove apostrophe typo
  um: Add missing apply_returns()
  x86/entry: Remove UNTRAIN_RET from native_irq_return_ldt
  x86/bugs: Mark retbleed_strings static
  x86/pat: Fix x86_has_pat_wp()
  x86/asm/32: Fix ANNOTATE_UNRET_SAFE use on 32-bit

2 years agoMerge tag 'gpio-fixes-for-v5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 17 Jul 2022 14:58:19 +0000 (07:58 -0700)]
Merge tag 'gpio-fixes-for-v5.19-rc7' of git://git./linux/kernel/git/brgl/linux

Pull gpio fix from Bartosz Golaszewski:

 - fix a configfs attribute of the gpio-sim module

* tag 'gpio-fixes-for-v5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: sim: fix the chip_name configfs item

2 years agoMerge tag 'input-for-v5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 17 Jul 2022 14:52:46 +0000 (07:52 -0700)]
Merge tag 'input-for-v5.19-rc6' of git://git./linux/kernel/git/dtor/input

Pull input fixes from Dmitry Torokhov:

 - fix Goodix driver to properly behave on the Aya Neo Next

 - some more sanity checks in usbtouchscreen driver

 - a tweak in wm97xx driver in preparation for remove() to return void

 - a clarification in input core regarding units of measurement for
   resolution on touch events.

* tag 'input-for-v5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: document the units for resolution of size axes
  Input: goodix - call acpi_device_fix_up_power() in some cases
  Input: wm97xx - make .remove() obviously always return 0
  Input: usbtouchscreen - add driver_info sanity check

2 years agoMerge tag 'for-v5.19-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux...
Linus Torvalds [Sun, 17 Jul 2022 14:45:51 +0000 (07:45 -0700)]
Merge tag 'for-v5.19-rc' of git://git./linux/kernel/git/sre/linux-power-supply

Pull power supply fixes from Sebastian Reichel:

 - power-supply core temperature interpolation regression fix for
   incorrect boundaries

 - ab8500 needs to destroy its work queues in error paths

 - Fix old DT refcount leak in arm-versatile

* tag 'for-v5.19-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
  power: supply: core: Fix boundary conditions in interpolation
  power/reset: arm-versatile: Fix refcount leak in versatile_reboot_probe
  power: supply: ab8500_fg: add missing destroy_workqueue in ab8500_fg_probe

2 years agoperf trace: Fix SIGSEGV when processing syscall args
Naveen N. Rao [Thu, 7 Jul 2022 09:09:00 +0000 (14:39 +0530)]
perf trace: Fix SIGSEGV when processing syscall args

On powerpc, 'perf trace' is crashing with a SIGSEGV when trying to
process a perf.data file created with 'perf trace record -p':

  #0  0x00000001225b8988 in syscall_arg__scnprintf_augmented_string <snip> at builtin-trace.c:1492
  #1  syscall_arg__scnprintf_filename <snip> at builtin-trace.c:1492
  #2  syscall_arg__scnprintf_filename <snip> at builtin-trace.c:1486
  #3  0x00000001225bdd9c in syscall_arg_fmt__scnprintf_val <snip> at builtin-trace.c:1973
  #4  syscall__scnprintf_args <snip> at builtin-trace.c:2041
  #5  0x00000001225bff04 in trace__sys_enter <snip> at builtin-trace.c:2319

That points to the below code in tools/perf/builtin-trace.c:
/*
 * If this is raw_syscalls.sys_enter, then it always comes with the 6 possible
 * arguments, even if the syscall being handled, say "openat", uses only 4 arguments
 * this breaks syscall__augmented_args() check for augmented args, as we calculate
 * syscall->args_size using each syscalls:sys_enter_NAME tracefs format file,
 * so when handling, say the openat syscall, we end up getting 6 args for the
 * raw_syscalls:sys_enter event, when we expected just 4, we end up mistakenly
 * thinking that the extra 2 u64 args are the augmented filename, so just check
 * here and avoid using augmented syscalls when the evsel is the raw_syscalls one.
 */
if (evsel != trace->syscalls.events.sys_enter)
augmented_args = syscall__augmented_args(sc, sample, &augmented_args_size, trace->raw_augmented_syscalls_args_size);

As the comment points out, we should not be trying to augment the args
for raw_syscalls. However, when processing a perf.data file, we are not
initializing those properly. Fix the same.

Reported-by: Claudio Carvalho <cclaudio@linux.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20220707090900.572584-1-naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf tests: Fix Convert perf time to TSC test for hybrid
Adrian Hunter [Wed, 13 Jul 2022 12:34:59 +0000 (15:34 +0300)]
perf tests: Fix Convert perf time to TSC test for hybrid

The test does not always correctly determine the number of events for
hybrids, nor allow for more than 1 evsel when parsing.

Fix by iterating the events actually created and getting the correct
evsel for the events processed.

Fixes: d9da6f70eb235110 ("perf tests: Support 'Convert perf time to TSC' test for hybrid")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20220713123459.24145-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf tests: Stop Convert perf time to TSC test opening events twice
Adrian Hunter [Wed, 13 Jul 2022 12:34:58 +0000 (15:34 +0300)]
perf tests: Stop Convert perf time to TSC test opening events twice

Do not call evlist__open() twice.

Fixes: 5bb017d4b97a0f13 ("perf test: Fix error message for test case 71 on s390, where it is not supported")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20220713123459.24145-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agotools arch x86: Sync the msr-index.h copy with the kernel sources
Arnaldo Carvalho de Melo [Thu, 1 Jul 2021 16:32:18 +0000 (13:32 -0300)]
tools arch x86: Sync the msr-index.h copy with the kernel sources

To pick up the changes from these csets:

  4ad3278df6fe2b08 ("x86/speculation: Disable RRSBA behavior")
  d7caac991feeef1b ("x86/cpu/amd: Add Spectral Chicken")

That cause no changes to tooling:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  $

Just silences this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
  diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/YtQTm9wsB3hxQWvy@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agotools headers cpufeatures: Sync with the kernel sources
Arnaldo Carvalho de Melo [Thu, 1 Jul 2021 16:39:15 +0000 (13:39 -0300)]
tools headers cpufeatures: Sync with the kernel sources

To pick the changes from:

  f43b9876e857c739 ("x86/retbleed: Add fine grained Kconfig knobs")
  a149180fbcf336e9 ("x86: Add magic AMD return-thunk")
  15e67227c49a5783 ("x86: Undo return-thunk damage")
  369ae6ffc41a3c11 ("x86/retpoline: Cleanup some #ifdefery")
  4ad3278df6fe2b08 x86/speculation: Disable RRSBA behavior
  26aae8ccbc197223 x86/cpu/amd: Enumerate BTC_NO
  9756bba28470722d x86/speculation: Fill RSB on vmexit for IBRS
  3ebc170068885b6f x86/bugs: Add retbleed=ibpb
  2dbb887e875b1de3 x86/entry: Add kernel IBRS implementation
  6b80b59b35557065 x86/bugs: Report AMD retbleed vulnerability
  a149180fbcf336e9 x86: Add magic AMD return-thunk
  15e67227c49a5783 x86: Undo return-thunk damage
  a883d624aed463c8 x86/cpufeatures: Move RETPOLINE flags to word 11
  51802186158c74a0 x86/speculation/mmio: Enumerate Processor MMIO Stale Data bug

This only causes these perf files to be rebuilt:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/disabled-features.h' differs from latest version at 'arch/x86/include/asm/disabled-features.h'
  diff -u tools/arch/x86/include/asm/disabled-features.h arch/x86/include/asm/disabled-features.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org
Link: https://lore.kernel.org/lkml/YtQM40VmiLTkPND2@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agotools headers UAPI: Sync linux/kvm.h with the kernel sources
Arnaldo Carvalho de Melo [Sun, 9 May 2021 12:39:02 +0000 (09:39 -0300)]
tools headers UAPI: Sync linux/kvm.h with the kernel sources

To pick the changes in:

  1b870fa5573e260b ("kvm: stats: tell userspace which values are boolean")

That just rebuilds perf, as these patches don't add any new KVM ioctl to
be harvested for the the 'perf trace' ioctl syscall argument
beautifiers.

This is also by now used by tools/testing/selftests/kvm/, a simple test
build succeeded.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
  diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lore.kernel.org/lkml/YtQLDvQrBhJNl3n5@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoMerge tag 'for-5.19-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Sat, 16 Jul 2022 20:48:55 +0000 (13:48 -0700)]
Merge tag 'for-5.19-rc7-tag' of git://git./linux/kernel/git/kdave/linux

Pull btrfs reverts from David Sterba:
 "Due to a recent report [1] we need to revert the radix tree to xarray
  conversion patches.

  There's a problem with sleeping under spinlock, when xa_insert could
  allocate memory under pressure. We use GFP_NOFS so this is a real
  problem that we unfortunately did not discover during review.

  I'm sorry to do such change at rc6 time but the revert is IMO the
  safer option, there are patches to use mutex instead of the spin locks
  but that would need more testing. The revert branch has been tested on
  a few setups, all seem ok.

  The conversion to xarray will be revisited in the future"

Link: https://lore.kernel.org/linux-btrfs/cover.1657097693.git.fdmanana@suse.com/
* tag 'for-5.19-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  Revert "btrfs: turn delayed_nodes_tree into an XArray"
  Revert "btrfs: turn name_cache radix tree into XArray in send_ctx"
  Revert "btrfs: turn fs_info member buffer_radix into XArray"
  Revert "btrfs: turn fs_roots_radix in btrfs_fs_info into an XArray"

2 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sat, 16 Jul 2022 18:45:40 +0000 (11:45 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Six small and reasonably obvious fixes, all in drivers"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: pm80xx: Set stopped phy's linkrate to Disabled
  scsi: pm80xx: Fix 'Unknown' max/min linkrate
  scsi: ufs: core: Fix missing clk change notification on host reset
  scsi: ufs: core: Drop loglevel of WriteBoost message
  scsi: megaraid: Clear READ queue map's nr_queues
  scsi: target: Fix WRITE_SAME No Data Buffer crash

2 years agoMerge tag 'block-5.19-2022-07-15' of git://git.kernel.dk/linux-block
Linus Torvalds [Sat, 16 Jul 2022 18:40:10 +0000 (11:40 -0700)]
Merge tag 'block-5.19-2022-07-15' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "Two NVMe fixes, and a regression fix for the core block layer from
  this merge window"

* tag 'block-5.19-2022-07-15' of git://git.kernel.dk/linux-block:
  block: fix missing blkcg_bio_issue_init
  nvme: fix block device naming collision
  nvme-pci: fix freeze accounting for error handling

2 years agoMerge tag 'usb-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Sat, 16 Jul 2022 18:21:15 +0000 (11:21 -0700)]
Merge tag 'usb-5.19-rc7' of git://git./linux/kernel/git/gregkh/usb

Pull USB driver fixes from Greg KH:
 "Here are some small USB driver fixes and new device ids for 5.19-rc7.
  They include:

   - new usb-serial driver ids

   - typec uevent fix

   - uvc gadget driver fix

   - dwc3 driver fixes

   - ehci-fsl driver fix

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'usb-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: serial: ftdi_sio: add Belimo device ids
  drivers/usb/host/ehci-fsl: Fix interrupt setup in host mode.
  usb: gadget: uvc: fix changing interface name via configfs
  usb: typec: add missing uevent when partner support PD
  usb: dwc3-am62: remove unnecesary clk_put()
  usb: dwc3: gadget: Fix event pending check

2 years agoMerge tag 'tty-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Linus Torvalds [Sat, 16 Jul 2022 18:11:56 +0000 (11:11 -0700)]
Merge tag 'tty-5.19-rc7' of git://git./linux/kernel/git/gregkh/tty

Pull tty and serial driver fixes from Greg KH:
 "Here are some TTY and Serial driver fixes for 5.19-rc7. They resolve a
  number of reported problems including:

   - longtime bug in pty_write() that has been reported in the past.

   - 8250 driver fixes

   - new serial device ids

   - vt overlapping data copy bugfix

   - other tiny serial driver bugfixes

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'tty-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  tty: use new tty_insert_flip_string_and_push_buffer() in pty_write()
  tty: extract tty_flip_buffer_commit() from tty_flip_buffer_push()
  serial: 8250: dw: Fix the macro RZN1_UART_xDMACR_8_WORD_BURST
  vt: fix memory overlapping when deleting chars in the buffer
  serial: mvebu-uart: correctly report configured baudrate value
  serial: 8250: Fix PM usage_count for console handover
  serial: 8250: fix return error code in serial8250_request_std_resource()
  serial: stm32: Clear prev values before setting RTS delays
  tty: Add N_CAN327 line discipline ID for ELM327 based CAN driver
  serial: 8250: Fix __stop_tx() & DMA Tx restart races
  serial: pl011: UPSTAT_AUTORTS requires .throttle/unthrottle
  tty: serial: samsung_tty: set dma burst_size to 1
  serial: 8250: dw: enable using pdata with ACPI

2 years agoMerge tag 's390-5.19-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Sat, 16 Jul 2022 18:00:40 +0000 (11:00 -0700)]
Merge tag 's390-5.19-6' of git://git./linux/kernel/git/s390/linux

Pull s390 fixes from Alexander Gordeev:

 - Fix building of out-of-tree kernel modules without a pre-built kernel
   in case CONFIG_EXPOLINE_EXTERN=y.

 - Fix a reference counting error that could prevent unloading of zcrypt
   modules.

* tag 's390-5.19-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/ap: fix error handling in __verify_queue_reservations()
  s390/nospec: remove unneeded header includes
  s390/nospec: build expoline.o for modules_prepare target

2 years agoMerge tag 'pm-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Sat, 16 Jul 2022 17:56:28 +0000 (10:56 -0700)]
Merge tag 'pm-5.19-rc7' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management fix from Rafael Wysocki
 "Fix recent regression in the cpufreq mediatek driver related to
  incorrect handling of regulator_get_optional() return value
  (AngeloGioacchino Del Regno)"

* tag 'pm-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: mediatek: Handle sram regulator probe deferral

2 years agoMerge tag 'acpi-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Sat, 16 Jul 2022 17:52:41 +0000 (10:52 -0700)]
Merge tag 'acpi-5.19-rc7' of git://git./linux/kernel/git/rafael/linux-pm

Pull ACPI fix from Rafael Wysocki:
 "Fix more fallout from recent changes of the ACPI CPPC handling on AMD
  platforms (Mario Limonciello)"

* tag 'acpi-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: CPPC: Fix enabling CPPC on AMD systems with shared memory

2 years agoMerge tag 'printk-for-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 16 Jul 2022 17:46:03 +0000 (10:46 -0700)]
Merge tag 'printk-for-5.19-rc7' of git://git./linux/kernel/git/printk/linux

Pull printk fix from Petr Mladek:

 - Make pr_flush() fast when consoles are suspended.

* tag 'printk-for-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printk: do not wait for consoles when suspended

2 years agorandom: cap jitter samples per bit to factor of HZ
Jason A. Donenfeld [Wed, 13 Jul 2022 15:11:15 +0000 (17:11 +0200)]
random: cap jitter samples per bit to factor of HZ

Currently the jitter mechanism will require two timer ticks per
iteration, and it requires N iterations per bit. This N is determined
with a small measurement, and if it's too big, it won't waste time with
jitter entropy because it'd take too long or not have sufficient entropy
anyway.

With the current max N of 32, there are large timeouts on systems with a
small CONFIG_HZ. Rather than set that maximum to 32, instead choose a
factor of CONFIG_HZ. In this case, 1/30 seems to yield sane values for
different configurations of CONFIG_HZ.

Reported-by: Vladimir Murzin <vladimir.murzin@arm.com>
Fixes: 78c768e619fb ("random: vary jitter iterations based on cycle counter speed")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Tested-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoefi/x86: use naked RET on mixed mode call wrapper
Thadeu Lima de Souza Cascardo [Fri, 15 Jul 2022 19:45:50 +0000 (16:45 -0300)]
efi/x86: use naked RET on mixed mode call wrapper

When running with return thunks enabled under 32-bit EFI, the system
crashes with:

  kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
  BUG: unable to handle page fault for address: 000000005bc02900
  #PF: supervisor instruction fetch in kernel mode
  #PF: error_code(0x0011) - permissions violation
  PGD 18f7063 P4D 18f7063 PUD 18ff063 PMD 190e063 PTE 800000005bc02063
  Oops: 0011 [#1] PREEMPT SMP PTI
  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.19.0-rc6+ #166
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:0x5bc02900
  Code: Unable to access opcode bytes at RIP 0x5bc028d6.
  RSP: 0018:ffffffffb3203e10 EFLAGS: 00010046
  RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000048
  RDX: 000000000190dfac RSI: 0000000000001710 RDI: 000000007eae823b
  RBP: ffffffffb3203e70 R08: 0000000001970000 R09: ffffffffb3203e28
  R10: 747563657865206c R11: 6c6977203a696665 R12: 0000000000001710
  R13: 0000000000000030 R14: 0000000001970000 R15: 0000000000000001
  FS:  0000000000000000(0000) GS:ffff8e013ca00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0018 ES: 0018 CR0: 0000000080050033
  CR2: 000000005bc02900 CR3: 0000000001930000 CR4: 00000000000006f0
  Call Trace:
   ? efi_set_virtual_address_map+0x9c/0x175
   efi_enter_virtual_mode+0x4a6/0x53e
   start_kernel+0x67c/0x71e
   x86_64_start_reservations+0x24/0x2a
   x86_64_start_kernel+0xe9/0xf4
   secondary_startup_64_no_verify+0xe5/0xeb

That's because it cannot jump to the return thunk from the 32-bit code.

Using a naked RET and marking it as safe allows the system to proceed
booting.

Fixes: aa3d480315ba ("x86: Use return-thunk in asm code")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: <stable@vger.kernel.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agox86/bugs: Remove apostrophe typo
Kim Phillips [Fri, 8 Jul 2022 21:21:28 +0000 (16:21 -0500)]
x86/bugs: Remove apostrophe typo

Remove a superfluous ' in the mitigation string.

Fixes: e8ec1b6e08a2 ("x86/bugs: Enable STIBP for JMP2RET")
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2 years agoMerge tag 'riscv-for-linus-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 15 Jul 2022 17:40:50 +0000 (10:40 -0700)]
Merge tag 'riscv-for-linus-5.19-rc7' of git://git./linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - A fix to avoid printing a warning when modules do not exercise any
   errata-dependent behavior and the SiFive errata are enabled.

 - A fix to the Microchip PFSOC to attach the L2 cache to the CPU nodes.

* tag 'riscv-for-linus-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: don't warn for sifive erratas in modules
  riscv: dts: microchip: hook up the mpfs' l2cache

2 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Fri, 15 Jul 2022 17:31:46 +0000 (10:31 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "RISC-V:
   - Fix missing PAGE_PFN_MASK

   - Fix SRCU deadlock caused by kvm_riscv_check_vcpu_requests()

  x86:
   - Fix for nested virtualization when TSC scaling is active

   - Estimate the size of fastcc subroutines conservatively, avoiding
     disastrous underestimation when return thunks are enabled

   - Avoid possible use of uninitialized fields of 'struct
     kvm_lapic_irq'

  Generic:
   - Mark as such the boolean values available from the statistics file
     descriptors

   - Clarify statistics documentation"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: emulate: do not adjust size of fastop and setcc subroutines
  KVM: x86: Fully initialize 'struct kvm_lapic_irq' in kvm_pv_kick_cpu_op()
  Documentation: kvm: clarify histogram units
  kvm: stats: tell userspace which values are boolean
  x86/kvm: fix FASTOP_SIZE when return thunks are enabled
  KVM: nVMX: Always enable TSC scaling for L2 when it was enabled for L1
  RISC-V: KVM: Fix SRCU deadlock caused by kvm_riscv_check_vcpu_requests()
  riscv: Fix missing PAGE_PFN_MASK

2 years agoMerge tag 'ceph-for-5.19-rc7' of https://github.com/ceph/ceph-client
Linus Torvalds [Fri, 15 Jul 2022 17:27:28 +0000 (10:27 -0700)]
Merge tag 'ceph-for-5.19-rc7' of https://github.com/ceph/ceph-client

Pull ceph fix from Ilya Dryomov:
 "A folio locking fixup that Xiubo and David cooperated on, marked for
  stable. Most of it is in netfs but I picked it up into ceph tree on
  agreement with David"

* tag 'ceph-for-5.19-rc7' of https://github.com/ceph/ceph-client:
  netfs: do not unlock and put the folio twice

2 years agoMerge tag 'spi-fix-v5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brooni...
Linus Torvalds [Fri, 15 Jul 2022 17:23:43 +0000 (10:23 -0700)]
Merge tag 'spi-fix-v5.19-rc4' of git://git./linux/kernel/git/broonie/spi

Pull spi fixes from Mark Brown:
 "A few driver specific fixes, none especially remarkable, plus a
  MAINTAINERS file update due to the previous maintainer for the NXP
  FSPI driver having left the company"

* tag 'spi-fix-v5.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: cadence-quadspi: Remove spi_master_put() in probe failure path
  MAINTAINERS: change the NXP FSPI driver maintainer.
  spi: amd: Limit max transfer and message size
  spi: aspeed: Fix division by zero
  spi: aspeed: Add dev_dbg() to dump the spi-mem direct mapping descriptor

2 years agoMerge tag 'soc-fixes-5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Linus Torvalds [Fri, 15 Jul 2022 17:16:44 +0000 (10:16 -0700)]
Merge tag 'soc-fixes-5.19-3' of git://git./linux/kernel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
 "Most of the contents are bugfixes for the devicetree files:

   - A Qualcomm MSM8974 pin controller regression, caused by a cleanup
     patch that gets partially reverted here.

   - Missing properties for Broadcom BCM49xx to fix timer detection and
     SMP boot.

   - Fix touchscreen pinctrl for imx6ull-colibri board

   - Multiple fixes for Rockchip rk3399 based machines including the vdu
     clock-rate fix, otg port fix on Quartz64-A and ethernet on
     Quartz64-B

   - Fixes for misspelled DT contents causing minor problems on
     imx6qdl-ts7970m, orangepi-zero, sama5d2, kontron-kswitch-d10, and
     ls1028a

  And a couple of changes elsewhere:

   - Fix binding for Allwinner D1 display pipeline

   - Trivial code fixes to the TEE and reset controller driver
     subsystems and the rockchip platform code.

   - Multiple updates to the MAINTAINERS files, marking the Palm Treo
     support as orphaned, and fixing some entries for added or changed
     file names"

* tag 'soc-fixes-5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (21 commits)
  arm64: dts: broadcom: bcm4908: Fix cpu node for smp boot
  arm64: dts: broadcom: bcm4908: Fix timer node for BCM4906 SoC
  ARM: dts: sunxi: Fix SPI NOR campatible on Orange Pi Zero
  ARM: dts: at91: sama5d2: Fix typo in i2s1 node
  tee: tee_get_drvdata(): fix description of return value
  optee: Remove duplicate 'of' in two places.
  ARM: dts: kswitch-d10: use open drain mode for coma-mode pins
  ARM: dts: colibri-imx6ull: fix snvs pinmux group
  optee: smc_abi.c: fix wrong pointer passed to IS_ERR/PTR_ERR()
  MAINTAINERS: add polarfire rng, pci and clock drivers
  MAINTAINERS: mark ARM/PALM TREO SUPPORT orphan
  ARM: dts: imx6qdl-ts7970: Fix ngpio typo and count
  arm64: dts: ls1028a: Update SFP node to include clock
  dt-bindings: display: sun4i: Fix D1 pipeline count
  ARM: dts: qcom: msm8974: re-add missing pinctrl
  reset: Fix devm bulk optional exclusive control getter
  MAINTAINERS: rectify entry for SYNOPSYS AXS10x RESET CONTROLLER DRIVER
  ARM: rockchip: Add missing of_node_put() in rockchip_suspend_init()
  arm64: dts: rockchip: Assign RK3399 VDU clock rate
  arm64: dts: rockchip: Fix Quartz64-A dwc3 otg port behavior
  ...

2 years agoRevert "btrfs: turn delayed_nodes_tree into an XArray"
David Sterba [Fri, 15 Jul 2022 11:59:45 +0000 (13:59 +0200)]
Revert "btrfs: turn delayed_nodes_tree into an XArray"

This reverts commit 253bf57555e451dec5a7f09dc95d380ce8b10e5b.

Revert the xarray conversion, there's a problem with potential
sleep-inside-spinlock [1] when calling xa_insert that triggers GFP_NOFS
allocation. The radix tree used the preloading mechanism to avoid
sleeping but this is not available in xarray.

Conversion from spin lock to mutex is possible but at time of rc6 is
riskier than a clean revert.

[1] https://lore.kernel.org/linux-btrfs/cover.1657097693.git.fdmanana@suse.com/

Reported-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2 years agoRevert "btrfs: turn name_cache radix tree into XArray in send_ctx"
David Sterba [Fri, 15 Jul 2022 11:59:38 +0000 (13:59 +0200)]
Revert "btrfs: turn name_cache radix tree into XArray in send_ctx"

This reverts commit 4076942021fe14efecae33bf98566df6dd5ae6f7.

Revert the xarray conversion, there's a problem with potential
sleep-inside-spinlock [1] when calling xa_insert that triggers GFP_NOFS
allocation. The radix tree used the preloading mechanism to avoid
sleeping but this is not available in xarray.

Conversion from spin lock to mutex is possible but at time of rc6 is
riskier than a clean revert.

[1] https://lore.kernel.org/linux-btrfs/cover.1657097693.git.fdmanana@suse.com/

Reported-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>