review.tizen.org Git - platform/kernel/linux-starfive.git/log

f2fs: wire up new fscrypt ioctls

Wire up the new ioctls for adding and removing fscrypt keys to/from the
filesystem, and the new ioctl for retrieving v2 encryption policies.

The key removal ioctls also required making f2fs_drop_inode() call
fscrypt_drop_inode().

For more details see Documentation/filesystems/fscrypt.rst and the
fscrypt patches that added the implementation of these ioctls.

Acked-by: Jaegeuk Kim <jaegeuk@kernel.org>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>

ext4: wire up new fscrypt ioctls

Wire up the new ioctls for adding and removing fscrypt keys to/from the
filesystem, and the new ioctl for retrieving v2 encryption policies.

The key removal ioctls also required making ext4_drop_inode() call
fscrypt_drop_inode().

For more details see Documentation/filesystems/fscrypt.rst and the
fscrypt patches that added the implementation of these ioctls.

Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>

fscrypt: require that key be added when setting a v2 encryption policy

By looking up the master keys in a filesystem-level keyring rather than
in the calling processes' key hierarchy, it becomes possible for a user
to set an encryption policy which refers to some key they don't actually
know, then encrypt their files using that key. Cryptographically this
isn't much of a problem, but the semantics of this would be a bit weird.
Thus, enforce that a v2 encryption policy can only be set if the user
has previously added the key, or has capable(CAP_FOWNER).

We tolerate that this problem will continue to exist for v1 encryption
policies, however; there is no way around that.

Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>

fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS ioctl

Add a root-only variant of the FS_IOC_REMOVE_ENCRYPTION_KEY ioctl which
removes all users' claims of the key, not just the current user's claim.
I.e., it always removes the key itself, no matter how many users have
added it.

This is useful for forcing a directory to be locked, without having to
figure out which user ID(s) the key was added under. This is planned to
be used by a command like 'sudo fscrypt lock DIR --all-users' in the
fscrypt userspace tool (http://github.com/google/fscrypt).

Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>

fscrypt: allow unprivileged users to add/remove keys for v2 policies

Allow the FS_IOC_ADD_ENCRYPTION_KEY and FS_IOC_REMOVE_ENCRYPTION_KEY
ioctls to be used by non-root users to add and remove encryption keys
from the filesystem-level crypto keyrings, subject to limitations.

Motivation: while privileged fscrypt key management is sufficient for
some users (e.g. Android and Chromium OS, where a privileged process
manages all keys), the old API by design also allows non-root users to
set up and use encrypted directories, and we don't want to regress on
that.  Especially, we don't want to force users to continue using the
old API, running into the visibility mismatch between files and keyrings
and being unable to "lock" encrypted directories.

Intuitively, the ioctls have to be privileged since they manipulate
filesystem-level state.  However, it's actually safe to make them
unprivileged if we very carefully enforce some specific limitations.

First, each key must be identified by a cryptographic hash so that a
user can't add the wrong key for another user's files.  For v2
encryption policies, we use the key_identifier for this.  v1 policies
don't have this, so managing keys for them remains privileged.

Second, each key a user adds is charged to their quota for the keyrings
service.  Thus, a user can't exhaust memory by adding a huge number of
keys.  By default each non-root user is allowed up to 200 keys; this can
be changed using the existing sysctl 'kernel.keys.maxkeys'.

Third, if multiple users add the same key, we keep track of those users
of the key (of which there remains a single copy), and won't really
remove the key, i.e. "lock" the encrypted files, until all those users
have removed it.  This prevents denial of service attacks that would be
possible under simpler schemes, such allowing the first user who added a
key to remove it -- since that could be a malicious user who has
compromised the key.  Of course, encryption keys should be kept secret,
but the idea is that using encryption should never be *less* secure than
not using encryption, even if your key was compromised.

We tolerate that a user will be unable to really remove a key, i.e.
unable to "lock" their encrypted files, if another user has added the
same key.  But in a sense, this is actually a good thing because it will
avoid providing a false notion of security where a key appears to have
been removed when actually it's still in memory, available to any
attacker who compromises the operating system kernel.

Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>

fscrypt: v2 encryption policy support

Add a new fscrypt policy version, "v2".  It has the following changes
from the original policy version, which we call "v1" (*):

- Master keys (the user-provided encryption keys) are only ever used as
  input to HKDF-SHA512.  This is more flexible and less error-prone, and
  it avoids the quirks and limitations of the AES-128-ECB based KDF.
  Three classes of cryptographically isolated subkeys are defined:

    - Per-file keys, like used in v1 policies except for the new KDF.

    - Per-mode keys.  These implement the semantics of the DIRECT_KEY
      flag, which for v1 policies made the master key be used directly.
      These are also planned to be used for inline encryption when
      support for it is added.

    - Key identifiers (see below).

- Each master key is identified by a 16-byte master_key_identifier,
  which is derived from the key itself using HKDF-SHA512.  This prevents
  users from associating the wrong key with an encrypted file or
  directory.  This was easily possible with v1 policies, which
  identified the key by an arbitrary 8-byte master_key_descriptor.

- The key must be provided in the filesystem-level keyring, not in a
  process-subscribed keyring.

The following UAPI additions are made:

- The existing ioctl FS_IOC_SET_ENCRYPTION_POLICY can now be passed a
  fscrypt_policy_v2 to set a v2 encryption policy.  It's disambiguated
  from fscrypt_policy/fscrypt_policy_v1 by the version code prefix.

- A new ioctl FS_IOC_GET_ENCRYPTION_POLICY_EX is added.  It allows
  getting the v1 or v2 encryption policy of an encrypted file or
  directory.  The existing FS_IOC_GET_ENCRYPTION_POLICY ioctl could not
  be used because it did not have a way for userspace to indicate which
  policy structure is expected.  The new ioctl includes a size field, so
  it is extensible to future fscrypt policy versions.

- The ioctls FS_IOC_ADD_ENCRYPTION_KEY, FS_IOC_REMOVE_ENCRYPTION_KEY,
  and FS_IOC_GET_ENCRYPTION_KEY_STATUS now support managing keys for v2
  encryption policies.  Such keys are kept logically separate from keys
  for v1 encryption policies, and are identified by 'identifier' rather
  than by 'descriptor'.  The 'identifier' need not be provided when
  adding a key, since the kernel will calculate it anyway.

This patch temporarily keeps adding/removing v2 policy keys behind the
same permission check done for adding/removing v1 policy keys:
capable(CAP_SYS_ADMIN).  However, the next patch will carefully take
advantage of the cryptographically secure master_key_identifier to allow
non-root users to add/remove v2 policy keys, thus providing a full
replacement for v1 policies.

(*) Actually, in the API fscrypt_policy::version is 0 while on-disk
    fscrypt_context::format is 1.  But I believe it makes the most sense
    to advance both to '2' to have them be in sync, and to consider the
    numbering to start at 1 except for the API quirk.

Reviewed-by: Paul Crowley <paulcrowley@google.com>
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>

fscrypt: add an HKDF-SHA512 implementation

Add an implementation of HKDF (RFC 5869) to fscrypt, for the purpose of
deriving additional key material from the fscrypt master keys for v2
encryption policies.  HKDF is a key derivation function built on top of
HMAC.  We choose SHA-512 for the underlying unkeyed hash, and use an
"hmac(sha512)" transform allocated from the crypto API.

We'll be using this to replace the AES-ECB based KDF currently used to
derive the per-file encryption keys.  While the AES-ECB based KDF is
believed to meet the original security requirements, it is nonstandard
and has problems that don't exist in modern KDFs such as HKDF:

1. It's reversible.  Given a derived key and nonce, an attacker can
   easily compute the master key.  This is okay if the master key and
   derived keys are equally hard to compromise, but now we'd like to be
   more robust against threats such as a derived key being compromised
   through a timing attack, or a derived key for an in-use file being
   compromised after the master key has already been removed.

2. It doesn't evenly distribute the entropy from the master key; each 16
   input bytes only affects the corresponding 16 output bytes.

3. It isn't easily extensible to deriving other values or keys, such as
   a public hash for securely identifying the key, or per-mode keys.
   Per-mode keys will be immediately useful for Adiantum encryption, for
   which fscrypt currently uses the master key directly, introducing
   unnecessary usage constraints.  Per-mode keys will also be useful for
   hardware inline encryption, which is currently being worked on.

HKDF solves all the above problems.

Reviewed-by: Paul Crowley <paulcrowley@google.com>
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>

fscrypt: add FS_IOC_GET_ENCRYPTION_KEY_STATUS ioctl

Add a new fscrypt ioctl, FS_IOC_GET_ENCRYPTION_KEY_STATUS.  Given a key
specified by 'struct fscrypt_key_specifier' (the same way a key is
specified for the other fscrypt key management ioctls), it returns
status information in a 'struct fscrypt_get_key_status_arg'.

The main motivation for this is that applications need to be able to
check whether an encrypted directory is "unlocked" or not, so that they
can add the key if it is not, and avoid adding the key (which may
involve prompting the user for a passphrase) if it already is.

It's possible to use some workarounds such as checking whether opening a
regular file fails with ENOKEY, or checking whether the filenames "look
like gibberish" or not.  However, no workaround is usable in all cases.

Like the other key management ioctls, the keyrings syscalls may seem at
first to be a good fit for this.  Unfortunately, they are not.  Even if
we exposed the keyring ID of the ->s_master_keys keyring and gave
everyone Search permission on it (note: currently the keyrings
permission system would also allow everyone to "invalidate" the keyring
too), the fscrypt keys have an additional state that doesn't map cleanly
to the keyrings API: the secret can be removed, but we can be still
tracking the files that were using the key, and the removal can be
re-attempted or the secret added again.

After later patches, some applications will also need a way to determine
whether a key was added by the current user vs. by some other user.
Reserved fields are included in fscrypt_get_key_status_arg for this and
other future extensions.

Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>

fscrypt: add FS_IOC_REMOVE_ENCRYPTION_KEY ioctl

Add a new fscrypt ioctl, FS_IOC_REMOVE_ENCRYPTION_KEY.  This ioctl
removes an encryption key that was added by FS_IOC_ADD_ENCRYPTION_KEY.
It wipes the secret key itself, then "locks" the encrypted files and
directories that had been unlocked using that key -- implemented by
evicting the relevant dentries and inodes from the VFS caches.

The problem this solves is that many fscrypt users want the ability to
remove encryption keys, causing the corresponding encrypted directories
to appear "locked" (presented in ciphertext form) again.  Moreover,
users want removing an encryption key to *really* remove it, in the
sense that the removed keys cannot be recovered even if kernel memory is
compromised, e.g. by the exploit of a kernel security vulnerability or
by a physical attack.  This is desirable after a user logs out of the
system, for example.  In many cases users even already assume this to be
the case and are surprised to hear when it's not.

It is not sufficient to simply unlink the master key from the keyring
(or to revoke or invalidate it), since the actual encryption transform
objects are still pinned in memory by their inodes.  Therefore, to
really remove a key we must also evict the relevant inodes.

Currently one workaround is to run 'sync && echo 2 >
/proc/sys/vm/drop_caches'.  But, that evicts all unused inodes in the
system rather than just the inodes associated with the key being
removed, causing severe performance problems.  Moreover, it requires
root privileges, so regular users can't "lock" their encrypted files.

Another workaround, used in Chromium OS kernels, is to add a new
VFS-level ioctl FS_IOC_DROP_CACHE which is a more restricted version of
drop_caches that operates on a single super_block.  It does:

        shrink_dcache_sb(sb);
        invalidate_inodes(sb, false);

But it's still a hack.  Yet, the major users of filesystem encryption
want this feature badly enough that they are actually using these hacks.

To properly solve the problem, start maintaining a list of the inodes
which have been "unlocked" using each master key.  Originally this
wasn't possible because the kernel didn't keep track of in-use master
keys at all.  But, with the ->s_master_keys keyring it is now possible.

Then, add an ioctl FS_IOC_REMOVE_ENCRYPTION_KEY.  It finds the specified
master key in ->s_master_keys, then wipes the secret key itself, which
prevents any additional inodes from being unlocked with the key.  Then,
it syncs the filesystem and evicts the inodes in the key's list.  The
normal inode eviction code will free and wipe the per-file keys (in
->i_crypt_info).  Note that freeing ->i_crypt_info without evicting the
inodes was also considered, but would have been racy.

Some inodes may still be in use when a master key is removed, and we
can't simply revoke random file descriptors, mmap's, etc.  Thus, the
ioctl simply skips in-use inodes, and returns -EBUSY to indicate that
some inodes weren't evicted.  The master key *secret* is still removed,
but the fscrypt_master_key struct remains to keep track of the remaining
inodes.  Userspace can then retry the ioctl to evict the remaining
inodes.  Alternatively, if userspace adds the key again, the refreshed
secret will be associated with the existing list of inodes so they
remain correctly tracked for future key removals.

The ioctl doesn't wipe pagecache pages.  Thus, we tolerate that after a
kernel compromise some portions of plaintext file contents may still be
recoverable from memory.  This can be solved by enabling page poisoning
system-wide, which security conscious users may choose to do.  But it's
very difficult to solve otherwise, e.g. note that plaintext file
contents may have been read in other places than pagecache pages.

Like FS_IOC_ADD_ENCRYPTION_KEY, FS_IOC_REMOVE_ENCRYPTION_KEY is
initially restricted to privileged users only.  This is sufficient for
some use cases, but not all.  A later patch will relax this restriction,
but it will require introducing key hashes, among other changes.

Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>

fscrypt: add FS_IOC_ADD_ENCRYPTION_KEY ioctl

Add a new fscrypt ioctl, FS_IOC_ADD_ENCRYPTION_KEY.  This ioctl adds an
encryption key to the filesystem's fscrypt keyring ->s_master_keys,
making any files encrypted with that key appear "unlocked".

Why we need this
~~~~~~~~~~~~~~~~

The main problem is that the "locked/unlocked" (ciphertext/plaintext)
status of encrypted files is global, but the fscrypt keys are not.
fscrypt only looks for keys in the keyring(s) the process accessing the
filesystem is subscribed to: the thread keyring, process keyring, and
session keyring, where the session keyring may contain the user keyring.

Therefore, userspace has to put fscrypt keys in the keyrings for
individual users or sessions.  But this means that when a process with a
different keyring tries to access encrypted files, whether they appear
"unlocked" or not is nondeterministic.  This is because it depends on
whether the files are currently present in the inode cache.

Fixing this by consistently providing each process its own view of the
filesystem depending on whether it has the key or not isn't feasible due
to how the VFS caches work.  Furthermore, while sometimes users expect
this behavior, it is misguided for two reasons.  First, it would be an
OS-level access control mechanism largely redundant with existing access
control mechanisms such as UNIX file permissions, ACLs, LSMs, etc.
Encryption is actually for protecting the data at rest.

Second, almost all users of fscrypt actually do need the keys to be
global.  The largest users of fscrypt, Android and Chromium OS, achieve
this by having PID 1 create a "session keyring" that is inherited by
every process.  This works, but it isn't scalable because it prevents
session keyrings from being used for any other purpose.

On general-purpose Linux distros, the 'fscrypt' userspace tool [1] can't
similarly abuse the session keyring, so to make 'sudo' work on all
systems it has to link all the user keyrings into root's user keyring
[2].  This is ugly and raises security concerns.  Moreover it can't make
the keys available to system services, such as sshd trying to access the
user's '~/.ssh' directory (see [3], [4]) or NetworkManager trying to
read certificates from the user's home directory (see [5]); or to Docker
containers (see [6], [7]).

By having an API to add a key to the *filesystem* we'll be able to fix
the above bugs, remove userspace workarounds, and clearly express the
intended semantics: the locked/unlocked status of an encrypted directory
is global, and encryption is orthogonal to OS-level access control.

Why not use the add_key() syscall
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

We use an ioctl for this API rather than the existing add_key() system
call because the ioctl gives us the flexibility needed to implement
fscrypt-specific semantics that will be introduced in later patches:

- Supporting key removal with the semantics such that the secret is
  removed immediately and any unused inodes using the key are evicted;
  also, the eviction of any in-use inodes can be retried.

- Calculating a key-dependent cryptographic identifier and returning it
  to userspace.

- Allowing keys to be added and removed by non-root users, but only keys
  for v2 encryption policies; and to prevent denial-of-service attacks,
  users can only remove keys they themselves have added, and a key is
  only really removed after all users who added it have removed it.

Trying to shoehorn these semantics into the keyrings syscalls would be
very difficult, whereas the ioctls make things much easier.

However, to reuse code the implementation still uses the keyrings
service internally.  Thus we get lockless RCU-mode key lookups without
having to re-implement it, and the keys automatically show up in
/proc/keys for debugging purposes.

References:

    [1] https://github.com/google/fscrypt
    [2] https://goo.gl/55cCrI#heading=h.vf09isp98isb
    [3] https://github.com/google/fscrypt/issues/111#issuecomment-444347939
    [4] https://github.com/google/fscrypt/issues/116
    [5] https://bugs.launchpad.net/ubuntu/+source/fscrypt/+bug/1770715
    [6] https://github.com/google/fscrypt/issues/128
    [7] https://askubuntu.com/questions/1130306/cannot-run-docker-on-an-encrypted-filesystem

Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>

fscrypt: rename keyinfo.c to keysetup.c

Rename keyinfo.c to keysetup.c since this better describes what the file
does (sets up the key), and it matches the new file keysetup_v1.c.

Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>

fscrypt: move v1 policy key setup to keysetup_v1.c

In preparation for introducing v2 encryption policies which will find
and derive encryption keys differently from the current v1 encryption
policies, move the v1 policy-specific key setup code from keyinfo.c into
keysetup_v1.c.

Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>

fscrypt: refactor key setup code in preparation for v2 policies

Do some more refactoring of the key setup code, in preparation for
introducing a filesystem-level keyring and v2 encryption policies:

- Now that ci_inode exists, don't pass around the inode unnecessarily.

- Define a function setup_file_encryption_key() which handles the crypto
  key setup given an under-construction fscrypt_info.  Don't pass the
  fscrypt_context, since everything is in the fscrypt_info.
  [This will be extended for v2 policies and the fs-level keyring.]

- Define a function fscrypt_set_derived_key() which sets the per-file
  key, without depending on anything specific to v1 policies.
  [This will also be used for v2 policies.]

- Define a function fscrypt_setup_v1_file_key() which takes the raw
  master key, thus separating finding the key from using it.
  [This will also be used if the key is found in the fs-level keyring.]

Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>

fscrypt: rename fscrypt_master_key to fscrypt_direct_key

In preparation for introducing a filesystem-level keyring which will
contain fscrypt master keys, rename the existing 'struct
fscrypt_master_key' to 'struct fscrypt_direct_key'. This is the
structure in the existing table of master keys that's maintained to
deduplicate the crypto transforms for v1 DIRECT_KEY policies.

I've chosen to keep this table as-is rather than make it automagically
add/remove the keys to/from the filesystem-level keyring, since that
would add a lot of extra complexity to the filesystem-level keyring.

Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>

fscrypt: add ->ci_inode to fscrypt_info

Add an inode back-pointer to 'struct fscrypt_info', such that
inode->i_crypt_info->ci_inode == inode.

This will be useful for:

1. Evicting the inodes when a fscrypt key is removed, since we'll track
   the inodes using a given key by linking their fscrypt_infos together,
   rather than the inodes directly.  This avoids bloating 'struct inode'
   with a new list_head.

2. Simplifying the per-file key setup, since the inode pointer won't
   have to be passed around everywhere just in case something goes wrong
   and it's needed for fscrypt_warn().

Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>

fscrypt: use FSCRYPT_* definitions, not FS_*

Update fs/crypto/ to use the new names for the UAPI constants rather
than the old names, then make the old definitions conditional on
!__KERNEL__.

Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>

fscrypt: use FSCRYPT_ prefix for uapi constants

Prefix all filesystem encryption UAPI constants except the ioctl numbers
with "FSCRYPT_" rather than with "FS_". This namespaces the constants
more appropriately and makes it clear that they are related specifically
to the filesystem encryption feature, and to the 'fscrypt_*' structures.
With some of the old names like "FS_POLICY_FLAGS_VALID", it was not
immediately clear that the constant had anything to do with encryption.

This is also useful because we'll be adding more encryption-related
constants, e.g. for the policy version, and we'd otherwise have to
choose whether to use unclear names like FS_POLICY_V1 or inconsistent
names like FS_ENCRYPTION_POLICY_V1.

For source compatibility with existing userspace programs, keep the old
names defined as aliases to the new names.

Finally, as long as new names are being defined anyway, I skipped
defining new names for the fscrypt mode numbers that aren't actually
used: INVALID (0), AES_256_GCM (2), AES_256_CBC (3), SPECK128_256_XTS
(7), and SPECK128_256_CTS (8).

Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>

fs, fscrypt: move uapi definitions to new header <linux/fscrypt.h>

More fscrypt definitions are being added, and we shouldn't use a
disproportionate amount of space in <linux/fs.h> for fscrypt stuff.
So move the fscrypt definitions to a new header <linux/fscrypt.h>.

For source compatibility with existing userspace programs, <linux/fs.h>
still includes the new header.

Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Eric Biggers <ebiggers@google.com>

fscrypt: use ENOPKG when crypto API support missing

Return ENOPKG rather than ENOENT when trying to open a file that's
encrypted using algorithms not available in the kernel's crypto API.

This avoids an ambiguity, since ENOENT is also returned when the file
doesn't exist.

Note: this is the same approach I'm taking for fs-verity.

Signed-off-by: Eric Biggers <ebiggers@google.com>

fscrypt: improve warnings for missing crypto API support

Users of fscrypt with non-default algorithms will encounter an error
like the following if they fail to include the needed algorithms into
the crypto API when configuring the kernel (as per the documentation):

Error allocating 'adiantum(xchacha12,aes)' transform: -2

This requires that the user figure out what the "-2" error means.
Make it more friendly by printing a warning like the following instead:

Missing crypto API support for Adiantum (API name: "adiantum(xchacha12,aes)")

Also upgrade the log level for *other* errors to KERN_ERR.

Signed-off-by: Eric Biggers <ebiggers@google.com>

fscrypt: improve warning messages for unsupported encryption contexts

When fs/crypto/ encounters an inode with an invalid encryption context,
currently it prints a warning if the pair of encryption modes are
unrecognized, but it's silent if there are other problems such as
unsupported context size, format, or flags. To help people debug such
situations, add more warning messages.

Signed-off-by: Eric Biggers <ebiggers@google.com>

fscrypt: make fscrypt_msg() take inode instead of super_block

Most of the warning and error messages in fs/crypto/ are for situations
related to a specific inode, not merely to a super_block. So to make
things easier, make fscrypt_msg() take an inode rather than a
super_block, and make it print the inode number.

Note: This is the same approach I'm taking for fsverity_msg().

Signed-off-by: Eric Biggers <ebiggers@google.com>

fscrypt: clean up base64 encoding/decoding

Some minor cleanups for the code that base64 encodes and decodes
encrypted filenames and long name digests:

- Rename "digest_{encode,decode}()" => "base64_{encode,decode}()" since
they are used for filenames too, not just for long name digests.
- Replace 'while' loops with more conventional 'for' loops.
- Use 'u8' for binary data. Keep 'char' for string data.
- Fully constify the lookup table (pointer was not const).
- Improve comment.

No actual change in behavior.

Signed-off-by: Eric Biggers <ebiggers@google.com>

fscrypt: remove loadable module related code

Since commit 643fa9612bf1 ("fscrypt: remove filesystem specific build
config option"), fs/crypto/ can no longer be built as a loadable module.
Thus it no longer needs a module_exit function, nor a MODULE_LICENSE.
So remove them, and change module_init to late_initcall.

Reviewed-by: Chandan Rajendra <chandan@linux.ibm.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>

Linux 5.3-rc3

Merge tag 'tpmdd-next-20190805' of git://git.infradead.org/users/jjs/linux-tpmdd

Pull tpm fixes from Jarkko Sakkinen:
"Two bug fixes that did not make into my first pull request"

* tag 'tpmdd-next-20190805' of git://git.infradead.org/users/jjs/linux-tpmdd:
tpm: tpm_ibm_vtpm: Fix unallocated banks
tpm: Fix null pointer dereference on chip register error path

Merge tag 'mtd/fixes-for-5.3-rc3' of git://git./linux/kernel/git/mtd/linux

Pull MTD fixes from Miquel Raynal:
"NAND:

   - Fix Micron driver as some chips enable internal ECC correction
     during their discovery while they advertize they do not have any.

  Hyperbus:

   - Restrict the build to only ARM64 SoCs (and compile testing) which
     is what should have been done since the beginning.

   - Fix Kconfig issue by selection something instead of implying it"

* tag 'mtd/fixes-for-5.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
  mtd: hyperbus: Add hardware dependency to AM654 driver
  mtd: hyperbus: Kconfig: Fix HBMC_AM654 dependencies
  mtd: rawnand: micron: handle on-die "ECC-off" devices correctly

tpm: tpm_ibm_vtpm: Fix unallocated banks

The nr_allocated_banks and allocated banks are initialized as part of
tpm_chip_register. Currently, this is done as part of auto startup
function. However, some drivers, like the ibm vtpm driver, do not run
auto startup during initialization. This results in uninitialized memory
issue and causes a kernel panic during boot.

This patch moves the pcr allocation outside the auto startup function
into tpm_chip_register. This ensures that allocated banks are initialized
in any case.

Fixes: 879b589210a9 ("tpm: retrieve digest size of unknown algorithms with PCR read")
Reported-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Nayna Jain <nayna@linux.ibm.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Tested-by: Michal Suchánek <msuchanek@suse.de>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>

tpm: Fix null pointer dereference on chip register error path

If clk_enable is not defined and chip initialization
is canceled code hits null dereference.

Easily reproducible with vTPM init fail:
swtpm chardev --tpmstate dir=nonexistent_dir --tpm2 --vtpm-proxy

BUG: kernel NULL pointer dereference, address: 00000000
...
Call Trace:
tpm_chip_start+0x9d/0xa0 [tpm]
tpm_chip_register+0x10/0x1a0 [tpm]
vtpm_proxy_work+0x11/0x30 [tpm_vtpm_proxy]
process_one_work+0x214/0x5a0
worker_thread+0x134/0x3e0
? process_one_work+0x5a0/0x5a0
kthread+0xd4/0x100
? process_one_work+0x5a0/0x5a0
? kthread_park+0x90/0x90
ret_from_fork+0x19/0x24

Fixes: 719b7d81f204 ("tpm: introduce tpm_chip_start() and tpm_chip_stop()")
Cc: stable@vger.kernel.org # v5.1+
Signed-off-by: Milan Broz <gmazyland@gmail.com>
Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>

Merge tag 'powerpc-5.3-3' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
"Some more powerpc fixes for 5.3:

   - Wire up the new clone3 syscall.

   - A fix for the PAPR SCM nvdimm driver, to fix a crash when firmware
     gives us a device that's attached to a non-online NUMA node.

   - A fix for a boot failure on 32-bit with KASAN enabled.

   - Three fixes for implicit fall through warnings, some of which are
     errors for us due to -Werror.

  Thanks to: Aneesh Kumar K.V, Christophe Leroy, Kees Cook, Santosh
  Sivaraj, Stephen Rothwell"

* tag 'powerpc-5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/kasan: fix early boot failure on PPC32
  drivers/macintosh/smu.c: Mark expected switch fall-through
  powerpc/spe: Mark expected switch fall-throughs
  powerpc/nvdimm: Pick nearby online node if the device node is not online
  powerpc/kvm: Fall through switch case explicitly
  powerpc: Wire up clone3 syscall

MAINTAINERS: Add Geert as Renesas SoC Co-Maintainer

At the end of the v5.3 upstream kernel development cycle, Simon will be
stepping down from his role as Renesas SoC maintainer. Starting with
the v5.4 development cycle, Geert is taking over this role.

Add Geert as a co-maintainer, and add his git repository and branch.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'kbuild-fixes-v5.3-2' of git://git./linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

- detect missing missing "WITH Linux-syscall-note" for uapi headers

- fix needless rebuild when using Clang

- fix false-positive cc-option in Kconfig when using Clang

- avoid including corrupted .*.cmd files in the modpost stage

- fix warning of 'make vmlinux'

- fix {m,n,x,g}config to not generate the broken .config on the second
   save operation.

- some trivial Makefile fixes

* tag 'kbuild-fixes-v5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kconfig: Clear "written" flag to avoid data loss
  kbuild: Check for unknown options with cc-option usage in Kconfig and clang
  lib/raid6: fix unnecessary rebuild of vpermxor*.c
  kbuild: modpost: do not parse unnecessary rules for vmlinux modpost
  kbuild: modpost: remove unnecessary dependency for __modpost
  kbuild: modpost: handle KBUILD_EXTRA_SYMBOLS only for external modules
  kbuild: modpost: include .*.cmd files only when targets exist
  kbuild: initialize CLANG_FLAGS correctly in the top Makefile
  kbuild: detect missing "WITH Linux-syscall-note" for uapi headers

Merge tag 'safesetid-maintainers-correction-5.3-rc2' of git://github.com/micah-morton/linux

Pull SafeSetID maintainer update from Micah Morton:
"Add entry in MAINTAINERS file for SafeSetID LSM"

* tag 'safesetid-maintainers-correction-5.3-rc2' of git://github.com/micah-morton/linux:
Add entry in MAINTAINERS file for SafeSetID LSM

kconfig: Clear "written" flag to avoid data loss

Prior to this commit, starting nconfig, xconfig or gconfig, and saving
the .config file more than once caused data loss, where a .config file
that contained only comments would be written to disk starting from the
second save operation.

This bug manifests itself because the SYMBOL_WRITTEN flag is never
cleared after the first call to conf_write, and subsequent calls to
conf_write then skip all of the configuration symbols due to the
SYMBOL_WRITTEN flag being set.

This commit resolves this issue by clearing the SYMBOL_WRITTEN flag
from all symbols before conf_write returns.

Fixes: 8e2442a5f86e ("kconfig: fix missing choice values in auto.conf")
Cc: linux-stable <stable@vger.kernel.org> # 4.19+
Signed-off-by: M. Vefa Bicakci <m.v.b@runbox.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

Merge tag 'xtensa-20190803' of git://github.com/jcmvbkbc/linux-xtensa

Pull Xtensa fix from Max Filippov:
"Fix build for xtensa cores with coprocessors that was broken by
entry/return abstraction patch"

* tag 'xtensa-20190803' of git://github.com/jcmvbkbc/linux-xtensa:
xtensa: fix build for cores with coprocessors

Merge branch 'i2c/for-current-fixed' of git://git./linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
"A set of driver fixes for the I2C subsystem"

* 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: s3c2410: Mark expected switch fall-through
  i2c: at91: fix clk_offset for sama5d2
  i2c: at91: disable TXRDY interrupt after sending data
  i2c: iproc: Fix i2c master read more than 63 bytes
  eeprom: at24: make spd world-readable again

Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf tooling fixes from Thomas Gleixner:
"A set of updates for perf tools and documentation:

  perf header:
    - Prevent a division by zero
    - Deal with an uninitialized warning proper

  libbpf:
    - Fix the missiong __WORDSIZE definition for musl & al

  UAPI headers:
    - Synchronize kernel headers

  Documentation:
    - Fix the memory units for perf.data size"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  libbpf: fix missing __WORDSIZE definition
  perf tools: Fix perf.data documentation units for memory size
  perf header: Fix use of unitialized value warning
  perf header: Fix divide by zero error if f_header.attr_size==0
  tools headers UAPI: Sync if_link.h with the kernel
  tools headers UAPI: Sync sched.h with the kernel
  tools headers UAPI: Sync usbdevice_fs.h with the kernels to get new ioctl
  tools perf beauty: Fix usbdevfs_ioctl table generator to handle _IOC()
  tools headers UAPI: Update tools's copy of drm.h headers
  tools headers UAPI: Update tools's copy of mman.h headers
  tools headers UAPI: Update tools's copy of kvm.h headers
  tools include UAPI: Sync x86's syscalls_64.tbl and generic unistd.h to pick up clone3 and pidfd_open

Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull vdso timer fixes from Thomas Gleixner:
"A series of commits to deal with the regression caused by the generic
  VDSO implementation.

  The usage of clock_gettime64() for 32bit compat fallback syscalls
  caused seccomp filters to kill innocent processes because they only
  allow clock_gettime().

  Handle the compat syscalls with clock_gettime() as before, which is
  not a functional problem for the VDSO as the legacy compat application
  interface is not y2038 safe anyway. It's just extra fallback code
  which needs to be implemented on every architecture.

  It's opt in for now so that it does not break the compile of already
  converted architectures in linux-next. Once these are fixed, the
  #ifdeffery goes away.

  So much for trying to be smart and reuse code..."

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  arm64: compat: vdso: Use legacy syscalls as fallback
  x86/vdso/32: Use 32bit syscall fallback
  lib/vdso/32: Provide legacy syscall fallbacks
  lib/vdso: Move fallback invocation to the callers
  lib/vdso/32: Remove inconsistent NULL pointer checks

Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:
"A small bunch of fixes from the irqchip department:

   - Fix a couple of UAF on error paths (RZA1, GICv3 ITS)

   - Fix iMX GPCv2 trigger setting

   - Add missing of_node_put() on error path in MBIGEN

   - Add another bunch of /* fall-through */ to silence warnings"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/renesas-rza1: Fix an use-after-free in rza1_irqc_probe()
  irqchip/irq-imx-gpcv2: Forward irq type to parent
  irqchip/irq-mbigen: Add of_node_put() before return
  irqchip/gic-v3-its: Free unused vpt_page when alloc vpe table fail
  irqchip/gic-v3: Mark expected switch fall-through

Merge tag 'xfs-5.3-fixes-1' of git://git./fs/xfs/xfs-linux

Pull xfs fixes from Darrick Wong:

- Avoid leaking kernel stack contents to userspace

- Fix a potential null pointer dereference in the dabtree scrub code

* tag 'xfs-5.3-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: Fix possible null-pointer dereferences in xchk_da_btree_block_check_sibling()
xfs: fix stack contents leakage in the v1 inumber ioctls

Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
"17 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  drivers/acpi/scan.c: document why we don't need the device_hotplug_lock
  memremap: move from kernel/ to mm/
  lib/test_meminit.c: use GFP_ATOMIC in RCU critical section
  asm-generic: fix -Wtype-limits compiler warnings
  cgroup: kselftest: relax fs_spec checks
  mm/memory_hotplug.c: remove unneeded return for void function
  mm/migrate.c: initialize pud_entry in migrate_vma()
  coredump: split pipe command whitespace before expanding template
  page flags: prioritize kasan bits over last-cpuid
  ubsan: build ubsan.c more conservatively
  kasan: remove clang version check for KASAN_STACK
  mm: compaction: avoid 100% CPU usage during compaction when a task is killed
  mm: migrate: fix reference check race between __find_get_block() and migration
  mm: vmscan: check if mem cgroup is disabled or not before calling memcg slab shrinker
  ocfs2: remove set but not used variable 'last_hash'
  Revert "kmemleak: allow to coexist with fault injection"
  kernel/signal.c: fix a kernel-doc markup

Merge tag 'riscv/for-v5.3-rc3' of git://git./linux/kernel/git/riscv/linux

Pull RISC-V fixes from Paul Walmsley:
"Three minor RISC-V-related changes for v5.3-rc3:

   - Add build ID to VDSO builds to avoid a double-free in perf when
     libelf isn't used

   - Align the RV64 defconfig to the output of "make savedefconfig" so
     subsequent defconfig patches don't get out of hand

   - Drop a superfluous DT property from the FU540 SoC DT data (since it
     must be already set in board data that includes it)"

* tag 'riscv/for-v5.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: defconfig: align RV64 defconfig to the output of "make savedefconfig"
  riscv: dts: fu540-c000: drop "timebase-frequency"
  riscv: Fix perf record without libelf support

drivers/acpi/scan.c: document why we don't need the device_hotplug_lock

Let's document why the lock is not needed in acpi_scan_init(), right now
this is not really obvious.

[akpm@linux-foundation.org: fix tpyo]
Link: http://lkml.kernel.org/r/20190731135306.31524-1-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

memremap: move from kernel/ to mm/

memremap.c implements MM functionality for ZONE_DEVICE, so it really
should be in the mm/ directory, not the kernel/ one.

Link: http://lkml.kernel.org/r/20190722094143.18387-1-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

lib/test_meminit.c: use GFP_ATOMIC in RCU critical section

kmalloc() shouldn't sleep while in RCU critical section, therefore use
GFP_ATOMIC instead of GFP_KERNEL.

The bug was spotted by the 0day kernel testing robot.

Link: http://lkml.kernel.org/r/20190725121703.210874-1-glider@google.com
Fixes: 7e659650cbda ("lib: introduce test_meminit module")
Signed-off-by: Alexander Potapenko <glider@google.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

asm-generic: fix -Wtype-limits compiler warnings

Commit d66acc39c7ce ("bitops: Optimise get_order()") introduced a
compilation warning because "rx_frag_size" is an "ushort" while
PAGE_SHIFT here is 16.

The commit changed the get_order() to be a multi-line macro where
compilers insist to check all statements in the macro even when
__builtin_constant_p(rx_frag_size) will return false as "rx_frag_size"
is a module parameter.

In file included from ./arch/powerpc/include/asm/page_64.h:107,
                 from ./arch/powerpc/include/asm/page.h:242,
                 from ./arch/powerpc/include/asm/mmu.h:132,
                 from ./arch/powerpc/include/asm/lppaca.h:47,
                 from ./arch/powerpc/include/asm/paca.h:17,
                 from ./arch/powerpc/include/asm/current.h:13,
                 from ./include/linux/thread_info.h:21,
                 from ./arch/powerpc/include/asm/processor.h:39,
                 from ./include/linux/prefetch.h:15,
                 from drivers/net/ethernet/emulex/benet/be_main.c:14:
drivers/net/ethernet/emulex/benet/be_main.c: In function 'be_rx_cqs_create':
./include/asm-generic/getorder.h:54:9: warning: comparison is always
true due to limited range of data type [-Wtype-limits]
   (((n) < (1UL << PAGE_SHIFT)) ? 0 :  \
         ^
drivers/net/ethernet/emulex/benet/be_main.c:3138:33: note: in expansion
of macro 'get_order'
  adapter->big_page_size = (1 << get_order(rx_frag_size)) * PAGE_SIZE;
                                 ^~~~~~~~~

Fix it by moving all of this multi-line macro into a proper function,
and killing __get_order() off.

[akpm@linux-foundation.org: remove __get_order() altogether]
[cai@lca.pw: v2]
Link: http://lkml.kernel.org/r/1564000166-31428-1-git-send-email-cai@lca.pw
Link: http://lkml.kernel.org/r/1563914986-26502-1-git-send-email-cai@lca.pw
Fixes: d66acc39c7ce ("bitops: Optimise get_order()")
Signed-off-by: Qian Cai <cai@lca.pw>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Jakub Jelinek <jakub@redhat.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Bill Wendling <morbo@google.com>
Cc: James Y Knight <jyknight@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

cgroup: kselftest: relax fs_spec checks

On my laptop most memcg kselftests were being skipped because it claimed
cgroup v2 hierarchy wasn't mounted, but this isn't correct.  Instead, it
seems current systemd HEAD mounts it with the name "cgroup2" instead of
"cgroup":

    % grep cgroup /proc/mounts
    cgroup2 /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate 0 0

I can't think of a reason to need to check fs_spec explicitly
since it's arbitrary, so we can just rely on fs_vfstype.

After these changes, `make TARGETS=cgroup kselftest` actually runs the
cgroup v2 tests in more cases.

Link: http://lkml.kernel.org/r/20190723210737.GA487@chrisdown.name
Signed-off-by: Chris Down <chris@chrisdown.name>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/memory_hotplug.c: remove unneeded return for void function

return is unneeded in void function

Link: http://lkml.kernel.org/r/20190723130814.21826-1-houweitaoo@gmail.com
Signed-off-by: Weitao Hou <houweitaoo@gmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm/migrate.c: initialize pud_entry in migrate_vma()

When CONFIG_MIGRATE_VMA_HELPER is enabled, migrate_vma() calls
migrate_vma_collect() which initializes a struct mm_walk but didn't
initialize mm_walk.pud_entry. (Found by code inspection) Use a C
structure initialization to make sure it is set to NULL.

Link: http://lkml.kernel.org/r/20190719233225.12243-1-rcampbell@nvidia.com
Fixes: 8763cb45ab967 ("mm/migrate: new memory migration helper for use with device memory")
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

coredump: split pipe command whitespace before expanding template

Save the offsets of the start of each argument to avoid having to update
pointers to each argument after every corename krealloc and to avoid
having to duplicate the memory for the dump command.

Executable names containing spaces were previously being expanded from
%e or %E and then split in the middle of the filename.  This is
incorrect behaviour since an argument list can represent arguments with
spaces.

The splitting could lead to extra arguments being passed to the core
dump handler that it might have interpreted as options or ignored
completely.

Core dump handlers that are not aware of this Linux kernel issue will be
using %e or %E without considering that it may be split and so they will
be vulnerable to processes with spaces in their names breaking their
argument list.  If their internals are otherwise well written, such as
if they are written in shell but quote arguments, they will work better
after this change than before.  If they are not well written, then there
is a slight chance of breakage depending on the details of the code but
they will already be fairly broken by the split filenames.

Core dump handlers that are aware of this Linux kernel issue will be
placing %e or %E as the last item in their core_pattern and then
aggregating all of the remaining arguments into one, separated by
spaces.  Alternatively they will be obtaining the filename via other
methods.  Both of these will be compatible with the new arrangement.

A side effect from this change is that unknown template types (for
example %z) result in an empty argument to the dump handler instead of
the argument being dropped.  This is a desired change as:

It is easier for dump handlers to process empty arguments than dropped
ones, especially if they are written in shell or don't pass each
template item with a preceding command-line option in order to
differentiate between individual template types.  Most core_patterns in
the wild do not use options so they can confuse different template types
(especially numeric ones) if an earlier one gets dropped in old kernels.
If the kernel introduces a new template type and a core_pattern uses it,
the core dump handler might not expect that the argument can be dropped
in old kernels.

For example, this can result in security issues when %d is dropped in
old kernels.  This happened with the corekeeper package in Debian and
resulted in the interface between corekeeper and Linux having to be
rewritten to use command-line options to differentiate between template
types.

The core_pattern for most core dump handlers is written by the handler
author who would generally not insert unknown template types so this
change should be compatible with all the core dump handlers that exist.

Link: http://lkml.kernel.org/r/20190528051142.24939-1-pabs3@bonedaddy.net
Fixes: 74aadce98605 ("core_pattern: allow passing of arguments to user mode helper when core_pattern is a pipe")
Signed-off-by: Paul Wise <pabs3@bonedaddy.net>
Reported-by: Jakub Wilk <jwilk@jwilk.net> [https://bugs.debian.org/924398]
Reported-by: Paul Wise <pabs3@bonedaddy.net> [https://lore.kernel.org/linux-fsdevel/c8b7ecb8508895bf4adb62a748e2ea2c71854597.camel@bonedaddy.net/]
Suggested-by: Jakub Wilk <jwilk@jwilk.net>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

page flags: prioritize kasan bits over last-cpuid

ARM64 randdconfig builds regularly run into a build error, especially
when NUMA_BALANCING and SPARSEMEM are enabled but not SPARSEMEM_VMEMMAP:

#error "KASAN: not enough bits in page flags for tag"

The last-cpuid bits are already contitional on the available space, so
the result of the calculation is a bit random on whether they were
already left out or not.

Adding the kasan tag bits before last-cpuid makes it much more likely to
end up with a successful build here, and should be reliable for
randconfig at least, as long as that does not randomize NR_CPUS or
NODES_SHIFT but uses the defaults.

In order for the modified check to not trigger in the x86 vdso32 code
where all constants are wrong (building with -m32), enclose all the
definitions with an #ifdef.

[arnd@arndb.de: build fix]
Link: http://lkml.kernel.org/r/CAK8P3a3Mno1SWTcuAOT0Wa9VS15pdU6EfnkxLbDpyS55yO04+g@mail.gmail.com
Link: http://lkml.kernel.org/r/20190722115520.3743282-1-arnd@arndb.de
Link: https://lore.kernel.org/lkml/20190618095347.3850490-1-arnd@arndb.de/
Fixes: 2813b9c02962 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ubsan: build ubsan.c more conservatively

objtool points out several conditions that it does not like, depending
on the combination with other configuration options and compiler
variants:

stack protector:
  lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch()+0xbf: call to __stack_chk_fail() with UACCESS enabled
  lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch_v1()+0xbe: call to __stack_chk_fail() with UACCESS enabled

stackleak plugin:
  lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch()+0x4a: call to stackleak_track_stack() with UACCESS enabled
  lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch_v1()+0x4a: call to stackleak_track_stack() with UACCESS enabled

kasan:
  lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch()+0x25: call to memcpy() with UACCESS enabled
  lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch_v1()+0x25: call to memcpy() with UACCESS enabled

The stackleak and kasan options just need to be disabled for this file
as we do for other files already.  For the stack protector, we already
attempt to disable it, but this fails on clang because the check is
mixed with the gcc specific -fno-conserve-stack option.  According to
Andrey Ryabinin, that option is not even needed, dropping it here fixes
the stackprotector issue.

Link: http://lkml.kernel.org/r/20190722125139.1335385-1-arnd@arndb.de
Link: https://lore.kernel.org/lkml/20190617123109.667090-1-arnd@arndb.de/t/
Link: https://lore.kernel.org/lkml/20190722091050.2188664-1-arnd@arndb.de/t/
Fixes: d08965a27e84 ("x86/uaccess, ubsan: Fix UBSAN vs. SMAP")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kasan: remove clang version check for KASAN_STACK

asan-stack mode still uses dangerously large kernel stacks of tens of
kilobytes in some drivers, and it does not seem that anyone is working
on the clang bug.

Turn it off for all clang versions to prevent users from accidentally
enabling it once they update to clang-9, and to help automated build
testing with clang-9.

Link: https://bugs.llvm.org/show_bug.cgi?id=38809
Link: http://lkml.kernel.org/r/20190719200347.2596375-1-arnd@arndb.de
Fixes: 6baec880d7a5 ("kasan: turn off asan-stack for clang-8 and earlier")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: compaction: avoid 100% CPU usage during compaction when a task is killed

"howaboutsynergy" reported via kernel buzilla number 204165 that
compact_zone_order was consuming 100% CPU during a stress test for
prolonged periods of time.  Specifically the following command, which
should exit in 10 seconds, was taking an excessive time to finish while
the CPU was pegged at 100%.

  stress -m 220 --vm-bytes 1000000000 --timeout 10

Tracing indicated a pattern as follows

          stress-3923  [007]   519.106208: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
          stress-3923  [007]   519.106212: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
          stress-3923  [007]   519.106216: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
          stress-3923  [007]   519.106219: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
          stress-3923  [007]   519.106223: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
          stress-3923  [007]   519.106227: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
          stress-3923  [007]   519.106231: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
          stress-3923  [007]   519.106235: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
          stress-3923  [007]   519.106238: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
          stress-3923  [007]   519.106242: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0

Note that compaction is entered in rapid succession while scanning and
isolating nothing.  The problem is that when a task that is compacting
receives a fatal signal, it retries indefinitely instead of exiting
while making no progress as a fatal signal is pending.

It's not easy to trigger this condition although enabling zswap helps on
the basis that the timing is altered.  A very small window has to be hit
for the problem to occur (signal delivered while compacting and
isolating a PFN for migration that is not aligned to SWAP_CLUSTER_MAX).

This was reproduced locally -- 16G single socket system, 8G swap, 30%
zswap configured, vm-bytes 22000000000 using Colin Kings stress-ng
implementation from github running in a loop until the problem hits).
Tracing recorded the problem occurring almost 200K times in a short
window.  With this patch, the problem hit 4 times but the task existed
normally instead of consuming CPU.

This problem has existed for some time but it was made worse by commit
cf66f0700c8f ("mm, compaction: do not consider a need to reschedule as
contention").  Before that commit, if the same condition was hit then
locks would be quickly contended and compaction would exit that way.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204165
Link: http://lkml.kernel.org/r/20190718085708.GE24383@techsingularity.net
Fixes: cf66f0700c8f ("mm, compaction: do not consider a need to reschedule as contention")
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org> [5.1+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: migrate: fix reference check race between __find_get_block() and migration

buffer_migrate_page_norefs() can race with bh users in the following
way:

CPU1                                    CPU2
buffer_migrate_page_norefs()
  buffer_migrate_lock_buffers()
  checks bh refs
  spin_unlock(&mapping->private_lock)
                                        __find_get_block()
                                          spin_lock(&mapping->private_lock)
                                          grab bh ref
                                          spin_unlock(&mapping->private_lock)
  move page                               do bh work

This can result in various issues like lost updates to buffers (i.e.
metadata corruption) or use after free issues for the old page.

This patch closes the race by holding mapping->private_lock while the
mapping is being moved to a new page.  Ordinarily, a reference can be
taken outside of the private_lock using the per-cpu BH LRU but the
references are checked and the LRU invalidated if necessary.  The
private_lock is held once the references are known so the buffer lookup
slow path will spin on the private_lock.  Between the page lock and
private_lock, it should be impossible for other references to be
acquired and updates to happen during the migration.

A user had reported data corruption issues on a distribution kernel with
a similar page migration implementation as mainline.  The data
corruption could not be reproduced with this patch applied.  A small
number of migration-intensive tests were run and no performance problems
were noted.

[mgorman@techsingularity.net: Changelog, removed tracing]
Link: http://lkml.kernel.org/r/20190718090238.GF24383@techsingularity.net
Fixes: 89cb0888ca14 "mm: migrate: provide buffer_migrate_page_norefs()"
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Cc: <stable@vger.kernel.org> [5.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: vmscan: check if mem cgroup is disabled or not before calling memcg slab shrinker

Shakeel Butt reported premature oom on kernel with
"cgroup_disable=memory" since mem_cgroup_is_root() returns false even
though memcg is actually NULL. The drop_caches is also broken.

It is because commit aeed1d325d42 ("mm/vmscan.c: generalize
shrink_slab() calls in shrink_node()") removed the !memcg check before
!mem_cgroup_is_root(). And, surprisingly root memcg is allocated even
though memory cgroup is disabled by kernel boot parameter.

Add mem_cgroup_disabled() check to make reclaimer work as expected.

Link: http://lkml.kernel.org/r/1563385526-20805-1-git-send-email-yang.shi@linux.alibaba.com
Fixes: aeed1d325d42 ("mm/vmscan.c: generalize shrink_slab() calls in shrink_node()")
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Reported-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Jan Hadrava <had@kam.mff.cuni.cz>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org> [4.19+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ocfs2: remove set but not used variable 'last_hash'

Fixes gcc '-Wunused-but-set-variable' warning:

fs/ocfs2/xattr.c: In function ocfs2_xattr_bucket_find:
fs/ocfs2/xattr.c:3828:6: warning: variable last_hash set but not used [-Wunused-but-set-variable]

It's never used and can be removed.

Link: http://lkml.kernel.org/r/20190716132110.34836-1-yuehaibing@huawei.com
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Revert "kmemleak: allow to coexist with fault injection"

When running ltp's oom test with kmemleak enabled, the below warning was
triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
passed in:

  WARNING: CPU: 105 PID: 2138 at mm/page_alloc.c:4608 __alloc_pages_nodemask+0x1c31/0x1d50
  Modules linked in: loop dax_pmem dax_pmem_core ip_tables x_tables xfs virtio_net net_failover virtio_blk failover ata_generic virtio_pci virtio_ring virtio libata
  CPU: 105 PID: 2138 Comm: oom01 Not tainted 5.2.0-next-20190710+ #7
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
  RIP: 0010:__alloc_pages_nodemask+0x1c31/0x1d50
  ...
   kmemleak_alloc+0x4e/0xb0
   kmem_cache_alloc+0x2a7/0x3e0
   mempool_alloc_slab+0x2d/0x40
   mempool_alloc+0x118/0x2b0
   bio_alloc_bioset+0x19d/0x350
   get_swap_bio+0x80/0x230
   __swap_writepage+0x5ff/0xb20

The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, however kmemleak
has __GFP_NOFAIL set all the time due to d9570ee3bd1d4f2 ("kmemleak:
allow to coexist with fault injection").  But, it doesn't make any sense
to have __GFP_NOFAIL and ~__GFP_DIRECT_RECLAIM specified at the same
time.

According to the discussion on the mailing list, the commit should be
reverted for short term solution.  Catalin Marinas would follow up with
a better solution for longer term.

The failure rate of kmemleak metadata allocation may increase in some
circumstances, but this should be expected side effect.

Link: http://lkml.kernel.org/r/1563299431-111710-1-git-send-email-yang.shi@linux.alibaba.com
Fixes: d9570ee3bd1d4f2 ("kmemleak: allow to coexist with fault injection")
Signed-off-by: Yang Shi <yang.shi@linux.alibaba.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Qian Cai <cai@lca.pw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

kernel/signal.c: fix a kernel-doc markup

The kernel-doc parser doesn't handle expressions with %foo*. Instead,
when an asterisk should be part of a constant, it uses an alternative
notation: `foo*`.

Link: http://lkml.kernel.org/r/7f18c2e0b5e39e6b7eb55ddeb043b8b260b49f2d.1563361575.git.mchehab+samsung@kernel.org
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Cc: Deepa Dinamani <deepa.kernel@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'drm-fixes-2019-08-02-1' of git://anongit.freedesktop.org/drm/drm

Pull more drm fixes from Daniel Vetter:
"Dave sends his pull, everyone realizes they've been asleep at the
  wheel and hits send on their own pulls :-/

  Normally I'd just ignore these all because w/e for me and Dave. But
  this time around the latecomers also included drm-intel-fixes, which
  failed to send out a -fixes pull thus far for this release (screwed up
  vacation coverage, despite that 2/3 maintainers were around ... they
  all look appropriately guilty), and that really is overdue to get
  landed.

  And since I had to do a pull request anyway I pulled the other two
  late ones too.

  intel fixes (didn't have any ever since the main merge window pull):
   - gvt fixes (2 cc: stable)
   - fix gpu reset vs mm-shrinker vs wakeup fun (needed a few patches)
   - two gem locking fixes (one cc: stable)
   - pile of misc fixes all over with minor impact, 6 cc: stable, others
     from this window

  exynos:
   - misc minor fixes

  misc:
   - some build/Kconfig fixes
   - regression fix for vm scalability perf test which seems to mostly
     exercise dmesg/console logging ...
   - the vgem cache flush fix for arm64 broke the world on x86, so
     that's reverted again

* tag 'drm-fixes-2019-08-02-1' of git://anongit.freedesktop.org/drm/drm: (42 commits)
  Revert "drm/vgem: fix cache synchronization on arm/arm64"
  drm/exynos: fix missing decrement of retry counter
  drm/exynos: add CONFIG_MMU dependency
  drm/exynos: remove redundant assignment to pointer 'node'
  drm/exynos: using dev_get_drvdata directly
  drm/bochs: Use shadow buffer for bochs framebuffer console
  drm/fb-helper: Instanciate shadow FB if configured in device's mode_config
  drm/fb-helper: Map DRM client buffer only when required
  drm/client: Support unmapping of DRM client buffers
  drm/i915: Only recover active engines
  drm/i915: Add a wakeref getter for iff the wakeref is already active
  drm/i915: Lift intel_engines_resume() to callers
  drm/vgem: fix cache synchronization on arm/arm64
  drm/i810: Use CONFIG_PREEMPTION
  drm/bridge: tc358764: Fix build error
  drm/bridge: lvds-encoder: Fix build error while CONFIG_DRM_KMS_HELPER=m
  drm/i915/gvt: Adding ppgtt to GVT GEM context after shadow pdps settled.
  drm/i915/gvt: grab runtime pm first for forcewake use
  drm/i915/gvt: fix incorrect cache entry for guest page mapping
  drm/i915/gvt: Checking workload's gma earlier
  ...

Merge tag 'selinux-pr-20190801' of git://git./linux/kernel/git/pcmoore/selinux

Pull selinux fix from Paul Moore:
"One more small fix for a potential memory leak in an error path"

* tag 'selinux-pr-20190801' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: fix memory leak in policydb_init()

mtd: hyperbus: Add hardware dependency to AM654 driver

The hbmc-am654 driver is for the TI AM654, which is an ARM64 SoC, so
don't propose this driver on other architectures unless
build-testing.

Fixes: b07079f1642c ("mtd: hyperbus: Add driver for TI's HyperBus memory controller")
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>

mtd: hyperbus: Kconfig: Fix HBMC_AM654 dependencies

On x86_64, when CONFIG_OF is not disabled:

WARNING: unmet direct dependencies detected for MUX_MMIO
  Depends on [n]: MULTIPLEXER [=y] && (OF [=n] || COMPILE_TEST [=n])
  Selected by [y]:
  - HBMC_AM654 [=y] && MTD [=y] && MTD_HYPERBUS [=y]

due to
config HBMC_AM654
tristate "HyperBus controller driver for AM65x SoC"
select MULTIPLEXER
select MUX_MMIO

Fix this by making HBMC_AM654 imply MUX_MMIO instead of select so
that dependencies are taken care of. MUX_MMIO is optional for
functioning of driver.

Fixes: b07079f1642c ("mtd: hyperbus: Add driver for TI's HyperBus memory controller")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>

mtd: rawnand: micron: handle on-die "ECC-off" devices correctly

Some devices are not supposed to support on-die ECC but experience
shows that internal ECC machinery can actually be enabled through the
"SET FEATURE (EFh)" command, even if a read of the "READ ID Parameter
Tables" returns that it is not.

Currently, the driver checks the "READ ID Parameter" field directly
after having enabled the feature. If the check fails it returns
immediately but leaves the ECC on. When using buggy chips like
MT29F2G08ABAGA and MT29F2G08ABBGA, all future read/program cycles will
go through the on-die ECC, confusing the host controller which is
supposed to be the one handling correction.

To address this in a common way we need to turn off the on-die ECC
directly after reading the "READ ID Parameter" and before checking the
"ECC status".

Cc: stable@vger.kernel.org
Fixes: dbc44edbf833 ("mtd: rawnand: micron: Fix on-die ECC detection logic")
Signed-off-by: Marco Felsch <m.felsch@pengutronix.de>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>

Merge tag 'for-linus-5.3a-rc3-tag' of git://git./linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:

- a small cleanup

- a fix for a build error on ARM with some configs

- a fix of a patch for the Xen gntdev driver

- three patches for fixing a potential problem in the swiotlb-xen
   driver which Konrad was fine with me carrying them through the Xen
   tree

* tag 'for-linus-5.3a-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/swiotlb: remember having called xen_create_contiguous_region()
  xen/swiotlb: simplify range_straddles_page_boundary()
  xen/swiotlb: fix condition for calling xen_destroy_contiguous_region()
  xen: avoid link error on ARM
  xen/gntdev.c: Replace vm_map_pages() with vm_map_pages_zero()
  xen/pciback: remove set but not used variable 'old_state'

Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:

- Update the compat layer to allow single-byte watchpoints on all
   addresses (similar to the native support)

- arm_pmu: fix the restoration of the counters on the
   CPU_PM_ENTER_FAILED path

- Fix build regression with vDSO and Makefile not stripping
   CROSS_COMPILE_COMPAT

- Fix the CTR_EL0 (cache type register) sanitisation on heterogeneous
   machines (e.g. big.LITTLE)

- Fix the interrupt controller priority mask value when pseudo-NMIs are
   enabled

- arm64 kprobes fixes: recovering of the PSTATE.D flag in the
   single-step exception handler, NOKPROBE annotations for
   unwind_frame() and walk_stackframe(), remove unneeded
   rcu_read_lock/unlock from debug handlers

- Several gcc fall-through warnings

- Unused variable warnings

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Make debug exception handlers visible from RCU
  arm64: kprobes: Recover pstate.D in single-step exception handler
  arm64/mm: fix variable 'tag' set but not used
  arm64/mm: fix variable 'pud' set but not used
  arm64: Remove unneeded rcu_read_lock from debug handlers
  arm64: unwind: Prohibit probing on return_address()
  arm64: Lower priority mask for GIC_PRIO_IRQON
  arm64/efi: fix variable 'si' set but not used
  arm64: cpufeature: Fix feature comparison for CTR_EL0.{CWG,ERG}
  arm64: vdso: Fix Makefile regression
  arm64: module: Mark expected switch fall-through
  arm64: smp: Mark expected switch fall-through
  arm64: hw_breakpoint: Fix warnings about implicit fallthrough
  drivers/perf: arm_pmu: Fix failure path in PM notifier
  arm64: compat: Allow single-byte watchpoints on all addresses

Merge branch 'parisc-5.3-4' of git://git./linux/kernel/git/deller/parisc-linux

Pull parisc fixes from Helge Deller:
"A few small fixes for the parisc architecture:

   - Fix fall-through warnings in parisc math emu code

   - Fix vmlinuz linking failure with debug-enabled kernels

   - Fix a race condition in kernel live-patching code

   - Add missing archclean Makefile target & defconfig adjustments"

* 'parisc-5.3-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Add archclean Makefile target
  parisc: Strip debug info from kernel before creating compressed vmlinuz
  parisc: Fix build of compressed kernel even with debug enabled
  parisc: fix race condition in patching code
  parisc: rename default_defconfig to defconfig
  parisc: Fix fall-through warnings in fpudispatch.c
  parisc: Mark expected switch fall-throughs in fault.c

Merge tag 's390-5.3-4' of git://git./linux/kernel/git/s390/linux

Pull s390 updates from Vasily Gorbik:

- Default configs updates

- Minor qdio cleanup

- Sparse warnings fixes

- Implicit-fallthrough warnings fixes

* tag 's390-5.3-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/zcrypt: adjust switch fall through comments for -Wimplicit-fallthrough
  vfio-ccw: make vfio_ccw_async_region_ops static
  s390/3215: add switch fall through comment for -Wimplicit-fallthrough
  s390/tape: add fallthrough annotations
  s390/mm: add fallthrough annotations
  s390/mm: make gmap_test_and_clear_dirty_pmd static
  s390/kexec: add missing include to machine_kexec_reloc.c
  s390/perf: make cf_diag_csd static
  s390/lib: add missing include
  s390/boot: add missing declarations and includes
  s390: update configs
  s390: clean up qdio.h

Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
"Seven fixes to four drivers with no core changes.

  The mpt3sas one is theoretical until we get a CPU that goes up to 64
  bits physical, the qla2xxx one fixes an oops in a driver
  initialization error leg and the others are mostly cosmetic"

[ The fcoe patches may be worth highlighting - they may be "just"
  cleanups, but they simplify and fix the odd fc_rport_priv structure
  handling rules so that the new gcc-9 warnings about memset crossing
  structure boundaries are gone.

  The old code was hard for humans to understand too, and really
  confused the compiler sanity checks  - Linus ]

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: qla2xxx: Fix possible fcport null-pointer dereferences
  scsi: mpt3sas: Use 63-bit DMA addressing on SAS35 HBA
  scsi: hpsa: remove printing internal cdb on tag collision
  scsi: hpsa: correct scsi command status issue after reset
  scsi: fcoe: pass in fcoe_rport structure instead of fc_rport_priv
  scsi: fcoe: Embed fc_rport_priv in fcoe_rport structure
  scsi: libfc: Whitespace cleanup in libfc.h

Merge tag 'for-linus-20190802' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
"Here's a small collection of fixes that should go into this series.
  This contains:

   - io_uring potential use-after-free fix (Jackie)

   - loop regression fix (Jan)

   - O_DIRECT fragmented bio regression fix (Damien)

   - Mark Denis as the new floppy maintainer (Denis)

   - ataflop switch fall-through annotation (Gustavo)

   - libata zpodd overflow fix (Kees)

   - libata ahci deferred probe fix (Miquel)

   - nbd invalidation BUG_ON() fix (Munehisa)

   - dasd endless loop fix (Stefan)"

* tag 'for-linus-20190802' of git://git.kernel.dk/linux-block:
  s390/dasd: fix endless loop after read unit address configuration
  block: Fix __blkdev_direct_IO() for bio fragments
  MAINTAINERS: floppy: take over maintainership
  nbd: replace kill_bdev() with __invalidate_device() again
  ata: libahci: do not complain in case of deferred probe
  io_uring: fix KASAN use after free in io_sq_wq_submit_work
  loop: Fix mount(2) failure due to race with LOOP_SET_FD
  libata: zpodd: Fix small read overflow in zpodd_get_mech_type()
  ataflop: Mark expected switch fall-through

Merge tag 'for-5.3/dm-fixes-1' of git://git./linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:
"Fix NULL pointer and various whitespace issues with DM's recent DAX
  code changes from commit in 5.3 merge"

* tag 'for-5.3/dm-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm table: fix various whitespace issues with recent DAX code
  dm table: fix dax_dev NULL dereference in device_synchronous()

Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma

Pull rdma fixes from Doug Ledford:
"Here's our second -rc pull request. Nothing particularly special in
  this one. The client removal deadlock fix is kindy tricky, but we had
  multiple eyes on it and no one could find a fault in it. A couple
  Spectre V1 fixes too. Otherwise, all just normal -rc fodder:

   - A couple Spectre V1 fixes (umad, hfi1)

   - Fix a tricky deadlock in the rdma core code with refcounting
     instead of locks (client removal patches)

   - Build errors (hns)

   - Fix a scheduling while atomic issue (mlx5)

   - Use after free fix (mad)

   - Fix error path return code (hns)

   - Null deref fix (siw_crypto_hash)

   - A few other misc. minor fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/hns: Fix error return code in hns_roce_v1_rsv_lp_qp()
  RDMA/mlx5: Release locks during notifier unregister
  IB/hfi1: Fix Spectre v1 vulnerability
  IB/mad: Fix use-after-free in ib mad completion handling
  RDMA/restrack: Track driver QP types in resource tracker
  IB/mlx5: Fix MR registration flow to use UMR properly
  RDMA/devices: Remove the lock around remove_client_context
  RDMA/devices: Do not deadlock during client removal
  IB/core: Add mitigation for Spectre V1
  Do not dereference 'siw_crypto_shash' before checking
  RDMA/qedr: Fix the hca_type and hca_rev returned in device attributes
  RDMA/hns: Fix build error

Merge tag 'for-5.3-rc2-tag' of git://git./linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

- tiny race window during 2 transactions aborting at the same time can
   accidentally lead to a commit

- regression fix, possible deadlock during fiemap

- fix for an old bug when incremental send can fail on a file that has
   been deduplicated in a special way

* tag 'for-5.3-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  Btrfs: fix deadlock between fiemap and transaction commits
  Btrfs: fix race leading to fs corruption after transaction abort
  Btrfs: fix incremental send failure after deduplication

Merge tag 'gfs2-v5.3-rc2.fixes' of git://git./linux/kernel/git/gfs2/linux-gfs2

Pull gfs2 fix from Andreas Gruenbacher:
"Fix gfs2 cluster coherency bug"

* tag 'gfs2-v5.3-rc2.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: Inode dirtying fix

Merge tag 'pm-5.3-rc3' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management fix from Rafael Wysocki:
"Fix recent regression affecting ACPI device power management"

* tag 'pm-5.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: PM: Fix regression in acpi_device_set_power()

Merge tag 'sound-5.3-rc3' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:

- A further fix for syzcaller issues with USB-audio, addressing NULL
   dereference that was introduced by the recent fix

- Avoid a long delay at boot with HD-audio when i915 module was built
   but not installed, found on some Debian systems

- A fix of small race window at PCM draining

* tag 'sound-5.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: usb-audio: Fix gpf in snd_usb_pipe_sanity_check
  ALSA: pcm: fix lost wakeup event scenarios in snd_pcm_drain
  ALSA: hda: Fix 1-minute detection delay when i915 module is not available

Merge tag 'drm-fixes-2019-08-02' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
"Thanks to Daniel for handling the email the last couple of weeks, flus
  and break-ins combined to derail me. Surprised nothing materialised
  today to take me out again.

  Just more amdgpu navi fixes, msm fixes and a single nouveau regression
  fix:

  amdgpu:
   - navi10 temperature and pstate fixes
   - vcn dynamic power management fix
   - CS ioctl error handling fix
   - debugfs info leak fix
   - amdkfd VegaM fix

  msm:
   - dma sync call fix
   - mdp5 dsi command mode fix
   - fall-through fixes
   - disabled GPU fix

  nouveau:
   - regression fix for displayport MST support"

* tag 'drm-fixes-2019-08-02' of git://anongit.freedesktop.org/drm/drm:
  drm/nouveau: Only release VCPI slots on mode changes
  drm: msm: Fix add_gpu_components
  drm/msm: Annotate intentional switch statement fall throughs
  drm/msm: add support for per-CRTC max_vblank_count on mdp5
  drm/msm: Use the correct dma_sync calls in msm_gem
  drm/amd/powerplay: correct UVD/VCE/VCN power status retrieval
  drm/amd/powerplay: correct Navi10 VCN powergate control (v2)
  drm/amd/powerplay: support VCN powergate status retrieval for SW SMU
  drm/amd/powerplay: support VCN powergate status retrieval on Raven
  drm/amd/powerplay: add new sensor type for VCN powergate status
  drm/amdgpu: fix a potential information leaking bug
  drm/amdgpu: fix error handling in amdgpu_cs_process_fence_dep
  drm/amd/powerplay: enable SW SMU reset functionality
  drm/amd/powerplay: fix null pointer dereference around dpm state relates
  drm/amdgpu/powerplay: use proper revision id for navi
  drm/amd/powerplay: fix temperature granularity error in smu11
  drm/amd/powerplay: add callback function of get_thermal_temperature_range
  drm/amdkfd: Fix byte align on VegaM

Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux

Pull clk fixes from Stephen Boyd:
"A few fixes for code that came in during the merge window or that
  started getting exercised differently this time around:

   - Select regmap MMIO kconfig in spreadtrum driver to avoid compile
     errors

   - Complete kerneldoc on devm_clk_bulk_get_optional()

   - Register an essential clk earlier on mediatek mt8183 SoCs so the
     clocksource driver can use it

   - Fix divisor math in the at91 driver

   - Plug a race in Renesas reset control logic"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: renesas: cpg-mssr: Fix reset control race condition
  clk: sprd: Select REGMAP_MMIO to avoid compile errors
  clk: mediatek: mt8183: Register 13MHz clock earlier for clocksource
  clk: Add missing documentation of devm_clk_bulk_get_optional() argument
  clk: at91: generated: Truncate divisor to GENERATED_MAX_DIV + 1

Merge tag 'arm-swiotlb-5.3' of git://git.infradead.org/users/hch/dma-mapping

Pull arm swiotlb support from Christoph Hellwig:
"This fixes a cascade of regressions that originally started with the
  addition of the ia64 port, but only got fatal once we removed most
  uses of block layer bounce buffering in Linux 4.18.

  The reason is that while the original i386/PAE code that was the first
  architecture that supported > 4GB of memory without an iommu decided
  to leave bounce buffering to the subsystems, which in those days just
  mean block and networking as no one else consumed arbitrary userspace
  memory.

  Later with ia64, x86_64 and other ports we assumed that either an
  iommu or something that fakes it up ("software IOTLB" in beautiful
  Intel speak) is present and that subsystems can rely on that for
  dealing with addressing limitations in devices. Except that the ARM
  LPAE scheme that added larger physical address to 32-bit ARM did not
  follow that scheme and thus only worked by chance and only for block
  and networking I/O directly to highmem.

  Long story, short fix - add swiotlb support to arm when build for LPAE
  platforms, which actuallys turns out to be pretty trivial with the
  modern dma-direct / swiotlb code to fix the Linux 4.18-ish regression"

* tag 'arm-swiotlb-5.3' of git://git.infradead.org/users/hch/dma-mapping:
  arm: use swiotlb for bounce buffering on LPAE configs
  dma-mapping: check pfn validity in dma_common_{mmap,get_sgtable}

Merge tag 'dma-mapping-5.3-3' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping regression fixes from Christoph Hellwig:
"Two related regression fixes for changes from this merge window to fix
  alignment issues introduced in the CMA allocation rework (Nicolin
  Chen)"

* tag 'dma-mapping-5.3-3' of git://git.infradead.org/users/hch/dma-mapping:
  dma-contiguous: page-align the size in dma_free_contiguous()
  dma-contiguous: do not overwrite align in dma_alloc_contiguous()

Merge tag 'exynos-drm-fixes-for-v5.3-rc3' of git://git./linux/kernel/git/daeinki/drm-exynos into drm-fixes

- Two cleanup patches
  . use dev_get_drvdata for readability instead of platform_get_drvdata
  . remove redundant assignment to node.
- Two fixup patches
  . fix undefined reference to 'vmf_insert_mixed' with NOMMU configuration.
  . fix potential infinite spin issue by decrementing 'retry' variable in
    scaler_reset function of exynos_drm_scaler.c

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Inki Dae <inki.dae@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1564734791-745-1-git-send-email-inki.dae@samsung.com

Revert "drm/vgem: fix cache synchronization on arm/arm64"

commit 7e9e5ead55be ("drm/vgem: fix cache synchronization on arm/arm64")
broke all of the !llc i915-vgem coherency tests in CI, and left the HW
very, very unhappy (which is even more scary).

Fixes: 7e9e5ead55be ("drm/vgem: fix cache synchronization on arm/arm64")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Rob Clark <robdclark@chromium.org>
Cc: Sean Paul <seanpaul@chromium.org>
Acked-by: Sean Paul <sean@poorly.run>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190801124458.24949-1-chris@chris-wilson.co.uk

Merge tag 'drm-misc-fixes-2019-08-02' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

drm-misc-fixes for v5.3-rc3:
- Fix some build errors in drm/bridge.
- Do not build i810 on CONFIG_PREEMPTION.
- Fix cache sync on arm in vgem.
- Allow mapping fb in drm_client only when required, and use it to fix bochs fbdev.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/af0dc371-16e0-cee8-0d71-4824d44aa973@linux.intel.com

s390/zcrypt: adjust switch fall through comments for -Wimplicit-fallthrough

Silence the following warnings when built with -Wimplicit-fallthrough=3
enabled by default since 5.3-rc2:
In file included from ./include/linux/preempt.h:11,
                 from ./include/linux/spinlock.h:51,
                 from ./include/linux/mmzone.h:8,
                 from ./include/linux/gfp.h:6,
                 from ./include/linux/slab.h:15,
                 from drivers/s390/crypto/ap_queue.c:13:
drivers/s390/crypto/ap_queue.c: In function 'ap_sm_recv':
./include/linux/list.h:577:2: warning: this statement may fall through [-Wimplicit-fallthrough=]
  577 |  for (pos = list_first_entry(head, typeof(*pos), member); \
      |  ^~~
drivers/s390/crypto/ap_queue.c:147:3: note: in expansion of macro 'list_for_each_entry'
  147 |   list_for_each_entry(ap_msg, &aq->pendingq, list) {
      |   ^~~~~~~~~~~~~~~~~~~
drivers/s390/crypto/ap_queue.c:155:2: note: here
  155 |  case AP_RESPONSE_NO_PENDING_REPLY:
      |  ^~~~
drivers/s390/crypto/zcrypt_msgtype6.c: In function 'convert_response_ep11_xcrb':
drivers/s390/crypto/zcrypt_msgtype6.c:871:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
  871 |   if (msg->cprbx.cprb_ver_id == 0x04)
      |      ^
drivers/s390/crypto/zcrypt_msgtype6.c:874:2: note: here
  874 |  default: /* Unknown response type, this should NEVER EVER happen */
      |  ^~~~~~~
drivers/s390/crypto/zcrypt_msgtype6.c: In function 'convert_response_rng':
drivers/s390/crypto/zcrypt_msgtype6.c:901:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
  901 |   if (msg->cprbx.cprb_ver_id == 0x02)
      |      ^
drivers/s390/crypto/zcrypt_msgtype6.c:907:2: note: here
  907 |  default: /* Unknown response type, this should NEVER EVER happen */
      |  ^~~~~~~
drivers/s390/crypto/zcrypt_msgtype6.c: In function 'convert_response_xcrb':
drivers/s390/crypto/zcrypt_msgtype6.c:838:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
  838 |   if (msg->cprbx.cprb_ver_id == 0x02)
      |      ^
drivers/s390/crypto/zcrypt_msgtype6.c:844:2: note: here
  844 |  default: /* Unknown response type, this should NEVER EVER happen */
      |  ^~~~~~~
drivers/s390/crypto/zcrypt_msgtype6.c: In function 'convert_response_ica':
drivers/s390/crypto/zcrypt_msgtype6.c:801:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
  801 |   if (msg->cprbx.cprb_ver_id == 0x02)
      |      ^
drivers/s390/crypto/zcrypt_msgtype6.c:808:2: note: here
  808 |  default: /* Unknown response type, this should NEVER EVER happen */
      |  ^~~~~~~

Acked-by: Patrick Steuer <patrick.steuer@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

arm64: Make debug exception handlers visible from RCU

Make debug exceptions visible from RCU so that synchronize_rcu()
correctly track the debug exception handler.

This also introduces sanity checks for user-mode exceptions as same
as x86's ist_enter()/ist_exit().

The debug exception can interrupt in idle task. For example, it warns
if we put a kprobe on a function called from idle task as below.
The warning message showed that the rcu_read_lock() caused this
problem. But actually, this means the RCU is lost the context which
is already in NMI/IRQ.

  /sys/kernel/debug/tracing # echo p default_idle_call >> kprobe_events
  /sys/kernel/debug/tracing # echo 1 > events/kprobes/enable
  /sys/kernel/debug/tracing # [  135.122237]
  [  135.125035] =============================
  [  135.125310] WARNING: suspicious RCU usage
  [  135.125581] 5.2.0-08445-g9187c508bdc7 #20 Not tainted
  [  135.125904] -----------------------------
  [  135.126205] include/linux/rcupdate.h:594 rcu_read_lock() used illegally while idle!
  [  135.126839]
  [  135.126839] other info that might help us debug this:
  [  135.126839]
  [  135.127410]
  [  135.127410] RCU used illegally from idle CPU!
  [  135.127410] rcu_scheduler_active = 2, debug_locks = 1
  [  135.128114] RCU used illegally from extended quiescent state!
  [  135.128555] 1 lock held by swapper/0/0:
  [  135.128944]  #0: (____ptrval____) (rcu_read_lock){....}, at: call_break_hook+0x0/0x178
  [  135.130499]
  [  135.130499] stack backtrace:
  [  135.131192] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.2.0-08445-g9187c508bdc7 #20
  [  135.131841] Hardware name: linux,dummy-virt (DT)
  [  135.132224] Call trace:
  [  135.132491]  dump_backtrace+0x0/0x140
  [  135.132806]  show_stack+0x24/0x30
  [  135.133133]  dump_stack+0xc4/0x10c
  [  135.133726]  lockdep_rcu_suspicious+0xf8/0x108
  [  135.134171]  call_break_hook+0x170/0x178
  [  135.134486]  brk_handler+0x28/0x68
  [  135.134792]  do_debug_exception+0x90/0x150
  [  135.135051]  el1_dbg+0x18/0x8c
  [  135.135260]  default_idle_call+0x0/0x44
  [  135.135516]  cpu_startup_entry+0x2c/0x30
  [  135.135815]  rest_init+0x1b0/0x280
  [  135.136044]  arch_call_rest_init+0x14/0x1c
  [  135.136305]  start_kernel+0x4d4/0x500
  [  135.136597]

So make debug exception visible to RCU can fix this warning.

Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Acked-by: Paul E. McKenney <paulmck@linux.ibm.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>

arm64: kprobes: Recover pstate.D in single-step exception handler

kprobes manipulates the interrupted PSTATE for single step, and
doesn't restore it. Thus, if we put a kprobe where the pstate.D
(debug) masked, the mask will be cleared after the kprobe hits.

Moreover, in the most complicated case, this can lead a kernel
crash with below message when a nested kprobe hits.

[ 152.118921] Unexpected kernel single-step exception at EL1

When the 1st kprobe hits, do_debug_exception() will be called.
At this point, debug exception (= pstate.D) must be masked (=1).
But if another kprobes hits before single-step of the first kprobe
(e.g. inside user pre_handler), it unmask the debug exception
(pstate.D = 0) and return.
Then, when the 1st kprobe setting up single-step, it saves current
DAIF, mask DAIF, enable single-step, and restore DAIF.
However, since "D" flag in DAIF is cleared by the 2nd kprobe, the
single-step exception happens soon after restoring DAIF.

This has been introduced by commit 7419333fa15e ("arm64: kprobe:
Always clear pstate.D in breakpoint exception handler")

To solve this issue, this stores all DAIF bits and restore it
after single stepping.

Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Fixes: 7419333fa15e ("arm64: kprobe: Always clear pstate.D in breakpoint exception handler")
Reviewed-by: James Morse <james.morse@arm.com>
Tested-by: James Morse <james.morse@arm.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>

Merge tag 'drm-intel-fixes-2019-08-02' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

drm/i915 fixes for v5.3-rc3:
- GVT fixes
- Fix TBT aux powerwell
- Fix PSR2 training pattern duration
- Fix memory leak in runtime wakeref tracking
- Fix ICL memory bandwidth issue preventing planes from being enabled
- Fix OA mux configuration delays for accurate performance data
- Fix VLV/CHV DP audio cdclk frequency requirements
- Fix register whitelisting to fix a number of GL & Vulkan CTS tests
- Fix ICL perf register offsets
- Fix Gen11 Sampler Prefetch workaround, impacting dEQP tests
- Fix various gen2 tracepoints
- A number of GEM locking fixes addressing lockdep issues
- Fix idle engine reset, recover only active engines
- Fix incorrect MCR programming

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87d0hnncgo.fsf@intel.com

drm/exynos: fix missing decrement of retry counter

Currently the retry counter is not being decremented, leading to a
potential infinite spin if the scalar_reads don't change state.

Addresses-Coverity: ("Infinite loop")
Fixes: 280e54c9f614 ("drm/exynos: scaler: Reset hardware before starting the operation")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>

drm/exynos: add CONFIG_MMU dependency

Compile-testing this driver on a NOMMU configuration shows a link failure:

drivers/gpu/drm/exynos/exynos_drm_gem.o: In function `exynos_drm_gem_fault':
exynos_drm_gem.c:(.text+0x484): undefined reference to `vmf_insert_mixed'

Add a CONFIG_MMU dependency to ensure we only enable this in configurations
that build correctly.

Many other drm drivers have the same dependency. It would be nice to
make this work in MMU-less configurations, but evidently nobody has
ever needed this so far.

Fixes: 156bdac99061 ("drm/exynos: trigger build of all modules")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Inki Dae <daeinki@gmail.com>

drm/exynos: remove redundant assignment to pointer 'node'

The pointer 'node' is being assigned with a value that is never
read and is re-assigned later. The assignment is redundant and
can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Inki Dae <daeinki@gmail.com>

drm/exynos: using dev_get_drvdata directly

Several drivers cast a struct device pointer to a struct
platform_device pointer only to then call platform_get_drvdata().
To improve readability, these constructs can be simplified
by using dev_get_drvdata() directly.

Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>

s390/dasd: fix endless loop after read unit address configuration

After getting a storage server event that causes the DASD device driver
to update its unit address configuration during a device shutdown there is
the possibility of an endless loop in the device driver.

In the system log there will be ongoing DASD error messages with RC: -19.

The reason is that the loop starting the ruac request only terminates when
the retry counter is decreased to 0. But in the sleep_on function there are
early exit paths that do not decrease the retry counter.

Prevent an endless loop by handling those cases separately.

Remove the unnecessary do..while loop since the sleep_on function takes
care of retries by itself.

Fixes: 8e09f21574ea ("[S390] dasd: add hyper PAV support to DASD device driver, part 1")
Cc: stable@vger.kernel.org # 2.6.25+
Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

Merge tag 'msm-fixes-2019_08_01' of https://gitlab.freedesktop.org/drm/msm into drm-fixes

- Fix the dma_sync calls applied last week (Rob)
- Fix mdp5 dsi command mode (Brian)
- Squash fall through warnings (Jordan)
- Don't add disabled gpu nodes to the of device list (Jeffrey)

Cc: Jeffrey Hugo <jeffrey.l.hugo@gmail.com>
Cc: Jordan Crouse <jcrouse@codeaurora.org>
Cc: Brian Masney <masneyb@onstation.org>
Cc: Rob Clark <robdclark@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
# gpg: Signature made Fri 02 Aug 2019 05:54:27 AM AEST
# gpg: using RSA key 96F70DFDA84A070A
# gpg: Can't check signature: public key not found
From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20190801200439.GV104440@art_vandelay

drm/nouveau: Only release VCPI slots on mode changes

Looks like a regression got introduced into nv50_mstc_atomic_check()
that somehow didn't get found until now. If userspace changes
crtc_state->active to false but leaves the CRTC enabled, we end up
calling drm_dp_atomic_find_vcpi_slots() using the PBN calculated in
asyh->dp.pbn. However, if the display is inactive we end up calculating
a PBN of 0, which inadvertently causes us to have an allocation of 0.
>From there, if userspace then disables the CRTC afterwards we end up
accidentally attempting to free the VCPI twice:

WARNING: CPU: 0 PID: 1484 at drivers/gpu/drm/drm_dp_mst_topology.c:3336
drm_dp_atomic_release_vcpi_slots+0x87/0xb0 [drm_kms_helper]
RIP: 0010:drm_dp_atomic_release_vcpi_slots+0x87/0xb0 [drm_kms_helper]
Call Trace:
drm_atomic_helper_check_modeset+0x3f3/0xa60 [drm_kms_helper]
? drm_atomic_check_only+0x43/0x780 [drm]
drm_atomic_helper_check+0x15/0x90 [drm_kms_helper]
nv50_disp_atomic_check+0x83/0x1d0 [nouveau]
drm_atomic_check_only+0x54d/0x780 [drm]
? drm_atomic_set_crtc_for_connector+0xec/0x100 [drm]
drm_atomic_commit+0x13/0x50 [drm]
drm_atomic_helper_set_config+0x81/0x90 [drm_kms_helper]
drm_mode_setcrtc+0x194/0x6a0 [drm]
? vprintk_emit+0x16a/0x230
? drm_ioctl+0x163/0x390 [drm]
? drm_mode_getcrtc+0x180/0x180 [drm]
drm_ioctl_kernel+0xaa/0xf0 [drm]
drm_ioctl+0x208/0x390 [drm]
? drm_mode_getcrtc+0x180/0x180 [drm]
nouveau_drm_ioctl+0x63/0xb0 [nouveau]
do_vfs_ioctl+0x405/0x660
? recalc_sigpending+0x17/0x50
? _copy_from_user+0x37/0x60
ksys_ioctl+0x5e/0x90
? exit_to_usermode_loop+0x92/0xe0
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x59/0x190
entry_SYSCALL_64_after_hwframe+0x44/0xa9
WARNING: CPU: 0 PID: 1484 at drivers/gpu/drm/drm_dp_mst_topology.c:3336
drm_dp_atomic_release_vcpi_slots+0x87/0xb0 [drm_kms_helper]
---[ end trace 4c395c0c51b1f88d ]---
[drm:drm_dp_atomic_release_vcpi_slots [drm_kms_helper]] *ERROR* no VCPI for
[MST PORT:00000000e288eb7d] found in mst state 000000008e642070

So, fix this by doing what we probably should have done from the start: only
call drm_dp_atomic_find_vcpi_slots() when crtc_state->mode_changed is set, so
that VCPI allocations remain for as long as the CRTC is enabled.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Fixes: 232c9eec417a ("drm/nouveau: Use atomic VCPI helpers for MST")
Cc: Lyude Paul <lyude@redhat.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Airlie <airlied@redhat.com>
Cc: Jerry Zuo <Jerry.Zuo@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Juston Li <juston.li@intel.com>
Cc: Karol Herbst <karolherbst@gmail.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <stable@vger.kernel.org> # v5.1+
Acked-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190801220216.15323-1-lyude@redhat.com

Merge tag 'drm-fixes-5.3-2019-07-31' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

drm-fixes-5.3-2019-07-31:

amdgpu:
- Fix temperature granularity for navi
- Fix stable pstate setting for navi
- Fix VCN DPM enablement on navi
- Fix error handling on CS ioctl when processing dependencies
- Fix possible information leak in debugfs

amdkfd:
- fix memory alignment for VegaM

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190731191648.25729-1-alexander.deucher@amd.com

ACPI: PM: Fix regression in acpi_device_set_power()

Commit f850a48a0799 ("ACPI: PM: Allow transitions to D0 to occur in
special cases") overlooked the fact that acpi_power_transition() may
change the power.state value for the target device and if that
happens, it may confuse acpi_device_set_power() and cause it to
omit the _PS0 evaluation which on some systems is necessary to
change power states of devices from low-power to D0.

Fix that by saving the current value of power.state for the
target device before passing it to acpi_power_transition() and
using the saved value in a subsequent check.

Fixes: f850a48a0799 ("ACPI: PM: Allow transitions to D0 to occur in special cases")
Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Reported-by: Mario Limonciello <mario.limonciello@dell.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Mario Limonciello <mario.limonciello@dell.com>

i2c: s3c2410: Mark expected switch fall-through

Mark switch cases where we are expecting to fall through.

This patch fixes the following warning:

drivers/i2c/busses/i2c-s3c2410.c: In function 'i2c_s3c_irq_nextbyte':
drivers/i2c/busses/i2c-s3c2410.c:431:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   if (i2c->state == STATE_READ)
      ^
drivers/i2c/busses/i2c-s3c2410.c:439:2: note: here
  case STATE_WRITE:
  ^~~~

Notice that, in this particular case, the code comment is
modified in accordance with what GCC is expecting to find.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>

i2c: at91: fix clk_offset for sama5d2

In SAMA5D2 datasheet, TWIHS_CWGR register rescription mentions clock
offset of 3 cycles (compared to 4 in eg. SAMA5D3).

Cc: stable@vger.kernel.org # 5.2.x
[needs applying to i2c-at91.c instead for earlier kernels]
Fixes: 0ef6f3213dac ("i2c: at91: add support for new alternative command mode")
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>

i2c: at91: disable TXRDY interrupt after sending data

Driver was not disabling TXRDY interrupt after last TX byte.
This caused interrupt storm until transfer timeouts for slow
or broken device on the bus. The patch fixes the interrupt storm
on my SAMA5D2-based board.

Cc: stable@vger.kernel.org # 5.2.x
[v5.2 introduced file split; the patch should apply to i2c-at91.c before the split]
Fixes: fac368a04048 ("i2c: at91: add new driver")
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Acked-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Tested-by: Raag Jadav <raagjadav@gmail.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>

block: Fix __blkdev_direct_IO() for bio fragments

The recent fix to properly handle IOCB_NOWAIT for async O_DIRECT IO
(patch 6a43074e2f46) introduced two problems with BIO fragment handling
for direct IOs:
1) The dio size processed is calculated by incrementing the ret variable
by the size of the bio fragment issued for the dio. However, this size
is obtained directly from bio->bi_iter.bi_size AFTER the bio submission
which may result in referencing the bi_size value after the bio
completed, resulting in an incorrect value use.
2) The ret variable is not incremented by the size of the last bio
fragment issued for the bio, leading to an invalid IO size being
returned to the user.

Fix both problem by using dio->size (which is incremented before the bio
submission) to update the value of ret after bio submissions, including
for the last bio fragment issued.

Fixes: 6a43074e2f46 ("block: properly handle IOCB_NOWAIT for async O_DIRECT IO")
Reported-by: Masato Suzuki <masato.suzuki@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>