platform/kernel/linux-starfive.git
5 years agoselftests: avoid KBUILD_OUTPUT dir cluttering with selftest objects
Shuah Khan [Tue, 14 May 2019 20:43:44 +0000 (14:43 -0600)]
selftests: avoid KBUILD_OUTPUT dir cluttering with selftest objects

Running "make kselftest" or building selftests when KBUILD_OUTPUT
is set, will create selftest objects in the KBUILD_OUTPUT directory.
This could be undesirable especially when user didn't intend to
relocate selftest objects.

Use KBUILD_OUTPUT/kselftest to create selftest objects instead of
cluttering the main directory.

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests: drivers: Create .gitignore to include /dma-buf/udmabuf
Kelsey Skunberg [Sun, 12 May 2019 05:04:52 +0000 (23:04 -0600)]
selftests: drivers: Create .gitignore to include /dma-buf/udmabuf

Create ../selftests/drivers/.gitignore which holds the following file name
created after compiling:

- /dma-buf/udmabuf

Signed-off-by: Kelsey Skunberg <skunberg.kelsey@gmail.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests: pidfd: Create .gitignore to include pidfd_test
Kelsey Skunberg [Sun, 12 May 2019 04:57:11 +0000 (22:57 -0600)]
selftests: pidfd: Create .gitignore to include pidfd_test

Create ../selftests/pidfd/.gitignore which holds the following file name
created after compiling:

- pidfd_test

Signed-off-by: Kelsey Skunberg <skunberg.kelsey@gmail.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests: fix bpf build/test workflow regression when KBUILD_OUTPUT is set
Shuah Khan [Sat, 11 May 2019 01:38:39 +0000 (19:38 -0600)]
selftests: fix bpf build/test workflow regression when KBUILD_OUTPUT is set

commit 8ce72dc32578 ("selftests: fix headers_install circular dependency")
broke bpf build/test workflow. When KBUILD_OUTPUT is set, bpf objects end
up in KBUILD_OUTPUT build directory instead of in ../selftests/bpf.

The following bpf workflow breaks when it can't find the test_verifier:

cd tools/testing/selftests/bpf; make; ./test_verifier;

Fix it to set OUTPUT only when it is undefined in lib.mk. It didn't need
to be set in the first place.

Fixes: 8ce72dc32578 ("selftests: fix headers_install circular dependency")
Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests: fix install target to use default install path
Shuah Khan [Tue, 7 May 2019 23:44:21 +0000 (17:44 -0600)]
selftests: fix install target to use default install path

Install target fails when INSTALL_PATH is undefined. Fix install target
to use "output_dir/install as the default install location. "output_dir"
is either the root of selftests directory under kernel source tree or
output directory specified by O= or KBUILD_OUTPUT.

e.g:
make -C tools/testing/selftests install
<installs under tools/testing/selftests/install>

make O=/tmp/kselftest -C tools/testing/selftests install
<installs under /tmp/kselftest/install>

export KBUILD_OUTPUT=/tmp/kselftest
make -C tools/testing/selftests install
<installs under /tmp/kselftest/install>

In addition, add "all" target as dependency to "install" to build and
install using a single command.

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agorseq/selftests: add -no-integrated-as for clang
Mathieu Desnoyers [Mon, 29 Apr 2019 15:28:03 +0000 (11:28 -0400)]
rseq/selftests: add -no-integrated-as for clang

Ongoing work for asm goto support from clang requires the
-no-integrated-as compiler flag.

This compiler flag is present in the toplevel kernel Makefile,
but is not replicated for selftests. Add it specifically for
the rseq selftest which requires asm goto.

Link: https://reviews.llvm.org/D56571
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Nick Desaulniers <ndesaulniers@google.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Joel Fernandes <joelaf@google.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Catalin Marinas <catalin.marinas@arm.com>
CC: Dave Watson <davejwatson@fb.com>
CC: Will Deacon <will.deacon@arm.com>
CC: Shuah Khan <shuah@kernel.org>
CC: Andi Kleen <andi@firstfloor.org>
CC: linux-kselftest@vger.kernel.org
CC: "H . Peter Anvin" <hpa@zytor.com>
CC: Chris Lameter <cl@linux.com>
CC: Russell King <linux@arm.linux.org.uk>
CC: Michael Kerrisk <mtk.manpages@gmail.com>
CC: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
CC: Paul Turner <pjt@google.com>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Josh Triplett <josh@joshtriplett.org>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Ben Maurer <bmaurer@fb.com>
CC: linux-api@vger.kernel.org
CC: Andy Lutomirski <luto@amacapital.net>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agorseq/selftests: mips: use break instruction for RSEQ_SIG
Mathieu Desnoyers [Mon, 29 Apr 2019 15:28:02 +0000 (11:28 -0400)]
rseq/selftests: mips: use break instruction for RSEQ_SIG

Use break as guard instruction for the restartable sequence abort
handler.

Previously, the chosen signature was simply data, based on the
assumption that it could always sit in a literal pool. However,
some compilation environments favor disabling literal pool. Therefore,
ensure the signature is a valid uncommon trap instruction.

Suggested-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Paul Burton <paul.burton@mips.com>
CC: James Hogan <jhogan@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agorseq/selftests: powerpc code signature: generate valid instructions
Mathieu Desnoyers [Mon, 29 Apr 2019 15:28:01 +0000 (11:28 -0400)]
rseq/selftests: powerpc code signature: generate valid instructions

Use "twui" as the guard instruction for the restartable sequence abort
handler.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
CC: Michael Ellerman <mpe@ellerman.id.au>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
CC: Alan Modra <amodra@gmail.com>
CC: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agorseq/selftests: aarch64 code signature: handle big-endian environment
Mathieu Desnoyers [Mon, 29 Apr 2019 15:28:00 +0000 (11:28 -0400)]
rseq/selftests: aarch64 code signature: handle big-endian environment

Handle compiling with -mbig-endian on aarch64, which generates binaries
with mixed code vs data endianness (little endian code, big endian
data).

Else mismatch between code endianness for the generated signatures and
data endianness for the RSEQ_SIG parameter passed to the rseq
registration will trigger application segmentation faults when the
kernel try to abort rseq critical sections.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Will Deacon <will.deacon@arm.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Joel Fernandes <joelaf@google.com>
CC: Catalin Marinas <catalin.marinas@arm.com>
CC: Dave Watson <davejwatson@fb.com>
CC: Will Deacon <will.deacon@arm.com>
CC: Shuah Khan <shuah@kernel.org>
CC: Andi Kleen <andi@firstfloor.org>
CC: linux-kselftest@vger.kernel.org
CC: "H . Peter Anvin" <hpa@zytor.com>
CC: Chris Lameter <cl@linux.com>
CC: Russell King <linux@arm.linux.org.uk>
CC: Michael Kerrisk <mtk.manpages@gmail.com>
CC: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
CC: Paul Turner <pjt@google.com>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Josh Triplett <josh@joshtriplett.org>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Ben Maurer <bmaurer@fb.com>
CC: linux-api@vger.kernel.org
CC: Andy Lutomirski <luto@amacapital.net>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agorseq/selftests: arm: use udf instruction for RSEQ_SIG
Mathieu Desnoyers [Fri, 3 May 2019 19:38:58 +0000 (15:38 -0400)]
rseq/selftests: arm: use udf instruction for RSEQ_SIG

Use udf as the guard instruction for the restartable sequence abort
handler.

Previously, the chosen signature was not a valid instruction, based
on the assumption that it could always sit in a literal pool. However,
there are compilation environments in which literal pools are not
available, for instance execute-only code. Therefore, we need to
choose a signature value that is also a valid instruction.

Handle compiling with -mbig-endian on ARMv6+, which generates binaries
with mixed code vs data endianness (little endian code, big endian
data).

Else mismatch between code endianness for the generated signatures and
data endianness for the RSEQ_SIG parameter passed to the rseq
registration will trigger application segmentation faults when the
kernel try to abort rseq critical sections.

Prior to ARMv6, -mbig-endian generates big-endian code and data, so
endianness should not be reversed in that case.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Joel Fernandes <joelaf@google.com>
CC: Catalin Marinas <catalin.marinas@arm.com>
CC: Dave Watson <davejwatson@fb.com>
CC: Will Deacon <will.deacon@arm.com>
CC: Shuah Khan <shuah@kernel.org>
CC: Andi Kleen <andi@firstfloor.org>
CC: linux-kselftest@vger.kernel.org
CC: "H . Peter Anvin" <hpa@zytor.com>
CC: Chris Lameter <cl@linux.com>
CC: Russell King <linux@arm.linux.org.uk>
CC: Michael Kerrisk <mtk.manpages@gmail.com>
CC: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
CC: Paul Turner <pjt@google.com>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Josh Triplett <josh@joshtriplett.org>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Ben Maurer <bmaurer@fb.com>
CC: linux-api@vger.kernel.org
CC: Andy Lutomirski <luto@amacapital.net>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agorseq/selftests: s390: use trap4 for RSEQ_SIG
Martin Schwidefsky [Mon, 29 Apr 2019 15:27:58 +0000 (11:27 -0400)]
rseq/selftests: s390: use trap4 for RSEQ_SIG

Use trap4 as the guard instruction for the restartable sequence abort
handler.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agorseq/selftests: x86: use ud1 instruction as RSEQ_SIG opcode
Mathieu Desnoyers [Mon, 29 Apr 2019 15:27:57 +0000 (11:27 -0400)]
rseq/selftests: x86: use ud1 instruction as RSEQ_SIG opcode

Use ud1 as the guard instruction for the restartable sequence abort
handler. Its benefit compared to nopl is to trap execution if the
program ends up trying to execute it by mistake, which makes debugging
easier.

The 4-byte signature per se is unchanged (it is the instruction
operand). Only the opcode is changed from nopl to ud1.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Joel Fernandes <joelaf@google.com>
CC: Catalin Marinas <catalin.marinas@arm.com>
CC: Dave Watson <davejwatson@fb.com>
CC: Will Deacon <will.deacon@arm.com>
CC: Shuah Khan <shuah@kernel.org>
CC: Andi Kleen <andi@firstfloor.org>
CC: linux-kselftest@vger.kernel.org
CC: "H . Peter Anvin" <hpa@zytor.com>
CC: Chris Lameter <cl@linux.com>
CC: Russell King <linux@arm.linux.org.uk>
CC: Michael Kerrisk <mtk.manpages@gmail.com>
CC: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
CC: Paul Turner <pjt@google.com>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Josh Triplett <josh@joshtriplett.org>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Ben Maurer <bmaurer@fb.com>
CC: linux-api@vger.kernel.org
CC: Andy Lutomirski <luto@amacapital.net>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agorseq/selftests: s390: use jg instruction for jumps outside of the asm
Mathieu Desnoyers [Mon, 29 Apr 2019 15:27:56 +0000 (11:27 -0400)]
rseq/selftests: s390: use jg instruction for jumps outside of the asm

The branch target range of the "j" instruction is 64K, which is not
enough for the general case.

Suggested-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Joel Fernandes <joelaf@google.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Catalin Marinas <catalin.marinas@arm.com>
CC: Dave Watson <davejwatson@fb.com>
CC: Will Deacon <will.deacon@arm.com>
CC: Shuah Khan <shuah@kernel.org>
CC: Andi Kleen <andi@firstfloor.org>
CC: linux-kselftest@vger.kernel.org
CC: "H . Peter Anvin" <hpa@zytor.com>
CC: Chris Lameter <cl@linux.com>
CC: Russell King <linux@arm.linux.org.uk>
CC: Michael Kerrisk <mtk.manpages@gmail.com>
CC: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
CC: Paul Turner <pjt@google.com>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Josh Triplett <josh@joshtriplett.org>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Ben Maurer <bmaurer@fb.com>
CC: linux-api@vger.kernel.org
CC: Andy Lutomirski <luto@amacapital.net>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agorseq/selftests: Use __rseq_handled symbol to coexist with glibc
Mathieu Desnoyers [Mon, 29 Apr 2019 15:27:55 +0000 (11:27 -0400)]
rseq/selftests: Use __rseq_handled symbol to coexist with glibc

In order to integrate rseq into user-space applications, expose a
__rseq_handled symbol so many rseq users can be linked into the same
application (e.g. librseq and glibc).

The __rseq_refcount TLS variable is static to the librseq library. It
ensures that rseq syscall registration/unregistration happens only for
the most early/late caller to rseq_{,un}register_current_thread for each
thread, thus ensuring that rseq is registered across the lifetime of all
rseq users for a given thread.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Shuah Khan <shuah@kernel.org>
CC: Carlos O'Donell <carlos@redhat.com>
CC: Florian Weimer <fweimer@redhat.com>
CC: Joseph Myers <joseph@codesourcery.com>
CC: Szabolcs Nagy <szabolcs.nagy@arm.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Ben Maurer <bmaurer@fb.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Will Deacon <will.deacon@arm.com>
CC: Dave Watson <davejwatson@fb.com>
CC: Paul Turner <pjt@google.com>
CC: linux-api@vger.kernel.org
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agorseq/selftests: Introduce __rseq_cs_ptr_array, rename __rseq_table to __rseq_cs
Mathieu Desnoyers [Mon, 29 Apr 2019 15:27:54 +0000 (11:27 -0400)]
rseq/selftests: Introduce __rseq_cs_ptr_array, rename __rseq_table to __rseq_cs

The entries within __rseq_table are aligned on 32 bytes due to
linux/rseq.h struct rseq_cs uapi requirements, but the start of the
__rseq_table section is not guaranteed to be 32-byte aligned. It can
cause padding to be added at the start of the section, which makes it
hard to use as an array of items by debuggers.

Considering that __rseq_table does not really consist of a table due to
the presence of padding, rename this section to __rseq_cs.

Create a new __rseq_cs_ptr_array section which contains 64-bit packed
pointers to entries within the __rseq_cs section.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Joel Fernandes <joelaf@google.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Catalin Marinas <catalin.marinas@arm.com>
CC: Dave Watson <davejwatson@fb.com>
CC: Will Deacon <will.deacon@arm.com>
CC: Shuah Khan <shuah@kernel.org>
CC: Andi Kleen <andi@firstfloor.org>
CC: linux-kselftest@vger.kernel.org
CC: "H . Peter Anvin" <hpa@zytor.com>
CC: Chris Lameter <cl@linux.com>
CC: Russell King <linux@arm.linux.org.uk>
CC: Michael Kerrisk <mtk.manpages@gmail.com>
CC: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
CC: Paul Turner <pjt@google.com>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Josh Triplett <josh@joshtriplett.org>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Ben Maurer <bmaurer@fb.com>
CC: linux-api@vger.kernel.org
CC: Andy Lutomirski <luto@amacapital.net>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agorseq/selftests: Add __rseq_exit_point_array section for debuggers
Mathieu Desnoyers [Mon, 29 Apr 2019 15:27:53 +0000 (11:27 -0400)]
rseq/selftests: Add __rseq_exit_point_array section for debuggers

Knowing all exit points is useful to assist debuggers stepping over the
rseq critical sections without requiring them to disassemble the content
of the critical section to figure out the exit points.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Joel Fernandes <joelaf@google.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Catalin Marinas <catalin.marinas@arm.com>
CC: Dave Watson <davejwatson@fb.com>
CC: Will Deacon <will.deacon@arm.com>
CC: Shuah Khan <shuah@kernel.org>
CC: Andi Kleen <andi@firstfloor.org>
CC: linux-kselftest@vger.kernel.org
CC: "H . Peter Anvin" <hpa@zytor.com>
CC: Chris Lameter <cl@linux.com>
CC: Russell King <linux@arm.linux.org.uk>
CC: Michael Kerrisk <mtk.manpages@gmail.com>
CC: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
CC: Paul Turner <pjt@google.com>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Josh Triplett <josh@joshtriplett.org>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Ben Maurer <bmaurer@fb.com>
CC: linux-api@vger.kernel.org
CC: Andy Lutomirski <luto@amacapital.net>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agorseq/selftests: x86: Work-around bogus gcc-8 optimisation
Mathieu Desnoyers [Mon, 29 Apr 2019 15:27:52 +0000 (11:27 -0400)]
rseq/selftests: x86: Work-around bogus gcc-8 optimisation

gcc-8 version 8.1.0, 8.2.0, and 8.3.0 generate broken assembler with asm
goto that have a thread-local storage "m" input operand on both x86-32
and x86-64. For instance:

__thread int var;

static int fct(void)
{
        asm goto (      "jmp %l[testlabel]\n\t"
                        : : [var] "m" (var) : : testlabel);
        return 0;
testlabel:
        return 1;
}

int main()
{
        return fct();
}

% gcc-8 -O2 -o test-asm-goto test-asm-goto.c
/tmp/ccAdHJbe.o: In function `main':
test-asm-goto.c:(.text.startup+0x1): undefined reference to `.L2'
collect2: error: ld returned 1 exit status

% gcc-8 -m32 -O2 -o test-asm-goto test-asm-goto.c
/tmp/ccREsVXA.o: In function `main':
test-asm-goto.c:(.text.startup+0x1): undefined reference to `.L2'
collect2: error: ld returned 1 exit status

Work-around this compiler bug in the rseq-x86.h header by passing the
address of the __rseq_abi TLS as a register operand rather than using
the "m" input operand.

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90193
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Ingo Molnar <mingo@redhat.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Joel Fernandes <joelaf@google.com>
CC: Catalin Marinas <catalin.marinas@arm.com>
CC: Dave Watson <davejwatson@fb.com>
CC: Will Deacon <will.deacon@arm.com>
CC: Shuah Khan <shuah@kernel.org>
CC: Andi Kleen <andi@firstfloor.org>
CC: linux-kselftest@vger.kernel.org
CC: "H . Peter Anvin" <hpa@zytor.com>
CC: Chris Lameter <cl@linux.com>
CC: Russell King <linux@arm.linux.org.uk>
CC: Michael Kerrisk <mtk.manpages@gmail.com>
CC: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
CC: Paul Turner <pjt@google.com>
CC: Boqun Feng <boqun.feng@gmail.com>
CC: Josh Triplett <josh@joshtriplett.org>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Ben Maurer <bmaurer@fb.com>
CC: linux-api@vger.kernel.org
CC: Andy Lutomirski <luto@amacapital.net>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests: Add test plan API to kselftest.h and adjust callers
Kees Cook [Wed, 24 Apr 2019 23:12:37 +0000 (16:12 -0700)]
selftests: Add test plan API to kselftest.h and adjust callers

The test plan for TAP needs to be declared immediately after the header.
This adds the test plan API to kselftest.h and updates all callers to
declare their expected test counts.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests: Remove KSFT_TAP_LEVEL
Kees Cook [Wed, 24 Apr 2019 23:12:36 +0000 (16:12 -0700)]
selftests: Remove KSFT_TAP_LEVEL

Since sub-testing can now be detected by indentation level, this removes
KSFT_TAP_LEVEL so that subtests report their TAP header for later parsing.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests: Move test output to diagnostic lines
Kees Cook [Wed, 24 Apr 2019 23:12:35 +0000 (16:12 -0700)]
selftests: Move test output to diagnostic lines

This changes the selftest output so that each test's output is prefixed
with "# " as a TAP "diagnostic line".

This creates a bit of a kernel-specific TAP dialect where the diagnostics
precede the results. The TAP spec isn't entirely clear about this, though,
so I think it's the correct solution so as to keep interactive runs making
sense. If the output _followed_ the result line in the spec-suggested
YAML form, each test would dump all of its output at once instead of as
it went, making debugging harder.

This does, however, solve the recursive TAP output problem, as sub-tests
will simply be prefixed by "# ". Parsing sub-tests becomes a simple
problem of just removing the first two characters of a given top-level
test's diagnostic output, and parsing the results.

Note that the shell construct needed to both get an exit code from
the first command in a pipe and still filter the pipe (to add the "# "
prefix) uses a POSIX solution rather than the bash "pipefail" option
which is not supported by dash.

Since some test environments may have a very minimal set of utilities
available, the new prefixing code will fall back to doing line-at-a-time
prefixing if perl and/or stdbuf are not available.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests: Distinguish between missing and non-executable
Kees Cook [Wed, 24 Apr 2019 23:12:34 +0000 (16:12 -0700)]
selftests: Distinguish between missing and non-executable

If a test was missing (e.g. wrong architecture, etc), the test runner
would incorrectly claim the test was non-executable. This adds an
existence check to report correctly.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests: Add plan line and fix result line syntax
Kees Cook [Wed, 24 Apr 2019 23:12:33 +0000 (16:12 -0700)]
selftests: Add plan line and fix result line syntax

The TAP version 13 spec requires a "plan" line, which has been missing.
Since we always know how many tests we're going to run, emit the count on
the plan line. This also fixes the result lines to remove the "1.." prefix
which is against spec, and to mark skips with the correct "# SKIP" suffix.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests: Extract logic for multiple test runs
Kees Cook [Wed, 24 Apr 2019 23:12:32 +0000 (16:12 -0700)]
selftests: Extract logic for multiple test runs

This moves the logic for running multiple tests into a single "run_many"
function of runner.sh. Both "run_tests" and "emit_tests" are modified to
use it. Summary handling is now controlled by the "per_test_logging"
shell flag.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests: Use runner.sh for emit targets
Kees Cook [Wed, 24 Apr 2019 23:12:31 +0000 (16:12 -0700)]
selftests: Use runner.sh for emit targets

This reuses the new runner.sh for the emit targets instead of manually
running each test via run_kselftest.sh.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests: Extract single-test shell logic from lib.mk
Kees Cook [Wed, 24 Apr 2019 23:12:30 +0000 (16:12 -0700)]
selftests: Extract single-test shell logic from lib.mk

In order to improve the reusability of the kselftest test running logic,
this extracts the single-test logic from lib.mk into kselftest/runner.sh
which lib.mk can call directly. No changes in output.

As part of the change, this moves the "summary" Makefile logic around
to set a new "logfile" output. This will be used again in the future
"emit_tests" target as well.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests: build and run gpio when output directory is the src dir
Shuah Khan [Fri, 19 Apr 2019 22:31:28 +0000 (16:31 -0600)]
selftests: build and run gpio when output directory is the src dir

Build and run gpio when output directory is the src dir.  gpio has
dependency on tools/gpio and builds tools/gpio objects in the src
directory in all cases making the src repo dirty even when object
relocation is specified.

This fixes the following commands from generating gpio objects in
the source repository:

make O=dir kselftest
export KBUILD_OUTPUT=dir; make kselftest
make O=dir -C tools/testing/selftests
expoert KBUILD_OUTPUT=dir; make -C tools/testing/selftests

The following commands still build gpio objects in the source repo
(gpio Makefile needs to fixed):
make O=dir kselftest TARGETS="gpio"
export KBUILD_OUTPUT=dir; make kselftest TARGETS="gpio"
make O=dir -C tools/testing/selftests TARGETS="gpio"
expoert KBUILD_OUTPUT=dir; make -C tools/testing/selftests TARGETS="gpio"

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests/ipc: Fix msgque compiler warnings
Kees Cook [Mon, 8 Apr 2019 17:13:44 +0000 (10:13 -0700)]
selftests/ipc: Fix msgque compiler warnings

This fixes the various compiler warnings when building the msgque
selftest. The primary change is using sys/msg.h instead of linux/msg.h
directly to gain the API declarations.

Fixes: 3a665531a3b7 ("selftests: IPC message queue copy feature test")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests/efivarfs: clean up test files from test_create*()
Po-Hsu Lin [Fri, 19 Apr 2019 14:04:49 +0000 (22:04 +0800)]
selftests/efivarfs: clean up test files from test_create*()

Test files created by test_create() and test_create_empty() tests will
stay in the $efivarfs_mount directory until the system was rebooted.

When the tester tries to run this efivarfs test again on the same
system, the immutable characteristics in that directory will cause some
"Operation not permitted" noises, and a false-positve test result as the
file was created in previous run.
    --------------------
    running test_create
    --------------------
    ./efivarfs.sh: line 59: /sys/firmware/efi/efivars/test_create-210be57c-9849-4fc7-a635-e6382d1aec27: Operation not permitted
      [PASS]
    --------------------
    running test_create_empty
    --------------------
    ./efivarfs.sh: line 78: /sys/firmware/efi/efivars/test_create_empty-210be57c-9849-4fc7-a635-e6382d1aec27: Operation not permitted
     [PASS]
    --------------------

Create a file_cleanup() to remove those test files in the end of each
test to solve this issue.

For the test_create_read, we can move the clean up task to the end of
the test to ensure the system is clean.

Also, use this function to replace the existing file removal code.

Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests: fix headers_install circular dependency
Shuah Khan [Mon, 15 Apr 2019 21:51:42 +0000 (15:51 -0600)]
selftests: fix headers_install circular dependency

"make kselftest" fails with "Circular Makefile.o <- prepare dependency
dropped." error, when lib.mk invokes "make headers_install".

Make level 0: Main make calls selftests run_tests target
...
Make level n: selftests lib.mk invokes main make's headers_install

The secondary level make inherits builtin-rules which will use the rule
to generate Makefile.o  and runs into "Circular Makefile.o <- prepare
dependency dropped." error, and kselftest compile fails.

Invoke headers_install target with --no-builtin-rules to avoid circular
error.

In addition, lib.mk installs headers in the default HDR_PATH, even when
build relocation is requested with O= or export KBUILD_OUTPUT. Fix the
problem by passing in INSTALL_HDR_PATH. The headers are installed under
the specified output "dir/usr".

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests/kexec: update get_secureboot_mode
Mimi Zohar [Mon, 1 Apr 2019 17:39:44 +0000 (13:39 -0400)]
selftests/kexec: update get_secureboot_mode

The get_secureboot_mode() function unnecessarily requires both
CONFIG_EFIVAR_FS and CONFIG_EFI_VARS to be enabled to determine if the
system is booted in secure boot mode.  On some systems the old EFI
variable support is not enabled or, possibly, even implemented.

This patch first checks the efivars filesystem for the SecureBoot and
SetupMode flags, but falls back to using the old EFI variable support.

The "secure_boot_file" and "setup_mode_file" couldn't be quoted due to
globbing.  This patch also removes the globbing.

Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests/kexec: make kexec_load test independent of IMA being enabled
Mimi Zohar [Mon, 25 Mar 2019 18:13:27 +0000 (14:13 -0400)]
selftests/kexec: make kexec_load test independent of IMA being enabled

Verify IMA is enabled before failing tests or emitting irrelevant
messages.

Suggested-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Dave Young <dyoung@redhat.com>
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests/kexec: check kexec_load and kexec_file_load are enabled
Mimi Zohar [Tue, 12 Mar 2019 18:53:54 +0000 (14:53 -0400)]
selftests/kexec: check kexec_load and kexec_file_load are enabled

Skip the kexec_load and kexec_file_load tests, if they aren't configured
in the kernel.  This change adds a new requirement that ikconfig is
configured in the kexec_load test.

Suggested-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests/kexec: Add missing '=y' to config options
Petr Vorel [Thu, 28 Feb 2019 23:30:30 +0000 (00:30 +0100)]
selftests/kexec: Add missing '=y' to config options

so the file can be used as kernel config snippet.

Signed-off-by: Petr Vorel <pvorel@suse.cz>
[zohar@linux.ibm.com: remove CONFIG_KEXEC_VERIFY_SIG from config]
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests/kexec: kexec_file_load syscall test
Mimi Zohar [Thu, 24 Jan 2019 23:22:06 +0000 (18:22 -0500)]
selftests/kexec: kexec_file_load syscall test

The kernel can be configured to verify PE signed kernel images, IMA
kernel image signatures, both types of signatures, or none.  This test
verifies only properly signed kernel images are loaded into memory,
based on the kernel configuration and runtime policies.

Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests/kexec: define "require_root_privileges"
Mimi Zohar [Wed, 6 Mar 2019 16:19:45 +0000 (11:19 -0500)]
selftests/kexec: define "require_root_privileges"

Many tests require root privileges.  Define a common function.

Suggested-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests/kexec: define common logging functions
Mimi Zohar [Wed, 13 Feb 2019 16:46:55 +0000 (11:46 -0500)]
selftests/kexec: define common logging functions

Define log_info, log_pass, log_fail, and log_skip functions.

Suggested-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests/kexec: define a set of common functions
Mimi Zohar [Thu, 24 Jan 2019 13:57:20 +0000 (08:57 -0500)]
selftests/kexec: define a set of common functions

Define, update and move get_secureboot_mode() to a common file for use
by other tests.

Updated to check both the efivar SecureBoot-$(UUID) and
SetupMode-$(UUID), based on Dave Young's review.

Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests/kexec: cleanup the kexec selftest
Mimi Zohar [Thu, 24 Jan 2019 13:27:25 +0000 (08:27 -0500)]
selftests/kexec: cleanup the kexec selftest

Remove the few bashisms and use the complete option name for clarity.

Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests/kexec: move the IMA kexec_load selftest to selftests/kexec
Mimi Zohar [Wed, 13 Mar 2019 14:09:10 +0000 (10:09 -0400)]
selftests/kexec: move the IMA kexec_load selftest to selftests/kexec

As requested move the existing kexec_load selftest and subsequent kexec
tests to the selftests/kexec directory.

Suggested-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests/harness: Add 30 second timeout per test
Kees Cook [Fri, 12 Apr 2019 00:11:08 +0000 (17:11 -0700)]
selftests/harness: Add 30 second timeout per test

In order to keep tests from hanging forever, this adds an alarm signal
to each test run. This assumes an individual test doesn't take longer
than 30 seconds.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests/seccomp: Handle namespace failures gracefully
Kees Cook [Thu, 11 Apr 2019 23:56:31 +0000 (16:56 -0700)]
selftests/seccomp: Handle namespace failures gracefully

When running without USERNS or PIDNS the seccomp test would hang since
it was waiting forever for the child to trigger the user notification
since it seems the glibc() abort handler makes a call to getpid(),
which would trap again. This changes the getpid filter to getppid, and
makes sure ASSERTs execute to stop from spawning the listener.

Reported-by: Shuah Khan <shuah@kernel.org>
Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")
Cc: stable@vger.kernel.org # > 5.0
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Tycho Andersen <tycho@tycho.ws>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
5 years agoselftests: cgroup: fix cleanup path in test_memcg_subtree_control()
Roman Gushchin [Mon, 8 Apr 2019 22:12:30 +0000 (15:12 -0700)]
selftests: cgroup: fix cleanup path in test_memcg_subtree_control()

Dan reported, that cleanup path in test_memcg_subtree_control()
triggers a static checker warning:
  ./tools/testing/selftests/cgroup/test_memcontrol.c:76 \
  test_memcg_subtree_control()
  error: uninitialized symbol 'child2'.

Fix this by initializing child2 and parent2 variables and
split the cleanup path into few stages.

Signed-off-by: Roman Gushchin <guro@fb.com>
Fixes: 84092dbcf901 ("selftests: cgroup: add memory controller self-tests")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Shuah Khan (Samsung OSG) <shuah@kernel.org>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Shuah Khan <shuah@kernel.org>
5 years agoselftests: efivarfs: remove the test_create_read file if it was exist
ZhangXiaoxu [Tue, 19 Mar 2019 03:24:31 +0000 (11:24 +0800)]
selftests: efivarfs: remove the test_create_read file if it was exist

After the first run, the test case 'test_create_read' will always
fail because the file is exist and file's attr is 'S_IMMUTABLE',
open with 'O_RDWR' will always return -EPERM.

Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Shuah Khan <shuah@kernel.org>
5 years agorseq/selftests: Adapt number of threads to the number of detected cpus
Mathieu Desnoyers [Tue, 5 Mar 2019 19:47:55 +0000 (14:47 -0500)]
rseq/selftests: Adapt number of threads to the number of detected cpus

On smaller systems, running a test with 200 threads can take a long
time on machines with smaller number of CPUs.

Detect the number of online cpus at test runtime, and multiply that
by 6 to have 6 rseq threads per cpu preempting each other.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Watson <davejwatson@fb.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: linux-kselftest@vger.kernel.org
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Chris Lameter <cl@linux.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ben Maurer <bmaurer@fb.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Shuah Khan <shuah@kernel.org>
5 years agolib: Add test module for strscpy_pad
Tobin C. Harding [Fri, 5 Apr 2019 01:58:59 +0000 (12:58 +1100)]
lib: Add test module for strscpy_pad

Add a test module for the new strscpy_pad() function.  Tie it into the
kselftest infrastructure for lib/ tests.

Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tobin C. Harding <tobin@kernel.org>
Signed-off-by: Shuah Khan <shuah@kernel.org>
5 years agolib/string: Add strscpy_pad() function
Tobin C. Harding [Fri, 5 Apr 2019 01:58:58 +0000 (12:58 +1100)]
lib/string: Add strscpy_pad() function

We have a function to copy strings safely and we have a function to copy
strings and zero the tail of the destination (if source string is
shorter than destination buffer) but we do not have a function to do
both at once.  This means developers must write this themselves if they
desire this functionality.  This is a chore, and also leaves us open to
off by one errors unnecessarily.

Add a function that calls strscpy() then memset()s the tail to zero if
the source string is shorter than the destination buffer.

Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tobin C. Harding <tobin@kernel.org>
Signed-off-by: Shuah Khan <shuah@kernel.org>
5 years agolib: Use new kselftest header
Tobin C. Harding [Fri, 5 Apr 2019 01:58:57 +0000 (12:58 +1100)]
lib: Use new kselftest header

We just added a new C header file for use with test modules that are
intended to be run with kselftest.  We can reduce code duplication by
using this header.

Use new kselftest header to reduce code duplication in test_printf and
test_bitmap test modules.

Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tobin C. Harding <tobin@kernel.org>
Signed-off-by: Shuah Khan <shuah@kernel.org>
5 years agokselftest: Add test module framework header
Tobin C. Harding [Fri, 5 Apr 2019 01:58:56 +0000 (12:58 +1100)]
kselftest: Add test module framework header

kselftest runs as a userspace process.  Sometimes we need to test things
from kernel space.  One way of doing this is by creating a test module.
Currently doing so requires developers to write a bunch of boiler plate
in the module if kselftest is to be used to run the tests.  This means
we currently have a load of duplicate code to achieve these ends.  If we
have a uniform method for implementing test modules then we can reduce
code duplication, ensure uniformity in the test framework, ease code
maintenance, and reduce the work required to create tests.  This all
helps to encourage developers to write and run tests.

Add a C header file that can be included in test modules.  This provides
a single point for common test functions/macros.  Implement a few macros
that make up the start of the test framework.

Add documentation for new kselftest header to kselftest documentation.

Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tobin C. Harding <tobin@kernel.org>
Signed-off-by: Shuah Khan <shuah@kernel.org>
5 years agokselftest: Add test runner creation script
Tobin C. Harding [Fri, 5 Apr 2019 01:58:55 +0000 (12:58 +1100)]
kselftest: Add test runner creation script

Currently if we wish to use kselftest to run tests within a kernel
module we write a small script to load/unload and do error reporting.
There are a bunch of these under tools/testing/selftests/lib/ that are
all identical except for the test name.  We can reduce code duplication
and improve maintainability if we have one version of this.  However
kselftest requires an executable for each test.  We can move all the
script logic to a central script then have each individual test script
call the main script.

Oneliner to call kselftest_module.sh courtesy of Kees, thanks!

Add test runner creation script.  Convert
tools/testing/selftests/lib/*.sh to use new test creation script.

Testing
-------

Configure kselftests for lib/ then build and boot kernel.  Then run
kselftests as follows:

  $ cd /path/to/kernel/tree
  $ sudo make O=$output_path -C tools/testing/selftests TARGETS="lib" run_tests

and also

  $ cd /path/to/kernel/tree
  $ cd tools/testing/selftests
  $ sudo make O=$output_path TARGETS="lib" run_tests

and also

  $ cd /path/to/kernel/tree
  $ cd tools/testing/selftests
  $ sudo make TARGETS="lib" run_tests

Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tobin C. Harding <tobin@kernel.org>
Signed-off-by: Shuah Khan <shuah@kernel.org>
5 years agolib/test_printf: Add empty module_exit function
Tobin C. Harding [Fri, 5 Apr 2019 01:58:54 +0000 (12:58 +1100)]
lib/test_printf: Add empty module_exit function

Currently the test_printf module does not have an exit function, this
prevents the module from being unloaded.  If we cannot unload the
module we cannot run the tests a second time.

Add an empty exit function.

Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tobin C. Harding <tobin@kernel.org>
Signed-off-by: Shuah Khan <shuah@kernel.org>
5 years agoselftest/gpio: Remove duplicate header
Sabyasachi Gupta [Wed, 6 Mar 2019 16:20:19 +0000 (21:50 +0530)]
selftest/gpio: Remove duplicate header

Remove duplicate header which are included twice.

Signed-off-by: Sabyasachi Gupta <sabyasachi.linux@gmail.com>
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Signed-off-by: Shuah Khan <shuah@kernel.org>
5 years agoselftest/rseq: Remove duplicate header
Sabyasachi Gupta [Wed, 6 Mar 2019 16:16:51 +0000 (21:46 +0530)]
selftest/rseq: Remove duplicate header

Remove duplicate header which is included twice

Signed-off-by: Sabyasachi Gupta <sabyasachi.linux@gmail.com>
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Shuah Khan <shuah@kernel.org>
5 years agoselftest/timers: Remove duplicate header
Sabyasachi Gupta [Wed, 6 Mar 2019 16:22:52 +0000 (21:52 +0530)]
selftest/timers: Remove duplicate header

Remove duplicate header which is included twice.

Signed-off-by: Sabyasachi Gupta <sabyasachi.linux@gmail.com>
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Signed-off-by: Shuah Khan <shuah@kernel.org>
5 years agoselftest/x86/mpx-dig.c: Remove duplicate header
Sabyasachi Gupta [Wed, 6 Mar 2019 16:29:04 +0000 (21:59 +0530)]
selftest/x86/mpx-dig.c: Remove duplicate header

Remove duplicate header which is included twice.

Signed-off-by: Sabyasachi Gupta <sabyasachi.linux@gmail.com>
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Signed-off-by: Shuah Khan <shuah@kernel.org>
5 years agoLinux 5.1-rc4
Linus Torvalds [Mon, 8 Apr 2019 00:09:59 +0000 (14:09 -1000)]
Linux 5.1-rc4

5 years agoMerge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Linus Torvalds [Sun, 7 Apr 2019 23:46:17 +0000 (13:46 -1000)]
Merge tag 'armsoc-fixes' of git://git./linux/kernel/git/soc/soc

Pull ARM SoC fixes from Olof Johansson:
 "A collection of fixes from the last few weeks. Most of them are
  smaller tweaks and fixes to DT and hardware descriptions for boards.
  Some of the more significant ones are:

   - eMMC and RGMII stability tweaks for rk3288

   - DDC fixes for Rock PI 4

   - Audio fixes for two TI am335x eval boards

   - D_CAN clock fix for am335x

   - Compilation fixes for clang

   - !HOTPLUG_CPU compilation fix for one of the new platforms this
     release (milbeaut)

   - A revert of a gpio fix for nomadik that instead was fixed in the
     gpio subsystem

   - Whitespace fix for the DT JSON schema (no tabs allowed)"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (25 commits)
  ARM: milbeaut: fix build with !CONFIG_HOTPLUG_CPU
  ARM: iop: don't use using 64-bit DMA masks
  ARM: orion: don't use using 64-bit DMA masks
  Revert "ARM: dts: nomadik: Fix polarity of SPI CS"
  dt-bindings: cpu: Fix JSON schema
  arm/mach-at91/pm : fix possible object reference leak
  ARM: dts: at91: Fix typo in ISC_D0 on PC9
  ARM: dts: Fix dcan clkctrl clock for am3
  reset: meson-audio-arb: Fix missing .owner setting of reset_controller_dev
  dt-bindings: reset: meson-g12a: Add missing USB2 PHY resets
  ARM: dts: rockchip: Remove #address/#size-cells from rk3288-veyron gpio-keys
  ARM: dts: rockchip: Remove #address/#size-cells from rk3288 mipi_dsi
  ARM: dts: rockchip: Fix gpu opp node names for rk3288
  ARM: dts: am335x-evmsk: Correct the regulators for the audio codec
  ARM: dts: am335x-evm: Correct the regulators for the audio codec
  ARM: OMAP2+: add missing of_node_put after of_device_is_available
  ARM: OMAP1: ams-delta: Fix broken GPIO ID allocation
  arm64: dts: stratix10: add the sysmgr-syscon property from the gmac's
  arm64: dts: rockchip: fix rk3328 sdmmc0 write errors
  arm64: dts: rockchip: fix rk3328 rgmii high tx error rate
  ...

5 years agoMerge tag 'for-linus-20190407' of git://git.kernel.dk/linux-block
Linus Torvalds [Sun, 7 Apr 2019 23:28:36 +0000 (13:28 -1000)]
Merge tag 'for-linus-20190407' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

 - Fixups for the pf/pcd queue handling (YueHaibing)

 - Revert of the three direct issue changes as they have been proven to
   cause an issue with dm-mpath (Bart)

 - Plug rq_count reset fix (Dongli)

 - io_uring double free in fileset registration error handling (me)

 - Make null_blk handle bad numa node passed in (John)

 - BFQ ifdef fix (Konstantin)

 - Flush queue leak fix (Shenghui)

 - Plug trace fix (Yufen)

* tag 'for-linus-20190407' of git://git.kernel.dk/linux-block:
  xsysace: Fix error handling in ace_setup
  null_blk: prevent crash from bad home_node value
  block: Revert v5.0 blk_mq_request_issue_directly() changes
  paride/pcd: Fix potential NULL pointer dereference and mem leak
  blk-mq: do not reset plug->rq_count before the list is sorted
  paride/pf: Fix potential NULL pointer dereference
  io_uring: fix double free in case of fileset regitration failure
  blk-mq: add trace block plug and unplug for multiple queues
  block: use blk_free_flush_queue() to free hctx->fq in blk_mq_init_hctx
  block/bfq: fix ifdef for CONFIG_BFQ_GROUP_IOSCHED=y

5 years agoARM: milbeaut: fix build with !CONFIG_HOTPLUG_CPU
Arnd Bergmann [Wed, 13 Mar 2019 21:19:16 +0000 (22:19 +0100)]
ARM: milbeaut: fix build with !CONFIG_HOTPLUG_CPU

When HOTPLUG_CPU is disabled, some fields in the smp operations
are not available or needed:

arch/arm/mach-milbeaut/platsmp.c:90:3: error: field designator 'cpu_die' does not refer to any field in type
      'struct smp_operations'
        .cpu_die                = m10v_cpu_die,
         ^
arch/arm/mach-milbeaut/platsmp.c:91:3: error: field designator 'cpu_kill' does not refer to any field in type
      'struct smp_operations'
        .cpu_kill               = m10v_cpu_kill,
         ^

Hide them in an #ifdef like the other platforms do.

Fixes: 9fb29c734f9e ("ARM: milbeaut: Add basic support for Milbeaut m10v SoC")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Olof Johansson <olof@lixom.net>
5 years agoARM: iop: don't use using 64-bit DMA masks
Arnd Bergmann [Mon, 25 Mar 2019 15:50:43 +0000 (16:50 +0100)]
ARM: iop: don't use using 64-bit DMA masks

clang warns about statically defined DMA masks from the DMA_BIT_MASK
macro with length 64:

 arch/arm/mach-iop13xx/setup.c:303:35: error: shift count >= width of type [-Werror,-Wshift-count-overflow]
 static u64 iop13xx_adma_dmamask = DMA_BIT_MASK(64);
                                  ^~~~~~~~~~~~~~~~
 include/linux/dma-mapping.h:141:54: note: expanded from macro 'DMA_BIT_MASK'
 #define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL<<(n))-1))
                                                      ^ ~~~

The ones in iop shouldn't really be 64 bit masks, so changing them
to what the driver can support avoids the warning.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Olof Johansson <olof@lixom.net>
5 years agoARM: orion: don't use using 64-bit DMA masks
Arnd Bergmann [Mon, 25 Mar 2019 15:50:42 +0000 (16:50 +0100)]
ARM: orion: don't use using 64-bit DMA masks

clang warns about statically defined DMA masks from the DMA_BIT_MASK
macro with length 64:

arch/arm/plat-orion/common.c:625:29: error: shift count >= width of type [-Werror,-Wshift-count-overflow]
                .coherent_dma_mask      = DMA_BIT_MASK(64),
                                          ^~~~~~~~~~~~~~~~
include/linux/dma-mapping.h:141:54: note: expanded from macro 'DMA_BIT_MASK'
 #define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL<<(n))-1))

The ones in orion shouldn't really be 64 bit masks, so changing them
to what the driver can support avoids the warning.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Olof Johansson <olof@lixom.net>
5 years agoRevert "ARM: dts: nomadik: Fix polarity of SPI CS"
Olof Johansson [Sun, 7 Apr 2019 22:18:41 +0000 (15:18 -0700)]
Revert "ARM: dts: nomadik: Fix polarity of SPI CS"

This reverts commit fa9463564e77067df81b0b8dec91adbbbc47bfb4.

Per Linus Walleij:

Dear ARM SoC maintainers,

can you please revert this patch. It was the wrong solution to the
wrong problem, and I must have acted in stress. Andrey fixed the
real bug in a proper way in these commits:

commit e5545c94e43b8f6599ffc01df8d1aedf18ee912a
"gpio: of: Check propname before applying "cs-gpios" quirks"
commit 7ce40277bf848391705011ba37eac2e377cbd9e6
"gpio: of: Check for "spi-cs-high" in child instead of parent node"

Signed-off-by: Olof Johansson <olof@lixom.net>
5 years agoMerge tag 'omap-for-v5.1/fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel...
Olof Johansson [Sun, 7 Apr 2019 22:16:38 +0000 (15:16 -0700)]
Merge tag 'omap-for-v5.1/fixes-signed' of git://git./linux/kernel/git/tmlind/linux-omap into arm/fixes

Fixes for omaps for v5.1-rc cycle

Few small fixes for omap variants:

- Fix ams-delta gpio IDs
- Add missing of_node_put for omapdss platform init code
- Fix unconfigured audio regulators for two am335x boards
- Fix use of wrong offset for am335x d_can clocks

* tag 'omap-for-v5.1/fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: dts: Fix dcan clkctrl clock for am3
  ARM: dts: am335x-evmsk: Correct the regulators for the audio codec
  ARM: dts: am335x-evm: Correct the regulators for the audio codec
  ARM: OMAP2+: add missing of_node_put after of_device_is_available
  ARM: OMAP1: ams-delta: Fix broken GPIO ID allocation

Signed-off-by: Olof Johansson <olof@lixom.net>
5 years agoMerge tag 'at91-5.1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/at91...
Olof Johansson [Sun, 7 Apr 2019 22:16:09 +0000 (15:16 -0700)]
Merge tag 'at91-5.1-fixes' of git://git./linux/kernel/git/at91/linux into arm/fixes

AT91 fixes for 5.1

- fix a typo in sama5d2 pinmuxing which concerns the ISC data 0 signal
- fix a kobject reference leak

* tag 'at91-5.1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/at91/linux:
  arm/mach-at91/pm : fix possible object reference leak
  ARM: dts: at91: Fix typo in ISC_D0 on PC9

Signed-off-by: Olof Johansson <olof@lixom.net>
5 years agoMerge tag 'v5.1-rockchip-dtfixes-1' of git://git.kernel.org/pub/scm/linux/kernel...
Olof Johansson [Sun, 7 Apr 2019 22:15:31 +0000 (15:15 -0700)]
Merge tag 'v5.1-rockchip-dtfixes-1' of git://git./linux/kernel/git/mmind/linux-rockchip into arm/fixes

Fixes for dtc warnings, fixes for ethernet transfers on rk3328,
sd-card related fixes on both rk3328 ans rk3288-tinker and a
regulator fix on rock64 and making ddc actually work on the
Rock PI 4 due to missing the ddc bus.

* tag 'v5.1-rockchip-dtfixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip:
  ARM: dts: rockchip: Remove #address/#size-cells from rk3288-veyron gpio-keys
  ARM: dts: rockchip: Remove #address/#size-cells from rk3288 mipi_dsi
  ARM: dts: rockchip: Fix gpu opp node names for rk3288
  arm64: dts: rockchip: fix rk3328 sdmmc0 write errors
  arm64: dts: rockchip: fix rk3328 rgmii high tx error rate
  ARM: dts: rockchip: Fix SD card detection on rk3288-tinker
  arm64: dts: rockchip: Fix vcc_host1_5v GPIO polarity on rk3328-rock64
  ARM: dts: rockchip: fix rk3288 cpu opp node reference
  arm64: dts: rockchip: add DDC bus on Rock Pi 4
  arm64: dts: rockchip: fix rk3328-roc-cc gmac2io tx/rx_delay

Signed-off-by: Olof Johansson <olof@lixom.net>
5 years agoMerge tag 'stratix10_fix_for_v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Olof Johansson [Sun, 7 Apr 2019 22:14:30 +0000 (15:14 -0700)]
Merge tag 'stratix10_fix_for_v5.1' of git://git./linux/kernel/git/dinguyen/linux into arm/fixes

arm64: dts: stratix10: fix emac loading warning
- Add missing "altr,sysmgr-syscon" property to all gmac nodes

* tag 'stratix10_fix_for_v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux:
  arm64: dts: stratix10: add the sysmgr-syscon property from the gmac's

Signed-off-by: Olof Johansson <olof@lixom.net>
5 years agoMerge tag 'reset-fixes-for-v5.1' of git://git.pengutronix.de/pza/linux into arm/fixes
Olof Johansson [Sun, 7 Apr 2019 22:14:00 +0000 (15:14 -0700)]
Merge tag 'reset-fixes-for-v5.1' of git://git.pengutronix.de/pza/linux into arm/fixes

Reset controller fixes for v5.1

This tag adds missing USB PHY reset lines to the Meson G12A reset
controller header and fixes the Meson Audio ARB driver to prevent
module unloading while it is in use.

* tag 'reset-fixes-for-v5.1' of git://git.pengutronix.de/pza/linux:
  reset: meson-audio-arb: Fix missing .owner setting of reset_controller_dev
  dt-bindings: reset: meson-g12a: Add missing USB2 PHY resets

Signed-off-by: Olof Johansson <olof@lixom.net>
5 years agodt-bindings: cpu: Fix JSON schema
Maxime Ripard [Mon, 18 Mar 2019 10:05:21 +0000 (11:05 +0100)]
dt-bindings: cpu: Fix JSON schema

Commit fd73403a4862 ("dt-bindings: arm: Add SMP enable-method for
Milbeaut") added support for a new cpu enable-method, but did so using
tabulations to ident. This is however invalid in the syntax, and resulted
in a failure when trying to use that schemas for validation.

Use spaces instead of tabs to indent to fix this.

Fixes: fd73403a4862 ("dt-bindings: arm: Add SMP enable-method for Milbeaut")
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Sugaya Taichi <sugaya.taichi@socionext.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
5 years agoMerge tag 'for-linus-5.1b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 7 Apr 2019 16:12:10 +0000 (06:12 -1000)]
Merge tag 'for-linus-5.1b-rc4-tag' of git://git./linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:
 "One minor fix and a small cleanup for the xen privcmd driver"

* tag 'for-linus-5.1b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: Prevent buffer overflow in privcmd ioctl
  xen: use struct_size() helper in kzalloc()

5 years agoMerge tag 'mtd/fixes-for-5.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 7 Apr 2019 16:07:20 +0000 (06:07 -1000)]
Merge tag 'mtd/fixes-for-5.1-rc4' of git://git./linux/kernel/git/mtd/linux

Pull MTD fix from Richard Weinberger:
 "A single fix for a possible infinite loop in the cfi_cmdset_0002
  driver"

* tag 'mtd/fixes-for-5.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
  mtd: cfi: fix deadloop in cfi_cmdset_0002.c do_write_buffer

5 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sun, 7 Apr 2019 16:00:35 +0000 (06:00 -1000)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Five small fixes. Four in three drivers: qedi, lpfc and storvsc. The
  final one is labelled core, but merely adds a dh rdac entry for Lenovo
  systems"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: lpfc: Fix missing wakeups on abort threads
  scsi: storvsc: Reduce default ring buffer size to 128 Kbytes
  scsi: storvsc: Fix calculation of sub-channel count
  scsi: core: add new RDAC LENOVO/DE_Series device
  scsi: qedi: remove declaration of nvm_image from stack

5 years agoMerge branch 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 6 Apr 2019 21:52:59 +0000 (11:52 -1000)]
Merge branch 'i2c/for-current-fixed' of git://git./linux/kernel/git/wsa/linux

Pull i2c fix from Wolfram Sang:
 "A simple but wanted driver bugfix"

* 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: imx: don't leak the i2c adapter on error

5 years agoMerge branch 'parisc-5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller...
Linus Torvalds [Sat, 6 Apr 2019 20:59:30 +0000 (10:59 -1000)]
Merge branch 'parisc-5.1-2' of git://git./linux/kernel/git/deller/parisc-linux

Pull parisc fixes from Helge Deller:
 "A 32-bit boot regression fix introduced in the merge window, a QEMU
  detection fix and two fixes by Sven regarding ptrace & kprobes"

* 'parisc-5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Detect QEMU earlier in boot process
  parisc: also set iaoq_b in instruction_pointer_set()
  parisc: regs_return_value() should return gpr28
  Revert: parisc: Use F_EXTEND() macro in iosapic code

5 years agoparisc: Detect QEMU earlier in boot process
Helge Deller [Tue, 2 Apr 2019 10:13:27 +0000 (12:13 +0200)]
parisc: Detect QEMU earlier in boot process

While adding LASI support to QEMU, I noticed that the QEMU detection in
the kernel happens much too late. For example, when a LASI chip is found
by the kernel, it registers the LASI LED driver as well.  But when we
run on QEMU it makes sense to avoid spending unnecessary CPU cycles, so
we need to access the running_on_QEMU flag earlier than before.

This patch now makes the QEMU detection the fist task of the Linux
kernel by moving it to where the kernel enters the C-coding.

Fixes: 310d82784fb4 ("parisc: qemu idle sleep support")
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v4.14+
5 years agoparisc: also set iaoq_b in instruction_pointer_set()
Sven Schnelle [Thu, 4 Apr 2019 16:16:04 +0000 (18:16 +0200)]
parisc: also set iaoq_b in instruction_pointer_set()

When setting the instruction pointer on PA-RISC we also need
to set the back of the instruction queue to the new offset, otherwise
we will execute on instruction from the new location, and jumping
back to the old location stored in iaoq_b.

Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Fixes: 75ebedf1d263 ("parisc: Add HAVE_REGS_AND_STACK_ACCESS_API feature")
Cc: stable@vger.kernel.org # 4.19+
5 years agoparisc: regs_return_value() should return gpr28
Sven Schnelle [Thu, 4 Apr 2019 16:16:03 +0000 (18:16 +0200)]
parisc: regs_return_value() should return gpr28

While working on kretprobes for PA-RISC I was wondering while the
kprobes sanity test always fails on kretprobes. This is caused by
returning gpr20 instead of gpr28.

Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # 4.14+
5 years agoRevert: parisc: Use F_EXTEND() macro in iosapic code
Helge Deller [Mon, 18 Mar 2019 21:56:15 +0000 (22:56 +0100)]
Revert: parisc: Use F_EXTEND() macro in iosapic code

Revert parts of commit 97d7e2e3fd8a ("parisc: Use F_EXTEND() macro in
iosapic code"). It breaks booting the 32-bit kernel on some machines.

Reported-by: Sven Schnelle <svens@stackframe.org>
Tested-by: Sven Schnelle <svens@stackframe.org>
Fixes: 97d7e2e3fd8a ("parisc: Use F_EXTEND() macro in iosapic code")
Signed-off-by: Helge Deller <deller@gmx.de>
5 years agofs: stream_open - opener for stream-like files so that read and write can run simulta...
Kirill Smelkov [Tue, 26 Mar 2019 22:20:43 +0000 (22:20 +0000)]
fs: stream_open - opener for stream-like files so that read and write can run simultaneously without deadlock

Commit 9c225f2655e3 ("vfs: atomic f_pos accesses as per POSIX") added
locking for file.f_pos access and in particular made concurrent read and
write not possible - now both those functions take f_pos lock for the
whole run, and so if e.g. a read is blocked waiting for data, write will
deadlock waiting for that read to complete.

This caused regression for stream-like files where previously read and
write could run simultaneously, but after that patch could not do so
anymore. See e.g. commit 581d21a2d02a ("xenbus: fix deadlock on writes
to /proc/xen/xenbus") which fixes such regression for particular case of
/proc/xen/xenbus.

The patch that added f_pos lock in 2014 did so to guarantee POSIX thread
safety for read/write/lseek and added the locking to file descriptors of
all regular files. In 2014 that thread-safety problem was not new as it
was already discussed earlier in 2006.

However even though 2006'th version of Linus's patch was adding f_pos
locking "only for files that are marked seekable with FMODE_LSEEK (thus
avoiding the stream-like objects like pipes and sockets)", the 2014
version - the one that actually made it into the tree as 9c225f2655e3 -
is doing so irregardless of whether a file is seekable or not.

See

    https://lore.kernel.org/lkml/53022DB1.4070805@gmail.com/
    https://lwn.net/Articles/180387
    https://lwn.net/Articles/180396

for historic context.

The reason that it did so is, probably, that there are many files that
are marked non-seekable, but e.g. their read implementation actually
depends on knowing current position to correctly handle the read. Some
examples:

kernel/power/user.c snapshot_read
fs/debugfs/file.c u32_array_read
fs/fuse/control.c fuse_conn_waiting_read + ...
drivers/hwmon/asus_atk0110.c atk_debugfs_ggrp_read
arch/s390/hypfs/inode.c hypfs_read_iter
...

Despite that, many nonseekable_open users implement read and write with
pure stream semantics - they don't depend on passed ppos at all. And for
those cases where read could wait for something inside, it creates a
situation similar to xenbus - the write could be never made to go until
read is done, and read is waiting for some, potentially external, event,
for potentially unbounded time -> deadlock.

Besides xenbus, there are 14 such places in the kernel that I've found
with semantic patch (see below):

drivers/xen/evtchn.c:667:8-24: ERROR: evtchn_fops: .read() can deadlock .write()
drivers/isdn/capi/capi.c:963:8-24: ERROR: capi_fops: .read() can deadlock .write()
drivers/input/evdev.c:527:1-17: ERROR: evdev_fops: .read() can deadlock .write()
drivers/char/pcmcia/cm4000_cs.c:1685:7-23: ERROR: cm4000_fops: .read() can deadlock .write()
net/rfkill/core.c:1146:8-24: ERROR: rfkill_fops: .read() can deadlock .write()
drivers/s390/char/fs3270.c:488:1-17: ERROR: fs3270_fops: .read() can deadlock .write()
drivers/usb/misc/ldusb.c:310:1-17: ERROR: ld_usb_fops: .read() can deadlock .write()
drivers/hid/uhid.c:635:1-17: ERROR: uhid_fops: .read() can deadlock .write()
net/batman-adv/icmp_socket.c:80:1-17: ERROR: batadv_fops: .read() can deadlock .write()
drivers/media/rc/lirc_dev.c:198:1-17: ERROR: lirc_fops: .read() can deadlock .write()
drivers/leds/uleds.c:77:1-17: ERROR: uleds_fops: .read() can deadlock .write()
drivers/input/misc/uinput.c:400:1-17: ERROR: uinput_fops: .read() can deadlock .write()
drivers/infiniband/core/user_mad.c:985:7-23: ERROR: umad_fops: .read() can deadlock .write()
drivers/gnss/core.c:45:1-17: ERROR: gnss_fops: .read() can deadlock .write()

In addition to the cases above another regression caused by f_pos
locking is that now FUSE filesystems that implement open with
FOPEN_NONSEEKABLE flag, can no longer implement bidirectional
stream-like files - for the same reason as above e.g. read can deadlock
write locking on file.f_pos in the kernel.

FUSE's FOPEN_NONSEEKABLE was added in 2008 in a7c1b990f715 ("fuse:
implement nonseekable open") to support OSSPD. OSSPD implements /dev/dsp
in userspace with FOPEN_NONSEEKABLE flag, with corresponding read and
write routines not depending on current position at all, and with both
read and write being potentially blocking operations:

See

    https://github.com/libfuse/osspd
    https://lwn.net/Articles/308445

    https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1406
    https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1438-L1477
    https://github.com/libfuse/osspd/blob/14a9cff0/osspd.c#L1479-L1510

Corresponding libfuse example/test also describes FOPEN_NONSEEKABLE as
"somewhat pipe-like files ..." with read handler not using offset.
However that test implements only read without write and cannot exercise
the deadlock scenario:

    https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L124-L131
    https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L146-L163
    https://github.com/libfuse/libfuse/blob/fuse-3.4.2-3-ga1bff7d/example/poll.c#L209-L216

I've actually hit the read vs write deadlock for real while implementing
my FUSE filesystem where there is /head/watch file, for which open
creates separate bidirectional socket-like stream in between filesystem
and its user with both read and write being later performed
simultaneously. And there it is semantically not easy to split the
stream into two separate read-only and write-only channels:

    https://lab.nexedi.com/kirr/wendelin.core/blob/f13aa600/wcfs/wcfs.go#L88-169

Let's fix this regression. The plan is:

1. We can't change nonseekable_open to include &~FMODE_ATOMIC_POS -
   doing so would break many in-kernel nonseekable_open users which
   actually use ppos in read/write handlers.

2. Add stream_open() to kernel to open stream-like non-seekable file
   descriptors. Read and write on such file descriptors would never use
   nor change ppos. And with that property on stream-like files read and
   write will be running without taking f_pos lock - i.e. read and write
   could be running simultaneously.

3. With semantic patch search and convert to stream_open all in-kernel
   nonseekable_open users for which read and write actually do not
   depend on ppos and where there is no other methods in file_operations
   which assume @offset access.

4. Add FOPEN_STREAM to fs/fuse/ and open in-kernel file-descriptors via
   steam_open if that bit is present in filesystem open reply.

   It was tempting to change fs/fuse/ open handler to use stream_open
   instead of nonseekable_open on just FOPEN_NONSEEKABLE flags, but
   grepping through Debian codesearch shows users of FOPEN_NONSEEKABLE,
   and in particular GVFS which actually uses offset in its read and
   write handlers

https://codesearch.debian.net/search?q=-%3Enonseekable+%3D
https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1080
https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1247-1346
https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1399-1481

   so if we would do such a change it will break a real user.

5. Add stream_open and FOPEN_STREAM handling to stable kernels starting
   from v3.14+ (the kernel where 9c225f2655 first appeared).

   This will allow to patch OSSPD and other FUSE filesystems that
   provide stream-like files to return FOPEN_STREAM | FOPEN_NONSEEKABLE
   in their open handler and this way avoid the deadlock on all kernel
   versions. This should work because fs/fuse/ ignores unknown open
   flags returned from a filesystem and so passing FOPEN_STREAM to a
   kernel that is not aware of this flag cannot hurt. In turn the kernel
   that is not aware of FOPEN_STREAM will be < v3.14 where just
   FOPEN_NONSEEKABLE is sufficient to implement streams without read vs
   write deadlock.

This patch adds stream_open, converts /proc/xen/xenbus to it and adds
semantic patch to automatically locate in-kernel places that are either
required to be converted due to read vs write deadlock, or that are just
safe to be converted because read and write do not use ppos and there
are no other funky methods in file_operations.

Regarding semantic patch I've verified each generated change manually -
that it is correct to convert - and each other nonseekable_open instance
left - that it is either not correct to convert there, or that it is not
converted due to current stream_open.cocci limitations.

The script also does not convert files that should be valid to convert,
but that currently have .llseek = noop_llseek or generic_file_llseek for
unknown reason despite file being opened with nonseekable_open (e.g.
drivers/input/mousedev.c)

Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Yongzhi Pan <panyongzhi@gmail.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Tejun Heo <tj@kernel.org>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Julia Lawall <Julia.Lawall@lip6.fr>
Cc: Nikolaus Rath <Nikolaus@rath.org>
Cc: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoxsysace: Fix error handling in ace_setup
Guenter Roeck [Tue, 19 Feb 2019 16:49:56 +0000 (08:49 -0800)]
xsysace: Fix error handling in ace_setup

If xace hardware reports a bad version number, the error handling code
in ace_setup() calls put_disk(), followed by queue cleanup. However, since
the disk data structure has the queue pointer set, put_disk() also
cleans and releases the queue. This results in blk_cleanup_queue()
accessing an already released data structure, which in turn may result
in a crash such as the following.

[   10.681671] BUG: Kernel NULL pointer dereference at 0x00000040
[   10.681826] Faulting instruction address: 0xc0431480
[   10.682072] Oops: Kernel access of bad area, sig: 11 [#1]
[   10.682251] BE PAGE_SIZE=4K PREEMPT Xilinx Virtex440
[   10.682387] Modules linked in:
[   10.682528] CPU: 0 PID: 1 Comm: swapper Tainted: G        W         5.0.0-rc6-next-20190218+ #2
[   10.682733] NIP:  c0431480 LR: c043147c CTR: c0422ad8
[   10.682863] REGS: cf82fbe0 TRAP: 0300   Tainted: G        W          (5.0.0-rc6-next-20190218+)
[   10.683065] MSR:  00029000 <CE,EE,ME>  CR: 22000222  XER: 00000000
[   10.683236] DEAR: 00000040 ESR: 00000000
[   10.683236] GPR00: c043147c cf82fc90 cf82ccc0 00000000 00000000 00000000 00000002 00000000
[   10.683236] GPR08: 00000000 00000000 c04310bc 00000000 22000222 00000000 c0002c54 00000000
[   10.683236] GPR16: 00000000 00000001 c09aa39c c09021b0 c09021dc 00000007 c0a68c08 00000000
[   10.683236] GPR24: 00000001 ced6d400 ced6dcf0 c0815d9c 00000000 00000000 00000000 cedf0800
[   10.684331] NIP [c0431480] blk_mq_run_hw_queue+0x28/0x114
[   10.684473] LR [c043147c] blk_mq_run_hw_queue+0x24/0x114
[   10.684602] Call Trace:
[   10.684671] [cf82fc90] [c043147c] blk_mq_run_hw_queue+0x24/0x114 (unreliable)
[   10.684854] [cf82fcc0] [c04315bc] blk_mq_run_hw_queues+0x50/0x7c
[   10.685002] [cf82fce0] [c0422b24] blk_set_queue_dying+0x30/0x68
[   10.685154] [cf82fcf0] [c0423ec0] blk_cleanup_queue+0x34/0x14c
[   10.685306] [cf82fd10] [c054d73c] ace_probe+0x3dc/0x508
[   10.685445] [cf82fd50] [c052d740] platform_drv_probe+0x4c/0xb8
[   10.685592] [cf82fd70] [c052abb0] really_probe+0x20c/0x32c
[   10.685728] [cf82fda0] [c052ae58] driver_probe_device+0x68/0x464
[   10.685877] [cf82fdc0] [c052b500] device_driver_attach+0xb4/0xe4
[   10.686024] [cf82fde0] [c052b5dc] __driver_attach+0xac/0xfc
[   10.686161] [cf82fe00] [c0528428] bus_for_each_dev+0x80/0xc0
[   10.686314] [cf82fe30] [c0529b3c] bus_add_driver+0x144/0x234
[   10.686457] [cf82fe50] [c052c46c] driver_register+0x88/0x15c
[   10.686610] [cf82fe60] [c09de288] ace_init+0x4c/0xac
[   10.686742] [cf82fe80] [c0002730] do_one_initcall+0xac/0x330
[   10.686888] [cf82fee0] [c09aafd0] kernel_init_freeable+0x34c/0x478
[   10.687043] [cf82ff30] [c0002c6c] kernel_init+0x18/0x114
[   10.687188] [cf82ff40] [c000f2f0] ret_from_kernel_thread+0x14/0x1c
[   10.687349] Instruction dump:
[   10.687435] 3863ffd4 4bfffd70 9421ffd0 7c0802a6 93c10028 7c9e2378 93e1002c 38810008
[   10.687637] 7c7f1b78 90010034 4bfffc25 813f008c <8129004075290100 4182002c 80810008
[   10.688056] ---[ end trace 13c9ff51d41b9d40 ]---

Fix the problem by setting the disk queue pointer to NULL before calling
put_disk(). A more comprehensive fix might be to rearrange the code
to check the hardware version before initializing data structures,
but I don't know if this would have undesirable side effects, and
it would increase the complexity of backporting the fix to older kernels.

Fixes: 74489a91dd43a ("Add support for Xilinx SystemACE CompactFlash interface")
Acked-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agonull_blk: prevent crash from bad home_node value
John Pittman [Fri, 5 Apr 2019 21:42:45 +0000 (17:42 -0400)]
null_blk: prevent crash from bad home_node value

At module load, if the selected home_node value is greater than
the available numa nodes, the system will crash in
__alloc_pages_nodemask() due to a bad paging request.  Prevent this
user error crash by detecting the bad value, logging an error, and
setting g_home_node back to the default of NUMA_NO_NODE.

Signed-off-by: John Pittman <jpittman@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
5 years agoMerge tag 'rtc-5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Linus Torvalds [Sat, 6 Apr 2019 16:26:36 +0000 (06:26 -1000)]
Merge tag 'rtc-5.1-2' of git://git./linux/kernel/git/abelloni/linux

Pull RTC fixes from Alexandre Belloni:

 - Various alarm fixes for da9063, cros-ec and sh

 - sd3078 manufacturer name fix as this was introduced this cycle

* tag 'rtc-5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
  rtc: da9063: set uie_unsupported when relevant
  rtc: sd3078: fix manufacturer name
  rtc: sh: Fix invalid alarm warning for non-enabled alarm
  rtc: cros-ec: Fail suspend/resume if wake IRQ can't be configured

5 years agoi2c: imx: don't leak the i2c adapter on error
Laurentiu Tudor [Mon, 1 Apr 2019 10:14:37 +0000 (13:14 +0300)]
i2c: imx: don't leak the i2c adapter on error

Make sure to free the i2c adapter on the error exit path.

Signed-off-by: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Fixes: e1ab9a468e3b ("i2c: imx: improve the error handling in i2c_imx_dma_request()")
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
5 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Sat, 6 Apr 2019 03:08:55 +0000 (17:08 -1000)]
Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
 "14 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  kernel/sysctl.c: fix out-of-bounds access when setting file-max
  mm/util.c: fix strndup_user() comment
  sh: fix multiple function definition build errors
  MAINTAINERS: add maintainer and replacing reviewer ARM/NUVOTON NPCM
  MAINTAINERS: fix bad pattern in ARM/NUVOTON NPCM
  mm: writeback: use exact memcg dirty counts
  psi: clarify the units used in pressure files
  mm/huge_memory.c: fix modifying of page protection by insert_pfn_pmd()
  hugetlbfs: fix memory leak for resv_map
  mm: fix vm_fault_t cast in VM_FAULT_GET_HINDEX()
  lib/lzo: fix bugs for very short or empty input
  include/linux/bitrev.h: fix constant bitrev
  kmemleak: powerpc: skip scanning holes in the .bss section
  lib/string.c: implement a basic bcmp

5 years agokernel/sysctl.c: fix out-of-bounds access when setting file-max
Will Deacon [Sat, 6 Apr 2019 01:39:38 +0000 (18:39 -0700)]
kernel/sysctl.c: fix out-of-bounds access when setting file-max

Commit 32a5ad9c2285 ("sysctl: handle overflow for file-max") hooked up
min/max values for the file-max sysctl parameter via the .extra1 and
.extra2 fields in the corresponding struct ctl_table entry.

Unfortunately, the minimum value points at the global 'zero' variable,
which is an int.  This results in a KASAN splat when accessed as a long
by proc_doulongvec_minmax on 64-bit architectures:

  | BUG: KASAN: global-out-of-bounds in __do_proc_doulongvec_minmax+0x5d8/0x6a0
  | Read of size 8 at addr ffff2000133d1c20 by task systemd/1
  |
  | CPU: 0 PID: 1 Comm: systemd Not tainted 5.1.0-rc3-00012-g40b114779944 #2
  | Hardware name: linux,dummy-virt (DT)
  | Call trace:
  |  dump_backtrace+0x0/0x228
  |  show_stack+0x14/0x20
  |  dump_stack+0xe8/0x124
  |  print_address_description+0x60/0x258
  |  kasan_report+0x140/0x1a0
  |  __asan_report_load8_noabort+0x18/0x20
  |  __do_proc_doulongvec_minmax+0x5d8/0x6a0
  |  proc_doulongvec_minmax+0x4c/0x78
  |  proc_sys_call_handler.isra.19+0x144/0x1d8
  |  proc_sys_write+0x34/0x58
  |  __vfs_write+0x54/0xe8
  |  vfs_write+0x124/0x3c0
  |  ksys_write+0xbc/0x168
  |  __arm64_sys_write+0x68/0x98
  |  el0_svc_common+0x100/0x258
  |  el0_svc_handler+0x48/0xc0
  |  el0_svc+0x8/0xc
  |
  | The buggy address belongs to the variable:
  |  zero+0x0/0x40
  |
  | Memory state around the buggy address:
  |  ffff2000133d1b00: 00 00 00 00 00 00 00 00 fa fa fa fa 04 fa fa fa
  |  ffff2000133d1b80: fa fa fa fa 04 fa fa fa fa fa fa fa 04 fa fa fa
  | >ffff2000133d1c00: fa fa fa fa 04 fa fa fa fa fa fa fa 00 00 00 00
  |                                ^
  |  ffff2000133d1c80: fa fa fa fa 00 fa fa fa fa fa fa fa 00 00 00 00
  |  ffff2000133d1d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Fix the splat by introducing a unsigned long 'zero_ul' and using that
instead.

Link: http://lkml.kernel.org/r/20190403153409.17307-1-will.deacon@arm.com
Fixes: 32a5ad9c2285 ("sysctl: handle overflow for file-max")
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Christian Brauner <christian@brauner.io>
Cc: Kees Cook <keescook@chromium.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Matteo Croce <mcroce@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/util.c: fix strndup_user() comment
Andrew Morton [Sat, 6 Apr 2019 01:39:34 +0000 (18:39 -0700)]
mm/util.c: fix strndup_user() comment

The kerneldoc misdescribes strndup_user()'s return value.

Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Timur Tabi <timur@freescale.com>
Cc: Mihai Caraman <mihai.caraman@freescale.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agosh: fix multiple function definition build errors
Randy Dunlap [Sat, 6 Apr 2019 01:39:30 +0000 (18:39 -0700)]
sh: fix multiple function definition build errors

Many of the sh CPU-types have their own plat_irq_setup() and
arch_init_clk_ops() functions, so these same (empty) functions in
arch/sh/boards/of-generic.c are not needed and cause build errors.

If there is some case where these empty functions are needed, they can
be retained by marking them as "__weak" while at the same time making
builds that do not need them succeed.

Fixes these build errors:

arch/sh/boards/of-generic.o: In function `plat_irq_setup':
(.init.text+0x134): multiple definition of `plat_irq_setup'
arch/sh/kernel/cpu/sh2/setup-sh7619.o:(.init.text+0x30): first defined here
arch/sh/boards/of-generic.o: In function `arch_init_clk_ops':
(.init.text+0x118): multiple definition of `arch_init_clk_ops'
arch/sh/kernel/cpu/sh2/clock-sh7619.o:(.init.text+0x0): first defined here

Link: http://lkml.kernel.org/r/9ee4e0c5-f100-86a2-bd4d-1d3287ceab31@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kbuild test robot <lkp@intel.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoMAINTAINERS: add maintainer and replacing reviewer ARM/NUVOTON NPCM
Tomer Maimon [Sat, 6 Apr 2019 01:39:26 +0000 (18:39 -0700)]
MAINTAINERS: add maintainer and replacing reviewer ARM/NUVOTON NPCM

Add Tali Perry as Nuvoton NPCM maintainer, replace Brendan Higgins
Nuvoton NPCM reviewer with Benjamin Fair.

Link: http://lkml.kernel.org/r/20190328235752.334462-2-tmaimon77@gmail.com
Signed-off-by: Tomer Maimon <tmaimon77@gmail.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Reviewed-by: Benjamin Fair <benjaminfair@google.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Cc: Joe Perches <joe@perches.com>
Cc: Avi Fishman <avifishman70@gmail.com>
Cc: Patrick Venture <venture@google.com>
Cc: Nancy Yuen <yuenn@google.com>
Cc: Tali Perry <tali.perry1@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoMAINTAINERS: fix bad pattern in ARM/NUVOTON NPCM
Tomer Maimon [Sat, 6 Apr 2019 01:39:22 +0000 (18:39 -0700)]
MAINTAINERS: fix bad pattern in ARM/NUVOTON NPCM

In the process of upstreaming architecture support for ARM/NUVOTON NPCM
include/dt-bindings/clock/nuvoton,npcm7xx-clks.h was renamed
include/dt-bindings/clock/nuvoton,npcm7xx-clock.h without updating
MAINTAINERS.  This updates the MAINTAINERS pattern to match the new name
of this file.

Link: http://lkml.kernel.org/r/20190328235752.334462-1-tmaimon77@gmail.com
Fixes: 6a498e06ba22 ("MAINTAINERS: Add entry for the Nuvoton NPCM architecture")
Signed-off-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Tomer Maimon <tmaimon77@gmail.com>
Reported-by: Joe Perches <joe@perches.com>
Reviewed-by: Benjamin Fair <benjaminfair@google.com>
Cc: Avi Fishman <avifishman70@gmail.com>
Cc: Mukesh Ojha <mojha@codeaurora.org>
Cc: Nancy Yuen <yuenn@google.com>
Cc: Patrick Venture <venture@google.com>
Cc: Tali Perry <tali.perry1@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm: writeback: use exact memcg dirty counts
Greg Thelen [Sat, 6 Apr 2019 01:39:18 +0000 (18:39 -0700)]
mm: writeback: use exact memcg dirty counts

Since commit a983b5ebee57 ("mm: memcontrol: fix excessive complexity in
memory.stat reporting") memcg dirty and writeback counters are managed
as:

 1) per-memcg per-cpu values in range of [-32..32]

 2) per-memcg atomic counter

When a per-cpu counter cannot fit in [-32..32] it's flushed to the
atomic.  Stat readers only check the atomic.  Thus readers such as
balance_dirty_pages() may see a nontrivial error margin: 32 pages per
cpu.

Assuming 100 cpus:
   4k x86 page_size:  13 MiB error per memcg
  64k ppc page_size: 200 MiB error per memcg

Considering that dirty+writeback are used together for some decisions the
errors double.

This inaccuracy can lead to undeserved oom kills.  One nasty case is
when all per-cpu counters hold positive values offsetting an atomic
negative value (i.e.  per_cpu[*]=32, atomic=n_cpu*-32).
balance_dirty_pages() only consults the atomic and does not consider
throttling the next n_cpu*32 dirty pages.  If the file_lru is in the
13..200 MiB range then there's absolutely no dirty throttling, which
burdens vmscan with only dirty+writeback pages thus resorting to oom
kill.

It could be argued that tiny containers are not supported, but it's more
subtle.  It's the amount the space available for file lru that matters.
If a container has memory.max-200MiB of non reclaimable memory, then it
will also suffer such oom kills on a 100 cpu machine.

The following test reliably ooms without this patch.  This patch avoids
oom kills.

  $ cat test
  mount -t cgroup2 none /dev/cgroup
  cd /dev/cgroup
  echo +io +memory > cgroup.subtree_control
  mkdir test
  cd test
  echo 10M > memory.max
  (echo $BASHPID > cgroup.procs && exec /memcg-writeback-stress /foo)
  (echo $BASHPID > cgroup.procs && exec dd if=/dev/zero of=/foo bs=2M count=100)

  $ cat memcg-writeback-stress.c
  /*
   * Dirty pages from all but one cpu.
   * Clean pages from the non dirtying cpu.
   * This is to stress per cpu counter imbalance.
   * On a 100 cpu machine:
   * - per memcg per cpu dirty count is 32 pages for each of 99 cpus
   * - per memcg atomic is -99*32 pages
   * - thus the complete dirty limit: sum of all counters 0
   * - balance_dirty_pages() only sees atomic count -99*32 pages, which
   *   it max()s to 0.
   * - So a workload can dirty -99*32 pages before balance_dirty_pages()
   *   cares.
   */
  #define _GNU_SOURCE
  #include <err.h>
  #include <fcntl.h>
  #include <sched.h>
  #include <stdlib.h>
  #include <stdio.h>
  #include <sys/stat.h>
  #include <sys/sysinfo.h>
  #include <sys/types.h>
  #include <unistd.h>

  static char *buf;
  static int bufSize;

  static void set_affinity(int cpu)
  {
   cpu_set_t affinity;

   CPU_ZERO(&affinity);
   CPU_SET(cpu, &affinity);
   if (sched_setaffinity(0, sizeof(affinity), &affinity))
   err(1, "sched_setaffinity");
  }

  static void dirty_on(int output_fd, int cpu)
  {
   int i, wrote;

   set_affinity(cpu);
   for (i = 0; i < 32; i++) {
   for (wrote = 0; wrote < bufSize; ) {
   int ret = write(output_fd, buf+wrote, bufSize-wrote);
   if (ret == -1)
   err(1, "write");
   wrote += ret;
   }
   }
  }

  int main(int argc, char **argv)
  {
   int cpu, flush_cpu = 1, output_fd;
   const char *output;

   if (argc != 2)
   errx(1, "usage: output_file");

   output = argv[1];
   bufSize = getpagesize();
   buf = malloc(getpagesize());
   if (buf == NULL)
   errx(1, "malloc failed");

   output_fd = open(output, O_CREAT|O_RDWR);
   if (output_fd == -1)
   err(1, "open(%s)", output);

   for (cpu = 0; cpu < get_nprocs(); cpu++) {
   if (cpu != flush_cpu)
   dirty_on(output_fd, cpu);
   }

   set_affinity(flush_cpu);
   if (fsync(output_fd))
   err(1, "fsync(%s)", output);
   if (close(output_fd))
   err(1, "close(%s)", output);
   free(buf);
  }

Make balance_dirty_pages() and wb_over_bg_thresh() work harder to
collect exact per memcg counters.  This avoids the aforementioned oom
kills.

This does not affect the overhead of memory.stat, which still reads the
single atomic counter.

Why not use percpu_counter? memcg already handles cpus going offline, so
no need for that overhead from percpu_counter.  And the percpu_counter
spinlocks are more heavyweight than is required.

It probably also makes sense to use exact dirty and writeback counters
in memcg oom reports.  But that is saved for later.

Link: http://lkml.kernel.org/r/20190329174609.164344-1-gthelen@google.com
Signed-off-by: Greg Thelen <gthelen@google.com>
Reviewed-by: Roman Gushchin <guro@fb.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org> [4.16+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agopsi: clarify the units used in pressure files
Waiman Long [Sat, 6 Apr 2019 01:39:14 +0000 (18:39 -0700)]
psi: clarify the units used in pressure files

The output of the PSI files show a bunch of numbers with no unit.  The
psi.txt documentation file also does not indicate what units are used.
One can only find out by looking at the source code.  The units are
percentage for the averages and useconds for the total.  Make the
information easier to find by documenting the units in psi.txt.

Link: http://lkml.kernel.org/r/20190402193810.3450-1-longman@redhat.com
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm/huge_memory.c: fix modifying of page protection by insert_pfn_pmd()
Aneesh Kumar K.V [Sat, 6 Apr 2019 01:39:10 +0000 (18:39 -0700)]
mm/huge_memory.c: fix modifying of page protection by insert_pfn_pmd()

With some architectures like ppc64, set_pmd_at() cannot cope with a
situation where there is already some (different) valid entry present.

Use pmdp_set_access_flags() instead to modify the pfn which is built to
deal with modifying existing PMD entries.

This is similar to commit cae85cb8add3 ("mm/memory.c: fix modifying of
page protection by insert_pfn()")

We also do similar update w.r.t insert_pfn_pud eventhough ppc64 don't
support pud pfn entries now.

Without this patch we also see the below message in kernel log "BUG:
non-zero pgtables_bytes on freeing mm:"

Link: http://lkml.kernel.org/r/20190402115125.18803-1-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reported-by: Chandan Rajendra <chandan@linux.ibm.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agohugetlbfs: fix memory leak for resv_map
Mike Kravetz [Sat, 6 Apr 2019 01:39:06 +0000 (18:39 -0700)]
hugetlbfs: fix memory leak for resv_map

When mknod is used to create a block special file in hugetlbfs, it will
allocate an inode and kmalloc a 'struct resv_map' via resv_map_alloc().
inode->i_mapping->private_data will point the newly allocated resv_map.
However, when the device special file is opened bd_acquire() will set
inode->i_mapping to bd_inode->i_mapping.  Thus the pointer to the
allocated resv_map is lost and the structure is leaked.

Programs to reproduce:
        mount -t hugetlbfs nodev hugetlbfs
        mknod hugetlbfs/dev b 0 0
        exec 30<> hugetlbfs/dev
        umount hugetlbfs/

resv_map structures are only needed for inodes which can have associated
page allocations.  To fix the leak, only allocate resv_map for those
inodes which could possibly be associated with page allocations.

Link: http://lkml.kernel.org/r/20190401213101.16476-1-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reported-by: Yufen Yu <yuyufen@huawei.com>
Suggested-by: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agomm: fix vm_fault_t cast in VM_FAULT_GET_HINDEX()
Jann Horn [Sat, 6 Apr 2019 01:39:01 +0000 (18:39 -0700)]
mm: fix vm_fault_t cast in VM_FAULT_GET_HINDEX()

Symmetrically to VM_FAULT_SET_HINDEX(), we need a force-cast in
VM_FAULT_GET_HINDEX() to tell sparse that this is intentional.

Sparse complains about the current code when building a kernel with
CONFIG_MEMORY_FAILURE:

  arch/x86/mm/fault.c:1058:53: warning: restricted vm_fault_t degrades to integer

Link: http://lkml.kernel.org/r/20190327204117.35215-1-jannh@google.com
Fixes: 3d3539018d2c ("mm: create the new vm_fault_t type")
Signed-off-by: Jann Horn <jannh@google.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agolib/lzo: fix bugs for very short or empty input
Dave Rodgman [Sat, 6 Apr 2019 01:38:58 +0000 (18:38 -0700)]
lib/lzo: fix bugs for very short or empty input

For very short input data (0 - 1 bytes), lzo-rle was not behaving
correctly.  Fix this behaviour and update documentation accordingly.

For zero-length input, lzo v0 outputs an end-of-stream marker only,
which was misinterpreted by lzo-rle as a bitstream version number.
Ensure bitstream versions > 0 require a minimum stream length of 5.

Also fixes a bug in handling the tail for very short inputs when a
bitstream version is present.

Link: http://lkml.kernel.org/r/20190326165857.34613-1-dave.rodgman@arm.com
Signed-off-by: Dave Rodgman <dave.rodgman@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoinclude/linux/bitrev.h: fix constant bitrev
Arnd Bergmann [Sat, 6 Apr 2019 01:38:53 +0000 (18:38 -0700)]
include/linux/bitrev.h: fix constant bitrev

clang points out with hundreds of warnings that the bitrev macros have a
problem with constant input:

  drivers/hwmon/sht15.c:187:11: error: variable '__x' is uninitialized when used within its own initialization
        [-Werror,-Wuninitialized]
          u8 crc = bitrev8(data->val_status & 0x0F);
                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  include/linux/bitrev.h:102:21: note: expanded from macro 'bitrev8'
          __constant_bitrev8(__x) :                       \
          ~~~~~~~~~~~~~~~~~~~^~~~
  include/linux/bitrev.h:67:11: note: expanded from macro '__constant_bitrev8'
          u8 __x = x;                     \
             ~~~   ^

Both the bitrev and the __constant_bitrev macros use an internal
variable named __x, which goes horribly wrong when passing one to the
other.

The obvious fix is to rename one of the variables, so this adds an extra
'_'.

It seems we got away with this because

 - there are only a few drivers using bitrev macros

 - usually there are no constant arguments to those

 - when they are constant, they tend to be either 0 or (unsigned)-1
   (drivers/isdn/i4l/isdnhdlc.o, drivers/iio/amplifiers/ad8366.c) and
   give the correct result by pure chance.

In fact, the only driver that I could find that gets different results
with this is drivers/net/wan/slic_ds26522.c, which in turn is a driver
for fairly rare hardware (adding the maintainer to Cc for testing).

Link: http://lkml.kernel.org/r/20190322140503.123580-1-arnd@arndb.de
Fixes: 556d2f055bf6 ("ARM: 8187/1: add CONFIG_HAVE_ARCH_BITREVERSE to support rbit instruction")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Cc: Zhao Qiang <qiang.zhao@nxp.com>
Cc: Yalin Wang <yalin.wang@sonymobile.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agokmemleak: powerpc: skip scanning holes in the .bss section
Catalin Marinas [Sat, 6 Apr 2019 01:38:49 +0000 (18:38 -0700)]
kmemleak: powerpc: skip scanning holes in the .bss section

Commit 2d4f567103ff ("KVM: PPC: Introduce kvm_tmp framework") adds
kvm_tmp[] into the .bss section and then free the rest of unused spaces
back to the page allocator.

kernel_init
  kvm_guest_init
    kvm_free_tmp
      free_reserved_area
        free_unref_page
          free_unref_page_prepare

With DEBUG_PAGEALLOC=y, it will unmap those pages from kernel.  As the
result, kmemleak scan will trigger a panic when it scans the .bss
section with unmapped pages.

This patch creates dedicated kmemleak objects for the .data, .bss and
potentially .data..ro_after_init sections to allow partial freeing via
the kmemleak_free_part() in the powerpc kvm_free_tmp() function.

Link: http://lkml.kernel.org/r/20190321171917.62049-1-catalin.marinas@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Qian Cai <cai@lca.pw>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Tested-by: Qian Cai <cai@lca.pw>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Avi Kivity <avi@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agolib/string.c: implement a basic bcmp
Nick Desaulniers [Sat, 6 Apr 2019 01:38:45 +0000 (18:38 -0700)]
lib/string.c: implement a basic bcmp

A recent optimization in Clang (r355672) lowers comparisons of the
return value of memcmp against zero to comparisons of the return value
of bcmp against zero.  This helps some platforms that implement bcmp
more efficiently than memcmp.  glibc simply aliases bcmp to memcmp, but
an optimized implementation is in the works.

This results in linkage failures for all targets with Clang due to the
undefined symbol.  For now, just implement bcmp as a tailcail to memcmp
to unbreak the build.  This routine can be further optimized in the
future.

Other ideas discussed:

 * A weak alias was discussed, but breaks for architectures that define
   their own implementations of memcmp since aliases to declarations are
   not permitted (only definitions). Arch-specific memcmp
   implementations typically declare memcmp in C headers, but implement
   them in assembly.

 * -ffreestanding also is used sporadically throughout the kernel.

 * -fno-builtin-bcmp doesn't work when doing LTO.

Link: https://bugs.llvm.org/show_bug.cgi?id=41035
Link: https://code.woboq.org/userspace/glibc/string/memcmp.c.html#bcmp
Link: https://github.com/llvm/llvm-project/commit/8e16d73346f8091461319a7dfc4ddd18eedcff13
Link: https://github.com/ClangBuiltLinux/linux/issues/416
Link: http://lkml.kernel.org/r/20190313211335.165605-1-ndesaulniers@google.com
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Reported-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Suggested-by: James Y Knight <jyknight@google.com>
Suggested-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Suggested-by: Nathan Chancellor <natechancellor@gmail.com>
Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoMerge tag 'for-5.1/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device...
Linus Torvalds [Sat, 6 Apr 2019 01:34:33 +0000 (15:34 -1000)]
Merge tag 'for-5.1/dm-fixes' of git://git./linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:

 - Two queue_limits stacking fixes: disable discards if underlying
   driver does. And propagate BDI_CAP_STABLE_WRITES to fix sporadic
   checksum errors.

 - Fix that reverts a DM core limit that wasn't needed given that
   dm-crypt was already updated to impose an equivalent limit.

 - Fix dm-init to properly establish 'const' for __initconst array.

 - Fix deadlock in DM integrity target that occurs when overlapping IO
   is being issued to it. And two smaller fixes to the DM integrity
   target.

* tag 'for-5.1/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm integrity: fix deadlock with overlapping I/O
  dm: disable DISCARD if the underlying storage no longer supports it
  dm table: propagate BDI_CAP_STABLE_WRITES to fix sporadic checksum errors
  dm: revert 8f50e358153d ("dm: limit the max bio size as BIO_MAX_PAGES * PAGE_SIZE")
  dm init: fix const confusion for dm_allowed_targets array
  dm integrity: make dm_integrity_init and dm_integrity_exit static
  dm integrity: change memcmp to strncmp in dm_integrity_ctr

5 years agoMerge tag 'vfio-v5.1-rc4' of git://github.com/awilliam/linux-vfio
Linus Torvalds [Sat, 6 Apr 2019 01:07:28 +0000 (15:07 -1000)]
Merge tag 'vfio-v5.1-rc4' of git://github.com/awilliam/linux-vfio

Pull VFIO fixes from Alex Williamson:

 - Fix clang printk format errors (Louis Taylor)

 - Declare structure static to fix sparse warning (Wang Hai)

 - Limit user DMA mappings per container (CVE-2019-3882) (Alex
   Williamson)

* tag 'vfio-v5.1-rc4' of git://github.com/awilliam/linux-vfio:
  vfio/type1: Limit DMA mappings per container
  vfio/spapr_tce: Make symbol 'tce_iommu_driver_ops' static
  vfio/pci: use correct format characters

5 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Fri, 5 Apr 2019 23:43:07 +0000 (13:43 -1000)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "x86 fixes for overflows and other nastiness"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: nVMX: fix x2APIC VTPR read intercept
  KVM: x86: nVMX: close leak of L0's x2APIC MSRs (CVE-2019-3887)
  KVM: SVM: prevent DBG_DECRYPT and DBG_ENCRYPT overflow
  kvm: svm: fix potential get_num_contig_pages overflow

5 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Fri, 5 Apr 2019 23:36:45 +0000 (13:36 -1000)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fix from Catalin Marinas:
 "Fix unwind_frame() in the context of pseudo NMI"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: fix wrong check of on_sdei_stack in nmi context