platform/kernel/linux-rpi.git
4 years agopowerpc/cell: Use fallthrough;
Joe Perches [Wed, 11 Mar 2020 04:51:31 +0000 (21:51 -0700)]
powerpc/cell: Use fallthrough;

Convert the various uses of fallthrough comments to fallthrough;

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/03073a9a269010ca439e9e658629c44602b0cc9f.1583896348.git.joe@perches.com
4 years agopowerpc/sstep: Fix DS operand in ld encoding to appropriate value
Balamuruhan S [Wed, 11 Mar 2020 10:24:05 +0000 (15:54 +0530)]
powerpc/sstep: Fix DS operand in ld encoding to appropriate value

ld instruction should have 14 bit immediate field (DS) concatenated
with 0b00 on the right, encode it accordingly. Introduce macro
`IMM_DS()` to encode DS form instructions with 14 bit immediate field.

Fixes: 4ceae137bdab ("powerpc: emulate_step() tests for load/store instructions")
Reviewed-by: Sandipan Das <sandipan@linux.ibm.com>
Signed-off-by: Balamuruhan S <bala24@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200311102405.392263-1-bala24@linux.ibm.com
4 years agopowerpc/pseries: Fix of_read_drc_info_cell() to point at next record
Tyrel Datwyler [Sat, 7 Mar 2020 02:45:47 +0000 (20:45 -0600)]
powerpc/pseries: Fix of_read_drc_info_cell() to point at next record

The expectation is that when calling of_read_drc_info_cell()
repeatedly to parse multiple drc-info records that the in/out curval
parameter points at the start of the next record on return. However,
the current behavior has curval still pointing at the final value of
the record just parsed. The result of which is that if the
ibm,drc-info property contains multiple properties the parsed value
of the drc_type for any record after the first has the power_domain
value of the previous record appended to the type string.

eg: observed the following 0xffffffff prepended to PHB

  drc-info: type: \xff\xff\xff\xffPHB, prefix: PHB , index_start: 0x20000001
  drc-info: suffix_start: 1, sequential_elems: 3072, sequential_inc: 1
  drc-info: power-domain: 0xffffffff, last_index: 0x20000c00

In practice PHBs are the only type of connector in the ibm,drc-info
property that has multiple records. So, it breaks PHB hotplug, but by
chance not PCI, CPU, slot, or memory because they happen to only ever
be a single record.

Fix by incrementing curval past the power_domain value to point at
drc_type string of next record.

Fixes: e83636ac3334 ("pseries/drc-info: Search DRC properties for CPU indexes")
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Acked-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200307024547.5748-1-tyreld@linux.ibm.com
4 years agoselftests/powerpc: Don't rely on segfault to rerun the test
Gustavo Luiz Duarte [Tue, 11 Feb 2020 03:38:31 +0000 (00:38 -0300)]
selftests/powerpc: Don't rely on segfault to rerun the test

The test case tm-signal-context-force-tm expects a segfault to happen
on returning from signal handler, and then does a setcontext() to run
the test again. However, the test doesn't always segfault, causing the
test to run a single time.

This patch fixes the test by putting it within a loop and jumping, via
setcontext, just prior to the loop in case it segfaults. This way we
get the desired behavior (run the test COUNT_MAX times) regardless if
it segfaults or not. This also reduces the use of setcontext for
control flow logic, keeping it only in the segfault handler.

Also, since 'count' is changed within the signal handler, it is
declared as volatile to prevent any compiler optimization getting
confused with asynchronous changes.

Signed-off-by: Gustavo Luiz Duarte <gustavold@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200211033831.11165-3-gustavold@linux.ibm.com
4 years agoselftests/powerpc: Add tm-signal-pagefault test
Gustavo Luiz Duarte [Tue, 11 Feb 2020 03:38:30 +0000 (00:38 -0300)]
selftests/powerpc: Add tm-signal-pagefault test

This test triggers a TM Bad Thing by raising a signal in transactional state
and forcing a pagefault to happen in kernelspace when the kernel signal
handling code first touches the user signal stack.

This is inspired by the test tm-signal-context-force-tm but uses userfaultfd to
make the test deterministic. While this test always triggers the bug in one
run, I had to execute tm-signal-context-force-tm several times (the test runs
5000 times each execution) to trigger the same bug.

tm-signal-context-force-tm is kept instead of replaced because, while this test
is more reliable and triggers the same bug, tm-signal-context-force-tm has a
better coverage, in the sense that by running the test several times it might
trigger the pagefault and/or be preempted at different places.

v3: skip test if userfaultfd is unavailable.

Signed-off-by: Gustavo Luiz Duarte <gustavold@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200211033831.11165-2-gustavold@linux.ibm.com
4 years agopowerpc/kuap: PPC_KUAP_DEBUG should depend on PPC_KUAP
Michael Ellerman [Sun, 1 Mar 2020 11:17:38 +0000 (22:17 +1100)]
powerpc/kuap: PPC_KUAP_DEBUG should depend on PPC_KUAP

Currently you can enable PPC_KUAP_DEBUG when PPC_KUAP is disabled,
even though the former has not effect without the latter.

Fix it so that PPC_KUAP_DEBUG can only be enabled when PPC_KUAP is
enabled, not when the platform could support KUAP (PPC_HAVE_KUAP).

Fixes: 890274c2dc4c ("powerpc/64s: Implement KUAP for Radix MMU")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200301111738.22497-1-mpe@ellerman.id.au
4 years agoselftests/powerpc: Add a test of sigreturn vs VDSO
Michael Ellerman [Wed, 4 Mar 2020 11:04:02 +0000 (22:04 +1100)]
selftests/powerpc: Add a test of sigreturn vs VDSO

There's two different paths through the sigreturn code, depending on
whether the VDSO is mapped or not. We recently discovered a bug in the
unmapped case, because it's not commonly used these days.

So add a test that sends itself a signal, then moves the VDSO, takes
another signal and finally unmaps the VDSO before sending itself
another signal. That tests the standard signal path, the code that
handles the VDSO being moved, and also the signal path in the case
where the VDSO is unmapped.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200304110402.6038-1-mpe@ellerman.id.au
4 years agopowerpc/lib: Fix emulate_step() std test
Nicholas Piggin [Wed, 26 Feb 2020 05:53:02 +0000 (15:53 +1000)]
powerpc/lib: Fix emulate_step() std test

We should be checking that the instruction was stepped *and* that the
target register has the right value.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
[mpe: Write change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200226055302.1577954-1-npiggin@gmail.com
4 years agopowerpc/64s/radix: Fix CONFIG_SMP=n build
Nicholas Piggin [Mon, 2 Mar 2020 01:04:10 +0000 (11:04 +1000)]
powerpc/64s/radix: Fix CONFIG_SMP=n build

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200302010410.2957362-1-npiggin@gmail.com
4 years agoselftests/powerpc: Add tlbie_test in .gitignore
Christophe Leroy [Fri, 28 Feb 2020 00:00:09 +0000 (00:00 +0000)]
selftests/powerpc: Add tlbie_test in .gitignore

The commit identified below added tlbie_test but forgot to add it in
.gitignore.

Fixes: 93cad5f78995 ("selftests/powerpc: Add test case for tlbie vs mtpidr ordering issue")
Cc: stable@vger.kernel.org # v5.4+
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/259f9c06ed4563c4fa4fa8ffa652347278d769e7.1582847784.git.christophe.leroy@c-s.fr
4 years agopowerpc/pmac/smp: Drop unnecessary volatile qualifier
YueHaibing [Tue, 3 Mar 2020 08:56:04 +0000 (16:56 +0800)]
powerpc/pmac/smp: Drop unnecessary volatile qualifier

core99_l2_cache/core99_l3_cache do not need to be marked as volatile,
remove it.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200303085604.24952-1-yuehaibing@huawei.com
4 years agopowerpc/pmac/smp: Avoid unused-variable warnings
Ilie Halip [Fri, 20 Sep 2019 15:39:51 +0000 (18:39 +0300)]
powerpc/pmac/smp: Avoid unused-variable warnings

When building with ppc64_defconfig, the compiler reports
that these 2 variables are not used:
    warning: unused variable 'core99_l2_cache' [-Wunused-variable]
    warning: unused variable 'core99_l3_cache' [-Wunused-variable]

They are only used when CONFIG_PPC64 is not defined. Move
them into a section which does the same macro check.

Reported-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Ilie Halip <ilie.halip@gmail.com>
[mpe: Move them into core99_init_caches() which is their only user]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190920153951.25762-1-ilie.halip@gmail.com
4 years agopowerpc/fsl_booke: Avoid creating duplicate tlb1 entry
Laurentiu Tudor [Thu, 23 Jan 2020 11:19:25 +0000 (11:19 +0000)]
powerpc/fsl_booke: Avoid creating duplicate tlb1 entry

In the current implementation, the call to loadcam_multi() is wrapped
between switch_to_as1() and restore_to_as0() calls so, when it tries
to create its own temporary AS=1 TLB1 entry, it ends up duplicating
the existing one created by switch_to_as1(). Add a check to skip
creating the temporary entry if already running in AS=1.

Fixes: d9e1831a4202 ("powerpc/85xx: Load all early TLB entries at once")
Cc: stable@vger.kernel.org # v4.4+
Signed-off-by: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Acked-by: Scott Wood <oss@buserror.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200123111914.2565-1-laurentiu.tudor@nxp.com
4 years agotty: evh_bytechan: Fix out of bounds accesses
Stephen Rothwell [Thu, 9 Jan 2020 07:39:12 +0000 (18:39 +1100)]
tty: evh_bytechan: Fix out of bounds accesses

ev_byte_channel_send() assumes that its third argument is a 16 byte
array. Some places where it is called it may not be (or we can't
easily tell if it is). Newer compilers have started producing warnings
about this, so make sure we actually pass a 16 byte array.

There may be more elegant solutions to this, but the driver is quite
old and hasn't been updated in many years.

The warnings (from a powerpc allyesconfig build) are:

  In file included from include/linux/byteorder/big_endian.h:5,
                   from arch/powerpc/include/uapi/asm/byteorder.h:14,
                   from include/asm-generic/bitops/le.h:6,
                   from arch/powerpc/include/asm/bitops.h:250,
                   from include/linux/bitops.h:29,
                   from include/linux/kernel.h:12,
                   from include/asm-generic/bug.h:19,
                   from arch/powerpc/include/asm/bug.h:109,
                   from include/linux/bug.h:5,
                   from include/linux/mmdebug.h:5,
                   from include/linux/gfp.h:5,
                   from include/linux/slab.h:15,
                   from drivers/tty/ehv_bytechan.c:24:
  drivers/tty/ehv_bytechan.c: In function ‘ehv_bc_udbg_putc’:
  arch/powerpc/include/asm/epapr_hcalls.h:298:20: warning: array subscript 1 is outside array bounds of ‘const char[1]’ [-Warray-bounds]
    298 |  r6 = be32_to_cpu(p[1]);
  include/uapi/linux/byteorder/big_endian.h:40:51: note: in definition of macro ‘__be32_to_cpu’
     40 | #define __be32_to_cpu(x) ((__force __u32)(__be32)(x))
        |                                                   ^
  arch/powerpc/include/asm/epapr_hcalls.h:298:7: note: in expansion of macro ‘be32_to_cpu’
    298 |  r6 = be32_to_cpu(p[1]);
        |       ^~~~~~~~~~~
  drivers/tty/ehv_bytechan.c:166:13: note: while referencing ‘data’
    166 | static void ehv_bc_udbg_putc(char c)
        |             ^~~~~~~~~~~~~~~~

Fixes: dcd83aaff1c8 ("tty/powerpc: introduce the ePAPR embedded hypervisor byte channel driver")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Tested-by: Laurentiu Tudor <laurentiu.tudor@nxp.com>
[mpe: Trim warnings from change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200109183912.5fcb52aa@canb.auug.org.au
4 years agocpufreq: powernv: Fix unsafe notifiers
Oliver O'Halloran [Thu, 6 Feb 2020 06:26:22 +0000 (17:26 +1100)]
cpufreq: powernv: Fix unsafe notifiers

The PowerNV cpufreq driver registers two notifiers: one to catch
throttle messages from the OCC and one to bump the CPU frequency back
to normal before a reboot. Both require the cpufreq driver to be
registered in order to function since the notifier callbacks use
various cpufreq_*() functions.

Right now we register both notifiers before we've initialised the
driver. This seems to work, but we should head off any protential
problems by registering the notifiers after the driver is initialised.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200206062622.28235-2-oohall@gmail.com
4 years agocpufreq: powernv: Fix use-after-free
Oliver O'Halloran [Thu, 6 Feb 2020 06:26:21 +0000 (17:26 +1100)]
cpufreq: powernv: Fix use-after-free

The cpufreq driver has a use-after-free that we can hit if:

a) There's an OCC message pending when the notifier is registered, and
b) The cpufreq driver fails to register with the core.

When a) occurs the notifier schedules a workqueue item to handle the
message. The backing work_struct is located on chips[].throttle and
when b) happens we clean up by freeing the array. Once we get to
the (now free) queued item and the kernel crashes.

Fixes: c5e29ea7ac14 ("cpufreq: powernv: Fix bugs in powernv_cpufreq_{init/exit}")
Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200206062622.28235-1-oohall@gmail.com
4 years agopowerpc/vdso: remove deprecated VDS64_HAS_DESCRIPTORS references
Joe Lawrence [Mon, 24 Feb 2020 21:18:48 +0000 (16:18 -0500)]
powerpc/vdso: remove deprecated VDS64_HAS_DESCRIPTORS references

The original 2005 patch that introduced the powerpc vdso, pre-git
("ppc64: Implement a vDSO and use it for signal trampoline") notes that:

  ... symbols exposed by the vDSO aren't "normal" function symbols, apps
  can't be expected to link against them directly, the vDSO's are both
  seen as if they were linked at 0 and the symbols just contain offsets
  to the various functions.  This is done on purpose to avoid a
  relocation step (ppc64 functions normally have descriptors with abs
  addresses in them).  When glibc uses those functions, it's expected to
  use it's own trampolines that know how to reach them.

Despite that explanation, there remains dead #ifdef
VDS64_HAS_DESCRIPTORS code-blocks that provide alternate function
definitions that setup function descriptors.

Since VDS64_HAS_DESCRIPTORS has been unused for all these years, we
might as well finally remove it from the codebase.

Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200224211848.26087-1-joe.lawrence@redhat.com
4 years agopowerpc/32: Fix missing NULL pmd check in virt_to_kpte()
Christophe Leroy [Sat, 7 Mar 2020 10:09:15 +0000 (10:09 +0000)]
powerpc/32: Fix missing NULL pmd check in virt_to_kpte()

Commit 2efc7c085f05 ("powerpc/32: drop get_pteptr()"),
replaced get_pteptr() by virt_to_kpte(). But virt_to_kpte() lacks a
NULL pmd check and returns an invalid non NULL pointer when there
is no page table.

Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Fixes: 2efc7c085f05 ("powerpc/32: drop get_pteptr()")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b1177cdfc6af74a3e277bba5d9e708c4b3315ebe.1583575707.git.christophe.leroy@c-s.fr
4 years agoMerge branch 'fixes' into next
Michael Ellerman [Tue, 10 Mar 2020 04:16:42 +0000 (15:16 +1100)]
Merge branch 'fixes' into next

Merge in our fixes branch. In particular we want to merge the TM and KUAP fixes,
so we can add selftests for them in next.

4 years agopowerpc/mm: Fix missing KUAP disable in flush_coherent_icache()
Michael Ellerman [Tue, 3 Mar 2020 12:28:47 +0000 (23:28 +1100)]
powerpc/mm: Fix missing KUAP disable in flush_coherent_icache()

Stefan reported a strange kernel fault which turned out to be due to a
missing KUAP disable in flush_coherent_icache() called from
flush_icache_range().

The fault looks like:

  Kernel attempted to access user page (7fffc30d9c00) - exploit attempt? (uid: 1009)
  BUG: Unable to handle kernel data access on read at 0x7fffc30d9c00
  Faulting instruction address: 0xc00000000007232c
  Oops: Kernel access of bad area, sig: 11 [#1]
  LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
  CPU: 35 PID: 5886 Comm: sigtramp Not tainted 5.6.0-rc2-gcc-8.2.0-00003-gfc37a1632d40 #79
  NIP:  c00000000007232c LR: c00000000003b7fc CTR: 0000000000000000
  REGS: c000001e11093940 TRAP: 0300   Not tainted  (5.6.0-rc2-gcc-8.2.0-00003-gfc37a1632d40)
  MSR:  900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 28000884  XER: 00000000
  CFAR: c0000000000722fc DAR: 00007fffc30d9c00 DSISR: 08000000 IRQMASK: 0
  GPR00: c00000000003b7fc c000001e11093bd0 c0000000023ac200 00007fffc30d9c00
  GPR04: 00007fffc30d9c18 0000000000000000 c000001e11093bd4 0000000000000000
  GPR08: 0000000000000000 0000000000000001 0000000000000000 c000001e1104ed80
  GPR12: 0000000000000000 c000001fff6ab380 c0000000016be2d0 4000000000000000
  GPR16: c000000000000000 bfffffffffffffff 0000000000000000 0000000000000000
  GPR20: 00007fffc30d9c00 00007fffc30d8f58 00007fffc30d9c18 00007fffc30d9c20
  GPR24: 00007fffc30d9c18 0000000000000000 c000001e11093d90 c000001e1104ed80
  GPR28: c000001e11093e90 0000000000000000 c0000000023d9d18 00007fffc30d9c00
  NIP flush_icache_range+0x5c/0x80
  LR  handle_rt_signal64+0x95c/0xc2c
  Call Trace:
    0xc000001e11093d90 (unreliable)
    handle_rt_signal64+0x93c/0xc2c
    do_notify_resume+0x310/0x430
    ret_from_except_lite+0x70/0x74
  Instruction dump:
  409e002c 7c0802a6 3c62ff31 3863f6a0 f8010080 48195fed 60000000 48fe4c8d
  60000000 e8010080 7c0803a6 7c0004ac <7c00ffac7c0004ac 4c00012c 38210070

This path through handle_rt_signal64() to setup_trampoline() and
flush_icache_range() is only triggered by 64-bit processes that have
unmapped their VDSO, which is rare.

flush_icache_range() takes a range of addresses to flush. In
flush_coherent_icache() we implement an optimisation for CPUs where we
know we don't actually have to flush the whole range, we just need to
do a single icbi.

However we still execute the icbi on the user address of the start of
the range we're flushing. On CPUs that also implement KUAP (Power9)
that leads to the spurious fault above.

We should be able to pass any address, including a kernel address, to
the icbi on these CPUs, which would avoid any interaction with KUAP.
But I don't want to make that change in a bug fix, just in case it
surfaces some strange behaviour on some CPU.

So for now just disable KUAP around the icbi. Note the icbi is treated
as a load, so we allow read access, not write as you'd expect.

Fixes: 890274c2dc4c ("powerpc/64s: Implement KUAP for Radix MMU")
Cc: stable@vger.kernel.org # v5.2+
Reported-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200303235708.26004-1-mpe@ellerman.id.au
4 years agopowerpc/numa: Remove late request for home node associativity
Srikar Dronamraju [Wed, 29 Jan 2020 13:53:01 +0000 (19:23 +0530)]
powerpc/numa: Remove late request for home node associativity

With commit ("powerpc/numa: Early request for home node associativity"),
commit 2ea626306810 ("powerpc/topology: Get topology for shared
processors at boot") which was requesting home node associativity
becomes redundant.

Hence remove the late request for home node associativity.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200129135301.24739-6-srikar@linux.vnet.ibm.com
4 years agopowerpc/numa: Early request for home node associativity
Srikar Dronamraju [Wed, 29 Jan 2020 13:53:00 +0000 (19:23 +0530)]
powerpc/numa: Early request for home node associativity

Currently the kernel detects if its running on a shared lpar platform
and requests home node associativity before the scheduler sched_domains
are setup. However between the time NUMA setup is initialized and the
request for home node associativity, workqueue initializes its per node
cpumask. The per node workqueue possible cpumask may turn invalid
after home node associativity resulting in weird situations like
workqueue possible cpumask being a subset of workqueue online cpumask.

This can be fixed by requesting home node associativity earlier just
before NUMA setup. However at the NUMA setup time, kernel may not be in
a position to detect if its running on a shared lpar platform. So
request for home node associativity and if the request fails, fallback
on the device tree property.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200129135301.24739-5-srikar@linux.vnet.ibm.com
4 years agopowerpc/numa: Use cpu node map of first sibling thread
Srikar Dronamraju [Wed, 29 Jan 2020 13:52:59 +0000 (19:22 +0530)]
powerpc/numa: Use cpu node map of first sibling thread

All the sibling threads of a core have to be part of the same node.
To ensure that all the sibling threads map to the same node, always
lookup/update the cpu-to-node map of the first thread in the core.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200129135301.24739-4-srikar@linux.vnet.ibm.com
4 years agopowerpc/numa: Handle extra hcall_vphn error cases
Srikar Dronamraju [Wed, 29 Jan 2020 13:52:58 +0000 (19:22 +0530)]
powerpc/numa: Handle extra hcall_vphn error cases

Currently code handles H_FUNCTION, H_SUCCESS, H_HARDWARE return codes.
However hcall_vphn can return other return codes. Now it also handles
H_PARAMETER return code.  Also the rest return codes are handled under the
default case.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200129135301.24739-3-srikar@linux.vnet.ibm.com
4 years agopowerpc/vphn: Check for error from hcall_vphn
Srikar Dronamraju [Wed, 29 Jan 2020 13:52:57 +0000 (19:22 +0530)]
powerpc/vphn: Check for error from hcall_vphn

There is no value in unpacking associativity, if
H_HOME_NODE_ASSOCIATIVITY hcall has returned an error.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200129135301.24739-2-srikar@linux.vnet.ibm.com
4 years agopowerpc/smp: Use nid as fallback for package_id
Srikar Dronamraju [Wed, 29 Jan 2020 13:51:21 +0000 (19:21 +0530)]
powerpc/smp: Use nid as fallback for package_id

package_id is to match cores that are part of the same chip. On
PowerNV machines, package_id defaults to chip_id. However ibm,chip_id
property is not present in device-tree of PowerVM LPARs. Hence lscpu
output shows one core per socket and multiple cores.

To overcome this, use nid as the package_id on PowerVM LPARs.

Before the patch:

  Architecture:        ppc64le
  Byte Order:          Little Endian
  CPU(s):              128
  On-line CPU(s) list: 0-127
  Thread(s) per core:  8
  Core(s) per socket:  1                     <----------------------
  Socket(s):           16                    <----------------------
  NUMA node(s):        2
  Model:               2.2 (pvr 004e 0202)
  Model name:          POWER9 (architected), altivec supported
  Hypervisor vendor:   pHyp
  Virtualization type: para
  L1d cache:           32K
  L1i cache:           32K
  L2 cache:            512K
  L3 cache:            10240K
  NUMA node0 CPU(s):   0-63
  NUMA node1 CPU(s):   64-127
  #
  # cat /sys/devices/system/cpu/cpu0/topology/physical_package_id
  -1

After the patch:

  Architecture:        ppc64le
  Byte Order:          Little Endian
  CPU(s):              128
  On-line CPU(s) list: 0-127
  Thread(s) per core:  8                     <---------------------
  Core(s) per socket:  8                     <---------------------
  Socket(s):           2
  NUMA node(s):        2
  Model:               2.2 (pvr 004e 0202)
  Model name:          POWER9 (architected), altivec supported
  Hypervisor vendor:   pHyp
  Virtualization type: para
  L1d cache:           32K
  L1i cache:           32K
  L2 cache:            512K
  L3 cache:            10240K
  NUMA node0 CPU(s):   0-63
  NUMA node1 CPU(s):   64-127
  #
  # cat /sys/devices/system/cpu/cpu0/topology/physical_package_id
  0

Now lscpu output is more in line with the system configuration.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
[mpe: Use pkg_id instead of ppid, tweak change log and comment]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200129135121.24617-1-srikar@linux.vnet.ibm.com
4 years agopowerpc/irq: Use current_stack_pointer in do_IRQ()
Christophe Leroy [Thu, 20 Feb 2020 11:51:41 +0000 (22:51 +1100)]
powerpc/irq: Use current_stack_pointer in do_IRQ()

Until commit 7306e83ccf5c ("powerpc: Don't use CURRENT_THREAD_INFO to
find the stack"), the current stack base address was obtained by
calling current_thread_info(). That inline function was simply masking
out the value of r1.

In that commit, it was changed to using current_stack_pointer() (since
renamed current_stack_frame()), which is a heavier function as it is
an outline assembly function which cannot be inlined and which reads
the content of the stack at 0(r1).

Convert to using current_stack_pointer for geting r1 and masking out
its value to obtain the base address of the stack pointer as before.

Fixes: 7306e83ccf5c ("powerpc: Don't use CURRENT_THREAD_INFO to find the stack")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200220115141.2707-5-mpe@ellerman.id.au
4 years agopowerpc/irq: use IS_ENABLED() in check_stack_overflow()
Christophe Leroy [Thu, 20 Feb 2020 11:51:40 +0000 (22:51 +1100)]
powerpc/irq: use IS_ENABLED() in check_stack_overflow()

Instead of #ifdef, use IS_ENABLED(CONFIG_DEBUG_STACKOVERFLOW).
This enable GCC to check for code validity even when the option
is not selected.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200220115141.2707-4-mpe@ellerman.id.au
4 years agopowerpc/irq: Use current_stack_pointer in check_stack_overflow()
Christophe Leroy [Thu, 20 Feb 2020 11:51:39 +0000 (22:51 +1100)]
powerpc/irq: Use current_stack_pointer in check_stack_overflow()

The purpose of check_stack_overflow() is to verify that the stack has
not overflowed.

To really know whether the stack pointer is still within boundaries,
the check must be done directly on the value of r1.

So use current_stack_pointer, which returns the current value of r1,
rather than current_stack_frame() which causes a frame to be created
and then returns that value.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200220115141.2707-3-mpe@ellerman.id.au
4 years agopowerpc: Add current_stack_pointer as a register global
Christophe Leroy [Thu, 20 Feb 2020 11:51:38 +0000 (22:51 +1100)]
powerpc: Add current_stack_pointer as a register global

current_stack_frame() doesn't return the stack pointer, but the
caller's stack frame. See commit bfe9a2cfe91a ("powerpc: Reimplement
__get_SP() as a function not a define") and commit
acf620ecf56c ("powerpc: Rename __get_SP() to current_stack_pointer()")
for details.

In some cases this is overkill or incorrect, as it doesn't return the
current value of r1.

So add a current_stack_pointer register global to get the value of r1
directly.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[mpe: Split out of other patch, tweak change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200220115141.2707-2-mpe@ellerman.id.au
4 years agopowerpc: Rename current_stack_pointer() to current_stack_frame()
Michael Ellerman [Thu, 20 Feb 2020 11:51:37 +0000 (22:51 +1100)]
powerpc: Rename current_stack_pointer() to current_stack_frame()

current_stack_pointer(), which was called __get_SP(), used to just
return the value in r1.

But that caused problems in some cases, so it was turned into a
function in commit bfe9a2cfe91a ("powerpc: Reimplement __get_SP() as a
function not a define").

Because it's a function in a separate compilation unit to all its
callers, it has the effect of causing a stack frame to be created, and
then returns the address of that frame. This is good in some cases
like those described in the above commit, but in other cases it's
overkill, we just need to know what stack page we're on.

On some other arches current_stack_pointer is just a register global
giving the stack pointer, and we'd like to do that too. So rename our
current_stack_pointer() to current_stack_frame() to make that
possible.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr>
Link: https://lore.kernel.org/r/20200220115141.2707-1-mpe@ellerman.id.au
4 years agopowerpc/kernel/sysfs: Add new config option PMU_SYSFS to enable PMU SPRs sysfs file...
Kajol Jain [Fri, 14 Feb 2020 08:06:06 +0000 (13:36 +0530)]
powerpc/kernel/sysfs: Add new config option PMU_SYSFS to enable PMU SPRs sysfs file creation

Many of the performance monitoring unit (PMU) SPRs are
exposed in the sysfs. This may not be a desirable since
"perf" API is the primary interface to program PMU and
collect counter data in the system. But that said, we
cant remove these sysfs files since we dont whether
anyone/anything is using them.

So the patch adds a new CONFIG option 'CONFIG_PMU_SYSFS'
(user selectable) to be used in sysfs file creation for
PMU SPRs. New option by default is disabled, but can be
enabled if user needs it.

Tested this patch behaviour in powernv and pseries machines.
Patch is also tested for pmac32_defconfig.

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Tested-by: Nageswara R Sastry <nasastry@in.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200214080606.26872-2-kjain@linux.ibm.com
4 years agopowerpc/kernel/sysfs: Refactor current sysfs.c
Madhavan Srinivasan [Fri, 14 Feb 2020 08:06:05 +0000 (13:36 +0530)]
powerpc/kernel/sysfs: Refactor current sysfs.c

An attempt to refactor the current sysfs.c file.
To start with a big chuck of macro #defines and dscr
functions are moved to start of the file. Secondly,
HAS_ #define macros are cleanup based on CONFIG_ options

Finally new HAS_ macro added:
1. HAS_PPC_PA6T (for PA6T) to separate out non-PMU SPRs.
2. HAS_PPC_PMC56 to separate out PMC SPR's from HAS_PPC_PMC_CLASSIC
   which come under CONFIG_PPC64.

Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200214080606.26872-1-kjain@linux.ibm.com
4 years agopowerpc/powernv: Add explicit fast-reboot support
Oliver O'Halloran [Mon, 17 Feb 2020 02:48:33 +0000 (13:48 +1100)]
powerpc/powernv: Add explicit fast-reboot support

Add a way to manually invoke a fast-reboot rather than setting the NVRAM
flag. The idea is to allow userspace to invoke a fast-reboot using the
optional string argument to the reboot() system call, or using the xmon
zr command so we don't need to leave around a persistent changes on
a system to use the feature.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200217024833.30580-2-oohall@gmail.com
4 years agopowerpc/powernv: Treat an empty reboot string as default
Oliver O'Halloran [Mon, 17 Feb 2020 02:48:32 +0000 (13:48 +1100)]
powerpc/powernv: Treat an empty reboot string as default

Treat an empty reboot cmd string the same as a NULL string. This squashes a
spurious unsupported reboot message that sometimes gets out when using
xmon.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200217024833.30580-1-oohall@gmail.com
4 years agopowerpc/Makefile: Mark phony targets as PHONY
Michael Ellerman [Wed, 19 Feb 2020 00:04:34 +0000 (11:04 +1100)]
powerpc/Makefile: Mark phony targets as PHONY

Some of our phony targets are not marked as such. This can lead to
confusing errors, eg:

  $ make clean
  $ touch install
  $ make install
  make: 'install' is up to date.
  $

Fix it by adding them to the PHONY variable which is marked phony in
the top-level Makefile, or in scripts/Makefile.build for the boot
Makefile.

Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200219000434.15872-1-mpe@ellerman.id.au
4 years agopowerpc/mm: Don't kmap_atomic() in pte_offset_map() on PPC32
Christophe Leroy [Mon, 17 Feb 2020 09:41:35 +0000 (09:41 +0000)]
powerpc/mm: Don't kmap_atomic() in pte_offset_map() on PPC32

On PPC32, pte_offset_map() does a kmap_atomic() in order to support
page tables allocated in high memory, just like ARM and x86/32.

But since at least 2008 and commit 8054a3428fbe ("powerpc: Remove dead
CONFIG_HIGHPTE"), page tables are never allocated in high memory.

When the page is in low mem, kmap_atomic() just returns the page
address but still disable preemption and pagefault. And it is
not an inlined function, so we suffer function call for no reason.

Make pte_offset_map() the same as pte_offset_kernel() and make
pte_unmap() void, in the same way as PPC64 which doesn't have HIGHMEM.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/03c97f0f6b3790d164822563be80f2fd4713a955.1581932480.git.christophe.leroy@c-s.fr
4 years agopowerpc/book3s64: Fix error handling in mm_iommu_do_alloc()
Alexey Kardashevskiy [Mon, 23 Dec 2019 06:03:51 +0000 (17:03 +1100)]
powerpc/book3s64: Fix error handling in mm_iommu_do_alloc()

The last jump to free_exit in mm_iommu_do_alloc() happens after page
pointers in struct mm_iommu_table_group_mem_t were already converted to
physical addresses. Thus calling put_page() on these physical addresses
will likely crash.

This moves the loop which calculates the pageshift and converts page
struct pointers to physical addresses later after the point when
we cannot fail; thus eliminating the need to convert pointers back.

Fixes: eb9d7a62c386 ("powerpc/mm_iommu: Fix potential deadlock")
Reported-by: Jan Kara <jack@suse.cz>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191223060351.26359-1-aik@ozlabs.ru
4 years agopowerpc/powernv: no need to check return value of debugfs_create functions
Greg Kroah-Hartman [Sun, 9 Feb 2020 10:59:01 +0000 (11:59 +0100)]
powerpc/powernv: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200209105901.1620958-6-gregkh@linuxfoundation.org
4 years agopowerpc/cell/axon_msi: no need to check return value of debugfs_create functions
Greg Kroah-Hartman [Sun, 9 Feb 2020 10:59:00 +0000 (11:59 +0100)]
powerpc/cell/axon_msi: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200209105901.1620958-5-gregkh@linuxfoundation.org
4 years agopowerpc/mm: ptdump: no need to check return value of debugfs_create functions
Greg Kroah-Hartman [Sun, 9 Feb 2020 10:58:59 +0000 (11:58 +0100)]
powerpc/mm: ptdump: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200209105901.1620958-4-gregkh@linuxfoundation.org
4 years agopowerpc/mm: book3s64: hash_utils: no need to check return value of debugfs_create...
Greg Kroah-Hartman [Sun, 9 Feb 2020 10:58:58 +0000 (11:58 +0100)]
powerpc/mm: book3s64: hash_utils: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200209105901.1620958-3-gregkh@linuxfoundation.org
4 years agopowerpc/kvm: no need to check return value of debugfs_create functions
Greg Kroah-Hartman [Sun, 9 Feb 2020 10:58:57 +0000 (11:58 +0100)]
powerpc/kvm: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Because of this cleanup, we get to remove a few fields in struct
kvm_arch that are now unused.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[mpe: Fix build error in kvm/timing.c, adapt kvmppc_remove_cpu_debugfs()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200209105901.1620958-2-gregkh@linuxfoundation.org
4 years agopowerpc/kernel: no need to check return value of debugfs_create functions
Greg Kroah-Hartman [Sun, 9 Feb 2020 10:58:56 +0000 (11:58 +0100)]
powerpc/kernel: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200209105901.1620958-1-gregkh@linuxfoundation.org
4 years agopowerpc/83xx: Add some error handling in 'quirk_mpc8360e_qe_enet10()'
Christophe JAILLET [Sat, 8 Feb 2020 14:09:20 +0000 (15:09 +0100)]
powerpc/83xx: Add some error handling in 'quirk_mpc8360e_qe_enet10()'

In some error handling path, we should call "of_node_put(np_par)" or
some resource may be leaking in case of error.

Fixes: 8159df72d43e ("83xx: add support for the kmeter1 board.")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Scott Wood <oss@buserror.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200208140920.7652-1-christophe.jaillet@wanadoo.fr
4 years agopowerpc/83xx: Fix some typo in some warning message
Christophe JAILLET [Sat, 8 Feb 2020 14:09:04 +0000 (15:09 +0100)]
powerpc/83xx: Fix some typo in some warning message

"couldn;t" should be "couldn't".

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Scott Wood <oss@buserror.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200208140904.7521-1-christophe.jaillet@wanadoo.fr
4 years agopowerpc: fix hardware PMU exception bug on PowerVM compatibility mode systems
Desnes A. Nunes do Rosario [Thu, 27 Feb 2020 13:47:15 +0000 (10:47 -0300)]
powerpc: fix hardware PMU exception bug on PowerVM compatibility mode systems

PowerVM systems running compatibility mode on a few Power8 revisions are
still vulnerable to the hardware defect that loses PMU exceptions arriving
prior to a context switch.

The software fix for this issue is enabled through the CPU_FTR_PMAO_BUG
cpu_feature bit, nevertheless this bit also needs to be set for PowerVM
compatibility mode systems.

Fixes: 68f2f0d431d9ea4 ("powerpc: Add a cpu feature CPU_FTR_PMAO_BUG")
Signed-off-by: Desnes A. Nunes do Rosario <desnesn@linux.ibm.com>
Reviewed-by: Leonardo Bras <leonardo@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200227134715.9715-1-desnesn@linux.ibm.com
4 years agopowerpc/32: drop get_pteptr()
Christophe Leroy [Thu, 9 Jan 2020 08:25:26 +0000 (08:25 +0000)]
powerpc/32: drop get_pteptr()

Commit 8d30c14cab30 ("powerpc/mm: Rework I$/D$ coherency (v3)") and
commit 90ac19a8b21b ("[POWERPC] Abolish iopa(), mm_ptov(),
io_block_mapping() from arch/powerpc") removed the use of get_pteptr()
outside of mm/pgtable_32.c

In mm/pgtable_32.c, the only user of get_pteptr() is change_page_attr()
which operates on kernel context and on lowmem pages only.

Make virt_to_kpte() available outside of mm/mem.c and use it instead
of get_pteptr(), and drop get_pteptr()

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/788378c6c3ba5c5298caab7c7f95e6c3c88244b8.1578558199.git.christophe.leroy@c-s.fr
4 years agopowerpc/32: refactor pmd_offset(pud_offset(pgd_offset...
Christophe Leroy [Thu, 9 Jan 2020 08:25:25 +0000 (08:25 +0000)]
powerpc/32: refactor pmd_offset(pud_offset(pgd_offset...

At several places pmd pointer is retrieved through the same action:

pmd = pmd_offset(pud_offset(pgd_offset(mm, addr), addr), addr);

or

pmd = pmd_offset(pud_offset(pgd_offset_k(addr), addr), addr);

Refactor this by implementing two helpers pmd_ptr() and pmd_ptr_k()

This will help when adding the p4d level.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/7b065c5be35726af4066cab238ee35cabceda1fa.1578558199.git.christophe.leroy@c-s.fr
4 years agopowerpc/32: don't restore r0, r6-r8 on exception entry path after trace_hardirqs_off()
Christophe Leroy [Tue, 7 Jan 2020 09:16:40 +0000 (09:16 +0000)]
powerpc/32: don't restore r0, r6-r8 on exception entry path after trace_hardirqs_off()

Since commit b86fb88855ea ("powerpc/32: implement fast entry for
syscalls on non BOOKE") and commit 1a4b739bbb4f ("powerpc/32:
implement fast entry for syscalls on BOOKE"), syscalls don't
use the exception entry path anymore. It is therefore pointless
to restore r0 and r6-r8 after calling trace_hardirqs_off().

In the meantime, drop the '2:' label which is unused and misleading.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/d2c6dc65d27e83964eb05f16a126161ab6455eea.1578388585.git.christophe.leroy@c-s.fr
4 years agopowerpc: Include .BTF section
Naveen N. Rao [Thu, 20 Feb 2020 11:31:32 +0000 (17:01 +0530)]
powerpc: Include .BTF section

Selecting CONFIG_DEBUG_INFO_BTF results in the below warning from ld:
  ld: warning: orphan section `.BTF' from `.btf.vmlinux.bin.o' being placed in section `.BTF'

Include .BTF section in vmlinux explicitly to fix the same.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200220113132.857132-1-naveen.n.rao@linux.vnet.ibm.com
4 years agopowerpc/watchpoint: Don't call dar_within_range() for Book3S
Ravi Bangoria [Sat, 22 Feb 2020 08:20:49 +0000 (13:50 +0530)]
powerpc/watchpoint: Don't call dar_within_range() for Book3S

DAR is set to the first byte of overlap between actual access and
watched range at DSI on Book3S processor. But actual access range
might or might not be within user asked range. So for Book3S, it
must not call dar_within_range().

This revert portion of commit 39413ae00967 ("powerpc/hw_breakpoints:
Rewrite 8xx breakpoints to allow any address range size.").

Before patch:
  # ./tools/testing/selftests/powerpc/ptrace/perf-hwbreak
  ...
  TESTED: No overlap
  FAILED: Partial overlap: 0 != 2
  TESTED: Partial overlap
  TESTED: No overlap
  FAILED: Full overlap: 0 != 2
  failure: perf_hwbreak

After patch:
  TESTED: No overlap
  TESTED: Partial overlap
  TESTED: Partial overlap
  TESTED: No overlap
  TESTED: Full overlap
  success: perf_hwbreak

Fixes: 39413ae00967 ("powerpc/hw_breakpoints: Rewrite 8xx breakpoints to allow any address range size.")
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200222082049.330435-1-ravi.bangoria@linux.ibm.com
4 years agopowerpc/32s: Slenderize _tlbia() for powerpc 603/603e
Christophe Leroy [Mon, 3 Feb 2020 16:47:37 +0000 (16:47 +0000)]
powerpc/32s: Slenderize _tlbia() for powerpc 603/603e

_tlbia() is a function used only on 603/603e core, ie on CPUs which
don't have a hash table.

_tlbia() uses the tlbia macro which implements a loop of 1024 tlbie.

On the 603/603e core, flushing the entire TLB requires no more than
32 tlbie.

Replace tlbia by a loop of 32 tlbie.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/12f4f4f0ff89aeab3b937fc96c84fb35e1b2517e.1580748445.git.christophe.leroy@c-s.fr
4 years agopowerpc/pseries: Avoid NULL pointer dereference when drmem is unavailable
Libor Pechacek [Fri, 31 Jan 2020 13:28:29 +0000 (14:28 +0100)]
powerpc/pseries: Avoid NULL pointer dereference when drmem is unavailable

In guests without hotplugagble memory drmem structure is only zero
initialized. Trying to manipulate DLPAR parameters results in a crash.

  $ echo "memory add count 1" > /sys/kernel/dlpar
  Oops: Kernel access of bad area, sig: 11 [#1]
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
  ...
  NIP:  c0000000000ff294 LR: c0000000000ff248 CTR: 0000000000000000
  REGS: c0000000fb9d3880 TRAP: 0300   Tainted: G            E      (5.5.0-rc6-2-default)
  MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 28242428  XER: 20000000
  CFAR: c0000000009a6c10 DAR: 0000000000000010 DSISR: 40000000 IRQMASK: 0
  ...
  NIP dlpar_memory+0x6e4/0xd00
  LR  dlpar_memory+0x698/0xd00
  Call Trace:
    dlpar_memory+0x698/0xd00 (unreliable)
    handle_dlpar_errorlog+0xc0/0x190
    dlpar_store+0x198/0x4a0
    kobj_attr_store+0x30/0x50
    sysfs_kf_write+0x64/0x90
    kernfs_fop_write+0x1b0/0x290
    __vfs_write+0x3c/0x70
    vfs_write+0xd0/0x260
    ksys_write+0xdc/0x130
    system_call+0x5c/0x68

Taking closer look at the code, I can see that for_each_drmem_lmb is a
macro expanding into `for (lmb = &drmem_info->lmbs[0]; lmb <=
&drmem_info->lmbs[drmem_info->n_lmbs - 1]; lmb++)`. When drmem_info->lmbs
is NULL, the loop would iterate through the whole address range if it
weren't stopped by the NULL pointer dereference on the next line.

This patch aligns for_each_drmem_lmb and for_each_drmem_lmb_in_range
macro behavior with the common C semantics, where the end marker does
not belong to the scanned range, and alters get_lmb_range() semantics.
As a side effect, the wraparound observed in the crash is prevented.

Fixes: 6c6ea53725b3 ("powerpc/mm: Separate ibm, dynamic-memory data from DT format")
Cc: stable@vger.kernel.org # v4.16+
Signed-off-by: Libor Pechacek <lpechacek@suse.cz>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200131132829.10281-1-msuchanek@suse.de
4 years agopowerpc: Don't use thread struct for saving SRR0/1 on syscall.
Christophe Leroy [Fri, 31 Jan 2020 11:34:55 +0000 (11:34 +0000)]
powerpc: Don't use thread struct for saving SRR0/1 on syscall.

CR0 can be saved later, and CTR can also be used for saving.

Keep SRR1 in r9 and stash SRR0 in CTR, this avoids using thread_struct
in memory for that.

Saves 3 cycles (ie 1%) in null_syscall selftest on 8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b94c3bc03bac9431fec2dadb686384c481889422.1580470483.git.christophe.leroy@c-s.fr
4 years agopowerpc/32: Warn and return ENOSYS on syscalls from kernel
Christophe Leroy [Fri, 31 Jan 2020 11:34:54 +0000 (11:34 +0000)]
powerpc/32: Warn and return ENOSYS on syscalls from kernel

Since commit b86fb88855ea ("powerpc/32: implement fast entry for
syscalls on non BOOKE") and commit 1a4b739bbb4f ("powerpc/32:
implement fast entry for syscalls on BOOKE"), syscalls from
kernel are unexpected and can have catastrophic consequences
as it will destroy the kernel stack.

Test MSR_PR on syscall entry. In case syscall is from kernel,
emit a warning and return ENOSYS error.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/8ee3bdbbdfdfc64ca7001e90c43b2aee6f333578.1580470482.git.christophe.leroy@c-s.fr
4 years agopowerpc/32s: Don't flush all TLBs when flushing one page
Christophe Leroy [Sat, 1 Feb 2020 08:04:31 +0000 (08:04 +0000)]
powerpc/32s: Don't flush all TLBs when flushing one page

When flushing any memory range, the flushing function
flushes all TLBs.

When (start) and (end - 1) are in the same memory page,
flush that page instead.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/b30b2eae6960502eaf0d9e36c60820b839693c33.1580542939.git.christophe.leroy@c-s.fr
4 years agopowerpc/fadump: sysfs for fadump memory reservation
Sourabh Jain [Wed, 11 Dec 2019 16:09:10 +0000 (21:39 +0530)]
powerpc/fadump: sysfs for fadump memory reservation

Add a sys interface to allow querying the memory reserved by FADump for
saving the crash dump.

Also added Documentation/ABI for the new sysfs file.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191211160910.21656-7-sourabhjain@linux.ibm.com
4 years agoDocumentation/ABI: Mark /sys/kernel/fadump_* sysfs files deprecated
Sourabh Jain [Wed, 11 Dec 2019 16:09:09 +0000 (21:39 +0530)]
Documentation/ABI: Mark /sys/kernel/fadump_* sysfs files deprecated

Add a deprecation note in FADump sysfs ABI documentation files and
move them from ABI/testing to ABI/obsolete directory.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
[mpe: Use a proper table to fix errors from the documentation build]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191211160910.21656-6-sourabhjain@linux.ibm.com
4 years agopowerpc/powernv: Move core and fadump_release_opalcore under new kobject
Sourabh Jain [Wed, 11 Dec 2019 16:09:08 +0000 (21:39 +0530)]
powerpc/powernv: Move core and fadump_release_opalcore under new kobject

The /sys/firmware/opal/core and /sys/kernel/fadump_release_opalcore
sysfs files are used to export and release the OPAL memory on PowerNV
platform. let's organize them into a new kobject under
/sys/firmware/opal/mpipl/ directory.

A symlink is added to maintain the backward compatibility for
/sys/firmware/opal/core sysfs file.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191211160910.21656-5-sourabhjain@linux.ibm.com
4 years agopowerpc/fadump: Reorganize /sys/kernel/fadump_* sysfs files
Sourabh Jain [Wed, 11 Dec 2019 16:09:07 +0000 (21:39 +0530)]
powerpc/fadump: Reorganize /sys/kernel/fadump_* sysfs files

As the number of FADump sysfs files increases it is hard to manage all
of them inside /sys/kernel directory. It's better to have all the
FADump related sysfs files in a dedicated directory
/sys/kernel/fadump. But in order to maintain backward compatibility a
symlink has been added for every sysfs that has moved to new location.

As the FADump sysfs files are now part of a dedicated directory there
is no need to prefix their name with fadump_, hence sysfs file names
are also updated. For example fadump_enabled sysfs file is now
referred as enabled.

Also consolidate ABI documentation for all the FADump sysfs files in a
single file Documentation/ABI/testing/sysfs-kernel-fadump.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Tested-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191211160910.21656-4-sourabhjain@linux.ibm.com
4 years agosysfs: Wrap __compat_only_sysfs_link_entry_to_kobj function to change the symlink...
Sourabh Jain [Wed, 11 Dec 2019 16:09:06 +0000 (21:39 +0530)]
sysfs: Wrap __compat_only_sysfs_link_entry_to_kobj function to change the symlink name

The __compat_only_sysfs_link_entry_to_kobj function creates a symlink
to a kobject but doesn't provide an option to change the symlink file
name.

This patch adds a wrapper function compat_only_sysfs_link_entry_to_kobj
that extends the __compat_only_sysfs_link_entry_to_kobj functionality
which allows function caller to customize the symlink name.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
[mpe: Fix compile error when CONFIG_SYSFS=n]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191211160910.21656-3-sourabhjain@linux.ibm.com
4 years agoDocumentation/ABI: Add ABI documentation for /sys/kernel/fadump_*
Sourabh Jain [Wed, 11 Dec 2019 16:09:05 +0000 (21:39 +0530)]
Documentation/ABI: Add ABI documentation for /sys/kernel/fadump_*

Add missing ABI documentation for existing FADump sysfs files.

Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Reviewed-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20191211160910.21656-2-sourabhjain@linux.ibm.com
4 years agopowerpc/process: Remove unneccessary #ifdef CONFIG_PPC64 in copy_thread_tls()
Christophe Leroy [Wed, 29 Jan 2020 19:50:07 +0000 (19:50 +0000)]
powerpc/process: Remove unneccessary #ifdef CONFIG_PPC64 in copy_thread_tls()

is_32bit_task() exists on both PPC64 and PPC32, no need of an ifdefery.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/6ecbda05b4119c40222dc8ec284604e1597c9bff.1580327381.git.christophe.leroy@c-s.fr
4 years agopowerpc/papr_scm: Mark papr_scm_ndctl() as static
Vaibhav Jain [Thu, 30 Jan 2020 04:02:06 +0000 (09:32 +0530)]
powerpc/papr_scm: Mark papr_scm_ndctl() as static

Function papr_scm_ndctl() is neither exported from the module nor
called directly from outside 'papr.c' hence should be marked 'static'.

Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200130040206.79998-1-vaibhav@linux.ibm.com
4 years agopowerpc/pseries/Makefile: Remove CONFIG_PPC_PSERIES check
Oliver O'Halloran [Thu, 30 Jan 2020 06:31:53 +0000 (17:31 +1100)]
powerpc/pseries/Makefile: Remove CONFIG_PPC_PSERIES check

The pseries Makefile (arch/powerpc/platforms/pseries/Makefile) is only
included by the platform Makefile (arch/powerpc/platform/Makefile)
when CONFIG_PPC_PSERIES is selected, so checking for
CONFIG_PPC_PSERIES in the pseries Makefile is pointless.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200130063153.19915-2-oohall@gmail.com
4 years agopowerpc/pseries/vio: Remove stray #ifdef CONFIG_PPC_PSERIES
Oliver O'Halloran [Thu, 30 Jan 2020 06:31:52 +0000 (17:31 +1100)]
powerpc/pseries/vio: Remove stray #ifdef CONFIG_PPC_PSERIES

vio.c is in platforms/pseries, which is only built if PPC_PSERIES=y.
In other words, this ifdef is pointless.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200130063153.19915-1-oohall@gmail.com
4 years agopowerpc/entry: Fix an #if which should be an #ifdef in entry_32.S
Christophe Leroy [Tue, 18 Feb 2020 14:09:29 +0000 (14:09 +0000)]
powerpc/entry: Fix an #if which should be an #ifdef in entry_32.S

Fixes: 12c3f1fd87bf ("powerpc/32s: get rid of CPU_FTR_601 feature")
Cc: stable@vger.kernel.org # v5.4+
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/a99fc0ad65b87a1ba51cfa3e0e9034ee294c3e07.1582034961.git.christophe.leroy@c-s.fr
4 years agopowerpc/xmon: Fix whitespace handling in getstring()
Oliver O'Halloran [Mon, 17 Feb 2020 04:13:43 +0000 (15:13 +1100)]
powerpc/xmon: Fix whitespace handling in getstring()

The ls (lookup symbol) and zr (reboot) commands use xmon's getstring()
helper to read a string argument from the xmon prompt. This function
skips over leading whitespace, but doesn't check if the first
"non-whitespace" character is a newline which causes some odd
behaviour (<enter> indicates a the enter key was pressed):

  0:mon> ls printk<enter>
  printk: c0000000001680c4

  0:mon> ls<enter>
  printk<enter>
  Symbol '
  printk' not found.
  0:mon>

With commit 2d9b332d99b ("powerpc/xmon: Allow passing an argument to
ppc_md.restart()") we have a similar problem with the zr command.
Previously zr took no arguments so "zr<enter> would trigger a reboot.
With that patch applied a second newline needs to be sent in order for
the reboot to occur. Fix this by checking if the leading whitespace
ended on a newline:

  0:mon> ls<enter>
  Symbol '' not found.

Fixes: 2d9b332d99b2 ("powerpc/xmon: Allow passing an argument to ppc_md.restart()")
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200217041343.2454-1-oohall@gmail.com
4 years agopowerpc/6xx: Fix power_save_ppc32_restore() with CONFIG_VMAP_STACK
Christophe Leroy [Fri, 14 Feb 2020 06:53:00 +0000 (06:53 +0000)]
powerpc/6xx: Fix power_save_ppc32_restore() with CONFIG_VMAP_STACK

power_save_ppc32_restore() is called during exception entry, before
re-enabling the MMU. It substracts KERNELBASE from the address
of nap_save_msscr0 to access it.

With CONFIG_VMAP_STACK enabled, data MMU translation has already been
re-enabled, so power_save_ppc32_restore() has to access
nap_save_msscr0 by its virtual address.

Reported-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Fixes: cd08f109e262 ("powerpc/32s: Enable CONFIG_VMAP_STACK")
Tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/7bce32ccbab3ba3e3e0f27da6961bf6313df97ed.1581663140.git.christophe.leroy@c-s.fr
4 years agopowerpc/chrp: Fix enter_rtas() with CONFIG_VMAP_STACK
Christophe Leroy [Fri, 14 Feb 2020 08:39:50 +0000 (08:39 +0000)]
powerpc/chrp: Fix enter_rtas() with CONFIG_VMAP_STACK

With CONFIG_VMAP_STACK, data MMU has to be enabled
to read data on the stack.

Fixes: cd08f109e262 ("powerpc/32s: Enable CONFIG_VMAP_STACK")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/d2330584f8c42d3039896e2b56f5d39676dc919c.1581669558.git.christophe.leroy@c-s.fr
4 years agopowerpc/32s: Fix DSI and ISI exceptions for CONFIG_VMAP_STACK
Christophe Leroy [Sat, 15 Feb 2020 10:14:25 +0000 (10:14 +0000)]
powerpc/32s: Fix DSI and ISI exceptions for CONFIG_VMAP_STACK

hash_page() needs to read page tables from kernel memory. When entire
kernel memory is mapped by BATs, which is normally the case when
CONFIG_STRICT_KERNEL_RWX is not set, it works even if the page hosting
the page table is not referenced in the MMU hash table.

However, if the page where the page table resides is not covered by
a BAT, a DSI fault can be encountered from hash_page(), and it loops
forever. This can happen when CONFIG_STRICT_KERNEL_RWX is selected
and the alignment of the different regions is too small to allow
covering the entire memory with BATs. This also happens when
CONFIG_DEBUG_PAGEALLOC is selected or when booting with 'nobats'
flag.

Also, if the page containing the kernel stack is not present in the
MMU hash table, registers cannot be saved and a recursive DSI fault
is encountered.

To allow hash_page() to properly do its job at all time and load the
MMU hash table whenever needed, it must run with data MMU disabled.
This means it must be called before re-enabling data MMU. To allow
this, registers clobbered by hash_page() and create_hpte() have to
be saved in the thread struct together with SRR0, SSR1, DAR and DSISR.
It is also necessary to ensure that DSI prolog doesn't overwrite
regs saved by prolog of the current running exception. That means:
- DSI can only use SPRN_SPRG_SCRATCH0
- Exceptions must free SPRN_SPRG_SCRATCH0 before writing to the stack.

This also fixes the Oops reported by Erhard when create_hpte() is
called by add_hash_page().

Due to prolog size increase, a few more exceptions had to get split
in two parts.

Fixes: cd08f109e262 ("powerpc/32s: Enable CONFIG_VMAP_STACK")
Reported-by: Erhard F. <erhard_f@mailbox.org>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Tested-by: Erhard F. <erhard_f@mailbox.org>
Tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206501
Link: https://lore.kernel.org/r/64a4aa44686e9fd4b01333401367029771d9b231.1581761633.git.christophe.leroy@c-s.fr
4 years agopowerpc/tm: Fix clearing MSR[TS] in current when reclaiming on signal delivery
Gustavo Luiz Duarte [Tue, 11 Feb 2020 03:38:29 +0000 (00:38 -0300)]
powerpc/tm: Fix clearing MSR[TS] in current when reclaiming on signal delivery

After a treclaim, we expect to be in non-transactional state. If we
don't clear the current thread's MSR[TS] before we get preempted, then
tm_recheckpoint_new_task() will recheckpoint and we get rescheduled in
suspended transaction state.

When handling a signal caught in transactional state,
handle_rt_signal64() calls get_tm_stackpointer() that treclaims the
transaction using tm_reclaim_current() but without clearing the
thread's MSR[TS]. This can cause the TM Bad Thing exception below if
later we pagefault and get preempted trying to access the user's
sigframe, using __put_user(). Afterwards, when we are rescheduled back
into do_page_fault() (but now in suspended state since the thread's
MSR[TS] was not cleared), upon executing 'rfid' after completion of
the page fault handling, the exception is raised because a transition
from suspended to non-transactional state is invalid.

  Unexpected TM Bad Thing exception at c00000000000de44 (msr 0x8000000302a03031) tm_scratch=800000010280b033
  Oops: Unrecoverable exception, sig: 6 [#1]
  LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
  CPU: 25 PID: 15547 Comm: a.out Not tainted 5.4.0-rc2 #32
  NIP:  c00000000000de44 LR: c000000000034728 CTR: 0000000000000000
  REGS: c00000003fe7bd70 TRAP: 0700   Not tainted  (5.4.0-rc2)
  MSR:  8000000302a03031 <SF,VEC,VSX,FP,ME,IR,DR,LE,TM[SE]>  CR: 44000884  XER: 00000000
  CFAR: c00000000000dda4 IRQMASK: 0
  PACATMSCRATCH: 800000010280b033
  GPR00: c000000000034728 c000000f65a17c80 c000000001662800 00007fffacf3fd78
  GPR04: 0000000000001000 0000000000001000 0000000000000000 c000000f611f8af0
  GPR08: 0000000000000000 0000000078006001 0000000000000000 000c000000000000
  GPR12: c000000f611f84b0 c00000003ffcb200 0000000000000000 0000000000000000
  GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR20: 0000000000000000 0000000000000000 0000000000000000 c000000f611f8140
  GPR24: 0000000000000000 00007fffacf3fd68 c000000f65a17d90 c000000f611f7800
  GPR28: c000000f65a17e90 c000000f65a17e90 c000000001685e18 00007fffacf3f000
  NIP [c00000000000de44] fast_exception_return+0xf4/0x1b0
  LR [c000000000034728] handle_rt_signal64+0x78/0xc50
  Call Trace:
  [c000000f65a17c80] [c000000000034710] handle_rt_signal64+0x60/0xc50 (unreliable)
  [c000000f65a17d30] [c000000000023640] do_notify_resume+0x330/0x460
  [c000000f65a17e20] [c00000000000dcc4] ret_from_except_lite+0x70/0x74
  Instruction dump:
  7c4ff120 e8410170 7c5a03a6 38400000 f8410060 e8010070 e8410080 e8610088
  60000000 60000000 e8810090 e8210078 <4c00002448000000 e8610178 88ed0989
  ---[ end trace 93094aa44b442f87 ]---

The simplified sequence of events that triggers the above exception is:

  ... # userspace in NON-TRANSACTIONAL state
  tbegin # userspace in TRANSACTIONAL state
  signal delivery # kernelspace in SUSPENDED state
  handle_rt_signal64()
    get_tm_stackpointer()
      treclaim # kernelspace in NON-TRANSACTIONAL state
    __put_user()
      page fault happens. We will never get back here because of the TM Bad Thing exception.

  page fault handling kicks in and we voluntarily preempt ourselves
  do_page_fault()
    __schedule()
      __switch_to(other_task)

  our task is rescheduled and we recheckpoint because the thread's MSR[TS] was not cleared
  __switch_to(our_task)
    switch_to_tm()
      tm_recheckpoint_new_task()
        trechkpt # kernelspace in SUSPENDED state

  The page fault handling resumes, but now we are in suspended transaction state
  do_page_fault()    completes
  rfid     <----- trying to get back where the page fault happened (we were non-transactional back then)
  TM Bad Thing # illegal transition from suspended to non-transactional

This patch fixes that issue by clearing the current thread's MSR[TS]
just after treclaim in get_tm_stackpointer() so that we stay in
non-transactional state in case we are preempted. In order to make
treclaim and clearing the thread's MSR[TS] atomic from a preemption
perspective when CONFIG_PREEMPT is set, preempt_disable/enable() is
used. It's also necessary to save the previous value of the thread's
MSR before get_tm_stackpointer() is called so that it can be exposed
to the signal handler later in setup_tm_sigcontexts() to inform the
userspace MSR at the moment of the signal delivery.

Found with tm-signal-context-force-tm kernel selftest.

Fixes: 2b0a576d15e0 ("powerpc: Add new transactional memory state to the signal context")
Cc: stable@vger.kernel.org # v3.9
Signed-off-by: Gustavo Luiz Duarte <gustavold@linux.ibm.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200211033831.11165-1-gustavold@linux.ibm.com
4 years agopowerpc/8xx: Fix clearing of bits 20-23 in ITLB miss
Christophe Leroy [Sun, 9 Feb 2020 18:14:42 +0000 (18:14 +0000)]
powerpc/8xx: Fix clearing of bits 20-23 in ITLB miss

In ITLB miss handled the line supposed to clear bits 20-23 on the L2
ITLB entry is buggy and does indeed nothing, leading to undefined
value which could allow execution when it shouldn't.

Properly do the clearing with the relevant instruction.

Fixes: 74fabcadfd43 ("powerpc/8xx: don't use r12/SPRN_SPRG_SCRATCH2 in TLB Miss handlers")
Cc: stable@vger.kernel.org # v5.0+
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Leonardo Bras <leonardo@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/4f70c2778163affce8508a210f65d140e84524b4.1581272050.git.christophe.leroy@c-s.fr
4 years agopowerpc/hugetlb: Fix 8M hugepages on 8xx
Christophe Leroy [Sun, 9 Feb 2020 16:02:41 +0000 (16:02 +0000)]
powerpc/hugetlb: Fix 8M hugepages on 8xx

With HW assistance all page tables must be 4k aligned, the 8xx drops
the last 12 bits during the walk.

Redefine HUGEPD_SHIFT_MASK to mask last 12 bits out. HUGEPD_SHIFT_MASK
is used to for alignment of page table cache.

Fixes: 22569b881d37 ("powerpc/8xx: Enable 8M hugepage support with HW assistance")
Cc: stable@vger.kernel.org # v5.0+
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/778b1a248c4c7ca79640eeff7740044da6a220a0.1581264115.git.christophe.leroy@c-s.fr
4 years agopowerpc/hugetlb: Fix 512k hugepages on 8xx with 16k page size
Christophe Leroy [Thu, 6 Feb 2020 13:50:28 +0000 (13:50 +0000)]
powerpc/hugetlb: Fix 512k hugepages on 8xx with 16k page size

Commit 55c8fc3f4930 ("powerpc/8xx: reintroduce 16K pages with HW
assistance") redefined pte_t as a struct of 4 pte_basic_t, because
in 16K pages mode there are four identical entries in the
page table. But the size of hugepage tables is calculated based
of the size of (void *). Therefore, we end up with page tables
of size 1k instead of 4k for 512k pages.

As 512k hugepage tables are the same size as standard page tables,
ie 4k, use the standard page tables instead of PGT_CACHE tables.

Fixes: 3fb69c6a1a13 ("powerpc/8xx: Enable 512k hugepage support with HW assistance")
Cc: stable@vger.kernel.org # v5.0+
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/90ec56a2315be602494619ed0223bba3b0b8d619.1580997007.git.christophe.leroy@c-s.fr
4 years agopowerpc/eeh: Fix deadlock handling dead PHB
Sam Bobroff [Fri, 7 Feb 2020 04:57:31 +0000 (15:57 +1100)]
powerpc/eeh: Fix deadlock handling dead PHB

Recovering a dead PHB can currently cause a deadlock as the PCI
rescan/remove lock is taken twice.

This is caused as part of an existing bug in
eeh_handle_special_event(). The pe is processed while traversing the
PHBs even though the pe is unrelated to the loop. This causes the pe
to be, incorrectly, processed more than once.

Untangling this section can move the pe processing out of the loop and
also outside the locked section, correcting both problems.

Fixes: 2e25505147b8 ("powerpc/eeh: Fix crash when edev->pdev changes")
Cc: stable@vger.kernel.org # 5.4+
Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Tested-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/0547e82dbf90ee0729a2979a8cac5c91665c621f.1581051445.git.sbobroff@linux.ibm.com
4 years agoLinux 5.6-rc2
Linus Torvalds [Sun, 16 Feb 2020 21:16:59 +0000 (13:16 -0800)]
Linux 5.6-rc2

4 years agoMerge tag 'for-linus-5.6-1' of https://github.com/cminyard/linux-ipmi
Linus Torvalds [Sun, 16 Feb 2020 21:05:46 +0000 (13:05 -0800)]
Merge tag 'for-linus-5.6-1' of https://github.com/cminyard/linux-ipmi

Pull IPMI update from Corey Minyard:
 "Minor bug fixes for IPMI

  I know this is late; I've been travelling and, well, I've been
  distracted.

  This is just a few bug fixes and adding i2c support to the IPMB
  driver, which is something I wanted from the beginning for it"

* tag 'for-linus-5.6-1' of https://github.com/cminyard/linux-ipmi:
  drivers: ipmi: fix off-by-one bounds check that leads to a out-of-bounds write
  ipmi:ssif: Handle a possible NULL pointer reference
  drivers: ipmi: Modify max length of IPMB packet
  drivers: ipmi: Support raw i2c packet in IPMB

4 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Sun, 16 Feb 2020 21:01:42 +0000 (13:01 -0800)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "Bugfixes and improvements to selftests.

  On top of this, Mauro converted the KVM documentation to rst format,
  which was very welcome"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (44 commits)
  docs: virt: guest-halt-polling.txt convert to ReST
  docs: kvm: review-checklist.txt: rename to ReST
  docs: kvm: Convert timekeeping.txt to ReST format
  docs: kvm: Convert s390-diag.txt to ReST format
  docs: kvm: Convert ppc-pv.txt to ReST format
  docs: kvm: Convert nested-vmx.txt to ReST format
  docs: kvm: Convert mmu.txt to ReST format
  docs: kvm: Convert locking.txt to ReST format
  docs: kvm: Convert hypercalls.txt to ReST format
  docs: kvm: arm/psci.txt: convert to ReST
  docs: kvm: convert arm/hyp-abi.txt to ReST
  docs: kvm: Convert api.txt to ReST format
  docs: kvm: convert devices/xive.txt to ReST
  docs: kvm: convert devices/xics.txt to ReST
  docs: kvm: convert devices/vm.txt to ReST
  docs: kvm: convert devices/vfio.txt to ReST
  docs: kvm: convert devices/vcpu.txt to ReST
  docs: kvm: convert devices/s390_flic.txt to ReST
  docs: kvm: convert devices/mpic.txt to ReST
  docs: kvm: convert devices/arm-vgit.txt to ReST
  ...

4 years agoMerge tag 'edac_urgent_for_5.6' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 16 Feb 2020 20:49:36 +0000 (12:49 -0800)]
Merge tag 'edac_urgent_for_5.6' of git://git./linux/kernel/git/ras/ras

Pull EDAC fixes from Borislav Petkov:
 "Two fixes for use-after-free and memory leaking in the EDAC core, by
  Robert Richter.

  Debug options like DEBUG_TEST_DRIVER_REMOVE, KASAN and DEBUG_KMEMLEAK
  unearthed issues with the lifespan of memory allocated by the EDAC
  memory controller descriptor due to misdesigned memory freeing, done
  partially by the EDAC core *and* the driver core, which is problematic
  to say the least.

  These two are minimal fixes to take care of stable - a proper rework
  is following which cleans up that mess properly"

* tag 'edac_urgent_for_5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/sysfs: Remove csrow objects on errors
  EDAC/mc: Fix use-after-free and memleaks during device removal

4 years agoMerge tag 'block-5.6-2020-02-16' of git://git.kernel.dk/linux-block
Linus Torvalds [Sun, 16 Feb 2020 20:35:52 +0000 (12:35 -0800)]
Merge tag 'block-5.6-2020-02-16' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "Not a lot here, which is great, basically just three small bcache
  fixes from Coly, and four NVMe fixes via Keith"

* tag 'block-5.6-2020-02-16' of git://git.kernel.dk/linux-block:
  nvme: fix the parameter order for nvme_get_log in nvme_get_fw_slot_info
  nvme/pci: move cqe check after device shutdown
  nvme: prevent warning triggered by nvme_stop_keep_alive
  nvme/tcp: fix bug on double requeue when send fails
  bcache: remove macro nr_to_fifo_front()
  bcache: Revert "bcache: shrink btree node cache after bch_btree_check()"
  bcache: ignore pending signals when creating gc and allocator thread

4 years agoMerge tag 'for-5.6-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Sun, 16 Feb 2020 19:43:45 +0000 (11:43 -0800)]
Merge tag 'for-5.6-rc1-tag' of git://git./linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
 "Two races fixed, memory leak fix, sysfs directory fixup and two new
  log messages:

   - two fixed race conditions: extent map merging and truncate vs
     fiemap

   - create the right sysfs directory with device information and move
     the individual device dirs under it

   - print messages when the tree-log is replayed at mount time or
     cannot be replayed on remount"

* tag 'for-5.6-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: sysfs, move device id directories to UUID/devinfo
  btrfs: sysfs, add UUID/devinfo kobject
  Btrfs: fix race between shrinking truncate and fiemap
  btrfs: log message when rw remount is attempted with unclean tree-log
  btrfs: print message when tree-log replay starts
  Btrfs: fix race between using extent maps and merging them
  btrfs: ref-verify: fix memory leaks

4 years agoMerge tag '5.6-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sun, 16 Feb 2020 19:41:29 +0000 (11:41 -0800)]
Merge tag '5.6-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "Four small CIFS/SMB3 fixes. One (the EA overflow fix) for stable"

* tag '5.6-rc1-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: make sure we do not overflow the max EA buffer size
  cifs: enable change notification for SMB2.1 dialect
  cifs: Fix mode output in debugging statements
  cifs: fix mount option display for sec=krb5i

4 years agoMerge tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 16 Feb 2020 19:12:06 +0000 (11:12 -0800)]
Merge tag 'ext4_for_linus_stable' of git://git./linux/kernel/git/tytso/ext4

Pull ext4 fixes from Ted Ts'o:
 "Miscellaneous ext4 bug fixes (all stable fodder)"

* tag 'ext4_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: improve explanation of a mount failure caused by a misconfigured kernel
  jbd2: do not clear the BH_Mapped flag when forgetting a metadata buffer
  jbd2: move the clearing of b_modified flag to the journal_unmap_buffer()
  ext4: add cond_resched() to ext4_protect_reserved_inode
  ext4: fix checksum errors with indexed dirs
  ext4: fix support for inode sizes > 1024 bytes
  ext4: simplify checking quota limits in ext4_statfs()
  ext4: don't assume that mmp_nodename/bdevname have NUL

4 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Linus Torvalds [Sun, 16 Feb 2020 00:49:25 +0000 (16:49 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input

Pull input updates from Dmitry Torokhov:

 - a few drivers have been updated to use flexible-array syntax instead
   of GCC extension

 - ili210x touchscreen driver now supports the 2120 protocol flavor

 - a couple more of Synaptics devices have been switched over to RMI4

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: cyapa - replace zero-length array with flexible-array member
  Input: tca6416-keypad - replace zero-length array with flexible-array member
  Input: gpio_keys_polled - replace zero-length array with flexible-array member
  Input: synaptics - remove the LEN0049 dmi id from topbuttonpad list
  Input: synaptics - enable SMBus on ThinkPad L470
  Input: synaptics - switch T470s to RMI4 by default
  Input: gpio_keys - replace zero-length array with flexible-array member
  Input: goldfish_events - replace zero-length array with flexible-array member
  Input: psmouse - switch to using i2c_new_scanned_device()
  Input: ili210x - add ili2120 support
  Input: ili210x - fix return value of is_visible function

4 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Linus Torvalds [Sun, 16 Feb 2020 00:38:49 +0000 (16:38 -0800)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
 "Not too much going on here, though there are about four fixes related
  to stuff merged during the last merge window.

  We also see the return of a syzkaller instance with access to RDMA
  devices, and a few bugs detected by that squished.

   - Fix three crashers and a memory memory leak for HFI1

   - Several bugs found by syzkaller

   - A bug fix for the recent QP counters feature on older mlx5 HW

   - Locking inversion in cxgb4

   - Unnecessary WARN_ON in siw

   - A umad crasher regression during unload, from a bug fix for
     something else

   - Bugs introduced in the merge window:
       - Missed list_del in uverbs file rework, core and mlx5 devx
       - Unexpected integer math truncation in the mlx5 VAR patches
       - Compilation bug fix for the VAR patches on 32 bit"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  IB/mlx5: Use div64_u64 for num_var_hw_entries calculation
  RDMA/core: Fix protection fault in get_pkey_idx_qp_list
  RDMA/rxe: Fix soft lockup problem due to using tasklets in softirq
  RDMA/mlx5: Prevent overflow in mmap offset calculations
  IB/umad: Fix kernel crash while unloading ib_umad
  RDMA/mlx5: Fix async events cleanup flows
  RDMA/core: Add missing list deletion on freeing event queue
  RDMA/siw: Remove unwanted WARN_ON in siw_cm_llp_data_ready()
  RDMA/iw_cxgb4: initiate CLOSE when entering TERM
  IB/mlx5: Return failure when rts2rts_qp_counters_set_id is not supported
  RDMA/core: Fix invalid memory access in spec_filter_size
  IB/rdmavt: Reset all QPs when the device is shut down
  IB/hfi1: Close window for pq and request coliding
  IB/hfi1: Acquire lock to release TID entries when user file is closed
  RDMA/hfi1: Fix memory leak in _dev_comp_vect_mappings_create

4 years agoMerge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Linus Torvalds [Sat, 15 Feb 2020 21:16:47 +0000 (13:16 -0800)]
Merge tag 'armsoc-fixes' of git://git./linux/kernel/git/soc/soc

Pull ARM SoC fixes from Olof Johansson:
 "A handful of fixes that have come in since the merge window:

   - Fix of PCI interrupt map on arm64 fast model (SW emulator)

   - Fixlet for sound on ST platforms and a small cleanup of deprecated
     DT properties

   - A stack buffer overflow fix for moxtet

   - Fuse driver build fix for Tegra194

   - A few config updates to turn on new drivers merged this cycle"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  bus: moxtet: fix potential stack buffer overflow
  soc/tegra: fuse: Fix build with Tegra194 configuration
  ARM: dts: sti: fixup sound frame-inversion for stihxxx-b2120.dtsi
  ARM: dts: sti: Remove deprecated snps PHY properties for stih410-b2260
  arm64: defconfig: Enable DRM_SUN6I_DSI
  arm64: defconfig: Enable CONFIG_SUN8I_THERMAL
  ARM: sunxi: Enable CONFIG_SUN8I_THERMAL
  arm64: defconfig: Set bcm2835-dma as built-in
  ARM: configs: Cleanup old Kconfig options
  ARM: npcm: Bring back GPIOLIB support
  arm64: dts: fast models: Fix FVP PCI interrupt-map property

4 years agoMerge tag 's390-5.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Sat, 15 Feb 2020 21:10:38 +0000 (13:10 -0800)]
Merge tag 's390-5.6-3' of git://git./linux/kernel/git/s390/linux

Pull s390 updates from Vasily Gorbik:

 - Enable paes-s390 cipher selftests in testmgr (acked-by Herbert Xu).

 - Fix protected key length update in PKEY_SEC2PROTK ioctl and increase
   card/queue requests counter to 64-bit in crypto code.

 - Fix clang warning in get_tod_clock.

 - Fix ultravisor info length extensions handling.

 - Fix style of SPDX License Identifier in vfio-ccw.

 - Avoid unnecessary GFP_ATOMIC and simplify ACK tracking in qdio.

* tag 's390-5.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  crypto/testmgr: enable selftests for paes-s390 ciphers
  s390/time: Fix clk type in get_tod_clock
  s390/uv: Fix handling of length extensions
  s390/qdio: don't allocate *aob array with GFP_ATOMIC
  s390/qdio: simplify ACK tracking
  s390/zcrypt: fix card and queue total counter wrap
  s390/pkey: fix missing length of protected key on return
  vfio-ccw: Use the correct style for SPDX License Identifier

4 years agoMerge tag 'hwmon-for-v5.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groec...
Linus Torvalds [Sat, 15 Feb 2020 21:03:50 +0000 (13:03 -0800)]
Merge tag 'hwmon-for-v5.6-rc2' of git://git./linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:
 "Fix compatible string typos in the xdpe12284 driver, and a wrong bit
  value in the ltc2978 driver"

* tag 'hwmon-for-v5.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (pmbus/xdpe12284) fix typo in compatible strings
  hwmon: (pmbus/ltc2978) Fix PMBus polling of MFR_COMMON definitions.

4 years agoMerge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 15 Feb 2020 20:51:22 +0000 (12:51 -0800)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull scheduler fixes from Ingo Molnar:
 "Misc fixes all over the place:

   - Fix NUMA over-balancing between lightly loaded nodes. This is
     fallout of the big load-balancer rewrite.

   - Fix the NOHZ remote loadavg update logic, which fixes anomalies
     like reported 150 loadavg on mostly idle CPUs.

   - Fix XFS performance/scalability

   - Fix throttled groups unbound task-execution bug

   - Fix PSI procfs boundary condition

   - Fix the cpu.uclamp.{min,max} cgroup configuration write checks

   - Fix DocBook annotations

   - Fix RCU annotations

   - Fix overly CPU-intensive housekeeper CPU logic loop on large CPU
     counts"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Fix kernel-doc warning in attach_entity_load_avg()
  sched/core: Annotate curr pointer in rq with __rcu
  sched/psi: Fix OOB write when writing 0 bytes to PSI files
  sched/fair: Allow a per-CPU kthread waking a task to stack on the same CPU, to fix XFS performance regression
  sched/fair: Prevent unlimited runtime on throttled group
  sched/nohz: Optimize get_nohz_timer_target()
  sched/uclamp: Reject negative values in cpu_uclamp_write()
  sched/fair: Allow a small load imbalance between low utilisation SD_NUMA domains
  timers/nohz: Update NOHZ load in remote tick
  sched/core: Don't skip remote tick for idle CPUs

4 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 15 Feb 2020 20:30:42 +0000 (12:30 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
 "Fixes and HW enablement patches:

   - Tooling fixes, most of which are tooling header synchronization
     with v5.6 changes

   - Fix kprobes fallout on ARM

   - Add Intel Elkhart Lake support and extend Tremont support, these
     are relatively simple and should only affect those models

   - Fix the AMD family 17h generic event table"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits)
  perf llvm: Fix script used to obtain kernel make directives to work with new kbuild
  tools headers kvm: Sync linux/kvm.h with the kernel sources
  tools headers kvm: Sync kvm headers with the kernel sources
  tools arch x86: Sync asm/cpufeatures.h with the kernel sources
  tools headers x86: Sync disabled-features.h
  tools include UAPI: Sync sound/asound.h copy
  tools headers UAPI: Sync asm-generic/mman-common.h with the kernel
  perf tools: Add arm64 version of get_cpuid()
  tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
  tools headers uapi: Sync linux/fscrypt.h with the kernel sources
  tools headers UAPI: Sync sched.h with the kernel
  perf trace: Resolve prctl's 'option' arg strings to numbers
  perf beauty prctl: Export the 'options' strarray
  tools headers UAPI: Sync prctl.h with the kernel sources
  tools headers UAPI: Sync copy of arm64's asm/unistd.h with the kernel sources
  perf maps: Move kmap::kmaps setup to maps__insert()
  perf maps: Fix map__clone() for struct kmap
  perf maps: Mark ksymbol DSOs with kernel type
  perf maps: Mark module DSOs with kernel type
  tools include UAPI: Sync x86's syscalls_64.tbl, generic unistd.h and fcntl.h to pick up openat2 and pidfd_getfd
  ...

4 years agobus: moxtet: fix potential stack buffer overflow
Marek Behún [Sat, 15 Feb 2020 14:21:30 +0000 (15:21 +0100)]
bus: moxtet: fix potential stack buffer overflow

The input_read function declares the size of the hex array relative to
sizeof(buf), but buf is a pointer argument of the function. The hex
array is meant to contain hexadecimal representation of the bin array.

Link: https://lore.kernel.org/r/20200215142130.22743-1-marek.behun@nic.cz
Fixes: 5bc7f990cd98 ("bus: Add support for Moxtet bus")
Signed-off-by: Marek Behún <marek.behun@nic.cz>
Reported-by: sohu0106 <sohu0106@126.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
4 years agoext4: improve explanation of a mount failure caused by a misconfigured kernel
Theodore Ts'o [Fri, 14 Feb 2020 23:11:19 +0000 (18:11 -0500)]
ext4: improve explanation of a mount failure caused by a misconfigured kernel

If CONFIG_QFMT_V2 is not enabled, but CONFIG_QUOTA is enabled, when a
user tries to mount a file system with the quota or project quota
enabled, the kernel will emit a very confusing messsage:

    EXT4-fs warning (device vdc): ext4_enable_quotas:5914: Failed to enable quota tracking (type=0, err=-3). Please run e2fsck to fix.
    EXT4-fs (vdc): mount failed

We will now report an explanatory message indicating which kernel
configuration options have to be enabled, to avoid customer/sysadmin
confusion.

Link: https://lore.kernel.org/r/20200215012738.565735-1-tytso@mit.edu
Google-Bug-Id: 149093531
Fixes: 7c319d328505b778 ("ext4: make quota as first class supported feature")
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
4 years agoMerge tag 'perf-urgent-for-mingo-5.6-20200214' of git://git.kernel.org/pub/scm/linux...
Ingo Molnar [Sat, 15 Feb 2020 08:35:11 +0000 (09:35 +0100)]
Merge tag 'perf-urgent-for-mingo-5.6-20200214' of git://git./linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

BPF:

  Arnaldo Carvalho de Melo:

  - Fix script used to obtain kernel make directives to work with new kbuild
    used for building BPF programs.

maps:

  Jiri Olsa:

  - Fixup kmap->kmaps backpointer in kernel maps.

arm64:

  John Garry:

  - Add arm64 version of get_cpuid() to get proper, arm64 specific output from
    'perf list' and other tools.

perf top:

  Kim Phillips:

  - Update kernel idle symbols so that output in AMD systems is in line with
    other systems.

perf stat:

  Kim Phillips:

  - Don't report a null stalled cycles per insn metric.

tools headers:

  Arnaldo Carvalho de Melo:

  - Sync tools/ headers with the kernel sources to get things like syscall
    numbers and new arguments so that 'perf trace' can decode and use them in
    tracepoint filters, e.g. prctl's new PR_{G,S}ET_IO_FLUSHER options.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
4 years agoInput: cyapa - replace zero-length array with flexible-array member
Gustavo A. R. Silva [Sat, 15 Feb 2020 01:03:12 +0000 (17:03 -0800)]
Input: cyapa - replace zero-length array with flexible-array member

The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
        int stuff;
        struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by
this change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Link: https://lore.kernel.org/r/20200214172132.GA28389@embeddedor
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
4 years agoInput: tca6416-keypad - replace zero-length array with flexible-array member
Gustavo A. R. Silva [Sat, 15 Feb 2020 01:02:11 +0000 (17:02 -0800)]
Input: tca6416-keypad - replace zero-length array with flexible-array member

The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
        int stuff;
        struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by
this change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Link: https://lore.kernel.org/r/20200214172022.GA27490@embeddedor
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
4 years agoInput: gpio_keys_polled - replace zero-length array with flexible-array member
Gustavo A. R. Silva [Sat, 15 Feb 2020 01:01:46 +0000 (17:01 -0800)]
Input: gpio_keys_polled - replace zero-length array with flexible-array member

The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
        int stuff;
        struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by
this change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Link: https://lore.kernel.org/r/20200214171907.GA26588@embeddedor
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
4 years agoMerge tag 'nfs-for-5.6-2' of git://git.linux-nfs.org/projects/anna/linux-nfs
Linus Torvalds [Fri, 14 Feb 2020 22:46:11 +0000 (14:46 -0800)]
Merge tag 'nfs-for-5.6-2' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client bugfixes from Anna Schumaker:
 "The only stable fix this time is the DMA scatter-gather list bug fixed
  by Chuck.

  The rest fix up races and refcounting issues that have been found
  during testing.

  Stable fix:
   - fix DMA scatter-gather list mapping imbalance

  The rest:
   - fix directory verifier races
   - fix races between open and dentry revalidation
   - fix revalidation of dentries with delegations
   - fix "cachethis" setting for writes
   - fix delegation and delegation cred pinning"

* tag 'nfs-for-5.6-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  NFSv4: Ensure the delegation cred is pinned when we call delegreturn
  NFSv4: Ensure the delegation is pinned in nfs_do_return_delegation()
  NFSv4.1 make cachethis=no for writes
  xprtrdma: Fix DMA scatter-gather list mapping imbalance
  NFSv4: Fix revalidation of dentries with delegations
  NFSv4: Fix races between open and dentry revalidation
  NFS: Fix up directory verifier races

4 years agoMerge tag 'ceph-for-5.6-rc2' of https://github.com/ceph/ceph-client
Linus Torvalds [Fri, 14 Feb 2020 22:42:31 +0000 (14:42 -0800)]
Merge tag 'ceph-for-5.6-rc2' of https://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:

 - make O_DIRECT | O_APPEND combination work better

 - redo the server path canonicalization patch that went into -rc1

 - fix the 'noacl' mount option that got broken by the conversion to the
   new mount API in 5.5

* tag 'ceph-for-5.6-rc2' of https://github.com/ceph/ceph-client:
  ceph: noacl mount option is effectively ignored
  ceph: canonicalize server path in place
  ceph: do not execute direct write in parallel if O_APPEND is specified