platform/kernel/linux-rpi.git
6 years agopowerpc/xmon: Also setup debugger hooks when single-stepping
Michal Suchanek [Wed, 23 May 2018 18:00:54 +0000 (20:00 +0200)]
powerpc/xmon: Also setup debugger hooks when single-stepping

When single-stepping kernel code from xmon without a debug hook
enabled the kernel crashes. This can happen when kernel starts with
xmon on crash disabled but xmon is entered using sysrq.

Call force_enable_xmon when single-stepping in xmon to install the
xmon debug hooks.

Fixes: e1368d0c9edb ("powerpc/xmon: Setup debugger hooks when first break-point is set")
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Reviewed-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/8xx: fix invalid register expression in head_8xx.S
Christophe Leroy [Thu, 24 May 2018 11:02:06 +0000 (11:02 +0000)]
powerpc/8xx: fix invalid register expression in head_8xx.S

New binutils generate the following warning

  AS      arch/powerpc/kernel/head_8xx.o
arch/powerpc/kernel/head_8xx.S: Assembler messages:
arch/powerpc/kernel/head_8xx.S:916: Warning: invalid register expression

This patch fixes it.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agoselftests/powerpc: Add ptrace hw breakpoint test
Michael Neuling [Tue, 22 May 2018 06:14:27 +0000 (16:14 +1000)]
selftests/powerpc: Add ptrace hw breakpoint test

This test the ptrace hw breakpoints via PTRACE_SET_DEBUGREG and
PPC_PTRACE_SETHWDEBUG.  This test was use to find the bugs fixed by
these recent commits:

  4f7c06e26e powerpc/ptrace: Fix setting 512B aligned breakpoints with PTRACE_SET_DEBUGREG
  cd6ef7eebf powerpc/ptrace: Fix enforcement of DAWR constraints

Signed-off-by: Michael Neuling <mikey@neuling.org>
[mpe: Add SPDX tag, clang format it]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agoselftests/powerpc: Add missing .gitignores
Michael Neuling [Tue, 22 May 2018 06:13:59 +0000 (16:13 +1000)]
selftests/powerpc: Add missing .gitignores

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/mm: Only read faulting instruction when necessary in do_page_fault()
Christophe Leroy [Wed, 23 May 2018 08:53:22 +0000 (10:53 +0200)]
powerpc/mm: Only read faulting instruction when necessary in do_page_fault()

Commit a7a9dcd882a67 ("powerpc: Avoid taking a data miss on every
userspace instruction miss") has shown that limiting the read of
faulting instruction to likely cases improves performance.

This patch goes further into this direction by limiting the read
of the faulting instruction to the only cases where it is likely
needed.

On an MPC885, with the same benchmark app as in the commit referred
above, we see a reduction of about 3900 dTLB misses (approx 3%):

Before the patch:
 Performance counter stats for './fault 500' (10 runs):

         683033312      cpu-cycles                                                    ( +-  0.03% )
            134538      dTLB-load-misses                                              ( +-  0.03% )
             46099      iTLB-load-misses                                              ( +-  0.02% )
             19681      faults                                                        ( +-  0.02% )

       5.389747878 seconds time elapsed                                          ( +-  0.06% )

With the patch:

 Performance counter stats for './fault 500' (10 runs):

         682112862      cpu-cycles                                                    ( +-  0.03% )
            130619      dTLB-load-misses                                              ( +-  0.03% )
             46073      iTLB-load-misses                                              ( +-  0.05% )
             19681      faults                                                        ( +-  0.01% )

       5.381342641 seconds time elapsed                                          ( +-  0.07% )

The proper work of the huge stack expansion was tested with the
following app:

int main(int argc, char **argv)
{
char buf[1024 * 1025];

sprintf(buf, "Hello world !\n");
printf(buf);

exit(0);
}

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Add include of pagemap.h to fix build errors]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/mm: Use instruction symbolic names in store_updates_sp()
Christophe Leroy [Wed, 23 May 2018 07:04:04 +0000 (09:04 +0200)]
powerpc/mm: Use instruction symbolic names in store_updates_sp()

Use symbolic names defined in asm/ppc-opcode.h
instead of hardcoded values.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agohwmon: (ibmpowernv) Add energy sensors
Shilpasri G Bhat [Mon, 7 May 2018 10:25:38 +0000 (15:55 +0530)]
hwmon: (ibmpowernv) Add energy sensors

This patch exports the accumulated power numbers of each power
sensor maintained by OCC.

Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agohwmon: (ibmpowernv): Add support to read 64 bit sensors
Shilpasri G Bhat [Mon, 7 May 2018 10:25:37 +0000 (15:55 +0530)]
hwmon: (ibmpowernv): Add support to read 64 bit sensors

The firmware has supported for reading sensor values of size u32.
This patch adds support to use newer firmware functions which allows
to read the sensors of size u64.

Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowernv: opal-sensor: Add support to read 64bit sensor values
Shilpasri G Bhat [Mon, 7 May 2018 10:25:36 +0000 (15:55 +0530)]
powernv: opal-sensor: Add support to read 64bit sensor values

This patch adds support to read 64-bit sensor values. This method is
used to read energy sensors and counters which are of type u64.

Signed-off-by: Shilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agoselftests/powerpc: Remove redundant cp_abort test
Michael Neuling [Thu, 5 Oct 2017 23:48:57 +0000 (10:48 +1100)]
selftests/powerpc: Remove redundant cp_abort test

Paste on POWER9 only works on accelerators and no longer on real
memory. Hence this test is broken so remove it.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/fsl/dts: fix the i2c-mux compatible for t104xqds
Peter Rosin [Thu, 3 Aug 2017 12:59:34 +0000 (14:59 +0200)]
powerpc/fsl/dts: fix the i2c-mux compatible for t104xqds

The sanctioned compatible is "nxp,pca9547".

Signed-off-by: Peter Rosin <peda@axentia.se>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/ptrace: Fix setting 512B aligned breakpoints with PTRACE_SET_DEBUGREG
Michael Neuling [Thu, 17 May 2018 05:37:15 +0000 (15:37 +1000)]
powerpc/ptrace: Fix setting 512B aligned breakpoints with PTRACE_SET_DEBUGREG

In commit e2a800beaca1 ("powerpc/hw_brk: Fix off by one error when
validating DAWR region end") we fixed setting the DAWR end point to
its max value via PPC_PTRACE_SETHWDEBUG. Unfortunately we broke
PTRACE_SET_DEBUGREG when setting a 512 byte aligned breakpoint.

PTRACE_SET_DEBUGREG currently sets the length of the breakpoint to
zero (memset() in hw_breakpoint_init()). This worked with
arch_validate_hwbkpt_settings() before the above patch was applied but
is now broken if the breakpoint is 512byte aligned.

This sets the length of the breakpoint to 8 bytes when using
PTRACE_SET_DEBUGREG.

Fixes: e2a800beaca1 ("powerpc/hw_brk: Fix off by one error when validating DAWR region end")
Cc: stable@vger.kernel.org # v3.11+
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/ptrace: Fix enforcement of DAWR constraints
Michael Neuling [Thu, 17 May 2018 05:37:14 +0000 (15:37 +1000)]
powerpc/ptrace: Fix enforcement of DAWR constraints

Back when we first introduced the DAWR, in commit 4ae7ebe9522a
("powerpc: Change hardware breakpoint to allow longer ranges"), we
screwed up the constraint making it a 1024 byte boundary rather than a
512. This makes the check overly permissive. Fortunately GDB is the
only real user and it always did they right thing, so we never
noticed.

This fixes the constraint to 512 bytes.

Fixes: 4ae7ebe9522a ("powerpc: Change hardware breakpoint to allow longer ranges")
Cc: stable@vger.kernel.org # v3.9+
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/powernv: Use __raw_[rm_]writeq_be() in npu-dma.c
Michael Ellerman [Mon, 14 May 2018 12:50:33 +0000 (22:50 +1000)]
powerpc/powernv: Use __raw_[rm_]writeq_be() in npu-dma.c

This allows us to squash some sparse warnings and also avoids having
to do explicity endian conversions in the code.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
6 years agopowerpc/powernv: Use __raw_[rm_]writeq_be() in pci-ioda.c
Michael Ellerman [Mon, 14 May 2018 12:50:32 +0000 (22:50 +1000)]
powerpc/powernv: Use __raw_[rm_]writeq_be() in pci-ioda.c

This allows us to squash some sparse warnings and also avoids having
to do explicity endian conversions in the code.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
6 years agopowerpc/io: Add __raw_writeq_be() __raw_rm_writeq_be()
Michael Ellerman [Mon, 14 May 2018 12:50:31 +0000 (22:50 +1000)]
powerpc/io: Add __raw_writeq_be() __raw_rm_writeq_be()

Add byte-swapping versions of __raw_writeq() and __raw_rm_writeq().

This allows us to avoid sparse warnings caused by passing __be64 to
__raw_writeq(), which takes unsigned long:

  arch/powerpc/platforms/powernv/pci-ioda.c:1981:38:
  warning: incorrect type in argument 1 (different base types)
      expected unsigned long [unsigned] v
      got restricted __be64 [usertype] <noident>

It's also generally preferable to use a byte-swapping accessor rather
than doing it by hand in the code, which is more bug prone.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Samuel Mendoza-Jonas <sam@mendozajonas.com>
6 years agopowerpc/perf: Fix memory allocation for core-imc based on num_possible_cpus()
Anju T Sudhakar [Wed, 16 May 2018 06:35:18 +0000 (12:05 +0530)]
powerpc/perf: Fix memory allocation for core-imc based on num_possible_cpus()

Currently memory is allocated for core-imc based on cpu_present_mask,
which has bit 'cpu' set iff cpu is populated. We use (cpu number / threads
per core) as the array index to access the memory.

Under some circumstances firmware marks a CPU as GUARDed CPU and boot the
system, until cleared of errors, these CPU's are unavailable for all
subsequent boots. GUARDed CPUs are possible but not present from linux
view, so it blows a hole when we assume the max length of our allocation
is driven by our max present cpus, where as one of the cpus might be online
and be beyond the max present cpus, due to the hole.
So (cpu number / threads per core) value bounds the array index and leads
to memory overflow.

Call trace observed during a guard test:

Faulting instruction address: 0xc000000000149f1c
cpu 0x69: Vector: 380 (Data Access Out of Range) at [c000003fea303420]
    pc:c000000000149f1c: prefetch_freepointer+0x14/0x30
    lr:c00000000014e0f8: __kmalloc+0x1a8/0x1ac
    sp:c000003fea3036a0
   msr:9000000000009033
   dar:c9c54b2c91dbf6b7
  current = 0xc000003fea2c0000
  paca    = 0xc00000000fddd880  softe: 3  irq_happened: 0x01
    pid   = 1, comm = swapper/104
Linux version 4.16.7-openpower1 (smc@smc-desktop) (gcc version 6.4.0
(Buildroot 2018.02.1-00006-ga8d1126)) #2 SMP Fri May 4 16:44:54 PDT 2018
enter ? for help
call trace:
 __kmalloc+0x1a8/0x1ac
 (unreliable)
 init_imc_pmu+0x7f4/0xbf0
 opal_imc_counters_probe+0x3fc/0x43c
 platform_drv_probe+0x48/0x80
 driver_probe_device+0x22c/0x308
 __driver_attach+0xa0/0xd8
 bus_for_each_dev+0x88/0xb4
 driver_attach+0x2c/0x40
 bus_add_driver+0x1e8/0x228
 driver_register+0xd0/0x114
 __platform_driver_register+0x50/0x64
 opal_imc_driver_init+0x24/0x38
 do_one_initcall+0x150/0x15c
 kernel_init_freeable+0x250/0x254
 kernel_init+0x1c/0x150
 ret_from_kernel_thread+0x5c/0xc8

Allocating memory for core-imc based on cpu_possible_mask, which has
bit 'cpu' set iff cpu is populatable, will fix this issue.

Reported-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>
Tested-by: Pridhiviraj Paidipeddi <ppaidipe@linux.vnet.ibm.com>
Fixes: 39a846db1d57 ("powerpc/perf: Add core IMC PMU support")
Cc: stable@vger.kernel.org # v4.14+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/rtas: Fix spelling mistake "Discharching" -> "Discharging"
Colin Ian King [Fri, 18 May 2018 09:31:17 +0000 (10:31 +0100)]
powerpc/rtas: Fix spelling mistake "Discharching" -> "Discharging"

Trivial fix to spelling mistake in battery_charging array.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/lib: Fix "integer constant is too large" build failure
Finn Thain [Fri, 18 May 2018 01:18:33 +0000 (11:18 +1000)]
powerpc/lib: Fix "integer constant is too large" build failure

My powerpc-linux-gnu-gcc v4.4.5 compiler can't build a 32-bit kernel
any more:

arch/powerpc/lib/sstep.c: In function 'do_popcnt':
arch/powerpc/lib/sstep.c:1068: error: integer constant is too large for 'long' type
arch/powerpc/lib/sstep.c:1069: error: integer constant is too large for 'long' type
arch/powerpc/lib/sstep.c:1069: error: integer constant is too large for 'long' type
arch/powerpc/lib/sstep.c:1070: error: integer constant is too large for 'long' type
arch/powerpc/lib/sstep.c:1079: error: integer constant is too large for 'long' type
arch/powerpc/lib/sstep.c: In function 'do_prty':
arch/powerpc/lib/sstep.c:1117: error: integer constant is too large for 'long' type

This file gets compiled with -std=gnu89 which means a constant can be
given the type 'long' even if it won't fit. Fix the errors with a 'ULL'
suffix on the relevant constants.

Fixes: 2c979c489fee ("powerpc/lib/sstep: Add prty instruction emulation")
Fixes: dcbd19b48d31 ("powerpc/lib/sstep: Add popcnt instruction emulation")
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/powernv: Fix opal_event_shutdown() called with interrupts disabled
Nicholas Piggin [Mon, 14 May 2018 15:59:46 +0000 (01:59 +1000)]
powerpc/powernv: Fix opal_event_shutdown() called with interrupts disabled

A kernel crash in process context that calls emergency_restart from
panic will end up calling opal_event_shutdown with interrupts disabled
but not in interrupt. This causes a sleeping function to be called
which gives the following warning with sysrq+c:

    Rebooting in 10 seconds..
    BUG: sleeping function called from invalid context at kernel/locking/mutex.c:238
    in_atomic(): 0, irqs_disabled(): 1, pid: 7669, name: bash
    CPU: 20 PID: 7669 Comm: bash Tainted: G      D W         4.17.0-rc5+ #3
    Call Trace:
    dump_stack+0xb0/0xf4 (unreliable)
    ___might_sleep+0x174/0x1a0
    mutex_lock+0x38/0xb0
    __free_irq+0x68/0x460
    free_irq+0x70/0xc0
    opal_event_shutdown+0xb4/0xf0
    opal_shutdown+0x24/0xa0
    pnv_shutdown+0x28/0x40
    machine_shutdown+0x44/0x60
    machine_restart+0x28/0x80
    emergency_restart+0x30/0x50
    panic+0x2a0/0x328
    oops_end+0x1ec/0x1f0
    bad_page_fault+0xe8/0x154
    handle_page_fault+0x34/0x38
    --- interrupt: 300 at sysrq_handle_crash+0x44/0x60
    LR = __handle_sysrq+0xfc/0x260
    flag_spec.62335+0x12b844/0x1e8db4 (unreliable)
    __handle_sysrq+0xfc/0x260
    write_sysrq_trigger+0xa8/0xb0
    proc_reg_write+0xac/0x110
    __vfs_write+0x6c/0x240
    vfs_write+0xd0/0x240
    ksys_write+0x6c/0x110

Fixes: 9f0fd0499d30 ("powerpc/powernv: Add a virtual irqchip for opal events")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/32: Use stmw/lmw for registers save/restore in asm
Christophe Leroy [Tue, 17 Apr 2018 17:08:18 +0000 (19:08 +0200)]
powerpc/32: Use stmw/lmw for registers save/restore in asm

arch/powerpc/Makefile activates -mmultiple on BE PPC32 configs
in order to use multiple word instructions in functions entry/exit.

The patch does the same for the asm parts, for consistency.

On processors like the 8xx on which insn fetching is pretty slow,
this speeds up registers save/restore.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[mpe: PPC32 is BE only, so drop the endian checks]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc: Avoid an unnecessary test and branch in longjmp()
Christophe Leroy [Tue, 17 Apr 2018 17:08:16 +0000 (19:08 +0200)]
powerpc: Avoid an unnecessary test and branch in longjmp()

Doing the test at exit of the function avoids an unnecessary
test and branch inside longjmp().

Semantics are unchanged.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agoRevert "powerpc/64: Fix checksum folding in csum_add()"
Christophe Leroy [Tue, 10 Apr 2018 06:34:37 +0000 (08:34 +0200)]
Revert "powerpc/64: Fix checksum folding in csum_add()"

This reverts commit 6ad966d7303b70165228dba1ee8da1a05c10eefe.

That commit was pointless, because csum_add() sums two 32 bits
values, so the sum is 0x1fffffffe at the maximum.
And then when adding upper part (1) and lower part (0xfffffffe),
the result is 0xffffffff which doesn't carry.
Any lower value will not carry either.

And behind the fact that this commit is useless, it also kills the
whole purpose of having an arch specific inline csum_add()
because the resulting code gets even worse than what is obtained
with the generic implementation of csum_add()

0000000000000240 <.csum_add>:
 240: 38 00 ff ff  li      r0,-1
 244: 7c 84 1a 14  add     r4,r4,r3
 248: 78 00 00 20  clrldi  r0,r0,32
 24c: 78 89 00 22  rldicl  r9,r4,32,32
 250: 7c 80 00 38  and     r0,r4,r0
 254: 7c 09 02 14  add     r0,r9,r0
 258: 78 09 00 22  rldicl  r9,r0,32,32
 25c: 7c 00 4a 14  add     r0,r0,r9
 260: 78 03 00 20  clrldi  r3,r0,32
 264: 4e 80 00 20  blr

In comparison, the generic implementation of csum_add() gives:

0000000000000290 <.csum_add>:
 290: 7c 63 22 14  add     r3,r3,r4
 294: 7f 83 20 40  cmplw   cr7,r3,r4
 298: 7c 10 10 26  mfocrf  r0,1
 29c: 54 00 ef fe  rlwinm  r0,r0,29,31,31
 2a0: 7c 60 1a 14  add     r3,r0,r3
 2a4: 78 63 00 20  clrldi  r3,r3,32
 2a8: 4e 80 00 20  blr

And the reverted implementation for PPC64 gives:

0000000000000240 <.csum_add>:
 240: 7c 84 1a 14  add     r4,r4,r3
 244: 78 80 00 22  rldicl  r0,r4,32,32
 248: 7c 80 22 14  add     r4,r0,r4
 24c: 78 83 00 20  clrldi  r3,r4,32
 250: 4e 80 00 20  blr

Fixes: 6ad966d7303b7 ("powerpc/64: Fix checksum folding in csum_add()")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Acked-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc: get rid of PMD_PAGE_SIZE() and _PMD_SIZE
Christophe Leroy [Wed, 16 May 2018 06:58:57 +0000 (08:58 +0200)]
powerpc: get rid of PMD_PAGE_SIZE() and _PMD_SIZE

PMD_PAGE_SIZE() is nowhere used and _PMD_SIZE is only
used by PMD_PAGE_SIZE().

This patch removes them.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/embedded6xx/hlwd-pic: Prevent interrupts from being handled by Starlet
Jonathan Neuschäfer [Thu, 10 May 2018 21:59:19 +0000 (23:59 +0200)]
powerpc/embedded6xx/hlwd-pic: Prevent interrupts from being handled by Starlet

The interrupt controller inside the Wii's Hollywood chip is connected to
two masters, the "Broadway" PowerPC and the "Starlet" ARM926, each with
their own interrupt status and mask registers.

When booting the Wii with mini[1], interrupts from the SD card
controller (IRQ 7) are handled by the ARM, because mini provides SD
access over IPC. Linux however can't currently use or disable this IPC
service, so both sides try to handle IRQ 7 without coordination.

Let's instead make sure that all interrupts that are unmasked on the PPC
side are masked on the ARM side; this will also make sure that Linux can
properly talk to the SD card controller (and potentially other devices).

If access to a device through IPC is desired in the future, interrupts
from that device should not be handled by Linux directly.

[1]: https://github.com/lewurm/mini

Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/embedded6xx/flipper-pic: Don't match all IRQ domains
Jonathan Neuschäfer [Thu, 10 May 2018 21:59:18 +0000 (23:59 +0200)]
powerpc/embedded6xx/flipper-pic: Don't match all IRQ domains

On the Wii, there is a secondary IRQ controller (hlwd-pic), so
flipper-pic's match operation should not be hardcoded to return 1.
In fact, the default matching logic is sufficient, and we can completely
omit flipper_pic_match.

Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/book3s64: Enable split pmd ptlock.
Aneesh Kumar K.V [Mon, 16 Apr 2018 11:27:24 +0000 (16:57 +0530)]
powerpc/book3s64: Enable split pmd ptlock.

Testing with a threaded version of mmap_bench which allocate 1G chunks and
with large number of threads we find:

without patch

    32.72%  mmap_bench  [kernel.vmlinux]            [k] do_raw_spin_lock
            |
            ---do_raw_spin_lock
               |
                --32.68%--0
                          |
                          |--15.82%--pte_fragment_alloc
                          |          |
                          |           --15.79%--do_huge_pmd_anonymous_page
                          |                     __handle_mm_fault
                          |                     handle_mm_fault
                          |                     __do_page_fault
                          |                     handle_page_fault
                          |                     test_mmap
                          |                     test_mmap
                          |                     start_thread
                          |                     __clone
                          |
                          |--14.95%--do_huge_pmd_anonymous_page
                          |          __handle_mm_fault
                          |          handle_mm_fault
                          |          __do_page_fault
                          |          handle_page_fault
                          |          test_mmap
                          |          test_mmap
                          |          start_thread
                          |          __clone
                          |

with patch

    12.89%  mmap_bench  [kernel.vmlinux]            [k] do_raw_spin_lock
            |
            ---do_raw_spin_lock
               |
                --12.83%--0
                          |
                          |--3.21%--pagevec_lru_move_fn
                          |          __lru_cache_add
                          |          |
                          |           --2.74%--do_huge_pmd_anonymous_page
                          |                     __handle_mm_fault
                          |                     handle_mm_fault
                          |                     __do_page_fault
                          |                     handle_page_fault
                          |                     test_mmap
                          |                     test_mmap
                          |                     start_thread
                          |                     __clone
                          |
                          |--3.11%--do_huge_pmd_anonymous_page
                          |          __handle_mm_fault
                          |          handle_mm_fault
                          |          __do_page_fault
                          |          handle_page_fault
                          |          test_mmap
                          |          test_mmap
                          |          start_thread
                          |          __clone

.....
                          |
                           --0.55%--pte_fragment_alloc
                                     |
                                      --0.55%--do_huge_pmd_anonymous_page
                                                __handle_mm_fault
                                                handle_mm_fault
                                                __do_page_fault
                                                handle_page_fault
                                                test_mmap
                                                test_mmap
                                                start_thread
                                                __clone

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/mm: Use page fragments for allocation page table at PMD level
Aneesh Kumar K.V [Mon, 16 Apr 2018 11:27:23 +0000 (16:57 +0530)]
powerpc/mm: Use page fragments for allocation page table at PMD level

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/mm: Implement helpers for pagetable fragment support at PMD level
Aneesh Kumar K.V [Mon, 16 Apr 2018 11:27:22 +0000 (16:57 +0530)]
powerpc/mm: Implement helpers for pagetable fragment support at PMD level

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/book3s64/mm: Simplify the rcu callback for page table free
Aneesh Kumar K.V [Mon, 16 Apr 2018 11:27:21 +0000 (16:57 +0530)]
powerpc/book3s64/mm: Simplify the rcu callback for page table free

Instead of encoding shift in the table address, use an enumerated index value.
This allow us to do different things in the callback for pte and pmd.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/mm/book3s64/4k: Switch 4k pagesize config to use pagetable fragment
Aneesh Kumar K.V [Mon, 16 Apr 2018 11:27:20 +0000 (16:57 +0530)]
powerpc/mm/book3s64/4k: Switch 4k pagesize config to use pagetable fragment

4K config use one full page at level 4 of the pagetable. Add support for single
fragment allocation in pagetable fragment code and and use that for 4K config.
This makes both 4k and 64k use the same code path. Later we will switch pmd to
use the page table fragment code. This is done only for 64bit platforms which
is using page table fragment support.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/mm/nohash: Remove pte fragment dependency from nohash
Aneesh Kumar K.V [Mon, 16 Apr 2018 11:27:19 +0000 (16:57 +0530)]
powerpc/mm/nohash: Remove pte fragment dependency from nohash

Now that we have removed 64K page size support, the RCU page table free can
be much simpler for nohash. Make a copy of the the rcu callback to pgalloc.h
header similar to nohash 32. We could possibly merge 32 and 64 bit there. But
that is for a later patch

We also move the book3s specific handler to pgtable_book3s64.c. This will be
updated in a later patch to handle split pmd ptlock.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/mm/book3e/64: Remove unsupported 64Kpage size from 64bit booke
Aneesh Kumar K.V [Mon, 16 Apr 2018 11:27:18 +0000 (16:57 +0530)]
powerpc/mm/book3e/64: Remove unsupported 64Kpage size from 64bit booke

We have in Kconfig

config PPC_64K_PAGES
bool "64k page size"
depends on !PPC_FSL_BOOK3E && (44x || PPC_BOOK3S_64 || PPC_BOOK3E_64)
select HAVE_ARCH_SOFT_DIRTY if PPC_BOOK3S_64

Only supported BOOK3E 64 bit platforms is FSL_BOOK3E. Remove the dead 64k page
support code from 64bit nohash.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/mm: Rename pte fragment functions
Aneesh Kumar K.V [Mon, 16 Apr 2018 11:27:17 +0000 (16:57 +0530)]
powerpc/mm: Rename pte fragment functions

We rename the alloc and get_from_cache to indicate they operate on pte
fragments. In later patch we will add pmd fragment support.

No functional change in this patch.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/mm: Use pmd_lockptr instead of opencoding it
Aneesh Kumar K.V [Mon, 16 Apr 2018 11:27:16 +0000 (16:57 +0530)]
powerpc/mm: Use pmd_lockptr instead of opencoding it

In later patch we switch pmd_lock from mm->page_table_lock to split pmd ptlock.
It avoid compilations issues, use pmd_lockptr helper.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/mm/book3s64: Move book3s64 code to pgtable-book3s64
Aneesh Kumar K.V [Mon, 16 Apr 2018 11:27:14 +0000 (16:57 +0530)]
powerpc/mm/book3s64: Move book3s64 code to pgtable-book3s64

Only code movement and avoid #ifdef.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agoMerge branch 'topic/ppc-kvm' into next
Michael Ellerman [Tue, 15 May 2018 12:28:19 +0000 (22:28 +1000)]
Merge branch 'topic/ppc-kvm' into next

This brings in one commit that we may want to share with the kvm-ppc
tree, to avoid merge conflicts and get wider testing.

6 years agopowerpc/kvm: Switch kvm pmd allocator to custom allocator
Aneesh Kumar K.V [Mon, 16 Apr 2018 11:27:15 +0000 (16:57 +0530)]
powerpc/kvm: Switch kvm pmd allocator to custom allocator

In the next set of patches, we will switch pmd allocator to use page fragments
and the locking will be updated to split pmd ptlock. We want to avoid using
fragments for partition-scoped table. Use slab cache similar to level 4 table

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/cell/spufs: Change return type to vm_fault_t
Souptick Joarder [Fri, 20 Apr 2018 17:32:39 +0000 (23:02 +0530)]
powerpc/cell/spufs: Change return type to vm_fault_t

Use new return type vm_fault_t for fault handler. For now, this is
just documenting that the function returns a VM_FAULT value rather
than an errno. Once all instances are converted, vm_fault_t will
become a distinct type. See commit 1c8f422059ae ("mm: change return
type to vm_fault_t").

We are fixing a minor bug, that the error from vm_insert_pfn() was
being ignored and the effect of this is likely to be only felt in OOM
situations.

Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agomacintosh/windfarm: fix spelling mistake: "ttarged" -> "ttarget"
Colin Ian King [Thu, 10 May 2018 15:54:39 +0000 (16:54 +0100)]
macintosh/windfarm: fix spelling mistake: "ttarged" -> "ttarget"

Trivial fix to spelling mistake in debug messages of a structure
field name

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agoselftests/powerpc: fix exec benchmark
Nicholas Piggin [Sat, 12 May 2018 03:35:24 +0000 (13:35 +1000)]
selftests/powerpc: fix exec benchmark

The exec_target binary could segfault calling _exit(2) because r13
is not set up properly (and libc looks at that when performing a
syscall). Call SYS_exit using syscall(2) which doesn't seem to
have this problem.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/ioda: Use ibm, supported-tce-sizes for IOMMU page size mask
Alexey Kardashevskiy [Mon, 14 May 2018 09:39:22 +0000 (19:39 +1000)]
powerpc/ioda: Use ibm, supported-tce-sizes for IOMMU page size mask

At the moment we assume that IODA2 and newer PHBs can always do 4K/64K/16M
IOMMU pages, however this is not the case for POWER9 and now skiboot
advertises the supported sizes via the device so we use that instead
of hard coding the mask.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/powernv: Fix memtrace build when NUMA=n
Michael Ellerman [Thu, 10 May 2018 13:09:13 +0000 (23:09 +1000)]
powerpc/powernv: Fix memtrace build when NUMA=n

Currently memtrace doesn't build if NUMA=n:

  In function â€˜memtrace_alloc_node’:
  arch/powerpc/platforms/powernv/memtrace.c:134:6:
  error: the address of â€˜contig_page_data’ will always evaluate as â€˜true’
    if (!NODE_DATA(nid) || !node_spanned_pages(nid))
        ^

This is because for NUMA=n NODE_DATA(nid) points to an always
allocated structure, contig_page_data.

But even in the NUMA=y case memtrace_alloc_node() is only called for
online nodes, and we should always have a NODE_DATA() allocated for an
online node. So remove the (hopefully) overly paranoid check, which
also means we can build when NUMA=n.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/prom: Drop support for old FDT versions
Michael Ellerman [Wed, 9 May 2018 13:42:27 +0000 (23:42 +1000)]
powerpc/prom: Drop support for old FDT versions

In commit e6a6928c3ea1 ("of/fdt: Convert FDT functions to use
libfdt") (Apr 2014), the generic flat device tree code dropped support
for flat device tree's older than version 0x10 (16).

We still have code in our CPU scanning to cope with flat device tree
versions earlier than 2, which can now never trigger, so drop it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/lib: Add alt patching test of branching past the last instruction
Michael Ellerman [Mon, 16 Apr 2018 14:39:05 +0000 (00:39 +1000)]
powerpc/lib: Add alt patching test of branching past the last instruction

Add a test of the relative branch patching logic in the alternate
section feature fixup code. This tests that if we branch past the last
instruction of the alternate section, the branch is not patched.
That's because the assembler will have created a branch that already
points to the first instruction after the patched section, which is
correct and needs no further patching.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/lib: Rename ftr_fixup_test7 to ftr_fixup_test_too_big
Michael Ellerman [Mon, 16 Apr 2018 14:39:04 +0000 (00:39 +1000)]
powerpc/lib: Rename ftr_fixup_test7 to ftr_fixup_test_too_big

We want this to remain the last test (because it's disabled by
default), so give it a non-numbered name so we don't have to renumber
it when adding new tests before it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/lib: Fix the feature fixup tests to actually work
Michael Ellerman [Mon, 16 Apr 2018 14:39:03 +0000 (00:39 +1000)]
powerpc/lib: Fix the feature fixup tests to actually work

The code patching code has always been a bit confused about whether
it's best to use void *, unsigned int *, char *, etc. to point to
instructions. In fact in the feature fixups tests we use both unsigned
int[] and u8[] in different places.

Unfortunately the tests that use unsigned int[] calculate the size of
the code blocks using subtraction of those unsigned int pointers, and
then pass the result to memcmp(). This means we're only comparing 1/4
of the bytes we need to, because we need to multiply by
sizeof(unsigned int) to get the number of *bytes*.

The result is that the tests do all the patching and then only compare
some of the resulting code, so patching bugs that only effect that
last 3/4 of the code could slip through undetected. It turns out that
hasn't been happening, although one test had a bad expected case (see
previous commit).

Fix it for now by multiplying the size by 4 in the affected functions.

Fixes: 362e7701fd18 ("powerpc: Add self-tests of the feature fixup code")
Epic-brown-paper-bag-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/lib: Fix feature fixup test of external branch
Michael Ellerman [Mon, 16 Apr 2018 14:39:02 +0000 (00:39 +1000)]
powerpc/lib: Fix feature fixup test of external branch

The expected case for this test was wrong, the source of the alternate
code sequence is:

  FTR_SECTION_ELSE
  2: or 2,2,2
   PPC_LCMPI r3,1
   beq 3f
   blt 2b
   b 3f
   b 1b
  ALT_FTR_SECTION_END(0, 1)
  3: or 1,1,1
   or 2,2,2
  4: or 3,3,3

So when it's patched the '3' label should still be on the 'or 1,1,1',
and the 4 label is irrelevant and can be removed.

Fixes: 362e7701fd18 ("powerpc: Add self-tests of the feature fixup code")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc: Make it clearer that systbl check errors are errors
Michael Ellerman [Mon, 30 Apr 2018 03:27:36 +0000 (13:27 +1000)]
powerpc: Make it clearer that systbl check errors are errors

If the systbl_chk.sh checks fail we print a message, but with no
indication that it's an error. That makes it hard to find in build
logs with eg. grep.

So prefix any output with "Error:".

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/syscalls: timer_create can be handle by perfectly normal COMPAT_SYS_SPU
Al Viro [Wed, 2 May 2018 13:20:51 +0000 (23:20 +1000)]
powerpc/syscalls: timer_create can be handle by perfectly normal COMPAT_SYS_SPU

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/syscalls: kill ppc32_select()
Al Viro [Wed, 2 May 2018 13:20:50 +0000 (23:20 +1000)]
powerpc/syscalls: kill ppc32_select()

it had always been pointless - compat_sys_select() sign-extends
the first argument just fine on its own.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[mpe: Use COMPAT_SPU_NEW() to keep systbl_chk.sh happy]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/syscalls: Add COMPAT_SPU_NEW() macro
Michael Ellerman [Wed, 2 May 2018 13:20:49 +0000 (23:20 +1000)]
powerpc/syscalls: Add COMPAT_SPU_NEW() macro

Currently the select system call is wired up with the SYSX_SPU()
macro. The SYSX_SPU() is not handled by systbl_chk.c, which means the
syscall number for select is not checked.

That hides the fact that the syscall number for select is actually
__NR__newselect not __NR_select.

In a following patch we'd like to drop ppc32_select() which means
select will become a regular COMPAT_SYS_SPU() syscall. But
COMPAT_SYS_SPU() can't deal with the fact that the syscall number is
actually __NR__newselect. We also can't just redefine __NR_select
because that's still used for the old select call.

So add a new COMPAT_NEW_SPU() that does the same thing as
COMPAT_SYS_SPU() except it encodes that we're using the new number.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/syscalls: switch rtas(2) to SYSCALL_DEFINE
Al Viro [Wed, 2 May 2018 13:20:48 +0000 (23:20 +1000)]
powerpc/syscalls: switch rtas(2) to SYSCALL_DEFINE

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[mpe: Update sys_ni.c for s/ppc_rtas/sys_rtas/]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/syscalls: signal_{32, 64} - switch to SYSCALL_DEFINE
Al Viro [Wed, 2 May 2018 13:20:47 +0000 (23:20 +1000)]
powerpc/syscalls: signal_{32, 64} - switch to SYSCALL_DEFINE

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[mpe: Fix sys_debug_setcontext() prototype to return long]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/syscalls: Switch trivial cases to SYSCALL_DEFINE
Al Viro [Wed, 2 May 2018 13:20:46 +0000 (23:20 +1000)]
powerpc/syscalls: Switch trivial cases to SYSCALL_DEFINE

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/livepatch: Implement reliable stack tracing for the consistency model
Torsten Duwe [Fri, 4 May 2018 12:38:34 +0000 (14:38 +0200)]
powerpc/livepatch: Implement reliable stack tracing for the consistency model

The "Power Architecture 64-Bit ELF V2 ABI" says in section 2.3.2.3:

[...] There are several rules that must be adhered to in order to ensure
reliable and consistent call chain backtracing:

* Before a function calls any other function, it shall establish its
  own stack frame, whose size shall be a multiple of 16 bytes.

 â€“ In instances where a function’s prologue creates a stack frame, the
   back-chain word of the stack frame shall be updated atomically with
   the value of the stack pointer (r1) when a back chain is implemented.
   (This must be supported as default by all ELF V2 ABI-compliant
   environments.)
[...]
 â€“ The function shall save the link register that contains its return
   address in the LR save doubleword of its caller’s stack frame before
   calling another function.

To me this sounds like the equivalent of HAVE_RELIABLE_STACKTRACE.
This patch may be unneccessarily limited to ppc64le, but OTOH the only
user of this flag so far is livepatching, which is only implemented on
PPCs with 64-LE, a.k.a. ELF ABI v2.

Feel free to add other ppc variants, but so far only ppc64le got tested.

This change also implements save_stack_trace_tsk_reliable() for ppc64le
that checks for the above conditions, where possible.

Signed-off-by: Torsten Duwe <duwe@suse.de>
Signed-off-by: Nicolai Stange <nstange@suse.de>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/watchdog: provide more data in watchdog messages
Nicholas Piggin [Sat, 5 May 2018 07:26:00 +0000 (17:26 +1000)]
powerpc/watchdog: provide more data in watchdog messages

Provide timebase and timebase of last heartbeat in watchdog lockup
messages. Also provide a stack trace of when a CPU becomes un-stuck,
which can be useful -- it could be where irqs are re-enabled, so it
may be the end of the critical section which is responsible for the
latency which is useful information.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/watchdog: don't update the watchdog timestamp if a lockup is detected
Nicholas Piggin [Sat, 5 May 2018 07:25:59 +0000 (17:25 +1000)]
powerpc/watchdog: don't update the watchdog timestamp if a lockup is detected

The watchdog heartbeat timestamp is updated when the local heartbeat
timer fires (or touch_nmi_watchdog() is called).

This is an interesting data point, so don't overwrite it when the
soft-NMI interrupt detects a hard lockup. That code came from a pre-
merge version to prevent hard lockup messages flood, but that's taken
care of with the stuck CPU logic now, so there is no reason to
update the heartbeat timestamp here.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/xive: prepare all hcalls to support long busy delays
Cédric Le Goater [Tue, 8 May 2018 07:05:17 +0000 (09:05 +0200)]
powerpc/xive: prepare all hcalls to support long busy delays

This is not the case for the moment, but future releases of pHyp might
need to introduce some synchronisation routines under the hood which
would make the XIVE hcalls longer to complete.

As this was done for H_INT_RESET, let's wrap the other hcalls in a
loop catching the H_LONG_BUSY_* codes.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/xive: shutdown XIVE when kexec or kdump is performed
Cédric Le Goater [Tue, 8 May 2018 07:05:16 +0000 (09:05 +0200)]
powerpc/xive: shutdown XIVE when kexec or kdump is performed

The hcall H_INT_RESET should be called to make sure XIVE is fully
reseted.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/xive: fix hcall H_INT_RESET to support long busy delays
Cédric Le Goater [Tue, 8 May 2018 07:05:15 +0000 (09:05 +0200)]
powerpc/xive: fix hcall H_INT_RESET to support long busy delays

The hcall H_INT_RESET can take some time to complete and in such cases
it returns H_LONG_BUSY_* codes requiring the machine to sleep for a
while before retrying.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/64/kexec: fix race in kexec when XIVE is shutdown
Cédric Le Goater [Tue, 8 May 2018 07:05:14 +0000 (09:05 +0200)]
powerpc/64/kexec: fix race in kexec when XIVE is shutdown

The kexec_state KEXEC_STATE_IRQS_OFF barrier is reached by all
secondary CPUs before the kexec_cpu_down() operation is called on
secondaries. This can raise conflicts and provoque errors in the XIVE
hcalls when XIVE is shutdown with H_INT_RESET on the primary CPU.

To synchronize the kexec_cpu_down() operations and make sure the
secondaries have completed their task before the primary starts doing
the same, let's move the primary kexec_cpu_down() after the
KEXEC_STATE_REAL_MODE barrier.

This change of the ending sequence of kexec is mostly useful on the
pseries platform but it impacts also the powernv, ps3 and 85xx
platforms. powernv can be easily tested and fixed but some caution is
required for the other two.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/config: powernv_defconfig updates
Nicholas Piggin [Mon, 30 Apr 2018 10:31:50 +0000 (20:31 +1000)]
powerpc/config: powernv_defconfig updates

For consideration:

* Add NVDIMM support - Enables greater testing, mambo device.
* Add IPv6 support built in + additional modules - Because it's 2018 maan.
* Add DEFERRED_STRUCT_PAGE_INIT - Let's see what breaks.
* Add PPC_MEMTRACE - Small powernv debugfs driver for getting hardware traces.
* Add MEMORY_FAILURE - Machine check exceptions can now drive memory failure.
* Turn on FANOTIFY - This is the current filesystem notification feature.
* Turn on SCOM_DEBUGFS - Handy for hardware/firmware debugging, security risk?
* Turn on async SCSI scanning - Let's see what breaks.
* Add MLX5 driver as a module - Popular demand.
* Add CRYPTO_CRCT10DIF_VPMSUM - POWER8 T10DIF acceleration.

* Make a bunch of USB hid drivers modules.
* Make SCSI SG, SR, and FC modules - FC is huge.
* Make video drivers except AST GPU modules - Also huge.
* Make PCI serial driver a module - Uncommon.
* Make more things modules, NFS FS, RAM disk, netconsole, MS-DOS fs.

* Get rid of /dev/port - Not used.
* Remove PPS and PTP subsystms - Unusual.
* Remove legacy BSD ttys - Long dead.
* Remove IDE - Deprecated and replaced with ATA.
* Remove WIRELESS - Until we get POWER9 laptops.
* Remove RAW - Long deprecated in favour of direct IO.
* Remove floppy, parport, and PS2 input devices - not supported.
* Remove virtio drivers, ballooning - We're host only.
* Remove PPP - Sorry Paulus.

This results in a significantly smaller vmlinux:

    text      data       bss        dec   filename
13143383   5277944   1317856   19739183   vanilla
12263281   4852074   1341720   18457075   patched

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc: wii_defconfig: Disable BCMA support
Jonathan Neuschäfer [Mon, 7 May 2018 14:20:19 +0000 (16:20 +0200)]
powerpc: wii_defconfig: Disable BCMA support

The B43 driver only needs CONFIG_SSB to support the WLAN card found in
the Wii. Configure it accordingly, and disable BCMA bus support to save
a bit of space.

Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc: wii_defconfig: Enable Wii SDHCI driver
Jonathan Neuschäfer [Mon, 7 May 2018 14:20:18 +0000 (16:20 +0200)]
powerpc: wii_defconfig: Enable Wii SDHCI driver

This allows access to the SD card and the BCM4318 Wifi module.

Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc: wii_defconfig: Enable GPIO-related options
Jonathan Neuschäfer [Mon, 7 May 2018 14:20:17 +0000 (16:20 +0200)]
powerpc: wii_defconfig: Enable GPIO-related options

Now that there's a GPIO driver for the Wii, let's enable the following
drivers:

- the GPIO driver itself
- gpio-keys
- gpio-poweroff
- gpio-leds and a few LED triggers

Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc: wii_defconfig: Disable Ethernet driver support code
Jonathan Neuschäfer [Mon, 7 May 2018 14:20:16 +0000 (16:20 +0200)]
powerpc: wii_defconfig: Disable Ethernet driver support code

The Wii doesn't have built-in Ethernet and USB Ethernet adapters are in
a different menu. Disable CONFIG_ETHERNET to save some space in support
code for Ethernet drivers.

Note that this patch doesn't disable any Ethernet drivers, because they
are not enabled by default.

Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/watchdog: fix typo 'can by' to 'can be'
Wolfram Sang [Sun, 6 May 2018 11:23:46 +0000 (13:23 +0200)]
powerpc/watchdog: fix typo 'can by' to 'can be'

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/pseries: hcall_exit tracepoint retval should be signed
Michael Ellerman [Mon, 7 May 2018 13:03:55 +0000 (23:03 +1000)]
powerpc/pseries: hcall_exit tracepoint retval should be signed

The hcall_exit() tracepoint has retval defined as unsigned long. That
leads to humours results like:

  bash-3686  [009] d..2   854.134094: hcall_entry: opcode=24
  bash-3686  [009] d..2   854.134095: hcall_exit: opcode=24 retval=18446744073709551609

It's normal for some hcalls to return negative values, displaying them
as unsigned isn't very helpful. So change it to signed.

  bash-3711  [001] d..2   471.691008: hcall_entry: opcode=24
  bash-3711  [001] d..2   471.691008: hcall_exit: opcode=24 retval=-7

Which can be more easily compared to H_NOT_FOUND in hvcall.h

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
6 years agoRevert "powerpc/powernv: Increase memory block size to 1GB on radix"
Balbir Singh [Tue, 1 May 2018 02:57:25 +0000 (12:57 +1000)]
Revert "powerpc/powernv: Increase memory block size to 1GB on radix"

This commit was a stop-gap to prevent crashes on hotunplug, caused by
the mismatch between the 1G mappings used for the linear mapping and the
memory block size. Those issues are now resolved because we split the
linear mapping at hotunplug time if necessary, as implemented in commit
4dd5f8a99e79 ("powerpc/mm/radix: Split linear mapping on hot-unplug").

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Tested-by: Rashmica Gupta <rashmica.g@gmail.com>
Tested-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/nohash: Use IS_ENABLED() to simplify __set_pte_at()
Christophe Leroy [Tue, 24 Apr 2018 16:31:32 +0000 (18:31 +0200)]
powerpc/nohash: Use IS_ENABLED() to simplify __set_pte_at()

By using IS_ENABLED() we can simplify __set_pte_at() by removing
redundant *ptep = pte.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/nohash: Remove _PAGE_BUSY
Christophe Leroy [Tue, 24 Apr 2018 16:31:30 +0000 (18:31 +0200)]
powerpc/nohash: Remove _PAGE_BUSY

_PAGE_BUSY is always 0, remove it.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/nohash: Remove hash related code from nohash headers.
Christophe Leroy [Tue, 24 Apr 2018 16:31:28 +0000 (18:31 +0200)]
powerpc/nohash: Remove hash related code from nohash headers.

When nohash and book3s header were split, some hash related stuff
remained in the nohash header. This patch removes them.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[mpe: Duplicate pte_young() to avoid circular header dependency]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/fadump: Unregister fadump on kexec down path.
Mahesh Salgaonkar [Fri, 27 Apr 2018 06:23:18 +0000 (11:53 +0530)]
powerpc/fadump: Unregister fadump on kexec down path.

Unregister fadump on kexec down path otherwise the fadump registration
in new kexec-ed kernel complains that fadump is already registered.
This makes new kernel to continue using fadump registered by previous
kernel which may lead to invalid vmcore generation. Hence this patch
fixes this issue by un-registering fadump in fadump_cleanup() which is
called during kexec path so that new kernel can register fadump with
new valid values.

Fixes: b500afff11f6 ("fadump: Invalidate registration and release reserved memory for general use.")
Cc: stable@vger.kernel.org # v3.4+
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/fadump: Do not use hugepages when fadump is active
Hari Bathini [Tue, 10 Apr 2018 13:41:31 +0000 (19:11 +0530)]
powerpc/fadump: Do not use hugepages when fadump is active

FADump capture kernel boots in restricted memory environment preserving
the context of previous kernel to save vmcore. Supporting hugepages in
such environment makes things unnecessarily complicated, as hugepages
need memory set aside for them. This means most of the capture kernel's
memory is used in supporting hugepages. In most cases, this results in
out-of-memory issues while booting FADump capture kernel. But hugepages
are not of much use in capture kernel whose only job is to save vmcore.
So, disabling hugepages support, when fadump is active, is a reliable
solution for the out of memory issues. Introducing a flag variable to
disable HugeTLB support when fadump is active.

Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/fadump: exclude memory holes while reserving memory in second kernel
Mahesh Salgaonkar [Tue, 10 Apr 2018 13:41:16 +0000 (19:11 +0530)]
powerpc/fadump: exclude memory holes while reserving memory in second kernel

The second kernel, during early boot after the crash, reserves rest of
the memory above boot memory size to make sure it does not touch any of the
dump memory area. It uses memblock_reserve() that reserves the specified
memory region irrespective of memory holes present within that region.
There are chances where previous kernel would have hot removed some of
its memory leaving memory holes behind. In such cases fadump kernel reports
incorrect number of reserved pages through arch_reserved_kernel_pages()
hook causing kernel to hang or panic.

Fix this by excluding memory holes while reserving rest of the memory
above boot memory size during second kernel boot after crash.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agotracing: Remove PPC32 wart from config TRACING_SUPPORT
Michael Ellerman [Wed, 2 May 2018 11:29:48 +0000 (21:29 +1000)]
tracing: Remove PPC32 wart from config TRACING_SUPPORT

config TRACING_SUPPORT has an exception for PPC32, because PPC32
didn't have irqflags tracing support.

But that hasn't been true since commit 5d38902c4838 ("powerpc: Add
irqtrace support for 32-bit powerpc") (Jun 2009).

So remove the exception for PPC32 and the comment.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc: remove retired sbc834x support
Paul Gortmaker [Mon, 11 Dec 2017 03:29:13 +0000 (22:29 -0500)]
powerpc: remove retired sbc834x support

I no longer have a functional version of this board for even the most
basic sanity boot testing, and they have not been available for purchase
for quite some years now.

There is no point in adding a burden to testing coverage that does
walk all the possible defconfigs, so with all the above in mind, it
makes sense to remove it.  Of course it will remain in the git history
for anyone who happens to stumble on one and wants to tinker with it.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc: Only support DYNAMIC_FTRACE not static
Michael Ellerman [Tue, 27 Mar 2018 04:29:06 +0000 (15:29 +1100)]
powerpc: Only support DYNAMIC_FTRACE not static

We've had dynamic ftrace support for over 9 years since Steve first
wrote it, all the distros use dynamic, and static is basically
untested these days, so drop support for static ftrace.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc64/ftrace: Implement support for ftrace_regs_caller()
Naveen N. Rao [Thu, 19 Apr 2018 07:04:09 +0000 (12:34 +0530)]
powerpc64/ftrace: Implement support for ftrace_regs_caller()

With -mprofile-kernel, we always save the full register state in
ftrace_caller(). While this works, this is inefficient if we're not
interested in the register state, such as when we're using the function
tracer.

Rename the existing ftrace_caller() as ftrace_regs_caller() and provide
a simpler implementation for ftrace_caller() that is used when registers
are not required to be saved.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc64/ftrace: Use the generic version of ftrace_replace_code()
Naveen N. Rao [Thu, 19 Apr 2018 07:04:08 +0000 (12:34 +0530)]
powerpc64/ftrace: Use the generic version of ftrace_replace_code()

Our implementation matches that of the generic version, which also
handles FTRACE_UPDATE_MODIFY_CALL. So, remove our implementation in
favor of the generic version.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc64/module: Tighten detection of mcount call sites with -mprofile-kernel
Naveen N. Rao [Thu, 19 Apr 2018 07:04:07 +0000 (12:34 +0530)]
powerpc64/module: Tighten detection of mcount call sites with -mprofile-kernel

For R_PPC64_REL24 relocations, we suppress emitting instructions for TOC
load/restore in the relocation stub if the relocation is for _mcount()
call when using -mprofile-kernel ABI.

To detect this, we check if the preceding instructions are per the
standard set of instructions emitted by gcc: either the two instruction
sequence of 'mflr r0; std r0,16(r1)', or the more optimized variant of a
single 'mflr r0'. This is not sufficient since nothing prevents users
from hand coding sequences involving a 'mflr r0' followed by a 'bl'.

For removing the toc save instruction from the stub, we additionally
check if the symbol is "_mcount". Add the same check here as well.

Also rename is_early_mcount_callsite() to is_mprofile_mcount_callsite()
since that is what is being checked. The use of "early" is misleading
since there is nothing involving this function that qualifies as early.

Fixes: 153086644fd1f ("powerpc/ftrace: Add support for -mprofile-kernel ftrace ABI")
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc64/kexec: Hard disable ftrace before switching to the new kernel
Naveen N. Rao [Thu, 19 Apr 2018 07:04:06 +0000 (12:34 +0530)]
powerpc64/kexec: Hard disable ftrace before switching to the new kernel

If function_graph tracer is enabled during kexec, we see the below
exception in the simulator:
root@(none):/# kexec -e
kvm: exiting hardware virtualization
kexec_core: Starting new kernel
[   19.262020070,5] OPAL: Switch to big-endian OS
kexec: Starting switchover sequence.
Interrupt to 0xC000000000004380 from 0xC000000000004380
** Execution stopped: Continuous Interrupt, Instruction caused exception,  **

Now that we have a more effective way to completely disable ftrace on
ppc64, let's also use that before switching to a new kernel during
kexec.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc64/ftrace: Disable ftrace during kvm entry/exit
Naveen N. Rao [Thu, 19 Apr 2018 07:04:05 +0000 (12:34 +0530)]
powerpc64/ftrace: Disable ftrace during kvm entry/exit

During guest entry/exit, we switch over to/from the guest MMU context
and we cannot take exceptions in the hypervisor code.

Since ftrace may be enabled and since it can result in us taking a trap,
disable ftrace by setting paca->ftrace_enabled to zero. There are two
paths through which we enter/exit a guest:
1. If we are the vcore runner, then we enter the guest via
__kvmppc_vcore_entry() and we disable ftrace around this. This is always
the case for Power9, and for the primary thread on Power8.
2. If we are a secondary thread in Power8, then we would be in nap due
to SMT being disabled. We are woken up by an IPI to enter the guest. In
this scenario, we enter the guest through kvm_start_guest(). We disable
ftrace at this point. In this scenario, ftrace would only get re-enabled
on the secondary thread when SMT is re-enabled (via start_secondary()).

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc64/ftrace: Disable ftrace during hotplug
Naveen N. Rao [Thu, 19 Apr 2018 07:04:04 +0000 (12:34 +0530)]
powerpc64/ftrace: Disable ftrace during hotplug

Disable ftrace when a cpu is about to go offline. When the cpu is woken
up, ftrace will get enabled in start_secondary().

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc64/ftrace: Delay enabling ftrace on secondary cpus
Naveen N. Rao [Thu, 19 Apr 2018 07:04:03 +0000 (12:34 +0530)]
powerpc64/ftrace: Delay enabling ftrace on secondary cpus

On the boot cpu, though we enable paca->ftrace_enabled in early_setup()
(via cpu_ready_for_interrupts()), we don't start tracing until much
later since ftrace is not initialized yet and since we only support
DYNAMIC_FTRACE on powerpc. However, it is possible that ftrace has been
initialized by the time some of the secondary cpus start up. In this
case, we will try to trace some of the early boot code which can cause
problems.

To address this, move setting paca->ftrace_enabled from
cpu_ready_for_interrupts() to early_setup() for the boot cpu, and towards
the end of start_secondary() for secondary cpus.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc64/ftrace: Add helpers to hard disable ftrace
Naveen N. Rao [Thu, 19 Apr 2018 07:04:02 +0000 (12:34 +0530)]
powerpc64/ftrace: Add helpers to hard disable ftrace

Add some helpers to enable/disable ftrace through paca->ftrace_enabled.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc64/ftrace: Rearrange #ifdef sections in ftrace.h
Naveen N. Rao [Thu, 19 Apr 2018 07:04:01 +0000 (12:34 +0530)]
powerpc64/ftrace: Rearrange #ifdef sections in ftrace.h

Re-arrange the last #ifdef section in preparation for a subsequent
change.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc64/ftrace: Add a field in paca to disable ftrace in unsafe code paths
Naveen N. Rao [Thu, 19 Apr 2018 07:04:00 +0000 (12:34 +0530)]
powerpc64/ftrace: Add a field in paca to disable ftrace in unsafe code paths

We have some C code that we call into from real mode where we cannot
take any exceptions. Though the C functions themselves are mostly safe,
if these functions are traced, there is a possibility that we may take
an exception. For instance, in certain conditions, the ftrace code uses
WARN(), which uses a 'trap' to do its job.

For such scenarios, introduce a new field in paca 'ftrace_enabled',
which is checked on ftrace entry before continuing. This field can then
be set to zero to disable/pause ftrace, and set to a non-zero value to
resume ftrace.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agoLinux v4.17-rc3
Linus Torvalds [Sun, 29 Apr 2018 21:17:42 +0000 (14:17 -0700)]
Linux v4.17-rc3

6 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 29 Apr 2018 17:06:05 +0000 (10:06 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "Another set of x86 related updates:

   - Fix the long broken x32 version of the IPC user space headers which
     was noticed by Arnd Bergman in course of his ongoing y2038 work.
     GLIBC seems to have non broken private copies of these headers so
     this went unnoticed.

   - Two microcode fixlets which address some more fallout from the
     recent modifications in that area:

      - Unconditionally save the microcode patch, which was only saved
        when CPU_HOTPLUG was enabled causing failures in the late
        loading mechanism

      - Make the later loader synchronization finally work under all
        circumstances. It was exiting early and causing timeout failures
        due to a missing synchronization point.

   - Do not use mwait_play_dead() on AMD systems to prevent excessive
     power consumption as the CPU cannot go into deep power states from
     there.

   - Address an annoying sparse warning due to lost type qualifiers of
     the vmemmap and vmalloc base address constants.

   - Prevent reserving crash kernel region on Xen PV as this leads to
     the wrong perception that crash kernels actually work there which
     is not the case. Xen PV has its own crash mechanism handled by the
     hypervisor.

   - Add missing TLB cpuid values to the table to make the printout on
     certain machines correct.

   - Enumerate the new CLDEMOTE instruction

   - Fix an incorrect SPDX identifier

   - Remove stale macros"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/ipc: Fix x32 version of shmid64_ds and msqid64_ds
  x86/setup: Do not reserve a crash kernel region if booted on Xen PV
  x86/cpu/intel: Add missing TLB cpuid values
  x86/smpboot: Don't use mwait_play_dead() on AMD systems
  x86/mm: Make vmemmap and vmalloc base address constants unsigned long
  x86/vector: Remove the unused macro FPU_IRQ
  x86/vector: Remove the macro VECTOR_OFFSET_START
  x86/cpufeatures: Enumerate cldemote instruction
  x86/microcode: Do not exit early from __reload_late()
  x86/microcode/intel: Save microcode patch unconditionally
  x86/jailhouse: Fix incorrect SPDX identifier

6 years agoMerge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 29 Apr 2018 16:36:22 +0000 (09:36 -0700)]
Merge branch 'x86-pti-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 pti fixes from Thomas Gleixner:
 "A set of updates for the x86/pti related code:

   - Preserve r8-r11 in int $0x80. r8-r11 need to be preserved, but the
     int$80 entry code removed that quite some time ago. Make it correct
     again.

   - A set of fixes for the Global Bit work which went into 4.17 and
     caused a bunch of interesting regressions:

      - Triggering a BUG in the page attribute code due to a missing
        check for early boot stage

      - Warnings in the page attribute code about holes in the kernel
        text mapping which are caused by the freeing of the init code.
        Handle such holes gracefully.

      - Reduce the amount of kernel memory which is set global to the
        actual text and do not incidentally overlap with data.

      - Disable the global bit when RANDSTRUCT is enabled as it
        partially defeats the hardening.

      - Make the page protection setup correct for vma->page_prot
        population again. The adjustment of the protections fell through
        the crack during the Global bit rework and triggers warnings on
        machines which do not support certain features, e.g. NX"

* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/entry/64/compat: Preserve r8-r11 in int $0x80
  x86/pti: Filter at vma->vm_page_prot population
  x86/pti: Disallow global kernel text with RANDSTRUCT
  x86/pti: Reduce amount of kernel text allowed to be Global
  x86/pti: Fix boot warning from Global-bit setting
  x86/pti: Fix boot problems from Global-bit setting

6 years agoMerge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 29 Apr 2018 16:03:25 +0000 (09:03 -0700)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull timer fixes from Thomas Gleixner:
 "Two fixes from the timer departement:

   - Fix a long standing issue in the NOHZ tick code which causes RB
     tree corruption, delayed timers and other malfunctions. The cause
     for this is code which modifies the expiry time of an enqueued
     hrtimer.

   - Revert the CLOCK_MONOTONIC/CLOCK_BOOTTIME unification due to
     regression reports. Seems userspace _is_ relying on the documented
     behaviour despite our hope that it wont"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Revert: Unify CLOCK_MONOTONIC and CLOCK_BOOTTIME
  tick/sched: Do not mess with an enqueued hrtimer

6 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 29 Apr 2018 15:58:50 +0000 (08:58 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Thomas Gleixner:
 "The perf update contains the following bits:

  x86:
   - Prevent setting freeze_on_smi on PerfMon V1 CPUs to avoid #GP

  perf stat:
   - Keep the '/' event modifier separator in fallback, for example when
     fallbacking from 'cpu/cpu-cycles/' to user level only, where it
     should become 'cpu/cpu-cycles/u' and not 'cpu/cpu-cycles/:u' (Jiri
     Olsa)

   - Fix PMU events parsing rule, improving error reporting for invalid
     events (Jiri Olsa)

   - Disable write_backward and other event attributes for !group events
     in a group, fixing, for instance this group: '{cycles,msr/aperf/}:S'
     that has leader sampling (:S) and where just the 'cycles', the
     leader event, should have the write_backward attribute set, in this
     case it all fails because the PMU where 'msr/aperf/' lives doesn't
     accepts write_backward style sampling (Jiri Olsa)

   - Only fall back group read for leader (Kan Liang)

   - Fix core PMU alias list for x86 platform (Kan Liang)

   - Print out hint for mixed PMU group error (Kan Liang)

   - Fix duplicate PMU name for interval print (Kan Liang)

  Core:
   - Set main kernel end address properly when reading kernel and module
     maps (Namhyung Kim)

  perf mem:
   - Fix incorrect entries and add missing man options (Sangwon Hong)

  s/390:
   - Remove s390 specific strcmp_cpuid_cmp function (Thomas Richter)

   - Adapt 'perf test' case record+probe_libc_inet_pton.sh for s390

   - Fix s390 undefined record__auxtrace_init() return value in 'perf
     record' (Thomas Richter)"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Don't enable freeze-on-smi for PerfMon V1
  perf stat: Fix duplicate PMU name for interval print
  perf evsel: Only fall back group read for leader
  perf stat: Print out hint for mixed PMU group error
  perf pmu: Fix core PMU alias list for X86 platform
  perf record: Fix s390 undefined record__auxtrace_init() return value
  perf mem: Document incorrect and missing options
  perf evsel: Disable write_backward for leader sampling group events
  perf pmu: Fix pmu events parsing rule
  perf stat: Keep the / modifier separator in fallback
  perf test: Adapt test case record+probe_libc_inet_pton.sh for s390
  perf list: Remove s390 specific strcmp_cpuid_cmp function
  perf machine: Set main kernel end address properly

6 years agoMerge tag 'for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso...
Linus Torvalds [Sun, 29 Apr 2018 03:07:21 +0000 (20:07 -0700)]
Merge tag 'for_linus_stable' of git://git./linux/kernel/git/tytso/ext4

Pull ext4 fixes from Ted Ts'o:
 "Fix misc bugs and a regression for ext4"

* tag 'for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: add MODULE_SOFTDEP to ensure crc32c is included in the initramfs
  ext4: fix bitmap position validation
  ext4: set h_journal if there is a failure starting a reserved handle
  ext4: prevent right-shifting extents beyond EXT_MAX_BLOCKS

6 years ago<linux/stringhash.h>: fix end_name_hash() for 64bit long
Amir Goldstein [Mon, 5 Feb 2018 17:32:18 +0000 (19:32 +0200)]
<linux/stringhash.h>: fix end_name_hash() for 64bit long

The comment claims that this helper will try not to loose bits, but for
64bit long it looses the high bits before hashing 64bit long into 32bit
int.  Use the helper hash_long() to do the right thing for 64bit long.
For 32bit long, there is no change.

All the callers of end_name_hash() either assign the result to
qstr->hash, which is u32 or return the result as an int value (e.g.
full_name_hash()).  Change the helper return type to int to conform to
its users.

[ It took me a while to apply this, because my initial reaction to it
  was - incorrectly - that it could make for slower code.

  After having looked more at it, I take back all my complaints about
  the patch, Amir was right and I was mis-reading things or just being
  stupid.

  I also don't worry too much about the possible performance impact of
  this on 64-bit, since most architectures that actually care about
  performance end up not using this very much (the dcache code is the
  most performance-critical, but the word-at-a-time case uses its own
  hashing anyway).

  So this ends up being mostly used for filesystems that do their own
  degraded hashing (usually because they want a case-insensitive
  comparison function).

  A _tiny_ worry remains, in that not everybody uses DCACHE_WORD_ACCESS,
  and then this potentially makes things more expensive on 64-bit
  architectures with slow or lacking multipliers even for the normal
  case.

  That said, realistically the only such architecture I can think of is
  PA-RISC. Nobody really cares about performance on that, it's more of a
  "look ma, I've got warts^W an odd machine" platform.

  So the patch is fine, and all my initial worries were just misplaced
  from not looking at this properly.   - Linus ]

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoMAINTAINERS: add myself as maintainer of AFFS
David Sterba [Sat, 28 Apr 2018 17:05:04 +0000 (19:05 +0200)]
MAINTAINERS: add myself as maintainer of AFFS

The AFFS filesystem is still in use by m68k community (Link #2), but as
there was no code activity and no maintainer, the filesystem appeared on
the list of candidates for staging/removal (Link #1).

I volunteer to act as a maintainer of AFFS to collect any fixes that
might show up and to guard fs/affs/ against another spring cleaning.

Link: https://lkml.kernel.org/r/20180425154602.GA8546@bombadil.infradead.org
Link: https://lkml.kernel.org/r/1613268.lKBQxPXt8J@merkaba
CC: Martin Steigerwald <martin@lichtvoll.de>
CC: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoMerge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Sat, 28 Apr 2018 17:06:16 +0000 (10:06 -0700)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:

 - two driver fixes

 - better parameter check for the core

 - Documentation updates

 - part of a tree-wide HAS_DMA cleanup

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: sprd: Fix the i2c count issue
  i2c: sprd: Prevent i2c accesses after suspend is called
  i2c: dev: prevent ZERO_SIZE_PTR deref in i2cdev_ioctl_rdwr()
  Documentation/i2c: adopt kernel commenting style in examples
  Documentation/i2c: sync docs with current state of i2c-tools
  Documentation/i2c: whitespace cleanup
  i2c: Remove depends on HAS_DMA in case of platform dependency

6 years agoMerge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Sat, 28 Apr 2018 17:02:44 +0000 (10:02 -0700)]
Merge branch 'linus' of git://git./linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:

 - crypto API regression that may cause sporadic alloc failures

 - double-free bug in drbg

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: drbg - set freed buffers to NULL
  crypto: api - fix finding algorithm currently being tested

6 years agoMerge tag '4.17-rc2-smb3' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sat, 28 Apr 2018 16:51:56 +0000 (09:51 -0700)]
Merge tag '4.17-rc2-smb3' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "A few security related fixes for SMB3, most importantly for SMB3.11
  encryption"

* tag '4.17-rc2-smb3' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: smbd: Avoid allocating iov on the stack
  cifs: smbd: Don't use RDMA read/write when signing is used
  SMB311: Fix reconnect
  SMB3: Fix 3.11 encryption to Windows and handle encrypted smb3 tcon
  CIFS: set *resp_buf_type to NO_BUFFER on error