platform/kernel/linux-starfive.git
5 years agopowerpc/8xx: Add microcode patch to move SMC parameter RAM.
Christophe Leroy [Fri, 14 Jun 2019 06:41:47 +0000 (06:41 +0000)]
powerpc/8xx: Add microcode patch to move SMC parameter RAM.

Some SCC functions like the QMC requires an extended parameter RAM.
On modern 8xx (ie 866 and 885), SPI area can already be relocated,
allowing the use of those functions on SCC2. But SCC3 and SCC4
parameter RAM collide with SMC1 and SMC2 parameter RAMs.

This patch adds microcode to allow the relocation of both SMC1 and
SMC2, and relocate them at offsets 0x1ec0 and 0x1fc0.
Those offsets are by default for the CPM1 DSP1 and DSP2, but there
is no kernel driver using them at the moment so this area can be
reused.

This microcode is provided by Freescale/NXP in Engineering Bulletin
EB662 ("MPC8xx I2C/SPI and SMC Relocation Microcode Packages")
dated 2006. The binary code is public. The source is not available.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/8xx: Use IO accessors in microcode programming.
Christophe Leroy [Fri, 14 Jun 2019 06:41:46 +0000 (06:41 +0000)]
powerpc/8xx: Use IO accessors in microcode programming.

Change microcode functions to use IO accessors and get rid
of volatile attributes.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/8xx: replace #ifdefs by IS_ENABLED() in microcode.c
Christophe Leroy [Fri, 14 Jun 2019 06:41:45 +0000 (06:41 +0000)]
powerpc/8xx: replace #ifdefs by IS_ENABLED() in microcode.c

Reduce #ifdef mess by using IS_ENABLED() instead.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/8xx: refactor programming of microcode CPM params.
Christophe Leroy [Fri, 14 Jun 2019 06:41:44 +0000 (06:41 +0000)]
powerpc/8xx: refactor programming of microcode CPM params.

The CPM registers RCCR and CPMCR1..4 registers has to be set in
accordance with the microcode patch beeing programmed. Lets
define them as part of the patch set and refactor their
programming from that definition.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/8xx: refactor printing of microcode patch name.
Christophe Leroy [Fri, 14 Jun 2019 06:41:43 +0000 (06:41 +0000)]
powerpc/8xx: refactor printing of microcode patch name.

Define patch name together with the patch code, and refactor
the associated printk() while replacing it by a pr_info()

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/8xx: Refactor microcode write
Christophe Leroy [Fri, 14 Jun 2019 06:41:42 +0000 (06:41 +0000)]
powerpc/8xx: Refactor microcode write

Add empty microcode tables so that all tables are defined
all the time. Regroup the writing of the 3 tables regardless
of the selected microcode.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/8xx: refactor writing of CPM microcode arrays
Christophe Leroy [Fri, 14 Jun 2019 06:41:41 +0000 (06:41 +0000)]
powerpc/8xx: refactor writing of CPM microcode arrays

Create a function to refactor the writing of CPM microcode arrays.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/8xx: compact microcode arrays
Christophe Leroy [Fri, 14 Jun 2019 06:41:40 +0000 (06:41 +0000)]
powerpc/8xx: compact microcode arrays

Compact obscure microcode arrays by putting 4 values per line
in order to reduce number of lines in the file to increase
readability.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/8xx: drop verify_patch()
Christophe Leroy [Fri, 14 Jun 2019 06:41:39 +0000 (06:41 +0000)]
powerpc/8xx: drop verify_patch()

verify_patch() has been opted out since many years, and
the comment suggests it doesn't work. So drop it.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/8xx: move CPM1 related files from sysdev/ to platforms/8xx
Christophe Leroy [Fri, 14 Jun 2019 06:41:38 +0000 (06:41 +0000)]
powerpc/8xx: move CPM1 related files from sysdev/ to platforms/8xx

Only 8xx selects CPM1 and related CONFIG options are already
in platforms/8xx/Kconfig

Move the related C files to platforms/8xx/.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[mpe: Minor formatting fixes]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64: reuse PPC32 static inline flush_dcache_range()
Christophe Leroy [Tue, 14 May 2019 09:05:16 +0000 (09:05 +0000)]
powerpc/64: reuse PPC32 static inline flush_dcache_range()

This patch drops the assembly PPC64 version of flush_dcache_range()
and re-uses the PPC32 static inline version.

With GCC 8.1, the following code is generated:

void flush_test(unsigned long start, unsigned long stop)
{
flush_dcache_range(start, stop);
}

0000000000000130 <.flush_test>:
 130: 3d 22 00 00  addis   r9,r2,0
132: R_PPC64_TOC16_HA .data+0x8
 134: 81 09 00 00  lwz     r8,0(r9)
136: R_PPC64_TOC16_LO .data+0x8
 138: 3d 22 00 00  addis   r9,r2,0
13a: R_PPC64_TOC16_HA .data+0xc
 13c: 80 e9 00 00  lwz     r7,0(r9)
13e: R_PPC64_TOC16_LO .data+0xc
 140: 7d 48 00 d0  neg     r10,r8
 144: 7d 43 18 38  and     r3,r10,r3
 148: 7c 00 04 ac  hwsync
 14c: 4c 00 01 2c  isync
 150: 39 28 ff ff  addi    r9,r8,-1
 154: 7c 89 22 14  add     r4,r9,r4
 158: 7c 83 20 50  subf    r4,r3,r4
 15c: 7c 89 3c 37  srd.    r9,r4,r7
 160: 41 82 00 1c  beq     17c <.flush_test+0x4c>
 164: 7d 29 03 a6  mtctr   r9
 168: 60 00 00 00  nop
 16c: 60 00 00 00  nop
 170: 7c 00 18 ac  dcbf    0,r3
 174: 7c 63 42 14  add     r3,r3,r8
 178: 42 00 ff f8  bdnz    170 <.flush_test+0x40>
 17c: 7c 00 04 ac  hwsync
 180: 4c 00 01 2c  isync
 184: 4e 80 00 20  blr
 188: 60 00 00 00  nop
 18c: 60 00 00 00  nop

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/32: define helpers to get L1 cache sizes.
Christophe Leroy [Tue, 14 May 2019 09:05:15 +0000 (09:05 +0000)]
powerpc/32: define helpers to get L1 cache sizes.

This patch defines C helpers to retrieve the size of
cache blocks and uses them in the cacheflush functions.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64: flush_inval_dcache_range() becomes flush_dcache_range()
Christophe Leroy [Tue, 14 May 2019 09:05:13 +0000 (09:05 +0000)]
powerpc/64: flush_inval_dcache_range() becomes flush_dcache_range()

On most arches having function flush_dcache_range(), including PPC32,
this function does a writeback and invalidation of the cache bloc.

On PPC64, flush_dcache_range() only does a writeback while
flush_inval_dcache_range() does the invalidation in addition.

In addition it looks like within arch/powerpc/, there are no PPC64
platforms using flush_dcache_range()

This patch drops the existing 64 bits version of flush_dcache_range()
and renames flush_inval_dcache_range() into flush_dcache_range().

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc: slightly improve cache helpers
Christophe Leroy [Fri, 10 May 2019 09:24:48 +0000 (09:24 +0000)]
powerpc: slightly improve cache helpers

Cache instructions (dcbz, dcbi, dcbf and dcbst) take two registers
that are summed to obtain the target address. Using 'Z' constraint
and '%y0' argument gives GCC the opportunity to use both registers
instead of only one with the second being forced to 0.

Suggested-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm/hugetlb: Don't enable HugeTLB if we don't have a page table cache
Aneesh Kumar K.V [Tue, 28 May 2019 05:36:26 +0000 (11:06 +0530)]
powerpc/mm/hugetlb: Don't enable HugeTLB if we don't have a page table cache

This makes sure we don't enable HugeTLB if the cache is not configured.
I am still not sure about this. IMHO hugetlb support should be a hardware
support derivative and any cache allocation failure should be handled as I did
in the earlier patch. But then if we were not able to create hugetlb page table
cache, we can as well declare hugetlb support disabled thereby avoiding calling
into allocation routines.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm/hugetlb: Fix kernel crash if we fail to allocate page table caches
Aneesh Kumar K.V [Tue, 28 May 2019 05:36:25 +0000 (11:06 +0530)]
powerpc/mm/hugetlb: Fix kernel crash if we fail to allocate page table caches

We only check for hugetlb allocations, because with hugetlb we do conditional
registration. For PGD/PUD/PMD levels we register them always in
pgtable_cache_init.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: Handle page table allocation failures
Aneesh Kumar K.V [Tue, 28 May 2019 05:36:24 +0000 (11:06 +0530)]
powerpc/mm: Handle page table allocation failures

This fixes kernel crash that arises due to not handling page table allocation
failures while allocating hugetlb page table.

Fixes: e2b3d202d1db ("powerpc: Switch 16GB and 16MB explicit hugepages to a different page table format")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: Remove radix dependency on HugeTLB page
Aneesh Kumar K.V [Tue, 14 May 2019 06:03:02 +0000 (11:33 +0530)]
powerpc/mm: Remove radix dependency on HugeTLB page

Now that we have switched the page table walk to use pmd_is_leaf we can now
revert commit 8adddf349fda ("powerpc/mm/radix: Make Radix require HUGETLB_PAGE")

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: pmd_devmap implies pmd_large().
Aneesh Kumar K.V [Tue, 14 May 2019 06:03:01 +0000 (11:33 +0530)]
powerpc/mm: pmd_devmap implies pmd_large().

large devmap usage is dependent on THP. Hence once check is sufficient.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/book3s: Use config independent helpers for page table walk
Aneesh Kumar K.V [Tue, 14 May 2019 06:03:00 +0000 (11:33 +0530)]
powerpc/book3s: Use config independent helpers for page table walk

Even when we have HugeTLB and THP disabled, kernel linear map can still be
mapped with hugepages. This is only an issue with radix translation because hash
MMU doesn't map kernel linear range in linux page table and other kernel
map areas are not mapped using hugepage.

Add config independent helpers and put WARN_ON() when we don't expect things
to be mapped via hugepages.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/pseries/scm: Use a specific endian format for storing uuid from the device...
Aneesh Kumar K.V [Fri, 7 Jun 2019 06:47:05 +0000 (12:17 +0530)]
powerpc/pseries/scm: Use a specific endian format for storing uuid from the device tree

We used uuid_parse to convert uuid string from device tree to two u64
components. We want to make sure we look at the uuid read from device
tree in an endian-neutral fashion. For now, I am picking little-endian
to be format so that we don't end up doing an additional conversion.

The reason to store in a specific endian format is to enable reading
the namespace created with a little-endian kernel config on a
big-endian kernel. We do store the device tree uuid string as a 64-bit
little-endian cookie in the label area. When booting the kernel we
also compare this cookie against what is read from the device tree.
For this, to work we have to store and compare these values in a CPU
endian config independent fashion.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/nvdimm: Add support for multibyte read/write for metadata
Aneesh Kumar K.V [Fri, 7 Jun 2019 06:45:11 +0000 (12:15 +0530)]
powerpc/nvdimm: Add support for multibyte read/write for metadata

SCM_READ/WRITE_MEATADATA hcall supports multibyte read/write. This patch
updates the metadata read/write to use 1, 2, 4 or 8 byte read/write as
mentioned in PAPR document.

READ/WRITE_METADATA hcall supports the 1, 2, 4, or 8 bytes read/write.
For other values hcall results H_P3.

Hypervisor stores the metadata contents in big-endian format and in-order
to enable read/write in different granularity, we need to switch the contents
to big-endian before calling HCALL.

Based on an patch from Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/pseries/scm: Mark the region volatile if cache flush not required
Aneesh Kumar K.V [Fri, 7 Jun 2019 06:44:07 +0000 (12:14 +0530)]
powerpc/pseries/scm: Mark the region volatile if cache flush not required

The device tree node is documented as below:

  “ibm,cache-flush-required”:
  property name indicates Cache Flush Required for this Persistent Memory Segment to persist memory
  prop-encoded-array: None, this is a name only property.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm/nvdimm: Add an informative message if we fail to allocate altmap block
Aneesh Kumar K.V [Mon, 1 Jul 2019 14:33:38 +0000 (20:03 +0530)]
powerpc/mm/nvdimm: Add an informative message if we fail to allocate altmap block

Allocation from altmap area can fail based on vmemmap page size used.
Add kernel info message to indicate the failure. That allows the user
to identify whether they are really using persistent memory reserved
space for per-page metadata.

The message looks like:
  [  136.587212] altmap block allocation failed, falling back to system memory

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: Consolidate numa_enable check and min_common_depth check
Aneesh Kumar K.V [Mon, 1 Jul 2019 14:36:26 +0000 (20:06 +0530)]
powerpc/mm: Consolidate numa_enable check and min_common_depth check

If we fail to parse min_common_depth from device tree we boot with
numa disabled. Reflect the same by updating numa_enabled variable
to false. Also, switch all min_common_depth failure check to
if (!numa_enabled) check.

This helps us to avoid checking for both in different code paths.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: Fix node look up with numa=off boot
Aneesh Kumar K.V [Mon, 1 Jul 2019 14:36:25 +0000 (20:06 +0530)]
powerpc/mm: Fix node look up with numa=off boot

If we boot with numa=off, we need to make sure we return NUMA_NO_NODE when
looking up associativity details of resources. Without this, we hit crash
like below

BUG: Unable to handle kernel data access at 0x40000000008
Faulting instruction address: 0xc000000008f31704
cpu 0x1b: Vector: 380 (Data SLB Access) at [c00000000b9bb320]
    pc: c000000008f31704: _raw_spin_lock+0x14/0x100
    lr: c0000000083f41fc: ____cache_alloc_node+0x5c/0x290
    sp: c00000000b9bb5b0
   msr: 800000010280b033
   dar: 40000000008
  current = 0xc00000000b9a2700
  paca    = 0xc00000000a740c00   irqmask: 0x03   irq_happened: 0x01
    pid   = 1, comm = swapper/27
Linux version 5.2.0-rc4-00925-g74e188c620b1 (root@linux-d8ip) (gcc version 7.4.1 20190424 [gcc-7-branch revision 270538] (SUSE Linux)) #34 SMP Sat Jun 29 00:41:02 EDT 2019
enter ? for help
[link register   ] c0000000083f41fc ____cache_alloc_node+0x5c/0x290
[c00000000b9bb5b00000000000000dc0 (unreliable)
[c00000000b9bb5f0c0000000083f48c8 kmem_cache_alloc_node_trace+0x138/0x360
[c00000000b9bb670c000000008aa789c devres_alloc_node+0x4c/0xa0
[c00000000b9bb6a0c000000008337218 devm_memremap+0x58/0x130
[c00000000b9bb6f0c000000008aed00c devm_nsio_enable+0xdc/0x170
[c00000000b9bb780c000000008af3b6c nd_pmem_probe+0x4c/0x180
[c00000000b9bb7b0c000000008ad84cc nvdimm_bus_probe+0xac/0x260
[c00000000b9bb840c000000008aa0628 really_probe+0x148/0x500
[c00000000b9bb8d0c000000008aa0d7c driver_probe_device+0x19c/0x1d0
[c00000000b9bb950c000000008aa11bc device_driver_attach+0xcc/0x100
[c00000000b9bb990c000000008aa12ec __driver_attach+0xfc/0x1e0
[c00000000b9bba10c000000008a9d0a4 bus_for_each_dev+0xb4/0x130
[c00000000b9bba70c000000008a9fc04 driver_attach+0x34/0x50
[c00000000b9bba90c000000008a9f118 bus_add_driver+0x1d8/0x300
[c00000000b9bbb20c000000008aa2358 driver_register+0x98/0x1a0
[c00000000b9bbb90c000000008ad7e6c __nd_driver_register+0x5c/0x100
[c00000000b9bbbf0c0000000093efbac nd_pmem_driver_init+0x34/0x48
[c00000000b9bbc10c0000000080106c0 do_one_initcall+0x60/0x2d0
[c00000000b9bbce0c00000000938463c kernel_init_freeable+0x384/0x48c
[c00000000b9bbdb0c000000008010a5c kernel_init+0x2c/0x160
[c00000000b9bbe20c00000000800ba54 ret_from_kernel_thread+0x5c/0x68

Reported-and-debugged-by: Vaibhav Jain <vaibhav@linux.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm/drconf: Use NUMA_NO_NODE on failures instead of node 0
Aneesh Kumar K.V [Mon, 1 Jul 2019 14:36:24 +0000 (20:06 +0530)]
powerpc/mm/drconf: Use NUMA_NO_NODE on failures instead of node 0

If we fail to parse the associativity array we should default to
NUMA_NO_NODE instead of NODE 0. Rest of the code fallback to the
right default if we find the numa node value NUMA_NO_NODE.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm/radix: Use the right page size for vmemmap mapping
Aneesh Kumar K.V [Mon, 1 Jul 2019 14:34:42 +0000 (20:04 +0530)]
powerpc/mm/radix: Use the right page size for vmemmap mapping

We use mmu_vmemmap_psize to find the page size for mapping the vmmemap area.
With radix translation, we are suboptimally setting this value to PAGE_SIZE.

We do check for 2M page size support and update mmu_vmemap_psize to use
hugepage size but we suboptimally reset the value to PAGE_SIZE in
radix__early_init_mmu(). This resulted in always mapping vmemmap area with
64K page size.

Fixes: 2bfd65e45e87 ("powerpc/mm/radix: Add radix callbacks for early init routines")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm/hash/4k: Don't use 64K page size for vmemmap with 4K pagesize
Aneesh Kumar K.V [Mon, 1 Jul 2019 14:34:41 +0000 (20:04 +0530)]
powerpc/mm/hash/4k: Don't use 64K page size for vmemmap with 4K pagesize

With hash translation and 4K PAGE_SIZE config, we need to make sure we don't
use 64K page size for vmemmap.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: Remove unused variable declaration
Aneesh Kumar K.V [Mon, 1 Jul 2019 14:37:00 +0000 (20:07 +0530)]
powerpc/mm: Remove unused variable declaration

Since commit 0034d395f89d ("powerpc/mm/hash64: Map all the kernel
regions in the same 0xc range") __kernel_virt_size is not used
anymore.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/pseries: Add documentation for vcpudispatch_stats
Naveen N. Rao [Wed, 3 Jul 2019 17:04:02 +0000 (22:34 +0530)]
powerpc/pseries: Add documentation for vcpudispatch_stats

Add a document describing the fields provided by
/proc/powerpc/vcpudispatch_stats.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/pseries: Protect against hogging the cpu while setting up the stats
Naveen N. Rao [Wed, 3 Jul 2019 17:04:01 +0000 (22:34 +0530)]
powerpc/pseries: Protect against hogging the cpu while setting up the stats

When enabling or disabling the vcpu dispatch statistics, we do a lot of
work including allocating/deallocating memory across all possible cpus
for the DTL buffer. In order to guard against hogging the cpu for too
long, track the time we're taking and yield the processor if necessary.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/pseries: Provide vcpu dispatch statistics
Naveen N. Rao [Wed, 3 Jul 2019 17:04:00 +0000 (22:34 +0530)]
powerpc/pseries: Provide vcpu dispatch statistics

For Shared Processor LPARs, the POWER Hypervisor maintains a
relatively static mapping of the LPAR processors (vcpus) to physical
processor chips (representing the "home" node) and tries to always
dispatch vcpus on their associated physical processor chip. However,
under certain scenarios, vcpus may be dispatched on a different
processor chip (away from its home node). The actual physical
processor number on which a certain vcpu is dispatched is available to
the guest in the 'processor_id' field of each DTL entry.

The guest can discover the home node of each vcpu through the
H_HOME_NODE_ASSOCIATIVITY(flags=1) hcall. The guest can also discover
the associativity of physical processors, as represented in the DTL
entry, through the H_HOME_NODE_ASSOCIATIVITY(flags=2) hcall.

These can then be compared to determine if the vcpu was dispatched on
its home node or not. If the vcpu was not dispatched on the home node,
it is possible to determine if the vcpu was dispatched in a different
chip, socket or drawer.

Introduce a procfs file /proc/powerpc/vcpudispatch_stats that can be
used to obtain these statistics. Writing '1' to this file enables
collecting the statistics, while writing '0' disables the statistics.
The statistics themselves are available by reading the procfs file. By
default, the DTLB log for each vcpu is processed 50 times a second so
as not to miss any entries. This processing frequency can be changed
through /proc/powerpc/vcpudispatch_stats_freq.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/pseries: Move mm/book3s64/vphn.c under platforms/pseries/
Naveen N. Rao [Wed, 3 Jul 2019 17:03:59 +0000 (22:33 +0530)]
powerpc/pseries: Move mm/book3s64/vphn.c under platforms/pseries/

hcall_vphn() is specific to pseries and will be used in a subsequent
patch. So, move it to a more appropriate place under
arch/powerpc/platforms/pseries. Also merge vphn.h into lppaca.h
and update vphn selftest to use the new files.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/pseries: Generalize hcall_vphn()
Naveen N. Rao [Wed, 3 Jul 2019 17:03:58 +0000 (22:33 +0530)]
powerpc/pseries: Generalize hcall_vphn()

H_HOME_NODE_ASSOCIATIVITY hcall can take two different flags and return
different associativity information in each case. Generalize the
existing hcall_vphn() function to take flags as an argument and to
return the result. Update the only existing user to pass the proper
arguments.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/pseries: Introduce rwlock to gatekeep DTLB usage
Naveen N. Rao [Wed, 3 Jul 2019 17:03:57 +0000 (22:33 +0530)]
powerpc/pseries: Introduce rwlock to gatekeep DTLB usage

Since we would be introducing a new user of the DTL buffer in a
subsequent patch, we need a way to gatekeep use of the DTL buffer.

The current debugfs interface for DTL allows registering and opening
cpu-specific DTL buffers. Cpu specific files are exposed under
debugfs 'powerpc/dtl/' node, and changing 'dtl_event_mask' in the same
directory enables controlling the event mask used when registering DTL
buffer for a particular cpu.

Subsequently, we will be introducing a user of the DTL buffers that
registers access to the DTL buffers across all cpus with the same event
mask. To ensure these two users do not step on each other, we introduce
a rwlock to gatekeep DTL buffer access. This fits the requirement of the
current debugfs interface wanting to allow multiple independent
cpu-specific users (read lock), and the subsequent user wanting
exclusive access (write lock).

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/pseries: Factor out DTL buffer allocation and registration routines
Naveen N. Rao [Wed, 3 Jul 2019 17:03:56 +0000 (22:33 +0530)]
powerpc/pseries: Factor out DTL buffer allocation and registration routines

Introduce new helpers for DTL buffer allocation and registration and
have the existing code use those.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
[mpe: Don't split error messages across lines, for grepability]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/pseries: Do not save the previous DTL mask value
Naveen N. Rao [Wed, 3 Jul 2019 17:03:55 +0000 (22:33 +0530)]
powerpc/pseries: Do not save the previous DTL mask value

When CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is enabled, we always initialize
DTL enable mask to DTL_LOG_PREEMPT (0x2). There are no other places
where the mask is changed. As such, when reading the DTL log buffer
through debugfs, there is no need to save and restore the previous mask
value.

We don't need to save and restore the earlier mask value if
CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is not enabled. So, remove the field
from the structure as well.

Acked-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/pseries: Use macros for referring to the DTL enable mask
Naveen N. Rao [Wed, 3 Jul 2019 17:03:54 +0000 (22:33 +0530)]
powerpc/pseries: Use macros for referring to the DTL enable mask

Introduce macros to encode the DTL enable mask fields and use those
instead of hardcoding numbers.

Acked-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc: Enable CONFIG_IPV6 in ppc64_defconfig
Satheesh Rajendran [Tue, 2 Jul 2019 15:47:45 +0000 (21:17 +0530)]
powerpc: Enable CONFIG_IPV6 in ppc64_defconfig

Enable CONFIG_IPV6 in ppc64_defconfig to enable
certain network functionalities required for tests.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/cell: set no_llseek in spufs_cntl_fops
Geliang Tang [Sat, 6 May 2017 15:37:20 +0000 (23:37 +0800)]
powerpc/cell: set no_llseek in spufs_cntl_fops

In spufs_cntl_fops, since we use nonseekable_open() to open, we
should use no_llseek() to seek, not generic_file_llseek().

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/perf/24x7: use rb_entry
Geliang Tang [Tue, 20 Dec 2016 14:02:17 +0000 (22:02 +0800)]
powerpc/perf/24x7: use rb_entry

To make the code clearer, use rb_entry() instead of container_of() to
deal with rbtree.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/configs: Disable latencytop
Anton Blanchard [Tue, 4 Jun 2019 05:42:57 +0000 (15:42 +1000)]
powerpc/configs: Disable latencytop

latencytop adds almost 4kB to each and every task struct and as such
it doesn't deserve to be in our defconfigs.

Signed-off-by: Anton Blanchard <anton@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/Kconfig: Clean up formatting
Enrico Weigelt, metux IT consult [Wed, 3 Jul 2019 16:04:13 +0000 (18:04 +0200)]
powerpc/Kconfig: Clean up formatting

Formatting of Kconfig files doesn't look so pretty, so let the
Great White Handkerchief come around and clean it up.

Also convert "---help---" as requested.

Signed-off-by: Enrico Weigelt, metux IT consult <info@metux.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: mark more tlb functions as __always_inline
Masahiro Yamada [Tue, 21 May 2019 13:13:24 +0000 (22:13 +0900)]
powerpc/mm: mark more tlb functions as __always_inline

With CONFIG_OPTIMIZE_INLINING enabled, Laura Abbott reported error
with gcc 9.1.1:

  arch/powerpc/mm/book3s64/radix_tlb.c: In function '_tlbiel_pid':
  arch/powerpc/mm/book3s64/radix_tlb.c:104:2: warning: asm operand 3 probably doesn't match constraints
    104 |  asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
        |  ^~~
  arch/powerpc/mm/book3s64/radix_tlb.c:104:2: error: impossible constraint in 'asm'

Fixing _tlbiel_pid() is enough to address the warning above, but I
inlined more functions to fix all potential issues.

To meet the "i" (immediate) constraint for the asm operands, functions
propagating "ric" must be always inlined.

Fixes: 9012d011660e ("compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING")
Reported-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc: Use the correct style for SPDX License Identifier
Nishad Kamdar [Tue, 16 Apr 2019 15:28:57 +0000 (20:58 +0530)]
powerpc: Use the correct style for SPDX License Identifier

This patch corrects the SPDX License Identifier style
in the powerpc Hardware Architecture related files.

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Nishad Kamdar <nishadkamdar@gmail.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/powernv-eeh: Consisely desribe what this file does
Stewart Smith [Tue, 28 May 2019 03:29:25 +0000 (13:29 +1000)]
powerpc/powernv-eeh: Consisely desribe what this file does

If the previous comment made sense, continue debugging or call your
doctor immediately.

Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/configs: Remove useless UEVENT_HELPER_PATH
Krzysztof Kozlowski [Tue, 4 Jun 2019 08:00:33 +0000 (10:00 +0200)]
powerpc/configs: Remove useless UEVENT_HELPER_PATH

Remove the CONFIG_UEVENT_HELPER_PATH because:
1. It is disabled since commit 1be01d4a5714 ("driver: base: Disable
   CONFIG_UEVENT_HELPER by default") as its dependency (UEVENT_HELPER) was
   made default to 'n',
2. It is not recommended (help message: "This should not be used today
   [...] creates a high system load") and was kept only for ancient
   userland,
3. Certain userland specifically requests it to be disabled (systemd
   README: "Legacy hotplug slows down the system and confuses udev").

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/4xx/uic: clear pending interrupt after irq type/pol change
Christian Lamparter [Sat, 15 Jun 2019 15:23:13 +0000 (17:23 +0200)]
powerpc/4xx/uic: clear pending interrupt after irq type/pol change

When testing out gpio-keys with a button, a spurious
interrupt (and therefore a key press or release event)
gets triggered as soon as the driver enables the irq
line for the first time.

This patch clears any potential bogus generated interrupt
that was caused by the switching of the associated irq's
type and polarity.

Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agoselftests/powerpc: Add missing newline at end of file
Geert Uytterhoeven [Mon, 17 Jun 2019 14:52:04 +0000 (16:52 +0200)]
selftests/powerpc: Add missing newline at end of file

"git diff" says:

    \ No newline at end of file

after modifying the file.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
[mpe: Rebase since addition of another test]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc: Add barrier_nospec to raw_copy_in_user()
Suraj Jitindar Singh [Wed, 6 Mar 2019 01:10:38 +0000 (12:10 +1100)]
powerpc: Add barrier_nospec to raw_copy_in_user()

Commit ddf35cf3764b ("powerpc: Use barrier_nospec in copy_from_user()")
Added barrier_nospec before loading from user-controlled pointers. The
intention was to order the load from the potentially user-controlled
pointer vs a previous branch based on an access_ok() check or similar.

In order to achieve the same result, add a barrier_nospec to the
raw_copy_in_user() function before loading from such a user-controlled
pointer.

Fixes: ddf35cf3764b ("powerpc: Use barrier_nospec in copy_from_user()")
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agoKVM: PPC: Book3S HV: Fix CR0 setting in TM emulation
Michael Neuling [Thu, 20 Jun 2019 06:00:40 +0000 (16:00 +1000)]
KVM: PPC: Book3S HV: Fix CR0 setting in TM emulation

When emulating tsr, treclaim and trechkpt, we incorrectly set CR0. The
code currently sets:
    CR0 <- 00 || MSR[TS]
but according to the ISA it should be:
    CR0 <-  0 || MSR[TS] || 0

This fixes the bit shift to put the bits in the correct location.

This is a data integrity issue as CR0 is corrupted.

Fixes: 4bb3c7a0208f ("KVM: PPC: Book3S HV: Work around transactional memory bugs in POWER9")
Cc: stable@vger.kernel.org # v4.17+
Tested-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/powernv: Fix stale iommu table base after VFIO
Alexey Kardashevskiy [Fri, 28 Jun 2019 06:53:00 +0000 (16:53 +1000)]
powerpc/powernv: Fix stale iommu table base after VFIO

The powernv platform uses @dma_iommu_ops for non-bypass DMA. These ops
need an iommu_table pointer which is stored in
dev->archdata.iommu_table_base. It is initialized during
pcibios_setup_device() which handles boot time devices. However when a
device is taken from the system in order to pass it through, the
default IOMMU table is destroyed but the pointer in a device is not
updated; also when a device is returned back to the system, a new
table pointer is not stored in dev->archdata.iommu_table_base either.
So when a just returned device tries using IOMMU, it crashes on
accessing stale iommu_table or its members.

This calls set_iommu_table_base() when the default window is created.
Note it used to be there before but was wrongly removed (see "fixes").
It did not appear before as these days most devices simply use bypass.

This adds set_iommu_table_base(NULL) when a device is taken from the
system to make it clear that IOMMU DMA cannot be used past that point.

Fixes: c4e9d3c1e65a ("powerpc/powernv/pseries: Rework device adding to IOMMU groups")
Cc: stable@vger.kernel.org # v5.0+
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/pci/of: Parse unassigned resources
Alexey Kardashevskiy [Wed, 26 Jun 2019 02:37:46 +0000 (12:37 +1000)]
powerpc/pci/of: Parse unassigned resources

The pseries platform uses the PCI_PROBE_DEVTREE method of PCI probing
which reads "assigned-addresses" of every PCI device and initializes
the device resources. However if the property is missing or zero sized,
then there is no fallback of any kind and the PCI resources remain
undiscovered, i.e. pdev->resource[] array remains empty.

This adds a fallback which parses the "reg" property in pretty much same
way except it marks resources as "unset" which later make Linux assign
those resources proper addresses.

This has an effect when:
1. a hypervisor failed to assign any resource for a device;
2. /chosen/linux,pci-probe-only=0 is in the DT so the system may try
assigning a resource.
Neither is likely to happen under PowerVM.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/pseries/dma: Enable SWIOTLB
Alexey Kardashevskiy [Tue, 7 May 2019 06:25:59 +0000 (16:25 +1000)]
powerpc/pseries/dma: Enable SWIOTLB

So far the pseries platforms has always been using IOMMU making
SWIOTLB unnecessary. Now we want secure guests which means devices can
only access certain areas of guest physical memory; we are going to
use SWIOTLB for this purpose.

This allows SWIOTLB for pseries. By default there is no change in
behavior.

This enables SWIOTLB when the "swiotlb" kernel parameter is set to
"force".

With the SWIOTLB enabled, the kernel creates a directly mapped DMA
window (using the usual DDW mechanism) and implements SWIOTLB on top
of that.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/pseries/dma: Allow SWIOTLB
Alexey Kardashevskiy [Tue, 7 May 2019 06:25:58 +0000 (16:25 +1000)]
powerpc/pseries/dma: Allow SWIOTLB

The commit 8617a5c5bc00 ("powerpc/dma: handle iommu bypass in
dma_iommu_ops") merged direct DMA ops into the IOMMU DMA ops allowing
SWIOTLB as well but only for mapping; the unmapping and bouncing parts
were left unmodified.

This adds missing direct unmapping calls to .unmap_page() and
.unmap_sg().

This adds missing sync callbacks and directs them to the direct DMA
hooks.

Fixes: 8617a5c5bc00 ("powerpc/dma: handle iommu bypass in dma_iommu_ops")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc: remove device_to_mask()
Christoph Hellwig [Sat, 29 Jun 2019 08:03:59 +0000 (10:03 +0200)]
powerpc: remove device_to_mask()

Use the dma_get_mask() helper from dma-mapping.h instead, as they are
functionally identical.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc: Fix compile issue with force DAWR
Michael Neuling [Tue, 4 Jun 2019 03:00:37 +0000 (13:00 +1000)]
powerpc: Fix compile issue with force DAWR

If you compile with KVM but without CONFIG_HAVE_HW_BREAKPOINT you fail
at linking with:
  arch/powerpc/kvm/book3s_hv_rmhandlers.o:(.text+0x708): undefined reference to `dawr_force_enable'

This was caused by commit c1fe190c0672 ("powerpc: Add force enable of
DAWR on P9 option").

This moves a bunch of code around to fix this. It moves a lot of the
DAWR code in a new file and creates a new CONFIG_PPC_DAWR to enable
compiling it.

Fixes: c1fe190c0672 ("powerpc: Add force enable of DAWR on P9 option")
Signed-off-by: Michael Neuling <mikey@neuling.org>
[mpe: Minor formatting in set_dawr()]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc: silence a -Wcast-function-type warning in dawr_write_file_bool
Mathieu Malaterre [Tue, 4 Jun 2019 03:00:36 +0000 (13:00 +1000)]
powerpc: silence a -Wcast-function-type warning in dawr_write_file_bool

In commit c1fe190c0672 ("powerpc: Add force enable of DAWR on P9
option") the following piece of code was added:

   smp_call_function((smp_call_func_t)set_dawr, &null_brk, 0);

Since GCC 8 this triggers the following warning about incompatible
function types:

  arch/powerpc/kernel/hw_breakpoint.c:408:21: error: cast between incompatible function types from 'int (*)(struct arch_hw_breakpoint *)' to 'void (*)(void *)' [-Werror=cast-function-type]

Since the warning is there for a reason, and should not be hidden behind
a cast, provide an intermediate callback function to avoid the warning.

Fixes: c1fe190c0672 ("powerpc: Add force enable of DAWR on P9 option")
Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/radix: keep kernel ERAT over local process/guest invalidates
Nicholas Piggin [Sun, 23 Jun 2019 10:41:52 +0000 (20:41 +1000)]
powerpc/64s/radix: keep kernel ERAT over local process/guest invalidates

ISA v3.0 radix modes provide SLBIA variants which can invalidate ERAT
for effPID!=0 or for effLPID!=0, which allows user and guest
invalidations to retain kernel/host ERAT entries.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s: Rename PPC_INVALIDATE_ERAT to PPC_ISA_3_0_INVALIDATE_ERAT
Nicholas Piggin [Sun, 23 Jun 2019 10:41:51 +0000 (20:41 +1000)]
powerpc/64s: Rename PPC_INVALIDATE_ERAT to PPC_ISA_3_0_INVALIDATE_ERAT

This makes it clear to the caller that it can only be used on POWER9
and later CPUs.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Use "ISA_3_0" rather than "ARCH_300"]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: simplify hmi control flow
Nicholas Piggin [Fri, 28 Jun 2019 06:33:22 +0000 (16:33 +1000)]
powerpc/64s/exception: simplify hmi control flow

Branch to the relocated 0xc000 address early (still in real mode), to
simplify subsequent branches. Have the virt mode handler avoid just
'windup' and redo the exception from scratch, rather than branching
back to the trampoline.

Rearrange the stack setup instruction location to match the system
reset handler (e.g., right before EXCEPTION_PROLOG_COMMON).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: hmi remove special case macro
Nicholas Piggin [Fri, 28 Jun 2019 06:33:21 +0000 (16:33 +1000)]
powerpc/64s/exception: hmi remove special case macro

No code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: sreset move trampoline ahead of common code
Nicholas Piggin [Fri, 28 Jun 2019 06:33:20 +0000 (16:33 +1000)]
powerpc/64s/exception: sreset move trampoline ahead of common code

Follow convention and move tramp ahead of common.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: optimise system_reset for idle, clean up non-idle case
Nicholas Piggin [Fri, 28 Jun 2019 06:33:19 +0000 (16:33 +1000)]
powerpc/64s/exception: optimise system_reset for idle, clean up non-idle case

The idle wake up code in the system reset interrupt is not very
optimal. There are two requirements: perform idle wake up quickly;
and save everything including CFAR for non-idle interrupts, with
no performance requirement.

The problem with placing the idle test in the middle of the handler
and using the normal handler code to save CFAR, is that it's quite
costly (e.g., mfcfar is serialising, speculative workarounds get
applied, SRR1 has to be reloaded, etc). It also prevents the standard
interrupt handler boilerplate being used.

This pain can be avoided by using a dedicated idle interrupt handler
at the start of the interrupt handler, which restores all registers
back to the way they were in case it was not an idle wake up. CFAR
is preserved without saving it before the non-idle case by making that
the fall-through, and idle is a taken branch.

Performance seems to be in the noise, but possibly around 0.5% faster,
the executed instructions certainly look better. The bigger benefit is
being able to drop in standard interrupt handlers after the idle code,
which helps with subsequent cleanup and consolidation.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Fixup BE by using DOTSYM for idle_return_gpr_loss call]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: remove bad stack branch
Nicholas Piggin [Fri, 28 Jun 2019 06:33:18 +0000 (16:33 +1000)]
powerpc/64s/exception: remove bad stack branch

The bad stack test in interrupt handlers has a few problems. For
performance it is taken in the common case, which is a fetch bubble
and a waste of i-cache.

For code development and maintainence, it requires yet another stack
frame setup routine, and that constrains all exception handlers to
follow the same register save pattern which inhibits future
optimisation.

Remove the test/branch and replace it with a trap. Teach the program
check handler to use the emergency stack for this case.

This does not result in quite so nice a message, however the SRR0 and
SRR1 of the crashed interrupt can be seen in r11 and r12, as is the
original r1 (adjusted by INT_FRAME_SIZE). These are the most important
parts to debugging the issue.

The original r9-12 and cr0 is lost, which is the main downside.

  kernel BUG at linux/arch/powerpc/kernel/exceptions-64s.S:847!
  Oops: Exception in kernel mode, sig: 5 [#1]
  BE SMP NR_CPUS=2048 NUMA PowerNV
  Modules linked in:
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted
  NIP:  c000000000009108 LR: c000000000cadbcc CTR: c0000000000090f0
  REGS: c0000000fffcbd70 TRAP: 0700   Not tainted
  MSR:  9000000000021032 <SF,HV,ME,IR,DR,RI>  CR: 28222448  XER: 20040000
  CFAR: c000000000009100 IRQMASK: 0
  GPR00: 000000000000003d fffffffffffffd00 c0000000018cfb00 c0000000f02b3166
  GPR04: fffffffffffffffd 0000000000000007 fffffffffffffffb 0000000000000030
  GPR08: 0000000000000037 0000000028222448 0000000000000000 c000000000ca8de0
  GPR12: 9000000002009032 c000000001ae0000 c000000000010a00 0000000000000000
  GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
  GPR20: c0000000f00322c0 c000000000f85200 0000000000000004 ffffffffffffffff
  GPR24: fffffffffffffffe 0000000000000000 0000000000000000 000000000000000a
  GPR28: 0000000000000000 0000000000000000 c0000000f02b391c c0000000f02b3167
  NIP [c000000000009108] decrementer_common+0x18/0x160
  LR [c000000000cadbcc] .vsnprintf+0x3ec/0x4f0
  Call Trace:
  Instruction dump:
  996d098a 994d098b 38610070 480246ed 48005518 60000000 38200000 718a4000
  7c2a0b78 3821fd00 41c20008 e82d0970 <0981fd00f92101a0 f9610170 f9810178

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/tm: update comment about interrupt re-entrancy
Nicholas Piggin [Fri, 28 Jun 2019 05:33:32 +0000 (15:33 +1000)]
powerpc/tm: update comment about interrupt re-entrancy

Since the system reset interrupt began to use its own stack, and
machine check interrupts have done so for some time, r1 can be
changed without clearing MSR[RI], provided no other interrupts
(including SLB misses) are taken.

MSR[RI] does have to be cleared when using SCRATCH0, however.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: move SET_SCRATCH0 into EXCEPTION_PROLOG_0
Nicholas Piggin [Fri, 28 Jun 2019 05:33:31 +0000 (15:33 +1000)]
powerpc/64s/exception: move SET_SCRATCH0 into EXCEPTION_PROLOG_0

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: denorm handler use standard scratch save macro
Nicholas Piggin [Fri, 28 Jun 2019 05:33:30 +0000 (15:33 +1000)]
powerpc/64s/exception: denorm handler use standard scratch save macro

Although the 0x1500 interrupt only applies to bare metal, it is better
to just use the standard macro for scratch save.

Runtime code path remains unchanged (due to instruction patching).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: machine check use standard macros to save dar/dsisr
Nicholas Piggin [Fri, 28 Jun 2019 05:33:29 +0000 (15:33 +1000)]
powerpc/64s/exception: machine check use standard macros to save dar/dsisr

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: add dar and dsisr options to exception macro
Nicholas Piggin [Fri, 28 Jun 2019 05:33:28 +0000 (15:33 +1000)]
powerpc/64s/exception: add dar and dsisr options to exception macro

Some exception entry requires DAR and/or DSISR to be saved into the
paca exception save area. Add options to the standard exception
macros for these.

Generated code changes slightly due to code structure.

-     554:      a6 02 72 7d     mfdsisr r11
-     558:      a8 00 4d f9     std     r10,168(r13)
-     55c:      b0 00 6d 91     stw     r11,176(r13)
+     554:      a8 00 4d f9     std     r10,168(r13)
+     558:      a6 02 52 7d     mfdsisr r10
+     55c:      b0 00 4d 91     stw     r10,176(r13)

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: use common macro for windup
Nicholas Piggin [Fri, 28 Jun 2019 05:33:27 +0000 (15:33 +1000)]
powerpc/64s/exception: use common macro for windup

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: shuffle windup code around
Nicholas Piggin [Fri, 28 Jun 2019 05:33:26 +0000 (15:33 +1000)]
powerpc/64s/exception: shuffle windup code around

Restore all SPRs and CR up-front, these are longer latency
instructions. Move register restore around to maximise pairs of
adjacent loads (e.g., restore r0 next to r1).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: simplify hmi windup code
Nicholas Piggin [Fri, 28 Jun 2019 05:33:25 +0000 (15:33 +1000)]
powerpc/64s/exception: simplify hmi windup code

Duplicate the hmi windup code for both cases, rather than to put a
special case branch in the middle of it. Remove unused label. This
helps with later code consolidation.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: move machine check windup in_mce handling
Nicholas Piggin [Fri, 28 Jun 2019 05:33:24 +0000 (15:33 +1000)]
powerpc/64s/exception: move machine check windup in_mce handling

Move in_mce decrement earlier before registers are restored (but
still after RI=0). This helps with later consolidation.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: windup use r9 consistently to restore SPRs
Nicholas Piggin [Fri, 28 Jun 2019 05:33:23 +0000 (15:33 +1000)]
powerpc/64s/exception: windup use r9 consistently to restore SPRs

Trivial code change, r3->r9.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: mtmsrd L=1 cleanup
Nicholas Piggin [Fri, 28 Jun 2019 05:33:22 +0000 (15:33 +1000)]
powerpc/64s/exception: mtmsrd L=1 cleanup

All supported 64s CPUs support mtmsrd L=1 instruction, so a cleanup
can be made in sreset and mce handlers.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: avoid SPR RAW scoreboard stall in real mode entry
Nicholas Piggin [Fri, 28 Jun 2019 05:33:21 +0000 (15:33 +1000)]
powerpc/64s/exception: avoid SPR RAW scoreboard stall in real mode entry

Move SPR reads ahead of writes. Real mode entry that is not a KVM
guest is rare these days, but bad practice propagates.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: clean up system call entry
Nicholas Piggin [Fri, 28 Jun 2019 05:33:20 +0000 (15:33 +1000)]
powerpc/64s/exception: clean up system call entry

syscall / hcall entry unnecessarily differs between KVM and non-KVM
builds. Move the SMT priority instruction to the same location
(after INTERRUPT_TO_KERNEL).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: move paca save area offsets into exception-64s.S
Nicholas Piggin [Sat, 22 Jun 2019 13:15:35 +0000 (23:15 +1000)]
powerpc/64s/exception: move paca save area offsets into exception-64s.S

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: remove pointless EXCEPTION_PROLOG macro indirection
Nicholas Piggin [Sat, 22 Jun 2019 13:15:34 +0000 (23:15 +1000)]
powerpc/64s/exception: remove pointless EXCEPTION_PROLOG macro indirection

No generated code change. Final vmlinux is changed only due to change
in bug table line numbers.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: generate regs clear instructions using .rept
Nicholas Piggin [Sat, 22 Jun 2019 13:15:33 +0000 (23:15 +1000)]
powerpc/64s/exception: generate regs clear instructions using .rept

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: fix indenting irregularities
Nicholas Piggin [Sat, 22 Jun 2019 13:15:32 +0000 (23:15 +1000)]
powerpc/64s/exception: fix indenting irregularities

Generally, macros that result in instructions being expanded are
indented by a tab, and those that don't have no indent. Fix the
obvious cases that go contrary to style.

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: use a gas macro for system call handler code
Nicholas Piggin [Sat, 22 Jun 2019 13:15:31 +0000 (23:15 +1000)]
powerpc/64s/exception: use a gas macro for system call handler code

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: remove unused BRANCH_TO_COMMON
Nicholas Piggin [Sat, 22 Jun 2019 13:15:30 +0000 (23:15 +1000)]
powerpc/64s/exception: remove unused BRANCH_TO_COMMON

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: remove __BRANCH_TO_KVM
Nicholas Piggin [Sat, 22 Jun 2019 13:15:29 +0000 (23:15 +1000)]
powerpc/64s/exception: remove __BRANCH_TO_KVM

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: move head-64.h code to exception-64s.S where it is used
Nicholas Piggin [Sat, 22 Jun 2019 13:15:28 +0000 (23:15 +1000)]
powerpc/64s/exception: move head-64.h code to exception-64s.S where it is used

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: move exception-64s.h code to exception-64s.S where it is used
Nicholas Piggin [Sat, 22 Jun 2019 13:15:27 +0000 (23:15 +1000)]
powerpc/64s/exception: move exception-64s.h code to exception-64s.S where it is used

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: move KVM related code together
Nicholas Piggin [Sat, 22 Jun 2019 13:15:26 +0000 (23:15 +1000)]
powerpc/64s/exception: move KVM related code together

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: remove STD_EXCEPTION_COMMON variants
Nicholas Piggin [Sat, 22 Jun 2019 13:15:25 +0000 (23:15 +1000)]
powerpc/64s/exception: remove STD_EXCEPTION_COMMON variants

These are only called in one place each.

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: move EXCEPTION_PROLOG_2* to a more logical place
Nicholas Piggin [Sat, 22 Jun 2019 13:15:24 +0000 (23:15 +1000)]
powerpc/64s/exception: move EXCEPTION_PROLOG_2* to a more logical place

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: improve 0x500 handler code
Nicholas Piggin [Sat, 22 Jun 2019 13:15:23 +0000 (23:15 +1000)]
powerpc/64s/exception: improve 0x500 handler code

After the previous cleanup, it becomes possible to consolidate some
common code outside the runtime alternate patching. Also remove
unused labels.

This results in some code change, but unchanged runtime instruction
sequence.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: unwind exception-64s.h macros
Nicholas Piggin [Sat, 22 Jun 2019 13:15:22 +0000 (23:15 +1000)]
powerpc/64s/exception: unwind exception-64s.h macros

Many of these macros just specify 1-4 lines which are only called a
few times each at most, and often just once. Remove this indirection.

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: Move EXCEPTION_COMMON additions into callers
Nicholas Piggin [Sat, 22 Jun 2019 13:15:21 +0000 (23:15 +1000)]
powerpc/64s/exception: Move EXCEPTION_COMMON additions into callers

More cases of code insertion via macros that does not add a great
deal. All the additions have to be specified in the macro arguments,
so they can just as well go after the macro.

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: Move EXCEPTION_COMMON handler and return branches into callers
Nicholas Piggin [Sat, 22 Jun 2019 13:15:20 +0000 (23:15 +1000)]
powerpc/64s/exception: Move EXCEPTION_COMMON handler and return branches into callers

The aim is to reduce the amount of indirection it takes to get through
the exception handler macros, particularly where it provides little
code sharing.

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: Make EXCEPTION_PROLOG_0 a gas macro for consistency with others
Nicholas Piggin [Sat, 22 Jun 2019 13:15:19 +0000 (23:15 +1000)]
powerpc/64s/exception: Make EXCEPTION_PROLOG_0 a gas macro for consistency with others

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: KVM handler can set the HSRR trap bit
Nicholas Piggin [Sat, 22 Jun 2019 13:15:18 +0000 (23:15 +1000)]
powerpc/64s/exception: KVM handler can set the HSRR trap bit

Move the KVM trap HSRR bit into the KVM handler, which can be
conditionally applied when hsrr parameter is set.

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: merge KVM handler and skip variants
Nicholas Piggin [Sat, 22 Jun 2019 13:15:17 +0000 (23:15 +1000)]
powerpc/64s/exception: merge KVM handler and skip variants

Conditionally expand the skip case if it is specified.

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: consolidate maskable and non-maskable prologs
Nicholas Piggin [Sat, 22 Jun 2019 13:15:16 +0000 (23:15 +1000)]
powerpc/64s/exception: consolidate maskable and non-maskable prologs

Conditionally expand the soft-masking test if a mask is passed in.

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64s/exception: remove the "extra" macro parameter
Nicholas Piggin [Sat, 22 Jun 2019 13:15:15 +0000 (23:15 +1000)]
powerpc/64s/exception: remove the "extra" macro parameter

Rather than pass in the soft-masking and KVM tests via macro that is
passed to another macro to expand it, switch to usig gas macros and
conditionally expand the soft-masking and KVM tests.

The system reset with its idle test is open coded as it is a one-off.

No generated code change.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>