platform/kernel/linux-rpi.git
8 years agoMerge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux...
Michael Ellerman [Fri, 16 Dec 2016 04:05:38 +0000 (15:05 +1100)]
Merge branch 'next' of git://git./linux/kernel/git/scottwood/linux into next

Freescale updates from Scott:

"Highlights include 8xx hugepage support, qbman fixes/cleanup, device
tree updates, and some misc cleanup."

8 years agopowerpc/fsl/dts: add FMan node for t1042d4rdb
Madalin Bucur [Wed, 7 Dec 2016 15:14:56 +0000 (17:14 +0200)]
powerpc/fsl/dts: add FMan node for t1042d4rdb

Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agopowerpc/fsl/dts: add sg_2500_aqr105_phy4 alias on t1024rdb
Madalin Bucur [Wed, 7 Dec 2016 15:14:55 +0000 (17:14 +0200)]
powerpc/fsl/dts: add sg_2500_aqr105_phy4 alias on t1024rdb

The alias is used by the boot loader to perform a device tree
fixup.

Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agopowerpc/fsl/dts: add QMan and BMan nodes on t1024
Madalin Bucur [Wed, 7 Dec 2016 15:14:54 +0000 (17:14 +0200)]
powerpc/fsl/dts: add QMan and BMan nodes on t1024

Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agopowerpc/fsl/dts: add QMan and BMan nodes on t1023
Madalin Bucur [Wed, 7 Dec 2016 15:14:53 +0000 (17:14 +0200)]
powerpc/fsl/dts: add QMan and BMan nodes on t1023

Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agosoc/fsl/qman: test: use DEFINE_SPINLOCK()
Fabian Frederick [Sun, 4 Dec 2016 12:44:59 +0000 (13:44 +0100)]
soc/fsl/qman: test: use DEFINE_SPINLOCK()

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agopowerpc/fsl-lbc: use DEFINE_SPINLOCK()
Fabian Frederick [Sun, 4 Dec 2016 12:47:28 +0000 (13:47 +0100)]
powerpc/fsl-lbc: use DEFINE_SPINLOCK()

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agopowerpc/8xx: Implement support of hugepages
Christophe Leroy [Wed, 7 Dec 2016 07:47:28 +0000 (08:47 +0100)]
powerpc/8xx: Implement support of hugepages

8xx uses a two level page table with two different linux page size
support (4k and 16k). 8xx also support two different hugepage sizes
512k and 8M. In order to support them on linux we define two different
page table layout.

The size of pages is in the PGD entry, using PS field (bits 28-29):
00 : Small pages (4k or 16k)
01 : 512k pages
10 : reserved
11 : 8M pages

For 512K hugepage size a pgd entry have the below format
[<hugepte address >0101] . The hugepte table allocated will contain 8
entries pointing to 512K huge pte in 4k pages mode and 64 entries in
16k pages mode.

For 8M in 16k mode, a pgd entry have the below format
[<hugepte address >1101] . The hugepte table allocated will contain 8
entries pointing to 8M huge pte.

For 8M in 4k mode, multiple pgd entries point to the same hugepte
address and pgd entry will have the below format
[<hugepte address>1101]. The hugepte table allocated will only have one
entry.

For the time being, we do not support CPU15 ERRATA when HUGETLB is
selected

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> (v3, for the generic bits)
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agopowerpc: get hugetlbpage handling more generic
Christophe Leroy [Wed, 7 Dec 2016 07:47:26 +0000 (08:47 +0100)]
powerpc: get hugetlbpage handling more generic

Today there are two implementations of hugetlbpages which are managed
by exclusive #ifdefs:
* FSL_BOOKE: several directory entries points to the same single hugepage
* BOOK3S: one upper level directory entry points to a table of hugepages

In preparation of implementation of hugepage support on the 8xx, we
need a mix of the two above solutions, because the 8xx needs both cases
depending on the size of pages:
* In 4k page size mode, each PGD entry covers a 4M bytes area. It means
that 2 PGD entries will be necessary to cover an 8M hugepage while a
single PGD entry will cover 8x 512k hugepages.
* In 16 page size mode, each PGD entry covers a 64M bytes area. It means
that 8x 8M hugepages will be covered by one PGD entry and 64x 512k
hugepages will be covers by one PGD entry.

This patch:
* removes #ifdefs in favor of if/else based on the range sizes
* merges the two huge_pte_alloc() functions as they are pretty similar
* merges the two hugetlbpage_init() functions as they are pretty similar

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> (v3)
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agopowerpc: port 64 bits pgtable_cache to 32 bits
Christophe Leroy [Wed, 7 Dec 2016 07:47:24 +0000 (08:47 +0100)]
powerpc: port 64 bits pgtable_cache to 32 bits

Today powerpc64 uses a set of pgtable_caches while powerpc32 uses
standard pages when using 4k pages and a single pgtable_cache
if using other size pages.

In preparation of implementing huge pages on the 8xx, this patch
replaces the specific powerpc32 handling by the 64 bits approach.

This is done by:
* moving 64 bits pgtable_cache_add() and pgtable_cache_init()
in a new file called init-common.c
* modifying pgtable_cache_init() to also handle the case
without PMD
* removing the 32 bits version of pgtable_cache_add() and
pgtable_cache_init()
* copying related header contents from 64 bits into both the
book3s/32 and nohash/32 header files

On the 8xx, the following cache sizes will be used:
* 4k pages mode:
- PGT_CACHE(10) for PGD
- PGT_CACHE(3) for 512k hugepage tables
* 16k pages mode:
- PGT_CACHE(6) for PGD
- PGT_CACHE(7) for 512k hugepage tables
- PGT_CACHE(3) for 8M hugepage tables

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agopowerpc/boot: Request no dynamic linker for boot wrapper
Nicholas Piggin [Mon, 28 Nov 2016 01:42:26 +0000 (12:42 +1100)]
powerpc/boot: Request no dynamic linker for boot wrapper

The boot wrapper performs its own relocations and does not require
PT_INTERP segment. However currently we don't tell the linker that.

Prior to binutils 2.28 that works OK. But since binutils commit
1a9ccd70f9a7 ("Fix the linker so that it will not silently generate ELF
binaries with invalid program headers. Fix readelf to report such
invalid binaries.") binutils tries to create a program header segment
due to PT_INTERP, and the link fails because there is no space for it:

  ld: arch/powerpc/boot/zImage.pseries: Not enough room for program headers, try linking with -N
  ld: final link failed: Bad value

So tell the linker not to do that, by passing --no-dynamic-linker.

Cc: stable@vger.kernel.org
Reported-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Drop dependency on ld-version.sh and massage change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agosoc/fsl/bman: Use resource_size instead of computation
Wei Yongjun [Mon, 17 Oct 2016 15:13:59 +0000 (15:13 +0000)]
soc/fsl/bman: Use resource_size instead of computation

Use resource_size function on resource object
instead of explicit computation.

Generated by: scripts/coccinelle/api/resource_size.cocci

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agosoc/fsl/qe: use builtin_platform_driver
Geliang Tang [Wed, 23 Nov 2016 15:04:21 +0000 (23:04 +0800)]
soc/fsl/qe: use builtin_platform_driver

Use builtin_platform_driver() helper to simplify the code.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agopowerpc/fsl_pmc: use builtin_platform_driver
Geliang Tang [Wed, 23 Nov 2016 15:02:35 +0000 (23:02 +0800)]
powerpc/fsl_pmc: use builtin_platform_driver

Use builtin_platform_driver() helper to simplify the code.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agopowerpc/83xx/suspend: use builtin_platform_driver
Geliang Tang [Wed, 23 Nov 2016 15:00:45 +0000 (23:00 +0800)]
powerpc/83xx/suspend: use builtin_platform_driver

Use builtin_platform_driver() helper to simplify the code.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agopowerpc/ftrace: Fix the comments for ftrace_modify_code
Libin [Sun, 6 Dec 2015 02:02:56 +0000 (10:02 +0800)]
powerpc/ftrace: Fix the comments for ftrace_modify_code

There is no need to worry about module and __init text disappearing
case, because that ftrace has a module notifier that is called when a
module is being unloaded and before the text goes away and this code
grabs the ftrace_lock mutex and removes the module functions from the
ftrace list, such that it will no longer do any modifications to that
module's text, the update to make functions be traced or not is done
under the ftrace_lock mutex as well. And by now, __init section codes
should not been modified by ftrace, because it is black listed in
recordmcount.c and ignored by ftrace.

Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Li Bin <huawei.libin@huawei.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/perf: macros for power9 format encoding
Madhavan Srinivasan [Fri, 2 Dec 2016 00:35:02 +0000 (06:05 +0530)]
powerpc/perf: macros for power9 format encoding

Patch to add macros and contants to support the power9 raw
event encoding format. Couple of functions added since some of the
bits fields like PMCxCOMB and THRESH_CMP has different width and location
within MMCR* in power9.

Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/perf: power9 raw event format encoding
Madhavan Srinivasan [Fri, 2 Dec 2016 00:35:01 +0000 (06:05 +0530)]
powerpc/perf: power9 raw event format encoding

Patch to update the power9 raw event encoding format
information and add support for the same in power9-pmu.c.

Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/perf: update attribute_group data structure
Madhavan Srinivasan [Fri, 2 Dec 2016 00:35:00 +0000 (06:05 +0530)]
powerpc/perf: update attribute_group data structure

Rename the power_pmu and attribute_group variables that
support PowerISA v2.07. Add a cpu feature flag check to pick
the PowerISA v2.07 format structures to support.

Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/perf: factor out the event format field
Madhavan Srinivasan [Fri, 2 Dec 2016 00:34:59 +0000 (06:04 +0530)]
powerpc/perf: factor out the event format field

Factor out the format field structure for PowerISA v2.07.

Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/mm/iommu, vfio/spapr: Put pages on VFIO container shutdown
Alexey Kardashevskiy [Wed, 30 Nov 2016 06:52:05 +0000 (17:52 +1100)]
powerpc/mm/iommu, vfio/spapr: Put pages on VFIO container shutdown

At the moment the userspace tool is expected to request pinning of
the entire guest RAM when VFIO IOMMU SPAPR v2 driver is present.
When the userspace process finishes, all the pinned pages need to
be put; this is done as a part of the userspace memory context (MM)
destruction which happens on the very last mmdrop().

This approach has a problem that a MM of the userspace process
may live longer than the userspace process itself as kernel threads
use userspace process MMs which was runnning on a CPU where
the kernel thread was scheduled to. If this happened, the MM remains
referenced until this exact kernel thread wakes up again
and releases the very last reference to the MM, on an idle system this
can take even hours.

This moves preregistered regions tracking from MM to VFIO; insteads of
using mm_iommu_table_group_mem_t::used, tce_container::prereg_list is
added so each container releases regions which it has pre-registered.

This changes the userspace interface to return EBUSY if a memory
region is already registered in a container. However it should not
have any practical effect as the only userspace tool available now
does register memory region once per container anyway.

As tce_iommu_register_pages/tce_iommu_unregister_pages are called
under container->lock, this does not need additional locking.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agovfio/spapr: Reference mm in tce_container
Alexey Kardashevskiy [Wed, 30 Nov 2016 06:52:04 +0000 (17:52 +1100)]
vfio/spapr: Reference mm in tce_container

In some situations the userspace memory context may live longer than
the userspace process itself so if we need to do proper memory context
cleanup, we better have tce_container take a reference to mm_struct and
use it later when the process is gone (@current or @current->mm is NULL).

This references mm and stores the pointer in the container; this is done
in a new helper - tce_iommu_mm_set() - when one of the following happens:
- a container is enabled (IOMMU v1);
- a first attempt to pre-register memory is made (IOMMU v2);
- a DMA window is created (IOMMU v2).
The @mm stays referenced till the container is destroyed.

This replaces current->mm with container->mm everywhere except debug
prints.

This adds a check that current->mm is the same as the one stored in
the container to prevent userspace from making changes to a memory
context of other processes.

DMA map/unmap ioctls() do not check for @mm as they already check
for @enabled which is set after tce_iommu_mm_set() is called.

This does not reference a task as multiple threads within the same mm
are allowed to ioctl() to vfio and supposedly they will have same limits
and capabilities and if they do not, we'll just fail with no harm made.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agovfio/spapr: Postpone default window creation
Alexey Kardashevskiy [Wed, 30 Nov 2016 06:52:03 +0000 (17:52 +1100)]
vfio/spapr: Postpone default window creation

We are going to allow the userspace to configure container in
one memory context and pass container fd to another so
we are postponing memory allocations accounted against
the locked memory limit. One of previous patches took care of
it_userspace.

At the moment we create the default DMA window when the first group is
attached to a container; this is done for the userspace which is not
DDW-aware but familiar with the SPAPR TCE IOMMU v2 in the part of memory
pre-registration - such client expects the default DMA window to exist.

This postpones the default DMA window allocation till one of
the folliwing happens:
1. first map/unmap request arrives;
2. new window is requested;
This adds noop for the case when the userspace requested removal
of the default window which has not been created yet.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agovfio/spapr: Add a helper to create default DMA window
Alexey Kardashevskiy [Wed, 30 Nov 2016 06:52:02 +0000 (17:52 +1100)]
vfio/spapr: Add a helper to create default DMA window

There is already a helper to create a DMA window which does allocate
a table and programs it to the IOMMU group. However
tce_iommu_take_ownership_ddw() did not use it and did these 2 calls
itself to simplify error path.

Since we are going to delay the default window creation till
the default window is accessed/removed or new window is added,
we need a helper to create a default window from all these cases.

This adds tce_iommu_create_default_window(). Since it relies on
a VFIO container to have at least one IOMMU group (for future use),
this changes tce_iommu_attach_group() to add a group to the container
first and then call the new helper.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agovfio/spapr: Postpone allocation of userspace version of TCE table
Alexey Kardashevskiy [Wed, 30 Nov 2016 06:52:01 +0000 (17:52 +1100)]
vfio/spapr: Postpone allocation of userspace version of TCE table

The iommu_table struct manages a hardware TCE table and a vmalloc'd
table with corresponding userspace addresses. Both are allocated when
the default DMA window is created and this happens when the very first
group is attached to a container.

As we are going to allow the userspace to configure container in one
memory context and pas container fd to another, we have to postpones
such allocations till a container fd is passed to the destination
user process so we would account locked memory limit against the actual
container user constrainsts.

This postpones the it_userspace array allocation till it is used first
time for mapping. The unmapping patch already checks if the array is
allocated.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/iommu: Stop using @current in mm_iommu_xxx
Alexey Kardashevskiy [Wed, 30 Nov 2016 06:52:00 +0000 (17:52 +1100)]
powerpc/iommu: Stop using @current in mm_iommu_xxx

This changes mm_iommu_xxx helpers to take mm_struct as a parameter
instead of getting it from @current which in some situations may
not have a valid reference to mm.

This changes helpers to receive @mm and moves all references to @current
to the caller, including checks for !current and !current->mm;
checks in mm_iommu_preregistered() are removed as there is no caller
yet.

This moves the mm_iommu_adjust_locked_vm() call to the caller as
it receives mm_iommu_table_group_mem_t but it needs mm.

This should cause no behavioral change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/iommu: Pass mm_struct to init/cleanup helpers
Alexey Kardashevskiy [Wed, 30 Nov 2016 06:51:59 +0000 (17:51 +1100)]
powerpc/iommu: Pass mm_struct to init/cleanup helpers

We are going to get rid of @current references in mmu_context_boos3s64.c
and cache mm_struct in the VFIO container. Since mm_context_t does not
have reference counting, we will be using mm_struct which does have
the reference counter.

This changes mm_iommu_init/mm_iommu_cleanup to receive mm_struct rather
than mm_context_t (which is embedded into mm).

This should not cause any behavioral change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/64: Define ILLEGAL_POINTER_VALUE for 64-bit
Michael Ellerman [Tue, 15 Nov 2016 10:59:38 +0000 (21:59 +1100)]
powerpc/64: Define ILLEGAL_POINTER_VALUE for 64-bit

This is used in poison.h to offset poison values so that they don't
point directly into user space.

The value we choose sits roughly between user and kernel space, which
means on their own the poison values don't point anywhere useful. If an
attacker can cause an access at some offset from the poison value then
we may still be in trouble, but by putting the poison values between
user and kernel space we maximise the required size of that offset.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc Don't print misleading facility name in facility unavailable exception
Balbir Singh [Wed, 30 Nov 2016 06:45:09 +0000 (17:45 +1100)]
powerpc Don't print misleading facility name in facility unavailable exception

The current facility_strings[] are correct when the trap address is
0xf80 (hypervisor facility unavailable). When the trap address is
0xf60 (facility unavailable) IC (Interruption Cause) a.k.a status in the
code is undefined for values 0 and 1.

Add a check to prevent printing the (misleading) facility name for IC 0
and 1 when we came in via 0xf60. In all cases, print the actual IC
value, to avoid any confusion.

This hasn't been seen on real hardware, on only qemu which was
misreporting an exception.

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
[mpe: Fix indentation, combine printks(), massage change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc: Make selects of IBM_EMAC_* depend on IBM_EMAC
Michael Ellerman [Thu, 1 Dec 2016 09:50:46 +0000 (20:50 +1100)]
powerpc: Make selects of IBM_EMAC_* depend on IBM_EMAC

We have a bunch of Kconfig symbols which select various IBM_EMAC_*
symbols. These all cause warnings when IBM_EMAC is not selected.

eg.

  warning: (PPC_CELL_NATIVE && BLUESTONE && CANYONLANDS && GLACIER &&
  EIGER && 440EPX && 440GRX && 440GX && 460SX && 405EX) selects
  IBM_EMAC_RGMII which has unmet direct dependencies (NETDEVICES &&
  ETHERNET && NET_VENDOR_IBM)

So make them all depend on IBM_EMAC being enabled first.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/cell: Drop select of MEMORY_HOTPLUG
Michael Ellerman [Thu, 1 Dec 2016 09:50:45 +0000 (20:50 +1100)]
powerpc/cell: Drop select of MEMORY_HOTPLUG

SPU_FS selects MEMORY_HOTPLUG, which is problematic because
MEMORY_HOTPLUG is user selectable, meaning we can end up with a broken
.config where MEMORY_HOTPLUG is enabled but its dependencies are not,
leading to build breakages.

The select of MEMORY_HOTPLUG for SPU_FS was added back in 2006, in
commit 4da30d15b6d5 ("[POWERPC] spufs: fix memory hotplug dependency").

However we reworked the spufs code and removed the dependency on memory
hotplug in 2007 in commit 78bde53e351b ("[POWERPC] spufs: remove need
for struct page for SPEs").

So drop the select as it's no longer needed and causes problems.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/pseries: Use lmb_is_removable() to check removability
Nathan Fontenot [Mon, 28 Nov 2016 16:50:45 +0000 (11:50 -0500)]
powerpc/pseries: Use lmb_is_removable() to check removability

We should be using lmb_is_removable() to validate that enough LMBs
are available to remove when doing a remove by count. This will check
that the LMB is owned by the system and it is considered removable.
This patch also adds a pr_info() notification to report the LMB count
to remove was not satisfied.

What we do now is just check that there are enough LMBs owned by the
system when validating there are enough LMBs to remove. This can
lead to situations where there are enough LMBs owned by the system
but not enough that are considered removable. This results in having
to bail out of the remove operation instead of just failing the request
that we should have known wouldn't succeed.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/mm: Fix page table dump build on non-Book3S
Michael Ellerman [Wed, 30 Nov 2016 08:41:02 +0000 (19:41 +1100)]
powerpc/mm: Fix page table dump build on non-Book3S

In the recent commit 1515ab932156 ("powerpc/mm: Dump hash table") we
added code to dump the hage page table. Currently this can be selected
to build on any platform. However it breaks the build if we're building
for a non-Book3S platform, because none of the hash page table related
defines and so on exist. So restrict it to building only on Book3S.

Similarly in commit 8eb07b187000 ("powerpc/mm: Dump linux pagetables")
we added code to dump the Linux page tables, which uses some constants
which are only defined on Book3S - so guard those with an #ifdef.

Fixes: 1515ab932156 ("powerpc/mm: Dump hash table")
Fixes: 8eb07b187000 ("powerpc/mm: Dump linux pagetables")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/ps3: Fix system hang with GCC 5 builds
Geoff Levand [Tue, 29 Nov 2016 18:47:32 +0000 (10:47 -0800)]
powerpc/ps3: Fix system hang with GCC 5 builds

GCC 5 generates different code for this bootwrapper null check that
causes the PS3 to hang very early in its bootup. This check is of
limited value, so just get rid of it.

Cc: stable@vger.kernel.org
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/prom: Switch to using structs for ibm_architecture_vec
Michael Ellerman [Fri, 18 Nov 2016 12:15:42 +0000 (23:15 +1100)]
powerpc/prom: Switch to using structs for ibm_architecture_vec

Now that we've defined structures to describe each of the client
architecture vectors, we can use those to construct the value we pass to
firmware.

This avoids the tricks we previously played with the W() macro, allows
us to properly endian annotate fields, and should help to avoid bugs
introduced by failing to have the correct number of zero pad bytes
between fields.

It also means we can avoid hard coding IBM_ARCH_VEC_NRCORES_OFFSET in
order to update the max_cpus value and instead just set it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/prom: Define structs for client architecture vectors
Michael Ellerman [Fri, 18 Nov 2016 12:15:41 +0000 (23:15 +1100)]
powerpc/prom: Define structs for client architecture vectors

The "client architecture vectors" are a series of structures we pass to
firmware to define various things, such as what processors we support
and many other options.

Each structure is entirely different so we have to define a different
struct for each one, but that's OK.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/pseries: add definitions for new H_SIGNAL_SYS_RESET hcall
Nicholas Piggin [Tue, 8 Nov 2016 06:08:06 +0000 (17:08 +1100)]
powerpc/pseries: add definitions for new H_SIGNAL_SYS_RESET hcall

This has not made its way to a PAPR release yet, but we have an hcall
number assigned.

  H_SIGNAL_SYS_RESET = 0x380

  Syntax:
    hcall(uint64 H_SIGNAL_SYS_RESET, int64 target);

  Generate a system reset NMI on the threads indicated by target.

  Values for target:
    -1 = target all online threads including the caller
    -2 = target all online threads except for the caller
    All other negative values: reserved
    Positive values: The thread to be targeted, obtained from the value
    of the "ibm,ppc-interrupt-server#s" property of the CPU in the OF
    device tree.

  Semantics:
  - Invalid target: return H_Parameter.
  - Otherwise: Generate a system reset NMI on target thread(s),
    return H_Success.

This will be used by crash/debug code to get stuck CPUs into a known
state.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc: Enable CONFIG_KEXEC_FILE in powerpc server defconfigs.
Thiago Jung Bauermann [Tue, 29 Nov 2016 12:45:54 +0000 (23:45 +1100)]
powerpc: Enable CONFIG_KEXEC_FILE in powerpc server defconfigs.

Enable CONFIG_KEXEC_FILE in powernv_defconfig, ppc64_defconfig and
pseries_defconfig.

It depends on CONFIG_CRYPTO_SHA256=y, so add that as well.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/kexec: Enable kexec_file_load() syscall
Thiago Jung Bauermann [Tue, 29 Nov 2016 12:45:53 +0000 (23:45 +1100)]
powerpc/kexec: Enable kexec_file_load() syscall

Define the Kconfig symbol so that the kexec_file_load() code can be
built, and wire up the syscall so that it can be called.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc: Add purgatory for kexec_file_load() implementation.
Thiago Jung Bauermann [Tue, 29 Nov 2016 12:45:52 +0000 (23:45 +1100)]
powerpc: Add purgatory for kexec_file_load() implementation.

This purgatory implementation is based on the versions from kexec-tools
and kexec-lite, with additional changes.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc: Add support code for kexec_file_load()
Thiago Jung Bauermann [Tue, 29 Nov 2016 12:45:51 +0000 (23:45 +1100)]
powerpc: Add support code for kexec_file_load()

This patch adds the support code needed for implementing
kexec_file_load() on powerpc.

This consists of functions to load the ELF kernel, either big or little
endian, and setup the purgatory enviroment which switches from the first
kernel to the second kernel.

None of this code is built yet, as it depends on CONFIG_KEXEC_FILE which
we have not yet defined. Although we could define CONFIG_KEXEC_FILE in
this patch, we'd then have a window in history where the kconfig symbol
is present but the syscall is not, which would be awkward.

Signed-off-by: Josh Sklar <sklar@linux.vnet.ibm.com>
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc: Change places using CONFIG_KEXEC to use CONFIG_KEXEC_CORE instead.
Thiago Jung Bauermann [Tue, 29 Nov 2016 12:45:50 +0000 (23:45 +1100)]
powerpc: Change places using CONFIG_KEXEC to use CONFIG_KEXEC_CORE instead.

Commit 2965faa5e03d ("kexec: split kexec_load syscall from kexec core
code") introduced CONFIG_KEXEC_CORE so that CONFIG_KEXEC means whether
the kexec_load system call should be compiled-in and CONFIG_KEXEC_FILE
means whether the kexec_file_load system call should be compiled-in.
These options can be set independently from each other.

Since until now powerpc only supported kexec_load, CONFIG_KEXEC and
CONFIG_KEXEC_CORE were synonyms. That is not the case anymore, so we
need to make a distinction. Almost all places where CONFIG_KEXEC was
being used should be using CONFIG_KEXEC_CORE instead, since
kexec_file_load also needs that code compiled in.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agokexec_file: Factor out kexec_locate_mem_hole from kexec_add_buffer.
Thiago Jung Bauermann [Tue, 29 Nov 2016 12:45:49 +0000 (23:45 +1100)]
kexec_file: Factor out kexec_locate_mem_hole from kexec_add_buffer.

kexec_locate_mem_hole will be used by the PowerPC kexec_file_load
implementation to find free memory for the purgatory stack.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Acked-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agokexec_file: Change kexec_add_buffer to take kexec_buf as argument.
Thiago Jung Bauermann [Tue, 29 Nov 2016 12:45:48 +0000 (23:45 +1100)]
kexec_file: Change kexec_add_buffer to take kexec_buf as argument.

This is done to simplify the kexec_add_buffer argument list.
Adapt all callers to set up a kexec_buf to pass to kexec_add_buffer.

In addition, change the type of kexec_buf.buffer from char * to void *.
There is no particular reason for it to be a char *, and the change
allows us to get rid of 3 existing casts to char * in the code.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Acked-by: Dave Young <dyoung@redhat.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agokexec_file: Allow arch-specific memory walking for kexec_add_buffer
Thiago Jung Bauermann [Tue, 29 Nov 2016 12:45:47 +0000 (23:45 +1100)]
kexec_file: Allow arch-specific memory walking for kexec_add_buffer

Allow architectures to specify a different memory walking function for
kexec_add_buffer. x86 uses iomem to track reserved memory ranges, but
PowerPC uses the memblock subsystem.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Acked-by: Dave Young <dyoung@redhat.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/mm: Fix no execute fault handling on pre-POWER5
Balbir Singh [Wed, 30 Nov 2016 00:35:36 +0000 (11:35 +1100)]
powerpc/mm: Fix no execute fault handling on pre-POWER5

Aneesh/Ben reported that the change to do_page_fault() we made in commit
1d18ad026844 ("powerpc/mm: Detect instruction fetch denied and report")
needs to handle the case where CPU_FTR_COHERENT_ICACHE is missing but we
have CPU_FTR_NOEXECUTE. In those cases the check added for
SRR1_ISI_N_OR_G might trigger a false positive.

This patch adds a check for CPU_FTR_COHERENT_ICACHE in addition to the
MSR value.

Fixes: 1d18ad026844 ("powerpc/mm: Detect instruction fetch denied and report")
Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/boot: Fix rebuild when changing kernel endian
Michael Ellerman [Mon, 21 Nov 2016 10:14:35 +0000 (21:14 +1100)]
powerpc/boot: Fix rebuild when changing kernel endian

Now that we don't set ARCH incorrectly when calling the boot Makefile,
we can use the generic cpp_lds_S rule for converting our zImage.lds.S
into zImage.lds.

The main advantage of using the generic rule is that it correctly uses
if_changed, which means we correctly regenerate the linker script when
switching endian. Fixing that means we are finally able to build one
endian and then rebuild the other endian without requiring to clean
between builds.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/boot: All uses of if_changed should depend on FORCE
Michael Ellerman [Mon, 21 Nov 2016 10:14:34 +0000 (21:14 +1100)]
powerpc/boot: All uses of if_changed should depend on FORCE

If we're using if_changed then we must depend on FORCE, so that
if_changed gets a chance to check if something changed.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc: Stop passing ARCH=ppc64 to boot Makefile
Michael Ellerman [Mon, 21 Nov 2016 10:14:33 +0000 (21:14 +1100)]
powerpc: Stop passing ARCH=ppc64 to boot Makefile

Back in 2005 when the ppc/ppc64 merge started, we used to build the
kernel code in arch/powerpc but use the boot code from arch/ppc or
arch/ppc64 depending on whether we were building for 32 or 64-bit.

Originally we called the boot Makefile passing ARCH=$(OLDARCH), where
OLDARCH was ppc or ppc64.

In commit 20f629549b30 ("powerpc: Make building the boot image work for
both 32-bit and 64-bit") (2005-10-11) we split the call for 32/64-bit
using an ifeq check, because the two Makefiles took different targets,
and explicitly passed ARCH=ppc64 for the 64-bit case and ARCH=ppc for
the 32-bit case.

Then in commit 94b212c29f68 ("powerpc: Move ppc64 boot wrapper code over
to arch/powerpc") (2005-11-16) we moved the boot code into arch/powerpc
and dropped the ppc case, but kept passing ARCH=ppc64 to
arch/powerpc/boot/Makefile.

Since then there have been several more boot targets added, all of which
have copied the ARCH=ppc64 setting, such that now we have four targets
using it.

Currently it seems that nothing actually uses the ARCH value, but that's
basically just luck, and in particular it prevents us from using the
generic cpp_lds_S rule. It's also clearly wrong, ARCH=ppc64 is dead,
buried and cremated.

Fix it by dropping the setting of ARCH completely, the correct value is
exported by the top level Makefile.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/mm: Batch tlb flush when invalidating pte entries
Aneesh Kumar K.V [Mon, 28 Nov 2016 06:17:04 +0000 (11:47 +0530)]
powerpc/mm: Batch tlb flush when invalidating pte entries

This will improve the task exit case, by batching tlb invalidates.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/mm: update radix__pte_update to not do full mm tlb flush
Aneesh Kumar K.V [Mon, 28 Nov 2016 06:17:03 +0000 (11:47 +0530)]
powerpc/mm: update radix__pte_update to not do full mm tlb flush

When we are updating a pte, we just need to flush the tlb mapping
that pte. Right now we do a full mm flush because we don't track page
size. Now that we have page size details in pte use that to do the
optimized flush

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/mm: update radix__ptep_set_access_flag to not do full mm tlb flush
Aneesh Kumar K.V [Mon, 28 Nov 2016 06:17:02 +0000 (11:47 +0530)]
powerpc/mm: update radix__ptep_set_access_flag to not do full mm tlb flush

When we are updating a pte, we just need to flush the tlb mapping
that pte. Right now we do a full mm flush because we don't track the page
size. Now that we have page size details in pte use that to do the
optimized flush

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/mm: Add radix__tlb_flush_pte_p9_dd1()
Aneesh Kumar K.V [Mon, 28 Nov 2016 06:17:01 +0000 (11:47 +0530)]
powerpc/mm: Add radix__tlb_flush_pte_p9_dd1()

Now that we have page size details encoded in pte using software pte
bits, use that to find the page size needed for tlb flush.

This function should only be used on P9 DD1, so give it a horrible name
to make that clear.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/mm: Introduce _PAGE_LARGE software pte bits
Aneesh Kumar K.V [Mon, 28 Nov 2016 06:17:00 +0000 (11:47 +0530)]
powerpc/mm: Introduce _PAGE_LARGE software pte bits

This patch adds a new software defined pte bit. We use the reserved
fields of ISA 3.0 pte definition since we will only be using this on DD1
code paths. We can possibly look at removing this code later.

The software bit will be used to differentiate between 64K/4K and 2M
ptes. This helps in finding the page size mapping by a pte so that we
can do efficient tlb flush.

We don't support 1G hugetlb pages yet. So we add a DEBUG WARN_ON to
catch wrong usage.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/mm/hugetlb: Handle hugepage size supported by hash config
Aneesh Kumar K.V [Mon, 28 Nov 2016 06:16:59 +0000 (11:46 +0530)]
powerpc/mm/hugetlb: Handle hugepage size supported by hash config

W.r.t hash page table config, we support 16MB and 16GB as the hugepage
size. Update the hstate_get_psize to handle 16M and 16G.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/mm: Rename hugetlb-radix.h to hugetlb.h
Aneesh Kumar K.V [Mon, 28 Nov 2016 06:16:58 +0000 (11:46 +0530)]
powerpc/mm: Rename hugetlb-radix.h to hugetlb.h

We will start moving some book3s specific hugetlb functions there.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/64e: Don't branch to dot symbols
Nicholas Piggin [Wed, 23 Nov 2016 13:02:09 +0000 (00:02 +1100)]
powerpc/64e: Don't branch to dot symbols

This converts one that was missed by b1576fec7f4d ("powerpc: No need
to use dot symbols when branching to a function").

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/64e: Convert cmpi to cmpwi in head_64.S
Nicholas Piggin [Wed, 23 Nov 2016 13:02:07 +0000 (00:02 +1100)]
powerpc/64e: Convert cmpi to cmpwi in head_64.S

From 80f23935cadb ("powerpc: Convert cmp to cmpd in idle enter sequence"):

  PowerPC's "cmp" instruction has four operands. Normally people write
  "cmpw" or "cmpd" for the second cmp operand 0 or 1. But, frequently
  people forget, and write "cmp" with just three operands.

  With older binutils this is silently accepted as if this was "cmpw",
  while often "cmpd" is wanted. With newer binutils GAS will complain
  about this for 64-bit code. For 32-bit code it still silently assumes
  "cmpw" is what is meant.

In this case, cmpwi is called for, so this is just a build fix for
new toolchains.

Cc: stable@vger.kernel.org # v3.0+
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/mm/radix: Prevent kernel execution of user space
Balbir Singh [Tue, 15 Nov 2016 06:56:16 +0000 (17:56 +1100)]
powerpc/mm/radix: Prevent kernel execution of user space

ISA 3 defines new encoded access authority that allows instruction
access prevention in privileged mode and allows normal access
to problem state. This patch just enables IAMR (Instruction Authority
Mask Register), enabling AMR would require more work.

I've tested this with a buggy driver and a simple payload. The payload
is specific to the build I've tested.

mpe: Also tested with LKDTM:

  # echo EXEC_USERSPACE > /sys/kernel/debug/provoke-crash/DIRECT
  lkdtm: Performing direct entry EXEC_USERSPACE
  lkdtm: attempting ok execution at c0000000005bf560
  lkdtm: attempting bad execution at 00003fff8d940000
  Unable to handle kernel paging request for instruction fetch
  Faulting instruction address: 0x3fff8d940000
  Oops: Kernel access of bad area, sig: 11 [#1]
  NIP: 00003fff8d940000 LR: c0000000005bfa58 CTR: 00003fff8d940000
  REGS: c0000000f1fcf900 TRAP: 0400   Not tainted  (4.9.0-rc5-compiler_gcc-6.2.0-00109-g956dbc06232a)
  MSR: 9000000010009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 48002222  XER: 00000000
  ...
  Call Trace:
    lkdtm_EXEC_USERSPACE+0x104/0x120 (unreliable)
    lkdtm_do_action+0x3c/0x80
    direct_entry+0x100/0x1b0
    full_proxy_write+0x94/0x100
    __vfs_write+0x3c/0x1b0
    vfs_write+0xcc/0x230
    SyS_write+0x60/0x110
    system_call+0x38/0xfc

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/mm: Detect instruction fetch denied and report
Balbir Singh [Tue, 15 Nov 2016 06:56:15 +0000 (17:56 +1100)]
powerpc/mm: Detect instruction fetch denied and report

ISA 3 allows for prevention of instruction fetch and execution
of user mode pages. If such an error occurs, SRR1 bit 35 reports the
error. We catch and report the error in do_page_fault().

Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/mm/radix: Setup AMOR in HV mode to allow key 0
Balbir Singh [Tue, 15 Nov 2016 06:56:14 +0000 (17:56 +1100)]
powerpc/mm/radix: Setup AMOR in HV mode to allow key 0

Setup AMOR (Authority Mask Override Register) in HV mode so that the
host and guest kernel can in turn setup IAMR.

This allows us to enable key 0 in a following patch.

Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowernv: Clear SPRN_PSSCR when a POWER9 CPU comes online
Gautham R. Shenoy [Tue, 22 Nov 2016 18:06:40 +0000 (23:36 +0530)]
powernv: Clear SPRN_PSSCR when a POWER9 CPU comes online

Ensure that PSSCR is set to a safe value corresponding to no
state-loss each time a POWER9 CPU comes online.

Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Acked-By: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/xmon: Add 'dt' command to dump trace buffers
Michael Ellerman [Fri, 6 Nov 2015 02:21:17 +0000 (13:21 +1100)]
powerpc/xmon: Add 'dt' command to dump trace buffers

There is a nice interface for asking ftrace to dump all its tracing
buffers. The only down side for use in xmon is that it uses printk.
Depending on circumstances printk may not work when in xmon, but it also
may, so add a 'dt' command which dumps the ftrace buffers, and add a
note to the help to mentiont that it uses printk.

Calling this routine also disables tracing, which is problematic if you
return from xmon and expect the system to keep operating normally. So
after we do the dump turn tracing back on.

Both functions already have nop versions defined for when ftrace is not
enabled, so we don't need any extra #ifdefs.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/of_platform: Use builtin_platform_driver
Geliang Tang [Wed, 23 Nov 2016 14:58:56 +0000 (22:58 +0800)]
powerpc/of_platform: Use builtin_platform_driver

Use builtin_platform_driver() helper to simplify the code.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Acked-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agocxl: drop duplicate header sched.h
Geliang Tang [Wed, 23 Nov 2016 15:27:38 +0000 (23:27 +0800)]
cxl: drop duplicate header sched.h

Drop duplicate header sched.h from native.c.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc: Fix __cmpxchg() to take a volatile ptr again
Michael Ellerman [Thu, 24 Nov 2016 06:08:11 +0000 (17:08 +1100)]
powerpc: Fix __cmpxchg() to take a volatile ptr again

In commit d0563a1297e2 ("powerpc: Implement {cmp}xchg for u8 and u16")
we removed the volatile from __cmpxchg().

This is leading to warnings such as:

  drivers/gpu/drm/drm_lock.c: In function ‘drm_lock_take’:
  arch/powerpc/include/asm/cmpxchg.h:484:37: warning: passing argument 1
  of ‘__cmpxchg’ discards ‘volatile’ qualifier from pointer target
     (__typeof__(*(ptr))) __cmpxchg((ptr), (unsigned long)_o_,   \

There doesn't seem to be consensus across architectures whether the
argument is volatile or not, so at least for now put the volatile back.

Fixes: d0563a1297e2 ("powerpc: Implement {cmp}xchg for u8 and u16")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agoMerge branch 'topic/ppc-kvm' into next
Michael Ellerman [Thu, 24 Nov 2016 11:14:52 +0000 (22:14 +1100)]
Merge branch 'topic/ppc-kvm' into next

Merge the topic branch we're sharing with the kvm-ppc tree.

8 years agosoc/qman: Handle endianness of h/w descriptors
Claudiu Manoil [Wed, 16 Nov 2016 14:40:30 +0000 (16:40 +0200)]
soc/qman: Handle endianness of h/w descriptors

The hardware descriptors have big endian (BE) format.
Provide proper endianness handling for the remaining
descriptor fields, to ensure they are correctly
accessed by non-BE CPUs too.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agocxl: Fix coccinelle warnings
Andrew Donnellan [Tue, 22 Nov 2016 10:13:27 +0000 (21:13 +1100)]
cxl: Fix coccinelle warnings

Fix the following coccinelle warnings:

  drivers/misc/cxl/debugfs.c:46:0-23: WARNING: fops_io_x64 should be
      defined with DEFINE_DEBUGFS_ATTRIBUTE
  drivers/misc/cxl/guest.c:890:5-26: WARNING: Comparison to bool
  drivers/misc/cxl/irq.c:107:3-23: WARNING: Assignment of bool to 0/1
  drivers/misc/cxl/native.c:57:2-3: Unneeded semicolon
  drivers/misc/cxl/native.c:170:2-3: Unneeded semicolon

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/32: Change the stack protector canary value per task
Christophe Leroy [Tue, 22 Nov 2016 10:49:32 +0000 (11:49 +0100)]
powerpc/32: Change the stack protector canary value per task

Partially copied from commit df0698be14c66 ("ARM: stack protector:
change the canary value per task")

A new random value for the canary is stored in the task struct whenever
a new task is forked.  This is meant to allow for different canary values
per task.  On powerpc, GCC expects the canary value to be found in a global
variable called __stack_chk_guard.  So this variable has to be updated
with the value stored in the task struct whenever a task switch occurs.

Because the variable GCC expects is global, this cannot work on SMP
unfortunately.  So, on SMP, the same initial canary value is kept
throughout, making this feature a bit less effective although it is still
useful.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc: Initial stack protector (-fstack-protector) support
Christophe Leroy [Tue, 22 Nov 2016 10:49:30 +0000 (11:49 +0100)]
powerpc: Initial stack protector (-fstack-protector) support

Partialy copied from commit c743f38013aef ("ARM: initial stack protector
(-fstack-protector) support")

This is the very basic stuff without the changing canary upon
task switch yet.  Just the Kconfig option and a constant canary
value initialized at boot time.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc: Implement {cmp}xchg for u8 and u16
Pan Xinhui [Wed, 27 Apr 2016 09:16:45 +0000 (17:16 +0800)]
powerpc: Implement {cmp}xchg for u8 and u16

Implement xchg{u8,u16}{local,relaxed}, and
cmpxchg{u8,u16}{,local,acquire,relaxed}.

It works on all ppc.

remove volatile of first parameter in __cmpxchg_local and __cmpxchg

Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com>
Acked-by: Boqun Feng <boqun.feng@gmail.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/pseries/ibmebus: Remove legacy suspend/resume support
Lars-Peter Clausen [Sat, 19 Nov 2016 13:42:14 +0000 (14:42 +0100)]
powerpc/pseries/ibmebus: Remove legacy suspend/resume support

There are no ibmebus driver that make use of legacy suspend/resume. This
patch removes the support for it from ibmebus framework, new ibmebus
driver (as unlikely as they are) wanting to use suspend/resume should
use dev_pm_ops.

Since there aren't any special bus specific things to do during
suspend/resume and since the PM core will automatically fallback
directly to using the device's PM ops if no bus PM ops are specified
there is no need to have any special ibmebus PM ops at all.

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/kprobes: Invoke handlers directly
Naveen N. Rao [Mon, 21 Nov 2016 17:06:41 +0000 (22:36 +0530)]
powerpc/kprobes: Invoke handlers directly

Invoke the kprobe handlers directly rather than through notify_die(), to
reduce path taken for handling kprobes. Similar to commit 6f6343f53d13
("kprobes/x86: Call exception handlers directly from do_int3/do_debug").

While at it, rename post_kprobe_handler() to kprobe_post_handler() for
more uniform naming.

Reported-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc: Remove extraneous header from asm-prototypes.h
Naveen N. Rao [Mon, 21 Nov 2016 17:06:40 +0000 (22:36 +0530)]
powerpc: Remove extraneous header from asm-prototypes.h

Commit 03465f899bda ("powerpc: Use kprobe blacklist for exception
handlers") removed __kprobes annotation from some of the prototypes,
but left the kprobes header include directive unchanged. Remove it as it
is no longer needed.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agosoc/qman: Clean up CGR CSCN target update operations
Claudiu Manoil [Wed, 16 Nov 2016 14:40:29 +0000 (16:40 +0200)]
soc/qman: Clean up CGR CSCN target update operations

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agosoc/qman: Change remaining contextB into context_b
Claudiu Manoil [Wed, 16 Nov 2016 14:40:28 +0000 (16:40 +0200)]
soc/qman: Change remaining contextB into context_b

There are multiple occurences of both contextB and context_b
in different h/w descriptors, referring to the same descriptor
field known as "Context B". Stick with the "context_b" naming,
for obvious reasons including consistency (see also context_a).

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agosoc/qbman: Handle endianness of qm/bm_in/out()
Claudiu Manoil [Wed, 16 Nov 2016 14:40:27 +0000 (16:40 +0200)]
soc/qbman: Handle endianness of qm/bm_in/out()

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agosoc/qman: Drop unused field from eqcr/dqrr descriptors
Claudiu Manoil [Wed, 16 Nov 2016 14:40:26 +0000 (16:40 +0200)]
soc/qman: Drop unused field from eqcr/dqrr descriptors

ORP ("Order Restoration Point") mechanism not supported.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agosoc/qman: Fix accesses to fqid, cleanup
Claudiu Manoil [Wed, 16 Nov 2016 14:40:25 +0000 (16:40 +0200)]
soc/qman: Fix accesses to fqid, cleanup

Preventively mask every access to the 'fqid' h/w field,
since it is defined as a 24-bit field, for every h/w
descriptor.  Add generic accessors for this field to
ensure correct access.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agosoc/qman: Remove unused struct qm_mcc* layouts
Claudiu Manoil [Wed, 16 Nov 2016 14:40:24 +0000 (16:40 +0200)]
soc/qman: Remove unused struct qm_mcc* layouts

1. qm_mcc_querywq layout not used for now, so drop it;
2. queryfq, queryfq_np and alterfq are used only for accesses to
   the 'fqid' field, so replace these with a generic 'fq' layout.
   As a consequence, 'querycgr' turns into 'cgr' following the
   same reasoning above and for consistent naming.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agosoc/qman: Remove redundant checks from qman_create_cgr()
Claudiu Manoil [Wed, 16 Nov 2016 14:40:23 +0000 (16:40 +0200)]
soc/qman: Remove redundant checks from qman_create_cgr()

opts is checked redundantly.
Move local_opts declaration inside its usage scope.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agosoc/qman: test: Don't use dummy platform device for dma mapping
Claudiu Manoil [Wed, 16 Nov 2016 14:40:22 +0000 (16:40 +0200)]
soc/qman: test: Don't use dummy platform device for dma mapping

Replace dummy platform device hack with a reference to a portal's
platform device, in order to dma map the test frame for this
small unit test.  The 2 qman symbols need to be exported because
this self test is a kernel module.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agosoc/qman: Don't add a new platform device for dma mapping
Claudiu Manoil [Wed, 16 Nov 2016 14:40:21 +0000 (16:40 +0200)]
soc/qman: Don't add a new platform device for dma mapping

The qman portals are platform devices themselves, so they should
handle dma mappings.  Creating a dummy platform device in order to
support dma mapping operations is not justified (and not portable).
Instead, do the mapping against the first portal that has been
initialised.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agosoc/qman: test: Fix implementation of fd_cmp()
Claudiu Manoil [Wed, 16 Nov 2016 14:40:20 +0000 (16:40 +0200)]
soc/qman: test: Fix implementation of fd_cmp()

This function must only return the truth value of whether
two frame descriptors are different or not.
It does NOT have to compute some obscure difference between
fd fields and return it as an int, making sparse complain
about type conversions in the process.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agosoc/qman: Fix direct access to fd's addr_lo, use proper accesor
Claudiu Manoil [Wed, 16 Nov 2016 14:40:19 +0000 (16:40 +0200)]
soc/qman: Fix direct access to fd's addr_lo, use proper accesor

Use the proper accessor to get the FD address.
Accessing the internal field "addr_lo" directly is not portable
and error prone.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agosoc/qman: Fix struct qm_fqd set accessor for context_a
Claudiu Manoil [Wed, 16 Nov 2016 14:40:18 +0000 (16:40 +0200)]
soc/qman: Fix struct qm_fqd set accessor for context_a

context_a.hi is 32bit

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agosoc/qbman: Fix resource leak on portal probing error path
Claudiu Manoil [Wed, 16 Nov 2016 14:40:17 +0000 (16:40 +0200)]
soc/qbman: Fix resource leak on portal probing error path

In case init_pcfg() returns with error the CI region
must be unmapped too.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agosoc/qman: Fix h/w resource cleanup error path handling
Claudiu Manoil [Wed, 16 Nov 2016 14:40:16 +0000 (16:40 +0200)]
soc/qman: Fix h/w resource cleanup error path handling

qman_query_fq*() may return other error codes apart from
-ERANGE, in which cases the error handling done by the
resource cleanup callers would be wrong.  The patch
fixes the handling of those cases, and cleans up related
code inside the resource cleanup & release handlers (i.e.
replace hardcoded fqid value with corresponding define).

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agosoc/qman: Replace of_get_property() with portable equivalent
Madalin Bucur [Wed, 16 Nov 2016 14:40:15 +0000 (16:40 +0200)]
soc/qman: Replace of_get_property() with portable equivalent

Use arch portable of_property_read_u32() instead, which takes
care of endianness conversions.

Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agosoc/qman: Check ioremap return value
Madalin Bucur [Wed, 16 Nov 2016 14:40:14 +0000 (16:40 +0200)]
soc/qman: Check ioremap return value

Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agopowerpc/85xx: Enable gpio power/reset driver
Andy Fleming [Sun, 23 Oct 2016 23:48:38 +0000 (18:48 -0500)]
powerpc/85xx: Enable gpio power/reset driver

These config changes build:
drivers/power/reset/gpio-poweroff.c
drivers/power/reset/gpio-restart.c

Signed-off-by: Andy Fleming <afleming@gmail.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agopowerpc/fsl_soc: improve and simplify get_baudrate
Heiner Kallweit [Sat, 29 Oct 2016 14:29:04 +0000 (16:29 +0200)]
powerpc/fsl_soc: improve and simplify get_baudrate

Use of_property_read_u32 instead of the generic of_get_property to
simplify the code. In addition move the declaration of fs_baudrate
into get_baudrate because it's private to this function.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agopowerpc/fsl_soc: improve and simplify get_brgfreq
Heiner Kallweit [Sat, 29 Oct 2016 14:28:25 +0000 (16:28 +0200)]
powerpc/fsl_soc: improve and simplify get_brgfreq

Use of_property_read_u32 instead of the generic of_get_property to
simplify the code. In addition move the declaration of brgfreq
into get_brgfreq because it's private to this function.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
[scottwood: minor whitespace fixes]
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agopowerpc/fsl_soc: improve and simplify fsl_get_sys_freq
Heiner Kallweit [Sat, 29 Oct 2016 14:27:14 +0000 (16:27 +0200)]
powerpc/fsl_soc: improve and simplify fsl_get_sys_freq

Use of_property_read_u32 instead of the generic of_get_property to
simplify the code. In addition move the declaration of sysfreq
into fsl_get_sys_freq because it's private to this function.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agopowerpc/85xx/qemu: Enable CONFIG_E500 and CONFIG_PPC_E500MC
David Engraf [Tue, 15 Nov 2016 08:24:27 +0000 (09:24 +0100)]
powerpc/85xx/qemu: Enable CONFIG_E500 and CONFIG_PPC_E500MC

The QEMU e500 board needs to enable CONFIG_E500 to correctly boot. QEMU
for ppc64 uses e5500/e6500 emulation, thus CONFIG_PPC_E500MC is required
as well.

Signed-off-by: David Engraf <david.engraf@sysgo.com>
Signed-off-by: Scott Wood <oss@buserror.net>
8 years agopowerpc/powernv: Define and set POWER9 HFSCR doorbell bit
Michael Neuling [Tue, 22 Nov 2016 23:44:09 +0000 (10:44 +1100)]
powerpc/powernv: Define and set POWER9 HFSCR doorbell bit

Define and set the POWER9 HFSCR doorbell bit so that guests can use
msgsndp.

ISA 3.0 calls this MSGP, so name it accordingly in the code.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/reg: Add definition for LPCR_PECE_HVEE
Michael Ellerman [Tue, 22 Nov 2016 03:50:39 +0000 (14:50 +1100)]
powerpc/reg: Add definition for LPCR_PECE_HVEE

ISA 3.0 defines a new PECE (Power-saving mode Exit Cause Enable) field
in the LPCR (Logical Partitioning Control Register), called
LPCR_PECE_HVEE (Hypervisor Virtualization Exit Enable).

KVM code will need to know about this bit, so add a definition for it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/64: Define new ISA v3.00 logical PVR value and PCR register value
Suraj Jitindar Singh [Mon, 21 Nov 2016 05:02:35 +0000 (16:02 +1100)]
powerpc/64: Define new ISA v3.00 logical PVR value and PCR register value

ISA 3.00 adds the logical PVR value 0x0f000005, so add a definition for
this.

Define PCR_ARCH_207 to reflect ISA 2.07 compatibility mode in the processor
compatibility register (PCR).

[paulus@ozlabs.org - moved dummy PCR_ARCH_300 value into next patch]

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
8 years agopowerpc/powernv: Define real-mode versions of OPAL XICS accessors
Paul Mackerras [Mon, 21 Nov 2016 05:01:36 +0000 (16:01 +1100)]
powerpc/powernv: Define real-mode versions of OPAL XICS accessors

This defines real-mode versions of opal_int_get_xirr(), opal_int_eoi()
and opal_int_set_mfrr(), for use by KVM real-mode code.

It also exports opal_int_set_mfrr() so that the modular part of KVM
can use it to send IPIs.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>