platform/kernel/linux-starfive.git
6 years agopowerpc/mm: Fixup tlbie vs store ordering issue on POWER9
Aneesh Kumar K.V [Fri, 23 Mar 2018 04:56:27 +0000 (10:26 +0530)]
powerpc/mm: Fixup tlbie vs store ordering issue on POWER9

On POWER9, under some circumstances, a broadcast TLB invalidation
might complete before all previous stores have drained, potentially
allowing stale stores from becoming visible after the invalidation.
This works around it by doubling up those TLB invalidations which was
verified by HW to be sufficient to close the risk window.

This will be documented in a yet-to-be-published errata.

Fixes: 1a472c9dba6b ("powerpc/mm/radix: Add tlbflush routines")
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
[mpe: Enable the feature in the DT CPU features code for all Power9,
      rename the feature to CPU_FTR_P9_TLBIE_BUG per benh.]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/mm/radix: Move the functions that does the actual tlbie closer
Aneesh Kumar K.V [Fri, 23 Mar 2018 04:56:26 +0000 (10:26 +0530)]
powerpc/mm/radix: Move the functions that does the actual tlbie closer

No functionality change. Just code movement to ease code changes later

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/mm/radix: Remove unused code
Aneesh Kumar K.V [Fri, 23 Mar 2018 04:56:25 +0000 (10:26 +0530)]
powerpc/mm/radix: Remove unused code

These function are not used in the code. Remove them.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/mm: Workaround Nest MMU bug with TLB invalidations
Benjamin Herrenschmidt [Thu, 22 Mar 2018 22:29:06 +0000 (09:29 +1100)]
powerpc/mm: Workaround Nest MMU bug with TLB invalidations

On POWER9 the Nest MMU may fail to invalidate some translations when
doing a tlbie "by PID" or "by LPID" that is targeted at the TLB only
and not the page walk cache.

This works around it by forcing such invalidations to escalate to
RIC=2 (full invalidation of TLB *and* PWC) when a coprocessor is in
use for the context.

Fixes: 03b8abedf4f4 ("cxl: Enable global TLBIs for cxl contexts")
Cc: stable@vger.kernel.org # v4.15+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Balbir Singh <bsingharora@gmail.com>
[balbirs: fixed spelling and coding style to quiesce checkpatch.pl]
Tested-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/mm: Add tracking of the number of coprocessors using a context
Benjamin Herrenschmidt [Thu, 22 Mar 2018 22:29:05 +0000 (09:29 +1100)]
powerpc/mm: Add tracking of the number of coprocessors using a context

Currently, when using coprocessors (which use the Nest MMU), we
simply increment the active_cpu count to force all TLB invalidations
to be come broadcast.

Unfortunately, due to an errata in POWER9, we will need to know
more specifically that coprocessors are in use.

This maintains a separate copros counter in the MMU context for
that purpose.

NB. The commit mentioned in the fixes tag below is not at fault for
the bug we're fixing in this commit and the next, but this fix applies
on top the infrastructure it introduced.

Fixes: 03b8abedf4f4 ("cxl: Enable global TLBIs for cxl contexts")
Cc: stable@vger.kernel.org # v4.15+
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/64s: Fix lost pending interrupt due to race causing lost update to irq_happened
Nicholas Piggin [Wed, 21 Mar 2018 02:22:28 +0000 (12:22 +1000)]
powerpc/64s: Fix lost pending interrupt due to race causing lost update to irq_happened

force_external_irq_replay() can be called in the do_IRQ path with
interrupts hard enabled and soft disabled if may_hard_irq_enable() set
MSR[EE]=1. It updates local_paca->irq_happened with a load, modify,
store sequence. If a maskable interrupt hits during this sequence, it
will go to the masked handler to be marked pending in irq_happened.
This update will be lost when the interrupt returns and the store
instruction executes. This can result in unpredictable latencies,
timeouts, lockups, etc.

Fix this by ensuring hard interrupts are disabled before modifying
irq_happened.

This could cause any maskable asynchronous interrupt to get lost, but
it was noticed on P9 SMP system doing RDMA NVMe target over 100GbE,
so very high external interrupt rate and high IPI rate. The hang was
bisected down to enabling doorbell interrupts for IPIs. These provided
an interrupt type that could run at high rates in the do_IRQ path,
stressing the race.

Fixes: 1d607bb3bd60 ("powerpc/irq: Add mechanism to force a replay of interrupts")
Cc: stable@vger.kernel.org # v4.8+
Reported-by: Carol L. Soto <clsoto@us.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/64s: Fix NULL AT_BASE_PLATFORM when using DT CPU features
Michael Ellerman [Tue, 13 Mar 2018 04:58:11 +0000 (15:58 +1100)]
powerpc/64s: Fix NULL AT_BASE_PLATFORM when using DT CPU features

When running virtualised the powerpc kernel is able to run the system
in "compat mode" - which means the kernel and hardware are pretending
to userspace that the CPU is an older version than it actually is.

AT_BASE_PLATFORM is an AUXV entry that we export to userspace for use
when we're running in that mode, which tells userspace the "platform"
string for the real CPU version, as opposed to the faked version.

Although we don't support compat mode when using DT CPU features, and
arguably don't need to set AT_BASE_PLATFORM, the existing cputable
based code always sets it even when we're running bare metal. That
means the lack of AT_BASE_PLATFORM is a user-visible artifact of the
fact that the kernel is using DT CPU features, which we don't want.

So set it in the DT CPU features code also.

This results in eg:
  $ LD_SHOW_AUXV=1 /bin/true | grep "AT_.*PLATFORM"
  AT_PLATFORM:     power9
  AT_BASE_PLATFORM:power9

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
6 years agopowerpc/pseries: Fix vector5 in ibm architecture vector table
Bharata B Rao [Tue, 6 Mar 2018 08:14:32 +0000 (13:44 +0530)]
powerpc/pseries: Fix vector5 in ibm architecture vector table

With ibm,dynamic-memory-v2 and ibm,drc-info coming around the same
time, byte22 in vector5 of ibm architecture vector table got set twice
separately. The end result is that guest kernel isn't advertising
support for ibm,dynamic-memory-v2.

Fix this by removing the duplicate assignment of byte22.

Fixes: 02ef6dd8109b ("powerpc: Enable support for ibm,drc-info devtree property")
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agoocxl: Document the OCXL_IOCTL_GET_METADATA IOCTL
Alastair D'Silva [Thu, 22 Feb 2018 04:17:39 +0000 (15:17 +1100)]
ocxl: Document the OCXL_IOCTL_GET_METADATA IOCTL

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agoocxl: Add get_metadata IOCTL to share OCXL information to userspace
Alastair D'Silva [Thu, 22 Feb 2018 04:17:38 +0000 (15:17 +1100)]
ocxl: Add get_metadata IOCTL to share OCXL information to userspace

Some required information is not exposed to userspace currently (eg. the
PASID), pass this information back, along with other information which
is currently communicated via sysfs, which saves some parsing effort in
userspace.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agoselftests/powerpc: Skip the subpage_prot tests if the syscall is unavailable
Michael Ellerman [Mon, 26 Feb 2018 04:22:22 +0000 (15:22 +1100)]
selftests/powerpc: Skip the subpage_prot tests if the syscall is unavailable

The subpage_prot syscall is only functional when the system is using
the Hash MMU. Since commit 5b2b80714796 ("powerpc/mm: Invalidate
subpage_prot() system call on radix platforms") it returns ENOENT when
the Radix MMU is active. Currently this just makes the test fail.

Additionally the syscall is not available if the kernel is built with
4K pages, or if CONFIG_PPC_SUBPAGE_PROT=n, in which case it returns
ENOSYS because the syscall is missing entirely.

So check explicitly for ENOENT and ENOSYS and skip if we see either of
those.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agoselftests/powerpc: Fix missing clean of pmu/lib.o
Michael Ellerman [Wed, 28 Feb 2018 04:15:56 +0000 (15:15 +1100)]
selftests/powerpc: Fix missing clean of pmu/lib.o

The tm-resched-dscr test links against pmu/lib.o, but we don't have a
rule to clean pmu/lib.o. This can lead to a build break if you build
for big endian and then little, or vice versa.

Fix it by making tm-resched-dscr depend on pmu/lib.c, causing the code
to be built directly in, meaning no .o is generated.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/boot: Fix random libfdt related build errors
Guenter Roeck [Fri, 23 Feb 2018 20:55:59 +0000 (12:55 -0800)]
powerpc/boot: Fix random libfdt related build errors

Once in a while I see build errors similar to the following
when building images from a clean tree.

  Building powerpc:virtex-ml507:44x/virtex5_defconfig ... failed
  ------------
  Error log:
  arch/powerpc/boot/treeboot-akebono.c:37:20: fatal error:
   libfdt.h: No such file or directory

  Building powerpc:bamboo:smpdev:44x/bamboo_defconfig ... failed
  ------------
  Error log:
  arch/powerpc/boot/treeboot-akebono.c:37:20: fatal error:
   libfdt.h: No such file or directory

  arch/powerpc/boot/treeboot-currituck.c:35:20: fatal error:
       libfdt.h: No such file or directory

Rebuilds will succeed.

Turns out that several source files in arch/powerpc/boot/ include
libfdt.h, but Makefile dependencies are incomplete. Let's fix that.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agoselftests/powerpc: Skip tm-trap if transactional memory is not enabled
Michael Ellerman [Mon, 26 Feb 2018 02:17:07 +0000 (13:17 +1100)]
selftests/powerpc: Skip tm-trap if transactional memory is not enabled

Some processor revisions do not support transactional memory, and
additionally kernel support can be disabled. In either case the
tm-trap test should be skipped, otherwise it will fail with a SIGILL.

Fixes: a08082f8e4e1 ("powerpc/selftests: Check endianness on trap in TM")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/powernv: Support firmware disable of RFI flush
Michael Ellerman [Thu, 22 Feb 2018 13:00:11 +0000 (00:00 +1100)]
powerpc/powernv: Support firmware disable of RFI flush

Some versions of firmware will have a setting that can be configured
to disable the RFI flush, add support for it.

Fixes: 6e032b350cd1 ("powerpc/powernv: Check device-tree for RFI flush settings")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/pseries: Support firmware disable of RFI flush
Michael Ellerman [Thu, 22 Feb 2018 12:58:49 +0000 (23:58 +1100)]
powerpc/pseries: Support firmware disable of RFI flush

Some versions of firmware will have a setting that can be configured
to disable the RFI flush, add support for it.

Fixes: 8989d56878a7 ("powerpc/pseries: Query hypervisor for RFI flush settings")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/mm/drmem: Fix unexpected flag value in ibm,dynamic-memory-v2
Bharata B Rao [Wed, 21 Feb 2018 10:36:26 +0000 (16:06 +0530)]
powerpc/mm/drmem: Fix unexpected flag value in ibm,dynamic-memory-v2

Memory addtion and removal by count and indexed-count methods
temporarily mark the LMBs that are being added/removed by a special
flag value DRMEM_LMB_RESERVED. Accessing flags value directly at a few
places without proper accessor method is causing two unexpected
side-effects:

- DRMEM_LMB_RESERVED bit is becoming part of the flags word of
  drconf_cell_v2 entries in ibm,dynamic-memory-v2 DT property.
- This results in extra drconf_cell entries in ibm,dynamic-memory-v2.
  For example if 1G memory is added, it leads to one entry for 3 LMBs
  and 1 separate entry for the last LMB. All the 4 LMBs should be
  defined by one entry here.

Fix this by always accessing the flags by its accessor method
drmem_lmb_flags().

Fixes: 2b31e3aec1db ("powerpc/drmem: Add support for ibm, dynamic-memory-v2 property")
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/bpf/jit: Fix 32-bit JIT for seccomp_data access
Mark Lord [Tue, 20 Feb 2018 19:49:20 +0000 (14:49 -0500)]
powerpc/bpf/jit: Fix 32-bit JIT for seccomp_data access

I am using SECCOMP to filter syscalls on a ppc32 platform, and noticed
that the JIT compiler was failing on the BPF even though the
interpreter was working fine.

The issue was that the compiler was missing one of the instructions
used by SECCOMP, so here is a patch to enable JIT for that
instruction.

Fixes: eb84bab0fb38 ("ppc: Kconfig: Enable BPF JIT on ppc32")
Signed-off-by: Mark Lord <mlord@pobox.com>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/pseries: Revert support for ibm,drc-info devtree property
Michael Bringmann [Tue, 13 Feb 2018 20:02:53 +0000 (14:02 -0600)]
powerpc/pseries: Revert support for ibm,drc-info devtree property

This reverts commit 02ef6dd8109b581343ebeb1c4c973513682535d6.

The earlier patch tried to enable support for a new property
"ibm,drc-info" on powerpc systems.

Unfortunately, some errors in the associated patch set break things
in some of the DLPAR operations.  In particular when attempting to
hot-add a new CPU or set of CPUs, the original patch failed to
properly calculate the available resources, and aborted the operation.
In addition, the original set missed several opportunities to compress
and reuse common code.

As the associated patch set was meant to provide an optimization of
storage and performance of a set of device-tree properties for future
systems with large amounts of resources, reverting just restores
the previous behavior for existing systems.  It seems unnecessary
to enable this feature and introduce the consequent problems in the
field that it will cause at this time, so please revert it for now
until testing of the corrections are finished properly.

Fixes: 02ef6dd8109b ("powerpc: Enable support for ibm,drc-info devtree property")
Signed-off-by: Michael W. Bringmann <mwb@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/pseries: Fix duplicate firmware feature for DRC_INFO
Michael Ellerman [Wed, 21 Feb 2018 11:54:37 +0000 (22:54 +1100)]
powerpc/pseries: Fix duplicate firmware feature for DRC_INFO

We had a mid-air collision between two new firmware features, DRMEM_V2
and DRC_INFO, and they ended up with the same value.

No one's actually reported any problems, presumably because the new
firmware that supports both properties is not widely available, and
the two properties tend to be enabled together.

Still if we ever had one enabled but not the other, the bugs that
could result are many and varied. So fix it.

Fixes: 3f38000eda48 ("powerpc/firmware: Add definitions for new drc-info firmware feature")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
6 years agoocxl: Fix potential bad errno on irq allocation
Frederic Barrat [Fri, 16 Feb 2018 13:01:18 +0000 (14:01 +0100)]
ocxl: Fix potential bad errno on irq allocation

Fix some issues found by a static checker:

When allocating an AFU interrupt, if the driver cannot copy the output
parameters to userland, the errno value was not set to EFAULT

Remove a (now) useless cast.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agopowerpc/eeh: Fix crashes in eeh_report_resume()
Juan J. Alvarez [Thu, 15 Feb 2018 18:49:51 +0000 (12:49 -0600)]
powerpc/eeh: Fix crashes in eeh_report_resume()

The notify_resume() callback in eeh_ops is NULL on powernv, leading to
crashes:

  NIP (null)
  LR  eeh_report_resume+0x218/0x220
  Call Trace:
   eeh_report_resume+0x1f0/0x220 (unreliable)
   eeh_pe_dev_traverse+0x98/0x170
   eeh_handle_normal_event+0x3f4/0x650
   eeh_handle_event+0x54/0x380
   eeh_event_handler+0x14c/0x210
   kthread+0x168/0x1b0
   ret_from_kernel_thread+0x5c/0xb4

Fix it by adding a check before calling it.

Fixes: 856e1eb9bdd4 ("PCI/AER: Add uevents in AER and EEH error/resume")
Signed-off-by: Juan J. Alvarez <jjalvare@linux.vnet.ibm.com>
Reviewed-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Tested-by: Carol L. Soto <clsoto@us.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Tested-by: Mauro S. M. Rodrigues <maurosr@linux.vnet.ibm.com>
Acked-by: Michael Neuling <mikey@neuling.org>
[mpe: Rewrite change log]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agoLinux 4.16-rc2
Linus Torvalds [Mon, 19 Feb 2018 01:29:42 +0000 (17:29 -0800)]
Linux 4.16-rc2

6 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 18 Feb 2018 20:56:41 +0000 (12:56 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 Kconfig fixes from Thomas Gleixner:
 "Three patchlets to correct HIGHMEM64G and CMPXCHG64 dependencies in
  Kconfig when CPU selections are explicitely set to M586 or M686"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/Kconfig: Explicitly enumerate i686-class CPUs in Kconfig
  x86/Kconfig: Exclude i586-class CPUs lacking PAE support from the HIGHMEM64G Kconfig group
  x86/Kconfig: Add missing i586-class CPUs to the X86_CMPXCHG64 Kconfig group

6 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 18 Feb 2018 20:38:40 +0000 (12:38 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf updates from Thomas Gleixner:
 "Perf tool updates and kprobe fixes:

   - perf_mmap overwrite mode fixes/overhaul, prep work to get 'perf
     top' using it, making it bearable to use it in large core count
     systems such as Knights Landing/Mill Intel systems (Kan Liang)

   - s/390 now uses syscall.tbl, just like x86-64 to generate the
     syscall table id -> string tables used by 'perf trace' (Hendrik
     Brueckner)

   - Use strtoull() instead of home grown function (Andy Shevchenko)

   - Synchronize kernel ABI headers, v4.16-rc1 (Ingo Molnar)

   - Document missing 'perf data --force' option (Sangwon Hong)

   - Add perf vendor JSON metrics for ARM Cortex-A53 Processor (William
     Cohen)

   - Improve error handling and error propagation of ftrace based
     kprobes so failures when installing kprobes are not silently
     ignored and create disfunctional tracepoints"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
  kprobes: Propagate error from disarm_kprobe_ftrace()
  kprobes: Propagate error from arm_kprobe_ftrace()
  Revert "tools include s390: Grab a copy of arch/s390/include/uapi/asm/unistd.h"
  perf s390: Rework system call table creation by using syscall.tbl
  perf s390: Grab a copy of arch/s390/kernel/syscall/syscall.tbl
  tools/headers: Synchronize kernel ABI headers, v4.16-rc1
  perf test: Fix test trace+probe_libc_inet_pton.sh for s390x
  perf data: Document missing --force option
  perf tools: Substitute yet another strtoull()
  perf top: Check the latency of perf_top__mmap_read()
  perf top: Switch default mode to overwrite mode
  perf top: Remove lost events checking
  perf hists browser: Add parameter to disable lost event warning
  perf top: Add overwrite fall back
  perf evsel: Expose the perf_missing_features struct
  perf top: Check per-event overwrite term
  perf mmap: Discard legacy interface for mmap read
  perf test: Update mmap read functions for backward-ring-buffer test
  perf mmap: Introduce perf_mmap__read_event()
  perf mmap: Introduce perf_mmap__read_done()
  ...

6 years agoMerge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 18 Feb 2018 20:22:04 +0000 (12:22 -0800)]
Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull irq updates from Thomas Gleixner:
 "A small set of updates mostly for irq chip drivers:

   - MIPS GIC fix for spurious, masked interrupts

   - fix for a subtle IPI bug in GICv3

   - do not probe GICv3 ITSs that are marked as disabled

   - multi-MSI support for GICv2m

   - various small cleanups"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqdomain: Re-use DEFINE_SHOW_ATTRIBUTE() macro
  irqchip/bcm: Remove hashed address printing
  irqchip/gic-v2m: Add PCI Multi-MSI support
  irqchip/gic-v3: Ignore disabled ITS nodes
  irqchip/gic-v3: Use wmb() instead of smb_wmb() in gic_raise_softirq()
  irqchip/gic-v3: Change pr_debug message to pr_devel
  irqchip/mips-gic: Avoid spuriously handling masked interrupts

6 years agoMerge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 18 Feb 2018 19:54:22 +0000 (11:54 -0800)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull core fix from Thomas Gleixner:
 "A small fix which adds the missing for_each_cpu_wrap() stub for the UP
  case to avoid build failures"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  cpumask: Make for_each_cpu_wrap() available on UP as well

6 years agoMerge tag 'for-linus-20180217' of git://git.kernel.dk/linux-block
Linus Torvalds [Sat, 17 Feb 2018 18:20:47 +0000 (10:20 -0800)]
Merge tag 'for-linus-20180217' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

 - NVMe pull request from Keith, with fixes all over the map for nvme.
   From various folks.

 - Classic polling fix, that avoids a latency issue where we still end
   up waiting for an interrupt in some cases. From Nitesh Shetty.

 - Comment typo fix from Minwoo Im.

* tag 'for-linus-20180217' of git://git.kernel.dk/linux-block:
  block: fix a typo in comment of BLK_MQ_POLL_STATS_BKTS
  nvme-rdma: fix sysfs invoked reset_ctrl error flow
  nvmet: Change return code of discard command if not supported
  nvme-pci: Fix timeouts in connecting state
  nvme-pci: Remap CMB SQ entries on every controller reset
  nvme: fix the deadlock in nvme_update_formats
  blk: optimization for classic polling
  nvme: Don't use a stack buffer for keep-alive command
  nvme_fc: cleanup io completion
  nvme_fc: correct abort race condition on resets
  nvme: Fix discard buffer overrun
  nvme: delete NVME_CTRL_LIVE --> NVME_CTRL_CONNECTING transition
  nvme-rdma: use NVME_CTRL_CONNECTING state to mark init process
  nvme: rename NVME_CTRL_RECONNECTING state to NVME_CTRL_CONNECTING

6 years agoMerge tag 'mmc-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Linus Torvalds [Sat, 17 Feb 2018 18:08:28 +0000 (10:08 -0800)]
Merge tag 'mmc-v4.16-rc1' of git://git./linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:

 - meson-gx: Revert to earlier tuning process

 - bcm2835: Don't overwrite max frequency unconditionally

* tag 'mmc-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: bcm2835: Don't overwrite max frequency unconditionally
  Revert "mmc: meson-gx: include tx phase in the tuning process"

6 years agoMerge tag 'mtd/fixes-for-4.16-rc2' of git://git.infradead.org/linux-mtd
Linus Torvalds [Sat, 17 Feb 2018 18:06:13 +0000 (10:06 -0800)]
Merge tag 'mtd/fixes-for-4.16-rc2' of git://git.infradead.org/linux-mtd

Pull mtd fixes from Boris Brezillon:

 - add missing dependency to NAND_MARVELL Kconfig entry

 - use the appropriate OOB layout in the VF610 driver

* tag 'mtd/fixes-for-4.16-rc2' of git://git.infradead.org/linux-mtd:
  mtd: nand: MTD_NAND_MARVELL should depend on HAS_DMA
  mtd: nand: vf610: set correct ooblayout

6 years agoMerge tag 'powerpc-4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Sat, 17 Feb 2018 17:48:26 +0000 (09:48 -0800)]
Merge tag 'powerpc-4.16-3' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "The main attraction is a fix for a bug in the new drmem code, which
  was causing an oops on boot on some versions of Qemu.

  There's also a fix for XIVE (Power9 interrupt controller) on KVM, as
  well as a few other minor fixes.

  Thanks to: Corentin Labbe, Cyril Bur, Cédric Le Goater, Daniel Black,
  Nathan Fontenot, Nicholas Piggin"

* tag 'powerpc-4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/pseries: Check for zero filled ibm,dynamic-memory property
  powerpc/pseries: Add empty update_numa_cpu_lookup_table() for NUMA=n
  powerpc/powernv: IMC fix out of bounds memory access at shutdown
  powerpc/xive: Use hw CPU ids when configuring the CPU queues
  powerpc: Expose TSCR via sysfs only on powernv

6 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Sat, 17 Feb 2018 17:46:18 +0000 (09:46 -0800)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:
 "The bulk of this is the pte accessors annotation to READ/WRITE_ONCE
  (we tried to avoid pushing this during the merge window to avoid
  conflicts)

   - Updated the page table accessors to use READ/WRITE_ONCE and prevent
     compiler transformation that could lead to an apparent loss of
     coherency

   - Enabled branch predictor hardening for the Falkor CPU

   - Fix interaction between kpti enabling and KASan causing the
     recursive page table walking to take a significant time

   - Fix some sparse warnings"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: cputype: Silence Sparse warnings
  arm64: mm: Use READ_ONCE/WRITE_ONCE when accessing page tables
  arm64: proc: Set PTE_NG for table entries to avoid traversing them twice
  arm64: Add missing Falkor part number for branch predictor hardening

6 years agoMerge tag 'for-linus-4.16a-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 17 Feb 2018 17:16:09 +0000 (09:16 -0800)]
Merge tag 'for-linus-4.16a-rc2-tag' of git://git./linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:

 - fixes for the Xen pvcalls frontend driver

 - fix for booting Xen pv domains

 - fix for the xenbus driver user interface

* tag 'for-linus-4.16a-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  pvcalls-front: wait for other operations to return when release passive sockets
  pvcalls-front: introduce a per sock_mapping refcount
  x86/xen: Calculate __max_logical_packages on PV domains
  xenbus: track caller request id

6 years agopvcalls-front: wait for other operations to return when release passive sockets
Stefano Stabellini [Wed, 14 Feb 2018 18:28:24 +0000 (10:28 -0800)]
pvcalls-front: wait for other operations to return when release passive sockets

Passive sockets can have ongoing operations on them, specifically, we
have two wait_event_interruptable calls in pvcalls_front_accept.

Add two wake_up calls in pvcalls_front_release, then wait for the
potential waiters to return and release the sock_mapping refcount.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Acked-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
6 years agopvcalls-front: introduce a per sock_mapping refcount
Stefano Stabellini [Wed, 14 Feb 2018 18:28:23 +0000 (10:28 -0800)]
pvcalls-front: introduce a per sock_mapping refcount

Introduce a per sock_mapping refcount, in addition to the existing
global refcount. Thanks to the sock_mapping refcount, we can safely wait
for it to be 1 in pvcalls_front_release before freeing an active socket,
instead of waiting for the global refcount to be 1.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Acked-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
6 years agox86/xen: Calculate __max_logical_packages on PV domains
Prarit Bhargava [Wed, 7 Feb 2018 23:49:23 +0000 (18:49 -0500)]
x86/xen: Calculate __max_logical_packages on PV domains

The kernel panics on PV domains because native_smp_cpus_done() is
only called for HVM domains.

Calculate __max_logical_packages for PV domains.

Fixes: b4c0a7326f5d ("x86/smpboot: Fix __max_logical_packages estimate")
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Tested-and-reported-by: Simon Gaiser <simon@invisiblethingslab.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Kate Stewart <kstewart@linuxfoundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: xen-devel@lists.xenproject.org
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
6 years agoxenbus: track caller request id
Joao Martins [Fri, 2 Feb 2018 17:42:33 +0000 (17:42 +0000)]
xenbus: track caller request id

Commit fd8aa9095a95 ("xen: optimize xenbus driver for multiple concurrent
xenstore accesses") optimized xenbus concurrent accesses but in doing so
broke UABI of /dev/xen/xenbus. Through /dev/xen/xenbus applications are in
charge of xenbus message exchange with the correct header and body. Now,
after the mentioned commit the replies received by application will no
longer have the header req_id echoed back as it was on request (see
specification below for reference), because that particular field is being
overwritten by kernel.

struct xsd_sockmsg
{
  uint32_t type;  /* XS_??? */
  uint32_t req_id;/* Request identifier, echoed in daemon's response.  */
  uint32_t tx_id; /* Transaction id (0 if not related to a transaction). */
  uint32_t len;   /* Length of data following this. */

  /* Generally followed by nul-terminated string(s). */
};

Before there was only one request at a time so req_id could simply be
forwarded back and forth. To allow simultaneous requests we need a
different req_id for each message thus kernel keeps a monotonic increasing
counter for this field and is written on every request irrespective of
userspace value.

Forwarding again the req_id on userspace requests is not a solution because
we would open the possibility of userspace-generated req_id colliding with
kernel ones. So this patch instead takes another route which is to
artificially keep user req_id while keeping the xenbus logic as is. We do
that by saving the original req_id before xs_send(), use the private kernel
counter as req_id and then once reply comes and was validated, we restore
back the original req_id.

Cc: <stable@vger.kernel.org> # 4.11
Fixes: fd8aa9095a ("xen: optimize xenbus driver for multiple concurrent xenstore accesses")
Reported-by: Bhavesh Davda <bhavesh.davda@oracle.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
6 years agoarm64: cputype: Silence Sparse warnings
Robin Murphy [Fri, 16 Feb 2018 17:04:23 +0000 (17:04 +0000)]
arm64: cputype: Silence Sparse warnings

Sparse makes a fair bit of noise about our MPIDR mask being implicitly
long - let's explicitly describe it as such rather than just relying on
the value forcing automatic promotion.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
6 years agoMerge tag 'dma-mapping-4.16-2' of git://git.infradead.org/users/hch/dma-mapping
Linus Torvalds [Fri, 16 Feb 2018 20:22:33 +0000 (12:22 -0800)]
Merge tag 'dma-mapping-4.16-2' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping fixes from Christoph Hellwig:
 "A few dma-mapping fixes for the fallout from the changes in rc1"

* tag 'dma-mapping-4.16-2' of git://git.infradead.org/users/hch/dma-mapping:
  powerpc/macio: set a proper dma_coherent_mask
  dma-mapping: fix a comment typo
  dma-direct: comment the dma_direct_free calling convention
  dma-direct: mark as is_phys
  ia64: fix build failure with CONFIG_SWIOTLB

6 years agoarm64: mm: Use READ_ONCE/WRITE_ONCE when accessing page tables
Will Deacon [Thu, 15 Feb 2018 11:14:56 +0000 (11:14 +0000)]
arm64: mm: Use READ_ONCE/WRITE_ONCE when accessing page tables

In many cases, page tables can be accessed concurrently by either another
CPU (due to things like fast gup) or by the hardware page table walker
itself, which may set access/dirty bits. In such cases, it is important
to use READ_ONCE/WRITE_ONCE when accessing page table entries so that
entries cannot be torn, merged or subject to apparent loss of coherence
due to compiler transformations.

Whilst there are some scenarios where this cannot happen (e.g. pinned
kernel mappings for the linear region), the overhead of using READ_ONCE
/WRITE_ONCE everywhere is minimal and makes the code an awful lot easier
to reason about. This patch consistently uses these macros in the arch
code, as well as explicitly namespacing pointers to page table entries
from the entries themselves by using adopting a 'p' suffix for the former
(as is sometimes used elsewhere in the kernel source).

Tested-by: Yury Norov <ynorov@caviumnetworks.com>
Tested-by: Richard Ruigrok <rruigrok@codeaurora.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
6 years agomm: hide a #warning for COMPILE_TEST
Arnd Bergmann [Fri, 16 Feb 2018 15:25:53 +0000 (16:25 +0100)]
mm: hide a #warning for COMPILE_TEST

We get a warning about some slow configurations in randconfig kernels:

  mm/memory.c:83:2: error: #warning Unfortunate NUMA and NUMA Balancing config, growing page-frame for last_cpupid. [-Werror=cpp]

The warning is reasonable by itself, but gets in the way of randconfig
build testing, so I'm hiding it whenever CONFIG_COMPILE_TEST is set.

The warning was added in 2013 in commit 75980e97dacc ("mm: fold
page->_last_nid into page->flags where possible").

Cc: stable@vger.kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoMerge tag 'mips_fixes_4.16_2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan...
Linus Torvalds [Fri, 16 Feb 2018 17:31:37 +0000 (09:31 -0800)]
Merge tag 'mips_fixes_4.16_2' of git://git./linux/kernel/git/jhogan/mips

Pull MIPS fixes from James Hogan:
 "A few fixes for outstanding MIPS issues:

   - an __init section mismatch warning when brcmstb_pm is enabled

   - a regression handling multiple mem=X@Y arguments (4.11)

   - a USB Kconfig select warning, and related sparc cleanup (4.16)"

* tag 'mips_fixes_4.16_2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips:
  sparc,leon: Select USB_UHCI_BIG_ENDIAN_{MMIO,DESC}
  usb: Move USB_UHCI_BIG_ENDIAN_* out of USB_SUPPORT
  MIPS: Fix incorrect mem=X@Y handling
  MIPS: BMIPS: Fix section mismatch warning

6 years agoMerge tag 'for-4.16-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Fri, 16 Feb 2018 17:26:18 +0000 (09:26 -0800)]
Merge tag 'for-4.16-rc1-tag' of git://git./linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
 "We have a few assorted fixes, some of them show up during fstests so I
  gave them more testing"

* tag 'for-4.16-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: Fix use-after-free when cleaning up fs_devs with a single stale device
  Btrfs: fix null pointer dereference when replacing missing device
  btrfs: remove spurious WARN_ON(ref->count < 0) in find_parent_nodes
  btrfs: Ignore errors from btrfs_qgroup_trace_extent_post
  Btrfs: fix unexpected -EEXIST when creating new inode
  Btrfs: fix use-after-free on root->orphan_block_rsv
  Btrfs: fix btrfs_evict_inode to handle abnormal inodes correctly
  Btrfs: fix extent state leak from tree log
  Btrfs: fix crash due to not cleaning up tree log block's dirty bits
  Btrfs: fix deadlock in run_delalloc_nocow

6 years agoMerge tag 'for-4.16/dm-chained-bios-fix' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 16 Feb 2018 17:23:36 +0000 (09:23 -0800)]
Merge tag 'for-4.16/dm-chained-bios-fix' of git://git./linux/kernel/git/device-mapper/linux-dm

Pull device mapper fix from Mike Snitzer:
 "Fix for DM core to properly propagate errors (avoids overriding
  non-zero error with 0). This is particularly important given DM core's
  increased use of chained bios"

* tag 'for-4.16/dm-chained-bios-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm: correctly handle chained bios in dec_pending()

6 years agoMerge tag 'platform-drivers-x86-v4.16-4' of git://git.infradead.org/linux-platform...
Linus Torvalds [Fri, 16 Feb 2018 17:20:00 +0000 (09:20 -0800)]
Merge tag 'platform-drivers-x86-v4.16-4' of git://git.infradead.org/linux-platform-drivers-x86

Pull x86 platform driver fixes from Andy Shevchenko:

 - regression fix in keyboard support for Dell laptops

 - prevent out-of-boundary write in WMI bus driver

 - increase timeout to read functional key status on Lenovo laptops

* tag 'platform-drivers-x86-v4.16-4' of git://git.infradead.org/linux-platform-drivers-x86:
  platform/x86: dell-laptop: Removed duplicates in DMI whitelist
  platform/x86: dell-laptop: fix kbd_get_state's request value
  platform/x86: ideapad-laptop: Increase timeout to wait for EC answer
  platform/x86: wmi: fix off-by-one write in wmi_dev_probe()

6 years agoMerge tag 'sound-4.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Fri, 16 Feb 2018 17:11:30 +0000 (09:11 -0800)]
Merge tag 'sound-4.16-rc2' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "A collection of usual suspects:

   - a handful USB-audio and HD-audio device-specific quirks

   - some trivial fixes for the new AC97 bus stuff

   - another race fix in ALSA sequencer core"

* tag 'sound-4.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda/realtek: PCI quirk for Fujitsu U7x7
  ALSA: seq: Fix racy pool initializations
  ALSA: usb: add more device quirks for USB DSD devices
  ALSA: usb-audio: Fix UAC2 get_ctl request with a RANGE attribute
  ALSA: ac97: Fix copy and paste typo in documentation
  ALSA: usb-audio: add implicit fb quirk for Behringer UFX1204
  ALSA: ac97: kconfig: Remove select of undefined symbol AC97
  ALSA: hda/realtek - Enable Thinkpad Dock device for ALC298 platform
  ALSA: hda/realtek - Add headset mode support for Dell laptop
  ALSA: hda - Fix headset mic detection problem for two Dell machines

6 years agoMerge tag 'drm-fixes-for-v4.16-rc2' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Fri, 16 Feb 2018 17:08:59 +0000 (09:08 -0800)]
Merge tag 'drm-fixes-for-v4.16-rc2' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "One nouveau regression fix, one AMD quirk and a full set of i915
  fixes.

  The i915 fixes are mostly for things caught by their CI system, main
  ones being DSI panel fixes and GEM fixes"

* tag 'drm-fixes-for-v4.16-rc2' of git://people.freedesktop.org/~airlied/linux:
  drm/nouveau: Make clock gate support conditional
  drm/i915: Fix DSI panels with v1 MIPI sequences without a DEASSERT sequence v3
  drm/i915: Free memdup-ed DSI VBT data structures on driver_unload
  drm/i915: Add intel_bios_cleanup() function
  drm/i915/vlv: Add cdclk workaround for DSI
  drm/i915/gvt: fix one typo of render_mmio trace
  drm/i915/gvt: Support BAR0 8-byte reads/writes
  drm/i915/gvt: add 0xe4f0 into gen9 render list
  drm/i915/pmu: Fix building without CONFIG_PM
  drm/i915/pmu: Fix sleep under atomic in RC6 readout
  drm/i915/pmu: Fix PMU enable vs execlists tasklet race
  drm/i915: Lock out execlist tasklet while peeking inside for busy-stats
  drm/i915/breadcrumbs: Ignore unsubmitted signalers
  drm/i915: Don't wake the device up to check if the engine is asleep
  drm/i915: Avoid truncation before clamping userspace's priority value
  drm/i915/perf: Fix compiler warning for string truncation
  drm/i915/perf: Fix compiler warning for string truncation
  drm/amdgpu: add new device to use atpx quirk

6 years agodm: correctly handle chained bios in dec_pending()
NeilBrown [Thu, 15 Feb 2018 09:00:15 +0000 (20:00 +1100)]
dm: correctly handle chained bios in dec_pending()

dec_pending() is given an error status (possibly 0) to be recorded
against a bio.  It can be called several times on the one 'struct
dm_io', and it is careful to only assign a non-zero error to
io->status.  However when it then assigned io->status to bio->bi_status,
it is not careful and could overwrite a genuine error status with 0.

This can happen when chained bios are in use.  If a bio is chained
beneath the bio that this dm_io is handling, the child bio might
complete and set bio->bi_status before the dm_io completes.

This has been possible since chained bios were introduced in 3.14, and
has become a lot easier to trigger with commit 18a25da84354 ("dm: ensure
bio submission follows a depth-first tree walk") as that commit caused
dm to start using chained bios itself.

A particular failure mode is that if a bio spans an 'error' target and a
working target, the 'error' fragment will complete instantly and set the
->bi_status, and the other fragment will normally complete a little
later, and will clear ->bi_status.

The fix is simply to only assign io_error to bio->bi_status when
io_error is not zero.

Reported-and-tested-by: Milan Broz <gmazyland@gmail.com>
Cc: stable@vger.kernel.org (v3.14+)
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
6 years agoMerge tag 'irqchip-4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm...
Thomas Gleixner [Fri, 16 Feb 2018 14:47:26 +0000 (15:47 +0100)]
Merge tag 'irqchip-4.16-2' of git://git./linux/kernel/git/maz/arm-platforms into irq/urgent

Pull irqchip updates for 4.16-rc2 from Marc Zyngier

 - A MIPS GIC fix for spurious, masked interrupts
 - A fix for a subtle IPI bug in GICv3
 - Do not probe GICv3 ITSs that are marked as disabled
 - Multi-MSI support for GICv2m
 - Various cleanups

6 years agoirqdomain: Re-use DEFINE_SHOW_ATTRIBUTE() macro
Andy Shevchenko [Wed, 14 Feb 2018 15:47:35 +0000 (17:47 +0200)]
irqdomain: Re-use DEFINE_SHOW_ATTRIBUTE() macro

...instead of open coding file operations followed by custom ->open()
callbacks per each attribute.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
6 years agoirqchip/bcm: Remove hashed address printing
Jaedon Shin [Mon, 12 Feb 2018 02:18:12 +0000 (11:18 +0900)]
irqchip/bcm: Remove hashed address printing

Since commit ad67b74d2469 ("printk: hash addresses printed with %p")
pointers are being hashed when printed. Displaying the virtual memory at
bootup time is not helpful. so delete the prints.

Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jaedon Shin <jaedon.shin@gmail.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
6 years agoirqchip/gic-v2m: Add PCI Multi-MSI support
Marc Zyngier [Tue, 6 Feb 2018 18:55:33 +0000 (18:55 +0000)]
irqchip/gic-v2m: Add PCI Multi-MSI support

We'd never implemented Multi-MSI support with GICv2m, because
it is weird and clunky, and you'd think people would rather use
MSI-X.

Turns out there is still plenty of devices out there that rely
on Multi-MSI. Oh well, let's teach that trick to the v2m widget,
it is not a big deal anyway.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
6 years agoirqchip/gic-v3: Ignore disabled ITS nodes
Stephen Boyd [Thu, 1 Feb 2018 17:03:29 +0000 (09:03 -0800)]
irqchip/gic-v3: Ignore disabled ITS nodes

On some platforms there's an ITS available but it's not enabled
because reading or writing the registers is denied by the
firmware. In fact, reading or writing them will cause the system
to reset. We could remove the node from DT in such a case, but
it's better to skip nodes that are marked as "disabled" in DT so
that we can describe the hardware that exists and use the status
property to indicate how the firmware has configured things.

Cc: Stuart Yoder <stuyoder@gmail.com>
Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Rajendra Nayak <rnayak@codeaurora.org>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
6 years agoirqchip/gic-v3: Use wmb() instead of smb_wmb() in gic_raise_softirq()
Shanker Donthineni [Thu, 1 Feb 2018 00:03:42 +0000 (18:03 -0600)]
irqchip/gic-v3: Use wmb() instead of smb_wmb() in gic_raise_softirq()

A DMB instruction can be used to ensure the relative order of only
memory accesses before and after the barrier. Since writes to system
registers are not memory operations, barrier DMB is not sufficient
for observability of memory accesses that occur before ICC_SGI1R_EL1
writes.

A DSB instruction ensures that no instructions that appear in program
order after the DSB instruction, can execute until the DSB instruction
has completed.

Cc: stable@vger.kernel.org
Acked-by: Will Deacon <will.deacon@arm.com>,
Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
6 years agoirqchip/gic-v3: Change pr_debug message to pr_devel
Mark Salter [Fri, 2 Feb 2018 14:20:29 +0000 (09:20 -0500)]
irqchip/gic-v3: Change pr_debug message to pr_devel

The pr_debug() in gic-v3 gic_send_sgi() can trigger a circular locking
warning:

 GICv3: CPU10: ICC_SGI1R_EL1 5000400
 ======================================================
 WARNING: possible circular locking dependency detected
 4.15.0+ #1 Tainted: G        W
 ------------------------------------------------------
 dynamic_debug01/1873 is trying to acquire lock:
  ((console_sem).lock){-...}, at: [<0000000099c891ec>] down_trylock+0x20/0x4c

 but task is already holding lock:
  (&rq->lock){-.-.}, at: [<00000000842e1587>] __task_rq_lock+0x54/0xdc

 which lock already depends on the new lock.

 the existing dependency chain (in reverse order) is:

 -> #2 (&rq->lock){-.-.}:
        __lock_acquire+0x3b4/0x6e0
        lock_acquire+0xf4/0x2a8
        _raw_spin_lock+0x4c/0x60
        task_fork_fair+0x3c/0x148
        sched_fork+0x10c/0x214
        copy_process.isra.32.part.33+0x4e8/0x14f0
        _do_fork+0xe8/0x78c
        kernel_thread+0x48/0x54
        rest_init+0x34/0x2a4
        start_kernel+0x45c/0x488

 -> #1 (&p->pi_lock){-.-.}:
        __lock_acquire+0x3b4/0x6e0
        lock_acquire+0xf4/0x2a8
        _raw_spin_lock_irqsave+0x58/0x70
        try_to_wake_up+0x48/0x600
        wake_up_process+0x28/0x34
        __up.isra.0+0x60/0x6c
        up+0x60/0x68
        __up_console_sem+0x4c/0x7c
        console_unlock+0x328/0x634
        vprintk_emit+0x25c/0x390
        dev_vprintk_emit+0xc4/0x1fc
        dev_printk_emit+0x88/0xa8
        __dev_printk+0x58/0x9c
        _dev_info+0x84/0xa8
        usb_new_device+0x100/0x474
        hub_port_connect+0x280/0x92c
        hub_event+0x740/0xa84
        process_one_work+0x240/0x70c
        worker_thread+0x60/0x400
        kthread+0x110/0x13c
        ret_from_fork+0x10/0x18

 -> #0 ((console_sem).lock){-...}:
        validate_chain.isra.34+0x6e4/0xa20
        __lock_acquire+0x3b4/0x6e0
        lock_acquire+0xf4/0x2a8
        _raw_spin_lock_irqsave+0x58/0x70
        down_trylock+0x20/0x4c
        __down_trylock_console_sem+0x3c/0x9c
        console_trylock+0x20/0xb0
        vprintk_emit+0x254/0x390
        vprintk_default+0x58/0x90
        vprintk_func+0xbc/0x164
        printk+0x80/0xa0
        __dynamic_pr_debug+0x84/0xac
        gic_raise_softirq+0x184/0x18c
        smp_cross_call+0xac/0x218
        smp_send_reschedule+0x3c/0x48
        resched_curr+0x60/0x9c
        check_preempt_curr+0x70/0xdc
        wake_up_new_task+0x310/0x470
        _do_fork+0x188/0x78c
        SyS_clone+0x44/0x50
        __sys_trace_return+0x0/0x4

 other info that might help us debug this:

 Chain exists of:
   (console_sem).lock --> &p->pi_lock --> &rq->lock

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&rq->lock);
                                lock(&p->pi_lock);
                                lock(&rq->lock);
   lock((console_sem).lock);

  *** DEADLOCK ***

 2 locks held by dynamic_debug01/1873:
  #0:  (&p->pi_lock){-.-.}, at: [<000000001366df53>] wake_up_new_task+0x40/0x470
  #1:  (&rq->lock){-.-.}, at: [<00000000842e1587>] __task_rq_lock+0x54/0xdc

 stack backtrace:
 CPU: 10 PID: 1873 Comm: dynamic_debug01 Tainted: G        W        4.15.0+ #1
 Hardware name: GIGABYTE R120-T34-00/MT30-GS2-00, BIOS T48 10/02/2017
 Call trace:
  dump_backtrace+0x0/0x188
  show_stack+0x24/0x2c
  dump_stack+0xa4/0xe0
  print_circular_bug.isra.31+0x29c/0x2b8
  check_prev_add.constprop.39+0x6c8/0x6dc
  validate_chain.isra.34+0x6e4/0xa20
  __lock_acquire+0x3b4/0x6e0
  lock_acquire+0xf4/0x2a8
  _raw_spin_lock_irqsave+0x58/0x70
  down_trylock+0x20/0x4c
  __down_trylock_console_sem+0x3c/0x9c
  console_trylock+0x20/0xb0
  vprintk_emit+0x254/0x390
  vprintk_default+0x58/0x90
  vprintk_func+0xbc/0x164
  printk+0x80/0xa0
  __dynamic_pr_debug+0x84/0xac
  gic_raise_softirq+0x184/0x18c
  smp_cross_call+0xac/0x218
  smp_send_reschedule+0x3c/0x48
  resched_curr+0x60/0x9c
  check_preempt_curr+0x70/0xdc
  wake_up_new_task+0x310/0x470
  _do_fork+0x188/0x78c
  SyS_clone+0x44/0x50
  __sys_trace_return+0x0/0x4
 GICv3: CPU0: ICC_SGI1R_EL1 12000

This could be fixed with printk_deferred() but that might lessen its
usefulness for debugging. So change it to pr_devel to keep it out of
production kernels. Developers working on gic-v3 can enable it as
needed in their kernels.

Signed-off-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
6 years agoirqchip/mips-gic: Avoid spuriously handling masked interrupts
Matt Redfearn [Mon, 5 Feb 2018 16:45:36 +0000 (16:45 +0000)]
irqchip/mips-gic: Avoid spuriously handling masked interrupts

Commit 7778c4b27cbe ("irqchip: mips-gic: Use pcpu_masks to avoid reading
GIC_SH_MASK*") removed the read of the hardware mask register when
handling shared interrupts, instead using the driver's shadow pcpu_masks
entry as the effective mask. Unfortunately this did not take account of
the write to pcpu_masks during gic_shared_irq_domain_map, which
effectively unmasks the interrupt early. If an interrupt is asserted,
gic_handle_shared_int decodes and processes the interrupt even though it
has not yet been unmasked via gic_unmask_irq, which also sets the
appropriate bit in pcpu_masks.

On the MIPS Boston board, when a console command line of
"console=ttyS0,115200n8r" is passed, the modem status IRQ is enabled in
the UART, which is immediately raised to the GIC. The interrupt has been
mapped, but no handler has yet been registered, nor is it expected to be
unmasked. However, the write to pcpu_masks in gic_shared_irq_domain_map
has effectively unmasked it, resulting in endless reports of:

[    5.058454] irq 13, desc: ffffffff80a7ad80, depth: 1, count: 0, unhandled: 0
[    5.062057] ->handle_irq():  ffffffff801b1838,
[    5.062175] handle_bad_irq+0x0/0x2c0

Where IRQ 13 is the UART interrupt.

To fix this, just remove the write to pcpu_masks in
gic_shared_irq_domain_map. The existing write in gic_unmask_irq is the
correct place for what is now the effective unmasking.

Cc: stable@vger.kernel.org
Fixes: 7778c4b27cbe ("irqchip: mips-gic: Use pcpu_masks to avoid reading GIC_SH_MASK*")
Signed-off-by: Matt Redfearn <matt.redfearn@mips.com>
Reviewed-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
6 years agopowerpc/pseries: Check for zero filled ibm,dynamic-memory property
Nathan Fontenot [Fri, 16 Feb 2018 03:27:41 +0000 (21:27 -0600)]
powerpc/pseries: Check for zero filled ibm,dynamic-memory property

Some versions of QEMU will produce an ibm,dynamic-reconfiguration-memory
node with a ibm,dynamic-memory property that is zero-filled. This
causes the drmem code to oops trying to parse this property.

The fix for this is to validate that the property does contain LMB
entries before trying to parse it and bail if the count is zero.

  Oops: Kernel access of bad area, sig: 11 [#1]
  DAR: 0000000000000010
  NIP read_drconf_v1_cell+0x54/0x9c
  LR  read_drconf_v1_cell+0x48/0x9c
  Call Trace:
    __param_initcall_debug+0x0/0x28 (unreliable)
    drmem_init+0x144/0x2f8
    do_one_initcall+0x64/0x1d0
    kernel_init_freeable+0x298/0x38c
    kernel_init+0x24/0x160
    ret_from_kernel_thread+0x5c/0xb4

The ibm,dynamic-reconfiguration-memory device tree property generated
that causes this:

  ibm,dynamic-reconfiguration-memory {
          ibm,lmb-size = <0x0 0x10000000>;
          ibm,memory-flags-mask = <0xff>;
          ibm,dynamic-memory = <0x0 0x0 0x0 0x0 0x0 0x0>;
          linux,phandle = <0x7e57eed8>;
          ibm,associativity-lookup-arrays = <0x1 0x4 0x0 0x0 0x0 0x0>;
          ibm,memory-preservation-time = <0x0>;
  };

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Reviewed-by: Cyril Bur <cyrilbur@gmail.com>
Tested-by: Daniel Black <daniel@linux.vnet.ibm.com>
[mpe: Trim oops report]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
6 years agocpumask: Make for_each_cpu_wrap() available on UP as well
Michael Kelley [Wed, 14 Feb 2018 02:54:03 +0000 (02:54 +0000)]
cpumask: Make for_each_cpu_wrap() available on UP as well

for_each_cpu_wrap() was originally added in the #else half of a
large "#if NR_CPUS == 1" statement, but was omitted in the #if
half.  This patch adds the missing #if half to prevent compile
errors when NR_CPUS is 1.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Michael Kelley <mhkelley@outlook.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kys@microsoft.com
Cc: martin.petersen@oracle.com
Cc: mikelley@microsoft.com
Fixes: c743f0a5c50f ("sched/fair, cpumask: Export for_each_cpu_wrap()")
Link: http://lkml.kernel.org/r/SN6PR1901MB2045F087F59450507D4FCC17CBF50@SN6PR1901MB2045.namprd19.prod.outlook.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
6 years agox86/Kconfig: Explicitly enumerate i686-class CPUs in Kconfig
Matthew Whitehead [Thu, 15 Feb 2018 16:54:56 +0000 (11:54 -0500)]
x86/Kconfig: Explicitly enumerate i686-class CPUs in Kconfig

The X86_P6_NOP config class leaves out many i686-class CPUs. Instead,
explicitly enumerate all these CPUs.

Using a configuration with M686 currently sets X86_MINIMUM_CPU_FAMILY=5
instead of the correct value of 6.

Booting on an i586 it will fail to generate the "This kernel
requires an i686 CPU, but only detected an i586 CPU" message and
intentional halt as expected. It will instead just silently hang
when it hits i686-specific instructions.

Signed-off-by: Matthew Whitehead <tedheadster@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1518713696-11360-3-git-send-email-tedheadster@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
6 years agox86/Kconfig: Exclude i586-class CPUs lacking PAE support from the HIGHMEM64G Kconfig...
Matthew Whitehead [Thu, 15 Feb 2018 16:54:55 +0000 (11:54 -0500)]
x86/Kconfig: Exclude i586-class CPUs lacking PAE support from the HIGHMEM64G Kconfig group

i586-class machines also lack support for Physical Address Extension (PAE),
so add them to the exclusion list.

Signed-off-by: Matthew Whitehead <tedheadster@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1518713696-11360-2-git-send-email-tedheadster@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
6 years agox86/Kconfig: Add missing i586-class CPUs to the X86_CMPXCHG64 Kconfig group
Matthew Whitehead [Thu, 15 Feb 2018 16:54:54 +0000 (11:54 -0500)]
x86/Kconfig: Add missing i586-class CPUs to the X86_CMPXCHG64 Kconfig group

Several i586-class CPUs supporting this instruction are missing from
the X86_CMPXCHG64 config group.

Using a configuration with either M586TSC or M586MMX currently sets
X86_MINIMUM_CPU_FAMILY=4 instead of the correct value of 5.

Booting on an i486 it will fail to generate the "This kernel
requires an i586 CPU, but only detected an i486 CPU" message and
intentional halt as expected. It will instead just silently hang
when it hits i586-specific instructions.

The M586 CPU is not in this list because at least the Cyrix 5x86
lacks this instruction, and perhaps others.

Signed-off-by: Matthew Whitehead <tedheadster@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1518713696-11360-1-git-send-email-tedheadster@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
6 years agokprobes: Propagate error from disarm_kprobe_ftrace()
Jessica Yu [Tue, 9 Jan 2018 23:51:24 +0000 (00:51 +0100)]
kprobes: Propagate error from disarm_kprobe_ftrace()

Improve error handling when disarming ftrace-based kprobes. Like with
arm_kprobe_ftrace(), propagate any errors from disarm_kprobe_ftrace() so
that we do not disable/unregister kprobes that are still armed. In other
words, unregister_kprobe() and disable_kprobe() should not report success
if the kprobe could not be disarmed.

disarm_all_kprobes() keeps its current behavior and attempts to
disarm all kprobes. It returns the last encountered error and gives a
warning if not all probes could be disarmed.

This patch is based on Petr Mladek's original patchset (patches 2 and 3)
back in 2015, which improved kprobes error handling, found here:

   https://lkml.org/lkml/2015/2/26/452

However, further work on this had been paused since then and the patches
were not upstreamed.

Based-on-patches-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S . Miller <davem@davemloft.net>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/20180109235124.30886-3-jeyu@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
6 years agokprobes: Propagate error from arm_kprobe_ftrace()
Jessica Yu [Tue, 9 Jan 2018 23:51:23 +0000 (00:51 +0100)]
kprobes: Propagate error from arm_kprobe_ftrace()

Improve error handling when arming ftrace-based kprobes. Specifically, if
we fail to arm a ftrace-based kprobe, register_kprobe()/enable_kprobe()
should report an error instead of success. Previously, this has lead to
confusing situations where register_kprobe() would return 0 indicating
success, but the kprobe would not be functional if ftrace registration
during the kprobe arming process had failed. We should therefore take any
errors returned by ftrace into account and propagate this error so that we
do not register/enable kprobes that cannot be armed. This can happen if,
for example, register_ftrace_function() finds an IPMODIFY conflict (since
kprobe_ftrace_ops has this flag set) and returns an error. Such a conflict
is possible since livepatches also set the IPMODIFY flag for their ftrace_ops.

arm_all_kprobes() keeps its current behavior and attempts to arm all
kprobes. It returns the last encountered error and gives a warning if
not all probes could be armed.

This patch is based on Petr Mladek's original patchset (patches 2 and 3)
back in 2015, which improved kprobes error handling, found here:

   https://lkml.org/lkml/2015/2/26/452

However, further work on this had been paused since then and the patches
were not upstreamed.

Based-on-patches-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S . Miller <davem@davemloft.net>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/20180109235124.30886-2-jeyu@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
6 years agoMerge tag 'perf-core-for-mingo-4.17-20180215' of git://git.kernel.org/pub/scm/linux...
Ingo Molnar [Fri, 16 Feb 2018 08:10:09 +0000 (09:10 +0100)]
Merge tag 'perf-core-for-mingo-4.17-20180215' of git://git./linux/kernel/git/acme/linux into perf/urgent

Pull perf/core fixes from Arnaldo Carvalho de Melo:

- perf_mmap overwrite mode fixes/overhaul, prep work to get 'perf top'
  using it, making it bearable to use it in large core count systems
  such as Knights Landing/Mill Intel systems (Kan Liang)

- s/390 now uses syscall.tbl, just like x86-64 to generate the syscall
  table id -> string tables used by 'perf trace' (Hendrik Brueckner)

- Use strtoull() instead of home grown function (Andy Shevchenko)

- Synchronize kernel ABI headers, v4.16-rc1 (Ingo Molnar)

- Document missing 'perf data --force' option (Sangwon Hong)

- Add perf vendor JSON metrics for ARM Cortex-A53 Processor (William Cohen)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
6 years agoMerge branch 'linux-4.16' of git://github.com/skeggsb/linux into drm-fixes
Dave Airlie [Fri, 16 Feb 2018 04:26:01 +0000 (14:26 +1000)]
Merge branch 'linux-4.16' of git://github.com/skeggsb/linux into drm-fixes

single fix for older gpus.

* 'linux-4.16' of git://github.com/skeggsb/linux:
  drm/nouveau: Make clock gate support conditional

6 years agodrm/nouveau: Make clock gate support conditional
Thierry Reding [Wed, 7 Feb 2018 17:40:27 +0000 (18:40 +0100)]
drm/nouveau: Make clock gate support conditional

The recently introduced clock gate support breaks on Tegra chips because
no thermal support is enabled for those devices. Conditionalize the code
on the existence of thermal support to fix this.

Fixes: b138eca661cc ("drm/nouveau: Add support for basic clockgating on Kepler1")
Cc: Martin Peres <martin.peres@free.fr>
Cc: Lyude Paul <lyude@redhat.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
6 years agoMerge tag 'drm-intel-fixes-2018-02-14-1' of git://anongit.freedesktop.org/drm/drm...
Dave Airlie [Fri, 16 Feb 2018 02:33:03 +0000 (12:33 +1000)]
Merge tag 'drm-intel-fixes-2018-02-14-1' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

There are important fixes for VLV with MIPI/DSI panels,
2 clean-up patches needed for this MIPI/DSI fix,
and many fixes for GEM including fixes for Perf OA and PMU,
and fixes on scheduler and preemption.

This also includes GVT fixes: "This has one to fix GTT mmio 8b
access from guest and two simple ones for mmio switch and typo fix"

* tag 'drm-intel-fixes-2018-02-14-1' of git://anongit.freedesktop.org/drm/drm-intel:
  drm/i915: Fix DSI panels with v1 MIPI sequences without a DEASSERT sequence v3
  drm/i915: Free memdup-ed DSI VBT data structures on driver_unload
  drm/i915: Add intel_bios_cleanup() function
  drm/i915/vlv: Add cdclk workaround for DSI
  drm/i915/gvt: fix one typo of render_mmio trace
  drm/i915/gvt: Support BAR0 8-byte reads/writes
  drm/i915/gvt: add 0xe4f0 into gen9 render list
  drm/i915/pmu: Fix building without CONFIG_PM
  drm/i915/pmu: Fix sleep under atomic in RC6 readout
  drm/i915/pmu: Fix PMU enable vs execlists tasklet race
  drm/i915: Lock out execlist tasklet while peeking inside for busy-stats
  drm/i915/breadcrumbs: Ignore unsubmitted signalers
  drm/i915: Don't wake the device up to check if the engine is asleep
  drm/i915: Avoid truncation before clamping userspace's priority value
  drm/i915/perf: Fix compiler warning for string truncation
  drm/i915/perf: Fix compiler warning for string truncation

6 years agoMerge branch 'drm-next-4.16' of git://people.freedesktop.org/~agd5f/linux into drm...
Dave Airlie [Fri, 16 Feb 2018 02:30:41 +0000 (12:30 +1000)]
Merge branch 'drm-next-4.16' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

single atpx fix

* 'drm-next-4.16' of git://people.freedesktop.org/~agd5f/linux:
  drm/amdgpu: add new device to use atpx quirk

6 years agoMerge tag 'acpi-4.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Thu, 15 Feb 2018 22:50:32 +0000 (14:50 -0800)]
Merge tag 'acpi-4.16-rc2' of git://git./linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
 "These fix a system resume regression from the 4.13 cycle, clean up
  device table handling in the ACPI core, update sysfs ABI documentation
  of a couple of drivers and add an expected switch fall-through marker
  to the SPCR table parsing code.

  Specifics:

   - Revert a problematic EC driver change from the 4.13 cycle that
     introduced a system resume regression on Thinkpad X240 (Rafael
     Wysocki).

   - Clean up device tables handling in the ACPI core and the related
     part of the device properties framework (Andy Shevchenko).

   - Update the sysfs ABI documentatio of the dock and the INT3407
     special device drivers (Aishwarya Pant).

   - Add an expected switch fall-through marker to the SPCR table
     parsing code (Gustavo Silva)"

* tag 'acpi-4.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: dock: document sysfs interface
  ACPI / DPTF: Document dptf_power sysfs atttributes
  device property: Constify device_get_match_data()
  ACPI / bus: Rename acpi_get_match_data() to acpi_device_get_match_data()
  ACPI / bus: Remove checks in acpi_get_match_data()
  ACPI / bus: Do not traverse through non-existed device table
  ACPI: SPCR: Mark expected switch fall-through in acpi_parse_spcr
  ACPI / EC: Restore polling during noirq suspend/resume phases

6 years agoMerge tag 'pm-4.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Thu, 15 Feb 2018 22:40:01 +0000 (14:40 -0800)]
Merge tag 'pm-4.16-rc2' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These fix a recently introduced build issue related to cpuidle and two
  bugs in the PM core, update cpuidle documentation and clean up memory
  allocations in the operating performance points (OPP) framework.

  Specifics:

   - Fix a recently introduced build issue related to cpuidle by
     covering all of the relevant combinations of Kconfig options
     in its header (Rafael Wysocki).

   - Add missing invocation of pm_runtime_drop_link() to the
     !CONFIG_SRCU variant of __device_link_del() (Lukas Wunner).

   - Fix unbalanced IRQ enable in the wakeup interrupts framework
     (Tony Lindgren).

   - Update cpuidle sysfs ABI documentation (Aishwarya Pant).

   - Use GFP_KERNEL instead of GFP_ATOMIC for allocating memory
     in dev_pm_opp_init_cpufreq_table() (Jia-Ju Bai)"

* tag 'pm-4.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: cpuidle: Fix cpuidle_poll_state_init() prototype
  PM / runtime: Update links_count also if !CONFIG_SRCU
  PM / wakeirq: Fix unbalanced IRQ enable for wakeirq
  Documentation/ABI: update cpuidle sysfs documentation
  opp: cpu: Replace GFP_ATOMIC with GFP_KERNEL in dev_pm_opp_init_cpufreq_table

6 years agoMerge tag 'hwmon-for-linus-v4.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 15 Feb 2018 22:31:28 +0000 (14:31 -0800)]
Merge tag 'hwmon-for-linus-v4.16-rc2' of git://git./linux/kernel/git/groeck/linux-staging

Pull hwmon fix from Guenter Roeck:
 "Fix bad temperature display on Ryzen/Threadripper"

* tag 'hwmon-for-linus-v4.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (k10temp) Only apply temperature offset if result is positive

6 years agoMerge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Linus Torvalds [Thu, 15 Feb 2018 22:29:27 +0000 (14:29 -0800)]
Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost

Pull virtio fixes from Michael Tsirkin:
 "This includes a bugfix for virtio 9p fs. It also fixes hybernation for
  s390 guests with virtio devices"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio/s390: implement PM operations for virtio_ccw
  9p/trans_virtio: discard zero-length reply

6 years agosparc,leon: Select USB_UHCI_BIG_ENDIAN_{MMIO,DESC}
James Hogan [Wed, 31 Jan 2018 22:24:46 +0000 (22:24 +0000)]
sparc,leon: Select USB_UHCI_BIG_ENDIAN_{MMIO,DESC}

Now that USB_UHCI_BIG_ENDIAN_MMIO and USB_UHCI_BIG_ENDIAN_DESC are moved
outside of the USB_SUPPORT conditional, simply select them from
SPARC_LEON rather than by the symbol's defaults in drivers/usb/Kconfig,
similar to how it is done for USB_EHCI_BIG_ENDIAN_MMIO and
USB_EHCI_BIG_ENDIAN_DESC.

Signed-off-by: James Hogan <jhogan@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Corentin Labbe <clabbe.montjoie@gmail.com>
Cc: sparclinux@vger.kernel.org
Cc: linux-usb@vger.kernel.org
Acked-by: David S. Miller <davem@davemloft.net>
Patchwork: https://patchwork.linux-mips.org/patch/18560/

6 years agousb: Move USB_UHCI_BIG_ENDIAN_* out of USB_SUPPORT
James Hogan [Wed, 31 Jan 2018 22:24:45 +0000 (22:24 +0000)]
usb: Move USB_UHCI_BIG_ENDIAN_* out of USB_SUPPORT

Move the Kconfig symbols USB_UHCI_BIG_ENDIAN_MMIO and
USB_UHCI_BIG_ENDIAN_DESC out of drivers/usb/host/Kconfig, which is
conditional upon USB && USB_SUPPORT, so that it can be freely selected
by platform Kconfig symbols in architecture code.

For example once the MIPS_GENERIC platform selects are fixed in commit
2e6522c56552 ("MIPS: Fix typo BIG_ENDIAN to CPU_BIG_ENDIAN"), the MIPS
32r6_defconfig warns like so:

warning: (MIPS_GENERIC) selects USB_UHCI_BIG_ENDIAN_MMIO which has unmet direct dependencies (USB_SUPPORT && USB)
warning: (MIPS_GENERIC) selects USB_UHCI_BIG_ENDIAN_DESC which has unmet direct dependencies (USB_SUPPORT && USB)

Fixes: 2e6522c56552 ("MIPS: Fix typo BIG_ENDIAN to CPU_BIG_ENDIAN")
Signed-off-by: James Hogan <jhogan@kernel.org>
Cc: Corentin Labbe <clabbe.montjoie@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: linux-usb@vger.kernel.org
Cc: linux-mips@linux-mips.org
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Patchwork: https://patchwork.linux-mips.org/patch/18559/

6 years agoMerge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 15 Feb 2018 17:28:47 +0000 (09:28 -0800)]
Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull scheduler fixes from Ingo Molnar:
 "Misc fixes:

   - fix rq->lock lockdep annotation bug

   - fix/improve update_curr_rt() and update_curr_dl() accounting

   - update documentation

   - remove unused macro"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/cpufreq: Remove unused SUGOV_KTHREAD_PRIORITY macro
  sched/core: Fix DEBUG_SPINLOCK annotation for rq->lock
  sched/rt: Make update_curr_rt() more accurate
  sched/deadline: Make update_curr_dl() more accurate
  membarrier-sync-core: Document architecture support

6 years agoMerge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 15 Feb 2018 17:05:26 +0000 (09:05 -0800)]
Merge branch 'locking-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull locking fixes from Ingo Molnar:
 "This contains two qspinlock fixes and three documentation and comment
  fixes"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/semaphore: Update the file path in documentation
  locking/atomic/bitops: Document and clarify ordering semantics for failed test_and_{}_bit()
  locking/qspinlock: Ensure node->count is updated before initialising node
  locking/qspinlock: Ensure node is initialised before updating prev->next
  Documentation/locking/mutex-design: Update to reflect latest changes

6 years agoblock: fix a typo in comment of BLK_MQ_POLL_STATS_BKTS
Minwoo Im [Thu, 15 Feb 2018 14:53:17 +0000 (23:53 +0900)]
block: fix a typo in comment of BLK_MQ_POLL_STATS_BKTS

Update comment typo _consisitent_ to _consistent_ from following commit.
commit 0206319fdfee ("blk-mq: Fix poll_stat for new size-based bucketing.")

Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Minwoo Im <minwoo.im.dev@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
6 years agoRevert "tools include s390: Grab a copy of arch/s390/include/uapi/asm/unistd.h"
Hendrik Brueckner [Thu, 8 Feb 2018 11:47:48 +0000 (12:47 +0100)]
Revert "tools include s390: Grab a copy of arch/s390/include/uapi/asm/unistd.h"

This reverts commit f120c7b187e6c418238710b48723ce141f467543 which is no
longer required with the introduction of a syscall.tbl on s390.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: linux-s390@vger.kernel.org
LPU-Reference: 1518090470-2899-2-git-send-email-brueckner@linux.vnet.ibm.com
Link: https://lkml.kernel.org/n/tip-q1lg0nvhha1tk39ri9aqalcb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
6 years agoperf s390: Rework system call table creation by using syscall.tbl
Hendrik Brueckner [Thu, 8 Feb 2018 11:47:50 +0000 (12:47 +0100)]
perf s390: Rework system call table creation by using syscall.tbl

Recently, s390 uses a syscall.tbl input file to generate its system call
table and unistd uapi header files.  Hence, update mksyscalltbl to use
it as input to create the system table for perf.

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: linux-s390@vger.kernel.org
LPU-Reference: 1518090470-2899-4-git-send-email-brueckner@linux.vnet.ibm.com
Link: https://lkml.kernel.org/n/tip-bdyhllhsq1zgxv2qx4m377y6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
6 years agoperf s390: Grab a copy of arch/s390/kernel/syscall/syscall.tbl
Hendrik Brueckner [Thu, 8 Feb 2018 11:47:49 +0000 (12:47 +0100)]
perf s390: Grab a copy of arch/s390/kernel/syscall/syscall.tbl

Grab a copy of the s390 system call table file introduced with commit
857f46bfb07f53dc112d69bdfb137cc5ec3da7c5 "s390/syscalls: add system call
table".

Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: linux-s390@vger.kernel.org
LPU-Reference: 1518090470-2899-3-git-send-email-brueckner@linux.vnet.ibm.com
Link: https://lkml.kernel.org/n/tip-hpw7vdjp7g92ivgpddrp5ydq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
6 years agotools/headers: Synchronize kernel ABI headers, v4.16-rc1
Ingo Molnar [Tue, 13 Feb 2018 11:54:58 +0000 (12:54 +0100)]
tools/headers: Synchronize kernel ABI headers, v4.16-rc1

Sync the following tooling headers with the latest kernel version:

  tools/arch/powerpc/include/uapi/asm/kvm.h
  tools/arch/x86/include/asm/cpufeatures.h
  tools/include/uapi/drm/i915_drm.h
  tools/include/uapi/linux/if_link.h
  tools/include/uapi/linux/kvm.h

All the changes are new ABI additions which don't impact their use
in existing tooling.

Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
6 years agoperf test: Fix test trace+probe_libc_inet_pton.sh for s390x
Thomas Richter [Wed, 17 Jan 2018 08:38:31 +0000 (09:38 +0100)]
perf test: Fix test trace+probe_libc_inet_pton.sh for s390x

On Intel test case trace+probe_libc_inet_pton.sh succeeds and the
output is:

[root@f27 perf]# ./perf trace --no-syscalls
                  -e probe_libc:inet_pton/max-stack=3/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.037 ms

 --- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.037/0.037/0.037/0.000 ms
     0.000 probe_libc:inet_pton:(7fa40ac618a0))
              __GI___inet_pton (/usr/lib64/libc-2.26.so)
              getaddrinfo (/usr/lib64/libc-2.26.so)
              main (/usr/bin/ping)

The kernel stack unwinder is used, it is specified implicitly
as call-graph=fp (frame pointer).

On s390x only dwarf is available for stack unwinding. It is also
done in user space. This requires different parameter setup
and result checking for s390x and Intel.

This patch adds separate perf trace setup and result checking
for Intel and s390x. On s390x specify this command line to
get a call-graph and handle the different call graph result
checking:

[root@s35lp76 perf]# ./perf trace --no-syscalls
-e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.041 ms

 --- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.041/0.041/0.041/0.000 ms
     0.000 probe_libc:inet_pton:(3ffb9942060))
            __GI___inet_pton (/usr/lib64/libc-2.26.so)
            gaih_inet (inlined)
            __GI_getaddrinfo (inlined)
            main (/usr/bin/ping)
            __libc_start_main (/usr/lib64/libc-2.26.so)
            _start (/usr/bin/ping)
[root@s35lp76 perf]#

Before:
[root@s8360047 perf]# ./perf test -vv 58
58: probe libc's inet_pton & backtrace it with ping       :
 --- start ---
test child forked, pid 26349
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.079 ms
 --- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.079/0.079/0.079/0.000 ms
0.000 probe_libc:inet_pton:(3ff925c2060))
test child finished with -1
 ---- end ----
probe libc's inet_pton & backtrace it with ping: FAILED!
[root@s8360047 perf]#

After:
[root@s35lp76 perf]# ./perf test -vv 57
57: probe libc's inet_pton & backtrace it with ping       :
 --- start ---
test child forked, pid 38708
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.038 ms
 --- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.038/0.038/0.038/0.000 ms
0.000 probe_libc:inet_pton:(3ff87342060))
__GI___inet_pton (/usr/lib64/libc-2.26.so)
gaih_inet (inlined)
__GI_getaddrinfo (inlined)
main (/usr/bin/ping)
__libc_start_main (/usr/lib64/libc-2.26.so)
_start (/usr/bin/ping)
test child finished with 0
 ---- end ----
probe libc's inet_pton & backtrace it with ping: Ok
[root@s35lp76 perf]#

On Intel the test case runs unchanged and succeeds.

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20180117083831.101001-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
6 years agoperf data: Document missing --force option
Sangwon Hong [Mon, 5 Feb 2018 11:48:35 +0000 (20:48 +0900)]
perf data: Document missing --force option

Add the --force option to the man page.

Signed-off-by: Sangwon Hong <qpakzk@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Link: http://lkml.kernel.org/r/1517831315-31490-1-git-send-email-qpakzk@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
6 years agoperf tools: Substitute yet another strtoull()
Andy Shevchenko [Mon, 29 Jan 2018 13:03:59 +0000 (15:03 +0200)]
perf tools: Substitute yet another strtoull()

Instead of home grown function let's use what library provides us.

Signed-off-by: Andriy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20180129130359.1490-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
6 years agoperf top: Check the latency of perf_top__mmap_read()
Kan Liang [Thu, 18 Jan 2018 21:26:32 +0000 (13:26 -0800)]
perf top: Check the latency of perf_top__mmap_read()

The latency of perf_top__mmap_read() should be lower than refresh time.
If not, give some hints to reduce the latency.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-18-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
6 years agoperf top: Switch default mode to overwrite mode
Kan Liang [Thu, 18 Jan 2018 21:26:31 +0000 (13:26 -0800)]
perf top: Switch default mode to overwrite mode

perf_top__mmap_read() has a severe performance issue in the Knights
Landing/Mill platform, when monitoring heavy load systems. It costs
several minutes to finish, which is unacceptable.

Currently, 'perf top' uses the non overwrite mode. For non overwrite
mode, it tries to read everything in the ringbuffer and doesn't pause
it. Once there are lots of samples delivered persistently, the
processing time could be very long. Also, the latest samples could be
lost when the ringbuffer is full.

For overwrite mode, it takes a snapshot for the system by pausing the
ringbuffer, which could significantly reduce the processing time.  Also,
the overwrite mode always keep the latest samples.  Considering the real
time requirement for 'perf top', the overwrite mode is more suitable for
it.

Actually, 'perf top' was overwrite mode. It is changed to non overwrite
mode since commit 93fc64f14472 ("perf top: Switch to non overwrite
mode"). It's better to change it back to overwrite mode by default.

For the kernel which doesn't support overwrite mode, it will fall back
to non overwrite mode.

There would be some records lost in overwrite mode because of pausing
the ringbuffer. It has little impact for the accuracy of the snapshot
and can be tolerated.

For overwrite mode, unconditionally wait 100 ms before each snapshot. It
also reduces the overhead caused by pausing ringbuffer, especially on
light load system.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-17-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
6 years agoperf top: Remove lost events checking
Kan Liang [Thu, 18 Jan 2018 21:26:30 +0000 (13:26 -0800)]
perf top: Remove lost events checking

There would be some records lost in overwrite mode because of pausing
the ringbuffer. It has little impact for the accuracy of the snapshot
and could be tolerated by 'perf top'.

Remove the lost events checking.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-16-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
6 years agoperf hists browser: Add parameter to disable lost event warning
Kan Liang [Thu, 18 Jan 2018 21:26:29 +0000 (13:26 -0800)]
perf hists browser: Add parameter to disable lost event warning

For overwrite mode, the ringbuffer will be paused. The event lost is
expected. It needs a way to notify the browser not print the warning.

It will be used later for perf top to disable lost event warning in
overwrite mode. There is no behavior change for now.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-15-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
6 years agoperf top: Add overwrite fall back
Kan Liang [Thu, 18 Jan 2018 21:26:28 +0000 (13:26 -0800)]
perf top: Add overwrite fall back

Switch to non-overwrite mode if kernel doesnot support overwrite
ringbuffer.

It's only effect when overwrite mode is supported.  No change to current
behavior.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-14-git-send-email-kan.liang@intel.com
[ Use perf_missing_features.write_backward instead of the non merged is_write_backward_fail() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
6 years agoperf evsel: Expose the perf_missing_features struct
Arnaldo Carvalho de Melo [Fri, 2 Feb 2018 14:27:25 +0000 (11:27 -0300)]
perf evsel: Expose the perf_missing_features struct

As tools may need to adjust to missing features, as 'perf top' will, in
the next csets, to cope with a missing 'write_backward' feature.

Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-jelngl9q1ooaizvkcput9tic@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
6 years agoperf top: Check per-event overwrite term
Kan Liang [Thu, 18 Jan 2018 21:26:26 +0000 (13:26 -0800)]
perf top: Check per-event overwrite term

Per-event overwrite term is not forbidden in 'perf top', which can bring
problems. Because 'perf top' only support non-overwrite mode now.

Add new rules and check regarding to overwrite term for 'perf top'.
- All events either have same per-event term or don't have per-event
  mode setting. Otherwise, it will error out.
- Per-event overwrite term should be consistent as opts->overwrite.
  If not, updating the opts->overwrite according to per-event term.

Make it possible to support either non-overwrite or overwrite mode.
The overwrite mode is forbidden now, which will be removed when the
overwrite mode is supported later.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-12-git-send-email-kan.liang@intel.com
[ Renamed perf_top_overwrite_check to perf_top__overwrite_check, to follow existing convention ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
6 years agoperf mmap: Discard legacy interface for mmap read
Kan Liang [Thu, 18 Jan 2018 21:26:25 +0000 (13:26 -0800)]
perf mmap: Discard legacy interface for mmap read

Discards perf_mmap__read_backward() and perf_mmap__read_catchup(). No
tools use them.

There are tools still use perf_mmap__read_forward(). Keep it, but add
comments to point to the new interface for future use.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-11-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
6 years agoperf test: Update mmap read functions for backward-ring-buffer test
Kan Liang [Thu, 18 Jan 2018 21:26:24 +0000 (13:26 -0800)]
perf test: Update mmap read functions for backward-ring-buffer test

Use the new perf_mmap__read_* interfaces for overwrite ringbuffer test.

Commiter notes:

Testing:

  [root@seventh ~]# perf test -v backward
  48: Read backward ring buffer                             :
  --- start ---
  test child forked, pid 8309
  Using CPUID GenuineIntel-6-9E
  mmap size 1052672B
  mmap size 8192B
  Finished reading overwrite ring buffer: rewind
  test child finished with 0
  ---- end ----
  Read backward ring buffer: Ok
  [root@seventh ~]#

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-10-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
6 years agoperf mmap: Introduce perf_mmap__read_event()
Kan Liang [Thu, 18 Jan 2018 21:26:23 +0000 (13:26 -0800)]
perf mmap: Introduce perf_mmap__read_event()

Except for 'perf record', the other perf tools read events one by one
from the ring buffer using perf_mmap__read_forward(). But it only
supports non-overwrite mode.

Introduce perf_mmap__read_event() to support both non-overwrite and
overwrite mode.

Usage:
perf_mmap__read_init()
while(event = perf_mmap__read_event()) {
        //process the event
        perf_mmap__consume()
}
perf_mmap__read_done()

It cannot use perf_mmap__read_backward(). Because it always reads the
stale buffer which is already processed. Furthermore, the forward and
backward concepts have been removed. The perf_mmap__read_backward() will
be replaced and discarded later.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-9-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
6 years agoperf mmap: Introduce perf_mmap__read_done()
Kan Liang [Thu, 18 Jan 2018 21:26:22 +0000 (13:26 -0800)]
perf mmap: Introduce perf_mmap__read_done()

The direction of overwrite mode is backward. The last perf_mmap__read()
will set tail to map->prev. Need to correct the map->prev to head which
is the end of next read.

It will be used later.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-8-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
6 years agoperf mmap: Discard 'prev' in perf_mmap__read()
Kan Liang [Thu, 18 Jan 2018 21:26:21 +0000 (13:26 -0800)]
perf mmap: Discard 'prev' in perf_mmap__read()

The 'start' and 'prev' variables are duplicates in perf_mmap__read().

Use 'map->prev' to replace 'start' in perf_mmap__read_*().

Suggested-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-7-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
6 years agoperf mmap: Add new return value logic for perf_mmap__read_init()
Kan Liang [Thu, 18 Jan 2018 21:26:20 +0000 (13:26 -0800)]
perf mmap: Add new return value logic for perf_mmap__read_init()

Improve the readability by using meaningful enum (-EAGAIN, -EINVAL and
0) to replace the three returning states (0, -1 and 1).

Suggested-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-6-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
6 years agoperf mmap: Introduce perf_mmap__read_init()
Kan Liang [Thu, 18 Jan 2018 21:26:19 +0000 (13:26 -0800)]
perf mmap: Introduce perf_mmap__read_init()

The new function perf_mmap__read_init() is factored out from
perf_mmap__push().

It is to calculate the 'start' and 'end' of the available data in
ringbuffer.

No functional change.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-5-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
6 years agoperf mmap: Cleanup perf_mmap__push()
Kan Liang [Thu, 18 Jan 2018 21:26:18 +0000 (13:26 -0800)]
perf mmap: Cleanup perf_mmap__push()

The first assignment for 'start' and 'end' is redundant.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1516310792-208685-4-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
6 years agoperf mmap: Recalculate size for overwrite mode
Kan Liang [Thu, 18 Jan 2018 21:26:17 +0000 (13:26 -0800)]
perf mmap: Recalculate size for overwrite mode

In perf_mmap__push(), the 'size' need to be recalculated, otherwise the
invalid data might be pushed to the record in overwrite mode.

The issue is introduced by commit 7fb4b407a124 ("perf mmap: Don't
discard prev in backward mode").

When the ring buffer is full in overwrite mode, backward_rb_find_range()
will be called to recalculate the 'start' and 'end'. The 'size' needs to
be recalculated accordingly.

Unconditionally recalculate the 'size', not just for full ring buffer in
overwrite mode. Because:

- There is no harmful to recalculate the 'size' for other cases.
- The code of calculating 'start' and 'end' will be factored out later.
  The new function does not need to return 'size'.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 7fb4b407a124 ("perf mmap: Don't discard prev in backward mode")
Link: http://lkml.kernel.org/r/1516310792-208685-3-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>