Thomas Gleixner [Thu, 6 Sep 2018 12:56:44 +0000 (14:56 +0200)]
Merge tag 'perf-core-for-mingo-4.20-
20180905' of git://git./linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo:
perf trace:
- Augment the payload of syscall entry/exit tracepoints with the contents
of pointer arguments, such as the "filename" argument to the "open"
syscall or the 'struct sockaddr *' argument to the 'connect' syscall.
This is done using a BPF program that gets compiled and attached to
various syscalls:sys_enter_NAME tracepoints, copying via a BPF map and
"bpf-output" perf event the raw_syscalls:sys_enter tracepoint payload +
the contents of pointer arguments using the "probe_read", "probe_read_str"
and "perf_event_output" BPF functions.
The 'perf trace' codebase now just processes these augmented tracepoints
using the existing beautifiers that now check if there is more in the
perf_sample->raw_data than what is expected for a normal syscall enter
tracepoint (the common preamble, syscall id, up to six parameters),
using that with hand crafted struct beautifiers.
This is just to show how to augment the existing tracepoints, work will
be done to use DWARF or BTF info to do the pretty-printing and to create
the collectors.
For now this is done using an example restricted C BPF program, but the
end goal is to have this all autogenerated and done transparently.
Its still useful to have this example as one can use it as an skeleton and
write more involved filters, see the etcsnoop.c BPF example, for instance.
E.g.:
# cd tools/perf/examples/bpf/
# perf trace -e augmented_syscalls.c ping -c 1 ::1
0.000 ( 0.008 ms): openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) = 3
0.020 ( 0.004 ms): openat(dfd: CWD, filename: /lib64/libcap.so.2, flags: CLOEXEC) = 3
0.051 ( 0.004 ms): openat(dfd: CWD, filename: /lib64/libidn.so.11, flags: CLOEXEC) = 3
0.076 ( 0.003 ms): openat(dfd: CWD, filename: /lib64/libcrypto.so.1.1, flags: CLOEXEC) = 3
0.106 ( 0.003 ms): openat(dfd: CWD, filename: /lib64/libresolv.so.2, flags: CLOEXEC) = 3
0.136 ( 0.004 ms): openat(dfd: CWD, filename: /lib64/libm.so.6, flags: CLOEXEC) = 3
0.194 ( 0.004 ms): openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC) = 3
0.224 ( 0.003 ms): openat(dfd: CWD, filename: /lib64/libz.so.1, flags: CLOEXEC) = 3
0.252 ( 0.004 ms): openat(dfd: CWD, filename: /lib64/libdl.so.2, flags: CLOEXEC) = 3
0.275 ( 0.003 ms): openat(dfd: CWD, filename: /lib64/libpthread.so.0, flags: CLOEXEC) = 3
0.730 ( 0.007 ms): open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
PING ::1(::1) 56 data bytes
0.834 ( 0.008 ms): connect(fd: 5, uservaddr: { .family: INET6, port: 1025, addr: ::1 }, addrlen: 28) = 0
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.032 ms
--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.032/0.032/0.032/0.000 ms
0.914 ( 0.036 ms): sendto(fd: 4<socket:[843044]>, buff: 0x55b5e52e9720, len: 64, addr: { .family: INET6, port: 58, addr: ::1 }, addr_len: 28) = 64
#
Use 'perf trace -e augmented_syscalls.c,close ping -c 1 ::1' to see the
'close' calls as well, as it is not one of the syscalls augmented in that .c
file.
(Arnaldo Carvalho de Melo)
- Alias 'umount' to 'umount2' (Benjamin Peterson)
perf stat: (Jiri Olsa)
- Make many builtin-stat.c functions generic, moving display functions
to a separate file, prep work for adding the ability to store/display
stat data in perf record/top.
perf annotate: (Kim Phillips)
- Handle arm64 move instructions
perf report: (Thomas Richter):
- Create auxiliary trace data files for s390
libtraceevent: (Tzvetomir Stoyanov (VMware)):
- Split trace-seq related APIs in a separate header file.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Wed, 5 Sep 2018 13:47:56 +0000 (10:47 -0300)]
perf tests: Fix record+probe_libc_inet_pton.sh without ping's debuginfo
When we don't have the iputils-debuginfo package installed, i.e. when we
don't have the DWARF information needed to resolve ping's samples, we
end up failing this 'perf test' entry:
# perf test ping
62: probe libc's inet_pton & backtrace it with ping : Ok
# rpm -e iputils-debuginfo
# perf test ping
62: probe libc's inet_pton & backtrace it with ping : FAILED!
#
Fix it to accept "[unknown]" where the symbol + offset, when resolved,
is expected.
I think this will fail in the other arches as well, but since I can't
test now, I'm leaving s390x and ppc cases as-is.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Sandipan Das <sandipan@linux.vnet.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes:
7903a7086723 ("perf script: Show symbol offsets by default")
Link: https://lkml.kernel.org/n/tip-hnizqwqrs03vcq1b74yao0f6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Tue, 4 Sep 2018 13:43:07 +0000 (10:43 -0300)]
perf map: Turn some pr_warning() to pr_debug()
Annoying when using it with --stdio/--stdio2, so just turn them debug,
we can get those using -v.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-t3684lkugnf1w4lwcmpj9ivm@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Mon, 3 Sep 2018 19:29:39 +0000 (16:29 -0300)]
perf trace: Use the raw_syscalls:sys_enter for the augmented syscalls
Now we combine what comes from the "bpf-output" event, i.e. what is
added in the augmented_syscalls.c BPF program via the
__augmented_syscalls__ BPF map, i.e. the payload we get with
raw_syscalls:sys_enter tracepoints plus the pointer contents, right
after that payload, with the raw_syscall:sys_exit also added, without
augmentation, in the augmented_syscalls.c program.
The end result is that for the hooked syscalls, we get strace like
output with pointer expansion, something that wasn't possible before
with just raw_syscalls:sys_enter + raw_syscalls:sys_exit.
E.g.:
# perf trace -e tools/perf/examples/bpf/augmented_syscalls.c ping -c 2 ::1
0.000 ( 0.008 ms): ping/19573 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) = 3
0.036 ( 0.006 ms): ping/19573 openat(dfd: CWD, filename: /lib64/libcap.so.2, flags: CLOEXEC) = 3
0.070 ( 0.004 ms): ping/19573 openat(dfd: CWD, filename: /lib64/libidn.so.11, flags: CLOEXEC) = 3
0.095 ( 0.004 ms): ping/19573 openat(dfd: CWD, filename: /lib64/libcrypto.so.1.1, flags: CLOEXEC) = 3
0.127 ( 0.004 ms): ping/19573 openat(dfd: CWD, filename: /lib64/libresolv.so.2, flags: CLOEXEC) = 3
0.156 ( 0.004 ms): ping/19573 openat(dfd: CWD, filename: /lib64/libm.so.6, flags: CLOEXEC) = 3
0.181 ( 0.004 ms): ping/19573 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC) = 3
0.212 ( 0.004 ms): ping/19573 openat(dfd: CWD, filename: /lib64/libz.so.1, flags: CLOEXEC) = 3
0.242 ( 0.004 ms): ping/19573 openat(dfd: CWD, filename: /lib64/libdl.so.2, flags: CLOEXEC) = 3
0.266 ( 0.003 ms): ping/19573 openat(dfd: CWD, filename: /lib64/libpthread.so.0, flags: CLOEXEC) = 3
0.709 ( 0.006 ms): ping/19573 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
PING ::1(::1) 56 data bytes
1.133 ( 0.011 ms): ping/19573 connect(fd: 5, uservaddr: { .family: INET6, port: 1025, addr: ::1 }, addrlen: 28) = 0
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.033 ms
1.234 ( 0.036 ms): ping/19573 sendto(fd: 4<socket:[1498931]>, buff: 0x555e5b975720, len: 64, addr: { .family: INET6, port: 58, addr: ::1 }, addr_len: 28) = 64
64 bytes from ::1: icmp_seq=2 ttl=64 time=0.120 ms
--- ::1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.033/0.076/0.120/0.044 ms
1002.060 ( 0.129 ms): ping/19573 sendto(fd: 4<socket:[1498931]>, buff: 0x555e5b975720, len: 64, flags: CONFIRM, addr: { .family: INET6, port: 58, addr: ::1 }, addr_len: 28) = 64
#
# perf trace -e tools/perf/examples/bpf/augmented_syscalls.c cat tools/perf/examples/bpf/hello.c
#include <stdio.h>
int syscall_enter(openat)(void *args)
{
puts("Hello, world\n");
return 0;
}
license(GPL);
0.000 ( 0.008 ms): cat/20054 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC) = 3
0.020 ( 0.005 ms): cat/20054 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC) = 3
0.176 ( 0.011 ms): cat/20054 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
0.243 ( 0.006 ms): cat/20054 openat(dfd: CWD, filename: tools/perf/examples/bpf/hello.c) = 3
#
Now to think how to hook on all syscalls, fallbacking to the non-augmented
raw_syscalls:sys_enter payload.
Probably the best way is to use a BPF_MAP_TYPE_PROG_ARRAY just like
samples/bpf/tracex5_kern.c does.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-nlt60y69o26xi59z5vtpdrj5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Mon, 3 Sep 2018 19:24:09 +0000 (16:24 -0300)]
perf trace: Setup augmented_args in the raw_syscalls:sys_enter handler
Without using something to augment the raw_syscalls:sys_enter tracepoint
payload with the pointer contents, this will work just like before, i.e.
the augmented_args arg will be NULL and the augmented_args_size will be
0.
This just paves the way for the next cset where we will associate the
trace__sys_enter tracepoint handler with the augmented "bpf-output"
event named "__augmented_args__".
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-p8uvt2a6ug3uwlhja3cno4la@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Mon, 3 Sep 2018 19:07:53 +0000 (16:07 -0300)]
perf trace: Introduce syscall__augmented_args() method
That will be used by trace__sys_enter when we start combining the
augmented syscalls:sys_enter_FOO + syscalls:sys_exit_FOO.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-iiseo3s0qbf9i3rzn8k597bv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Mon, 3 Sep 2018 18:18:37 +0000 (15:18 -0300)]
perf augmented_syscalls: Avoid optimization to pass older BPF validators
See https://www.spinics.net/lists/netdev/msg480099.html for the whole
discussio, but to make the augmented_syscalls.c BPF program to get built
and loaded successfully in a greater range of kernels, add an extra
check.
Related patch:
a60dd35d2e39 ("bpf: change bpf_perf_event_output arg5 type to ARG_CONST_SIZE_OR_ZERO")
That is in the kernel since v4.15, I couldn't figure why this is hitting
me with 4.17.17, but adding the workaround discussed there makes this
work with this fedora kernel and with 4.18.recent.
Before:
# uname -a
Linux seventh 4.17.17-100.fc27.x86_64 #1 SMP Mon Aug 20 15:53:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
# perf trace -e tools/perf/examples/bpf/augmented_syscalls.c cat /etc/passwd > /dev/null
libbpf: load bpf program failed: Permission denied
libbpf: -- BEGIN DUMP LOG ---
libbpf:
0: (bf) r6 = r1
1: (b7) r1 = 0
2: (7b) *(u64 *)(r10 -8) = r1
3: (7b) *(u64 *)(r10 -16) = r1
4: (7b) *(u64 *)(r10 -24) = r1
5: (7b) *(u64 *)(r10 -32) = r1
6: (7b) *(u64 *)(r10 -40) = r1
7: (7b) *(u64 *)(r10 -48) = r1
8: (7b) *(u64 *)(r10 -56) = r1
9: (7b) *(u64 *)(r10 -64) = r1
10: (7b) *(u64 *)(r10 -72) = r1
11: (7b) *(u64 *)(r10 -80) = r1
12: (7b) *(u64 *)(r10 -88) = r1
13: (7b) *(u64 *)(r10 -96) = r1
14: (7b) *(u64 *)(r10 -104) = r1
15: (7b) *(u64 *)(r10 -112) = r1
16: (7b) *(u64 *)(r10 -120) = r1
17: (7b) *(u64 *)(r10 -128) = r1
18: (7b) *(u64 *)(r10 -136) = r1
19: (7b) *(u64 *)(r10 -144) = r1
20: (7b) *(u64 *)(r10 -152) = r1
21: (7b) *(u64 *)(r10 -160) = r1
22: (7b) *(u64 *)(r10 -168) = r1
23: (7b) *(u64 *)(r10 -176) = r1
24: (7b) *(u64 *)(r10 -184) = r1
25: (7b) *(u64 *)(r10 -192) = r1
26: (7b) *(u64 *)(r10 -200) = r1
27: (7b) *(u64 *)(r10 -208) = r1
28: (7b) *(u64 *)(r10 -216) = r1
29: (7b) *(u64 *)(r10 -224) = r1
30: (7b) *(u64 *)(r10 -232) = r1
31: (7b) *(u64 *)(r10 -240) = r1
32: (7b) *(u64 *)(r10 -248) = r1
33: (7b) *(u64 *)(r10 -256) = r1
34: (7b) *(u64 *)(r10 -264) = r1
35: (7b) *(u64 *)(r10 -272) = r1
36: (7b) *(u64 *)(r10 -280) = r1
37: (7b) *(u64 *)(r10 -288) = r1
38: (7b) *(u64 *)(r10 -296) = r1
39: (7b) *(u64 *)(r10 -304) = r1
40: (7b) *(u64 *)(r10 -312) = r1
41: (bf) r7 = r10
42: (07) r7 += -312
43: (bf) r1 = r7
44: (b7) r2 = 48
45: (bf) r3 = r6
46: (85) call bpf_probe_read#4
47: (79) r3 = *(u64 *)(r6 +24)
48: (bf) r1 = r10
49: (07) r1 += -256
50: (b7) r8 = 256
51: (b7) r2 = 256
52: (85) call bpf_probe_read_str#45
53: (bf) r1 = r0
54: (67) r1 <<= 32
55: (77) r1 >>= 32
56: (bf) r5 = r0
57: (07) r5 += 56
58: (2d) if r8 > r1 goto pc+1
R0=inv(id=0) R1=inv(id=0,umin_value=256,umax_value=
4294967295,var_off=(0x0; 0xffffffff)) R5=inv(id=0) R6=ctx(id=0,off=0,imm=0) R7=fp-312,call_-1 R8=inv256 R10=fp0,call_-1 fp-264=0
59: (b7) r5 = 312
60: (63) *(u32 *)(r10 -264) = r0
61: (67) r5 <<= 32
62: (77) r5 >>= 32
63: (bf) r1 = r6
64: (18) r2 = 0xffff8b9120cc8500
66: (18) r3 = 0xffffffff
68: (bf) r4 = r7
69: (85) call bpf_perf_event_output#25
70: (b7) r0 = 0
71: (95) exit
from 58 to 60: R0=inv(id=0) R1=inv(id=0,umax_value=255,var_off=(0x0; 0xff)) R5=inv(id=0) R6=ctx(id=0,off=0,imm=0) R7=fp-312,call_-1 R8=inv256 R10=fp0,call_-1 fp-264=0
60: (63) *(u32 *)(r10 -264) = r0
61: (67) r5 <<= 32
62: (77) r5 >>= 32
63: (bf) r1 = r6
64: (18) r2 = 0xffff8b9120cc8500
66: (18) r3 = 0xffffffff
68: (bf) r4 = r7
69: (85) call bpf_perf_event_output#25
R5 unbounded memory access, use 'var &= const' or 'if (var < const)'
libbpf: -- END LOG --
libbpf: failed to load program 'syscalls:sys_enter_openat'
libbpf: failed to load object 'tools/perf/examples/bpf/augmented_syscalls.c'
bpf: load objects failed: err=-4007: (Kernel verifier blocks program loading)
event syntax error: 'tools/perf/examples/bpf/augmented_syscalls.c'
\___ Kernel verifier blocks program loading
After:
# perf trace -e tools/perf/examples/bpf/augmented_syscalls.c cat /etc/passwd > /dev/null
0.000 cat/29249 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
0.008 cat/29249 syscalls:sys_exit_openat:0x3
0.021 cat/29249 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC)
0.025 cat/29249 syscalls:sys_exit_openat:0x3
0.180 cat/29249 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC)
0.185 cat/29249 syscalls:sys_exit_open:0x3
0.242 cat/29249 openat(dfd: CWD, filename: /etc/passwd)
0.245 cat/29249 syscalls:sys_exit_openat:0x3
#
It also works with a more recent kernel:
# uname -a
Linux jouet 4.18.0-00014-g4e67b2a5df5d #6 SMP Thu Aug 30 17:34:17 -03 2018 x86_64 x86_64 x86_64 GNU/Linux
# perf trace -e tools/perf/examples/bpf/augmented_syscalls.c cat /etc/passwd > /dev/null
0.000 cat/26451 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
0.020 cat/26451 syscalls:sys_exit_openat:0x3
0.039 cat/26451 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC)
0.044 cat/26451 syscalls:sys_exit_openat:0x3
0.231 cat/26451 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC)
0.238 cat/26451 syscalls:sys_exit_open:0x3
0.278 cat/26451 openat(dfd: CWD, filename: /etc/passwd)
0.282 cat/26451 syscalls:sys_exit_openat:0x3
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: Gianluca Borello <g.borello@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lkml.kernel.org/n/tip-wkpsivs1a9afwldbul46btbv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Mon, 3 Sep 2018 18:02:22 +0000 (15:02 -0300)]
perf augmented_syscalls: Check probe_read_str() return separately
Using a value returned from probe_read_str() to tell how many bytes to
copy using perf_event_output() has issues in some older kernels, like
4.17.17-100.fc27.x86_64, so separate the bounds checking done on how
many bytes to copy to a separate variable, so that the next patch has
only what is being done to make the test pass on older BPF validators.
For reference, see the discussion in this thread:
https://www.spinics.net/lists/netdev/msg480099.html
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lkml.kernel.org/n/tip-jtsapwibyxrnv1xjfsgzp0fj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Linus Torvalds [Mon, 3 Sep 2018 03:09:36 +0000 (20:09 -0700)]
Merge tag 'dma-mapping-4.19-2' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fixes from Christoph Hellwig:
"A few fixes for the fallout of being a little more pedantic about dma
masks"
* tag 'dma-mapping-4.19-2' of git://git.infradead.org/users/hch/dma-mapping:
of/platform: initialise AMBA default DMA masks
sparc: set a default 32-bit dma mask for OF devices
kernel/dma/direct: take DMA offset into account in dma_direct_supported
Linus Torvalds [Sun, 2 Sep 2018 21:37:30 +0000 (14:37 -0700)]
Linux 4.19-rc2
Linus Torvalds [Sun, 2 Sep 2018 17:56:01 +0000 (10:56 -0700)]
Merge tag 'devicetree-fixes-for-4.19' of git://git./linux/kernel/git/robh/linux
Pull devicetree updates from Rob Herring:
"A couple of new helper functions in preparation for some tree wide
clean-ups.
I'm sending these new helpers now for rc2 in order to simplify the
dependencies on subsequent cleanups across the tree in 4.20"
* tag 'devicetree-fixes-for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
of: Add device_type access helper functions
of: add node name compare helper functions
of: add helper to lookup compatible child node
Linus Torvalds [Sun, 2 Sep 2018 17:44:28 +0000 (10:44 -0700)]
Merge tag 'armsoc-fixes' of git://git./linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"First batch of fixes post-merge window:
- A handful of devicetree changes for i.MX2{3,8} to change over to
new panel bindings. The platforms were moved from legacy
framebuffers to DRM and some development board panels hadn't yet
been converted.
- OMAP fixes related to ti-sysc driver conversion fallout, fixing
some register offsets, no_console_suspend fixes, etc.
- Droid4 changes to fix flaky eMMC probing and vibrator DTS mismerge.
- Fixed 0755->0644 permissions on a newly added file.
- Defconfig changes to make ARM Versatile more useful with QEMU
(helps testing).
- Enable defconfig options for new TI SoC platform that was merged
this window (AM6)"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
arm64: defconfig: Enable TI's AM6 SoC platform
ARM: defconfig: Update the ARM Versatile defconfig
ARM: dts: omap4-droid4: Fix emmc errors seen on some devices
ARM: dts: Fix file permission for am335x-osd3358-sm-red.dts
ARM: imx_v6_v7_defconfig: Select CONFIG_DRM_PANEL_SEIKO_43WVF1G
ARM: mxs_defconfig: Select CONFIG_DRM_PANEL_SEIKO_43WVF1G
ARM: dts: imx23-evk: Convert to the new display bindings
ARM: dts: imx23-evk: Move regulators outside simple-bus
ARM: dts: imx28-evk: Convert to the new display bindings
ARM: dts: imx28-evk: Move regulators outside simple-bus
Revert "ARM: dts: imx7d: Invert legacy PCI irq mapping"
arm: dts: am4372: setup rtc as system-power-controller
ARM: dts: omap4-droid4: fix vibrations on Droid 4
bus: ti-sysc: Fix no_console_suspend handling
bus: ti-sysc: Fix module register ioremap for larger offsets
ARM: OMAP2+: Fix module address for modules using mpu_rt_idx
ARM: OMAP2+: Fix null hwmod for ti-sysc debug
Linus Torvalds [Sun, 2 Sep 2018 17:11:30 +0000 (10:11 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"Speculation:
- Make the microcode check more robust
- Make the L1TF memory limit depend on the internal cache physical
address space and not on the CPUID advertised physical address
space, which might be significantly smaller. This avoids disabling
L1TF on machines which utilize the full physical address space.
- Fix the GDT mapping for EFI calls on 32bit PTI
- Fix the MCE nospec implementation to prevent #GP
Fixes and robustness:
- Use the proper operand order for LSL in the VDSO
- Prevent NMI uaccess race against CR3 switching
- Add a lockdep check to verify that text_mutex is held in
text_poke() functions
- Repair the fallout of giving native_restore_fl() a prototype
- Prevent kernel memory dumps based on usermode RIP
- Wipe KASAN shadow stack before rewinding the stack to prevent false
positives
- Move the AMS GOTO enforcement to the actual build stage to allow
user API header extraction without a compiler
- Fix a section mismatch introduced by the on demand VDSO mapping
change
Miscellaneous:
- Trivial typo, GCC quirk removal and CC_SET/OUT() cleanups"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/pti: Fix section mismatch warning/error
x86/vdso: Fix lsl operand order
x86/mce: Fix set_mce_nospec() to avoid #GP fault
x86/efi: Load fixmap GDT in efi_call_phys_epilog()
x86/nmi: Fix NMI uaccess race against CR3 switching
x86: Allow generating user-space headers without a compiler
x86/dumpstack: Don't dump kernel memory based on usermode RIP
x86/asm: Use CC_SET()/CC_OUT() in __gen_sigismember()
x86/alternatives: Lockdep-enforce text_mutex in text_poke*()
x86/entry/64: Wipe KASAN stack shadow before rewind_stack_do_exit()
x86/irqflags: Mark native_restore_fl extern inline
x86/build: Remove jump label quirk for GCC older than 4.5.2
x86/Kconfig: Fix trivial typo
x86/speculation/l1tf: Increase l1tf memory limit for Nehalem+
x86/spectre: Add missing family 6 check to microcode check
Linus Torvalds [Sun, 2 Sep 2018 17:09:35 +0000 (10:09 -0700)]
Merge branch 'smp-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull CPU hotplug fix from Thomas Gleixner:
"Remove the stale skip_onerr member from the hotplug states"
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
cpu/hotplug: Remove skip_onerr field from cpuhp_step structure
Linus Torvalds [Sun, 2 Sep 2018 16:41:45 +0000 (09:41 -0700)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull core fixes from Thomas Gleixner:
"A small set of updates for core code:
- Prevent tracing in functions which are called from trace patching
via stop_machine() to prevent executing half patched function trace
entries.
- Remove old GCC workarounds
- Remove pointless includes of notifier.h"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Remove workaround for unreachable warnings from old GCC
notifier: Remove notifier header file wherever not used
watchdog: Mark watchdog touch functions as notrace
Randy Dunlap [Sun, 2 Sep 2018 04:01:28 +0000 (21:01 -0700)]
x86/pti: Fix section mismatch warning/error
Fix the section mismatch warning in arch/x86/mm/pti.c:
WARNING: vmlinux.o(.text+0x6972a): Section mismatch in reference from the function pti_clone_pgtable() to the function .init.text:pti_user_pagetable_walk_pte()
The function pti_clone_pgtable() references
the function __init pti_user_pagetable_walk_pte().
This is often because pti_clone_pgtable lacks a __init
annotation or the annotation of pti_user_pagetable_walk_pte is wrong.
FATAL: modpost: Section mismatches detected.
Fixes:
85900ea51577 ("x86/pti: Map the vsyscall page if needed")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/43a6d6a3-d69d-5eda-da09-0b1c88215a2a@infradead.org
Linus Walleij [Fri, 31 Aug 2018 14:13:07 +0000 (16:13 +0200)]
of/platform: initialise AMBA default DMA masks
This addresses a v4.19-rc1 regression in the PL111 DRM driver in
drivers/gpu/pl111/*
The driver uses the CMA KMS helpers and will thus at some point call
down to dma_alloc_attrs() to allocate a chunk of contigous DMA memory
for the framebuffer.
It appears that in v4.18, it was OK that this (and other DMA mastering
AMBA devices) left dev->coherent_dma_mask blank (zero).
In v4.19-rc1 the WARN_ON_ONCE(dev && !dev->coherent_dma_mask) in
dma_alloc_attrs() in include/linux/dma-mapping.h is triggered. The
allocation later fails when get_coherent_dma_mask() is called from
__dma_alloc() and __dma_alloc() returns NULL:
drm-clcd-pl111 dev:20: coherent DMA mask is unset
drm-clcd-pl111 dev:20: [drm:drm_fb_helper_fbdev_setup] *ERROR*
Failed to set fbdev configuration
It turns out that in commit
4d8bde883bfb ("OF: Don't set default
coherent DMA mask") the OF core stops setting the default DMA mask on
new devices, especially those lines of the patch:
- if (!dev->coherent_dma_mask)
- dev->coherent_dma_mask = DMA_BIT_MASK(32);
Robin Murphy solved a similar problem in
a5516219b102 ("of/platform:
Initialise default DMA masks") by simply assigning dev.coherent_dma_mask
and the dev.dma_mask to point to the same when creating devices from the
device tree, and introducing the same code into the code path creating
AMBA/PrimeCell devices solved my problem, graphics now come up.
The code simply assumes that the device can access all of the system
memory by setting the coherent DMA mask to 0xffffffff when creating a
device from the device tree, which is crude, but seems to be what kernel
v4.18 assumed.
The AMBA PrimeCells do not differ between coherent and streaming DMA so
we can just assign the same to any DMA mask.
Possibly drivers should augment their coherent DMA mask in accordance
with "dma-ranges" from the device tree if more finegranular masking is
needed.
Reported-by: Russell King <linux@armlinux.org.uk>
Fixes:
4d8bde883bfb ("OF: Don't set default coherent DMA mask")
Cc: Russell King <linux@armlinux.org.uk>
Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Christoph Hellwig [Tue, 28 Aug 2018 09:25:51 +0000 (11:25 +0200)]
sparc: set a default 32-bit dma mask for OF devices
This keeps the historic default behavior for devices without a DMA mask,
but removes the warning about a lacking DMA mask for doing DMA without
a mask.
Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Olof Johansson [Sun, 2 Sep 2018 01:22:19 +0000 (18:22 -0700)]
Merge tag 'omap-for-v4.19/fixes-v2-signed' of git://git./linux/kernel/git/tmlind/linux-omap into fixes
Fixes for omap variants against v4.19-rc1
These are mostly fixes related to using ti-sysc interconnect target module
driver for accessing right register offsets for sgx and cpsw and for
no_console_suspend regression.
There is also a droid4 emmc fix where emmc may not get detected for some
models, and vibrator dts mismerge fix.
And we have a file permission fix for am335x-osd3358-sm-red.dts that
just got added. And we must tag RTC as system-power-controller for
am437x for PMIC to shut down during poweroff.
* tag 'omap-for-v4.19/fixes-v2-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: omap4-droid4: Fix emmc errors seen on some devices
ARM: dts: Fix file permission for am335x-osd3358-sm-red.dts
arm: dts: am4372: setup rtc as system-power-controller
ARM: dts: omap4-droid4: fix vibrations on Droid 4
bus: ti-sysc: Fix no_console_suspend handling
bus: ti-sysc: Fix module register ioremap for larger offsets
ARM: OMAP2+: Fix module address for modules using mpu_rt_idx
ARM: OMAP2+: Fix null hwmod for ti-sysc debug
Signed-off-by: Olof Johansson <olof@lixom.net>
Samuel Neves [Sat, 1 Sep 2018 20:14:52 +0000 (21:14 +0100)]
x86/vdso: Fix lsl operand order
In the __getcpu function, lsl is using the wrong target and destination
registers. Luckily, the compiler tends to choose %eax for both variables,
so it has been working so far.
Fixes:
a582c540ac1b ("x86/vdso: Use RDPID in preference to LSL when available")
Signed-off-by: Samuel Neves <sneves@dei.uc.pt>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180901201452.27828-1-sneves@dei.uc.pt
Linus Torvalds [Sat, 1 Sep 2018 20:17:15 +0000 (13:17 -0700)]
Merge tag 'linux-watchdog-4.19-rc2' of git://linux-watchdog.org/linux-watchdog
Pull watchdog fixlet from Wim Van Sebroeck:
"Document support for r8a774a1"
* tag 'linux-watchdog-4.19-rc2' of git://www.linux-watchdog.org/linux-watchdog:
dt-bindings: watchdog: renesas-wdt: Document r8a774a1 support
Linus Torvalds [Sat, 1 Sep 2018 20:03:32 +0000 (13:03 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"Two small fixes, one for the x86 Stoney SoC to get a more accurate clk
frequency and the other to fix a bad allocation in the Nuvoton NPCM7XX
driver"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: x86: Set default parent to 48Mhz
clk: npcm7xx: fix memory allocation
Christoph Hellwig [Fri, 24 Aug 2018 06:07:52 +0000 (08:07 +0200)]
kernel/dma/direct: take DMA offset into account in dma_direct_supported
When a device has a DMA offset the dma capable result will change due
to the difference between the physical and DMA address. Take that into
account.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
LuckTony [Fri, 31 Aug 2018 16:55:06 +0000 (09:55 -0700)]
x86/mce: Fix set_mce_nospec() to avoid #GP fault
The trick with flipping bit 63 to avoid loading the address of the 1:1
mapping of the poisoned page while the 1:1 map is updated used to work when
unmapping the page. But it falls down horribly when attempting to directly
set the page as uncacheable.
The problem is that when the cache mode is changed to uncachable, the pages
needs to be flushed from the cache first. But the decoy address is
non-canonical due to bit 63 flipped, and the CLFLUSH instruction throws a
#GP fault.
Add code to change_page_attr_set_clr() to fix the address before calling
flush.
Fixes:
284ce4011ba6 ("x86/memory_failure: Introduce {set, clear}_mce_nospec()")
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Link: https://lkml.kernel.org/r/20180831165506.GA9605@agluck-desk
Linus Torvalds [Fri, 31 Aug 2018 16:20:30 +0000 (09:20 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"A few arm64 fixes came in this week, specifically fixing some nasty
truncation of return values from firmware calls and resolving a
VM_BUG_ON due to accessing uninitialised struct pages corresponding to
NOMAP pages.
Summary:
- Fix typos in SVE documentation
- Fix type-checking and implicit truncation for SMCCC calls
- Force CONFIG_HOLES_IN_ZONE=y so that SLAB doesn't fall over NOMAP
regions"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mm: always enable CONFIG_HOLES_IN_ZONE
arm/arm64: smccc-1.1: Handle function result as parameters
arm/arm64: smccc-1.1: Make return values unsigned long
Documentation/arm64/sve: Couple of improvements and typos
Joerg Roedel [Fri, 31 Aug 2018 08:05:38 +0000 (10:05 +0200)]
x86/efi: Load fixmap GDT in efi_call_phys_epilog()
When PTI is enabled on x86-32 the kernel uses the GDT mapped in the fixmap
for the simple reason that this address is also mapped for user-space.
The efi_call_phys_prolog()/efi_call_phys_epilog() wrappers change the GDT
to call EFI runtime services and switch back to the kernel GDT when they
return. But the switch-back uses the writable GDT, not the fixmap GDT.
When that happened and and the CPU returns to user-space it switches to the
user %cr3 and tries to restore user segment registers. This fails because
the writable GDT is not mapped in the user page-table, and without a GDT
the fault handlers also can't be launched. The result is a triple fault and
reboot of the machine.
Fix that by restoring the GDT back to the fixmap GDT which is also mapped
in the user page-table.
Fixes:
7757d607c6b3 x86/pti: ('Allow CONFIG_PAGE_TABLE_ISOLATION for x86_32')
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: hpa@zytor.com
Cc: linux-efi@vger.kernel.org
Link: https://lkml.kernel.org/r/1535702738-10971-1-git-send-email-joro@8bytes.org
Linus Torvalds [Fri, 31 Aug 2018 15:45:16 +0000 (08:45 -0700)]
Merge tag 'for-linus-4.19b-rc2-tag' of git://git./linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- minor cleanup avoiding a warning when building with new gcc
- a patch to add a new sysfs node for Xen frontend/backend drivers to
make it easier to obtain the state of a pv device
- two fixes for 32-bit pv-guests to avoid intermediate L1TF vulnerable
PTEs
* tag 'for-linus-4.19b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: remove redundant variable save_pud
xen: export device state to sysfs
x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear
x86/xen: don't write ptes directly in 32-bit PV guests
Linus Torvalds [Fri, 31 Aug 2018 15:42:46 +0000 (08:42 -0700)]
Merge tag 'm68k-for-v4.19-tag2' of git://git./linux/kernel/git/geert/linux-m68k
Pull m68k fix from Geert Uytterhoeven:
"Just a single fix for a bug introduced during the merge window: fix
wrong date and time on PMU-based Macs"
* tag 'm68k-for-v4.19-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k/mac: Use correct PMU response format
Linus Torvalds [Fri, 31 Aug 2018 15:38:53 +0000 (08:38 -0700)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
- regression fixes for i801 and designware
- better API and leak fix for releasing DMA safe buffers
- better greppable strings for the bitbang algorithm
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: sh_mobile: fix leak when using DMA bounce buffer
i2c: sh_mobile: define start_ch() void as it only returns 0 anyhow
i2c: refactor function to release a DMA safe buffer
i2c: algos: bit: make the error messages grepable
i2c: designware: Re-init controllers with pm_disabled set on resume
i2c: i801: Allow ACPI AML access I/O ports not reserved for SMBus
Andy Lutomirski [Wed, 29 Aug 2018 15:47:18 +0000 (08:47 -0700)]
x86/nmi: Fix NMI uaccess race against CR3 switching
A NMI can hit in the middle of context switching or in the middle of
switch_mm_irqs_off(). In either case, CR3 might not match current->mm,
which could cause copy_from_user_nmi() and friends to read the wrong
memory.
Fix it by adding a new nmi_uaccess_okay() helper and checking it in
copy_from_user_nmi() and in __copy_from_user_nmi()'s callers.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Rik van Riel <riel@surriel.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jann Horn <jannh@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/dd956eba16646fd0b15c3c0741269dfd84452dac.1535557289.git.luto@kernel.org
Ben Hutchings [Wed, 29 Aug 2018 19:43:17 +0000 (20:43 +0100)]
x86: Allow generating user-space headers without a compiler
When bootstrapping an architecture, it's usual to generate the kernel's
user-space headers (make headers_install) before building a compiler. Move
the compiler check (for asm goto support) to the archprepare target so that
it is only done when building code for the target.
Fixes:
e501ce957a78 ("x86: Force asm-goto")
Reported-by: Helmut Grohne <helmutg@debian.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180829194317.GA4765@decadent.org.uk
Jann Horn [Tue, 28 Aug 2018 15:49:01 +0000 (17:49 +0200)]
x86/dumpstack: Don't dump kernel memory based on usermode RIP
show_opcodes() is used both for dumping kernel instructions and for dumping
user instructions. If userspace causes #PF by jumping to a kernel address,
show_opcodes() can be reached with regs->ip controlled by the user,
pointing to kernel code. Make sure that userspace can't trick us into
dumping kernel memory into dmesg.
Fixes:
7cccf0725cf7 ("x86/dumpstack: Add a show_ip() function")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: security@kernel.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180828154901.112726-1-jannh@google.com
Rob Herring [Tue, 28 Aug 2018 20:10:48 +0000 (15:10 -0500)]
of: Add device_type access helper functions
In preparation to remove direct access to device_node.type, add
of_node_is_type() and of_node_get_device_type() helpers to check and
retrieve the device type.
Cc: Frank Rowand <frowand.list@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Mukesh Ojha [Tue, 28 Aug 2018 06:54:54 +0000 (12:24 +0530)]
cpu/hotplug: Remove skip_onerr field from cpuhp_step structure
When notifiers were there, `skip_onerr` was used to avoid calling
particular step startup/teardown callbacks in the CPU up/down rollback
path, which made the hotplug asymmetric.
As notifiers are gone now after the full state machine conversion, the
`skip_onerr` field is no longer required.
Remove it from the structure and its usage.
Signed-off-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/1535439294-31426-1-git-send-email-mojha@codeaurora.org
James Morse [Thu, 30 Aug 2018 15:05:32 +0000 (16:05 +0100)]
arm64: mm: always enable CONFIG_HOLES_IN_ZONE
Commit
6d526ee26ccd ("arm64: mm: enable CONFIG_HOLES_IN_ZONE for NUMA")
only enabled HOLES_IN_ZONE for NUMA systems because the NUMA code was
choking on the missing zone for nomap pages. This problem doesn't just
apply to NUMA systems.
If the architecture doesn't set HAVE_ARCH_PFN_VALID, pfn_valid() will
return true if the pfn is part of a valid sparsemem section.
When working with multiple pages, the mm code uses pfn_valid_within()
to test each page it uses within the sparsemem section is valid. On
most systems memory comes in MAX_ORDER_NR_PAGES chunks which all
have valid/initialised struct pages. In this case pfn_valid_within()
is optimised out.
Systems where this isn't true (e.g. due to nomap) should set
HOLES_IN_ZONE and provide HAVE_ARCH_PFN_VALID so that mm tests each
page as it works with it.
Currently non-NUMA arm64 systems can't enable HOLES_IN_ZONE, leading to
a VM_BUG_ON():
| page:
fffffdff802e1780 is uninitialized and poisoned
| raw:
ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
| raw:
ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
| page dumped because: VM_BUG_ON_PAGE(PagePoisoned(p))
| ------------[ cut here ]------------
| kernel BUG at include/linux/mm.h:978!
| Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[...]
| CPU: 1 PID: 25236 Comm: dd Not tainted 4.18.0 #7
| Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
| pstate:
40000085 (nZcv daIf -PAN -UAO)
| pc : move_freepages_block+0x144/0x248
| lr : move_freepages_block+0x144/0x248
| sp :
fffffe0071177680
[...]
| Process dd (pid: 25236, stack limit = 0x0000000094cc07fb)
| Call trace:
| move_freepages_block+0x144/0x248
| steal_suitable_fallback+0x100/0x16c
| get_page_from_freelist+0x440/0xb20
| __alloc_pages_nodemask+0xe8/0x838
| new_slab+0xd4/0x418
| ___slab_alloc.constprop.27+0x380/0x4a8
| __slab_alloc.isra.21.constprop.26+0x24/0x34
| kmem_cache_alloc+0xa8/0x180
| alloc_buffer_head+0x1c/0x90
| alloc_page_buffers+0x68/0xb0
| create_empty_buffers+0x20/0x1ec
| create_page_buffers+0xb0/0xf0
| __block_write_begin_int+0xc4/0x564
| __block_write_begin+0x10/0x18
| block_write_begin+0x48/0xd0
| blkdev_write_begin+0x28/0x30
| generic_perform_write+0x98/0x16c
| __generic_file_write_iter+0x138/0x168
| blkdev_write_iter+0x80/0xf0
| __vfs_write+0xe4/0x10c
| vfs_write+0xb4/0x168
| ksys_write+0x44/0x88
| sys_write+0xc/0x14
| el0_svc_naked+0x30/0x34
| Code:
aa1303e0 90001a01 91296421 94008902 (
d4210000)
| ---[ end trace
1601ba47f6e883fe ]---
Remove the NUMA dependency.
Link: https://www.spinics.net/lists/arm-kernel/msg671851.html
Cc: <stable@vger.kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Pavel Tatashin <pavel.tatashin@microsoft.com>
Tested-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Finn Thain [Fri, 24 Aug 2018 02:02:06 +0000 (12:02 +1000)]
m68k/mac: Use correct PMU response format
Now that the 68k Mac port has adopted the via-pmu driver, it must decode
the PMU response accordingly otherwise the date and time will be wrong.
Fixes:
ebd722275f9cfc67 ("macintosh/via-pmu: Replace via-pmu68k driver with via-pmu driver")
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Linus Torvalds [Fri, 31 Aug 2018 04:18:05 +0000 (21:18 -0700)]
Merge tag 'drm-fixes-2018-08-31' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Regular fixes pull:
- Mediatek has a bunch of fixes to their RDMA and Overlay engines.
- i915 has some Cannonlake/Geminilake watermark workarounds, LSPCON
fix, HDCP free fix, audio fix and a ppgtt reference counting fix.
- amdgpu has some SRIOV, Kasan, memory leaks and other misc fixes"
* tag 'drm-fixes-2018-08-31' of git://anongit.freedesktop.org/drm/drm: (35 commits)
drm/i915/audio: Hook up component bindings even if displays are disabled
drm/i915: Increase LSPCON timeout
drm/i915: Stop holding a ref to the ppgtt from each vma
drm/i915: Free write_buf that we allocated with kzalloc.
drm/i915: Fix glk/cnl display w/a #1175
drm/amdgpu: Need to set moved to true when evict bo
drm/amdgpu: Remove duplicated power source update
drm/amd/display: Fix memory leak caused by missed dc_sink_release
drm/amdgpu: fix holding mn_lock while allocating memory
drm/amdgpu: Power on uvd block when hw_fini
drm/amdgpu: Update power state at the end of smu hw_init.
drm/amdgpu: Fix vce initialize failed on Kaveri/Mullins
drm/amdgpu: Enable/disable gfx PG feature in rlc safe mode
drm/amdgpu: Adjust the VM size based on system memory size v2
drm/mediatek: fix connection from RDMA2 to DSI1
drm/mediatek: update some variable name from ovl to comp
drm/mediatek: use layer_nr function to get layer number to init plane
drm/mediatek: add function to return RDMA layer number
drm/mediatek: add function to return OVL layer number
drm/mediatek: add function to get layer number for component
...
Stephen Rothwell [Thu, 30 Aug 2018 21:47:28 +0000 (07:47 +1000)]
disable stringop truncation warnings for now
They are too noisy
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 31 Aug 2018 01:02:02 +0000 (18:02 -0700)]
Merge tag 'pm-4.19-rc2' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These address a corner case in the menu cpuidle governor and fix error
handling in the PM core's generic clock management code.
Specifics:
- Make the menu cpuidle governor avoid stopping the scheduler tick if
the predicted idle duration exceeds the tick period length, but the
selected idle state is shallow and deeper idle states with high
target residencies are available (Rafael Wysocki).
- Make the PM core's generic clock management code use a proper data
type for one variable to make error handling work (Dan Carpenter)"
* tag 'pm-4.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpuidle: menu: Retain tick when shallow state is selected
PM / clk: signedness bug in of_pm_clk_add_clks()
Rafael J. Wysocki [Thu, 30 Aug 2018 23:23:31 +0000 (01:23 +0200)]
Merge branch 'pm-core'
Merge a generic clock management fix for 4.19-rc2.
* pm-core:
PM / clk: signedness bug in of_pm_clk_add_clks()
Akshu Agrawal [Tue, 21 Aug 2018 06:51:57 +0000 (12:21 +0530)]
clk: x86: Set default parent to 48Mhz
System clk provided in ST soc can be set to:
48Mhz, non-spread
25Mhz, spread
To get accurate rate, we need it to set it at non-spread
option which is 48Mhz.
Signed-off-by: Akshu Agrawal <akshu.agrawal@amd.com>
Reviewed-by: Daniel Kurtz <djkurtz@chromium.org>
Fixes:
421bf6a1f061 ("clk: x86: Add ST oscout platform clock")
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Wolfram Sang [Fri, 24 Aug 2018 14:52:46 +0000 (16:52 +0200)]
i2c: sh_mobile: fix leak when using DMA bounce buffer
We only freed the bounce buffer after successful DMA, missing the cases
where DMA setup may have gone wrong. Use a better location which always
gets called after each message and use 'stop_after_dma' as a flag for a
successful transfer.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Wolfram Sang [Fri, 24 Aug 2018 14:52:45 +0000 (16:52 +0200)]
i2c: sh_mobile: define start_ch() void as it only returns 0 anyhow
After various refactoring over the years, start_ch() doesn't return
errno anymore, so make the function return void. This saves the error
handling when calling it which in turn eases cleanup of resources of a
future patch.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Wolfram Sang [Fri, 24 Aug 2018 14:52:44 +0000 (16:52 +0200)]
i2c: refactor function to release a DMA safe buffer
a) rename to 'put' instead of 'release' to match 'get' when obtaining
the buffer
b) change the argument order to have the buffer as first argument
c) add a new argument telling the function if the message was
transferred. This allows the function to be used also in cases
where setting up DMA failed, so the buffer needs to be freed without
syncing to the message buffer.
Also convert the only user.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Jan Kundrát [Tue, 28 Aug 2018 08:07:40 +0000 (10:07 +0200)]
i2c: algos: bit: make the error messages grepable
Yep, I went looking for one of these, and I wasn't able to find it
easily. That's worse than a line which is 82-chars long, IMHO.
Signed-off-by: Jan Kundrát <jan.kundrat@cesnet.cz>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Hans de Goede [Wed, 29 Aug 2018 13:06:31 +0000 (15:06 +0200)]
i2c: designware: Re-init controllers with pm_disabled set on resume
On Bay Trail and Cherry Trail devices we set the pm_disabled flag for I2C
busses which the OS shares with the PUNIT as these need special handling.
Until now we called dev_pm_syscore_device(dev, true) for I2C controllers
with this flag set to keep these I2C controllers always on.
After commit
12864ff8545f ("ACPI / LPSS: Avoid PM quirks on suspend and
resume from hibernation"), this no longer works. This commit modifies
lpss_iosf_exit_d3_state() to only run if lpss_iosf_enter_d3_state() has ran
before it, so that it does not run on a resume from hibernate (or from S3).
On these systems the conditions for lpss_iosf_enter_d3_state() to run
never become true, so lpss_iosf_exit_d3_state() never gets called and
the 2 LPSS DMA controllers never get forced into D0 mode, instead they
are left in their default automatic power-on when needed mode.
The not forcing of D0 mode for the DMA controllers enables these systems
to properly enter S0ix modes, which is a good thing.
But after entering S0ix modes the I2C controller connected to the PMIC
no longer works, leading to e.g. broken battery monitoring.
The _PS3 method for this I2C controller looks like this:
Method (_PS3, 0, NotSerialized) // _PS3: Power State 3
{
If ((((PMID == 0x04) || (PMID == 0x05)) || (PMID == 0x06)))
{
Return (Zero)
}
PSAT |= 0x03
Local0 = PSAT /* \_SB_.I2C5.PSAT */
}
Where PMID = 0x05, so we enter the Return (Zero) path on these systems.
So even if we were to not call dev_pm_syscore_device(dev, true) the
I2C controller will be left in D0 rather then be switched to D3.
Yet on other Bay and Cherry Trail devices S0ix is not entered unless *all*
I2C controllers are in D3 mode. This combined with the I2C controller no
longer working now that we reach S0ix states on these systems leads to me
believing that the PUNIT itself puts the I2C controller in D3 when all
other conditions for entering S0ix states are true.
Since now the I2C controller is put in D3 over a suspend/resume we must
re-initialize it afterwards and that does indeed fix it no longer working.
This commit implements this fix by:
1) Making the suspend_late callback a no-op if pm_disabled is set and
making the resume_early callback skip the clock re-enable (since it now was
not disabled) while still doing the necessary I2C controller re-init.
2) Removing the dev_pm_syscore_device(dev, true) call, so that the suspend
and resume callbacks are actually called. Normally this would cause the
ACPI pm code to call _PS3 putting the I2C controller in D3, wreaking havoc
since it is shared with the PUNIT, but in this special case the _PS3 method
is a no-op so we can safely allow a "fake" suspend / resume.
Fixes:
12864ff8545f ("ACPI / LPSS: Avoid PM quirks on suspend and resume ...")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=200861
Cc: 4.15+ <stable@vger.kernel.org> # 4.15+
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Mika Westerberg [Thu, 30 Aug 2018 08:50:13 +0000 (11:50 +0300)]
i2c: i801: Allow ACPI AML access I/O ports not reserved for SMBus
Commit
7ae81952cda ("i2c: i801: Allow ACPI SystemIO OpRegion to conflict
with PCI BAR") made it possible for AML code to access SMBus I/O ports
by installing custom SystemIO OpRegion handler and blocking i80i driver
access upon first AML read/write to this OpRegion.
However, while ThinkPad T560 does have SystemIO OpRegion declared under
the SMBus device, it does not access any of the SMBus registers:
Device (SMBU)
{
...
OperationRegion (SMBP, PCI_Config, 0x50, 0x04)
Field (SMBP, DWordAcc, NoLock, Preserve)
{
, 5,
TCOB, 11,
Offset (0x04)
}
Name (TCBV, 0x00)
Method (TCBS, 0, NotSerialized)
{
If ((TCBV == 0x00))
{
TCBV = (\_SB.PCI0.SMBU.TCOB << 0x05)
}
Return (TCBV) /* \_SB_.PCI0.SMBU.TCBV */
}
OperationRegion (TCBA, SystemIO, TCBS (), 0x10)
Field (TCBA, ByteAcc, NoLock, Preserve)
{
Offset (0x04),
, 9,
CPSC, 1
}
}
Problem with the current approach is that it blocks all I/O port access
and because this system has touchpad connected to the SMBus controller
after first AML access (happens during suspend/resume cycle) the
touchpad fails to work anymore.
Fix this so that we allow ACPI AML I/O port access if it does not touch
the region reserved for the SMBus.
Fixes:
7ae81952cda ("i2c: i801: Allow ACPI SystemIO OpRegion to conflict with PCI BAR")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=200737
Reported-by: Yussuf Khalil <dev@pp3345.net>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Linus Torvalds [Thu, 30 Aug 2018 20:39:04 +0000 (13:39 -0700)]
Merge tag 'for-linus-
20180830' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"Small collection of fixes that should go into this series. This pull
contains:
- NVMe pull request with three small fixes (via Christoph)
- Kill useless NULL check before kmem_cache_destroy (Chengguang Xu)
- Xen block driver pull request with persistent grant flushing fixes
(Juergen Gross)
- Final wbt fixes, wrapping up the changes for this series. These
have been heavily tested (me)
- cdrom info leak fix (Scott Bauer)
- ATA dma quirk for SQ201 (Linus Walleij)
- Straight forward bsg refcount_t conversion (John Pittman)"
* tag 'for-linus-
20180830' of git://git.kernel.dk/linux-block:
cdrom: Fix info leak/OOB read in cdrom_ioctl_drive_status
nvmet: free workqueue object if module init fails
nvme-fcloop: Fix dropped LS's to removed target port
nvme-pci: add a memory barrier to nvme_dbbuf_update_and_check_event
block: bsg: move atomic_t ref_count variable to refcount API
block: remove unnecessary condition check
ata: ftide010: Add a quirk for SQ201
blk-wbt: remove dead code
blk-wbt: improve waking of tasks
blk-wbt: abstract out end IO completion handler
xen/blkback: remove unused pers_gnts_lock from struct xen_blkif_ring
xen/blkback: move persistent grants flags to bool
xen/blkfront: reorder tests in xlblk_init()
xen/blkfront: cleanup stale persistent grants
xen/blkback: don't keep persistent grants too long
Rob Herring [Mon, 27 Aug 2018 12:50:47 +0000 (07:50 -0500)]
of: add node name compare helper functions
In preparation to remove device_node.name pointer, add helper functions
for node name comparisons which are a common pattern throughout the kernel.
Cc: Frank Rowand <frowand.list@gmail.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Kim Phillips [Mon, 27 Aug 2018 20:08:07 +0000 (15:08 -0500)]
perf annotate: Handle arm64 move instructions
Add default handler for non-jump instructions. This really only has an
effect on instructions that compute a PC-relative address, such as
'adrp,' as seen in these couple of examples:
BEFORE: adrp x0,
ffff20000aa11000 <kallsyms_token_index+0xce000>
AFTER: adrp x0, kallsyms_token_index+0xce000
BEFORE: adrp x23,
ffff20000ae94000 <__per_cpu_load>
AFTER: adrp x23, __per_cpu_load
The implementation is identical to that of s390, but with a slight
adjustment for objdump whitespace propagation (arm64 objdump puts spaces
after commas, whereas s390's presumably doesn't).
The mov__scnprintf() declaration is moved from s390's to arm64's
instructions.c because arm64's gets included before s390's.
Committer testing:
Ran 'perf annotate --stdio2 > /tmp/{before,after}' no diff.
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20180827150807.304110d2e9919a17c832ca48@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Benjamin Peterson [Tue, 28 Aug 2018 03:53:44 +0000 (20:53 -0700)]
perf trace beauty: Alias 'umount' to 'umount2'
Before:
# perf trace -e *mount* umount /dev/mapper/fedora-home /s
11.576 ( 0.004 ms) umount/3138 umount2(arg0:
94501956754656, arg1: 0, arg2: 1, arg3:
140051050083104, arg4: 4, arg5:
94501956755136) = -1 EINVAL Invalid argument
#
After:
# perf trace -e *mount* umount /s
0.000 ( 9.241 ms): umount/5251 umount2(name: 0x55f74a986480) = 0
Signed-off-by: Benjamin Peterson <benjamin@python.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180828035344.31500-1-benjamin@python.org
[ split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:52 +0000 (08:32 +0200)]
perf stat: Move the display functions to stat-display.c
Move perf_evlist__print_counters() with all its dependency functions to
the stat-display.c object.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-44-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:51 +0000 (08:32 +0200)]
perf stat: Move 'metric_events' to 'struct perf_stat_config'
Move the static variable 'metric_events' to 'struct perf_stat_config',
so that it can be passed around and used outside 'perf stat' command.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-43-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:50 +0000 (08:32 +0200)]
perf stat: Move 'walltime_*' data to 'struct perf_stat_config'
Move the static variables 'walltime_*' to 'struct perf_stat_config', so
that it can be passed around and used outside 'perf stat' command.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-42-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:49 +0000 (08:32 +0200)]
perf stat: Propagate 'struct target' arg to sort_aggr_thread()
Propagate the 'struct target' arg to sort_aggr_thread() so that the
function does not depend on the 'perf stat' command object local
variable 'target' and can be moved out.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-41-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:48 +0000 (08:32 +0200)]
perf stat: Move 'no_merge' data to 'struct perf_stat_config'
Move the static variable 'no_merge' to 'struct perf_stat_config', so
that it can be passed around and used outside 'perf stat' command.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-40-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:47 +0000 (08:32 +0200)]
perf stat: Move 'big_num' data to 'struct perf_stat_config'
Move the static variable 'big_num' to 'struct perf_stat_config', so that
it can be passed around and used outside 'perf stat' command.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-39-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:46 +0000 (08:32 +0200)]
perf stat: Do not use the global 'evsel_list' in print functions
Get rid of the the 'evsel_list' global variable dependency, here we can
use the 'evlist' pointer from the evsel.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-38-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:45 +0000 (08:32 +0200)]
perf stat: Move *_aggr_* data to 'struct perf_stat_config'
Move the *_aggr_* global variables to 'struct perf_stat_config', so that
it can be passed around and used outside 'perf stat' command.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-37-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:44 +0000 (08:32 +0200)]
perf stat: Move ru_* data to 'struct perf_stat_config'
Move the 'ru_*' global variables to 'struct perf_stat_config', so that
it can be passed around and used outside the 'perf stat' command.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-36-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:43 +0000 (08:32 +0200)]
perf stat: Move 'print_mixed_hw_group_error' to 'struct perf_stat_config'
Move the 'print_mixed_hw_group_error' global variable to 'struct perf_stat_config',
so that it can be passed around and used outside the 'perf stat' command.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-35-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:42 +0000 (08:32 +0200)]
perf stat: Move 'print_free_counters_hint' to 'struct perf_stat_config'
Move the 'print_free_counters_hint' variable to 'struct perf_stat_config',
so that it can be passed around and used outside the 'perf stat' command.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-34-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:41 +0000 (08:32 +0200)]
perf stat: Move 'null_run' to 'struct perf_stat_config'
Move the static 'null_run' variable to 'struct perf_stat_config', so
that it can be passed around and used outside the 'perf stat' command.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-33-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:40 +0000 (08:32 +0200)]
perf stat: Add 'walltime_nsecs_stats' pointer to 'struct perf_stat_config'
Add 'walltime_nsecs_stats' pointer to 'struct perf_stat_config', so that
it can be passed around and used outside the 'perf stat' command.
It's initialized to point to stat's walltime_nsecs_stats value.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-32-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:39 +0000 (08:32 +0200)]
perf stat: Pass 'evlist' to aggr_update_shadow()
Pass a 'evlist' argument to aggr_update_shadow(), to get rid of the
global 'evsel_list' variable dependency.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-31-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:38 +0000 (08:32 +0200)]
perf stat: Pass 'struct perf_stat_config' to first_shadow_cpu()
Pass a 'struct perf_stat_config' arg to first_shadow_cpu(), so that the
function does not depend on the 'perf stat' command object local
'stat_config' variable and can then be moved out.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-30-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:37 +0000 (08:32 +0200)]
perf stat: Move 'metric_only_len' to 'struct perf_stat_config'
Move the static 'metric_only_len' variable to 'struct perf_stat_config',
so that it can be passed around and used outside the 'perf stat' command.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-29-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:36 +0000 (08:32 +0200)]
perf stat: Move 'run_count' to 'struct perf_stat_config'
Move the static 'run_count' variable to 'struct perf_stat_config', so
that it can be passed around and used outside the 'perf stat' command.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-28-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:35 +0000 (08:32 +0200)]
perf stat: Use 'evsel->evlist' instead of 'evsel_list' in collect_all_aliases()
Use 'evsel->evlist' instead of 'evsel_list' in collect_all_aliases(), to
get rid of the global 'evsel_list' variable dependency.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-27-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:34 +0000 (08:32 +0200)]
perf stat: Pass 'evlist' argument to print functions
Add 'evlist' argument to print functions to get rid of the global
'evsel_list' variable dependency.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-26-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:33 +0000 (08:32 +0200)]
perf stat: Add 'target' argument to perf_evlist__print_counters()
Add 'struct target' argument to perf_evlist__print_counters(), so the
function does not depend on the 'perf stat' command object local target
and can be moved out.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-25-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:32 +0000 (08:32 +0200)]
perf stat: Move 'unit_width' to 'struct perf_stat_config'
Move the static 'unit_width' variable to 'struct perf_stat_config',
so it can be passed around and used outside the 'perf stat' command.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-24-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:31 +0000 (08:32 +0200)]
perf stat: Move 'metric_only' to 'struct perf_stat_config'
Move the static 'metric_only' variable to 'struct perf_stat_config', so
it can be passed around and used outside the 'perf stat' command.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-23-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:30 +0000 (08:32 +0200)]
perf stat: Move 'interval_clear' to 'struct perf_stat_config'
Move the static 'interval_clear' variable to 'struct perf_stat_config',
so it can be passed around and used outside the 'perf stat' command.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-22-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:29 +0000 (08:32 +0200)]
perf stat: Move csv_* to 'struct perf_stat_config'
Move the static csv_* variables to 'struct perf_stat_config', so that it
can be passed around and used outside the 'perf stat' command.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-21-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:28 +0000 (08:32 +0200)]
perf stat: Pass a 'struct perf_stat_config' argument to global print functions
Add 'struct perf_stat_config' argument to the global print functions, so
that these functions can be used out of the 'perf stat' command code.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-20-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:27 +0000 (08:32 +0200)]
perf stat: Pass 'struct perf_stat_config' argument to local print functions
Add 'struct perf_stat_config' argument to print functions, so that those
functions can be moved out of the 'perf stat' command to a generic class
in the following patches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-19-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:26 +0000 (08:32 +0200)]
perf stat: Add 'struct perf_stat_config' argument to perf_evlist__print_counters()
Add a 'struct perf_stat_config' argument to perf_evlist__print_counters(),
so that it can be moved out of the 'perf stat' command to generic object
in the following patches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-18-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:25 +0000 (08:32 +0200)]
perf stat: Move STAT_RECORD out of perf_evlist__print_counters()
It's stat related and should stay in the 'perf stat' command. The
perf_evlist__print_counters function will be moved out in the following
patches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-17-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:24 +0000 (08:32 +0200)]
perf stat: Introduce perf_evlist__print_counters()
To be in charge of printing out the stat output. It will be moved out of
the 'perf stat' command in the following patches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-16-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:23 +0000 (08:32 +0200)]
perf stat: Move perf_stat_synthesize_config() to stat.c
So that it can be used globally.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-15-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:22 +0000 (08:32 +0200)]
perf stat: Add 'perf_event__handler_t' argument to perf_stat_synthesize_config()
So that it's completely independent and can be used outside the 'perf stat' command.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-14-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:21 +0000 (08:32 +0200)]
perf stat: Add 'struct perf_evlist' argument to perf_stat_synthesize_config()
Get rid of the 'evsel_list' global variable dependency, here in
perf_stat_synthesize_config() we are adding the 'evlist' arg.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-13-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:20 +0000 (08:32 +0200)]
perf stat: Add 'struct perf_tool' argument to perf_stat_synthesize_config()
So that we can use the function outside the 'perf stat' command with standard
synthesize functions, that take 'struct perf_tool *' argument.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-12-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:19 +0000 (08:32 +0200)]
perf stat: Add 'struct perf_stat_config' argument to perf_stat_synthesize_config()
Add a 'struct perf_stat_config' argument to perf_stat_synthesize_config(),
so we could synthesize arbitrary config.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-11-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:18 +0000 (08:32 +0200)]
perf stat: Rename 'is_pipe' argument to 'attrs' in perf_stat_synthesize_config()
The attrs name makes more sense.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-10-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:17 +0000 (08:32 +0200)]
perf stat: Move create_perf_stat_counter() to stat.c
Move create_perf_stat_counter() to the 'stat' class, so that we can use
it globally.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-9-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:16 +0000 (08:32 +0200)]
perf evsel: Introduce perf_evsel__store_ids()
Add perf_evsel__store_ids() from stat's store_counter_ids() code to the
evsel class, so that it can be used globally.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-8-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:15 +0000 (08:32 +0200)]
perf tools: Switch 'session' argument to 'evlist' in perf_event__synthesize_attrs()
To be able to pass in other than session's evlist.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-7-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:14 +0000 (08:32 +0200)]
perf stat: Add 'identifier' flag to 'struct perf_stat_config'
Add 'identifier' flag to 'struct perf_stat_config' to carry the info
whether to use PERF_SAMPLE_IDENTIFIER for events.
This makes create_perf_stat_counter() independent.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:13 +0000 (08:32 +0200)]
perf stat: Use local config arg for scale in create_perf_stat_counter()
Use the local 'scale' member in the 'struct perf_stat_config' argument
instead of the global 'stat_config' variable, to make the function
independent.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:12 +0000 (08:32 +0200)]
perf stat: Move 'no_inherit' to 'struct perf_stat_config'
Move the static 'no_inherit' variable to 'struct perf_stat_config', so
it can be passed around and used outside the 'perf stat' command.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:11 +0000 (08:32 +0200)]
perf stat: Move 'initial_delay' to 'struct perf_stat_config'
Move the static 'initial_delay' variable to 'struct perf_stat_config',
so it can be passed around and used outside the 'perf stat' command.
Add 'struct perf_stat_config' argument to create_perf_stat_counter() and
use its 'initial_delay' member instead of the static one.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Jiri Olsa [Thu, 30 Aug 2018 06:32:10 +0000 (08:32 +0200)]
perf stat: Use evsel->threads in create_perf_stat_counter()
Get rid of the evsel_list dependency, here we can use the evsel->threads
copy of the struct thread_map.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20180830063252.23729-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Thu, 30 Aug 2018 16:37:28 +0000 (13:37 -0300)]
perf trace: Show comm and tid for tracepoint events
So that all events have that info, improving reading by having
information better aligned, etc.
Before:
# echo 1 > /proc/sys/vm/drop_caches
# perf trace -e block:*,ext4:*,tools/perf/examples/bpf/augmented_syscalls.c,close cat tools/perf/examples/bpf/hello.c
0.000 ( ): #include <stdio.h>
int syscall_enter(openat)(void *args)
{
puts("Hello, world\n");
return 0;
}
license(GPL);
cat/2731 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)
0.025 ( ): syscalls:sys_exit_openat:0x3
0.063 ( 0.022 ms): cat/2731 close(fd: 3) = 0
0.110 ( ): cat/2731 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC)
0.123 ( ): syscalls:sys_exit_openat:0x3
0.243 ( 0.008 ms): cat/2731 close(fd: 3) = 0
0.485 ( ): cat/2731 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC)
0.500 ( ): syscalls:sys_exit_open:0x3
0.531 ( 0.017 ms): cat/2731 close(fd: 3) = 0
0.587 ( ): cat/2731 openat(dfd: CWD, filename: tools/perf/examples/bpf/hello.c)
0.601 ( ): syscalls:sys_exit_openat:0x3
0.631 ( ): ext4:ext4_es_lookup_extent_enter:dev 253,2 ino 1311399 lblk 0
0.639 ( ): ext4:ext4_es_lookup_extent_exit:dev 253,2 ino 1311399 found 1 [0/1) 5276651 W0x10
0.654 ( ): block:block_bio_queue:253,2 R
42213208 + 8 [cat]
0.663 ( ): block:block_bio_remap:8,0 R
58206040 + 8 <- (253,2)
42213208
0.671 ( ): block:block_bio_remap:8,0 R
175570776 + 8 <- (8,6)
58206040
0.678 ( ): block:block_bio_queue:8,0 R
175570776 + 8 [cat]
0.692 ( ): block:block_getrq:8,0 R
175570776 + 8 [cat]
0.700 ( ): block:block_plug:[cat]
0.708 ( ): block:block_rq_insert:8,0 R 4096 ()
175570776 + 8 [cat]
0.713 ( ): block:block_unplug:[cat] 1
0.716 ( ): block:block_rq_issue:8,0 R 4096 ()
175570776 + 8 [cat]
0.949 ( 0.007 ms): cat/2731 close(fd: 3) = 0
0.969 ( 0.006 ms): cat/2731 close(fd: 1) = 0
0.982 ( 0.006 ms): cat/2731 close(fd: 2) = 0
#
After:
# echo 1 > /proc/sys/vm/drop_caches
# perf trace -e block:*,ext4:*,tools/perf/examples/bpf/augmented_syscalls.c,close cat tools/perf/examples/bpf/hello.c
0.000 ( ): cat/1380 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: CLOEXEC)#include <stdio.h>
int syscall_enter(openat)(void *args)
{
puts("Hello, world\n");
return 0;
}
license(GPL);
0.024 ( ): cat/1380 syscalls:sys_exit_openat:0x3
0.063 ( 0.024 ms): cat/1380 close(fd: 3) = 0
0.114 ( ): cat/1380 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: CLOEXEC)
0.127 ( ): cat/1380 syscalls:sys_exit_openat:0x3
0.247 ( 0.009 ms): cat/1380 close(fd: 3) = 0
0.484 ( ): cat/1380 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC)
0.499 ( ): cat/1380 syscalls:sys_exit_open:0x3
0.613 ( 0.010 ms): cat/1380 close(fd: 3) = 0
0.662 ( ): cat/1380 openat(dfd: CWD, filename: tools/perf/examples/bpf/hello.c)
0.678 ( ): cat/1380 syscalls:sys_exit_openat:0x3
0.712 ( ): cat/1380 ext4:ext4_es_lookup_extent_enter:dev 253,2 ino 1311399 lblk 0
0.721 ( ): cat/1380 ext4:ext4_es_lookup_extent_exit:dev 253,2 ino 1311399 found 1 [0/1) 5276651 W0x10
0.734 ( ): cat/1380 block:block_bio_queue:253,2 R
42213208 + 8 [cat]
0.745 ( ): cat/1380 block:block_bio_remap:8,0 R
58206040 + 8 <- (253,2)
42213208
0.754 ( ): cat/1380 block:block_bio_remap:8,0 R
175570776 + 8 <- (8,6)
58206040
0.761 ( ): cat/1380 block:block_bio_queue:8,0 R
175570776 + 8 [cat]
0.780 ( ): cat/1380 block:block_getrq:8,0 R
175570776 + 8 [cat]
0.791 ( ): cat/1380 block:block_plug:[cat]
0.802 ( ): cat/1380 block:block_rq_insert:8,0 R 4096 ()
175570776 + 8 [cat]
0.806 ( ): cat/1380 block:block_unplug:[cat] 1
0.810 ( ): cat/1380 block:block_rq_issue:8,0 R 4096 ()
175570776 + 8 [cat]
1.005 ( 0.011 ms): cat/1380 close(fd: 3) = 0
1.031 ( 0.008 ms): cat/1380 close(fd: 1) = 0
1.048 ( 0.008 ms): cat/1380 close(fd: 2) = 0
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-us1mwsupxffs4jlm3uqm5dvj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Thu, 30 Aug 2018 15:32:35 +0000 (12:32 -0300)]
perf trace augmented_syscalls: Hook into syscalls:sys_exit_SYSCALL too
Hook the pair enter/exit when using augmented_{filename,sockaddr,etc}_syscall(),
this way we'll be able to see what entries are in the ELF sections generated
from augmented_syscalls.c and filter them out from the main raw_syscalls:*
tracepoints used by 'perf trace'.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-cyav42qj5yylolw4attcw99z@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Thu, 30 Aug 2018 14:50:21 +0000 (11:50 -0300)]
perf trace augmented_syscalls: Rename augmented_*_syscall__enter to just *_syscall
As we'll also hook into the syscalls:sys_exit_SYSCALL for which there
are enter hooks.
This way we'll be able to iterate the ELF file for the eBPF program,
find the syscalls that have hooks and filter them out from the general
raw_syscalls:sys_{enter,exit} tracepoint for not-yet-augmented (the ones
with pointer arguments not yet being attached to the usual syscalls
tracepoint payload) and non augmentable syscalls (syscalls without
pointer arguments).
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-cl1xyghwb1usp500354mv37h@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Thu, 30 Aug 2018 13:02:23 +0000 (10:02 -0300)]
perf augmented_syscalls: Update the header comments
Reflecting the fact that it now augments more than syscalls:sys_enter_SYSCALL
tracepoints that have filename strings as args. Also mention how the
extra data is handled by the by now modified 'perf trace' beautifiers,
that will use special "augmented" beautifiers when extra data is found
after the expected syscall enter/exit tracepoints.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-ybskanehmdilj5fs7080nz1g@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Thu, 30 Aug 2018 11:48:44 +0000 (08:48 -0300)]
perf bpf: Add syscall_exit() helper
So that we can hook to the syscalls:sys_exit_SYSCALL tracepoints in
addition to the syscalls:sys_enter_SYSCALL we hook using the
syscall_enter() helper.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-6qh8aph1jklyvdu7w89c0izc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tzvetomir Stoyanov (VMware) [Tue, 28 Aug 2018 22:50:38 +0000 (18:50 -0400)]
tools lib traceevent, perf tools: Split trace-seq related APIs in a separate header file
In order to make libtraceevent into a proper library, all its APIs
should be defined in corresponding header files. This patch splits
trace-seq related APIs in a separate header file: trace-seq.h
Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20180828185038.2dcb2743@gandalf.local.home
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>