Namhyung Kim [Fri, 10 Feb 2017 07:36:11 +0000 (16:36 +0900)]
perf diff: Add 'delta-abs' compute method
The 'delta-abs' compute method is same as 'delta' but shows entries with
bigger absolute delta first instead of sorting numerically. This is
only useful together with -o option.
Below is default output (-c delta):
$ perf diff -o 1 -c delta | grep -v ^# | head
42.22% +4.97% [kernel.kallsyms] [k] cfb_imageblit
0.62% +1.23% [kernel.kallsyms] [k] mutex_lock
+1.15% [kernel.kallsyms] [k] copy_user_generic_string
2.40% +0.95% [kernel.kallsyms] [k] bit_putcs
0.31% +0.79% [kernel.kallsyms] [k] link_path_walk
+0.64% [kernel.kallsyms] [k] kmem_cache_alloc
0.00% +0.57% [kernel.kallsyms] [k] __rcu_read_unlock
+0.45% [kernel.kallsyms] [k] alloc_set_pte
0.16% +0.45% [kernel.kallsyms] [k] menu_select
+0.41% ld-2.24.so [.] do_lookup_x
Now with 'delta-abs' it shows entries have bigger delta value either
positive or negative.
$ perf diff -o 1 -c delta-abs | grep -v ^# | head
42.22% +4.97% [kernel.kallsyms] [k] cfb_imageblit
12.72% -3.01% [kernel.kallsyms] [k] intel_idle
9.72% -1.31% [unknown] [.] 0x0000000000411343
0.62% +1.23% [kernel.kallsyms] [k] mutex_lock
2.40% +0.95% [kernel.kallsyms] [k] bit_putcs
0.31% +0.79% [kernel.kallsyms] [k] link_path_walk
1.35% -0.71% [kernel.kallsyms] [k] smp_call_function_single
0.00% +0.57% [kernel.kallsyms] [k] __rcu_read_unlock
0.16% +0.45% [kernel.kallsyms] [k] menu_select
0.72% -0.44% [kernel.kallsyms] [k] lookup_fast
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170210073614.24584-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Fri, 10 Feb 2017 14:41:11 +0000 (11:41 -0300)]
tools include: Introduce linux/compiler-gcc.h
To match the kernel headers structure, setting up things that are
specific to gcc or to some specific version of gcc.
It gets included by linux/compiler.h when gcc is the compiler being
used.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Perches <joe@perches.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-fabcqfq4asodq9t158hcs8t3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Srinivas Pandruvada [Fri, 10 Feb 2017 19:38:37 +0000 (11:38 -0800)]
perf/x86/intel: Add Kaby Lake support
Add Kaby Lake mobile and desktop models for RAPL, CSTATE and UNCORE
matching Skylake.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: peterz@infradead.org
Cc: kan.liang@intel.com
Cc: bigeasy@linutronix.de
Cc: dave.hansen@linux.intel.com
Cc: piotr.luc@intel.com
Cc: davidcc@google.com
Cc: bp@suse.de
Link: http://lkml.kernel.org/r/1486755517-17812-1-git-send-email-srinivas.pandruvada@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Alexander Shishkin [Thu, 26 Jan 2017 09:40:57 +0000 (11:40 +0200)]
perf/core: Allow kernel filters on CPU events
While supporting file-based address filters for CPU events requires some
extra context switch handling, kernel address filters are easy, since the
kernel mapping is preserved across address spaces. It is also useful as
it permits tracing scheduling paths of the kernel.
This patch allows setting up kernel filters for CPU events.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Will Deacon <will.deacon@arm.com>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20170126094057.13805-4-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Alexander Shishkin [Thu, 26 Jan 2017 09:40:56 +0000 (11:40 +0200)]
perf/core: Do error out on a kernel filter on an exclude_filter event
It is currently possible to configure a kernel address filter for a
event that excludes kernel from its traces (attr.exclude_kernel==1).
While in reality this doesn't make sense, the SET_FILTER ioctl() should
return a error in such case, currently it does not. Furthermore, it
will still silently discard the filter and any potentially valid filters
that came with it.
This patch makes the SET_FILTER ioctl() error out in such cases.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Will Deacon <will.deacon@arm.com>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20170126094057.13805-3-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Fri, 10 Feb 2017 07:36:12 +0000 (08:36 +0100)]
Merge tag 'perf-core-for-mingo-4.11-
20170209' of git://git./linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- Add support for parsing Intel uncore vendor event files and add uncore
vendor events for the Intel server processors (Haswell, Broadwell,
IvyBridge), Xeon Phi (Knights Landing) and Broadwell DE (Andi Kleen)
- Support --symfs in 'perf probe' (Uwe Kleine-König)
- Add support for generating bpf prologue on the aarch64 architecture (He Kuang)
- Show proper hint when SDT event not yet in place via 'perf probe' (Ravi Bangoria)
- Take into account symfs setting when reading file build ID (Victor Kamensky)
Infrastructure changes:
- Map gcc7's '__attribute__ ((fallthrough))', that warns when code
associated to case blocks in switches continue into the next case entry,
to '__falltrough' and use it where warned by gcc, tested on Fedora Rawhide
(Arnaldo Carvalho de Melo)
- Fix buffer sizes used with snprintf that could lead to truncation,
another warning introduced in gcc7 (Arnaldo Carvalho de Melo)
- Robustify do_generate_dynamic_list_file in libtraceevent (David Carrillo-Cisneros)
- Use zfree() in more places (Taeung Song)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Arnaldo Carvalho de Melo [Thu, 9 Feb 2017 18:22:22 +0000 (15:22 -0300)]
perf intel-pt: Use __fallthrough
To address new warnings emmited by gcc 7, e.g.::
CC /tmp/build/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.o
CC /tmp/build/perf/tests/parse-events.o
util/intel-pt-decoder/intel-pt-pkt-decoder.c: In function 'intel_pt_pkt_desc':
util/intel-pt-decoder/intel-pt-pkt-decoder.c:499:6: error: this statement may fall through [-Werror=implicit-fallthrough=]
if (!(packet->count))
^
util/intel-pt-decoder/intel-pt-pkt-decoder.c:501:2: note: here
case INTEL_PT_CYC:
^~~~
CC /tmp/build/perf/util/intel-pt-decoder/intel-pt-decoder.o
cc1: all warnings being treated as errors
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-mf0hw789pu9x855us5l32c83@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Thu, 9 Feb 2017 17:48:46 +0000 (14:48 -0300)]
perf tests: Avoid possible truncation with dirent->d_name + snprintf
Addressing a few cases spotted by a new warning in gcc 7:
tests/parse-events.c: In function 'test_pmu_events':
tests/parse-events.c:1790:39: error: '%s' directive output may be truncated writing up to 255 bytes into a region of size 90 [-Werror=format-truncation=]
snprintf(name, MAX_NAME, "cpu/event=%s/u", ent->d_name);
^~
In file included from /usr/include/stdio.h:939:0,
from /git/linux/tools/perf/util/map.h:9,
from /git/linux/tools/perf/util/symbol.h:7,
from /git/linux/tools/perf/util/evsel.h:10,
from tests/parse-events.c:3:
/usr/include/bits/stdio2.h:64:10: note: '__builtin___snprintf_chk' output between 13 and 268 bytes into a destination of size 100
return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
__bos (__s), __fmt, __va_arg_pack ());
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
tests/parse-events.c:1798:29: error: '%s' directive output may be truncated writing up to 255 bytes into a region of size 100 [-Werror=format-truncation=]
snprintf(name, MAX_NAME, "%s:u,cpu/event=%s/u", ent->d_name, ent->d_name);
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes:
945aea220bb8 ("perf tests: Move test objects into 'tests' directory")
Link: http://lkml.kernel.org/n/tip-ty4q2p8zp1dp3mskvubxskm5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Thu, 9 Feb 2017 17:39:42 +0000 (14:39 -0300)]
perf bench numa: Avoid possible truncation when using snprintf()
Addressing this warning from gcc 7:
CC /tmp/build/perf/bench/numa.o
bench/numa.c: In function '__bench_numa':
bench/numa.c:1582:42: error: '%d' directive output may be truncated writing between 1 and 10 bytes into a region of size between 8 and 17 [-Werror=format-truncation=]
snprintf(tname, 32, "process%d:thread%d", p, t);
^~
bench/numa.c:1582:25: note: directive argument in the range [0,
2147483647]
snprintf(tname, 32, "process%d:thread%d", p, t);
^~~~~~~~~~~~~~~~~~~~
In file included from /usr/include/stdio.h:939:0,
from bench/../util/util.h:47,
from bench/../builtin.h:4,
from bench/numa.c:11:
/usr/include/bits/stdio2.h:64:10: note: '__builtin___snprintf_chk' output between 17 and 35 bytes into a destination of size 32
return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
__bos (__s), __fmt, __va_arg_pack ());
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Petr Holasek <pholasek@redhat.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-twa37vsfqcie5gwpqwnjuuz9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Thu, 9 Feb 2017 00:57:22 +0000 (21:57 -0300)]
perf header: Fix handling of PERF_EVENT_UPDATE__SCALE
In commit
daeecbc0c431 ("perf tools: Add event_update event scale type"), the
handling of PERF_EVENT_UPDATE__SCALE cast struct event_update_event->data to a
pointer to event_update_event_scale, uses some field from this casted struct
and then ends up falling through to the handling of another event type,
PERF_EVENT_UPDATE__CPUS were it casts that ev->data to yet another type, oops,
fix it by inserting the missing break.
Noticed when building perf using gcc 7 on Fedora Rawhide:
util/header.c: In function 'perf_event__process_event_update':
util/header.c:3207:16: error: this statement may fall through [-Werror=implicit-fallthrough=]
evsel->scale = ev_scale->scale;
~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
util/header.c:3208:2: note: here
case PERF_EVENT_UPDATE__CPUS:
^~~~
This wasn't noticed because probably PERF_EVENT_UPDATE__CPUS comes after
PERF_EVENT_UPDATE__SCALE, so we would just create a bogus evsel->own_cpus when
processing a PERF_EVENT_UPDATE__SCALE to then leak it and create a new cpu map
with the correct data.
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes:
daeecbc0c431 ("perf tools: Add event_update event scale type")
Link: http://lkml.kernel.org/n/tip-lukcf9hdj092ax2914ss95at@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Wed, 8 Feb 2017 20:01:46 +0000 (17:01 -0300)]
perf thread_map: Correctly size buffer used with dirent->dt_name
The size of dirent->dt_name is NAME_MAX + 1, but the size for the 'path'
buffer is hard coded at 256, which may truncate it because we also
prepend "/proc/", so that all that into account and thank gcc 7 for this
warning:
/git/linux/tools/perf/util/thread_map.c: In function 'thread_map__new_by_uid':
/git/linux/tools/perf/util/thread_map.c:119:39: error: '%s' directive output may be truncated writing up to 255 bytes into a region of size 250 [-Werror=format-truncation=]
snprintf(path, sizeof(path), "/proc/%s", dirent->d_name);
^~
In file included from /usr/include/stdio.h:939:0,
from /git/linux/tools/perf/util/thread_map.c:5:
/usr/include/bits/stdio2.h:64:10: note: '__builtin___snprintf_chk' output between 7 and 262 bytes into a destination of size 256
return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
__bos (__s), __fmt, __va_arg_pack ());
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-csy0r8zrvz5efccgd4k12c82@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Wed, 8 Feb 2017 20:01:46 +0000 (17:01 -0300)]
perf top: Use __fallthrough
The implicit fall through case label here is intended, so let us inform
that to gcc >= 7:
CC /tmp/build/perf/builtin-top.o
builtin-top.c: In function 'display_thread':
builtin-top.c:644:7: error: this statement may fall through [-Werror=implicit-fallthrough=]
if (errno == EINTR)
^
builtin-top.c:647:3: note: here
default:
^~~~~~~
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-lmcfnnyx9ic0m6j0aud98p4e@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Wed, 8 Feb 2017 20:01:46 +0000 (17:01 -0300)]
tools strfilter: Use __fallthrough
The implicit fall through case label here is intended, so let us inform
that to gcc >= 7:
util/strfilter.c: In function 'strfilter_node__sprint':
util/strfilter.c:270:6: error: this statement may fall through [-Werror=implicit-fallthrough=]
if (len < 0)
^
util/strfilter.c:272:2: note: here
case '!':
^~~~
cc1: all warnings being treated as errors
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-z2dpywg7u8fim000hjfbpyfm@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Wed, 8 Feb 2017 20:01:46 +0000 (17:01 -0300)]
tools string: Use __fallthrough in perf_atoll()
The implicit fall through case label here is intended, so let us inform
that to gcc >= 7:
CC /tmp/build/perf/util/string.o
util/string.c: In function 'perf_atoll':
util/string.c:22:7: error: this statement may fall through [-Werror=implicit-fallthrough=]
if (*p)
^
util/string.c:24:3: note: here
case '\0':
^~~~
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-0ophb30v9apkk6o95el0rqlq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Wed, 8 Feb 2017 20:01:46 +0000 (17:01 -0300)]
tools include: Add a __fallthrough statement
For cases where implicit fall through case labels are intended,
to let us inform that to gcc >= 7:
CC /tmp/build/perf/util/string.o
util/string.c: In function 'perf_atoll':
util/string.c:22:7: error: this statement may fall through [-Werror=implicit-fallthrough=]
if (*p)
^
util/string.c:24:3: note: here
case '\0':
^~~~
So we introduce:
#define __fallthrough __attribute__ ((fallthrough))
And use it in such cases.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: William Cohen <wcohen@redhat.com>
Link: http://lkml.kernel.org/n/tip-qnpig0xfop4hwv6k4mv1wts5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Mickaël Salaün [Tue, 7 Feb 2017 20:56:05 +0000 (21:56 +0100)]
tools lib bpf: Add missing header to the library
Include stddef.h to define size_t.
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Acked-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Joe Stringer <joe@ovn.org>
Link: http://lkml.kernel.org/r/20170207205609.8035-2-mic@digikod.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Mon, 10 Oct 2016 16:33:59 +0000 (09:33 -0700)]
perf vendor events intel: Add uncore events for Broadwell DE
This is not a full uncore event list, but a short list of useful
and understandable metrics.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/n/tip-c0cix4eprbldfrx5zf60suvh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Mon, 19 Sep 2016 16:36:31 +0000 (09:36 -0700)]
perf vendor events intel: Add uncore events for Xeon Phi (Knights Landing)
Add metrics for memory and MCDRAM. Minimal metrics only for now.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/n/tip-c0cix4eprbldfrx5zf60suvh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Sun, 18 Sep 2016 01:10:03 +0000 (18:10 -0700)]
perf vendor events intel: Add uncore events for Sandy Bridge Server
This is not a full uncore event list, but a short list of useful
and understandable metrics.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/n/tip-c0cix4eprbldfrx5zf60suvh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Sun, 18 Sep 2016 01:09:40 +0000 (18:09 -0700)]
perf vendor events intel: Add uncore events for IvyBridge Server
This is not a full uncore event list, but a short list of useful
and understandable metrics.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/n/tip-c0cix4eprbldfrx5zf60suvh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Sun, 18 Sep 2016 01:09:21 +0000 (18:09 -0700)]
perf vendor events intel: Add uncore events for Broadwell Server
This is not a full uncore event list, but a short list of useful
and understandable metrics.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/n/tip-c0cix4eprbldfrx5zf60suvh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Sun, 18 Sep 2016 01:08:45 +0000 (18:08 -0700)]
perf vendor events intel: Add uncore events for Haswell Server processor
This is not a full uncore event list, but a short list of useful and
understandable metrics.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/n/tip-c0cix4eprbldfrx5zf60suvh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Tue, 7 Feb 2017 19:15:21 +0000 (16:15 -0300)]
perf tools: Fix include of linux/mman.h
It was using uapi/linux/mmap.h which caused for at least one reporter,
that hasn't specified in what environment the problem manifests itself:
----
The original error is:
In file included from util/event.c:2:0:
...tools/include/uapi/linux/mman.h:4:27: fatal error: uapi/asm/mman.h:
No such file or directory
#include <uapi/asm/mman.h>
^
compilation terminated.
----
Test built it on these containers:
# dm
1 alpine:3.4: Ok
2 android-ndk:r12b-arm: Ok
3 archlinux:latest: Ok
4 centos:5: Ok
5 centos:6: Ok
6 centos:7: Ok
7 debian:7: Ok
8 debian:8: Ok
9 debian:experimental: Ok
10 debian:experimental-x-arm64: Ok
11 debian:experimental-x-mips: Ok
12 debian:experimental-x-mips64: Ok
13 debian:experimental-x-mipsel: Ok
14 fedora:20: Ok
15 fedora:21: Ok
16 fedora:22: Ok
17 fedora:23: Ok
18 fedora:24: Ok
19 fedora:24-x-ARC-uClibc: Ok
20 fedora:25: Ok
21 fedora:rawhide: Ok
22 mageia:5: Ok
23 opensuse:13.2: Ok
24 opensuse:42.1: Ok
25 opensuse:tumbleweed: Ok
26 ubuntu:12.04.5: Ok
27 ubuntu:14.04.4-x-linaro-arm64: Ok
28 ubuntu:15.10: Ok
29 ubuntu:16.04: Ok
30 ubuntu:16.04-x-arm: Ok
31 ubuntu:16.04-x-arm64: Ok
32 ubuntu:16.04-x-powerpc: Ok
33 ubuntu:16.04-x-powerpc64: Ok
34 ubuntu:16.04-x-powerpc64el: Ok
35 ubuntu:16.04-x-s390: Ok
36 ubuntu:16.10: Ok
Reported-by: David Carrillo-Cisneros <davidcc@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michal Marek <mmarek@suse.com>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes:
fbef103fad50 ("perf tools: Do hugetlb handling in more systems")
Link: http://lkml.kernel.org/n/tip-4wm5xmjz5wgbq7ucyz4dyd72@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
David Carrillo-Cisneros [Wed, 8 Feb 2017 05:28:40 +0000 (21:28 -0800)]
tools lib traceevent: Robustify do_generate_dynamic_list_file
The dynamic-list-file used to export dynamic symbols introduced in
commit
e3d09ec8126f ("tools lib traceevent: Export dynamic symbols
used by traceevent plugins")
is generated without any sort of error checking.
I experienced problems due to an old version of nm (v 0.158) that outputs
in a format distinct from the assumed by the script.
Robustify the built of dynamic symbol list by enforcing that the second
column of $(NM) -u <files> is either "U" (Undefined), "W" or "w" (undefined
weak), which are the possible outputs from non-ancient $(NM) versions.
Print an error if format is unexpected.
v2: Accept "W" and "w" symbol options.
Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michal Marek <mmarek@suse.com>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20170208052840.112182-1-davidcc@google.com
[ Use STRING1 = STRING1 instead of == to make this work on Ubuntu systems ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Taeung Song [Wed, 1 Feb 2017 12:34:07 +0000 (21:34 +0900)]
perf tools: Use zfree() to avoid keeping dangling pointers
The cases changed in this patch are for when we free but keep the
pointer to the freed area, which is not always a good idea.
Be more defensive and zero the pointer to avoid possible use after
free bugs to take more time to be detected.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1485952447-7013-5-git-send-email-treeze.taeung@gmail.com
[ rewrote commit log ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Taeung Song [Wed, 1 Feb 2017 12:34:06 +0000 (21:34 +0900)]
perf tools: Use zfree() instead of ad hoc equivalent
We have zfree(&ptr) for this very common pattern:
free(ptr);
ptr = NULL;
So use it in a few more places.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1485952447-7013-4-git-send-email-treeze.taeung@gmail.com
[ rewrote commit log ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Taeung Song [Wed, 1 Feb 2017 12:34:05 +0000 (21:34 +0900)]
perf tools: Add missing check for failure in a zalloc() call
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1485952447-7013-3-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Taeung Song [Wed, 1 Feb 2017 12:34:04 +0000 (21:34 +0900)]
perf tools: Only increase index if perf_evsel__new_idx() succeeds
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1485952447-7013-2-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Uwe Kleine-König [Thu, 21 Jul 2016 09:48:32 +0000 (11:48 +0200)]
perf probe: Add option --symfs
perf probe makes use of debug symbols, so add --symfs as the other
commands have.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel@pengutronix.de
Link: http://lkml.kernel.org/r/1469094512-13440-2-git-send-email-u.kleine-koenig@pengutronix.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Victor Kamensky [Mon, 6 Feb 2017 23:48:28 +0000 (15:48 -0800)]
perf symbols: Take into account symfs setting when reading file build ID
After commit
5baecbcd9c9a ("perf symbols: we can now read separate
debug-info files based on a build ID") and when --symfs option is used
perf failed to pick up symbols for file with the same name between host
and sysroot specified by --symfs option. One can see message like this:
bin/bash with build id
26f0062cb6950d4d1ab0fd9c43eae8b10ca42062 not found, continuing without symbols
It happens because code added by
5baecbcd9c9a opens files directly by
dso->long_name without symbol_conf.symfs consideration, which as result
picks one from the host. It reads its build ID and later even code finds
another proper file in directory pointed by --symfs perf ignores it
because build id mismatches.
Fix is to use __symbol__join_symfs to adjust file name according to
--symfs setting. If no --symfs passed the operation would noop and picks
the same host file as before.
Also note in latter tree after
5baecbcd9c9a commit additional check for
'!dso->has_build_id' was added, so to observe error condition 'perf
record' should run with --no-buildid, so perf.data itself would not have
build id for target binary in buildid perf section and 'perf report'
will pass '!dso->has_build_id' condition. Or target binary should not
have build id, but the same binary on host has build id, again
'!dso->has_build_id' will pass in this case and incorrect build id could
be read if --symfs is used.
Signed-off-by: Victor Kamensky <kamensky@cisco.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chris Phlipot <cphlipot0@gmail.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: xe-linux-external@cisco.com
Fixes:
5baecbcd9c9a ("perf symbols: we can now read separate debug-info files based on a build ID")
Link: http://lkml.kernel.org/r/1486424908-17094-1-git-send-email-kamensky@cisco.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ravi Bangoria [Fri, 3 Feb 2017 10:26:42 +0000 (15:56 +0530)]
perf sdt: Show proper hint when event not yet in place via 'perf probe'
All events from 'perf list', except SDT events, can be directly recorded
with 'perf record'. But, the flow is little different for SDT events.
Probe points for SDT event needs to be created using 'perf probe' before
recording it using 'perf record'.
Perf shows misleading hint when a user tries to record SDT event without
first creating a probe point. Show proper hint there.
Before patch:
$ perf record -a -e sdt_glib:idle__add
event syntax error: 'sdt_glib:idle__add'
\___ unknown tracepoint
Error: File /sys/kernel/debug/tracing/events/sdt_glib/idle__add not found.
Hint: Perhaps this kernel misses some CONFIG_ setting to enable this feature?.
...
After patch:
$ perf record -a -e sdt_glib:idle__add
event syntax error: 'sdt_glib:idle__add'
\___ unknown tracepoint
Error: File /sys/kernel/debug/tracing/events/sdt_glib/idle__add not found.
Hint: SDT event cannot be directly recorded on.
Please first use 'perf probe sdt_glib:idle__add' before recording it.
...
$ perf probe sdt_glib:idle__add
Added new event:
sdt_glib:idle__add (on %idle__add in /usr/lib64/libglib-2.0.so.0.5000.2)
You can now use it in all perf tools, such as:
perf record -e sdt_glib:idle__add -aR sleep 1
$ perf record -a -e sdt_glib:idle__add
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.175 MB perf.data ]
Suggested-and-Acked-by: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexis Berlemont <alexis.berlemont@gmail.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170203102642.17258-1-ravi.bangoria@linux.vnet.ibm.com
[ s/Please use/Please first use/ and break the Hint line in two ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Sat, 28 Jan 2017 02:03:40 +0000 (18:03 -0800)]
perf list: Add debug support for outputing alias string
For debugging and testing it is useful to see the converted alias
string. Add support to perf stat/record and perf list to print the alias
conversion. The text string is saved in the alias structure. For perf
stat/record it is folded into the normal -v. For perf list -v was taken,
so we use --debug.
Before:
% perf list
...
cache:
l1d.replacement
[L1D data line replacements]
l1d_pend_miss.fb_full
[Cycles a demand request was blocked due to Fill Buffers inavailability]
After
% perf list --debug
...
cache:
l1d.replacement
[L1D data line replacements]
cpu/umask=0x1,period=2000003,event=0x51/
l1d_pend_miss.fb_full
[Cycles a demand request was blocked due to Fill Buffers inavailability]
cpu/umask=0x2,period=2000003,cmask=1,event=0x48/
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170128020345.19007-6-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Sat, 28 Jan 2017 02:03:39 +0000 (18:03 -0800)]
perf pmu: Support event aliases for non cpu// pmus
The code for handling pmu aliases without specifying the PMU hardcoded
only supported the cpu PMU.
This patch extends it to work for all PMUs. We always duplicate the
event for all PMUs that have an matching alias. This allows to
automatically expand an alias for all instances of a PMU (so for example
you can monitor all cache boxes with a single event)
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170128020345.19007-5-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Sat, 28 Jan 2017 02:03:38 +0000 (18:03 -0800)]
perf pmu: Support per pmu json aliases
Add support for registering json aliases per PMU. Any alias with an unit
matching the prefix is registered to the PMU. Uncore has multiple
instances of most units, so all these aliases get registered for each
individual PMU (this is important later to run the event on every
instance of the PMU).
To avoid printing the events multiple times in perf list filter out
duplicated events during printing.
v2: Rely on uncore_ prefix already in unit
v3: Document why calls were reordered
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170128020345.19007-4-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Sat, 28 Jan 2017 02:03:37 +0000 (18:03 -0800)]
perf jevents: Add support for parsing uncore json files
Handle the "Unit" field, which is needed to find the right PMU for an
event. We call it "pmu" and convert it to the perf pmu name with an
uncore prefix.
Handle the "ExtSel" field, which just extends the event mask with an
additional bit.
Handle the "Filter" field which adds parameters to the main event
to configure filtering.
Handle the "Unit" field which declares the unit the values should be
scaled too (similar to what the kernel exports)
Set up the "perpkg" field for uncore events so that perf knows they are
per package (similar to what the kernel exports)
Then output the fields into the pmu-events data structures which are
compiled into perf.
Filter out zero fields, except for the event itself.
v2: Fix compilation. Add uncore_ prefix at pre-processing time.
Move eventcode change to separate patch.
v3: Remove extra __maybe_unused
v4: dont duplicate aliases for cpu pmu events
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170128020345.19007-3-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Sat, 28 Jan 2017 02:03:36 +0000 (18:03 -0800)]
perf jevents: Parse eventcode as number
The next patch needs to modify event code. Previously eventcode was just
passed through as a string. Now parse it as a number.
v2: Don't special case 0
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170128020345.19007-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
He Kuang [Tue, 7 Feb 2017 07:34:12 +0000 (07:34 +0000)]
perf bpf: Add missing newline in debug messages
These two debug messages are missing the trailing newline.
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bintian Wang <bintian.wang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20170207073412.26983-2-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
He Kuang [Tue, 7 Feb 2017 07:34:11 +0000 (07:34 +0000)]
perf tools arm64: Add support for generating bpf prologue
Since HAVE_KPROBES can be enabled in arm64, this patch introduces
regs_query_register_offset() to convert register name to offset for
arm64, so the BPF prologue feature is ready to use.
Signed-off-by: He Kuang <hekuang@huawei.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bintian Wang <bintian.wang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20170207073412.26983-1-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Mon, 6 Feb 2017 14:09:26 +0000 (11:09 -0300)]
Merge remote-tracking branch 'tip/perf/urgent' into perf/core
To pick fixes that are affecting tests of new 'perf diff' features in
perf/core.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Masami Hiramatsu [Mon, 6 Feb 2017 09:55:43 +0000 (18:55 +0900)]
kprobes/x86: Use hlist_for_each_entry() instead of hlist_for_each_entry_safe()
Use hlist_for_each_entry() in the first loop in the kretprobe
trampoline_handler() function, because it doesn't change the hlist.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/148637493309.19245.12546866092052500584.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Masami Hiramatsu [Mon, 6 Feb 2017 09:54:33 +0000 (18:54 +0900)]
kprobes/arm64: Remove a redundant dependency from the Kconfig
Remove the 'HAVE_KPROBES' dependency from the HAVE_KRETPROBES line,
since HAVE_KPROBES is already selected unconditionally in the Kconfig
line above this one.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David A. Long <dave.long@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandeepa Prabhu <sandeepa.s.prabhu@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/148637486369.19245.316601692744886725.stgit@devbox
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Mon, 6 Feb 2017 10:04:39 +0000 (11:04 +0100)]
Merge branch 'perf/urgent' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Fri, 3 Feb 2017 19:42:30 +0000 (20:42 +0100)]
Merge tag 'perf-urgent-for-mingo-4.10-
20170203' of git://git./linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- Reference count maps in callchains, fixing a SEGFAULT when referencing a
map after it is freed (Krister Johansen)
- Fix segfault on 'perf diff -o N' option (Namhyung Kim)
- Fix 'perf diff -o/--order' option behavior (Namhyung Kim)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Thu, 2 Feb 2017 22:08:58 +0000 (14:08 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Misc fixes:
- two microcode loader fixes
- two FPU xstate handling fixes
- an MCE timer handling related crash fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce: Make timer handling more robust
x86/microcode: Do not access the initrd after it has been freed
x86/fpu/xstate: Fix xcomp_bv in XSAVES header
x86/fpu: Set the xcomp_bv when we fake up a XSAVES area
x86/microcode/intel: Drop stashed AP patch pointer optimization
Linus Torvalds [Thu, 2 Feb 2017 21:30:19 +0000 (13:30 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Five kernel fixes:
- an mmap tracing ABI fix for certain mappings
- a use-after-free fix, found via KASAN
- three CPU hotplug related x86 PMU driver fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/uncore: Make package handling more robust
perf/x86/intel/uncore: Clean up hotplug conversion fallout
perf/x86/intel/rapl: Make package handling more robust
perf/core: Fix PERF_RECORD_MMAP2 prot/flags for anonymous memory
perf/core: Fix use-after-free bug
Linus Torvalds [Thu, 2 Feb 2017 21:20:23 +0000 (13:20 -0800)]
Merge branch 'efi-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull EFI fixes from Ingo Molnar:
"Two EFI boot fixes, one for arm64 and one for x86 systems with certain
firmware versions"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/fdt: Avoid FDT manipulation after ExitBootServices()
x86/efi: Always map the first physical page into the EFI pagetables
Linus Torvalds [Thu, 2 Feb 2017 20:54:45 +0000 (12:54 -0800)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull objtool fix from Ingo Molnar:
"A fix for a bad opcode in objtool's instruction decoder"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Fix IRET's opcode
Linus Torvalds [Thu, 2 Feb 2017 20:49:58 +0000 (12:49 -0800)]
Merge tag 'nfsd-4.10-2' of git://linux-nfs.org/~bfields/linux
Pull nfsd fixes from Bruce Fields:
"Three more miscellaneous nfsd bugfixes"
* tag 'nfsd-4.10-2' of git://linux-nfs.org/~bfields/linux:
svcrpc: fix oops in absence of krb5 module
nfsd: special case truncates some more
NFSD: Fix a null reference case in find_or_create_lock_stateid()
Linus Torvalds [Thu, 2 Feb 2017 20:39:10 +0000 (12:39 -0800)]
Merge tag 'xtensa-
20170202' of git://github.com/jcmvbkbc/linux-xtensa
Pull Xtensa fix from Max Filippov:
"A for an Xtensa build error introduced in reset code refactoring
series in v4.9:
- fix noMMU build on cores with MMU"
* tag 'xtensa-
20170202' of git://github.com/jcmvbkbc/linux-xtensa:
xtensa: fix noMMU build on cores with MMU
Linus Torvalds [Thu, 2 Feb 2017 20:34:27 +0000 (12:34 -0800)]
Merge tag 'pci-v4.10-fixes-2' of git://git./linux/kernel/git/helgaas/pci
Pull PCI fix from Bjorn Helgaas:
"Configure ASPM on the link from a PCI-to-PCIe bridge (avoids a NULL
pointer dereference on topologies including these bridges)"
* tag 'pci-v4.10-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI/ASPM: Handle PCI-to-PCIe bridges as roots of PCIe hierarchies
Krister Johansen [Fri, 6 Jan 2017 06:23:31 +0000 (22:23 -0800)]
perf callchain: Reference count maps
If dso__load_kcore frees all of the existing maps, but one has already
been attached to a callchain cursor node, then we can get a SIGSEGV in
any function that happens to try to use this invalid cursor. Use the
existing map refcount mechanism to forestall cleanup of a map until the
cursor iterates past the node.
Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@kernel.org
Fixes:
84c2cafa2889 ("perf tools: Reference count struct map")
Link: http://lkml.kernel.org/r/20170106062331.GB2707@templeofstupid.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Wed, 18 Jan 2017 05:14:57 +0000 (14:14 +0900)]
perf diff: Fix -o/--order option behavior (again)
Commit
21e6d8428664 ("perf diff: Use perf_hpp__register_sort_field
interface") changed list_add() to perf_hpp__register_sort_field().
This resulted in a behavior change since the field was added to the tail
instead of the head. So the -o option is mostly ignored due to its
order in the list.
This patch fixes it by adding perf_hpp__prepend_sort_field().
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes:
21e6d8428664 ("perf diff: Use perf_hpp__register_sort_field interface")
Link: http://lkml.kernel.org/r/20170118051457.30946-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Wed, 18 Jan 2017 05:14:56 +0000 (14:14 +0900)]
perf diff: Fix segfault on 'perf diff -o N' option
The -o/--order option is to select column number to sort a diff result.
It does the job by adding a hpp field at the beginning of the sort list.
But it should not be added to the output field list as it has no
callbacks required by a output field.
During the setup_sorting(), the perf_hpp__setup_output_field() appends
the given sort keys to the output field if it's not there already.
Originally it was checked by fmt->list being non-empty. But commit
3f931f2c4274 ("perf hists: Make hpp setup function generic") changed it
to check the ->equal callback.
Anyways, we don't need to add the pseudo hpp field to the output field
list since it won't be used for output. So just skip fields if they
have no ->color or ->entry callbacks.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes:
3f931f2c4274 ("perf hists: Make hpp setup function generic")
Link: http://lkml.kernel.org/r/20170118051457.30946-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ard Biesheuvel [Wed, 1 Feb 2017 17:45:02 +0000 (17:45 +0000)]
efi/fdt: Avoid FDT manipulation after ExitBootServices()
Some AArch64 UEFI implementations disable the MMU in ExitBootServices(),
after which unaligned accesses to RAM are no longer supported.
Commit:
abfb7b686a3e ("efi/libstub/arm*: Pass latest memory map to the kernel")
fixed an issue in the memory map handling of the stub FDT code, but
inadvertently created an issue with such firmware, by moving some
of the FDT manipulation to after the invocation of ExitBootServices().
Given that the stub's libfdt implementation uses the ordinary, accelerated
string functions, which rely on hardware handling of unaligned accesses,
manipulating the FDT with the MMU off may result in alignment faults.
So fix the situation by moving the update_fdt_memmap() call into the
callback function invoked by efi_exit_boot_services() right before it
calls the ExitBootServices() UEFI service (which is arguably a better
place for it anyway)
Note that disabling the MMU in ExitBootServices() is not compliant with
the UEFI spec, and carries great risk due to the fact that switching from
cached to uncached memory accesses halfway through compiler generated code
(i.e., involving a stack) can never be done in a way that is architecturally
safe.
Fixes:
abfb7b686a3e ("efi/libstub/arm*: Pass latest memory map to the kernel")
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Tested-by: Riku Voipio <riku.voipio@linaro.org>
Cc: <stable@vger.kernel.org>
Cc: mark.rutland@arm.com
Cc: linux-efi@vger.kernel.org
Cc: matt@codeblueprint.co.uk
Cc: leif.lindholm@linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1485971102-23330-2-git-send-email-ard.biesheuvel@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Linus Torvalds [Wed, 1 Feb 2017 19:52:27 +0000 (11:52 -0800)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Fix handling of interrupt status in stmmac driver. Just because we
have masked the event from generating interrupts, doesn't mean the
bit won't still be set in the interrupt status register. From Alexey
Brodkin.
2) Fix DMA API debugging splats in gianfar driver, from Arseny Solokha.
3) Fix off-by-one error in __ip6_append_data(), from Vlad Yasevich.
4) cls_flow does not match on icmpv6 codes properly, from Simon Horman.
5) Initial MAC address can be set incorrectly in some scenerios, from
Ivan Vecera.
6) Packet header pointer arithmetic fix in ip6_tnl_parse_tlv_end_lim(),
from Dan Carpenter.
7) Fix divide by zero in __tcp_select_window(), from Eric Dumazet.
8) Fix crash in iwlwifi when unregistering thermal zone, from Jens
Axboe.
9) Check for DMA mapping errors in starfire driver, from Alexey
Khoroshilov.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (31 commits)
tcp: fix 0 divide in __tcp_select_window()
ipv6: pointer math error in ip6_tnl_parse_tlv_enc_lim()
net: fix ndo_features_check/ndo_fix_features comment ordering
net/sched: matchall: Fix configuration race
be2net: fix initial MAC setting
ipv6: fix flow labels when the traffic class is non-0
net: thunderx: avoid dereferencing xcv when NULL
net/sched: cls_flower: Correct matching on ICMPv6 code
ipv6: Paritially checksum full MTU frames
net/mlx4_core: Avoid command timeouts during VF driver device shutdown
gianfar: synchronize DMA API usage by free_skb_rx_queue w/ gfar_new_page
net: ethtool: add support for 2500BaseT and 5000BaseT link modes
can: bcm: fix hrtimer/tasklet termination in bcm op removal
net: adaptec: starfire: add checks for dma mapping errors
net: phy: micrel: KSZ8795 do not set SUPPORTED_[Asym_]Pause
can: Fix kernel panic at security_sock_rcv_skb
net: macb: Fix 64 bit addressing support for GEM
stmmac: Discard masked flags in interrupt status register
net/mlx5e: Check ets capability before ets query FW command
net/mlx5e: Fix update of hash function/key via ethtool
...
Linus Torvalds [Wed, 1 Feb 2017 18:30:56 +0000 (10:30 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs
Pull fscache fixes from Al Viro.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fscache: Fix dead object requeue
fscache: Clear outstanding writes when disabling a cookie
FS-Cache: Initialise stores_lock in netfs cookie
Eric Dumazet [Wed, 1 Feb 2017 16:33:53 +0000 (08:33 -0800)]
tcp: fix 0 divide in __tcp_select_window()
syszkaller fuzzer was able to trigger a divide by zero, when
TCP window scaling is not enabled.
SO_RCVBUF can be used not only to increase sk_rcvbuf, also
to decrease it below current receive buffers utilization.
If mss is negative or 0, just return a zero TCP window.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Wed, 1 Feb 2017 08:46:32 +0000 (11:46 +0300)]
ipv6: pointer math error in ip6_tnl_parse_tlv_enc_lim()
Casting is a high precedence operation but "off" and "i" are in terms of
bytes so we need to have some parenthesis here.
Fixes:
fbfa743a9d2a ("ipv6: fix ip6_tnl_parse_tlv_enc_lim()")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 1 Feb 2017 17:24:00 +0000 (09:24 -0800)]
Merge branch 'linus' of git://git./linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"This fixes a bug in CBC/CTR on ARM64 that breaks chaining as well as a
bug in the core API that causes registration failures when a driver
unloads and then reloads an algorithm"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: arm64/aes-blk - honour iv_out requirement in CBC and CTR modes
crypto: api - Clear CRYPTO_ALG_DEAD bit before registering an alg
Linus Torvalds [Wed, 1 Feb 2017 17:22:08 +0000 (09:22 -0800)]
Merge tag 'dmaengine-fix-4.10-rc7' of git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine fixes from Vinod Koul:
"A couple of fixes showed up late in the cycle so sending them up and
sending early in the week and not on Friday :).
They fix a double lock in pl330 driver and runtime pm fixes for cppi
driver"
* tag 'dmaengine-fix-4.10-rc7' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: pl330: fix double lock
dmaengine: cppi41: Clean up pointless warnings
dmaengine: cppi41: Fix oops in cppi41_runtime_resume
dmaengine: cppi41: Fix runtime PM timeouts with USB mass storage
Dimitris Michailidis [Wed, 1 Feb 2017 00:03:13 +0000 (16:03 -0800)]
net: fix ndo_features_check/ndo_fix_features comment ordering
Commit
cdba756f5803a2 ("net: move ndo_features_check() close to
ndo_start_xmit()") inadvertently moved the doc comment for
.ndo_fix_features instead of .ndo_features_check. Fix the comment
ordering.
Fixes:
cdba756f5803a2 ("net: move ndo_features_check() close to ndo_start_xmit()")
Signed-off-by: Dimitris Michailidis <dmichail@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yotam Gigi [Tue, 31 Jan 2017 13:14:29 +0000 (15:14 +0200)]
net/sched: matchall: Fix configuration race
In the current version, the matchall internal state is split into two
structs: cls_matchall_head and cls_matchall_filter. This makes little
sense, as matchall instance supports only one filter, and there is no
situation where one exists and the other does not. In addition, that led
to some races when filter was deleted while packet was processed.
Unify that two structs into one, thus simplifying the process of matchall
creation and deletion. As a result, the new, delete and get callbacks have
a dummy implementation where all the work is done in destroy and change
callbacks, as was done in cls_cgroup.
Fixes:
bf3994d2ed31 ("net/sched: introduce Match-all classifier")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 1 Feb 2017 16:34:13 +0000 (08:34 -0800)]
Merge tag 'pinctrl-v4.10-4' of git://git./linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"Another week, another set of pin control fixes. The subsystem has seen
high patch-spot activity recently.
The majority of the patches are for Intel, I vaguely think it mostly
concern phones, tablets and maybe chromebooks and even laptops with
this Intel Atom family chips.
Driver fixes only:
- one fix to the Berlin driver making the SD card work fully again.
- one fix to the Allwinner/sunxi bias function: one premature change
needs to be partially reverted.
- the remaining four patches are to Intel embedded SoCs: baytrail
(three patches) and merrifield (one patch): register access
debounce fixes and a missing spinlock"
* tag 'pinctrl-v4.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: baytrail: Add missing spinlock usage in byt_gpio_irq_handler
pinctrl: baytrail: Debounce register is one per community
pinctrl: baytrail: Rectify debounce support (part 2)
pinctrl: intel: merrifield: Add missed check in mrfld_config_set()
pinctrl: sunxi: Don't enforce bias disable (for now)
pinctrl: berlin-bg4ct: fix the value for "sd1a" of pin SCRD0_CRD_PRES
Ivan Vecera [Tue, 31 Jan 2017 19:01:31 +0000 (20:01 +0100)]
be2net: fix initial MAC setting
Recent commit
34393529163a ("be2net: fix MAC addr setting on privileged
BE3 VFs") allows privileged BE3 VFs to set its MAC address during
initialization. Although the initial MAC for such VFs is already
programmed by parent PF the subsequent setting performed by VF is OK,
but in certain cases (after fresh boot) this command in VF can fail.
The MAC should be initialized only when:
1) no MAC is programmed (always except BE3 VFs during first init)
2) programmed MAC is different from requested (e.g. MAC is set when
interface is down). In this case the initial MAC programmed by PF
needs to be deleted.
The adapter->dev_mac contains MAC address currently programmed in HW so
it should be zeroed when the MAC is deleted from HW and should not be
filled when MAC is set when interface is down in be_mac_addr_set() as
no programming is performed in this case.
Example of failure without the fix (immediately after fresh boot):
# ip link set eth0 up <- eth0 is BE3 PF
be2net 0000:01:00.0 eth0: Link is Up
# echo 1 > /sys/class/net/eth0/device/sriov_numvfs <- Create 1 VF
...
be2net 0000:01:04.0: Emulex OneConnect(be3): VF port 0
# ip link set eth8 up <- eth8 is created privileged VF
be2net 0000:01:04.0: opcode 59-1 failed:status 1-76
RTNETLINK answers: Input/output error
# echo 0 > /sys/class/net/eth0/device/sriov_numvfs <- Delete VF
iommu: Removing device 0000:01:04.0 from group 33
...
# echo 1 > /sys/class/net/eth0/device/sriov_numvfs <- Create it again
iommu: Removing device 0000:01:04.0 from group 33
...
# ip link set eth8 up
be2net 0000:01:04.0 eth8: Link is Up
Initialization is now OK.
v2 - Corrected the comment and condition check suggested by Suresh & Harsha
Fixes:
34393529163a ("be2net: fix MAC addr setting on privileged BE3 VFs")
Cc: Sathya Perla <sathya.perla@broadcom.com>
Cc: Ajit Khaparde <ajit.khaparde@broadcom.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Cc: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Ivan Vecera <cera@cera.cz>
Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ingo Molnar [Wed, 1 Feb 2017 14:32:36 +0000 (15:32 +0100)]
Merge tag 'perf-core-for-mingo-4.11-
20170201' of git://git./linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
New features:
- Allow configuring a 'perf ftrace' default --tracer (Taeung Song)
Infrastructure changes:
- Sync tools/arch/{powerpc,arm}/include/uapi/asm/kvm.h and
tools/arch/x86/include/asm/cpufeatures.h (Ingo Molnar)
- Add BPF program file system pinning APIs and respective
'perf test' entry (Joe Stringer)
- Make tools tree support 'make -s' (Josh Poimboeuf)
- Reference count maps in callchains, fixing SEGFAULT when
referencing maps after it is freed (Krister Johansen)
- Create for_each_event trace points iterator (Taeung Song)
- Do not consider an error not to have any perfconfig file
(Arnaldo Carvalho de Melo
- Propagate perf_config() errors (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Alexander Shishkin [Fri, 27 Jan 2017 15:16:43 +0000 (17:16 +0200)]
perf/x86/intel/pt: Add format strings for PTWRITE and power event tracing
Commit:
8ee83b2ab3 ("perf/x86/intel/pt: Add support for PTWRITE and power event tracing")
forgot to add format strings to the PT driver. So one could enable these features
by setting corresponding bits in the event config, but not by their mnemonic names.
This patch adds the format strings.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: vince@deater.net
Fixes:
8ee83b2ab3 ("perf/x86/intel/pt: Add support for PTWRITE...")
Link: http://lkml.kernel.org/r/20170127151644.8585-2-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Thomas Gleixner [Tue, 31 Jan 2017 22:58:40 +0000 (23:58 +0100)]
perf/x86/intel/uncore: Make package handling more robust
The package management code in uncore relies on package mapping being
available before a CPU is started. This changed with:
9d85eb9119f4 ("x86/smpboot: Make logical package management more robust")
because the ACPI/BIOS information turned out to be unreliable, but that
left uncore in broken state. This was not noticed because on a regular boot
all CPUs are online before uncore is initialized.
Move the allocation to the CPU online callback and simplify the hotplug
handling. At this point the package mapping is established and correct.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com>
Fixes:
9d85eb9119f4 ("x86/smpboot: Make logical package management more robust")
Link: http://lkml.kernel.org/r/20170131230141.377156255@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Thomas Gleixner [Tue, 31 Jan 2017 22:58:39 +0000 (23:58 +0100)]
perf/x86/intel/uncore: Clean up hotplug conversion fallout
The recent conversion to the hotplug state machine kept two mechanisms from
the original code:
1) The first_init logic which adds the number of online CPUs in a package
to the refcount. That's wrong because the callbacks are executed for
all online CPUs.
Remove it so the refcounting is correct.
2) The on_each_cpu() call to undo box->init() in the error handling
path. That's bogus because when the prepare callback fails no box has
been initialized yet.
Remove it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Yasuaki Ishimatsu <yasu.isimatu@gmail.com>
Fixes:
1a246b9f58c6 ("perf/x86/intel/uncore: Convert to hotplug state machine")
Link: http://lkml.kernel.org/r/20170131230141.298032324@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Thomas Gleixner [Tue, 31 Jan 2017 22:58:38 +0000 (23:58 +0100)]
perf/x86/intel/rapl: Make package handling more robust
The package management code in RAPL relies on package mapping being
available before a CPU is started. This changed with:
9d85eb9119f4 ("x86/smpboot: Make logical package management more robust")
because the ACPI/BIOS information turned out to be unreliable, but that
left RAPL in broken state. This was not noticed because on a regular boot
all CPUs are online before RAPL is initialized.
A possible fix would be to reintroduce the mess which allocates a package
data structure in CPU prepare and when it turns out to already exist in
starting throw it away later in the CPU online callback. But that's a
horrible hack and not required at all because RAPL becomes functional for
perf only in the CPU online callback. That's correct because user space is
not yet informed about the CPU being onlined, so nothing caan rely on RAPL
being available on that particular CPU.
Move the allocation to the CPU online callback and simplify the hotplug
handling. At this point the package mapping is established and correct.
This also adds a missing check for available package data in the
event_init() function.
Reported-by: Yasuaki Ishimatsu <yasu.isimatu@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes:
9d85eb9119f4 ("x86/smpboot: Make logical package management more robust")
Link: http://lkml.kernel.org/r/20170131230141.212593966@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Max Filippov [Wed, 1 Feb 2017 02:35:37 +0000 (18:35 -0800)]
xtensa: fix noMMU build on cores with MMU
Commit
bf15f86b343ed8 ("xtensa: initialize MMU before jumping to reset
vector") calls MMU management functions even when CONFIG_MMU is not
selected. That breaks noMMU build on cores with MMU.
Don't manage MMU when CONFIG_MMU is not selected.
Cc: stable@vger.kernel.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Linus Torvalds [Wed, 1 Feb 2017 00:32:40 +0000 (16:32 -0800)]
Merge tag 'trace-4.10-rc2' of git://git./linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
"It was reported to me that the thread created by the hwlat tracer does
not migrate after the first instance. I found that there was as small
bug in the logic, and fixed it. It's minor, but should be fixed
regardless. There's not much impact outside the hwlat tracer"
* tag 'trace-4.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Fix hwlat kthread migration
Linus Torvalds [Tue, 31 Jan 2017 21:59:10 +0000 (13:59 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
Pull input subsystem fixes from Dmitry Torokhov:
"A fix for a crash in the wm97xx driver and synaptics-rmi4 will stop
throwing erroneous warnings."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: synaptics-rmi4 - fix reversed conditions in enable/disable_irq_wake
Input: wm97xx - make missing platform data non-fatal
Linus Torvalds [Tue, 31 Jan 2017 21:54:41 +0000 (13:54 -0800)]
Merge branch 'for-4.10-fixes' of git://git./linux/kernel/git/tj/cgroup
Pull cgroup fix from Tejun Heo:
"The cgroup creation path was getting the order of operations wrong and
exposing cgroups which don't have their names set yet to controllers
which can lead to NULL derefs.
This contains the fix for the bug"
* 'for-4.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: don't online subsystems before cgroup_name/path() are operational
Linus Torvalds [Tue, 31 Jan 2017 21:10:59 +0000 (13:10 -0800)]
Merge branch 'for-4.10-fixes' of git://git./linux/kernel/git/tj/percpu
Pull percpu fix from Tejun Heo:
"Douglas found and fixed a ref leak bug in percpu_ref_tryget[_live]().
The bug is caused by storing the return value of atomic_long_inc_not_zero()
into an int temp variable before returning it as a bool. The interim
cast to int loses the upper bits and can lead to false negatives. As
percpu_ref uses a high bit to mark a draining counter, this can happen
relatively easily.
Fixed by using bool for the temp variable"
* 'for-4.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu-refcount: fix reference leak during percpu-atomic transition
Linus Torvalds [Tue, 31 Jan 2017 21:07:04 +0000 (13:07 -0800)]
Merge branch 'for-4.10-fixes' of git://git./linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
"Three libata fixes: an error handling fix, blacklist addition for
another fallout from upping the default max sectors, and fix for a
sense data reporting bug which affects new harddrives which can report
sense data"
* 'for-4.10-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
ata: sata_mv:- Handle return value of devm_ioremap.
libata: Fix ATA request sense
libata: apply MAX_SEC_1024 to all CX1-JB*-HP devices
Linus Torvalds [Tue, 31 Jan 2017 21:05:15 +0000 (13:05 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jikos/hid
Pull HID fixes from Jiri Kosina:
- regression fix (sleeping while atomic) for cp2112, from Johan Hovold
- regression fix for proximity handling under certain circumstances in
Wacom driver, from Jason Gerecke
- functional fix for Logitech Rumblepad 2, from Ardinartsev Nikita
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: cp2112: fix gpio-callback error handling
HID: cp2112: fix sleep-while-atomic
HID: hid-lg: Fix immediate disconnection of Logitech Rumblepad 2
HID: usbhid: Quirk a AMI virtual mouse and keyboard with ALWAYS_POLL
HID: wacom: Fix poor prox handling in 'wacom_pl_irq'
Thomas Gleixner [Tue, 31 Jan 2017 08:37:34 +0000 (09:37 +0100)]
x86/mce: Make timer handling more robust
Erik reported that on a preproduction hardware a CMCI storm triggers the
BUG_ON in add_timer_on(). The reason is that the per CPU MCE timer is
started by the CMCI logic before the MCE CPU hotplug callback starts the
timer with add_timer_on(). So the timer is already queued which triggers
the BUG.
Using add_timer_on() is pretty pointless in this code because the timer is
strictlty per CPU, initialized as pinned and all operations which arm the
timer happen on the CPU to which the timer belongs.
Simplify the whole machinery by using mod_timer() instead of add_timer_on()
which avoids the problem because mod_timer() can handle already queued
timers. Use __start_timer() everywhere so the earliest armed expiry time is
preserved.
Reported-by: Erik Veijola <erik.veijola@intel.com>
Tested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1701310936080.3457@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Linus Torvalds [Tue, 31 Jan 2017 20:36:39 +0000 (12:36 -0800)]
Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fix from Steve French:
"A small cifs fix for stable"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
cifs: initialize file_info_lock
Taeung Song [Tue, 31 Jan 2017 11:38:29 +0000 (20:38 +0900)]
perf ftrace: Add ftrace.tracer config option
Currently 'perf ftrace' command allows selecting 'function_graph' or
'function', defaulting to 'function_graph'.
Add ftrace.tracer config option to select the default tracer:
# cat ~/.perfconfig
[ftrace]
tracer = function
# perf ftrace usleep 123456 | head -10
<...>-14450 [002] d... 10089.284231: finish_task_switch <-__schedule
<...>-14450 [002] .... 10089.284232: finish_wait <-pipe_wait
<...>-14450 [002] .... 10089.284232: mutex_lock <-pipe_wait
<...>-14450 [002] .... 10089.284232: _cond_resched <-mutex_lock
Committer notes:
Retesting it with invalid variables, invalid values for ftrace.tracer,
and a valid one:
# cat ~/.perfconfig
[ftrace]
trace = function
# perf ftrace usleep 1
Error: wrong config key-value pair ftrace.trace=function
# cat ~/.perfconfig
[ftrace]
tracer = functin
# perf ftrace usleep 1
Please select "function_graph" (default) or "function"
Error: wrong config key-value pair ftrace.tracer=functin
# cat ~/.perfconfig
[ftrace]
tracer = function
# perf ftrace usleep 1 | head -5
<idle>-0 [000] d... 3855.820847: switch_mm_irqs_off <-__schedule
<...>-18550 [000] d... 3855.820849: finish_task_switch <-__schedule
<...>-18550 [000] d... 3855.820851: smp_irq_work_interrupt <-irq_work_interrupt
<...>-18550 [000] d... 3855.820851: irq_enter <-smp_irq_work_interrupt
<...>-18550 [000] d... 3855.820851: rcu_irq_enter <-irq_enter
#
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1485862711-20216-3-git-send-email-treeze.taeung@gmail.com
[ Added missign space in error message, changed the logic to make it more compact and less error prone ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Taeung Song [Tue, 31 Jan 2017 11:38:28 +0000 (20:38 +0900)]
perf tools: Create for_each_event macro for tracepoints iteration
Similar to for_each_subsystem and for_each_event in util/parse-events.c,
add new macro 'for_each_event' for easy iteration over the tracepoints
in order to be more compact and readable.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1485862711-20216-2-git-send-email-treeze.taeung@gmail.com
[ Slight change to keep existing style for checking strcmp() return ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Joe Stringer [Thu, 26 Jan 2017 21:20:01 +0000 (13:20 -0800)]
perf test: Add libbpf pinning test
Add a test for the newly added BPF object pinning functionality.
For example:
# tools/perf/perf test 37
37: BPF filter :
37.1: Basic BPF filtering : Ok
37.2: BPF pinning : Ok
37.3: BPF prologue generation : Ok
37.4: BPF relocation checker : Ok
# tools/perf/perf test 37 -v 2>&1 | grep pinned
libbpf: pinned map '/sys/fs/bpf/perf_test/flip_table'
libbpf: pinned program '/sys/fs/bpf/perf_test/func=SyS_epoll_wait/0'
Signed-off-by: Joe Stringer <joe@ovn.org>
Requested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/20170126212001.14103-7-joe@ovn.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Joe Stringer [Thu, 26 Jan 2017 21:20:00 +0000 (13:20 -0800)]
tools lib api fs: Add bpf_fs filesystem detector
Allow mounting of the BPF filesystem at /sys/fs/bpf.
Signed-off-by: Joe Stringer <joe@ovn.org>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/20170126212001.14103-6-joe@ovn.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Joe Stringer [Thu, 26 Jan 2017 21:19:59 +0000 (13:19 -0800)]
tools perf util: Make rm_rf(path) argument const
rm_rf() doesn't modify its path argument, and a future caller will pass
a string constant into it to delete.
Signed-off-by: Joe Stringer <joe@ovn.org>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/20170126212001.14103-5-joe@ovn.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Joe Stringer [Thu, 26 Jan 2017 21:19:58 +0000 (13:19 -0800)]
tools lib bpf: Add bpf_object__pin()
Add a new API to pin a BPF object to the filesystem. The user can
specify the path within a BPF filesystem to pin the object.
Programs will be pinned under a subdirectory named the same as the
program, with each instance appearing as a numbered file under that
directory, and maps will be pinned under the path using the name of
the map as the file basename.
For example, with the directory '/sys/fs/bpf/foo' and a BPF object which
contains two instances of a program named 'bar', and a map named 'baz':
/sys/fs/bpf/foo/bar/0
/sys/fs/bpf/foo/bar/1
/sys/fs/bpf/foo/baz
Signed-off-by: Joe Stringer <joe@ovn.org>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/20170126212001.14103-4-joe@ovn.org
[ Check snprintf >= for truncation, as snprintf(bf, size, ...) == size also means truncation ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Joe Stringer [Thu, 26 Jan 2017 21:19:57 +0000 (13:19 -0800)]
tools lib bpf: Add bpf_map__pin()
Add a new API to pin a BPF map to the filesystem. The user can specify
the path full path within a BPF filesystem to pin the map.
Signed-off-by: Joe Stringer <joe@ovn.org>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/20170126212001.14103-3-joe@ovn.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Joe Stringer [Thu, 26 Jan 2017 21:19:56 +0000 (13:19 -0800)]
tools lib bpf: Add BPF program pinning APIs
Add new APIs to pin a BPF program (or specific instances) to the
filesystem. The user can specify the path full path within a BPF
filesystem to pin the program.
bpf_program__pin_instance(prog, path, n) will pin the nth instance of
'prog' to the specified path.
bpf_program__pin(prog, path) will create the directory 'path' (if it
does not exist) and pin each instance within that directory. For
instance, path/0, path/1, path/2.
Committer notes:
- Add missing headers for mkdir()
- Check strdup() for failure
- Check snprintf >= size, not >, as == also means truncated, see 'man
snprintf', return value.
- Conditionally define BPF_FS_MAGIC, as it isn't in magic.h in older
systems and we're not yet having a tools/include/uapi/linux/magic.h
copy.
- Do not include linux/magic.h, not present in older distros.
Signed-off-by: Joe Stringer <joe@ovn.org>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: netdev@vger.kernel.org
Link: http://lkml.kernel.org/r/20170126212001.14103-2-joe@ovn.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Krister Johansen [Fri, 6 Jan 2017 06:23:31 +0000 (22:23 -0800)]
perf callchain: Reference count maps
If dso__load_kcore frees all of the existing maps, but one has already
been attached to a callchain cursor node, then we can get a SIGSEGV in
any function that happens to try to use this invalid cursor. Use the
existing map refcount mechanism to forestall cleanup of a map until the
cursor iterates past the node.
Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@kernel.org
Fixes:
84c2cafa2889 ("perf tools: Reference count struct map")
Link: http://lkml.kernel.org/r/20170106062331.GB2707@templeofstupid.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
David Howells [Tue, 31 Jan 2017 09:45:28 +0000 (09:45 +0000)]
fscache: Fix dead object requeue
Under some circumstances, an fscache object can become queued such that it
fscache_object_work_func() can be called once the object is in the
OBJECT_DEAD state. This results in the kernel oopsing when it tries to
invoke the handler for the state (which is hard coded to 0x2).
The way this comes about is something like the following:
(1) The object dispatcher is processing a work state for an object. This
is done in workqueue context.
(2) An out-of-band event comes in that isn't masked, causing the object to
be queued, say EV_KILL.
(3) The object dispatcher finishes processing the current work state on
that object and then sees there's another event to process, so,
without returning to the workqueue core, it processes that event too.
It then follows the chain of events that initiates until we reach
OBJECT_DEAD without going through a wait state (such as
WAIT_FOR_CLEARANCE).
At this point, object->events may be 0, object->event_mask will be 0
and oob_event_mask will be 0.
(4) The object dispatcher returns to the workqueue processor, and in due
course, this sees that the object's work item is still queued and
invokes it again.
(5) The current state is a work state (OBJECT_DEAD), so the dispatcher
jumps to it - resulting in an OOPS.
When I'm seeing this, the work state in (1) appears to have been either
LOOK_UP_OBJECT or CREATE_OBJECT (object->oob_table is
fscache_osm_lookup_oob).
The window for (2) is very small:
(A) object->event_mask is cleared whilst the event dispatch process is
underway - though there's no memory barrier to force this to the top
of the function.
The window, therefore is from the time the object was selected by the
workqueue processor and made requeueable to the time the mask was
cleared.
(B) fscache_raise_event() will only queue the object if it manages to set
the event bit and the corresponding event_mask bit was set.
The enqueuement is then deferred slightly whilst we get a ref on the
object and get the per-CPU variable for workqueue congestion. This
slight deferral slightly increases the probability by allowing extra
time for the workqueue to make the item requeueable.
Handle this by giving the dead state a processor function and checking the
for the dead state address rather than seeing if the processor function is
address 0x2. The dead state processor function can then set a flag to
indicate that it's occurred and give a warning if it occurs more than once
per object.
If this race occurs, an oops similar to the following is seen (note the RIP
value):
BUG: unable to handle kernel NULL pointer dereference at
0000000000000002
IP: [<
0000000000000002>] 0x1
PGD 0
Oops: 0010 [#1] SMP
Modules linked in: ...
CPU: 17 PID: 16077 Comm: kworker/u48:9 Not tainted 3.10.0-327.18.2.el7.x86_64 #1
Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 12/27/2015
Workqueue: fscache_object fscache_object_work_func [fscache]
task:
ffff880302b63980 ti:
ffff880717544000 task.ti:
ffff880717544000
RIP: 0010:[<
0000000000000002>] [<
0000000000000002>] 0x1
RSP: 0018:
ffff880717547df8 EFLAGS:
00010202
RAX:
ffffffffa0368640 RBX:
ffff880edf7a4480 RCX:
dead000000200200
RDX:
0000000000000002 RSI:
00000000ffffffff RDI:
ffff880edf7a4480
RBP:
ffff880717547e18 R08:
0000000000000000 R09:
dfc40a25cb3a4510
R10:
dfc40a25cb3a4510 R11:
0000000000000400 R12:
0000000000000000
R13:
ffff880edf7a4510 R14:
ffff8817f6153400 R15:
0000000000000600
FS:
0000000000000000(0000) GS:
ffff88181f420000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000000000000002 CR3:
000000000194a000 CR4:
00000000001407e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000ffff0ff0 DR7:
0000000000000400
Stack:
ffffffffa0363695 ffff880edf7a4510 ffff88093f16f900 ffff8817faa4ec00
ffff880717547e60 ffffffff8109d5db 00000000faa4ec18 0000000000000000
ffff8817faa4ec18 ffff88093f16f930 ffff880302b63980 ffff88093f16f900
Call Trace:
[<
ffffffffa0363695>] ? fscache_object_work_func+0xa5/0x200 [fscache]
[<
ffffffff8109d5db>] process_one_work+0x17b/0x470
[<
ffffffff8109e4ac>] worker_thread+0x21c/0x400
[<
ffffffff8109e290>] ? rescuer_thread+0x400/0x400
[<
ffffffff810a5acf>] kthread+0xcf/0xe0
[<
ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140
[<
ffffffff816460d8>] ret_from_fork+0x58/0x90
[<
ffffffff810a5a00>] ? kthread_create_on_node+0x140/0x140
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jeremy McNicoll <jeremymc@redhat.com>
Tested-by: Frank Sorenson <sorenson@redhat.com>
Tested-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
David Howells [Wed, 18 Jan 2017 14:29:25 +0000 (14:29 +0000)]
fscache: Clear outstanding writes when disabling a cookie
fscache_disable_cookie() needs to clear the outstanding writes on the
cookie it's disabling because they cannot be completed after.
Without this, fscache_nfs_open_file() gets stuck because it disables the
cookie when the file is opened for writing but can't uncache the pages till
afterwards - otherwise there's a race between the open routine and anyone
who already has it open R/O and is still reading from it.
Looking in /proc/pid/stack of the offending process shows:
[<
ffffffffa0142883>] __fscache_wait_on_page_write+0x82/0x9b [fscache]
[<
ffffffffa014336e>] __fscache_uncache_all_inode_pages+0x91/0xe1 [fscache]
[<
ffffffffa01740fa>] nfs_fscache_open_file+0x59/0x9e [nfs]
[<
ffffffffa01ccf41>] nfs4_file_open+0x17f/0x1b8 [nfsv4]
[<
ffffffff8117350e>] do_dentry_open+0x16d/0x2b7
[<
ffffffff811743ac>] vfs_open+0x5c/0x65
[<
ffffffff81184185>] path_openat+0x785/0x8fb
[<
ffffffff81184343>] do_filp_open+0x48/0x9e
[<
ffffffff81174710>] do_sys_open+0x13b/0x1cb
[<
ffffffff811747b9>] SyS_open+0x19/0x1b
[<
ffffffff81001c44>] do_syscall_64+0x80/0x17a
[<
ffffffff8165c2da>] return_from_SYSCALL_64+0x0/0x7a
[<
ffffffffffffffff>] 0xffffffffffffffff
Reported-by: Jianhong Yin <jiyin@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
David Howells [Wed, 18 Jan 2017 14:29:17 +0000 (14:29 +0000)]
FS-Cache: Initialise stores_lock in netfs cookie
Initialise the stores_lock in fscache netfs cookies. Technically, it
shouldn't be necessary, since the netfs cookie is an index and stores no
data, but initialising it anyway adds insignificant overhead.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Steve Dickson <steved@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Dimitris Michailidis [Mon, 30 Jan 2017 22:09:42 +0000 (14:09 -0800)]
ipv6: fix flow labels when the traffic class is non-0
ip6_make_flowlabel() determines the flow label for IPv6 packets. It's
supposed to be passed a flow label, which it returns as is if non-0 and
in some other cases, otherwise it calculates a new value.
The problem is callers often pass a flowi6.flowlabel, which may also
contain traffic class bits. If the traffic class is non-0
ip6_make_flowlabel() mistakes the non-0 it gets as a flow label and
returns the whole thing. Thus it can return a 'flow label' longer than
20b and the low 20b of that is typically 0 resulting in packets with 0
label. Moreover, different packets of a flow may be labeled differently.
For a TCP flow with ECN non-payload and payload packets get different
labels as exemplified by this pair of consecutive packets:
(pure ACK)
Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2::
0110 .... = Version: 6
.... 0000 0000 .... .... .... .... .... = Traffic Class: 0x00 (DSCP: CS0, ECN: Not-ECT)
.... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0)
.... .... ..00 .... .... .... .... .... = Explicit Congestion Notification: Not ECN-Capable Transport (0)
.... .... .... 0001 1100 1110 0100 1001 = Flow Label: 0x1ce49
Payload Length: 32
Next Header: TCP (6)
(payload)
Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2::
0110 .... = Version: 6
.... 0000 0010 .... .... .... .... .... = Traffic Class: 0x02 (DSCP: CS0, ECN: ECT(0))
.... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0)
.... .... ..10 .... .... .... .... .... = Explicit Congestion Notification: ECN-Capable Transport codepoint '10' (2)
.... .... .... 0000 0000 0000 0000 0000 = Flow Label: 0x00000
Payload Length: 688
Next Header: TCP (6)
This patch allows ip6_make_flowlabel() to be passed more than just a
flow label and has it extract the part it really wants. This was simpler
than modifying the callers. With this patch packets like the above become
Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2::
0110 .... = Version: 6
.... 0000 0000 .... .... .... .... .... = Traffic Class: 0x00 (DSCP: CS0, ECN: Not-ECT)
.... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0)
.... .... ..00 .... .... .... .... .... = Explicit Congestion Notification: Not ECN-Capable Transport (0)
.... .... .... 1010 1111 1010 0101 1110 = Flow Label: 0xafa5e
Payload Length: 32
Next Header: TCP (6)
Internet Protocol Version 6, Src: 2002:af5:11a3::, Dst: 2002:af5:11a2::
0110 .... = Version: 6
.... 0000 0010 .... .... .... .... .... = Traffic Class: 0x02 (DSCP: CS0, ECN: ECT(0))
.... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0)
.... .... ..10 .... .... .... .... .... = Explicit Congestion Notification: ECN-Capable Transport codepoint '10' (2)
.... .... .... 1010 1111 1010 0101 1110 = Flow Label: 0xafa5e
Payload Length: 688
Next Header: TCP (6)
Signed-off-by: Dimitris Michailidis <dmichail@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vincent [Mon, 30 Jan 2017 14:06:43 +0000 (15:06 +0100)]
net: thunderx: avoid dereferencing xcv when NULL
This fixes the following smatch and coccinelle warnings:
drivers/net/ethernet/cavium/thunder/thunder_xcv.c:119 xcv_setup_link() error: we previously assumed 'xcv' could be null (see line 118) [smatch]
drivers/net/ethernet/cavium/thunder/thunder_xcv.c:119:16-20: ERROR: xcv is NULL but dereferenced. [coccinelle]
Fixes:
6465859aba1e66a5 ("net: thunderx: Add RGMII interface type support")
Signed-off-by: Vincent Stehlé <vincent.stehle@laposte.net>
Cc: Sunil Goutham <sgoutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
J. Bruce Fields [Tue, 31 Jan 2017 16:37:50 +0000 (11:37 -0500)]
svcrpc: fix oops in absence of krb5 module
Olga Kornievskaia says: "I ran into this oops in the nfsd (below)
(4.10-rc3 kernel). To trigger this I had a client (unsuccessfully) try
to mount the server with krb5 where the server doesn't have the
rpcsec_gss_krb5 module built."
The problem is that rsci.cred is copied from a svc_cred structure that
gss_proxy didn't properly initialize. Fix that.
[120408.542387] general protection fault: 0000 [#1] SMP
...
[120408.565724] CPU: 0 PID: 3601 Comm: nfsd Not tainted 4.10.0-rc3+ #16
[120408.567037] Hardware name: VMware, Inc. VMware Virtual =
Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
[120408.569225] task:
ffff8800776f95c0 task.stack:
ffffc90003d58000
[120408.570483] RIP: 0010:gss_mech_put+0xb/0x20 [auth_rpcgss]
...
[120408.584946] ? rsc_free+0x55/0x90 [auth_rpcgss]
[120408.585901] gss_proxy_save_rsc+0xb2/0x2a0 [auth_rpcgss]
[120408.587017] svcauth_gss_proxy_init+0x3cc/0x520 [auth_rpcgss]
[120408.588257] ? __enqueue_entity+0x6c/0x70
[120408.589101] svcauth_gss_accept+0x391/0xb90 [auth_rpcgss]
[120408.590212] ? try_to_wake_up+0x4a/0x360
[120408.591036] ? wake_up_process+0x15/0x20
[120408.592093] ? svc_xprt_do_enqueue+0x12e/0x2d0 [sunrpc]
[120408.593177] svc_authenticate+0xe1/0x100 [sunrpc]
[120408.594168] svc_process_common+0x203/0x710 [sunrpc]
[120408.595220] svc_process+0x105/0x1c0 [sunrpc]
[120408.596278] nfsd+0xe9/0x160 [nfsd]
[120408.597060] kthread+0x101/0x140
[120408.597734] ? nfsd_destroy+0x60/0x60 [nfsd]
[120408.598626] ? kthread_park+0x90/0x90
[120408.599448] ret_from_fork+0x22/0x30
Fixes:
1d658336b05f "SUNRPC: Add RPC based upcall mechanism for RPCGSS auth"
Cc: stable@vger.kernel.org
Cc: Simo Sorce <simo@redhat.com>
Reported-by: Olga Kornievskaia <kolga@netapp.com>
Tested-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Christoph Hellwig [Tue, 24 Jan 2017 08:22:41 +0000 (09:22 +0100)]
nfsd: special case truncates some more
Both the NFS protocols and the Linux VFS use a setattr operation with a
bitmap of attributs to set to set various file attributes including the
file size and the uid/gid.
The Linux syscalls never mixes size updates with unrelated updates like
the uid/gid, and some file systems like XFS and GFS2 rely on the fact
that truncates might not update random other attributes, and many other
file systems handle the case but do not update the different attributes
in the same transaction. NFSD on the other hand passes the attributes
it gets on the wire more or less directly through to the VFS, leading to
updates the file systems don't expect. XFS at least has an assert on
the allowed attributes, which caught an unusual NFS client setting the
size and group at the same time.
To handle this issue properly this switches nfsd to call vfs_truncate
for size changes, and then handle all other attributes through
notify_change. As a side effect this also means less boilerplace code
around the size change as we can now reuse the VFS code.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Kinglong Mee [Wed, 18 Jan 2017 11:04:42 +0000 (19:04 +0800)]
NFSD: Fix a null reference case in find_or_create_lock_stateid()
nfsd assigns the nfs4_free_lock_stateid to .sc_free in init_lock_stateid().
If nfsd doesn't go through init_lock_stateid() and put stateid at end,
there is a NULL reference to .sc_free when calling nfs4_put_stid(ns).
This patch let the nfs4_stid.sc_free assignment to nfs4_alloc_stid().
Cc: stable@vger.kernel.org
Fixes:
356a95ece7aa "nfsd: clean up races in lock stateid searching..."
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Steven Rostedt (VMware) [Tue, 31 Jan 2017 00:27:10 +0000 (19:27 -0500)]
tracing: Fix hwlat kthread migration
The hwlat tracer creates a kernel thread at start of the tracer. It is
pinned to a single CPU and will move to the next CPU after each period of
running. If the user modifies the migration thread's affinity, it will not
change after that happens.
The original code created the thread at the first instance it was called,
but later was changed to destroy the thread after the tracer was finished,
and would not be created until the next instance of the tracer was
established. The code that initialized the affinity was only called on the
initial instantiation of the tracer. After that, it was not initialized, and
the previous affinity did not match the current newly created one, making
it appear that the user modified the thread's affinity when it did not, and
the thread failed to migrate again.
Cc: stable@vger.kernel.org
Fixes:
0330f7aa8ee6 ("tracing: Have hwlat trace migrate across tracing_cpumask CPUs")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Johan Hovold [Mon, 30 Jan 2017 10:26:39 +0000 (11:26 +0100)]
HID: cp2112: fix gpio-callback error handling
In case of a zero-length report, the gpio direction_input callback would
currently return success instead of an errno.
Fixes:
1ffb3c40ffb5 ("HID: cp2112: make transfer buffers DMA capable")
Cc: stable <stable@vger.kernel.org> # 4.9
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Johan Hovold [Mon, 30 Jan 2017 10:26:38 +0000 (11:26 +0100)]
HID: cp2112: fix sleep-while-atomic
A recent commit fixing DMA-buffers on stack added a shared transfer
buffer protected by a spinlock. This is broken as the USB HID request
callbacks can sleep. Fix this up by replacing the spinlock with a mutex.
Fixes:
1ffb3c40ffb5 ("HID: cp2112: make transfer buffers DMA capable")
Cc: stable <stable@vger.kernel.org> # 4.9
Signed-off-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Christophe JAILLET [Tue, 31 Jan 2017 08:47:30 +0000 (00:47 -0800)]
Input: synaptics-rmi4 - fix reversed conditions in enable/disable_irq_wake
These tests are reversed. A warning should be displayed if an error is
returned, not on success.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Linus Torvalds [Mon, 30 Jan 2017 23:47:19 +0000 (15:47 -0800)]
Merge git://git./linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:
"Several small bug fixes and tidies, along with a fix for non-resumable
memory errors triggered by userspace"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Handle PIO & MEM non-resumable errors.
sparc64: Zero pages on allocation for mondo and error queues.
sparc: Fixed typo in sstate.c. Replaced panicing with panicking
sparc: use symbolic names for tsb indexing