Thomas Gleixner [Fri, 5 Apr 2019 11:28:15 +0000 (13:28 +0200)]
Merge tag 'perf-core-for-mingo-5.2-
20190402' of git://git./linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo:
perf record:
Alexey Budankov:
- Implement --mmap-flush=<number> option, to control a threshold for draining
the mmap ring buffers and consequently the size of the write calls to the
output, be it perf.data, pipe mode or soon a compressor that with bigger
buffers will do a better job before dumping compressed data into a new
perf.data content mode, which is in the final steps of reviewing and testing.
perf trace:
Arnaldo Carvalho de Melo:
- Add 'string' event alias to select syscalls with string args, i.e. for testing
the BPF program used to copy those strings, allow for:
# perf trace -e string
To select all the syscalls that have things like pathnames.
- Use a PERCPU_ARRAY BPF map to copy more string bytes than what is possible using
the BPF stack, just like pioneered by the sysdig tool.
Feature detection:
Alexey Budankov:
- Implement libzstd feature check, which is a library that provides a uniform
API to various compression formats, will be used in 'perf record', see note
about --mmap-flush feature.
perf stat:
Andi Kleen:
- Implement a tool specific 'duration_time' event to allow showing the "time
elapsed" line in the default 'perf stat' output as one of the events that
can be asked for when using --field-separator and other script consumable
outputs.
Intel vendor events (JSON files):
Andi Kleen:
- Update metrics from TMAM 3.5.
- Update events:
Bonnell to V4
Broadwell-DE to v7
Broadwell to v23
BroadwellX to v14
GoldmontPlus to v1.01
Goldmont to v13
Haswell to v28
HaswellX to v20
IvyBridge to v21
IvyTown to v20
JakeTown to v20
KnightsLanding to v9
SandyBridge to v16
Silvermont to v14
Skylake to v42
SkylakeX to v1.12
IBM S/390 vendor events (JSON):
Thomas Richter:
- Fix s390 counter long description for L1D_RO_EXCL_WRITES.
tools lib traceevent:
Steven Rostedt (Red Hat):
- Add more debugging to see various internal ring buffer entries.
Steven Rostedt (VMWare):
- Handle trace_printk() "%px".
- Add mono clocks to be parsed in seconds.
- Removed unneeded !! and return parenthesis.
Tzvetomir Stoyanov :
- Implement a new API, tep_list_events_copy().
- Implement new traceevent APIs for accessing struct tep_handler fields.
- Remove tep filter trivial APIs, not used anymore.
- Remove call to exit() from tep_filter_add_filter_str(), library routines shouldn't
kill tools using it.
- Make traceevent APIs more consistent.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Shaokun Zhang [Wed, 3 Apr 2019 06:54:24 +0000 (14:54 +0800)]
perf/headers: Fix stale comment for struct perf_addr_filter
The @inode field has been removed after:
9511bce9fe8e ("perf/core: Fix bad use of igrab()")
Update the description.
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lkml.kernel.org/r/1554274464-5739-1-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Valdis Kletnieks [Tue, 12 Mar 2019 08:06:37 +0000 (04:06 -0400)]
perf/core: Make perf_swevent_init_cpu() static
'make W=1' causes GCC to complain:
kernel/events/core.c:11877:6: warning: no previous prototype for 'perf_swevent_init_cpu' [-Wmissing-prototypes]
It's not referenced anywhere else, make it static.
Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/28974.1552377997@turing-police
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 14 Mar 2019 12:25:02 +0000 (13:25 +0100)]
perf/x86: Add sanity checks to x86_schedule_events()
By computing the 'committed' index earlier, we can use it to validate
the cached constraint state.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 14 Mar 2019 12:17:51 +0000 (13:17 +0100)]
perf/x86: Optimize x86_schedule_events()
Now that cpuc->event_constraint[] is retained, we can avoid calling
get_event_constraints() over and over again.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 14 Mar 2019 12:03:00 +0000 (13:03 +0100)]
perf/x86: Clear ->event_constraint[] on put
The current code unconditionally clears cpuc->event_constraint[i]
before calling get_event_constraints(.idx=i). The only site that cares
is intel_get_event_constraints() where the c1 load will always be
NULL.
However, always calling get_event_constraints() on all events is
wastefull, most times it will return the exact same result. Therefore
retain the logic in intel_get_event_constraints() and change the
generic code to only clear the constraint on put.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Stephane Eranian <eranian@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 14 Mar 2019 12:01:14 +0000 (13:01 +0100)]
perf/x86/intel: Optimize intel_get_excl_constraints()
Avoid the POPCNT by noting we can decrement the weight for each
cleared bit.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Stephane Eranian <eranian@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 14 Mar 2019 11:58:52 +0000 (12:58 +0100)]
perf/x86: Remove PERF_X86_EVENT_COMMITTED
The flag PERF_X86_EVENT_COMMITTED is used to find uncommitted events
for which to call put_event_constraint() when scheduling fails.
These are the newly added events to the list, and must form, per
definition, the tail of cpuc->event_list[]. By computing the list
index of the last successfull schedule, then iteration can start there
and the flag is redundant.
There are only 3 callers of x86_schedule_events(), notably:
- x86_pmu_add()
- x86_pmu_commit_txn()
- validate_group()
For x86_pmu_add(), cpuc->n_events isn't updated until after
schedule_events() succeeds, therefore cpuc->n_events points to the
desired index.
For x86_pmu_commit_txn(), cpuc->n_events is updated, but we can
trivially compute the desired value with cpuc->n_txn -- the number of
events added in this transaction.
For validate_group(), we can make the rule for x86_pmu_add() work by
simply setting cpuc->n_events to 0 before calling schedule_events().
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Stephane Eranian <eranian@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 14 Mar 2019 08:57:57 +0000 (09:57 +0100)]
perf/x86: Simplify x86_pmu.get_constraints() interface
There is a special case for validate_events() where we'll call
x86_pmu.get_constraints(.idx=-1). It's purpose, up until recent, seems
to be to avoid taking a previous constraint from
cpuc->event_constraint[] in intel_get_event_constraints().
(I could not find any other get_event_constraints() implementation
using @idx)
However, since that cpuc is freshly allocated, that array will in fact
be initialized with NULL pointers, achieving the very same effect.
Therefore remove this exception.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Stephane Eranian <eranian@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Thu, 14 Mar 2019 08:47:18 +0000 (09:47 +0100)]
perf/x86/intel: Simplify intel_tfa_commit_scheduling()
validate_group() calls x86_schedule_events(.assign=NULL) and therefore
will not call intel_tfa_commit_scheduling(). So there is no point in
checking cpuc->is_fake, we'll never get there.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Andi Kleen [Thu, 14 Mar 2019 21:56:26 +0000 (14:56 -0700)]
perf vendor events intel: Update Silvermont to v14
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lkml.kernel.org/r/20190315165219.GA21223@tassilo.jf.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Thu, 14 Mar 2019 21:55:53 +0000 (14:55 -0700)]
perf vendor events intel: Update GoldmontPlus to v1.01
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lkml.kernel.org/r/20190315165219.GA21223@tassilo.jf.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Thu, 14 Mar 2019 21:55:34 +0000 (14:55 -0700)]
perf vendor events intel: Update Goldmont to v13
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lkml.kernel.org/r/20190315165219.GA21223@tassilo.jf.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Thu, 14 Mar 2019 21:55:07 +0000 (14:55 -0700)]
perf vendor events intel: Update Bonnell to V4
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lkml.kernel.org/r/20190315165219.GA21223@tassilo.jf.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Thu, 14 Mar 2019 15:43:04 +0000 (08:43 -0700)]
perf vendor events intel: Update KnightsLanding events to v9
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lkml.kernel.org/r/20190315165219.GA21223@tassilo.jf.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Thu, 14 Mar 2019 15:42:14 +0000 (08:42 -0700)]
perf vendor events intel: Update Haswell events to v28
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lkml.kernel.org/r/20190315165219.GA21223@tassilo.jf.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Thu, 14 Mar 2019 15:41:35 +0000 (08:41 -0700)]
perf vendor events intel: Update IvyBridge events to v21
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lkml.kernel.org/r/20190315165219.GA21223@tassilo.jf.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Thu, 14 Mar 2019 15:41:12 +0000 (08:41 -0700)]
perf vendor events intel: Update SandyBridge events to v16
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lkml.kernel.org/r/20190315165219.GA21223@tassilo.jf.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Thu, 14 Mar 2019 15:40:51 +0000 (08:40 -0700)]
perf vendor events intel: Update JakeTown events to v20
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lkml.kernel.org/r/20190315165219.GA21223@tassilo.jf.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Thu, 14 Mar 2019 15:40:32 +0000 (08:40 -0700)]
perf vendor events intel: Update IvyTown events to v20
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lkml.kernel.org/r/20190315165219.GA21223@tassilo.jf.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Thu, 14 Mar 2019 15:39:49 +0000 (08:39 -0700)]
perf vendor events intel: Update HaswellX events to v20
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lkml.kernel.org/r/20190315165219.GA21223@tassilo.jf.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Thu, 14 Mar 2019 15:39:28 +0000 (08:39 -0700)]
perf vendor events intel: Update BroadwellX events to v14
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lkml.kernel.org/r/20190315165219.GA21223@tassilo.jf.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Thu, 14 Mar 2019 15:39:05 +0000 (08:39 -0700)]
perf vendor events intel: Update SkylakeX events to v1.12
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lkml.kernel.org/r/20190315165219.GA21223@tassilo.jf.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Thu, 14 Mar 2019 15:38:43 +0000 (08:38 -0700)]
perf vendor events intel: Update Skylake events to v42
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lkml.kernel.org/r/20190315165219.GA21223@tassilo.jf.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Thu, 14 Mar 2019 15:37:33 +0000 (08:37 -0700)]
perf vendor events intel: Update Broadwell-DE events to v7
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lkml.kernel.org/r/20190315165219.GA21223@tassilo.jf.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Thu, 14 Mar 2019 15:36:31 +0000 (08:36 -0700)]
perf vendor events intel: Update Broadwell events to v23
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lkml.kernel.org/r/20190315165219.GA21223@tassilo.jf.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Thu, 14 Mar 2019 15:03:48 +0000 (08:03 -0700)]
perf vendor events intel: Update metrics from TMAM 3.5
Update all the Intel JSON metrics from Ahmad Yasin's TMAM 3.5
for Intel big core from Sandy Bridge to Cascade Lake.
This has many improvements and new metircs
- New TopDownL1_SMT group that provides a per SMT thread version
of --topdown that does not require -a anymore. The drawback is
increased multiplexing though since L1 TopDown does not fit into
4 generic counters anymore.
- Added SMT aware versions of other metrics
- Split SMT aware metrics into separate metrics to avoid
unnecessary event collections
- New metrics for better branch analysis:
Estimated Branch_Mispredict_Costs, Instructions per taken Branch,
Branch Instructions per Taken Branch, etc.
- Instruction mix metrics:
Instructions per load, Instructions per store, Instructions per Branch,
Instructions per Call
- New Cache metrics:
Bandwidth to L1/L2/L3 caches. L1/L2/L3 misses per kilo instructions.
memory level parallelism
- New memory controller metrics:
Normalized memory bandwidth in interval mode, Average memory latency,
Average number of parallel read requests,
- 3DXP persistent memory metrics for Cascade Lake:
3dxp read latency, 3dxp read/write bandwidth
- Some other useful metrics like Instruction Level Parallelism,
- Various other improvements.
Not all metrics are available on all CPUs. Skylake has best coverage.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lkml.kernel.org/r/20190315165219.GA21223@tassilo.jf.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Alexey Budankov [Mon, 18 Mar 2019 17:40:26 +0000 (20:40 +0300)]
perf record: Implement --mmap-flush=<number> option
Implement a --mmap-flush option that specifies minimal number of bytes
that is extracted from mmaped kernel buffer to store into a trace. The
default option value is 1 byte what means every time trace writing
thread finds some new data in the mmaped buffer the data is extracted,
possibly compressed and written to a trace.
$ tools/perf/perf record --mmap-flush 1024 -e cycles -- matrix.gcc
$ tools/perf/perf record --aio --mmap-flush 1K -e cycles -- matrix.gcc
The option is independent from -z setting, doesn't vary with compression
level and can serve two purposes.
The first purpose is to increase the compression ratio of a trace data.
Larger data chunks are compressed more effectively so the implemented
option allows specifying data chunk size to compress. Also at some cases
executing more write syscalls with smaller data size can take longer
than executing less write syscalls with bigger data size due to syscall
overhead so extracting bigger data chunks specified by the option value
could additionally decrease runtime overhead.
The second purpose is to avoid self monitoring live-lock issue in system
wide (-a) profiling mode. Profiling in system wide mode with compression
(-a -z) can additionally induce data into the kernel buffers along with
the data from monitored processes. If performance data rate and volume
from the monitored processes is high then trace streaming and
compression activity in the tool is also high. High tool process
activity can lead to subtle live-lock effect when compression of single
new byte from some of mmaped kernel buffer leads to generation of the
next single byte at some mmaped buffer. So perf tool process ends up in
endless self monitoring.
Implemented synch parameter is the mean to force data move independently
from the specified flush threshold value. Despite the provided flush
value the tool needs capability to unconditionally drain memory buffers,
at least in the end of the collection.
Committer testing:
Running with the default value, i.e. as soon as there is something to
read go on consuming, we first write the synthesized events, small
chunks of about 128 bytes:
# perf trace -m 2048 --call-graph dwarf -e write -- perf record
<SNIP>
101.142 ( 0.004 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x210db60, count: 120) = 120
__libc_write (/usr/lib64/libpthread-2.28.so)
ion (/home/acme/bin/perf)
record__write (inlined)
process_synthesized_event (/home/acme/bin/perf)
perf_tool__process_synth_event (inlined)
perf_event__synthesize_mmap_events (/home/acme/bin/perf)
Then we move to reading the mmap buffers consuming the events put there
by the kernel perf infrastructure:
107.561 ( 0.005 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x7f1befc02000, count: 336) = 336
__libc_write (/usr/lib64/libpthread-2.28.so)
ion (/home/acme/bin/perf)
record__write (inlined)
record__pushfn (/home/acme/bin/perf)
perf_mmap__push (/home/acme/bin/perf)
record__mmap_read_evlist (inlined)
record__mmap_read_all (inlined)
__cmd_record (inlined)
cmd_record (/home/acme/bin/perf)
12919.953 ( 0.136 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x7f1befc83150, count: 184984) = 184984
<SNIP same backtrace as in the 107.561 timestamp>
12920.094 ( 0.155 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x7f1befc02150, count: 261816) = 261816
<SNIP same backtrace as in the 107.561 timestamp>
12920.253 ( 0.093 ms): perf/25821 write(fd: 3</root/perf.data>, buf: 0x7f1befb81120, count: 170832) = 170832
<SNIP same backtrace as in the 107.561 timestamp>
If we limit it to write only when more than 16MB are available for
reading, it throttles that to a quarter of the --mmap-pages set for
'perf record', which by default get to 528384 bytes, found out using
'record -v':
mmap flush: 132096
mmap size 528384B
With that in place all the writes coming from
record__mmap_read_evlist(), i.e. from the mmap buffers setup by the
kernel perf infrastructure were at least 132096 bytes long.
Trying with a bigger mmap size:
perf trace -e write perf record -v -m 2048 --mmap-flush 16M
74982.928 ( 2.471 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff94a6cc000, count: 3580888) = 3580888
74985.406 ( 2.353 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff949ecb000, count: 3453256) = 3453256
74987.764 ( 2.629 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff9496ca000, count: 3859232) = 3859232
74990.399 ( 2.341 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff948ec9000, count: 3769032) = 3769032
74992.744 ( 2.064 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff9486c8000, count: 3310520) = 3310520
74994.814 ( 2.619 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff947ec7000, count: 4194688) = 4194688
74997.439 ( 2.787 ms): perf/26500 write(fd: 3</root/perf.data>, buf: 0x7ff9476c6000, count: 4029760) = 4029760
Was again limited to a quarter of the mmap size:
mmap flush: 2098176
mmap size
8392704B
A warning about that would be good to have but can be added later,
something like:
"max flush is a quarter of the mmap size, if wanting to bump the mmap
flush further, bump the mmap size as well using -m/--mmap-pages"
Also rename the 'sync' parameters to 'synch' to keep tools/perf building
with older glibcs:
cc1: warnings being treated as errors
builtin-record.c: In function 'record__mmap_read_evlist':
builtin-record.c:775: warning: declaration of 'sync' shadows a global declaration
/usr/include/unistd.h:933: warning: shadowed declaration is here
builtin-record.c: In function 'record__mmap_read_all':
builtin-record.c:856: warning: declaration of 'sync' shadows a global declaration
/usr/include/unistd.h:933: warning: shadowed declaration is here
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/f6600d72-ecfa-2eb7-7e51-f6954547d500@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Alexey Budankov [Mon, 18 Mar 2019 17:39:49 +0000 (20:39 +0300)]
tools build: Implement libzstd feature check, LIBZSTD_DIR and NO_LIBZSTD defines
Implement libzstd feature check, NO_LIBZSTD and LIBZSTD_DIR defines to
override Zstd library sources or disable the feature from the command
line:
$ make -C tools/perf LIBZSTD_DIR=/path/to/zstd/sources/ clean all
$ make -C tools/perf NO_LIBZSTD=1 clean all
Auto detection feature status is reported just before compilation
starts. If your system has some version of the zstd library
preinstalled then the build system finds and uses it during the build.
If you still prefer to compile with some other version of zstd library
you have capability to refer the compilation to that version using
LIBZSTD_DIR define.
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/9b4cd8b0-10a3-1f1e-8d6b-5922a7ca216b@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tzvetomir Stoyanov [Mon, 1 Apr 2019 16:43:20 +0000 (12:43 -0400)]
tools lib traceevent: Rename input arguments and local variables of libtraceevent from pevent to tep
"pevent" to "tep" renaming of:
- all "pevent" input arguments of libtraceevent internal functions.
- all local "pevent" variables of libtraceevent.
This makes the implementation consistent with the chosen naming
convention, tep (trace event parser), and will avoid any confusion with
the old pevent name
Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/linux-trace-devel/20190401132111.13727-5-tstoyanov@vmware.com
Link: http://lkml.kernel.org/r/20190401164344.944953447@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tzvetomir Stoyanov [Mon, 1 Apr 2019 16:43:19 +0000 (12:43 -0400)]
perf tools, tools lib traceevent: Rename "pevent" member of struct tep_event_filter to "tep"
The member "pevent" of the struct tep_event_filter is renamed to "tep".
This makes the struct consistent with the chosen naming convention:
tep (trace event parser), instead of the old pevent.
Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/linux-trace-devel/20190401132111.13727-4-tstoyanov@vmware.com
Link: http://lkml.kernel.org/r/20190401164344.785896189@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tzvetomir Stoyanov [Mon, 1 Apr 2019 16:43:18 +0000 (12:43 -0400)]
perf tools, tools lib traceevent: Rename "pevent" member of struct tep_event to "tep"
The member "pevent" of the struct tep_event is renamed to "tep". This
makes the struct consistent with the chosen naming convention:
tep (trace event parser), instead of the old pevent.
Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/linux-trace-devel/20190401132111.13727-3-tstoyanov@vmware.com
Link: http://lkml.kernel.org/r/20190401164344.627724996@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tzvetomir Stoyanov [Mon, 1 Apr 2019 16:43:17 +0000 (12:43 -0400)]
tools lib traceevent: Rename input arguments of libtraceevent APIs from pevent to tep
Input arguments of libtraceevent APIs are renamed from "struct
tep_handle *pevent" to "struct tep_handle *tep". This makes the API
consistent with the chosen naming convention: tep (trace event parser),
instead of the old pevent.
Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/linux-trace-devel/20190401132111.13727-2-tstoyanov@vmware.com
Link: http://lkml.kernel.org/r/20190401164344.465573837@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tzvetomir Stoyanov [Mon, 1 Apr 2019 16:43:16 +0000 (12:43 -0400)]
tools tools, tools lib traceevent: Make traceevent APIs more consistent
Rename some traceevent APIs for consistency:
tep_pid_is_registered() to tep_is_pid_registered()
tep_file_bigendian() to tep_is_file_bigendian()
to make the names and return values consistent with other tep_is_... APIs
tep_data_lat_fmt() to tep_data_latency_format()
to make the name more descriptive
tep_host_bigendian() to tep_is_bigendian()
tep_set_host_bigendian() to tep_set_local_bigendian()
tep_is_host_bigendian() to tep_is_local_bigendian()
"host" can be confused with VMs, and "local" is about the local
machine. All tep_is_..._bigendian(struct tep_handle *tep) APIs return
the saved data in the tep handle, while tep_is_bigendian() returns
the running machine's endianness.
All tep_is_... functions are modified to return bool value, instead of int.
Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20190327141946.4353-2-tstoyanov@vmware.com
Link: http://lkml.kernel.org/r/20190401164344.288624897@goodmis.org
[ Removed some extra parenthesis around return statements ]
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tzvetomir Stoyanov [Mon, 1 Apr 2019 16:43:15 +0000 (12:43 -0400)]
tools lib traceevent: Remove call to exit() from tep_filter_add_filter_str()
This patch removes call to exit() from tep_filter_add_filter_str(). A
library function should not force the application to exit. In the
current implementation tep_filter_add_filter_str() calls exit() when a
special "test_filters" mode is set, used only for debugging purposes.
When this mode is set and a filter is added - its string is printed to
the console and exit() is called. This patch changes the logic - when in
"test_filters" mode, the filter string is still printed, but the exit()
is not called. It is up to the application to track when "test_filters"
mode is set and to call exit, if needed.
Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20190326154328.28718-9-tstoyanov@vmware.com
Link: http://lkml.kernel.org/r/20190401164344.121717482@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tzvetomir Stoyanov [Mon, 1 Apr 2019 16:43:14 +0000 (12:43 -0400)]
tools lib traceevent: Remove tep filter trivial APIs
This patch removes trivial filter tep APIs:
enum tep_filter_trivial_type
tep_filter_event_has_trivial()
tep_update_trivial()
tep_filter_clear_trivial()
Trivial filters is an optimization, used only in the first version of
KernelShark. The API is deprecated, the next KernelShark release does
not use it.
Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20190326154328.28718-4-tstoyanov@vmware.com
Link: http://lkml.kernel.org/r/20190401164343.968458918@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Steven Rostedt (VMware) [Mon, 1 Apr 2019 16:43:13 +0000 (12:43 -0400)]
tools lib traceevent: Removed unneeded !! and return parenthesis
As return is not a function we do not need parenthesis around the return
value. Also, a function returning bool does not need to add !!.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Link: http://lkml.kernel.org/r/20190401164343.817886725@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tzvetomir Stoyanov [Mon, 1 Apr 2019 16:43:12 +0000 (12:43 -0400)]
tools lib traceevent: Implement new traceevent APIs for accessing struct tep_handler fields
As struct tep_handler definition is not exposed as part of libtraceevent
API, its fields cannot be accessed directly by the library users. This
patch implements new APIs, which can be used to access the struct
tep_handler fields:
tep_get_event() - retrieves an event pointer at a specific index
tep_get_first_event() - is modified to use tep_get_event()
tep_clear_flag() - clears a tep handle flag
tep_test_flag() - test if a given flag is set
tep_get_header_timestamp_size() - returns the size of the timestamp stored
in the header.
tep_get_cpus() - returns the number of CPUs
tep_is_old_format() - returns true if data was created by an
older kernel with the old data format
tep_set_print_raw() - have the output print in the raw format
tep_set_test_filters() - debugging utility for testing tep filters
Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/linux-trace-devel/20190325145017.30246-4-tstoyanov@vmware.com
Link: http://lkml.kernel.org/r/20190401164343.679629539@goodmis.org
[ Renamed some newly added "pevent" to "tep" ]
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tzvetomir Stoyanov [Mon, 1 Apr 2019 16:43:11 +0000 (12:43 -0400)]
tools lib traceevent: Coding style fixes
Fixed few coding style problems in event-parse-api.c
Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/linux-trace-devel/20190325145017.30246-3-tstoyanov@vmware.com
Link: http://lkml.kernel.org/r/20190401164343.537086316@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tzvetomir Stoyanov [Mon, 1 Apr 2019 16:43:10 +0000 (12:43 -0400)]
tools lib traceevent: Change description of few APIs
APIs descriptions should describe the purpose of the function, its
parameters and return value. While working on man pages implementation,
I noticed mismatches in the descriptions of few APIs. This patch
changes the description of these APIs, making them consistent with the
man pages:
- tep_print_num_field()
- tep_print_func_field()
- tep_get_header_page_size()
- tep_get_long_size()
- tep_set_long_size()
- tep_get_page_size()
- tep_set_page_size()
Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/linux-trace-devel/20190325145017.30246-2-tstoyanov@vmware.com
Link: http://lkml.kernel.org/r/20190401164343.396759247@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Steven Rostedt (Red Hat) [Mon, 1 Apr 2019 16:43:09 +0000 (12:43 -0400)]
tools lib traceevent: Add more debugging to see various internal ring buffer entries
When trace-cmd report --debug is set, show the internal ring buffer
entries like time-extends and padding. This requires adding new kbuffer
API to retrieve these items.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Link: http://lkml.kernel.org/r/20190401164343.257591565@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tzvetomir Stoyanov [Mon, 1 Apr 2019 16:43:08 +0000 (12:43 -0400)]
tools lib traceevent: Implement a new API, tep_list_events_copy()
Existing API tep_list_events() is not thread safe, it uses the internal
array sort_events to keep cache of the sorted events and reuses it. This
patch implements a new API, tep_list_events_copy(), which allocates new
sorted array each time it is called. It could be used when a sorted
events functionality is needed in thread safe use cases. It is up to the
caller to free the array.
Signed-off-by: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/linux-trace-devel/20181218133013.31094-1-tstoyanov@vmware.com
Link: http://lkml.kernel.org/r/20190401164343.117437443@goodmis.org
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Steven Rostedt (VMware) [Mon, 1 Apr 2019 16:43:07 +0000 (12:43 -0400)]
tools lib traceevent: Add mono clocks to be parsed in seconds
The mono clocks can display in seconds instead of whole numbers:
trace-cmd-521 [001]
99176715281005: sched_waking: comm=kworker/u16:2 pid=32118 prio=120 target_cpu=002
trace-cmd-521 [001]
99176715286349: sched_wake_idle_without_ipi: cpu=2
trace-cmd-521 [001]
99176715288047: sched_wakeup: kworker/u16:2:32118 [120] success=1 CPU:002
trace-cmd-521 [001]
99176715290022: sched_waking: comm=trace-cmd pid=523 prio=120 target_cpu=000
trace-cmd-521 [001]
99176715292332: sched_wake_idle_without_ipi: cpu=0
trace-cmd-521 [001]
99176715292855: sched_wakeup: trace-cmd:523 [120] success=1 CPU:000
trace-cmd-521 [001]
99176715300697: sched_stat_runtime: comm=trace-cmd pid=521 runtime=80233 [ns] vruntime=
66705540554 [ns
Break it up in seconds:
trace-cmd-521 [001] 99176.715281: sched_waking: comm=kworker/u16:2 pid=32118 prio=120 target_cpu=002
trace-cmd-521 [001] 99176.715286: sched_wake_idle_without_ipi: cpu=2
trace-cmd-521 [001] 99176.715288: sched_wakeup: kworker/u16:2:32118 [120] success=1 CPU:002
trace-cmd-521 [001] 99176.715290: sched_waking: comm=trace-cmd pid=523 prio=120 target_cpu=000
trace-cmd-521 [001] 99176.715292: sched_wake_idle_without_ipi: cpu=0
trace-cmd-521 [001] 99176.715293: sched_wakeup: trace-cmd:523 [120] success=1 CPU:000
trace-cmd-521 [001] 99176.715301: sched_stat_runtime: comm=trace-cmd pid=521 runtime=80233 [ns] vruntime=
66705540554 [ns]
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Link: http://lkml.kernel.org/r/20190401164342.976554023@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Steven Rostedt (VMware) [Mon, 1 Apr 2019 16:43:06 +0000 (12:43 -0400)]
tools lib traceevent: Handle trace_printk() "%px"
With security updates, %p in the kernel is hashed to protect true kernel
locations. But trace_printk() is not allowed in production systems, and
when a pointer is used, there are many times that the actual address is
needed. "%px" produces the real address. But libtraceevent does not know how
to handle that extension. Add it.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Link: http://lkml.kernel.org/r/20190401164342.837312153@goodmis.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Tue, 26 Mar 2019 22:18:23 +0000 (15:18 -0700)]
perf list: Output tool events
Add support in 'perf list' to output tool internal events, currently
only 'duration_time'.
Committer testing:
$ perf list dur*
List of pre-defined events (to be used in -e):
duration_time [Tool event]
Metric Groups:
$ perf list sw
List of pre-defined events (to be used in -e):
alignment-faults [Software event]
bpf-output [Software event]
context-switches OR cs [Software event]
cpu-clock [Software event]
cpu-migrations OR migrations [Software event]
dummy [Software event]
emulation-faults [Software event]
major-faults [Software event]
minor-faults [Software event]
page-faults OR faults [Software event]
task-clock [Software event]
duration_time [Tool event]
$ perf list | grep duration
duration_time [Tool event]
[L1D miss outstandings duration in cycles]
page walk duration are excluded in Skylake]
load. EPT page walk duration are excluded in Skylake]
page walk duration are excluded in Skylake]
store. EPT page walk duration are excluded in Skylake]
(instruction fetch) request. EPT page walk duration are excluded in
instruction fetch request. EPT page walk duration are excluded in
$
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190326221823.11518-5-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Tue, 26 Mar 2019 22:18:22 +0000 (15:18 -0700)]
perf evsel: Support printing evsel name for 'duration_time'
Implement printing the correct name for duration_time
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190326221823.11518-4-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Tue, 26 Mar 2019 22:18:21 +0000 (15:18 -0700)]
perf stat: Implement duration_time as a proper event
The perf metric expression use 'duration_time' internally to normalize
events. Normal 'perf stat' without -x also prints the duration time.
But when using -x, the interval is not output anywhere, which is
inconvenient for any post processing which often wants to normalize
values to time.
So implement 'duration_time' as a proper perf event that can be
specified explicitely with -e.
The previous implementation of 'duration_time' only worked for metric
processing. This adds the concept of a tool event that is handled by the
tool. On the kernel level it is still mapped to the dummy software
event, but the values are not read anymore, but instead computed by the
tool.
Add proper plumbing to handle this in the event parser, and display it
in 'perf stat'. We don't want 'duration_time' to be added up, so it's
only printed for the first CPU.
% perf stat -e duration_time,cycles true
Performance counter stats for 'true':
555,476 ns duration_time
771,958 cycles
0.
000555476 seconds time elapsed
0.
000644000 seconds user
0.
000000000 seconds sys
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190326221823.11518-3-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andi Kleen [Tue, 26 Mar 2019 22:18:20 +0000 (15:18 -0700)]
perf stat: Revert checks for duration_time
This reverts
e864c5ca145e ("perf stat: Hide internal duration_time
counter") but doing it manually since the code has now moved to a
different file.
The next patch will properly implement duration_time as a full event, so
no need to hide it anymore.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190326221823.11518-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Thomas Richter [Fri, 29 Mar 2019 13:33:37 +0000 (14:33 +0100)]
perf list: Fix s390 counter long description for L1D_RO_EXCL_WRITES
Command
# perf list --long-desc pmu
lists the long description of the available counters. For counter
named L1D_RO_EXCL_WRITES on machine types 3906 and 3907 the long
description contains the counter number 'Counter:128 Name:'
prefix. This is wrong.
The fix changes the description text and removes this prefix.
Output before:
[root@m35lp76 perf]# ./perf list --long-desc pmu
...
L1D_ONDRAWER_L4_SOURCED_WRITES
[A directory write to the Level-1 Data cache directory where the
returned cache line was sourced from On-Drawer Level-4 cache]
L1D_RO_EXCL_WRITES
[Counter:128 Name:L1D_RO_EXCL_WRITES A directory write to the Level-1
Data cache where the line was originally in a Read-Only state in the
cache but has been updated to be in the Exclusive state that allows
stores to the cache line]
...
Output after:
[root@m35lp76 perf]# ./perf list --long-desc pmu
...
L1D_ONDRAWER_L4_SOURCED_WRITES
[A directory write to the Level-1 Data cache directory where the
returned cache line was sourced from On-Drawer Level-4 cache]
L1D_RO_EXCL_WRITES
[L1D_RO_EXCL_WRITES A directory write to the Level-1
Data cache where the line was originally in a Read-Only state in the
cache but has been updated to be in the Exclusive state that allows
stores to the cache line]
...
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fixes:
109d59b900e7 ("perf vendor events s390: Add JSON files for IBM z14")
Link: http://lkml.kernel.org/r/20190329133337.60255-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Wed, 27 Mar 2019 18:22:52 +0000 (15:22 -0300)]
perf tools: Add header defining used namespace struct to event.h
When adding the 'struct namespaces_event' to event.h, referencing the
'struct perf_ns_link_info' type, we forgot to add the header where it is
defined, getting that definition only by sheer luck.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes:
f3b3614a284d ("perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info")
Link: https://lkml.kernel.org/n/tip-qkrld0v7boc9uabjbd8csxux@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Wed, 27 Mar 2019 13:16:43 +0000 (10:16 -0300)]
perf trace beauty renameat: No need to include linux/fs.h
There is no use for what is in that file, as everything is
built by the tools/perf/trace/beauty/rename_flags.sh script from
the copied kernel headers, the end result being:
$ cat /tmp/build/perf/trace/beauty/generated/rename_flags_array.c
static const char *rename_flags[] = {
[0 + 1] = "NOREPLACE",
[1 + 1] = "EXCHANGE",
[2 + 1] = "WHITEOUT",
};
$
I.e. no use of any defines from uapi/linux/fs.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-lgugmfa8z4bpw5zsbuoitllb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Fri, 22 Mar 2019 17:48:26 +0000 (14:48 -0300)]
perf augmented_raw_syscalls: Use a PERCPU_ARRAY map to copy more string bytes
The previous method, copying to the BPF stack limited us in how many
bytes we could copy from strings, use a PERCPU_ARRAY map like devised by
the sysdig guys[1] to copy more bytes:
Before:
# trace --no-inherit -e openat touch `python -c "print "$s" 'a' * 2000"`
touch: cannot touch 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa': File name too long
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", O_CREAT|O_NOCTTY|O_NONBLOCK|O_WRONLY, S_IRUGO|S_IWUGO) = -1 ENAMETOOLONG (File name too long)
openat(AT_FDCWD, "/usr/share/locale/locale.alias", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/usr/share/locale/en_US.UTF-8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/share/locale/en_US.utf8/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory)
<SNIP some openat calls>
#
After:
[root@quaco acme]# trace --no-inherit -e openat touch `python -c "print "$s" 'a' * 2000"`
<STRIP what is the same as in the 'before' part>
openat(AT_FDCWD, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", O_CREAT|O_NOCTTY|O_NONBLOC) = -1 ENAMETOOLONG (File name too long)
<STRIP what is the same as in the 'before' part>
If we leave something like 'perf trace -e string' to trace all syscalls
with a string, and then do some 'perf top', to get some annotation for
the augmented_raw_syscalls.o BPF program we get:
│ → callq *
ffffffffc45576d1 â–’
│ augmented_args->filename.size = probe_read_str(&augmented_args->filename.value, ▒
0.05 │ mov %eax,0x40(%r13)
Looking with pahole, expanding types, asking for hex offsets and sizes,
and use of BTF type information to see what is at that 0x40 offset from
%r13:
# pahole -F btf -C augmented_args_filename --expand_types --hex /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
struct augmented_args_filename {
struct syscall_enter_args {
long long unsigned int common_tp_fields; /* 0 0x8 */
long int syscall_nr; /* 0x8 0x8 */
long unsigned int args[6]; /* 0x10 0x30 */
} args; /* 0 0x40 */
/* --- cacheline 1 boundary (64 bytes) --- */
struct augmented_filename {
unsigned int size; /* 0x40 0x4 */
int reserved; /* 0x44 0x4 */
char value[4096]; /* 0x48 0x1000 */
} filename; /* 0x40 0x1008 */
/* size: 4168, cachelines: 66, members: 2 */
/* last cacheline: 8 bytes */
};
#
Then looking if PATH_MAX leaves some signature in the tests:
│ if (augmented_args->filename.size < sizeof(augmented_args->filename.value)) { ▒
│ cmp $0xfff,%rdi
0xfff == 4095
sizeof(augmented_args->filename.value) == PATH_MAX == 4096
[1] https://sysdig.com/blog/the-art-of-writing-ebpf-programs-a-primer/
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Daniel Borkmann <borkmann@iogearbox.net>
Cc: Gianluca Borello <g.borello@gmail.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
cc: Martin Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lkml.kernel.org/n/tip-76gce2d2ghzq537ubwhjkone@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Fri, 22 Mar 2019 17:25:14 +0000 (14:25 -0300)]
perf augmented_raw_syscalls: Copy strings from all syscalls with 1st or 2nd string arg
Gets the augmented_raw_syscalls a bit more useful as-is, add a comment
stating that the intent is to have all this in a map populated by
userspace via the 'syscalls' BPF map, that right now has only a flag
stating if the syscall is filtered or not.
With it:
# grep -B1 augmented_raw ~/.perfconfig
[trace]
add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
#
# perf trace -e string
weechat/6001 stat("/etc/localtime", 0x7ffe22c23d10) = 0
gnome-shell/1943 openat(AT_FDCWD, "/proc/self/stat", O_RDONLY) = 81
weechat/6001 stat("/etc/localtime", 0x7ffe22c23d10) = 0
gmain/2475 inotify_add_watch(20<anon_inode:inotify>, "/home/acme/.config/firewall",
16789454) = -1 ENOENT (No such file or directory)
gmain/2391 inotify_add_watch(3<anon_inode:inotify>, "",
16789454) = -1 ENOENT (No such file or directory)
gmain/2391 inotify_add_watch(3<anon_inode:inotify>, "/var/cache/app-info/yaml",
16789454) = -1 ENOENT (No such file or directory)
gmain/2391 inotify_add_watch(3<anon_inode:inotify>, "/var/lib/app-info/xmls",
16789454) = -1 ENOENT (No such file or directory)
gmain/2391 inotify_add_watch(3<anon_inode:inotify>, "/var/lib/app-info/yaml",
16789454) = -1 ENOENT (No such file or directory)
gmain/2391 inotify_add_watch(3<anon_inode:inotify>, "/usr/share/app-info/yaml",
16789454) = -1 ENOENT (No such file or directory)
gmain/2391 inotify_add_watch(3<anon_inode:inotify>, "/usr/local/share/app-info/xmls",
16789454) = -1 ENOENT (No such file or directory)
gmain/2391 inotify_add_watch(3<anon_inode:inotify>, "/usr/local/share/app-info/yaml",
16789454) = -1 ENOENT (No such file or directory)
gmain/2391 inotify_add_watch(3<anon_inode:inotify>, "/home/acme/.local/share/app-info/yaml",
16789454) = -1 ENOENT (No such file or directory)
gmain/1121 inotify_add_watch(12<anon_inode:inotify>, "/etc/NetworkManager/VPN",
16789454) = -1 ENOENT (No such file or directory)
weechat/6001 stat("/etc/localtime", 0x7ffe22c23d10) = 0
gmain/2050 inotify_add_watch(8<anon_inode:inotify>, "/home/acme/~",
16789454) = -1 ENOENT (No such file or directory)
gmain/2521 inotify_add_watch(6<anon_inode:inotify>, "/var/lib/fwupd/remotes.d/lvfs-testing",
16789454) = -1 ENOENT (No such file or directory)
weechat/6001 stat("/etc/localtime", 0x7ffe22c23d10) = 0
DOM Worker/22714 ... [continued]: openat()) = 257
FS Broker 3982/3990 openat(AT_FDCWD, "/dev/urandom", O_RDONLY|O_CLOEXEC|O_NOCTTY) = 187
DOMCacheThread/16652 mkdir("/home/acme/.mozilla/firefox/ina67tev.default/storage/default/https+++web.whatsapp.com/cache/morgue/192", S_IRUGO|S_IXUGO|S_IWUSR) = -1 EEXIST (File exists)
^C#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-a1hxffoy8t43e0wq6bzhp23u@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo Carvalho de Melo [Fri, 22 Mar 2019 17:18:21 +0000 (14:18 -0300)]
perf trace: Add 'string' event alias to select syscalls with string args
Will be used in conjunction with the change to augmented_raw_syscalls.c
in the next cset that adds all syscalls with a first or second arg
string.
With just what we have in the syscall tracepoints we get:
# perf trace -e string ls > /dev/null
? ( ): ls/22382 ... [continued]: execve()) = 0
0.043 ( 0.004 ms): ls/22382 access(filename: 0x51ad420, mode: R) = -1 ENOENT (No such file or directory)
0.051 ( 0.004 ms): ls/22382 openat(dfd: CWD, filename: 0x51aa8b3, flags: RDONLY|CLOEXEC) = 3
0.071 ( 0.004 ms): ls/22382 openat(dfd: CWD, filename: 0x51b4d00, flags: RDONLY|CLOEXEC) = 3
0.138 ( 0.009 ms): ls/22382 openat(dfd: CWD, filename: 0x51684d0, flags: RDONLY|CLOEXEC) = 3
0.192 ( 0.004 ms): ls/22382 openat(dfd: CWD, filename: 0x51689c0, flags: RDONLY|CLOEXEC) = 3
0.255 ( 0.004 ms): ls/22382 openat(dfd: CWD, filename: 0x5168eb0, flags: RDONLY|CLOEXEC) = 3
0.342 ( 0.003 ms): ls/22382 openat(dfd: CWD, filename: 0x51693a0, flags: RDONLY|CLOEXEC) = 3
0.380 ( 0.003 ms): ls/22382 openat(dfd: CWD, filename: 0x5169950, flags: RDONLY|CLOEXEC) = 3
0.670 ( 0.011 ms): ls/22382 statfs(pathname: 0x515c783, buf: 0x7fff54d75b70) = 0
0.683 ( 0.005 ms): ls/22382 statfs(pathname: 0x515c783, buf: 0x7fff54d75a60) = 0
0.725 ( 0.004 ms): ls/22382 access(filename: 0x515c7ab) = 0
0.744 ( 0.005 ms): ls/22382 openat(dfd: CWD, filename: 0x50fba20, flags: RDONLY|CLOEXEC) = 3
0.793 ( 0.004 ms): ls/22382 openat(dfd: CWD, filename: 0x9e3e8390, flags: RDONLY|CLOEXEC|DIRECTORY|NONBLOCK) = 3
0.921 ( 0.006 ms): ls/22382 openat(dfd: CWD, filename: 0x50f7d90) = 3
#
If we put the vfs_getname probe point in place:
# perf probe 'vfs_getname=getname_flags:73 pathname=result->name:string'
Added new events:
probe:vfs_getname (on getname_flags:73 with pathname=result->name:string)
probe:vfs_getname_1 (on getname_flags:73 with pathname=result->name:string)
You can now use it in all perf tools, such as:
perf record -e probe:vfs_getname_1 -aR sleep 1
# perf trace -e string ls > /dev/null
? ( ): ls/22440 ... [continued]: execve()) = 0
0.048 ( 0.008 ms): ls/22440 access(filename: /etc/ld.so.preload, mode: R) = -1 ENOENT (No such file or directory)
0.061 ( 0.007 ms): ls/22440 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: RDONLY|CLOEXEC) = 3
0.092 ( 0.008 ms): ls/22440 openat(dfd: CWD, filename: /lib64/libselinux.so.1, flags: RDONLY|CLOEXEC) = 3
0.165 ( 0.007 ms): ls/22440 openat(dfd: CWD, filename: /lib64/libcap.so.2, flags: RDONLY|CLOEXEC) = 3
0.216 ( 0.007 ms): ls/22440 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: RDONLY|CLOEXEC) = 3
0.282 ( 0.007 ms): ls/22440 openat(dfd: CWD, filename: /lib64/libpcre2-8.so.0, flags: RDONLY|CLOEXEC) = 3
0.340 ( 0.007 ms): ls/22440 openat(dfd: CWD, filename: /lib64/libdl.so.2, flags: RDONLY|CLOEXEC) = 3
0.383 ( 0.007 ms): ls/22440 openat(dfd: CWD, filename: /lib64/libpthread.so.0, flags: RDONLY|CLOEXEC) = 3
0.697 ( 0.021 ms): ls/22440 statfs(pathname: /sys/fs/selinux, buf: 0x7ffee7dc9010) = 0
0.720 ( 0.007 ms): ls/22440 statfs(pathname: /sys/fs/selinux, buf: 0x7ffee7dc8f00) = 0
0.757 ( 0.007 ms): ls/22440 access(filename: /etc/selinux/config) = 0
0.779 ( 0.009 ms): ls/22440 openat(dfd: CWD, filename: /usr/lib/locale/locale-archive, flags: RDONLY|CLOEXEC) = 3
0.830 ( 0.006 ms): ls/22440 openat(dfd: CWD, filename: ., flags: RDONLY|CLOEXEC|DIRECTORY|NONBLOCK) = 3
0.958 ( 0.010 ms): ls/22440 openat(dfd: CWD, filename: /usr/lib64/gconv/gconv-modules.cache) = 3
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-6fh1myvn7ulf4xwq9iz3o776@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Linus Torvalds [Mon, 1 Apr 2019 15:28:36 +0000 (08:28 -0700)]
Merge branch 'work.aio' of git://git./linux/kernel/git/viro/vfs
Pull aio race fixes and cleanups from Al Viro.
The aio code had more issues with error handling and races with the aio
completing at just the right (wrong) time along with freeing the file
descriptor when another thread closes the file.
Just a couple of these commits are the actual fixes: the others are
cleanups to either make the fixes simpler, or to make the code legible
and understandable enough that we hope there's no more fundamental races
hiding.
* 'work.aio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
aio: move sanity checks and request allocation to io_submit_one()
deal with get_reqs_available() in aio_get_req() itself
aio: move dropping ->ki_eventfd into iocb_destroy()
make aio_read()/aio_write() return int
Fix aio_poll() races
aio: store event at final iocb_put()
aio: keep io_event in aio_kiocb
aio: fold lookup_kiocb() into its sole caller
pin iocb through aio.
Linus Torvalds [Mon, 1 Apr 2019 14:51:48 +0000 (07:51 -0700)]
Merge branch 'fixes' of git://git./linux/kernel/git/viro/vfs
Pull symlink fixes from Al Viro:
"The ceph fix is already in mainline, Daniel's bpf fix is in bpf tree
(
1da6c4d9140c "bpf: fix use after free in bpf_evict_inode"), the rest
is in here"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
debugfs: fix use-after-free on symlink traversal
ubifs: fix use-after-free on symlink traversal
jffs2: fix use-after-free on symlink traversal
Al Viro [Tue, 26 Mar 2019 01:43:37 +0000 (01:43 +0000)]
debugfs: fix use-after-free on symlink traversal
symlink body shouldn't be freed without an RCU delay. Switch debugfs to
->destroy_inode() and use of call_rcu(); free both the inode and symlink
body in the callback. Similar to solution for bpf, only here it's even
more obvious that ->evict_inode() can be dropped.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 26 Mar 2019 01:40:38 +0000 (01:40 +0000)]
ubifs: fix use-after-free on symlink traversal
free the symlink body after the same RCU delay we have for freeing the
struct inode itself, so that traversal during RCU pathwalk wouldn't step
into freed memory.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Al Viro [Tue, 26 Mar 2019 01:39:50 +0000 (01:39 +0000)]
jffs2: fix use-after-free on symlink traversal
free the symlink body after the same RCU delay we have for freeing the
struct inode itself, so that traversal during RCU pathwalk wouldn't step
into freed memory.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Linus Torvalds [Sun, 31 Mar 2019 21:39:29 +0000 (14:39 -0700)]
Linux 5.1-rc3
Linus Torvalds [Sun, 31 Mar 2019 15:55:59 +0000 (08:55 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"A collection of x86 and ARM bugfixes, and some improvements to
documentation.
On top of this, a cleanup of kvm_para.h headers, which were exported
by some architectures even though they not support KVM at all. This is
responsible for all the Kbuild changes in the diffstat"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (28 commits)
Documentation: kvm: clarify KVM_SET_USER_MEMORY_REGION
KVM: doc: Document the life cycle of a VM and its resources
KVM: selftests: complete IO before migrating guest state
KVM: selftests: disable stack protector for all KVM tests
KVM: selftests: explicitly disable PIE for tests
KVM: selftests: assert on exit reason in CR4/cpuid sync test
KVM: x86: update %rip after emulating IO
x86/kvm/hyper-v: avoid spurious pending stimer on vCPU init
kvm/x86: Move MSR_IA32_ARCH_CAPABILITIES to array emulated_msrs
KVM: x86: Emulate MSR_IA32_ARCH_CAPABILITIES on AMD hosts
kvm: don't redefine flags as something else
kvm: mmu: Used range based flushing in slot_handle_level_range
KVM: export <linux/kvm_para.h> and <asm/kvm_para.h> iif KVM is supported
KVM: x86: remove check on nr_mmu_pages in kvm_arch_commit_memory_region()
kvm: nVMX: Add a vmentry check for HOST_SYSENTER_ESP and HOST_SYSENTER_EIP fields
KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP violation)
KVM: Reject device ioctls from processes other than the VM's creator
KVM: doc: Fix incorrect word ordering regarding supported use of APIs
KVM: x86: fix handling of role.cr4_pae and rename it to 'gpte_size'
KVM: nVMX: Do not inherit quadrant and invalid for the root shadow EPT
...
Linus Torvalds [Sun, 31 Mar 2019 15:40:15 +0000 (08:40 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A pile of x86 updates:
- Prevent exceeding he valid physical address space in the /dev/mem
limit checks.
- Move all header content inside the header guard to prevent compile
failures.
- Fix the bogus __percpu annotation in this_cpu_has() which makes
sparse very noisy.
- Disable switch jump tables completely when retpolines are enabled.
- Prevent leaking the trampoline address"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/realmode: Make set_real_mode_mem() static inline
x86/cpufeature: Fix __percpu annotation in this_cpu_has()
x86/mm: Don't exceed the valid physical address space
x86/retpolines: Disable switch jump tables when retpolines are enabled
x86/realmode: Don't leak the trampoline kernel address
x86/boot: Fix incorrect ifdeffery scope
x86/resctrl: Remove unused variable
Linus Torvalds [Sun, 31 Mar 2019 15:37:04 +0000 (08:37 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf tooling fixes from Thomas Gleixner:
"Core libraries:
- Fix max perf_event_attr.precise_ip detection.
- Fix parser error for uncore event alias
- Fixup ordering of kernel maps after obtaining the main kernel map
address.
Intel PT:
- Fix TSC slip where A TSC packet can slip past MTC packets so that
the timestamp appears to go backwards.
- Fixes for exported-sql-viewer GUI conversion to python3.
ARM coresight:
- Fix the build by adding a missing case value for enumeration value
introduced in newer library, that now is the required one.
tool headers:
- Syncronize kernel headers with the kernel, getting new io_uring and
pidfd_send_signal syscalls so that 'perf trace' can handle them"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf pmu: Fix parser error for uncore event alias
perf scripts python: exported-sql-viewer.py: Fix python3 support
perf scripts python: exported-sql-viewer.py: Fix never-ending loop
perf machine: Update kernel map address and re-order properly
tools headers uapi: Sync powerpc's asm/kvm.h copy with the kernel sources
tools headers: Update x86's syscall_64.tbl and uapi/asm-generic/unistd
tools headers uapi: Update drm/i915_drm.h
tools arch x86: Sync asm/cpufeatures.h with the kernel sources
tools headers uapi: Sync linux/fcntl.h to get the F_SEAL_FUTURE_WRITE addition
tools headers uapi: Sync asm-generic/mman-common.h and linux/mman.h
perf evsel: Fix max perf_event_attr.precise_ip detection
perf intel-pt: Fix TSC slip
perf cs-etm: Add missing case value
Linus Torvalds [Sun, 31 Mar 2019 15:22:12 +0000 (08:22 -0700)]
Merge branch 'smp-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull CPU hotplug fixes from Thomas Gleixner:
"Two SMT/hotplug related fixes:
- Prevent crash when HOTPLUG_CPU is disabled and the CPU bringup
aborts. This is triggered with the 'nosmt' command line option, but
can happen by any abort condition. As the real unplug code is not
compiled in, prevent the fail by keeping the CPU in zombie state.
- Enforce HOTPLUG_CPU for SMP on x86 to avoid the above situation
completely. With 'nosmt' being a popular option it's required to
unplug the half brought up sibling CPUs (due to the MCE wreckage)
completely"
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/smp: Enforce CONFIG_HOTPLUG_CPU when SMP=y
cpu/hotplug: Prevent crash when CPU bringup fails on CONFIG_HOTPLUG_CPU=n
Linus Torvalds [Sun, 31 Mar 2019 14:48:58 +0000 (07:48 -0700)]
Merge branch 'locking-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull locking fixlet from Thomas Gleixner:
"Trivial update to the maintainers file"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
MAINTAINERS: Remove deleted file from futex file pattern
Linus Torvalds [Sun, 31 Mar 2019 14:47:21 +0000 (07:47 -0700)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull core fixes from Thomas Gleixner:
"A small set of core updates:
- Make the watchdog respect the selected CPU mask again. That was
broken by the rework of the watchdog thread management and caused
inconsistent state and NMI watchdog being unstoppable.
- Ensure that the objtool build can find the libelf location.
- Remove dead kcore stub code"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
watchdog: Respect watchdog cpumask on CPU hotplug
objtool: Query pkg-config for libelf location
proc/kcore: Remove unused kclist_add_remap()
Linus Torvalds [Sun, 31 Mar 2019 14:44:13 +0000 (07:44 -0700)]
Merge tag 'powerpc-5.1-4' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"Three non-regression fixes.
- Our optimised memcmp could read past the end of one of the buffers
and potentially trigger a page fault leading to an oops.
- Some of our code to read energy management data on PowerVM had an
endian bug leading to bogus results.
- When reporting a machine check exception we incorrectly reported
TLB multihits as D-Cache multhits due to a missing entry in the
array of causes.
Thanks to: Chandan Rajendra, Gautham R. Shenoy, Mahesh Salgaonkar,
Segher Boessenkool, Vaidyanathan Srinivasan"
* tag 'powerpc-5.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/pseries/mce: Fix misleading print for TLB mutlihit
powerpc/pseries/energy: Use OF accessor functions to read ibm,drc-indexes
powerpc/64: Fix memcmp reading past the end of src/dest
Linus Torvalds [Sun, 31 Mar 2019 14:42:39 +0000 (07:42 -0700)]
Merge tag 'dmaengine-fix-5.1-rc3' of git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine fixes from Vinod Koul:
- Revert "dmaengine: stm32-mdma: Add a check on read_u32_array" as that
caused regression
- Fix MAINTAINER file uniphier-mdmac.c file path
* tag 'dmaengine-fix-5.1-rc3' of git://git.infradead.org/users/vkoul/slave-dma:
MAINTAINERS: Fix uniphier-mdmac.c file path
dmaengine: stm32-mdma: Revert "dmaengine: stm32-mdma: Add a check on read_u32_array"
Linus Torvalds [Sat, 30 Mar 2019 19:12:56 +0000 (12:12 -0700)]
Merge tag 'led-fixes-for-5.1-rc3' of git://git./linux/kernel/git/j.anaszewski/linux-leds
Pull LED fixes from Jacek Anaszewski:
- fix refcnt leak on interface rename
- use memcpy in device_name_store() to avoid including garbage from a
previous, longer value in the device_name
- fix a potential NULL pointer dereference in case of_match_device()
cannot find a match
* tag 'led-fixes-for-5.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds:
leds: trigger: netdev: use memcpy in device_name_store
leds: pca9532: fix a potential NULL pointer dereference
leds: trigger: netdev: fix refcnt leak on interface rename
Linus Torvalds [Sat, 30 Mar 2019 18:33:34 +0000 (11:33 -0700)]
Merge tag 'gpio-v5.1-2' of git://git./linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
"As you can see [in the git history] I was away on leave and Bartosz
kindly stepped in and collected a slew of fixes, I pulled them into my
tree in two sets and merged some two more fixes (fixing my own caused
bugs) on top.
Summary:
- Revert the extended use of gpio_set_config() and think about how we
can do this properly.
- Fix up the SPI CS GPIO handling so it now works properly on the SPI
bus children, as intended.
- Error paths and driver fixes"
* tag 'gpio-v5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: mockup: use simple_read_from_buffer() in debugfs read callback
gpio: of: Fix of_gpiochip_add() error path
gpio: of: Check for "spi-cs-high" in child instead of parent node
gpio: of: Check propname before applying "cs-gpios" quirks
gpio: mockup: fix debugfs read
Revert "gpio: use new gpio_set_config() helper in more places"
gpio: aspeed: fix a potential NULL pointer dereference
gpio: amd-fch: Fix bogus SPDX identifier
gpio: adnp: Fix testing wrong value in adnp_gpio_direction_input
gpio: exar: add a check for the return value of ida_simple_get fails
Rasmus Villemoes [Thu, 14 Mar 2019 14:06:14 +0000 (15:06 +0100)]
leds: trigger: netdev: use memcpy in device_name_store
If userspace doesn't end the input with a newline (which can easily
happen if the write happens from a C program that does write(fd,
iface, strlen(iface))), we may end up including garbage from a
previous, longer value in the device_name. For example
# cat device_name
# printf 'eth12' > device_name
# cat device_name
eth12
# printf 'eth3' > device_name
# cat device_name
eth32
I highly doubt anybody is relying on this behaviour, so switch to
simply copying the bytes (we've already checked that size is <
IFNAMSIZ) and unconditionally zero-terminate it; of course, we also
still have to strip a trailing newline.
This is also preparation for future patches.
Fixes:
06f502f57d0d ("leds: trigger: Introduce a NETDEV trigger")
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
Kangjie Lu [Sat, 9 Mar 2019 06:04:11 +0000 (00:04 -0600)]
leds: pca9532: fix a potential NULL pointer dereference
In case of_match_device cannot find a match, return -EINVAL to avoid
NULL pointer dereference.
Fixes:
fa4191a609f2 ("leds: pca9532: Add device tree support")
Signed-off-by: Kangjie Lu <kjlu@umn.edu>
Signed-off-by: Jacek Anaszewski <jacek.anaszewski@gmail.com>
Linus Torvalds [Sat, 30 Mar 2019 17:35:20 +0000 (10:35 -0700)]
Merge tag 'staging-5.1-rc3' of git://git./linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg KH:
"Here are some small staging driver fixes for 5.1-rc3, and one driver
removal.
The biggest thing here is the removal of the mt7621-eth driver as a
"real" network driver was merged in 5.1-rc1 for this hardware, so this
old driver can now be removed.
Other than that, there are just a number of small fixes, all resolving
reported issues and some potential corner cases for error handling
paths.
All of these have been in linux-next with no reported issues"
* tag 'staging-5.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: vt6655: Remove vif check from vnt_interrupt
staging: erofs: keep corrupted fs from crashing kernel in erofs_readdir()
staging: octeon-ethernet: fix incorrect PHY mode
staging: vc04_services: Fix an error code in vchiq_probe()
staging: erofs: fix error handling when failed to read compresssed data
staging: vt6655: Fix interrupt race condition on device start up.
staging: rtlwifi: Fix potential NULL pointer dereference of kzalloc
staging: rtl8712: uninitialized memory in read_bbreg_hdl()
staging: rtlwifi: rtl8822b: fix to avoid potential NULL pointer dereference
staging: rtl8188eu: Fix potential NULL pointer dereference of kcalloc
staging, mt7621-pci: fix build without pci support
staging: speakup_soft: Fix alternate speech with other synths
staging: axis-fifo: add CONFIG_OF dependency
staging: olpc_dcon_xo_1: add missing 'const' qualifier
staging: comedi: ni_mio_common: Fix divide-by-zero for DIO cmdtest
staging: erofs: fix to handle error path of erofs_vmap()
staging: mt7621-dts: update ethernet settings.
staging: remove mt7621-eth
Linus Torvalds [Sat, 30 Mar 2019 17:30:38 +0000 (10:30 -0700)]
Merge tag 'tty-5.1-rc3' of git://git./linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg KH:
"Here are some small tty and serial driver fixes for 5.1-rc3.
Nothing major here, just a number of potential problems fixes for
error handling paths, as well as some other minor bugfixes for
reported issues with 5.1-rc1.
All of these have been in linux-next with no reported issues"
* tag 'tty-5.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
tty: fix NULL pointer issue when tty_port ops is not set
Disable kgdboc failed by echo space to /sys/module/kgdboc/parameters/kgdboc
dt-bindings: serial: Add compatible for Mediatek MT8183
tty/serial: atmel: RS485 HD w/DMA: enable RX after TX is stopped
tty/serial: atmel: Add is_half_duplex helper
serial: sh-sci: Fix setting SCSCR_TIE while transferring data
serial: ar933x_uart: Fix build failure with disabled console
tty: serial: qcom_geni_serial: Initialize baud in qcom_geni_console_setup
sc16is7xx: missing unregister/delete driver on error in sc16is7xx_init()
tty: mxs-auart: fix a potential NULL pointer dereference
tty: atmel_serial: fix a potential NULL pointer dereference
serial: max310x: Fix to avoid potential NULL pointer dereference
serial: mvebu-uart: Fix to avoid a potential NULL pointer dereference
Linus Torvalds [Sat, 30 Mar 2019 17:26:36 +0000 (10:26 -0700)]
Merge tag 'usb-5.1-rc3' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB fixes for 5.1-rc3.
Nothing major at all here, just a small collection of fixes for
reported issues, and potential problems with error handling paths.
Also a few new device ids, as normal.
All of these have been in linux-next with no reported issues"
* tag 'usb-5.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (25 commits)
USB: serial: option: add Olicard 600
USB: serial: cp210x: add new device id
usb: u132-hcd: fix resource leak
usb: cdc-acm: fix race during wakeup blocking TX traffic
usb: mtu3: fix EXTCON dependency
usb: usb251xb: fix to avoid potential NULL pointer dereference
usb: core: Try generic PHY_MODE_USB_HOST if usb_phy_roothub_set_mode fails
phy: sun4i-usb: Support set_mode to USB_HOST for non-OTG PHYs
xhci: Don't let USB3 ports stuck in polling state prevent suspend
usb: xhci: dbc: Don't free all memory with spinlock held
xhci: Fix port resume done detection for SS ports with LPM enabled
USB: serial: mos7720: fix mos_parport refcount imbalance on error path
USB: gadget: f_hid: fix deadlock in f_hidg_write()
usb: gadget: net2272: Fix net2272_dequeue()
usb: gadget: net2280: Fix net2280_dequeue()
usb: gadget: net2280: Fix overrun of OUT messages
usb: dwc3: pci: add support for Comet Lake PCH ID
usb: usb251xb: Remove unnecessary comparison of unsigned integer with >= 0
usb: common: Consider only available nodes for dr_mode
usb: typec: tcpm: Try PD-2.0 if sink does not respond to 3.0 source-caps
...
Linus Torvalds [Sat, 30 Mar 2019 17:09:11 +0000 (10:09 -0700)]
Merge tag 'acpi-5.1-rc3' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"This corrects a previous attempt to make Linux use its own set of ACPI
debug flags different from the upstream ACPICA's default (Erik
Schmauss)"
* tag 'acpi-5.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: use different default debug value than ACPICA
Linus Torvalds [Sat, 30 Mar 2019 17:06:09 +0000 (10:06 -0700)]
Merge tag 'pm-5.1-rc3' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix CPU base frequency reporting in the intel_pstate driver and
a use-after-free in the scpi-cpufreq driver.
Specifics:
- Fix the ACPI CPPC library to actually follow the specification when
decoding the guaranteed performance register information and make
the intel_pstate driver to fall back to the nominal frequency when
reporting the base frequency if the guaranteed performance register
information is not there (Srinivas Pandruvada).
- Fix use-after-free in the exit callback of the scpi-cpufreq left
after an update during the 5.0 development cycle (Vincent Stehlé)"
* tag 'pm-5.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: scpi: Fix use after free
cpufreq: intel_pstate: Also use CPPC nominal_perf for base_frequency
ACPI / CPPC: Fix guaranteed performance handling
Linus Torvalds [Sat, 30 Mar 2019 16:19:09 +0000 (09:19 -0700)]
Merge branch 'fixes-v5.1-a' of git://git./linux/kernel/git/jmorris/linux-security
Pull security layer fixes from James Morris:
"Yama and LSM config fixes"
* 'fixes-v5.1-a' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
LSM: Revive CONFIG_DEFAULT_SECURITY_* for "make oldconfig"
Yama: mark local symbols as static
Linus Torvalds [Fri, 29 Mar 2019 23:02:28 +0000 (16:02 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"22 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (22 commits)
fs/proc/proc_sysctl.c: fix NULL pointer dereference in put_links
fs: fs_parser: fix printk format warning
checkpatch: add %pt as a valid vsprintf extension
mm/migrate.c: add missing flush_dcache_page for non-mapped page migrate
drivers/block/zram/zram_drv.c: fix idle/writeback string compare
mm/page_isolation.c: fix a wrong flag in set_migratetype_isolate()
mm/memory_hotplug.c: fix notification in offline error path
ptrace: take into account saved_sigmask in PTRACE{GET,SET}SIGMASK
fs/proc/kcore.c: make kcore_modules static
include/linux/list.h: fix list_is_first() kernel-doc
mm/debug.c: fix __dump_page when mapping->host is not set
mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified
include/linux/hugetlb.h: convert to use vm_fault_t
iommu/io-pgtable-arm-v7s: request DMA32 memory, and improve debugging
mm: add support for kmem caches in DMA32 zone
ocfs2: fix inode bh swapping mixup in ocfs2_reflink_inodes_lock
mm/hotplug: fix offline undo_isolate_page_range()
fs/open.c: allow opening only regular files during execve()
mailmap: add Changbin Du
mm/debug.c: add a cast to u64 for atomic64_read()
...
Linus Torvalds [Fri, 29 Mar 2019 22:44:11 +0000 (15:44 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fix from Catalin Marinas:
"Use memblock_alloc() instead of memblock_alloc_low() in
request_standard_resources(), the latter being limited to the low 4G
memory range on arm64"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: replace memblock_alloc_low with memblock_alloc
Linus Torvalds [Fri, 29 Mar 2019 22:37:10 +0000 (15:37 -0700)]
Merge tag 'iommu-fixes-v5.1-rc3' of git://git./linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:
- Fix a bug in the AMD IOMMU driver not handling exclusion ranges
correctly. In fact the driver did not reserve these ranges for IOVA
allocations, so that dma-handles could be allocated in an exclusion
range, leading to data corruption. Exclusion ranges have not been
used by any firmware up to now, so this issue remained undiscovered
for quite some time.
- Fix wrong warning messages that the IOMMU core code prints when it
tries to allocate the default domain for an iommu group and the
driver does not support any of the default domain types (like Intel
VT-d).
* tag 'iommu-fixes-v5.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/amd: Reserve exclusion range in iova-domain
iommu: Don't print warning when IOMMU driver only supports unmanaged domains
Linus Torvalds [Fri, 29 Mar 2019 22:07:29 +0000 (15:07 -0700)]
Merge tag 'driver-core-5.1-rc3' of git://git./linux/kernel/git/gregkh/driver-core
Pull driver core fix from Greg KH:
"Here is a single driver core patch for 5.1-rc3.
After 5.1-rc1, all of the users of BUS_ATTR() are finally removed, so
we can now drop this macro from include/linux/device.h so that no more
new users will be created.
This patch has been in linux-next for a while, with no reported
issues"
* tag 'driver-core-5.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
driver core: remove BUS_ATTR()
Linus Torvalds [Fri, 29 Mar 2019 22:03:30 +0000 (15:03 -0700)]
Merge tag 'char-misc-5.1-rc3' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are some binder, habanalabs, and vboxguest driver fixes for
5.1-rc3.
The Binder fixes resolve some reported issues found by testing, first
by the selinux developers, and then earlier today by syzbot.
The habanalabs fixes are all minor, resolving a number of tiny things.
The vboxguest patches are a bit larger. They resolve the fact that
virtual box decided to change their api in their latest release in a
way that broke the existing kernel code, despite saying that they were
never going to do that. So this is a bit of a "new feature", but is
good to get merged so that 5.1 will work with the latest release. The
changes are not large and of course virtual box "swears" they will not
break this again, but no one is holding their breath here.
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-5.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
virt: vbox: Implement passing requestor info to the host for VirtualBox 6.0.x
binder: fix race between munmap() and direct reclaim
binder: fix BUG_ON found by selinux-testsuite
habanalabs: cast to expected type
habanalabs: prevent host crash during suspend/resume
habanalabs: perform accounting for active CS
habanalabs: fix mapping with page size bigger than 4KB
habanalabs: complete user context cleanup before hard reset
habanalabs: fix bug when mapping very large memory area
habanalabs: fix MMU number of pages calculation
Linus Torvalds [Fri, 29 Mar 2019 21:58:49 +0000 (14:58 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Thirteen fixes, seven of which are for IBM fibre channel and three
additional for fairly serious bugs in drivers (qla2xxx, mpt3sas,
aacraid).
Of the three core fixes, the most significant is probably the missed
run queue causing an indefinite hang. The others are fixing a
potential use after free on device close and silencing an incorrect
warning"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ibmvfc: Clean up transport events
scsi: ibmvfc: Byte swap status and error codes when logging
scsi: ibmvfc: Add failed PRLI to cmd_status lookup array
scsi: ibmvfc: Remove "failed" from logged errors
scsi: zfcp: reduce flood of fcrscn1 trace records on multi-element RSCN
scsi: zfcp: fix scsi_eh host reset with port_forced ERP for non-NPIV FCP devices
scsi: zfcp: fix rport unblock if deleted SCSI devices on Scsi_Host
scsi: sd: Quiesce warning if device does not report optimal I/O size
scsi: sd: Fix a race between closing an sd device and sd I/O
scsi: core: Run queue when state is set to running after being blocked
scsi: qla4xxx: fix a potential NULL pointer dereference
scsi: aacraid: Insure we don't access PCIe space during AER/EEH
scsi: mpt3sas: Fix kernel panic during expander reset
Linus Torvalds [Fri, 29 Mar 2019 21:56:53 +0000 (14:56 -0700)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"A new ID for the i801 driver and some Documentation fixes to make it
easier for people to find the bindings (which is also a basis for
further improvements in that area)"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: wmt: make bindings file name match the driver
i2c: sun6i-p2wi: make bindings file name match the driver
i2c: stu300: make bindings file name match the driver
i2c: mt65xx: make bindings file name match the driver
i2c: iop3xx: make bindings file name match the driver
i2c: i801: Add support for Intel Comet Lake
Linus Torvalds [Fri, 29 Mar 2019 21:53:33 +0000 (14:53 -0700)]
Merge tag 'sound-5.1-rc3' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"The important fixes at this time are a couple fixes in ALSA core: a
fix for PCM is about the OOB access in PCM OSS plugins that has been
for long time, but hasn't hit so often until now just because we
allocated a large buffer via vmalloc(), and surfaced more often after
switching to kvmalloc(). Another fix is for a long-standing PCM
problem wrt racy PM resume.
Others are trivial nospec coverage and usual HD-audio quirks"
* tag 'sound-5.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/realtek - Fix speakers on Acer Predator Helios 500 Ryzen laptops
ALSA: pcm: Don't suspend stream in unrecoverable PCM state
ALSA: hda/ca0132 - Simplify alt firmware loading code
ALSA: pcm: Fix possible OOB access in PCM oss plugins
ALSA: hda/realtek: Enable headset MIC of ASUS X430UN and X512DK with ALC256
ALSA: hda/realtek: Enable headset mic of ASUS P5440FF with ALC256
ALSA: hda/realtek: Enable ASUS X441MB and X705FD headset MIC with ALC256
ALSA: hda/realtek - Add support for Acer Aspire E5-523G/ES1-432 headset mic
ALSA: hda/realtek: Enable headset MIC of Acer Aspire Z24-890 with ALC286
ALSA: seq: oss: Fix Spectre v1 vulnerability
ALSA: rawmidi: Fix potential Spectre v1 vulnerability
Linus Torvalds [Fri, 29 Mar 2019 21:46:00 +0000 (14:46 -0700)]
Merge tag 'kbuild-fixes-v5.1' of git://git./linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Remove harmful -Oz option of Clang
- Get back the original behavior (no recursion for in-tree build) for
GNU Make 4.x
- Some minor fixes for coccinelle patches
- Do not overwrite .gitignore in the output directory in case it is
version-controlled
- Fix missed record-mcount bug for dynamic ftrace
- Fix endianness bug in modversions for relative CRC
- Cater to '^H' key code in Kconfig ncurses programs
* tag 'kbuild-fixes-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kconfig/[mn]conf: handle backspace (^H) key
kbuild: modversions: Fix relative CRC byte order interpretation
scripts: coccinelle: Fix description of badty.cocci
kbuild: strip whitespace in cmd_record_mcount findstring
kbuild: do not overwrite .gitignore in output directory
kbuild: skip parsing pre sub-make code for recursion
coccinelle: put_device: reduce false positives
kbuild: skip sub-make for in-tree build with GNU Make 4.x
Revert "kbuild: use -Oz instead of -Os when using clang"
Linus Torvalds [Fri, 29 Mar 2019 21:43:07 +0000 (14:43 -0700)]
Merge tag 'for-linus-
20190329' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"Small set of fixes that should go into this series. This contains:
- compat signal mask fix for io_uring (Arnd)
- EAGAIN corner case for direct vs buffered writes for io_uring
(Roman)
- NVMe pull request from Christoph with various little fixes
- sbitmap ws_active fix, which caused a perf regression for shared
tags (me)
- sbitmap bit ordering fix (Ming)
- libata on-stack DMA fix (Raymond)"
* tag 'for-linus-
20190329' of git://git.kernel.dk/linux-block:
nvmet: fix error flow during ns enable
nvmet: fix building bvec from sg list
nvme-multipath: relax ANA state check
nvme-tcp: fix an endianess miss-annotation
libata: fix using DMA buffers on stack
io_uring: offload write to async worker in case of -EAGAIN
sbitmap: order READ/WRITE freed instance and setting clear bit
blk-mq: fix sbitmap ws_active for shared tags
io_uring: fix big-endian compat signal mask handling
blk-mq: update comment for blk_mq_hctx_has_pending()
blk-mq: use blk_mq_put_driver_tag() to put tag
Linus Torvalds [Fri, 29 Mar 2019 21:41:09 +0000 (14:41 -0700)]
Merge tag 'ceph-for-5.1-rc3' of git://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
"A patch to avoid choking on multipage bvecs in the messenger and a
small use-after-free fix"
* tag 'ceph-for-5.1-rc3' of git://github.com/ceph/ceph-client:
ceph: fix use-after-free on symlink traversal
libceph: fix breakage caused by multipage bvecs
Linus Torvalds [Fri, 29 Mar 2019 21:36:57 +0000 (14:36 -0700)]
Merge tag 'xfs-5.1-fixes-1' of git://git./fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
"Here are a few fixes for some corruption bugs and uninitialized
variable problems. The few patches here have gone through a few days
worth of fstest runs with no new problems observed.
Changes since last update:
- Fix a bunch of static checker complaints about uninitialized
variables and insufficient range checks.
- Avoid a crash when incore extent map data are corrupt.
- Disallow FITRIM when we haven't recovered the log and know the
metadata are stale.
- Fix a data corruption when doing unaligned overlapping dio writes"
* tag 'xfs-5.1-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: serialize unaligned dio writes against all other dio writes
xfs: prohibit fstrim in norecovery mode
xfs: always init bma in xfs_bmapi_write
xfs: fix btree scrub checking with regards to root-in-inode
xfs: dabtree scrub needs to range-check level
xfs: don't trip over uninitialized buffer on extent read of corrupted inode
Kees Cook [Fri, 29 Mar 2019 19:36:04 +0000 (12:36 -0700)]
LSM: Revive CONFIG_DEFAULT_SECURITY_* for "make oldconfig"
Commit
70b62c25665f636c ("LoadPin: Initialize as ordered LSM") removed
CONFIG_DEFAULT_SECURITY_{SELINUX,SMACK,TOMOYO,APPARMOR,DAC} from
security/Kconfig and changed CONFIG_LSM to provide a fixed ordering as a
default value. That commit expected that existing users (upgrading from
Linux 5.0 and earlier) will edit CONFIG_LSM value in accordance with
their CONFIG_DEFAULT_SECURITY_* choice in their old kernel configs. But
since users might forget to edit CONFIG_LSM value, this patch revives
the choice (only for providing the default value for CONFIG_LSM) in order
to make sure that CONFIG_LSM reflects CONFIG_DEFAULT_SECURITY_* from their
old kernel configs.
Note that since TOMOYO can be fully stacked against the other legacy
major LSMs, when it is selected, it explicitly disables the other LSMs
to avoid them also initializing since TOMOYO does not expect this
currently.
Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Fixes:
70b62c25665f636c ("LoadPin: Initialize as ordered LSM")
Co-developed-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Signed-off-by: James Morris <james.morris@microsoft.com>
Thomas Gleixner [Fri, 29 Mar 2019 20:28:58 +0000 (21:28 +0100)]
Merge tag 'perf-urgent-for-mingo-5.1-
20190329' of git://git./linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes from Arnaldo:
Core libraries:
Jiri Olsa:
- Fix max perf_event_attr.precise_ip detection.
Kan Liang:
- Fix parser error for uncore event alias
Wei Lin:
- Fixup ordering of kernel maps after obtaining the main kernel map address.
Intel PT:
Adrian Hunter:
- Fix TSC slip where A TSC packet can slip past MTC packets so that the
timestamp appears to go backwards.
- Fixes for exported-sql-viewer GUI conversion to python3.
ARM coresight:
Solomon Tan:
- Fix the build by adding a missing case value for enumeration value introduced
in newer library, that now is the required one.
tool headers:
Arnaldo Carvalho de Melo:
- Syncronize kernel headers with the kernel, getting new io_uring and
pidfd_send_signal syscalls so that 'perf trace' can handle them.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Linus Torvalds [Fri, 29 Mar 2019 18:12:45 +0000 (11:12 -0700)]
Merge tag 'drm-fixes-2019-03-29' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Weekly fixes roundup, nothing two serious, some usb device regressions
are fixed, and i915 GVT has a bigger fix but otherwise not really much
happening here.
core:
- fb bpp check regression fix
- release/unplug fix
- use after free fixes
i915:
- fix mmap range checks
- fix gvt ppgtt mm LRU list access races
- fix selftest error pointer check
- fix a macro definition (pre-emptive for potential further backports)
- fix one AML SKU ULX status
amdgpu:
- one variable refresh rate fix
udl:
- fix EDID reading
tegra:
- build/warning fixes
meson:
- cleanup path fixes
- TMDS clock filter fix
rockchip:
- NV12 buffers and scalar fix"
* tag 'drm-fixes-2019-03-29' of git://anongit.freedesktop.org/drm/drm: (22 commits)
drm/i915/icl: Fix VEBOX mismatch BUG_ON()
drm/i915/selftests: Fix an IS_ERR() vs NULL check
drm/i915: Mark AML 0x87CA as ULX
drm/meson: fix TMDS clock filtering for DMT monitors
drm/meson: Uninstall IRQ handler
drm/meson: Fix invalid pointer in meson_drv_unbind()
drm/udl: Refactor edid retrieving in UDL driver (v2)
drm: Fix drm_release() and device unplug
drm/fb: avoid setting 0 depth.
drm/tegra: vic: Fix implicit function declaration warning
drm/tegra: hub: Fix dereference before check
drm/i915/icl: Fix the TRANS_DDI_FUNC_CTL2 bitfield macro
drm/amd/display: Only allow VRR when vrefresh is within supported range
drm/rockchip: vop: reset scale mode when win is disabled
drm/vkms: fix use-after-free when drm_gem_handle_create() fails
drm/vgem: fix use-after-free when drm_gem_handle_create() fails
drm/i915/gvt: Add mutual lock for ppgtt mm LRU list
drm/i915/gvt: Only assign ppgtt root at dispatch time
drm/i915/gvt: Don't submit request for error workload dispatch
drm/i915/gvt: stop scheduling workload when vgpu is inactive
...
YueHaibing [Fri, 29 Mar 2019 03:44:40 +0000 (20:44 -0700)]
fs/proc/proc_sysctl.c: fix NULL pointer dereference in put_links
Syzkaller reports:
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN PTI
CPU: 1 PID: 5373 Comm: syz-executor.0 Not tainted 5.0.0-rc8+ #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
RIP: 0010:put_links+0x101/0x440 fs/proc/proc_sysctl.c:1599
Code: 00 0f 85 3a 03 00 00 48 8b 43 38 48 89 44 24 20 48 83 c0 38 48 89 c2 48 89 44 24 28 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f 85 fe 02 00 00 48 8b 74 24 20 48 c7 c7 60 2a 9d 91
RSP: 0018:
ffff8881d828f238 EFLAGS:
00010202
RAX:
dffffc0000000000 RBX:
ffff8881e01b1140 RCX:
ffffffff8ee98267
RDX:
0000000000000007 RSI:
ffffc90001479000 RDI:
ffff8881e01b1178
RBP:
dffffc0000000000 R08:
ffffed103ee27259 R09:
ffffed103ee27259
R10:
0000000000000001 R11:
ffffed103ee27258 R12:
fffffffffffffff4
R13:
0000000000000006 R14:
ffff8881f59838c0 R15:
dffffc0000000000
FS:
00007f072254f700(0000) GS:
ffff8881f7100000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00007fff8b286668 CR3:
00000001f0542002 CR4:
00000000007606e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
PKRU:
55555554
Call Trace:
drop_sysctl_table+0x152/0x9f0 fs/proc/proc_sysctl.c:1629
get_subdir fs/proc/proc_sysctl.c:1022 [inline]
__register_sysctl_table+0xd65/0x1090 fs/proc/proc_sysctl.c:1335
br_netfilter_init+0xbc/0x1000 [br_netfilter]
do_one_initcall+0xfa/0x5ca init/main.c:887
do_init_module+0x204/0x5f6 kernel/module.c:3460
load_module+0x66b2/0x8570 kernel/module.c:3808
__do_sys_finit_module+0x238/0x2a0 kernel/module.c:3902
do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x462e99
Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:
00007f072254ec58 EFLAGS:
00000246 ORIG_RAX:
0000000000000139
RAX:
ffffffffffffffda RBX:
000000000073bf00 RCX:
0000000000462e99
RDX:
0000000000000000 RSI:
0000000020000280 RDI:
0000000000000003
RBP:
00007f072254ec70 R08:
0000000000000000 R09:
0000000000000000
R10:
0000000000000000 R11:
0000000000000246 R12:
00007f072254f6bc
R13:
00000000004bcefa R14:
00000000006f6fb0 R15:
0000000000000004
Modules linked in: br_netfilter(+) dvb_usb_dibusb_mc_common dib3000mc dibx000_common dvb_usb_dibusb_common dvb_usb_dw2102 dvb_usb classmate_laptop palmas_regulator cn videobuf2_v4l2 v4l2_common snd_soc_bd28623 mptbase snd_usb_usx2y snd_usbmidi_lib snd_rawmidi wmi libnvdimm lockd sunrpc grace rc_kworld_pc150u rc_core rtc_da9063 sha1_ssse3 i2c_cros_ec_tunnel adxl34x_spi adxl34x nfnetlink lib80211 i5500_temp dvb_as102 dvb_core videobuf2_common videodev media videobuf2_vmalloc videobuf2_memops udc_core lnbp22 leds_lp3952 hid_roccat_ryos s1d13xxxfb mtd vport_geneve openvswitch nf_conncount nf_nat_ipv6 nsh geneve udp_tunnel ip6_udp_tunnel snd_soc_mt6351 sis_agp phylink snd_soc_adau1761_spi snd_soc_adau1761 snd_soc_adau17x1 snd_soc_core snd_pcm_dmaengine ac97_bus snd_compress snd_soc_adau_utils snd_soc_sigmadsp_regmap snd_soc_sigmadsp raid_class hid_roccat_konepure hid_roccat_common hid_roccat c2port_duramar2150 core mdio_bcm_unimac iptable_security iptable_raw iptable_mangle
iptable_nat nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter bpfilter ip6_vti ip_vti ip_gre ipip sit tunnel4 ip_tunnel hsr veth netdevsim devlink vxcan batman_adv cfg80211 rfkill chnl_net caif nlmon dummy team bonding vcan bridge stp llc ip6_gre gre ip6_tunnel tunnel6 tun crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel joydev mousedev ide_pci_generic piix aesni_intel aes_x86_64 ide_core crypto_simd atkbd cryptd glue_helper serio_raw ata_generic pata_acpi i2c_piix4 floppy sch_fq_codel ip_tables x_tables ipv6 [last unloaded: lm73]
Dumping ftrace buffer:
(ftrace buffer empty)
---[ end trace
770020de38961fd0 ]---
A new dir entry can be created in get_subdir and its 'header->parent' is
set to NULL. Only after insert_header success, it will be set to 'dir',
otherwise 'header->parent' is set to NULL and drop_sysctl_table is called.
However in err handling path of get_subdir, drop_sysctl_table also be
called on 'new->header' regardless its value of parent pointer. Then
put_links is called, which triggers NULL-ptr deref when access member of
header->parent.
In fact we have multiple error paths which call drop_sysctl_table() there,
upon failure on insert_links() we also call drop_sysctl_table().And even
in the successful case on __register_sysctl_table() we still always call
drop_sysctl_table().This patch fix it.
Link: http://lkml.kernel.org/r/20190314085527.13244-1-yuehaibing@huawei.com
Fixes:
0e47c99d7fe25 ("sysctl: Replace root_list with links between sysctl_table_sets")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: <stable@vger.kernel.org> [3.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Fri, 29 Mar 2019 03:44:36 +0000 (20:44 -0700)]
fs: fs_parser: fix printk format warning
Fix printk format warning (seen on i386 builds) by using ptrdiff format
specifier (%t):
fs/fs_parser.c:413:6: warning: format `%lu' expects argument of type `long unsigned int', but argument 3 has type `int' [-Wformat=]
Link: http://lkml.kernel.org/r/19432668-ffd3-fbb2-af4f-1c8e48f6cc81@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexandre Belloni [Fri, 29 Mar 2019 03:44:32 +0000 (20:44 -0700)]
checkpatch: add %pt as a valid vsprintf extension
Commit
4d42c44727a0 ("lib/vsprintf: Print time and date in human
readable format via %pt") introduced a new extension, %pt.
Add it in the list of valid extensions.
Link: http://lkml.kernel.org/r/20190314203719.29130-1-alexandre.belloni@bootlin.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lars Persson [Fri, 29 Mar 2019 03:44:28 +0000 (20:44 -0700)]
mm/migrate.c: add missing flush_dcache_page for non-mapped page migrate
Our MIPS 1004Kc SoCs were seeing random userspace crashes with SIGILL
and SIGSEGV that could not be traced back to a userspace code bug. They
had all the magic signs of an I/D cache coherency issue.
Now recently we noticed that the /proc/sys/vm/compact_memory interface
was quite efficient at provoking this class of userspace crashes.
Studying the code in mm/migrate.c there is a distinction made between
migrating a page that is mapped at the instant of migration and one that
is not mapped. Our problem turned out to be the non-mapped pages.
For the non-mapped page the code performs a copy of the page content and
all relevant meta-data of the page without doing the required D-cache
maintenance. This leaves dirty data in the D-cache of the CPU and on
the 1004K cores this data is not visible to the I-cache. A subsequent
page-fault that triggers a mapping of the page will happily serve the
process with potentially stale code.
What about ARM then, this bug should have seen greater exposure? Well
ARM became immune to this flaw back in 2010, see commit
c01778001a4f
("ARM: 6379/1: Assume new page cache pages have dirty D-cache").
My proposed fix moves the D-cache maintenance inside move_to_new_page to
make it common for both cases.
Link: http://lkml.kernel.org/r/20190315083502.11849-1-larper@axis.com
Fixes:
97ee0524614 ("flush cache before installing new page at migraton")
Signed-off-by: Lars Persson <larper@axis.com>
Reviewed-by: Paul Burton <paul.burton@mips.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Minchan Kim [Fri, 29 Mar 2019 03:44:24 +0000 (20:44 -0700)]
drivers/block/zram/zram_drv.c: fix idle/writeback string compare
Makoto report a below KASAN error: zram does out-of-bounds read. Because
strscpy copies from source up to count bytes unconditionally. It could
cause out-of-bounds read on next object in slab.
To prevent it, use strlcpy which checks source's length automatically.
BUG: KASAN: slab-out-of-bounds in strscpy+0x68/0x154
Read of size 8 at addr
ffffffc0c3495a00 by task system_server/1314
..
Call trace:
strscpy+0x68/0x154
idle_store+0xc4/0x34c
dev_attr_store+0x50/0x6c
sysfs_kf_write+0x98/0xb4
kernfs_fop_write+0x198/0x260
__vfs_write+0x10c/0x338
vfs_write+0x114/0x238
SyS_write+0xc8/0x168
__sys_trace_return+0x0/0x4
Allocated by task 1314:
__kmalloc+0x280/0x318
kernfs_fop_write+0xac/0x260
__vfs_write+0x10c/0x338
vfs_write+0x114/0x238
SyS_write+0xc8/0x168
__sys_trace_return+0x0/0x4
Freed by task 2855:
kfree+0x138/0x630
kernfs_put_open_node+0x10c/0x124
kernfs_fop_release+0xd8/0x114
__fput+0x130/0x2a4
____fput+0x1c/0x28
task_work_run+0x16c/0x1c8
do_notify_resume+0x2bc/0x107c
work_pending+0x8/0x10
The buggy address belongs to the object at
ffffffc0c3495a00
which belongs to the cache kmalloc-128 of size 128
The buggy address is located 0 bytes inside of
128-byte region [
ffffffc0c3495a00,
ffffffc0c3495a80)
The buggy address belongs to the page:
page:
ffffffbf030d2500 count:1 mapcount:0 mapping: (null) index:0x0 compound_mapcount: 0
flags: 0x4000000000010200(slab|head)
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffffffc0c3495900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffffffc0c3495980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>
ffffffc0c3495a00: 04 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^
ffffffc0c3495a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffffffc0c3495b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Link: http://lkml.kernel.org/r/20190319231911.145968-1-minchan@kernel.org
Cc: <stable@vger.kernel.org> [5.0]
Signed-off-by: Minchan Kim <minchan@kernel.org>
Reported-by: Makoto Wu <makotowu@google.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Qian Cai [Fri, 29 Mar 2019 03:44:21 +0000 (20:44 -0700)]
mm/page_isolation.c: fix a wrong flag in set_migratetype_isolate()
Due to has_unmovable_pages() taking an incorrect irqsave flag instead of
the isolation flag in set_migratetype_isolate(), there are issues with
HWPOSION and error reporting where dump_page() is not called when there
is an unmovable page.
Link: http://lkml.kernel.org/r/20190320204941.53731-1-cai@lca.pw
Fixes:
d381c54760dc ("mm: only report isolation failures when offlining memory")
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Qian Cai <cai@lca.pw>
Cc: <stable@vger.kernel.org> [5.0.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Qian Cai [Fri, 29 Mar 2019 03:44:16 +0000 (20:44 -0700)]
mm/memory_hotplug.c: fix notification in offline error path
When start_isolate_page_range() returned -EBUSY in __offline_pages(), it
calls memory_notify(MEM_CANCEL_OFFLINE, &arg) with an uninitialized
"arg". As the result, it triggers warnings below. Also, it is only
necessary to notify MEM_CANCEL_OFFLINE after MEM_GOING_OFFLINE.
page:
ffffea0001200000 count:1 mapcount:0 mapping:
0000000000000000
index:0x0
flags: 0x3fffe000001000(reserved)
raw:
003fffe000001000 ffffea0001200008 ffffea0001200008 0000000000000000
raw:
0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
page dumped because: unmovable page
WARNING: CPU: 25 PID: 1665 at mm/kasan/common.c:665
kasan_mem_notifier+0x34/0x23b
CPU: 25 PID: 1665 Comm: bash Tainted: G W 5.0.0+ #94
Hardware name: HP ProLiant DL180 Gen9/ProLiant DL180 Gen9, BIOS U20
10/25/2017
RIP: 0010:kasan_mem_notifier+0x34/0x23b
RSP: 0018:
ffff8883ec737890 EFLAGS:
00010206
RAX:
0000000000000246 RBX:
ff10f0f4435f1000 RCX:
f887a7a21af88000
RDX:
dffffc0000000000 RSI:
0000000000000020 RDI:
ffff8881f221af88
RBP:
ffff8883ec737898 R08:
ffff888000000000 R09:
ffffffffb0bddcd0
R10:
ffffed103e857088 R11:
ffff8881f42b8443 R12:
dffffc0000000000
R13:
00000000fffffff9 R14:
dffffc0000000000 R15:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000560fbd31d730 CR3:
00000004049c6003 CR4:
00000000001606a0
Call Trace:
notifier_call_chain+0xbf/0x130
__blocking_notifier_call_chain+0x76/0xc0
blocking_notifier_call_chain+0x16/0x20
memory_notify+0x1b/0x20
__offline_pages+0x3e2/0x1210
offline_pages+0x11/0x20
memory_block_action+0x144/0x300
memory_subsys_offline+0xe5/0x170
device_offline+0x13f/0x1e0
state_store+0xeb/0x110
dev_attr_store+0x3f/0x70
sysfs_kf_write+0x104/0x150
kernfs_fop_write+0x25c/0x410
__vfs_write+0x66/0x120
vfs_write+0x15a/0x4f0
ksys_write+0xd2/0x1b0
__x64_sys_write+0x73/0xb0
do_syscall_64+0xeb/0xb78
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f14f75cc3b8
RSP: 002b:
00007ffe84d01d68 EFLAGS:
00000246 ORIG_RAX:
0000000000000001
RAX:
ffffffffffffffda RBX:
0000000000000008 RCX:
00007f14f75cc3b8
RDX:
0000000000000008 RSI:
0000563f8e433d70 RDI:
0000000000000001
RBP:
0000563f8e433d70 R08:
000000000000000a R09:
00007ffe84d018f0
R10:
000000000000000a R11:
0000000000000246 R12:
00007f14f789e780
R13:
0000000000000008 R14:
00007f14f7899740 R15:
0000000000000008
Link: http://lkml.kernel.org/r/20190320204255.53571-1-cai@lca.pw
Fixes:
7960509329c2 ("mm, memory_hotplug: print reason for the offlining failure")
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Qian Cai <cai@lca.pw>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@vger.kernel.org> [5.0.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>