platform/kernel/linux-starfive.git
3 years agoperf tools: Enable libtraceevent dynamic linking
Michael Petlan [Wed, 28 Apr 2021 09:20:23 +0000 (11:20 +0200)]
perf tools: Enable libtraceevent dynamic linking

Currently we support only static linking with kernel's libtraceevent
(tools/lib/traceevent). This patch adds libtraceevent package detection
and support to link perf with it dynamically.

  The libtraceevent package status is displayed with:
  $ make VF=1 LIBTRACEEVENT_DYNAMIC=1
  ...
  ...                 libtraceevent: [ on  ]

Default behavior remains the same (static linking).

Committer testing:

  $ make LIBTRACEEVENT_DYNAMIC=1 VF=1 O=/tmp/build/perf -C tools/perf install-bin |& grep traceevent
  Makefile.config:1090: *** Error: No libtraceevent devel library found, please install libtraceevent-devel.  Stop.
  $

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
LPU-Reference: 20210428092023.4009-1-mpetlan@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf Documentation: Document intel-hybrid support
Jin Yao [Tue, 27 Apr 2021 07:01:39 +0000 (15:01 +0800)]
perf Documentation: Document intel-hybrid support

Add some words and examples to help understanding of
Intel hybrid perf support.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-27-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf tests: Skip 'perf stat metrics (shadow stat) test' for hybrid
Jin Yao [Tue, 27 Apr 2021 07:01:38 +0000 (15:01 +0800)]
perf tests: Skip 'perf stat metrics (shadow stat) test' for hybrid

Currently we don't support shadow stat for hybrid.

  root@ssp-pwrt-002:~# ./perf stat -e cycles,instructions -a -- sleep 1

   Performance counter stats for 'system wide':

      12,883,109,591      cpu_core/cycles/
       6,405,163,221      cpu_atom/cycles/
         555,553,778      cpu_core/instructions/
         841,158,734      cpu_atom/instructions/

         1.002644773 seconds time elapsed

Now there is no shadow stat 'insn per cycle' reported. We will support
it later and now just skip the 'perf stat metrics (shadow stat) test'.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-26-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf tests: Support 'Convert perf time to TSC' test for hybrid
Jin Yao [Tue, 27 Apr 2021 07:01:37 +0000 (15:01 +0800)]
perf tests: Support 'Convert perf time to TSC' test for hybrid

Since for "cycles:u' on hybrid platform, it creates two "cycles".  So
the second evsel in evlist also needs initialization.

With this patch,

  # ./perf test 71
  71: Convert perf time to TSC                                        : Ok

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-25-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf tests: Support 'Session topology' test for hybrid
Jin Yao [Tue, 27 Apr 2021 07:01:36 +0000 (15:01 +0800)]
perf tests: Support 'Session topology' test for hybrid

Force to create one event "cpu_core/cycles/" by default, otherwise in
evlist__valid_sample_type, the checking of 'if (evlist->core.nr_entries
== 1)' would be failed.

  # ./perf test 41
  41: Session topology                                                : Ok

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-24-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf tests: Support 'Parse and process metrics' test for hybrid
Jin Yao [Tue, 27 Apr 2021 07:01:35 +0000 (15:01 +0800)]
perf tests: Support 'Parse and process metrics' test for hybrid

Some events are not supported. Only pick up some cases for hybrid.

  # ./perf test 68
  68: Parse and process metrics                                       : Ok

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-23-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf tests: Support 'Track with sched_switch' test for hybrid
Jin Yao [Tue, 27 Apr 2021 07:01:34 +0000 (15:01 +0800)]
perf tests: Support 'Track with sched_switch' test for hybrid

Since for "cycles:u' on hybrid platform, it creates two "cycles".
So the number of events in evlist is not expected in next test
steps. Now we just use one event "cpu_core/cycles:u/" for hybrid.

  # ./perf test 35
  35: Track with sched_switch                                         : Ok

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-22-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf tests: Skip 'Setup struct perf_event_attr' test for hybrid
Jin Yao [Tue, 27 Apr 2021 07:01:33 +0000 (15:01 +0800)]
perf tests: Skip 'Setup struct perf_event_attr' test for hybrid

For hybrid, the attr.type consists of pmu type id + original type.
There will be much changes for this test. Now we temporarily
skip this test case and TODO in future.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-21-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf tests: Add hybrid cases for 'Roundtrip evsel->name' test
Jin Yao [Tue, 27 Apr 2021 07:01:32 +0000 (15:01 +0800)]
perf tests: Add hybrid cases for 'Roundtrip evsel->name' test

Since for one hw event, two hybrid events are created.

For example,

evsel->idx      evsel__name(evsel)
0               cycles
1               cycles
2               instructions
3               instructions
...

So for comparing the evsel name on hybrid, the evsel->idx
needs to be divided by 2.

  # ./perf test 14
  14: Roundtrip evsel->name                                           : Ok

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-20-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf tests: Add hybrid cases for 'Parse event definition strings' test
Jin Yao [Tue, 27 Apr 2021 07:01:31 +0000 (15:01 +0800)]
perf tests: Add hybrid cases for 'Parse event definition strings' test

Add basic hybrid test cases for 'Parse event definition strings' test.

  # perf test 6
   6: Parse event definition strings                                  : Ok

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-19-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf record: Uniquify hybrid event name
Jin Yao [Tue, 27 Apr 2021 07:01:30 +0000 (15:01 +0800)]
perf record: Uniquify hybrid event name

For perf-record, it would be useful to tell user the pmu which the
event belongs to.

For example,

  # perf record -a -- sleep 1
  # perf report

  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 106  of event 'cpu_core/cycles/'
  # Event count (approx.): 22043448
  #
  # Overhead  Command       Shared Object            Symbol
  # ........  ............  .......................  ............................
  #
  ...

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-18-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf stat: Warn group events from different hybrid PMU
Jin Yao [Tue, 27 Apr 2021 07:01:29 +0000 (15:01 +0800)]
perf stat: Warn group events from different hybrid PMU

If a group has events which are from different hybrid PMUs,
shows a warning:

"WARNING: events in group from different hybrid PMUs!"

This is to remind the user not to put the core event and atom
event into one group.

Next, just disable grouping.

  # perf stat -e "{cpu_core/cycles/,cpu_atom/cycles/}" -a -- sleep 1
  WARNING: events in group from different hybrid PMUs!
  WARNING: grouped events cpus do not match, disabling group:
    anon group { cpu_core/cycles/, cpu_atom/cycles/ }

   Performance counter stats for 'system wide':

           5,438,125      cpu_core/cycles/
           3,914,586      cpu_atom/cycles/

         1.004250966 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-17-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf stat: Filter out unmatched aggregation for hybrid event
Jin Yao [Tue, 27 Apr 2021 07:01:28 +0000 (15:01 +0800)]
perf stat: Filter out unmatched aggregation for hybrid event

perf-stat has supported some aggregation modes, such as --per-core,
--per-socket and etc. While for hybrid event, it may only available
on part of cpus. So for --per-core, we need to filter out the
unavailable cores, for --per-socket, filter out the unavailable
sockets, and so on.

Before:

  # perf stat --per-core -e cpu_core/cycles/ -a -- sleep 1

   Performance counter stats for 'system wide':

  S0-D0-C0           2            479,530      cpu_core/cycles/
  S0-D0-C4           2            175,007      cpu_core/cycles/
  S0-D0-C8           2            166,240      cpu_core/cycles/
  S0-D0-C12          2            704,673      cpu_core/cycles/
  S0-D0-C16          2            865,835      cpu_core/cycles/
  S0-D0-C20          2          2,958,461      cpu_core/cycles/
  S0-D0-C24          2            163,988      cpu_core/cycles/
  S0-D0-C28          2            164,729      cpu_core/cycles/
  S0-D0-C32          0      <not counted>      cpu_core/cycles/
  S0-D0-C33          0      <not counted>      cpu_core/cycles/
  S0-D0-C34          0      <not counted>      cpu_core/cycles/
  S0-D0-C35          0      <not counted>      cpu_core/cycles/
  S0-D0-C36          0      <not counted>      cpu_core/cycles/
  S0-D0-C37          0      <not counted>      cpu_core/cycles/
  S0-D0-C38          0      <not counted>      cpu_core/cycles/
  S0-D0-C39          0      <not counted>      cpu_core/cycles/

         1.003597211 seconds time elapsed

After:

  # perf stat --per-core -e cpu_core/cycles/ -a -- sleep 1

   Performance counter stats for 'system wide':

  S0-D0-C0           2            210,428      cpu_core/cycles/
  S0-D0-C4           2            444,830      cpu_core/cycles/
  S0-D0-C8           2            435,241      cpu_core/cycles/
  S0-D0-C12          2            423,976      cpu_core/cycles/
  S0-D0-C16          2            859,350      cpu_core/cycles/
  S0-D0-C20          2          1,559,589      cpu_core/cycles/
  S0-D0-C24          2            163,924      cpu_core/cycles/
  S0-D0-C28          2            376,610      cpu_core/cycles/

         1.003621290 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Co-developed-by: Jiri Olsa <jolsa@redhat.com>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-16-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf stat: Add default hybrid events
Jin Yao [Tue, 27 Apr 2021 07:01:27 +0000 (15:01 +0800)]
perf stat: Add default hybrid events

Previously if '-e' is not specified in perf stat, some software events
and hardware events are added to evlist by default.

Before:

  # perf stat -a -- sleep 1

   Performance counter stats for 'system wide':

           24,044.40 msec cpu-clock                 #   23.946 CPUs utilized
                  99      context-switches          #    4.117 /sec
                  24      cpu-migrations            #    0.998 /sec
                   3      page-faults               #    0.125 /sec
           7,000,244      cycles                    #    0.000 GHz
           2,955,024      instructions              #    0.42  insn per cycle
             608,941      branches                  #   25.326 K/sec
              31,991      branch-misses             #    5.25% of all branches

         1.004106859 seconds time elapsed

Among the events, cycles, instructions, branches and branch-misses
are hardware events.

One hybrid platform, two hardware events are created for one
hardware event.

cpu_core/cycles/,
cpu_atom/cycles/,
cpu_core/instructions/,
cpu_atom/instructions/,
cpu_core/branches/,
cpu_atom/branches/,
cpu_core/branch-misses/,
cpu_atom/branch-misses/

These events would be added to evlist on hybrid platform.

Since parse_events() has been supported to create two hardware events
for one event on hybrid platform, so we just use parse_events(evlist,
"cycles,instructions,branches,branch-misses") to create the default
events and add them to evlist.

After:

  # perf stat -a -- sleep 1

   Performance counter stats for 'system wide':

           24,043.99 msec cpu-clock                 #   23.991 CPUs utilized
                 139      context-switches          #    5.781 /sec
                  25      cpu-migrations            #    1.040 /sec
                   6      page-faults               #    0.250 /sec
          10,381,751      cpu_core/cycles/          #  431.782 K/sec
           1,264,216      cpu_atom/cycles/          #   52.579 K/sec
           3,406,958      cpu_core/instructions/    #  141.697 K/sec
             414,588      cpu_atom/instructions/    #   17.243 K/sec
             705,149      cpu_core/branches/        #   29.327 K/sec
              82,358      cpu_atom/branches/        #    3.425 K/sec
              40,821      cpu_core/branch-misses/   #    1.698 K/sec
               9,086      cpu_atom/branch-misses/   #  377.891 /sec

         1.002228863 seconds time elapsed

We can see two events are created for one hardware event.

One TODO is, the shadow stats looks a bit different, now it's just
'M/sec'.

The perf_stat__update_shadow_stats and perf_stat__print_shadow_stats
need to be improved in future if we want to get the original shadow
stats.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-15-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf record: Create two hybrid 'cycles' events by default
Jin Yao [Tue, 27 Apr 2021 07:01:26 +0000 (15:01 +0800)]
perf record: Create two hybrid 'cycles' events by default

When evlist is empty, for example no '-e' specified in perf record,
one default 'cycles' event is added to evlist.

While on hybrid platform, it needs to create two default 'cycles'
events. One is for cpu_core, the other is for cpu_atom.

This patch actually calls evsel__new_cycles() two times to create
two 'cycles' events.

  # ./perf record -vv -a -- sleep 1
  ...
  ------------------------------------------------------------
  perf_event_attr:
    size                             120
    config                           0x400000000
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|TIME|ID|CPU|PERIOD
    read_format                      ID
    disabled                         1
    inherit                          1
    freq                             1
    precise_ip                       3
    sample_id_all                    1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 5
  sys_perf_event_open: pid -1  cpu 1  group_fd -1  flags 0x8 = 6
  sys_perf_event_open: pid -1  cpu 2  group_fd -1  flags 0x8 = 7
  sys_perf_event_open: pid -1  cpu 3  group_fd -1  flags 0x8 = 9
  sys_perf_event_open: pid -1  cpu 4  group_fd -1  flags 0x8 = 10
  sys_perf_event_open: pid -1  cpu 5  group_fd -1  flags 0x8 = 11
  sys_perf_event_open: pid -1  cpu 6  group_fd -1  flags 0x8 = 12
  sys_perf_event_open: pid -1  cpu 7  group_fd -1  flags 0x8 = 13
  sys_perf_event_open: pid -1  cpu 8  group_fd -1  flags 0x8 = 14
  sys_perf_event_open: pid -1  cpu 9  group_fd -1  flags 0x8 = 15
  sys_perf_event_open: pid -1  cpu 10  group_fd -1  flags 0x8 = 16
  sys_perf_event_open: pid -1  cpu 11  group_fd -1  flags 0x8 = 17
  sys_perf_event_open: pid -1  cpu 12  group_fd -1  flags 0x8 = 18
  sys_perf_event_open: pid -1  cpu 13  group_fd -1  flags 0x8 = 19
  sys_perf_event_open: pid -1  cpu 14  group_fd -1  flags 0x8 = 20
  sys_perf_event_open: pid -1  cpu 15  group_fd -1  flags 0x8 = 21
  ------------------------------------------------------------
  perf_event_attr:
    size                             120
    config                           0x800000000
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|TIME|ID|CPU|PERIOD
    read_format                      ID
    disabled                         1
    inherit                          1
    freq                             1
    precise_ip                       3
    sample_id_all                    1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8 = 22
  sys_perf_event_open: pid -1  cpu 17  group_fd -1  flags 0x8 = 23
  sys_perf_event_open: pid -1  cpu 18  group_fd -1  flags 0x8 = 24
  sys_perf_event_open: pid -1  cpu 19  group_fd -1  flags 0x8 = 25
  sys_perf_event_open: pid -1  cpu 20  group_fd -1  flags 0x8 = 26
  sys_perf_event_open: pid -1  cpu 21  group_fd -1  flags 0x8 = 27
  sys_perf_event_open: pid -1  cpu 22  group_fd -1  flags 0x8 = 28
  sys_perf_event_open: pid -1  cpu 23  group_fd -1  flags 0x8 = 29
  ------------------------------------------------------------

We have to create evlist-hybrid.c otherwise due to the symbol
dependency the perf test python would be failed.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-14-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf parse-events: Support event inside hybrid pmu
Jin Yao [Tue, 27 Apr 2021 07:01:25 +0000 (15:01 +0800)]
perf parse-events: Support event inside hybrid pmu

On hybrid platform, user may want to enable events on one pmu.

Following syntax are supported:

cpu_core/<event>/
cpu_atom/<event>/

But the syntax doesn't work for cache event.

Before:

  # perf stat -e cpu_core/LLC-loads/ -a -- sleep 1
  event syntax error: 'cpu_core/LLC-loads/'
                                \___ unknown term 'LLC-loads' for pmu 'cpu_core'

Cache events are a bit complex. We can't create aliases for them.
We use another solution. For example, if we use "cpu_core/LLC-loads/",
in parse_events_add_pmu(), term->config is "LLC-loads".

Then we create a new parser to scan "LLC-loads". The
parse_events_add_cache() would be called during parsing.
The parse_state->hybrid_pmu_name is used to identify the pmu
where the event should be enabled on.

After:

  # perf stat -e cpu_core/LLC-loads/ -a -- sleep 1

   Performance counter stats for 'system wide':

              24,593      cpu_core/LLC-loads/

         1.003911601 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-13-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf parse-events: Compare with hybrid pmu name
Jin Yao [Tue, 27 Apr 2021 07:01:24 +0000 (15:01 +0800)]
perf parse-events: Compare with hybrid pmu name

On hybrid platform, user may want to enable event only on one pmu.
Following syntax will be supported:

cpu_core/<event>/
cpu_atom/<event>/

For hardware event, hardware cache event and raw event, two events
are created by default. We pass the specified pmu name in parse_state
and it would be checked before event creation. So next only the
event with the specified pmu would be created.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-12-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf parse-events: Create two hybrid raw events
Jin Yao [Tue, 27 Apr 2021 07:01:23 +0000 (15:01 +0800)]
perf parse-events: Create two hybrid raw events

On hybrid platform, same raw event is possible to be available
on both cpu_core pmu and cpu_atom pmu. It's supported to create
two raw events for one event encoding. For raw events, the
attr.type is PMU type.

  # perf stat -e r3c -a -vv -- sleep 1
  Control descriptor is not initialized
  ------------------------------------------------------------
  perf_event_attr:
    type                             4
    size                             120
    config                           0x3c
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 3
  ------------------------------------------------------------
  ...
  ------------------------------------------------------------
  perf_event_attr:
    type                             4
    size                             120
    config                           0x3c
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 15  group_fd -1  flags 0x8 = 19
  ------------------------------------------------------------
  perf_event_attr:
    type                             8
    size                             120
    config                           0x3c
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8 = 20
  ------------------------------------------------------------
  ...
  ------------------------------------------------------------
  perf_event_attr:
    type                             8
    size                             120
    config                           0x3c
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 23  group_fd -1  flags 0x8 = 27
  r3c: 0: 434449 1001412521 1001412521
  r3c: 1: 173162 1001482031 1001482031
  r3c: 2: 231710 1001524974 1001524974
  r3c: 3: 110012 1001563523 1001563523
  r3c: 4: 191517 1001593221 1001593221
  r3c: 5: 956458 1001628147 1001628147
  r3c: 6: 416969 1001715626 1001715626
  r3c: 7: 1047527 1001596650 1001596650
  r3c: 8: 103877 1001633520 1001633520
  r3c: 9: 70571 1001637898 1001637898
  r3c: 10: 550284 1001714398 1001714398
  r3c: 11: 1257274 1001738349 1001738349
  r3c: 12: 107797 1001801432 1001801432
  r3c: 13: 67471 1001836281 1001836281
  r3c: 14: 286782 1001923161 1001923161
  r3c: 15: 815509 1001952550 1001952550
  r3c: 0: 95994 1002071117 1002071117
  r3c: 1: 105570 1002142438 1002142438
  r3c: 2: 115921 1002189147 1002189147
  r3c: 3: 72747 1002238133 1002238133
  r3c: 4: 103519 1002276753 1002276753
  r3c: 5: 121382 1002315131 1002315131
  r3c: 6: 80298 1002248050 1002248050
  r3c: 7: 466790 1002278221 1002278221
  r3c: 6821369 16026754282 16026754282
  r3c: 1162221 8017758990 8017758990

   Performance counter stats for 'system wide':

           6,821,369      cpu_core/r3c/
           1,162,221      cpu_atom/r3c/

         1.002289965 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-11-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf parse-events: Create two hybrid cache events
Jin Yao [Tue, 27 Apr 2021 07:01:22 +0000 (15:01 +0800)]
perf parse-events: Create two hybrid cache events

For cache events, they have pre-defined configs. The kernel needs
to know where the cache event comes from (e.g. from cpu_core pmu
or from cpu_atom pmu). But the perf type PERF_TYPE_HW_CACHE
can't carry pmu information.

Now the type PERF_TYPE_HW_CACHE is extended to be PMU aware type.
The PMU type ID is stored at attr.config[63:32].

When enabling a hybrid cache event without specified pmu, such as,
'perf stat -e LLC-loads -a', two events are created
automatically. One is for atom, the other is for core.

  # perf stat -e LLC-loads -a -vv -- sleep 1
  Control descriptor is not initialized
  ------------------------------------------------------------
  perf_event_attr:
    type                             3
    size                             120
    config                           0x400000002
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 3
  ------------------------------------------------------------
  ...
  ------------------------------------------------------------
  perf_event_attr:
    type                             3
    size                             120
    config                           0x400000002
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 15  group_fd -1  flags 0x8 = 19
  ------------------------------------------------------------
  perf_event_attr:
    type                             3
    size                             120
    config                           0x800000002
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8 = 20
  ------------------------------------------------------------
  ...
  ------------------------------------------------------------
  perf_event_attr:
    type                             3
    size                             120
    config                           0x800000002
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 23  group_fd -1  flags 0x8 = 27
  LLC-loads: 0: 1507 1001800280 1001800280
  LLC-loads: 1: 666 1001812250 1001812250
  LLC-loads: 2: 3353 1001813453 1001813453
  LLC-loads: 3: 514 1001848795 1001848795
  LLC-loads: 4: 627 1001952832 1001952832
  LLC-loads: 5: 4399 1001451154 1001451154
  LLC-loads: 6: 1240 1001481052 1001481052
  LLC-loads: 7: 478 1001520348 1001520348
  LLC-loads: 8: 691 1001551236 1001551236
  LLC-loads: 9: 310 1001578945 1001578945
  LLC-loads: 10: 1018 1001594354 1001594354
  LLC-loads: 11: 3656 1001622355 1001622355
  LLC-loads: 12: 882 1001661416 1001661416
  LLC-loads: 13: 506 1001693963 1001693963
  LLC-loads: 14: 3547 1001721013 1001721013
  LLC-loads: 15: 1399 1001734818 1001734818
  LLC-loads: 0: 1314 1001793826 1001793826
  LLC-loads: 1: 2857 1001752764 1001752764
  LLC-loads: 2: 646 1001830694 1001830694
  LLC-loads: 3: 1612 1001864861 1001864861
  LLC-loads: 4: 2244 1001912381 1001912381
  LLC-loads: 5: 1255 1001943889 1001943889
  LLC-loads: 6: 4624 1002021109 1002021109
  LLC-loads: 7: 2703 1001959302 1001959302
  LLC-loads: 24793 16026838264 16026838264
  LLC-loads: 17255 8015078826 8015078826

   Performance counter stats for 'system wide':

              24,793      cpu_core/LLC-loads/
              17,255      cpu_atom/LLC-loads/

         1.001970988 seconds time elapsed

0x4 in 0x400000002 indicates the cpu_core pmu.
0x8 in 0x800000002 indicates the cpu_atom pmu.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-10-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf parse-events: Create two hybrid hardware events
Jin Yao [Tue, 27 Apr 2021 07:01:21 +0000 (15:01 +0800)]
perf parse-events: Create two hybrid hardware events

Current hardware events has special perf types PERF_TYPE_HARDWARE.
But it doesn't pass the PMU type in the user interface. For a hybrid
system, the perf kernel doesn't know which PMU the events belong to.

So now this type is extended to be PMU aware type. The PMU type ID
is stored at attr.config[63:32].

PMU type ID is retrieved from sysfs.

  root@lkp-adl-d01:/sys/devices/cpu_atom# cat type
  8

  root@lkp-adl-d01:/sys/devices/cpu_core# cat type
  4

When enabling a hybrid hardware event without specified pmu, such as,
'perf stat -e cycles -a', two events are created automatically. One
is for atom, the other is for core.

  # perf stat -e cycles -a -vv -- sleep 1
  Control descriptor is not initialized
  ------------------------------------------------------------
  perf_event_attr:
    size                             120
    config                           0x400000000
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 3
  ------------------------------------------------------------
  ...
  ------------------------------------------------------------
  perf_event_attr:
    size                             120
    config                           0x400000000
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 15  group_fd -1  flags 0x8 = 19
  ------------------------------------------------------------
  perf_event_attr:
    size                             120
    config                           0x800000000
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8 = 20
  ------------------------------------------------------------
  ...
  ------------------------------------------------------------
  perf_event_attr:
    size                             120
    config                           0x800000000
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
    disabled                         1
    inherit                          1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid -1  cpu 23  group_fd -1  flags 0x8 = 27
  cycles: 0: 836272 1001525722 1001525722
  cycles: 1: 628564 1001580453 1001580453
  cycles: 2: 872693 1001605997 1001605997
  cycles: 3: 70417 1001641369 1001641369
  cycles: 4: 88593 1001726722 1001726722
  cycles: 5: 470495 1001752993 1001752993
  cycles: 6: 484733 1001840440 1001840440
  cycles: 7: 1272477 1001593105 1001593105
  cycles: 8: 209185 1001608616 1001608616
  cycles: 9: 204391 1001633962 1001633962
  cycles: 10: 264121 1001661745 1001661745
  cycles: 11: 826104 1001689904 1001689904
  cycles: 12: 89935 1001728861 1001728861
  cycles: 13: 70639 1001756757 1001756757
  cycles: 14: 185266 1001784810 1001784810
  cycles: 15: 171094 1001825466 1001825466
  cycles: 0: 129624 1001854843 1001854843
  cycles: 1: 122533 1001840421 1001840421
  cycles: 2: 90055 1001882506 1001882506
  cycles: 3: 139607 1001896463 1001896463
  cycles: 4: 141791 1001907838 1001907838
  cycles: 5: 530927 1001883880 1001883880
  cycles: 6: 143246 1001852529 1001852529
  cycles: 7: 667769 1001872626 1001872626
  cycles: 6744979 16026956922 16026956922
  cycles: 1965552 8014991106 8014991106

   Performance counter stats for 'system wide':

           6,744,979      cpu_core/cycles/
           1,965,552      cpu_atom/cycles/

         1.001882711 seconds time elapsed

0x4 in 0x400000000 indicates the cpu_core pmu.
0x8 in 0x800000000 indicates the cpu_atom pmu.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-9-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf stat: Uniquify hybrid event name
Jin Yao [Tue, 27 Apr 2021 07:01:20 +0000 (15:01 +0800)]
perf stat: Uniquify hybrid event name

It would be useful to let user know the pmu which the event belongs to.
perf-stat has supported '--no-merge' option and it can print the pmu
name after the event name, such as:

"cycles [cpu_core]"

Now this option is enabled by default for hybrid platform but change
the format to:

"cpu_core/cycles/"

If user configs the name, we still use the user specified name.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
ink: https://lore.kernel.org/r/20210427070139.25256-8-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf pmu: Add hybrid helper functions
Jin Yao [Tue, 27 Apr 2021 07:01:19 +0000 (15:01 +0800)]
perf pmu: Add hybrid helper functions

The functions perf_pmu__is_hybrid and perf_pmu__find_hybrid_pmu
can be used to identify the hybrid platform and return the found
hybrid cpu pmu. All the detected hybrid pmus have been saved in
'perf_pmu__hybrid_pmus' list. So we just need to search this list.

perf_pmu__hybrid_type_to_pmu converts the user specified string
to hybrid pmu name. This is used to support the '--cputype' option
in next patches.

perf_pmu__has_hybrid checks the existing of hybrid pmu. Note that,
we have to define it in pmu.c (make pmu-hybrid.c no more symbol
dependency), otherwise perf test python would be failed.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-7-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf pmu: Save detected hybrid pmus to a global pmu list
Jin Yao [Tue, 27 Apr 2021 07:01:18 +0000 (15:01 +0800)]
perf pmu: Save detected hybrid pmus to a global pmu list

We identify the cpu_core pmu and cpu_atom pmu by explicitly
checking following files:

For cpu_core, checks:
"/sys/bus/event_source/devices/cpu_core/cpus"

For cpu_atom, checks:
"/sys/bus/event_source/devices/cpu_atom/cpus"

If the 'cpus' file exists and it has data, the pmu exists.

But in order not to hardcode the "cpu_core" and "cpu_atom",
and make the code in a generic way.

So if the path "/sys/bus/event_source/devices/cpu_xxx/cpus" exists, the
hybrid pmu exists. All the detected hybrid pmus are linked to a global
list 'perf_pmu__hybrid_pmus' and then next we just need to iterate the
list to get all hybrid pmu by using perf_pmu__for_each_hybrid_pmu.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-6-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf pmu: Save pmu name
Jin Yao [Tue, 27 Apr 2021 07:01:17 +0000 (15:01 +0800)]
perf pmu: Save pmu name

On hybrid platform, one event is available on one pmu
(such as, available on cpu_core or on cpu_atom).

This patch saves the pmu name to the pmu field of struct perf_pmu_alias.
Then next we can know the pmu which the event can be enabled on.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-5-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf pmu: Simplify arguments of __perf_pmu__new_alias
Jin Yao [Tue, 27 Apr 2021 07:01:16 +0000 (15:01 +0800)]
perf pmu: Simplify arguments of __perf_pmu__new_alias

Simplify the arguments of __perf_pmu__new_alias() by passing the whole
'struct pme_event' pointer.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-4-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf jevents: Support unit value "cpu_core" and "cpu_atom"
Jin Yao [Tue, 27 Apr 2021 07:01:15 +0000 (15:01 +0800)]
perf jevents: Support unit value "cpu_core" and "cpu_atom"

For some Intel platforms, such as Alderlake, which is a hybrid platform
and it consists of atom cpu and core cpu. Each cpu has dedicated event
list. Part of events are available on core cpu, part of events are
available on atom cpu.

The kernel exports new cpu pmus: cpu_core and cpu_atom. The event in
json is added with a new field "Unit" to indicate which pmu the event
is available on.

For example, one event in cache.json,

    {
        "BriefDescription": "Counts the number of load ops retired that",
        "CollectPEBSRecord": "2",
        "Counter": "0,1,2,3",
        "EventCode": "0xd2",
        "EventName": "MEM_LOAD_UOPS_RETIRED_MISC.MMIO",
        "PEBScounters": "0,1,2,3",
        "SampleAfterValue": "1000003",
        "UMask": "0x80",
        "Unit": "cpu_atom"
    },

The unit "cpu_atom" indicates this event is only available on "cpu_atom".

In generated pmu-events.c, we can see:

{
        .name = "mem_load_uops_retired_misc.mmio",
        .event = "period=1000003,umask=0x80,event=0xd2",
        .desc = "Counts the number of load ops retired that. Unit: cpu_atom ",
        .topic = "cache",
        .pmu = "cpu_atom",
},

But if without this patch, the "uncore_" prefix is added before "cpu_atom",
such as:
        .pmu = "uncore_cpu_atom"

That would be a wrong pmu.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-3-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agotools headers uapi: Update tools's copy of linux/perf_event.h
Jin Yao [Tue, 27 Apr 2021 07:01:14 +0000 (15:01 +0800)]
tools headers uapi: Update tools's copy of linux/perf_event.h

To get the changes in:

Liang Kan's patch

  55bcf6ef314ae8ba ("perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE")

Kan's patch is in the tip/perf/core branch.

So the next perf tool patches need this interface for hybrid support.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427070139.25256-2-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf report: Print percentage of each event statistics
Namhyung Kim [Tue, 27 Apr 2021 01:37:17 +0000 (18:37 -0700)]
perf report: Print percentage of each event statistics

It's sometimes useful to see how many samples vs other events in the
data file with percent values.

  $ perf report --stat

  Aggregated stats:
             TOTAL events:      20064
              MMAP events:        239  ( 1.2%)
              COMM events:       1518  ( 7.6%)
              EXIT events:          1  ( 0.0%)
              FORK events:       1517  ( 7.6%)
            SAMPLE events:       4015  (20.0%)
             MMAP2 events:      12769  (63.6%)
    FINISHED_ROUND events:          2  ( 0.0%)
        THREAD_MAP events:          1  ( 0.0%)
           CPU_MAP events:          1  ( 0.0%)
         TIME_CONV events:          1  ( 0.0%)
  cycles stats:
            SAMPLE events:       2475
  instructions stats:
            SAMPLE events:       1540

Suggested-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427013717.1651674-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf report: Make --skip-empty as default
Namhyung Kim [Tue, 27 Apr 2021 01:37:16 +0000 (18:37 -0700)]
perf report: Make --skip-empty as default

so that the compact output is shown by default.  Also add 'report.skip-empty'
config option to override the default.  Users can also use --no-skip-empty
command line option to change the behavior anytime.

Committer testing:

  $ perf report --stat

  Aggregated stats:
             TOTAL events:         19
              COMM events:          2
              EXIT events:          1
            SAMPLE events:          8
             MMAP2 events:          4
    FINISHED_ROUND events:          1
        THREAD_MAP events:          1
           CPU_MAP events:          1
         TIME_CONV events:          1
  cycles:u stats:
            SAMPLE events:          8
  $ perf config report.skip-empty=false
  $ perf report --stat

  Aggregated stats:
             TOTAL events:         19
              MMAP events:          0
              LOST events:          0
              COMM events:          2
              EXIT events:          1
          THROTTLE events:          0
        UNTHROTTLE events:          0
              FORK events:          0
              READ events:          0
            SAMPLE events:          8
             MMAP2 events:          4
               AUX events:          0
      ITRACE_START events:          0
      LOST_SAMPLES events:          0
            SWITCH events:          0
   SWITCH_CPU_WIDE events:          0
        NAMESPACES events:          0
           KSYMBOL events:          0
         BPF_EVENT events:          0
            CGROUP events:          0
         TEXT_POKE events:          0
              ATTR events:          0
        EVENT_TYPE events:          0
      TRACING_DATA events:          0
          BUILD_ID events:          0
    FINISHED_ROUND events:          1
          ID_INDEX events:          0
     AUXTRACE_INFO events:          0
          AUXTRACE events:          0
    AUXTRACE_ERROR events:          0
        THREAD_MAP events:          1
           CPU_MAP events:          1
       STAT_CONFIG events:          0
              STAT events:          0
        STAT_ROUND events:          0
      EVENT_UPDATE events:          0
         TIME_CONV events:          1
           FEATURE events:          0
        COMPRESSED events:          0
  cycles:u stats:
            SAMPLE events:          8
  $ perf config report.skip-empty
  report.skip-empty=false
  $

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427013717.1651674-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf report: Add --skip-empty option to suppress 0 event stat
Namhyung Kim [Tue, 27 Apr 2021 01:37:15 +0000 (18:37 -0700)]
perf report: Add --skip-empty option to suppress 0 event stat

To make the output more readable, I think it's better to remove 0's in
the output.  Also the dummy event has no event stats so it just wasts
the space.  Let's use the --skip-empty option to suppress it.

  $ perf report --stat --skip-empty

  Aggregated stats:
             TOTAL events:      16530
              MMAP events:        226
              COMM events:       1596
              EXIT events:          2
          THROTTLE events:        121
        UNTHROTTLE events:        117
              FORK events:       1595
            SAMPLE events:        719
             MMAP2 events:      12147
            CGROUP events:          2
    FINISHED_ROUND events:          2
        THREAD_MAP events:          1
           CPU_MAP events:          1
         TIME_CONV events:          1
  cycles stats:
            SAMPLE events:        719

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427013717.1651674-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf report: Show event sample counts in --stat output
Namhyung Kim [Tue, 27 Apr 2021 01:37:14 +0000 (18:37 -0700)]
perf report: Show event sample counts in --stat output

To make the output identical with perf report -D, it needs to show
per-event sample counts along with the aggregated stat  at the end.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427013717.1651674-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf hists: Split hists_stats from events_stats
Namhyung Kim [Tue, 27 Apr 2021 01:37:13 +0000 (18:37 -0700)]
perf hists: Split hists_stats from events_stats

Each struct hists have events_stats but most of the fields were not
used.  It's to count number of samples and periods whether filtered or
not.  And other fields are used only by evlist.

So it'd be better to split hists_stats and events_stats to reduce
wasted memory in the struct hists.  This makes the output of event
statistics in the perf report compact by skipping 0 events in each
evsel/hists.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427013717.1651674-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf top: Use evlist->events_stat to count events
Namhyung Kim [Tue, 27 Apr 2021 01:37:12 +0000 (18:37 -0700)]
perf top: Use evlist->events_stat to count events

It's mainly to count lost events for the warning so it should be ok
to use the evlist->stats instead.  This is needed for changes in the
next commit.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210427013717.1651674-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf data: Add JSON export
Nicholas Fraser [Mon, 26 Apr 2021 14:47:16 +0000 (10:47 -0400)]
perf data: Add JSON export

This adds a feature to export perf data to JSON.

The resolved symbols are exported into the JSON so that external tools
don't need to load the dsos themselves (or even have access to them at
all.) This makes it easy to load and analyze perf data with standalone
tools where direct perf or libbabeltrace integration is impractical.

The exporter uses a minimal inline JSON encoding without any external
dependencies. Currently it only outputs some headers and sample metadata
but it's easily extensible.

Use it like this:

  $ perf data convert --to-json out.json

Committer notes:

Fixup a __printf() bug that broke the build:

  util/data-convert-json.c:103:11: error: expected ‘)’ before numeric constant
    103 | __(printf, 5, 6)
        |           ^~
        |           )
  util/data-convert-json.c: In function ‘output_sample_callchain_entry’:
  util/data-convert-json.c:124:2: error: implicit declaration of function ‘output_json_key_format’; did you mean ‘output_json_format’? [-Werror=implicit-function-declaration]
    124 |  output_json_key_format(out, false, 5, "ip", "\"0x%" PRIx64 "\"", ip);
        |  ^~~~~~~~~~~~~~~~~~~~~~
        |  output_json_format

Also had to add this patch to fix errors reported by various versions of
clang:

  -       if (al && al->sym && al->sym->name && strlen(al->sym->name) > 0) {
  +       if (al && al->sym && al->sym->namelen) {

al->sym->name is a zero sized array, to avoid one extra alloc in the
symbol__new() constructor, sym->namelen carries its strlen.

Committer testing:

  $ ls -la out.json
  ls: cannot access 'out.json': No such file or directory
  $ perf record sleep 0.1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.001 MB perf.data (8 samples) ]
  $ perf report --stats | grep -w SAMPLE
            SAMPLE events:          8
  $ perf data convert --to-json out.json
  [ perf data convert: Converted 'perf.data' into JSON data 'out.json' ]
  [ perf data convert: Converted and wrote 0.002 MB (8 samples) ]
  $ ls -la out.json
  -rw-rw-r--. 1 acme acme 2017 Apr 26 17:29 out.json
  $ cat out.json
  {
   "linux-perf-json-version": 1,
   "headers": {
   "header-version": 1,
   "captured-on": "2021-04-26T20:28:57Z",
   "data-offset": 432,
   "data-size": 1016,
   "feat-offset": 1448,
   "hostname": "five",
   "os-release": "5.11.14-200.fc33.x86_64",
   "arch": "x86_64",
   "cpu-desc": "AMD Ryzen 9 3900X 12-Core Processor",
   "cpuid": "AuthenticAMD,23,113,0",
   "nrcpus-online": 24,
   "nrcpus-avail": 24,
   "perf-version": "5.12.gee134f3189bd",
   "cmdline": [
   "/home/acme/bin/perf",
   "record",
   "sleep",
   "0.1"
   ]
   },
   "samples": [
   {
   "timestamp": 170517539043684,
   "pid": 375844,
   "tid": 375844,
   "comm": "sleep",
   "callchain": [
   {
   "ip": "0xffffffffa6268827"
   }
   ]
   },
   {
   "timestamp": 170517539048443,
   "pid": 375844,
   "tid": 375844,
   "comm": "sleep",
   "callchain": [
   {
   "ip": "0xffffffffa661359d"
   }
   ]
   },
   {
   "timestamp": 170517539051018,
   "pid": 375844,
   "tid": 375844,
   "comm": "sleep",
   "callchain": [
   {
   "ip": "0xffffffffa6311e18"
   }
   ]
   },
   {
   "timestamp": 170517539053652,
   "pid": 375844,
   "tid": 375844,
   "comm": "sleep",
   "callchain": [
   {
   "ip": "0x7fdb77b4812b",
   "symbol": "_dl_start",
   "dso": "ld-2.32.so"
   }
   ]
   },
   {
   "timestamp": 170517539055306,
   "pid": 375844,
   "tid": 375844,
   "comm": "sleep",
   "callchain": [
   {
   "ip": "0xffffffffa6269286"
   }
   ]
   },
   {
   "timestamp": 170517539057590,
   "pid": 375844,
   "tid": 375844,
   "comm": "sleep",
   "callchain": [
   {
   "ip": "0xffffffffa62abd8b"
   }
   ]
   },
   {
   "timestamp": 170517539067559,
   "pid": 375844,
   "tid": 375844,
   "comm": "sleep",
   "callchain": [
   {
   "ip": "0x7fdb77b5e9e9",
   "symbol": "__GI___tunables_init",
   "dso": "ld-2.32.so"
   }
   ]
   },
   {
   "timestamp": 170517539282452,
   "pid": 375844,
   "tid": 375844,
   "comm": "sleep",
   "callchain": [
   {
   "ip": "0x7fdb779978d2",
   "symbol": "getenv",
   "dso": "libc-2.32.so"
   }
   ]
   }
   ]
  }
  $

Signed-off-by: Nicholas Fraser <nfraser@codeweavers.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Changbin Du <changbin.du@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: Ulrich Czekalla <uczekalla@codeweavers.com>
Link: http://lore.kernel.org/lkml/3884969f-804d-2f53-c648-e2b0bd85edff@codeweavers.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf stat: Introduce bpf_counter_ops->disable()
Song Liu [Sun, 25 Apr 2021 21:43:33 +0000 (14:43 -0700)]
perf stat: Introduce bpf_counter_ops->disable()

Introduce bpf_counter_ops->disable(), which is used stop counting the
event.

Committer notes:

Added a dummy bpf_counter__disable() to the python binding to avoid
having 'perf test python' failing.

bpf_counter isn't supported in the python binding.

Signed-off-by: Song Liu <song@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: kernel-team@fb.com
Link: https://lore.kernel.org/r/20210425214333.1090950-6-song@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf stat: Introduce ':b' modifier
Song Liu [Sun, 25 Apr 2021 21:43:32 +0000 (14:43 -0700)]
perf stat: Introduce ':b' modifier

Introduce 'b' modifier to event parser, which means use BPF program to
manage this event. This is the same as --bpf-counters option, but only
applies to this event. For example,

  perf stat -e cycles:b,cs               # use bpf for cycles, but not cs
  perf stat -e cycles,cs --bpf-counters  # use bpf for both cycles and cs

Suggested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Song Liu <song@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/r/20210425214333.1090950-5-song@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf stat: Introduce config stat.bpf-counter-events
Song Liu [Sun, 25 Apr 2021 21:43:31 +0000 (14:43 -0700)]
perf stat: Introduce config stat.bpf-counter-events

Currently, to use BPF to aggregate perf event counters, the user uses
--bpf-counters option. Enable "use bpf by default" events with a config
option, stat.bpf-counter-events. Events with name in the option will use
BPF.

This also enables mixed BPF event and regular event in the same sesssion.
For example:

   perf config stat.bpf-counter-events=instructions
   perf stat -e instructions,cs

The second command will use BPF for "instructions" but not "cs".

Signed-off-by: Song Liu <song@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/r/20210425214333.1090950-4-song@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf bpf: check perf_attr_map is compatible with the perf binary
Song Liu [Sun, 25 Apr 2021 21:43:30 +0000 (14:43 -0700)]
perf bpf: check perf_attr_map is compatible with the perf binary

perf_attr_map could be shared among different version of perf binary. Add
bperf_attr_map_compatible() to check whether the existing attr_map is
compatible with current perf binary.

Signed-off-by: Song Liu <song@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: kernel-team@fb.com
Link: https://lore.kernel.org/r/20210425214333.1090950-3-song@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf util: Move bpf_perf definitions to a libperf header
Song Liu [Sun, 25 Apr 2021 21:43:29 +0000 (14:43 -0700)]
perf util: Move bpf_perf definitions to a libperf header

By following the same protocol, other tools can share hardware PMCs with
perf. Move perf_event_attr_map_entry and BPF_PERF_DEFAULT_ATTR_MAP_PATH to
bpf_perf.h for other tools to use.

Signed-off-by: Song Liu <song@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: kernel-team@fb.com
Link: https://lore.kernel.org/r/20210425214333.1090950-2-song@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoMerge remote-tracking branch 'torvalds/master' into perf/core
Arnaldo Carvalho de Melo [Mon, 26 Apr 2021 12:35:41 +0000 (09:35 -0300)]
Merge remote-tracking branch 'torvalds/master' into perf/core

To pick up fixes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoLinux 5.12
Linus Torvalds [Sun, 25 Apr 2021 20:49:08 +0000 (13:49 -0700)]
Linux 5.12

3 years agoMerge tag 'perf-tools-fixes-for-v5.12-2021-04-25' of git://git.kernel.org/pub/scm...
Linus Torvalds [Sun, 25 Apr 2021 16:48:46 +0000 (09:48 -0700)]
Merge tag 'perf-tools-fixes-for-v5.12-2021-04-25' of git://git./linux/kernel/git/acme/linux

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix potential NULL pointer dereference in the auxtrace option parser

 - Fix access to PID in an array when setting a PID filter in 'perf ftrace'

 - Fix error return code in the 'perf data' tool and in maps__clone(),
   found using a static analysis tool from Huawei

* tag 'perf-tools-fixes-for-v5.12-2021-04-25' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf map: Fix error return code in maps__clone()
  perf ftrace: Fix access to pid in array when setting a pid filter
  perf auxtrace: Fix potential NULL pointer dereference
  perf data: Fix error return code in perf_data__create_dir()

3 years agoMerge tag 'perf_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 25 Apr 2021 16:42:06 +0000 (09:42 -0700)]
Merge tag 'perf_urgent_for_v5.12' of git://git./linux/kernel/git/tip/tip

Pull x86 perf fixes from Borislav Petkov:

 - Fix Broadwell Xeon's stepping in the PEBS isolation table of CPUs

 - Fix a panic when initializing perf uncore machinery on Haswell and
   Broadwell servers

* tag 'perf_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/kvm: Fix Broadwell Xeon stepping in isolation_ucodes[]
  perf/x86/intel/uncore: Remove uncore extra PCI dev HSWEP_PCI_PCU_3

3 years agoMerge tag 'locking_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 25 Apr 2021 16:10:10 +0000 (09:10 -0700)]
Merge tag 'locking_urgent_for_v5.12' of git://git./linux/kernel/git/tip/tip

Pull locking fix from Borislav Petkov:
 "Fix ordering in the queued writer lock's slowpath"

* tag 'locking_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/qrwlock: Fix ordering in queued_write_lock_slowpath()

3 years agoMerge tag 'sched_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 25 Apr 2021 16:08:19 +0000 (09:08 -0700)]
Merge tag 'sched_urgent_for_v5.12' of git://git./linux/kernel/git/tip/tip

Pull scheduler fix from Borislav Petkov:
 "Fix a typo in a macro ifdeffery"

* tag 'sched_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  preempt/dynamic: Fix typo in macro conditional statement

3 years agoMerge tag 'x86_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 25 Apr 2021 16:02:13 +0000 (09:02 -0700)]
Merge tag 'x86_urgent_for_v5.12' of git://git./linux/kernel/git/tip/tip

Pull x86 fix from Borislav Petkov:
 "Fix an out-of-bounds memory access when setting up a crash kernel with
  kexec"

* tag 'x86_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/crash: Fix crash_setup_memmap_entries() out-of-bounds access

3 years agoperf vendor events intel: Add missing skylake & icelake model numbers
Jin Yao [Mon, 29 Mar 2021 07:09:03 +0000 (15:09 +0800)]
perf vendor events intel: Add missing skylake & icelake model numbers

Kernel has supported COMETLAKE/COMETLAKE_L to use the SKYLAKE
events and supported TIGERLAKE_L/TIGERLAKE/ROCKETLAKE to use
the ICELAKE events. But pmu-events mapfile.csv is missing
these model numbers.

Now add the missing model numbers to mapfile.csv.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210329070903.8894-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Sat, 24 Apr 2021 16:40:18 +0000 (09:40 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull kvm fix from Paolo Bonzini:
 "Fix SRCU bug introduced in the merge window"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86/xen: Take srcu lock when accessing kvm_memslots()

3 years agoRevert "net/rds: Avoid potential use after free in rds_send_remove_from_sock"
Linus Torvalds [Sat, 24 Apr 2021 16:32:35 +0000 (09:32 -0700)]
Revert "net/rds: Avoid potential use after free in rds_send_remove_from_sock"

This reverts commit 0c85a7e87465f2d4cbc768e245f4f45b2f299b05.

The games with 'rm' are on (two separate instances) of a local variable,
and make no difference.

Quoting Aditya Pakki:
 "I was the author of the patch and it was the cause of the giant UMN
  revert.

  The patch is garbage and I was unaware of the steps involved in
  retracting it. I *believed* the maintainers would pull it, given it
  was already under Greg's list. The patch does not introduce any bugs
  but is pointless and is stupid. I accept my incompetence and for not
  requesting a revert earlier."

Link: https://lwn.net/Articles/854319/
Requested-by: Aditya Pakki <pakki001@umn.edu>
Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoMerge tag 'pinctrl-v5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw...
Linus Torvalds [Sat, 24 Apr 2021 00:11:10 +0000 (17:11 -0700)]
Merge tag 'pinctrl-v5.12-3' of git://git./linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:
 "Late pin control fixes, would have been in the main pull request
  normally but hey I got lucky and we got another week to polish up
  v5.12 so here we go.

  One driver fix and one making the core debugfs work:

   - Fix the number of pins in the community of the Intel Lewisburg SoC

   - Show pin numbers for controllers with base = 0 in the new debugfs
     feature"

* tag 'pinctrl-v5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: core: Show pin numbers for the controllers with base = 0
  pinctrl: lewisburg: Update number of pins in community

3 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Fri, 23 Apr 2021 21:56:23 +0000 (14:56 -0700)]
Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
 "5 patches.

  Subsystems affected by this patch series: coda, overlayfs, and
  mm (pagecache and memcg)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  tools/cgroup/slabinfo.py: updated to work on current kernel
  mm/filemap: fix mapping_seek_hole_data on THP & 32-bit
  mm/filemap: fix find_lock_entries hang on 32-bit THP
  ovl: fix reference counting in ovl_mmap error path
  coda: fix reference counting in coda_file_mmap error path

3 years agoMerge tag 'block-5.12-2021-04-23' of git://git.kernel.dk/linux-block
Linus Torvalds [Fri, 23 Apr 2021 21:46:20 +0000 (14:46 -0700)]
Merge tag 'block-5.12-2021-04-23' of git://git.kernel.dk/linux-block

Pull block fix from Jens Axboe:
 "A single fix for a behavioral regression in this series, when
  re-reading the partition table with partitions open"

* tag 'block-5.12-2021-04-23' of git://git.kernel.dk/linux-block:
  block: return -EBUSY when there are open partitions in blkdev_reread_part

3 years agotools/cgroup/slabinfo.py: updated to work on current kernel
Vasily Averin [Fri, 23 Apr 2021 21:29:03 +0000 (14:29 -0700)]
tools/cgroup/slabinfo.py: updated to work on current kernel

slabinfo.py script does not work with actual kernel version.

First, it was unable to recognise SLUB susbsytem, and when I specified
it manually it failed again with

  AttributeError: 'struct page' has no member 'obj_cgroups'

.. and then again with

  File "tools/cgroup/memcg_slabinfo.py", line 221, in main
    memcg.kmem_caches.address_of_(),
  AttributeError: 'struct mem_cgroup' has no member 'kmem_caches'

Link: https://lkml.kernel.org/r/cec1a75e-43b4-3d64-2084-d9f98fda037f@virtuozzo.com
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Tested-by: Roman Gushchin <guro@fb.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/filemap: fix mapping_seek_hole_data on THP & 32-bit
Hugh Dickins [Fri, 23 Apr 2021 21:29:00 +0000 (14:29 -0700)]
mm/filemap: fix mapping_seek_hole_data on THP & 32-bit

No problem on 64-bit, or without huge pages, but xfstests generic/285
and other SEEK_HOLE/SEEK_DATA tests have regressed on huge tmpfs, and on
32-bit architectures, with the new mapping_seek_hole_data().  Several
different bugs turned out to need fixing.

u64 cast to stop losing bits when converting unsigned long to loff_t
(and let's use shifts throughout, rather than mixed with * and /).

Use round_up() when advancing pos, to stop assuming that pos was already
THP-aligned when advancing it by THP-size.  (This use of round_up()
assumes that any THP has THP-aligned index: true at present and true
going forward, but could be recoded to avoid the assumption.)

Use xas_set() when iterating away from a THP, so that xa_index stays in
synch with start, instead of drifting away to return bogus offset.

Check start against end to avoid wrapping 32-bit xa_index to 0 (and to
handle these additional cases, seek_data or not, it's easier to break
the loop than goto: so rearrange exit from the function).

[hughd@google.com: remove unneeded u64 casts, per Matthew]
Link: https://lkml.kernel.org/r/alpine.LSU.2.11.2104221347240.1170@eggly.anvils
Link: https://lkml.kernel.org/r/alpine.LSU.2.11.2104211737410.3299@eggly.anvils
Fixes: 41139aa4c3a3 ("mm/filemap: add mapping_seek_hole_data")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agomm/filemap: fix find_lock_entries hang on 32-bit THP
Hugh Dickins [Fri, 23 Apr 2021 21:28:57 +0000 (14:28 -0700)]
mm/filemap: fix find_lock_entries hang on 32-bit THP

No problem on 64-bit, or without huge pages, but xfstests generic/308
hung uninterruptibly on 32-bit huge tmpfs.

Since commit 0cc3b0ec23ce ("Clarify (and fix) in 4.13 MAX_LFS_FILESIZE
macros"), MAX_LFS_FILESIZE is only a PAGE_SIZE away from wrapping 32-bit
xa_index to 0, so the new find_lock_entries() has to be extra careful
when handling a THP.

Link: https://lkml.kernel.org/r/alpine.LSU.2.11.2104211735430.3299@eggly.anvils
Fixes: 5c211ba29deb ("mm: add and use find_lock_entries")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.cz>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Yang Shi <yang.shi@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoovl: fix reference counting in ovl_mmap error path
Christian König [Fri, 23 Apr 2021 21:28:54 +0000 (14:28 -0700)]
ovl: fix reference counting in ovl_mmap error path

mmap_region() now calls fput() on the vma->vm_file.

Fix this by using vma_set_file() so it doesn't need to be handled
manually here any more.

Link: https://lkml.kernel.org/r/20210421132012.82354-2-christian.koenig@amd.com
Fixes: 1527f926fd04 ("mm: mmap: fix fput in error path v2")
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: <stable@vger.kernel.org> [5.11+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agocoda: fix reference counting in coda_file_mmap error path
Christian König [Fri, 23 Apr 2021 21:28:51 +0000 (14:28 -0700)]
coda: fix reference counting in coda_file_mmap error path

mmap_region() now calls fput() on the vma->vm_file.

So we need to drop the extra reference on the coda file instead of the
host file.

Link: https://lkml.kernel.org/r/20210421132012.82354-1-christian.koenig@amd.com
Fixes: 1527f926fd04 ("mm: mmap: fix fput in error path v2")
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jan Harkes <jaharkes@cs.cmu.edu>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: <stable@vger.kernel.org> [5.11+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoKVM: x86/xen: Take srcu lock when accessing kvm_memslots()
Wanpeng Li [Fri, 23 Apr 2021 08:23:20 +0000 (16:23 +0800)]
KVM: x86/xen: Take srcu lock when accessing kvm_memslots()

kvm_memslots() will be called by kvm_write_guest_offset_cached() so we should
take the srcu lock. Let's pull the srcu lock operation from kvm_steal_time_set_preempted()
again to fix xen part.

Fixes: 30b5c851af7 ("KVM: x86/xen: Add support for vCPU runstate information")
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1619166200-9215-1-git-send-email-wanpengli@tencent.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
3 years agoMerge tag 'arm-fixes-5.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Linus Torvalds [Fri, 23 Apr 2021 20:01:05 +0000 (13:01 -0700)]
Merge tag 'arm-fixes-5.12-4' of git://git./linux/kernel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
 "These should be the final fixes for v5.12.

  There is one fix for SD card detection on one Allwinner board, and a
  few fixes for the Tegra platform that I had already queued up for
  v5.13 due to a communication problem. This addresses MMC device
  ordering on multiple machines, audio support on Jetson AGX Xavier and
  suspend/resume on Jetson TX2"

* tag 'arm-fixes-5.12-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  arm64: dts: allwinner: Revert SD card CD GPIO for Pine64-LTS
  arm64: tegra: Move clocks from RT5658 endpoint to device node
  arm64: tegra: Fix mmc0 alias for Jetson Xavier NX
  arm64: tegra: Set fw_devlink=on for Jetson TX2
  arm64: tegra: Add unit-address for ACONNECT on Tegra186

3 years agoperf map: Fix error return code in maps__clone()
Zhen Lei [Thu, 15 Apr 2021 09:27:44 +0000 (17:27 +0800)]
perf map: Fix error return code in maps__clone()

Although 'err' has been initialized to -ENOMEM, but it will be reassigned
by the "err = unwind__prepare_access(...)" statement in the for loop. So
that, the value of 'err' is unknown when map__clone() failed.

Fixes: 6c502584438bda63 ("perf unwind: Call unwind__prepare_access for forked thread")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: zhen lei <thunder.leizhen@huawei.com>
Link: http://lore.kernel.org/lkml/20210415092744.3793-1-thunder.leizhen@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf ftrace: Fix access to pid in array when setting a pid filter
Thomas Richter [Wed, 21 Apr 2021 12:04:00 +0000 (14:04 +0200)]
perf ftrace: Fix access to pid in array when setting a pid filter

Command 'perf ftrace -v -- ls' fails in s390 (at least 5.12.0rc6).

The root cause is a missing pointer dereference which causes an
array element address to be used as PID.

Fix this by extracting the PID.

Output before:
  # ./perf ftrace -v -- ls
  function_graph tracer is used
  write '-263732416' to tracing/set_ftrace_pid failed: Invalid argument
  failed to set ftrace pid
  #

Output after:
   ./perf ftrace -v -- ls
   function_graph tracer is used
   # tracer: function_graph
   #
   # CPU  DURATION                  FUNCTION CALLS
   # |     |   |                     |   |   |   |
   4)               |  rcu_read_lock_sched_held() {
   4)   0.552 us    |    rcu_lockdep_current_cpu_online();
   4)   6.124 us    |  }

Reported-by: Alexander Schmidt <alexschm@de.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20210421120400.2126433-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf auxtrace: Fix potential NULL pointer dereference
Leo Yan [Tue, 20 Apr 2021 15:15:53 +0000 (23:15 +0800)]
perf auxtrace: Fix potential NULL pointer dereference

In the function auxtrace_parse_snapshot_options(), the callback pointer
"itr->parse_snapshot_options" can be NULL if it has not been set during
the AUX record initialization.  This can cause tool crashing if the
callback pointer "itr->parse_snapshot_options" is dereferenced without
performing NULL check.

Add a NULL check for the pointer "itr->parse_snapshot_options" before
invoke the callback.

Fixes: d20031bb63dd6dde ("perf tools: Add AUX area tracing Snapshot Mode")
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Link: http://lore.kernel.org/lkml/20210420151554.2031768-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoMerge tag 'drm-fixes-2021-04-23' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 23 Apr 2021 17:23:57 +0000 (10:23 -0700)]
Merge tag 'drm-fixes-2021-04-23' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Just some small i915 and amdgpu fixes this week, should be all until
  you open the merge window.

  amdgpu:
   - Fix gpuvm page table update issue
   - Modifier fixes
   - Register fix for dimgrey cavefish

  i915:
   - GVT's BDW regression fix for cmd parser
   - Fix modesetting in case of unexpected AUX timeouts"

* tag 'drm-fixes-2021-04-23' of git://anongit.freedesktop.org/drm/drm:
  drm/amdgpu: fix GCR_GENERAL_CNTL offset for dimgrey_cavefish
  amd/display: allow non-linear multi-planar formats
  drm/amd/display: Update modifier list for gfx10_3
  drm/amdgpu: reserve fence slot to update page table
  drm/i915: Fix modesetting in case of unexpected AUX timeouts
  drm/i915/gvt: Fix BDW command parser regression

3 years agoMerge tag 'gpio-fixes-for-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 23 Apr 2021 17:19:19 +0000 (10:19 -0700)]
Merge tag 'gpio-fixes-for-v5.12' of git://git./linux/kernel/git/brgl/linux

Pull gpio fix from Bartosz Golaszewski:
 "Save and restore the sysconfig register in gpio-omap to fix a
  power-management issue"

* tag 'gpio-fixes-for-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: omap: Save and restore sysconfig

3 years agoMerge branch 'tegra/dt64' into arm/fixes
Arnd Bergmann [Fri, 23 Apr 2021 13:37:53 +0000 (15:37 +0200)]
Merge branch 'tegra/dt64' into arm/fixes

arm64: tegra: Device tree fixes for v5.12-rc6

This contains a couple of device tree fixes for the v5.12 release cycle.
These are needed for proper audio support on Jetson AGX Xavier, to boot
the Jetson Xavier NX from an SD card and to be able to suspend/resume
the Jetson TX2.

* tegra/dt64:
  arm64: tegra: Move clocks from RT5658 endpoint to device node
  arm64: tegra: Fix mmc0 alias for Jetson Xavier NX
  arm64: tegra: Set fw_devlink=on for Jetson TX2
  arm64: tegra: Add unit-address for ACONNECT on Tegra186

Link: https://lore.kernel.org/linux-arm-kernel/YILD4yyPXuiYbHW1@orome.fritz.box/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
3 years agoMerge tag 'drm-intel-fixes-2021-04-22' of git://anongit.freedesktop.org/drm/drm-intel...
Dave Airlie [Fri, 23 Apr 2021 02:16:12 +0000 (12:16 +1000)]
Merge tag 'drm-intel-fixes-2021-04-22' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- GVT's BDW regression fix for cmd parser (Zhenyu)
- Fix modesetting in case of unexpected AUX timeouts (Imre)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YIGZ3pQPgPQtZtyI@intel.com
3 years agoMerge tag 'amd-drm-fixes-5.12-2021-04-21' of https://gitlab.freedesktop.org/agd5f...
Dave Airlie [Fri, 23 Apr 2021 01:34:02 +0000 (11:34 +1000)]
Merge tag 'amd-drm-fixes-5.12-2021-04-21' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-5.12-2021-04-21:

amdgpu:
- Fix gpuvm page table update issue
- Modifier fixes
- Register fix for dimgrey cavefish

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210421220456.3839-1-alexander.deucher@amd.com
3 years agoMerge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Linus Torvalds [Thu, 22 Apr 2021 23:28:18 +0000 (16:28 -0700)]
Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost

Pull virtio fixes from Michael Tsirkin:
 "Very late in the cycle but both risky if left unfixed and more or less
  obvious.."

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vdpa/mlx5: Set err = -ENOMEM in case dma_map_sg_attrs fails
  vhost-vdpa: protect concurrent access to vhost device iotlb

3 years agovdpa/mlx5: Set err = -ENOMEM in case dma_map_sg_attrs fails
Eli Cohen [Sun, 11 Apr 2021 08:36:46 +0000 (11:36 +0300)]
vdpa/mlx5: Set err = -ENOMEM in case dma_map_sg_attrs fails

Set err = -ENOMEM if dma_map_sg_attrs() fails so the function reutrns
error.

Fixes: 94abbccdf291 ("vdpa/mlx5: Add shared memory registration code")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20210411083646.910546-1-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
3 years agovhost-vdpa: protect concurrent access to vhost device iotlb
Xie Yongji [Mon, 12 Apr 2021 09:55:12 +0000 (17:55 +0800)]
vhost-vdpa: protect concurrent access to vhost device iotlb

Protect vhost device iotlb by vhost_dev->mutex. Otherwise,
it might cause corruption of the list and interval tree in
struct vhost_iotlb if userspace sends the VHOST_IOTLB_MSG_V2
message concurrently.

Fixes: 4c8cf318("vhost: introduce vDPA-based backend")
Cc: stable@vger.kernel.org
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20210412095512.178-1-xieyongji@bytedance.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
3 years agoMerge tag 'sunxi-fixes-for-5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Arnd Bergmann [Thu, 22 Apr 2021 21:18:30 +0000 (23:18 +0200)]
Merge tag 'sunxi-fixes-for-5.12-2' of git://git./linux/kernel/git/sunxi/linux into arm/fixes

One fix for the MMC card detect on the Pine H64 board

* tag 'sunxi-fixes-for-5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
  arm64: dts: allwinner: Revert SD card CD GPIO for Pine64-LTS

Link: https://lore.kernel.org/r/45fc5e4d-ef48-4729-a869-79a8f288bb83.lettre@localhost
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
3 years agoMerge tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/tpmdd
Linus Torvalds [Thu, 22 Apr 2021 20:51:46 +0000 (13:51 -0700)]
Merge tag 'fixes' of git://git./linux/kernel/git/jejb/tpmdd

Pull tpm fix from James Bottomley:
 "This is an urgent regression fix for a tpm patch set that went in this
  merge window. It looks like a rebase before the original pull request
  lost a tpm_try_get_ops() so we have a lock imbalance in our code which
  is causing oopses. The original patch was correct on the mailing list.

  I'm sending this in agreement with Mimi (as joint maintainers of
  trusted keys) because Jarkko is off communing with the Reindeer or
  whatever it is Finns do when on holiday"

* tag 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/tpmdd:
  KEYS: trusted: Fix TPM reservation for seal/unseal

3 years agoperf tools: Update topdown documentation to permit rdpmc calls
Ray Kinsella [Wed, 21 Apr 2021 09:10:09 +0000 (10:10 +0100)]
perf tools: Update topdown documentation to permit rdpmc calls

Update Topdown documentation to permit calls to rdpmc, and describe
interaction with system calls.

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: http://lore.kernel.org/lkml/20210421091009.1711565-1-mdr@ashroe.eu
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf/x86/kvm: Fix Broadwell Xeon stepping in isolation_ucodes[]
Jim Mattson [Thu, 22 Apr 2021 00:18:34 +0000 (17:18 -0700)]
perf/x86/kvm: Fix Broadwell Xeon stepping in isolation_ucodes[]

The only stepping of Broadwell Xeon parts is stepping 1. Fix the
relevant isolation_ucodes[] entry, which previously enumerated
stepping 2.

Although the original commit was characterized as an optimization, it
is also a workaround for a correctness issue.

If a PMI arrives between kvm's call to perf_guest_get_msrs() and the
subsequent VM-entry, a stale value for the IA32_PEBS_ENABLE MSR may be
restored at the next VM-exit. This is because, unbeknownst to kvm, PMI
throttling may clear bits in the IA32_PEBS_ENABLE MSR. CPUs with "PEBS
isolation" don't suffer from this issue, because perf_guest_get_msrs()
doesn't report the IA32_PEBS_ENABLE value.

Fixes: 9b545c04abd4f ("perf/x86/kvm: Avoid unnecessary work in guest filtering")
Signed-off-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Peter Shier <pshier@google.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Link: https://lkml.kernel.org/r/20210422001834.1748319-1-jmattson@google.com
3 years agoarm64: dts: allwinner: Revert SD card CD GPIO for Pine64-LTS
Andre Przywara [Wed, 14 Apr 2021 10:47:40 +0000 (11:47 +0100)]
arm64: dts: allwinner: Revert SD card CD GPIO for Pine64-LTS

Commit 941432d00768 ("arm64: dts: allwinner: Drop non-removable from
SoPine/LTS SD card") enabled the card detect GPIO for the SOPine module,
along the way with the Pine64-LTS, which share the same base .dtsi.

This was based on the observation that the Pine64-LTS has as "push-push"
SD card socket, and that the schematic mentions the card detect GPIO.

After having received two reports about failing SD card access with that
patch, some more research and polls on that subject revealed that there
are at least two different versions of the Pine64-LTS out there:
- On some boards (including mine) the card detect pin is "stuck" at
  high, regardless of an microSD card being inserted or not.
- On other boards the card-detect is working, but is active-high, by
  virtue of an explicit inverter circuit, as shown in the schematic.

To cover all versions of the board out there, and don't take any chances,
let's revert the introduction of the active-low CD GPIO, but let's use
the broken-cd property for the Pine64-LTS this time. That should avoid
regressions and should work for everyone, even allowing SD card changes
now.
The SOPine card detect has proven to be working, so let's keep that
GPIO in place.

Fixes: 941432d00768 ("arm64: dts: allwinner: Drop non-removable from SoPine/LTS SD card")
Reported-by: Michael Weiser <michael.weiser@gmx.de>
Reported-by: Daniel Kulesz <kuleszdl@posteo.org>
Suggested-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Tested-by: Michael Weiser <michael.weiser@gmx.de>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://lore.kernel.org/r/20210414104740.31497-1-andre.przywara@arm.com
3 years agopinctrl: core: Show pin numbers for the controllers with base = 0
Andy Shevchenko [Thu, 15 Apr 2021 13:03:56 +0000 (16:03 +0300)]
pinctrl: core: Show pin numbers for the controllers with base = 0

The commit f1b206cf7c57 ("pinctrl: core: print gpio in pins debugfs file")
enabled GPIO pin number and label in debugfs for pin controller. However,
it limited that feature to the chips where base is positive number. This,
in particular, excluded chips where base is 0 for the historical or backward
compatibility reasons. Refactor the code to include the latter as well.

Fixes: f1b206cf7c57 ("pinctrl: core: print gpio in pins debugfs file")
Cc: Drew Fustini <drew@beagleboard.org>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Tested-by: Drew Fustini <drew@beagleboard.org>
Reviewed-by: Drew Fustini <drew@beagleboard.org>
Link: https://lore.kernel.org/r/20210415130356.15885-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
3 years agoKEYS: trusted: Fix TPM reservation for seal/unseal
James Bottomley [Wed, 21 Apr 2021 22:42:47 +0000 (15:42 -0700)]
KEYS: trusted: Fix TPM reservation for seal/unseal

The original patch 8c657a0590de ("KEYS: trusted: Reserve TPM for seal
and unseal operations") was correct on the mailing list:

https://lore.kernel.org/linux-integrity/20210128235621.127925-4-jarkko@kernel.org/

But somehow got rebased so that the tpm_try_get_ops() in
tpm2_seal_trusted() got lost.  This causes an imbalanced put of the
TPM ops and causes oopses on TIS based hardware.

This fix puts back the lost tpm_try_get_ops()

Fixes: 8c657a0590de ("KEYS: trusted: Reserve TPM for seal and unseal operations")
Reported-by: Mimi Zohar <zohar@linux.ibm.com>
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
3 years agoMerge tag 'mmc-v5.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Linus Torvalds [Wed, 21 Apr 2021 17:22:54 +0000 (10:22 -0700)]
Merge tag 'mmc-v5.12-rc5' of git://git./linux/kernel/git/ulfh/mmc

Pull MMC fix from Ulf Hansson:
 "Replace WARN_ONCE with dev_warn_once for non-optimal sg-alignment in
  the meson-gx host driver"

* tag 'mmc-v5.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: meson-gx: replace WARN_ONCE with dev_warn_once about scatterlist size alignment in block mode

3 years agoblock: return -EBUSY when there are open partitions in blkdev_reread_part
Christoph Hellwig [Wed, 21 Apr 2021 16:05:02 +0000 (18:05 +0200)]
block: return -EBUSY when there are open partitions in blkdev_reread_part

The switch to go through blkdev_get_by_dev means we now ignore the
return value from bdev_disk_changed in __blkdev_get.  Add a manual
check to restore the old semantics.

Fixes: 4601b4b130de ("block: reopen the device in blkdev_reread_part")
Reported-by: Karel Zak <kzak@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210421160502.447418-1-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
3 years agodrm/amdgpu: fix GCR_GENERAL_CNTL offset for dimgrey_cavefish
Jiansong Chen [Mon, 19 Apr 2021 08:33:22 +0000 (16:33 +0800)]
drm/amdgpu: fix GCR_GENERAL_CNTL offset for dimgrey_cavefish

dimgrey_cavefish has similar gc_10_3 ip with sienna_cichlid,
so follow its registers offset setting.

Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
3 years agoamd/display: allow non-linear multi-planar formats
Simon Ser [Fri, 26 Mar 2021 16:59:44 +0000 (17:59 +0100)]
amd/display: allow non-linear multi-planar formats

Accept non-linear buffers which use a multi-planar format, as long
as they don't use DCC.

Tested on GFX9 with NV12.

Signed-off-by: Simon Ser <contact@emersion.fr>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Harry Wentland <hwentlan@amd.com>
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Cc: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
3 years agodrm/amd/display: Update modifier list for gfx10_3
Qingqing Zhuo [Wed, 14 Apr 2021 23:00:01 +0000 (19:00 -0400)]
drm/amd/display: Update modifier list for gfx10_3

[Why]
Current list supports modifiers that have DCC_MAX_COMPRESSED_BLOCK
set to AMD_FMT_MOD_DCC_BLOCK_128B, while AMD_FMT_MOD_DCC_BLOCK_64B
is used instead by userspace.

[How]
Replace AMD_FMT_MOD_DCC_BLOCK_128B with AMD_FMT_MOD_DCC_BLOCK_64B
for modifiers with DCC supported.

Fixes: faa37f54ce0462 ("drm/amd/display: Expose modifiers")
Signed-off-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
3 years agodrm/amdgpu: reserve fence slot to update page table
Philip Yang [Thu, 1 Apr 2021 04:22:23 +0000 (00:22 -0400)]
drm/amdgpu: reserve fence slot to update page table

Forgot to reserve a fence slot to use sdma to update page table, cause
below kernel BUG backtrace to handle vm retry fault while application is
exiting.

[  133.048143] kernel BUG at /home/yangp/git/compute_staging/kernel/drivers/dma-buf/dma-resv.c:281!
[  133.048487] Workqueue: events amdgpu_irq_handle_ih1 [amdgpu]
[  133.048506] RIP: 0010:dma_resv_add_shared_fence+0x204/0x280
[  133.048672]  amdgpu_vm_sdma_commit+0x134/0x220 [amdgpu]
[  133.048788]  amdgpu_vm_bo_update_range+0x220/0x250 [amdgpu]
[  133.048905]  amdgpu_vm_handle_fault+0x202/0x370 [amdgpu]
[  133.049031]  gmc_v9_0_process_interrupt+0x1ab/0x310 [amdgpu]
[  133.049165]  ? kgd2kfd_interrupt+0x9a/0x180 [amdgpu]
[  133.049289]  ? amdgpu_irq_dispatch+0xb6/0x240 [amdgpu]
[  133.049408]  amdgpu_irq_dispatch+0xb6/0x240 [amdgpu]
[  133.049534]  amdgpu_ih_process+0x9b/0x1c0 [amdgpu]
[  133.049657]  amdgpu_irq_handle_ih1+0x21/0x60 [amdgpu]
[  133.049669]  process_one_work+0x29f/0x640
[  133.049678]  worker_thread+0x39/0x3f0
[  133.049685]  ? process_one_work+0x640/0x640

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.11.x
3 years agogpio: omap: Save and restore sysconfig
Tony Lindgren [Sat, 17 Apr 2021 08:38:39 +0000 (11:38 +0300)]
gpio: omap: Save and restore sysconfig

As we are using cpu_pm to save and restore context, we must also save and
restore the GPIO sysconfig register. This is needed because we are not
calling PM runtime functions at all with cpu_pm.

We need to save the sysconfig on idle as it's value can get reconfigured by
PM runtime and can be different from the init time value. Device specific
flags like "ti,no-idle-on-init" can affect the init value.

Fixes: b764a5863fd8 ("gpio: omap: Remove custom PM calls and use cpu_pm instead")
Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: Adam Ford <aford173@gmail.com>
Cc: Andreas Kemnade <andreas@kemnade.info>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Peter Ujfalusi <peter.ujfalusi@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Acked-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
3 years agoperf/x86/intel/uncore: Remove uncore extra PCI dev HSWEP_PCI_PCU_3
Kan Liang [Thu, 15 Apr 2021 21:22:43 +0000 (14:22 -0700)]
perf/x86/intel/uncore: Remove uncore extra PCI dev HSWEP_PCI_PCU_3

There may be a kernel panic on the Haswell server and the Broadwell
server, if the snbep_pci2phy_map_init() return error.

The uncore_extra_pci_dev[HSWEP_PCI_PCU_3] is used in the cpu_init() to
detect the existence of the SBOX, which is a MSR type of PMON unit.
The uncore_extra_pci_dev is allocated in the uncore_pci_init(). If the
snbep_pci2phy_map_init() returns error, perf doesn't initialize the
PCI type of the PMON units, so the uncore_extra_pci_dev will not be
allocated. But perf may continue initializing the MSR type of PMON
units. A null dereference kernel panic will be triggered.

The sockets in a Haswell server or a Broadwell server are identical.
Only need to detect the existence of the SBOX once.
Current perf probes all available PCU devices and stores them into the
uncore_extra_pci_dev. It's unnecessary.
Use the pci_get_device() to replace the uncore_extra_pci_dev. Only
detect the existence of the SBOX on the first available PCU device once.

Factor out hswep_has_limit_sbox(), since the Haswell server and the
Broadwell server uses the same way to detect the existence of the SBOX.

Add some macros to replace the magic number.

Fixes: 5306c31c5733 ("perf/x86/uncore/hsw-ep: Handle systems with only two SBOXes")
Reported-by: Steve Wahl <steve.wahl@hpe.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Steve Wahl <steve.wahl@hpe.com>
Link: https://lkml.kernel.org/r/1618521764-100923-1-git-send-email-kan.liang@linux.intel.com
3 years agoMerge tag 'trace-v5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Linus Torvalds [Tue, 20 Apr 2021 21:38:35 +0000 (14:38 -0700)]
Merge tag 'trace-v5.12-rc8' of git://git./linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
 "Fix tp_printk command line and trace events

  Masami added a wrapper to be able to unhash trace event pointers as
  they are only read by root anyway, and they can also be extracted by
  the raw trace data buffers. But this wrapper utilized the iterator to
  have a temporary buffer to manipulate the text with.

  tp_printk is a kernel command line option that will send the trace
  output of a trace event to the console on boot up (useful when the
  system crashes before finishing the boot). But the code used the same
  wrapper that Masami added, and its iterator did not have a buffer, and
  this caused the system to crash.

  Have the wrapper just print the trace event normally if the iterator
  has no temporary buffer"

* tag 'trace-v5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Fix checking event hash pointer logic when tp_printk is enabled

3 years agocapabilities: require CAP_SETFCAP to map uid 0
Serge E. Hallyn [Tue, 20 Apr 2021 13:43:34 +0000 (08:43 -0500)]
capabilities: require CAP_SETFCAP to map uid 0

cap_setfcap is required to create file capabilities.

Since commit 8db6c34f1dbc ("Introduce v3 namespaced file capabilities"),
a process running as uid 0 but without cap_setfcap is able to work
around this as follows: unshare a new user namespace which maps parent
uid 0 into the child namespace.

While this task will not have new capabilities against the parent
namespace, there is a loophole due to the way namespaced file
capabilities are represented as xattrs.  File capabilities valid in
userns 1 are distinguished from file capabilities valid in userns 2 by
the kuid which underlies uid 0.  Therefore the restricted root process
can unshare a new self-mapping namespace, add a namespaced file
capability onto a file, then use that file capability in the parent
namespace.

To prevent that, do not allow mapping parent uid 0 if the process which
opened the uid_map file does not have CAP_SETFCAP, which is the
capability for setting file capabilities.

As a further wrinkle: a task can unshare its user namespace, then open
its uid_map file itself, and map (only) its own uid.  In this case we do
not have the credential from before unshare, which was potentially more
restricted.  So, when creating a user namespace, we record whether the
creator had CAP_SETFCAP.  Then we can use that during map_write().

With this patch:

1. Unprivileged user can still unshare -Ur

   ubuntu@caps:~$ unshare -Ur
   root@caps:~# logout

2. Root user can still unshare -Ur

   ubuntu@caps:~$ sudo bash
   root@caps:/home/ubuntu# unshare -Ur
   root@caps:/home/ubuntu# logout

3. Root user without CAP_SETFCAP cannot unshare -Ur:

   root@caps:/home/ubuntu# /sbin/capsh --drop=cap_setfcap --
   root@caps:/home/ubuntu# /sbin/setcap cap_setfcap=p /sbin/setcap
   unable to set CAP_SETFCAP effective capability: Operation not permitted
   root@caps:/home/ubuntu# unshare -Ur
   unshare: write failed /proc/self/uid_map: Operation not permitted

Note: an alternative solution would be to allow uid 0 mappings by
processes without CAP_SETFCAP, but to prevent such a namespace from
writing any file capabilities.  This approach can be seen at [1].

Background history: commit 95ebabde382 ("capabilities: Don't allow
writing ambiguous v3 file capabilities") tried to fix the issue by
preventing v3 fscaps to be written to disk when the root uid would map
to the same uid in nested user namespaces.  This led to regressions for
various workloads.  For example, see [2].  Ultimately this is a valid
use-case we have to support meaning we had to revert this change in
3b0c2d3eaa83 ("Revert 95ebabde382c ("capabilities: Don't allow writing
ambiguous v3 file capabilities")").

Link: https://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux.git/log/?h=2021-04-15/setfcap-nsfscaps-v4
Link: https://github.com/containers/buildah/issues/3071
Signed-off-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Andrew G. Morgan <morgan@kernel.org>
Tested-by: Christian Brauner <christian.brauner@ubuntu.com>
Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com>
Tested-by: Giuseppe Scrivano <gscrivan@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoperf data: Fix error return code in perf_data__create_dir()
Zhen Lei [Thu, 15 Apr 2021 08:34:16 +0000 (16:34 +0800)]
perf data: Fix error return code in perf_data__create_dir()

Although 'ret' has been initialized to -1, but it will be reassigned by
the "ret = open(...)" statement in the for loop. So that, the value of
'ret' is unknown when asprintf() failed.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210415083417.3740-1-thunder.leizhen@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agox86/crash: Fix crash_setup_memmap_entries() out-of-bounds access
Mike Galbraith [Fri, 16 Apr 2021 12:02:07 +0000 (14:02 +0200)]
x86/crash: Fix crash_setup_memmap_entries() out-of-bounds access

Commit in Fixes: added support for kexec-ing a kernel on panic using a
new system call. As part of it, it does prepare a memory map for the new
kernel.

However, while doing so, it wrongly accesses memory it has not
allocated: it accesses the first element of the cmem->ranges[] array in
memmap_exclude_ranges() but it has not allocated the memory for it in
crash_setup_memmap_entries(). As KASAN reports:

  BUG: KASAN: vmalloc-out-of-bounds in crash_setup_memmap_entries+0x17e/0x3a0
  Write of size 8 at addr ffffc90000426008 by task kexec/1187

  (gdb) list *crash_setup_memmap_entries+0x17e
  0xffffffff8107cafe is in crash_setup_memmap_entries (arch/x86/kernel/crash.c:322).
  317                                      unsigned long long mend)
  318     {
  319             unsigned long start, end;
  320
  321             cmem->ranges[0].start = mstart;
  322             cmem->ranges[0].end = mend;
  323             cmem->nr_ranges = 1;
  324
  325             /* Exclude elf header region */
  326             start = image->arch.elf_load_addr;
  (gdb)

Make sure the ranges array becomes a single element allocated.

 [ bp: Write a proper commit message. ]

Fixes: dd5f726076cc ("kexec: support for kexec on panic using new system call")
Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Dave Young <dyoung@redhat.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/725fa3dc1da2737f0f6188a1a9701bead257ea9d.camel@gmx.de
3 years agotracing: Fix checking event hash pointer logic when tp_printk is enabled
Steven Rostedt (VMware) [Mon, 19 Apr 2021 18:23:12 +0000 (14:23 -0400)]
tracing: Fix checking event hash pointer logic when tp_printk is enabled

Pointers in events that are printed are unhashed if the flags allow it,
and the logic to do so is called before processing the event output from
the raw ring buffer. In most cases, this is done when a user reads one of
the trace files.

But if tp_printk is added on the kernel command line, this logic is done
for trace events when they are triggered, and their output goes out via
printk. The unhash logic (and even the validation of the output) did not
support the tp_printk output, and would crash.

Link: https://lore.kernel.org/linux-tegra/9835d9f1-8d3a-3440-c53f-516c2606ad07@nvidia.com/
Fixes: efbbdaa22bb7 ("tracing: Show real address for trace event arguments")
Reported-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
3 years agoMerge tag 'gvt-fixes-2021-04-20' of https://github.com/intel/gvt-linux into drm-intel...
Rodrigo Vivi [Tue, 20 Apr 2021 13:41:32 +0000 (09:41 -0400)]
Merge tag 'gvt-fixes-2021-04-20' of https://github.com/intel/gvt-linux into drm-intel-fixes

gvt-fixes-2021-04-20

- Fix cmd parser regression on BDW (Zhenyu)

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
From: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210420023312.GL1551@zhen-hp.sh.intel.com
3 years agoperf tools: Add a build-test variant to use in builds from a tarball
Arnaldo Carvalho de Melo [Fri, 16 Apr 2021 00:15:34 +0000 (21:15 -0300)]
perf tools: Add a build-test variant to use in builds from a tarball

To use in automated tests inside containers from a tarball generated
by 'make perf-tar-src-pkg*', where testing building from a tarball
is obviously not needed, so add a 'build-test-tarball' for that case.

And don't build with gtk2 as this complicates things for cross builds
where we don't always have all the libraries a full perf build requires
available for the target arch, ditto for static builds.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf data: Fix error return code in perf_data__create_dir()
Zhen Lei [Thu, 15 Apr 2021 08:34:16 +0000 (16:34 +0800)]
perf data: Fix error return code in perf_data__create_dir()

Although 'ret' has been initialized to -1, but it will be reassigned by
the "ret = open(...)" statement in the for loop. So that, the value of
'ret' is unknown when asprintf() failed.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210415083417.3740-1-thunder.leizhen@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf arm64: Fix off-by-one directory paths.
Ian Rogers [Fri, 16 Apr 2021 21:41:13 +0000 (14:41 -0700)]
perf arm64: Fix off-by-one directory paths.

Relative path include works in the regular build due to -I paths but may
break in other situations.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210416214113.552252-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf annotate: Add line number like in TUI and source location at EOL
Martin Liška [Sun, 21 Feb 2021 12:46:36 +0000 (13:46 +0100)]
perf annotate: Add line number like in TUI and source location at EOL

The patch changes the output format in 2 ways:
- line number is displayed for all source lines (matching TUI mode)
- source locations for the hottest lines are printed
   at the line end in order to preserve layout

Before:

     0.00 :   405ef1: inc    %r15
          :            tmpsd * (TD + tmpsd * TDD)));
     0.01 :   405ef4: vfmadd213sd 0x2b9b3(%rip),%xmm0,%xmm3        # 4318b0 <_IO_stdin_used+0x8b0>
          :            tmpsd * (TC +
  eff.c:1811    0.67 :   405efd: vfmadd213sd 0x2b9b2(%rip),%xmm0,%xmm3        # 4318b8 <_IO_stdin_used+0x8b8>
          :            TA + tmpsd * (TB +
     0.35 :   405f06: vfmadd213sd 0x2b9b1(%rip),%xmm0,%xmm3        # 4318c0 <_IO_stdin_used+0x8c0>
          :            dumbo =
  eff.c:1809    1.41 :   405f0f: vfmadd213sd 0x2b9b0(%rip),%xmm0,%xmm3        # 4318c8 <_IO_stdin_used+0x8c8>
          :            sumi -= sj * tmpsd * dij2i * dumbo;
  eff.c:1813    2.58 :   405f18: vmulsd %xmm3,%xmm0,%xmm0
     2.81 :   405f1c: vfnmadd213sd 0x30(%rsp),%xmm1,%xmm0
     3.78 :   405f23: vmovsd %xmm0,0x30(%rsp)
          :            for (k = 0; k < lpears[i] + upears[i]; k++) {
  eff.c:1761    0.90 :   405f29: cmp    %r15d,%r12d

After:

     0.00 :   405ef1: inc    %r15
          : 1812   tmpsd * (TD + tmpsd * TDD)));
     0.01 :   405ef4: vfmadd213sd 0x2b9b3(%rip),%xmm0,%xmm3        # 4318b0 <_IO_stdin_used+0x8b0>
          : 1811   tmpsd * (TC +
     0.67 :   405efd: vfmadd213sd 0x2b9b2(%rip),%xmm0,%xmm3        # 4318b8 <_IO_stdin_used+0x8b8> // eff.c:1811
          : 1810   TA + tmpsd * (TB +
     0.35 :   405f06: vfmadd213sd 0x2b9b1(%rip),%xmm0,%xmm3        # 4318c0 <_IO_stdin_used+0x8c0>
          : 1809   dumbo =
     1.41 :   405f0f: vfmadd213sd 0x2b9b0(%rip),%xmm0,%xmm3        # 4318c8 <_IO_stdin_used+0x8c8> // eff.c:1809
          : 1813   sumi -= sj * tmpsd * dij2i * dumbo;
     2.58 :   405f18: vmulsd %xmm3,%xmm0,%xmm0 // eff.c:1813
     2.81 :   405f1c: vfnmadd213sd 0x30(%rsp),%xmm1,%xmm0
     3.78 :   405f23: vmovsd %xmm0,0x30(%rsp)
          : 1761   for (k = 0; k < lpears[i] + upears[i]; k++) {

Where e.g. '// eff.c:1811' shares the same color as the percentantage
at the line beginning.

Signed-off-by: Martin Liška <mliska@suse.cz>
Link: http://lore.kernel.org/lkml/a0d53f31-f633-5013-c386-a4452391b081@suse.cz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf: Update .gitignore file
Alexander Antonov [Mon, 19 Apr 2021 09:41:47 +0000 (12:41 +0300)]
perf: Update .gitignore file

After a "make -C tools/perf", git reports the following untracked file:
perf-iostat

Add this generated file to perf's .gitignore file.

Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey V Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210419094147.15909-5-alexander.antonov@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf stat: Enable iostat mode for x86 platforms
Alexander Antonov [Mon, 19 Apr 2021 09:41:46 +0000 (12:41 +0300)]
perf stat: Enable iostat mode for x86 platforms

This functionality is based on recently introduced sysfs attributes for
Intel® Xeon® Scalable processor family (code name Skylake-SP):

Commit bb42b3d39781d7fc ("perf/x86/intel/uncore: Expose an Uncore unit to IIO PMON mapping")

Mode is intended to provide four I/O performance metrics in MB per each
PCIe root port:

 - Inbound Read: I/O devices below root port read from the host memory
 - Inbound Write: I/O devices below root port write to the host memory
 - Outbound Read: CPU reads from I/O devices below root port
 - Outbound Write: CPU writes to I/O devices below root port

Each metric requiries only one uncore event which increments at every 4B
transfer in corresponding direction. The formulas to compute metrics
are generic:
    #EventCount * 4B / (1024 * 1024)

Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey V Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210419094147.15909-4-alexander.antonov@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf stat: Helper functions for PCIe root ports list in iostat mode
Alexander Antonov [Mon, 19 Apr 2021 09:41:45 +0000 (12:41 +0300)]
perf stat: Helper functions for PCIe root ports list in iostat mode

Introduce helper functions to control PCIe root ports list.
These helpers will be used in the follow-up patch.

Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey V Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210419094147.15909-3-alexander.antonov@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf stat: Basic support for iostat in perf
Alexander Antonov [Mon, 19 Apr 2021 09:41:44 +0000 (12:41 +0300)]
perf stat: Basic support for iostat in perf

Add basic flow for a new iostat mode in perf. Mode is intended to
provide four I/O performance metrics per each PCIe root port: Inbound Read,
Inbound Write, Outbound Read, Outbound Write.

The actual code to compute the metrics and attribute it to
root port is in follow-on patches.

Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey V Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210419094147.15909-2-alexander.antonov@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 years agoperf vendor events: Initial JSON/events list for power10 platform
Kajol Jain [Mon, 19 Apr 2021 11:20:01 +0000 (16:50 +0530)]
perf vendor events: Initial JSON/events list for power10 platform

Patch adds initial JSON/events for POWER10.

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Paul Clarke <pc@us.ibm.com>
Tested-by: Paul Clarke <pc@us.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lore.kernel.org/lkml/20210419112001.71466-1-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>