platform/kernel/linux-rpi.git
19 months agoperf test: Fix wrong size expectation for 'Setup struct perf_event_attr'
Thomas Richter [Wed, 22 Mar 2023 09:47:31 +0000 (10:47 +0100)]
perf test: Fix wrong size expectation for 'Setup struct perf_event_attr'

The test case "perf test 'Setup struct perf_event_attr'" is failing.

On s390 this output is observed:

 # ./perf test -Fvvvv 17
 17: Setup struct perf_event_attr                                    :
 --- start ---
 running './tests/attr/test-stat-C0'
 Using CPUID IBM,8561,703,T01,3.6,002f
 .....
 Event event:base-stat
      fd = 1
      group_fd = -1
      flags = 0|8
      cpu = *
      type = 0
      size = 128     <<<--- wrong, specified in file base-stat
      config = 0
      sample_period = 0
      sample_type = 65536
      ...
 'PERF_TEST_ATTR=/tmp/tmpgw574wvg ./perf stat -o \
/tmp/tmpgw574wvg/perf.data -e cycles -C 0 kill >/dev/null \
2>&1 ret '1', expected '1'
  loading result events
    Event event-0-0-4
      fd = 4
      group_fd = -1
      cpu = 0
      pid = -1
      flags = 8
      type = 0
      size = 136     <<<--- actual size used in system call
      .....
  compare
    matching [event-0-0-4]
      to [event:base-stat]
      [cpu] 0 *
      [flags] 8 0|8
      [type] 0 0
      [size] 136 128
    ->FAIL
    match: [event-0-0-4] matches []
  expected size=136, got 128
  FAILED './tests/attr/test-stat-C0' - match failure

This mismatch is caused by
commit 09519ec3b19e ("perf: Add perf_event_attr::config3")
which enlarges the structure perf_event_attr by 8 bytes.

Fix this by adjusting the expected value of size.

Output after:
 # ./perf test -Fvvvv 17
 17: Setup struct perf_event_attr                                    :
 --- start ---
 running './tests/attr/test-stat-C0'
 Using CPUID IBM,8561,703,T01,3.6,002f
 ...
  matched
  compare
    matching [event-0-0-4]
      to [event:base-stat]
      [cpu] 0 *
      [flags] 8 0|8
      [type] 0 0
      [size] 136 136
      ....
   ->OK
   match: [event-0-0-4] matches ['event:base-stat']
 matched

Fixes: 09519ec3b19e4144 ("perf: Add perf_event_attr::config3")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230322094731.1768281-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf build: Add warning for when vmlinux.h generation fails
Ian Rogers [Wed, 22 Mar 2023 18:31:08 +0000 (11:31 -0700)]
perf build: Add warning for when vmlinux.h generation fails

The warning advises on the NO_BPF_SKEL=1 option.

Suggested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230322183108.1380882-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf report: Append inlines to non-DWARF callchains
Artem Savkov [Thu, 16 Mar 2023 13:35:57 +0000 (14:35 +0100)]
perf report: Append inlines to non-DWARF callchains

Append information about inlined functions to FP and LBR callchains from
DWARF debuginfo when available. Do so by calling append_inlines() from
add_callchain_ip().

Testing it:

Frame-pointer mode recorded with 'perf record --call-graph=fp --freq=max -- ./a.out'

  #include <stdio.h>
  #include <stdint.h>

  static __attribute__((noinline)) uint32_t func5(uint32_t i)
  {
          return i + 10;
  }

  static uint32_t func4(uint32_t i)
  {
          return func5(i + 5);
  }

  static inline uint32_t func3(uint32_t i)
  {
          return func4(i + 4);
  }

  static __attribute__((noinline)) uint32_t func2(uint32_t i)
  {
          return func3(i + 3);
  }

  static uint32_t func1(uint32_t i)
  {
          return func2(i + 2);
  }

  __attribute__((noinline)) uint64_t entry(void)
  {
          uint64_t ret = 0;
          uint32_t i = 0;
          for (i = 0; i < 1000000; i++) {
                  ret += func1(i);
                  ret -= func2(i);
                  ret += func3(i);
                  ret += func4(i);
                  ret -= func5(i);
          }
          return ret;
  }

  int main(int argc, char **argv)
  {
          printf("%s\n", __func__);
          return entry();
  }
  ======

Here is the output I get with '--call-graph callee --no-children'

  ======
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 250  of event 'cycles:u'
  # Event count (approx.): 26819859
  #
  # Overhead  Command  Shared Object         Symbol
  # ........  .......  ....................  .....................................
  #
      43.58%  a.out    a.out                 [.] func5
              |
              |--28.93%--entry
              |          main
              |          __libc_start_call_main
              |
               --14.65%--func4 (inlined)
                         |
                         |--10.45%--entry
                         |          main
                         |          __libc_start_call_main
                         |
                          --4.20%--func3 (inlined)
                                    entry
                                    main
                                    __libc_start_call_main

      38.80%  a.out    a.out                 [.] entry
              |
              |--23.27%--func4 (inlined)
              |          |
              |          |--20.28%--func3 (inlined)
              |          |          func2
              |          |          main
              |          |          __libc_start_call_main
              |          |
              |           --2.99%--entry
              |                     main
              |                     __libc_start_call_main
              |
              |--8.17%--func5
              |          main
              |          __libc_start_call_main
              |
              |--3.89%--func1 (inlined)
              |          entry
              |          main
              |          __libc_start_call_main
              |
               --3.48%--entry
                         main
                         __libc_start_call_main

      13.07%  a.out    a.out                 [.] func2
              |
              ---func5
                 main
                 __libc_start_call_main

       1.54%  a.out    [unknown]             [k] 0xffffffff81e011b7
       1.16%  a.out    [unknown]             [k] 0xffffffff81e00193
              |
               --0.57%--__mmap64 (inlined)
                         __mmap64 (inlined)

       0.34%  a.out    ld-linux-x86-64.so.2  [.] __tunable_get_val
       0.34%  a.out    ld-linux-x86-64.so.2  [.] strcmp
       0.32%  a.out    libc.so.6             [.] strchr
       0.31%  a.out    ld-linux-x86-64.so.2  [.] _dl_relocate_object
       0.22%  a.out    ld-linux-x86-64.so.2  [.] _dl_init_paths
       0.18%  a.out    ld-linux-x86-64.so.2  [.] get_common_cache_info.constprop.0
       0.14%  a.out    ld-linux-x86-64.so.2  [.] __GI___tunables_init

  #
  # (Tip: Show individual samples with: perf script)
  #
  ======

  It does not seem to be out of order, or at least it is consistent with
  what I get with dwarf unwinders.

Committer notes:

Adrian Hunter pointed out that this breaks --branch-history, so don't do
it for branches, see the second Link below.

Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Signed-off-by: <asavkov@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230316133557.868731-2-asavkov@redhat.com
Link: https://lore.kernel.org/r/54129783-2960-84e1-05e9-97ac70ffb432@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf tools: Add support for perf_event_attr::config3
Rob Herring [Fri, 17 Feb 2023 22:32:11 +0000 (16:32 -0600)]
perf tools: Add support for perf_event_attr::config3

perf_event_attr has gained a new field, config3, so add support for it
extending the existing configN support.

Signed-off-by: Rob Herring <robh@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220914-arm-perf-tool-spe1-2-v2-v5-2-2cf5210b2f77@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf vendor events arm64: Add N1 metrics
James Clark [Mon, 20 Mar 2023 11:45:59 +0000 (11:45 +0000)]
perf vendor events arm64: Add N1 metrics

Generated from the telemetry solution repo[1] with this command:

  ./generate.py <linux-repo>/tools/perf/ --telemetry-files \
    ../../data/pmu/cpu/neoverse/neoverse-n1.json

Since this data source now includes the SPE events for N1, it has
diverged from A76 which means the folder has to be split.

The new data also uses more fine grained grouping, but this will be
consistent for all future products. Long PublicDescriptions are now
included even for common events because this can include product
specific details. For non verbose mode the common BriefDescriptions
remain the same.

[1]: https://gitlab.arm.com/telemetry-solution/telemetry-solution

Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: renyu.zj@linux.alibaba.com
Link: https://lore.kernel.org/r/20230320114601.524958-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf jevents: Sort list of input files
Bernhard M. Wiedemann [Tue, 21 Mar 2023 06:30:32 +0000 (07:30 +0100)]
perf jevents: Sort list of input files

Without this, pmu-events.c would be generated with variations in
ordering depending on non-deterministic filesystem readdir order.

I tested that pmu-events.c still has the same number of lines and that
perf list output works.

This patch was done while working on reproducible builds for openSUSE,
but also solves issues in Debian [1] and other distributions.

[1] https://tests.reproducible-builds.org/debian/rb-pkg/unstable/i386/linux.html

Signed-off-by: Bernhard M. Wiedemann <bwiedemann@suse.de>
Cc: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20230321063032.19804-1-bwiedemann@suse.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf kvm: Delete histograms entries before exiting
Leo Yan [Mon, 20 Mar 2023 06:16:19 +0000 (14:16 +0800)]
perf kvm: Delete histograms entries before exiting

It's good not to release resources for a program when kernel cleans up
memory space, this patch explicitly releases histograms entries with
hists__delete_entries().

Committer notice:

This helps with memory leak checkers, but may delay exiting a tool by
doing needless linked list traversals freeing lots of objects.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230320061619.29520-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf kvm: Reference count 'struct kvm_info'
Leo Yan [Mon, 20 Mar 2023 06:16:18 +0000 (14:16 +0800)]
perf kvm: Reference count 'struct kvm_info'

hists__add_entry_ops() doesn't allocate a new histogram entry if it has
an existing entry for a KVM event, in this case, find_create_kvm_event()
allocates a 'struct kvm_info' but it's not used by any histograms and
never freed.

To fix the memory leak, this patch first introduces a refcnt and a set
of functions for refcnt operations on 'struct kvm_info'.  When the data
structure is not anymore used (the refcnt hits zero) kvm_info__zput()
will free the memory used.

Committer:

Provide a nop version of kvm_info__zput() to be used when
HAVE_KVM_STAT_SUPPORT isn't defined as it is used unconditionally in
hists__findnew_entry() and hist_entry__delete().

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230320061619.29520-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf report: Add 'simd' sort field
German Gomez [Mon, 20 Mar 2023 15:15:08 +0000 (15:15 +0000)]
perf report: Add 'simd' sort field

Add 'simd' sort field to visualize SIMD ops in 'perf report'.

Rows are labeled with the SIMD ISA, and the type of predicate (if any):

  - [p] partial predicate
  - [e] empty predicate (no elements in the vector being used)

Example with Arm SPE and SVE (Scalable Vector Extension):

  #include <arm_sve.h>

  double src[1025], dst[1025];

  int main(void) {
    svfloat64_t vc = svdup_f64(1);
    for(;;)
      for(int i = 0; i < 1025; i += svcntd())
      {
        svbool_t pg = svwhilelt_b64(i, 1025);
        svfloat64_t vsrc = svld1(pg, &src[i]);
        svfloat64_t vdst = svadd_x(pg, vsrc, vc);
        svst1(pg, &dst[i], vdst);
      }
    return 0;
  }

  ... compiled using "gcc-11 -march=armv8-a+sve -O3"

Profiling on a platform that implements FEAT_SVE and FEAT_SPEv1p1:

  $ perf record -e arm_spe_0// -- ./a.out
  $ perf report --itrace=i1i -s overhead,pid,simd,sym

  Overhead      Pid:Command   Simd     Symbol
  ........  ................  .......  ......................

    53.76%    10758:program            [.] main
    46.14%    10758:program   [.] SVE  [.] main
     0.09%    10758:program   [p] SVE  [.] main

The report shows 0.09% of the sampled SVE operations use partial
predicates due to src and dst arrays not being multiples of the vector
register lengths.

Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman.Khandual@arm.com
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230320151509.1137462-2-james.clark@arm.com
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf arm-spe: Add SVE flags to the SPE samples
German Gomez [Mon, 20 Mar 2023 15:15:07 +0000 (15:15 +0000)]
perf arm-spe: Add SVE flags to the SPE samples

Add flags from the Scalable Vector Extension (SVE) to the SPE samples
which are available from Armv8.3 (FEAT_SPEv1p1).

These will be displayed in a new SIMD sort field in a later commit.

Signed-off-by: German Gomez <german.gomez@arm.com>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20230320151509.1137462-2-james.clark@arm.com
Cc: Anshuman.Khandual@arm.com
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: John Garry <john.g.garry@oracle.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf arm-spe: Refactor arm-spe to support operation packet type
German Gomez [Mon, 20 Mar 2023 15:15:06 +0000 (15:15 +0000)]
perf arm-spe: Refactor arm-spe to support operation packet type

Extend the decoder of Arm SPE records to support more fields from the
operation packet type.

Not all fields are being decoded by this commit. Only those needed to
support the use-case SVE load/store/other operations.

Suggested-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman.Khandual@arm.com
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230320151509.1137462-2-james.clark@arm.com
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf event: Add 'simd_flags' field to 'struct perf_sample'
German Gomez [Mon, 20 Mar 2023 15:15:05 +0000 (15:15 +0000)]
perf event: Add 'simd_flags' field to 'struct perf_sample'

Add new field to 'struct perf_sample' to store flags related to SIMD
ops.

It will be used to store SIMD information from SVE and NEON when
profiling using ARM SPE.

Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman.Khandual@arm.com
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230320151509.1137462-2-james.clark@arm.com
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf intel-pt: Add support for new branch instructions ERETS and ERETU
Adrian Hunter [Mon, 20 Mar 2023 18:35:17 +0000 (20:35 +0200)]
perf intel-pt: Add support for new branch instructions ERETS and ERETU

Intel Flexible Return and Event Delivery (FRED) adds instructions ERETS
(return to supervisor) and ERETU (return to user). Intel PT instruction
decoder needs to know about these instructions because they are
branch instructions. Similar to IRET instructions, when the decoder
encounters one of these instructions it will match it to a TIP (target
instruction pointer) packet that informs what the branch destination is.

The existing "x86 instruction decoder - new instructions" test can be
used to test the result e.g.

  $ perf test -v ins |& grep eret
  Decoded ok: f2 0f 01 ca         erets
  Decoded ok: f3 0f 01 ca         eretu

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230320183517.15099-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf intel-pt: Add event type names UINTR and UIRET
Adrian Hunter [Mon, 20 Mar 2023 18:35:16 +0000 (20:35 +0200)]
perf intel-pt: Add event type names UINTR and UIRET

UINTR and UIRET are listed in table 32-50 "CFE Packet Type and Vector
Fields Details" in the Intel Processor Trace chapter of The Intel SDM
Volume 3 version 078.

The codes are for "User interrupt delivered" and "Exiting from user
interrupt routine" respectively.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230320183517.15099-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf symbol: Sort names under write lock
Ian Rogers [Mon, 20 Mar 2023 03:37:53 +0000 (20:37 -0700)]
perf symbol: Sort names under write lock

If finding a name doesn't find the sorted names then they are
allocated and sorted. This shouldn't be done under a read lock as
another reader may access it. Release the read lock and acquire the
write lock, then release the write lock and reacquire the read lock.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: André Almeida <andrealmeid@collabora.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20230320033810.980165-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf test: Fix memory leak in symbols
Ian Rogers [Mon, 20 Mar 2023 03:37:52 +0000 (20:37 -0700)]
perf test: Fix memory leak in symbols

machine__delete() doesn't delete threads. Add call to delete threads
ahead of deleting the machine.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: André Almeida <andrealmeid@collabora.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20230320033810.980165-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf tests: Add common error route for code-reading
Ian Rogers [Mon, 20 Mar 2023 03:37:51 +0000 (20:37 -0700)]
perf tests: Add common error route for code-reading

A later change will enforce that the map is put on this path
regardless of success or error.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: André Almeida <andrealmeid@collabora.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20230320033810.980165-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf bpf_counter: Use public cpumap accessors
Ian Rogers [Mon, 20 Mar 2023 03:37:50 +0000 (20:37 -0700)]
perf bpf_counter: Use public cpumap accessors

Avoid the use of internal apis via the cpumap accessor functions.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: André Almeida <andrealmeid@collabora.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20230320033810.980165-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf symbol: Avoid memory leak from abi::__cxa_demangle
Ian Rogers [Mon, 20 Mar 2023 03:37:49 +0000 (20:37 -0700)]
perf symbol: Avoid memory leak from abi::__cxa_demangle

Rather than allocate memory, allow abi::__cxa_demangle to do
that. This avoids a problem where on error NULL was returned
triggering a memory leak.

Fixes: 3b4e4efe88f615f1 ("perf symbol: Add abi::__cxa_demangle C++ demangling support")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: André Almeida <andrealmeid@collabora.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20230320033810.980165-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf kvm: Update documentation to reflect new changes
Leo Yan [Wed, 15 Mar 2023 14:51:12 +0000 (22:51 +0800)]
perf kvm: Update documentation to reflect new changes

Update documentation for new sorting and option '--stdio'.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf kvm: Add TUI mode for stat report
Leo Yan [Wed, 15 Mar 2023 14:51:11 +0000 (22:51 +0800)]
perf kvm: Add TUI mode for stat report

Since we have supported histograms list and prepared the dimensions in
the tool, this patch adds TUI mode for stat report.  It also adds UI
progress for sorting for better user experience.

Committer notes:

kvm_display() is only used by functions enclosed in:

  #if defined(HAVE_KVM_STAT_SUPPORT) && defined(HAVE_LIBTRACEEVENT)

So do it with this new function as well.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf kvm: Add dimensions for percentages
Leo Yan [Wed, 15 Mar 2023 14:51:10 +0000 (22:51 +0800)]
perf kvm: Add dimensions for percentages

Add dimensions for count and time percentages, it would be useful for
user to review percentage statistics.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf kvm: Support printing attributions for dimensions
Leo Yan [Wed, 15 Mar 2023 14:51:09 +0000 (22:51 +0800)]
perf kvm: Support printing attributions for dimensions

This patch adds header, entry callback and width for every dimension,
thus in TUI mode the tool can print items with the defined attributions.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf kvm: Polish sorting key
Leo Yan [Wed, 15 Mar 2023 14:51:08 +0000 (22:51 +0800)]
perf kvm: Polish sorting key

Since histograms supports sorting, the tool doesn't need to maintain the
mapping between the sorting keys and the corresponding comparison
callbacks, therefore, this patch removes structure kvm_event_key.

But we still need to validate the sorting key, this patch uses an array
for sorting keys and renames function select_key() to is_valid_key()
to validate the sorting key passed by user.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf kvm: Use histograms list to replace cached list
Leo Yan [Wed, 15 Mar 2023 14:51:07 +0000 (22:51 +0800)]
perf kvm: Use histograms list to replace cached list

perf kvm tool defines its own cached list which is managed with RB tree,
histograms also provide RB tree to manage data entries.  Since now we
have introduced histograms in the tool, it's not necessary to use the
self defined list and we can directly use histograms list to manage
KVM events.

This patch changes to use histograms list to track KVM events, and it
invokes the common function hists__output_resort_cb() to sort result,
this also give us flexibility to extend more sorting key words easily.

After histograms list supported, the cached list is redundant so remove
the relevant code for it.

Committer notes:

kvm_hists__reinit() is only used by functions enclosed in:

  #if defined(HAVE_KVM_STAT_SUPPORT) && defined(HAVE_LIBTRACEEVENT)

So do it with this new function as well.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf kvm: Add dimensions for KVM event statistics
Leo Yan [Wed, 15 Mar 2023 14:51:06 +0000 (22:51 +0800)]
perf kvm: Add dimensions for KVM event statistics

To support KVM event statistics, this patch firstly registers histograms
columns and sorting fields; every column or field has its own format
structure, the format structure is dereferenced to access the dimension,
finally the dimension provides the comparison callback for sorting
result.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf hist: Add 'kvm_info' field in histograms entry
Leo Yan [Wed, 15 Mar 2023 14:51:05 +0000 (22:51 +0800)]
perf hist: Add 'kvm_info' field in histograms entry

__hists__add_entry() creates a temporary entry and compare it with
existed histograms entries, if any existed entry equals to the
temporary entry it skips to allocation to avoid duplication.

The problem for support KVM event in histograms is it doesn't contain
any info to identify KVM event and can be used for comparison entries.

This patch adds 'kvm_info' field in the histograms entry which contains
the KVM event's key, this identifier will be used for comparison
histograms entries in later change.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf kvm: Parse address location for samples
Leo Yan [Wed, 15 Mar 2023 14:51:04 +0000 (22:51 +0800)]
perf kvm: Parse address location for samples

Parse address location for samples and save it into the structure
'perf_kvm_stat', it is to be used by histograms entry.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf kvm: Pass argument 'sample' to kvm_alloc_init_event()
Leo Yan [Wed, 15 Mar 2023 14:51:03 +0000 (22:51 +0800)]
perf kvm: Pass argument 'sample' to kvm_alloc_init_event()

This patch adds an argument 'sample' for kvm_alloc_init_event(), and its
caller functions are updated as well for passing down the 'sample'
pointer.

This is a preparation change to allow later patch to create histograms
entries for kvm event, no any functionality changes.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf kvm: Introduce histograms data structures
Leo Yan [Wed, 15 Mar 2023 14:51:02 +0000 (22:51 +0800)]
perf kvm: Introduce histograms data structures

This is a preparation to support histograms in perf kvm tool.  As first
step, this patch defines histograms data structures and initialize them.

Committer notes:

Those are only used by functions enclosed in:

  #if efined(HAVE_KVM_STAT_SUPPORT) && defined(HAVE_LIBTRACEEVENT)

So do this for these new functions and struct as well.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf kvm: Use macro to replace variable 'decode_str_len'
Leo Yan [Wed, 15 Mar 2023 14:51:01 +0000 (22:51 +0800)]
perf kvm: Use macro to replace variable 'decode_str_len'

The variable 'decode_str_len' defines the string length for KVM event
name and every arch defines its own values.

This introduces complexity that the variable definition are spreading in
multiple source files under arch folder.  This patch refactors code to
use a macro KVM_EVENT_NAME_LEN to define event name length and thus
remove the definitions in arch files.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf kvm: Use subtraction for comparison metrics
Leo Yan [Wed, 15 Mar 2023 14:51:00 +0000 (22:51 +0800)]
perf kvm: Use subtraction for comparison metrics

Currently the metrics comparison uses greater operator (>), it returns
the boolean value (0 or 1).

This patch changes to use subtraction as comparison result, which can
be used by histograms sorting.  Since the subtraction result is u64
type, we change key_cmp_fun's return type to int64_t to avoid overflow.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf kvm: Move up metrics helpers
Leo Yan [Wed, 15 Mar 2023 14:50:59 +0000 (22:50 +0800)]
perf kvm: Move up metrics helpers

This patch moves up the helper functions of event's metrics for later
adding code to call them.

No any functionality changes, but has a function renaming from
compare_kvm_event_{metric}() to cmp_event_{metric}().

Committer notes:

Those helper functions are only used if this is true:

  if defined(HAVE_KVM_STAT_SUPPORT) && defined(HAVE_LIBTRACEEVENT)

So keep them enclosed with that.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf kvm: Add pointer to 'perf_kvm_stat' in kvm event
Leo Yan [Wed, 15 Mar 2023 14:50:58 +0000 (22:50 +0800)]
perf kvm: Add pointer to 'perf_kvm_stat' in kvm event

Sometimes, handling kvm events needs to base on global variables, e.g.
when read event counts we need to know the target vcpu ID; the global
variables are stored in structure perf_kvm_stat.

This patch adds add a 'perf_kvm_stat' pointer in kvm event structure,
it is to be used by later refactoring.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf kvm: Refactor overall statistics
Leo Yan [Wed, 15 Mar 2023 14:50:57 +0000 (22:50 +0800)]
perf kvm: Refactor overall statistics

Currently the tool computes overall statistics when sort the results.
This patch refactors overall statistics during events processing,
therefore, the function update_total_coun() is not needed anymore, an
extra benefit is we can de-couple code between the statistics and the
sorting.

This patch is not expected any functionality changes.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230315145112.186603-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf record: Update documentation for BPF filters
Namhyung Kim [Tue, 14 Mar 2023 23:42:37 +0000 (16:42 -0700)]
perf record: Update documentation for BPF filters

Add more description and examples.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf bpf filter: Show warning for missing sample flags
Namhyung Kim [Tue, 14 Mar 2023 23:42:36 +0000 (16:42 -0700)]
perf bpf filter: Show warning for missing sample flags

For a BPF filter to work properly, users need to provide appropriate
options to enable the sample types.  Otherwise the BPF program would
see an invalid value (i.e. always 0) and filter won't work well.

Show a warning message if sample types are missing like below.

  $ sudo ./perf record -e cycles --filter 'addr < 100' true
  Error: cycles event does not have PERF_SAMPLE_ADDR
   Hint: please add -d option to perf record.
  failed to set filter "BPF" on event cycles with 22 (Invalid argument)

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf bpf filter: Add logical OR operator
Namhyung Kim [Tue, 14 Mar 2023 23:42:35 +0000 (16:42 -0700)]
perf bpf filter: Add logical OR operator

It supports two or more expressions connected as a group and the group
result is considered true when one of them returns true.  The new group
operators (GROUP_BEGIN and GROUP_END) are added to setup and check the
condition.  As it doesn't allow nested groups, the condition is saved
in local variables.

For example, the following is to get samples only if the data source
memory level is L2 cache or the weight value is greater than 30.

  $ sudo ./perf record -adW -e cpu/mem-loads/pp \
  > --filter 'mem_lvl == l2 || weight > 30' -- sleep 1

  $ sudo ./perf script -F data_src,weight
     10668100842 |OP LOAD|LVL L3 or L3 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK  N/A     47
     11868100242 |OP LOAD|LVL LFB/MAB or LFB/MAB hit|SNP None|TLB L1 or L2 hit|LCK No|BLK  N/A      57
     10668100842 |OP LOAD|LVL L3 or L3 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK  N/A                56
     10650100842 |OP LOAD|LVL L3 or L3 hit|SNP None|TLB L2 miss|LCK No|BLK  N/A                    144
     10468100442 |OP LOAD|LVL L2 or L2 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK  N/A                16
     10468100442 |OP LOAD|LVL L2 or L2 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK  N/A                20
     11868100242 |OP LOAD|LVL LFB/MAB or LFB/MAB hit|SNP None|TLB L1 or L2 hit|LCK No|BLK  N/A     189
     1026a100142 |OP LOAD|LVL L1 or L1 hit|SNP None|TLB L1 or L2 hit|LCK Yes|BLK  N/A              193
     10468100442 |OP LOAD|LVL L2 or L2 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK  N/A                18
     ...

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf bpf filter: Add data_src sample data support
Namhyung Kim [Tue, 14 Mar 2023 23:42:34 +0000 (16:42 -0700)]
perf bpf filter: Add data_src sample data support

The data_src has many entries to express memory behaviors.  Add each
term separately so that users can combine them for their purpose.

I didn't add prefix for the constants for simplicity as they are mostly
distinguishable but I had to use l1_miss and l2_hit for mem_dtlb since
mem_lvl has different values for the same names.  Note that I decided
mem_lvl to be used as an alias of mem_lvlnum as it's deprecated now.
According to the comment in the UAPI header, users should use the mix of
mem_lvlnum, mem_remote and mem_snoop.  Also the SNOOPX bits are
concatenated to mem_snoop for simplicity.

The following terms are used for data_src and the corresponding perf
sample data fields:

 * mem_op : { load, store, pfetch, exec }
 * mem_lvl: { l1, l2, l3, l4, cxl, io, any_cache, lfb, ram, pmem }
 * mem_snoop: { none, hit, miss, hitm, fwd, peer }
 * mem_remote: { remote }
 * mem_lock: { locked }
 * mem_dtlb { l1_hit, l1_miss, l2_hit, l2_miss, any_hit, any_miss, walk, fault }
 * mem_blk { by_data, by_addr }
 * mem_hops { hops0, hops1, hops2, hops3 }

We can now use a filter expression like below:

  'mem_op == load, mem_lvl <= l2, mem_dtlb == l1_hit'
  'mem_dtlb == l2_miss, mem_hops > hops1'
  'mem_lvl == ram, mem_remote == 1'

Note that 'na' is shared among the terms as it has the same value except
for mem_lvl.  I don't have a good idea to handle that for now.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf bpf filter: Add more weight sample data support
Namhyung Kim [Tue, 14 Mar 2023 23:42:33 +0000 (16:42 -0700)]
perf bpf filter: Add more weight sample data support

The weight data consists of a couple of fields with the
PERF_SAMPLE_WEIGHT_STRUCT.  Add weight{1,2,3} term to select them
separately.  Also add their aliases like 'ins_lat', 'p_stage_cyc' and
'retire_lat'.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf bpf filter: Add 'pid' sample data support
Namhyung Kim [Tue, 14 Mar 2023 23:42:32 +0000 (16:42 -0700)]
perf bpf filter: Add 'pid' sample data support

The pid is special because it's saved in the PERF_SAMPLE_TID together.
So it needs to differenciate tid and pid using the 'part' field in the
perf bpf filter entry struct.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf record: Record dropped sample count
Namhyung Kim [Tue, 14 Mar 2023 23:42:31 +0000 (16:42 -0700)]
perf record: Record dropped sample count

When it uses bpf filters, event might drop some samples.  It'd be nice
if it can report how many samples it lost.  As LOST_SAMPLES event can
carry the similar information, let's use it for bpf filters.

To indicate it's from BPF filters, add a new misc flag for that and
do not display cpu load warnings.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf record: Add BPF event filter support
Namhyung Kim [Tue, 14 Mar 2023 23:42:30 +0000 (16:42 -0700)]
perf record: Add BPF event filter support

Use --filter option to set BPF filter for generic events other than the
tracepoints or Intel PT.  The BPF program will check the sample data and
filter according to the expression.

For example, the below is the typical perf record for frequency mode.
The sample period started from 1 and increased gradually.

  $ sudo ./perf record -e cycles true
  $ sudo ./perf script
       perf-exec 2272336 546683.916875:          1 cycles:  ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms])
       perf-exec 2272336 546683.916892:          1 cycles:  ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms])
       perf-exec 2272336 546683.916899:          3 cycles:  ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms])
       perf-exec 2272336 546683.916905:         17 cycles:  ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms])
       perf-exec 2272336 546683.916911:        100 cycles:  ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms])
       perf-exec 2272336 546683.916917:        589 cycles:  ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms])
       perf-exec 2272336 546683.916924:       3470 cycles:  ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms])
       perf-exec 2272336 546683.916930:      20465 cycles:  ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms])
            true 2272336 546683.916940:     119873 cycles:  ffffffff8283afdd perf_iterate_ctx+0x2d ([kernel.kallsyms])
            true 2272336 546683.917003:     461349 cycles:  ffffffff82892517 vma_interval_tree_insert+0x37 ([kernel.kallsyms])
            true 2272336 546683.917237:     635778 cycles:  ffffffff82a11400 security_mmap_file+0x20 ([kernel.kallsyms])

When you add a BPF filter to get samples having periods greater than 1000,
the output would look like below:

  $ sudo ./perf record -e cycles --filter 'period > 1000' true
  $ sudo ./perf script
       perf-exec 2273949 546850.708501:       5029 cycles:  ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms])
       perf-exec 2273949 546850.708508:      32409 cycles:  ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms])
       perf-exec 2273949 546850.708526:     143369 cycles:  ffffffff82b4cdbf xas_start+0x5f ([kernel.kallsyms])
       perf-exec 2273949 546850.708600:     372650 cycles:  ffffffff8286b8f7 __pagevec_lru_add+0x117 ([kernel.kallsyms])
       perf-exec 2273949 546850.708791:     482953 cycles:  ffffffff829190de __mod_memcg_lruvec_state+0x4e ([kernel.kallsyms])
            true 2273949 546850.709036:     501985 cycles:  ffffffff828add7c tlb_gather_mmu+0x4c ([kernel.kallsyms])
            true 2273949 546850.709292:     503065 cycles:      7f2446d97c03 _dl_map_object_deps+0x973 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)

Committer notes:

Add stubs for perf_bpf_filter__prepare() and perf_bpf_filter__destroy()
to tools/perf/util/python.c to keep it building.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf bpf filter: Implement event sample filtering
Namhyung Kim [Tue, 14 Mar 2023 23:42:29 +0000 (16:42 -0700)]
perf bpf filter: Implement event sample filtering

The BPF program will be attached to a perf_event and be triggered when
it overflows.  It'd iterate the filters map and compare the sample
value according to the expression.  If any of them fails, the sample
would be dropped.

Also it needs to have the corresponding sample data for the expression
so it compares data->sample_flags with the given value.  To access the
sample data, it uses the bpf_cast_to_kern_ctx() kfunc which was added
in v6.2 kernel.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf bpf filter: Introduce basic BPF filter expression
Namhyung Kim [Tue, 14 Mar 2023 23:42:28 +0000 (16:42 -0700)]
perf bpf filter: Introduce basic BPF filter expression

This implements a tiny parser for the filter expressions used for BPF.
Each expression will be converted to struct perf_bpf_filter_expr and
be passed to a BPF map.

For now, I'd like to start with the very basic comparisons like EQ or
GT.  The LHS should be a term for sample data and the RHS is a number.
The expressions are connected by a comma.  For example,

    period > 10000
    ip < 0x1000000000000, cpu == 3

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230314234237.3008956-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf top: Fix rare segfault in thread__comm_len()
liuwenyu [Wed, 15 Mar 2023 06:42:17 +0000 (14:42 +0800)]
perf top: Fix rare segfault in thread__comm_len()

In thread__comm_len(),strlen() is called outside of the
thread->comm_lock critical section,which may cause a UAF
problems if comm__free() is called by the process_thread
concurrently.

backtrace of the core file is as follows:

    (gdb) bt
    #0  __strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex.S:77
    #1  0x000055ad15d31de5 in thread__comm_len (thread=0x7f627d20e300) at util/thread.c:320
    #2  0x000055ad15d4fade in hists__calc_col_len (h=0x7f627d295940, hists=0x55ad1772bfe0)
        at util/hist.c:103
    #3  hists__calc_col_len (hists=0x55ad1772bfe0, h=0x7f627d295940) at util/hist.c:79
    #4  0x000055ad15d52c8c in output_resort (hists=hists@entry=0x55ad1772bfe0, prog=0x0,
        use_callchain=false, cb=cb@entry=0x0, cb_arg=0x0) at util/hist.c:1926
    #5  0x000055ad15d530a4 in evsel__output_resort_cb (evsel=evsel@entry=0x55ad1772bde0,
        prog=prog@entry=0x0, cb=cb@entry=0x0, cb_arg=cb_arg@entry=0x0) at util/hist.c:1945
    #6  0x000055ad15d53110 in evsel__output_resort (evsel=evsel@entry=0x55ad1772bde0,
        prog=prog@entry=0x0) at util/hist.c:1950
    #7  0x000055ad15c6ae9a in perf_top__resort_hists (t=t@entry=0x7ffcd9cbf4f0) at builtin-top.c:311
    #8  0x000055ad15c6cc6d in perf_top__print_sym_table (top=0x7ffcd9cbf4f0) at builtin-top.c:346
    #9  display_thread (arg=0x7ffcd9cbf4f0) at builtin-top.c:700
    #10 0x00007f6282fab4fa in start_thread (arg=<optimized out>) at pthread_create.c:443
    #11 0x00007f628302e200 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

The reason is that strlen() get a pointer to a memory that has been freed.

The string pointer is stored in the structure comm_str, which corresponds
to a rb_tree node,when the node is erased, the memory of the string is also freed.

In thread__comm_len(),it gets the pointer within the thread->comm_lock critical section,
but passed to strlen() outside of the thread->comm_lock critical section, and the perf
process_thread may called comm__free() concurrently, cause this segfault problem.

The process is as follows:

display_thread                                  process_thread
--------------                                  --------------

thread__comm_len
  -> thread__comm_str
       # held the comm read lock
    -> __thread__comm_str(thread)
       # release the comm read lock
                                                thread__delete
                                                     # held the comm write lock
                                                  -> comm__free
                                                    -> comm_str__put(comm->comm_str)
                                                      -> zfree(&cs->str)
                                                     # release the comm write lock
      # The memory of the string pointed
        to by comm has been free.
    -> thread->comm_len = strlen(comm);

This patch expand the critical section range of thread->comm_lock in thread__comm_len(),
to make strlen() called safe.

Signed-off-by: Wenyu Liu <liuwenyu7@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Feilong Lin <linfeilong@huawei.com>
Cc: Hewenliang <hewenliang4@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Yunfeng Ye <yeyunfeng@huawei.com>
Link: https://lore.kernel.org/r/322bfb49-840b-f3b6-9ef1-f9ec3435b07e@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf script: Fix Python support when no libtraceevent
Adrian Hunter [Wed, 15 Mar 2023 08:43:21 +0000 (10:43 +0200)]
perf script: Fix Python support when no libtraceevent

Python scripting can be used without libtraceevent. In particular,
scripting for Intel PT does not use tracepoints, and so does not need
libtraceevent support.

Alter the build and employ conditional compilation to allow Python
scripting without libtraceevent.

Example:

 Before:

    $ ldd `which perf` | grep -i python
    $ ldd `which perf` | grep -i libtraceevent
    $ perf record -e intel_pt//u uname
    Linux
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.031 MB perf.data ]
    $ perf script intel-pt-events.py |& head -3
      Error: Couldn't find script `intel-pt-events.py'

     See perf script -l for available scripts.

 After:

    $ ldd `which perf` | grep -i python
            libpython3.10.so.1.0 => /lib/x86_64-linux-gnu/libpython3.10.so.1.0 (0x00007f4bac400000)
    $ ldd `which perf` | grep -i libtraceevent
    $ perf script intel-pt-events.py | head
    Intel PT Branch Trace, Power Events, Event Trace and PTWRITE
         Switch In    8021/8021  [000]     11234.097713404     0/0
           perf-exec  8021/8021  [000]     11234.098041726       psb                        offset: 0x0                0 [unknown] ([unknown])
           perf-exec  8021/8021  [000]     11234.098041726       cbr                         45  freq: 4505 MHz  (161%)                0 [unknown] ([unknown])
               uname  8021/8021  [000]     11234.098082170  branches:uH  tr strt                              0 [unknown] ([unknown]) => 7f3a8b9422b0 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
               uname  8021/8021  [000]     11234.098082379  branches:uH  tr end                    7f3a8b9422b0 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) => 0 [unknown] ([unknown])
               uname  8021/8021  [000]     11234.098083629  branches:uH  tr strt                              0 [unknown] ([unknown]) => 7f3a8b9422b0 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
               uname  8021/8021  [000]     11234.098083629  branches:uH  call                      7f3a8b9422b3 _start+0x3 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) => 7f3a8b943050 _dl_start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
               uname  8021/8021  [000]     11234.098083837  branches:uH  tr end                    7f3a8b943060 _dl_start+0x10 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) => 0 [unknown] ([unknown])  IPC: 0.01 (9/938)
               uname  8021/8021  [000]     11234.098084670  branches:uH  tr strt                              0 [unknown] ([unknown]) => 7f3a8b943060 _dl_start+0x10 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)

Fixes: 378ef0f5d9d7f465 ("perf build: Use libtraceevent from the system")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20230315084321.14563-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf vendor events s390: Add metric for TLB and cache
Thomas Richter [Mon, 13 Mar 2023 08:02:01 +0000 (09:02 +0100)]
perf vendor events s390: Add metric for TLB and cache

Add metrics for tlb and cache statistics:

- finite_cpi: Cycles per Instructions from Finite cache/memory
- est_cpi: Estimated Instruction Complexity CPI infinite Level 1
- scpl1m: Estimated Sourcing Cycles per Level 1 Miss
- tlb_percent: Estimated TLB CPU percentage of Total CPU
- tlb_miss: Estimated Cycles per TLB Miss

For details about the formulas see this documentation:

  https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf

Output after:

  # ./perf stat -M tlb_miss -- dd if=/dev/zero of=/dev/null bs=1M count=10K
  ... dd output removed

  Performance counter stats for 'dd if=/dev/zero of=/dev/null bs=1M count=10K':

           667,726      DTLB2_MISSES             #   440.96 tlb_miss
               198      ITLB2_WRITES
       795,170,260      L1C_TLB2_MISSES
             9,478      ITLB2_MISSES
               820      DTLB2_WRITES
     1,197,126,869      L1D_PENALTY_CYCLES
         2,457,447      L1I_PENALTY_CYCLES

       1.249342187 seconds time elapsed

       0.001030000 seconds user
       1.248105000 seconds sys

  #

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-By: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20230313080201.2440201-3-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf vendor events s390: Add cache metrics for z13
Thomas Richter [Mon, 13 Mar 2023 08:02:00 +0000 (09:02 +0100)]
perf vendor events s390: Add cache metrics for z13

Add metrics for s390 z13

- Percentage sourced from Level 2 cache
- Percentage sourced from Level 3 on same chip cache
- Percentage sourced from Level 4 Local cache on same book
- Percentage sourced from Level 4 Remote cache on different book
- Percentage sourced from memory

For details about the formulas see this documentation:

  https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf

Output after:

  # ./perf stat -M l4rp -- find /
   ...find output deleted

  Performance counter stats for 'find /':

            2      L1I_OFFDRAWER_SCOL_L4_SOURCED_WRITES #     0.02 l4rp
          252      L1D_ONDRAWER_L4_SOURCED_WRITES
        3,465      L1D_ONDRAWER_L3_SOURCED_WRITES_IV
           80      L1D_OFFDRAWER_SCOL_L4_SOURCED_WRITES
          761      L1D_ONDRAWER_L3_SOURCED_WRITES
            0      L1I_OFFDRAWER_SCOL_L3_SOURCED_WRITES
  131,817,067      L1I_DIR_WRITES
            1      L1I_OFFDRAWER_FCOL_L4_SOURCED_WRITES
          447      L1D_OFFDRAWER_SCOL_L3_SOURCED_WRITES
           22      L1D_OFFDRAWER_FCOL_L4_SOURCED_WRITES
            7      L1I_ONDRAWER_L4_SOURCED_WRITES
            0      L1I_OFFDRAWER_FCOL_L3_SOURCED_WRITES
        1,071      L1D_OFFDRAWER_FCOL_L3_SOURCED_WRITES
            3      L1I_ONDRAWER_L3_SOURCED_WRITES
       13,352      L1D_OFFDRAWER_FCOL_L3_SOURCED_WRITES_IV
       15,252      L1D_OFFDRAWER_SCOL_L3_SOURCED_WRITES_IV
            0      L1I_ONDRAWER_L3_SOURCED_WRITES_IV
            0      L1I_OFFDRAWER_FCOL_L3_SOURCED_WRITES_IV
   57,431,083      L1D_DIR_WRITES
            0      L1I_OFFDRAWER_SCOL_L3_SOURCED_WRITES_IV

      15.386502874 seconds time elapsed

       0.647348000 seconds user
       3.537041000 seconds sys

  #

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-By: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20230313080201.2440201-3-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf vendor events s390: Add cache metrics for z14
Thomas Richter [Mon, 13 Mar 2023 08:01:59 +0000 (09:01 +0100)]
perf vendor events s390: Add cache metrics for z14

Add metrics for s390 z14

- Percentage sourced from Level 2 cache
- Percentage sourced from Level 3 on same chip cache
- Percentage sourced from Level 4 Local cache on same book
- Percentage sourced from Level 4 Remote cache on different book
- Percentage sourced from memory

For details about the formulas see this documentation:

  https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf

Output after:

  # ./perf stat -M l4rp -- find /
  .... find output deleted

  Performance counter stats for 'find /':

             0      L1I_OFFDRAWER_L4_SOURCED_WRITES  #     0.01 l4rp
            84      L1D_OFFDRAWER_L4_SOURCED_WRITES
             0      L1I_OFFDRAWER_L3_SOURCED_WRITES
    71,535,353      L1I_DIR_WRITES
           219      L1D_OFFDRAWER_L3_SOURCED_WRITES
        16,436      L1D_OFFDRAWER_L3_SOURCED_WRITES_IV
             0      L1I_OFFDRAWER_L3_SOURCED_WRITES_IV
    46,343,940      L1D_DIR_WRITES

  10.530805537 seconds time elapsed

   0.774396000 seconds user
   1.602714000 seconds sys

  #

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-By: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20230313080201.2440201-3-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf vendor events s390: Add cache metrics for z15
Thomas Richter [Mon, 13 Mar 2023 08:01:58 +0000 (09:01 +0100)]
perf vendor events s390: Add cache metrics for z15

Add metrics for s390 z15

- Percentage sourced from Level 2 cache
- Percentage sourced from Level 3 on same chip cache
- Percentage sourced from Level 4 Local cache on same book
- Percentage sourced from Level 4 Remote cache on different book
- Percentage sourced from memory

For details about the formulas see this documentation:

  https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf

Outpuf after:
  # ./perf stat -M l4rp -- find /
  .... find output deleted

  Performance counter stats for 'find /':

              5      L1I_OFFDRAWER_L4_SOURCED_WRITES  #     0.01 l4rp
            187      L1D_OFFDRAWER_L4_SOURCED_WRITES
              0      L1I_OFFDRAWER_L3_SOURCED_WRITES
    231,333,165      L1I_DIR_WRITES
          3,303      L1D_OFFDRAWER_L3_SOURCED_WRITES
         47,461      L1D_OFFDRAWER_L3_SOURCED_WRITES_IV
              0      L1I_OFFDRAWER_L3_SOURCED_WRITES_IV
    126,706,244      L1D_DIR_WRITES

   27.870355461 seconds time elapsed

    0.521562000 seconds user
   12.494503000 seconds sys
  #

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-By: Sumanth Korikkar <sumanthk@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20230313080201.2440201-3-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf vendor events intel: Update skylake events
Ian Rogers [Tue, 14 Mar 2023 05:33:12 +0000 (22:33 -0700)]
perf vendor events intel: Update skylake events

Update from v54 to v55. Addition of OFFCORE_RESPONSE,
FP_ARITH_INST_RETIRED.SCALAR, FP_ARITH_INST_RETIRED.VECTOR and
INT_MISC.CLEARS_COUNT.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20230314053312.3237390-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf vendor events intel: Update meteorlake events
Ian Rogers [Tue, 14 Mar 2023 05:33:11 +0000 (22:33 -0700)]
perf vendor events intel: Update meteorlake events

Update from 1.00 to 1.01. Event description updates. Addition of
IDQ_BUBBLES.CORE, TOPDOWN.BACKEND_BOUND_SLOTS, UOPS_RETIRED.SLOTS.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20230314053312.3237390-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf vendor events intel: Update graniterapids events
Ian Rogers [Tue, 14 Mar 2023 05:33:10 +0000 (22:33 -0700)]
perf vendor events intel: Update graniterapids events

Update from 1.00 to 1.01, some event description updates.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20230314053312.3237390-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf scripts intel-pt-events.py: Fix IPC output for Python 2
Roman Lozko [Fri, 10 Mar 2023 15:04:45 +0000 (15:04 +0000)]
perf scripts intel-pt-events.py: Fix IPC output for Python 2

Integers are not converted to floats during division in Python 2 which
results in incorrect IPC values. Fix by switching to new division
behavior.

Fixes: a483e64c0b62e93a ("perf scripting python: intel-pt-events.py: Add --insn-trace and --src-trace")
Signed-off-by: Roman Lozko <lozko.roma@gmail.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20230310150445.2925841-1-lozko.roma@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf tools bpf: Add vmlinux.h to .gitignore
Arnaldo Carvalho de Melo [Tue, 14 Mar 2023 11:41:36 +0000 (08:41 -0300)]
perf tools bpf: Add vmlinux.h to .gitignore

Now that BPF skel based tools will be built by default if the toolchain
pieces that are needed are available, building directly on the source
tree will produce a vmlinux.h from the BTF info that needs to get
ignored.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf test: Fix "PMU event table sanity" for NO_JEVENTS=1
Ian Rogers [Wed, 8 Mar 2023 00:27:14 +0000 (16:27 -0800)]
perf test: Fix "PMU event table sanity" for NO_JEVENTS=1

A table was renamed and needed to be renamed in the empty case.

Fixes: 62774db2a05dc878 ("perf jevents: Generate metrics and events as separate tables")
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230308002714.1755698-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf lock contention: Show lock type with address
Namhyung Kim [Mon, 13 Mar 2023 20:48:25 +0000 (13:48 -0700)]
perf lock contention: Show lock type with address

Show lock type names after the symbol of locks if any.  This can be
useful especially when it doesn't show the lock symbols.

The indentation before the lock type parenthesis is to recognize lock
symbols more easily.

  $ sudo ./perf lock con -abl -- sleep 1
   contended   total wait     max wait     avg wait            address   symbol

          44      6.13 ms    284.49 us    139.28 us   ffffffff92e06080   tasklist_lock (rwlock)
         159    983.38 us     12.38 us      6.18 us   ffff8cc717c90000   siglock (spinlock)
          10    679.90 us    153.35 us     67.99 us   ffff8cdc2872aaf8   mmap_lock (rwsem)
           9    558.11 us    180.67 us     62.01 us   ffff8cd647914038   mmap_lock (rwsem)
          78    228.56 us      7.82 us      2.93 us   ffff8cc700061c00    (spinlock)
           5     41.60 us     16.93 us      8.32 us   ffffd853acb41468    (spinlock)
          10     37.24 us      5.87 us      3.72 us   ffff8cd560b5c200   siglock (spinlock)
           4     11.17 us      3.97 us      2.79 us   ffff8d053ddf0c80   rq_lock (spinlock)
           1      7.86 us      7.86 us      7.86 us   ffff8cd64791404c    (spinlock)
           1      4.13 us      4.13 us      4.13 us   ffff8d053d930c80   rq_lock (spinlock)
           7      3.98 us      1.67 us       568 ns   ffff8ccb92479440    (mutex)
           2      2.62 us      2.33 us      1.31 us   ffff8cc702e6ede0    (rwlock)

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230313204825.2665483-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf lock contention: Show per-cpu rq_lock with address
Namhyung Kim [Mon, 13 Mar 2023 20:48:24 +0000 (13:48 -0700)]
perf lock contention: Show per-cpu rq_lock with address

Using the BPF_PROG_RUN mechanism, we can run a raw_tp BPF program to
collect some semi-global locks like per-cpu locks.  Let's add runqueue
locks using bpf_per_cpu_ptr() helper.

  $ sudo ./perf lock con -abl -- sleep 1
   contended   total wait     max wait     avg wait            address   symbol

         248      3.25 ms     32.23 us     13.10 us   ffff8cc75cfd2940   siglock
          60    217.91 us      9.69 us      3.63 us   ffff8cc700061c00
           8     70.23 us     13.86 us      8.78 us   ffff8cc703629484
           4     56.32 us     35.81 us     14.08 us   ffff8cc78b66f778   mmap_lock
           4     16.70 us      5.18 us      4.18 us   ffff8cc7036a0684
           3      4.99 us      2.65 us      1.66 us   ffff8d053da30c80   rq_lock
           2      3.44 us      2.28 us      1.72 us   ffff8d053dcf0c80   rq_lock
           9      2.51 us       371 ns       278 ns   ffff8ccb92479440
           2      2.11 us      1.24 us      1.06 us   ffff8d053db30c80   rq_lock
           2      2.06 us      1.69 us      1.03 us   ffff8d053d970c80   rq_lock

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230313204825.2665483-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf lock contention: Track and show siglock with address
Namhyung Kim [Mon, 13 Mar 2023 20:48:23 +0000 (13:48 -0700)]
perf lock contention: Track and show siglock with address

Likewise, we can display siglock by following the pointer like
current->sighand->siglock.

  $ sudo ./perf lock con -abl -- sleep 1
   contended   total wait     max wait     avg wait            address   symbol

          16      2.18 ms    305.35 us    136.34 us   ffffffff92e06080   tasklist_lock
          28    521.78 us     31.16 us     18.63 us   ffff8cc703783ec4
           7    119.03 us     23.55 us     17.00 us   ffff8ccb92479440
          15     88.29 us     10.06 us      5.89 us   ffff8cd560b5f380   siglock
           7     37.67 us      9.16 us      5.38 us   ffff8d053daf0c80
           5      8.81 us      4.92 us      1.76 us   ffff8d053d6b0c80

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230313204825.2665483-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf lock contention: Track and show mmap_lock with address
Namhyung Kim [Mon, 13 Mar 2023 20:48:22 +0000 (13:48 -0700)]
perf lock contention: Track and show mmap_lock with address

Sometimes there are severe contentions on the mmap_lock and we want
see it in the -l/--lock-addr output.  However it cannot symbolize
the mmap_lock because it's allocated dynamically without symbols.

Stephane and Hao gave me an idea separately to display mmap_lock by
following the current->mm pointer.  I added a flag to mark mmap_lock
after comparing the lock address so that it can show them differently.
With this change it can show mmap_lock like below:

  $ sudo ./perf lock con -abl -- sleep 10
   contended   total wait     max wait     avg wait            address   symbol
   ...
       16344    312.30 ms      2.22 ms     19.11 us   ffff8cc702595640
       17686    310.08 ms      1.49 ms     17.53 us   ffff8cc7025952c0
           3     84.14 ms     45.79 ms     28.05 ms   ffff8cc78114c478   mmap_lock
        3557     76.80 ms     68.75 us     21.59 us   ffff8cc77ca3af58
           1     68.27 ms     68.27 ms     68.27 ms   ffff8cda745dfd70
           9     54.53 ms      7.96 ms      6.06 ms   ffff8cc7642a48b8   mmap_lock
       14629     44.01 ms     60.00 us      3.01 us   ffff8cc7625f9ca0
        3481     42.63 ms    140.71 us     12.24 us   ffffffff937906ac   vmap_area_lock
       16194     38.73 ms     42.15 us      2.39 us   ffff8cd397cbc560
          11     38.44 ms     10.39 ms      3.49 ms   ffff8ccd6d12fbb8   mmap_lock
           1      5.43 ms      5.43 ms      5.43 ms   ffff8cd70018f0d8
        1674      5.38 ms    422.93 us      3.21 us   ffffffff92e06080   tasklist_lock
         581      4.51 ms    130.68 us      7.75 us   ffff8cc9b1259058
           5      3.52 ms      1.27 ms    703.23 us   ffff8cc754510070
         112      3.47 ms     56.47 us     31.02 us   ffff8ccee38b3120
         381      3.31 ms     73.44 us      8.69 us   ffffffff93790690   purge_vmap_area_lock
         255      3.19 ms     36.35 us     12.49 us   ffff8d053ce30c80

Note that mmap_lock was renamed some time ago and it needs to support
old kernels with a different name 'mmap_sem'.

Suggested-by: Hao Luo <haoluo@google.com>
Suggested-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230313204825.2665483-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf build: Error if no libelf and NO_LIBELF isn't set
Ian Rogers [Sat, 11 Mar 2023 06:57:44 +0000 (22:57 -0800)]
perf build: Error if no libelf and NO_LIBELF isn't set

Building without libelf support is going disable a lot of
functionality. Require that the NO_LIBELF=1 build option is passed if
this is intentional.

Committer notes:

Add NO_LIBELF=1 to the 'make_static' target in tools/perf/tests/make so
that 'make -C tools/perf build-test' works.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Pavithra Gurushankar <gpavithrasha@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Roberto Sassu <roberto.sassu@huawei.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20230311065753.3012826-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf build: Remove redundant NO_NEWT build option
Ian Rogers [Sat, 11 Mar 2023 06:57:53 +0000 (22:57 -0800)]
perf build: Remove redundant NO_NEWT build option

The option controlled nothing and no code depends, conditional or
otherwise, on libnewt.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Pavithra Gurushankar <gpavithrasha@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Roberto Sassu <roberto.sassu@huawei.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20230311065753.3012826-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf build: If libtraceevent isn't present error the build
Ian Rogers [Sat, 11 Mar 2023 06:57:51 +0000 (22:57 -0800)]
perf build: If libtraceevent isn't present error the build

If libtraceevent isn't present, the build will warn and continue. This
disables a number of features and so isn't desirable. This change
makes the build error for this case. The build can still be made to
happen by adding NO_LIBTRACEEVENT=1.

Committer notes:

Add NO_LIBTRACEEVENT=1 to the 'make_static' target in
tools/perf/tests/make so that 'make -C tools/perf build-test' works.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Pavithra Gurushankar <gpavithrasha@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Roberto Sassu <roberto.sassu@huawei.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20230311065753.3012826-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf build: Switch libpfm4 to opt-out rather than opt-in
Ian Rogers [Sat, 11 Mar 2023 06:57:50 +0000 (22:57 -0800)]
perf build: Switch libpfm4 to opt-out rather than opt-in

If libpfm4 passes the feature test, it would be nice to have it
enabled rather than also requiring the LIBPFM4=1 build flag.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Pavithra Gurushankar <gpavithrasha@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Roberto Sassu <roberto.sassu@huawei.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20230311065753.3012826-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf symbol: Add abi::__cxa_demangle C++ demangling support
Ian Rogers [Sat, 11 Mar 2023 06:57:49 +0000 (22:57 -0800)]
perf symbol: Add abi::__cxa_demangle C++ demangling support

Refactor C++ demangling out of symbol-elf into its own files similar
to other languages. Add abi::__cxa_demangle support. As the other
demanglers are not shippable with distributions, this brings back C++
demangling in a common case. It isn't perfect as the support for
optionally demangling arguments and modifiers isn't present.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Pavithra Gurushankar <gpavithrasha@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Roberto Sassu <roberto.sassu@huawei.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20230311065753.3012826-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agotools build: Add feature test for abi::__cxa_demangle
Ian Rogers [Sat, 11 Mar 2023 06:57:48 +0000 (22:57 -0800)]
tools build: Add feature test for abi::__cxa_demangle

cxxabi.h is part of libsdtc++ and LLVM's libcxx, providing
abi::__cxa_demangle a portable C++ demangler. Add a feature test to
detect that the function is available.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Pavithra Gurushankar <gpavithrasha@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Roberto Sassu <roberto.sassu@huawei.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20230311065753.3012826-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf build: Make binutil libraries opt in
Ian Rogers [Sat, 11 Mar 2023 06:57:47 +0000 (22:57 -0800)]
perf build: Make binutil libraries opt in

binutils is GPLv3 so distributions cannot ship perf linked against
libbfd and libiberty as the licenses are incompatible. Rather than
defaulting the build to opting in to libbfd and libiberty support and
opting out via NO_LIBBFD=1 and NO_DEMANGLE=1, make building against
the libraries optional and enabled with BUILD_NONDISTRO=1.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Pavithra Gurushankar <gpavithrasha@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Roberto Sassu <roberto.sassu@huawei.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20230311065753.3012826-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf build: Error if jevents won't work and NO_JEVENTS=1 isn't set
Ian Rogers [Sat, 11 Mar 2023 06:57:46 +0000 (22:57 -0800)]
perf build: Error if jevents won't work and NO_JEVENTS=1 isn't set

Rather than disabling jevents if a sufficient python isn't present
error in the build. This avoids the build progressing but the binary
being degraded. The build can still succeed by specifying NO_JEVENTS=1
to the build and this is conveyed in the error message.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Pavithra Gurushankar <gpavithrasha@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Roberto Sassu <roberto.sassu@huawei.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20230311065753.3012826-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf util: Remove weak sched_getcpu
Ian Rogers [Sat, 11 Mar 2023 06:57:45 +0000 (22:57 -0800)]
perf util: Remove weak sched_getcpu

sched_getcpu may not be present and so a feature test and definition
exist to workaround this in the build. The feature test is used to
define HAVE_SCHED_GETCPU_SUPPORT and so this is sufficient to know
whether the local sched_getcpu is needed and a weak symbol can be
avoided.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Pavithra Gurushankar <gpavithrasha@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Roberto Sassu <roberto.sassu@huawei.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20230311065753.3012826-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf build: Remove unused HAVE_GLIBC_SUPPORT
Ian Rogers [Sat, 11 Mar 2023 06:57:43 +0000 (22:57 -0800)]
perf build: Remove unused HAVE_GLIBC_SUPPORT

HAVE_GLIBC_SUPPORT is only used in `perf version --build-options` but
doesn't control any behavior. Remove from the build to simplify it.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Pavithra Gurushankar <gpavithrasha@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Roberto Sassu <roberto.sassu@huawei.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20230311065753.3012826-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf build: Make BUILD_BPF_SKEL default, rename to NO_BPF_SKEL
Ian Rogers [Sat, 11 Mar 2023 06:57:42 +0000 (22:57 -0800)]
perf build: Make BUILD_BPF_SKEL default, rename to NO_BPF_SKEL

BPF skeleton support is now key to a number of perf features. Rather
than making it so that BPF support must be enabled for the build, make
this the default and error if the build lacks a clang and libbpf that
are sufficient. To avoid the error and build without BPF skeletons the
NO_BPF_SKEL=1 flag can be used. Add a build-options flag to 'perf
version' to enable detection of the BPF skeleton support and use this
in the offcpu shell test.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Pavithra Gurushankar <gpavithrasha@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Roberto Sassu <roberto.sassu@huawei.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20230311065753.3012826-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf build: Support python/perf.so testing
Ian Rogers [Sat, 11 Mar 2023 06:57:41 +0000 (22:57 -0800)]
perf build: Support python/perf.so testing

Add a build target to echo the python/perf.so's name from
Makefile.perf. Use it in tests/make so the correct target is built and
tested for.

Fixes: caec54705adb73b0 ("perf build: Fix python/perf.so library's name")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Pavithra Gurushankar <gpavithrasha@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Roberto Sassu <roberto.sassu@huawei.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20230311065753.3012826-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf bpf: Remove pre libbpf 1.0 conditional logic
Ian Rogers [Mon, 16 Jan 2023 01:01:15 +0000 (17:01 -0800)]
perf bpf: Remove pre libbpf 1.0 conditional logic

Tests are no longer applicable as libbpf 1.0 can be assumed.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked/Tested-by: Jiri Olsa <jolsa@kernel.org>
Acked/Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Christy Lee <christylee@fb.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Roberto Sassu <roberto.sassu@huawei.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230116010115.490713-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf build: Remove libbpf pre-1.0 feature tests
Ian Rogers [Mon, 16 Jan 2023 01:01:14 +0000 (17:01 -0800)]
perf build: Remove libbpf pre-1.0 feature tests

The feature tests were necessary for libbpf pre-1.0, but as the libbpf
implies at least 1.0 we can remove these now.

Committer notes:

Modified tools/perf/Makefile.config to better reflect the reason for
failure when the libbpf present is < 1.0 and LIBBPF_DYNAMIC=1 was asked
for.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Christy Lee <christylee@fb.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Roberto Sassu <roberto.sassu@huawei.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230116010115.490713-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agotools build: Pass libbpf feature only if libbpf 1.0+
Ian Rogers [Mon, 16 Jan 2023 01:01:13 +0000 (17:01 -0800)]
tools build: Pass libbpf feature only if libbpf 1.0+

libbpf 1.0 represented a cleanup and stabilization of APIs. Simplify
development by only passing the feature test if libbpf 1.0 is installed.

Committer notes:

Change 'make -C tools/perf build-test' so that the LIBBPF_DYNAMIC=1 test
runs only if libbpf is >= 1.0.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Christy Lee <christylee@fb.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Roberto Sassu <roberto.sassu@huawei.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230116010115.490713-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf lock contention: Fix compiler builtin detection
Ian Rogers [Wed, 8 Mar 2023 00:30:20 +0000 (16:30 -0800)]
perf lock contention: Fix compiler builtin detection

__has_builtin was passed the macro rather than the actual builtin
feature. The builtin test isn't sufficient and a clang version test
also needs to be performed.

Fixes: 1bece1351c653c3d ("perf lock contention: Support old rw_semaphore type")
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20230308003020.3653271-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf cs-etm: Avoid printing warning in cs_etm_is_ete() check
James Clark [Wed, 8 Mar 2023 09:48:43 +0000 (09:48 +0000)]
perf cs-etm: Avoid printing warning in cs_etm_is_ete() check

When checking for the presence of ETE, a register is read that may not
be present on older kernels or if ETE isn't available. cs_etm_get_ro()
will print a warning if it doesn't exist, so check for the existence
first before accessing it.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230308094843.287093-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf cs-etm: Reduce verbosity of ts_source warning
James Clark [Wed, 8 Mar 2023 09:48:42 +0000 (09:48 +0000)]
perf cs-etm: Reduce verbosity of ts_source warning

This is printed as a warning but it is normal behavior that users
shouldn't be expected to do anything about. Reduce the warning level to
debug3 so it's only seen in verbose mode to avoid confusion.

Reviewed-by: Leo Yan <leo.yan@linaro.org
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230308094843.287093-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf list: Add PMU pai_ext event description for IBM z16
Thomas Richter [Wed, 8 Mar 2023 12:53:26 +0000 (13:53 +0100)]
perf list: Add PMU pai_ext event description for IBM z16

Add the event description for the IBM z16 pai_ext PMU released with
commit c432fefe8e6262bf ("s390/pai: Add support for PAI Extension 1 NNPA
counters")

The document SA22-7832-13 "z/Architecture Principles of Operation",
published May, 2022, contains the description of the
Processor Activity Instrumentation Facility and the NNPA counter
set., See Pages 5-113 to 5-116 and chapter 26 for details.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20230308125326.2195613-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf vendor events s390: Add cache metrics for z16
Thomas Richter [Mon, 13 Mar 2023 08:01:57 +0000 (09:01 +0100)]
perf vendor events s390: Add cache metrics for z16

Add metrics for s390 z16

- Percentage sourced from Level 2 cache
- Percentage sourced from Level 3 on same chip cache
- Percentage sourced from Level 4 Local cache on same book
- Percentage sourced from Level 4 Remote cache on different book
- Percentage sourced from memory

For details about the formulas see this documentation:
https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf

Output after:

  # ./perf stat -M l4rp -- dd if=/dev/zero of=/dev/null bs=10M count=10K
  .... dd output deleted

  Performance counter stats for 'dd if=/dev/zero of=/dev/null bs=10M count=10K':

                  0      IDCW_OFF_DRAWER_CHIP_HIT         #     0.00 l4rp
            431,866      L1I_DIR_WRITES
              2,395      IDCW_OFF_DRAWER_IV
                  0      ICW_OFF_DRAWER
                  0      IDCW_OFF_DRAWER_DRAWER_HIT
              1,437      DCW_OFF_DRAWER
        425,960,793      L1D_DIR_WRITES

       12.165030699 seconds time elapsed

        0.001037000 seconds user
       12.162140000 seconds sys
  #

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-By: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20230313080201.2440201-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf vendor events s390: Add common metrics
Thomas Richter [Mon, 13 Mar 2023 08:01:56 +0000 (09:01 +0100)]
perf vendor events s390: Add common metrics

Add 3 metrics for s390 machines:

- Cycles per instruction: Amount of CPU cycles used per instructions,
  named cpi.
- Problem state ratio: Ratio of instructions executed in problem state
  compared to total number of instructions, named prbstate.
- Level one instruction and data cache misses per 100 instructions,
  named l1mp.

For details about the formulas see this documentation:
https://www.ibm.com/support/pages/system/files/inline-files/CPU%20MF%20Formulas%20including%20z16%20-%20May%202022_1.pdf

Output after:

  # ./perf stat -M cpi -- dd if=/dev/zero of=/dev/null bs=1M count=10K
  10240+0 records in
  10240+0 records out
  10737418240 bytes (11 GB, 10 GiB) copied, 1.30151 s, 8.2 GB/s

  Performance counter stats for 'dd if=/dev/zero of=/dev/null .....':

    6,779,778,802      CPU_CYCLES              #     1.96 cpi
    3,461,975,090      INSTRUCTIONS

    1.306873021 seconds time elapsed

    0.001034000 seconds user
    1.305677000 seconds sys
  #

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-By: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20230313080201.2440201-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf parse-events: Warn when events are regrouped
Ian Rogers [Sun, 12 Mar 2023 02:15:43 +0000 (18:15 -0800)]
perf parse-events: Warn when events are regrouped

Use if an event is reordered or the number of groups increases to
signal that regrouping has happened and warn about it. Disable the
warning in the case wild card PMU names are used and for metrics.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf evlist: Remove nr_groups
Ian Rogers [Sun, 12 Mar 2023 02:15:42 +0000 (18:15 -0800)]
perf evlist: Remove nr_groups

Maintaining the number of groups during event parsing is problematic
and since changing to sort/regroup events can only be computed by a
linear pass over the evlist. As the value is generally only used in
tests, rather than hold it in a variable compute it by passing over
the evlist when necessary.

This change highlights that libpfm's counting of groups with a single
entry disagreed with regular event parsing. The libpfm tests are
updated accordingly.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf evsel: Remove use_uncore_alias
Ian Rogers [Sun, 12 Mar 2023 02:15:41 +0000 (18:15 -0800)]
perf evsel: Remove use_uncore_alias

This flag used to be used when regrouping uncore events in particular
due to wildcard matches. This is now handled by sorting evlist and so
the flag is redundant.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf parse-events: Sort and group parsed events
Ian Rogers [Sun, 12 Mar 2023 02:15:40 +0000 (18:15 -0800)]
perf parse-events: Sort and group parsed events

This change is intended to be a no-op for most current cases, the
default sort order is the order the events were parsed. Where it
varies is in how groups are handled. Previously an uncore and core
event that are grouped would most often cause the group to be removed:

```
$ perf stat -e '{instructions,uncore_imc_free_running_0/data_total/}' -a sleep 1
WARNING: grouped events cpus do not match, disabling group:
  anon group { instructions, uncore_imc_free_running_0/data_total/ }
...
```

However, when wildcards are used the events should be re-sorted and
re-grouped in parse_events__set_leader, but this currently fails for
simple examples:

```
$ perf stat -e '{uncore_imc_free_running/data_read/,uncore_imc_free_running/data_write/}' -a sleep 1

 Performance counter stats for 'system wide':

     <not counted> MiB  uncore_imc_free_running/data_read/
     <not counted> MiB  uncore_imc_free_running/data_write/

       1.000996992 seconds time elapsed
```

A futher failure mode, fixed in this patch, is to force topdown events
into a group.

This change moves sorting the evsels in the evlist after parsing. It
requires parsing to set up groups. First the evsels are sorted
respecting the existing groupings and parse order, but also reordering
to ensure evsels of the same PMU and group appear together. So that
software and aux events respect groups, their pmu_name is taken from
the group leader. The sorting is done with list_sort removing a memory
allocation.

After sorting a pass is done to correct the group leaders and for
topdown events ensuring they have a group leader.

This fixes the problems seen before:

```
$ perf stat -e '{uncore_imc_free_running/data_read/,uncore_imc_free_running/data_write/}' -a sleep 1

 Performance counter stats for 'system wide':

            727.42 MiB  uncore_imc_free_running/data_read/
             81.84 MiB  uncore_imc_free_running/data_write/

       1.000948615 seconds time elapsed
```

As well as making groups not fail for cases like:

```
$ perf stat -e '{imc_free_running_0/data_total/,imc_free_running_1/data_total/}' -a sleep 1

 Performance counter stats for 'system wide':

            256.47 MiB  imc_free_running_0/data_total/
            256.48 MiB  imc_free_running_1/data_total/

       1.001165442 seconds time elapsed
```

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf parse-events: Pass ownership of the group name
Ian Rogers [Sun, 12 Mar 2023 02:15:39 +0000 (18:15 -0800)]
perf parse-events: Pass ownership of the group name

Pass ownership of the group name rather than copying and freeing the
original. This saves a memory allocation and copy.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf evsel: Add function to compute group PMU name
Ian Rogers [Sun, 12 Mar 2023 02:15:38 +0000 (18:15 -0800)]
perf evsel: Add function to compute group PMU name

The computed name respects software events and aux event groups, such
that the pmu_name is changed to be that of the aux event leader or
group leader for software events. This is done as a later change will
split events that are in different PMUs into different groups.

Committer notes:

Added a stub for this new function so that 'perf test python' passes.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf evsel: Allow const evsel for certain accesses
Ian Rogers [Sun, 12 Mar 2023 02:15:37 +0000 (18:15 -0800)]
perf evsel: Allow const evsel for certain accesses

List sorting, added later to evlist, passes const elements requiring
helper functions to also be const. Make the argument to
evsel__find_pmu, evsel__is_aux_event and evsel__leader const.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf stat: Modify the group test
Ian Rogers [Sun, 12 Mar 2023 02:15:36 +0000 (18:15 -0800)]
perf stat: Modify the group test

Currently nr_members is 0 for an event with no group, however, they
are always a leader of their own group. A later change will make that
count 1 because the event is its own leader. Make the find_stat logic
consistent with this, an improvement suggested by Namhyung Kim.

Suggested-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf pmu: Earlier PMU auxtrace initialization
Ian Rogers [Sun, 12 Mar 2023 02:15:35 +0000 (18:15 -0800)]
perf pmu: Earlier PMU auxtrace initialization

This allows event parsing to use the evsel__is_aux_event function,
which is important when determining event grouping.

Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf stat: Don't remove all grouped events when CPU maps disagree
Ian Rogers [Sun, 12 Mar 2023 02:15:34 +0000 (18:15 -0800)]
perf stat: Don't remove all grouped events when CPU maps disagree

If the events in an evlist's CPU map differ then the entire group is
removed. For example:

```
$ perf stat -e '{imc_free_running/data_read/,imc_free_running/data_write/,cs}' -a sleep 1
WARNING: grouped events cpus do not match, disabling group:
  anon group { imc_free_running/data_read/, imc_free_running/data_write/, cs }
```

Change the behavior so that just the events not matching the leader
are removed. So in the example above, just 'cs' will be removed.

Modify the warning so that it is produced once for each group, rather
than once for the entire evlist. Shrink the scope and size of the
warning text buffer.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agolibperf evlist: Avoid a use of evsel idx
Ian Rogers [Sun, 12 Mar 2023 02:15:33 +0000 (18:15 -0800)]
libperf evlist: Avoid a use of evsel idx

Setting the leader iterates the list, so rather than use idx (which
may be changed through list reordering) just count the elements and
set afterwards.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20230312021543.3060328-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf ftrace: Reuse target::initial_delay
Changbin Du [Thu, 2 Mar 2023 03:11:46 +0000 (11:11 +0800)]
perf ftrace: Reuse target::initial_delay

Replace perf_ftrace::initial_delay with target::initial_delay.
Specifying a negative initial_delay is meaningless for ftrace in
practice but allowed here.

Signed-off-by: Changbin Du <changbin.du@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hui Wang <hw.huiwang@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230302031146.2801588-4-changbin.du@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf record: Reuse target::initial_delay
Changbin Du [Thu, 2 Mar 2023 03:11:45 +0000 (11:11 +0800)]
perf record: Reuse target::initial_delay

This just simply replace record_opts::initial_delay with
target::initial_delay. Nothing else is changed.

Signed-off-by: Changbin Du <changbin.du@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hui Wang <hw.huiwang@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230302031146.2801588-3-changbin.du@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
19 months agoperf record: Fix "read LOST count failed" msg with sample read
Kan Liang [Wed, 1 Mar 2023 15:04:13 +0000 (07:04 -0800)]
perf record: Fix "read LOST count failed" msg with sample read

Hundreds of "read LOST count failed" error messages may be displayed,
when the below command is launched.

perf record -e '{cpu/mem-loads-aux/,cpu/event=0xcd,umask=0x1/}:S' -a

According to the commit 89e3106fa25fb1b6 ("libperf: Handle read format
in perf_evsel__read()"), the PERF_FORMAT_GROUP is only available for
the leader. However, the record__read_lost_samples() goes through every
entry of an evlist, which includes both leader and member. The member
event errors out and triggers the error message. Since there may be
hundreds of CPUs on a server, the message will be printed hundreds of
times, which is very annoying.

The message itself is correct, but the pr_err is a overkill. Other error
messages in the record__read_lost_samples() are all pr_debug. To make
the output format consistent, change the pr_err("read LOST count
failed\n"); to pr_debug("read LOST count failed\n");.
User can still get the message via -v option.

Fixes: e3a23261ad06d598 ("perf record: Read and inject LOST_SAMPLES events")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230301150413.27011-1-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
20 months agoMerge remote-tracking branch 'acme/perf-tools' into perf-tools-next
Arnaldo Carvalho de Melo [Fri, 10 Mar 2023 21:43:17 +0000 (18:43 -0300)]
Merge remote-tracking branch 'acme/perf-tools' into perf-tools-next

To pick up perf-tools fixes just merged upstream.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
20 months agoMerge tag 'riscv-for-linus-6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 10 Mar 2023 17:19:30 +0000 (09:19 -0800)]
Merge tag 'riscv-for-linus-6.3-rc2' of git://git./linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - RISC-V architecture-specific ELF attributes have been disabled in the
   kernel builds

 - A fix for a locking failure while during errata patching that
   manifests on SiFive-based systems

 - A fix for a KASAN failure during stack unwinding

 - A fix for some lockdep failures during text patching

* tag 'riscv-for-linus-6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  RISC-V: Don't check text_mutex during stop_machine
  riscv: Use READ_ONCE_NOCHECK in imprecise unwinding stack mode
  RISC-V: fix taking the text_mutex twice during sifive errata patching
  RISC-V: Stop emitting attributes

20 months agoMerge tag 'drm-fixes-2023-03-10' of git://anongit.freedesktop.org/drm/drm
Linus Torvalds [Fri, 10 Mar 2023 16:57:46 +0000 (08:57 -0800)]
Merge tag 'drm-fixes-2023-03-10' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Weekly fixes.

  msm and amdgpu are the vast majority of these, otherwise some
  straggler misc from last week for nouveau and cirrus and a mailmap
  update for a drm developer.

  mailmap:
   - add an entry

  nouveau:
   - fix system shutdown regression
   - build warning fix

  cirrus:
   - NULL ptr deref fix

  msm:
   - fix invalid ptr free in syncobj cleanup
   - sync GMU removal in teardown
   - a5xx preemption fixes
   - fix runpm imbalance
   - DPU hw fixes
   - stack corruption fix
   - clear DSPP reservation

  amdgpu:
   - Misc display fixes
   - UMC 8.10 fixes
   - Driver unload fixes
   - NBIO 7.3.0 fix
   - Error checking fixes for soc15, nv, soc21 read register interface
   - Fix video cap query for VCN 4.0.4

  amdkfd:
   - Fix return check in doorbell handling"

* tag 'drm-fixes-2023-03-10' of git://anongit.freedesktop.org/drm/drm: (42 commits)
  drm/amdgpu/soc21: Add video cap query support for VCN_4_0_4
  drm/amdgpu: fix error checking in amdgpu_read_mm_registers for nv
  drm/amdgpu: fix error checking in amdgpu_read_mm_registers for soc21
  drm/amdgpu: fix error checking in amdgpu_read_mm_registers for soc15
  drm/amdgpu: Fix the warning info when removing amdgpu device
  drm/amdgpu: fix return value check in kfd
  drm/amd: Fix initialization mistake for NBIO 7.3.0
  drm/amdgpu: Fix call trace warning and hang when removing amdgpu device
  mailmap: add mailmap entries for Faith.
  drm/msm: DEVFREQ_GOV_SIMPLE_ONDEMAND is no longer needed
  drm/amd/display: Update clock table to include highest clock setting
  drm/amd/pm: Enable ecc_info table support for smu v13_0_10
  drm/amdgpu: Support umc node harvest config on umc v8_10
  drm/connector: print max_requested_bpc in state debugfs
  drm/display: Don't block HDR_OUTPUT_METADATA on unknown EOTF
  drm/msm/dpu: clear DSPP reservations in rm release
  drm/msm/disp/dpu: fix sc7280_pp base offset
  drm/msm/dpu: fix stack smashing in dpu_hw_ctl_setup_blendstage
  drm/msm/dpu: don't use DPU_CLK_CTRL_CURSORn for DMA SSPP clocks
  drm/msm/dpu: fix clocks settings for msm8998 SSPP blocks
  ...

20 months agoMerge tag 'erofs-for-6.3-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 10 Mar 2023 16:51:57 +0000 (08:51 -0800)]
Merge tag 'erofs-for-6.3-rc2-fixes' of git://git./linux/kernel/git/xiang/erofs

Pull erofs fixes from Gao Xiang:
 "The most important one reverts an improper fix which can cause an
  unexpected warning more often on specific images, and another one
  fixes LZMA decompression on 32-bit platforms. The others are minor
  fixes and cleanups.

   - Fix LZMA decompression failure on HIGHMEM platforms

   - Revert an inproper fix since it is actually an implementation issue
     of vmalloc()

   - Avoid a wrong DBG_BUGON since it could be triggered with -EINTR

   - Minor cleanups"

* tag 'erofs-for-6.3-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: use wrapper i_blocksize() in erofs_file_read_iter()
  erofs: get rid of a useless DBG_BUGON
  erofs: Revert "erofs: fix kvcalloc() misuse with __GFP_NOFAIL"
  erofs: fix wrong kunmap when using LZMA on HIGHMEM platforms
  erofs: mark z_erofs_lzma_init/erofs_pcpubuf_init w/ __init