perf report: Add "addr_from" and "addr_to" sort dimensions
authorStephane Eranian <eranian@google.com>
Tue, 8 Feb 2022 21:16:37 +0000 (13:16 -0800)
committerArnaldo Carvalho de Melo <acme@redhat.com>
Wed, 16 Feb 2022 14:21:22 +0000 (11:21 -0300)
commit052747700e914896e8c78ff019411487dc7c12a0
tree08b65a5fafe68aee1a7bb82ce413cfd3ffc3a284
parentb47f18d85c795d8deac8210f50032030b1254882
perf report: Add "addr_from" and "addr_to" sort dimensions

With the existing symbol_from/symbol_to, branches captured in the same
function would be collapsed into a single function if the latencies
associated with the each branch (cycles) were all the same.  That is the
case on Intel Broadwell, for instance. Since Intel Skylake, the latency
is captured by hardware and therefore is used to disambiguate branches.

Add addr_from/addr_to sort dimensions to sort branches based on their
addresses and not the function there are in. The output is still the
function name but the offset within the function is provided to uniquely
identify each branch.  These new sort dimensions also help with annotate
because they create different entries in the histogram which, in turn,
generates proper branch annotations.

Here is an example using AMD's branch sampling:

  $ perf record -a -b -c 1000037 -e cpu/branch-brs/ test_prg

  $ perf report
  Samples: 6M of event 'cpu/branch-brs/', Event count (approx.): 6901276
  Overhead  Command          Source Shared Object  Source Symbol                                   Target Symbol                                   Basic Block Cycle
    99.65%  test_prg    test_prg              [.] test_thread                                 [.] test_thread                                 -
     0.02%  test_prg         [kernel.vmlinux]      [k] asm_sysvec_apic_timer_interrupt             [k] error_entry                                 -

  $ perf report -F overhead,comm,dso,addr_from,addr_to
  Samples: 6M of event 'cpu/branch-brs/', Event count (approx.): 6901276
  Overhead  Command          Shared Object     Source Address          Target Address
     4.22%  test_prg         test_prg          [.] test_thread+0x3c    [.] test_thread+0x4
     4.13%  test_prg         test_prg          [.] test_thread+0x4     [.] test_thread+0x3a
     4.09%  test_prg         test_prg          [.] test_thread+0x3a    [.] test_thread+0x6
     4.08%  test_prg         test_prg          [.] test_thread+0x2     [.] test_thread+0x3c
     4.06%  test_prg         test_prg          [.] test_thread+0x3e    [.] test_thread+0x2
     3.87%  test_prg         test_prg          [.] test_thread+0x6     [.] test_thread+0x38
     3.84%  test_prg         test_prg          [.] test_thread         [.] test_thread+0x3e
     3.76%  test_prg         test_prg          [.] test_thread+0x1e    [.] test_thread
     3.76%  test_prg         test_prg          [.] test_thread+0x38    [.] test_thread+0x8
     3.56%  test_prg         test_prg          [.] test_thread+0x22    [.] test_thread+0x1e
     3.54%  test_prg         test_prg          [.] test_thread+0x8     [.] test_thread+0x36
     3.47%  test_prg         test_prg          [.] test_thread+0x1c    [.] test_thread+0x22
     3.45%  test_prg         test_prg          [.] test_thread+0x36    [.] test_thread+0xa
     3.28%  test_prg         test_prg          [.] test_thread+0x24    [.] test_thread+0x1c
     3.25%  test_prg         test_prg          [.] test_thread+0xa     [.] test_thread+0x34
     3.24%  test_prg         test_prg          [.] test_thread+0x1a    [.] test_thread+0x24
     3.20%  test_prg         test_prg          [.] test_thread+0x34    [.] test_thread+0xc
     3.04%  test_prg         test_prg          [.] test_thread+0x26    [.] test_thread+0x1a
     3.01%  test_prg         test_prg          [.] test_thread+0xc     [.] test_thread+0x32
     2.98%  test_prg         test_prg          [.] test_thread+0x18    [.] test_thread+0x26
     2.94%  test_prg         test_prg          [.] test_thread+0x32    [.] test_thread+0xe
     2.76%  test_prg         test_prg          [.] test_thread+0x28    [.] test_thread+0x18
     2.73%  test_prg         test_prg          [.] test_thread+0xe     [.] test_thread+0x30
     2.67%  test_prg         test_prg          [.] test_thread+0x30    [.] test_thread+0x10
     2.67%  test_prg         test_prg          [.] test_thread+0x16    [.] test_thread+0x28
     2.46%  test_prg         test_prg          [.] test_thread+0x10    [.] test_thread+0x2e
     2.44%  test_prg         test_prg          [.] test_thread+0x2a    [.] test_thread+0x16
     2.38%  test_prg         test_prg          [.] test_thread+0x14    [.] test_thread+0x2a
     2.32%  test_prg         test_prg          [.] test_thread+0x2e    [.] test_thread+0x12
     2.28%  test_prg         test_prg          [.] test_thread+0x12    [.] test_thread+0x2c
     2.16%  test_prg         test_prg          [.] test_thread+0x2c    [.] test_thread+0x14
     0.02%  test_prg         [kernel.vmlinux]  [k] asm_sysvec_apic_ti+0x5  [k] error_entry

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Link: http://lore.kernel.org/lkml/20220208211637.2221872-13-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
tools/perf/util/hist.c
tools/perf/util/hist.h
tools/perf/util/sort.c
tools/perf/util/sort.h