From: Ian Rogers Date: Wed, 5 Jan 2022 06:13:06 +0000 (-0800) Subject: perf stat: Correct aggregation CPU map X-Git-Tag: v6.1-rc5~2191^2~62 X-Git-Url: http://review.tizen.org/git/?a=commitdiff_plain;h=01843ca0197783d0951a1948ebeaaed9a47ce55d;p=platform%2Fkernel%2Flinux-starfive.git perf stat: Correct aggregation CPU map Switch the perf_cpu_map in aggr_update_shadow from the evlist to the counter's cpu map, so the index is appropriate. This addresses a problem where uncore counts, with a cpumap like: $ cat /sys/devices/uncore_imc_0/cpumask 0,18 Don't aggregate counts in CPUs based on the index of those values in the cpumap (0 and 1) but on the actual CPU (0 and 18). Thereby correcting metric calculations in per-socket mode for counters without a full cpumask. On a SkylakeX with a tweaked DRAM_BW_Use metric, to remove unnecessary scaling, this gives: Before: $ /perf stat --per-socket -M DRAM_BW_Use -I 1000 1.001102293 S0 1 27.01 MiB uncore_imc/cas_count_write/ # 103.00 DRAM_BW_Use 1.001102293 S0 1 30.22 MiB uncore_imc/cas_count_read/ 1.001102293 S0 1 1,001,102,293 ns duration_time 1.001102293 S1 1 20.10 MiB uncore_imc/cas_count_write/ # 0.00 DRAM_BW_Use 1.001102293 S1 1 32.74 MiB uncore_imc/cas_count_read/ 1.001102293 S1 0 ns duration_time 2.003517973 S0 1 83.04 MiB uncore_imc/cas_count_write/ # 920.00 DRAM_BW_Use 2.003517973 S0 1 145.95 MiB uncore_imc/cas_count_read/ 2.003517973 S0 1 1,002,415,680 ns duration_time 2.003517973 S1 1 302.45 MiB uncore_imc/cas_count_write/ # 0.00 DRAM_BW_Use 2.003517973 S1 1 290.99 MiB uncore_imc/cas_count_read/ 2.003517973 S1 0 ns duration_time After: $ perf stat --per-socket -M DRAM_BW_Use -I 1000 1.001080840 S0 1 24.96 MiB uncore_imc/cas_count_write/ # 54.00 DRAM_BW_Use 1.001080840 S0 1 33.64 MiB uncore_imc/cas_count_read/ 1.001080840 S0 1 1,001,080,840 ns duration_time 1.001080840 S1 1 42.43 MiB uncore_imc/cas_count_write/ # 84.00 DRAM_BW_Use 1.001080840 S1 1 47.05 MiB uncore_imc/cas_count_read/ 1.001080840 S1 0 ns duration_time Signed-off-by: Ian Rogers Tested-by: John Garry Cc: Alexander Shishkin Cc: Andi Kleen Cc: Ingo Molnar Cc: James Clark Cc: Jiri Olsa Cc: Kajol Jain Cc: Kan Liang Cc: Leo Yan Cc: Mark Rutland Cc: Mathieu Poirier Cc: Mike Leach Cc: Namhyung Kim Cc: Paul Clarke Cc: Peter Zijlstra Cc: Riccardo Mancini Cc: Stephane Eranian Cc: Suzuki Poulouse Cc: Vineet Singh Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: zhengjun.xing@intel.com Link: https://lore.kernel.org/r/20220105061351.120843-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo --- diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c index 5886010..b0fa81f 100644 --- a/tools/perf/util/stat-display.c +++ b/tools/perf/util/stat-display.c @@ -526,7 +526,7 @@ static void aggr_update_shadow(struct perf_stat_config *config, evlist__for_each_entry(evlist, counter) { val = 0; for (cpu = 0; cpu < evsel__nr_cpus(counter); cpu++) { - s2 = config->aggr_get_id(config, evlist->core.cpus, cpu); + s2 = config->aggr_get_id(config, evsel__cpus(counter), cpu); if (!cpu_map__compare_aggr_cpu_id(s2, id)) continue; val += perf_counts(counter->counts, cpu, 0)->val;