perf: pmu fine-tune for aarch32/64 of A53/A55/A73 [1/1]
authorHanjie Lin <hanjie.lin@amlogic.com>
Tue, 27 Aug 2019 08:21:46 +0000 (16:21 +0800)
committerJianxin Pan <jianxin.pan@amlogic.com>
Wed, 18 Sep 2019 05:35:56 +0000 (22:35 -0700)
commit270cbf3b28a98f2071f54862e7cf967e52e98356
tree5047820d793ec85ecc8a4cff68582378e4ba2011
parented434f226ee7db58064e22d9c7d613c39867a182
perf: pmu fine-tune for aarch32/64 of A53/A55/A73 [1/1]

PD#SWPL-13243

Problem:
pmu event is not accurate or not complete in A53/A55/A73.

Solution:
1, modify event config for A53/A55/A73.
2, perf executable file must compiled from latest kernel(5.1+)
3, A55 events are most complete, A73 are least complete(eg: less ld_retired/st_retired/stall/prefetch events)
4, A55/A53 same event meanings simlar, but A73 is more different(eg: L1/L2 dcache/icache loads meanings)

sample commands:
a55 arm64:
perf stat -e task-clock,context-switches,cpu-migrations,page-faults,instructions,armv8_pmuv3/ld_retired/,armv8_pmuv3/st_retired/,cycles,branch-loads,branch-load-misses,armv8_pmuv3/a55_l1d_cache_rd/,armv8_pmuv3/a55_l1d_cache_refill_rd/,armv8_pmuv3/a55_l1d_cache_wr/,armv8_pmuv3/a55_l1d_cache_refill_wr/,L1-icache-loads,L1-icache-load-misses,armv8_pmuv3/a55_l2d_cache_rd/,armv8_pmuv3/a55_l2d_cache_refill_rd/,armv8_pmuv3/a55_l1d_cache_refill_inner/,armv8_pmuv3/a55_l1d_cache_refill_outer/,armv8_pmuv3/a55_l1d_cache_refill_prefetch/,armv8_pmuv3/a55_l2d_cache_refill_prefetch/,armv8_pmuv3/a5x_stall_frontend_cache/,armv8_pmuv3/a5x_stall_frontend_tlb/,armv8_pmuv3/a5x_stall_backend_ld/,armv8_pmuv3/a55_stall_backend_ld_cache/,armv8_pmuv3/a55_stall_backend_ld_tlb/,armv8_pmuv3/a5x_stall_backend_st/,armv8_pmuv3/a5x_stall_backend_ilock_agu/,armv8_pmuv3/a5x_stall_backend_ilock_fpu/ ls

a53 arm64:
perf stat -e task-clock,context-switches,cpu-migrations,page-faults,instructions,armv8_pmuv3/ld_retired/,armv8_pmuv3/st_retired/,cycles,branch-loads,branch-load-misses,armv8_pmuv3/l1d_cache/,armv8_pmuv3/l1d_cache_refill/,L1-icache-loads,L1-icache-load-misses,armv8_pmuv3/a5x_l2d_cache/,armv8_pmuv3/a5x_l2d_cache_refill/,armv8_pmuv3/a53_cache_refill_prefetch/,armv8_pmuv3/a53_scu_snooped/,armv8_pmuv3/a5x_stall_frontend_cache/,armv8_pmuv3/a5x_stall_frontend_tlb/,armv8_pmuv3/a5x_stall_backend_ld/,,armv8_pmuv3/a5x_stall_backend_st/,armv8_pmuv3/a5x_stall_backend_ilock_agu/,armv8_pmuv3/a5x_stall_backend_ilock_fpu/ ls

a73 arm64: (w400 bind to a73 cpu2)
perf stat -e task-clock,context-switches,cpu-migrations,page-faults,instructions,cycles,branch-loads,branch-load-misses,armv8_pmuv3/l1d_cache/,armv8_pmuv3/l1d_cache_refill/,armv8_pmuv3/a55_l1d_cache_rd/,armv8_pmuv3/a55_l1d_cache_wr/,armv8_pmuv3/a5x_l2d_cache/,armv8_pmuv3/a5x_l2d_cache_refill/,armv8_pmuv3/a55_l2d_cache_rd/,armv8_pmuv3/a55_l2d_cache_wr/ busybox taskset 4 ls

a55 arm:
perf stat -e task-clock,context-switches,cpu-migrations,page-faults,instructions,armv7_cortex_a15/ld_retired/,armv7_cortex_a15/st_retired/,cycles,branch-loads,branch-load-misses,armv7_cortex_a15/a55_l1d_cache_rd/,armv7_cortex_a15/a55_l1d_cache_refill_rd/,armv7_cortex_a15/a55_l1d_cache_wr/,armv7_cortex_a15/a55_l1d_cache_refill_wr/,L1-icache-loads,L1-icache-load-misses,armv7_cortex_a15/a55_l2d_cache_rd/,armv7_cortex_a15/a55_l2d_cache_refill_rd/,armv7_cortex_a15/a55_l1d_cache_refill_inner/,armv7_cortex_a15/a55_l1d_cache_refill_outer/,armv7_cortex_a15/a55_l1d_cache_refill_prefetch/,armv7_cortex_a15/a55_l2d_cache_refill_prefetch/,armv7_cortex_a15/a5x_stall_frontend_cache/,armv7_cortex_a15/a5x_stall_frontend_tlb/,armv7_cortex_a15/a5x_stall_backend_ld/,armv7_cortex_a15/a55_stall_backend_ld_cache/,armv7_cortex_a15/a55_stall_backend_ld_tlb/,armv7_cortex_a15/a5x_stall_backend_st/,armv7_cortex_a15/a5x_stall_backend_ilock_agu/,armv7_cortex_a15/a5x_stall_backend_ilock_fpu/ ls

a53 arm:
perf stat -e task-clock,context-switches,cpu-migrations,page-faults,instructions,armv7_cortex_a15/ld_retired/,armv7_cortex_a15/st_retired/,cycles,branch-loads,branch-load-misses,armv7_cortex_a15/l1d_cache/,armv7_cortex_a15/l1d_cache_refill/,L1-icache-loads,L1-icache-load-misses,armv7_cortex_a15/a5x_l2d_cache/,armv7_cortex_a15/a5x_l2d_cache_refill/,armv7_cortex_a15/a53_cache_refill_prefetch/,armv7_cortex_a15/a53_scu_snooped/,armv7_cortex_a15/a5x_stall_frontend_cache/,armv7_cortex_a15/a5x_stall_frontend_tlb/,armv7_cortex_a15/a5x_stall_backend_ld/,armv7_cortex_a15/a5x_stall_backend_st/,armv7_cortex_a15/a5x_stall_backend_ilock_agu/,armv7_cortex_a15/a5x_stall_backend_ilock_fpu/ ls

a73 arm: (w400 bind to a73 cpu2)
perf stat -e task-clock,context-switches,cpu-migrations,page-faults,instructions,cycles,branch-loads,branch-load-misses,armv7_cortex_a15/l1d_cache/,armv7_cortex_a15/l1d_cache_refill/,armv7_cortex_a15/a55_l1d_cache_rd/,armv7_cortex_a15/a55_l1d_cache_wr/,armv7_cortex_a15/a5x_l2d_cache/,armv7_cortex_a15/a5x_l2d_cache_refill/,armv7_cortex_a15/a55_l2d_cache_rd/,armv7_cortex_a15/a55_l2d_cache_wr/ busybox taskset 4 ls

Verify:
ac200/u200/w400

Change-Id: I7f11e1480c3c27d016b011d2a84c33e824f69b08
Signed-off-by: Hanjie Lin <hanjie.lin@amlogic.com>
arch/arm/kernel/perf_event_v7.c
arch/arm64/kernel/perf_event.c