arm64: perf: Implement correct cap_user_time
authorPeter Zijlstra <peterz@infradead.org>
Thu, 16 Jul 2020 05:11:26 +0000 (13:11 +0800)
committerWill Deacon <will@kernel.org>
Mon, 20 Jul 2020 10:50:47 +0000 (11:50 +0100)
commit950b74ddefc4a42add8b1ae0170aa309338ffe73
tree3b91c41219f8da64f6eff9b1b74979d0b37cdcbf
parentaadd6e5caaacd6feca9691ba30536e7de5a7d152
arm64: perf: Implement correct cap_user_time

As reported by Leo; the existing implementation is broken when the
clock and counter don't intersect at 0.

Use the sched_clock's struct clock_read_data information to correctly
implement cap_user_time and cap_user_time_zero.

Note that the ARM64 counter is architecturally only guaranteed to be
56bit wide (implementations are allowed to be wider) and the existing
perf ABI cannot deal with wrap-around.

This implementation should also be faster than the old; seeing how we
don't need to recompute mult and shift all the time.

[leoyan: Use mul_u64_u32_shr() to convert cyc to ns to avoid overflow]

Reported-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20200716051130.4359-4-leo.yan@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
arch/arm64/kernel/perf_event.c