tsan: make obtaining current PC faster
authorDmitry Vyukov <dvyukov@google.com>
Thu, 15 Jul 2021 08:51:32 +0000 (10:51 +0200)
committerDmitry Vyukov <dvyukov@google.com>
Mon, 19 Jul 2021 11:05:30 +0000 (13:05 +0200)
commitbaa7f58973d47e99c663860e4c2c3d55505f2bc7
treefb183a51a8749ce4de8274dc63066bb15abab412
parent94e0975450daa57060248c2231ea8bf902b3e86a
tsan: make obtaining current PC faster

We obtain the current PC is all interceptors and collectively
common interceptor code contributes to overall slowdown
(in particular cheaper str/mem* functions).

The current way to obtain the current PC involves:

  4493e1:       e8 3a f3 fe ff          callq  438720 <_ZN11__sanitizer10StackTrace12GetCurrentPcEv>
  4493e9:       48 89 c6                mov    %rax,%rsi

and the called function is:

uptr StackTrace::GetCurrentPc() {
  438720:       48 8b 04 24             mov    (%rsp),%rax
  438724:       c3                      retq

The new way uses address of a local label and involves just:

  44a888:       48 8d 35 fa ff ff ff    lea    -0x6(%rip),%rsi

I am not switching all uses of StackTrace::GetCurrentPc to GET_CURRENT_PC
because it may lead some differences in produced reports and break tests.
The difference comes from the fact that currently we have PC pointing
to the CALL instruction, but the new way does not yield any code on its own
so the PC points to a random instruction in the function and symbolizing
that instruction can produce additional inlined frames (if the random
instruction happen to relate to some inlined function).

Reviewed By: vitalybuka, melver

Differential Revision: https://reviews.llvm.org/D106046
compiler-rt/lib/sanitizer_common/sanitizer_stacktrace.h
compiler-rt/lib/sanitizer_common/tests/sanitizer_stacktrace_test.cpp
compiler-rt/lib/tsan/rtl/tsan_interceptors.h