[CaptureTracking] Avoid long compilation time on large basic blocks
authorBruno Cardoso Lopes <bruno.cardoso@gmail.com>
Wed, 24 Jun 2015 17:53:17 +0000 (17:53 +0000)
committerBruno Cardoso Lopes <bruno.cardoso@gmail.com>
Wed, 24 Jun 2015 17:53:17 +0000 (17:53 +0000)
commit7900ef891fc8354efeb3f010b4e7a581e81ba5c9
treeca9789e203e5bebaf0c1fe0367eb21bf5f201e57
parent12b554e6a7857795a8b0a701d17fed1a2863ebd9
[CaptureTracking] Avoid long compilation time on large basic blocks

CaptureTracking becomes very expensive in large basic blocks while
calling PointerMayBeCaptured. PointerMayBeCaptured scans the BB the
number of times equal to the number of uses of 'BeforeHere', which is
currently capped at 20 and bails out with Tracker->tooManyUses().

The bottleneck here is the number of calls to PointerMayBeCaptured * the
basic block scan. In a testcase with a 82k instruction BB,
PointerMayBeCaptured is called 130k times, leading to 'shouldExplore'
taking 527k runs, this currently takes ~12min.

To fix this we locally (within PointerMayBeCaptured) number the
instructions in the basic block using a DenseMap to cache instruction
positions/numbers. We build the cache incrementally every time we need
to scan an unexplored part of the BB, improving compile time to only
take ~2min.

This triggers in the flow: DeadStoreElimination -> MepDepAnalysis ->
CaptureTracking.

Side note: after multiple runs in the test-suite I've seen no
performance nor compile time regressions, but could note a couple of
compile time improvements:

Performance Improvements - Compile Time Delta Previous  Current StdDev
SingleSource/Benchmarks/Misc-C++/bigfib -4.48%  0.8547  0.8164  0.0022
MultiSource/Benchmarks/TSVC/LoopRerolling-dbl/LoopRerolling-dbl -1.47% 1.3912  1.3707  0.0056

Differential Revision: http://reviews.llvm.org/D7010

llvm-svn: 240560
llvm/include/llvm/Analysis/CFG.h
llvm/lib/Analysis/CFG.cpp
llvm/lib/Analysis/CaptureTracking.cpp