[Clang][CMake] Use perf-training for Clang-BOLT
authorAmir Ayupov <aaupov@fb.com>
Sat, 13 May 2023 17:34:50 +0000 (10:34 -0700)
committerAmir Aupov <amir.aupov@gmail.com>
Sat, 13 May 2023 17:36:29 +0000 (10:36 -0700)
commit76b2915fdbbba18693c9aabda419768f41106f31
tree305a8d5eaa42b5acc5ddfda12a4e7a3578aec361
parentc19c248466cc730d46bbf3c3112791872661be5e
[Clang][CMake] Use perf-training for Clang-BOLT

Leverage perf-training flow for BOLT profile collection, enabling reproducible
BOLT optimization. Remove the use of bootstrapped build for profile collection.

Test Plan:
- Regular (single-stage) build
```
$ cmake ... -C .../clang/cmake/caches/BOLT.cmake
$ ninja clang-bolt
...
[21/24] Instrumenting clang binary with BOLT
[21/24] Generating BOLT profile for Clang
[23/24] Merging BOLT fdata
Profile from 2 files merged.
[24/24] Optimizing Clang with BOLT
...
          1291202496 : executed instructions (-1.1%)
            27005133 : taken branches (-71.5%)
...
```
- Two stage build (ThinLTO+InstPGO)
```
$ cmake ... -C .../clang/cmake/caches/BOLT.cmake -C .../clang/cmake/caches/BOLT-PGO.cmake
$ ninja clang-bolt
$ ninja stage2-clang-bolt
...
[2756/2759] Instrumenting clang binary with BOLT
[2756/2759] Generating BOLT profile for Clang
[2758/2759] Merging BOLT fdata
[2759/2759] Optimizing Clang with BOLT
...
BOLT-INFO: 7092 out of 184104 functions in the binary (3.9%) have non-empty execution profile
           756531927 : executed instructions (-0.5%)
            15399400 : taken branches (-40.3%)
...
```

Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D143553
clang/CMakeLists.txt
clang/cmake/caches/BOLT.cmake
clang/utils/perf-training/CMakeLists.txt
clang/utils/perf-training/bolt.lit.cfg [new file with mode: 0644]
clang/utils/perf-training/bolt.lit.site.cfg.in [new file with mode: 0644]