[clang][deps] Prune unused header search paths
authorJan Svoboda <jan_svoboda@apple.com>
Tue, 12 Oct 2021 10:23:38 +0000 (12:23 +0200)
committerJan Svoboda <jan_svoboda@apple.com>
Tue, 12 Oct 2021 10:39:23 +0000 (12:39 +0200)
commit6a1f50b84ae8f8a8087fcdbe5f27dae8c76878f1
tree64fda556755e7e3c690cf3395f71face67464267
parente19bbd0fa2a577dca21cab940719115a30dd1809
[clang][deps] Prune unused header search paths

To reduce the number of explicit builds of a single module, we can try to squash multiple occurrences of the module with different command-lines (and context hashes) by removing benign command-line options. The greatest contributors to benign differences between command-lines are the header search paths.

In this patch, the lookup cache in `HeaderSearch` is used to identify paths that were actually used when implicitly building the module during scanning. This information is serialized into the unhashed control block of the implicitly-built PCM. The dependency scanner then loads this and may use it to prune the header search paths before computing the context hash of the module and generating the command-line.

We could also prune the header search paths when serializing `HeaderSearchOptions` into the PCM. That way, we could do it only once instead of every load of the PCM file by dependency scanner. However, that would result in a PCM file whose contents don't produce the same context hash as the original build, which is probably highly surprising.

There is an alternative approach to storing extra information into the PCM: wire up preprocessor callbacks to capture the used header search paths on-the-fly during preprocessing of modularized headers (similar to what we currently do for the main source file and textual headers). Right now, that's not compatible with the fact that we do an actual implicit build producing PCM files during dependency scanning. The second run of dependency scanner loads the PCM from the first run, skipping the preprocessing altogether, which would result in different results between runs. We can revisit this approach when we stop building implicitly during dependency scanning.

Depends on D102923.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D102488
19 files changed:
clang/include/clang/Serialization/ASTBitCodes.h
clang/include/clang/Serialization/ModuleFile.h
clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h
clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
clang/lib/Serialization/ASTReader.cpp
clang/lib/Serialization/ASTWriter.cpp
clang/lib/Tooling/DependencyScanning/DependencyScanningService.cpp
clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
clang/test/ClangScanDeps/Inputs/header-search-pruning/a/a.h [new file with mode: 0644]
clang/test/ClangScanDeps/Inputs/header-search-pruning/b/b.h [new file with mode: 0644]
clang/test/ClangScanDeps/Inputs/header-search-pruning/begin/begin.h [new file with mode: 0644]
clang/test/ClangScanDeps/Inputs/header-search-pruning/cdb.json [new file with mode: 0644]
clang/test/ClangScanDeps/Inputs/header-search-pruning/end/end.h [new file with mode: 0644]
clang/test/ClangScanDeps/Inputs/header-search-pruning/mod.h [new file with mode: 0644]
clang/test/ClangScanDeps/Inputs/header-search-pruning/module.modulemap [new file with mode: 0644]
clang/test/ClangScanDeps/header-search-pruning.cpp [new file with mode: 0644]
clang/tools/clang-scan-deps/ClangScanDeps.cpp