[LLVM][LTO][LLD] Enable Profile Guided Layout (--call-graph-profile-sort) for FullLTO
authorBen Dunbobbin <Ben.Dunbobbin@sony.com>
Thu, 30 Jun 2022 22:01:30 +0000 (23:01 +0100)
committerBen Dunbobbin <Ben.Dunbobbin@sony.com>
Fri, 1 Jul 2022 12:57:36 +0000 (13:57 +0100)
The CGProfilePass needs to be run during FullLTO compilation at link
time to emit the .llvm.call-graph-profile section to the compiled LTO
object file. Currently, it is being run only during the initial
LTO-prelink compilation stage (to produce the bitcode files to be
consumed by the linker) and so the section is not produced.

ThinLTO is not affected because:
- For ThinLTO-prelink compilation the CGProfilePass pass is not run
  because ThinLTO-prelink passes are added via
  buildThinLTOPreLinkDefaultPipeline. Normal and FullLTO-prelink
  passes are both added via buildPerModuleDefaultPipeline which uses
  the LTOPreLink parameter to customize its behavior for the
  FullLTO-prelink pass differences.
- ThinLTO backend compilation phase adds the CGProfilePass (see:
  buildModuleOptimizationPipeline).

Adjust when the pass is run so that the .llvm.call-graph-profile
section is produced correctly for FullLTO.

Fixes #56185 (https://github.com/llvm/llvm-project/issues/56185)

clang/test/CodeGen/thinlto-distributed-newpm.ll
llvm/lib/Passes/PassBuilderPipelines.cpp
llvm/test/Other/new-pm-defaults.ll
llvm/test/Other/new-pm-lto-defaults.ll
llvm/test/Other/new-pm-thinlto-defaults.ll
llvm/test/Other/new-pm-thinlto-postlink-pgo-defaults.ll
llvm/test/Other/new-pm-thinlto-postlink-samplepgo-defaults.ll

index 6d42e9e..463fc52 100644 (file)
 ; CHECK-O: Running pass: InstSimplifyPass on main
 ; CHECK-O: Running pass: DivRemPairsPass on main
 ; CHECK-O: Running pass: SimplifyCFGPass on main
-; CHECK-O: Running pass: CGProfilePass
 ; CHECK-O: Running pass: GlobalDCEPass
 ; CHECK-O: Running pass: ConstantMergePass
+; CHECK-O: Running pass: CGProfilePass
 
 target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
 target triple = "x86_64-grtev4-linux-gnu"
index 8e0f0cb..a534517 100644 (file)
@@ -1278,9 +1278,6 @@ PassBuilder::buildModuleOptimizationPipeline(OptimizationLevel Level,
   if (PTO.MergeFunctions)
     MPM.addPass(MergeFunctionsPass());
 
-  if (PTO.CallGraphProfile)
-    MPM.addPass(CGProfilePass());
-
   // Now we need to do some global optimization transforms.
   // FIXME: It would seem like these should come first in the optimization
   // pipeline and maybe be the bottom of the canonicalization pipeline? Weird
@@ -1288,6 +1285,9 @@ PassBuilder::buildModuleOptimizationPipeline(OptimizationLevel Level,
   MPM.addPass(GlobalDCEPass());
   MPM.addPass(ConstantMergePass());
 
+  if (PTO.CallGraphProfile && !LTOPreLink)
+    MPM.addPass(CGProfilePass());
+
   // TODO: Relative look table converter pass caused an issue when full lto is
   // enabled. See https://reviews.llvm.org/D94355 for more details.
   // Until the issue fixed, disable this pass during pre-linking phase.
@@ -1748,6 +1748,9 @@ PassBuilder::buildLTODefaultPipeline(OptimizationLevel Level,
   if (PTO.MergeFunctions)
     MPM.addPass(MergeFunctionsPass());
 
+  if (PTO.CallGraphProfile)
+    MPM.addPass(CGProfilePass());
+
   for (auto &C : FullLinkTimeOptimizationLastEPCallbacks)
     C(MPM, Level);
 
index 04b57fe..8dd82c4 100644 (file)
 ; CHECK-IR-OUTLINER-NEXT: Running pass: IROutlinerPass
 ; CHECK-IR-OUTLINER-NEXT: Running analysis: IRSimilarityAnalysis
 ; CHECK-MERGE-FUNCS-NEXT: Running pass: MergeFunctionsPass
-; CHECK-O-NEXT: Running pass: CGProfilePass
 ; CHECK-O-NEXT: Running pass: GlobalDCEPass
 ; CHECK-O-NEXT: Running pass: ConstantMergePass
+; CHECK-DEFAULT-NEXT: Running pass: CGProfilePass
 ; CHECK-DEFAULT-NEXT: Running pass: RelLookupTableConverterPass
 ; CHECK-LTO-NOT: Running pass: RelLookupTableConverterPass
 ; CHECK-O-NEXT: Running pass: AnnotationRemarksPass on foo
index 2e82698..da66e71 100644 (file)
 ; CHECK-O23SZ-NEXT: Running pass: SimplifyCFGPass
 ; CHECK-O23SZ-NEXT: Running pass: EliminateAvailableExternallyPass
 ; CHECK-O23SZ-NEXT: Running pass: GlobalDCEPass
+; CHECK-O23SZ-NEXT: Running pass: CGProfilePass
 ; CHECK-EP-NEXT: Running pass: NoOpModulePass
 ; CHECK-O-NEXT: Running pass: AnnotationRemarksPass on foo
 ; CHECK-O-NEXT: Running pass: PrintModulePass
index a11c9f7..ea7cf2d 100644 (file)
 ; CHECK-POSTLINK-O-NEXT: Running pass: InstSimplifyPass
 ; CHECK-POSTLINK-O-NEXT: Running pass: DivRemPairsPass
 ; CHECK-POSTLINK-O-NEXT: Running pass: SimplifyCFGPass
-; CHECK-POSTLINK-O-NEXT: Running pass: CGProfilePass
 ; CHECK-POSTLINK-O-NEXT: Running pass: GlobalDCEPass
 ; CHECK-POSTLINK-O-NEXT: Running pass: ConstantMergePass
+; CHECK-POSTLINK-O-NEXT: Running pass: CGProfilePass
 ; CHECK-POSTLINK-O-NEXT: Running pass: RelLookupTableConverterPass
 ; CHECK-O-NEXT:          Running pass: AnnotationRemarksPass on foo
 ; CHECK-PRELINK-O-NEXT: Running pass: CanonicalizeAliasesPass
index dfb19c4..7857238 100644 (file)
 ; CHECK-O-NEXT: Running pass: InstSimplifyPass
 ; CHECK-O-NEXT: Running pass: DivRemPairsPass
 ; CHECK-O-NEXT: Running pass: SimplifyCFGPass
-; CHECK-O-NEXT: Running pass: CGProfilePass
 ; CHECK-O-NEXT: Running pass: GlobalDCEPass
 ; CHECK-O-NEXT: Running pass: ConstantMergePass
+; CHECK-O-NEXT: Running pass: CGProfilePass
 ; CHECK-O-NEXT: Running pass: RelLookupTableConverterPass
 ; CHECK-O-NEXT: Running pass: AnnotationRemarksPass on foo
 ; CHECK-O-NEXT: Running pass: PrintModulePass
index 072ea54..a90fad5 100644 (file)
 ; CHECK-O-NEXT: Running pass: InstSimplifyPass
 ; CHECK-O-NEXT: Running pass: DivRemPairsPass
 ; CHECK-O-NEXT: Running pass: SimplifyCFGPass
-; CHECK-O-NEXT: Running pass: CGProfilePass
 ; CHECK-O-NEXT: Running pass: GlobalDCEPass
 ; CHECK-O-NEXT: Running pass: ConstantMergePass
+; CHECK-O-NEXT: Running pass: CGProfilePass
 ; CHECK-O-NEXT: Running pass: RelLookupTableConverterPass
 ; CHECK-O-NEXT: Running pass: AnnotationRemarksPass on foo
 ; CHECK-O-NEXT: Running pass: PrintModulePass