LLVM  9.0.0svn
CGSCCPassManager.h
Go to the documentation of this file.
1 //===- CGSCCPassManager.h - Call graph pass management ----------*- C++ -*-===//
2 //
3 // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4 // See https://llvm.org/LICENSE.txt for license information.
5 // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6 //
7 //===----------------------------------------------------------------------===//
8 /// \file
9 ///
10 /// This header provides classes for managing passes over SCCs of the call
11 /// graph. These passes form an important component of LLVM's interprocedural
12 /// optimizations. Because they operate on the SCCs of the call graph, and they
13 /// traverse the graph in post-order, they can effectively do pair-wise
14 /// interprocedural optimizations for all call edges in the program while
15 /// incrementally refining it and improving the context of these pair-wise
16 /// optimizations. At each call site edge, the callee has already been
17 /// optimized as much as is possible. This in turn allows very accurate
18 /// analysis of it for IPO.
19 ///
20 /// A secondary more general goal is to be able to isolate optimization on
21 /// unrelated parts of the IR module. This is useful to ensure our
22 /// optimizations are principled and don't miss oportunities where refinement
23 /// of one part of the module influence transformations in another part of the
24 /// module. But this is also useful if we want to parallelize the optimizations
25 /// across common large module graph shapes which tend to be very wide and have
26 /// large regions of unrelated cliques.
27 ///
28 /// To satisfy these goals, we use the LazyCallGraph which provides two graphs
29 /// nested inside each other (and built lazily from the bottom-up): the call
30 /// graph proper, and a reference graph. The reference graph is super set of
31 /// the call graph and is a conservative approximation of what could through
32 /// scalar or CGSCC transforms *become* the call graph. Using this allows us to
33 /// ensure we optimize functions prior to them being introduced into the call
34 /// graph by devirtualization or other technique, and thus ensures that
35 /// subsequent pair-wise interprocedural optimizations observe the optimized
36 /// form of these functions. The (potentially transitive) reference
37 /// reachability used by the reference graph is a conservative approximation
38 /// that still allows us to have independent regions of the graph.
39 ///
40 /// FIXME: There is one major drawback of the reference graph: in its naive
41 /// form it is quadratic because it contains a distinct edge for each
42 /// (potentially indirect) reference, even if are all through some common
43 /// global table of function pointers. This can be fixed in a number of ways
44 /// that essentially preserve enough of the normalization. While it isn't
45 /// expected to completely preclude the usability of this, it will need to be
46 /// addressed.
47 ///
48 ///
49 /// All of these issues are made substantially more complex in the face of
50 /// mutations to the call graph while optimization passes are being run. When
51 /// mutations to the call graph occur we want to achieve two different things:
52 ///
53 /// - We need to update the call graph in-flight and invalidate analyses
54 /// cached on entities in the graph. Because of the cache-based analysis
55 /// design of the pass manager, it is essential to have stable identities for
56 /// the elements of the IR that passes traverse, and to invalidate any
57 /// analyses cached on these elements as the mutations take place.
58 ///
59 /// - We want to preserve the incremental and post-order traversal of the
60 /// graph even as it is refined and mutated. This means we want optimization
61 /// to observe the most refined form of the call graph and to do so in
62 /// post-order.
63 ///
64 /// To address this, the CGSCC manager uses both worklists that can be expanded
65 /// by passes which transform the IR, and provides invalidation tests to skip
66 /// entries that become dead. This extra data is provided to every SCC pass so
67 /// that it can carefully update the manager's traversal as the call graph
68 /// mutates.
69 ///
70 /// We also provide support for running function passes within the CGSCC walk,
71 /// and there we provide automatic update of the call graph including of the
72 /// pass manager to reflect call graph changes that fall out naturally as part
73 /// of scalar transformations.
74 ///
75 /// The patterns used to ensure the goals of post-order visitation of the fully
76 /// refined graph:
77 ///
78 /// 1) Sink toward the "bottom" as the graph is refined. This means that any
79 /// iteration continues in some valid post-order sequence after the mutation
80 /// has altered the structure.
81 ///
82 /// 2) Enqueue in post-order, including the current entity. If the current
83 /// entity's shape changes, it and everything after it in post-order needs
84 /// to be visited to observe that shape.
85 ///
86 //===----------------------------------------------------------------------===//
87 
88 #ifndef LLVM_ANALYSIS_CGSCCPASSMANAGER_H
89 #define LLVM_ANALYSIS_CGSCCPASSMANAGER_H
90 
91 #include "llvm/ADT/DenseSet.h"
93 #include "llvm/ADT/STLExtras.h"
94 #include "llvm/ADT/SmallPtrSet.h"
95 #include "llvm/ADT/SmallVector.h"
97 #include "llvm/IR/CallSite.h"
98 #include "llvm/IR/Function.h"
99 #include "llvm/IR/InstIterator.h"
100 #include "llvm/IR/PassManager.h"
101 #include "llvm/IR/ValueHandle.h"
102 #include "llvm/Support/Debug.h"
104 #include <algorithm>
105 #include <cassert>
106 #include <utility>
107 
108 namespace llvm {
109 
110 struct CGSCCUpdateResult;
111 class Module;
112 
113 // Allow debug logging in this inline function.
114 #define DEBUG_TYPE "cgscc"
115 
116 /// Extern template declaration for the analysis set for this IR unit.
117 extern template class AllAnalysesOn<LazyCallGraph::SCC>;
118 
119 extern template class AnalysisManager<LazyCallGraph::SCC, LazyCallGraph &>;
120 
121 /// The CGSCC analysis manager.
122 ///
123 /// See the documentation for the AnalysisManager template for detail
124 /// documentation. This type serves as a convenient way to refer to this
125 /// construct in the adaptors and proxies used to integrate this into the larger
126 /// pass manager infrastructure.
127 using CGSCCAnalysisManager =
129 
130 // Explicit specialization and instantiation declarations for the pass manager.
131 // See the comments on the definition of the specialization for details on how
132 // it differs from the primary template.
133 template <>
136  CGSCCUpdateResult &>::run(LazyCallGraph::SCC &InitialC,
137  CGSCCAnalysisManager &AM,
138  LazyCallGraph &G, CGSCCUpdateResult &UR);
139 extern template class PassManager<LazyCallGraph::SCC, CGSCCAnalysisManager,
140  LazyCallGraph &, CGSCCUpdateResult &>;
141 
142 /// The CGSCC pass manager.
143 ///
144 /// See the documentation for the PassManager template for details. It runs
145 /// a sequence of SCC passes over each SCC that the manager is run over. This
146 /// type serves as a convenient way to refer to this construct.
147 using CGSCCPassManager =
148  PassManager<LazyCallGraph::SCC, CGSCCAnalysisManager, LazyCallGraph &,
149  CGSCCUpdateResult &>;
150 
151 /// An explicit specialization of the require analysis template pass.
152 template <typename AnalysisT>
153 struct RequireAnalysisPass<AnalysisT, LazyCallGraph::SCC, CGSCCAnalysisManager,
154  LazyCallGraph &, CGSCCUpdateResult &>
155  : PassInfoMixin<RequireAnalysisPass<AnalysisT, LazyCallGraph::SCC,
156  CGSCCAnalysisManager, LazyCallGraph &,
157  CGSCCUpdateResult &>> {
158  PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM,
159  LazyCallGraph &CG, CGSCCUpdateResult &) {
160  (void)AM.template getResult<AnalysisT>(C, CG);
161  return PreservedAnalyses::all();
162  }
163 };
164 
165 /// A proxy from a \c CGSCCAnalysisManager to a \c Module.
168 
169 /// We need a specialized result for the \c CGSCCAnalysisManagerModuleProxy so
170 /// it can have access to the call graph in order to walk all the SCCs when
171 /// invalidating things.
173 public:
174  explicit Result(CGSCCAnalysisManager &InnerAM, LazyCallGraph &G)
175  : InnerAM(&InnerAM), G(&G) {}
176 
177  /// Accessor for the analysis manager.
178  CGSCCAnalysisManager &getManager() { return *InnerAM; }
179 
180  /// Handler for invalidation of the Module.
181  ///
182  /// If the proxy analysis itself is preserved, then we assume that the set of
183  /// SCCs in the Module hasn't changed. Thus any pointers to SCCs in the
184  /// CGSCCAnalysisManager are still valid, and we don't need to call \c clear
185  /// on the CGSCCAnalysisManager.
186  ///
187  /// Regardless of whether this analysis is marked as preserved, all of the
188  /// analyses in the \c CGSCCAnalysisManager are potentially invalidated based
189  /// on the set of preserved analyses.
190  bool invalidate(Module &M, const PreservedAnalyses &PA,
192 
193 private:
194  CGSCCAnalysisManager *InnerAM;
195  LazyCallGraph *G;
196 };
197 
198 /// Provide a specialized run method for the \c CGSCCAnalysisManagerModuleProxy
199 /// so it can pass the lazy call graph to the result.
200 template <>
203 
204 // Ensure the \c CGSCCAnalysisManagerModuleProxy is provided as an extern
205 // template.
207 
208 extern template class OuterAnalysisManagerProxy<
209  ModuleAnalysisManager, LazyCallGraph::SCC, LazyCallGraph &>;
210 
211 /// A proxy from a \c ModuleAnalysisManager to an \c SCC.
214  LazyCallGraph &>;
215 
216 /// Support structure for SCC passes to communicate updates the call graph back
217 /// to the CGSCC pass manager infrsatructure.
218 ///
219 /// The CGSCC pass manager runs SCC passes which are allowed to update the call
220 /// graph and SCC structures. This means the structure the pass manager works
221 /// on is mutating underneath it. In order to support that, there needs to be
222 /// careful communication about the precise nature and ramifications of these
223 /// updates to the pass management infrastructure.
224 ///
225 /// All SCC passes will have to accept a reference to the management layer's
226 /// update result struct and use it to reflect the results of any CG updates
227 /// performed.
228 ///
229 /// Passes which do not change the call graph structure in any way can just
230 /// ignore this argument to their run method.
231 struct CGSCCUpdateResult {
232  /// Worklist of the RefSCCs queued for processing.
233  ///
234  /// When a pass refines the graph and creates new RefSCCs or causes them to
235  /// have a different shape or set of component SCCs it should add the RefSCCs
236  /// to this worklist so that we visit them in the refined form.
237  ///
238  /// This worklist is in reverse post-order, as we pop off the back in order
239  /// to observe RefSCCs in post-order. When adding RefSCCs, clients should add
240  /// them in reverse post-order.
242 
243  /// Worklist of the SCCs queued for processing.
244  ///
245  /// When a pass refines the graph and creates new SCCs or causes them to have
246  /// a different shape or set of component functions it should add the SCCs to
247  /// this worklist so that we visit them in the refined form.
248  ///
249  /// Note that if the SCCs are part of a RefSCC that is added to the \c
250  /// RCWorklist, they don't need to be added here as visiting the RefSCC will
251  /// be sufficient to re-visit the SCCs within it.
252  ///
253  /// This worklist is in reverse post-order, as we pop off the back in order
254  /// to observe SCCs in post-order. When adding SCCs, clients should add them
255  /// in reverse post-order.
257 
258  /// The set of invalidated RefSCCs which should be skipped if they are found
259  /// in \c RCWorklist.
260  ///
261  /// This is used to quickly prune out RefSCCs when they get deleted and
262  /// happen to already be on the worklist. We use this primarily to avoid
263  /// scanning the list and removing entries from it.
265 
266  /// The set of invalidated SCCs which should be skipped if they are found
267  /// in \c CWorklist.
268  ///
269  /// This is used to quickly prune out SCCs when they get deleted and happen
270  /// to already be on the worklist. We use this primarily to avoid scanning
271  /// the list and removing entries from it.
273 
274  /// If non-null, the updated current \c RefSCC being processed.
275  ///
276  /// This is set when a graph refinement takes place an the "current" point in
277  /// the graph moves "down" or earlier in the post-order walk. This will often
278  /// cause the "current" RefSCC to be a newly created RefSCC object and the
279  /// old one to be added to the above worklist. When that happens, this
280  /// pointer is non-null and can be used to continue processing the "top" of
281  /// the post-order walk.
283 
284  /// If non-null, the updated current \c SCC being processed.
285  ///
286  /// This is set when a graph refinement takes place an the "current" point in
287  /// the graph moves "down" or earlier in the post-order walk. This will often
288  /// cause the "current" SCC to be a newly created SCC object and the old one
289  /// to be added to the above worklist. When that happens, this pointer is
290  /// non-null and can be used to continue processing the "top" of the
291  /// post-order walk.
292  LazyCallGraph::SCC *UpdatedC;
293 
294  /// Preserved analyses across SCCs.
295  ///
296  /// We specifically want to allow CGSCC passes to mutate ancestor IR
297  /// (changing both the CG structure and the function IR itself). However,
298  /// this means we need to take special care to correctly mark what analyses
299  /// are preserved *across* SCCs. We have to track this out-of-band here
300  /// because within the main `PassManeger` infrastructure we need to mark
301  /// everything within an SCC as preserved in order to avoid repeatedly
302  /// invalidating the same analyses as we unnest pass managers and adaptors.
303  /// So we track the cross-SCC version of the preserved analyses here from any
304  /// code that does direct invalidation of SCC analyses, and then use it
305  /// whenever we move forward in the post-order walk of SCCs before running
306  /// passes over the new SCC.
308 
309  /// A hacky area where the inliner can retain history about inlining
310  /// decisions that mutated the call graph's SCC structure in order to avoid
311  /// infinite inlining. See the comments in the inliner's CG update logic.
312  ///
313  /// FIXME: Keeping this here seems like a big layering issue, we should look
314  /// for a better technique.
317 };
318 
319 /// The core module pass which does a post-order walk of the SCCs and
320 /// runs a CGSCC pass over each one.
321 ///
322 /// Designed to allow composition of a CGSCCPass(Manager) and
323 /// a ModulePassManager. Note that this pass must be run with a module analysis
324 /// manager as it uses the LazyCallGraph analysis. It will also run the
325 /// \c CGSCCAnalysisManagerModuleProxy analysis prior to running the CGSCC
326 /// pass over the module to enable a \c FunctionAnalysisManager to be used
327 /// within this run safely.
328 template <typename CGSCCPassT>
330  : public PassInfoMixin<ModuleToPostOrderCGSCCPassAdaptor<CGSCCPassT>> {
331 public:
333  : Pass(std::move(Pass)) {}
334 
335  // We have to explicitly define all the special member functions because MSVC
336  // refuses to generate them.
339  : Pass(Arg.Pass) {}
340 
342  : Pass(std::move(Arg.Pass)) {}
343 
346  std::swap(LHS.Pass, RHS.Pass);
347  }
348 
351  swap(*this, RHS);
352  return *this;
353  }
354 
355  /// Runs the CGSCC pass across every SCC in the module.
356  PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM);
357 
358 private:
359  CGSCCPassT Pass;
360 };
361 
362 /// A function to deduce a function pass type and wrap it in the
363 /// templated adaptor.
364 template <typename CGSCCPassT>
367  return ModuleToPostOrderCGSCCPassAdaptor<CGSCCPassT>(std::move(Pass));
368 }
369 
370 /// A proxy from a \c FunctionAnalysisManager to an \c SCC.
371 ///
372 /// When a module pass runs and triggers invalidation, both the CGSCC and
373 /// Function analysis manager proxies on the module get an invalidation event.
374 /// We don't want to fully duplicate responsibility for most of the
375 /// invalidation logic. Instead, this layer is only responsible for SCC-local
376 /// invalidation events. We work with the module's FunctionAnalysisManager to
377 /// invalidate function analyses.
379  : public AnalysisInfoMixin<FunctionAnalysisManagerCGSCCProxy> {
380 public:
381  class Result {
382  public:
383  explicit Result(FunctionAnalysisManager &FAM) : FAM(&FAM) {}
384 
385  /// Accessor for the analysis manager.
387 
388  bool invalidate(LazyCallGraph::SCC &C, const PreservedAnalyses &PA,
390 
391  private:
393  };
394 
395  /// Computes the \c FunctionAnalysisManager and stores it in the result proxy.
396  Result run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM, LazyCallGraph &);
397 
398 private:
400 
401  static AnalysisKey Key;
402 };
403 
405 
406 /// A proxy from a \c CGSCCAnalysisManager to a \c Function.
409 
410 /// Helper to update the call graph after running a function pass.
411 ///
412 /// Function passes can only mutate the call graph in specific ways. This
413 /// routine provides a helper that updates the call graph in those ways
414 /// including returning whether any changes were made and populating a CG
415 /// update result struct for the overall CGSCC walk.
417  LazyCallGraph &G, LazyCallGraph::SCC &C, LazyCallGraph::Node &N,
418  CGSCCAnalysisManager &AM, CGSCCUpdateResult &UR);
419 
420 /// Adaptor that maps from a SCC to its functions.
421 ///
422 /// Designed to allow composition of a FunctionPass(Manager) and
423 /// a CGSCCPassManager. Note that if this pass is constructed with a pointer
424 /// to a \c CGSCCAnalysisManager it will run the
425 /// \c FunctionAnalysisManagerCGSCCProxy analysis prior to running the function
426 /// pass over the SCC to enable a \c FunctionAnalysisManager to be used
427 /// within this run safely.
428 template <typename FunctionPassT>
431 public:
432  explicit CGSCCToFunctionPassAdaptor(FunctionPassT Pass)
433  : Pass(std::move(Pass)) {}
434 
435  // We have to explicitly define all the special member functions because MSVC
436  // refuses to generate them.
438  : Pass(Arg.Pass) {}
439 
441  : Pass(std::move(Arg.Pass)) {}
442 
445  std::swap(LHS.Pass, RHS.Pass);
446  }
447 
449  swap(*this, RHS);
450  return *this;
451  }
452 
453  /// Runs the function pass across every function in the module.
454  PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM,
455  LazyCallGraph &CG, CGSCCUpdateResult &UR) {
456  // Setup the function analysis manager from its proxy.
459 
461  for (LazyCallGraph::Node &N : C)
462  Nodes.push_back(&N);
463 
464  // The SCC may get split while we are optimizing functions due to deleting
465  // edges. If this happens, the current SCC can shift, so keep track of
466  // a pointer we can overwrite.
467  LazyCallGraph::SCC *CurrentC = &C;
468 
469  LLVM_DEBUG(dbgs() << "Running function passes across an SCC: " << C
470  << "\n");
471 
473  for (LazyCallGraph::Node *N : Nodes) {
474  // Skip nodes from other SCCs. These may have been split out during
475  // processing. We'll eventually visit those SCCs and pick up the nodes
476  // there.
477  if (CG.lookupSCC(*N) != CurrentC)
478  continue;
479 
480  Function &F = N->getFunction();
481 
483  if (!PI.runBeforePass<Function>(Pass, F))
484  continue;
485 
486  PreservedAnalyses PassPA = Pass.run(F, FAM);
487 
488  PI.runAfterPass<Function>(Pass, F);
489 
490  // We know that the function pass couldn't have invalidated any other
491  // function's analyses (that's the contract of a function pass), so
492  // directly handle the function analysis manager's invalidation here.
493  FAM.invalidate(F, PassPA);
494 
495  // Then intersect the preserved set so that invalidation of module
496  // analyses will eventually occur when the module pass completes.
497  PA.intersect(std::move(PassPA));
498 
499  // If the call graph hasn't been preserved, update it based on this
500  // function pass. This may also update the current SCC to point to
501  // a smaller, more refined SCC.
502  auto PAC = PA.getChecker<LazyCallGraphAnalysis>();
503  if (!PAC.preserved() && !PAC.preservedSet<AllAnalysesOn<Module>>()) {
504  CurrentC = &updateCGAndAnalysisManagerForFunctionPass(CG, *CurrentC, *N,
505  AM, UR);
506  assert(
507  CG.lookupSCC(*N) == CurrentC &&
508  "Current SCC not updated to the SCC containing the current node!");
509  }
510  }
511 
512  // By definition we preserve the proxy. And we preserve all analyses on
513  // Functions. This precludes *any* invalidation of function analyses by the
514  // proxy, but that's OK because we've taken care to invalidate analyses in
515  // the function analysis manager incrementally above.
518 
519  // We've also ensured that we updated the call graph along the way.
521 
522  return PA;
523  }
524 
525 private:
526  FunctionPassT Pass;
527 };
528 
529 /// A function to deduce a function pass type and wrap it in the
530 /// templated adaptor.
531 template <typename FunctionPassT>
534  return CGSCCToFunctionPassAdaptor<FunctionPassT>(std::move(Pass));
535 }
536 
537 /// A helper that repeats an SCC pass each time an indirect call is refined to
538 /// a direct call by that pass.
539 ///
540 /// While the CGSCC pass manager works to re-visit SCCs and RefSCCs as they
541 /// change shape, we may also want to repeat an SCC pass if it simply refines
542 /// an indirect call to a direct call, even if doing so does not alter the
543 /// shape of the graph. Note that this only pertains to direct calls to
544 /// functions where IPO across the SCC may be able to compute more precise
545 /// results. For intrinsics, we assume scalar optimizations already can fully
546 /// reason about them.
547 ///
548 /// This repetition has the potential to be very large however, as each one
549 /// might refine a single call site. As a consequence, in practice we use an
550 /// upper bound on the number of repetitions to limit things.
551 template <typename PassT>
553  : public PassInfoMixin<DevirtSCCRepeatedPass<PassT>> {
554 public:
556  : Pass(std::move(Pass)), MaxIterations(MaxIterations) {}
557 
558  /// Runs the wrapped pass up to \c MaxIterations on the SCC, iterating
559  /// whenever an indirect call is refined.
560  PreservedAnalyses run(LazyCallGraph::SCC &InitialC, CGSCCAnalysisManager &AM,
561  LazyCallGraph &CG, CGSCCUpdateResult &UR) {
564  AM.getResult<PassInstrumentationAnalysis>(InitialC, CG);
565 
566  // The SCC may be refined while we are running passes over it, so set up
567  // a pointer that we can update.
568  LazyCallGraph::SCC *C = &InitialC;
569 
570  // Collect value handles for all of the indirect call sites.
571  SmallVector<WeakTrackingVH, 8> CallHandles;
572 
573  // Struct to track the counts of direct and indirect calls in each function
574  // of the SCC.
575  struct CallCount {
576  int Direct;
577  int Indirect;
578  };
579 
580  // Put value handles on all of the indirect calls and return the number of
581  // direct calls for each function in the SCC.
582  auto ScanSCC = [](LazyCallGraph::SCC &C,
583  SmallVectorImpl<WeakTrackingVH> &CallHandles) {
584  assert(CallHandles.empty() && "Must start with a clear set of handles.");
585 
586  SmallVector<CallCount, 4> CallCounts;
587  for (LazyCallGraph::Node &N : C) {
588  CallCounts.push_back({0, 0});
589  CallCount &Count = CallCounts.back();
590  for (Instruction &I : instructions(N.getFunction()))
591  if (auto CS = CallSite(&I)) {
592  if (CS.getCalledFunction()) {
593  ++Count.Direct;
594  } else {
595  ++Count.Indirect;
596  CallHandles.push_back(WeakTrackingVH(&I));
597  }
598  }
599  }
600 
601  return CallCounts;
602  };
603 
604  // Populate the initial call handles and get the initial call counts.
605  auto CallCounts = ScanSCC(*C, CallHandles);
606 
607  for (int Iteration = 0;; ++Iteration) {
608 
609  if (!PI.runBeforePass<LazyCallGraph::SCC>(Pass, *C))
610  continue;
611 
612  PreservedAnalyses PassPA = Pass.run(*C, AM, CG, UR);
613 
614  if (UR.InvalidatedSCCs.count(C))
615  PI.runAfterPassInvalidated<LazyCallGraph::SCC>(Pass);
616  else
617  PI.runAfterPass<LazyCallGraph::SCC>(Pass, *C);
618 
619  // If the SCC structure has changed, bail immediately and let the outer
620  // CGSCC layer handle any iteration to reflect the refined structure.
621  if (UR.UpdatedC && UR.UpdatedC != C) {
622  PA.intersect(std::move(PassPA));
623  break;
624  }
625 
626  // Check that we didn't miss any update scenario.
627  assert(!UR.InvalidatedSCCs.count(C) && "Processing an invalid SCC!");
628  assert(C->begin() != C->end() && "Cannot have an empty SCC!");
629  assert((int)CallCounts.size() == C->size() &&
630  "Cannot have changed the size of the SCC!");
631 
632  // Check whether any of the handles were devirtualized.
633  auto IsDevirtualizedHandle = [&](WeakTrackingVH &CallH) {
634  if (!CallH)
635  return false;
636  auto CS = CallSite(CallH);
637  if (!CS)
638  return false;
639 
640  // If the call is still indirect, leave it alone.
641  Function *F = CS.getCalledFunction();
642  if (!F)
643  return false;
644 
645  LLVM_DEBUG(dbgs() << "Found devirutalized call from "
646  << CS.getParent()->getParent()->getName() << " to "
647  << F->getName() << "\n");
648 
649  // We now have a direct call where previously we had an indirect call,
650  // so iterate to process this devirtualization site.
651  return true;
652  };
653  bool Devirt = llvm::any_of(CallHandles, IsDevirtualizedHandle);
654 
655  // Rescan to build up a new set of handles and count how many direct
656  // calls remain. If we decide to iterate, this also sets up the input to
657  // the next iteration.
658  CallHandles.clear();
659  auto NewCallCounts = ScanSCC(*C, CallHandles);
660 
661  // If we haven't found an explicit devirtualization already see if we
662  // have decreased the number of indirect calls and increased the number
663  // of direct calls for any function in the SCC. This can be fooled by all
664  // manner of transformations such as DCE and other things, but seems to
665  // work well in practice.
666  if (!Devirt)
667  for (int i = 0, Size = C->size(); i < Size; ++i)
668  if (CallCounts[i].Indirect > NewCallCounts[i].Indirect &&
669  CallCounts[i].Direct < NewCallCounts[i].Direct) {
670  Devirt = true;
671  break;
672  }
673 
674  if (!Devirt) {
675  PA.intersect(std::move(PassPA));
676  break;
677  }
678 
679  // Otherwise, if we've already hit our max, we're done.
680  if (Iteration >= MaxIterations) {
681  LLVM_DEBUG(
682  dbgs() << "Found another devirtualization after hitting the max "
683  "number of repetitions ("
684  << MaxIterations << ") on SCC: " << *C << "\n");
685  PA.intersect(std::move(PassPA));
686  break;
687  }
688 
689  LLVM_DEBUG(
690  dbgs()
691  << "Repeating an SCC pass after finding a devirtualization in: " << *C
692  << "\n");
693 
694  // Move over the new call counts in preparation for iterating.
695  CallCounts = std::move(NewCallCounts);
696 
697  // Update the analysis manager with each run and intersect the total set
698  // of preserved analyses so we're ready to iterate.
699  AM.invalidate(*C, PassPA);
700  PA.intersect(std::move(PassPA));
701  }
702 
703  // Note that we don't add any preserved entries here unlike a more normal
704  // "pass manager" because we only handle invalidation *between* iterations,
705  // not after the last iteration.
706  return PA;
707  }
708 
709 private:
710  PassT Pass;
711  int MaxIterations;
712 };
713 
714 /// A function to deduce a function pass type and wrap it in the
715 /// templated adaptor.
716 template <typename PassT>
718  int MaxIterations) {
719  return DevirtSCCRepeatedPass<PassT>(std::move(Pass), MaxIterations);
720 }
721 
722 // Out-of-line implementation details for templates below this point.
723 
724 template <typename CGSCCPassT>
727  ModuleAnalysisManager &AM) {
728  // Setup the CGSCC analysis manager from its proxy.
729  CGSCCAnalysisManager &CGAM =
731 
732  // Get the call graph for this module.
733  LazyCallGraph &CG = AM.getResult<LazyCallGraphAnalysis>(M);
734 
735  // We keep worklists to allow us to push more work onto the pass manager as
736  // the passes are run.
739 
740  // Keep sets for invalidated SCCs and RefSCCs that should be skipped when
741  // iterating off the worklists.
744 
746  InlinedInternalEdges;
747 
748  CGSCCUpdateResult UR = {
749  RCWorklist, CWorklist, InvalidRefSCCSet, InvalidSCCSet,
750  nullptr, nullptr, PreservedAnalyses::all(), InlinedInternalEdges};
751 
752  // Request PassInstrumentation from analysis manager, will use it to run
753  // instrumenting callbacks for the passes later.
755 
757  CG.buildRefSCCs();
758  for (auto RCI = CG.postorder_ref_scc_begin(),
759  RCE = CG.postorder_ref_scc_end();
760  RCI != RCE;) {
761  assert(RCWorklist.empty() &&
762  "Should always start with an empty RefSCC worklist");
763  // The postorder_ref_sccs range we are walking is lazily constructed, so
764  // we only push the first one onto the worklist. The worklist allows us
765  // to capture *new* RefSCCs created during transformations.
766  //
767  // We really want to form RefSCCs lazily because that makes them cheaper
768  // to update as the program is simplified and allows us to have greater
769  // cache locality as forming a RefSCC touches all the parts of all the
770  // functions within that RefSCC.
771  //
772  // We also eagerly increment the iterator to the next position because
773  // the CGSCC passes below may delete the current RefSCC.
774  RCWorklist.insert(&*RCI++);
775 
776  do {
777  LazyCallGraph::RefSCC *RC = RCWorklist.pop_back_val();
778  if (InvalidRefSCCSet.count(RC)) {
779  LLVM_DEBUG(dbgs() << "Skipping an invalid RefSCC...\n");
780  continue;
781  }
782 
783  assert(CWorklist.empty() &&
784  "Should always start with an empty SCC worklist");
785 
786  LLVM_DEBUG(dbgs() << "Running an SCC pass across the RefSCC: " << *RC
787  << "\n");
788 
789  // Push the initial SCCs in reverse post-order as we'll pop off the
790  // back and so see this in post-order.
791  for (LazyCallGraph::SCC &C : llvm::reverse(*RC))
792  CWorklist.insert(&C);
793 
794  do {
795  LazyCallGraph::SCC *C = CWorklist.pop_back_val();
796  // Due to call graph mutations, we may have invalid SCCs or SCCs from
797  // other RefSCCs in the worklist. The invalid ones are dead and the
798  // other RefSCCs should be queued above, so we just need to skip both
799  // scenarios here.
800  if (InvalidSCCSet.count(C)) {
801  LLVM_DEBUG(dbgs() << "Skipping an invalid SCC...\n");
802  continue;
803  }
804  if (&C->getOuterRefSCC() != RC) {
805  LLVM_DEBUG(dbgs() << "Skipping an SCC that is now part of some other "
806  "RefSCC...\n");
807  continue;
808  }
809 
810  // Ensure we can proxy analysis updates from from the CGSCC analysis
811  // manager into the Function analysis manager by getting a proxy here.
812  // FIXME: This seems like a bit of a hack. We should find a cleaner
813  // or more costructive way to ensure this happens.
814  (void)CGAM.getResult<FunctionAnalysisManagerCGSCCProxy>(*C, CG);
815 
816  // Each time we visit a new SCC pulled off the worklist,
817  // a transformation of a child SCC may have also modified this parent
818  // and invalidated analyses. So we invalidate using the update record's
819  // cross-SCC preserved set. This preserved set is intersected by any
820  // CGSCC pass that handles invalidation (primarily pass managers) prior
821  // to marking its SCC as preserved. That lets us track everything that
822  // might need invalidation across SCCs without excessive invalidations
823  // on a single SCC.
824  //
825  // This essentially allows SCC passes to freely invalidate analyses
826  // of any ancestor SCC. If this becomes detrimental to successfully
827  // caching analyses, we could force each SCC pass to manually
828  // invalidate the analyses for any SCCs other than themselves which
829  // are mutated. However, that seems to lose the robustness of the
830  // pass-manager driven invalidation scheme.
831  //
832  // FIXME: This is redundant in one case -- the top of the worklist may
833  // *also* be the same SCC we just ran over (and invalidated for). In
834  // that case, we'll end up doing a redundant invalidation here as
835  // a consequence.
836  CGAM.invalidate(*C, UR.CrossSCCPA);
837 
838  do {
839  // Check that we didn't miss any update scenario.
840  assert(!InvalidSCCSet.count(C) && "Processing an invalid SCC!");
841  assert(C->begin() != C->end() && "Cannot have an empty SCC!");
842  assert(&C->getOuterRefSCC() == RC &&
843  "Processing an SCC in a different RefSCC!");
844 
845  UR.UpdatedRC = nullptr;
846  UR.UpdatedC = nullptr;
847 
848  // Check the PassInstrumentation's BeforePass callbacks before
849  // running the pass, skip its execution completely if asked to
850  // (callback returns false).
851  if (!PI.runBeforePass<LazyCallGraph::SCC>(Pass, *C))
852  continue;
853 
854  PreservedAnalyses PassPA = Pass.run(*C, CGAM, CG, UR);
855 
856  if (UR.InvalidatedSCCs.count(C))
857  PI.runAfterPassInvalidated<LazyCallGraph::SCC>(Pass);
858  else
859  PI.runAfterPass<LazyCallGraph::SCC>(Pass, *C);
860 
861  // Update the SCC and RefSCC if necessary.
862  C = UR.UpdatedC ? UR.UpdatedC : C;
863  RC = UR.UpdatedRC ? UR.UpdatedRC : RC;
864 
865  // If the CGSCC pass wasn't able to provide a valid updated SCC,
866  // the current SCC may simply need to be skipped if invalid.
867  if (UR.InvalidatedSCCs.count(C)) {
868  LLVM_DEBUG(dbgs() << "Skipping invalidated root or island SCC!\n");
869  break;
870  }
871  // Check that we didn't miss any update scenario.
872  assert(C->begin() != C->end() && "Cannot have an empty SCC!");
873 
874  // We handle invalidating the CGSCC analysis manager's information
875  // for the (potentially updated) SCC here. Note that any other SCCs
876  // whose structure has changed should have been invalidated by
877  // whatever was updating the call graph. This SCC gets invalidated
878  // late as it contains the nodes that were actively being
879  // processed.
880  CGAM.invalidate(*C, PassPA);
881 
882  // Then intersect the preserved set so that invalidation of module
883  // analyses will eventually occur when the module pass completes.
884  // Also intersect with the cross-SCC preserved set to capture any
885  // cross-SCC invalidation.
886  UR.CrossSCCPA.intersect(PassPA);
887  PA.intersect(std::move(PassPA));
888 
889  // The pass may have restructured the call graph and refined the
890  // current SCC and/or RefSCC. We need to update our current SCC and
891  // RefSCC pointers to follow these. Also, when the current SCC is
892  // refined, re-run the SCC pass over the newly refined SCC in order
893  // to observe the most precise SCC model available. This inherently
894  // cannot cycle excessively as it only happens when we split SCCs
895  // apart, at most converging on a DAG of single nodes.
896  // FIXME: If we ever start having RefSCC passes, we'll want to
897  // iterate there too.
898  if (UR.UpdatedC)
899  LLVM_DEBUG(dbgs()
900  << "Re-running SCC passes after a refinement of the "
901  "current SCC: "
902  << *UR.UpdatedC << "\n");
903 
904  // Note that both `C` and `RC` may at this point refer to deleted,
905  // invalid SCC and RefSCCs respectively. But we will short circuit
906  // the processing when we check them in the loop above.
907  } while (UR.UpdatedC);
908  } while (!CWorklist.empty());
909 
910  // We only need to keep internal inlined edge information within
911  // a RefSCC, clear it to save on space and let the next time we visit
912  // any of these functions have a fresh start.
913  InlinedInternalEdges.clear();
914  } while (!RCWorklist.empty());
915  }
916 
917  // By definition we preserve the call garph, all SCC analyses, and the
918  // analysis proxies by handling them above and in any nested pass managers.
919  PA.preserveSet<AllAnalysesOn<LazyCallGraph::SCC>>();
920  PA.preserve<LazyCallGraphAnalysis>();
921  PA.preserve<CGSCCAnalysisManagerModuleProxy>();
922  PA.preserve<FunctionAnalysisManagerModuleProxy>();
923  return PA;
924 }
925 
926 // Clear out the debug logging macro.
927 #undef DEBUG_TYPE
928 
929 } // end namespace llvm
930 
931 #endif // LLVM_ANALYSIS_CGSCCPASSMANAGER_H
Pass interface - Implemented by all &#39;passes&#39;.
Definition: Pass.h:80
uint64_t CallInst * C
This file provides a priority worklist.
ModuleToPostOrderCGSCCPassAdaptor(ModuleToPostOrderCGSCCPassAdaptor &&Arg)
PreservedAnalyses run(LazyCallGraph::SCC &InitialC, CGSCCAnalysisManager &AM, LazyCallGraph &CG, CGSCCUpdateResult &UR)
Runs the wrapped pass up to MaxIterations on the SCC, iterating whenever an indirect call is refined...
bool insert(const T &X)
Insert a new element into the PriorityWorklist.
PassT::Result & getResult(IRUnitT &IR, ExtraArgTs... ExtraArgs)
Get the result of an analysis pass for a given IR unit.
Definition: PassManager.h:776
This class represents lattice values for constants.
Definition: AllocatorList.h:23
SmallPriorityWorklist< LazyCallGraph::RefSCC *, 1 > & RCWorklist
Worklist of the RefSCCs queued for processing.
void intersect(const PreservedAnalyses &Arg)
Intersect this set with another in place.
Definition: PassManager.h:225
A Module instance is used to store all the information related to an LLVM module. ...
Definition: Module.h:65
LazyCallGraph::SCC & updateCGAndAnalysisManagerForFunctionPass(LazyCallGraph &G, LazyCallGraph::SCC &C, LazyCallGraph::Node &N, CGSCCAnalysisManager &AM, CGSCCUpdateResult &UR)
Helper to update the call graph after running a function pass.
CGSCCToFunctionPassAdaptor & operator=(CGSCCToFunctionPassAdaptor RHS)
SCC * lookupSCC(Node &N) const
Lookup a function&#39;s SCC in the graph.
bool invalidate(IRUnitT &IR, const PreservedAnalyses &PA, typename AnalysisManager< IRUnitT, ExtraArgTs... >::Invalidator &Inv)
Handler for invalidation of the outer IR unit, IRUnitT.
Implements a lazy call graph analysis and related passes for the new pass manager.
ModuleToPostOrderCGSCCPassAdaptor< CGSCCPassT > createModuleToPostOrderCGSCCPassAdaptor(CGSCCPassT Pass)
A function to deduce a function pass type and wrap it in the templated adaptor.
bool empty() const
Determine if the PriorityWorklist is empty or not.
CGSCCToFunctionPassAdaptor< FunctionPassT > createCGSCCToFunctionPassAdaptor(FunctionPassT Pass)
A function to deduce a function pass type and wrap it in the templated adaptor.
F(f)
ModuleToPostOrderCGSCCPassAdaptor & operator=(ModuleToPostOrderCGSCCPassAdaptor RHS)
A utility pass template to force an analysis result to be available.
Definition: PassManager.h:1355
Result(CGSCCAnalysisManager &InnerAM, LazyCallGraph &G)
A proxy from a FunctionAnalysisManager to an SCC.
Support structure for SCC passes to communicate updates the call graph back to the CGSCC pass manager...
A templated base class for SmallPtrSet which provides the typesafe interface that is common across al...
Definition: SmallPtrSet.h:343
PreservedAnalysisChecker getChecker() const
Build a checker for this PreservedAnalyses and the specified analysis type.
Definition: PassManager.h:311
RefSCC & getOuterRefSCC() const
PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM, LazyCallGraph &CG, CGSCCUpdateResult &UR)
Runs the function pass across every function in the module.
Definition: BitVector.h:937
LazyCallGraph::SCC * UpdatedC
If non-null, the updated current SCC being processed.
iterator begin() const
This class consists of common code factored out of the SmallVector class to reduce code duplication b...
Definition: APFloat.h:41
postorder_ref_scc_iterator postorder_ref_scc_begin()
ModuleToPostOrderCGSCCPassAdaptor(const ModuleToPostOrderCGSCCPassAdaptor &Arg)
AnalysisManagerT & getManager()
Accessor for the analysis manager.
Definition: PassManager.h:1079
auto reverse(ContainerTy &&C, typename std::enable_if< has_rbegin< ContainerTy >::value >::type *=nullptr) -> decltype(make_range(C.rbegin(), C.rend()))
Definition: STLExtras.h:266
Key
PAL metadata keys.
CGSCCToFunctionPassAdaptor(const CGSCCToFunctionPassAdaptor &Arg)
A RefSCC of the call graph.
Value handle that is nullable, but tries to track the Value.
Definition: ValueHandle.h:181
AnalysisManager< LazyCallGraph::SCC, LazyCallGraph & > CGSCCAnalysisManager
The CGSCC analysis manager.
A CRTP mix-in to automatically provide informational APIs needed for passes.
Definition: PassManager.h:372
A lazily constructed view of the call graph of a module.
bool runBeforePass(const PassT &Pass, const IRUnitT &IR) const
BeforePass instrumentation point - takes Pass instance to be executed and constant reference to IR it...
static cl::opt< unsigned > MaxIterations("max-cg-scc-iterations", cl::ReallyHidden, cl::init(4))
A set of analyses that are preserved following a run of a transformation pass.
Definition: PassManager.h:153
The core module pass which does a post-order walk of the SCCs and runs a CGSCC pass over each one...
SmallPtrSetImpl< LazyCallGraph::RefSCC * > & InvalidatedRefSCCs
The set of invalidated RefSCCs which should be skipped if they are found in RCWorklist.
SmallPtrSetImpl< LazyCallGraph::SCC * > & InvalidatedSCCs
The set of invalidated SCCs which should be skipped if they are found in CWorklist.
A CRTP mix-in that provides informational APIs needed for analysis passes.
Definition: PassManager.h:389
friend void swap(ModuleToPostOrderCGSCCPassAdaptor &LHS, ModuleToPostOrderCGSCCPassAdaptor &RHS)
LazyCallGraph::RefSCC * UpdatedRC
If non-null, the updated current RefSCC being processed.
bool any_of(R &&range, UnaryPredicate P)
Provide wrappers to std::any_of which take ranges instead of having to pass begin/end explicitly...
Definition: STLExtras.h:1192
friend void swap(CGSCCToFunctionPassAdaptor &LHS, CGSCCToFunctionPassAdaptor &RHS)
FunctionAnalysisManager & getManager()
Accessor for the analysis manager.
iterator end() const
amdgpu Simplify well known AMD library false FunctionCallee Value * Arg
A node in the call graph.
size_type count(ConstPtrType Ptr) const
count - Return 1 if the specified pointer is in the set, 0 otherwise.
Definition: SmallPtrSet.h:381
CGSCCToFunctionPassAdaptor(FunctionPassT Pass)
A version of PriorityWorklist that selects small size optimized data structures for the vector and ma...
DevirtSCCRepeatedPass< PassT > createDevirtSCCRepeatedPass(PassT Pass, int MaxIterations)
A function to deduce a function pass type and wrap it in the templated adaptor.
static PreservedAnalyses all()
Construct a special preserved set that preserves all passes.
Definition: PassManager.h:159
SmallPriorityWorklist< LazyCallGraph::SCC *, 1 > & CWorklist
Worklist of the SCCs queued for processing.
Pseudo-analysis pass that exposes the PassInstrumentation to pass managers.
Definition: PassManager.h:581
PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM)
Runs the CGSCC pass across every SCC in the module.
AnalysisManager< Module > ModuleAnalysisManager
Convenience typedef for the Module analysis manager.
Definition: PassManager.h:1020
print lazy value Lazy Value Info Printer Pass
CGSCCAnalysisManager & getManager()
Accessor for the analysis manager.
SmallPtrSet - This class implements a set which is optimized for holding SmallSize or less elements...
Definition: SmallPtrSet.h:417
void runAfterPass(const PassT &Pass, const IRUnitT &IR) const
AfterPass instrumentation point - takes Pass instance that has just been executed and constant refere...
An analysis over an "inner" IR unit that provides access to an analysis manager over a "outer" IR uni...
Definition: PassManager.h:1160
This is a &#39;vector&#39; (really, a variable-sized array), optimized for the case when the array is small...
Definition: SmallVector.h:841
void runAfterPassInvalidated(const PassT &Pass) const
AfterPassInvalidated instrumentation point - takes Pass instance that has just been executed...
Result run(IRUnitT &IR, AnalysisManager< IRUnitT, ExtraArgTs... > &AM, ExtraArgTs...)
Run the analysis pass and create our proxy result object.
Definition: PassManager.h:1107
const DataFlowGraph & G
Definition: RDFGraph.cpp:202
void invalidate(IRUnitT &IR)
Invalidate a specific analysis pass for an IR module.
Definition: PassManager.h:848
raw_ostream & dbgs()
dbgs() - This returns a reference to a raw_ostream for debugging messages.
Definition: Debug.cpp:132
Implements a dense probed hash-table based set with some number of buckets stored inline...
Definition: DenseSet.h:267
void swap(llvm::BitVector &LHS, llvm::BitVector &RHS)
Implement std::swap in terms of BitVector swap.
Definition: BitVector.h:940
A helper that repeats an SCC pass each time an indirect call is refined to a direct call by that pass...
CGSCCToFunctionPassAdaptor(CGSCCToFunctionPassAdaptor &&Arg)
Adaptor that maps from a SCC to its functions.
Manages a sequence of passes over a particular unit of IR.
Definition: PassManager.h:464
LLVM_NODISCARD bool empty() const
Definition: SmallVector.h:55
void preserveSet()
Mark an analysis set as preserved.
Definition: PassManager.h:189
StringRef getName() const
Return a constant reference to the value&#39;s name.
Definition: Value.cpp:214
#define I(x, y, z)
Definition: MD5.cpp:58
#define N
PreservedAnalyses CrossSCCPA
Preserved analyses across SCCs.
uint32_t Size
Definition: Profile.cpp:46
void preserve()
Mark an analysis as preserved.
Definition: PassManager.h:174
PreservedAnalyses run(IRUnitT &Arg, AnalysisManagerT &AM, ExtraArgTs &&... Args)
Run this pass over some unit of IR.
Definition: PassManager.h:1364
An analysis pass which computes the call graph for a module.
PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM, LazyCallGraph &CG, CGSCCUpdateResult &)
API to communicate dependencies between analyses during invalidation.
Definition: PassManager.h:648
assert(ImpDefSCC.getReg()==AMDGPU::SCC &&ImpDefSCC.isDef())
This templated class represents "all analyses that operate over <a particular IR unit>" (e...
Definition: PassManager.h:91
An SCC of the call graph.
DevirtSCCRepeatedPass(PassT Pass, int MaxIterations)
inst_range instructions(Function *F)
Definition: InstIterator.h:133
A container for analyses that lazily runs them and caches their results.
This class provides instrumentation entry points for the Pass Manager, doing calls to callbacks regis...
This header defines various interfaces for pass management in LLVM.
#define LLVM_DEBUG(X)
Definition: Debug.h:122
postorder_ref_scc_iterator postorder_ref_scc_end()
SmallDenseSet< std::pair< LazyCallGraph::Node *, LazyCallGraph::SCC * >, 4 > & InlinedInternalEdges
A hacky area where the inliner can retain history about inlining decisions that mutated the call grap...
A special type used by analysis passes to provide an address that identifies that particular analysis...
Definition: PassManager.h:70
An analysis over an "outer" IR unit that provides access to an analysis manager over an "inner" IR un...
Definition: PassManager.h:1044