LLVM 17.0.0git
SampleProfReader.h
Go to the documentation of this file.
1//===- SampleProfReader.h - Read LLVM sample profile data -------*- C++ -*-===//
2//
3// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4// See https://llvm.org/LICENSE.txt for license information.
5// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6//
7//===----------------------------------------------------------------------===//
8//
9// This file contains definitions needed for reading sample profiles.
10//
11// NOTE: If you are making changes to this file format, please remember
12// to document them in the Clang documentation at
13// tools/clang/docs/UsersManual.rst.
14//
15// Text format
16// -----------
17//
18// Sample profiles are written as ASCII text. The file is divided into
19// sections, which correspond to each of the functions executed at runtime.
20// Each section has the following format
21//
22// function1:total_samples:total_head_samples
23// offset1[.discriminator]: number_of_samples [fn1:num fn2:num ... ]
24// offset2[.discriminator]: number_of_samples [fn3:num fn4:num ... ]
25// ...
26// offsetN[.discriminator]: number_of_samples [fn5:num fn6:num ... ]
27// offsetA[.discriminator]: fnA:num_of_total_samples
28// offsetA1[.discriminator]: number_of_samples [fn7:num fn8:num ... ]
29// ...
30// !CFGChecksum: num
31// !Attribute: flags
32//
33// This is a nested tree in which the indentation represents the nesting level
34// of the inline stack. There are no blank lines in the file. And the spacing
35// within a single line is fixed. Additional spaces will result in an error
36// while reading the file.
37//
38// Any line starting with the '#' character is completely ignored.
39//
40// Inlined calls are represented with indentation. The Inline stack is a
41// stack of source locations in which the top of the stack represents the
42// leaf function, and the bottom of the stack represents the actual
43// symbol to which the instruction belongs.
44//
45// Function names must be mangled in order for the profile loader to
46// match them in the current translation unit. The two numbers in the
47// function header specify how many total samples were accumulated in the
48// function (first number), and the total number of samples accumulated
49// in the prologue of the function (second number). This head sample
50// count provides an indicator of how frequently the function is invoked.
51//
52// There are three types of lines in the function body.
53//
54// * Sampled line represents the profile information of a source location.
55// * Callsite line represents the profile information of a callsite.
56// * Metadata line represents extra metadata of the function.
57//
58// Each sampled line may contain several items. Some are optional (marked
59// below):
60//
61// a. Source line offset. This number represents the line number
62// in the function where the sample was collected. The line number is
63// always relative to the line where symbol of the function is
64// defined. So, if the function has its header at line 280, the offset
65// 13 is at line 293 in the file.
66//
67// Note that this offset should never be a negative number. This could
68// happen in cases like macros. The debug machinery will register the
69// line number at the point of macro expansion. So, if the macro was
70// expanded in a line before the start of the function, the profile
71// converter should emit a 0 as the offset (this means that the optimizers
72// will not be able to associate a meaningful weight to the instructions
73// in the macro).
74//
75// b. [OPTIONAL] Discriminator. This is used if the sampled program
76// was compiled with DWARF discriminator support
77// (http://wiki.dwarfstd.org/index.php?title=Path_Discriminators).
78// DWARF discriminators are unsigned integer values that allow the
79// compiler to distinguish between multiple execution paths on the
80// same source line location.
81//
82// For example, consider the line of code ``if (cond) foo(); else bar();``.
83// If the predicate ``cond`` is true 80% of the time, then the edge
84// into function ``foo`` should be considered to be taken most of the
85// time. But both calls to ``foo`` and ``bar`` are at the same source
86// line, so a sample count at that line is not sufficient. The
87// compiler needs to know which part of that line is taken more
88// frequently.
89//
90// This is what discriminators provide. In this case, the calls to
91// ``foo`` and ``bar`` will be at the same line, but will have
92// different discriminator values. This allows the compiler to correctly
93// set edge weights into ``foo`` and ``bar``.
94//
95// c. Number of samples. This is an integer quantity representing the
96// number of samples collected by the profiler at this source
97// location.
98//
99// d. [OPTIONAL] Potential call targets and samples. If present, this
100// line contains a call instruction. This models both direct and
101// number of samples. For example,
102//
103// 130: 7 foo:3 bar:2 baz:7
104//
105// The above means that at relative line offset 130 there is a call
106// instruction that calls one of ``foo()``, ``bar()`` and ``baz()``,
107// with ``baz()`` being the relatively more frequently called target.
108//
109// Each callsite line may contain several items. Some are optional.
110//
111// a. Source line offset. This number represents the line number of the
112// callsite that is inlined in the profiled binary.
113//
114// b. [OPTIONAL] Discriminator. Same as the discriminator for sampled line.
115//
116// c. Number of samples. This is an integer quantity representing the
117// total number of samples collected for the inlined instance at this
118// callsite
119//
120// Metadata line can occur in lines with one indent only, containing extra
121// information for the top-level function. Furthermore, metadata can only
122// occur after all the body samples and callsite samples.
123// Each metadata line may contain a particular type of metadata, marked by
124// the starting characters annotated with !. We process each metadata line
125// independently, hence each metadata line has to form an independent piece
126// of information that does not require cross-line reference.
127// We support the following types of metadata:
128//
129// a. CFG Checksum (a.k.a. function hash):
130// !CFGChecksum: 12345
131// b. CFG Checksum (see ContextAttributeMask):
132// !Atribute: 1
133//
134//
135// Binary format
136// -------------
137//
138// This is a more compact encoding. Numbers are encoded as ULEB128 values
139// and all strings are encoded in a name table. The file is organized in
140// the following sections:
141//
142// MAGIC (uint64_t)
143// File identifier computed by function SPMagic() (0x5350524f463432ff)
144//
145// VERSION (uint32_t)
146// File format version number computed by SPVersion()
147//
148// SUMMARY
149// TOTAL_COUNT (uint64_t)
150// Total number of samples in the profile.
151// MAX_COUNT (uint64_t)
152// Maximum value of samples on a line.
153// MAX_FUNCTION_COUNT (uint64_t)
154// Maximum number of samples at function entry (head samples).
155// NUM_COUNTS (uint64_t)
156// Number of lines with samples.
157// NUM_FUNCTIONS (uint64_t)
158// Number of functions with samples.
159// NUM_DETAILED_SUMMARY_ENTRIES (size_t)
160// Number of entries in detailed summary
161// DETAILED_SUMMARY
162// A list of detailed summary entry. Each entry consists of
163// CUTOFF (uint32_t)
164// Required percentile of total sample count expressed as a fraction
165// multiplied by 1000000.
166// MIN_COUNT (uint64_t)
167// The minimum number of samples required to reach the target
168// CUTOFF.
169// NUM_COUNTS (uint64_t)
170// Number of samples to get to the desrired percentile.
171//
172// NAME TABLE
173// SIZE (uint32_t)
174// Number of entries in the name table.
175// NAMES
176// A NUL-separated list of SIZE strings.
177//
178// FUNCTION BODY (one for each uninlined function body present in the profile)
179// HEAD_SAMPLES (uint64_t) [only for top-level functions]
180// Total number of samples collected at the head (prologue) of the
181// function.
182// NOTE: This field should only be present for top-level functions
183// (i.e., not inlined into any caller). Inlined function calls
184// have no prologue, so they don't need this.
185// NAME_IDX (uint32_t)
186// Index into the name table indicating the function name.
187// SAMPLES (uint64_t)
188// Total number of samples collected in this function.
189// NRECS (uint32_t)
190// Total number of sampling records this function's profile.
191// BODY RECORDS
192// A list of NRECS entries. Each entry contains:
193// OFFSET (uint32_t)
194// Line offset from the start of the function.
195// DISCRIMINATOR (uint32_t)
196// Discriminator value (see description of discriminators
197// in the text format documentation above).
198// SAMPLES (uint64_t)
199// Number of samples collected at this location.
200// NUM_CALLS (uint32_t)
201// Number of non-inlined function calls made at this location. In the
202// case of direct calls, this number will always be 1. For indirect
203// calls (virtual functions and function pointers) this will
204// represent all the actual functions called at runtime.
205// CALL_TARGETS
206// A list of NUM_CALLS entries for each called function:
207// NAME_IDX (uint32_t)
208// Index into the name table with the callee name.
209// SAMPLES (uint64_t)
210// Number of samples collected at the call site.
211// NUM_INLINED_FUNCTIONS (uint32_t)
212// Number of callees inlined into this function.
213// INLINED FUNCTION RECORDS
214// A list of NUM_INLINED_FUNCTIONS entries describing each of the inlined
215// callees.
216// OFFSET (uint32_t)
217// Line offset from the start of the function.
218// DISCRIMINATOR (uint32_t)
219// Discriminator value (see description of discriminators
220// in the text format documentation above).
221// FUNCTION BODY
222// A FUNCTION BODY entry describing the inlined function.
223//===----------------------------------------------------------------------===//
224
225#ifndef LLVM_PROFILEDATA_SAMPLEPROFREADER_H
226#define LLVM_PROFILEDATA_SAMPLEPROFREADER_H
227
228#include "llvm/ADT/SmallVector.h"
229#include "llvm/ADT/StringRef.h"
231#include "llvm/IR/LLVMContext.h"
236#include "llvm/Support/Debug.h"
238#include "llvm/Support/ErrorOr.h"
240#include <cstdint>
241#include <list>
242#include <memory>
243#include <optional>
244#include <string>
245#include <system_error>
246#include <unordered_set>
247#include <vector>
248
249namespace llvm {
250
251class raw_ostream;
252class Twine;
253
254namespace vfs {
255class FileSystem;
256} // namespace vfs
257
258namespace sampleprof {
259
260class SampleProfileReader;
261
262/// SampleProfileReaderItaniumRemapper remaps the profile data from a
263/// sample profile data reader, by applying a provided set of equivalences
264/// between components of the symbol names in the profile.
266public:
267 SampleProfileReaderItaniumRemapper(std::unique_ptr<MemoryBuffer> B,
268 std::unique_ptr<SymbolRemappingReader> SRR,
270 : Buffer(std::move(B)), Remappings(std::move(SRR)), Reader(R) {
271 assert(Remappings && "Remappings cannot be nullptr");
272 }
273
274 /// Create a remapper from the given remapping file. The remapper will
275 /// be used for profile read in by Reader.
277 create(const std::string Filename, vfs::FileSystem &FS,
279
280 /// Create a remapper from the given Buffer. The remapper will
281 /// be used for profile read in by Reader.
283 create(std::unique_ptr<MemoryBuffer> &B, SampleProfileReader &Reader,
284 LLVMContext &C);
285
286 /// Apply remappings to the profile read by Reader.
287 void applyRemapping(LLVMContext &Ctx);
288
289 bool hasApplied() { return RemappingApplied; }
290
291 /// Insert function name into remapper.
292 void insert(StringRef FunctionName) { Remappings->insert(FunctionName); }
293
294 /// Query whether there is equivalent in the remapper which has been
295 /// inserted.
296 bool exist(StringRef FunctionName) {
297 return Remappings->lookup(FunctionName);
298 }
299
300 /// Return the equivalent name in the profile for \p FunctionName if
301 /// it exists.
302 std::optional<StringRef> lookUpNameInProfile(StringRef FunctionName);
303
304private:
305 // The buffer holding the content read from remapping file.
306 std::unique_ptr<MemoryBuffer> Buffer;
307 std::unique_ptr<SymbolRemappingReader> Remappings;
308 // Map remapping key to the name in the profile. By looking up the
309 // key in the remapper, a given new name can be mapped to the
310 // cannonical name using the NameMap.
312 // The Reader the remapper is servicing.
313 SampleProfileReader &Reader;
314 // Indicate whether remapping has been applied to the profile read
315 // by Reader -- by calling applyRemapping.
316 bool RemappingApplied = false;
317};
318
319/// Sample-based profile reader.
320///
321/// Each profile contains sample counts for all the functions
322/// executed. Inside each function, statements are annotated with the
323/// collected samples on all the instructions associated with that
324/// statement.
325///
326/// For this to produce meaningful data, the program needs to be
327/// compiled with some debug information (at minimum, line numbers:
328/// -gline-tables-only). Otherwise, it will be impossible to match IR
329/// instructions to the line numbers collected by the profiler.
330///
331/// From the profile file, we are interested in collecting the
332/// following information:
333///
334/// * A list of functions included in the profile (mangled names).
335///
336/// * For each function F:
337/// 1. The total number of samples collected in F.
338///
339/// 2. The samples collected at each line in F. To provide some
340/// protection against source code shuffling, line numbers should
341/// be relative to the start of the function.
342///
343/// The reader supports two file formats: text and binary. The text format
344/// is useful for debugging and testing, while the binary format is more
345/// compact and I/O efficient. They can both be used interchangeably.
347public:
348 SampleProfileReader(std::unique_ptr<MemoryBuffer> B, LLVMContext &C,
350 : Profiles(0), Ctx(C), Buffer(std::move(B)), Format(Format) {}
351
352 virtual ~SampleProfileReader() = default;
353
354 /// Read and validate the file header.
355 virtual std::error_code readHeader() = 0;
356
357 /// Set the bits for FS discriminators. Parameter Pass specify the sequence
358 /// number, Pass == i is for the i-th round of adding FS discriminators.
359 /// Pass == 0 is for using base discriminators.
362 }
363
364 /// Get the bitmask the discriminators: For FS profiles, return the bit
365 /// mask for this pass. For non FS profiles, return (unsigned) -1.
367 if (!ProfileIsFS)
368 return 0xFFFFFFFF;
369 assert((MaskedBitFrom != 0) && "MaskedBitFrom is not set properly");
370 return getN1Bits(MaskedBitFrom);
371 }
372
373 /// The interface to read sample profiles from the associated file.
374 std::error_code read() {
375 if (std::error_code EC = readImpl())
376 return EC;
377 if (Remapper)
378 Remapper->applyRemapping(Ctx);
381 }
382
383 /// The implementaion to read sample profiles from the associated file.
384 virtual std::error_code readImpl() = 0;
385
386 /// Print the profile for \p FContext on stream \p OS.
388
389 /// Collect functions with definitions in Module M. For reader which
390 /// support loading function profiles on demand, return true when the
391 /// reader has been given a module. Always return false for reader
392 /// which doesn't support loading function profiles on demand.
393 virtual bool collectFuncsFromModule() { return false; }
394
395 /// Print all the profiles on stream \p OS.
396 void dump(raw_ostream &OS = dbgs());
397
398 /// Print all the profiles on stream \p OS in the JSON format.
399 void dumpJson(raw_ostream &OS = dbgs());
400
401 /// Return the samples collected for function \p F.
403 // The function name may have been updated by adding suffix. Call
404 // a helper to (optionally) strip off suffixes so that we can
405 // match against the original function name in the profile.
407 return getSamplesFor(CanonName);
408 }
409
410 /// Return the samples collected for function \p F, create empty
411 /// FunctionSamples if it doesn't exist.
413 std::string FGUID;
415 CanonName = getRepInFormat(CanonName, useMD5(), FGUID);
416 auto It = Profiles.find(CanonName);
417 if (It != Profiles.end())
418 return &It->second;
419 if (!FGUID.empty()) {
420 assert(useMD5() && "New name should only be generated for md5 profile");
421 CanonName = *MD5NameBuffer.insert(FGUID).first;
422 }
423 return &Profiles[CanonName];
424 }
425
426 /// Return the samples collected for function \p F.
428 std::string FGUID;
429 Fname = getRepInFormat(Fname, useMD5(), FGUID);
430 auto It = Profiles.find(Fname);
431 if (It != Profiles.end())
432 return &It->second;
433
434 if (Remapper) {
435 if (auto NameInProfile = Remapper->lookUpNameInProfile(Fname)) {
436 auto It = Profiles.find(*NameInProfile);
437 if (It != Profiles.end())
438 return &It->second;
439 }
440 }
441 return nullptr;
442 }
443
444 /// Return all the profiles.
446
447 /// Report a parse error message.
448 void reportError(int64_t LineNumber, const Twine &Msg) const {
449 Ctx.diagnose(DiagnosticInfoSampleProfile(Buffer->getBufferIdentifier(),
450 LineNumber, Msg));
451 }
452
453 /// Create a sample profile reader appropriate to the file format.
454 /// Create a remapper underlying if RemapFilename is not empty.
455 /// Parameter P specifies the FSDiscriminatorPass.
457 create(const std::string Filename, LLVMContext &C, vfs::FileSystem &FS,
459 const std::string RemapFilename = "");
460
461 /// Create a sample profile reader from the supplied memory buffer.
462 /// Create a remapper underlying if RemapFilename is not empty.
463 /// Parameter P specifies the FSDiscriminatorPass.
465 create(std::unique_ptr<MemoryBuffer> &B, LLVMContext &C, vfs::FileSystem &FS,
467 const std::string RemapFilename = "");
468
469 /// Return the profile summary.
470 ProfileSummary &getSummary() const { return *(Summary.get()); }
471
472 MemoryBuffer *getBuffer() const { return Buffer.get(); }
473
474 /// \brief Return the profile format.
476
477 /// Whether input profile is based on pseudo probes.
479
480 /// Whether input profile is fully context-sensitive.
481 bool profileIsCS() const { return ProfileIsCS; }
482
483 /// Whether input profile contains ShouldBeInlined contexts.
485
486 virtual std::unique_ptr<ProfileSymbolList> getProfileSymbolList() {
487 return nullptr;
488 };
489
490 /// It includes all the names that have samples either in outline instance
491 /// or inline instance.
492 virtual std::vector<StringRef> *getNameTable() { return nullptr; }
493 virtual bool dumpSectionInfo(raw_ostream &OS = dbgs()) { return false; };
494
495 /// Return whether names in the profile are all MD5 numbers.
496 virtual bool useMD5() { return false; }
497
498 /// Don't read profile without context if the flag is set. This is only meaningful
499 /// for ExtBinary format.
500 virtual void setSkipFlatProf(bool Skip) {}
501 /// Return whether any name in the profile contains ".__uniq." suffix.
502 virtual bool hasUniqSuffix() { return false; }
503
505
506 void setModule(const Module *Mod) { M = Mod; }
507
508protected:
509 /// Map every function to its associated profile.
510 ///
511 /// The profile of every function executed at runtime is collected
512 /// in the structure FunctionSamples. This maps function objects
513 /// to their corresponding profiles.
515
516 /// LLVM context used to emit diagnostics.
518
519 /// Memory buffer holding the profile file.
520 std::unique_ptr<MemoryBuffer> Buffer;
521
522 /// Extra name buffer holding names created on demand.
523 /// This should only be needed for md5 profiles.
524 std::unordered_set<std::string> MD5NameBuffer;
525
526 /// Profile summary information.
527 std::unique_ptr<ProfileSummary> Summary;
528
529 /// Take ownership of the summary of this reader.
530 static std::unique_ptr<ProfileSummary>
532 return std::move(Reader.Summary);
533 }
534
535 /// Compute summary for this profile.
536 void computeSummary();
537
538 std::unique_ptr<SampleProfileReaderItaniumRemapper> Remapper;
539
540 /// \brief Whether samples are collected based on pseudo probes.
542
543 /// Whether function profiles are context-sensitive flat profiles.
544 bool ProfileIsCS = false;
545
546 /// Whether function profile contains ShouldBeInlined contexts.
548
549 /// Number of context-sensitive profiles.
551
552 /// Whether the function profiles use FS discriminators.
553 bool ProfileIsFS = false;
554
555 /// \brief The format of sample.
557
558 /// \brief The current module being compiled if SampleProfileReader
559 /// is used by compiler. If SampleProfileReader is used by other
560 /// tools which are not compiler, M is usually nullptr.
561 const Module *M = nullptr;
562
563 /// Zero out the discriminator bits higher than bit MaskedBitFrom (0 based).
564 /// The default is to keep all the bits.
566};
567
569public:
570 SampleProfileReaderText(std::unique_ptr<MemoryBuffer> B, LLVMContext &C)
572
573 /// Read and validate the file header.
574 std::error_code readHeader() override { return sampleprof_error::success; }
575
576 /// Read sample profiles from the associated file.
577 std::error_code readImpl() override;
578
579 /// Return true if \p Buffer is in the format supported by this class.
580 static bool hasFormat(const MemoryBuffer &Buffer);
581
582private:
583 /// CSNameTable is used to save full context vectors. This serves as an
584 /// underlying immutable buffer for all clients.
585 std::list<SampleContextFrameVector> CSNameTable;
586};
587
589public:
590 SampleProfileReaderBinary(std::unique_ptr<MemoryBuffer> B, LLVMContext &C,
593
594 /// Read and validate the file header.
595 std::error_code readHeader() override;
596
597 /// Read sample profiles from the associated file.
598 std::error_code readImpl() override;
599
600 /// It includes all the names that have samples either in outline instance
601 /// or inline instance.
602 std::vector<StringRef> *getNameTable() override { return &NameTable; }
603
604protected:
605 /// Read a numeric value of type T from the profile.
606 ///
607 /// If an error occurs during decoding, a diagnostic message is emitted and
608 /// EC is set.
609 ///
610 /// \returns the read value.
611 template <typename T> ErrorOr<T> readNumber();
612
613 /// Read a numeric value of type T from the profile. The value is saved
614 /// without encoded.
615 template <typename T> ErrorOr<T> readUnencodedNumber();
616
617 /// Read a string from the profile.
618 ///
619 /// If an error occurs during decoding, a diagnostic message is emitted and
620 /// EC is set.
621 ///
622 /// \returns the read value.
624
625 /// Read the string index and check whether it overflows the table.
626 template <typename T> inline ErrorOr<uint32_t> readStringIndex(T &Table);
627
628 /// Return true if we've reached the end of file.
629 bool at_eof() const { return Data >= End; }
630
631 /// Read the next function profile instance.
632 std::error_code readFuncProfile(const uint8_t *Start);
633
634 /// Read the contents of the given profile instance.
635 std::error_code readProfile(FunctionSamples &FProfile);
636
637 /// Read the contents of Magic number and Version number.
638 std::error_code readMagicIdent();
639
640 /// Read profile summary.
641 std::error_code readSummary();
642
643 /// Read the whole name table.
644 virtual std::error_code readNameTable();
645
646 /// Points to the current location in the buffer.
647 const uint8_t *Data = nullptr;
648
649 /// Points to the end of the buffer.
650 const uint8_t *End = nullptr;
651
652 /// Function name table.
653 std::vector<StringRef> NameTable;
654
655 /// Read a string indirectly via the name table.
658
659private:
660 std::error_code readSummaryEntry(std::vector<ProfileSummaryEntry> &Entries);
661 virtual std::error_code verifySPMagic(uint64_t Magic) = 0;
662};
663
665private:
666 std::error_code verifySPMagic(uint64_t Magic) override;
667
668public:
669 SampleProfileReaderRawBinary(std::unique_ptr<MemoryBuffer> B, LLVMContext &C,
672
673 /// \brief Return true if \p Buffer is in the format supported by this class.
674 static bool hasFormat(const MemoryBuffer &Buffer);
675};
676
677/// SampleProfileReaderExtBinaryBase/SampleProfileWriterExtBinaryBase defines
678/// the basic structure of the extensible binary format.
679/// The format is organized in sections except the magic and version number
680/// at the beginning. There is a section table before all the sections, and
681/// each entry in the table describes the entry type, start, size and
682/// attributes. The format in each section is defined by the section itself.
683///
684/// It is easy to add a new section while maintaining the backward
685/// compatibility of the profile. Nothing extra needs to be done. If we want
686/// to extend an existing section, like add cache misses information in
687/// addition to the sample count in the profile body, we can add a new section
688/// with the extension and retire the existing section, and we could choose
689/// to keep the parser of the old section if we want the reader to be able
690/// to read both new and old format profile.
691///
692/// SampleProfileReaderExtBinary/SampleProfileWriterExtBinary define the
693/// commonly used sections of a profile in extensible binary format. It is
694/// possible to define other types of profile inherited from
695/// SampleProfileReaderExtBinaryBase/SampleProfileWriterExtBinaryBase.
697private:
698 std::error_code decompressSection(const uint8_t *SecStart,
699 const uint64_t SecSize,
700 const uint8_t *&DecompressBuf,
701 uint64_t &DecompressBufSize);
702
703 BumpPtrAllocator Allocator;
704
705protected:
706 std::vector<SecHdrTableEntry> SecHdrTable;
707 std::error_code readSecHdrTableEntry(uint32_t Idx);
708 std::error_code readSecHdrTable();
709
710 std::error_code readFuncMetadata(bool ProfileHasAttribute);
711 std::error_code readFuncMetadata(bool ProfileHasAttribute,
712 FunctionSamples *FProfile);
713 std::error_code readFuncOffsetTable();
714 std::error_code readFuncProfiles();
715 std::error_code readMD5NameTable();
716 std::error_code readNameTableSec(bool IsMD5);
717 std::error_code readCSNameTableSec();
718 std::error_code readProfileSymbolList();
719
720 std::error_code readHeader() override;
721 std::error_code verifySPMagic(uint64_t Magic) override = 0;
722 virtual std::error_code readOneSection(const uint8_t *Start, uint64_t Size,
723 const SecHdrTableEntry &Entry);
724 // placeholder for subclasses to dispatch their own section readers.
725 virtual std::error_code readCustomSection(const SecHdrTableEntry &Entry) = 0;
729
730 std::unique_ptr<ProfileSymbolList> ProfSymList;
731
732 /// The table mapping from function context to the offset of its
733 /// FunctionSample towards file start.
735
736 /// Function offset mapping ordered by contexts.
737 std::unique_ptr<std::vector<std::pair<SampleContext, uint64_t>>>
739
740 /// The set containing the functions to use when compiling a module.
742
743 /// Use fixed length MD5 instead of ULEB128 encoding so NameTable doesn't
744 /// need to be read in up front and can be directly accessed using index.
745 bool FixedLengthMD5 = false;
746 /// The starting address of NameTable containing fixed length MD5.
747 const uint8_t *MD5NameMemStart = nullptr;
748
749 /// If MD5 is used in NameTable section, the section saves uint64_t data.
750 /// The uint64_t data has to be converted to a string and then the string
751 /// will be used to initialize StringRef in NameTable.
752 /// Note NameTable contains StringRef so it needs another buffer to own
753 /// the string data. MD5StringBuf serves as the string buffer that is
754 /// referenced by NameTable (vector of StringRef). We make sure
755 /// the lifetime of MD5StringBuf is not shorter than that of NameTable.
756 std::unique_ptr<std::vector<std::string>> MD5StringBuf;
757
758 /// CSNameTable is used to save full context vectors. This serves as an
759 /// underlying immutable buffer for all clients.
760 std::unique_ptr<const std::vector<SampleContextFrameVector>> CSNameTable;
761
762 /// If SkipFlatProf is true, skip the sections with
763 /// SecFlagFlat flag.
764 bool SkipFlatProf = false;
765
766 bool FuncOffsetsOrdered = false;
767
768public:
769 SampleProfileReaderExtBinaryBase(std::unique_ptr<MemoryBuffer> B,
772
773 /// Read sample profiles in extensible format from the associated file.
774 std::error_code readImpl() override;
775
776 /// Get the total size of all \p Type sections.
778 /// Get the total size of header and all sections.
780 bool dumpSectionInfo(raw_ostream &OS = dbgs()) override;
781
782 /// Collect functions with definitions in Module M. Return true if
783 /// the reader has been given a module.
784 bool collectFuncsFromModule() override;
785
786 /// Return whether names in the profile are all MD5 numbers.
787 bool useMD5() override { return MD5StringBuf.get(); }
788
789 std::unique_ptr<ProfileSymbolList> getProfileSymbolList() override {
790 return std::move(ProfSymList);
791 };
792
793 void setSkipFlatProf(bool Skip) override { SkipFlatProf = Skip; }
794};
795
797private:
798 std::error_code verifySPMagic(uint64_t Magic) override;
799 std::error_code readCustomSection(const SecHdrTableEntry &Entry) override {
800 // Update the data reader pointer to the end of the section.
801 Data = End;
803 };
804
805public:
806 SampleProfileReaderExtBinary(std::unique_ptr<MemoryBuffer> B, LLVMContext &C,
809
810 /// \brief Return true if \p Buffer is in the format supported by this class.
811 static bool hasFormat(const MemoryBuffer &Buffer);
812};
813
815private:
816 /// Function name table.
817 std::vector<std::string> NameTable;
818 /// The table mapping from function name to the offset of its FunctionSample
819 /// towards file start.
820 DenseMap<StringRef, uint64_t> FuncOffsetTable;
821 /// The set containing the functions to use when compiling a module.
822 DenseSet<StringRef> FuncsToUse;
823 std::error_code verifySPMagic(uint64_t Magic) override;
824 std::error_code readNameTable() override;
825 /// Read a string indirectly via the name table.
826 ErrorOr<StringRef> readStringFromTable() override;
827 std::error_code readHeader() override;
828 std::error_code readFuncOffsetTable();
829
830public:
831 SampleProfileReaderCompactBinary(std::unique_ptr<MemoryBuffer> B,
832 LLVMContext &C)
834
835 /// \brief Return true if \p Buffer is in the format supported by this class.
836 static bool hasFormat(const MemoryBuffer &Buffer);
837
838 /// Read samples only for functions to use.
839 std::error_code readImpl() override;
840
841 /// Collect functions with definitions in Module M. Return true if
842 /// the reader has been given a module.
843 bool collectFuncsFromModule() override;
844
845 /// Return whether names in the profile are all MD5 numbers.
846 bool useMD5() override { return true; }
847};
848
850
851// Supported histogram types in GCC. Currently, we only need support for
852// call target histograms.
863
865public:
866 SampleProfileReaderGCC(std::unique_ptr<MemoryBuffer> B, LLVMContext &C)
868 GcovBuffer(Buffer.get()) {}
869
870 /// Read and validate the file header.
871 std::error_code readHeader() override;
872
873 /// Read sample profiles from the associated file.
874 std::error_code readImpl() override;
875
876 /// Return true if \p Buffer is in the format supported by this class.
877 static bool hasFormat(const MemoryBuffer &Buffer);
878
879protected:
880 std::error_code readNameTable();
881 std::error_code readOneFunctionProfile(const InlineCallStack &InlineStack,
882 bool Update, uint32_t Offset);
883 std::error_code readFunctionProfiles();
884 std::error_code skipNextWord();
885 template <typename T> ErrorOr<T> readNumber();
887
888 /// Read the section tag and check that it's the same as \p Expected.
889 std::error_code readSectionTag(uint32_t Expected);
890
891 /// GCOV buffer containing the profile.
893
894 /// Function names in this profile.
895 std::vector<std::string> Names;
896
897 /// GCOV tags used to separate sections in the profile file.
898 static const uint32_t GCOVTagAFDOFileNames = 0xaa000000;
899 static const uint32_t GCOVTagAFDOFunction = 0xac000000;
900};
901
902} // end namespace sampleprof
903
904} // end namespace llvm
905
906#endif // LLVM_PROFILEDATA_SAMPLEPROFREADER_H
static GCRegistry::Add< OcamlGC > B("ocaml", "ocaml 3.10-compatible GC")
Returns the sub type a function will return at a given Idx Should correspond to the result type of an ExtractValue instruction executed with just that one unsigned Idx
uint64_t Size
Provides ErrorOr<T> smart pointer.
#define F(x, y, z)
Definition: MD5.cpp:55
#define P(N)
assert(ImpDefSCC.getReg()==AMDGPU::SCC &&ImpDefSCC.isDef())
raw_pwrite_stream & OS
This file defines the SmallVector class.
Allocate memory in an ever growing pool, as if by bump-pointer.
Definition: Allocator.h:66
Implements a dense probed hash-table based set.
Definition: DenseSet.h:271
Diagnostic information for the sample profiler.
Represents either an error or a value T.
Definition: ErrorOr.h:56
Tagged union holding either a T or a Error.
Definition: Error.h:470
GCOVBuffer - A wrapper around MemoryBuffer to provide GCOV specific read operations.
Definition: GCOV.h:72
This is an important class for using LLVM in a threaded context.
Definition: LLVMContext.h:67
void diagnose(const DiagnosticInfo &DI)
Report a message to the currently installed diagnostic handler.
This interface provides simple read-only access to a block of memory, and provides simple methods for...
Definition: MemoryBuffer.h:51
A Module instance is used to store all the information related to an LLVM module.
Definition: Module.h:65
This is a 'vector' (really, a variable-sized array), optimized for the case when the array is small.
Definition: SmallVector.h:1200
StringRef - Represent a constant reference to a string, i.e.
Definition: StringRef.h:50
Twine - A lightweight data structure for efficiently representing the concatenation of temporary valu...
Definition: Twine.h:81
The instances of the Type class are immutable: once they are created, they are never changed.
Definition: Type.h:45
This class implements an extremely fast bulk output stream that can only output to a stream.
Definition: raw_ostream.h:52
Representation of the samples collected for a function.
Definition: SampleProf.h:732
static StringRef getCanonicalFnName(const Function &F)
Return the canonical name for a function, taking into account suffix elision policy attributes.
Definition: SampleProf.h:1051
static bool UseMD5
Whether the profile uses MD5 to represent string.
Definition: SampleProf.h:1152
std::error_code readProfile(FunctionSamples &FProfile)
Read the contents of the given profile instance.
virtual std::error_code readNameTable()
Read the whole name table.
const uint8_t * Data
Points to the current location in the buffer.
ErrorOr< StringRef > readString()
Read a string from the profile.
std::vector< StringRef > NameTable
Function name table.
ErrorOr< T > readNumber()
Read a numeric value of type T from the profile.
std::error_code readHeader() override
Read and validate the file header.
virtual ErrorOr< StringRef > readStringFromTable()
Read a string indirectly via the name table.
bool at_eof() const
Return true if we've reached the end of file.
std::error_code readImpl() override
Read sample profiles from the associated file.
ErrorOr< uint32_t > readStringIndex(T &Table)
Read the string index and check whether it overflows the table.
SampleProfileReaderBinary(std::unique_ptr< MemoryBuffer > B, LLVMContext &C, SampleProfileFormat Format=SPF_None)
virtual ErrorOr< SampleContext > readSampleContextFromTable()
const uint8_t * End
Points to the end of the buffer.
std::vector< StringRef > * getNameTable() override
It includes all the names that have samples either in outline instance or inline instance.
ErrorOr< T > readUnencodedNumber()
Read a numeric value of type T from the profile.
std::error_code readFuncProfile(const uint8_t *Start)
Read the next function profile instance.
std::error_code readSummary()
Read profile summary.
std::error_code readMagicIdent()
Read the contents of Magic number and Version number.
static bool hasFormat(const MemoryBuffer &Buffer)
Return true if Buffer is in the format supported by this class.
std::error_code readImpl() override
Read samples only for functions to use.
bool useMD5() override
Return whether names in the profile are all MD5 numbers.
SampleProfileReaderCompactBinary(std::unique_ptr< MemoryBuffer > B, LLVMContext &C)
bool collectFuncsFromModule() override
Collect functions with definitions in Module M.
SampleProfileReaderExtBinaryBase/SampleProfileWriterExtBinaryBase defines the basic structure of the ...
ErrorOr< StringRef > readStringFromTable() override
Read a string indirectly via the name table.
std::error_code readFuncMetadata(bool ProfileHasAttribute)
bool collectFuncsFromModule() override
Collect functions with definitions in Module M.
uint64_t getSectionSize(SecType Type)
Get the total size of all Type sections.
void setSkipFlatProf(bool Skip) override
Don't read profile without context if the flag is set.
virtual std::error_code readCustomSection(const SecHdrTableEntry &Entry)=0
std::unique_ptr< const std::vector< SampleContextFrameVector > > CSNameTable
CSNameTable is used to save full context vectors.
bool FixedLengthMD5
Use fixed length MD5 instead of ULEB128 encoding so NameTable doesn't need to be read in up front and...
DenseSet< StringRef > FuncsToUse
The set containing the functions to use when compiling a module.
ErrorOr< SampleContextFrames > readContextFromTable()
std::unique_ptr< ProfileSymbolList > ProfSymList
std::unique_ptr< ProfileSymbolList > getProfileSymbolList() override
std::unique_ptr< std::vector< std::string > > MD5StringBuf
If MD5 is used in NameTable section, the section saves uint64_t data.
std::error_code readImpl() override
Read sample profiles in extensible format from the associated file.
bool useMD5() override
Return whether names in the profile are all MD5 numbers.
const uint8_t * MD5NameMemStart
The starting address of NameTable containing fixed length MD5.
virtual std::error_code readOneSection(const uint8_t *Start, uint64_t Size, const SecHdrTableEntry &Entry)
bool dumpSectionInfo(raw_ostream &OS=dbgs()) override
std::error_code verifySPMagic(uint64_t Magic) override=0
SampleProfileReaderExtBinaryBase(std::unique_ptr< MemoryBuffer > B, LLVMContext &C, SampleProfileFormat Format)
bool SkipFlatProf
If SkipFlatProf is true, skip the sections with SecFlagFlat flag.
std::error_code readHeader() override
Read and validate the file header.
uint64_t getFileSize()
Get the total size of header and all sections.
DenseMap< SampleContext, uint64_t > FuncOffsetTable
The table mapping from function context to the offset of its FunctionSample towards file start.
std::unique_ptr< std::vector< std::pair< SampleContext, uint64_t > > > OrderedFuncOffsets
Function offset mapping ordered by contexts.
ErrorOr< SampleContext > readSampleContextFromTable() override
SampleProfileReaderExtBinary(std::unique_ptr< MemoryBuffer > B, LLVMContext &C, SampleProfileFormat Format=SPF_Ext_Binary)
static bool hasFormat(const MemoryBuffer &Buffer)
Return true if Buffer is in the format supported by this class.
GCOVBuffer GcovBuffer
GCOV buffer containing the profile.
std::vector< std::string > Names
Function names in this profile.
std::error_code readImpl() override
Read sample profiles from the associated file.
SampleProfileReaderGCC(std::unique_ptr< MemoryBuffer > B, LLVMContext &C)
std::error_code readHeader() override
Read and validate the file header.
std::error_code readOneFunctionProfile(const InlineCallStack &InlineStack, bool Update, uint32_t Offset)
static const uint32_t GCOVTagAFDOFileNames
GCOV tags used to separate sections in the profile file.
static bool hasFormat(const MemoryBuffer &Buffer)
Return true if Buffer is in the format supported by this class.
std::error_code readSectionTag(uint32_t Expected)
Read the section tag and check that it's the same as Expected.
SampleProfileReaderItaniumRemapper remaps the profile data from a sample profile data reader,...
bool exist(StringRef FunctionName)
Query whether there is equivalent in the remapper which has been inserted.
static ErrorOr< std::unique_ptr< SampleProfileReaderItaniumRemapper > > create(const std::string Filename, vfs::FileSystem &FS, SampleProfileReader &Reader, LLVMContext &C)
Create a remapper from the given remapping file.
void applyRemapping(LLVMContext &Ctx)
Apply remappings to the profile read by Reader.
SampleProfileReaderItaniumRemapper(std::unique_ptr< MemoryBuffer > B, std::unique_ptr< SymbolRemappingReader > SRR, SampleProfileReader &R)
void insert(StringRef FunctionName)
Insert function name into remapper.
std::optional< StringRef > lookUpNameInProfile(StringRef FunctionName)
Return the equivalent name in the profile for FunctionName if it exists.
SampleProfileReaderRawBinary(std::unique_ptr< MemoryBuffer > B, LLVMContext &C, SampleProfileFormat Format=SPF_Binary)
static bool hasFormat(const MemoryBuffer &Buffer)
Return true if Buffer is in the format supported by this class.
SampleProfileReaderText(std::unique_ptr< MemoryBuffer > B, LLVMContext &C)
std::error_code readHeader() override
Read and validate the file header.
std::error_code readImpl() override
Read sample profiles from the associated file.
static bool hasFormat(const MemoryBuffer &Buffer)
Return true if Buffer is in the format supported by this class.
Sample-based profile reader.
uint32_t MaskedBitFrom
Zero out the discriminator bits higher than bit MaskedBitFrom (0 based).
bool ProfileIsPreInlined
Whether function profile contains ShouldBeInlined contexts.
void dumpFunctionProfile(SampleContext FContext, raw_ostream &OS=dbgs())
Print the profile for FContext on stream OS.
SampleProfileMap & getProfiles()
Return all the profiles.
uint32_t CSProfileCount
Number of context-sensitive profiles.
bool profileIsProbeBased() const
Whether input profile is based on pseudo probes.
FunctionSamples * getSamplesFor(const Function &F)
Return the samples collected for function F.
void dump(raw_ostream &OS=dbgs())
Print all the profiles on stream OS.
const Module * M
The current module being compiled if SampleProfileReader is used by compiler.
virtual FunctionSamples * getSamplesFor(StringRef Fname)
Return the samples collected for function F.
std::unique_ptr< MemoryBuffer > Buffer
Memory buffer holding the profile file.
std::unique_ptr< SampleProfileReaderItaniumRemapper > Remapper
FunctionSamples * getOrCreateSamplesFor(const Function &F)
Return the samples collected for function F, create empty FunctionSamples if it doesn't exist.
bool profileIsPreInlined() const
Whether input profile contains ShouldBeInlined contexts.
std::error_code read()
The interface to read sample profiles from the associated file.
virtual bool useMD5()
Return whether names in the profile are all MD5 numbers.
SampleProfileReaderItaniumRemapper * getRemapper()
bool ProfileIsCS
Whether function profiles are context-sensitive flat profiles.
virtual std::vector< StringRef > * getNameTable()
It includes all the names that have samples either in outline instance or inline instance.
static std::unique_ptr< ProfileSummary > takeSummary(SampleProfileReader &Reader)
Take ownership of the summary of this reader.
ProfileSummary & getSummary() const
Return the profile summary.
SampleProfileFormat Format
The format of sample.
SampleProfileReader(std::unique_ptr< MemoryBuffer > B, LLVMContext &C, SampleProfileFormat Format=SPF_None)
std::unique_ptr< ProfileSummary > Summary
Profile summary information.
virtual bool hasUniqSuffix()
Return whether any name in the profile contains ".__uniq." suffix.
void computeSummary()
Compute summary for this profile.
uint32_t getDiscriminatorMask() const
Get the bitmask the discriminators: For FS profiles, return the bit mask for this pass.
std::unordered_set< std::string > MD5NameBuffer
Extra name buffer holding names created on demand.
virtual bool dumpSectionInfo(raw_ostream &OS=dbgs())
SampleProfileFormat getFormat() const
Return the profile format.
void setDiscriminatorMaskedBitFrom(FSDiscriminatorPass P)
Set the bits for FS discriminators.
bool profileIsCS() const
Whether input profile is fully context-sensitive.
bool ProfileIsFS
Whether the function profiles use FS discriminators.
virtual bool collectFuncsFromModule()
Collect functions with definitions in Module M.
virtual void setSkipFlatProf(bool Skip)
Don't read profile without context if the flag is set.
void dumpJson(raw_ostream &OS=dbgs())
Print all the profiles on stream OS in the JSON format.
static ErrorOr< std::unique_ptr< SampleProfileReader > > create(const std::string Filename, LLVMContext &C, vfs::FileSystem &FS, FSDiscriminatorPass P=FSDiscriminatorPass::Base, const std::string RemapFilename="")
Create a sample profile reader appropriate to the file format.
SampleProfileMap Profiles
Map every function to its associated profile.
virtual std::error_code readHeader()=0
Read and validate the file header.
bool ProfileIsProbeBased
Whether samples are collected based on pseudo probes.
void reportError(int64_t LineNumber, const Twine &Msg) const
Report a parse error message.
virtual std::unique_ptr< ProfileSymbolList > getProfileSymbolList()
LLVMContext & Ctx
LLVM context used to emit diagnostics.
virtual std::error_code readImpl()=0
The implementaion to read sample profiles from the associated file.
The virtual file system interface.
@ C
The default llvm calling convention, compatible with C.
Definition: CallingConv.h:34
std::unordered_map< SampleContext, FunctionSamples, SampleContext::Hash > SampleProfileMap
Definition: SampleProf.h:1237
static StringRef getRepInFormat(StringRef Name, bool UseMD5, std::string &GUIDBuf)
Get the proper representation of a string according to whether the current Format uses MD5 to represe...
Definition: SampleProf.h:114
This is an optimization pass for GlobalISel generic memory operations.
Definition: AddressRanges.h:18
@ Offset
Definition: DWP.cpp:406
static unsigned getFSPassBitEnd(FSDiscriminatorPass P)
Definition: Discriminator.h:89
decltype(auto) get(const PointerIntPair< PointerTy, IntBits, IntType, PtrTraits, Info > &Pair)
raw_ostream & dbgs()
dbgs() - This returns a reference to a raw_ostream for debugging messages.
Definition: Debug.cpp:163
@ Mod
The access may modify the value stored in memory.
static unsigned getN1Bits(int N)
OutputIt move(R &&Range, OutputIt Out)
Provide wrappers to std::move which take ranges instead of having to pass begin/end explicitly.
Definition: STLExtras.h:1946
Definition: BitVector.h:858