|
LLVM 23.0.0git
|
Class for storing and accessing the IR2Vec vocabulary. More...
#include "llvm/Analysis/IR2Vec.h"
Public Types | |
| enum class | CanonicalTypeID : unsigned { FloatTy , VoidTy , LabelTy , MetadataTy , VectorTy , TokenTy , IntegerTy , FunctionTy , PointerTy , StructTy , ArrayTy , UnknownTy , MaxCanonicalType } |
| Canonical type IDs supported by IR2Vec Vocabulary. More... | |
| enum class | OperandKind : unsigned { FunctionID , PointerID , ConstantID , VariableID , MaxOperandKind } |
| Operand kinds supported by IR2Vec Vocabulary. More... | |
| using | const_iterator = VocabStorage::const_iterator |
| Const Iterator type aliases. | |
Public Member Functions | |
| Vocabulary ()=default | |
| LLVM_ABI | Vocabulary (VocabStorage &&Storage) |
| Vocabulary (const Vocabulary &)=delete | |
| Vocabulary & | operator= (const Vocabulary &)=delete |
| Vocabulary (Vocabulary &&)=default | |
| Vocabulary & | operator= (Vocabulary &&Other)=delete |
| LLVM_ABI bool | isValid () const |
| LLVM_ABI unsigned | getDimension () const |
| LLVM_ABI const ir2vec::Embedding & | operator[] (unsigned Opcode) const |
| Accessors to get the embedding for a given entity. | |
| LLVM_ABI const ir2vec::Embedding & | operator[] (Type::TypeID TypeID) const |
| LLVM_ABI const ir2vec::Embedding & | operator[] (const Value &Arg) const |
| LLVM_ABI const ir2vec::Embedding & | operator[] (CmpInst::Predicate P) const |
| const_iterator | begin () const |
| const_iterator | cbegin () const |
| const_iterator | end () const |
| const_iterator | cend () const |
| LLVM_ABI bool | invalidate (Module &M, const PreservedAnalyses &PA, ModuleAnalysisManager::Invalidator &Inv) const |
Static Public Member Functions | |
| static LLVM_ABI Expected< Vocabulary > | fromFile (StringRef VocabFilePath, float OpcWeight=1.0, float TypeWeight=0.5, float ArgWeight=0.2) |
| Create a Vocabulary by loading embeddings from a JSON file. | |
| static constexpr size_t | getCanonicalSize () |
| Total number of entries (opcodes + canonicalized types + operand kinds + predicates) | |
| static LLVM_ABI StringRef | getVocabKeyForOpcode (unsigned Opcode) |
| Function to get vocabulary key for a given Opcode. | |
| static LLVM_ABI StringRef | getVocabKeyForTypeID (Type::TypeID TypeID) |
| Function to get vocabulary key for a given TypeID. | |
| static LLVM_ABI StringRef | getVocabKeyForOperandKind (OperandKind Kind) |
| Function to get vocabulary key for a given OperandKind. | |
| static LLVM_ABI OperandKind | getOperandKind (const Value *Op) |
| Function to classify an operand into OperandKind. | |
| static LLVM_ABI StringRef | getVocabKeyForPredicate (CmpInst::Predicate P) |
| Function to get vocabulary key for a given predicate. | |
| static LLVM_ABI unsigned | getIndex (unsigned Opcode) |
| Functions to return flat index. | |
| static LLVM_ABI unsigned | getIndex (Type::TypeID TypeID) |
| static LLVM_ABI unsigned | getIndex (const Value &Op) |
| static LLVM_ABI unsigned | getIndex (CmpInst::Predicate P) |
| static LLVM_ABI StringRef | getStringKey (unsigned Pos) |
| Returns the string key for a given index position in the vocabulary. | |
| static LLVM_ABI VocabStorage | createDummyVocabForTest (unsigned Dim=1) |
| Create a dummy vocabulary for testing purposes. | |
Static Public Attributes | |
| static constexpr unsigned | MaxTypeIDs = Type::TypeID::TargetExtTyID + 1 |
| static constexpr unsigned | MaxCanonicalTypeIDs |
| static constexpr unsigned | MaxOperandKinds |
| static constexpr unsigned | MaxPredicateKinds |
Friends | |
| class | llvm::IR2VecVocabAnalysis |
Class for storing and accessing the IR2Vec vocabulary.
The Vocabulary class manages seed embeddings for LLVM IR entities. The seed embeddings are the initial learned representations of the entities of LLVM IR. The IR2Vec representation for a given IR is derived from these seed embeddings.
The vocabulary contains the seed embeddings for three types of entities: instruction opcodes, types, and operands. Types are grouped/canonicalized for better learning (e.g., all float variants map to FloatTy). The vocabulary abstracts away the canonicalization effectively, the exposed APIs handle all the known LLVM IR opcodes, types and operands.
This class helps populate the seed embeddings in an internal vector-based ADT. It provides logic to map every IR entity to a specific slot index or position in this vector, enabling O(1) embedding lookup while avoiding unnecessary computations involving string based lookups while generating the embeddings.
|
strong |
Canonical type IDs supported by IR2Vec Vocabulary.
| Enumerator | |
|---|---|
| FloatTy | |
| VoidTy | |
| LabelTy | |
| MetadataTy | |
| VectorTy | |
| TokenTy | |
| IntegerTy | |
| FunctionTy | |
| PointerTy | |
| StructTy | |
| ArrayTy | |
| UnknownTy | |
| MaxCanonicalType | |
|
strong |
Operand kinds supported by IR2Vec Vocabulary.
| Enumerator | |
|---|---|
| FunctionID | |
| PointerID | |
| ConstantID | |
| VariableID | |
| MaxOperandKind | |
|
default |
Referenced by fromFile(), operator=(), operator=(), Vocabulary(), and Vocabulary().
|
inline |
Definition at line 328 of file IR2Vec.h.
References LLVM_ABI, and llvm::move().
|
delete |
References Vocabulary().
|
default |
References Vocabulary().
|
inline |
|
inline |
|
inline |
|
static |
Create a dummy vocabulary for testing purposes.
Definition at line 432 of file IR2Vec.cpp.
References I, MaxCanonicalTypeIDs, MaxOperandKinds, MaxPredicateKinds, and llvm::ir2vec::VocabStorage::VocabStorage().
|
inline |
|
static |
Create a Vocabulary by loading embeddings from a JSON file.
This is the primary entry point for programmatic vocabulary creation, suitable for use in Python bindings or other contexts where command-line options are not available. Weights are applied to scale the embeddings for opcodes, types, and arguments respectively.
Definition at line 609 of file IR2Vec.cpp.
References llvm::ir2vec::ArgWeight, llvm::ir2vec::OpcWeight, llvm::ir2vec::TypeWeight, and Vocabulary().
Referenced by llvm::IR2VecVocabAnalysis::run().
|
inlinestaticconstexpr |
|
inlinestatic |
Definition at line 391 of file IR2Vec.h.
References assert(), getOperandKind(), LLVM_ABI, and MaxOperandKinds.
|
inlinestatic |
Definition at line 386 of file IR2Vec.h.
References assert(), LLVM_ABI, and MaxTypeIDs.
|
static |
Function to classify an operand into OperandKind.
Definition at line 369 of file IR2Vec.cpp.
References ConstantID, FunctionID, llvm::isa(), PointerID, and VariableID.
Referenced by getIndex(), and operator[]().
Returns the string key for a given index position in the vocabulary.
This is useful for debugging or printing the vocabulary. Do not use this for embedding generation as string based lookups are inefficient.
Definition at line 408 of file IR2Vec.cpp.
References assert(), getVocabKeyForOpcode(), getVocabKeyForOperandKind(), and getVocabKeyForPredicate().
Function to get vocabulary key for a given Opcode.
Definition at line 357 of file IR2Vec.cpp.
References assert().
Referenced by getStringKey().
|
inlinestatic |
Function to get vocabulary key for a given OperandKind.
Definition at line 368 of file IR2Vec.h.
References assert(), LLVM_ABI, and MaxOperandKinds.
Referenced by getStringKey().
|
static |
Function to get vocabulary key for a given predicate.
Definition at line 398 of file IR2Vec.cpp.
References llvm::CmpInst::FIRST_ICMP_PREDICATE, and llvm::CmpInst::getPredicateName().
Referenced by getStringKey().
|
inlinestatic |
| bool Vocabulary::invalidate | ( | Module & | M, |
| const PreservedAnalyses & | PA, | ||
| ModuleAnalysisManager::Invalidator & | Inv ) const |
Definition at line 426 of file IR2Vec.cpp.
References llvm::PreservedAnalyses::getChecker(), and llvm::IR2VecVocabAnalysis.
Definition at line 346 of file IR2Vec.h.
References LLVM_ABI.
Referenced by begin(), end(), getDimension(), llvm::FunctionPropertiesInfo::getFunctionPropertiesInfo(), and llvm::IR2VecPrinterPass::run().
|
delete |
References Vocabulary().
|
delete |
References llvm::ir2vec::ArgWeight, LLVM_ABI, llvm::ir2vec::OpcWeight, llvm::Other, llvm::ir2vec::TypeWeight, and Vocabulary().
|
inline |
|
inline |
Definition at line 413 of file IR2Vec.h.
References assert(), getOperandKind(), LLVM_ABI, and MaxOperandKinds.
|
inline |
Definition at line 407 of file IR2Vec.h.
References assert(), LLVM_ABI, and MaxTypeIDs.
|
inline |
|
friend |
Definition at line 249 of file IR2Vec.h.
Referenced by invalidate().
|
staticconstexpr |
Definition at line 318 of file IR2Vec.h.
Referenced by createDummyVocabForTest().
|
staticconstexpr |
Definition at line 320 of file IR2Vec.h.
Referenced by createDummyVocabForTest(), getIndex(), getVocabKeyForOperandKind(), and operator[]().
|
staticconstexpr |
Definition at line 324 of file IR2Vec.h.
Referenced by createDummyVocabForTest().
|
staticconstexpr |
Definition at line 317 of file IR2Vec.h.
Referenced by getIndex(), and operator[]().