LLVM 22.0.0git
Public Types | Public Member Functions | Static Public Member Functions | Static Public Attributes | Friends | List of all members
llvm::ir2vec::Vocabulary Class Reference

Class for storing and accessing the IR2Vec vocabulary. More...

#include "llvm/Analysis/IR2Vec.h"

Public Types

enum class  CanonicalTypeID : unsigned {
  FloatTy , VoidTy , LabelTy , MetadataTy ,
  VectorTy , TokenTy , IntegerTy , FunctionTy ,
  PointerTy , StructTy , ArrayTy , UnknownTy ,
  MaxCanonicalType
}
 Canonical type IDs supported by IR2Vec Vocabulary. More...
 
enum class  OperandKind : unsigned {
  FunctionID , PointerID , ConstantID , VariableID ,
  MaxOperandKind
}
 Operand kinds supported by IR2Vec Vocabulary. More...
 
using const_iterator = VocabVector::const_iterator
 Const Iterator type aliases.
 

Public Member Functions

 Vocabulary ()=default
 
LLVM_ABI Vocabulary (VocabVector &&Vocab)
 
LLVM_ABI bool isValid () const
 
LLVM_ABI unsigned getDimension () const
 
LLVM_ABI const ir2vec::Embeddingoperator[] (unsigned Opcode) const
 Accessors to get the embedding for a given entity.
 
LLVM_ABI const ir2vec::Embeddingoperator[] (Type::TypeID TypeId) const
 
LLVM_ABI const ir2vec::Embeddingoperator[] (const Value &Arg) const
 
const_iterator begin () const
 
const_iterator cbegin () const
 
const_iterator end () const
 
const_iterator cend () const
 
LLVM_ABI bool invalidate (Module &M, const PreservedAnalyses &PA, ModuleAnalysisManager::Invalidator &Inv) const
 

Static Public Member Functions

static constexpr size_t getCanonicalSize ()
 Total number of entries (opcodes + canonicalized types + operand kinds)
 
static LLVM_ABI StringRef getVocabKeyForOpcode (unsigned Opcode)
 Function to get vocabulary key for a given Opcode.
 
static LLVM_ABI StringRef getVocabKeyForTypeID (Type::TypeID TypeID)
 Function to get vocabulary key for a given TypeID.
 
static LLVM_ABI StringRef getVocabKeyForOperandKind (OperandKind Kind)
 Function to get vocabulary key for a given OperandKind.
 
static LLVM_ABI OperandKind getOperandKind (const Value *Op)
 Function to classify an operand into OperandKind.
 
static LLVM_ABI unsigned getSlotIndex (unsigned Opcode)
 Functions to return the slot index or position of a given Opcode, TypeID, or OperandKind in the vocabulary.
 
static LLVM_ABI unsigned getSlotIndex (Type::TypeID TypeID)
 
static LLVM_ABI unsigned getSlotIndex (const Value &Op)
 
static LLVM_ABI StringRef getStringKey (unsigned Pos)
 Returns the string key for a given index position in the vocabulary.
 
static LLVM_ABI VocabVector createDummyVocabForTest (unsigned Dim=1)
 Create a dummy vocabulary for testing purposes.
 

Static Public Attributes

static constexpr unsigned MaxTypeIDs = Type::TypeID::TargetExtTyID + 1
 
static constexpr unsigned MaxCanonicalTypeIDs
 
static constexpr unsigned MaxOperandKinds
 

Friends

class llvm::IR2VecVocabAnalysis
 

Detailed Description

Class for storing and accessing the IR2Vec vocabulary.

The Vocabulary class manages seed embeddings for LLVM IR entities. The seed embeddings are the initial learned representations of the entities of LLVM IR. The IR2Vec representation for a given IR is derived from these seed embeddings.

The vocabulary contains the seed embeddings for three types of entities: instruction opcodes, types, and operands. Types are grouped/canonicalized for better learning (e.g., all float variants map to FloatTy). The vocabulary abstracts away the canonicalization effectively, the exposed APIs handle all the known LLVM IR opcodes, types and operands.

This class helps populate the seed embeddings in an internal vector-based ADT. It provides logic to map every IR entity to a specific slot index or position in this vector, enabling O(1) embedding lookup while avoiding unnecessary computations involving string based lookups while generating the embeddings.

Definition at line 157 of file IR2Vec.h.

Member Typedef Documentation

◆ const_iterator

using llvm::ir2vec::Vocabulary::const_iterator = VocabVector::const_iterator

Const Iterator type aliases.

Definition at line 238 of file IR2Vec.h.

Member Enumeration Documentation

◆ CanonicalTypeID

Canonical type IDs supported by IR2Vec Vocabulary.

Enumerator
FloatTy 
VoidTy 
LabelTy 
MetadataTy 
VectorTy 
TokenTy 
IntegerTy 
FunctionTy 
PointerTy 
StructTy 
ArrayTy 
UnknownTy 
MaxCanonicalType 

Definition at line 170 of file IR2Vec.h.

◆ OperandKind

Operand kinds supported by IR2Vec Vocabulary.

Enumerator
FunctionID 
PointerID 
ConstantID 
VariableID 
MaxOperandKind 

Definition at line 187 of file IR2Vec.h.

Constructor & Destructor Documentation

◆ Vocabulary() [1/2]

llvm::ir2vec::Vocabulary::Vocabulary ( )
default

◆ Vocabulary() [2/2]

Vocabulary::Vocabulary ( VocabVector &&  Vocab)

Definition at line 263 of file IR2Vec.cpp.

Member Function Documentation

◆ begin()

const_iterator llvm::ir2vec::Vocabulary::begin ( ) const
inline

Definition at line 239 of file IR2Vec.h.

References assert().

◆ cbegin()

const_iterator llvm::ir2vec::Vocabulary::cbegin ( ) const
inline

Definition at line 244 of file IR2Vec.h.

References assert().

◆ cend()

const_iterator llvm::ir2vec::Vocabulary::cend ( ) const
inline

Definition at line 254 of file IR2Vec.h.

References assert().

◆ createDummyVocabForTest()

Vocabulary::VocabVector Vocabulary::createDummyVocabForTest ( unsigned  Dim = 1)
static

Create a dummy vocabulary for testing purposes.

Definition at line 369 of file IR2Vec.cpp.

References _, MaxCanonicalTypeIDs, MaxOperandKinds, and llvm::seq().

◆ end()

const_iterator llvm::ir2vec::Vocabulary::end ( ) const
inline

Definition at line 249 of file IR2Vec.h.

References assert().

◆ getCanonicalSize()

static constexpr size_t llvm::ir2vec::Vocabulary::getCanonicalSize ( )
inlinestaticconstexpr

Total number of entries (opcodes + canonicalized types + operand kinds)

Definition at line 212 of file IR2Vec.h.

◆ getDimension()

unsigned Vocabulary::getDimension ( ) const

Definition at line 270 of file IR2Vec.cpp.

References assert().

Referenced by llvm::FunctionPropertiesInfo::getFunctionPropertiesInfo().

◆ getOperandKind()

Vocabulary::OperandKind Vocabulary::getOperandKind ( const Value Op)
static

Function to classify an operand into OperandKind.

Definition at line 338 of file IR2Vec.cpp.

References ConstantID, FunctionID, PointerID, and VariableID.

Referenced by getSlotIndex().

◆ getSlotIndex() [1/3]

unsigned Vocabulary::getSlotIndex ( const Value Op)
static

Definition at line 285 of file IR2Vec.cpp.

References assert(), getOperandKind(), MaxCanonicalTypeIDs, and MaxOperandKinds.

◆ getSlotIndex() [2/3]

unsigned Vocabulary::getSlotIndex ( Type::TypeID  TypeID)
static

Definition at line 280 of file IR2Vec.cpp.

References assert(), and MaxTypeIDs.

◆ getSlotIndex() [3/3]

unsigned Vocabulary::getSlotIndex ( unsigned  Opcode)
static

Functions to return the slot index or position of a given Opcode, TypeID, or OperandKind in the vocabulary.

Definition at line 275 of file IR2Vec.cpp.

References assert().

Referenced by operator[]().

◆ getStringKey()

StringRef Vocabulary::getStringKey ( unsigned  Pos)
static

Returns the string key for a given index position in the vocabulary.

This is useful for debugging or printing the vocabulary. Do not use this for embedding generation as string based lookups are inefficient.

Definition at line 348 of file IR2Vec.cpp.

References assert(), getVocabKeyForOpcode(), getVocabKeyForOperandKind(), and MaxCanonicalTypeIDs.

◆ getVocabKeyForOpcode()

StringRef Vocabulary::getVocabKeyForOpcode ( unsigned  Opcode)
static

Function to get vocabulary key for a given Opcode.

Definition at line 303 of file IR2Vec.cpp.

References assert().

Referenced by getStringKey().

◆ getVocabKeyForOperandKind()

StringRef Vocabulary::getVocabKeyForOperandKind ( Vocabulary::OperandKind  Kind)
static

Function to get vocabulary key for a given OperandKind.

Definition at line 331 of file IR2Vec.cpp.

References assert(), and MaxOperandKinds.

Referenced by getStringKey().

◆ getVocabKeyForTypeID()

StringRef Vocabulary::getVocabKeyForTypeID ( Type::TypeID  TypeID)
static

Function to get vocabulary key for a given TypeID.

Definition at line 327 of file IR2Vec.cpp.

◆ invalidate()

bool Vocabulary::invalidate ( Module M,
const PreservedAnalyses PA,
ModuleAnalysisManager::Invalidator Inv 
) const

Definition at line 363 of file IR2Vec.cpp.

References llvm::PreservedAnalyses::getChecker().

◆ isValid()

bool Vocabulary::isValid ( ) const

◆ operator[]() [1/3]

const ir2vec::Embedding & Vocabulary::operator[] ( const Value Arg) const

Definition at line 299 of file IR2Vec.cpp.

References getSlotIndex().

◆ operator[]() [2/3]

const Embedding & Vocabulary::operator[] ( Type::TypeID  TypeId) const

Definition at line 295 of file IR2Vec.cpp.

References getSlotIndex().

◆ operator[]() [3/3]

const Embedding & Vocabulary::operator[] ( unsigned  Opcode) const

Accessors to get the embedding for a given entity.

Definition at line 291 of file IR2Vec.cpp.

References getSlotIndex().

Friends And Related Function Documentation

◆ llvm::IR2VecVocabAnalysis

friend class llvm::IR2VecVocabAnalysis
friend

Definition at line 158 of file IR2Vec.h.

Member Data Documentation

◆ MaxCanonicalTypeIDs

constexpr unsigned llvm::ir2vec::Vocabulary::MaxCanonicalTypeIDs
staticconstexpr
Initial value:

Definition at line 201 of file IR2Vec.h.

Referenced by createDummyVocabForTest(), getSlotIndex(), and getStringKey().

◆ MaxOperandKinds

constexpr unsigned llvm::ir2vec::Vocabulary::MaxOperandKinds
staticconstexpr
Initial value:

Definition at line 203 of file IR2Vec.h.

Referenced by createDummyVocabForTest(), getSlotIndex(), and getVocabKeyForOperandKind().

◆ MaxTypeIDs

constexpr unsigned llvm::ir2vec::Vocabulary::MaxTypeIDs = Type::TypeID::TargetExtTyID + 1
staticconstexpr

Definition at line 200 of file IR2Vec.h.

Referenced by getSlotIndex().


The documentation for this class was generated from the following files: