LLVM  14.0.0git
Macros | Functions | Variables
lib/Target/README.txt File Reference
#include <assert.h>
#include <stdio.h>
#include <cstdio>
Include dependency graph for README.txt:

Macros

#define PMD_MASK   (~((1UL << 23) - 1))
 

Functions

The legalization code for mul with overflow needs to be made more robust before this can be implemented though Get the C front end to expand hypot (x, y) -> llvm.sqrt(x *x+y *y) when errno and precision don 't matter(ffastmath). Misc/mandel will like this. :) This isn 't safe in general, even on darwin. See the libm implementation of hypot for examples(which special case when x/y are exactly zero to get signed zeros etc right). On targets with expensive 64-bit multiply, we could LSR this:for(i=...
 
 for (i=...;++i, tmp+=tmp) x
 
This would be a win on but not x86 or ppc64 setlt (loadi8 Phi)
 
int foo (int z, int n)
 
This is blocked on not handling X *X *X powi (X, 3)(see note above). The issue is that we end up getting t
 
void f ()
 
int h (int *j, int *l)
 
float sincosf (float x, float *sin, float *cos)
 
long double sincosl (long double x, long double *sin, long double *cos)
 
 if (target< 32)
 
but this requires TBAA This isn t recognized as bswap by instcombine (yes, it really is bswap)
 
We don t delete this output free because trip count analysis doesn t realize that it is finite (if it were infinite, it would be undefined). Not having this blocks Loop Idiom from matching strlen and friends. void foo(char *C)
 
These idioms should be recognized as popcount (see PR1488)
 
unsigned int popcount (unsigned int input)
 
 for (i=0;i< 32;i++) if(a &(1<<(31-i))) return i
 
This sort of thing should be added to the loop idiom pass These should turn into single bit (unaligned?) loads on little/big endian processors. unsigned short read_16_le(const unsigned char *adr)
 
unsigned short read_16_be (const unsigned char *adr)
 
int test (U32 *inst, U64 *regs)
 
returnpow2m1 (n - 1)+1
 
< i32 > ret i32 tmp foo define i32 bar (i32 *%x)
 
THotKey GetHotKey ()
 
 into (-m64 -O3 -fno-exceptions -static -fomit-frame-pointer)
 
 if (x==3) y=0
 
The loop unroller should partially unroll loops (instead of peeling them) when code growth isn 't too bad and when an unroll count allows simplification of some code within the loop. One trivial example is
 
unsigned long long f6 (unsigned long long x, unsigned long long y, int z)
 
 This (and similar related idioms)
 
return j (j<< 16)
 
aka conv or ret i32 or6 or even i depending on the speed of the multiplier The best way to handle this is to canonicalize it to a multiply in IR and have codegen handle lowering multiplies to shifts on cpus where shifts are faster We do a number of simplifications in simplify libcalls to strength reduce standard library but we don t currently merge them together For it is useful to merge memcpy (a, b, strlen(b)) -> strcpy. This can only be done safely if "b" isn 't modified between the strlen and memcpy of course. We compile this program:(from GCC PR11680) http:Into code that runs the same speed in fast/slow modes, but both modes run 2x slower than when compile with GCC(either 4.0 or 4.2):$ llvm-g++perf.cpp -O3 -fno-exceptions $ time ./a.out fast 1.821u 0.003s 0:01.82 100.0% 0+0k 0+0io 0pf+0w $ g++perf.cpp -O3 -fno-exceptions $ time ./a.out fast 0.821u 0.001s 0:00.82 100.0% 0+0k 0+0io 0pf+0w It looks like we are making the same inlining decisions, so this may be raw codegen badness or something else(haven 't investigated). Divisibility by constant can be simplified(according to GCC PR12849) from being a mulhi to being a mul lo(cheaper). Testcase:void bar(unsigned n)
 
This is equivalent to the where is the multiplicative inverse and is ((2^32) -1)/3+1
 
This should optimize to or at least something sane Currently not optimized with clang emit llvm bc opt O3 int a (int a, int b, int c)
 
Should fold to a &&b c Currently not optimized with clang emit llvm bc opt O3 int a (int x)
 
Should combine to x &Currently not optimized with clang emit llvm bc opt O3 unsigned a (unsigned a)
 
Should combine to a *Currently not optimized with clang emit llvm bc opt O3 unsigned a (char *x)
 
There s an unnecessary zext in the generated code with clang emit llvm bc opt O3 unsigned a (unsigned long long x)
 
Should combine to *unsigned x &Currently not optimized with clang emit llvm bc opt O3 int g (int x)
 
Should combine to x<=9" (the sub has nsw). Currently notoptimized with "clang -emit-llvm-bc|opt -O3".int g(int x) { return (x + 10) < 0; }Should combine to "x< -10" (the add has nsw). Currently notoptimized with "clang -emit-llvm-bc|opt -O3".int f(int i, int j) { return i < j + 1; }int g(int i, int j) { return j > i - 1; }Should combine to "i<=j" (the add/sub has nsw). Currently notoptimized with "clang -emit-llvm-bc|opt -O3".unsigned f(unsigned x) { return ((x & 7) + 1) & 15; }The & 15 part should be optimized away, it doesn't change the result. Currentlynot optimized with "clang -emit-llvm-bc|opt -O3".This was noticed in the entryblock for grokdeclarator in 403.gcc: %tmp = icmp eq i32 %decl_context, 4 %decl_context_addr.0 = select i1 %tmp, i32 3, i32 %decl_context %tmp1 = icmp eq i32 %decl_context_addr.0, 1 %decl_context_addr.1 = select i1 %tmp1, i32 0, i32 %decl_context_addr.0tmp1 should be simplified to something like: (!tmp || decl_context == 1)This allows recursive simplifications, tmp1 is used all over the place inthe function, e.g. by: %tmp23 = icmp eq i32 %decl_context_addr.1, 0 ; <i1> [#uses=1] %tmp24 = xor i1 %tmp1, true ; <i1> [#uses=1] %or.cond8 = and i1 %tmp23, %tmp24 ; <i1> [#uses=1]later.[STORE SINKING]Store sinking: This code:void f (int n, int *cond, int *res) { int i; *res = 0; for (i = 0; i < n; i++) if (*cond) *res ^= 234; }On this function GVN hoists the fully redundant value of *res, but nothingmoves the store out. This gives us this code:bb: ; preds = %bb2, %entry %.rle = phi i32 [ 0, %entry ], [ %.rle6, %bb2 ] %i.05 = phi i32 [ 0, %entry ], [ %indvar.next, %bb2 ] %1 = load i32* %cond, align 4 %2 = icmp eq i32 %1, 0 br i1 %2, label %bb2, label %bb1bb1: ; preds = %bb %3 = xor i32 %.rle, 234 store i32 %3, i32* %res, align 4 br label %bb2bb2: ; preds = %bb, %bb1 %.rle6 = phi i32 [ %3, %bb1 ], [ %.rle, %bb ] %indvar.next = add i32 %i.05, 1 %exitcond = icmp eq i32 %indvar.next, %n br i1 %exitcond, label %return, label %bbDSE should sink partially dead stores to get the store out of the loop.Here's another partial dead case:http:Scalar PRE hoists the mul in the common block up to the else:int test (int a, int b, int c, int g) { int d, e; if (a) d = b * c; else d = b - c; e = b * c + g; return d + e;}It would be better to do the mul once to reduce codesize above the if.This is GCC PR38204.This simple function from 179.art:int winner, numf2s;struct { double y; int reset; } *Y;void find_match() { int i; winner = 0; for (i=0;i<numf2s;i++) if (Y[i].y > Y[winner].y) winner =i;}Compiles into (with clang TBAA):for.body: ; preds = %for.inc, %bb.nph %indvar = phi i64 [ 0, %bb.nph ], [ %indvar.next, %for.inc ] %i.01718 = phi i32 [ 0, %bb.nph ], [ %i.01719, %for.inc ] %tmp4 = getelementptr inbounds %struct.anon* %tmp3, i64 %indvar, i32 0 %tmp5 = load double* %tmp4, align 8, !tbaa !4 %idxprom7 = sext i32 %i.01718 to i64 %tmp10 = getelementptr inbounds %struct.anon* %tmp3, i64 %idxprom7, i32 0 %tmp11 = load double* %tmp10, align 8, !tbaa !4 %cmp12 = fcmp ogt double %tmp5, %tmp11 br i1 %cmp12, label %if.then, label %for.incif.then: ; preds = %for.body %i.017 = trunc i64 %indvar to i32 br label %for.incfor.inc: ; preds = %for.body, %if.then %i.01719 = phi i32 [ %i.01718, %for.body ], [ %i.017, %if.then ] %indvar.next = add i64 %indvar, 1 %exitcond = icmp eq i64 %indvar.next, %tmp22 br i1 %exitcond, label %for.cond.for.end_crit_edge, label %for.bodyIt is good that we hoisted the reloads of numf2's, and Y out of the loop andsunk the store to winner out.However, this is awful on several levels: the conditional truncate in the loop(-indvars at fault? why can't we completely promote the IV to i64?).Beyond that, we have a partially redundant load in the loop: if "winner" (aka %i.01718) isn't updated, we reload Y[winner].y the next time through the loop.Similarly, the addressing that feeds it (including the sext) is redundant. Inthe end we get this generated assembly:LBB0_2: ## %for.body ## =>This Inner Loop Header: Depth=1 movsd (%rdi), %xmm0 movslq %edx, %r8 shlq $4, %r8 ucomisd (%rcx,%r8), %xmm0 jbe LBB0_4 movl %esi, %edxLBB0_4: ## %for.inc addq $16, %rdi incq %rsi cmpq %rsi, %rax jne LBB0_2All things considered this isn't too bad, but we shouldn't need the movslq orthe shlq instruction, or the load folded into ucomisd every time through theloop.On an x86-specific topic, if the loop can't be restructure, the movl should be acmov.[STORE SINKING]GCC PR37810 is an interesting case where we should sink load/store reloadinto the if block and outside the loop, so we don't reload/store it on thenon-call path.for () { *P += 1; if () call(); else ...->tmp = *Pfor () { tmp += 1; if () { *P = tmp; call(); tmp = *P; } else ...}*P = tmp;We now hoist the reload after the call (Transforms/GVN/lpre-call-wrap.ll), butwe don't sink the store. We need partially dead store sinking.[LOAD PRE CRIT EDGE SPLITTING]GCC PR37166: Sinking of loads prevents SROA'ing the "g" struct on the stackleading to excess stack traffic. This could be handled by GVN with some crazysymbolic phi translation. The code we get looks like (g is on the stack):bb2: ; preds = %bb1.. %9 = getelementptr %struct.f* %g, i32 0, i32 0 store i32 %8, i32* %9, align bel %bb3bb3: ; preds = %bb1, %bb2, %bb %c_addr.0 = phi %struct.f* [ %g, %bb2 ], [ %c, %bb ], [ %c, %bb1 ] %b_addr.0 = phi %struct.f* [ %b, %bb2 ], [ %g, %bb ], [ %b, %bb1 ] %10 = getelementptr %struct.f* %c_addr.0, i32 0, i32 0 %11 = load i32* %10, align 4%11 is partially redundant, an in BB2 it should have the value %8.GCC PR33344 and PR35287 are similar cases.[LOAD PRE]There are many load PRE testcases in testsuite/gcc.dg/tree-ssa/loadpre* in theGCC testsuite, ones we don't get yet are (checked through loadpre25):[CRIT EDGE BREAKING]predcom-4.c[PRE OF READONLY CALL]loadpre5.c[TURN SELECT INTO BRANCH]loadpre14.c loadpre15.c actually a conditional increment: loadpre18.c loadpre19.c[LOAD PRE / STORE SINKING / SPEC HACK]This is a chunk of code from 456.hmmer:int f(int M, int *mc, int *mpp, int *tpmm, int *ip, int *tpim, int *dpp, int *tpdm, int xmb, int *bp, int *ms) { int k, sc; for (k = 1; k <= M; k++) { mc[k] = mpp[k-1] + tpmm[k-1]; if ((sc = ip[k-1] + tpim[k-1]) > mc[k]) mc[k] = sc; if ((sc = dpp[k-1] + tpdm[k-1]) > mc[k]) mc[k] = sc; if ((sc = xmb + bp[k]) > mc[k]) mc[k] = sc; mc[k] += ms[k]; }}It is very profitable for this benchmark to turn the conditional stores to mc[k]into a conditional move (select instr in IR) and allow the final store to do thestore. See GCC PR27313 for more details. Note that this is valid to xform evenwith the new C++ memory model, since mc[k] is previously loaded and laterstored.[SCALAR PRE]There are many PRE testcases in testsuite/gcc.dg/tree-ssa/ssa-pre-*.c in theGCC testsuite.There are some interesting cases in testsuite/gcc.dg/tree-ssa/pred-comm* in theGCC testsuite. For example, we get the first example in predcom-1.c, but miss the second one:unsigned fib[1000];unsigned avg[1000];__attribute__ ((noinline))void count_averages(int n) { int i; for (i = 1; i < n; i++) avg[i] = (((unsigned long) fib[i - 1] + fib[i] + fib[i + 1]) / 3) & 0xffff;}which compiles into two loads instead of one in the loop.predcom-2.c is the same as predcom-1.cpredcom-3.c is very similar but needs loads feeding each other instead ofstore->load.[ALIAS ANALYSIS]Type based alias analysis:http:We should do better analysis of posix_memalign. At the least it shouldno-capture its pointer argument, at best, we should know that the out-valueresult doesn't point to anything (like malloc). One example of this is inSingleSource/Benchmarks/Misc/dt.cInteresting missed case because of control flow flattening (should be 2 loads):http:With: llvm-gcc t2.c -S -o - -O0 -emit-llvm | llvm-as | opt -mem2reg -gvn -instcombine | llvm-diswe miss it because we need 1) CRIT EDGE 2) MULTIPLE DIFFERENTVALS PRODUCED BY ONE BLOCK OVER DIFFERENT PATHShttp:We could eliminate the branch condition here, loading from null is undefined:struct S { int w, x, y, z; };struct T { int r; struct S s; };void bar (struct S, int);void foo (int a, struct T b){ struct S *c = 0; if (a) c = &b.s; bar (*c, a);}simplifylibcalls should do several optimizations for strspn/strcspn:strcspn(x, "a") -> inlined loop for up to letters (similarly for strspn)
 
This should turn into a switch on the character See PR3253 for some notes on codegen hmmer apparently uses strcspn and strspn a lot omnetpp uses strspn simplifylibcalls should turn these snprintf idioms into memcpy (GCC PR47917) char buf1[6]
 

Variables

instcombine should handle this C2 when and C2 are unsigned Similarly for udiv and signed operands Currently InstCombine avoids this transform but will do it when the signs of the operands and the sign of the divide match See the FIXME in InstructionCombining cpp in the visitSetCondInst method after the switch case for Instruction::UDiv(around line 4447) for more details. The SingleSource/Benchmarks/Shootout-C++/hash and hash2 tests have examples of this const ruct.[LOOP OPTIMIZATION] SingleSource/Benchmarks/Misc/dt.c shows several interesting optimization opportunities in its double_array_divs_variable function typedef unsigned long long U64
 
Target Independent Opportunities
 
Target Independent unsigned int b
 
 i
 
into __pad0__
 
This would be a win on ppc32
 
This would be a win on but not x86 or ppc64 Shrink
 
This would be a win on but not x86 or ppc64 Reassociate should turn things like
 
into llvm powi calls
 
into llvm powi allowing the code generator to produce balanced multiplication trees First
 
into llvm powi allowing the code generator to produce balanced multiplication trees the intrinsic needs to be extended to support integers
 
into llvm powi allowing the code generator to produce balanced multiplication trees the intrinsic needs to be extended to support and second the code generator needs to be enhanced to lower these to multiplication trees Interesting testcase for add shift mul reassoc
 
into llvm powi allowing the code generator to produce balanced multiplication trees the intrinsic needs to be extended to support and second the code generator needs to be enhanced to lower these to multiplication trees Interesting testcase for add shift mul int y
 
This is blocked on not handling X *X *X which is the same number of multiplies and is canonical
 
This is blocked on not handling X *X *X which is the same number of multiplies and is because the *X has multiple uses Here s a simple example
 
This is blocked on not handling X *X *X which is the same number of multiplies and is because the *X has multiple uses Here s a simple X1 * C = mul i32 %B
 
This is blocked on not handling X *X *X which is the same number of multiplies and is because the *X has multiple uses Here s a simple X1 B ret i32 C Reassociate should handle the example in GCC PR16157
 
This is blocked on not handling X *X *X which is the same number of multiplies and is because the *X has multiple uses Here s a simple X1 B ret i32 C Reassociate should handle the example in GCC a1
 
This is blocked on not handling X *X *X which is the same number of multiplies and is because the *X has multiple uses Here s a simple X1 B ret i32 C Reassociate should handle the example in GCC a2
 
This is blocked on not handling X *X *X which is the same number of multiplies and is because the *X has multiple uses Here s a simple X1 B ret i32 C Reassociate should handle the example in GCC a3
 
This is blocked on not handling X *X *X which is the same number of multiplies and is because the *X has multiple uses Here s a simple X1 B ret i32 C Reassociate should handle the example in GCC a4
 
int b0
 
int b1
 
int b2
 
int b3
 
int b4
 
This requires reassociating to forms of expressions that are already available
 
This requires reassociating to forms of expressions that are already something that reassoc doesn t think about yet These two functions should generate the same code on big endian systems
 
This requires reassociating to forms of expressions that are already something that reassoc doesn t think about yet These two functions should generate the same code on big endian intl { return memcmp(j,l,4)
 
this could be done in SelectionDAGISel cpp
 
this could be done in SelectionDAGISel along with other special cases
 
this could be done in SelectionDAGISel along with other special for
 
this could be done in SelectionDAGISel along with other special bytes It would be nice to revert this patch
 
Add support for conditional increments
 
Add support for conditional and other related patterns Instead of
 
Add support for conditional and other related patterns Instead eax cmpl
 
Add support for conditional and other related patterns Instead eax eax je LBB16_2 LBB16_1
 
Add support for conditional and other related patterns Instead eax eax je LBB16_2 eax edi sbbl
 
Add support for conditional and other related patterns Instead eax eax je LBB16_2 eax edi eax movl eax
 
Add support for conditional and other related patterns Instead eax eax je LBB16_2 eax edi eax movl _foo Combine
 
Doing so could allow SROA of the destination pointers See also
 
i< reg-> size
 
 else
 
We don t delete this output free loop
 
This should be recognized as CLZ
 
 return
 
instcombine should handle this transform
 
instcombine should handle this C2 when X
 
instcombine should handle this C2 when C1
 
Note that only the low bits of effective_addr2 are used On bit we don t eliminate the computation of the top half of effective_addr2 because we don t have whole function selection dags On x86
 
Note that only the low bits of effective_addr2 are used On bit we don t eliminate the computation of the top half of effective_addr2 because we don t have whole function selection dags On this means we use one extra register for the function when effective_addr2 is declared as U64 than when it is declared U32 PHI Slicing could be extended to do this Tail call elim should be more aggressive
 
Note that only the low bits of effective_addr2 are used On bit we don t eliminate the computation of the top half of effective_addr2 because we don t have whole function selection dags On this means we use one extra register for the function when effective_addr2 is declared as U64 than when it is declared U32 PHI Slicing could be extended to do this Tail call elim should be more checking to see if the call is followed by an uncond branch to an exit block
 
instruction into the terminating blocks because there was other code
 
optimized out of the function after the taildup happened
 
RUN __pad1__
 
RUN< i32tmp = icmp ne i32 %tmp.1
 
RUN< i32 >< i1 > br i1 label then
 
 preds
 
then result = phi i32 [ 0, %else.0 ]
 
then ret i32 result Tail recursion elimination should handle
 
 Also
 
multiplies can be turned into SHL s
 
multiplies can be turned into SHL so they should be handled as if they were associative return like this
 
RUN __pad2__
 
< i32 > tmp foo = call i32 @foo( i32* %x )
 
We should investigate an instruction sinking pass Consider this silly example in pic mode
 
we compile this to
 
we compile this esp call L1 $pb L1 $pb
 
we compile this esp call L1 $pb L1 esp je LBB1_2 LBB1_1
 
we compile this esp call L1 $pb L1 esp je LBB1_2 esp ret LBB1_2
 
we compile this esp call L1 $pb L1 esp je LBB1_2 esp ret but is currently always computed in the entry block It would be better to sink the picbase computation down into the block for the assertion
 
we compile this esp call L1 $pb L1 esp je LBB1_2 esp ret but is currently always computed in the entry block It would be better to sink the picbase computation down into the block for the as it is the only one that uses it This happens for a lot of code with early outs Another example is loads of arguments
 
we compile this esp call L1 $pb L1 esp je LBB1_2 esp ret but is currently always computed in the entry block It would be better to sink the picbase computation down into the block for the as it is the only one that uses it This happens for a lot of code with early outs Another example is loads of which are usually emitted into the entry block on targets like x86 If not used in all paths through a function
 
we compile this esp call L1 $pb L1 esp je LBB1_2 esp ret but is currently always computed in the entry block It would be better to sink the picbase computation down into the block for the as it is the only one that uses it This happens for a lot of code with early outs Another example is loads of which are usually emitted into the entry block on targets like x86 If not used in all paths through a they should be sunk into the ones that do In this case
 
we compile this esp call L1 $pb L1 esp je LBB1_2 esp ret but is currently always computed in the entry block It would be better to sink the picbase computation down into the block for the as it is the only one that uses it This happens for a lot of code with early outs Another example is loads of which are usually emitted into the entry block on targets like x86 If not used in all paths through a they should be sunk into the ones that do In this whole function isel would also handle this Investigate lowering of sparse switch statements into perfect hash tables
 
bool Control
 
bool Shift
 
bool Alt
 
THotKey m_HotKey
 
Clang compiles this into
 
Clang compiles this i8
 
Clang compiles this i64
 
Clang compiles this i32
 
Clang compiles this i1 false = getelementptr [8 x i64]* %input
 
Clang compiles this i1 i64 store i64 align = getelementptr [8 x i64]* %input
 
Clang compiles this i1 i64 store i64 i64 store i64 i64 store i64 i64 store i64 align Which gets codegen d xmm0 movaps xmm0
 
Clang compiles this i1 i64 store i64 i64 store i64 i64 store i64 i64 store i64 align Which gets codegen d xmm0 movaps rbp movaps rbp movaps rbp movaps rbp movq
 
Clang compiles this i1 i64 store i64 i64 store i64 i64 store i64 i64 store i64 align Which gets codegen d xmm0 movaps rbp movaps rbp movaps rbp movaps rbp rbp rbp rbp rbp It would be better to have movq s of instead of the movaps s http
 
Clang compiles this i1 i64 store i64 i64 store i64 i64 store i64 i64 store i64 align Which gets codegen d xmm0 movaps rbp movaps rbp movaps rbp movaps rbp rbp rbp rbp rbp It would be better to have movq s of instead of the movaps s LLVM produces ret int
 
Unrolling by would eliminate the &in both copies
 
Unrolling by would eliminate the &in both leading to a net reduction in code size The resultant code would then also be suitable for exit value computation We miss a bunch of rotate opportunities on various targets
 
Unrolling by would eliminate the &in both leading to a net reduction in code size The resultant code would then also be suitable for exit value computation We miss a bunch of rotate opportunities on various including ppc
 
Unrolling by would eliminate the &in both leading to a net reduction in code size The resultant code would then also be suitable for exit value computation We miss a bunch of rotate opportunities on various including etc On X86
 
Unrolling by would eliminate the &in both leading to a net reduction in code size The resultant code would then also be suitable for exit value computation We miss a bunch of rotate opportunities on various including etc On we miss a bunch of rotate by variable cases because the rotate matching code in dag combine doesn t look through truncates aggressively enough Here are some testcases reduces from GCC PR17886
 
compiles shl5 = shl i32 %conv
 
compiles shl9 = shl i32 %conv
 
compiles or = or i32 %shl9
 
compiles conv or6 = or i32 %or
 
compiles conv shl5 or10 = or i32 %or6
 
compiles conv shl5 shl ret i32 or10 it would be better as
 
aka __pad3__
 
aka conv or ret i32 or6 or even i depending on the speed of the multiplier The best way to handle this is to canonicalize it to a multiply in IR and have codegen handle lowering multiplies to shifts on cpus where shifts are faster We do a number of simplifications in simplify libcalls to strength reduce standard library functions
 
This is equivalent to the following
 
The same transformation can work with an even modulo with the addition of a rotate
 
The same transformation can work with an even modulo with the addition of a and shrink the compare RHS by the same amount Unless the target supports rotates
 
The same transformation can work with an even modulo with the addition of a and shrink the compare RHS by the same amount Unless the target supports though
 
The same transformation can work with an even modulo with the addition of a and shrink the compare RHS by the same amount Unless the target supports that transformation probably isn t worthwhile The transformation can also easily be made to work with non zero equality comparisons
 
The same transformation can work with an even modulo with the addition of a and shrink the compare RHS by the same amount Unless the target supports that transformation probably isn t worthwhile The transformation can also easily be made to work with non zero equality for n
 
The same transformation can work with an even modulo with the addition of a and shrink the compare RHS by the same amount Unless the target supports that transformation probably isn t worthwhile The transformation can also easily be made to work with non zero equality for the first function produces better code on X86 From GCC Bug
 
This should optimize to x
 
This should turn into a switch on the character See PR3253 for some notes on codegen hmmer apparently uses strcspn and strspn a lot omnetpp uses strspn simplifylibcalls should turn these snprintf idioms into buf2 [6]
 
This should turn into a switch on the character See PR3253 for some notes on codegen hmmer apparently uses strcspn and strspn a lot omnetpp uses strspn simplifylibcalls should turn these snprintf idioms into buf3 [4]
 
This should turn into a switch on the character See PR3253 for some notes on codegen hmmer apparently uses strcspn and strspn a lot omnetpp uses strspn simplifylibcalls should turn these snprintf idioms into buf4 [4]
 

Macro Definition Documentation

◆ PMD_MASK

#define PMD_MASK   (~((1UL << 23) - 1))

Function Documentation

◆ a() [1/5]

Should combine to a* Currently not optimized with clang emit llvm bc opt O3 unsigned a ( char *  x)

Definition at line 883 of file README.txt.

◆ a() [2/5]

This should optimize to or at least something sane Currently not optimized with clang emit llvm bc opt O3 int a ( int  a,
int  b,
int  c 
)

Definition at line 853 of file README.txt.

◆ a() [3/5]

Should fold to a&& b c Currently not optimized with clang emit llvm bc opt O3 int a ( int  x)

Definition at line 859 of file README.txt.

◆ a() [4/5]

Should combine to x& Currently not optimized with clang emit llvm bc opt O3 unsigned a ( unsigned  a)

Definition at line 877 of file README.txt.

◆ a() [5/5]

There s an unnecessary zext in the generated code with clang emit llvm bc opt O3 unsigned a ( unsigned long long  x)

Definition at line 889 of file README.txt.

References x.

◆ bar()

<i32> ret i32 tmp foo define i32 bar ( i32 *%  x)

Definition at line 387 of file README.txt.

References call(), entry, foo, i32, ret(), uses, and x.

◆ bit()

This sort of thing should be added to the loop idiom pass These should turn into single bit ( unaligned?  ) const

Definition at line 249 of file README.txt.

◆ f()

void f ( )

Definition at line 85 of file README.txt.

References a1, a2, a3, a4, b1, b2, b3, and b4.

◆ f6()

unsigned long long f6 ( unsigned long long  x,
unsigned long long  y,
int  z 
)

Definition at line 575 of file README.txt.

◆ finite()

We don t delete this output free because trip count analysis doesn t realize that it is finite ( if it were  infinite,
it would be  undefined 
)

Definition at line 206 of file README.txt.

◆ foo()

int foo ( int  z,
int  n 
)

Definition at line 64 of file README.txt.

References bar, n, and z.

◆ for() [1/2]

for ( i  = ...;++i,
tmp = tmp 
)

◆ for() [2/2]

for ( )

◆ g()

Should combine to* unsigned x& Currently not optimized with clang emit llvm bc opt O3 int g ( int  x)

Definition at line 895 of file README.txt.

References x.

◆ GetHotKey()

THotKey GetHotKey ( )

Definition at line 470 of file README.txt.

References m_HotKey.

◆ h()

int h ( int j,
int l 
)

Definition at line 101 of file README.txt.

◆ hypot()

The legalization code for mul with overflow needs to be made more robust before this can be implemented though Get the C front end to expand hypot ( x  ,
y   
) -> llvm.sqrt(x *x+y *y) when errno and precision don 't matter(ffastmath). Misc/mandel will like this. :) This isn 't safe in general, even on darwin. See the libm implementation of hypot for examples(which special case when x/y are exactly zero to get signed zeros etc right). On targets with expensive 64-bit multiply, we could LSR this:for(i=...

◆ if() [1/2]

if ( )

Definition at line 176 of file README.txt.

◆ if() [2/2]

if ( x  = =3)
pure virtual

◆ instcombine()

but this requires TBAA This isn t recognized as bswap by instcombine ( yes  ,
it really is  bswap 
)

Definition at line 191 of file README.txt.

◆ into()

into ( -m64 -O3 -fno-exceptions -static -fomit-frame-  pointer)

Definition at line 472 of file README.txt.

References foo, and input.

◆ is()

This is equivalent to the where is the multiplicative inverse and is ( (2^32) -  1)

Definition at line 672 of file README.txt.

References n.

◆ j()

return j ( j<<  16)

Referenced by AnalyzeArguments(), llvm::analyzeArguments(), llvm::HexagonSubtarget::BankConflictMutation::apply(), llvm::PBQP::applyR1(), llvm::PBQP::applyR2(), llvm::SwitchCG::SwitchLowering::buildBitTests(), llvm::ScheduleDAGInstrs::buildSchedGraph(), llvm::LoopVectorizationCostModel::calculateRegisterUsage(), canLowerByDroppingEvenElements(), CanMergeParamLoadStoresStartingAt(), CollectOpsToWiden(), combineBasicSADPattern(), CompactSwizzlableVector(), llvm::EHStreamer::computeActionsTable(), llvm::EHStreamer::computePadMap(), ComputePTXValueVTs(), computeZeroableShuffleElements(), concatSubVector(), llvm::ARMBaseInstrInfo::convertToThreeAddress(), llvm::IntervalMapImpl::NodeBase< std::pair< KeyT, KeyT >, ValT, N >::copy(), llvm::createBitMaskForGaps(), llvm::createInterleaveMask(), llvm::createReplicatedMask(), llvm::DecodeSubVectorBroadcast(), detectAVGPattern(), llvm::DistributeRange(), EltsFromConsecutiveLoads(), llvm::AsmPrinter::emitConstantPool(), llvm::encodeBase64(), llvm::IntervalMapImpl::NodeBase< std::pair< KeyT, KeyT >, ValT, N >::erase(), llvm::UnwindOpcodeAssembler::Finalize(), llvm::SwitchCG::SwitchLowering::findBitTestClusters(), FindInOperandList(), llvm::SwitchCG::SwitchLowering::findJumpTables(), llvm::RecordRecTy::get(), llvm::ScalarEvolution::getAddExpr(), llvm::HexagonMCInstrInfo::getDuplexPossibilties(), getFauxShuffleMask(), llvm::MachineFunction::getFilterIDFor(), getPSHUFShuffleMask(), getReassignedChan(), llvm::GetReturnInfo(), llvm::getShuffleReduction(), llvm::R600InstrInfo::getSrcs(), getTargetConstantBitsFromNode(), GroupByComplexity(), llvm::CallLowering::handleAssignments(), llvm::BuildVectorSDNode::isConstantSplat(), isHopBuildVector(), isHorizontalBinOp(), llvm::R600InstrInfo::isLegalUpTo(), isMultiLaneShuffleMask(), isNByteElemShuffleMask(), IsSafeAndProfitableToMove(), llvm::PPC::isSplatShuffleMask(), isUZP_v_undef_Mask(), isVMerge(), llvm::PPC::isVPKUDUMShuffleMask(), llvm::PPC::isVPKUHUMShuffleMask(), llvm::PPC::isVPKUWUMShuffleMask(), isVTRN_v_undef_Mask(), isVTRNMask(), isVUZP_v_undef_Mask(), isVUZPMask(), isVZIP_v_undef_Mask(), isVZIPMask(), KnuthDiv(), LowerBITREVERSE_XOP(), llvm::RISCVTargetLowering::LowerCall(), llvm::NVPTXTargetLowering::LowerCall(), llvm::TargetLowering::LowerCallTo(), LowerCONCAT_VECTORS_i1(), LowerEXTRACT_SUBVECTOR(), llvm::NVPTXTargetLowering::LowerFormalArguments(), llvm::AArch64TargetLowering::lowerInterleavedStore(), llvm::ARMTargetLowering::lowerInterleavedStore(), LowerMUL(), LowerScalarVariableShift(), LowerShift(), lowerShuffleAsBlend(), lowerShuffleAsDecomposedShuffleMerge(), lowerShuffleAsRepeatedMaskAndLanePermute(), lowerV16I8Shuffle(), lowerV8I16GeneralSingleInputShuffle(), llvm::HexagonTargetLowering::LowerVECTOR_SHUFFLE(), lowerVECTOR_SHUFFLE_SHF(), LowervXi8MulWithUNPCK(), match1BitShuffleAsKSHIFT(), matchShuffleAsBitRotate(), matchShuffleAsShift(), llvm::CombinerHelper::matchTruncStoreMerge(), llvm::PBQP::RegAlloc::MatrixMetadata::MatrixMetadata(), llvm::IntervalMapImpl::NodeBase< std::pair< KeyT, KeyT >, ValT, N >::moveLeft(), llvm::IntervalMapImpl::NodeBase< std::pair< KeyT, KeyT >, ValT, N >::moveRight(), llvm::LegalizerHelper::narrowScalar(), llvm::LiveRange::overlapsFrom(), performANY_EXTENDCombine(), llvm::TargetInstrInfo::PredicateInstruction(), llvm::MachineJumpTableInfo::print(), llvm::cl::generic_parser_base::printGenericOptionDiff(), llvm::MachineJumpTableInfo::ReplaceMBBInJumpTable(), resolveTargetShuffleInputsAndMask(), llvm::TargetLowering::SimplifyDemandedBits(), llvm::TargetLowering::SimplifyMultipleUseDemandedBits(), llvm::TailDuplicator::tailDuplicateAndUpdate(), llvm::MachineFunction::tidyLandingPads(), tryCombineToBSL(), llvm::LegalizationArtifactCombiner::tryCombineUnmergeValues(), umul_ov(), llvm::UnrollLoop(), llvm::SelectionDAG::UnrollVectorOp(), llvm::InstCombinerImpl::visitLandingPadInst(), llvm::InstCombinerImpl::visitPHINode(), and llvm::Interpreter::visitShuffleVectorInst().

◆ letters()

Should combine to x<= 9" (the sub has nsw). Currently notoptimized with "clang -emit-llvm-bc | opt -O3".int g(int x) { return (x + 10) < 0; }Should combine to "x < -10" (the add has nsw). Currently notoptimized with "clang -emit-llvm-bc | opt -O3".int f(int i, int j) { return i < j + 1; }int g(int i, int j) { return j > i - 1; }Should combine to "i <= j" (the add/sub has nsw). Currently notoptimized with "clang -emit-llvm-bc | opt -O3".unsigned f(unsigned x) { return ((x & 7) + 1) & 15; }The & 15 part should be optimized away, it doesn't change the result. Currentlynot optimized with "clang -emit-llvm-bc | opt -O3".This was noticed in the entryblock for grokdeclarator in 403.gcc: %tmp = icmp eq i32 %decl_context, 4 %decl_context_addr.0 = select i1 %tmp, i32 3, i32 %decl_context %tmp1 = icmp eq i32 %decl_context_addr.0, 1 %decl_context_addr.1 = select i1 %tmp1, i32 0, i32 %decl_context_addr.0tmp1 should be simplified to something like: (!tmp || decl_context == 1)This allows recursive simplifications, tmp1 is used all over the place inthe function, e.g. by: %tmp23 = icmp eq i32 %decl_context_addr.1, 0 ; <i1> [#uses=1] %tmp24 = xor i1 %tmp1, true ; <i1> [#uses=1] %or.cond8 = and i1 %tmp23, %tmp24 ; <i1> [#uses=1]later.[STORE SINKING]Store sinking: This code:void f (int n, int *cond, int *res) { int i; *res = 0; for (i = 0; i < n; i++) if (*cond) *res ^= 234; }On this function GVN hoists the fully redundant value of *res, but nothingmoves the store out. This gives us this code:bb: ; preds = %bb2, %entry %.rle = phi i32 [ 0, %entry ], [ %.rle6, %bb2 ] %i.05 = phi i32 [ 0, %entry ], [ %indvar.next, %bb2 ] %1 = load i32* %cond, align 4 %2 = icmp eq i32 %1, 0 br i1 %2, label %bb2, label %bb1bb1: ; preds = %bb %3 = xor i32 %.rle, 234 store i32 %3, i32* %res, align 4 br label %bb2bb2: ; preds = %bb, %bb1 %.rle6 = phi i32 [ %3, %bb1 ], [ %.rle, %bb ] %indvar.next = add i32 %i.05, 1 %exitcond = icmp eq i32 %indvar.next, %n br i1 %exitcond, label %return, label %bbDSE should sink partially dead stores to get the store out of the loop.Here's another partial dead case:http:Scalar PRE hoists the mul in the common block up to the else:int test (int a, int b, int c, int g) { int d, e; if (a) d = b * c; else d = b - c; e = b * c + g; return d + e;}It would be better to do the mul once to reduce codesize above the if.This is GCC PR38204.This simple function from 179.art:int winner, numf2s;struct { double y; int reset; } *Y;void find_match() { int i; winner = 0; for (i=0;i<numf2s;i++) if (Y[i].y > Y[winner].y) winner =i;}Compiles into (with clang TBAA):for.body: ; preds = %for.inc, %bb.nph %indvar = phi i64 [ 0, %bb.nph ], [ %indvar.next, %for.inc ] %i.01718 = phi i32 [ 0, %bb.nph ], [ %i.01719, %for.inc ] %tmp4 = getelementptr inbounds %struct.anon* %tmp3, i64 %indvar, i32 0 %tmp5 = load double* %tmp4, align 8, !tbaa !4 %idxprom7 = sext i32 %i.01718 to i64 %tmp10 = getelementptr inbounds %struct.anon* %tmp3, i64 %idxprom7, i32 0 %tmp11 = load double* %tmp10, align 8, !tbaa !4 %cmp12 = fcmp ogt double %tmp5, %tmp11 br i1 %cmp12, label %if.then, label %for.incif.then: ; preds = %for.body %i.017 = trunc i64 %indvar to i32 br label %for.incfor.inc: ; preds = %for.body, %if.then %i.01719 = phi i32 [ %i.01718, %for.body ], [ %i.017, %if.then ] %indvar.next = add i64 %indvar, 1 %exitcond = icmp eq i64 %indvar.next, %tmp22 br i1 %exitcond, label %for.cond.for.end_crit_edge, label %for.bodyIt is good that we hoisted the reloads of numf2's, and Y out of the loop andsunk the store to winner out.However, this is awful on several levels: the conditional truncate in the loop(-indvars at fault? why can't we completely promote the IV to i64?).Beyond that, we have a partially redundant load in the loop: if "winner" (aka %i.01718) isn't updated, we reload Y[winner].y the next time through the loop.Similarly, the addressing that feeds it (including the sext) is redundant. Inthe end we get this generated assembly:LBB0_2: ## %for.body ## =>This Inner Loop Header: Depth=1 movsd (%rdi), %xmm0 movslq %edx, %r8 shlq $4, %r8 ucomisd (%rcx,%r8), %xmm0 jbe LBB0_4 movl %esi, %edxLBB0_4: ## %for.inc addq $16, %rdi incq %rsi cmpq %rsi, %rax jne LBB0_2All things considered this isn't too bad, but we shouldn't need the movslq orthe shlq instruction, or the load folded into ucomisd every time through theloop.On an x86-specific topic, if the loop can't be restructure, the movl should be acmov.[STORE SINKING]GCC PR37810 is an interesting case where we should sink load/store reloadinto the if block and outside the loop, so we don't reload/store it on thenon-call path.for () { *P += 1; if () call(); else ...->tmp = *Pfor () { tmp += 1; if () { *P = tmp; call(); tmp = *P; } else ...}*P = tmp;We now hoist the reload after the call (Transforms/GVN/lpre-call-wrap.ll), butwe don't sink the store. We need partially dead store sinking.[LOAD PRE CRIT EDGE SPLITTING]GCC PR37166: Sinking of loads prevents SROA'ing the "g" struct on the stackleading to excess stack traffic. This could be handled by GVN with some crazysymbolic phi translation. The code we get looks like (g is on the stack):bb2: ; preds = %bb1.. %9 = getelementptr %struct.f* %g, i32 0, i32 0 store i32 %8, i32* %9, align bel %bb3bb3: ; preds = %bb1, %bb2, %bb %c_addr.0 = phi %struct.f* [ %g, %bb2 ], [ %c, %bb ], [ %c, %bb1 ] %b_addr.0 = phi %struct.f* [ %b, %bb2 ], [ %g, %bb ], [ %b, %bb1 ] %10 = getelementptr %struct.f* %c_addr.0, i32 0, i32 0 %11 = load i32* %10, align 4%11 is partially redundant, an in BB2 it should have the value %8.GCC PR33344 and PR35287 are similar cases.[LOAD PRE]There are many load PRE testcases in testsuite/gcc.dg/tree-ssa/loadpre* in theGCC testsuite, ones we don't get yet are (checked through loadpre25):[CRIT EDGE BREAKING]predcom-4.c[PRE OF READONLY CALL]loadpre5.c[TURN SELECT INTO BRANCH]loadpre14.c loadpre15.c actually a conditional increment: loadpre18.c loadpre19.c[LOAD PRE / STORE SINKING / SPEC HACK]This is a chunk of code from 456.hmmer:int f(int M, int *mc, int *mpp, int *tpmm, int *ip, int *tpim, int *dpp, int *tpdm, int xmb, int *bp, int *ms) { int k, sc; for (k = 1; k <= M; k++) { mc[k] = mpp[k-1] + tpmm[k-1]; if ((sc = ip[k-1] + tpim[k-1]) > mc[k]) mc[k] = sc; if ((sc = dpp[k-1] + tpdm[k-1]) > mc[k]) mc[k] = sc; if ((sc = xmb + bp[k]) > mc[k]) mc[k] = sc; mc[k] += ms[k]; }}It is very profitable for this benchmark to turn the conditional stores to mc[k]into a conditional move (select instr in IR) and allow the final store to do thestore. See GCC PR27313 for more details. Note that this is valid to xform evenwith the new C++ memory model, since mc[k] is previously loaded and laterstored.[SCALAR PRE]There are many PRE testcases in testsuite/gcc.dg/tree-ssa/ssa-pre-*.c in theGCC testsuite.There are some interesting cases in testsuite/gcc.dg/tree-ssa/pred-comm* in theGCC testsuite. For example, we get the first example in predcom-1.c, but miss the second one:unsigned fib[1000];unsigned avg[1000];__attribute__ ((noinline))void count_averages(int n) { int i; for (i = 1; i < n; i++) avg[i] = (((unsigned long) fib[i - 1] + fib[i] + fib[i + 1]) / 3) & 0xffff;}which compiles into two loads instead of one in the loop.predcom-2.c is the same as predcom-1.cpredcom-3.c is very similar but needs loads feeding each other instead ofstore->load.[ALIAS ANALYSIS]Type based alias analysis:http:We should do better analysis of posix_memalign. At the least it shouldno-capture its pointer argument, at best, we should know that the out-valueresult doesn't point to anything (like malloc). One example of this is inSingleSource/Benchmarks/Misc/dt.cInteresting missed case because of control flow flattening (should be 2 loads):http:With: llvm-gcc t2.c -S -o - -O0 -emit-llvm | llvm-as | opt -mem2reg -gvn -instcombine | llvm-diswe miss it because we need 1) CRIT EDGE 2) MULTIPLE DIFFERENTVALS PRODUCED BY ONE BLOCK OVER DIFFERENT PATHShttp:We could eliminate the branch condition here, loading from null is undefined:struct S { int w, x, y, z; };struct T { int r; struct S s; };void bar (struct S, int);void foo (int a, struct T b){ struct S *c = 0; if (a) c = &b.s; bar (*c, a);}simplifylibcalls should do several optimizations for strspn/strcspn:strcspn(x, "a") -> inlined loop for up to letters ( similarly for  strspn)

Definition at line 1233 of file README.txt.

◆ loops()

The loop unroller should partially unroll loops ( instead of peeling  them)

Definition at line 544 of file README.txt.

◆ memcpy() [1/2]

aka conv or ret i32 or6 or even i depending on the speed of the multiplier The best way to handle this is to canonicalize it to a multiply in IR and have codegen handle lowering multiplies to shifts on cpus where shifts are faster We do a number of simplifications in simplify libcalls to strength reduce standard library but we don t currently merge them together For it is useful to merge memcpy ( a  ,
b  ,
strlen(b  
) -> strcpy. This can only be done safely if "b" isn't modified between the strlen and memcpy of course. We compile this program: (from GCC PR11680) http: Into code that runs the same speed in fast/slow modes, but both modes run 2x slower than when compile with GCC (either 4.0 or 4.2): $ llvm-g++ perf.cpp -O3 -fno-exceptions $ time ./a.out fast 1.821u 0.003s 0:01.82 100.0% 0+0k 0+0io 0pf+0w $ g++ perf.cpp -O3 -fno-exceptions $ time ./a.out fast 0.821u 0.001s 0:00.82 100.0% 0+0k 0+0io 0pf+0w It looks like we are making the same inlining decisions, so this may be raw codegen badness or something else (haven't investigated). Divisibility by constant can be simplified (according to GCC PR12849) from being a mulhi to being a mul lo (cheaper). Testcase: void bar(unsigned n)

Definition at line 639 of file README.txt.

References n.

◆ memcpy() [2/2]

This should turn into a switch on the character See PR3253 for some notes on codegen hmmer apparently uses strcspn and strspn a lot omnetpp uses strspn simplifylibcalls should turn these snprintf idioms into memcpy ( GCC  PR47917)

◆ popcount() [1/2]

These idioms should be recognized as popcount ( see  PR1488)

Definition at line 219 of file README.txt.

◆ popcount() [2/2]

unsigned int popcount ( unsigned int  input)

Definition at line 228 of file README.txt.

◆ pow2m1()

return* pow2m1 ( n 1)

◆ powi()

This is blocked on not handling X* X* X powi ( X  ,
 
)

◆ read_16_be()

unsigned short read_16_be ( const unsigned char *  adr)

Definition at line 255 of file README.txt.

◆ setlt()

This would be a win on but not x86 or ppc64 setlt ( loadi8  Phi)

◆ sincosf()

float sincosf ( float  x,
float *  sin,
float *  cos 
)

◆ sincosl()

long double sincosl ( long double  x,
long double sin,
long double cos 
)

◆ test()

int test ( U32 *  inst,
U64 regs 
)

Definition at line 301 of file README.txt.

◆ This()

This ( and similar related  idioms)

Definition at line 592 of file README.txt.

Variable Documentation

◆ $pb

we compile this esp call L1 $pb L1 $pb

Definition at line 410 of file README.txt.

◆ __pad0__

into __pad0__

Definition at line 33 of file README.txt.

◆ __pad1__

RUN __pad1__

Definition at line 336 of file README.txt.

◆ __pad2__

RUN __pad2__

Definition at line 382 of file README.txt.

◆ __pad3__

aka __pad3__

Definition at line 624 of file README.txt.

◆ a1

This is blocked on not handling X* X* X which is the same number of multiplies and is because the* X has multiple uses Here s a simple X1 B ret i32 C Reassociate should handle the example in GCC a1

Definition at line 84 of file README.txt.

Referenced by f().

◆ a2

This is blocked on not handling X* X* X which is the same number of multiplies and is because the* X has multiple uses Here s a simple X1 B ret i32 C Reassociate should handle the example in GCC a2

Definition at line 84 of file README.txt.

Referenced by f().

◆ a3

This is blocked on not handling X* X* X which is the same number of multiplies and is because the* X has multiple uses Here s a simple X1 B ret i32 C Reassociate should handle the example in GCC a3

Definition at line 84 of file README.txt.

Referenced by f().

◆ a4

This is blocked on not handling X* X* X which is the same number of multiplies and is because the* X has multiple uses Here s a simple X1 B ret i32 C Reassociate should handle the example in GCC a4

Definition at line 84 of file README.txt.

Referenced by f().

◆ aggressive

Note that only the low bits of effective_addr2 are used On bit we don t eliminate the computation of the top half of effective_addr2 because we don t have whole function selection dags On this means we use one extra register for the function when effective_addr2 is declared as U64 than when it is declared U32 PHI Slicing could be extended to do this Tail call elim should be more aggressive

Definition at line 326 of file README.txt.

◆ align

Clang compiles this i1 i64 store i64 i64 store i64 i64 store i64 align = getelementptr [8 x i64]* %input

Definition at line 507 of file README.txt.

◆ also

Doing so could allow SROA of the destination pointers See also

Definition at line 166 of file README.txt.

◆ Also

Also

Definition at line 370 of file README.txt.

◆ Alt

bool Alt

Definition at line 468 of file README.txt.

◆ arguments

we compile this esp call L1 $pb L1 esp je LBB1_2 esp ret but is currently always computed in the entry block It would be better to sink the picbase computation down into the block for the as it is the only one that uses it This happens for a lot of code with early outs Another example is loads of arguments

Definition at line 425 of file README.txt.

◆ as

therefore end up llgh r3 lr r0 br r14 but truncating the load would lh r3 br r14 Functions ret i64 and ought to be implemented as

Definition at line 615 of file README.txt.

Referenced by llvm::ELFAttributeParser::printAttribute().

◆ assertion

we compile this esp call L1 $pb L1 esp je LBB1_2 esp ret but is currently always computed in the entry block It would be better to sink the picbase computation down into the block for the assertion

Definition at line 422 of file README.txt.

◆ available

This requires reassociating to forms of expressions that are already available

Definition at line 92 of file README.txt.

◆ b

Add support for conditional and other related patterns Instead eax eax je LBB16_2 eax edi eax movl _foo b
Initial value:
{
if ((unsigned long long)a*b>0xffffffff)
exit(0)

Definition at line 8 of file README.txt.

◆ b0

int b0

◆ b1

int b1

◆ b2

int b2

Definition at line 84 of file README.txt.

Referenced by EmitUnwindCode(), f(), and llvm::findMaximalSubpartOfIllFormedUTF8Sequence().

◆ b3

int b3

Definition at line 84 of file README.txt.

Referenced by f(), and llvm::findMaximalSubpartOfIllFormedUTF8Sequence().

◆ b4

int b4

Definition at line 84 of file README.txt.

Referenced by f().

◆ block

Note that only the low bits of effective_addr2 are used On bit we don t eliminate the computation of the top half of effective_addr2 because we don t have whole function selection dags On this means we use one extra register for the function when effective_addr2 is declared as U64 than when it is declared U32 PHI Slicing could be extended to do this Tail call elim should be more checking to see if the call is followed by an uncond branch to an exit block

Definition at line 329 of file README.txt.

◆ buf2

This should turn into a switch on the character See PR3253 for some notes on codegen hmmer apparently uses strcspn and strspn a lot omnetpp uses strspn simplifylibcalls should turn these snprintf idioms into buf2[6]

Definition at line 1253 of file README.txt.

◆ buf3

This should turn into a switch on the character See PR3253 for some notes on codegen hmmer apparently uses strcspn and strspn a lot omnetpp uses strspn simplifylibcalls should turn these snprintf idioms into buf3[4]

Definition at line 1253 of file README.txt.

◆ buf4

This should turn into a switch on the character See PR3253 for some notes on codegen hmmer apparently uses strcspn and strspn a lot omnetpp uses strspn simplifylibcalls should turn these snprintf idioms into buf4[4]

Definition at line 1253 of file README.txt.

◆ Bug

The same transformation can work with an even modulo with the addition of a and shrink the compare RHS by the same amount Unless the target supports that transformation probably isn t worthwhile The transformation can also easily be made to work with non zero equality for the first function produces better code on X86 From GCC Bug

Definition at line 763 of file README.txt.

◆ C

This is blocked on not handling X* X* X which is the same number of multiplies and is because the* X has multiple uses Here s a simple X1* C = mul i32 %B

Definition at line 75 of file README.txt.

Referenced by zero().

◆ C1

instcombine should handle this C2 when C1

Definition at line 263 of file README.txt.

Referenced by AddCombineBUILD_VECTORToVPADDL(), adjustForFNeg(), adjustForLTGFR(), llvm::AliasSet::aliasesUnknownInst(), llvm::ScalarEvolution::applyLoopGuards(), areInverseVectorBitmasks(), llvm::BinaryConstantExpr::BinaryConstantExpr(), checkForNegativeOperand(), CombineANDShift(), combineShiftOfShiftedLogic(), llvm::InstCombinerImpl::commonIDivTransforms(), llvm::ScalarEvolution::computeConstantDifference(), llvm::ConstantFoldBinaryInstruction(), llvm::ConstantFoldBinOp(), llvm::ConstantFoldCompareInstOperands(), llvm::ConstantFoldCompareInstruction(), llvm::ConstantFoldFPBinOp(), detectUSatPattern(), llvm::ExtractElementConstantExpr::ExtractElementConstantExpr(), foldAddSubBoolOfMaskedVal(), foldAndOrOfEqualityCmpsWithConstants(), foldClampRangeOfTwo(), llvm::InstCombinerImpl::foldCmpLoadFromIndexedGlobal(), llvm::SelectionDAG::FoldConstantArithmetic(), llvm::SelectionDAG::foldConstantFPMath(), llvm::InstCombinerImpl::foldICmpAndConstConst(), llvm::InstCombinerImpl::foldICmpAndShift(), llvm::InstCombinerImpl::foldICmpBinOp(), llvm::InstCombinerImpl::foldICmpEquality(), foldICmpWithTruncSignExtendedVal(), foldLogOpOfMaskedICmps_NotAllZeros_BMask_Mixed(), foldNoWrapAdd(), foldSelectICmpAndOr(), foldSelectOfConstantsUsingSra(), foldSelectShuffle(), llvm::SelectionDAG::FoldSetCC(), foldShiftedShift(), foldShiftOfShiftedLogic(), foldUDivPow2Cst(), FoldValue(), llvm::InstCombinerImpl::foldVariableSignZeroExtensionOfVariableHighBitExtract(), gcd(), llvm::ConstantExpr::get(), llvm::ConstantExpr::getAdd(), llvm::ScalarEvolution::getAddExpr(), llvm::ConstantExpr::getAnd(), llvm::ConstantExpr::getAShr(), llvm::ConstantExpr::getCompare(), llvm::ConstantExpr::getExactAShr(), llvm::ConstantExpr::getExactLShr(), llvm::ConstantExpr::getExactSDiv(), llvm::ConstantExpr::getExactUDiv(), llvm::ConstantExpr::getFAdd(), llvm::ConstantExpr::getFDiv(), llvm::ConstantExpr::getFMul(), llvm::ConstantExpr::getFRem(), llvm::ConstantExpr::getFSub(), getKnownUndefForVectorBinop(), llvm::ConstantExpr::getLShr(), llvm::ConstantExpr::getMul(), llvm::ConstantExpr::getNSWAdd(), llvm::ConstantExpr::getNSWMul(), llvm::ConstantExpr::getNSWShl(), llvm::ConstantExpr::getNSWSub(), llvm::ConstantExpr::getNUWAdd(), llvm::ConstantExpr::getNUWMul(), llvm::ConstantExpr::getNUWShl(), llvm::ConstantExpr::getNUWSub(), llvm::ConstantExpr::getOr(), llvm::ConstantExpr::getSDiv(), llvm::ConstantExpr::getShl(), llvm::ConstantExpr::getSRem(), llvm::ConstantExpr::getSub(), llvm::ConstantExpr::getUDiv(), llvm::ConstantExpr::getUMin(), llvm::ConstantExpr::getURem(), llvm::ConstantExpr::getXor(), IdxCompare(), llvm::InsertElementConstantExpr::InsertElementConstantExpr(), llvm::X86TTIImpl::instCombineIntrinsic(), llvm::GCNTTIImpl::instCombineIntrinsic(), llvm::RISCVTargetLowering::isDesirableToCommuteWithShift(), llvm::Constant::isElementWiseEqual(), isImpliedCondMatchingImmOperands(), llvm::SystemZTTIImpl::isLSRCostLess(), llvm::PPCTTIImpl::isLSRCostLess(), llvm::TargetTransformInfoImplBase::isLSRCostLess(), llvm::X86TTIImpl::isLSRCostLess(), llvm::BasicTTIImplBase< AMDGPUTTIImpl >::isLSRCostLess(), llvm::TargetTransformInfo::isLSRCostLess(), llvm::RISCVTargetLowering::isMulAddWithConstProfitable(), llvm::AArch64TargetLowering::isMulAddWithConstProfitable(), isMultiple(), isNonEqualPHIs(), isStrictSubset(), isSubset(), llvm::AMDGPULegalizerInfo::legalizeFDIVFastIntrin(), llvm::AMDGPULegalizerInfo::legalizeFrint(), llvm::AMDGPULegalizerInfo::legalizeUnsignedDIV_REM64Impl(), llvm::AMDGPUTargetLowering::LowerFRINT(), LowerUINT_TO_FP_i64(), LowerVSETCC(), matchClamp(), matchMinMax(), llvm::CombinerHelper::matchOverlappingAnd(), llvm::CombinerHelper::matchReassocFoldConstantsInSubTree(), llvm::CombinerHelper::matchShiftOfShiftedLogic(), moveAddAfterMinMax(), multiplyOverflows(), MulWillOverflow(), llvm::HexagonTargetLowering::PerformDAGCombine(), llvm::SelectConstantExpr::SelectConstantExpr(), llvm::HexagonDAGToDAGISel::SelectSHL(), llvm::ShuffleVectorConstantExpr::ShuffleVectorConstantExpr(), llvm::InstCombinerImpl::SimplifyAddWithRemainder(), simplifyAndOfICmpsWithAdd(), simplifyAndOrOfICmpsWithConstants(), simplifyAssocCastAssoc(), llvm::InstCombinerImpl::SimplifyAssociativeOrCommutative(), simplifyBinaryIntrinsic(), llvm::TargetLowering::SimplifyDemandedBits(), llvm::X86TargetLowering::SimplifyDemandedVectorEltsForTargetNode(), simplifyDiv(), simplifyICmpWithBinOpOnLHS(), simplifyLogicOfAddSub(), SimplifyOrInst(), simplifyOrOfICmpsWithAdd(), llvm::TargetLowering::SimplifySetCC(), simplifySetCCWithCTPOP(), SolveQuadraticAddRecRange(), llvm::AMDGPURegisterBankInfo::splitBufferOffsets(), transformAddImmMulImm(), transformAddShlImm(), tryLowerToSLI(), trySimplifyICmpWithAdds(), ValuesOverlap(), llvm::InstCombinerImpl::visitAdd(), llvm::InstCombinerImpl::visitAnd(), llvm::InstCombinerImpl::visitCallInst(), visitFMinMax(), llvm::InstCombinerImpl::visitFMul(), llvm::InstCombinerImpl::visitLShr(), llvm::InstCombinerImpl::visitMul(), llvm::InstCombinerImpl::visitOr(), llvm::InstCombinerImpl::visitShl(), llvm::InstCombinerImpl::visitUDiv(), and llvm::InstCombinerImpl::visitXor().

◆ calls

into llvm powi calls

Definition at line 51 of file README.txt.

◆ canonical

This is blocked on not handling X* X* X which is the same number of multiplies and is canonical

Definition at line 70 of file README.txt.

◆ case

we compile this esp call L1 $pb L1 esp je LBB1_2 esp ret but is currently always computed in the entry block It would be better to sink the picbase computation down into the block for the as it is the only one that uses it This happens for a lot of code with early outs Another example is loads of which are usually emitted into the entry block on targets like x86 If not used in all paths through a they should be sunk into the ones that do In this case

Definition at line 429 of file README.txt.

◆ cases

this could be done in SelectionDAGISel along with other special cases

Definition at line 103 of file README.txt.

◆ CLZ

This should be recognized as CLZ

Definition at line 238 of file README.txt.

◆ cmpl

currently compiles eax eax je LBB0_3 testl eax jne LBB0_4 the testl could be eax cmpl

Definition at line 135 of file README.txt.

◆ code

instruction into the terminating blocks because there was other code

Definition at line 331 of file README.txt.

◆ Combine

Add support for conditional and other related patterns Instead eax eax je LBB16_2 eax edi eax movl _foo Combine

Definition at line 149 of file README.txt.

◆ comparisons

The same transformation can work with an even modulo with the addition of a and shrink the compare RHS by the same amount Unless the target supports that transformation probably isn t worthwhile The transformation can also easily be made to work with non zero equality comparisons

Definition at line 685 of file README.txt.

◆ Control

bool Control

Definition at line 468 of file README.txt.

◆ copies

Unrolling by would eliminate the& in both copies

Definition at line 561 of file README.txt.

◆ cpp

this could be done in SelectionDAGISel cpp

Definition at line 103 of file README.txt.

◆ eax

currently compiles eax eax je LBB0_3 testl eax

Definition at line 145 of file README.txt.

◆ else

< i32 > br label return else
Initial value:
{
for(i=0; i<reg->size; i++)
reg->node[i].state ^= Res & 0xFFFFFFFF00000000ULL
}
... which would only do one 32-bit XOR per loop iteration instead of two.
It would also be nice to recognize the reg->size doesn't alias reg->node[i]

Definition at line 179 of file README.txt.

◆ example

The same transformation can work with an even modulo with the addition of a and shrink the compare RHS by the same amount Unless the target supports that transformation probably isn t worthwhile The transformation can also easily be made to work with non zero equality for example

Definition at line 74 of file README.txt.

◆ false

Clang compiles this i1 false = getelementptr [8 x i64]* %input

Definition at line 505 of file README.txt.

◆ First

into llvm powi allowing the code generator to produce balanced multiplication trees First

Definition at line 54 of file README.txt.

Referenced by llvm::addLocationToRemarks(), llvm::HashBuilderImpl< HasherT, Endianness >::addRange(), llvm::HashBuilderImpl< HasherT, Endianness >::addRangeElements(), adjustCostForPairing(), areBothVectorOrScalar(), areSlicesNextToEachOther(), PODSmallVector< Node *, 8 >::back(), StringView::begin(), PODSmallVector< Node *, 8 >::begin(), BracedRangeExpr::BracedRangeExpr(), llvm::SwitchCG::SwitchLowering::buildBitTests(), llvm::SwitchCG::SwitchLowering::buildJumpTable(), llvm::HexagonInstrInfo::canExecuteInBundle(), CanMergeValues(), PODSmallVector< Node *, 8 >::clear(), llvm::DIExpression::constantFold(), copy_if_else(), llvm::TargetInstrInfo::createMIROperandComment(), llvm::detail::DoubleAPFloat::DoubleAPFloat(), StringView::dropBack(), PODSmallVector< Node *, 8 >::dropBack(), StringView::dropFront(), llvm::dwarf::RegisterLocations::dump(), llvm::dumpBytes(), StringView::empty(), PODSmallVector< Node *, 8 >::empty(), llvm::gsym::LineTable::encode(), llvm::AllocatorList< Token >::erase(), llvm::simple_ilist< MachineInstr, Options... >::erase(), llvm::MCInst::erase(), llvm::msgpack::MapDocNode::erase(), llvm::simple_ilist< MachineInstr, Options... >::eraseAndDispose(), llvm::AMDGPURegisterBankInfo::executeInWaterfallLoop(), expandBounds(), llvm::HexagonInstrInfo::expandVGatherPseudo(), StringView::find(), find_best(), llvm::SparseBitVector< ElementSize >::find_first(), llvm::SwitchCG::SwitchLowering::findBitTestClusters(), llvm::SwitchCG::SwitchLowering::findJumpTables(), llvm::detail::frexp(), llvm::RecordRecTy::getAsString(), llvm::getCallSiteLocation(), llvm::SwitchCG::getJumpTableNumCases(), llvm::SwitchCG::getJumpTableRange(), llvm::HexagonBlockRanges::InstrIndexMap::getPrevIndex(), llvm::object::ELFObjectFile< ELFT >::getSectionIndex(), INITIALIZE_PASS(), llvm::simple_ilist< MachineInstr, Options... >::insert(), llvm::AllocatorList< Token >::insert(), llvm::StringMap< std::unique_ptr< llvm::vfs::detail::InMemoryNode > >::insert(), isDefBetween(), llvm::SmallVectorTemplateCommon< T >::isRangeInStorage(), llvm::SmallVectorTemplateCommon< T >::isReferenceToRange(), llvm::ARM_AM::isSOImmTwoPartValNeg(), isStringOfOnes(), loadM0FromVGPR(), LowerBuildVectorAsInsert(), llvm::PatternMatch::m_SplatOrUndefMask::match(), BracedRangeExpr::match(), nodes_for_root(), llvm::HexagonBlockRanges::IndexType::operator unsigned(), llvm::gsym::operator<<(), PODSmallVector< Node *, 8 >::operator=(), packCmovGroup(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parse(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseBareSourceName(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseBracedExpr(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseCtorDtorName(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseExpr(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseExprPrimary(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseFloatingLiteral(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseFoldExpr(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseLocalName(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseNumber(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseOperatorName(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseQualifiedType(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseSeqId(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseSourceName(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseSpecialName(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseSubstitution(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseTemplateArg(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseType(), llvm::RegisterBankInfo::ValueMapping::partsAllUniform(), PODSmallVector< Node *, 8 >::PODSmallVector(), PODSmallVector< Node *, 8 >::pop_back(), StringView::popFront(), llvm::MIPrinter::print(), llvm::ScalarEvolution::print(), llvm::RuntimePointerChecking::printChecks(), BracedRangeExpr::printLeft(), llvm::SparcInstPrinter::printMembarTag(), printTypes(), llvm::msf::MappedBlockStream::readLongestContiguousChunk(), llvm::ilist_base< EnableSentinelTracking >::removeRange(), llvm::ilist_base< EnableSentinelTracking >::removeRangeImpl(), llvm::object::ELFFile< ELFT >::sections(), StringView::size(), PODSmallVector< Node *, 8 >::size(), llvm::simple_ilist< MachineInstr, Options... >::splice(), llvm::BinaryStreamWriter::split(), llvm::BinaryStreamReader::split(), llvm::stable_hash_combine_range(), StringView::StringView(), llvm::GVNExpression::BasicExpression::swapOperands(), llvm::ilist_base< EnableSentinelTracking >::transferBefore(), llvm::ilist_base< EnableSentinelTracking >::transferBeforeImpl(), llvm::SelectionDAGBuilder::UpdateSplitBlock(), wasEscaped(), and PODSmallVector< Node *, 8 >::~PODSmallVector().

◆ following

This is equivalent to the following

Definition at line 671 of file README.txt.

◆ foo

int foo = call i32 @foo( i32* %x )

◆ for

this could be done in SelectionDAGISel along with other special for

◆ function

we compile this esp call L1 $pb L1 esp je LBB1_2 esp ret but is currently always computed in the entry block It would be better to sink the picbase computation down into the block for the as it is the only one that uses it This happens for a lot of code with early outs Another example is loads of which are usually emitted into the entry block on targets like x86 If not used in all paths through a function

Definition at line 427 of file README.txt.

◆ functions

aka conv or ret i32 or6 or even i depending on the speed of the multiplier The best way to handle this is to canonicalize it to a multiply in IR and have codegen handle lowering multiplies to shifts on cpus where shifts are faster We do a number of simplifications in simplify libcalls to strength reduce standard library functions

Definition at line 631 of file README.txt.

◆ handle

then ret i32 result Tail recursion elimination should handle

◆ happened

optimized out of the function after the taildup happened

Definition at line 332 of file README.txt.

◆ http

Clang compiles this i1 i64 store i64 i64 store i64 i64 store i64 i64 store i64 align Which gets codegen d xmm0 movaps rbp movaps rbp movaps rbp movaps rbp rbp rbp rbp rbp It would be better to have movq s of instead of the movaps s http

Definition at line 532 of file README.txt.

◆ i

int i
Initial value:
{
x = 1ULL << i

Definition at line 29 of file README.txt.

Referenced by llvm::AliasSetTracker::add(), AddAliasScopeMetadata(), addAndInterleaveWithUnsupported(), llvm::Function::addAttributeAtIndex(), llvm::CallBase::addAttributeAtIndex(), AddCombineBUILD_VECTORToVPADDL(), addConstantComments(), llvm::DwarfExpression::addConstantFP(), llvm::DwarfUnit::addConstantValue(), llvm::CallBase::addDereferenceableParamAttr(), llvm::SourceMgr::AddIncludeFile(), addIncomingValuesToPHIs(), llvm::RegsForValue::AddInlineAsmOperands(), llvm::LiveIntervals::addKillFlags(), llvm::LiveVariables::addNewBlock(), AddNodeIDCustom(), addOperands(), addOptionalImmOperand(), llvm::MachineInstr::addRegisterDead(), llvm::MachineInstr::addRegisterKilled(), AddRuntimeUnrollDisableMetaData(), addSaveRestoreRegs(), addShuffleForVecExtend(), addStackMapLiveVars(), llvm::addStringMetadataToLoop(), AddThumb1SBit(), llvm::SCCPInstVisitor::addTrackedFunction(), llvm::MachO::InterfaceFile::addUUID(), llvm::ARMBasicBlockUtils::adjustBBOffsetsAfter(), AdjustBlendMask(), llvm::ARMAsmBackend::adjustFixupValue(), llvm::GCNHazardRecognizer::AdvanceCycle(), llvm::AggressiveAntiDepBreaker::AggressiveAntiDepBreaker(), llvm::AggressiveAntiDepState::AggressiveAntiDepState(), llvm::AliasSet::aliasesPointer(), llvm::AliasSet::aliasesUnknownInst(), llvm::BitVector::all(), llvm::BitsInit::allInComplete(), llvm::CCState::AllocateStack(), allocset(), allSameType(), llvm::LiveRangeEdit::allUsesAvailableAt(), AnalyzeArguments(), llvm::analyzeArguments(), llvm::SystemZCCState::AnalyzeCallOperands(), llvm::CCState::AnalyzeCallOperands(), llvm::CCState::AnalyzeCallResult(), llvm::SystemZCCState::AnalyzeFormalArguments(), llvm::CCState::AnalyzeFormalArguments(), llvm::AMDGPUTargetLowering::analyzeFormalArgumentsCompute(), llvm::CCState::AnalyzeReturn(), llvm::analyzeReturnValues(), llvm::BitVector::anyCommon(), llvm::SmallBitVector::anyCommon(), appendToGlobalArray(), llvm::HexagonSubtarget::CallMutation::apply(), llvm::HexagonSubtarget::BankConflictMutation::apply(), llvm::AVRAsmBackend::applyFixup(), llvm::MipsAsmBackend::applyFixup(), llvm::ARMAsmBackend::applyFixup(), llvm::RISCVAsmBackend::applyFixup(), llvm::AMDGPURegisterBankInfo::applyMappingSBufferLoad(), llvm::PBQP::applyR1(), llvm::PBQP::applyR2(), llvm::DomTreeBuilder::SemiNCAInfo< DomTreeT >::ApplyUpdates(), ApplyX86MaskOn1BitsVec(), AreEquivalentPhiNodes(), areInverseVectorBitmasks(), areOuterLoopExitPHIsSupported(), llvm::HexagonPacketizerList::arePredicatesComplements(), ARM64EmitUnwindInfo(), llvm::ARMBaseInstrInfo::ARMBaseInstrInfo(), llvm::X86FrameLowering::assignCalleeSavedSpillSlots(), llvm::M68kFrameLowering::assignCalleeSavedSpillSlots(), llvm::HexagonFrameLowering::assignCalleeSavedSpillSlots(), llvm::PPCFrameLowering::assignCalleeSavedSpillSlots(), assignCalleeSavedSpillSlots(), AssignProtectedObjSet(), llvm::IntervalMapImpl::Path::atBegin(), llvm::DomTreeBuilder::SemiNCAInfo< DomTreeT >::attachNewSubtree(), llvm::GCOVBlock::augmentOneCycle(), llvm::AVRDAGToDAGISel::select< AVRISD::CALL >(), llvm::BitstreamWriter::BackpatchWord(), BatchCommitValueTo(), llvm::sys::path::begin(), bigEndianByteAt(), BinomialCoefficient(), BitCastConstantVector(), llvm::SwitchCG::SwitchLowering::buildBitTests(), buildCallOperands(), buildClonedLoopBlocks(), BuildConstantFromSCEV(), buildCopyToRegs(), buildFromShuffleMostly(), llvm::VPlanSlp::buildGraph(), buildOrChain(), llvm::MachineIRBuilder::buildSequence(), llvm::R600InstrInfo::buildSlotOfVectorInstruction(), llvm::SIRegisterInfo::buildSpillLoadStore(), BuildSubAggregate(), BuildVSLDOI(), llvm::cacheAnnotationFromMD(), calculateMMLEIndex(), llvm::LoopVectorizationCostModel::calculateRegisterUsage(), llvm::Interpreter::callFunction(), canBeFeederToNewValueJump(), canClobberPhysRegDefs(), canCreateUndefOrPoison(), canEvaluateShuffled(), canEvaluateZExtd(), llvm::HexagonInstrInfo::canExecuteInBundle(), canFoldIntoSelect(), canLoopBeDeleted(), canLowerByDroppingEvenElements(), llvm::RISCVTargetLowering::CanLowerReturn(), cannotBeOrderedLessThanZeroImpl(), canonicalizeBitSelect(), canonicalizeInsertSplat(), canonicalizeShuffleMaskWithCommute(), canonicalizeShuffleMaskWithHorizOp(), canReduceVMulWidth(), canReplaceGEPIdxWithZero(), canTrapImpl(), canWidenShuffleElements(), llvm::BitTracker::RegisterCell::cat(), CC_PPC32_SPE_CustomSplitFP64(), CC_PPC32_SPE_RetF64(), CC_PPC32_SVR4_Custom_SkipLastArgRegsPPCF128(), chainLoadsAndStoresForMemcpy(), CheckAndCreateOffsetAdd(), checkBitsConcrete(), checkDyldCommand(), checkDylibCommand(), CheckForLiveRegDefMasked(), checkLinkerOptCommand(), checkLowRegisterList(), checkOffsetSize(), checkRegOnlyPHIInputs(), llvm::CCState::CheckReturn(), checkRpathCommand(), llvm::object::BindRebaseSegInfo::checkSegAndOffsets(), checkSubCommand(), Choose(), chooseConstraint(), ChooseConstraint(), CleanupPointerRootUsers(), llvm::LiveIntervalUnion::Array::clear(), llvm::MachineRegisterInfo::clearVirtRegs(), clobbersCTR(), llvm::ARMBaseInstrInfo::ClobbersPredicate(), llvm::PPCInstrInfo::ClobbersPredicate(), llvm::CloneAndPruneIntoFromInst(), cloneInstructionInExitBlock(), llvm::JumpThreadingPass::cloneInstructions(), llvm::CloneModule(), llvm::FunctionComparator::cmpBasicBlocks(), llvm::FunctionComparator::cmpConstants(), llvm::FunctionComparator::cmpOperations(), llvm::FunctionComparator::cmpTypes(), llvm::OpenMPIRBuilder::collapseLoops(), CollectAddOperandsWithScales(), collectInsertionElements(), CollectOpsToWiden(), collectShuffleElements(), collectSingleShuffleElements(), llvm::sys::unicode::columnWidthUTF8(), combineAddOfPMADDWD(), combineAnd(), combineArithReduction(), combineBasicSADPattern(), combineBitcast(), combineBVOfConsecutiveLoads(), combineBVOfVecSExt(), combineConcatVectorOfExtracts(), combineConcatVectorOps(), combineExtractWithShuffle(), combineInsertSubvector(), combineSelect(), combineShuffleOfSplatVal(), combineShuffleToVectorExtend(), combineTargetShuffle(), combineToConsecutiveLoads(), combineToExtendBoolVectorInReg(), combineTruncationShuffle(), combineVectorShiftImm(), combineX86ShuffleChain(), combineX86ShuffleChainWithExtract(), combineX86ShufflesConstants(), combineX86ShufflesRecursively(), llvm::ShuffleVectorInst::commute(), llvm::ShuffleVectorSDNode::commuteMask(), CompactSwizzlableVector(), llvm::FunctionComparator::compare(), CompareSCEVComplexity(), completeEphemeralValues(), llvm::IntEqClasses::compress(), llvm::computeAccessFunctions(), llvm::ComputeASanStackFrameLayout(), computeCalleeSaveRegisterPairs(), llvm::ComputeEditDistance(), computeExcessPressureDelta(), computeFreeStackSlots(), llvm::SelectionDAG::computeKnownBits(), computeKnownBits(), llvm::X86TargetLowering::computeKnownBitsForTargetNode(), computeKnownBitsFromOperator(), llvm::computeKnownBitsFromRangeMetadata(), llvm::GISelKnownBits::computeKnownBitsImpl(), llvm::rdf::Liveness::computeLiveIns(), computeMaxPressureDelta(), llvm::SelectionDAG::ComputeNumSignBits(), llvm::X86TargetLowering::ComputeNumSignBitsForTargetNode(), ComputeNumSignBitsImpl(), computeNumSignBitsVectorConstant(), llvm::EHStreamer::computePadMap(), llvm::rdf::Liveness::computePhiInfo(), llvm::FunctionLoweringInfo::ComputePHILiveOutRegInfo(), ComputePTXValueVTs(), llvm::TargetLoweringBase::computeRegisterProperties(), llvm::SMSchedule::computeStart(), llvm::JumpThreadingPass::computeValueKnownInPredecessorsImpl(), llvm::computeValueLLTs(), llvm::ComputeValueVTs(), computeZeroableShuffleElements(), llvm::concatenateVectors(), concatSubVector(), ConsecutiveRegisters(), llvm::ConstantFoldBinaryInstruction(), llvm::ConstantFoldCastInstruction(), llvm::ConstantFoldExtractElementInstruction(), llvm::ConstantFoldGetElementPtr(), llvm::ConstantFoldInsertElementInstruction(), llvm::ConstantFoldInsertValueInstruction(), llvm::ConstantFoldLoadThroughGEPConstantExpr(), llvm::ConstantFoldSelectInstruction(), llvm::ConstantFoldShuffleVectorInstruction(), llvm::ConstantFoldTerminator(), llvm::ConstantFoldUnaryInstruction(), llvm::DwarfUnit::constructSubprogramArguments(), consume(), llvm::Constant::containsConstantExpression(), containsNoDependence(), containsUndefinedElement(), llvm::convertAddSubFlagsOpcode(), llvm::BitsInit::convertInitializerBitRange(), llvm::IntInit::convertInitializerBitRange(), llvm::BitsInit::convertInitializerTo(), llvm::IntInit::convertInitializerTo(), ConvertSelectToConcatVector(), convertShiftLeftToScale(), convertToGuardPredicates(), llvm::ARMBaseInstrInfo::convertToThreeAddress(), llvm::IntervalMapImpl::NodeBase< std::pair< KeyT, KeyT >, ValT, N >::copy(), copyExtraImplicitOps(), llvm::DenseMapBase< DenseMap< llvm::VPInstruction *, llvm::InterleaveGroup< llvm::VPInstruction > *, DenseMapInfo< llvm::VPInstruction * >, llvm::detail::DenseMapPair< llvm::VPInstruction *, llvm::InterleaveGroup< llvm::VPInstruction > * > >, llvm::VPInstruction *, llvm::InterleaveGroup< llvm::VPInstruction > *, DenseMapInfo< llvm::VPInstruction * >, llvm::detail::DenseMapPair< llvm::VPInstruction *, llvm::InterleaveGroup< llvm::VPInstruction > * > >::copyFrom(), llvm::MachineInstr::copyImplicitOps(), llvm::SparcInstrInfo::copyPhysReg(), llvm::ARMBaseInstrInfo::copyPhysReg(), llvm::MachO::ArchitectureSet::count(), llvm::SparseBitVectorElement< ElementSize >::count(), llvm::StringRef::count(), llvm::CallBase::countOperandBundlesOfType(), llvm::CallBase::Create(), llvm::IRBuilderBase::CreateAggregateRet(), llvm::IRBuilderBase::CreateAnd(), llvm::createBitMaskForGaps(), llvm::MDBuilder::createBranchWeights(), CreateGCRelocates(), llvm::IRBuilderBase::CreateGEP(), createGuardBlocks(), llvm::IRBuilderBase::CreateInBoundsGEP(), createIndexMap(), llvm::createInterleaveMask(), createLoweredInitializer(), llvm::createMemLibcall(), createMMXBuildVector(), llvm::IRBuilderBase::CreateOr(), llvm::FunctionLoweringInfo::CreateRegs(), llvm::createReplicatedMask(), createShuffleMaskFromVSELECT(), createShuffleStride(), llvm::createSplat2ShuffleMask(), llvm::IRBuilderBase::CreateStepVector(), llvm::MDBuilder::createTBAAStructNode(), llvm::MDBuilder::createTBAAStructTypeNode(), llvm::sys::fs::createUniquePath(), llvm::createUnpackShuffleMask(), createVariablePermute(), llvm::IRBuilderBase::CreateVectorReverse(), llvm::CallBase::dataOperandHasImpliedAttr(), llvm::DbgValueLocEntry::DbgValueLocEntry(), DCEInstruction(), llvm::DecodeBLENDMask(), DecodeDPRRegListOperand(), llvm::DecodeEXTRQIMask(), DecodeFixedType(), DecodeIITType(), llvm::DecodeInsertElementMask(), llvm::DecodeINSERTQIMask(), llvm::AArch64_AM::decodeLogicalImmediate(), llvm::DecodeMOVDDUPMask(), llvm::DecodeMOVHLPSMask(), llvm::DecodeMOVLHPSMask(), llvm::DecodeMOVSHDUPMask(), llvm::DecodeMOVSLDUPMask(), llvm::DecodePALIGNRMask(), DecodePALIGNRMask(), llvm::DecodePSHUFBMask(), llvm::DecodePSHUFHWMask(), llvm::DecodePSHUFLWMask(), llvm::DecodePSHUFMask(), llvm::DecodePSLLDQMask(), llvm::DecodePSRLDQMask(), DecodeRegListOperand(), DecodeRegListOperand16(), llvm::DecodeScalarMoveMask(), llvm::DecodeSHUFPMask(), DecodeSPRRegListOperand(), llvm::DecodeSubVectorBroadcast(), llvm::DecodeUNPCKHMask(), llvm::DecodeUNPCKLMask(), llvm::DecodeVALIGNMask(), llvm::DecodeVPERM2X128Mask(), llvm::DecodeVPERMIL2PMask(), llvm::DecodeVPERMILPMask(), llvm::DecodeVPERMMask(), llvm::DecodeVPERMV3Mask(), llvm::DecodeVPERMVMask(), llvm::DecodeVPPERMMask(), DecodeVPTMaskOperand(), llvm::decodeVSHUF64x2FamilyMask(), llvm::DecodeZeroExtendMask(), llvm::LegacyLegalizerInfo::decreaseToSmallerTypesAndIncreaseToSmallest(), DeleteBasicBlock(), llvm::deleteDeadLoop(), llvm::DeleteDeadPHIs(), llvm::DomTreeBuilder::SemiNCAInfo< DomTreeT >::DeleteUnreachable(), llvm::DemotePHIToStack(), llvm::DemoteRegToStack(), detectAVGPattern(), detectPMADDUBSW(), llvm::CallLowering::determineAssignments(), llvm::RISCVFrameLowering::determineCalleeSaves(), llvm::ARMFrameLowering::determineCalleeSaves(), llvm::AArch64FrameLowering::determineCalleeSaves(), llvm::HexagonFrameLowering::determineCalleeSaves(), llvm::TargetFrameLowering::determineCalleeSaves(), llvm::diagnoseDontCall(), llvm::MD5::MD5Result::digest(), DiscoverDependentGlobals(), llvm::DistributeRange(), llvm::NVPTXAsmPrinter::doFinalization(), doinsert(), doNotCSE(), doPromotion(), dropInstructionKeepingImpDefs(), llvm::PHITransAddr::dump(), llvm::MCFragment::dump(), llvm::LiveVariables::VarInfo::dump(), llvm::LexicalScope::dump(), llvm::DWARFUnitIndex::dump(), llvm::AppleAcceleratorTable::dump(), llvm::SlotIndexes::dump(), llvm::dwarf::CFIProgram::dump(), llvm::MCInst::dump_pretty(), llvm::dumpAmdKernelCode(), llvm::dumpBytes(), dumpBytes(), dumpDataAux(), llvm::vfs::RedirectingFileSystem::dumpEntry(), llvm::PMTopLevelManager::dumpPasses(), llvm::SystemZHazardRecognizer::dumpProcResourceCounters(), llvm::dumpRegSetPressure(), llvm::ScheduleDAGSDNodes::dumpSchedule(), llvm::JumpThreadingPass::duplicateCondBranchOnPHIIntoPred(), llvm::DuplicateInstructionsInSplitBetween(), llvm::BitTracker::MachineEvaluator::eAND(), llvm::BitTracker::MachineEvaluator::eIMM(), llvm::LiveRangeEdit::eliminateDeadDefs(), llvm::BPFRegisterInfo::eliminateFrameIndex(), EltsFromConsecutiveLoads(), llvm::StringMatcher::Emit(), llvm::DIEAbbrev::Emit(), emitAlignedDPRCS2Restores(), emitAlignedDPRCS2Spills(), llvm::emitAMDGPUPrintfCall(), llvm::EmitAnyX86InstComments(), emitConstant(), llvm::AsmPrinter::emitConstantPool(), llvm::MipsSEFrameLowering::emitEpilogue(), llvm::XCoreFrameLowering::emitEpilogue(), llvm::PPCFrameLowering::emitEpilogue(), llvm::MachineInstr::emitError(), llvm::EHStreamer::emitExceptionTable(), llvm::MCObjectStreamer::emitFill(), EmitGenDwarfAranges(), llvm::EmitGEPOffset(), emitGlobalConstantArray(), emitGlobalConstantDataSequential(), emitGlobalConstantLargeInt(), emitGlobalConstantStruct(), emitGlobalConstantVector(), llvm::ExecutionEngine::emitGlobals(), llvm::MipsMCCodeEmitter::emitInstruction(), llvm::ScoreboardHazardRecognizer::EmitInstruction(), llvm::MCStreamer::emitInstruction(), llvm::R600TargetLowering::EmitInstrWithCustomInserter(), llvm::ARMTargetLowering::EmitInstrWithCustomInserter(), llvm::MCWinCOFFStreamer::emitInstToData(), emitKill(), llvm::MachineRegisterInfo::EmitLiveInCopies(), llvm::ScheduleHazardRecognizer::EmitNoops(), emitNop(), llvm::TargetLoweringBase::emitPatchPoint(), llvm::Thumb1FrameLowering::emitPrologue(), llvm::MipsSEFrameLowering::emitPrologue(), llvm::XCoreFrameLowering::emitPrologue(), llvm::ARMFrameLowering::emitPrologue(), llvm::SystemZELFFrameLowering::emitPrologue(), llvm::BitstreamWriter::EmitRecord(), llvm::GraphWriter< GraphType >::emitSimpleNode(), llvm::BufferByteStreamer::emitSLEB128(), llvm::StringToOffsetTable::EmitString(), llvm::HexagonMCELFStreamer::EmitSymbol(), llvm::ARMSelectionDAGInfo::EmitTargetCodeForMemcpy(), llvm::BufferByteStreamer::emitULEB128(), emitWideAPInt(), llvm::SparseBitVectorElement< ElementSize >::empty(), llvm::encodeBase64(), encodeBase64StringEntry(), llvm::sys::path::end(), llvm::BitTracker::MachineEvaluator::eNOT(), llvm::BitTracker::MachineEvaluator::eORL(), llvm::IntervalMapImpl::NodeBase< std::pair< KeyT, KeyT >, ValT, N >::erase(), llvm::PriorityQueue< T, Sequence, Compare >::erase_one(), llvm::yaml::escape(), llvm::Regex::escape(), llvm::DOT::EscapeString(), estimateRSStackSizeLimit(), llvm::MachineFrameInfo::estimateStackSize(), llvm::HexagonEvaluator::evaluate(), llvm::SCEVAddRecExpr::evaluateAtIteration(), EvaluateExpression(), evaluateGEPOffsetExpression(), evaluateICmpRelation(), evaluateInDifferentElementOrder(), llvm::InstCombinerImpl::EvaluateInDifferentType(), EvaluateStoreInto(), executeSelectInst(), llvm::BitTracker::MachineEvaluator::eXOR(), llvm::ModuloScheduleExpander::expand(), llvm::TargetLowering::expandUnalignedLoad(), llvm::TargetLowering::expandUnalignedStore(), ExtendToType(), ExtendUsesToFormExtLoad(), llvm::AppleAcceleratorTable::extract(), llvm::BitTracker::RegisterCell::extract(), llvm::extractConstantMask(), llvm::Instruction::extractProfTotalWeight(), extractVector(), f64AssignAAPCS(), f64RetAssign(), llvm::AArch64TargetLowering::fallBackToDAGISel(), llvm::LegalizerHelper::fewerElementsVectorSelect(), llvm::SHA1::final(), llvm::SHA256::final(), llvm::GISelWorkList< 8 >::finalize(), llvm::UnwindOpcodeAssembler::Finalize(), llvm::finalizeBundle(), llvm::RuntimeDyldELF::finalizeLoad(), llvm::SMSchedule::finalizeSchedule(), llvm::SparseBitVectorElement< ElementSize >::find_first(), llvm::StringRef::find_first_not_of(), llvm::StringRef::find_first_of(), llvm::OnDiskChainedHashTable< Info >::find_hashed(), llvm::StringRef::find_last_not_of(), llvm::StringRef::find_last_of(), llvm::SparseBitVectorElement< ElementSize >::find_next(), llvm::DWARFAbbreviationDeclaration::findAttributeIndex(), llvm::SwitchCG::SwitchLowering::findBitTestClusters(), llvm::SourceMgr::FindBufferContainingLoc(), findCorrespondingPred(), findDeadCallerSavedReg(), findDefIdx(), findDemandedEltsBySingleUser(), FindFirstNonCommonLetter(), llvm::MCInstrDesc::findFirstPredOperandIdx(), llvm::MachineInstr::findFirstPredOperandIdx(), findFirstVectorPredOperandIdx(), llvm::findFirstVPTPredOperandIdx(), llvm::IntervalMapImpl::LeafNode< KeyT, ValT, N, Traits >::findFrom(), llvm::IntervalMapImpl::BranchNode< KeyT, ValT, RootBranchCap, Traits >::findFrom(), findFuncPointers(), llvm::Mips16HardFloatInfo::findFuncSignature(), llvm::ExecutionEngine::FindFunctionNamed(), llvm::sampleprof::FunctionSamples::findFunctionSamples(), llvm::ExecutionEngine::FindGlobalVariableNamed(), llvm::SparseSet< RootData >::findIndex(), llvm::SparseMultiSet< VReg2SUnit, VirtReg2IndexFunctor >::findIndex(), llvm::MachineInstr::findInlineAsmFlagIdx(), FindInOperandList(), llvm::FindInsertedValue(), llvm::SwitchCG::SwitchLowering::findJumpTables(), llvm::LiveVariables::VarInfo::findKill(), findLiveReferences(), FindMatchingEpilog(), findmust(), llvm::cl::generic_parser_base::findOption(), llvm::findOptionMDForLoopID(), llvm::RuntimeDyldImpl::findOrEmitSection(), findPartitions(), findPHIToPartitionLoops(), llvm::MachineInstr::findRegisterDefOperandIdx(), llvm::MachineInstr::findRegisterUseOperandIdx(), llvm::TargetLoweringBase::findRepresentativeClass(), llvm::DomTreeBuilder::SemiNCAInfo< DomTreeT >::FindRoots(), findScratchNonCalleeSaveRegister(), llvm::MachineInstr::findTiedOperandIdx(), FindUsedValues(), findUsedValues(), findUseIdx(), llvm::SplitEditor::finish(), firstch(), llvm::R600InstrInfo::fitsConstReadLimitations(), llvm::R600InstrInfo::fitsReadPortLimitations(), llvm::InnerLoopVectorizer::fixNonInductionPHIs(), llvm::PPCInstrInfo::fixupIsDeadOrKill(), llvm::SwingSchedulerDAG::fixupRegisterOverlaps(), fixupShuffleMaskForPermutedSToV(), fixupVariableFloatArgs(), FlattenVectorShuffle(), llvm::BinOpInit::Fold(), llvm::TernOpInit::Fold(), llvm::CondOpInit::Fold(), FoldBUILD_VECTOR(), llvm::InstCombinerImpl::foldCmpLoadFromIndexedGlobal(), foldCONCAT_VECTORS(), FoldCondBranchOnPHIImpl(), llvm::InstCombinerImpl::foldGEPICmp(), foldICmpWithLowBitMaskedVal(), foldIdentityExtractShuffle(), foldIdentityPaddedShuffles(), foldInsEltIntoIdentityShuffle(), foldInsEltIntoSplat(), foldInsSequenceIntoSplat(), llvm::InstCombinerImpl::foldIntegerTypedPHI(), llvm::InstCombinerImpl::foldOpIntoPhi(), foldPatchpoint(), llvm::InstCombinerImpl::foldPHIArgBinOpIntoPHI(), llvm::InstCombinerImpl::foldPHIArgExtractValueInstructionIntoPHI(), llvm::InstCombinerImpl::foldPHIArgGEPIntoPHI(), llvm::InstCombinerImpl::foldPHIArgInsertValueInstructionIntoPHI(), llvm::InstCombinerImpl::foldPHIArgLoadIntoPHI(), llvm::InstCombinerImpl::foldPHIArgOpIntoPHI(), llvm::InstCombinerImpl::foldPHIArgZextsIntoPHI(), foldShuffleOfConcatUndefs(), foldShuffleWithInsert(), foldTruncShuffle(), for(), ForeachDagApply(), formSplatFromShuffles(), llvm::PMDataManager::freePass(), freeset(), freezeset(), llvm::FunctionComparator::functionHash(), FuseInst(), FuseTwoAddrInst(), gatherIncomingValuesToPhi(), llvm::ARMAsmBackendDarwin::generateCompactUnwindEncoding(), GeneratePerfectShuffle(), genShuffleBland(), llvm::gsym::LineTable::get(), llvm::RecordRecTy::get(), llvm::ConstantStruct::get(), llvm::PPC::get_VSPLTI_elt(), llvm::LegacyLegalizerInfo::getAction(), llvm::LegalizerInfo::getAction(), llvm::ScalarEvolution::getAddExpr(), llvm::ScalarEvolution::getAddRecExpr(), getAddressAccessSCEV(), llvm::rdf::PhysicalRegisterInfo::getAliasSet(), llvm::getAlign(), getAllocatableSetForRC(), llvm::Function::getArg(), llvm::VarDefInit::getArg(), GetArgMD(), llvm::CallBase::getArgOperand(), llvm::FuncletPadInst::getArgOperand(), llvm::CallBase::getArgOperandUse(), getArrayElements(), llvm::BitsInit::getAsString(), llvm::CondOpInit::getAsString(), llvm::DagInit::getAsString(), llvm::Function::getAttributeAtIndex(), llvm::CallBase::getAttributeAtIndex(), llvm::PHINode::getBasicBlockIndex(), getBBClusterInfo(), getBestDestForJumpOnUndef(), llvm::Trace::getBlock(), llvm::Trace::getBlockIndex(), llvm::BitstreamBlockInfo::getBlockInfo(), llvm::BitstreamWriter::getBlockInfo(), GetBranchWeights(), llvm::SourceMgr::getBufferInfo(), getBuildDwordsVector(), getBuildPairElt(), GetCodeName(), llvm::SelectionDAG::getConstant(), llvm::SDValue::getConstantOperandAPInt(), llvm::SDValue::getConstantOperandVal(), llvm::MachineConstantPool::getConstantPoolIndex(), llvm::getConstantRangeFromMetadata(), llvm::ExecutionEngine::getConstantValue(), getConstantVector(), getConstVector(), llvm::Type::getContainedType(), llvm::DWARFUnitIndex::Entry::getContribution(), getCopyFromPartsVector(), llvm::RegsForValue::getCopyFromRegs(), getCopyToParts(), getCopyToPartsVector(), llvm::RegsForValue::getCopyToRegs(), llvm::AMDGPURegisterBankInfo::getDefaultMappingSOP(), llvm::AMDGPURegisterBankInfo::getDefaultMappingVOP(), llvm::IndirectBrInst::getDestination(), getEdgeValueLocal(), llvm::ListInit::getElement(), llvm::ListInit::getElementAsRecord(), llvm::ARMConstantPoolValue::getExistingMachineCPValueImpl(), llvm::cl::generic_parser_base::getExtraOptionNames(), getFauxShuffleMask(), llvm::X86::getFeaturesForCPU(), llvm::MachineFunction::getFilterIDFor(), llvm::CCState::getFirstUnallocated(), llvm::InstCombiner::getFlippedStrictnessPredicateAndConstant(), getFrameIndexOperandNum(), llvm::R600FrameLowering::getFrameIndexReference(), llvm::DWARFUnitIndex::getFromOffset(), llvm::Type::getFunctionParamType(), getGEPSmallConstantIntOffsetV(), llvm::ExecutionEngine::getGlobalValueAtAddress(), llvm::getGuaranteedWellDefinedOps(), getHalfShuffleMask(), llvm::ScoreboardHazardRecognizer::getHazardType(), getHiPELiteral(), getHopForBuildVector(), llvm::NVPTXMachineFunctionInfo::getImageHandleSymbolIndex(), getImpliedDisabledFeatures(), getImpliedEnabledFeatures(), llvm::Function::getImportGUIDs(), llvm::PHINode::getIncomingBlock(), llvm::PHINode::getIncomingValue(), llvm::PHINode::getIncomingValueNumForOperand(), llvm::getIndexExpressionsFromGEP(), llvm::CallBrInst::getIndirectDest(), llvm::CallBrInst::getIndirectDestLabel(), llvm::CallBrInst::getIndirectDestLabelUse(), llvm::CallBrInst::getIndirectDests(), getInitPhiReg(), llvm::DWARFContext::getInliningInfoForAddress(), getInputChainForNode(), getInsertPointForUses(), llvm::PPCInstrInfo::getInstrLatency(), llvm::ARMRegisterBankInfo::getInstrMapping(), llvm::AMDGPURegisterBankInfo::getInstrMapping(), llvm::AMDGPUDisassembler::getInstruction(), getKnownUndefForVectorBinop(), llvm::object::MachOObjectFile::getLibraryShortNameByIndex(), llvm::MachineFrameInfo::getLocalFrameObjectMap(), getLoopPhiReg(), llvm::HexagonMCCodeEmitter::getMachineOpValue(), getMangledTypeStr(), llvm::AMDGPURegisterBankInfo::getMappingType(), getMemcpyLoadsAndStores(), llvm::TargetTransformInfoImplBase::getMemcpyLoopResidualLoweringType(), getMemmoveLoadsAndStores(), llvm::SourceMgr::getMemoryBuffer(), llvm::SystemZTTIImpl::getMemoryOpCost(), llvm::PPCTTIImpl::getMemoryOpCost(), getMemsetStores(), getMemsetStringVal(), llvm::SourceMgr::GetMessage(), llvm::ScalarEvolution::getMinMaxExpr(), llvm::MDNode::getMostGenericRange(), getMOVL(), llvm::AMDGPU::SendMsg::getMsgId(), llvm::AMDGPU::SendMsg::getMsgOpId(), llvm::ScalarEvolution::getMulExpr(), llvm::TargetLowering::getMultipleConstraintMatchWeight(), llvm::X86TargetLowering::getNegatedExpression(), getNewSource(), llvm::NodeSet::getNode(), llvm::SelectionDAG::getNode(), llvm::MCInstrDesc::getNumImplicitDefs(), llvm::MCInstrDesc::getNumImplicitUses(), getOffsetFromIndex(), getOffsetFromIndices(), getOneTrueElt(), getOpenCLAlignment(), llvm::SCEVCastExpr::getOperand(), llvm::User::getOperand(), llvm::SCEVNAryExpr::getOperand(), llvm::SDValue::getOperand(), llvm::MCInst::getOperand(), llvm::SCEVUDivExpr::getOperand(), llvm::MachineInstr::getOperand(), llvm::UnOpInit::getOperand(), llvm::BinOpInit::getOperand(), llvm::TernOpInit::getOperand(), llvm::NamedMDNode::getOperand(), llvm::CallBase::getOperandBundle(), llvm::CallBase::getOperandBundlesAsDefs(), llvm::RegisterBankInfo::InstructionMapping::getOperandMapping(), llvm::PHINode::getOperandNumForIncomingValue(), llvm::User::getOperandUse(), getOpIdxForMO(), llvm::MipsTargetLowering::getOpndList(), getOptionHelpName(), llvm::cl::generic_parser_base::getOptionWidth(), llvm::MachineFunction::getOrCreateLandingPadInfo(), llvm::AArch64TTIImpl::getOrCreateResultFromMemIntrinsic(), llvm::DIBuilder::getOrCreateTypeArray(), getOtherIncomingValue(), getOutliningPenalty(), llvm::CallBase::getParamDereferenceableBytes(), llvm::CallBase::getParamDereferenceableOrNullBytes(), llvm::FunctionType::getParamType(), llvm::SourceMgr::getParentIncludeLoc(), llvm::LessRecordRegister::RecordParts::getPart(), getPHIDeps(), getPhiRegs(), getPHISrcRegOpIdx(), llvm::SCEVAddRecExpr::getPostIncExpr(), getPowerOf2Factor(), llvm::LazyValueInfo::getPredicateAt(), llvm::NVPTXTargetLowering::getPrototype(), getPSHUFShuffleMask(), llvm::MCRegisterInfo::getRegClass(), llvm::TargetRegisterInfo::getRegClass(), llvm::MachineInstr::getRegClassConstraintEffectForVReg(), llvm::MCRegisterClass::getRegister(), llvm::TargetRegisterClass::getRegister(), llvm::M68kRegisterInfo::getRegisterOrder(), getRegsUsedByPHIs(), llvm::SIRegisterInfo::getReservedRegs(), llvm::AArch64RegisterInfo::getReservedRegs(), llvm::GetReturnInfo(), llvm::InstCombiner::getSafeVectorConstantForBinop(), llvm::X86TTIImpl::getScalarizationOverhead(), llvm::BasicTTIImplBase< AMDGPUTTIImpl >::getScalarizationOverhead(), llvm::ARMTargetLowering::getSchedulingPreference(), llvm::AArch64RegisterInfo::getSEHRegNum(), llvm::X86RegisterInfo::getSEHRegNum(), getSetupCost(), getShiftedValue(), getShuffleComment(), getShuffleDemandedElts(), llvm::ShuffleVectorInst::getShuffleMask(), getShuffleMaskIndexOfOneElementFromOp0IntoOp1(), llvm::getShuffleReduction(), getShuffleVectorZeroOrUndef(), llvm::ShuffleVectorSDNode::getSplatIndex(), llvm::BuildVectorSDNode::getSplatValue(), llvm::SelectionDAG::getStepVector(), llvm::SCCPInstVisitor::getStructLatticeValueFor(), llvm::VectorType::getSubdividedVectorType(), llvm::BranchInst::getSuccessor(), llvm::IndirectBrInst::getSuccessor(), llvm::InvokeInst::getSuccessor(), llvm::CallBrInst::getSuccessor(), llvm::GetSuccessorNumber(), llvm::sys::getSwappedBytes(), getTargetConstantBitsFromNode(), getTargetShuffleAndZeroables(), getTargetVShiftByConstNode(), llvm::ScalarEvolution::getTruncateExpr(), llvm::BasicTTIImplBase< AMDGPUTTIImpl >::getTypeBasedIntrinsicInstrCost(), llvm::ConstantStruct::getTypeForElements(), llvm::MachineFunction::getTypeIDFor(), llvm::ScalarEvolution::getUDivExactExpr(), llvm::ScalarEvolution::getUDivExpr(), llvm::GetUnrollMetadata(), llvm::SelectionDAG::getValidMaximumShiftAmountConstant(), llvm::SelectionDAG::getValidMinimumShiftAmountConstant(), llvm::PackedVectorBase< T, BitNum, BitVectorTy, false >::getValue(), llvm::PackedVectorBase< T, BitNum, BitVectorTy, true >::getValue(), llvm::yaml::ScalarNode::getValue(), llvm::FunctionLoweringInfo::getValueFromVirtualReg(), llvm::SelectionDAGBuilder::getValueImpl(), llvm::MachineSSAUpdater::GetValueInMiddleOfBlock(), llvm::SSAUpdater::GetValueInMiddleOfBlock(), llvm::SelectionDAG::getVectorShuffle(), getX86MaskVec(), group2Shuffle(), GroupByComplexity(), llvm::PBQP::RegAlloc::NodeMetadata::handleAddEdge(), llvm::CallLowering::handleAssignments(), handlePhiDef(), llvm::PBQP::RegAlloc::NodeMetadata::handleRemoveEdge(), llvm::LiveVariables::HandleVirtRegUse(), llvm::GetElementPtrInst::hasAllConstantIndices(), llvm::GetElementPtrInst::hasAllZeroIndices(), HasConditionalBranch(), llvm::PHINode::hasConstantOrUndefValue(), llvm::PHINode::hasConstantValue(), llvm::MCInstrDesc::hasDefOfPhysReg(), HashMachineInstr(), hasIdenticalHalvesShuffleMask(), llvm::X86InstrInfo::hasLiveCondCodeDef(), hasNormalLoadOperand(), llvm::CallBase::hasOperandBundlesOtherThan(), hasRegisterDependency(), llvm::MachineInstr::hasRegisterImplicitUseOperand(), llvm::PBQP::hasRegisterOptions(), hasVectorOperands(), haveEfficientBuildVectorPattern(), haveSameOperands(), llvm::HexagonLowerToMC(), llvm::hoistRegion(), llvm::rdf::NodeAllocator::id(), incDecVectorConstant(), llvm::ValueEnumerator::incorporateFunction(), llvm::LegacyLegalizerInfo::increaseToLargerTypesAndDecreaseToLargest(), llvm::InterferenceCache::init(), llvm::SchedBoundary::init(), llvm::ConvergingVLIWScheduler::initialize(), INITIALIZE_PASS(), llvm::ExecutionEngine::InitializeMemory(), llvm::ResourcePriorityQueue::initNodes(), llvm::ScheduleDAGMILive::initRegPressure(), llvm::InlineFunction(), llvm::SystemZELFFrameLowering::inlineStackProbe(), llvm::PPCFrameLowering::inlineStackProbe(), llvm::PriorityWorklist< llvm::LazyCallGraph::SCC *, SmallVector< llvm::LazyCallGraph::SCC *, N >, SmallDenseMap< llvm::LazyCallGraph::SCC *, ptrdiff_t > >::insert(), llvm::BitTracker::RegisterCell::insert(), llvm::APInt::insertBits(), llvm::ARCInstrInfo::insertBranch(), llvm::IntervalMapImpl::LeafNode< KeyT, ValT, N, Traits >::insertFrom(), llvm::TargetInstrInfo::insertNoops(), insertParsePoints(), insertUniqueBackedgeBlock(), insertVector(), installExceptionOrSignalHandlers(), llvm::PPCTTIImpl::instCombineIntrinsic(), llvm::SparseBitVectorElement< ElementSize >::intersects(), llvm::SparseBitVectorElement< ElementSize >::intersectWith(), llvm::SparseBitVectorElement< ElementSize >::intersectWithComplement(), llvm::IntervalPartition::IntervalPartition(), InTreeUserNeedToExtract(), llvm::MachineTraceMetrics::invalidate(), is128BitUnpackShuffleMask(), is16BitEquivalent(), is64Bit(), isACalleeSavedRegister(), isAddSubOrSubAdd(), isAddSubOrSubAddMask(), isAllConstantBuildVector(), isAlternatingShuffMask(), isBigEndian(), llvm::isCalleeSavedRegister(), llvm::SITargetLowering::isCanonicalized(), llvm::BitsInit::isComplete(), llvm::BitsInit::isConcrete(), isConstantIntVector(), isConstantOrUndefBUILD_VECTOR(), llvm::BuildVectorSDNode::isConstantSplat(), llvm::X86::isConstantSplat(), llvm::ISD::isConstantSplatVectorAllOnes(), llvm::MachineInstr::isConstantValuePHI(), isConstCompatible(), llvm::IsCPSRDead< MachineInstr >(), isDeInterleaveMaskOfFactor(), llvm::ArgumentPromotionPass::isDenselyPacked(), llvm::SparseSolver< LatticeKey, LatticeVal, KeyInfo >::isEdgeFeasible(), llvm::Type::isEmptyTy(), isEndbrImm64(), IsEquivalentPHI(), isExtendedBUILD_VECTOR(), llvm::ShuffleVectorInst::isExtractSubvectorMask(), llvm::X86TargetLowering::isFPImmLegal(), llvm::ARMBaseRegisterInfo::isFrameOffsetLegal(), isFunctionMallocLike(), isFusableLoadOpStorePattern(), llvm::SelectionDAG::isGuaranteedNotToBeUndefOrPoison(), isGuaranteedNotToBeUndefOrPoison(), isHomogeneousAggregate(), isHopBuildVector(), isHorizontalBinOp(), isHorizontalBinOpPart(), llvm::MachineInstr::isIdenticalTo(), isIdentityMaskImpl(), llvm::ShuffleVectorInst::isIdentityWithPadding(), isInBoundsIndices(), llvm::ShuffleVectorInst::isInsertSubvectorMask(), isinsets(), isINSMask(), llvm::BitTracker::MachineEvaluator::isInt(), isIntersect(), llvm::LiveRangeCalc::isJointlyDominated(), llvm::isKnownNeverInfinity(), llvm::isKnownNeverNaN(), isKnownNonZero(), isLaneCrossingShuffleMask(), llvm::HexagonPacketizerList::isLegalToPacketizeTogether(), llvm::R600InstrInfo::isLegalUpTo(), llvm::LiveVariables::isLiveOut(), llvm::SMSchedule::isLoopCarriedDefOfUse(), isMaybeZeroSizedType(), isMultiLaneShuffleMask(), isNByteElemShuffleMask(), llvm::ARM_AM::isNEONBytesplat(), isNonZeroElementsInOrder(), isNoopShuffleMask(), IsNullTerminatedString(), IsOperandAMemoryOperand(), llvm::SIInstrInfo::isOperandLegal(), isOuterMostDepPositive(), llvm::RISCVSubtarget::isRegisterReservedByUser(), isRepeatedByteSequence(), isRepeatedShuffleMask(), isRepeatedTargetShuffleMask(), llvm::VLIWResourceModel::isResourceAvailable(), llvm::ResourcePriorityQueue::isResourceAvailable(), isReturnNonNull(), llvm::ShuffleVectorInst::isReverseMask(), isReverseMask(), isREVMask(), llvm::isSafeToSpeculativelyExecute(), llvm::AMDGPURegisterBankInfo::isSALUMapping(), llvm::PPCInstrInfo::isSameClassPhysRegCopy(), llvm::Instruction::isSameOperationAs(), llvm::ShuffleVectorInst::isSelectMask(), isSequentialOrUndefInRange(), isSequentialOrUndefOrZeroInRange(), isSETCCorConvertedSETCC(), isShuffleEquivalent(), isShuffleEquivalentToSelect(), isShuffleMaskInputInPlace(), llvm::AArch64TargetLowering::isShuffleMaskLegal(), llvm::ARMTargetLowering::isShuffleMaskLegal(), isSimpleEnoughValueToCommitHelper(), llvm::BasicBlockEdge::isSingleEdge(), isSingletonEXTMask(), isSingletonVEXTMask(), isSplat(), isSplatBV(), llvm::ShuffleVectorSDNode::isSplatMask(), llvm::PPC::isSplatShuffleMask(), llvm::SelectionDAG::isSplatValue(), llvm::SCCPInstVisitor::isStructLatticeConstant(), isSupportedType(), isTargetNullPtr(), isTargetShuffleEquivalent(), llvm::ShuffleVectorInst::isTransposeMask(), isTRN_v_undef_Mask(), isTRNMask(), isTwoAddrUse(), isUZP_v_undef_Mask(), isUZPMask(), llvm::ShuffleVectorInst::isValidOperands(), isVectorElementSwap(), isVectorPredicable(), isVEXTMask(), isVMerge(), isVMOVNMask(), isVMOVNTruncMask(), llvm::PPC::isVPKUDUMShuffleMask(), llvm::PPC::isVPKUHUMShuffleMask(), llvm::PPC::isVPKUWUMShuffleMask(), llvm::isVREVMask(), llvm::PPC::isVSLDOIShuffleMask(), isVTRN_v_undef_Mask(), isVTRNMask(), isVUZP_v_undef_Mask(), isVUZPMask(), isVZIP_v_undef_Mask(), isVZIPMask(), isWideTypeMask(), llvm::AArch64Subtarget::isXRegCustomCalleeSaved(), llvm::AArch64Subtarget::isXRegisterReserved(), isXXBRShuffleMaskHelper(), llvm::ShuffleVectorInst::isZeroEltSplatMask(), isZIP_v_undef_Mask(), isZipMask(), isZIPMask(), iterativelySimplifyCFG(), IVUseShouldUsePostIncValue(), llvm::LiveRange::join(), KnuthDiv(), llvm::MCAssembler::layout(), layoutCOFF(), llvm::SIInstrInfo::legalizeOperands(), llvm::SIInstrInfo::legalizeOperandsVOP3(), llvm::cfg::LegalizeUpdates(), LinearizeExprTree(), listContainsReg(), littleEndianByteAt(), lle_X_scanf(), lle_X_sscanf(), llvm_getMetadata(), llvm_regcomp(), LLVMCopyModuleFlagsMetadata(), LLVMGetArgOperand(), LLVMGetMDNodeOperands(), LLVMGetNamedMetadataOperands(), LLVMGetSubtypes(), LLVMGetSuccessor(), LLVMSetArgOperand(), LLVMSetSuccessor(), LLVMStructGetTypeAtIndex(), llvm::Mips16InstrInfo::loadImmediate(), llvm::PPCInstrInfo::loadRegFromStackSlotNoUpd(), llvm::ExecutionEngine::LoadValueFromMemory(), LookForIdenticalPHI(), llvm::XCoreMCInstLower::Lower(), llvm::BPFMCInstLower::Lower(), llvm::MSP430MCInstLower::Lower(), llvm::ARCMCInstLower::Lower(), llvm::MipsMCInstLower::Lower(), llvm::M68kMCInstLower::Lower(), lower1BitShuffle(), lower1BitShuffleAsKSHIFTR(), LowerAVXCONCAT_VECTORS(), llvm::LegalizerHelper::lowerBitCount(), LowerBITREVERSE(), LowerBITREVERSE_XOP(), llvm::LegalizerHelper::lowerBswap(), llvm::HexagonTargetLowering::LowerBUILD_VECTOR(), LowerBUILD_VECTOR_i1(), LowerBUILD_VECTORvXi1(), LowerBuildVectorAsInsert(), LowerBuildVectorOfFPExt(), LowerBuildVectorOfFPTrunc(), lowerBuildVectorToBitOp(), LowerBuildVectorv16i8(), LowerBuildVectorv4x32(), llvm::VETargetLowering::LowerCall(), llvm::HexagonTargetLowering::LowerCall(), llvm::SITargetLowering::LowerCall(), llvm::RISCVTargetLowering::LowerCall(), llvm::NVPTXTargetLowering::LowerCall(), llvm::CallLowering::lowerCall(), llvm::SparcTargetLowering::LowerCall_32(), llvm::SparcTargetLowering::LowerCall_64(), llvm::HexagonTargetLowering::LowerCallResult(), llvm::SITargetLowering::LowerCallResult(), lowerCallResult(), LowerCallResult(), llvm::FastISel::lowerCallTo(), llvm::TargetLowering::LowerCallTo(), llvm::HexagonTargetLowering::LowerCONCAT_VECTORS(), LowerCONCAT_VECTORS_i1(), LowerCONCAT_VECTORSvXi1(), llvm::HexagonTargetLowering::LowerConstantPool(), LowerCTLZ(), LowerCTPOP(), LowerEXTEND_VECTOR_INREG(), LowerEXTRACT_SUBVECTOR(), llvm::MipsCallLowering::lowerFormalArguments(), llvm::AArch64CallLowering::lowerFormalArguments(), llvm::R600TargetLowering::LowerFormalArguments(), llvm::VETargetLowering::LowerFormalArguments(), llvm::HexagonTargetLowering::LowerFormalArguments(), llvm::SITargetLowering::LowerFormalArguments(), llvm::RISCVTargetLowering::LowerFormalArguments(), llvm::NVPTXTargetLowering::LowerFormalArguments(), llvm::SparcTargetLowering::LowerFormalArguments_32(), llvm::SparcTargetLowering::LowerFormalArguments_64(), llvm::AMDGPUCallLowering::lowerFormalArgumentsKernel(), llvm::InlineAsmLowering::lowerInlineAsm(), llvm::HexagonTargetLowering::LowerINLINEASM(), lowerINT_TO_FP_vXi64(), llvm::AArch64TargetLowering::lowerInterleavedLoad(), llvm::ARMTargetLowering::lowerInterleavedLoad(), llvm::AArch64TargetLowering::lowerInterleavedStore(), llvm::ARMTargetLowering::lowerInterleavedStore(), llvm::X86TargetLowering::lowerInterleavedStore(), LowerMUL(), LowerMULH(), llvm::LowerPPCMachineInstrToMCInst(), llvm::AArch64CallLowering::lowerReturn(), llvm::VETargetLowering::LowerReturn(), llvm::HexagonTargetLowering::LowerReturn(), llvm::RISCVTargetLowering::LowerReturn(), llvm::NVPTXTargetLowering::LowerReturn(), llvm::SparcTargetLowering::LowerReturn_32(), llvm::SparcTargetLowering::LowerReturn_64(), LowerReverse_VECTOR_SHUFFLE(), LowerScalarVariableShift(), LowerShift(), lowerShuffleAsBitBlend(), lowerShuffleAsBitMask(), lowerShuffleAsBlend(), lowerShuffleAsBlendAndPermute(), lowerShuffleAsBlendOfPSHUFBs(), lowerShuffleAsDecomposedShuffleMerge(), lowerShuffleAsElementInsertion(), lowerShuffleAsLanePermuteAndPermute(), lowerShuffleAsLanePermuteAndRepeatedMask(), lowerShuffleAsLanePermuteAndShuffle(), lowerShuffleAsLanePermuteAndSHUFP(), lowerShuffleAsPermuteAndUnpack(), lowerShuffleAsRepeatedMaskAndLanePermute(), lowerShuffleAsSpecificZeroOrAnyExtend(), lowerShuffleAsSplitOrBlend(), lowerShuffleAsZeroOrAnyExtend(), lowerShuffleWithPACK(), lowerShuffleWithPSHUFB(), LowerSIGN_EXTEND(), llvm::LowerSparcMachineInstrToMCInst(), lowerStatepointMetaArgs(), LowerToHorizontalOp(), lowerV16I8Shuffle(), lowerV4X128Shuffle(), lowerV8I16GeneralSingleInputShuffle(), lowerV8I16Shuffle(), llvm::HexagonTargetLowering::LowerVECTOR_SHUFFLE(), LowerVECTOR_SHUFFLE(), lowerVECTOR_SHUFFLE(), lowerVECTOR_SHUFFLE_SHF(), lowerVECTOR_SHUFFLE_VSHF(), LowerVECTOR_SHUFFLEUsingOneOff(), LowerVectorCTLZInRegLUT(), LowerVectorCTPOPInRegLUT(), llvm::LowerVEMachineInstrToMCInst(), LowervXi8MulWithUNPCK(), makeAllBits(), llvm::makePostTransformationMetadata(), llvm::RuntimeDyldImpl::mapSectionAddress(), llvm::SCCPInstVisitor::markOverdefined(), llvm::LiveVariables::MarkVirtRegAliveInBlock(), llvm::Regex::match(), llvm::PatternMatch::cstval_pred_ty< Predicate, ConstantVal >::match(), matchAddReduction(), matchBinaryPermuteShuffle(), llvm::ISD::matchBinaryPredicate(), matchBinaryShuffle(), llvm::SelectionDAG::matchBinOpReduction(), llvm::CombinerHelper::matchCombineShuffleVector(), matchIntrinsicType(), matchPMADDWD(), matchPMADDWD_2(), matchShuffleAsBitRotate(), matchShuffleAsBlend(), matchShuffleAsElementRotate(), matchShuffleAsEXTRQ(), matchShuffleAsInsertPS(), matchShuffleAsShift(), matchShuffleWithSHUFPD(), matchShuffleWithUNPCK(), matchStridedConstant(), llvm::CombinerHelper::matchTruncStoreMerge(), matchUnaryPermuteShuffle(), llvm::ISD::matchUnaryPredicate(), matchUnaryShuffle(), llvm::PBQP::RegAlloc::MatrixMetadata::MatrixMetadata(), llvm::BitTracker::RegisterCell::meet(), mergeConstants(), llvm::TargetTransformInfoImplBase::minRequiredElementSize(), llvm::MipsSETargetLowering::MipsSETargetLowering(), moveBelowOrigChain(), llvm::IntervalMapImpl::NodeBase< std::pair< KeyT, KeyT >, ValT, N >::moveLeft(), llvm::IntervalMapImpl::NodeBase< std::pair< KeyT, KeyT >, ValT, N >::moveRight(), llvm::LoopBase< BasicBlock, Loop >::moveToHeader(), llvm::SIInstrInfo::moveToVALU(), llvm::APInt::multiplicativeInverse(), llvm::LegalizerHelper::narrowScalar(), llvm::LegalizerHelper::narrowScalarAddSub(), llvm::LegalizerHelper::narrowScalarExtract(), nch(), llvm::AArch64RegisterInfo::needsFrameBaseReg(), llvm::ARMBaseRegisterInfo::needsFrameBaseReg(), llvm::PHITransAddr::NeedsPHITranslationFromBlock(), NextPossibleSolution(), llvm::NodeSet::NodeSet(), llvm::Triple::normalize(), llvm::DAGTypeLegalizer::NoteDeletion(), llvm::BitVector::operator&=(), llvm::orc::BlockFreqQuery::operator()(), llvm::PBQP::operator<<(), llvm::SparseBitVectorElement< ElementSize >::operator==(), llvm::object::ExportEntry::operator==(), llvm::BitTracker::RegisterCell::operator==(), llvm::Trace::operator[](), llvm::BitcodeReaderValueList::operator[](), llvm::gsym::AddressRanges::operator[](), llvm::gsym::LineTable::operator[](), llvm::CallGraphNode::operator[](), llvm::TinyPtrVector< llvm::VPValue * >::operator[](), OptimizeAndOrXor(), OptimizeAwayTrappingUsesOfValue(), llvm::ARMBaseInstrInfo::optimizeCompareInstr(), llvm::PPCInstrInfo::optimizeCompareInstr(), llvm::BranchFolder::OptimizeFunction(), llvm::optimizeGlobalCtorsList(), optimizeIntegerToVectorInsertions(), llvm::X86InstrInfo::optimizeLoadInstr(), llvm::LanaiInstrInfo::optimizeSelect(), llvm::ARMBaseInstrInfo::optimizeSelect(), llvm::opt::OptTable::OptTable(), llvm::SMSchedule::orderDependence(), llvm::AArch64FrameLowering::orderFrameObjects(), llvm::X86FrameLowering::orderFrameObjects(), llvm::LiveRange::overlapsFrom(), p_b_term(), p_bracket(), p_simp_re(), llvm::CallLowering::parametersInCSRMatch(), llvm::cl::parser< const PassInfo * >::parse(), llvm::MachO::PackedVersion::parse32(), llvm::MachO::PackedVersion::parse64(), llvm::remarks::BitstreamParserHelper::parseMagic(), parseOperands(), parseSectionFlags(), parseVersionFromName(), partitionShuffleOfConcats(), passingValueIsAlwaysUndefined(), llvm::IntervalMap< KeyT, ValT, N, Traits >::const_iterator::pathFillFind(), PerformBUILD_VECTORCombine(), performConcatVectorsCombine(), llvm::PPCTargetLowering::PerformDAGCombine(), llvm::ARMTargetLowering::PerformIntrinsicCombine(), llvm::ARMTargetLowering::PerformMVETruncCombine(), PerformMVEVLDCombine(), performNEONPostLDSTCombine(), PerformSplittingMVEEXTToWideningLoad(), PerformSplittingMVETruncToNarrowingStores(), PerformSplittingToNarrowingStores(), PerformSplittingToWideningLoad(), PerformTruncatingStoreCombine(), llvm::InstCombinerImpl::PHIArgMergedDebugLoc(), llvm::R600SchedStrategy::pickNode(), llvm::SchedBoundary::pickOnlyChoice(), placeSplitBlockCarefully(), llvm::RuntimeDyldMachO::populateIndirectSymbolPointersSection(), llvm::possiblyDemandedEltsInMask(), llvm::HexagonInstrInfo::PredicateInstruction(), llvm::TargetInstrInfo::PredicateInstruction(), llvm::IntervalPartition::print(), llvm::safestack::StackLayout::print(), llvm::DWARFExpression::Operation::print(), llvm::Trace::print(), llvm::MachineJumpTableInfo::print(), llvm::DIEAbbrev::print(), llvm::opt::Arg::print(), llvm::MachineConstantPool::print(), llvm::SCEV::print(), llvm::VirtRegMap::print(), llvm::MCInst::print(), llvm::AliasSet::print(), llvm::SMDiagnostic::print(), llvm::LiveIntervals::print(), llvm::MachineOperand::print(), llvm::MachineTraceMetrics::Ensemble::print(), llvm::LoopBase< BasicBlock, Loop >::print(), llvm::BasicAAResult::DecomposedGEP::print(), llvm::LiveRange::print(), llvm::MachineFrameInfo::print(), llvm::AttributeList::print(), llvm::LoopInfoBase< BasicBlock, Loop >::print(), llvm::VPInterleaveRecipe::print(), llvm::MachineInstr::print(), llvm::SDNode::print_details(), llvm::SDNode::print_types(), printCFI(), PrintCFIEscape(), printConstant(), llvm::ARMInstPrinter::printCPSIFlag(), llvm::PBQP::RegAlloc::PBQPRAGraph::printDot(), llvm::cl::generic_parser_base::printGenericOptionDiff(), PrintHelpOptionList(), printHex32(), llvm::ScopedPrinter::printIndent(), llvm::ARMInstPrinter::printInst(), llvm::GVNExpression::BasicExpression::printInternal(), llvm::GVNExpression::AggregateValueExpression::printInternal(), printLine(), llvm::printLLVMNameWithoutPrefix(), llvm::SparcInstPrinter::printMembarTag(), llvm::PrintMessage(), printMetadataIdentifier(), llvm::ARMInstPrinter::printMVEVectorList(), PrintOps(), llvm::cl::generic_parser_base::printOptionInfo(), llvm::TargetRegistry::printRegisteredTargetsForVersion(), llvm::MipsAsmPrinter::printRegisterList(), llvm::ARMInstPrinter::printRegisterList(), printSourceLine(), printStackObjectDbgInfo(), llvm::PrintStatistics(), llvm::BitcodeAnalyzer::printStats(), llvm::MCSectionMachO::PrintSwitchToSection(), printSymbolizedStackTrace(), llvm::AArch64InstPrinter::printVectorList(), llvm::JumpThreadingPass::processBlock(), llvm::JumpThreadingPass::processBranchOnPHI(), llvm::HexagonFrameLowering::processFunctionBeforeFrameFinalized(), llvm::PPCFrameLowering::processFunctionBeforeFrameFinalized(), processPHI(), llvm::RuntimeDyldELF::processRelocationRef(), llvm::ARMBaseInstrInfo::produceSameValue(), llvm::DIEAbbrev::Profile(), llvm::APInt::Profile(), llvm::SampleProfileLoaderBaseImpl< MachineBasicBlock >::propagateThroughEdges(), ProvideOption(), llvm::cl::ProvidePositionalOption(), llvm::ValueEnumerator::purgeFunction(), PushArgMD(), rangeMetadataExcludesValue(), llvm::ResourcePriorityQueue::rawRegPressureDelta(), llvm::BitstreamCursor::ReadAbbrevRecord(), llvm::GCOVFile::readGCDA(), llvm::GCOVFile::readGCNO(), llvm::SIInstrInfo::readlaneVGPRToSGPR(), llvm::BitstreamCursor::readRecord(), llvm::sampleprof::SampleProfileReaderExtBinaryBase::readSecHdrTable(), llvm::MachineInstr::readsWritesVirtualRegister(), llvm::DomTreeBuilder::SemiNCAInfo< DomTreeT >::reattachExistingSubtree(), recomputeLiveInValues(), llvm::AArch64TargetLowering::ReconstructShuffle(), llvm::PMDataManager::recordAvailableAnalysis(), llvm::StackMaps::recordPatchPoint(), llvm::ImutAVLFactory< ImutInfo >::recoverNodes(), redirectValuesFromPredecessorsToPhi(), reduceBuildVecToShuffleWithZero(), reduceVMULWidth(), llvm::BitTracker::RegisterCell::ref(), llvm::BitTracker::RegisterCell::regify(), llvm::RuntimeDyldMachOCRTPBase< RuntimeDyldMachOX86_64 >::registerEHFrames(), llvm::RuntimeDyldELF::registerEHFrames(), llvm::RegsForValue::RegsForValue(), llvm::MachineTraceMetrics::releaseMemory(), llvm::LiveIntervals::releaseMemory(), relocationViaAlloca(), llvm::CallGraphNode::removeAnyCallEdgeTo(), llvm::Function::removeAttributeAtIndex(), llvm::CallBase::removeAttributeAtIndex(), llvm::SIMachineFunctionInfo::removeDeadFrameIndices(), llvm::LazyCallGraph::removeDeadFunction(), llvm::LazyCallGraph::RefSCC::removeInternalRefEdge(), llvm::MachineInstr::RemoveOperand(), removeOperands(), removePhis(), llvm::DomTreeBuilder::SemiNCAInfo< DomTreeT >::RemoveRedundantRoots(), RemoveSwitchAfterSelectConversion(), removeTemplateArgs(), removeUndefIntroducingPredecessor(), llvm::Record::removeValue(), llvm::SIMachineFunctionInfo::removeVGPRForSGPRSpill(), llvm::LiveVariables::removeVirtualRegisterDead(), llvm::LiveVariables::removeVirtualRegisterKilled(), llvm::LiveVariables::removeVirtualRegistersKilled(), llvm::opt::Arg::render(), reorderSubVector(), ReorganizeVector(), llvm::SelectionDAG::ReplaceAllUsesOfValuesWith(), llvm::SelectionDAG::ReplaceAllUsesWith(), replaceExtractElements(), replaceInChain(), llvm::PPCInstrInfo::replaceInstrWithLI(), ReplaceINTRINSIC_W_CHAIN(), ReplaceLoadVector(), llvm::MachineJumpTableInfo::ReplaceMBBInJumpTables(), llvm::CallGraphSCC::ReplaceNode(), llvm::X86TargetLowering::ReplaceNodeResults(), llvm::MachineBasicBlock::replacePhiUsesWith(), llvm::Constant::replaceUndefsWith(), replaceUndefValuesInPhi(), llvm::SelectionDAGISel::ReplaceUses(), llvm::MachineBasicBlock::ReplaceUsesOfBlockWith(), llvm::User::replaceUsesOfWith(), replaceUsesOfWith(), llvm::PPCRegisterInfo::requiresFrameIndexScavenging(), rescheduleCanonically(), rescheduleLexographically(), llvm::VLIWResourceModel::reserveResources(), llvm::BitVector::reset(), llvm::SmallBitVector::reset(), llvm::DIInliningInfo::resize(), resolveBuildVector(), llvm::SCCPInstVisitor::resolvedUndefsIn(), llvm::ThumbRegisterInfo::resolveFrameIndex(), llvm::AArch64RegisterInfo::resolveFrameIndex(), llvm::ARMBaseRegisterInfo::resolveFrameIndex(), llvm::BitsInit::resolveReferences(), llvm::RuntimeDyldImpl::resolveRelocationList(), llvm::RuntimeDyldImpl::resolveRelocations(), resolveTargetShuffleFromZeroables(), resolveTargetShuffleInputsAndMask(), resolveZeroablesFromTargetShuffle(), llvm::Thumb1FrameLowering::restoreCalleeSavedRegisters(), llvm::MSP430FrameLowering::restoreCalleeSavedRegisters(), llvm::X86FrameLowering::restoreCalleeSavedRegisters(), llvm::PPCFrameLowering::restoreCalleeSavedRegisters(), llvm::SIRegisterInfo::restoreSGPR(), RestoreSpillList(), llvm::SIScheduleDAGMI::restoreSULinksLeft(), llvm::HexagonPacketizerList::restrictingDepExistInPacket(), llvm::CallLowering::resultsCompatible(), returnEdge(), llvm::reverseBits(), rewrite(), llvm::rewriteLoopExitValues(), rewritePHINodesForExitAndUnswitchedBlocks(), rewritePHINodesForUnswitchedExitBlock(), llvm::StringRef::rfind(), llvm::StringRef::rfind_insensitive(), llvm::BitTracker::RegisterCell::rol(), llvm::InstCombinerImpl::run(), llvm::RepeatedPass< PassT >::run(), llvm::ExecutionEngine::runFunctionAsMain(), llvm::runIPSCCP(), llvm::RewriteStatepointsForGC::runOnFunction(), llvm::IntervalPartition::runOnFunction(), llvm::SelectionDAGISel::runOnMachineFunction(), llvm::AMDGPUAsmPrinter::runOnMachineFunction(), llvm::ExecutionDomainFix::runOnMachineFunction(), llvm::LiveIntervals::runOnMachineFunction(), llvm::AVRFrameAnalyzer::runOnMachineFunction(), llvm::DomTreeBuilder::SemiNCAInfo< DomTreeT >::runSemiNCA(), llvm::ExecutionEngine::runStaticConstructorsDestructors(), llvm::IntervalMapImpl::LeafNode< KeyT, ValT, N, Traits >::safeFind(), llvm::IntervalMapImpl::BranchNode< KeyT, ValT, RootBranchCap, Traits >::safeFind(), llvm::IntervalMapImpl::LeafNode< KeyT, ValT, N, Traits >::safeLookup(), llvm::SelectionDAG::salvageDebugInfo(), samesets(), scalarizeVectorStore(), llvm::APIntOps::ScaleBitMask(), ScaleVectorOffset(), scaleVectorShuffleBlendMask(), llvm::SIScheduleDAGMI::schedule(), llvm::ResourcePriorityQueue::scheduledNode(), llvm::sys::DynamicLibrary::SearchForAddressOfSymbol(), AMDGPUDAGToDAGISel::SelectBuildVector(), llvm::SelectionDAGISel::SelectCodeCommon(), llvm::FastISel::selectExtractValue(), llvm::SelectionDAGISel::SelectInlineAsmMemoryOperands(), llvm::FastISel::selectInstruction(), llvm::FastISel::selectPatchpoint(), llvm::FastISel::selectStackmap(), llvm::EngineBuilder::selectTarget(), llvm::LoopVectorizationCostModel::selectVectorizationFactor(), llvm::BitTracker::RegisterCell::self(), separateNestedLoop(), llvm::FunctionLoweringInfo::set(), llvm::CallBase::setArgOperand(), llvm::FuncletPadInst::setArgOperand(), llvm::LTOCodeGenerator::setAsmUndefinedRefs(), llvm::ARMBaseInstrInfo::setExecutionDomain(), setGroupSize(), llvm::PHINode::setIncomingBlock(), llvm::PHINode::setIncomingValue(), llvm::CallBrInst::setIndirectDest(), llvm::WebAssemblyFunctionInfo::setLocal(), setMemoryPhiValueForBlock(), llvm::User::setOperand(), llvm::IndirectBrInst::setSuccessor(), llvm::InvokeInst::setSuccessor(), llvm::CallBrInst::setSuccessor(), llvm::PackedVectorBase< T, BitNum, BitVectorTy, false >::setValue(), llvm::PackedVectorBase< T, BitNum, BitVectorTy, true >::setValue(), llvm::LoopVectorizationCostModel::setWideningDecision(), llvm::IntervalMapImpl::NodeBase< std::pair< KeyT, KeyT >, ValT, N >::shift(), llvm::AAMDNodes::shiftTBAAStruct(), shrinkFPConstantVector(), simpleLibcall(), SimplifyAddOperands(), simplifyAndDCEInstruction(), simplifyCommonValuePhi(), llvm::TargetLowering::SimplifyDemandedBits(), llvm::InstCombinerImpl::SimplifyDemandedVectorElts(), llvm::TargetLowering::SimplifyDemandedVectorElts(), llvm::X86TargetLowering::SimplifyDemandedVectorEltsForTargetNode(), llvm::X86TargetLowering::SimplifyDemandedVectorEltsForTargetShuffle(), simplifyDivRem(), llvm::TargetLowering::SimplifyMultipleUseDemandedBits(), llvm::X86TargetLowering::SimplifyMultipleUseDemandedBitsForTargetNode(), simplifyOneLoop(), SimplifySelectInst(), simplifySetCCWithCTPOP(), simplifyShuffleOfShuffle(), SimplifyShuffleVectorInst(), simplifyX86immShift(), simplifyX86insertps(), sinkLoopInvariantInstructions(), llvm::SIScheduleBlockScheduler::SIScheduleBlockScheduler(), sizeOfSPAdjustment(), llvm::yaml::Stream::skip(), llvm::yaml::skip(), skipExtensionForVectorMULL(), SkipExtensionForVMULL(), llvm::BitstreamCursor::skipRecord(), llvm::InstCombinerImpl::SliceUpIllegalIntegerPHI(), llvm::AVRFrameLowering::spillCalleeSavedRegisters(), llvm::Thumb1FrameLowering::spillCalleeSavedRegisters(), llvm::Mips16FrameLowering::spillCalleeSavedRegisters(), llvm::MipsSEFrameLowering::spillCalleeSavedRegisters(), llvm::MSP430FrameLowering::spillCalleeSavedRegisters(), llvm::X86FrameLowering::spillCalleeSavedRegisters(), llvm::PPCFrameLowering::spillCalleeSavedRegisters(), llvm::SIRegisterInfo::spillEmergencySGPR(), llvm::SIRegisterInfo::spillSGPR(), llvm::DominatorTreeBase< BasicBlock, IsPostDom >::Split(), SplitAddRecs(), llvm::SplitAllCriticalEdges(), splitAndLowerShuffle(), SplitBlockPredecessorsImpl(), splitCallSite(), llvm::SplitCriticalEdge(), llvm::MachineBasicBlock::SplitCriticalEdge(), SplitCriticalSideEffectEdges(), llvm::SplitKnownCriticalEdge(), SplitLandingPadPredecessorsImpl(), llvm::splitLoopBound(), SplitOpsAndApply(), splitRetconCoroutine(), llvm::CallLowering::splitToValueTypes(), llvm::APInt::sqrt(), SRAGlobal(), llvm::sampleprof::SampleProfileWriterBinary::stablizeNameTable(), StackMallocSizeClass(), llvm::IntervalMapImpl::LeafNode< KeyT, ValT, N, Traits >::start(), llvm::CriticalAntiDepBreaker::StartBlock(), llvm::IntervalMapImpl::LeafNode< KeyT, ValT, N, Traits >::stop(), llvm::IntervalMapImpl::BranchNode< KeyT, ValT, RootBranchCap, Traits >::stop(), llvm::PPCInstrInfo::storeRegToStackSlotNoUpd(), StoreTailCallArgumentsToStackSlot(), llvm::ExecutionEngine::StoreValueToMemory(), llvm::remarks::StringTable::StringTable(), llvm::stripGetElementPtr(), stripNonValidDataFromBody(), StripTypeNames(), llvm::BitTracker::subst(), llvm::IntervalMapImpl::NodeRef::subtree(), llvm::IntervalMapImpl::BranchNode< KeyT, ValT, RootBranchCap, Traits >::subtree(), llvm::SmallVectorImpl< llvm::VectorizationFactor >::swap(), llvm::SmallDenseMap< llvm::Value *, llvm::Value * >::swap(), swapAntiDependences(), swapMIOperands(), llvm::MachO::swapStruct(), SwitchToLookupTable(), llvm::TailDuplicator::tailDuplicateAndUpdate(), llvm::X86TargetLowering::targetShrinkDemandedConstant(), llvm::APInt::tcAdd(), llvm::APInt::tcAddPart(), llvm::APInt::tcAssign(), tcComplement(), llvm::APInt::tcFullMultiply(), llvm::APInt::tcIsZero(), llvm::APInt::tcLSB(), llvm::APInt::tcMultiply(), llvm::APInt::tcMultiplyPart(), llvm::APInt::tcSet(), llvm::detail::tcSetLeastSignificantBits(), llvm::APInt::tcShiftRight(), llvm::APInt::tcSubtract(), llvm::APInt::tcSubtractPart(), llvm::BitVector::test(), llvm::SmallBitVector::test(), llvm::JumpThreadingPass::threadEdge(), llvm::JumpThreadingPass::threadThroughTwoBasicBlocks(), llvm::MachineFunction::tidyLandingPads(), llvm::OpenMPIRBuilder::tileLoops(), llvm::BitTracker::MachineEvaluator::toInt(), llvm::BitTracker::RegisterCell::top(), llvm::DbgValueHistoryMap::trimLocationRanges(), llvm::APInt::trunc(), tryAddToFoldList(), TryCombineBaseUpdate(), tryCombineToBSL(), llvm::LegalizationArtifactCombiner::tryCombineTrunc(), llvm::tryFoldSPUpdateIntoPushPop(), tryToFoldExtendOfConstant(), tryToReplaceWithConstant(), llvm::TryToSimplifyUncondBranchFromEmptyBlock(), umul_ov(), llvm::IntEqClasses::uncompress(), llvm::X86InstrInfo::unfoldMemoryOperand(), uninstallExceptionOrSignalHandlers(), llvm::SparseBitVectorElement< ElementSize >::unionWith(), unpackLoadToAggregate(), unpackStoreToAggregate(), llvm::UnrollLoop(), llvm::UnrollRuntimeLoopRemainder(), llvm::SelectionDAG::UnrollVectorOp(), llvm::SelectionDAG::UnrollVectorOverflowOp(), unrollVectorShift(), llvm::AArch64RegisterInfo::UpdateCustomCalleeSavedRegs(), llvm::AArch64RegisterInfo::UpdateCustomCallPreservedMask(), llvm::X86::updateImpliedFeatures(), updateLoopMetadataDebugLocationsImpl(), llvm::SelectionDAG::UpdateNodeOperands(), updateOperand(), UpdatePHINodes(), updatePHIs(), updatePostorderSequenceForEdgeInsertion(), updatePredecessorProfileMetadata(), llvm::CallInst::updateProfWeight(), llvm::SelectionDAGBuilder::UpdateSplitBlock(), llvm::FastISel::updateValueMap(), llvm::UpgradeGlobalVariable(), UpgradeX86ALIGNIntrinsics(), UpgradeX86PSLLDQIntrinsics(), UpgradeX86PSRLDQIntrinsics(), llvm::IntervalMapImpl::LeafNode< KeyT, ValT, N, Traits >::value(), llvm::ValueEnumerator::ValueEnumerator(), ValuesOverlap(), llvm::InnerLoopVectorizer::vectorizeInterleaveGroup(), llvm::slpvectorizer::BoUpSLP::vectorizeTree(), llvm::InlineAsm::Verify(), llvm::PHITransAddr::Verify(), llvm::MachineTraceMetrics::verifyAnalysis(), llvm::SIInstrInfo::verifyInstruction(), llvm::LoopBase< BasicBlock, Loop >::verifyLoop(), VerifyLowRegs(), VerifyPHIs(), llvm::ScheduleDAGSDNodes::VerifyScheduledSequence(), llvm::MachineRegisterInfo::verifyUseLists(), VerifyVectorTypes(), llvm::InstCombinerImpl::visitAllocSite(), llvm::Interpreter::visitAShr(), llvm::Interpreter::visitBinaryOperator(), llvm::SelectionDAGBuilder::visitBitTestHeader(), llvm::InstCombinerImpl::visitCallInst(), llvm::Interpreter::visitExtractValueInst(), VisitGlobalVariableForEmission(), llvm::Interpreter::visitInsertValueInst(), llvm::InstCombinerImpl::visitLandingPadInst(), llvm::Interpreter::visitLShr(), llvm::InstCombinerImpl::visitPHINode(), llvm::ObjectSizeOffsetEvaluator::visitPHINode(), llvm::Interpreter::visitShl(), llvm::Interpreter::visitShuffleVectorInst(), llvm::InstCombinerImpl::visitShuffleVectorInst(), llvm::InstCombinerImpl::visitSRem(), llvm::InstCombinerImpl::visitUDiv(), llvm::Interpreter::visitUnaryOperator(), llvm::widenShuffleMaskElts(), widenVec(), willShiftRightEliminate(), llvm::msgpack::Writer::write(), writeCOFF(), WriteConstantInternal(), llvm::ARMAsmBackend::writeNopData(), llvm::MCAssembler::writeSectionData(), writeStringRecord(), llvm::opt::Arg::~Arg(), llvm::CrashRecoveryContext::~CrashRecoveryContext(), and llvm::MachineConstantPool::~MachineConstantPool().

◆ i32

Clang compiles this i32

Definition at line 504 of file README.txt.

◆ i64

Clang compiles this i1 i64 store i64 i64 store i64 i64 store i64 i64 store i64

Definition at line 504 of file README.txt.

Referenced by getBaseWithOffsetUsingSplitOR(), and AMDGPUDAGToDAGISel::Select().

◆ i8

Clang compiles this i8

Definition at line 504 of file README.txt.

Referenced by AMDGPUDAGToDAGISel::matchLoadD16FromBuildVector(), and to().

◆ increments

Add support for conditional increments

Definition at line 131 of file README.txt.

◆ int

Clang compiles this i1 i64 store i64 i64 store i64 i64 store i64 i64 store i64 align Which gets codegen d xmm0 movaps rbp movaps rbp movaps rbp movaps rbp rbp rbp rbp rbp It would be better to have movq s of instead of the movaps s LLVM produces ret int

Definition at line 536 of file README.txt.

Referenced by llvm::pdb::DbiStreamBuilder::addDbgStream(), llvm::ScheduleDAGInstrs::addPhysRegDataDeps(), analyzeLoopUnrollCost(), llvm::AVRDAGToDAGISel::select< ISD::STORE >(), BUCompareLatency(), CalculateTailCallSPDiff(), checkResourceLimit(), combineAddOfPMADDWD(), combineExtractWithShuffle(), combineShuffleOfScalars(), combineShuffleToVectorExtend(), combineTargetShuffle(), combineTruncationShuffle(), combineX86ShufflesRecursively(), llvm::ShuffleVectorInst::commuteShuffleMask(), CompareSCEVComplexity(), CompareValueComplexity(), completeEphemeralValues(), computeExcessPressureDelta(), computeFreeStackSlots(), ComputeImportForModule(), computeMaxPressureDelta(), constructDup(), createShuffleMaskFromVSELECT(), llvm::MachineFrameInfo::CreateSpillStackObject(), llvm::MachineFrameInfo::CreateStackObject(), llvm::DecodeEXTRQIMask(), llvm::DecodeINSERTQIMask(), DecodeInsSize(), llvm::DecodeVPERM2X128Mask(), llvm::M68kRegisterInfo::eliminateFrameIndex(), llvm::X86RegisterInfo::eliminateFrameIndex(), llvm::EmitAnyX86InstComments(), llvm::Thumb1FrameLowering::emitEpilogue(), llvm::SparcFrameLowering::emitEpilogue(), llvm::ARMFrameLowering::emitEpilogue(), emitFrameOffsetAdj(), llvm::WebAssemblyAsmPrinter::EmitProducerInfo(), llvm::SparcFrameLowering::emitPrologue(), llvm::AArch64SelectionDAGInfo::EmitTargetCodeForSetTag(), llvm::pdb::DbiStreamBuilder::finalizeMsfLayout(), FoldIntToFPToInt(), llvm::InstCombinerImpl::foldItoFPtoI(), foldShuffleOfConcatUndefs(), llvm::ARM_AM::getAM2Opc(), getARMIndexedAddressParts(), llvm::SlotTracker::getAttributeGroupSlot(), getFauxShuffleMask(), llvm::SlotTracker::getGlobalSlot(), llvm::SlotTracker::getGUIDSlot(), llvm::ScoreboardHazardRecognizer::getHazardType(), llvm::SlotTracker::getLocalSlot(), llvm::SlotTracker::getMetadataSlot(), llvm::SlotTracker::getModulePathSlot(), getMVEIndexedAddressParts(), getOrCreateFrameHelper(), getShuffleComment(), getShuffleScalarElt(), getT2IndexedAddressParts(), getTargetConstantBitsFromNode(), llvm::SlotTracker::getTypeIdSlot(), llvm::RegPressureTracker::getUpwardPressureDelta(), handleIndirectSymViaGOTPCRel(), llvm::MachineInstr::hasComplexRegisterTies(), iJIT_NotifyEvent(), llvm::MachineFrameInfo::isFixedObjectIndex(), isKnownExactCastIntToFP(), llvm::SelectionDAG::isSplatValue(), IsValueFullyAvailableInBlock(), llvm::cfg::LegalizeUpdates(), lle_X_memset(), llvm::ARMTargetLowering::LowerAsmOperandForConstraint(), llvm::MipsCallLowering::lowerFormalArguments(), LowerMULH(), lowerShuffleAsBroadcast(), lowerShuffleAsByteShiftMask(), lowerShuffleAsElementInsertion(), lowerVECTOR_SHUFFLE(), LowerVECTOR_SHUFFLE(), llvm::SelectionDAG::matchBinOpReduction(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseTemplateParamDecl(), partitionShuffleOfConcats(), llvm::SystemZConstantPoolValue::print(), llvm::MachineFrameInfo::print(), printBitField(), llvm::SparcInstPrinter::printCCOperand(), llvm::VEInstPrinter::printCCOperand(), llvm::printCompactDWARFExpr(), printField(), PrintHelpOptionList(), llvm::PPCInstPrinter::printImmZeroOperand(), llvm::PPCInstPrinter::printInst(), llvm::NVPTXInstPrinter::printLdStCode(), llvm::VEInstPrinter::printMImmOperand(), llvm::NVPTXInstPrinter::printMmaCode(), llvm::ScopedPrinter::printNumber(), llvm::SparcInstPrinter::printOperand(), llvm::VEInstPrinter::printRDOperand(), llvm::PPCInstPrinter::printS5ImmOperand(), llvm::PPCInstPrinter::printU1ImmOperand(), llvm::PPCInstPrinter::printU2ImmOperand(), llvm::PPCInstPrinter::printU3ImmOperand(), llvm::PPCInstPrinter::printU4ImmOperand(), llvm::PPCInstPrinter::printU5ImmOperand(), llvm::PPCInstPrinter::printU6ImmOperand(), llvm::PPCInstPrinter::printU7ImmOperand(), llvm::PPCInstPrinter::printU8ImmOperand(), llvm::readExponent(), llvm::AArch64FrameLowering::resolveFrameOffsetReference(), llvm::orc::SelfExecutorProcessControl::runAsMain(), llvm::orc::rt_bootstrap::runAsMainWrapper(), llvm::MCJIT::runFunction(), llvm::AVRDAGToDAGISel::SelectAddr(), llvm::PPCTargetLowering::SelectAddressRegImm(), llvm::ImportedFunctionsInliningStatistics::setModuleInfo(), llvm::MachineOperand::setOffset(), llvm::GCNHazardRecognizer::ShouldPreferAnother(), llvm::TargetLowering::SimplifyMultipleUseDemandedBits(), llvm::Register::stackSlot2Index(), this(), and llvm::detail::IEEEFloat::toString().

◆ integers

into llvm powi allowing the code generator to produce balanced multiplication trees the intrinsic needs to be extended to support integers

Definition at line 54 of file README.txt.

◆ into

In fpstack this compiles into

Definition at line 504 of file README.txt.

◆ l

This requires reassociating to forms of expressions that are already something that reassoc doesn t think about yet These two functions should generate the same code on big endian int* l { return memcmp(j,l,4)

◆ LBB16_1

Add support for conditional and other related patterns Instead eax eax je LBB16_2 LBB16_1

Definition at line 142 of file README.txt.

◆ LBB1_1

we compile this esp call L1 $pb L1 esp je LBB1_2 LBB1_1

Definition at line 414 of file README.txt.

◆ LBB1_2

we compile this esp call L1 $pb L1 esp je LBB1_2 esp ret LBB1_2

Definition at line 420 of file README.txt.

◆ like

Where MAX_UNSIGNED state is a bit int On a bit platform it would be just so cool to turn it into something like

Definition at line 48 of file README.txt.

◆ loop

is twice as slow as this loop

Definition at line 205 of file README.txt.

◆ m_HotKey

THotKey m_HotKey

Referenced by GetHotKey().

◆ mode

We should investigate an instruction sinking pass Consider this silly example in pic mode

Definition at line 400 of file README.txt.

◆ movq

Clang compiles this i1 i64 store i64 i64 store i64 i64 store i64 i64 store i64 align Which gets codegen d xmm0 movaps rbp movaps rbp movaps rbp movaps rbp rbp rbp rbp movq

Definition at line 521 of file README.txt.

◆ n

The same transformation can work with an even modulo with the addition of a and shrink the compare RHS by the same amount Unless the target supports that transformation probably isn t worthwhile The transformation can also easily be made to work with non zero equality for n

Definition at line 685 of file README.txt.

Referenced by llvm::IntervalMapImpl::adjustSiblingSizes(), appendToGlobalArray(), llvm::HexagonFrameLowering::assignCalleeSavedSpillSlots(), llvm::pointer_union_detail::bitsRequired(), llvm::ScheduleDAGInstrs::buildSchedGraph(), Choose(), CombineVLDDUP(), llvm::ComputeEditDistance(), llvm::decodeSLEB128(), llvm::decodeULEB128(), llvm::IntervalMapImpl::distribute(), llvm::NVPTXAsmPrinter::doFinalization(), llvm::detail::indexed_accessor_range_base< DerivedT, std::pair< BaseT, ptrdiff_t >, T, T *, T & >::drop_back(), llvm::detail::indexed_accessor_range_base< DerivedT, std::pair< BaseT, ptrdiff_t >, T, T *, T & >::drop_front(), llvm::encodeBase64(), llvm::SpillPlacement::finish(), llvm::R600InstrInfo::fitsConstReadLimitations(), foo(), llvm::getAlign(), llvm::SpillPlacement::Node::getDissentingNeighbors(), llvm::DWARFContext::getInliningInfoForAddress(), llvm::HexagonMCCodeEmitter::getMachineOpValue(), llvm::df_iterator< T, std::set< typename GraphTraits< T >::NodeRef >, true >::getPath(), llvm::SparcRegisterInfo::getReservedRegs(), llvm::X86RegisterInfo::getReservedRegs(), llvm::IndexedMap< unsigned, llvm::VirtReg2IndexFunctor >::grow(), handleSwitchExpect(), llvm::rdf::NodeAllocator::id(), llvm::IndexedMap< unsigned, llvm::VirtReg2IndexFunctor >::inBounds(), INITIALIZE_PASS(), llvm::yaml::ScalarTraits< MaybeAlign >::input(), is(), llvm::BasicBlockEdge::isSingleEdge(), llvm::SpillPlacement::iterate(), KnuthDiv(), llvm_strlcpy(), LowerCTPOP(), llvm::BitTracker::RegisterCell::meet(), memcpy(), nch(), llvm::IntervalMapImpl::NodeRef::NodeRef(), llvm::iterator_facade_base< partition_iterator, std::forward_iterator_tag, Partition >::operator+(), llvm::iterator_adaptor_base< GSIHashIterator, FixedStreamArrayIterator< PSHashRecord >, std::random_access_iterator_tag, const uint32_t >::operator+=(), llvm::iterator_facade_base< partition_iterator, std::forward_iterator_tag, Partition >::operator-(), llvm::iterator_adaptor_base< GSIHashIterator, FixedStreamArrayIterator< PSHashRecord >, std::random_access_iterator_tag, const uint32_t >::operator-=(), llvm::operator<<(), llvm::IndexedMap< unsigned, llvm::VirtReg2IndexFunctor >::operator[](), llvm::SetVector< llvm::ElementCount, SmallVector< llvm::ElementCount, N >, SmallDenseSet< llvm::ElementCount, N > >::operator[](), llvm::iterator_facade_base< partition_iterator, std::forward_iterator_tag, Partition >::operator[](), PerformMVEVLDCombine(), performNEONPostLDSTCombine(), PerformVECTOR_SHUFFLECombine(), llvm::powerOf5(), llvm::HexagonInstrInfo::PredicateInstruction(), FloatLiteralImpl< Float >::printLeft(), llvm::ImutAVLFactory< ImutInfo >::recoverNodes(), llvm::BitTracker::RegisterCell::regify(), llvm::APInt::roundToDouble(), ScaleVectorOffset(), llvm::SpillPlacement::scanActiveBundles(), llvm::cl::Option::setNumAdditionalVals(), llvm::cl::list< DataType, StorageClass, ParserClass >::setNumAdditionalVals(), llvm::ARMFunctionInfo::setNumAlignedDPRCS2Regs(), llvm::IntervalMapImpl::NodeRef::setSize(), llvm::ImutAVLTree< ImutInfo >::size(), llvm::detail::indexed_accessor_range_base< DerivedT, std::pair< BaseT, ptrdiff_t >, T, T *, T & >::slice(), llvm::detail::indexed_accessor_range_base< DerivedT, std::pair< BaseT, ptrdiff_t >, T, T *, T & >::take_back(), llvm::detail::indexed_accessor_range_base< DerivedT, std::pair< BaseT, ptrdiff_t >, T, T *, T & >::take_front(), llvm::APInt::tcDivide(), llvm::APInt::tcExtract(), llvm::APInt::tcLSB(), llvm::APInt::tcMSB(), llvm::APInt::tcMultiplyPart(), TryCombineBaseUpdate(), llvm::writeUnsignedDecimal(), x(), and y().

◆ of

This is equivalent to the where is the multiplicative inverse of

Definition at line 134 of file README.txt.

◆ Opportunities

Target Independent Opportunities

Definition at line 8 of file README.txt.

◆ or

aka or = or i32 %shl9

Definition at line 606 of file README.txt.

◆ or10

compiles conv shl5 or10 = or i32 %or6

Definition at line 608 of file README.txt.

◆ or6

aka conv or6 = or i32 %or

Definition at line 607 of file README.txt.

◆ patch

this could be done in SelectionDAGISel along with other special bytes It would be nice to revert this patch

Definition at line 104 of file README.txt.

◆ ppc

Unrolling by would eliminate the& in both leading to a net reduction in code size The resultant code would then also be suitable for exit value computation We miss a bunch of rotate opportunities on various including ppc

Definition at line 567 of file README.txt.

◆ ppc32

This would be a win on ppc32

Definition at line 37 of file README.txt.

◆ PR16157

This is blocked on not handling X* X* X which is the same number of multiplies and is because the* X has multiple uses Here s a simple X1 B ret i32 C Reassociate should handle the example in GCC PR16157

Definition at line 84 of file README.txt.

◆ PR17886

Unrolling by would eliminate the& in both leading to a net reduction in code size The resultant code would then also be suitable for exit value computation We miss a bunch of rotate opportunities on various including etc On we miss a bunch of rotate by variable cases because the rotate matching code in dag combine doesn t look through truncates aggressively enough Here are some testcases reduces from GCC PR17886

Definition at line 572 of file README.txt.

◆ preds

preds
Initial value:
%tmp.5 = add i32 %a

Definition at line 340 of file README.txt.

Referenced by abort_gzip(), foo(), llvm::MIPatternMatch::m_all_of(), and llvm::MIPatternMatch::m_any_of().

◆ reassoc

into llvm powi allowing the code generator to produce balanced multiplication trees the intrinsic needs to be extended to support and second the code generator needs to be enhanced to lower these to multiplication trees Interesting testcase for add shift mul reassoc

Definition at line 61 of file README.txt.

◆ result

then result = phi i32 [ 0, %else.0 ]

Definition at line 355 of file README.txt.

◆ return

< i32 > br label return return

◆ rotate

The same transformation can work with an even modulo with the addition of a rotate

Definition at line 680 of file README.txt.

Referenced by DecodeNEONComplexLane64Instruction().

◆ rotates

The same transformation can work with an even modulo with the addition of a and shrink the compare RHS by the same amount Unless the target supports rotates

Definition at line 681 of file README.txt.

◆ s

multiplies can be turned into SHL s

Definition at line 370 of file README.txt.

Referenced by llvm::PBQP::backpropagate(), llvm::LoopVectorizationCostModel::calculateRegisterUsage(), llvm::sys::path::convert_to_slash(), llvm::ARMConstantPoolSymbol::Create(), llvm::hashing::detail::hash_state::create(), llvm::DecodeSHUFPMask(), doinsert(), findmust(), llvm::ScheduleDAGSDNodes::getGraphNodeLabel(), llvm::ScheduleDAGInstrs::getGraphNodeLabel(), llvm::MachineFunction::getMachineMemOperand(), llvm::SlotIndex::getNextSlot(), llvm::SelectionDAG::getNode(), llvm::SlotIndex::getPrevSlot(), llvm::ScheduleDAGTopologicalSort::GetSubGraph(), llvm::hashing::detail::hash_17to32_bytes(), llvm::hashing::detail::hash_1to3_bytes(), llvm::hashing::detail::hash_33to64_bytes(), llvm::hashing::detail::hash_4to8_bytes(), llvm::hashing::detail::hash_9to16_bytes(), llvm::hashing::detail::hash_integer_value(), llvm::hashing::detail::hash_short(), llvm::HexagonResource::HexagonResource(), llvm::AsmLexer::LexToken(), llvm_regerror(), llvm_strlcpy(), llvm::MCInstPrinter::markup(), llvm::hashing::detail::hash_state::mix(), llvm::hashing::detail::hash_state::mix_32_bytes(), llvm::GlobalVariable::operator new(), llvm::MCSymbol::operator new(), parseSegmentLoadCommand(), pluscount(), llvm::IndexedMap< unsigned, llvm::VirtReg2IndexFunctor >::reserve(), llvm::IndexedMap< unsigned, llvm::VirtReg2IndexFunctor >::resize(), llvm::ARMFunctionInfo::setArgRegsSaveSize(), llvm::AArch64FunctionInfo::setCalleeSaveStackHasFreeSpace(), llvm::ARMFunctionInfo::setDPRCalleeSavedAreaSize(), llvm::ARMFunctionInfo::setDPRCalleeSavedGapSize(), llvm::ARMFunctionInfo::setFPCXTSaveAreaSize(), llvm::ARMFunctionInfo::setGPRCalleeSavedArea1Size(), llvm::ARMFunctionInfo::setGPRCalleeSavedArea2Size(), llvm::MachineFrameInfo::setHasPatchPoint(), llvm::AArch64FunctionInfo::setHasRedZone(), llvm::ARMFunctionInfo::setHasStackFrame(), llvm::AArch64FunctionInfo::setHasStackFrame(), llvm::MachineFrameInfo::setHasStackMap(), llvm::X86MachineFunctionInfo::setIsSplitCSR(), llvm::AArch64FunctionInfo::setIsSplitCSR(), llvm::ARMFunctionInfo::setIsSplitCSR(), llvm::ARMFunctionInfo::setLRIsSpilled(), llvm::MachineFrameInfo::setReturnAddressIsTaken(), llvm::ARMFunctionInfo::setReturnRegsCount(), llvm::ARMFunctionInfo::setShouldRestoreSPFromFP(), llvm::AArch64FunctionInfo::setStackRealigned(), llvm::object::Elf_Rel_Impl< ELFType< TargetEndianness, false >, false >::setSymbol(), llvm::object::Elf_Rel_Impl< ELFType< TargetEndianness, true >, false >::setSymbol(), llvm::ELF::Elf32_Rel::setSymbol(), llvm::ELF::Elf32_Rela::setSymbol(), llvm::ELF::Elf64_Rel::setSymbol(), llvm::ELF::Elf64_Rela::setSymbol(), llvm::object::Elf_Rel_Impl< ELFType< TargetEndianness, false >, false >::setSymbolAndType(), llvm::object::Elf_Rel_Impl< ELFType< TargetEndianness, true >, false >::setSymbolAndType(), llvm::ELF::Elf32_Rel::setSymbolAndType(), llvm::ELF::Elf32_Rela::setSymbolAndType(), llvm::ELF::Elf64_Rel::setSymbolAndType(), llvm::ELF::Elf64_Rela::setSymbolAndType(), llvm::HexagonResource::setUnits(), llvm::HexagonResource::setWeight(), llvm::SmallBitVector::SmallBitVector(), llvm::sys::fs::status_known(), llvm::MachO::swapStruct(), wordsOfString(), llvm::msgpack::Writer::write(), and llvm::write32AArch64Addr().

◆ sbbl

Add support for conditional and other related patterns Instead eax eax je LBB16_2 eax edi sbbl

Definition at line 144 of file README.txt.

◆ Shift

bool Shift

Definition at line 468 of file README.txt.

Referenced by llvm::KnownBits::ashr(), BuildExactSDIV(), llvm::TargetLowering::BuildSDIV(), canShiftBinOpWithConstantRHS(), collectInsertionElements(), combineAndMaskToShift(), combineMulSpecial(), combineShiftOfShiftedLogic(), llvm::X86TargetLowering::computeKnownBitsForTargetNode(), llvm::detail::TrailingZerosCounter< T, SizeOfT >::count(), llvm::detail::LeadingZerosCounter< T, SizeOfT >::count(), DecodeImm8OptLsl(), llvm::decodeSLEB128(), DecodeSORegImmOperand(), DecodeSORegRegOperand(), llvm::decodeULEB128(), llvm::PPCTargetLowering::decomposeMulByConstant(), llvm::AArch64InstrInfo::describeLoadedValue(), llvm::ScaledNumbers::divide32(), llvm::ScaledNumbers::divide64(), dumpApplePropertyAttribute(), llvm::MipsMCCodeEmitter::emitInstruction(), llvm::TargetLowering::expandABS(), llvm::TargetLowering::expandBITREVERSE(), llvm::AArch64_IMM::expandMOVImm(), expandMOVImmSimple(), llvm::TargetLowering::expandMUL_LOHI(), llvm::bitfields_details::Impl< Bitfield, StorageType >::extract(), extractMaskedValue(), llvm::InstCombinerImpl::foldICmpAndShift(), llvm::InstCombinerImpl::foldICmpShlConstConst(), llvm::InstCombinerImpl::foldICmpShrConstConst(), foldMaskAndShiftToExtract(), foldMaskAndShiftToScale(), foldMaskedShiftToBEXTR(), foldMaskedShiftToScaledMask(), foldVectorXorShiftIntoCmp(), foldXorTruncShiftIntoCmp(), generateSignedDivisionCode(), generateSignedRemainderCode(), llvm::MipsTargetLowering::getAddrNonPICSym64(), llvm::ScaledNumbers::getAdjusted(), llvm::object::coff_section::getAlignment(), llvm::object::coff_tls_directory< IntTy >::getAlignment(), getCmpOperandFoldingProfit(), llvm::TargetLoweringBase::getCondCodeAction(), getInputSegmentList(), llvm::TargetLoweringBase::getLoadExtAction(), llvm::SelectionDAG::getNode(), getScaledOffsetForBitWidth(), GetVBR(), INITIALIZE_PASS(), InRange(), insertMaskedValue(), llvm::X86TTIImpl::instCombineIntrinsic(), llvm::AArch64InstrInfo::isAddImmediate(), llvm::AArch64_AM::isAnyMOVZMovAlias(), isKnownNonZero(), llvm::AArch64_AM::isMOVNMovAlias(), llvm::AArch64_AM::isMOVZMovAlias(), isShlDoublePermute(), isSimpleShift(), llvm::MipsLegalizerInfo::legalizeCustom(), llvm::AMDGPULegalizerInfo::loadInputValue(), llvm::AMDGPUTargetLowering::loadInputValue(), lower1BitShuffle(), llvm::LegalizerHelper::lowerAbsToAddXor(), llvm::VETargetLowering::lowerEXTRACT_VECTOR_ELT(), LowerEXTRACT_VECTOR_ELT_i1(), llvm::LegalizerHelper::lowerFCopySign(), llvm::VETargetLowering::lowerINSERT_VECTOR_ELT(), LowerLargeShift(), llvm::LegalizerHelper::lowerLoad(), llvm::MSP430TargetLowering::LowerSETCC(), lowerShuffleAsByteShiftMask(), llvm::LegalizerHelper::lowerUnmergeValues(), lowerV16I16Shuffle(), lowerV16I32Shuffle(), lowerV16I8Shuffle(), lowerV2I64Shuffle(), lowerV32I16Shuffle(), lowerV32I8Shuffle(), lowerV4I32Shuffle(), lowerV4I64Shuffle(), lowerV64I8Shuffle(), lowerV8I16Shuffle(), lowerV8I32Shuffle(), lowerV8I64Shuffle(), LowerVectorCTLZInRegLUT(), llvm::KnownBits::lshr(), match1BitShuffleAsKSHIFT(), matchAArch64MulConstCombine(), matchIntPart(), matchLoadAndBytePosition(), matchRotateHalf(), matchShuffleAsShift(), llvm::ScaledNumbers::multiply64(), llvm::BlockFrequencyInfoImplBase::Distribution::normalize(), llvm::PointerEmbeddedInt< IntT, Bits >::operator IntT(), llvm::operator<<(), llvm::ScaledNumber< uint64_t >::operator<<=(), llvm::PointerEmbeddedInt< IntT, Bits >::operator=(), llvm::operator>>(), llvm::ScaledNumber< uint64_t >::operator>>=(), packSegmentMask(), ParseBFI(), llvm::PPCTargetLowering::PerformDAGCombine(), performVectorTruncateCombine(), performVSelectCombine(), llvm::AArch64InstPrinter::printAddSubImm(), llvm::AArch64InstPrinter::printImm8OptLsl(), llvm::AArch64InstPrinter::printInst(), ReduceSwitchRange(), llvm::M68kFrameLowering::restoreCalleeSavedRegisters(), selectI64ImmDirect(), selectI64ImmDirectPrefix(), llvm::TargetLoweringBase::setCondCodeAction(), llvm::TargetLoweringBase::setLoadExtAction(), shiftRightAndRound(), llvm::KnownBits::shl(), llvm::AArch64TargetLowering::shouldConvertConstantLoadToIntImm(), llvm::X86TargetLowering::SimplifyDemandedBitsForTargetNode(), llvm::TargetLowering::SimplifySetCC(), llvm::InstCombinerImpl::SliceUpIllegalIntegerPHI(), llvm::M68kFrameLowering::spillCalleeSavedRegisters(), llvm::OpenMPIRBuilder::tileLoops(), llvm::ScaledNumberBase::toString(), toStringAPFloat(), tryAdvSIMDModImm16(), tryAdvSIMDModImm32(), tryAdvSIMDModImm321s(), tryBitfieldInsertOpFromOrAndImm(), tryCombineFixedPointConvert(), tryLowerToSLI(), llvm::bitfields_details::Impl< Bitfield, StorageType >::update(), UpgradeX86ALIGNIntrinsics(), UpgradeX86PSLLDQIntrinsics(), UpgradeX86PSRLDQIntrinsics(), llvm::InstCombinerImpl::visitTrunc(), and llvm::LegalizerHelper::widenScalar().

◆ shl5

aka conv shl5 = shl i32 %conv

Definition at line 604 of file README.txt.

◆ shl9

compiles shl9 = shl i32 %conv

Definition at line 605 of file README.txt.

◆ Shrink

This would be a win on but not x86 or ppc64 Shrink

Definition at line 41 of file README.txt.

◆ size

i<reg-> size

Definition at line 166 of file README.txt.

Referenced by enlarge(), operator new(), llvm::orc::shared::detail::serializeViaSPSToWrapperFunctionResult(), llvm::orc::shared::SPSSerializationTraits< SPSRemoteSymbolLookupSetElement, SymbolLookupSet::value_type >::size(), llvm::orc::shared::SPSSerializationTraits< SPSRemoteSymbolLookup, ExecutorProcessControl::LookupRequest >::size(), llvm::orc::shared::SPSSerializationTraits< SPSExecutorAddr, jitlink::JITLinkMemoryManager::FinalizedAlloc >::size(), llvm::orc::shared::SPSArgList< SPSTagT, SPSTagTs... >::size(), llvm::orc::shared::SPSSerializationTraits< SPSExecutorAddr, ExecutorAddr >::size(), llvm::orc::shared::SPSSerializationTraits< SPSRemoteSymbolLookupSetElement, RemoteSymbolLookupSetElement >::size(), llvm::orc::shared::SPSSerializationTraits< SPSMemoryProtectionFlags, tpctypes::WireProtectionFlags >::size(), llvm::orc::shared::SPSSerializationTraits< SPSRemoteSymbolLookup, RemoteSymbolLookup >::size(), llvm::orc::shared::SPSSerializationTraits< SPSExecutorAddrRange, ExecutorAddrRange >::size(), llvm::orc::shared::SPSSerializationTraits< SPSSupportFunctionCall, tpctypes::SupportFunctionCall >::size(), llvm::orc::shared::SPSSerializationTraits< SPSSimpleRemoteEPCExecutorInfo, SimpleRemoteEPCExecutorInfo >::size(), llvm::orc::shared::SPSSerializationTraits< SPSAllocationActionsPair, tpctypes::AllocationActionsPair >::size(), llvm::orc::shared::SPSSerializationTraits< SPSSegFinalizeRequest, tpctypes::SegFinalizeRequest >::size(), llvm::orc::shared::SPSSerializationTraits< SPSELFPerObjectSectionsToRegister, ELFPerObjectSectionsToRegister >::size(), llvm::orc::shared::SPSSerializationTraits< SPSMachOPerObjectSectionsToRegister, MachOPerObjectSectionsToRegister >::size(), llvm::orc::shared::SPSSerializationTraits< SPSFinalizeRequest, tpctypes::FinalizeRequest >::size(), llvm::orc::shared::SPSSerializationTraits< SPSMemoryAccessUIntWrite< T >, tpctypes::UIntWrite< T > >::size(), llvm::orc::shared::SPSSerializationTraits< SPSELFNixJITDylibInitializers, ELFNixJITDylibInitializers >::size(), llvm::orc::shared::SPSSerializationTraits< SPSMachOJITDylibInitializers, MachOJITDylibInitializers >::size(), llvm::orc::shared::SPSSerializationTraits< SPSMemoryAccessBufferWrite, tpctypes::BufferWrite >::size(), llvm::orc::shared::SPSSerializationTraits< SPSSequence< char >, ArrayRef< char > >::size(), llvm::orc::shared::SPSSerializationTraits< SPSSequence< SPSElementTagT >, SequenceT, std::enable_if_t< TrivialSPSSequenceSerialization< SPSElementTagT, SequenceT >::available > >::size(), llvm::orc::shared::SPSSerializationTraits< SPSTuple< SPSTagT1, SPSTagT2 >, std::pair< T1, T2 > >::size(), llvm::orc::shared::SPSSerializationTraits< SPSString, StringRef >::size(), llvm::orc::shared::SPSSerializationTraits< SPSSequence< SPSTuple< SPSString, SPSValueT > >, StringMap< ValueT > >::size(), llvm::orc::shared::SPSSerializationTraits< SPSError, detail::SPSSerializableError >::size(), llvm::orc::shared::SPSSerializationTraits< SPSExpected< SPSTagT >, detail::SPSSerializableExpected< T > >::size(), llvm::orc::shared::SPSSerializationTraits< SPSExpected< SPSTagT >, detail::SPSSerializableError >::size(), and llvm::orc::shared::SPSSerializationTraits< SPSExpected< SPSTagT >, T >::size().

◆ systems

Note that only the low bits of effective_addr2 are used On bit systems

Definition at line 100 of file README.txt.

◆ tables

we compile this esp call L1 $pb L1 esp je LBB1_2 esp ret but is currently always computed in the entry block It would be better to sink the picbase computation down into the block for the as it is the only one that uses it This happens for a lot of code with early outs Another example is loads of which are usually emitted into the entry block on targets like x86 If not used in all paths through a they should be sunk into the ones that do In this whole function isel would also handle this Investigate lowering of sparse switch statements into perfect hash tables

Definition at line 439 of file README.txt.

◆ targets

Unrolling by would eliminate the& in both leading to a net reduction in code size The resultant code would then also be suitable for exit value computation We miss a bunch of rotate opportunities on various targets

Definition at line 567 of file README.txt.

◆ then

< i1 > br i1 label label return then

Definition at line 338 of file README.txt.

◆ this

multiplies can be turned into SHL so they should be handled as if they were associative return like this

Definition at line 378 of file README.txt.

◆ though

The same transformation can work with an even modulo with the addition of a and shrink the compare RHS by the same amount Unless the target supports though

Definition at line 681 of file README.txt.

◆ tmp

<i32> tmp = icmp ne i32 %tmp.1

Definition at line 337 of file README.txt.

◆ to

we compile this to

Definition at line 406 of file README.txt.

◆ transform

bool LoopInterchangeTransform::transform

◆ U64

instcombine should handle this C2 when and C2 are unsigned Similarly for udiv and signed operands Currently InstCombine avoids this transform but will do it when the signs of the operands and the sign of the divide match See the FIXME in InstructionCombining cpp in the visitSetCondInst method after the switch case for Instruction::UDiv (around line 4447) for more details. The SingleSource/Benchmarks/Shootout-C++/hash and hash2 tests have examples of this const ruct. [LOOP OPTIMIZATION] SingleSource/Benchmarks/Misc/dt.c shows several interesting optimization opportunities in its double_array_divs_variable function typedef unsigned long long U64

Definition at line 268 of file README.txt.

Referenced by parseBasicType().

◆ X

instcombine should handle this C2 when X

Definition at line 263 of file README.txt.

Referenced by test().

◆ x

This should optimize to x

Definition at line 767 of file README.txt.

◆ x86

There are other cases in various td files Take something like the following on x86

◆ X86

Unrolling by would eliminate the& in both leading to a net reduction in code size The resultant code would then also be suitable for exit value computation We miss a bunch of rotate opportunities on various including etc On X86

◆ xmm0

gets compiled into this on rsp movaps rsp movaps rsp movaps rsp movaps rsp movaps rsp movaps rsp movaps rsp movaps xmm0

Definition at line 517 of file README.txt.

◆ y

The same transformation can work with an even modulo with the addition of a and shrink the compare RHS by the same amount Unless the target supports that transformation probably isn t worthwhile The transformation can also easily be made to work with non zero equality for the first function produces better code on X86 From GCC int y
i
i
Definition: README.txt:29
to
Should compile to
Definition: README.txt:449
tmp
alloca< 16 x float >, align 16 %tmp2=alloca< 16 x float >, align 16 store< 16 x float > %A,< 16 x float > *%tmp %s=bitcast< 16 x float > *%tmp to i8 *%s2=bitcast< 16 x float > *%tmp2 to i8 *call void @llvm.memcpy.i64(i8 *%s, i8 *%s2, i64 64, i32 16) %R=load< 16 x float > *%tmp2 ret< 16 x float > %R } declare void @llvm.memcpy.i64(i8 *nocapture, i8 *nocapture, i64, i32) nounwind which compiles to:_foo:subl $140, %esp movaps %xmm3, 112(%esp) movaps %xmm2, 96(%esp) movaps %xmm1, 80(%esp) movaps %xmm0, 64(%esp) movl 60(%esp), %eax movl %eax, 124(%esp) movl 56(%esp), %eax movl %eax, 120(%esp) movl 52(%esp), %eax< many many more 32-bit copies > movaps(%esp), %xmm0 movaps 16(%esp), %xmm1 movaps 32(%esp), %xmm2 movaps 48(%esp), %xmm3 addl $140, %esp ret On Nehalem, it may even be cheaper to just use movups when unaligned than to fall back to lower-granularity chunks. Implement processor-specific optimizations for parity with GCC on these processors. GCC does two optimizations:1. ix86_pad_returns inserts a noop before ret instructions if immediately preceded by a conditional branch or is the target of a jump. 2. ix86_avoid_jump_misspredicts inserts noops in cases where a 16-byte block of code contains more than 3 branches. The first one is done for all AMDs, Core2, and "Generic" The second one is done for:Atom, Pentium Pro, all AMDs, Pentium 4, Nocona, Core 2, and "Generic" Testcase:int x(int a) { return(a &0xf0)> >4 tmp
Definition: README.txt:1347
loop
Analysis the ScalarEvolution expression for r is< loop > Outside the loop
Definition: README.txt:8
a
=0.0 ? 0.0 :(a > 0.0 ? 1.0 :-1.0) a
Definition: README.txt:489
b
the resulting code requires compare and branches when and if the revised code is with conditional branches instead of More there is a byte word extend before each where there should be only and the condition codes are not remembered when the same two values are compared twice More LSR enhancements i8 and i32 load store addressing modes are identical int b
Definition: README.txt:418
be
Common register allocation spilling lr str ldr sxth r3 ldr mla r4 can be
Definition: README.txt:14
only
dot regions only
Definition: RegionPrinter.cpp:205
one
the resulting code requires compare and branches when and if the revised code is with conditional branches instead of More there is a byte word extend before each where there should be only one
Definition: README.txt:401
i32
Common register allocation spilling lr str ldr sxth r3 ldr mla r4 can lr mov lr str ldr sxth r3 mla r4 and then merge mul and lr str ldr sxth r3 mla r4 It also increase the likelihood the store may become dead bb27 Successors according to LLVM ID Predecessors according to mbb< bb27, 0x8b0a7c0 > Note ADDri is not a two address instruction its result reg1037 is an operand of the PHI node in bb76 and its operand reg1039 is the result of the PHI node We should treat it as a two address code and make sure the ADDri is scheduled after any node that reads reg1039 Use info(i.e. register scavenger) to assign it a free register to allow reuse the collector could move the objects and invalidate the derived pointer This is bad enough in the first but safe points can crop up unpredictably **array_addr i32
Definition: README.txt:122
also
Doing so could allow SROA of the destination pointers See also
Definition: README.txt:166
add
we currently eax ecx subl eax ret We would use one fewer register if codegen d eax neg eax add
Definition: README.txt:454
llvm::ISD::XOR
@ XOR
Definition: ISDOpcodes.h:634
bit
compiles ldr LCPI1_0 ldr ldr mov lsr tst moveq r1 ldr LCPI1_1 and r0 bx lr It would be better to do something like to fold the shift into the conditional ldr LCPI1_0 ldr ldr tst movne lsr ldr LCPI1_1 and r0 bx lr it saves an instruction and a register It might be profitable to cse MOVi16 if there are lots of bit immediates with the same bottom half Robert Muth started working on an alternate jump table implementation that does not put the tables in line in the text This is more like the llvm default jump table implementation This might be useful sometime Several revisions of patches are on the mailing beginning while CMP sets them like a subtract Therefore to be able to use CMN for comparisons other than the Z bit
Definition: README.txt:584
x
TODO unsigned x
Definition: README.txt:10
y
into llvm powi allowing the code generator to produce balanced multiplication trees the intrinsic needs to be extended to support and second the code generator needs to be enhanced to lower these to multiplication trees Interesting testcase for add shift mul int y
Definition: README.txt:61
exit
declare void exit(i32) noreturn nounwind This compiles into
Definition: README.txt:1072
entry
print Instructions which execute on loop entry
Definition: MustExecute.cpp:339
of
Add support for conditional and other related patterns Instead of
Definition: README.txt:134