LLVM
15.0.0git
|
#include <assert.h>
#include <stdio.h>
#include <cstdio>
Macros | |
#define | PMD_MASK (~((1UL << 23) - 1)) |
Functions | |
The legalization code for mul with overflow needs to be made more robust before this can be implemented though Get the C front end to expand | hypot (x, y) -> llvm.sqrt(x *x+y *y) when errno and precision don 't matter(ffastmath). Misc/mandel will like this. :) This isn 't safe in general, even on darwin. See the libm implementation of hypot for examples(which special case when x/y are exactly zero to get signed zeros etc right). On targets with expensive 64-bit multiply, we could LSR this:for(i=... |
for (i=...;++i, tmp+=tmp) x | |
This would be a win on but not x86 or ppc64 | setlt (loadi8 Phi) |
int | foo (int z, int n) |
This is blocked on not handling X *X *X | powi (X, 3)(see note above). The issue is that we end up getting t |
void | f () |
int | h (int *j, int *l) |
float | sincosf (float x, float *sin, float *cos) |
long double | sincosl (long double x, long double *sin, long double *cos) |
if (target< 32) | |
but this requires TBAA This isn t recognized as bswap by | instcombine (yes, it really is bswap) |
We don t delete this output free because trip count analysis doesn t realize that it is | finite (if it were infinite, it would be undefined). Not having this blocks Loop Idiom from matching strlen and friends. void foo(char *C) |
These idioms should be recognized as | popcount (see PR1488) |
unsigned int | popcount (unsigned int input) |
for (i=0;i< 32;i++) if(a &(1<<(31-i))) return i | |
This sort of thing should be added to the loop idiom pass These should turn into single | bit (unaligned?) loads on little/big endian processors. unsigned short read_16_le(const unsigned char *adr) |
unsigned short | read_16_be (const unsigned char *adr) |
int | test (U32 *inst, U64 *regs) |
return * | pow2m1 (n - 1)+1 |
< i32 > ret i32 tmp foo define i32 | bar (i32 *%x) |
THotKey | GetHotKey () |
into (-m64 -O3 -fno-exceptions -static -fomit-frame-pointer) | |
if (x==3) y=0 | |
The loop unroller should partially unroll | loops (instead of peeling them) when code growth isn 't too bad and when an unroll count allows simplification of some code within the loop. One trivial example is |
unsigned long long | f6 (unsigned long long x, unsigned long long y, int z) |
This (and similar related idioms) | |
return | j (j<< 16) |
aka conv or ret i32 or6 or even i depending on the speed of the multiplier The best way to handle this is to canonicalize it to a multiply in IR and have codegen handle lowering multiplies to shifts on cpus where shifts are faster We do a number of simplifications in simplify libcalls to strength reduce standard library but we don t currently merge them together For it is useful to merge | memcpy (a, b, strlen(b)) -> strcpy. This can only be done safely if "b" isn 't modified between the strlen and memcpy of course. We compile this program:(from GCC PR11680) http:Into code that runs the same speed in fast/slow modes, but both modes run 2x slower than when compile with GCC(either 4.0 or 4.2):$ llvm-g++perf.cpp -O3 -fno-exceptions $ time ./a.out fast 1.821u 0.003s 0:01.82 100.0% 0+0k 0+0io 0pf+0w $ g++perf.cpp -O3 -fno-exceptions $ time ./a.out fast 0.821u 0.001s 0:00.82 100.0% 0+0k 0+0io 0pf+0w It looks like we are making the same inlining decisions, so this may be raw codegen badness or something else(haven 't investigated). Divisibility by constant can be simplified(according to GCC PR12849) from being a mulhi to being a mul lo(cheaper). Testcase:void bar(unsigned n) |
This is equivalent to the where is the multiplicative inverse and | is ((2^32) -1)/3+1 |
This should optimize to or at least something sane Currently not optimized with clang emit llvm bc opt O3 int | a (int a, int b, int c) |
Should fold to a &&b c Currently not optimized with clang emit llvm bc opt O3 int | a (int x) |
Should combine to x &Currently not optimized with clang emit llvm bc opt O3 unsigned | a (unsigned a) |
Should combine to a *Currently not optimized with clang emit llvm bc opt O3 unsigned | a (char *x) |
There s an unnecessary zext in the generated code with clang emit llvm bc opt O3 unsigned | a (unsigned long long x) |
Should combine to *unsigned x &Currently not optimized with clang emit llvm bc opt O3 int | g (int x) |
Should combine to x<=9" (the sub has nsw). Currently notoptimized with "clang -emit-llvm-bc|opt -O3".int g(int x) { return (x + 10) < 0; }Should combine to "x< -10" (the add has nsw). Currently notoptimized with "clang -emit-llvm-bc|opt -O3".int f(int i, int j) { return i < j + 1; }int g(int i, int j) { return j > i - 1; }Should combine to "i<=j" (the add/sub has nsw). Currently notoptimized with "clang -emit-llvm-bc|opt -O3".unsigned f(unsigned x) { return ((x & 7) + 1) & 15; }The & 15 part should be optimized away, it doesn't change the result. Currentlynot optimized with "clang -emit-llvm-bc|opt -O3".This was noticed in the entryblock for grokdeclarator in 403.gcc: %tmp = icmp eq i32 %decl_context, 4 %decl_context_addr.0 = select i1 %tmp, i32 3, i32 %decl_context %tmp1 = icmp eq i32 %decl_context_addr.0, 1 %decl_context_addr.1 = select i1 %tmp1, i32 0, i32 %decl_context_addr.0tmp1 should be simplified to something like: (!tmp || decl_context == 1)This allows recursive simplifications, tmp1 is used all over the place inthe function, e.g. by: %tmp23 = icmp eq i32 %decl_context_addr.1, 0 ; <i1> [#uses=1] %tmp24 = xor i1 %tmp1, true ; <i1> [#uses=1] %or.cond8 = and i1 %tmp23, %tmp24 ; <i1> [#uses=1]later.[STORE SINKING]Store sinking: This code:void f (int n, int *cond, int *res) { int i; *res = 0; for (i = 0; i < n; i++) if (*cond) *res ^= 234; }On this function GVN hoists the fully redundant value of *res, but nothingmoves the store out. This gives us this code:bb: ; preds = %bb2, %entry %.rle = phi i32 [ 0, %entry ], [ %.rle6, %bb2 ] %i.05 = phi i32 [ 0, %entry ], [ %indvar.next, %bb2 ] %1 = load i32* %cond, align 4 %2 = icmp eq i32 %1, 0 br i1 %2, label %bb2, label %bb1bb1: ; preds = %bb %3 = xor i32 %.rle, 234 store i32 %3, i32* %res, align 4 br label %bb2bb2: ; preds = %bb, %bb1 %.rle6 = phi i32 [ %3, %bb1 ], [ %.rle, %bb ] %indvar.next = add i32 %i.05, 1 %exitcond = icmp eq i32 %indvar.next, %n br i1 %exitcond, label %return, label %bbDSE should sink partially dead stores to get the store out of the loop.Here's another partial dead case:http:Scalar PRE hoists the mul in the common block up to the else:int test (int a, int b, int c, int g) { int d, e; if (a) d = b * c; else d = b - c; e = b * c + g; return d + e;}It would be better to do the mul once to reduce codesize above the if.This is GCC PR38204.This simple function from 179.art:int winner, numf2s;struct { double y; int reset; } *Y;void find_match() { int i; winner = 0; for (i=0;i<numf2s;i++) if (Y[i].y > Y[winner].y) winner =i;}Compiles into (with clang TBAA):for.body: ; preds = %for.inc, %bb.nph %indvar = phi i64 [ 0, %bb.nph ], [ %indvar.next, %for.inc ] %i.01718 = phi i32 [ 0, %bb.nph ], [ %i.01719, %for.inc ] %tmp4 = getelementptr inbounds %struct.anon* %tmp3, i64 %indvar, i32 0 %tmp5 = load double* %tmp4, align 8, !tbaa !4 %idxprom7 = sext i32 %i.01718 to i64 %tmp10 = getelementptr inbounds %struct.anon* %tmp3, i64 %idxprom7, i32 0 %tmp11 = load double* %tmp10, align 8, !tbaa !4 %cmp12 = fcmp ogt double %tmp5, %tmp11 br i1 %cmp12, label %if.then, label %for.incif.then: ; preds = %for.body %i.017 = trunc i64 %indvar to i32 br label %for.incfor.inc: ; preds = %for.body, %if.then %i.01719 = phi i32 [ %i.01718, %for.body ], [ %i.017, %if.then ] %indvar.next = add i64 %indvar, 1 %exitcond = icmp eq i64 %indvar.next, %tmp22 br i1 %exitcond, label %for.cond.for.end_crit_edge, label %for.bodyIt is good that we hoisted the reloads of numf2's, and Y out of the loop andsunk the store to winner out.However, this is awful on several levels: the conditional truncate in the loop(-indvars at fault? why can't we completely promote the IV to i64?).Beyond that, we have a partially redundant load in the loop: if "winner" (aka %i.01718) isn't updated, we reload Y[winner].y the next time through the loop.Similarly, the addressing that feeds it (including the sext) is redundant. Inthe end we get this generated assembly:LBB0_2: ## %for.body ## =>This Inner Loop Header: Depth=1 movsd (%rdi), %xmm0 movslq %edx, %r8 shlq $4, %r8 ucomisd (%rcx,%r8), %xmm0 jbe LBB0_4 movl %esi, %edxLBB0_4: ## %for.inc addq $16, %rdi incq %rsi cmpq %rsi, %rax jne LBB0_2All things considered this isn't too bad, but we shouldn't need the movslq orthe shlq instruction, or the load folded into ucomisd every time through theloop.On an x86-specific topic, if the loop can't be restructure, the movl should be acmov.[STORE SINKING]GCC PR37810 is an interesting case where we should sink load/store reloadinto the if block and outside the loop, so we don't reload/store it on thenon-call path.for () { *P += 1; if () call(); else ...->tmp = *Pfor () { tmp += 1; if () { *P = tmp; call(); tmp = *P; } else ...}*P = tmp;We now hoist the reload after the call (Transforms/GVN/lpre-call-wrap.ll), butwe don't sink the store. We need partially dead store sinking.[LOAD PRE CRIT EDGE SPLITTING]GCC PR37166: Sinking of loads prevents SROA'ing the "g" struct on the stackleading to excess stack traffic. This could be handled by GVN with some crazysymbolic phi translation. The code we get looks like (g is on the stack):bb2: ; preds = %bb1.. %9 = getelementptr %struct.f* %g, i32 0, i32 0 store i32 %8, i32* %9, align bel %bb3bb3: ; preds = %bb1, %bb2, %bb %c_addr.0 = phi %struct.f* [ %g, %bb2 ], [ %c, %bb ], [ %c, %bb1 ] %b_addr.0 = phi %struct.f* [ %b, %bb2 ], [ %g, %bb ], [ %b, %bb1 ] %10 = getelementptr %struct.f* %c_addr.0, i32 0, i32 0 %11 = load i32* %10, align 4%11 is partially redundant, an in BB2 it should have the value %8.GCC PR33344 and PR35287 are similar cases.[LOAD PRE]There are many load PRE testcases in testsuite/gcc.dg/tree-ssa/loadpre* in theGCC testsuite, ones we don't get yet are (checked through loadpre25):[CRIT EDGE BREAKING]predcom-4.c[PRE OF READONLY CALL]loadpre5.c[TURN SELECT INTO BRANCH]loadpre14.c loadpre15.c actually a conditional increment: loadpre18.c loadpre19.c[LOAD PRE / STORE SINKING / SPEC HACK]This is a chunk of code from 456.hmmer:int f(int M, int *mc, int *mpp, int *tpmm, int *ip, int *tpim, int *dpp, int *tpdm, int xmb, int *bp, int *ms) { int k, sc; for (k = 1; k <= M; k++) { mc[k] = mpp[k-1] + tpmm[k-1]; if ((sc = ip[k-1] + tpim[k-1]) > mc[k]) mc[k] = sc; if ((sc = dpp[k-1] + tpdm[k-1]) > mc[k]) mc[k] = sc; if ((sc = xmb + bp[k]) > mc[k]) mc[k] = sc; mc[k] += ms[k]; }}It is very profitable for this benchmark to turn the conditional stores to mc[k]into a conditional move (select instr in IR) and allow the final store to do thestore. See GCC PR27313 for more details. Note that this is valid to xform evenwith the new C++ memory model, since mc[k] is previously loaded and laterstored.[SCALAR PRE]There are many PRE testcases in testsuite/gcc.dg/tree-ssa/ssa-pre-*.c in theGCC testsuite.There are some interesting cases in testsuite/gcc.dg/tree-ssa/pred-comm* in theGCC testsuite. For example, we get the first example in predcom-1.c, but miss the second one:unsigned fib[1000];unsigned avg[1000];__attribute__ ((noinline))void count_averages(int n) { int i; for (i = 1; i < n; i++) avg[i] = (((unsigned long) fib[i - 1] + fib[i] + fib[i + 1]) / 3) & 0xffff;}which compiles into two loads instead of one in the loop.predcom-2.c is the same as predcom-1.cpredcom-3.c is very similar but needs loads feeding each other instead ofstore->load.[ALIAS ANALYSIS]Type based alias analysis:http:We should do better analysis of posix_memalign. At the least it shouldno-capture its pointer argument, at best, we should know that the out-valueresult doesn't point to anything (like malloc). One example of this is inSingleSource/Benchmarks/Misc/dt.cInteresting missed case because of control flow flattening (should be 2 loads):http:With: llvm-gcc t2.c -S -o - -O0 -emit-llvm | llvm-as | opt -mem2reg -gvn -instcombine | llvm-diswe miss it because we need 1) CRIT EDGE 2) MULTIPLE DIFFERENTVALS PRODUCED BY ONE BLOCK OVER DIFFERENT PATHShttp:We could eliminate the branch condition here, loading from null is undefined:struct S { int w, x, y, z; };struct T { int r; struct S s; };void bar (struct S, int);void foo (int a, struct T b){ struct S *c = 0; if (a) c = &b.s; bar (*c, a);}simplifylibcalls should do several optimizations for strspn/strcspn:strcspn(x, "a") -> inlined loop for up to | letters (similarly for strspn) |
This should turn into a switch on the character See PR3253 for some notes on codegen hmmer apparently uses strcspn and strspn a lot omnetpp uses strspn simplifylibcalls should turn these snprintf idioms into | memcpy (GCC PR47917) char buf1[6] |
Variables | |
instcombine should handle this C2 when and C2 are unsigned Similarly for udiv and signed operands Currently InstCombine avoids this transform but will do it when the signs of the operands and the sign of the divide match See the FIXME in InstructionCombining cpp in the visitSetCondInst method after the switch case for Instruction::UDiv(around line 4447) for more details. The SingleSource/Benchmarks/Shootout-C++/hash and hash2 tests have examples of this const ruct.[LOOP OPTIMIZATION] SingleSource/Benchmarks/Misc/dt.c shows several interesting optimization opportunities in its double_array_divs_variable function typedef unsigned long long | U64 |
Target Independent | Opportunities |
Target Independent unsigned int | b |
i | |
into | __pad0__ |
This would be a win on | ppc32 |
This would be a win on but not x86 or ppc64 | Shrink |
This would be a win on but not x86 or ppc64 Reassociate should turn things | like |
into llvm powi | calls |
into llvm powi allowing the code generator to produce balanced multiplication trees | First |
into llvm powi allowing the code generator to produce balanced multiplication trees the intrinsic needs to be extended to support | integers |
into llvm powi allowing the code generator to produce balanced multiplication trees the intrinsic needs to be extended to support and second the code generator needs to be enhanced to lower these to multiplication trees Interesting testcase for add shift mul | reassoc |
into llvm powi allowing the code generator to produce balanced multiplication trees the intrinsic needs to be extended to support and second the code generator needs to be enhanced to lower these to multiplication trees Interesting testcase for add shift mul int | y |
This is blocked on not handling X *X *X which is the same number of multiplies and is | canonical |
This is blocked on not handling X *X *X which is the same number of multiplies and is because the *X has multiple uses Here s a simple | example |
This is blocked on not handling X *X *X which is the same number of multiplies and is because the *X has multiple uses Here s a simple X1 * | C = mul i32 %B |
This is blocked on not handling X *X *X which is the same number of multiplies and is because the *X has multiple uses Here s a simple X1 B ret i32 C Reassociate should handle the example in GCC | PR16157 |
This is blocked on not handling X *X *X which is the same number of multiplies and is because the *X has multiple uses Here s a simple X1 B ret i32 C Reassociate should handle the example in GCC | a1 |
This is blocked on not handling X *X *X which is the same number of multiplies and is because the *X has multiple uses Here s a simple X1 B ret i32 C Reassociate should handle the example in GCC | a2 |
This is blocked on not handling X *X *X which is the same number of multiplies and is because the *X has multiple uses Here s a simple X1 B ret i32 C Reassociate should handle the example in GCC | a3 |
This is blocked on not handling X *X *X which is the same number of multiplies and is because the *X has multiple uses Here s a simple X1 B ret i32 C Reassociate should handle the example in GCC | a4 |
int | b0 |
int | b1 |
int | b2 |
int | b3 |
int | b4 |
This requires reassociating to forms of expressions that are already | available |
This requires reassociating to forms of expressions that are already something that reassoc doesn t think about yet These two functions should generate the same code on big endian | systems |
This requires reassociating to forms of expressions that are already something that reassoc doesn t think about yet These two functions should generate the same code on big endian int * | l { return memcmp(j,l,4) |
this could be done in SelectionDAGISel | cpp |
this could be done in SelectionDAGISel along with other special | cases |
this could be done in SelectionDAGISel along with other special | for |
this could be done in SelectionDAGISel along with other special bytes It would be nice to revert this | patch |
Add support for conditional | increments |
Add support for conditional and other related patterns Instead | of |
Add support for conditional and other related patterns Instead eax | cmpl |
Add support for conditional and other related patterns Instead eax eax je LBB16_2 | LBB16_1 |
Add support for conditional and other related patterns Instead eax eax je LBB16_2 eax edi | sbbl |
Add support for conditional and other related patterns Instead eax eax je LBB16_2 eax edi eax movl | eax |
Add support for conditional and other related patterns Instead eax eax je LBB16_2 eax edi eax movl _foo | Combine |
Doing so could allow SROA of the destination pointers See | also |
i< reg-> | size |
else | |
We don t delete this output free | loop |
This should be recognized as | CLZ |
return | |
instcombine should handle this | transform |
instcombine should handle this C2 when | X |
instcombine should handle this C2 when | C1 |
Note that only the low bits of effective_addr2 are used On bit we don t eliminate the computation of the top half of effective_addr2 because we don t have whole function selection dags On | x86 |
Note that only the low bits of effective_addr2 are used On bit we don t eliminate the computation of the top half of effective_addr2 because we don t have whole function selection dags On this means we use one extra register for the function when effective_addr2 is declared as U64 than when it is declared U32 PHI Slicing could be extended to do this Tail call elim should be more | aggressive |
Note that only the low bits of effective_addr2 are used On bit we don t eliminate the computation of the top half of effective_addr2 because we don t have whole function selection dags On this means we use one extra register for the function when effective_addr2 is declared as U64 than when it is declared U32 PHI Slicing could be extended to do this Tail call elim should be more checking to see if the call is followed by an uncond branch to an exit | block |
instruction into the terminating blocks because there was other | code |
optimized out of the function after the taildup | happened |
RUN | __pad1__ |
RUN< i32 > | tmp = icmp ne i32 %tmp.1 |
RUN< i32 >< i1 > br i1 label | then |
preds | |
then | result = phi i32 [ 0, %else.0 ] |
then ret i32 result Tail recursion elimination should | handle |
Also | |
multiplies can be turned into SHL | s |
multiplies can be turned into SHL so they should be handled as if they were associative return like | this |
RUN | __pad2__ |
< i32 > tmp | foo = call i32 @foo( i32* %x ) |
We should investigate an instruction sinking pass Consider this silly example in pic | mode |
we compile this | to |
we compile this esp call L1 $pb L1 | $pb |
we compile this esp call L1 $pb L1 esp je LBB1_2 | LBB1_1 |
we compile this esp call L1 $pb L1 esp je LBB1_2 esp ret | LBB1_2 |
we compile this esp call L1 $pb L1 esp je LBB1_2 esp ret but is currently always computed in the entry block It would be better to sink the picbase computation down into the block for the | assertion |
we compile this esp call L1 $pb L1 esp je LBB1_2 esp ret but is currently always computed in the entry block It would be better to sink the picbase computation down into the block for the as it is the only one that uses it This happens for a lot of code with early outs Another example is loads of | arguments |
we compile this esp call L1 $pb L1 esp je LBB1_2 esp ret but is currently always computed in the entry block It would be better to sink the picbase computation down into the block for the as it is the only one that uses it This happens for a lot of code with early outs Another example is loads of which are usually emitted into the entry block on targets like x86 If not used in all paths through a | function |
we compile this esp call L1 $pb L1 esp je LBB1_2 esp ret but is currently always computed in the entry block It would be better to sink the picbase computation down into the block for the as it is the only one that uses it This happens for a lot of code with early outs Another example is loads of which are usually emitted into the entry block on targets like x86 If not used in all paths through a they should be sunk into the ones that do In this | case |
we compile this esp call L1 $pb L1 esp je LBB1_2 esp ret but is currently always computed in the entry block It would be better to sink the picbase computation down into the block for the as it is the only one that uses it This happens for a lot of code with early outs Another example is loads of which are usually emitted into the entry block on targets like x86 If not used in all paths through a they should be sunk into the ones that do In this whole function isel would also handle this Investigate lowering of sparse switch statements into perfect hash | tables |
bool | Control |
bool | Shift |
bool | Alt |
THotKey | m_HotKey |
Clang compiles this | into |
Clang compiles this | i8 |
Clang compiles this | i64 |
Clang compiles this | i32 |
Clang compiles this i1 | false = getelementptr [8 x i64]* %input |
Clang compiles this i1 i64 store i64 | align = getelementptr [8 x i64]* %input |
Clang compiles this i1 i64 store i64 i64 store i64 i64 store i64 i64 store i64 align Which gets codegen d xmm0 movaps | xmm0 |
Clang compiles this i1 i64 store i64 i64 store i64 i64 store i64 i64 store i64 align Which gets codegen d xmm0 movaps rbp movaps rbp movaps rbp movaps rbp | movq |
Clang compiles this i1 i64 store i64 i64 store i64 i64 store i64 i64 store i64 align Which gets codegen d xmm0 movaps rbp movaps rbp movaps rbp movaps rbp rbp rbp rbp rbp It would be better to have movq s of instead of the movaps s | http |
Clang compiles this i1 i64 store i64 i64 store i64 i64 store i64 i64 store i64 align Which gets codegen d xmm0 movaps rbp movaps rbp movaps rbp movaps rbp rbp rbp rbp rbp It would be better to have movq s of instead of the movaps s LLVM produces ret | int |
Unrolling by would eliminate the &in both | copies |
Unrolling by would eliminate the &in both leading to a net reduction in code size The resultant code would then also be suitable for exit value computation We miss a bunch of rotate opportunities on various | targets |
Unrolling by would eliminate the &in both leading to a net reduction in code size The resultant code would then also be suitable for exit value computation We miss a bunch of rotate opportunities on various including | ppc |
Unrolling by would eliminate the &in both leading to a net reduction in code size The resultant code would then also be suitable for exit value computation We miss a bunch of rotate opportunities on various including etc On | X86 |
Unrolling by would eliminate the &in both leading to a net reduction in code size The resultant code would then also be suitable for exit value computation We miss a bunch of rotate opportunities on various including etc On we miss a bunch of rotate by variable cases because the rotate matching code in dag combine doesn t look through truncates aggressively enough Here are some testcases reduces from GCC | PR17886 |
compiles | shl5 = shl i32 %conv |
compiles | shl9 = shl i32 %conv |
compiles | or = or i32 %shl9 |
compiles conv | or6 = or i32 %or |
compiles conv shl5 | or10 = or i32 %or6 |
compiles conv shl5 shl ret i32 or10 it would be better | as |
aka | __pad3__ |
aka conv or ret i32 or6 or even i depending on the speed of the multiplier The best way to handle this is to canonicalize it to a multiply in IR and have codegen handle lowering multiplies to shifts on cpus where shifts are faster We do a number of simplifications in simplify libcalls to strength reduce standard library | functions |
This is equivalent to the | following |
The same transformation can work with an even modulo with the addition of a | rotate |
The same transformation can work with an even modulo with the addition of a and shrink the compare RHS by the same amount Unless the target supports | rotates |
The same transformation can work with an even modulo with the addition of a and shrink the compare RHS by the same amount Unless the target supports | though |
The same transformation can work with an even modulo with the addition of a and shrink the compare RHS by the same amount Unless the target supports that transformation probably isn t worthwhile The transformation can also easily be made to work with non zero equality | comparisons |
The same transformation can work with an even modulo with the addition of a and shrink the compare RHS by the same amount Unless the target supports that transformation probably isn t worthwhile The transformation can also easily be made to work with non zero equality for | n |
The same transformation can work with an even modulo with the addition of a and shrink the compare RHS by the same amount Unless the target supports that transformation probably isn t worthwhile The transformation can also easily be made to work with non zero equality for the first function produces better code on X86 From GCC | Bug |
This should optimize to | x |
This should turn into a switch on the character See PR3253 for some notes on codegen hmmer apparently uses strcspn and strspn a lot omnetpp uses strspn simplifylibcalls should turn these snprintf idioms into | buf2 [6] |
This should turn into a switch on the character See PR3253 for some notes on codegen hmmer apparently uses strcspn and strspn a lot omnetpp uses strspn simplifylibcalls should turn these snprintf idioms into | buf3 [4] |
This should turn into a switch on the character See PR3253 for some notes on codegen hmmer apparently uses strcspn and strspn a lot omnetpp uses strspn simplifylibcalls should turn these snprintf idioms into | buf4 [4] |
#define PMD_MASK (~((1UL << 23) - 1)) |
Definition at line 883 of file README.txt.
This should optimize to or at least something sane Currently not optimized with clang emit llvm bc opt O3 int a | ( | int | a, |
int | b, | ||
int | c | ||
) |
Definition at line 853 of file README.txt.
Definition at line 859 of file README.txt.
Should combine to x& Currently not optimized with clang emit llvm bc opt O3 unsigned a | ( | unsigned | a | ) |
Definition at line 877 of file README.txt.
There s an unnecessary zext in the generated code with clang emit llvm bc opt O3 unsigned a | ( | unsigned long long | x | ) |
Definition at line 889 of file README.txt.
References x.
This sort of thing should be added to the loop idiom pass These should turn into single bit | ( | unaligned? | ) | const |
Definition at line 249 of file README.txt.
unsigned long long f6 | ( | unsigned long long | x, |
unsigned long long | y, | ||
int | z | ||
) |
Definition at line 575 of file README.txt.
We don t delete this output free because trip count analysis doesn t realize that it is finite | ( | if it were | infinite, |
it would be | undefined | ||
) |
Definition at line 206 of file README.txt.
for | ( | ) |
Should combine to* unsigned x& Currently not optimized with clang emit llvm bc opt O3 int g | ( | int | x | ) |
Definition at line 895 of file README.txt.
References x.
THotKey GetHotKey | ( | ) |
Definition at line 470 of file README.txt.
References m_HotKey.
Definition at line 101 of file README.txt.
The legalization code for mul with overflow needs to be made more robust before this can be implemented though Get the C front end to expand hypot | ( | x | , |
y | |||
) | -> llvm.sqrt(x *x+y *y) when errno and precision don 't matter(ffastmath). Misc/mandel will like this. :) This isn 't safe in general, even on darwin. See the libm implementation of hypot for examples(which special case when x/y are exactly zero to get signed zeros etc right). On targets with expensive 64-bit multiply, we could LSR this:for(i=... |
if | ( | ) |
Definition at line 176 of file README.txt.
|
pure virtual |
Definition at line 191 of file README.txt.
into | ( | -m64 -O3 -fno-exceptions -static -fomit-frame- | pointer | ) |
Definition at line 472 of file README.txt.
Definition at line 672 of file README.txt.
References n.
return j | ( | j<< | 16 | ) |
Referenced by AnalyzeArguments(), llvm::analyzeArguments(), llvm::HexagonSubtarget::BankConflictMutation::apply(), llvm::PBQP::applyR1(), llvm::PBQP::applyR2(), llvm::SwitchCG::SwitchLowering::buildBitTests(), llvm::ScheduleDAGInstrs::buildSchedGraph(), llvm::LoopVectorizationCostModel::calculateRegisterUsage(), canLowerByDroppingElements(), CanMergeParamLoadStoresStartingAt(), CollectOpsToWiden(), combineBasicSADPattern(), combineVPDPBUSDPattern(), CompactSwizzlableVector(), llvm::EHStreamer::computeActionsTable(), llvm::EHStreamer::computePadMap(), ComputePTXValueVTs(), computeZeroableShuffleElements(), concatSubVector(), llvm::ARMBaseInstrInfo::convertToThreeAddress(), llvm::IntervalMapImpl::NodeBase< std::pair< IndexT, IndexT >, char, N >::copy(), llvm::createBitMaskForGaps(), llvm::createInterleaveMask(), llvm::createReplicatedMask(), llvm::DecodeSubVectorBroadcast(), detectAVGPattern(), llvm::DistributeRange(), EltsFromConsecutiveLoads(), llvm::AsmPrinter::emitConstantPool(), llvm::encodeBase64(), llvm::IntervalMapImpl::NodeBase< std::pair< IndexT, IndexT >, char, N >::erase(), llvm::LegalizerHelper::fewerElementsVectorMerge(), llvm::LegalizerHelper::fewerElementsVectorPhi(), llvm::UnwindOpcodeAssembler::Finalize(), llvm::SwitchCG::SwitchLowering::findBitTestClusters(), FindInOperandList(), llvm::SwitchCG::SwitchLowering::findJumpTables(), llvm::RecordRecTy::get(), llvm::ScalarEvolution::getAddExpr(), llvm::HexagonMCInstrInfo::getDuplexPossibilties(), getFauxShuffleMask(), llvm::MachineFunction::getFilterIDFor(), getPSHUFShuffleMask(), llvm::GetReturnInfo(), llvm::getShuffleReduction(), getTargetConstantBitsFromNode(), GroupByComplexity(), llvm::CallLowering::handleAssignments(), llvm::BuildVectorSDNode::isConstantSplat(), isHopBuildVector(), isHorizontalBinOp(), llvm::R600InstrInfo::isLegalUpTo(), isMultiLaneShuffleMask(), isNByteElemShuffleMask(), IsSafeAndProfitableToMove(), llvm::PPC::isSplatShuffleMask(), isUZP_v_undef_Mask(), isVMerge(), llvm::PPC::isVPKUDUMShuffleMask(), llvm::PPC::isVPKUHUMShuffleMask(), llvm::PPC::isVPKUWUMShuffleMask(), isVTRN_v_undef_Mask(), isVTRNMask(), isVUZP_v_undef_Mask(), isVUZPMask(), isVZIP_v_undef_Mask(), isVZIPMask(), KnuthDiv(), LowerBITREVERSE_XOP(), llvm::NVPTXTargetLowering::LowerCall(), llvm::RISCVTargetLowering::LowerCall(), llvm::TargetLowering::LowerCallTo(), LowerCONCAT_VECTORS_i1(), LowerEXTRACT_SUBVECTOR(), llvm::NVPTXTargetLowering::LowerFormalArguments(), llvm::AArch64TargetLowering::lowerInterleavedStore(), llvm::ARMTargetLowering::lowerInterleavedStore(), LowerMUL(), LowerShift(), lowerShuffleAsBlend(), lowerShuffleAsDecomposedShuffleMerge(), lowerShuffleAsRepeatedMaskAndLanePermute(), lowerV16I8Shuffle(), lowerV8I16GeneralSingleInputShuffle(), llvm::HexagonTargetLowering::LowerVECTOR_SHUFFLE(), lowerVECTOR_SHUFFLE_SHF(), LowervXi8MulWithUNPCK(), match1BitShuffleAsKSHIFT(), matchShuffleAsBitRotate(), matchShuffleAsShift(), llvm::CombinerHelper::matchTruncStoreMerge(), llvm::PBQP::RegAlloc::MatrixMetadata::MatrixMetadata(), llvm::IntervalMapImpl::NodeBase< std::pair< IndexT, IndexT >, char, N >::moveLeft(), llvm::IntervalMapImpl::NodeBase< std::pair< IndexT, IndexT >, char, N >::moveRight(), llvm::LegalizerHelper::narrowScalar(), llvm::LiveRange::overlapsFrom(), llvm::TargetInstrInfo::PredicateInstruction(), llvm::cl::generic_parser_base::printGenericOptionDiff(), resolveTargetShuffleInputsAndMask(), llvm::TargetLowering::SimplifyDemandedBits(), llvm::TargetLowering::SimplifyMultipleUseDemandedBits(), llvm::MachineFunction::tidyLandingPads(), tryCombineToBSL(), llvm::LegalizationArtifactCombiner::tryCombineUnmergeValues(), umul_ov(), llvm::UnrollLoop(), llvm::SelectionDAG::UnrollVectorOp(), llvm::InstCombinerImpl::visitLandingPadInst(), and llvm::Interpreter::visitShuffleVectorInst().
Should combine to x<= 9" (the sub has nsw). Currently notoptimized with "clang -emit-llvm-bc | opt -O3".int g(int x) { return (x + 10) < 0; }Should combine to "x < -10" (the add has nsw). Currently notoptimized with "clang -emit-llvm-bc | opt -O3".int f(int i, int j) { return i < j + 1; }int g(int i, int j) { return j > i - 1; }Should combine to "i <= j" (the add/sub has nsw). Currently notoptimized with "clang -emit-llvm-bc | opt -O3".unsigned f(unsigned x) { return ((x & 7) + 1) & 15; }The & 15 part should be optimized away, it doesn't change the result. Currentlynot optimized with "clang -emit-llvm-bc | opt -O3".This was noticed in the entryblock for grokdeclarator in 403.gcc: %tmp = icmp eq i32 %decl_context, 4 %decl_context_addr.0 = select i1 %tmp, i32 3, i32 %decl_context %tmp1 = icmp eq i32 %decl_context_addr.0, 1 %decl_context_addr.1 = select i1 %tmp1, i32 0, i32 %decl_context_addr.0tmp1 should be simplified to something like: (!tmp || decl_context == 1)This allows recursive simplifications, tmp1 is used all over the place inthe function, e.g. by: %tmp23 = icmp eq i32 %decl_context_addr.1, 0 ; <i1> [#uses=1] %tmp24 = xor i1 %tmp1, true ; <i1> [#uses=1] %or.cond8 = and i1 %tmp23, %tmp24 ; <i1> [#uses=1]later.[STORE SINKING]Store sinking: This code:void f (int n, int *cond, int *res) { int i; *res = 0; for (i = 0; i < n; i++) if (*cond) *res ^= 234; }On this function GVN hoists the fully redundant value of *res, but nothingmoves the store out. This gives us this code:bb: ; preds = %bb2, %entry %.rle = phi i32 [ 0, %entry ], [ %.rle6, %bb2 ] %i.05 = phi i32 [ 0, %entry ], [ %indvar.next, %bb2 ] %1 = load i32* %cond, align 4 %2 = icmp eq i32 %1, 0 br i1 %2, label %bb2, label %bb1bb1: ; preds = %bb %3 = xor i32 %.rle, 234 store i32 %3, i32* %res, align 4 br label %bb2bb2: ; preds = %bb, %bb1 %.rle6 = phi i32 [ %3, %bb1 ], [ %.rle, %bb ] %indvar.next = add i32 %i.05, 1 %exitcond = icmp eq i32 %indvar.next, %n br i1 %exitcond, label %return, label %bbDSE should sink partially dead stores to get the store out of the loop.Here's another partial dead case:http:Scalar PRE hoists the mul in the common block up to the else:int test (int a, int b, int c, int g) { int d, e; if (a) d = b * c; else d = b - c; e = b * c + g; return d + e;}It would be better to do the mul once to reduce codesize above the if.This is GCC PR38204.This simple function from 179.art:int winner, numf2s;struct { double y; int reset; } *Y;void find_match() { int i; winner = 0; for (i=0;i<numf2s;i++) if (Y[i].y > Y[winner].y) winner =i;}Compiles into (with clang TBAA):for.body: ; preds = %for.inc, %bb.nph %indvar = phi i64 [ 0, %bb.nph ], [ %indvar.next, %for.inc ] %i.01718 = phi i32 [ 0, %bb.nph ], [ %i.01719, %for.inc ] %tmp4 = getelementptr inbounds %struct.anon* %tmp3, i64 %indvar, i32 0 %tmp5 = load double* %tmp4, align 8, !tbaa !4 %idxprom7 = sext i32 %i.01718 to i64 %tmp10 = getelementptr inbounds %struct.anon* %tmp3, i64 %idxprom7, i32 0 %tmp11 = load double* %tmp10, align 8, !tbaa !4 %cmp12 = fcmp ogt double %tmp5, %tmp11 br i1 %cmp12, label %if.then, label %for.incif.then: ; preds = %for.body %i.017 = trunc i64 %indvar to i32 br label %for.incfor.inc: ; preds = %for.body, %if.then %i.01719 = phi i32 [ %i.01718, %for.body ], [ %i.017, %if.then ] %indvar.next = add i64 %indvar, 1 %exitcond = icmp eq i64 %indvar.next, %tmp22 br i1 %exitcond, label %for.cond.for.end_crit_edge, label %for.bodyIt is good that we hoisted the reloads of numf2's, and Y out of the loop andsunk the store to winner out.However, this is awful on several levels: the conditional truncate in the loop(-indvars at fault? why can't we completely promote the IV to i64?).Beyond that, we have a partially redundant load in the loop: if "winner" (aka %i.01718) isn't updated, we reload Y[winner].y the next time through the loop.Similarly, the addressing that feeds it (including the sext) is redundant. Inthe end we get this generated assembly:LBB0_2: ## %for.body ## =>This Inner Loop Header: Depth=1 movsd (%rdi), %xmm0 movslq %edx, %r8 shlq $4, %r8 ucomisd (%rcx,%r8), %xmm0 jbe LBB0_4 movl %esi, %edxLBB0_4: ## %for.inc addq $16, %rdi incq %rsi cmpq %rsi, %rax jne LBB0_2All things considered this isn't too bad, but we shouldn't need the movslq orthe shlq instruction, or the load folded into ucomisd every time through theloop.On an x86-specific topic, if the loop can't be restructure, the movl should be acmov.[STORE SINKING]GCC PR37810 is an interesting case where we should sink load/store reloadinto the if block and outside the loop, so we don't reload/store it on thenon-call path.for () { *P += 1; if () call(); else ...->tmp = *Pfor () { tmp += 1; if () { *P = tmp; call(); tmp = *P; } else ...}*P = tmp;We now hoist the reload after the call (Transforms/GVN/lpre-call-wrap.ll), butwe don't sink the store. We need partially dead store sinking.[LOAD PRE CRIT EDGE SPLITTING]GCC PR37166: Sinking of loads prevents SROA'ing the "g" struct on the stackleading to excess stack traffic. This could be handled by GVN with some crazysymbolic phi translation. The code we get looks like (g is on the stack):bb2: ; preds = %bb1.. %9 = getelementptr %struct.f* %g, i32 0, i32 0 store i32 %8, i32* %9, align bel %bb3bb3: ; preds = %bb1, %bb2, %bb %c_addr.0 = phi %struct.f* [ %g, %bb2 ], [ %c, %bb ], [ %c, %bb1 ] %b_addr.0 = phi %struct.f* [ %b, %bb2 ], [ %g, %bb ], [ %b, %bb1 ] %10 = getelementptr %struct.f* %c_addr.0, i32 0, i32 0 %11 = load i32* %10, align 4%11 is partially redundant, an in BB2 it should have the value %8.GCC PR33344 and PR35287 are similar cases.[LOAD PRE]There are many load PRE testcases in testsuite/gcc.dg/tree-ssa/loadpre* in theGCC testsuite, ones we don't get yet are (checked through loadpre25):[CRIT EDGE BREAKING]predcom-4.c[PRE OF READONLY CALL]loadpre5.c[TURN SELECT INTO BRANCH]loadpre14.c loadpre15.c actually a conditional increment: loadpre18.c loadpre19.c[LOAD PRE / STORE SINKING / SPEC HACK]This is a chunk of code from 456.hmmer:int f(int M, int *mc, int *mpp, int *tpmm, int *ip, int *tpim, int *dpp, int *tpdm, int xmb, int *bp, int *ms) { int k, sc; for (k = 1; k <= M; k++) { mc[k] = mpp[k-1] + tpmm[k-1]; if ((sc = ip[k-1] + tpim[k-1]) > mc[k]) mc[k] = sc; if ((sc = dpp[k-1] + tpdm[k-1]) > mc[k]) mc[k] = sc; if ((sc = xmb + bp[k]) > mc[k]) mc[k] = sc; mc[k] += ms[k]; }}It is very profitable for this benchmark to turn the conditional stores to mc[k]into a conditional move (select instr in IR) and allow the final store to do thestore. See GCC PR27313 for more details. Note that this is valid to xform evenwith the new C++ memory model, since mc[k] is previously loaded and laterstored.[SCALAR PRE]There are many PRE testcases in testsuite/gcc.dg/tree-ssa/ssa-pre-*.c in theGCC testsuite.There are some interesting cases in testsuite/gcc.dg/tree-ssa/pred-comm* in theGCC testsuite. For example, we get the first example in predcom-1.c, but miss the second one:unsigned fib[1000];unsigned avg[1000];__attribute__ ((noinline))void count_averages(int n) { int i; for (i = 1; i < n; i++) avg[i] = (((unsigned long) fib[i - 1] + fib[i] + fib[i + 1]) / 3) & 0xffff;}which compiles into two loads instead of one in the loop.predcom-2.c is the same as predcom-1.cpredcom-3.c is very similar but needs loads feeding each other instead ofstore->load.[ALIAS ANALYSIS]Type based alias analysis:http:We should do better analysis of posix_memalign. At the least it shouldno-capture its pointer argument, at best, we should know that the out-valueresult doesn't point to anything (like malloc). One example of this is inSingleSource/Benchmarks/Misc/dt.cInteresting missed case because of control flow flattening (should be 2 loads):http:With: llvm-gcc t2.c -S -o - -O0 -emit-llvm | llvm-as | opt -mem2reg -gvn -instcombine | llvm-diswe miss it because we need 1) CRIT EDGE 2) MULTIPLE DIFFERENTVALS PRODUCED BY ONE BLOCK OVER DIFFERENT PATHShttp:We could eliminate the branch condition here, loading from null is undefined:struct S { int w, x, y, z; };struct T { int r; struct S s; };void bar (struct S, int);void foo (int a, struct T b){ struct S *c = 0; if (a) c = &b.s; bar (*c, a);}simplifylibcalls should do several optimizations for strspn/strcspn:strcspn(x, "a") -> inlined loop for up to letters | ( | similarly for | strspn | ) |
Definition at line 1233 of file README.txt.
Definition at line 544 of file README.txt.
aka conv or ret i32 or6 or even i depending on the speed of the multiplier The best way to handle this is to canonicalize it to a multiply in IR and have codegen handle lowering multiplies to shifts on cpus where shifts are faster We do a number of simplifications in simplify libcalls to strength reduce standard library but we don t currently merge them together For it is useful to merge memcpy | ( | a | , |
b | , | ||
strlen(b) | |||
) | -> strcpy. This can only be done safely if "b" isn't modified between the strlen and memcpy of course. We compile this program: (from GCC PR11680) http: Into code that runs the same speed in fast/slow modes, but both modes run 2x slower than when compile with GCC (either 4.0 or 4.2): $ llvm-g++ perf.cpp -O3 -fno-exceptions $ time ./a.out fast 1.821u 0.003s 0:01.82 100.0% 0+0k 0+0io 0pf+0w $ g++ perf.cpp -O3 -fno-exceptions $ time ./a.out fast 0.821u 0.001s 0:00.82 100.0% 0+0k 0+0io 0pf+0w It looks like we are making the same inlining decisions, so this may be raw codegen badness or something else (haven't investigated). Divisibility by constant can be simplified (according to GCC PR12849) from being a mulhi to being a mul lo (cheaper). Testcase: void bar(unsigned n) |
Definition at line 639 of file README.txt.
References n.
This should turn into a switch on the character See PR3253 for some notes on codegen hmmer apparently uses strcspn and strspn a lot omnetpp uses strspn simplifylibcalls should turn these snprintf idioms into memcpy | ( | GCC | PR47917 | ) |
Definition at line 219 of file README.txt.
Definition at line 228 of file README.txt.
Referenced by llvm::canConstantFoldCallTo(), cannotBeOrderedLessThanZeroImpl(), createPowWithIntegerExponent(), foldFDivPowDivisor(), llvm::BasicTTIImplBase< AMDGPUTTIImpl >::getIntrinsicInstrCost(), llvm::ARMTTIImpl::isLoweredToCall(), llvm::isTriviallyVectorizable(), llvm::isVectorIntrinsicWithOverloadTypeAtArg(), llvm::isVectorIntrinsicWithScalarOpAtArg(), simplifyBinaryIntrinsic(), llvm::InstCombinerImpl::visitCallInst(), and llvm::InstCombinerImpl::visitFMul().
unsigned short read_16_be | ( | const unsigned char * | adr | ) |
Definition at line 255 of file README.txt.
float sincosf | ( | float | x, |
float * | sin, | ||
float * | cos | ||
) |
Definition at line 301 of file README.txt.
This | ( | and similar related | idioms | ) |
Definition at line 592 of file README.txt.
Definition at line 410 of file README.txt.
into __pad0__ |
Definition at line 33 of file README.txt.
RUN __pad1__ |
Definition at line 336 of file README.txt.
RUN __pad2__ |
Definition at line 382 of file README.txt.
aka __pad3__ |
Definition at line 624 of file README.txt.
This is blocked on not handling X* X* X which is the same number of multiplies and is because the* X has multiple uses Here s a simple X1 B ret i32 C Reassociate should handle the example in GCC a1 |
Definition at line 84 of file README.txt.
Referenced by f().
This is blocked on not handling X* X* X which is the same number of multiplies and is because the* X has multiple uses Here s a simple X1 B ret i32 C Reassociate should handle the example in GCC a2 |
Definition at line 84 of file README.txt.
Referenced by f().
This is blocked on not handling X* X* X which is the same number of multiplies and is because the* X has multiple uses Here s a simple X1 B ret i32 C Reassociate should handle the example in GCC a3 |
Definition at line 84 of file README.txt.
Referenced by f().
This is blocked on not handling X* X* X which is the same number of multiplies and is because the* X has multiple uses Here s a simple X1 B ret i32 C Reassociate should handle the example in GCC a4 |
Definition at line 84 of file README.txt.
Referenced by f().
Note that only the low bits of effective_addr2 are used On bit we don t eliminate the computation of the top half of effective_addr2 because we don t have whole function selection dags On this means we use one extra register for the function when effective_addr2 is declared as U64 than when it is declared U32 PHI Slicing could be extended to do this Tail call elim should be more aggressive |
Definition at line 326 of file README.txt.
Clang compiles this i1 i64 store i64 i64 store i64 i64 store i64 align = getelementptr [8 x i64]* %input |
Definition at line 507 of file README.txt.
Doing so could allow SROA of the destination pointers See also |
Definition at line 166 of file README.txt.
Also |
Definition at line 370 of file README.txt.
bool Alt |
Definition at line 468 of file README.txt.
we compile this esp call L1 $pb L1 esp je LBB1_2 esp ret but is currently always computed in the entry block It would be better to sink the picbase computation down into the block for the as it is the only one that uses it This happens for a lot of code with early outs Another example is loads of arguments |
Definition at line 425 of file README.txt.
therefore end up llgh r3 lr r0 br r14 but truncating the load would lh r3 br r14 Functions ret i64 and ought to be implemented as |
Definition at line 615 of file README.txt.
Referenced by llvm::ELFAttributeParser::printAttribute().
we compile this esp call L1 $pb L1 esp je LBB1_2 esp ret but is currently always computed in the entry block It would be better to sink the picbase computation down into the block for the assertion |
Definition at line 422 of file README.txt.
This requires reassociating to forms of expressions that are already available |
Definition at line 92 of file README.txt.
Add support for conditional and other related patterns Instead eax eax je LBB16_2 eax edi eax movl _foo b |
Definition at line 8 of file README.txt.
int b0 |
int b1 |
Definition at line 84 of file README.txt.
Referenced by f(), and llvm::findMaximalSubpartOfIllFormedUTF8Sequence().
int b2 |
Definition at line 84 of file README.txt.
Referenced by EmitUnwindCode(), f(), and llvm::findMaximalSubpartOfIllFormedUTF8Sequence().
int b3 |
Definition at line 84 of file README.txt.
Referenced by f(), and llvm::findMaximalSubpartOfIllFormedUTF8Sequence().
int b4 |
Definition at line 84 of file README.txt.
Referenced by f().
Note that only the low bits of effective_addr2 are used On bit we don t eliminate the computation of the top half of effective_addr2 because we don t have whole function selection dags On this means we use one extra register for the function when effective_addr2 is declared as U64 than when it is declared U32 PHI Slicing could be extended to do this Tail call elim should be more checking to see if the call is followed by an uncond branch to an exit block |
Definition at line 329 of file README.txt.
This should turn into a switch on the character See PR3253 for some notes on codegen hmmer apparently uses strcspn and strspn a lot omnetpp uses strspn simplifylibcalls should turn these snprintf idioms into buf2[6] |
Definition at line 1253 of file README.txt.
This should turn into a switch on the character See PR3253 for some notes on codegen hmmer apparently uses strcspn and strspn a lot omnetpp uses strspn simplifylibcalls should turn these snprintf idioms into buf3[4] |
Definition at line 1253 of file README.txt.
This should turn into a switch on the character See PR3253 for some notes on codegen hmmer apparently uses strcspn and strspn a lot omnetpp uses strspn simplifylibcalls should turn these snprintf idioms into buf4[4] |
Definition at line 1253 of file README.txt.
The same transformation can work with an even modulo with the addition of a and shrink the compare RHS by the same amount Unless the target supports that transformation probably isn t worthwhile The transformation can also easily be made to work with non zero equality for the first function produces better code on X86 From GCC Bug |
Definition at line 763 of file README.txt.
This is blocked on not handling X* X* X which is the same number of multiplies and is because the* X has multiple uses Here s a simple X1* C = mul i32 %B |
Definition at line 75 of file README.txt.
Referenced by zero().
instcombine should handle this C2 when C1 |
Definition at line 263 of file README.txt.
Referenced by AddCombineBUILD_VECTORToVPADDL(), adjustForFNeg(), adjustForLTGFR(), llvm::AliasSet::aliasesUnknownInst(), llvm::ScalarEvolution::applyLoopGuards(), areInverseVectorBitmasks(), llvm::BinaryConstantExpr::BinaryConstantExpr(), checkForNegativeOperand(), CombineANDShift(), combineShiftOfShiftedLogic(), llvm::InstCombinerImpl::commonIDivTransforms(), llvm::InstCombinerImpl::commonShiftTransforms(), llvm::ScalarEvolution::computeConstantDifference(), llvm::ConstantFoldBinaryInstruction(), llvm::ConstantFoldBinOp(), llvm::ConstantFoldCompareInstOperands(), llvm::ConstantFoldCompareInstruction(), llvm::ConstantFoldFPBinOp(), detectUSatPattern(), llvm::ExtractElementConstantExpr::ExtractElementConstantExpr(), foldAddSubBoolOfMaskedVal(), llvm::InstCombinerImpl::foldBinopWithPhiOperands(), foldClampRangeOfTwo(), llvm::InstCombinerImpl::foldCmpLoadFromIndexedGlobal(), llvm::SelectionDAG::FoldConstantArithmetic(), llvm::SelectionDAG::foldConstantFPMath(), llvm::InstCombinerImpl::foldICmpAndConstConst(), llvm::InstCombinerImpl::foldICmpAndShift(), llvm::InstCombinerImpl::foldICmpEquality(), llvm::InstCombinerImpl::foldICmpUsingKnownBits(), foldICmpWithTruncSignExtendedVal(), foldLogOpOfMaskedICmps_NotAllZeros_BMask_Mixed(), foldNoWrapAdd(), foldSelectICmpAndOr(), foldSelectOfConstantsUsingSra(), llvm::InstCombinerImpl::foldSelectShuffle(), llvm::SelectionDAG::FoldSetCC(), foldSetCCWithFunnelShift(), foldSetCCWithRotate(), llvm::InstCombinerImpl::FoldShiftByConstant(), foldShiftedShift(), foldShiftOfShiftedLogic(), llvm::ConstantFolder::FoldShuffleVector(), llvm::TargetFolder::FoldShuffleVector(), FoldValue(), llvm::InstCombinerImpl::foldVariableSignZeroExtensionOfVariableHighBitExtract(), gcd(), llvm::ConstantExpr::get(), llvm::ConstantExpr::getAdd(), llvm::ScalarEvolution::getAddExpr(), llvm::ConstantExpr::getAnd(), llvm::ConstantExpr::getAShr(), llvm::ConstantExpr::getCompare(), llvm::ConstantExpr::getExactAShr(), llvm::ConstantExpr::getExactLShr(), llvm::ConstantExpr::getExactSDiv(), llvm::ConstantExpr::getExactUDiv(), llvm::ConstantExpr::getFAdd(), llvm::ConstantExpr::getFDiv(), llvm::ConstantExpr::getFMul(), llvm::ConstantExpr::getFRem(), llvm::ConstantExpr::getFSub(), getKnownUndefForVectorBinop(), llvm::ConstantExpr::getLShr(), llvm::ConstantExpr::getMul(), llvm::ConstantExpr::getNSWAdd(), llvm::ConstantExpr::getNSWMul(), llvm::ConstantExpr::getNSWShl(), llvm::ConstantExpr::getNSWSub(), llvm::ConstantExpr::getNUWAdd(), llvm::ConstantExpr::getNUWMul(), llvm::ConstantExpr::getNUWShl(), llvm::ConstantExpr::getNUWSub(), llvm::ConstantExpr::getOr(), llvm::ConstantExpr::getSDiv(), llvm::slpvectorizer::BoUpSLP::LookAheadHeuristics::getShallowScore(), llvm::ConstantExpr::getShl(), llvm::ConstantExpr::getSRem(), llvm::ConstantExpr::getSub(), llvm::ConstantExpr::getUDiv(), llvm::ConstantExpr::getUMin(), llvm::ConstantExpr::getURem(), llvm::ConstantExpr::getXor(), llvm::InsertElementConstantExpr::InsertElementConstantExpr(), llvm::X86TTIImpl::instCombineIntrinsic(), llvm::GCNTTIImpl::instCombineIntrinsic(), llvm::RISCVTargetLowering::isDesirableToCommuteWithShift(), llvm::Constant::isElementWiseEqual(), isImpliedCondMatchingImmOperands(), llvm::SystemZTTIImpl::isLSRCostLess(), llvm::PPCTTIImpl::isLSRCostLess(), llvm::TargetTransformInfoImplBase::isLSRCostLess(), llvm::X86TTIImpl::isLSRCostLess(), llvm::BasicTTIImplBase< AMDGPUTTIImpl >::isLSRCostLess(), llvm::TargetTransformInfo::isLSRCostLess(), llvm::RISCVTargetLowering::isMulAddWithConstProfitable(), llvm::AArch64TargetLowering::isMulAddWithConstProfitable(), isMultiple(), isNonEqualPHIs(), isSaturatingMinMax(), isStrictSubset(), isSubset(), llvm::AMDGPULegalizerInfo::legalizeFDIVFastIntrin(), llvm::AMDGPULegalizerInfo::legalizeFrint(), llvm::AMDGPULegalizerInfo::legalizeUnsignedDIV_REM64Impl(), llvm::AMDGPUTargetLowering::LowerFRINT(), LowerUINT_TO_FP_i64(), LowerVSETCC(), matchClamp(), matchMinMax(), llvm::CombinerHelper::matchOverlappingAnd(), llvm::CombinerHelper::matchReassocFoldConstantsInSubTree(), llvm::CombinerHelper::matchShiftOfShiftedLogic(), moveAddAfterMinMax(), multiplyOverflows(), MulWillOverflow(), llvm::HexagonTargetLowering::PerformDAGCombine(), PerformUMinFpToSatCombine(), reassociateMinMaxWithConstants(), llvm::SelectConstantExpr::SelectConstantExpr(), llvm::HexagonDAGToDAGISel::SelectSHL(), llvm::RISCVDAGToDAGISel::selectSHXADDOp(), llvm::AArch64TargetLowering::shouldFoldConstantShiftPairToMask(), llvm::ShuffleVectorConstantExpr::ShuffleVectorConstantExpr(), llvm::InstCombinerImpl::SimplifyAddWithRemainder(), simplifyAndOfICmpsWithAdd(), simplifyAndOrOfICmpsWithConstants(), simplifyAssocCastAssoc(), llvm::InstCombinerImpl::SimplifyAssociativeOrCommutative(), simplifyBinaryIntrinsic(), llvm::TargetLowering::SimplifyDemandedBits(), llvm::X86TargetLowering::SimplifyDemandedVectorEltsForTargetNode(), simplifyDiv(), simplifyICmpWithBinOpOnLHS(), simplifyLogicOfAddSub(), simplifyOrInst(), simplifyOrOfICmpsWithAdd(), llvm::TargetLowering::SimplifySetCC(), simplifySetCCWithCTPOP(), SolveQuadraticAddRecRange(), llvm::AMDGPURegisterBankInfo::splitBufferOffsets(), transformAddImmMulImm(), transformAddShlImm(), tryLowerToSLI(), trySimplifyICmpWithAdds(), UpgradeARMIntrinsicCall(), ValuesOverlap(), llvm::InstCombinerImpl::visitAdd(), llvm::InstCombinerImpl::visitAnd(), llvm::InstCombinerImpl::visitCallInst(), llvm::InstCombinerImpl::visitFMul(), llvm::InstCombinerImpl::visitLShr(), llvm::InstCombinerImpl::visitMul(), llvm::InstCombinerImpl::visitOr(), llvm::InstCombinerImpl::visitShl(), llvm::InstCombinerImpl::visitUDiv(), and llvm::InstCombinerImpl::visitXor().
Definition at line 51 of file README.txt.
Definition at line 70 of file README.txt.
we compile this esp call L1 $pb L1 esp je LBB1_2 esp ret but is currently always computed in the entry block It would be better to sink the picbase computation down into the block for the as it is the only one that uses it This happens for a lot of code with early outs Another example is loads of which are usually emitted into the entry block on targets like x86 If not used in all paths through a they should be sunk into the ones that do In this case |
Definition at line 429 of file README.txt.
Definition at line 103 of file README.txt.
Definition at line 238 of file README.txt.
Referenced by isSignExtendingOpW().
Definition at line 135 of file README.txt.
instruction into the terminating blocks because there was other code |
Definition at line 331 of file README.txt.
Add support for conditional and other related patterns Instead eax eax je LBB16_2 eax edi eax movl _foo Combine |
Definition at line 149 of file README.txt.
The same transformation can work with an even modulo with the addition of a and shrink the compare RHS by the same amount Unless the target supports that transformation probably isn t worthwhile The transformation can also easily be made to work with non zero equality comparisons |
Definition at line 685 of file README.txt.
bool Control |
Definition at line 468 of file README.txt.
Unrolling by would eliminate the& in both copies |
Definition at line 561 of file README.txt.
Definition at line 103 of file README.txt.
currently compiles eax eax je LBB0_3 testl eax |
Definition at line 145 of file README.txt.
Referenced by get_cpu_features().
The same transformation can work with an even modulo with the addition of a and shrink the compare RHS by the same amount Unless the target supports that transformation probably isn t worthwhile The transformation can also easily be made to work with non zero equality for example |
Definition at line 74 of file README.txt.
Definition at line 505 of file README.txt.
Definition at line 54 of file README.txt.
Referenced by llvm::addLocationToRemarks(), llvm::HashBuilderImpl< HasherT, Endianness >::addRange(), llvm::HashBuilderImpl< HasherT, Endianness >::addRangeElements(), adjustCostForPairing(), areBothVectorOrScalar(), areSlicesNextToEachOther(), PODSmallVector< Node *, 8 >::back(), StringView::begin(), PODSmallVector< Node *, 8 >::begin(), BracedRangeExpr::BracedRangeExpr(), llvm::SwitchCG::SwitchLowering::buildBitTests(), llvm::SwitchCG::SwitchLowering::buildJumpTable(), llvm::HexagonInstrInfo::canExecuteInBundle(), CanMergeValues(), PODSmallVector< Node *, 8 >::clear(), llvm::DIExpression::constantFold(), copy_if_else(), llvm::TargetInstrInfo::createMIROperandComment(), llvm::detail::DoubleAPFloat::DoubleAPFloat(), StringView::dropBack(), PODSmallVector< Node *, 8 >::dropBack(), StringView::dropFront(), llvm::dwarf::RegisterLocations::dump(), llvm::dumpBytes(), EmitLoweredCascadedSelect(), StringView::empty(), PODSmallVector< Node *, 8 >::empty(), llvm::gsym::LineTable::encode(), llvm::AllocatorList< Token >::erase(), llvm::simple_ilist< MachineInstr, Options... >::erase(), llvm::MCInst::erase(), llvm::msgpack::MapDocNode::erase(), llvm::simple_ilist< MachineInstr, Options... >::eraseAndDispose(), expandBounds(), llvm::HexagonInstrInfo::expandVGatherPseudo(), StringView::find(), find_best(), llvm::SparseBitVector< ElementSize >::find_first(), llvm::SwitchCG::SwitchLowering::findBitTestClusters(), llvm::SwitchCG::SwitchLowering::findJumpTables(), llvm::formatCallSiteLocation(), llvm::detail::frexp(), llvm::RecordRecTy::getAsString(), llvm::SwitchCG::getJumpTableNumCases(), llvm::SwitchCG::getJumpTableRange(), llvm::HexagonBlockRanges::InstrIndexMap::getPrevIndex(), llvm::object::ELFObjectFile< ELFT >::getSectionIndex(), INITIALIZE_PASS(), llvm::simple_ilist< MachineInstr, Options... >::insert(), llvm::AllocatorList< Token >::insert(), llvm::StringMap< uint64_t >::insert(), insertSEH(), isDefBetween(), llvm::TargetInstrInfo::isMBBSafeToOutlineFrom(), llvm::SmallVectorTemplateCommon< T >::isRangeInStorage(), llvm::SmallVectorTemplateCommon< T >::isReferenceToRange(), llvm::ARM_AM::isSOImmTwoPartValNeg(), isStringOfOnes(), loadM0FromVGPR(), LowerBuildVectorAsInsert(), llvm::PatternMatch::m_SplatOrUndefMask::match(), BracedRangeExpr::match(), nodes_for_root(), llvm::HexagonBlockRanges::IndexType::operator unsigned(), llvm::gsym::operator<<(), PODSmallVector< Node *, 8 >::operator=(), packCmovGroup(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parse(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseBareSourceName(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseBracedExpr(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseCtorDtorName(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseExprPrimary(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseFloatingLiteral(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseFoldExpr(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseLocalName(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseNestedName(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseNumber(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseOperatorEncoding(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseOperatorName(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseQualifiedType(), parseRegMask(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseSeqId(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseSourceName(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseSpecialName(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseSubstitution(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseTemplateArg(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseType(), llvm::RegisterBankInfo::ValueMapping::partsAllUniform(), PODSmallVector< Node *, 8 >::PODSmallVector(), PODSmallVector< Node *, 8 >::pop_back(), StringView::popFront(), llvm::MIPrinter::print(), llvm::ScalarEvolution::print(), llvm::RuntimePointerChecking::printChecks(), llvm::GenericCycle< ContextT >::printEntries(), BracedRangeExpr::printLeft(), llvm::SparcInstPrinter::printMembarTag(), printTypes(), llvm::msf::MappedBlockStream::readLongestContiguousChunk(), llvm::ilist_base< EnableSentinelTracking >::removeRange(), llvm::ilist_base< EnableSentinelTracking >::removeRangeImpl(), llvm::AArch64FrameLowering::restoreCalleeSavedRegisters(), llvm::object::ELFFile< ELFT >::sections(), StringView::size(), PODSmallVector< Node *, 8 >::size(), llvm::simple_ilist< MachineInstr, Options... >::splice(), llvm::BinaryStreamWriter::split(), llvm::BinaryStreamReader::split(), llvm::stable_hash_combine_range(), StringView::StringView(), llvm::GVNExpression::BasicExpression::swapOperands(), llvm::ilist_base< EnableSentinelTracking >::transferBefore(), llvm::ilist_base< EnableSentinelTracking >::transferBeforeImpl(), llvm::SelectionDAGBuilder::UpdateSplitBlock(), wasEscaped(), and PODSmallVector< Node *, 8 >::~PODSmallVector().
Definition at line 671 of file README.txt.
Definition at line 383 of file README.txt.
Referenced by bar(), LiveDebugValues::MLocTracker::dump_mloc_map(), LiveDebugValues::InstrRefBasedLDV::dump_mloc_transfer(), and into().
Definition at line 104 of file README.txt.
Referenced by checkDyldCommand(), checkDylibCommand(), checkRpathCommand(), checkSubCommand(), llvm::ModuleSymbolTable::CollectAsmSymvers(), collectBitParts(), llvm::AppleAcceleratorTable::dump(), llvm::BitTracker::RegisterCell::extract(), firstch(), freeset(), freezeset(), isProfitableChain(), llvm::orc::lookupAndRecordAddrs(), nch(), llvm::BitVector::operator&=(), llvm::rdf::CopyPropagation::run(), simplifyAMDGCNImageIntrinsic(), and llvm::UnrollLoop().
we compile this esp call L1 $pb L1 esp je LBB1_2 esp ret but is currently always computed in the entry block It would be better to sink the picbase computation down into the block for the as it is the only one that uses it This happens for a lot of code with early outs Another example is loads of which are usually emitted into the entry block on targets like x86 If not used in all paths through a function |
Definition at line 427 of file README.txt.
aka conv or ret i32 or6 or even i depending on the speed of the multiplier The best way to handle this is to canonicalize it to a multiply in IR and have codegen handle lowering multiplies to shifts on cpus where shifts are faster We do a number of simplifications in simplify libcalls to strength reduce standard library functions |
Definition at line 631 of file README.txt.
then ret i32 result Tail recursion elimination should handle |
Definition at line 355 of file README.txt.
Referenced by llvm_orc_deregisterEHFrameSectionWrapper(), llvm_orc_registerEHFrameSectionWrapper(), llvm_orc_registerJITLoaderGDBAllocAction(), llvm_orc_registerJITLoaderGDBWrapper(), llvm::orc::rt_bootstrap::runAsMainWrapper(), llvm::orc::rt_bootstrap::writeBuffersWrapper(), and llvm::orc::rt_bootstrap::writeUIntsWrapper().
Definition at line 332 of file README.txt.
Clang compiles this i1 i64 store i64 i64 store i64 i64 store i64 i64 store i64 align Which gets codegen d xmm0 movaps rbp movaps rbp movaps rbp movaps rbp rbp rbp rbp rbp It would be better to have movq s of instead of the movaps s http |
Definition at line 532 of file README.txt.
int i |
Definition at line 29 of file README.txt.
Referenced by llvm::AliasSetTracker::add(), AddAliasScopeMetadata(), addAndInterleaveWithUnsupported(), llvm::Function::addAttributeAtIndex(), llvm::CallBase::addAttributeAtIndex(), AddCombineBUILD_VECTORToVPADDL(), addConstantComments(), llvm::DwarfExpression::addConstantFP(), llvm::DwarfUnit::addConstantValue(), llvm::CallBase::addDereferenceableParamAttr(), addIncomingValuesToPHIs(), llvm::RegsForValue::AddInlineAsmOperands(), llvm::LiveIntervals::addKillFlags(), llvm::LiveVariables::addNewBlock(), AddNodeIDCustom(), addOperands(), addOptionalImmOperand(), llvm::MachineInstr::addRegisterDead(), llvm::MachineInstr::addRegisterKilled(), AddRuntimeUnrollDisableMetaData(), addSaveRestoreRegs(), addShuffleForVecExtend(), addStackMapLiveVars(), addStringImm(), llvm::addStringMetadataToLoop(), AddThumb1SBit(), llvm::SCCPInstVisitor::addTrackedFunction(), llvm::MachO::InterfaceFile::addUUID(), llvm::ARMBasicBlockUtils::adjustBBOffsetsAfter(), AdjustBlendMask(), llvm::ARMAsmBackend::adjustFixupValue(), llvm::GCNHazardRecognizer::AdvanceCycle(), llvm::AggressiveAntiDepBreaker::AggressiveAntiDepBreaker(), llvm::AggressiveAntiDepState::AggressiveAntiDepState(), llvm::AliasSet::aliasesPointer(), llvm::AliasSet::aliasesUnknownInst(), llvm::BitVector::all(), llvm::BitsInit::allInComplete(), llvm::CCState::AllocateStack(), llvm::SITargetLowering::allocateSystemSGPRs(), allocset(), allSameType(), AnalyzeArguments(), llvm::analyzeArguments(), llvm::SystemZCCState::AnalyzeCallOperands(), llvm::CCState::AnalyzeCallOperands(), analyzeCallOperands(), llvm::CCState::AnalyzeCallResult(), llvm::SystemZCCState::AnalyzeFormalArguments(), llvm::CCState::AnalyzeFormalArguments(), llvm::AMDGPUTargetLowering::analyzeFormalArgumentsCompute(), llvm::CCState::AnalyzeReturn(), llvm::analyzeReturnValues(), llvm::BitVector::anyCommon(), llvm::SmallBitVector::anyCommon(), appendToGlobalArray(), llvm::HexagonSubtarget::BankConflictMutation::apply(), llvm::AVRAsmBackend::applyFixup(), llvm::MipsAsmBackend::applyFixup(), llvm::ARMAsmBackend::applyFixup(), llvm::RISCVAsmBackend::applyFixup(), llvm::AMDGPURegisterBankInfo::applyMappingSBufferLoad(), llvm::PBQP::applyR1(), llvm::PBQP::applyR2(), llvm::DomTreeBuilder::SemiNCAInfo< DomTreeT >::ApplyUpdates(), ApplyX86MaskOn1BitsVec(), AreEquivalentPhiNodes(), areInverseVectorBitmasks(), areOuterLoopExitPHIsSupported(), llvm::HexagonPacketizerList::arePredicatesComplements(), ARM64EmitUnwindInfo(), llvm::ARMBaseInstrInfo::ARMBaseInstrInfo(), ARMEmitUnwindCode(), ARMEmitUnwindInfo(), llvm::X86FrameLowering::assignCalleeSavedSpillSlots(), llvm::M68kFrameLowering::assignCalleeSavedSpillSlots(), llvm::PPCFrameLowering::assignCalleeSavedSpillSlots(), assignCalleeSavedSpillSlots(), AssignProtectedObjSet(), llvm::IntervalMapImpl::Path::atBegin(), llvm::DomTreeBuilder::SemiNCAInfo< DomTreeT >::attachNewSubtree(), llvm::GCOVBlock::augmentOneCycle(), llvm::AVRDAGToDAGISel::select< AVRISD::CALL >(), llvm::BitstreamWriter::BackpatchWord(), llvm::sys::path::begin(), bigEndianByteAt(), BinomialCoefficient(), BitCastConstantVector(), broadcastSrcOp(), llvm::SwitchCG::SwitchLowering::buildBitTests(), buildCallOperands(), buildClonedLoopBlocks(), buildCopyToRegs(), llvm::MachineIRBuilder::buildDeleteTrailingVectorElements(), buildFromShuffleMostly(), llvm::VPlanSlp::buildGraph(), BuildInstOrderMap(), llvm::AMDGPULegalizerInfo::buildMultiply(), buildOrChain(), llvm::MachineIRBuilder::buildPadVectorWithUndefElements(), llvm::AMDGPURegisterBankInfo::buildReadFirstLane(), llvm::SIRegisterInfo::buildSpillLoadStore(), BuildSubAggregate(), BuildVSLDOI(), llvm::cacheAnnotationFromMD(), calculateMMLEIndex(), llvm::LoopVectorizationCostModel::calculateRegisterUsage(), llvm::Interpreter::callFunction(), canBeFeederToNewValueJump(), canClobberPhysRegDefs(), canCreateUndefOrPoison(), canEvaluateShuffled(), canEvaluateZExtd(), llvm::HexagonInstrInfo::canExecuteInBundle(), canLoopBeDeleted(), canLowerByDroppingElements(), llvm::RISCVTargetLowering::CanLowerReturn(), cannotBeOrderedLessThanZeroImpl(), canonicalizeBitSelect(), canonicalizeInsertSplat(), canonicalizeShuffleMaskWithCommute(), canonicalizeShuffleMaskWithHorizOp(), canReduceVMulWidth(), canReplaceGEPIdxWithZero(), canTrapImpl(), canWidenShuffleElements(), llvm::BitTracker::RegisterCell::cat(), CC_PPC32_SPE_CustomSplitFP64(), CC_PPC32_SPE_RetF64(), CC_PPC32_SVR4_Custom_SkipLastArgRegsPPCF128(), chainLoadsAndStoresForMemcpy(), CheckAndCreateOffsetAdd(), checkBitsConcrete(), checkDyldCommand(), checkDylibCommand(), CheckForLiveRegDefMasked(), checkLinkerOptCommand(), checkLowRegisterList(), checkOffsetSize(), checkRegOnlyPHIInputs(), llvm::CCState::CheckReturn(), checkRpathCommand(), llvm::object::BindRebaseSegInfo::checkSegAndOffsets(), checkSubCommand(), Choose(), chooseConstraint(), ChooseConstraint(), CleanupPointerRootUsers(), llvm::LiveIntervalUnion::Array::clear(), llvm::MachineRegisterInfo::clearVirtRegs(), llvm::CloneAndPruneIntoFromInst(), cloneInstructionInExitBlock(), llvm::JumpThreadingPass::cloneInstructions(), llvm::CloneModule(), llvm::FunctionComparator::cmpBasicBlocks(), llvm::FunctionComparator::cmpConstants(), llvm::FunctionComparator::cmpOperations(), llvm::FunctionComparator::cmpTypes(), llvm::OpenMPIRBuilder::collapseLoops(), CollectAddOperandsWithScales(), collectInsertionElements(), CollectOpsToWiden(), collectShuffleElements(), collectSingleShuffleElements(), llvm::sys::unicode::columnWidthUTF8(), combineAddOfPMADDWD(), combineAnd(), combineArithReduction(), combineBasicSADPattern(), combineBitcast(), combineBVOfConsecutiveLoads(), combineBVOfVecSExt(), combineConcatVectorOfExtracts(), combineConcatVectorOps(), combineExtractWithShuffle(), combineINSERT_SUBVECTOR(), combineSelect(), combineShuffleOfSplatVal(), combineShuffleToVectorExtend(), combineTargetShuffle(), combineToConsecutiveLoads(), combineToExtendBoolVectorInReg(), combineTruncationShuffle(), combineVectorShiftImm(), combineVPDPBUSDPattern(), combineX86ShuffleChain(), combineX86ShuffleChainWithExtract(), combineX86ShufflesConstants(), combineX86ShufflesRecursively(), llvm::ShuffleVectorInst::commute(), llvm::ShuffleVectorSDNode::commuteMask(), CompactSwizzlableVector(), llvm::FunctionComparator::compare(), CompareSCEVComplexity(), completeEphemeralValues(), llvm::IntEqClasses::compress(), llvm::computeAccessFunctions(), llvm::ComputeASanStackFrameLayout(), computeCalleeSaveRegisterPairs(), computeExcessPressureDelta(), computeFreeStackSlots(), llvm::SelectionDAG::computeKnownBits(), computeKnownBits(), llvm::X86TargetLowering::computeKnownBitsForTargetNode(), computeKnownBitsFromOperator(), llvm::computeKnownBitsFromRangeMetadata(), llvm::GISelKnownBits::computeKnownBitsImpl(), llvm::rdf::Liveness::computeLiveIns(), llvm::ComputeMappedEditDistance(), computeMaxPressureDelta(), llvm::SelectionDAG::ComputeNumSignBits(), llvm::X86TargetLowering::ComputeNumSignBitsForTargetNode(), ComputeNumSignBitsImpl(), computeNumSignBitsVectorConstant(), llvm::EHStreamer::computePadMap(), llvm::rdf::Liveness::computePhiInfo(), llvm::FunctionLoweringInfo::ComputePHILiveOutRegInfo(), ComputePTXValueVTs(), llvm::TargetLoweringBase::computeRegisterProperties(), llvm::SMSchedule::computeStart(), llvm::JumpThreadingPass::computeValueKnownInPredecessorsImpl(), llvm::computeValueLLTs(), llvm::ComputeValueVTs(), computeZeroableShuffleElements(), llvm::concatenateVectors(), concatSubVector(), ConsecutiveRegisters(), llvm::ConstantFoldBinaryInstruction(), llvm::ConstantFoldCastInstruction(), llvm::ConstantFoldExtractElementInstruction(), llvm::ConstantFoldGetElementPtr(), llvm::ConstantFoldInsertElementInstruction(), llvm::ConstantFoldInsertValueInstruction(), llvm::ConstantFoldSelectInstruction(), llvm::ConstantFoldShuffleVectorInstruction(), llvm::ConstantFoldTerminator(), llvm::ConstantFoldUnaryInstruction(), llvm::DwarfUnit::constructSubprogramArguments(), consume(), llvm::Constant::containsConstantExpression(), containsNoDependence(), containsUndefinedElement(), llvm::convertAddSubFlagsOpcode(), convertCharsToWord(), llvm::BitsInit::convertInitializerBitRange(), llvm::IntInit::convertInitializerBitRange(), llvm::BitsInit::convertInitializerTo(), llvm::IntInit::convertInitializerTo(), ConvertSelectToConcatVector(), convertToGuardPredicates(), llvm::IntervalMapImpl::NodeBase< std::pair< IndexT, IndexT >, char, N >::copy(), llvm::DenseMapBase< DenseMap< llvm::VPInstruction *, llvm::InterleaveGroup< llvm::VPInstruction > *, DenseMapInfo< llvm::VPInstruction * >, llvm::detail::DenseMapPair< llvm::VPInstruction *, llvm::InterleaveGroup< llvm::VPInstruction > * > >, llvm::VPInstruction *, llvm::InterleaveGroup< llvm::VPInstruction > *, DenseMapInfo< llvm::VPInstruction * >, llvm::detail::DenseMapPair< llvm::VPInstruction *, llvm::InterleaveGroup< llvm::VPInstruction > * > >::copyFrom(), llvm::SparcInstrInfo::copyPhysReg(), llvm::ARMBaseInstrInfo::copyPhysReg(), llvm::MachO::ArchitectureSet::count(), llvm::SparseBitVectorElement< ElementSize >::count(), llvm::StringRef::count(), llvm::CallBase::countOperandBundlesOfType(), llvm::CallBase::Create(), llvm::IRBuilderBase::CreateAggregateRet(), llvm::IRBuilderBase::CreateAnd(), llvm::createBitMaskForGaps(), llvm::MDBuilder::createBranchWeights(), CreateGCRelocates(), createGuardBlocks(), createIndexMap(), llvm::createInterleaveMask(), llvm::IRBuilderBase::CreateLogicalOr(), createLoweredInitializer(), llvm::createMemLibcall(), createMMXBuildVector(), llvm::IRBuilderBase::CreateOr(), llvm::FunctionLoweringInfo::CreateRegs(), llvm::createReplicatedMask(), llvm::createSequentialMask(), createShuffleMaskFromVSELECT(), createShuffleStride(), llvm::createSplat2ShuffleMask(), llvm::IRBuilderBase::CreateStepVector(), llvm::createStrideMask(), llvm::MDBuilder::createTBAAStructNode(), llvm::MDBuilder::createTBAAStructTypeNode(), llvm::sys::fs::createUniquePath(), llvm::createUnpackShuffleMask(), createVariablePermute(), llvm::IRBuilderBase::CreateVectorReverse(), llvm::CallBase::dataOperandHasImpliedAttr(), llvm::DbgValueLocEntry::DbgValueLocEntry(), DCEInstruction(), llvm::DecodeBLENDMask(), DecodeDPRRegListOperand(), llvm::DecodeEXTRQIMask(), DecodeFixedType(), DecodeIITType(), llvm::DecodeInsertElementMask(), llvm::DecodeINSERTQIMask(), llvm::AArch64_AM::decodeLogicalImmediate(), llvm::DecodeMOVDDUPMask(), llvm::DecodeMOVHLPSMask(), llvm::DecodeMOVLHPSMask(), llvm::DecodeMOVSHDUPMask(), llvm::DecodeMOVSLDUPMask(), llvm::DecodePALIGNRMask(), DecodePALIGNRMask(), llvm::DecodePSHUFBMask(), llvm::DecodePSHUFHWMask(), llvm::DecodePSHUFLWMask(), llvm::DecodePSHUFMask(), llvm::DecodePSLLDQMask(), llvm::DecodePSRLDQMask(), DecodeRegListOperand(), DecodeRegListOperand16(), llvm::DecodeScalarMoveMask(), llvm::DecodeSHUFPMask(), DecodeSPRRegListOperand(), llvm::DecodeSubVectorBroadcast(), llvm::DecodeUNPCKHMask(), llvm::DecodeUNPCKLMask(), llvm::DecodeVALIGNMask(), llvm::DecodeVPERM2X128Mask(), llvm::DecodeVPERMIL2PMask(), llvm::DecodeVPERMILPMask(), llvm::DecodeVPERMMask(), llvm::DecodeVPERMV3Mask(), llvm::DecodeVPERMVMask(), llvm::DecodeVPPERMMask(), DecodeVPTMaskOperand(), llvm::decodeVSHUF64x2FamilyMask(), llvm::DecodeZeroExtendMask(), llvm::LegacyLegalizerInfo::decreaseToSmallerTypesAndIncreaseToSmallest(), DeleteBasicBlock(), llvm::deleteDeadLoop(), llvm::DeleteDeadPHIs(), llvm::DomTreeBuilder::SemiNCAInfo< DomTreeT >::DeleteUnreachable(), llvm::DemotePHIToStack(), llvm::DemoteRegToStack(), detectAVGPattern(), detectPMADDUBSW(), llvm::CallLowering::determineAssignments(), llvm::RISCVFrameLowering::determineCalleeSaves(), llvm::CSKYFrameLowering::determineCalleeSaves(), llvm::ARMFrameLowering::determineCalleeSaves(), llvm::AArch64FrameLowering::determineCalleeSaves(), llvm::HexagonFrameLowering::determineCalleeSaves(), llvm::TargetFrameLowering::determineCalleeSaves(), llvm::diagnoseDontCall(), DiscoverDependentGlobals(), llvm::DistributeRange(), doinsert(), doNotCSE(), llvm::PHITransAddr::dump(), llvm::MCFragment::dump(), llvm::LiveVariables::VarInfo::dump(), llvm::LexicalScope::dump(), llvm::DWARFUnitIndex::dump(), llvm::AppleAcceleratorTable::dump(), llvm::SlotIndexes::dump(), llvm::dwarf::CFIProgram::dump(), llvm::MCInst::dump_pretty(), llvm::dumpAmdKernelCode(), llvm::dumpBytes(), dumpBytes(), dumpDataAux(), llvm::PMTopLevelManager::dumpPasses(), llvm::SystemZHazardRecognizer::dumpProcResourceCounters(), llvm::dumpRegSetPressure(), llvm::JumpThreadingPass::duplicateCondBranchOnPHIIntoPred(), llvm::DuplicateInstructionsInSplitBetween(), llvm::BitTracker::MachineEvaluator::eAND(), llvm::BitTracker::MachineEvaluator::eIMM(), llvm::BPFRegisterInfo::eliminateFrameIndex(), EltsFromConsecutiveLoads(), llvm::DIEAbbrev::Emit(), llvm::emitAMDGPUPrintfCall(), llvm::EmitAnyX86InstComments(), emitConstant(), llvm::AsmPrinter::emitConstantPool(), llvm::MipsSEFrameLowering::emitEpilogue(), llvm::XCoreFrameLowering::emitEpilogue(), llvm::PPCFrameLowering::emitEpilogue(), llvm::MachineInstr::emitError(), llvm::MCObjectStreamer::emitFill(), EmitGenDwarfAranges(), llvm::EmitGEPOffset(), emitGlobalConstantLargeInt(), llvm::MipsMCCodeEmitter::emitInstruction(), llvm::ScoreboardHazardRecognizer::EmitInstruction(), llvm::MCStreamer::emitInstruction(), llvm::MCWinCOFFStreamer::emitInstToData(), llvm::MachineRegisterInfo::EmitLiveInCopies(), llvm::ScheduleHazardRecognizer::EmitNoops(), emitNop(), llvm::TargetLoweringBase::emitPatchPoint(), llvm::MipsSEFrameLowering::emitPrologue(), llvm::XCoreFrameLowering::emitPrologue(), llvm::SystemZELFFrameLowering::emitPrologue(), llvm::BitstreamWriter::EmitRecord(), llvm::GraphWriter< GraphType >::emitSimpleNode(), llvm::BufferByteStreamer::emitSLEB128(), llvm::StringToOffsetTable::EmitString(), llvm::HexagonMCELFStreamer::EmitSymbol(), llvm::ARMSelectionDAGInfo::EmitTargetCodeForMemcpy(), emitTypedInstrOperands(), llvm::BufferByteStreamer::emitULEB128(), llvm::dxil::DXILBitcodeWriter::emitWideAPInt(), emitWideAPInt(), llvm::SparseBitVectorElement< ElementSize >::empty(), llvm::encodeBase64(), encodeBase64StringEntry(), llvm::sys::path::end(), llvm::BitTracker::MachineEvaluator::eNOT(), llvm::BitTracker::MachineEvaluator::eORL(), llvm::IntervalMapImpl::NodeBase< std::pair< IndexT, IndexT >, char, N >::erase(), llvm::PriorityQueue< T, Sequence, Compare >::erase_one(), llvm::yaml::escape(), llvm::DOT::EscapeString(), estimateRSStackSizeLimit(), llvm::MachineFrameInfo::estimateStackSize(), llvm::HexagonEvaluator::evaluate(), llvm::SCEVAddRecExpr::evaluateAtIteration(), EvaluateExpression(), evaluateGEPOffsetExpression(), evaluateInDifferentElementOrder(), llvm::InstCombinerImpl::EvaluateInDifferentType(), llvm::AMDGPURegisterBankInfo::executeInWaterfallLoop(), executeSelectInst(), llvm::BitTracker::MachineEvaluator::eXOR(), llvm::TargetLowering::expandUnalignedLoad(), llvm::TargetLowering::expandUnalignedStore(), ExtendToType(), ExtendUsesToFormExtLoad(), llvm::AppleAcceleratorTable::extract(), llvm::BitTracker::RegisterCell::extract(), llvm::CodeExtractor::extractCodeRegion(), llvm::extractConstantMask(), llvm::Instruction::extractProfTotalWeight(), llvm::SelectionDAG::ExtractVectorElements(), f64AssignAAPCS(), f64RetAssign(), llvm::AArch64TargetLowering::fallBackToDAGISel(), llvm::LegalizerHelper::fewerElementsVectorMerge(), llvm::LegalizerHelper::fewerElementsVectorMultiEltType(), llvm::LegalizerHelper::fewerElementsVectorPhi(), llvm::GISelWorkList< 8 >::finalize(), llvm::UnwindOpcodeAssembler::Finalize(), llvm::finalizeBundle(), llvm::RuntimeDyldELF::finalizeLoad(), llvm::SparseBitVectorElement< ElementSize >::find_first(), llvm::StringRef::find_first_not_of(), llvm::StringRef::find_first_of(), llvm::OnDiskChainedHashTable< Info >::find_hashed(), llvm::StringRef::find_last_not_of(), llvm::StringRef::find_last_of(), llvm::SparseBitVectorElement< ElementSize >::find_next(), llvm::DWARFAbbreviationDeclaration::findAttributeIndex(), llvm::SwitchCG::SwitchLowering::findBitTestClusters(), llvm::SourceMgr::FindBufferContainingLoc(), findCorrespondingPred(), findDeadCallerSavedReg(), findDefIdx(), findDemandedEltsBySingleUser(), FindFirstNonCommonLetter(), llvm::MCInstrDesc::findFirstPredOperandIdx(), llvm::MachineInstr::findFirstPredOperandIdx(), findFirstVectorPredOperandIdx(), llvm::findFirstVPTPredOperandIdx(), llvm::IntervalMapImpl::LeafNode< uint64_t, uint16_t, N, IntervalMapHalfOpenInfo< uint64_t > >::findFrom(), llvm::IntervalMapImpl::BranchNode< KeyT, ValT, RootBranchCap, Traits >::findFrom(), findFuncPointers(), llvm::Mips16HardFloatInfo::findFuncSignature(), llvm::ExecutionEngine::FindFunctionNamed(), llvm::sampleprof::FunctionSamples::findFunctionSamples(), llvm::ExecutionEngine::FindGlobalVariableNamed(), llvm::SparseSet< RootData >::findIndex(), llvm::SparseMultiSet< VReg2SUnit, VirtReg2IndexFunctor >::findIndex(), llvm::MachineInstr::findInlineAsmFlagIdx(), FindInOperandList(), llvm::FindInsertedValue(), llvm::SwitchCG::SwitchLowering::findJumpTables(), findLiveReferences(), FindMatchingEpilog(), findmust(), llvm::cl::generic_parser_base::findOption(), llvm::findOptionMDForLoopID(), llvm::RuntimeDyldImpl::findOrEmitSection(), findPartitions(), findPHIToPartitionLoops(), llvm::MachineInstr::findRegisterDefOperandIdx(), llvm::MachineInstr::findRegisterUseOperandIdx(), llvm::TargetLoweringBase::findRepresentativeClass(), llvm::DomTreeBuilder::SemiNCAInfo< DomTreeT >::FindRoots(), findScratchNonCalleeSaveRegister(), llvm::MachineInstr::findTiedOperandIdx(), FindUsedValues(), findUsedValues(), findUseIdx(), llvm::SplitEditor::finish(), firstch(), llvm::InnerLoopVectorizer::fixNonInductionPHIs(), llvm::PPCInstrInfo::fixupIsDeadOrKill(), llvm::SwingSchedulerDAG::fixupRegisterOverlaps(), fixupShuffleMaskForPermutedSToV(), fixupVariableFloatArgs(), flattenCommandLine(), FlattenVectorShuffle(), llvm::BinOpInit::Fold(), llvm::TernOpInit::Fold(), llvm::CondOpInit::Fold(), FoldBUILD_VECTOR(), llvm::InstCombinerImpl::foldCmpLoadFromIndexedGlobal(), foldCONCAT_VECTORS(), llvm::InstCombinerImpl::foldGEPICmp(), foldICmpWithLowBitMaskedVal(), foldIdentityExtractShuffle(), foldIdentityPaddedShuffles(), foldInsEltIntoIdentityShuffle(), foldInsEltIntoSplat(), foldInsSequenceIntoSplat(), llvm::InstCombinerImpl::foldOpIntoPhi(), foldPatchpoint(), foldShuffledIntrinsicOperands(), foldShuffleOfConcatUndefs(), foldShuffleWithInsert(), foldTruncShuffle(), ForeachDagApply(), formSplatFromShuffles(), freeset(), freezeset(), llvm::FunctionComparator::functionHash(), FuseInst(), FuseTwoAddrInst(), gatherIncomingValuesToPhi(), GeneratePerfectShuffle(), genShuffleBland(), llvm::gsym::LineTable::get(), llvm::RecordRecTy::get(), llvm::PPC::get_VSPLTI_elt(), llvm::LegacyLegalizerInfo::getAction(), llvm::LegalizerInfo::getAction(), llvm::ScalarEvolution::getAddExpr(), llvm::ScalarEvolution::getAddRecExpr(), getAddressAccessSCEV(), llvm::rdf::PhysicalRegisterInfo::getAliasSet(), llvm::getAlign(), llvm::rdf::Liveness::getAllReachingDefs(), llvm::Function::getArg(), llvm::VarDefInit::getArg(), GetArgMD(), llvm::CallBase::getArgOperand(), llvm::FuncletPadInst::getArgOperand(), llvm::CallBase::getArgOperandUse(), getArrayElements(), llvm::BitsInit::getAsString(), llvm::CondOpInit::getAsString(), llvm::DagInit::getAsString(), llvm::Function::getAttributeAtIndex(), llvm::CallBase::getAttributeAtIndex(), llvm::PHINode::getBasicBlockIndex(), getBestDestForJumpOnUndef(), llvm::Trace::getBlock(), llvm::Trace::getBlockIndex(), llvm::BitstreamBlockInfo::getBlockInfo(), llvm::BitstreamWriter::getBlockInfo(), GetBranchWeights(), llvm::SourceMgr::getBufferInfo(), getBuildDwordsVector(), getBuildPairElt(), llvm::SelectionDAG::getConstant(), llvm::SDValue::getConstantOperandAPInt(), llvm::SDValue::getConstantOperandVal(), llvm::MachineConstantPool::getConstantPoolIndex(), llvm::getConstantRangeFromMetadata(), llvm::ExecutionEngine::getConstantValue(), getConstantVector(), getConstVector(), llvm::Type::getContainedType(), llvm::DWARFUnitIndex::Entry::getContribution(), getCopyFromPartsVector(), llvm::RegsForValue::getCopyFromRegs(), getCopyToParts(), getCopyToPartsVector(), llvm::RegsForValue::getCopyToRegs(), llvm::AMDGPURegisterBankInfo::getDefaultMappingSOP(), llvm::AMDGPURegisterBankInfo::getDefaultMappingVOP(), llvm::IndirectBrInst::getDestination(), getEdgeValueLocal(), llvm::ListInit::getElement(), llvm::ListInit::getElementAsRecord(), llvm::ARMConstantPoolValue::getExistingMachineCPValueImpl(), llvm::CSKYConstantPoolValue::getExistingMachineCPValueImpl(), llvm::cl::generic_parser_base::getExtraOptionNames(), getFauxShuffleMask(), llvm::X86::getFeaturesForCPU(), llvm::MachineFunction::getFilterIDFor(), llvm::CCState::getFirstUnallocated(), llvm::InstCombiner::getFlippedStrictnessPredicateAndConstant(), llvm::MemoryLocation::getForDest(), getFrameIndexOperandNum(), llvm::R600FrameLowering::getFrameIndexReference(), llvm::DWARFUnitIndex::getFromOffset(), llvm::Type::getFunctionParamType(), getGEPSmallConstantIntOffsetV(), llvm::ExecutionEngine::getGlobalValueAtAddress(), llvm::getGuaranteedWellDefinedOps(), getHalfShuffleMask(), llvm::ScoreboardHazardRecognizer::getHazardType(), getHiPELiteral(), getHopForBuildVector(), llvm::NVPTXMachineFunctionInfo::getImageHandleSymbolIndex(), getImpliedDisabledFeatures(), getImpliedEnabledFeatures(), llvm::Function::getImportGUIDs(), llvm::PHINode::getIncomingBlock(), llvm::PHINode::getIncomingValue(), llvm::PHINode::getIncomingValueNumForOperand(), llvm::getIndexExpressionsFromGEP(), llvm::CallBrInst::getIndirectDest(), llvm::CallBrInst::getIndirectDestLabel(), llvm::CallBrInst::getIndirectDestLabelUse(), llvm::CallBrInst::getIndirectDests(), getInitPhiReg(), llvm::DWARFContext::getInliningInfoForAddress(), getInputChainForNode(), getInsertPointForUses(), llvm::PPCInstrInfo::getInstrLatency(), llvm::ARMRegisterBankInfo::getInstrMapping(), llvm::AMDGPURegisterBankInfo::getInstrMapping(), llvm::AArch64Disassembler::getInstruction(), llvm::AMDGPUDisassembler::getInstruction(), getKnownUndefForVectorBinop(), llvm::object::MachOObjectFile::getLibraryShortNameByIndex(), llvm::MachineFrameInfo::getLocalFrameObjectMap(), llvm::Loop::getLocRange(), getLoopPhiReg(), llvm::HexagonMCCodeEmitter::getMachineOpValue(), getMangledTypeStr(), getMemcpyLoadsAndStores(), llvm::TargetTransformInfoImplBase::getMemcpyLoopResidualLoweringType(), getMemmoveLoadsAndStores(), llvm::SourceMgr::getMemoryBuffer(), llvm::SystemZTTIImpl::getMemoryOpCost(), llvm::PPCTTIImpl::getMemoryOpCost(), getMemsetStores(), getMemsetStringVal(), llvm::ScalarEvolution::getMinMaxExpr(), llvm::MDNode::getMostGenericRange(), getMOVL(), llvm::AMDGPU::SendMsg::getMsgOpId(), llvm::ScalarEvolution::getMulExpr(), llvm::X86TargetLowering::getNegatedExpression(), getNewSource(), llvm::NodeSet::getNode(), llvm::SelectionDAG::getNode(), llvm::MCInstrDesc::getNumImplicitDefs(), llvm::MCInstrDesc::getNumImplicitUses(), getOffsetFromIndex(), getOffsetFromIndices(), getOneTrueElt(), llvm::SCEVCastExpr::getOperand(), llvm::User::getOperand(), llvm::SCEVNAryExpr::getOperand(), llvm::SDValue::getOperand(), llvm::MCInst::getOperand(), llvm::SCEVUDivExpr::getOperand(), llvm::MachineInstr::getOperand(), llvm::UnOpInit::getOperand(), llvm::BinOpInit::getOperand(), llvm::TernOpInit::getOperand(), llvm::NamedMDNode::getOperand(), llvm::CallBase::getOperandBundle(), llvm::CallBase::getOperandBundlesAsDefs(), llvm::RegisterBankInfo::InstructionMapping::getOperandMapping(), llvm::PHINode::getOperandNumForIncomingValue(), llvm::User::getOperandUse(), getOpIdxForMO(), getOptionHelpName(), llvm::cl::generic_parser_base::getOptionWidth(), llvm::MachineFunction::getOrCreateLandingPadInfo(), llvm::AArch64TTIImpl::getOrCreateResultFromMemIntrinsic(), llvm::getOrInsertLibFunc(), getOtherIncomingValue(), getOutliningPenalty(), llvm::CallBase::getParamDereferenceableBytes(), llvm::CallBase::getParamDereferenceableOrNullBytes(), llvm::FunctionType::getParamType(), llvm::SourceMgr::getParentIncludeLoc(), llvm::LessRecordRegister::RecordParts::getPart(), getPerfectShuffleCost(), getPHIDeps(), getPhiRegs(), getPHISrcRegOpIdx(), getPlanEntry(), llvm::SCEVAddRecExpr::getPostIncExpr(), getPowerOf2Factor(), llvm::LazyValueInfo::getPredicateAt(), llvm::NVPTXTargetLowering::getPrototype(), getPSHUFShuffleMask(), llvm::MCRegisterInfo::getRegClass(), llvm::TargetRegisterInfo::getRegClass(), llvm::MachineInstr::getRegClassConstraintEffectForVReg(), llvm::MCRegisterClass::getRegister(), llvm::TargetRegisterClass::getRegister(), llvm::M68kRegisterInfo::getRegisterOrder(), getRegsUsedByPHIs(), llvm::CSKYRegisterInfo::getReservedRegs(), llvm::SIRegisterInfo::getReservedRegs(), llvm::AArch64RegisterInfo::getReservedRegs(), llvm::GetReturnInfo(), llvm::InstCombiner::getSafeVectorConstantForBinop(), llvm::X86TTIImpl::getScalarizationOverhead(), llvm::BasicTTIImplBase< AMDGPUTTIImpl >::getScalarizationOverhead(), llvm::ARMTargetLowering::getSchedulingPreference(), llvm::AArch64RegisterInfo::getSEHRegNum(), llvm::X86RegisterInfo::getSEHRegNum(), llvm::ARMBaseRegisterInfo::getSEHRegNum(), llvm::ScalarEvolution::getSequentialMinMaxExpr(), getSetupCost(), getShiftedValue(), getShuffleComment(), getShuffleDemandedElts(), llvm::ShuffleVectorInst::getShuffleMask(), getShuffleMaskIndexOfOneElementFromOp0IntoOp1(), llvm::getShuffleReduction(), getShuffleVectorZeroOrUndef(), getSPIRVStringOperand(), llvm::ShuffleVectorSDNode::getSplatIndex(), llvm::BuildVectorSDNode::getSplatValue(), llvm::SelectionDAG::getStepVector(), llvm::SCCPInstVisitor::getStructLatticeValueFor(), llvm::VectorType::getSubdividedVectorType(), llvm::BranchInst::getSuccessor(), llvm::IndirectBrInst::getSuccessor(), llvm::InvokeInst::getSuccessor(), llvm::CallBrInst::getSuccessor(), llvm::GetSuccessorNumber(), llvm::sys::getSwappedBytes(), getTargetConstantBitsFromNode(), getTargetShuffleAndZeroables(), llvm::ScalarEvolution::getTruncateExpr(), llvm::BasicTTIImplBase< AMDGPUTTIImpl >::getTypeBasedIntrinsicInstrCost(), llvm::ConstantStruct::getTypeForElements(), llvm::MachineFunction::getTypeIDFor(), llvm::ScalarEvolution::getUDivExactExpr(), llvm::ScalarEvolution::getUDivExpr(), llvm::GetUnrollMetadata(), llvm::SelectionDAG::getValidMaximumShiftAmountConstant(), llvm::SelectionDAG::getValidMinimumShiftAmountConstant(), llvm::PackedVectorBase< T, BitNum, BitVectorTy, false >::getValue(), llvm::PackedVectorBase< T, BitNum, BitVectorTy, true >::getValue(), llvm::yaml::ScalarNode::getValue(), llvm::FunctionLoweringInfo::getValueFromVirtualReg(), llvm::SelectionDAGBuilder::getValueImpl(), llvm::MachineSSAUpdater::GetValueInMiddleOfBlock(), llvm::SSAUpdater::GetValueInMiddleOfBlock(), llvm::SelectionDAG::getVectorShuffle(), getX86MaskVec(), group2Shuffle(), GroupByComplexity(), llvm::PBQP::RegAlloc::NodeMetadata::handleAddEdge(), llvm::CallLowering::handleAssignments(), handlePhiDef(), llvm::PBQP::RegAlloc::NodeMetadata::handleRemoveEdge(), llvm::GetElementPtrInst::hasAllConstantIndices(), llvm::GetElementPtrInst::hasAllZeroIndices(), HasConditionalBranch(), llvm::PHINode::hasConstantOrUndefValue(), llvm::PHINode::hasConstantValue(), llvm::MCInstrDesc::hasDefOfPhysReg(), HashMachineInstr(), hasIdenticalHalvesShuffleMask(), hasNormalLoadOperand(), llvm::CallBase::hasOperandBundlesOtherThan(), hasRegisterDependency(), llvm::MachineInstr::hasRegisterImplicitUseOperand(), llvm::PBQP::hasRegisterOptions(), haveEfficientBuildVectorPattern(), haveSameOperands(), llvm::hoistRegion(), llvm::rdf::NodeAllocator::id(), incDecVectorConstant(), llvm::ValueEnumerator::incorporateFunction(), llvm::LegacyLegalizerInfo::increaseToLargerTypesAndDecreaseToLargest(), llvm::LiveIntervalUnion::Array::init(), llvm::SchedBoundary::init(), llvm::ConvergingVLIWScheduler::initialize(), INITIALIZE_PASS(), llvm::ExecutionEngine::InitializeMemory(), llvm::ScheduleDAGMILive::initRegPressure(), llvm::InlineFunction(), llvm::SystemZELFFrameLowering::inlineStackProbe(), llvm::PPCFrameLowering::inlineStackProbe(), llvm::PriorityWorklist< llvm::LazyCallGraph::SCC *, SmallVector< llvm::LazyCallGraph::SCC *, N >, SmallDenseMap< llvm::LazyCallGraph::SCC *, ptrdiff_t > >::insert(), llvm::BitTracker::RegisterCell::insert(), llvm::APInt::insertBits(), llvm::ARCInstrInfo::insertBranch(), llvm::IntervalMapImpl::LeafNode< uint64_t, uint16_t, N, IntervalMapHalfOpenInfo< uint64_t > >::insertFrom(), llvm::TargetInstrInfo::insertNoops(), insertParsePoints(), insertSEH(), insertUniqueBackedgeBlock(), insertVector(), installExceptionOrSignalHandlers(), llvm::PPCTTIImpl::instCombineIntrinsic(), llvm::LoopVectorizationCostModel::interleavedAccessCanBeWidened(), llvm::SparseBitVectorElement< ElementSize >::intersects(), llvm::SparseBitVectorElement< ElementSize >::intersectWith(), llvm::SparseBitVectorElement< ElementSize >::intersectWithComplement(), InTreeUserNeedToExtract(), is128BitUnpackShuffleMask(), is16BitEquivalent(), is64Bit(), isACalleeSavedRegister(), isAddSubOrSubAdd(), isAddSubOrSubAddMask(), isAllConstantBuildVector(), isAlternatingShuffMask(), isBigEndian(), llvm::isCalleeSavedRegister(), llvm::SITargetLowering::isCanonicalized(), llvm::BitsInit::isComplete(), llvm::BitsInit::isConcrete(), isConstantIntVector(), isConstantOrUndefBUILD_VECTOR(), llvm::BuildVectorSDNode::isConstantSplat(), llvm::X86::isConstantSplat(), llvm::ISD::isConstantSplatVectorAllOnes(), llvm::MachineInstr::isConstantValuePHI(), isConstCompatible(), isDeInterleaveMaskOfFactor(), llvm::SparseSolver< LatticeKey, LatticeVal, KeyInfo >::isEdgeFeasible(), isElementRotate(), llvm::Type::isEmptyTy(), isEndbrImm64(), IsEquivalentPHI(), isExtendedBUILD_VECTOR(), llvm::ShuffleVectorInst::isExtractSubvectorMask(), llvm::ARMBaseRegisterInfo::isFrameOffsetLegal(), isFunctionMallocLike(), isFusableLoadOpStorePattern(), llvm::SelectionDAG::isGuaranteedNotToBeUndefOrPoison(), isGuaranteedNotToBeUndefOrPoison(), isHomogeneousAggregate(), isHopBuildVector(), isHorizontalBinOp(), isHorizontalBinOpPart(), llvm::MachineInstr::isIdenticalTo(), isIdentityMaskImpl(), llvm::ShuffleVectorInst::isIdentityWithPadding(), isInBoundsIndices(), llvm::ShuffleVectorInst::isInsertSubvectorMask(), isinsets(), isINSMask(), llvm::BitTracker::MachineEvaluator::isInt(), isInterleaveShuffle(), llvm::LiveRangeCalc::isJointlyDominated(), llvm::isKnownNeverInfinity(), llvm::isKnownNeverNaN(), isKnownNonZero(), isLaneCrossingShuffleMask(), IsLegalOffset(), llvm::HexagonPacketizerList::isLegalToPacketizeTogether(), llvm::R600InstrInfo::isLegalUpTo(), llvm::SMSchedule::isLoopCarriedDefOfUse(), isMultiLaneShuffleMask(), isNByteElemShuffleMask(), llvm::ARM_AM::isNEONBytesplat(), isNonZeroElementsInOrder(), isNoopShuffleMask(), IsNullTerminatedString(), llvm::SIInstrInfo::isOperandLegal(), isOuterMostDepPositive(), llvm::RISCVSubtarget::isRegisterReservedByUser(), isRepeatedByteSequence(), isRepeatedShuffleMask(), isRepeatedTargetShuffleMask(), llvm::VLIWResourceModel::isResourceAvailable(), isReturnNonNull(), llvm::ShuffleVectorInst::isReverseMask(), isReverseMask(), isREVMask(), llvm::isSafeToSpeculativelyExecuteWithOpcode(), llvm::PPCInstrInfo::isSameClassPhysRegCopy(), llvm::Instruction::isSameOperationAs(), llvm::ShuffleVectorInst::isSelectMask(), isSequentialOrUndefInRange(), isSequentialOrUndefOrZeroInRange(), isSETCCorConvertedSETCC(), isShuffleEquivalent(), isShuffleEquivalentToSelect(), isShuffleMaskInputInPlace(), llvm::ARMTargetLowering::isShuffleMaskLegal(), isSimpleEnoughValueToCommitHelper(), isSingletonEXTMask(), isSingletonVEXTMask(), isSplatBV(), llvm::ShuffleVectorSDNode::isSplatMask(), llvm::PPC::isSplatShuffleMask(), llvm::SelectionDAG::isSplatValue(), llvm::SCCPInstVisitor::isStructLatticeConstant(), isSupportedType(), isTargetNullPtr(), isTargetShuffleEquivalent(), llvm::ShuffleVectorInst::isTransposeMask(), isTRN_v_undef_Mask(), isTRNMask(), isTwoAddrUse(), isUZP_v_undef_Mask(), isUZPMask(), llvm::ShuffleVectorInst::isValidOperands(), isVectorElementSwap(), isVectorPredicable(), isVEXTMask(), isVMerge(), isVMOVNMask(), isVMOVNTruncMask(), llvm::PPC::isVPKUDUMShuffleMask(), llvm::PPC::isVPKUHUMShuffleMask(), llvm::PPC::isVPKUWUMShuffleMask(), llvm::isVREVMask(), llvm::PPC::isVSLDOIShuffleMask(), isVTRN_v_undef_Mask(), isVTRNMask(), isVUZP_v_undef_Mask(), isVUZPMask(), isVZIP_v_undef_Mask(), isVZIPMask(), isWideTypeMask(), llvm::AArch64Subtarget::isXRegCustomCalleeSaved(), llvm::AArch64Subtarget::isXRegisterReserved(), isXXBRShuffleMaskHelper(), llvm::ShuffleVectorInst::isZeroEltSplatMask(), isZIP_v_undef_Mask(), isZipMask(), isZIPMask(), iterativelySimplifyCFG(), IVUseShouldUsePostIncValue(), llvm::LiveRange::join(), KnuthDiv(), llvm::MCAssembler::layout(), layoutCOFF(), llvm::AMDGPULegalizerInfo::legalizeInsertVectorElt(), llvm::AMDGPULegalizerInfo::legalizeMul(), llvm::SIInstrInfo::legalizeOperands(), llvm::VETargetLowering::legalizePackedAVL(), llvm::cfg::LegalizeUpdates(), LinearizeExprTree(), listContainsReg(), littleEndianByteAt(), lle_X_scanf(), lle_X_sscanf(), llvm_getMetadata(), llvm_regcomp(), LLVMCopyModuleFlagsMetadata(), LLVMGetArgOperand(), LLVMGetMDNodeOperands(), LLVMGetNamedMetadataOperands(), LLVMGetSubtypes(), LLVMGetSuccessor(), LLVMSetArgOperand(), LLVMSetSuccessor(), LLVMStructGetTypeAtIndex(), llvm::Mips16InstrInfo::loadImmediate(), llvm::PPCInstrInfo::loadRegFromStackSlotNoUpd(), llvm::ExecutionEngine::LoadValueFromMemory(), LookForIdenticalPHI(), llvm::SPIRVMCInstLower::lower(), llvm::M68kMCInstLower::Lower(), lower1BitShuffle(), lower1BitShuffleAsKSHIFTR(), LowerAVXCONCAT_VECTORS(), llvm::LegalizerHelper::lowerBitCount(), LowerBITREVERSE(), LowerBITREVERSE_XOP(), llvm::LegalizerHelper::lowerBswap(), llvm::HexagonTargetLowering::LowerBUILD_VECTOR(), LowerBUILD_VECTOR_i1(), LowerBUILD_VECTORvXi1(), LowerBuildVectorAsInsert(), LowerBuildVectorOfFPExt(), LowerBuildVectorOfFPTrunc(), lowerBuildVectorToBitOp(), LowerBuildVectorv16i8(), LowerBuildVectorv4x32(), llvm::LoongArchTargetLowering::LowerCall(), llvm::VETargetLowering::LowerCall(), llvm::HexagonTargetLowering::LowerCall(), llvm::SITargetLowering::LowerCall(), llvm::NVPTXTargetLowering::LowerCall(), llvm::RISCVTargetLowering::LowerCall(), llvm::CallLowering::lowerCall(), llvm::SparcTargetLowering::LowerCall_32(), llvm::SparcTargetLowering::LowerCall_64(), llvm::HexagonTargetLowering::LowerCallResult(), llvm::SITargetLowering::LowerCallResult(), lowerCallResult(), LowerCallResult(), llvm::FastISel::lowerCallTo(), llvm::TargetLowering::LowerCallTo(), llvm::HexagonTargetLowering::LowerCONCAT_VECTORS(), LowerCONCAT_VECTORS_i1(), LowerCONCAT_VECTORSvXi1(), llvm::HexagonTargetLowering::LowerConstantPool(), LowerCTLZ(), LowerCTPOP(), LowerEXTEND_VECTOR_INREG(), LowerEXTRACT_SUBVECTOR(), llvm::MipsCallLowering::lowerFormalArguments(), llvm::SPIRVCallLowering::lowerFormalArguments(), llvm::AArch64CallLowering::lowerFormalArguments(), llvm::R600TargetLowering::LowerFormalArguments(), llvm::LoongArchTargetLowering::LowerFormalArguments(), llvm::VETargetLowering::LowerFormalArguments(), llvm::HexagonTargetLowering::LowerFormalArguments(), llvm::SITargetLowering::LowerFormalArguments(), llvm::NVPTXTargetLowering::LowerFormalArguments(), llvm::RISCVTargetLowering::LowerFormalArguments(), llvm::SparcTargetLowering::LowerFormalArguments_32(), llvm::SparcTargetLowering::LowerFormalArguments_64(), llvm::AMDGPUCallLowering::lowerFormalArgumentsKernel(), llvm::InlineAsmLowering::lowerInlineAsm(), llvm::HexagonTargetLowering::LowerINLINEASM(), llvm::LegalizerHelper::lowerInsert(), lowerINT_TO_FP_vXi64(), llvm::AArch64TargetLowering::lowerInterleavedLoad(), llvm::ARMTargetLowering::lowerInterleavedLoad(), llvm::AArch64TargetLowering::lowerInterleavedStore(), llvm::ARMTargetLowering::lowerInterleavedStore(), llvm::X86TargetLowering::lowerInterleavedStore(), lowerLoadI1(), LowerMUL(), LowerMULH(), llvm::AArch64CallLowering::lowerReturn(), llvm::LoongArchTargetLowering::LowerReturn(), llvm::VETargetLowering::LowerReturn(), llvm::HexagonTargetLowering::LowerReturn(), llvm::NVPTXTargetLowering::LowerReturn(), llvm::RISCVTargetLowering::LowerReturn(), llvm::SparcTargetLowering::LowerReturn_32(), llvm::SparcTargetLowering::LowerReturn_64(), LowerReverse_VECTOR_SHUFFLE(), LowerShift(), lowerShuffleAsBitBlend(), lowerShuffleAsBitMask(), lowerShuffleAsBlend(), lowerShuffleAsBlendAndPermute(), lowerShuffleAsBlendOfPSHUFBs(), lowerShuffleAsDecomposedShuffleMerge(), lowerShuffleAsElementInsertion(), lowerShuffleAsLanePermuteAndPermute(), lowerShuffleAsLanePermuteAndRepeatedMask(), lowerShuffleAsLanePermuteAndShuffle(), lowerShuffleAsLanePermuteAndSHUFP(), lowerShuffleAsPermuteAndUnpack(), lowerShuffleAsRepeatedMaskAndLanePermute(), lowerShuffleAsSpecificZeroOrAnyExtend(), lowerShuffleAsSplitOrBlend(), lowerShuffleAsZeroOrAnyExtend(), lowerShuffleWithPACK(), lowerShuffleWithPSHUFB(), LowerSIGN_EXTEND(), lowerStatepointMetaArgs(), lowerStoreI1(), LowerToHorizontalOp(), lowerV16I8Shuffle(), lowerV4X128Shuffle(), lowerV8I16GeneralSingleInputShuffle(), lowerV8I16Shuffle(), llvm::HexagonTargetLowering::LowerVECTOR_SHUFFLE(), LowerVECTOR_SHUFFLE(), lowerVECTOR_SHUFFLE(), lowerVECTOR_SHUFFLE_SHF(), lowerVECTOR_SHUFFLE_VSHF(), LowerVECTOR_SHUFFLEUsingMovs(), LowerVECTOR_SHUFFLEUsingOneOff(), LowerVectorCTLZInRegLUT(), LowerVectorCTPOPInRegLUT(), LowervXi8MulWithUNPCK(), makeAllBits(), makeDstOps(), llvm::makePostTransformationMetadata(), llvm::RuntimeDyldImpl::mapSectionAddress(), llvm::SCCPInstVisitor::markOverdefined(), llvm::LiveVariables::MarkVirtRegAliveInBlock(), llvm::Regex::match(), llvm::PatternMatch::cstval_pred_ty< Predicate, ConstantVal >::match(), matchAddReduction(), matchBinaryPermuteShuffle(), llvm::ISD::matchBinaryPredicate(), matchBinaryShuffle(), llvm::SelectionDAG::matchBinOpReduction(), llvm::CombinerHelper::matchCombineShuffleVector(), matchIntrinsicType(), matchPMADDWD(), matchPMADDWD_2(), matchShuffleAsBitRotate(), matchShuffleAsBlend(), matchShuffleAsElementRotate(), matchShuffleAsEXTRQ(), matchShuffleAsInsertPS(), matchShuffleAsShift(), matchShuffleWithSHUFPD(), matchShuffleWithUNPCK(), llvm::matchSimpleRecurrence(), matchStridedConstant(), llvm::CombinerHelper::matchTruncStoreMerge(), matchUnaryPermuteShuffle(), llvm::ISD::matchUnaryPredicate(), matchUnaryShuffle(), llvm::PBQP::RegAlloc::MatrixMetadata::MatrixMetadata(), llvm::BitTracker::RegisterCell::meet(), mergeConstants(), llvm::TargetTransformInfoImplBase::minRequiredElementSize(), llvm::LegalizerHelper::moreElementsVector(), moveBelowOrigChain(), llvm::IntervalMapImpl::NodeBase< std::pair< IndexT, IndexT >, char, N >::moveLeft(), llvm::IntervalMapImpl::NodeBase< std::pair< IndexT, IndexT >, char, N >::moveRight(), llvm::LoopBase< BasicBlock, Loop >::moveToHeader(), llvm::SIInstrInfo::moveToVALU(), llvm::APInt::multiplicativeInverse(), llvm::LegalizerHelper::narrowScalar(), llvm::LegalizerHelper::narrowScalarAddSub(), llvm::LegalizerHelper::narrowScalarExtract(), nch(), llvm::AArch64RegisterInfo::needsFrameBaseReg(), llvm::ARMBaseRegisterInfo::needsFrameBaseReg(), llvm::PHITransAddr::NeedsPHITranslationFromBlock(), NextPossibleSolution(), llvm::NodeSet::NodeSet(), llvm::Triple::normalize(), llvm::DAGTypeLegalizer::NoteDeletion(), llvm::SourceMgr::OpenIncludeFile(), llvm::BitVector::operator&=(), llvm::orc::BlockFreqQuery::operator()(), llvm::PBQP::operator<<(), llvm::operator<<(), llvm::SparseBitVectorElement< ElementSize >::operator==(), llvm::object::ExportEntry::operator==(), llvm::BitTracker::RegisterCell::operator==(), llvm::Trace::operator[](), llvm::BitcodeReaderValueList::operator[](), llvm::AddressRanges::operator[](), llvm::gsym::LineTable::operator[](), llvm::CallGraphNode::operator[](), llvm::TinyPtrVector< llvm::VPValue * >::operator[](), OptimizeAndOrXor(), OptimizeAwayTrappingUsesOfValue(), llvm::ARMBaseInstrInfo::optimizeCompareInstr(), llvm::PPCInstrInfo::optimizeCompareInstr(), llvm::BranchFolder::OptimizeFunction(), optimizeIntegerToVectorInsertions(), llvm::X86InstrInfo::optimizeLoadInstr(), llvm::LanaiInstrInfo::optimizeSelect(), llvm::ARMBaseInstrInfo::optimizeSelect(), llvm::opt::OptTable::OptTable(), llvm::AArch64FrameLowering::orderFrameObjects(), llvm::X86FrameLowering::orderFrameObjects(), llvm::LiveRange::overlapsFrom(), p_b_term(), p_bracket(), p_simp_re(), llvm::CallLowering::parametersInCSRMatch(), llvm::cl::parser< const PassInfo * >::parse(), llvm::MachO::PackedVersion::parse32(), llvm::MachO::PackedVersion::parse64(), llvm::remarks::BitstreamParserHelper::parseMagic(), parseSectionFlags(), partitionShuffleOfConcats(), llvm::IntervalMap< KeyT, ValT, N, Traits >::const_iterator::pathFillFind(), PerformBUILD_VECTORCombine(), performConcatVectorsCombine(), llvm::PPCTargetLowering::PerformDAGCombine(), llvm::ARMTargetLowering::PerformIntrinsicCombine(), llvm::ARMTargetLowering::PerformMVETruncCombine(), PerformMVEVLDCombine(), performNEONPostLDSTCombine(), PerformSplittingMVEEXTToWideningLoad(), PerformSplittingMVETruncToNarrowingStores(), PerformSplittingToNarrowingStores(), PerformSplittingToWideningLoad(), PerformTruncatingStoreCombine(), llvm::ConvergingVLIWScheduler::VLIWSchedBoundary::pickOnlyChoice(), llvm::SchedBoundary::pickOnlyChoice(), placeSplitBlockCarefully(), llvm::ConstraintSystem::popLastNVariables(), llvm::RuntimeDyldMachO::populateIndirectSymbolPointersSection(), llvm::possiblyDemandedEltsInMask(), llvm::HexagonInstrInfo::PredicateInstruction(), llvm::TargetInstrInfo::PredicateInstruction(), llvm::safestack::StackLayout::print(), llvm::DWARFExpression::Operation::print(), llvm::Trace::print(), llvm::MachineJumpTableInfo::print(), llvm::DIEAbbrev::print(), llvm::opt::Arg::print(), llvm::MachineConstantPool::print(), llvm::SCEV::print(), llvm::VirtRegMap::print(), llvm::AliasSet::print(), llvm::MCInst::print(), llvm::LiveIntervals::print(), llvm::MachineOperand::print(), llvm::SMDiagnostic::print(), llvm::MachineTraceMetrics::Ensemble::print(), llvm::BasicAAResult::DecomposedGEP::print(), llvm::LoopBase< BasicBlock, Loop >::print(), llvm::LiveRange::print(), llvm::MachineFrameInfo::print(), llvm::AttributeList::print(), llvm::LoopInfoBase< BasicBlock, Loop >::print(), llvm::VPInterleaveRecipe::print(), llvm::MachineInstr::print(), llvm::SDNode::print_details(), llvm::SDNode::print_types(), printCFI(), PrintCFIEscape(), printConstant(), llvm::ARMInstPrinter::printCPSIFlag(), llvm::PBQP::RegAlloc::PBQPRAGraph::printDot(), llvm::cl::generic_parser_base::printGenericOptionDiff(), printHex32(), llvm::ScopedPrinter::printIndent(), llvm::vfs::FileSystem::printIndent(), llvm::SPIRVInstPrinter::printInst(), llvm::ARMInstPrinter::printInst(), llvm::GVNExpression::BasicExpression::printInternal(), llvm::GVNExpression::AggregateValueExpression::printInternal(), printLine(), llvm::SparcInstPrinter::printMembarTag(), llvm::PrintMessage(), printMetadataIdentifier(), llvm::ARMInstPrinter::printMVEVectorList(), PrintOps(), llvm::cl::generic_parser_base::printOptionInfo(), llvm::MipsAsmPrinter::printRegisterList(), llvm::ARMInstPrinter::printRegisterList(), llvm::SPIRVInstPrinter::printRemainingVariableOps(), printSourceLine(), printStackObjectDbgInfo(), llvm::MCSectionMachO::printSwitchToSection(), printSymbolizedStackTrace(), llvm::AArch64InstPrinter::printVectorList(), llvm::JumpThreadingPass::processBlock(), llvm::JumpThreadingPass::processBranchOnPHI(), llvm::RISCVFrameLowering::processFunctionBeforeFrameFinalized(), llvm::HexagonFrameLowering::processFunctionBeforeFrameFinalized(), llvm::PPCFrameLowering::processFunctionBeforeFrameFinalized(), processPHI(), llvm::RuntimeDyldELF::processRelocationRef(), processSwitches(), llvm::ARMBaseInstrInfo::produceSameValue(), llvm::DIEAbbrev::Profile(), llvm::APInt::Profile(), llvm::SampleProfileLoaderBaseImpl< MachineBasicBlock >::propagateThroughEdges(), ProvideOption(), llvm::cl::ProvidePositionalOption(), llvm::ValueEnumerator::purgeFunction(), PushArgMD(), rangeMetadataExcludesValue(), llvm::ResourcePriorityQueue::rawRegPressureDelta(), llvm::BitstreamCursor::ReadAbbrevRecord(), llvm::GCOVFile::readGCDA(), llvm::GCOVFile::readGCNO(), llvm::SIInstrInfo::readlaneVGPRToSGPR(), llvm::BitstreamCursor::readRecord(), llvm::sampleprof::SampleProfileReaderExtBinaryBase::readSecHdrTable(), llvm::MachineInstr::readsWritesVirtualRegister(), llvm::DomTreeBuilder::SemiNCAInfo< DomTreeT >::reattachExistingSubtree(), recomputeLiveInValues(), llvm::AArch64TargetLowering::ReconstructShuffle(), llvm::StackMaps::recordPatchPoint(), llvm::ImutAVLFactory< ImutInfo >::recoverNodes(), redirectValuesFromPredecessorsToPhi(), reduceBuildVecToShuffleWithZero(), reduceVMULWidth(), llvm::BitTracker::RegisterCell::ref(), llvm::BitTracker::RegisterCell::regify(), llvm::RuntimeDyldMachOCRTPBase< RuntimeDyldMachOX86_64 >::registerEHFrames(), llvm::RuntimeDyldELF::registerEHFrames(), llvm::RegsForValue::RegsForValue(), llvm::LiveIntervals::releaseMemory(), llvm::ConvergingVLIWScheduler::VLIWSchedBoundary::releasePending(), relocationViaAlloca(), llvm::CallGraphNode::removeAnyCallEdgeTo(), llvm::Function::removeAttributeAtIndex(), llvm::CallBase::removeAttributeAtIndex(), llvm::SIMachineFunctionInfo::removeDeadFrameIndices(), llvm::LazyCallGraph::RefSCC::removeInternalRefEdge(), llvm::MachineInstr::removeOperand(), removeOperands(), removePhis(), llvm::DomTreeBuilder::SemiNCAInfo< DomTreeT >::RemoveRedundantRoots(), removeSwitchAfterSelectFold(), removeTemplateArgs(), removeUndefIntroducingPredecessor(), llvm::Record::removeValue(), llvm::LiveVariables::removeVirtualRegisterDead(), llvm::LiveVariables::removeVirtualRegisterKilled(), llvm::LiveVariables::removeVirtualRegistersKilled(), llvm::opt::Arg::render(), reorderSubVector(), ReorganizeVector(), llvm::SelectionDAG::ReplaceAllUsesOfValuesWith(), llvm::SelectionDAG::ReplaceAllUsesWith(), replaceExtractElements(), replaceInChain(), llvm::PPCInstrInfo::replaceInstrWithLI(), ReplaceINTRINSIC_W_CHAIN(), ReplaceLoadVector(), llvm::MachineJumpTableInfo::ReplaceMBBInJumpTables(), llvm::CallGraphSCC::ReplaceNode(), llvm::X86TargetLowering::ReplaceNodeResults(), llvm::MachineBasicBlock::replacePhiUsesWith(), llvm::Constant::replaceUndefsWith(), replaceUndefValuesInPhi(), llvm::SelectionDAGISel::ReplaceUses(), llvm::MachineBasicBlock::ReplaceUsesOfBlockWith(), llvm::User::replaceUsesOfWith(), llvm::PPCRegisterInfo::requiresFrameIndexScavenging(), rescheduleCanonically(), rescheduleLexographically(), llvm::VLIWResourceModel::reserveResources(), llvm::BitVector::reset(), llvm::SmallBitVector::reset(), llvm::DIInliningInfo::resize(), resolveBuildVector(), llvm::SCCPInstVisitor::resolvedUndefsIn(), llvm::ThumbRegisterInfo::resolveFrameIndex(), llvm::AArch64RegisterInfo::resolveFrameIndex(), llvm::ARMBaseRegisterInfo::resolveFrameIndex(), llvm::BitsInit::resolveReferences(), llvm::RuntimeDyldImpl::resolveRelocationList(), resolveTargetShuffleFromZeroables(), resolveTargetShuffleInputsAndMask(), resolveZeroablesFromTargetShuffle(), llvm::PPCFrameLowering::restoreCalleeSavedRegisters(), llvm::SIRegisterInfo::restoreSGPR(), RestoreSpillList(), llvm::SIScheduleDAGMI::restoreSULinksLeft(), llvm::HexagonPacketizerList::restrictingDepExistInPacket(), llvm::CallLowering::resultsCompatible(), returnEdge(), llvm::reverseBits(), rewrite(), llvm::rewriteLoopExitValues(), rewritePHINodesForExitAndUnswitchedBlocks(), rewritePHINodesForUnswitchedExitBlock(), llvm::StringRef::rfind(), llvm::StringRef::rfind_insensitive(), llvm::BitTracker::RegisterCell::rol(), llvm::InstCombinerImpl::run(), llvm::RepeatedPass< PassT >::run(), llvm::ExecutionEngine::runFunctionAsMain(), llvm::runIPSCCP(), llvm::RewriteStatepointsForGC::runOnFunction(), llvm::SelectionDAGISel::runOnMachineFunction(), llvm::AMDGPUAsmPrinter::runOnMachineFunction(), llvm::ExecutionDomainFix::runOnMachineFunction(), llvm::LiveIntervals::runOnMachineFunction(), llvm::AVRFrameAnalyzer::runOnMachineFunction(), llvm::DomTreeBuilder::SemiNCAInfo< DomTreeT >::runSemiNCA(), llvm::ExecutionEngine::runStaticConstructorsDestructors(), llvm::IntervalMapImpl::LeafNode< uint64_t, uint16_t, N, IntervalMapHalfOpenInfo< uint64_t > >::safeFind(), llvm::IntervalMapImpl::BranchNode< KeyT, ValT, RootBranchCap, Traits >::safeFind(), llvm::IntervalMapImpl::LeafNode< uint64_t, uint16_t, N, IntervalMapHalfOpenInfo< uint64_t > >::safeLookup(), llvm::SelectionDAG::salvageDebugInfo(), SalvageDVI(), samesets(), scalarizeVectorStore(), llvm::APIntOps::ScaleBitMask(), ScaleVectorOffset(), scaleVectorShuffleBlendMask(), llvm::SIScheduleDAGMI::schedule(), llvm::ResourcePriorityQueue::scheduledNode(), llvm::sys::DynamicLibrary::SearchForAddressOfSymbol(), AMDGPUDAGToDAGISel::SelectBuildVector(), llvm::SelectionDAGISel::SelectCodeCommon(), llvm::FastISel::selectExtractValue(), llvm::SelectionDAGISel::SelectInlineAsmMemoryOperands(), llvm::FastISel::selectInstruction(), llvm::FastISel::selectPatchpoint(), llvm::FastISel::selectStackmap(), llvm::EngineBuilder::selectTarget(), llvm::LoopVectorizationCostModel::selectVectorizationFactor(), llvm::BitTracker::RegisterCell::self(), separateNestedLoop(), llvm::FunctionLoweringInfo::set(), llvm::CallBase::setArgOperand(), llvm::FuncletPadInst::setArgOperand(), llvm::ARMBaseInstrInfo::setExecutionDomain(), setGroupSize(), llvm::PHINode::setIncomingBlock(), llvm::PHINode::setIncomingValue(), llvm::CallBrInst::setIndirectDest(), llvm::WebAssemblyFunctionInfo::setLocal(), setMemoryPhiValueForBlock(), llvm::User::setOperand(), llvm::IndirectBrInst::setSuccessor(), llvm::InvokeInst::setSuccessor(), llvm::CallBrInst::setSuccessor(), llvm::PackedVectorBase< T, BitNum, BitVectorTy, false >::setValue(), llvm::PackedVectorBase< T, BitNum, BitVectorTy, true >::setValue(), llvm::LoopVectorizationCostModel::setWideningDecision(), llvm::IntervalMapImpl::NodeBase< std::pair< IndexT, IndexT >, char, N >::shift(), llvm::AAMDNodes::shiftTBAAStruct(), shrinkFPConstantVector(), SimplifyAddOperands(), simplifyAndDCEInstruction(), simplifyCommonValuePhi(), llvm::TargetLowering::SimplifyDemandedBits(), llvm::InstCombinerImpl::SimplifyDemandedVectorElts(), llvm::TargetLowering::SimplifyDemandedVectorElts(), llvm::X86TargetLowering::SimplifyDemandedVectorEltsForTargetNode(), llvm::X86TargetLowering::SimplifyDemandedVectorEltsForTargetShuffle(), simplifyDivRem(), llvm::TargetLowering::SimplifyMultipleUseDemandedBits(), llvm::X86TargetLowering::SimplifyMultipleUseDemandedBitsForTargetNode(), simplifyOneLoop(), simplifySelectInst(), simplifySetCCWithCTPOP(), simplifyShuffleOfShuffle(), simplifyShuffleVectorInst(), simplifyX86immShift(), simplifyX86insertps(), sinkLoopInvariantInstructions(), llvm::SIScheduleBlockScheduler::SIScheduleBlockScheduler(), sizeOfSPAdjustment(), llvm::yaml::skip(), skipExtensionForVectorMULL(), SkipExtensionForVMULL(), llvm::BitstreamCursor::skipRecord(), llvm::SIRegisterInfo::spillEmergencySGPR(), llvm::SIRegisterInfo::spillSGPR(), llvm::DominatorTreeBase< BasicBlock, false >::Split(), SplitAddRecs(), llvm::SplitAllCriticalEdges(), splitAndLowerShuffle(), SplitBlockPredecessorsImpl(), splitCallSite(), llvm::OutlinableRegion::splitCandidate(), llvm::SplitCriticalEdge(), llvm::MachineBasicBlock::SplitCriticalEdge(), SplitCriticalSideEffectEdges(), llvm::SplitKnownCriticalEdge(), SplitLandingPadPredecessorsImpl(), llvm::splitLoopBound(), SplitOpsAndApply(), splitRetconCoroutine(), llvm::CallLowering::splitToValueTypes(), llvm::VETargetLowering::splitVectorOp(), llvm::APInt::sqrt(), llvm::sampleprof::SampleProfileWriterBinary::stablizeNameTable(), StackMallocSizeClass(), llvm::IntervalMapImpl::LeafNode< uint64_t, uint16_t, N, IntervalMapHalfOpenInfo< uint64_t > >::start(), llvm::CriticalAntiDepBreaker::StartBlock(), llvm::IntervalMapImpl::LeafNode< uint64_t, uint16_t, N, IntervalMapHalfOpenInfo< uint64_t > >::stop(), llvm::IntervalMapImpl::BranchNode< KeyT, ValT, RootBranchCap, Traits >::stop(), llvm::PPCInstrInfo::storeRegToStackSlotNoUpd(), StoreTailCallArgumentsToStackSlot(), llvm::ExecutionEngine::StoreValueToMemory(), llvm::remarks::StringTable::StringTable(), llvm::stripGetElementPtr(), stripNonValidDataFromBody(), StripTypeNames(), llvm::BitTracker::subst(), llvm::IntervalMapImpl::NodeRef::subtree(), llvm::IntervalMapImpl::BranchNode< KeyT, ValT, RootBranchCap, Traits >::subtree(), llvm::SmallVectorImpl< std::pair< llvm::RelocationValueRef, llvm::RelocationEntry > >::swap(), llvm::SmallDenseMap< const llvm::RecurrenceDescriptor *, unsigned, N >::swap(), swapMIOperands(), llvm::MachO::swapStruct(), swapWord(), SwitchToLookupTable(), llvm::TailDuplicator::tailDuplicateAndUpdate(), llvm::X86TargetLowering::targetShrinkDemandedConstant(), llvm::APInt::tcAdd(), llvm::APInt::tcAddPart(), llvm::APInt::tcAssign(), tcComplement(), llvm::APInt::tcFullMultiply(), llvm::APInt::tcIsZero(), llvm::APInt::tcLSB(), llvm::APInt::tcMultiply(), llvm::APInt::tcMultiplyPart(), llvm::APInt::tcSet(), llvm::detail::tcSetLeastSignificantBits(), llvm::APInt::tcShiftRight(), llvm::APInt::tcSubtract(), llvm::APInt::tcSubtractPart(), llvm::BitVector::test(), llvm::SmallBitVector::test(), llvm::JumpThreadingPass::threadEdge(), llvm::JumpThreadingPass::threadThroughTwoBasicBlocks(), llvm::MachineFunction::tidyLandingPads(), llvm::OpenMPIRBuilder::tileLoops(), llvm::BitTracker::MachineEvaluator::toInt(), llvm::BitTracker::RegisterCell::top(), transpose_msg_vecs(), transpose_msg_vecs16(), transpose_msg_vecs4(), transpose_msg_vecs8(), llvm::DbgValueHistoryMap::trimLocationRanges(), llvm::APInt::trunc(), tryAddToFoldList(), TryCombineBaseUpdate(), tryCombineToBSL(), llvm::LegalizationArtifactCombiner::tryCombineTrunc(), llvm::tryFoldSPUpdateIntoPushPop(), tryInterleave(), tryToFoldExtendOfConstant(), tryToReplaceWithConstant(), llvm::TryToSimplifyUncondBranchFromEmptyBlock(), umul_ov(), llvm::IntEqClasses::uncompress(), llvm::X86InstrInfo::unfoldMemoryOperand(), uninstallExceptionOrSignalHandlers(), llvm::SparseBitVectorElement< ElementSize >::unionWith(), unpackLoadToAggregate(), unpackStoreToAggregate(), llvm::UnrollLoop(), llvm::UnrollRuntimeLoopRemainder(), llvm::SelectionDAG::UnrollVectorOp(), llvm::SelectionDAG::UnrollVectorOverflowOp(), unrollVectorShift(), llvm::AArch64RegisterInfo::UpdateCustomCalleeSavedRegs(), llvm::AArch64RegisterInfo::UpdateCustomCallPreservedMask(), llvm::X86::updateImpliedFeatures(), updateLoopMetadataDebugLocationsImpl(), llvm::SelectionDAG::UpdateNodeOperands(), updateOperand(), UpdatePHINodes(), updatePHIs(), updatePostorderSequenceForEdgeInsertion(), updatePredecessorProfileMetadata(), llvm::CallInst::updateProfWeight(), llvm::FastISel::updateValueMap(), llvm::UpgradeGlobalVariable(), UpgradeX86ALIGNIntrinsics(), UpgradeX86PSLLDQIntrinsics(), UpgradeX86PSRLDQIntrinsics(), llvm::IntervalMapImpl::LeafNode< uint64_t, uint16_t, N, IntervalMapHalfOpenInfo< uint64_t > >::value(), llvm::ValueEnumerator::ValueEnumerator(), llvm::dxil::ValueEnumerator::ValueEnumerator(), ValuesOverlap(), llvm::InnerLoopVectorizer::vectorizeInterleaveGroup(), llvm::slpvectorizer::BoUpSLP::vectorizeTree(), llvm::PHITransAddr::Verify(), llvm::SIInstrInfo::verifyInstruction(), llvm::LoopBase< BasicBlock, Loop >::verifyLoop(), VerifyLowRegs(), VerifyPHIs(), llvm::MachineRegisterInfo::verifyUseLists(), VerifyVectorTypes(), llvm::InstCombinerImpl::visitAllocSite(), llvm::Interpreter::visitAShr(), llvm::Interpreter::visitBinaryOperator(), llvm::SelectionDAGBuilder::visitBitTestHeader(), llvm::InstCombinerImpl::visitCallInst(), llvm::Interpreter::visitExtractValueInst(), VisitGlobalVariableForEmission(), llvm::Interpreter::visitInsertValueInst(), llvm::InstCombinerImpl::visitLandingPadInst(), llvm::Interpreter::visitLShr(), llvm::ObjectSizeOffsetEvaluator::visitPHINode(), llvm::Interpreter::visitShl(), llvm::Interpreter::visitShuffleVectorInst(), llvm::InstCombinerImpl::visitShuffleVectorInst(), llvm::InstCombinerImpl::visitSRem(), llvm::Interpreter::visitUnaryOperator(), llvm::widenShuffleMaskElts(), widenVec(), willShiftRightEliminate(), llvm::msgpack::Writer::write(), writeCOFF(), WriteConstantInternal(), writeFragment(), llvm::ARMAsmBackend::writeNopData(), llvm::MCAssembler::writeSectionData(), llvm::opt::Arg::~Arg(), and llvm::CrashRecoveryContext::~CrashRecoveryContext().
Clang compiles this i32 |
Definition at line 504 of file README.txt.
Definition at line 504 of file README.txt.
Referenced by getBaseWithOffsetUsingSplitOR(), and AMDGPUDAGToDAGISel::Select().
Clang compiles this i8 |
Definition at line 504 of file README.txt.
Referenced by AMDGPUDAGToDAGISel::matchLoadD16FromBuildVector(), and to().
Definition at line 131 of file README.txt.
Clang compiles this i1 i64 store i64 i64 store i64 i64 store i64 i64 store i64 align Which gets codegen d xmm0 movaps rbp movaps rbp movaps rbp movaps rbp rbp rbp rbp rbp It would be better to have movq s of instead of the movaps s LLVM produces ret int |
Definition at line 536 of file README.txt.
Referenced by llvm::pdb::DbiStreamBuilder::addDbgStream(), llvm::ScheduleDAGInstrs::addPhysRegDataDeps(), analyzeLoopUnrollCost(), llvm::AVRDAGToDAGISel::select< ISD::STORE >(), BUCompareLatency(), CalculateTailCallSPDiff(), checkResourceLimit(), clusterSortPtrAccesses(), combineAddOfPMADDWD(), combineExtractWithShuffle(), combineShuffleOfScalars(), combineShuffleToVectorExtend(), combineTargetShuffle(), combineTruncationShuffle(), combineX86ShufflesRecursively(), llvm::ShuffleVectorInst::commuteShuffleMask(), CompareSCEVComplexity(), CompareValueComplexity(), completeEphemeralValues(), computeExcessPressureDelta(), computeFreeStackSlots(), ComputeImportForModule(), computeMaxPressureDelta(), constructDup(), createShuffleMaskFromVSELECT(), llvm::MachineFrameInfo::CreateSpillStackObject(), llvm::MachineFrameInfo::CreateStackObject(), llvm::DecodeEXTRQIMask(), llvm::DecodeINSERTQIMask(), DecodeInsSize(), llvm::DecodeVPERM2X128Mask(), llvm::M68kRegisterInfo::eliminateFrameIndex(), llvm::X86RegisterInfo::eliminateFrameIndex(), llvm::EmitAnyX86InstComments(), llvm::Thumb1FrameLowering::emitEpilogue(), llvm::SparcFrameLowering::emitEpilogue(), llvm::ARMFrameLowering::emitEpilogue(), emitFrameOffsetAdj(), llvm::WebAssemblyAsmPrinter::EmitProducerInfo(), llvm::SparcFrameLowering::emitPrologue(), llvm::AArch64SelectionDAGInfo::EmitTargetCodeForSetTag(), llvm::pdb::DbiStreamBuilder::finalizeMsfLayout(), FoldIntToFPToInt(), llvm::InstCombinerImpl::foldItoFPtoI(), foldShuffleOfConcatUndefs(), llvm::ARM_AM::getAM2Opc(), getARMIndexedAddressParts(), llvm::SlotTracker::getAttributeGroupSlot(), llvm::RISCVTTIImpl::getCastInstrCost(), getFauxShuffleMask(), llvm::SlotTracker::getGlobalSlot(), llvm::SlotTracker::getGUIDSlot(), llvm::ScoreboardHazardRecognizer::getHazardType(), llvm::SlotTracker::getLocalSlot(), llvm::SlotTracker::getMetadataSlot(), llvm::SlotTracker::getModulePathSlot(), getMVEIndexedAddressParts(), getOrCreateFrameHelper(), getShuffleComment(), getShuffleScalarElt(), getT2IndexedAddressParts(), getTargetConstantBitsFromNode(), llvm::SlotTracker::getTypeIdSlot(), llvm::RegPressureTracker::getUpwardPressureDelta(), handleIndirectSymViaGOTPCRel(), llvm::MachineInstr::hasComplexRegisterTies(), iJIT_NotifyEvent(), llvm::MachineFrameInfo::isFixedObjectIndex(), isKnownExactCastIntToFP(), llvm::SelectionDAG::isSplatValue(), IsValueFullyAvailableInBlock(), llvm::cfg::LegalizeUpdates(), lle_X_memset(), llvm::ARMTargetLowering::LowerAsmOperandForConstraint(), llvm::MipsCallLowering::lowerFormalArguments(), LowerMULH(), lowerShuffleAsBroadcast(), lowerShuffleAsByteShiftMask(), lowerShuffleAsElementInsertion(), lowerVECTOR_SHUFFLE(), LowerVECTOR_SHUFFLE(), llvm::SelectionDAG::matchBinOpReduction(), AbstractManglingParser< ManglingParser< Alloc >, Alloc >::parseTemplateParamDecl(), partitionShuffleOfConcats(), llvm::SystemZConstantPoolValue::print(), llvm::MachineFrameInfo::print(), printBitField(), llvm::SparcInstPrinter::printCCOperand(), llvm::VEInstPrinter::printCCOperand(), llvm::printCompactDWARFExpr(), printField(), PrintHelpOptionList(), llvm::PPCInstPrinter::printImmZeroOperand(), llvm::PPCInstPrinter::printInst(), llvm::NVPTXInstPrinter::printLdStCode(), llvm::VEInstPrinter::printMImmOperand(), llvm::NVPTXInstPrinter::printMmaCode(), llvm::ScopedPrinter::printNumber(), llvm::SparcInstPrinter::printOperand(), llvm::VEInstPrinter::printRDOperand(), llvm::PPCInstPrinter::printS5ImmOperand(), llvm::PPCInstPrinter::printU1ImmOperand(), llvm::PPCInstPrinter::printU2ImmOperand(), llvm::PPCInstPrinter::printU3ImmOperand(), llvm::PPCInstPrinter::printU4ImmOperand(), llvm::PPCInstPrinter::printU5ImmOperand(), llvm::PPCInstPrinter::printU6ImmOperand(), llvm::PPCInstPrinter::printU7ImmOperand(), llvm::PPCInstPrinter::printU8ImmOperand(), llvm::readExponent(), llvm::AArch64FrameLowering::resolveFrameOffsetReference(), llvm::orc::SelfExecutorProcessControl::runAsMain(), llvm::orc::rt_bootstrap::runAsMainWrapper(), llvm::MCJIT::runFunction(), llvm::AVRDAGToDAGISel::SelectAddr(), llvm::PPCTargetLowering::SelectAddressRegImm(), llvm::ImportedFunctionsInliningStatistics::setModuleInfo(), llvm::MachineOperand::setOffset(), llvm::GCNHazardRecognizer::ShouldPreferAnother(), llvm::TargetLowering::SimplifyMultipleUseDemandedBits(), llvm::VETargetLowering::splitPackedLoadStore(), llvm::VETargetLowering::splitVectorOp(), llvm::Register::stackSlot2Index(), this(), and llvm::detail::IEEEFloat::toString().
into llvm powi allowing the code generator to produce balanced multiplication trees the intrinsic needs to be extended to support integers |
Definition at line 54 of file README.txt.
In fpstack this compiles into |
Definition at line 504 of file README.txt.
This requires reassociating to forms of expressions that are already something that reassoc doesn t think about yet These two functions should generate the same code on big endian int* l { return memcmp(j,l,4) |
Definition at line 100 of file README.txt.
Referenced by addConstantComments(), llvm::DecodeMOVDDUPMask(), llvm::DecodePALIGNRMask(), DecodePALIGNRMask(), llvm::DecodePSHUFHWMask(), llvm::DecodePSHUFLWMask(), llvm::DecodePSHUFMask(), llvm::DecodePSLLDQMask(), llvm::DecodePSRLDQMask(), llvm::DecodePSWAPMask(), llvm::DecodeSHUFPMask(), llvm::DecodeUNPCKHMask(), llvm::DecodeUNPCKLMask(), llvm::DecodeVPERM2X128Mask(), llvm::DecodeVPERMMask(), llvm::decodeVSHUF64x2FamilyMask(), llvm::IntervalMapImpl::Path::getLeftSibling(), llvm::IntervalMapImpl::Path::getRightSibling(), load_counters(), load_counters16(), llvm::IntervalMapImpl::Path::moveLeft(), llvm::IntervalMapImpl::Path::moveRight(), llvm::sys::fs::operator&(), llvm::sys::fs::operator&=(), llvm::sys::fs::operator|(), llvm::sys::fs::operator|=(), llvm::EngineBuilder::setOptLevel(), llvm::APInt::tcAdd(), llvm::APInt::tcSubtract(), llvm::IntervalMap< KeyT, ValT, N, Traits >::const_iterator::treeAdvanceTo(), UpgradeX86ALIGNIntrinsics(), UpgradeX86PSLLDQIntrinsics(), UpgradeX86PSRLDQIntrinsics(), and llvm::InstCombinerImpl::visitLandingPadInst().
Definition at line 142 of file README.txt.
Definition at line 420 of file README.txt.
Where MAX_UNSIGNED state is a bit int On a bit platform it would be just so cool to turn it into something like |
Definition at line 48 of file README.txt.
THotKey m_HotKey |
Referenced by GetHotKey().
Definition at line 400 of file README.txt.
Clang compiles this i1 i64 store i64 i64 store i64 i64 store i64 i64 store i64 align Which gets codegen d xmm0 movaps rbp movaps rbp movaps rbp movaps rbp rbp rbp rbp movq |
Definition at line 521 of file README.txt.
The same transformation can work with an even modulo with the addition of a and shrink the compare RHS by the same amount Unless the target supports that transformation probably isn t worthwhile The transformation can also easily be made to work with non zero equality for n |
Definition at line 685 of file README.txt.
Referenced by llvm::IntervalMapImpl::adjustSiblingSizes(), appendToGlobalArray(), llvm::pointer_union_detail::bitsRequired(), llvm::ScheduleDAGInstrs::buildSchedGraph(), Choose(), CombineVLDDUP(), llvm::ComputeMappedEditDistance(), llvm::decodeSLEB128(), llvm::decodeULEB128(), llvm::IntervalMapImpl::distribute(), llvm::detail::indexed_accessor_range_base< DerivedT, std::pair< BaseT, ptrdiff_t >, T, T *, T & >::drop_back(), llvm::detail::indexed_accessor_range_base< DerivedT, std::pair< BaseT, ptrdiff_t >, T, T *, T & >::drop_front(), llvm::encodeBase64(), llvm::SpillPlacement::finish(), foo(), llvm::getAlign(), llvm::SpillPlacement::Node::getDissentingNeighbors(), llvm::DWARFContext::getInliningInfoForAddress(), llvm::HexagonMCCodeEmitter::getMachineOpValue(), llvm::df_iterator< Inverse< T >, df_iterator_default_set< typename GraphTraits< T >::NodeRef >, External >::getPath(), llvm::SparcRegisterInfo::getReservedRegs(), llvm::X86RegisterInfo::getReservedRegs(), llvm::IndexedMap< unsigned, llvm::VirtReg2IndexFunctor >::grow(), handleSwitchExpect(), llvm::rdf::NodeAllocator::id(), llvm::IndexedMap< unsigned, llvm::VirtReg2IndexFunctor >::inBounds(), INITIALIZE_PASS(), llvm::yaml::ScalarTraits< MaybeAlign >::input(), is(), llvm::SpillPlacement::iterate(), KnuthDiv(), llvm_strlcpy(), LowerCTPOP(), llvm::BitTracker::RegisterCell::meet(), memcpy(), nch(), llvm::IntervalMapImpl::NodeRef::NodeRef(), llvm::iterator_facade_base< iterator, std::forward_iterator_tag, const entry >::operator+(), llvm::iterator_adaptor_base< const_succ_op_iterator, const_value_op_iterator, std::random_access_iterator_tag, const BasicBlock *, ptrdiff_t, const BasicBlock *, const BasicBlock * >::operator+=(), llvm::iterator_facade_base< iterator, std::forward_iterator_tag, const entry >::operator-(), llvm::iterator_adaptor_base< const_succ_op_iterator, const_value_op_iterator, std::random_access_iterator_tag, const BasicBlock *, ptrdiff_t, const BasicBlock *, const BasicBlock * >::operator-=(), llvm::operator<<(), llvm::IndexedMap< unsigned, llvm::VirtReg2IndexFunctor >::operator[](), llvm::SetVector< llvm::MCSection *, SmallVector< llvm::MCSection *, N >, SmallDenseSet< llvm::MCSection *, N > >::operator[](), llvm::iterator_facade_base< iterator, std::forward_iterator_tag, const entry >::operator[](), PerformMVEVLDCombine(), performNEONPostLDSTCombine(), PerformVECTOR_SHUFFLECombine(), llvm::powerOf5(), llvm::HexagonInstrInfo::PredicateInstruction(), FloatLiteralImpl< Float >::printLeft(), llvm::ImutAVLFactory< ImutInfo >::recoverNodes(), llvm::BitTracker::RegisterCell::regify(), llvm::APInt::roundToDouble(), ScaleVectorOffset(), llvm::SpillPlacement::scanActiveBundles(), llvm::cl::Option::setNumAdditionalVals(), llvm::cl::list< DataType, StorageClass, ParserClass >::setNumAdditionalVals(), llvm::ARMFunctionInfo::setNumAlignedDPRCS2Regs(), llvm::IntervalMapImpl::NodeRef::setSize(), llvm::ImutAVLTree< ImutInfo >::size(), llvm::detail::indexed_accessor_range_base< DerivedT, std::pair< BaseT, ptrdiff_t >, T, T *, T & >::slice(), llvm::detail::indexed_accessor_range_base< DerivedT, std::pair< BaseT, ptrdiff_t >, T, T *, T & >::take_back(), llvm::detail::indexed_accessor_range_base< DerivedT, std::pair< BaseT, ptrdiff_t >, T, T *, T & >::take_front(), llvm::APInt::tcDivide(), llvm::APInt::tcExtract(), llvm::APInt::tcLSB(), llvm::APInt::tcMSB(), llvm::APInt::tcMultiplyPart(), TryCombineBaseUpdate(), llvm::writeUnsignedDecimal(), x(), and y().
Definition at line 134 of file README.txt.
Target Independent Opportunities |
Definition at line 8 of file README.txt.
Definition at line 606 of file README.txt.
Definition at line 607 of file README.txt.
this could be done in SelectionDAGISel along with other special bytes It would be nice to revert this patch |
Definition at line 104 of file README.txt.
Unrolling by would eliminate the& in both leading to a net reduction in code size The resultant code would then also be suitable for exit value computation We miss a bunch of rotate opportunities on various including ppc |
Definition at line 567 of file README.txt.
Definition at line 37 of file README.txt.
This is blocked on not handling X* X* X which is the same number of multiplies and is because the* X has multiple uses Here s a simple X1 B ret i32 C Reassociate should handle the example in GCC PR16157 |
Definition at line 84 of file README.txt.
Unrolling by would eliminate the& in both leading to a net reduction in code size The resultant code would then also be suitable for exit value computation We miss a bunch of rotate opportunities on various including etc On we miss a bunch of rotate by variable cases because the rotate matching code in dag combine doesn t look through truncates aggressively enough Here are some testcases reduces from GCC PR17886 |
Definition at line 572 of file README.txt.
preds |
Definition at line 340 of file README.txt.
Referenced by abort_gzip(), foo(), llvm::MIPatternMatch::m_all_of(), and llvm::MIPatternMatch::m_any_of().
into llvm powi allowing the code generator to produce balanced multiplication trees the intrinsic needs to be extended to support and second the code generator needs to be enhanced to lower these to multiplication trees Interesting testcase for add shift mul reassoc |
Definition at line 61 of file README.txt.
Definition at line 355 of file README.txt.
< i32 > br label return return |
Definition at line 242 of file README.txt.
Referenced by llvm::object::MachOObjectFile::getVersionMinMajor(), llvm::object::MachOObjectFile::getVersionMinMinor(), llvm::performOptimizedStructLayout(), llvm::SparcInstPrinter::printOperand(), and llvm::SplitModule().
Definition at line 680 of file README.txt.
Referenced by DecodeNEONComplexLane64Instruction().
The same transformation can work with an even modulo with the addition of a and shrink the compare RHS by the same amount Unless the target supports rotates |
Definition at line 681 of file README.txt.
Definition at line 370 of file README.txt.
Referenced by llvm::PBQP::backpropagate(), llvm::LoopVectorizationCostModel::calculateRegisterUsage(), llvm::sys::path::convert_to_slash(), llvm::ARMConstantPoolSymbol::Create(), llvm::hashing::detail::hash_state::create(), llvm::DecodeSHUFPMask(), doinsert(), findmust(), llvm::ScheduleDAGSDNodes::getGraphNodeLabel(), llvm::ScheduleDAGInstrs::getGraphNodeLabel(), llvm::MachineFunction::getMachineMemOperand(), llvm::SlotIndex::getNextSlot(), llvm::SelectionDAG::getNode(), llvm::SlotIndex::getPrevSlot(), getSPIRVStringOperand(), llvm::ScheduleDAGTopologicalSort::GetSubGraph(), llvm::hashing::detail::hash_17to32_bytes(), llvm::hashing::detail::hash_1to3_bytes(), llvm::hashing::detail::hash_33to64_bytes(), llvm::hashing::detail::hash_4to8_bytes(), llvm::hashing::detail::hash_9to16_bytes(), llvm::hashing::detail::hash_integer_value(), llvm::hashing::detail::hash_short(), llvm::HexagonResource::HexagonResource(), llvm::AsmLexer::LexToken(), llvm_regerror(), llvm_strlcpy(), llvm::MCInstPrinter::markup(), llvm::hashing::detail::hash_state::mix(), llvm::hashing::detail::hash_state::mix_32_bytes(), llvm::GlobalVariable::operator new(), llvm::MCSymbol::operator new(), parseSegmentLoadCommand(), pluscount(), llvm::IndexedMap< unsigned, llvm::VirtReg2IndexFunctor >::reserve(), llvm::IndexedMap< unsigned, llvm::VirtReg2IndexFunctor >::resize(), llvm::ARMFunctionInfo::setArgRegsSaveSize(), llvm::AArch64FunctionInfo::setCalleeSaveStackHasFreeSpace(), llvm::ARMFunctionInfo::setDPRCalleeSavedAreaSize(), llvm::ARMFunctionInfo::setDPRCalleeSavedGapSize(), llvm::ARMFunctionInfo::setFPCXTSaveAreaSize(), llvm::ARMFunctionInfo::setFrameRecordSavedAreaSize(), llvm::ARMFunctionInfo::setGPRCalleeSavedArea1Size(), llvm::ARMFunctionInfo::setGPRCalleeSavedArea2Size(), llvm::MachineFrameInfo::setHasPatchPoint(), llvm::AArch64FunctionInfo::setHasRedZone(), llvm::ARMFunctionInfo::setHasStackFrame(), llvm::AArch64FunctionInfo::setHasStackFrame(), llvm::MachineFrameInfo::setHasStackMap(), llvm::X86MachineFunctionInfo::setIsSplitCSR(), llvm::AArch64FunctionInfo::setIsSplitCSR(), llvm::ARMFunctionInfo::setIsSplitCSR(), llvm::AArch64FunctionInfo::setIsSVECC(), llvm::CSKYMachineFunctionInfo::setLRIsSpilled(), llvm::ARMFunctionInfo::setLRIsSpilled(), llvm::MachineFrameInfo::setReturnAddressIsTaken(), llvm::ARMFunctionInfo::setReturnRegsCount(), llvm::ARMFunctionInfo::setShouldRestoreSPFromFP(), llvm::AArch64FunctionInfo::setStackRealigned(), llvm::object::Elf_Rel_Impl< ELFType< TargetEndianness, false >, false >::setSymbol(), llvm::object::Elf_Rel_Impl< ELFType< TargetEndianness, true >, false >::setSymbol(), llvm::ELF::Elf32_Rel::setSymbol(), llvm::ELF::Elf32_Rela::setSymbol(), llvm::ELF::Elf64_Rel::setSymbol(), llvm::ELF::Elf64_Rela::setSymbol(), llvm::object::Elf_Rel_Impl< ELFType< TargetEndianness, false >, false >::setSymbolAndType(), llvm::object::Elf_Rel_Impl< ELFType< TargetEndianness, true >, false >::setSymbolAndType(), llvm::ELF::Elf32_Rel::setSymbolAndType(), llvm::ELF::Elf32_Rela::setSymbolAndType(), llvm::ELF::Elf64_Rel::setSymbolAndType(), llvm::ELF::Elf64_Rela::setSymbolAndType(), llvm::HexagonResource::setUnits(), llvm::HexagonResource::setWeight(), llvm::SmallBitVector::SmallBitVector(), llvm::sys::fs::status_known(), llvm::MachO::swapStruct(), wordsOfString(), llvm::msgpack::Writer::write(), and llvm::write32AArch64Addr().
Definition at line 144 of file README.txt.
bool Shift |
Definition at line 468 of file README.txt.
Referenced by llvm::KnownBits::ashr(), BuildExactSDIV(), llvm::TargetLowering::BuildSDIV(), canShiftBinOpWithConstantRHS(), collectInsertionElements(), combineAndMaskToShift(), combineBitOpWithShift(), combineMulSpecial(), combineShiftOfShiftedLogic(), computeGREVOrGORC(), llvm::X86TargetLowering::computeKnownBitsForTargetNode(), llvm::detail::TrailingZerosCounter< T, SizeOfT >::count(), llvm::detail::LeadingZerosCounter< T, SizeOfT >::count(), DecodeImm8OptLsl(), llvm::decodeSLEB128(), DecodeSORegImmOperand(), DecodeSORegRegOperand(), llvm::decodeULEB128(), llvm::PPCTargetLowering::decomposeMulByConstant(), llvm::AArch64InstrInfo::describeLoadedValue(), llvm::ScaledNumbers::divide32(), llvm::ScaledNumbers::divide64(), dumpApplePropertyAttribute(), llvm::SIRegisterInfo::eliminateFrameIndex(), llvm::MipsMCCodeEmitter::emitInstruction(), llvm::TargetLowering::expandABS(), llvm::TargetLowering::expandBITREVERSE(), llvm::AArch64_IMM::expandMOVImm(), expandMOVImmSimple(), llvm::TargetLowering::expandMUL_LOHI(), llvm::bitfields_details::Impl< Bitfield, StorageType >::extract(), extractMaskedValue(), findMoreOptimalIndexType(), llvm::InstCombinerImpl::foldICmpAndShift(), llvm::InstCombinerImpl::foldICmpShlConstConst(), llvm::InstCombinerImpl::foldICmpShrConstConst(), foldIndexIntoBase(), foldMaskAndShiftToExtract(), foldMaskAndShiftToScale(), foldMaskedShiftToBEXTR(), foldMaskedShiftToScaledMask(), foldSetCCWithFunnelShift(), foldVectorXorShiftIntoCmp(), foldXorTruncShiftIntoCmp(), generateSignedDivisionCode(), generateSignedRemainderCode(), llvm::MipsTargetLowering::getAddrNonPICSym64(), llvm::ScaledNumbers::getAdjusted(), llvm::object::coff_section::getAlignment(), llvm::object::coff_tls_directory< IntTy >::getAlignment(), getCmpOperandFoldingProfit(), llvm::TargetLoweringBase::getCondCodeAction(), getInputSegmentList(), llvm::TargetLoweringBase::getLoadExtAction(), llvm::SelectionDAG::getNode(), getScaledOffsetForBitWidth(), GetVBR(), INITIALIZE_PASS(), InRange(), insertMaskedValue(), llvm::X86TTIImpl::instCombineIntrinsic(), instCombineSVESrshl(), llvm::AArch64InstrInfo::isAddImmediate(), llvm::AArch64_AM::isAnyMOVZMovAlias(), isKnownNonZero(), llvm::AArch64_AM::isMOVNMovAlias(), llvm::AArch64_AM::isMOVZMovAlias(), isShlDoublePermute(), isSimpleShift(), llvm::MipsLegalizerInfo::legalizeCustom(), llvm::AMDGPULegalizerInfo::loadInputValue(), llvm::AMDGPUTargetLowering::loadInputValue(), lower1BitShuffle(), llvm::LegalizerHelper::lowerAbsToAddXor(), lowerCTLZ_CTTZ_ZERO_UNDEF(), llvm::VETargetLowering::lowerEXTRACT_VECTOR_ELT(), LowerEXTRACT_VECTOR_ELT_i1(), llvm::LegalizerHelper::lowerFCopySign(), llvm::VETargetLowering::lowerINSERT_VECTOR_ELT(), LowerLargeShift(), llvm::LegalizerHelper::lowerLoad(), llvm::MSP430TargetLowering::LowerSETCC(), lowerShuffleAsByteShiftMask(), llvm::LegalizerHelper::lowerUnmergeValues(), lowerV16I16Shuffle(), lowerV16I32Shuffle(), lowerV16I8Shuffle(), lowerV2I64Shuffle(), lowerV32I16Shuffle(), lowerV32I8Shuffle(), lowerV4I32Shuffle(), lowerV4I64Shuffle(), lowerV64I8Shuffle(), lowerV8I16Shuffle(), lowerV8I32Shuffle(), lowerV8I64Shuffle(), LowerVectorCTLZInRegLUT(), llvm::KnownBits::lshr(), match1BitShuffleAsKSHIFT(), matchAArch64MulConstCombine(), matchIntPart(), matchLoadAndBytePosition(), matchRotateHalf(), matchShuffleAsShift(), llvm::ScaledNumbers::multiply64(), llvm::BlockFrequencyInfoImplBase::Distribution::normalize(), llvm::PointerEmbeddedInt< IntT, Bits >::operator IntT(), llvm::operator<<(), llvm::ScaledNumber< uint64_t >::operator<<=(), llvm::PointerEmbeddedInt< IntT, Bits >::operator=(), llvm::operator>>(), llvm::ScaledNumber< uint64_t >::operator>>=(), packSegmentMask(), ParseBFI(), llvm::PPCTargetLowering::PerformDAGCombine(), performVSelectCombine(), llvm::AArch64InstPrinter::printAddSubImm(), llvm::AArch64InstPrinter::printImm8OptLsl(), llvm::AArch64InstPrinter::printInst(), ReduceSwitchRange(), llvm::M68kFrameLowering::restoreCalleeSavedRegisters(), selectI64ImmDirect(), selectI64ImmDirectPrefix(), llvm::TargetLoweringBase::setCondCodeAction(), llvm::TargetLoweringBase::setLoadExtAction(), shiftRightAndRound(), llvm::KnownBits::shl(), llvm::AArch64TargetLowering::shouldConvertConstantLoadToIntImm(), llvm::X86TargetLowering::SimplifyDemandedBitsForTargetNode(), llvm::TargetLowering::SimplifySetCC(), llvm::InstCombinerImpl::SliceUpIllegalIntegerPHI(), llvm::M68kFrameLowering::spillCalleeSavedRegisters(), llvm::OpenMPIRBuilder::tileLoops(), llvm::ScaledNumberBase::toString(), toStringAPFloat(), tryAdvSIMDModImm16(), tryAdvSIMDModImm32(), tryAdvSIMDModImm321s(), tryBitfieldInsertOpFromOrAndImm(), tryCombineFixedPointConvert(), tryLowerToSLI(), llvm::bitfields_details::Impl< Bitfield, StorageType >::update(), UpgradeX86ALIGNIntrinsics(), UpgradeX86PSLLDQIntrinsics(), UpgradeX86PSRLDQIntrinsics(), llvm::InstCombinerImpl::visitTrunc(), and llvm::LegalizerHelper::widenScalar().
Definition at line 604 of file README.txt.
Definition at line 605 of file README.txt.
i<reg-> size |
Definition at line 166 of file README.txt.
Referenced by llvm::orc::shared::WrapperFunctionCall::Create(), enlarge(), getSystemRegistryString(), operator new(), llvm::orc::shared::detail::serializeViaSPSToWrapperFunctionResult(), llvm::orc::shared::SPSSerializationTraits< SPSRemoteSymbolLookupSetElement, SymbolLookupSet::value_type >::size(), llvm::orc::shared::SPSSerializationTraits< SPSMachOJITDylibDepInfo, MachOPlatform::MachOJITDylibDepInfo >::size(), llvm::orc::shared::SPSSerializationTraits< SPSRemoteSymbolLookup, ExecutorProcessControl::LookupRequest >::size(), llvm::orc::shared::SPSSerializationTraits< SPSExecutorAddr, jitlink::JITLinkMemoryManager::FinalizedAlloc >::size(), llvm::orc::shared::SPSSerializationTraits< SPSAllocActionCallPair, AllocActionCallPair >::size(), llvm::orc::shared::SPSArgList< SPSTagT, SPSTagTs... >::size(), llvm::orc::shared::SPSSerializationTraits< SPSMemoryProtectionFlags, tpctypes::WireProtectionFlags >::size(), llvm::orc::shared::SPSSerializationTraits< SPSRemoteSymbolLookupSetElement, RemoteSymbolLookupSetElement >::size(), llvm::orc::shared::SPSSerializationTraits< SPSSegFinalizeRequest, tpctypes::SegFinalizeRequest >::size(), llvm::orc::shared::SPSSerializationTraits< SPSRemoteSymbolLookup, RemoteSymbolLookup >::size(), llvm::orc::shared::SPSSerializationTraits< SPSExecutorAddr, ExecutorAddr >::size(), llvm::orc::shared::SPSSerializationTraits< SPSFinalizeRequest, tpctypes::FinalizeRequest >::size(), llvm::orc::shared::SPSSerializationTraits< SPSSimpleRemoteEPCExecutorInfo, SimpleRemoteEPCExecutorInfo >::size(), llvm::orc::shared::SPSSerializationTraits< SPSMemoryAccessUIntWrite< T >, tpctypes::UIntWrite< T > >::size(), llvm::orc::shared::SPSSerializationTraits< SPSExecutorAddrRange, ExecutorAddrRange >::size(), llvm::orc::shared::SPSSerializationTraits< SPSMemoryAccessBufferWrite, tpctypes::BufferWrite >::size(), llvm::orc::shared::SPSSerializationTraits< SPSELFPerObjectSectionsToRegister, ELFPerObjectSectionsToRegister >::size(), llvm::orc::shared::SPSSerializationTraits< SPSELFNixJITDylibInitializers, ELFNixJITDylibInitializers >::size(), llvm::orc::shared::SPSSerializationTraits< SPSSequence< char >, ArrayRef< char > >::size(), llvm::orc::shared::SPSSerializationTraits< SPSSequence< SPSElementTagT >, SequenceT, std::enable_if_t< TrivialSPSSequenceSerialization< SPSElementTagT, SequenceT >::available > >::size(), llvm::orc::shared::SPSSerializationTraits< SPSTuple< SPSTagT1, SPSTagT2 >, std::pair< T1, T2 > >::size(), llvm::orc::shared::SPSSerializationTraits< SPSString, StringRef >::size(), llvm::orc::shared::SPSSerializationTraits< SPSSequence< SPSTuple< SPSString, SPSValueT > >, StringMap< ValueT > >::size(), llvm::orc::shared::SPSSerializationTraits< SPSError, detail::SPSSerializableError >::size(), llvm::orc::shared::SPSSerializationTraits< SPSExpected< SPSTagT >, detail::SPSSerializableExpected< T > >::size(), llvm::orc::shared::SPSSerializationTraits< SPSExpected< SPSTagT >, detail::SPSSerializableError >::size(), llvm::orc::shared::SPSSerializationTraits< SPSExpected< SPSTagT >, T >::size(), and llvm::orc::shared::SPSSerializationTraits< SPSWrapperFunctionCall, WrapperFunctionCall >::size().
Definition at line 100 of file README.txt.
we compile this esp call L1 $pb L1 esp je LBB1_2 esp ret but is currently always computed in the entry block It would be better to sink the picbase computation down into the block for the as it is the only one that uses it This happens for a lot of code with early outs Another example is loads of which are usually emitted into the entry block on targets like x86 If not used in all paths through a they should be sunk into the ones that do In this whole function isel would also handle this Investigate lowering of sparse switch statements into perfect hash tables |
Definition at line 439 of file README.txt.
Unrolling by would eliminate the& in both leading to a net reduction in code size The resultant code would then also be suitable for exit value computation We miss a bunch of rotate opportunities on various targets |
Definition at line 567 of file README.txt.
Definition at line 338 of file README.txt.
multiplies can be turned into SHL so they should be handled as if they were associative return like this |
Definition at line 378 of file README.txt.
The same transformation can work with an even modulo with the addition of a and shrink the compare RHS by the same amount Unless the target supports though |
Definition at line 681 of file README.txt.
Definition at line 337 of file README.txt.
we compile this to |
Definition at line 406 of file README.txt.
bool LoopInterchangeTransform::transform |
Definition at line 262 of file README.txt.
Referenced by llvm::InterleaveGroup< InstTy >::addMetadata(), llvm::Clause::getFormattedParserClassName(), llvm::LoopVectorizationCostModel::getVectorIntrinsicCost(), llvm::PBQP::Vector::operator+=(), llvm::PBQP::Matrix::operator+=(), llvm::AArch64InstPrinter::printSysAlias(), llvm::xray::profileFromTrace(), llvm::readWideAPInt(), and llvm::transform().
instcombine should handle this C2 when and C2 are unsigned Similarly for udiv and signed operands Currently InstCombine avoids this transform but will do it when the signs of the operands and the sign of the divide match See the FIXME in InstructionCombining cpp in the visitSetCondInst method after the switch case for Instruction::UDiv (around line 4447) for more details. The SingleSource/Benchmarks/Shootout-C++/hash and hash2 tests have examples of this const ruct. [LOOP OPTIMIZATION] SingleSource/Benchmarks/Misc/dt.c shows several interesting optimization opportunities in its double_array_divs_variable function typedef unsigned long long U64 |
Definition at line 268 of file README.txt.
Referenced by parseBasicType().
instcombine should handle this C2 when X |
Definition at line 263 of file README.txt.
Referenced by test().
Definition at line 767 of file README.txt.
Definition at line 318 of file README.txt.
Referenced by llvm::archToDevDivInternalArch(), llvm::archToLegacyVCArch(), llvm::archToWindowsSDKArch(), llvm::identify_magic(), mapArchToCVCPUType(), llvm::pdb::operator<<(), and readPrefixes().
Unrolling by would eliminate the& in both leading to a net reduction in code size The resultant code would then also be suitable for exit value computation We miss a bunch of rotate opportunities on various including etc On X86 |
Definition at line 568 of file README.txt.
Referenced by llvm::X86InstrInfo::isDataInvariant(), streamMapping(), llvm::X86InstrInfo::X86InstrInfo(), and llvm::X86RegisterInfo::X86RegisterInfo().
gets compiled into this on rsp movaps rsp movaps rsp movaps rsp movaps rsp movaps rsp movaps rsp movaps rsp movaps xmm0 |
Definition at line 517 of file README.txt.
The same transformation can work with an even modulo with the addition of a and shrink the compare RHS by the same amount Unless the target supports that transformation probably isn t worthwhile The transformation can also easily be made to work with non zero equality for the first function produces better code on X86 From GCC int y |
Definition at line 61 of file README.txt.
Referenced by llvm::ComputeMappedEditDistance(), llvm::Optional< uint64_t >::create(), exit(), g(), llvm::getMaxNTIDy(), llvm::ScalarEvolution::getMulExpr(), llvm::getReqNTIDy(), llvm::hashing::detail::hash_1to3_bytes(), llvm::IntervalMap< uint64_t, uint16_t, 8, IntervalMapHalfOpenInfo< uint64_t > >::insert(), llvm::IntervalMap< KeyT, ValT, N, Traits >::iterator::insert(), llvm::IntervalMapImpl::LeafNode< uint64_t, uint16_t, N, IntervalMapHalfOpenInfo< uint64_t > >::insertFrom(), llvm::ExecutionEngine::LoadValueFromMemory(), llvm::make_range(), llvm::InstIterator< BB_t, BB_i_t, BI_t, II_t >::operator!=(), llvm::optional_detail::OptionalStorage< uint64_t >::operator=(), llvm::optional_detail::OptionalStorage< T, true >::operator=(), llvm::Optional< uint64_t >::operator=(), llvm::InstIterator< BB_t, BB_i_t, BI_t, II_t >::operator==(), and runNVVMIntrRange().