LLVM 22.0.0git
MemorySanitizer.cpp
Go to the documentation of this file.
1//===- MemorySanitizer.cpp - detector of uninitialized reads --------------===//
2//
3// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4// See https://llvm.org/LICENSE.txt for license information.
5// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6//
7//===----------------------------------------------------------------------===//
8//
9/// \file
10/// This file is a part of MemorySanitizer, a detector of uninitialized
11/// reads.
12///
13/// The algorithm of the tool is similar to Memcheck
14/// (https://static.usenix.org/event/usenix05/tech/general/full_papers/seward/seward_html/usenix2005.html)
15/// We associate a few shadow bits with every byte of the application memory,
16/// poison the shadow of the malloc-ed or alloca-ed memory, load the shadow,
17/// bits on every memory read, propagate the shadow bits through some of the
18/// arithmetic instruction (including MOV), store the shadow bits on every
19/// memory write, report a bug on some other instructions (e.g. JMP) if the
20/// associated shadow is poisoned.
21///
22/// But there are differences too. The first and the major one:
23/// compiler instrumentation instead of binary instrumentation. This
24/// gives us much better register allocation, possible compiler
25/// optimizations and a fast start-up. But this brings the major issue
26/// as well: msan needs to see all program events, including system
27/// calls and reads/writes in system libraries, so we either need to
28/// compile *everything* with msan or use a binary translation
29/// component (e.g. DynamoRIO) to instrument pre-built libraries.
30/// Another difference from Memcheck is that we use 8 shadow bits per
31/// byte of application memory and use a direct shadow mapping. This
32/// greatly simplifies the instrumentation code and avoids races on
33/// shadow updates (Memcheck is single-threaded so races are not a
34/// concern there. Memcheck uses 2 shadow bits per byte with a slow
35/// path storage that uses 8 bits per byte).
36///
37/// The default value of shadow is 0, which means "clean" (not poisoned).
38///
39/// Every module initializer should call __msan_init to ensure that the
40/// shadow memory is ready. On error, __msan_warning is called. Since
41/// parameters and return values may be passed via registers, we have a
42/// specialized thread-local shadow for return values
43/// (__msan_retval_tls) and parameters (__msan_param_tls).
44///
45/// Origin tracking.
46///
47/// MemorySanitizer can track origins (allocation points) of all uninitialized
48/// values. This behavior is controlled with a flag (msan-track-origins) and is
49/// disabled by default.
50///
51/// Origins are 4-byte values created and interpreted by the runtime library.
52/// They are stored in a second shadow mapping, one 4-byte value for 4 bytes
53/// of application memory. Propagation of origins is basically a bunch of
54/// "select" instructions that pick the origin of a dirty argument, if an
55/// instruction has one.
56///
57/// Every 4 aligned, consecutive bytes of application memory have one origin
58/// value associated with them. If these bytes contain uninitialized data
59/// coming from 2 different allocations, the last store wins. Because of this,
60/// MemorySanitizer reports can show unrelated origins, but this is unlikely in
61/// practice.
62///
63/// Origins are meaningless for fully initialized values, so MemorySanitizer
64/// avoids storing origin to memory when a fully initialized value is stored.
65/// This way it avoids needless overwriting origin of the 4-byte region on
66/// a short (i.e. 1 byte) clean store, and it is also good for performance.
67///
68/// Atomic handling.
69///
70/// Ideally, every atomic store of application value should update the
71/// corresponding shadow location in an atomic way. Unfortunately, atomic store
72/// of two disjoint locations can not be done without severe slowdown.
73///
74/// Therefore, we implement an approximation that may err on the safe side.
75/// In this implementation, every atomically accessed location in the program
76/// may only change from (partially) uninitialized to fully initialized, but
77/// not the other way around. We load the shadow _after_ the application load,
78/// and we store the shadow _before_ the app store. Also, we always store clean
79/// shadow (if the application store is atomic). This way, if the store-load
80/// pair constitutes a happens-before arc, shadow store and load are correctly
81/// ordered such that the load will get either the value that was stored, or
82/// some later value (which is always clean).
83///
84/// This does not work very well with Compare-And-Swap (CAS) and
85/// Read-Modify-Write (RMW) operations. To follow the above logic, CAS and RMW
86/// must store the new shadow before the app operation, and load the shadow
87/// after the app operation. Computers don't work this way. Current
88/// implementation ignores the load aspect of CAS/RMW, always returning a clean
89/// value. It implements the store part as a simple atomic store by storing a
90/// clean shadow.
91///
92/// Instrumenting inline assembly.
93///
94/// For inline assembly code LLVM has little idea about which memory locations
95/// become initialized depending on the arguments. It can be possible to figure
96/// out which arguments are meant to point to inputs and outputs, but the
97/// actual semantics can be only visible at runtime. In the Linux kernel it's
98/// also possible that the arguments only indicate the offset for a base taken
99/// from a segment register, so it's dangerous to treat any asm() arguments as
100/// pointers. We take a conservative approach generating calls to
101/// __msan_instrument_asm_store(ptr, size)
102/// , which defer the memory unpoisoning to the runtime library.
103/// The latter can perform more complex address checks to figure out whether
104/// it's safe to touch the shadow memory.
105/// Like with atomic operations, we call __msan_instrument_asm_store() before
106/// the assembly call, so that changes to the shadow memory will be seen by
107/// other threads together with main memory initialization.
108///
109/// KernelMemorySanitizer (KMSAN) implementation.
110///
111/// The major differences between KMSAN and MSan instrumentation are:
112/// - KMSAN always tracks the origins and implies msan-keep-going=true;
113/// - KMSAN allocates shadow and origin memory for each page separately, so
114/// there are no explicit accesses to shadow and origin in the
115/// instrumentation.
116/// Shadow and origin values for a particular X-byte memory location
117/// (X=1,2,4,8) are accessed through pointers obtained via the
118/// __msan_metadata_ptr_for_load_X(ptr)
119/// __msan_metadata_ptr_for_store_X(ptr)
120/// functions. The corresponding functions check that the X-byte accesses
121/// are possible and returns the pointers to shadow and origin memory.
122/// Arbitrary sized accesses are handled with:
123/// __msan_metadata_ptr_for_load_n(ptr, size)
124/// __msan_metadata_ptr_for_store_n(ptr, size);
125/// Note that the sanitizer code has to deal with how shadow/origin pairs
126/// returned by the these functions are represented in different ABIs. In
127/// the X86_64 ABI they are returned in RDX:RAX, in PowerPC64 they are
128/// returned in r3 and r4, and in the SystemZ ABI they are written to memory
129/// pointed to by a hidden parameter.
130/// - TLS variables are stored in a single per-task struct. A call to a
131/// function __msan_get_context_state() returning a pointer to that struct
132/// is inserted into every instrumented function before the entry block;
133/// - __msan_warning() takes a 32-bit origin parameter;
134/// - local variables are poisoned with __msan_poison_alloca() upon function
135/// entry and unpoisoned with __msan_unpoison_alloca() before leaving the
136/// function;
137/// - the pass doesn't declare any global variables or add global constructors
138/// to the translation unit.
139///
140/// Also, KMSAN currently ignores uninitialized memory passed into inline asm
141/// calls, making sure we're on the safe side wrt. possible false positives.
142///
143/// KernelMemorySanitizer only supports X86_64, SystemZ and PowerPC64 at the
144/// moment.
145///
146//
147// FIXME: This sanitizer does not yet handle scalable vectors
148//
149//===----------------------------------------------------------------------===//
150
152#include "llvm/ADT/APInt.h"
153#include "llvm/ADT/ArrayRef.h"
154#include "llvm/ADT/DenseMap.h"
156#include "llvm/ADT/SetVector.h"
157#include "llvm/ADT/SmallPtrSet.h"
158#include "llvm/ADT/SmallVector.h"
160#include "llvm/ADT/StringRef.h"
164#include "llvm/IR/Argument.h"
166#include "llvm/IR/Attributes.h"
167#include "llvm/IR/BasicBlock.h"
168#include "llvm/IR/CallingConv.h"
169#include "llvm/IR/Constant.h"
170#include "llvm/IR/Constants.h"
171#include "llvm/IR/DataLayout.h"
172#include "llvm/IR/DerivedTypes.h"
173#include "llvm/IR/Function.h"
174#include "llvm/IR/GlobalValue.h"
176#include "llvm/IR/IRBuilder.h"
177#include "llvm/IR/InlineAsm.h"
178#include "llvm/IR/InstVisitor.h"
179#include "llvm/IR/InstrTypes.h"
180#include "llvm/IR/Instruction.h"
181#include "llvm/IR/Instructions.h"
183#include "llvm/IR/Intrinsics.h"
184#include "llvm/IR/IntrinsicsAArch64.h"
185#include "llvm/IR/IntrinsicsX86.h"
186#include "llvm/IR/MDBuilder.h"
187#include "llvm/IR/Module.h"
188#include "llvm/IR/Type.h"
189#include "llvm/IR/Value.h"
190#include "llvm/IR/ValueMap.h"
193#include "llvm/Support/Casting.h"
195#include "llvm/Support/Debug.h"
205#include <algorithm>
206#include <cassert>
207#include <cstddef>
208#include <cstdint>
209#include <memory>
210#include <numeric>
211#include <string>
212#include <tuple>
213
214using namespace llvm;
215
216#define DEBUG_TYPE "msan"
217
218DEBUG_COUNTER(DebugInsertCheck, "msan-insert-check",
219 "Controls which checks to insert");
220
221DEBUG_COUNTER(DebugInstrumentInstruction, "msan-instrument-instruction",
222 "Controls which instruction to instrument");
223
224static const unsigned kOriginSize = 4;
227
228// These constants must be kept in sync with the ones in msan.h.
229// TODO: increase size to match SVE/SVE2/SME/SME2 limits
230static const unsigned kParamTLSSize = 800;
231static const unsigned kRetvalTLSSize = 800;
232
233// Accesses sizes are powers of two: 1, 2, 4, 8.
234static const size_t kNumberOfAccessSizes = 4;
235
236/// Track origins of uninitialized values.
237///
238/// Adds a section to MemorySanitizer report that points to the allocation
239/// (stack or heap) the uninitialized bits came from originally.
241 "msan-track-origins",
242 cl::desc("Track origins (allocation sites) of poisoned memory"), cl::Hidden,
243 cl::init(0));
244
245static cl::opt<bool> ClKeepGoing("msan-keep-going",
246 cl::desc("keep going after reporting a UMR"),
247 cl::Hidden, cl::init(false));
248
249static cl::opt<bool>
250 ClPoisonStack("msan-poison-stack",
251 cl::desc("poison uninitialized stack variables"), cl::Hidden,
252 cl::init(true));
253
255 "msan-poison-stack-with-call",
256 cl::desc("poison uninitialized stack variables with a call"), cl::Hidden,
257 cl::init(false));
258
260 "msan-poison-stack-pattern",
261 cl::desc("poison uninitialized stack variables with the given pattern"),
262 cl::Hidden, cl::init(0xff));
263
264static cl::opt<bool>
265 ClPrintStackNames("msan-print-stack-names",
266 cl::desc("Print name of local stack variable"),
267 cl::Hidden, cl::init(true));
268
269static cl::opt<bool>
270 ClPoisonUndef("msan-poison-undef",
271 cl::desc("Poison fully undef temporary values. "
272 "Partially undefined constant vectors "
273 "are unaffected by this flag (see "
274 "-msan-poison-undef-vectors)."),
275 cl::Hidden, cl::init(true));
276
278 "msan-poison-undef-vectors",
279 cl::desc("Precisely poison partially undefined constant vectors. "
280 "If false (legacy behavior), the entire vector is "
281 "considered fully initialized, which may lead to false "
282 "negatives. Fully undefined constant vectors are "
283 "unaffected by this flag (see -msan-poison-undef)."),
284 cl::Hidden, cl::init(false));
285
287 "msan-precise-disjoint-or",
288 cl::desc("Precisely poison disjoint OR. If false (legacy behavior), "
289 "disjointedness is ignored (i.e., 1|1 is initialized)."),
290 cl::Hidden, cl::init(false));
291
292static cl::opt<bool>
293 ClHandleICmp("msan-handle-icmp",
294 cl::desc("propagate shadow through ICmpEQ and ICmpNE"),
295 cl::Hidden, cl::init(true));
296
297static cl::opt<bool>
298 ClHandleICmpExact("msan-handle-icmp-exact",
299 cl::desc("exact handling of relational integer ICmp"),
300 cl::Hidden, cl::init(true));
301
303 "msan-handle-lifetime-intrinsics",
304 cl::desc(
305 "when possible, poison scoped variables at the beginning of the scope "
306 "(slower, but more precise)"),
307 cl::Hidden, cl::init(true));
308
309// When compiling the Linux kernel, we sometimes see false positives related to
310// MSan being unable to understand that inline assembly calls may initialize
311// local variables.
312// This flag makes the compiler conservatively unpoison every memory location
313// passed into an assembly call. Note that this may cause false positives.
314// Because it's impossible to figure out the array sizes, we can only unpoison
315// the first sizeof(type) bytes for each type* pointer.
317 "msan-handle-asm-conservative",
318 cl::desc("conservative handling of inline assembly"), cl::Hidden,
319 cl::init(true));
320
321// This flag controls whether we check the shadow of the address
322// operand of load or store. Such bugs are very rare, since load from
323// a garbage address typically results in SEGV, but still happen
324// (e.g. only lower bits of address are garbage, or the access happens
325// early at program startup where malloc-ed memory is more likely to
326// be zeroed. As of 2012-08-28 this flag adds 20% slowdown.
328 "msan-check-access-address",
329 cl::desc("report accesses through a pointer which has poisoned shadow"),
330 cl::Hidden, cl::init(true));
331
333 "msan-eager-checks",
334 cl::desc("check arguments and return values at function call boundaries"),
335 cl::Hidden, cl::init(false));
336
338 "msan-dump-strict-instructions",
339 cl::desc("print out instructions with default strict semantics i.e.,"
340 "check that all the inputs are fully initialized, and mark "
341 "the output as fully initialized. These semantics are applied "
342 "to instructions that could not be handled explicitly nor "
343 "heuristically."),
344 cl::Hidden, cl::init(false));
345
346// Currently, all the heuristically handled instructions are specifically
347// IntrinsicInst. However, we use the broader "HeuristicInstructions" name
348// to parallel 'msan-dump-strict-instructions', and to keep the door open to
349// handling non-intrinsic instructions heuristically.
351 "msan-dump-heuristic-instructions",
352 cl::desc("Prints 'unknown' instructions that were handled heuristically. "
353 "Use -msan-dump-strict-instructions to print instructions that "
354 "could not be handled explicitly nor heuristically."),
355 cl::Hidden, cl::init(false));
356
358 "msan-instrumentation-with-call-threshold",
359 cl::desc(
360 "If the function being instrumented requires more than "
361 "this number of checks and origin stores, use callbacks instead of "
362 "inline checks (-1 means never use callbacks)."),
363 cl::Hidden, cl::init(3500));
364
365static cl::opt<bool>
366 ClEnableKmsan("msan-kernel",
367 cl::desc("Enable KernelMemorySanitizer instrumentation"),
368 cl::Hidden, cl::init(false));
369
370static cl::opt<bool>
371 ClDisableChecks("msan-disable-checks",
372 cl::desc("Apply no_sanitize to the whole file"), cl::Hidden,
373 cl::init(false));
374
375static cl::opt<bool>
376 ClCheckConstantShadow("msan-check-constant-shadow",
377 cl::desc("Insert checks for constant shadow values"),
378 cl::Hidden, cl::init(true));
379
380// This is off by default because of a bug in gold:
381// https://sourceware.org/bugzilla/show_bug.cgi?id=19002
382static cl::opt<bool>
383 ClWithComdat("msan-with-comdat",
384 cl::desc("Place MSan constructors in comdat sections"),
385 cl::Hidden, cl::init(false));
386
387// These options allow to specify custom memory map parameters
388// See MemoryMapParams for details.
389static cl::opt<uint64_t> ClAndMask("msan-and-mask",
390 cl::desc("Define custom MSan AndMask"),
391 cl::Hidden, cl::init(0));
392
393static cl::opt<uint64_t> ClXorMask("msan-xor-mask",
394 cl::desc("Define custom MSan XorMask"),
395 cl::Hidden, cl::init(0));
396
397static cl::opt<uint64_t> ClShadowBase("msan-shadow-base",
398 cl::desc("Define custom MSan ShadowBase"),
399 cl::Hidden, cl::init(0));
400
401static cl::opt<uint64_t> ClOriginBase("msan-origin-base",
402 cl::desc("Define custom MSan OriginBase"),
403 cl::Hidden, cl::init(0));
404
405static cl::opt<int>
406 ClDisambiguateWarning("msan-disambiguate-warning-threshold",
407 cl::desc("Define threshold for number of checks per "
408 "debug location to force origin update."),
409 cl::Hidden, cl::init(3));
410
411const char kMsanModuleCtorName[] = "msan.module_ctor";
412const char kMsanInitName[] = "__msan_init";
413
414namespace {
415
416// Memory map parameters used in application-to-shadow address calculation.
417// Offset = (Addr & ~AndMask) ^ XorMask
418// Shadow = ShadowBase + Offset
419// Origin = OriginBase + Offset
420struct MemoryMapParams {
421 uint64_t AndMask;
422 uint64_t XorMask;
423 uint64_t ShadowBase;
424 uint64_t OriginBase;
425};
426
427struct PlatformMemoryMapParams {
428 const MemoryMapParams *bits32;
429 const MemoryMapParams *bits64;
430};
431
432} // end anonymous namespace
433
434// i386 Linux
435static const MemoryMapParams Linux_I386_MemoryMapParams = {
436 0x000080000000, // AndMask
437 0, // XorMask (not used)
438 0, // ShadowBase (not used)
439 0x000040000000, // OriginBase
440};
441
442// x86_64 Linux
443static const MemoryMapParams Linux_X86_64_MemoryMapParams = {
444 0, // AndMask (not used)
445 0x500000000000, // XorMask
446 0, // ShadowBase (not used)
447 0x100000000000, // OriginBase
448};
449
450// mips32 Linux
451// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
452// after picking good constants
453
454// mips64 Linux
455static const MemoryMapParams Linux_MIPS64_MemoryMapParams = {
456 0, // AndMask (not used)
457 0x008000000000, // XorMask
458 0, // ShadowBase (not used)
459 0x002000000000, // OriginBase
460};
461
462// ppc32 Linux
463// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
464// after picking good constants
465
466// ppc64 Linux
467static const MemoryMapParams Linux_PowerPC64_MemoryMapParams = {
468 0xE00000000000, // AndMask
469 0x100000000000, // XorMask
470 0x080000000000, // ShadowBase
471 0x1C0000000000, // OriginBase
472};
473
474// s390x Linux
475static const MemoryMapParams Linux_S390X_MemoryMapParams = {
476 0xC00000000000, // AndMask
477 0, // XorMask (not used)
478 0x080000000000, // ShadowBase
479 0x1C0000000000, // OriginBase
480};
481
482// arm32 Linux
483// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
484// after picking good constants
485
486// aarch64 Linux
487static const MemoryMapParams Linux_AArch64_MemoryMapParams = {
488 0, // AndMask (not used)
489 0x0B00000000000, // XorMask
490 0, // ShadowBase (not used)
491 0x0200000000000, // OriginBase
492};
493
494// loongarch64 Linux
495static const MemoryMapParams Linux_LoongArch64_MemoryMapParams = {
496 0, // AndMask (not used)
497 0x500000000000, // XorMask
498 0, // ShadowBase (not used)
499 0x100000000000, // OriginBase
500};
501
502// riscv32 Linux
503// FIXME: Remove -msan-origin-base -msan-and-mask added by PR #109284 to tests
504// after picking good constants
505
506// aarch64 FreeBSD
507static const MemoryMapParams FreeBSD_AArch64_MemoryMapParams = {
508 0x1800000000000, // AndMask
509 0x0400000000000, // XorMask
510 0x0200000000000, // ShadowBase
511 0x0700000000000, // OriginBase
512};
513
514// i386 FreeBSD
515static const MemoryMapParams FreeBSD_I386_MemoryMapParams = {
516 0x000180000000, // AndMask
517 0x000040000000, // XorMask
518 0x000020000000, // ShadowBase
519 0x000700000000, // OriginBase
520};
521
522// x86_64 FreeBSD
523static const MemoryMapParams FreeBSD_X86_64_MemoryMapParams = {
524 0xc00000000000, // AndMask
525 0x200000000000, // XorMask
526 0x100000000000, // ShadowBase
527 0x380000000000, // OriginBase
528};
529
530// x86_64 NetBSD
531static const MemoryMapParams NetBSD_X86_64_MemoryMapParams = {
532 0, // AndMask
533 0x500000000000, // XorMask
534 0, // ShadowBase
535 0x100000000000, // OriginBase
536};
537
538static const PlatformMemoryMapParams Linux_X86_MemoryMapParams = {
541};
542
543static const PlatformMemoryMapParams Linux_MIPS_MemoryMapParams = {
544 nullptr,
546};
547
548static const PlatformMemoryMapParams Linux_PowerPC_MemoryMapParams = {
549 nullptr,
551};
552
553static const PlatformMemoryMapParams Linux_S390_MemoryMapParams = {
554 nullptr,
556};
557
558static const PlatformMemoryMapParams Linux_ARM_MemoryMapParams = {
559 nullptr,
561};
562
563static const PlatformMemoryMapParams Linux_LoongArch_MemoryMapParams = {
564 nullptr,
566};
567
568static const PlatformMemoryMapParams FreeBSD_ARM_MemoryMapParams = {
569 nullptr,
571};
572
573static const PlatformMemoryMapParams FreeBSD_X86_MemoryMapParams = {
576};
577
578static const PlatformMemoryMapParams NetBSD_X86_MemoryMapParams = {
579 nullptr,
581};
582
583namespace {
584
585/// Instrument functions of a module to detect uninitialized reads.
586///
587/// Instantiating MemorySanitizer inserts the msan runtime library API function
588/// declarations into the module if they don't exist already. Instantiating
589/// ensures the __msan_init function is in the list of global constructors for
590/// the module.
591class MemorySanitizer {
592public:
593 MemorySanitizer(Module &M, MemorySanitizerOptions Options)
594 : CompileKernel(Options.Kernel), TrackOrigins(Options.TrackOrigins),
595 Recover(Options.Recover), EagerChecks(Options.EagerChecks) {
596 initializeModule(M);
597 }
598
599 // MSan cannot be moved or copied because of MapParams.
600 MemorySanitizer(MemorySanitizer &&) = delete;
601 MemorySanitizer &operator=(MemorySanitizer &&) = delete;
602 MemorySanitizer(const MemorySanitizer &) = delete;
603 MemorySanitizer &operator=(const MemorySanitizer &) = delete;
604
605 bool sanitizeFunction(Function &F, TargetLibraryInfo &TLI);
606
607private:
608 friend struct MemorySanitizerVisitor;
609 friend struct VarArgHelperBase;
610 friend struct VarArgAMD64Helper;
611 friend struct VarArgAArch64Helper;
612 friend struct VarArgPowerPC64Helper;
613 friend struct VarArgPowerPC32Helper;
614 friend struct VarArgSystemZHelper;
615 friend struct VarArgI386Helper;
616 friend struct VarArgGenericHelper;
617
618 void initializeModule(Module &M);
619 void initializeCallbacks(Module &M, const TargetLibraryInfo &TLI);
620 void createKernelApi(Module &M, const TargetLibraryInfo &TLI);
621 void createUserspaceApi(Module &M, const TargetLibraryInfo &TLI);
622
623 template <typename... ArgsTy>
624 FunctionCallee getOrInsertMsanMetadataFunction(Module &M, StringRef Name,
625 ArgsTy... Args);
626
627 /// True if we're compiling the Linux kernel.
628 bool CompileKernel;
629 /// Track origins (allocation points) of uninitialized values.
630 int TrackOrigins;
631 bool Recover;
632 bool EagerChecks;
633
634 Triple TargetTriple;
635 LLVMContext *C;
636 Type *IntptrTy; ///< Integer type with the size of a ptr in default AS.
637 Type *OriginTy;
638 PointerType *PtrTy; ///< Integer type with the size of a ptr in default AS.
639
640 // XxxTLS variables represent the per-thread state in MSan and per-task state
641 // in KMSAN.
642 // For the userspace these point to thread-local globals. In the kernel land
643 // they point to the members of a per-task struct obtained via a call to
644 // __msan_get_context_state().
645
646 /// Thread-local shadow storage for function parameters.
647 Value *ParamTLS;
648
649 /// Thread-local origin storage for function parameters.
650 Value *ParamOriginTLS;
651
652 /// Thread-local shadow storage for function return value.
653 Value *RetvalTLS;
654
655 /// Thread-local origin storage for function return value.
656 Value *RetvalOriginTLS;
657
658 /// Thread-local shadow storage for in-register va_arg function.
659 Value *VAArgTLS;
660
661 /// Thread-local shadow storage for in-register va_arg function.
662 Value *VAArgOriginTLS;
663
664 /// Thread-local shadow storage for va_arg overflow area.
665 Value *VAArgOverflowSizeTLS;
666
667 /// Are the instrumentation callbacks set up?
668 bool CallbacksInitialized = false;
669
670 /// The run-time callback to print a warning.
671 FunctionCallee WarningFn;
672
673 // These arrays are indexed by log2(AccessSize).
674 FunctionCallee MaybeWarningFn[kNumberOfAccessSizes];
675 FunctionCallee MaybeWarningVarSizeFn;
676 FunctionCallee MaybeStoreOriginFn[kNumberOfAccessSizes];
677
678 /// Run-time helper that generates a new origin value for a stack
679 /// allocation.
680 FunctionCallee MsanSetAllocaOriginWithDescriptionFn;
681 // No description version
682 FunctionCallee MsanSetAllocaOriginNoDescriptionFn;
683
684 /// Run-time helper that poisons stack on function entry.
685 FunctionCallee MsanPoisonStackFn;
686
687 /// Run-time helper that records a store (or any event) of an
688 /// uninitialized value and returns an updated origin id encoding this info.
689 FunctionCallee MsanChainOriginFn;
690
691 /// Run-time helper that paints an origin over a region.
692 FunctionCallee MsanSetOriginFn;
693
694 /// MSan runtime replacements for memmove, memcpy and memset.
695 FunctionCallee MemmoveFn, MemcpyFn, MemsetFn;
696
697 /// KMSAN callback for task-local function argument shadow.
698 StructType *MsanContextStateTy;
699 FunctionCallee MsanGetContextStateFn;
700
701 /// Functions for poisoning/unpoisoning local variables
702 FunctionCallee MsanPoisonAllocaFn, MsanUnpoisonAllocaFn;
703
704 /// Pair of shadow/origin pointers.
705 Type *MsanMetadata;
706
707 /// Each of the MsanMetadataPtrXxx functions returns a MsanMetadata.
708 FunctionCallee MsanMetadataPtrForLoadN, MsanMetadataPtrForStoreN;
709 FunctionCallee MsanMetadataPtrForLoad_1_8[4];
710 FunctionCallee MsanMetadataPtrForStore_1_8[4];
711 FunctionCallee MsanInstrumentAsmStoreFn;
712
713 /// Storage for return values of the MsanMetadataPtrXxx functions.
714 Value *MsanMetadataAlloca;
715
716 /// Helper to choose between different MsanMetadataPtrXxx().
717 FunctionCallee getKmsanShadowOriginAccessFn(bool isStore, int size);
718
719 /// Memory map parameters used in application-to-shadow calculation.
720 const MemoryMapParams *MapParams;
721
722 /// Custom memory map parameters used when -msan-shadow-base or
723 // -msan-origin-base is provided.
724 MemoryMapParams CustomMapParams;
725
726 MDNode *ColdCallWeights;
727
728 /// Branch weights for origin store.
729 MDNode *OriginStoreWeights;
730};
731
732void insertModuleCtor(Module &M) {
735 /*InitArgTypes=*/{},
736 /*InitArgs=*/{},
737 // This callback is invoked when the functions are created the first
738 // time. Hook them into the global ctors list in that case:
739 [&](Function *Ctor, FunctionCallee) {
740 if (!ClWithComdat) {
741 appendToGlobalCtors(M, Ctor, 0);
742 return;
743 }
744 Comdat *MsanCtorComdat = M.getOrInsertComdat(kMsanModuleCtorName);
745 Ctor->setComdat(MsanCtorComdat);
746 appendToGlobalCtors(M, Ctor, 0, Ctor);
747 });
748}
749
750template <class T> T getOptOrDefault(const cl::opt<T> &Opt, T Default) {
751 return (Opt.getNumOccurrences() > 0) ? Opt : Default;
752}
753
754} // end anonymous namespace
755
757 bool EagerChecks)
758 : Kernel(getOptOrDefault(ClEnableKmsan, K)),
759 TrackOrigins(getOptOrDefault(ClTrackOrigins, Kernel ? 2 : TO)),
760 Recover(getOptOrDefault(ClKeepGoing, Kernel || R)),
761 EagerChecks(getOptOrDefault(ClEagerChecks, EagerChecks)) {}
762
765 // Return early if nosanitize_memory module flag is present for the module.
766 if (checkIfAlreadyInstrumented(M, "nosanitize_memory"))
767 return PreservedAnalyses::all();
768 bool Modified = false;
769 if (!Options.Kernel) {
770 insertModuleCtor(M);
771 Modified = true;
772 }
773
774 auto &FAM = AM.getResult<FunctionAnalysisManagerModuleProxy>(M).getManager();
775 for (Function &F : M) {
776 if (F.empty())
777 continue;
778 MemorySanitizer Msan(*F.getParent(), Options);
779 Modified |=
780 Msan.sanitizeFunction(F, FAM.getResult<TargetLibraryAnalysis>(F));
781 }
782
783 if (!Modified)
784 return PreservedAnalyses::all();
785
787 // GlobalsAA is considered stateless and does not get invalidated unless
788 // explicitly invalidated; PreservedAnalyses::none() is not enough. Sanitizers
789 // make changes that require GlobalsAA to be invalidated.
790 PA.abandon<GlobalsAA>();
791 return PA;
792}
793
795 raw_ostream &OS, function_ref<StringRef(StringRef)> MapClassName2PassName) {
797 OS, MapClassName2PassName);
798 OS << '<';
799 if (Options.Recover)
800 OS << "recover;";
801 if (Options.Kernel)
802 OS << "kernel;";
803 if (Options.EagerChecks)
804 OS << "eager-checks;";
805 OS << "track-origins=" << Options.TrackOrigins;
806 OS << '>';
807}
808
809/// Create a non-const global initialized with the given string.
810///
811/// Creates a writable global for Str so that we can pass it to the
812/// run-time lib. Runtime uses first 4 bytes of the string to store the
813/// frame ID, so the string needs to be mutable.
815 StringRef Str) {
816 Constant *StrConst = ConstantDataArray::getString(M.getContext(), Str);
817 return new GlobalVariable(M, StrConst->getType(), /*isConstant=*/true,
818 GlobalValue::PrivateLinkage, StrConst, "");
819}
820
821template <typename... ArgsTy>
823MemorySanitizer::getOrInsertMsanMetadataFunction(Module &M, StringRef Name,
824 ArgsTy... Args) {
825 if (TargetTriple.getArch() == Triple::systemz) {
826 // SystemZ ABI: shadow/origin pair is returned via a hidden parameter.
827 return M.getOrInsertFunction(Name, Type::getVoidTy(*C), PtrTy,
828 std::forward<ArgsTy>(Args)...);
829 }
830
831 return M.getOrInsertFunction(Name, MsanMetadata,
832 std::forward<ArgsTy>(Args)...);
833}
834
835/// Create KMSAN API callbacks.
836void MemorySanitizer::createKernelApi(Module &M, const TargetLibraryInfo &TLI) {
837 IRBuilder<> IRB(*C);
838
839 // These will be initialized in insertKmsanPrologue().
840 RetvalTLS = nullptr;
841 RetvalOriginTLS = nullptr;
842 ParamTLS = nullptr;
843 ParamOriginTLS = nullptr;
844 VAArgTLS = nullptr;
845 VAArgOriginTLS = nullptr;
846 VAArgOverflowSizeTLS = nullptr;
847
848 WarningFn = M.getOrInsertFunction("__msan_warning",
849 TLI.getAttrList(C, {0}, /*Signed=*/false),
850 IRB.getVoidTy(), IRB.getInt32Ty());
851
852 // Requests the per-task context state (kmsan_context_state*) from the
853 // runtime library.
854 MsanContextStateTy = StructType::get(
855 ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8),
856 ArrayType::get(IRB.getInt64Ty(), kRetvalTLSSize / 8),
857 ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8),
858 ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8), /* va_arg_origin */
859 IRB.getInt64Ty(), ArrayType::get(OriginTy, kParamTLSSize / 4), OriginTy,
860 OriginTy);
861 MsanGetContextStateFn =
862 M.getOrInsertFunction("__msan_get_context_state", PtrTy);
863
864 MsanMetadata = StructType::get(PtrTy, PtrTy);
865
866 for (int ind = 0, size = 1; ind < 4; ind++, size <<= 1) {
867 std::string name_load =
868 "__msan_metadata_ptr_for_load_" + std::to_string(size);
869 std::string name_store =
870 "__msan_metadata_ptr_for_store_" + std::to_string(size);
871 MsanMetadataPtrForLoad_1_8[ind] =
872 getOrInsertMsanMetadataFunction(M, name_load, PtrTy);
873 MsanMetadataPtrForStore_1_8[ind] =
874 getOrInsertMsanMetadataFunction(M, name_store, PtrTy);
875 }
876
877 MsanMetadataPtrForLoadN = getOrInsertMsanMetadataFunction(
878 M, "__msan_metadata_ptr_for_load_n", PtrTy, IntptrTy);
879 MsanMetadataPtrForStoreN = getOrInsertMsanMetadataFunction(
880 M, "__msan_metadata_ptr_for_store_n", PtrTy, IntptrTy);
881
882 // Functions for poisoning and unpoisoning memory.
883 MsanPoisonAllocaFn = M.getOrInsertFunction(
884 "__msan_poison_alloca", IRB.getVoidTy(), PtrTy, IntptrTy, PtrTy);
885 MsanUnpoisonAllocaFn = M.getOrInsertFunction(
886 "__msan_unpoison_alloca", IRB.getVoidTy(), PtrTy, IntptrTy);
887}
888
890 return M.getOrInsertGlobal(Name, Ty, [&] {
891 return new GlobalVariable(M, Ty, false, GlobalVariable::ExternalLinkage,
892 nullptr, Name, nullptr,
894 });
895}
896
897/// Insert declarations for userspace-specific functions and globals.
898void MemorySanitizer::createUserspaceApi(Module &M,
899 const TargetLibraryInfo &TLI) {
900 IRBuilder<> IRB(*C);
901
902 // Create the callback.
903 // FIXME: this function should have "Cold" calling conv,
904 // which is not yet implemented.
905 if (TrackOrigins) {
906 StringRef WarningFnName = Recover ? "__msan_warning_with_origin"
907 : "__msan_warning_with_origin_noreturn";
908 WarningFn = M.getOrInsertFunction(WarningFnName,
909 TLI.getAttrList(C, {0}, /*Signed=*/false),
910 IRB.getVoidTy(), IRB.getInt32Ty());
911 } else {
912 StringRef WarningFnName =
913 Recover ? "__msan_warning" : "__msan_warning_noreturn";
914 WarningFn = M.getOrInsertFunction(WarningFnName, IRB.getVoidTy());
915 }
916
917 // Create the global TLS variables.
918 RetvalTLS =
919 getOrInsertGlobal(M, "__msan_retval_tls",
920 ArrayType::get(IRB.getInt64Ty(), kRetvalTLSSize / 8));
921
922 RetvalOriginTLS = getOrInsertGlobal(M, "__msan_retval_origin_tls", OriginTy);
923
924 ParamTLS =
925 getOrInsertGlobal(M, "__msan_param_tls",
926 ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8));
927
928 ParamOriginTLS =
929 getOrInsertGlobal(M, "__msan_param_origin_tls",
930 ArrayType::get(OriginTy, kParamTLSSize / 4));
931
932 VAArgTLS =
933 getOrInsertGlobal(M, "__msan_va_arg_tls",
934 ArrayType::get(IRB.getInt64Ty(), kParamTLSSize / 8));
935
936 VAArgOriginTLS =
937 getOrInsertGlobal(M, "__msan_va_arg_origin_tls",
938 ArrayType::get(OriginTy, kParamTLSSize / 4));
939
940 VAArgOverflowSizeTLS = getOrInsertGlobal(M, "__msan_va_arg_overflow_size_tls",
941 IRB.getIntPtrTy(M.getDataLayout()));
942
943 for (size_t AccessSizeIndex = 0; AccessSizeIndex < kNumberOfAccessSizes;
944 AccessSizeIndex++) {
945 unsigned AccessSize = 1 << AccessSizeIndex;
946 std::string FunctionName = "__msan_maybe_warning_" + itostr(AccessSize);
947 MaybeWarningFn[AccessSizeIndex] = M.getOrInsertFunction(
948 FunctionName, TLI.getAttrList(C, {0, 1}, /*Signed=*/false),
949 IRB.getVoidTy(), IRB.getIntNTy(AccessSize * 8), IRB.getInt32Ty());
950 MaybeWarningVarSizeFn = M.getOrInsertFunction(
951 "__msan_maybe_warning_N", TLI.getAttrList(C, {}, /*Signed=*/false),
952 IRB.getVoidTy(), PtrTy, IRB.getInt64Ty(), IRB.getInt32Ty());
953 FunctionName = "__msan_maybe_store_origin_" + itostr(AccessSize);
954 MaybeStoreOriginFn[AccessSizeIndex] = M.getOrInsertFunction(
955 FunctionName, TLI.getAttrList(C, {0, 2}, /*Signed=*/false),
956 IRB.getVoidTy(), IRB.getIntNTy(AccessSize * 8), PtrTy,
957 IRB.getInt32Ty());
958 }
959
960 MsanSetAllocaOriginWithDescriptionFn =
961 M.getOrInsertFunction("__msan_set_alloca_origin_with_descr",
962 IRB.getVoidTy(), PtrTy, IntptrTy, PtrTy, PtrTy);
963 MsanSetAllocaOriginNoDescriptionFn =
964 M.getOrInsertFunction("__msan_set_alloca_origin_no_descr",
965 IRB.getVoidTy(), PtrTy, IntptrTy, PtrTy);
966 MsanPoisonStackFn = M.getOrInsertFunction("__msan_poison_stack",
967 IRB.getVoidTy(), PtrTy, IntptrTy);
968}
969
970/// Insert extern declaration of runtime-provided functions and globals.
971void MemorySanitizer::initializeCallbacks(Module &M,
972 const TargetLibraryInfo &TLI) {
973 // Only do this once.
974 if (CallbacksInitialized)
975 return;
976
977 IRBuilder<> IRB(*C);
978 // Initialize callbacks that are common for kernel and userspace
979 // instrumentation.
980 MsanChainOriginFn = M.getOrInsertFunction(
981 "__msan_chain_origin",
982 TLI.getAttrList(C, {0}, /*Signed=*/false, /*Ret=*/true), IRB.getInt32Ty(),
983 IRB.getInt32Ty());
984 MsanSetOriginFn = M.getOrInsertFunction(
985 "__msan_set_origin", TLI.getAttrList(C, {2}, /*Signed=*/false),
986 IRB.getVoidTy(), PtrTy, IntptrTy, IRB.getInt32Ty());
987 MemmoveFn =
988 M.getOrInsertFunction("__msan_memmove", PtrTy, PtrTy, PtrTy, IntptrTy);
989 MemcpyFn =
990 M.getOrInsertFunction("__msan_memcpy", PtrTy, PtrTy, PtrTy, IntptrTy);
991 MemsetFn = M.getOrInsertFunction("__msan_memset",
992 TLI.getAttrList(C, {1}, /*Signed=*/true),
993 PtrTy, PtrTy, IRB.getInt32Ty(), IntptrTy);
994
995 MsanInstrumentAsmStoreFn = M.getOrInsertFunction(
996 "__msan_instrument_asm_store", IRB.getVoidTy(), PtrTy, IntptrTy);
997
998 if (CompileKernel) {
999 createKernelApi(M, TLI);
1000 } else {
1001 createUserspaceApi(M, TLI);
1002 }
1003 CallbacksInitialized = true;
1004}
1005
1006FunctionCallee MemorySanitizer::getKmsanShadowOriginAccessFn(bool isStore,
1007 int size) {
1008 FunctionCallee *Fns =
1009 isStore ? MsanMetadataPtrForStore_1_8 : MsanMetadataPtrForLoad_1_8;
1010 switch (size) {
1011 case 1:
1012 return Fns[0];
1013 case 2:
1014 return Fns[1];
1015 case 4:
1016 return Fns[2];
1017 case 8:
1018 return Fns[3];
1019 default:
1020 return nullptr;
1021 }
1022}
1023
1024/// Module-level initialization.
1025///
1026/// inserts a call to __msan_init to the module's constructor list.
1027void MemorySanitizer::initializeModule(Module &M) {
1028 auto &DL = M.getDataLayout();
1029
1030 TargetTriple = M.getTargetTriple();
1031
1032 bool ShadowPassed = ClShadowBase.getNumOccurrences() > 0;
1033 bool OriginPassed = ClOriginBase.getNumOccurrences() > 0;
1034 // Check the overrides first
1035 if (ShadowPassed || OriginPassed) {
1036 CustomMapParams.AndMask = ClAndMask;
1037 CustomMapParams.XorMask = ClXorMask;
1038 CustomMapParams.ShadowBase = ClShadowBase;
1039 CustomMapParams.OriginBase = ClOriginBase;
1040 MapParams = &CustomMapParams;
1041 } else {
1042 switch (TargetTriple.getOS()) {
1043 case Triple::FreeBSD:
1044 switch (TargetTriple.getArch()) {
1045 case Triple::aarch64:
1046 MapParams = FreeBSD_ARM_MemoryMapParams.bits64;
1047 break;
1048 case Triple::x86_64:
1049 MapParams = FreeBSD_X86_MemoryMapParams.bits64;
1050 break;
1051 case Triple::x86:
1052 MapParams = FreeBSD_X86_MemoryMapParams.bits32;
1053 break;
1054 default:
1055 report_fatal_error("unsupported architecture");
1056 }
1057 break;
1058 case Triple::NetBSD:
1059 switch (TargetTriple.getArch()) {
1060 case Triple::x86_64:
1061 MapParams = NetBSD_X86_MemoryMapParams.bits64;
1062 break;
1063 default:
1064 report_fatal_error("unsupported architecture");
1065 }
1066 break;
1067 case Triple::Linux:
1068 switch (TargetTriple.getArch()) {
1069 case Triple::x86_64:
1070 MapParams = Linux_X86_MemoryMapParams.bits64;
1071 break;
1072 case Triple::x86:
1073 MapParams = Linux_X86_MemoryMapParams.bits32;
1074 break;
1075 case Triple::mips64:
1076 case Triple::mips64el:
1077 MapParams = Linux_MIPS_MemoryMapParams.bits64;
1078 break;
1079 case Triple::ppc64:
1080 case Triple::ppc64le:
1081 MapParams = Linux_PowerPC_MemoryMapParams.bits64;
1082 break;
1083 case Triple::systemz:
1084 MapParams = Linux_S390_MemoryMapParams.bits64;
1085 break;
1086 case Triple::aarch64:
1087 case Triple::aarch64_be:
1088 MapParams = Linux_ARM_MemoryMapParams.bits64;
1089 break;
1091 MapParams = Linux_LoongArch_MemoryMapParams.bits64;
1092 break;
1093 default:
1094 report_fatal_error("unsupported architecture");
1095 }
1096 break;
1097 default:
1098 report_fatal_error("unsupported operating system");
1099 }
1100 }
1101
1102 C = &(M.getContext());
1103 IRBuilder<> IRB(*C);
1104 IntptrTy = IRB.getIntPtrTy(DL);
1105 OriginTy = IRB.getInt32Ty();
1106 PtrTy = IRB.getPtrTy();
1107
1108 ColdCallWeights = MDBuilder(*C).createUnlikelyBranchWeights();
1109 OriginStoreWeights = MDBuilder(*C).createUnlikelyBranchWeights();
1110
1111 if (!CompileKernel) {
1112 if (TrackOrigins)
1113 M.getOrInsertGlobal("__msan_track_origins", IRB.getInt32Ty(), [&] {
1114 return new GlobalVariable(
1115 M, IRB.getInt32Ty(), true, GlobalValue::WeakODRLinkage,
1116 IRB.getInt32(TrackOrigins), "__msan_track_origins");
1117 });
1118
1119 if (Recover)
1120 M.getOrInsertGlobal("__msan_keep_going", IRB.getInt32Ty(), [&] {
1121 return new GlobalVariable(M, IRB.getInt32Ty(), true,
1122 GlobalValue::WeakODRLinkage,
1123 IRB.getInt32(Recover), "__msan_keep_going");
1124 });
1125 }
1126}
1127
1128namespace {
1129
1130/// A helper class that handles instrumentation of VarArg
1131/// functions on a particular platform.
1132///
1133/// Implementations are expected to insert the instrumentation
1134/// necessary to propagate argument shadow through VarArg function
1135/// calls. Visit* methods are called during an InstVisitor pass over
1136/// the function, and should avoid creating new basic blocks. A new
1137/// instance of this class is created for each instrumented function.
1138struct VarArgHelper {
1139 virtual ~VarArgHelper() = default;
1140
1141 /// Visit a CallBase.
1142 virtual void visitCallBase(CallBase &CB, IRBuilder<> &IRB) = 0;
1143
1144 /// Visit a va_start call.
1145 virtual void visitVAStartInst(VAStartInst &I) = 0;
1146
1147 /// Visit a va_copy call.
1148 virtual void visitVACopyInst(VACopyInst &I) = 0;
1149
1150 /// Finalize function instrumentation.
1151 ///
1152 /// This method is called after visiting all interesting (see above)
1153 /// instructions in a function.
1154 virtual void finalizeInstrumentation() = 0;
1155};
1156
1157struct MemorySanitizerVisitor;
1158
1159} // end anonymous namespace
1160
1161static VarArgHelper *CreateVarArgHelper(Function &Func, MemorySanitizer &Msan,
1162 MemorySanitizerVisitor &Visitor);
1163
1164static unsigned TypeSizeToSizeIndex(TypeSize TS) {
1165 if (TS.isScalable())
1166 // Scalable types unconditionally take slowpaths.
1167 return kNumberOfAccessSizes;
1168 unsigned TypeSizeFixed = TS.getFixedValue();
1169 if (TypeSizeFixed <= 8)
1170 return 0;
1171 return Log2_32_Ceil((TypeSizeFixed + 7) / 8);
1172}
1173
1174namespace {
1175
1176/// Helper class to attach debug information of the given instruction onto new
1177/// instructions inserted after.
1178class NextNodeIRBuilder : public IRBuilder<> {
1179public:
1180 explicit NextNodeIRBuilder(Instruction *IP) : IRBuilder<>(IP->getNextNode()) {
1181 SetCurrentDebugLocation(IP->getDebugLoc());
1182 }
1183};
1184
1185/// This class does all the work for a given function. Store and Load
1186/// instructions store and load corresponding shadow and origin
1187/// values. Most instructions propagate shadow from arguments to their
1188/// return values. Certain instructions (most importantly, BranchInst)
1189/// test their argument shadow and print reports (with a runtime call) if it's
1190/// non-zero.
1191struct MemorySanitizerVisitor : public InstVisitor<MemorySanitizerVisitor> {
1192 Function &F;
1193 MemorySanitizer &MS;
1194 SmallVector<PHINode *, 16> ShadowPHINodes, OriginPHINodes;
1195 ValueMap<Value *, Value *> ShadowMap, OriginMap;
1196 std::unique_ptr<VarArgHelper> VAHelper;
1197 const TargetLibraryInfo *TLI;
1198 Instruction *FnPrologueEnd;
1199 SmallVector<Instruction *, 16> Instructions;
1200
1201 // The following flags disable parts of MSan instrumentation based on
1202 // exclusion list contents and command-line options.
1203 bool InsertChecks;
1204 bool PropagateShadow;
1205 bool PoisonStack;
1206 bool PoisonUndef;
1207 bool PoisonUndefVectors;
1208
1209 struct ShadowOriginAndInsertPoint {
1210 Value *Shadow;
1211 Value *Origin;
1212 Instruction *OrigIns;
1213
1214 ShadowOriginAndInsertPoint(Value *S, Value *O, Instruction *I)
1215 : Shadow(S), Origin(O), OrigIns(I) {}
1216 };
1218 DenseMap<const DILocation *, int> LazyWarningDebugLocationCount;
1219 SmallSetVector<AllocaInst *, 16> AllocaSet;
1222 int64_t SplittableBlocksCount = 0;
1223
1224 MemorySanitizerVisitor(Function &F, MemorySanitizer &MS,
1225 const TargetLibraryInfo &TLI)
1226 : F(F), MS(MS), VAHelper(CreateVarArgHelper(F, MS, *this)), TLI(&TLI) {
1227 bool SanitizeFunction =
1228 F.hasFnAttribute(Attribute::SanitizeMemory) && !ClDisableChecks;
1229 InsertChecks = SanitizeFunction;
1230 PropagateShadow = SanitizeFunction;
1231 PoisonStack = SanitizeFunction && ClPoisonStack;
1232 PoisonUndef = SanitizeFunction && ClPoisonUndef;
1233 PoisonUndefVectors = SanitizeFunction && ClPoisonUndefVectors;
1234
1235 // In the presence of unreachable blocks, we may see Phi nodes with
1236 // incoming nodes from such blocks. Since InstVisitor skips unreachable
1237 // blocks, such nodes will not have any shadow value associated with them.
1238 // It's easier to remove unreachable blocks than deal with missing shadow.
1240
1241 MS.initializeCallbacks(*F.getParent(), TLI);
1242 FnPrologueEnd =
1243 IRBuilder<>(&F.getEntryBlock(), F.getEntryBlock().getFirstNonPHIIt())
1244 .CreateIntrinsic(Intrinsic::donothing, {});
1245
1246 if (MS.CompileKernel) {
1247 IRBuilder<> IRB(FnPrologueEnd);
1248 insertKmsanPrologue(IRB);
1249 }
1250
1251 LLVM_DEBUG(if (!InsertChecks) dbgs()
1252 << "MemorySanitizer is not inserting checks into '"
1253 << F.getName() << "'\n");
1254 }
1255
1256 bool instrumentWithCalls(Value *V) {
1257 // Constants likely will be eliminated by follow-up passes.
1258 if (isa<Constant>(V))
1259 return false;
1260 ++SplittableBlocksCount;
1262 SplittableBlocksCount > ClInstrumentationWithCallThreshold;
1263 }
1264
1265 bool isInPrologue(Instruction &I) {
1266 return I.getParent() == FnPrologueEnd->getParent() &&
1267 (&I == FnPrologueEnd || I.comesBefore(FnPrologueEnd));
1268 }
1269
1270 // Creates a new origin and records the stack trace. In general we can call
1271 // this function for any origin manipulation we like. However it will cost
1272 // runtime resources. So use this wisely only if it can provide additional
1273 // information helpful to a user.
1274 Value *updateOrigin(Value *V, IRBuilder<> &IRB) {
1275 if (MS.TrackOrigins <= 1)
1276 return V;
1277 return IRB.CreateCall(MS.MsanChainOriginFn, V);
1278 }
1279
1280 Value *originToIntptr(IRBuilder<> &IRB, Value *Origin) {
1281 const DataLayout &DL = F.getDataLayout();
1282 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
1283 if (IntptrSize == kOriginSize)
1284 return Origin;
1285 assert(IntptrSize == kOriginSize * 2);
1286 Origin = IRB.CreateIntCast(Origin, MS.IntptrTy, /* isSigned */ false);
1287 return IRB.CreateOr(Origin, IRB.CreateShl(Origin, kOriginSize * 8));
1288 }
1289
1290 /// Fill memory range with the given origin value.
1291 void paintOrigin(IRBuilder<> &IRB, Value *Origin, Value *OriginPtr,
1292 TypeSize TS, Align Alignment) {
1293 const DataLayout &DL = F.getDataLayout();
1294 const Align IntptrAlignment = DL.getABITypeAlign(MS.IntptrTy);
1295 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
1296 assert(IntptrAlignment >= kMinOriginAlignment);
1297 assert(IntptrSize >= kOriginSize);
1298
1299 // Note: The loop based formation works for fixed length vectors too,
1300 // however we prefer to unroll and specialize alignment below.
1301 if (TS.isScalable()) {
1302 Value *Size = IRB.CreateTypeSize(MS.IntptrTy, TS);
1303 Value *RoundUp =
1304 IRB.CreateAdd(Size, ConstantInt::get(MS.IntptrTy, kOriginSize - 1));
1305 Value *End =
1306 IRB.CreateUDiv(RoundUp, ConstantInt::get(MS.IntptrTy, kOriginSize));
1307 auto [InsertPt, Index] =
1309 IRB.SetInsertPoint(InsertPt);
1310
1311 Value *GEP = IRB.CreateGEP(MS.OriginTy, OriginPtr, Index);
1313 return;
1314 }
1315
1316 unsigned Size = TS.getFixedValue();
1317
1318 unsigned Ofs = 0;
1319 Align CurrentAlignment = Alignment;
1320 if (Alignment >= IntptrAlignment && IntptrSize > kOriginSize) {
1321 Value *IntptrOrigin = originToIntptr(IRB, Origin);
1322 Value *IntptrOriginPtr = IRB.CreatePointerCast(OriginPtr, MS.PtrTy);
1323 for (unsigned i = 0; i < Size / IntptrSize; ++i) {
1324 Value *Ptr = i ? IRB.CreateConstGEP1_32(MS.IntptrTy, IntptrOriginPtr, i)
1325 : IntptrOriginPtr;
1326 IRB.CreateAlignedStore(IntptrOrigin, Ptr, CurrentAlignment);
1327 Ofs += IntptrSize / kOriginSize;
1328 CurrentAlignment = IntptrAlignment;
1329 }
1330 }
1331
1332 for (unsigned i = Ofs; i < (Size + kOriginSize - 1) / kOriginSize; ++i) {
1333 Value *GEP =
1334 i ? IRB.CreateConstGEP1_32(MS.OriginTy, OriginPtr, i) : OriginPtr;
1335 IRB.CreateAlignedStore(Origin, GEP, CurrentAlignment);
1336 CurrentAlignment = kMinOriginAlignment;
1337 }
1338 }
1339
1340 void storeOrigin(IRBuilder<> &IRB, Value *Addr, Value *Shadow, Value *Origin,
1341 Value *OriginPtr, Align Alignment) {
1342 const DataLayout &DL = F.getDataLayout();
1343 const Align OriginAlignment = std::max(kMinOriginAlignment, Alignment);
1344 TypeSize StoreSize = DL.getTypeStoreSize(Shadow->getType());
1345 // ZExt cannot convert between vector and scalar
1346 Value *ConvertedShadow = convertShadowToScalar(Shadow, IRB);
1347 if (auto *ConstantShadow = dyn_cast<Constant>(ConvertedShadow)) {
1348 if (!ClCheckConstantShadow || ConstantShadow->isZeroValue()) {
1349 // Origin is not needed: value is initialized or const shadow is
1350 // ignored.
1351 return;
1352 }
1353 if (llvm::isKnownNonZero(ConvertedShadow, DL)) {
1354 // Copy origin as the value is definitely uninitialized.
1355 paintOrigin(IRB, updateOrigin(Origin, IRB), OriginPtr, StoreSize,
1356 OriginAlignment);
1357 return;
1358 }
1359 // Fallback to runtime check, which still can be optimized out later.
1360 }
1361
1362 TypeSize TypeSizeInBits = DL.getTypeSizeInBits(ConvertedShadow->getType());
1363 unsigned SizeIndex = TypeSizeToSizeIndex(TypeSizeInBits);
1364 if (instrumentWithCalls(ConvertedShadow) &&
1365 SizeIndex < kNumberOfAccessSizes && !MS.CompileKernel) {
1366 FunctionCallee Fn = MS.MaybeStoreOriginFn[SizeIndex];
1367 Value *ConvertedShadow2 =
1368 IRB.CreateZExt(ConvertedShadow, IRB.getIntNTy(8 * (1 << SizeIndex)));
1369 CallBase *CB = IRB.CreateCall(Fn, {ConvertedShadow2, Addr, Origin});
1370 CB->addParamAttr(0, Attribute::ZExt);
1371 CB->addParamAttr(2, Attribute::ZExt);
1372 } else {
1373 Value *Cmp = convertToBool(ConvertedShadow, IRB, "_mscmp");
1375 Cmp, &*IRB.GetInsertPoint(), false, MS.OriginStoreWeights);
1376 IRBuilder<> IRBNew(CheckTerm);
1377 paintOrigin(IRBNew, updateOrigin(Origin, IRBNew), OriginPtr, StoreSize,
1378 OriginAlignment);
1379 }
1380 }
1381
1382 void materializeStores() {
1383 for (StoreInst *SI : StoreList) {
1384 IRBuilder<> IRB(SI);
1385 Value *Val = SI->getValueOperand();
1386 Value *Addr = SI->getPointerOperand();
1387 Value *Shadow = SI->isAtomic() ? getCleanShadow(Val) : getShadow(Val);
1388 Value *ShadowPtr, *OriginPtr;
1389 Type *ShadowTy = Shadow->getType();
1390 const Align Alignment = SI->getAlign();
1391 const Align OriginAlignment = std::max(kMinOriginAlignment, Alignment);
1392 std::tie(ShadowPtr, OriginPtr) =
1393 getShadowOriginPtr(Addr, IRB, ShadowTy, Alignment, /*isStore*/ true);
1394
1395 [[maybe_unused]] StoreInst *NewSI =
1396 IRB.CreateAlignedStore(Shadow, ShadowPtr, Alignment);
1397 LLVM_DEBUG(dbgs() << " STORE: " << *NewSI << "\n");
1398
1399 if (SI->isAtomic())
1400 SI->setOrdering(addReleaseOrdering(SI->getOrdering()));
1401
1402 if (MS.TrackOrigins && !SI->isAtomic())
1403 storeOrigin(IRB, Addr, Shadow, getOrigin(Val), OriginPtr,
1404 OriginAlignment);
1405 }
1406 }
1407
1408 // Returns true if Debug Location corresponds to multiple warnings.
1409 bool shouldDisambiguateWarningLocation(const DebugLoc &DebugLoc) {
1410 if (MS.TrackOrigins < 2)
1411 return false;
1412
1413 if (LazyWarningDebugLocationCount.empty())
1414 for (const auto &I : InstrumentationList)
1415 ++LazyWarningDebugLocationCount[I.OrigIns->getDebugLoc()];
1416
1417 return LazyWarningDebugLocationCount[DebugLoc] >= ClDisambiguateWarning;
1418 }
1419
1420 /// Helper function to insert a warning at IRB's current insert point.
1421 void insertWarningFn(IRBuilder<> &IRB, Value *Origin) {
1422 if (!Origin)
1423 Origin = (Value *)IRB.getInt32(0);
1424 assert(Origin->getType()->isIntegerTy());
1425
1426 if (shouldDisambiguateWarningLocation(IRB.getCurrentDebugLocation())) {
1427 // Try to create additional origin with debug info of the last origin
1428 // instruction. It may provide additional information to the user.
1429 if (Instruction *OI = dyn_cast_or_null<Instruction>(Origin)) {
1430 assert(MS.TrackOrigins);
1431 auto NewDebugLoc = OI->getDebugLoc();
1432 // Origin update with missing or the same debug location provides no
1433 // additional value.
1434 if (NewDebugLoc && NewDebugLoc != IRB.getCurrentDebugLocation()) {
1435 // Insert update just before the check, so we call runtime only just
1436 // before the report.
1437 IRBuilder<> IRBOrigin(&*IRB.GetInsertPoint());
1438 IRBOrigin.SetCurrentDebugLocation(NewDebugLoc);
1439 Origin = updateOrigin(Origin, IRBOrigin);
1440 }
1441 }
1442 }
1443
1444 if (MS.CompileKernel || MS.TrackOrigins)
1445 IRB.CreateCall(MS.WarningFn, Origin)->setCannotMerge();
1446 else
1447 IRB.CreateCall(MS.WarningFn)->setCannotMerge();
1448 // FIXME: Insert UnreachableInst if !MS.Recover?
1449 // This may invalidate some of the following checks and needs to be done
1450 // at the very end.
1451 }
1452
1453 void materializeOneCheck(IRBuilder<> &IRB, Value *ConvertedShadow,
1454 Value *Origin) {
1455 const DataLayout &DL = F.getDataLayout();
1456 TypeSize TypeSizeInBits = DL.getTypeSizeInBits(ConvertedShadow->getType());
1457 unsigned SizeIndex = TypeSizeToSizeIndex(TypeSizeInBits);
1458 if (instrumentWithCalls(ConvertedShadow) && !MS.CompileKernel) {
1459 // ZExt cannot convert between vector and scalar
1460 ConvertedShadow = convertShadowToScalar(ConvertedShadow, IRB);
1461 Value *ConvertedShadow2 =
1462 IRB.CreateZExt(ConvertedShadow, IRB.getIntNTy(8 * (1 << SizeIndex)));
1463
1464 if (SizeIndex < kNumberOfAccessSizes) {
1465 FunctionCallee Fn = MS.MaybeWarningFn[SizeIndex];
1466 CallBase *CB = IRB.CreateCall(
1467 Fn,
1468 {ConvertedShadow2,
1469 MS.TrackOrigins && Origin ? Origin : (Value *)IRB.getInt32(0)});
1470 CB->addParamAttr(0, Attribute::ZExt);
1471 CB->addParamAttr(1, Attribute::ZExt);
1472 } else {
1473 FunctionCallee Fn = MS.MaybeWarningVarSizeFn;
1474 Value *ShadowAlloca = IRB.CreateAlloca(ConvertedShadow2->getType(), 0u);
1475 IRB.CreateStore(ConvertedShadow2, ShadowAlloca);
1476 unsigned ShadowSize = DL.getTypeAllocSize(ConvertedShadow2->getType());
1477 CallBase *CB = IRB.CreateCall(
1478 Fn,
1479 {ShadowAlloca, ConstantInt::get(IRB.getInt64Ty(), ShadowSize),
1480 MS.TrackOrigins && Origin ? Origin : (Value *)IRB.getInt32(0)});
1481 CB->addParamAttr(1, Attribute::ZExt);
1482 CB->addParamAttr(2, Attribute::ZExt);
1483 }
1484 } else {
1485 Value *Cmp = convertToBool(ConvertedShadow, IRB, "_mscmp");
1487 Cmp, &*IRB.GetInsertPoint(),
1488 /* Unreachable */ !MS.Recover, MS.ColdCallWeights);
1489
1490 IRB.SetInsertPoint(CheckTerm);
1491 insertWarningFn(IRB, Origin);
1492 LLVM_DEBUG(dbgs() << " CHECK: " << *Cmp << "\n");
1493 }
1494 }
1495
1496 void materializeInstructionChecks(
1497 ArrayRef<ShadowOriginAndInsertPoint> InstructionChecks) {
1498 const DataLayout &DL = F.getDataLayout();
1499 // Disable combining in some cases. TrackOrigins checks each shadow to pick
1500 // correct origin.
1501 bool Combine = !MS.TrackOrigins;
1502 Instruction *Instruction = InstructionChecks.front().OrigIns;
1503 Value *Shadow = nullptr;
1504 for (const auto &ShadowData : InstructionChecks) {
1505 assert(ShadowData.OrigIns == Instruction);
1506 IRBuilder<> IRB(Instruction);
1507
1508 Value *ConvertedShadow = ShadowData.Shadow;
1509
1510 if (auto *ConstantShadow = dyn_cast<Constant>(ConvertedShadow)) {
1511 if (!ClCheckConstantShadow || ConstantShadow->isZeroValue()) {
1512 // Skip, value is initialized or const shadow is ignored.
1513 continue;
1514 }
1515 if (llvm::isKnownNonZero(ConvertedShadow, DL)) {
1516 // Report as the value is definitely uninitialized.
1517 insertWarningFn(IRB, ShadowData.Origin);
1518 if (!MS.Recover)
1519 return; // Always fail and stop here, not need to check the rest.
1520 // Skip entire instruction,
1521 continue;
1522 }
1523 // Fallback to runtime check, which still can be optimized out later.
1524 }
1525
1526 if (!Combine) {
1527 materializeOneCheck(IRB, ConvertedShadow, ShadowData.Origin);
1528 continue;
1529 }
1530
1531 if (!Shadow) {
1532 Shadow = ConvertedShadow;
1533 continue;
1534 }
1535
1536 Shadow = convertToBool(Shadow, IRB, "_mscmp");
1537 ConvertedShadow = convertToBool(ConvertedShadow, IRB, "_mscmp");
1538 Shadow = IRB.CreateOr(Shadow, ConvertedShadow, "_msor");
1539 }
1540
1541 if (Shadow) {
1542 assert(Combine);
1543 IRBuilder<> IRB(Instruction);
1544 materializeOneCheck(IRB, Shadow, nullptr);
1545 }
1546 }
1547
1548 static bool isAArch64SVCount(Type *Ty) {
1549 if (TargetExtType *TTy = dyn_cast<TargetExtType>(Ty))
1550 return TTy->getName() == "aarch64.svcount";
1551 return false;
1552 }
1553
1554 // This is intended to match the "AArch64 Predicate-as-Counter Type" (aka
1555 // 'target("aarch64.svcount")', but not e.g., <vscale x 4 x i32>.
1556 static bool isScalableNonVectorType(Type *Ty) {
1557 if (!isAArch64SVCount(Ty))
1558 LLVM_DEBUG(dbgs() << "isScalableNonVectorType: Unexpected type " << *Ty
1559 << "\n");
1560
1561 return Ty->isScalableTy() && !isa<VectorType>(Ty);
1562 }
1563
1564 void materializeChecks() {
1565#ifndef NDEBUG
1566 // For assert below.
1567 SmallPtrSet<Instruction *, 16> Done;
1568#endif
1569
1570 for (auto I = InstrumentationList.begin();
1571 I != InstrumentationList.end();) {
1572 auto OrigIns = I->OrigIns;
1573 // Checks are grouped by the original instruction. We call all
1574 // `insertShadowCheck` for an instruction at once.
1575 assert(Done.insert(OrigIns).second);
1576 auto J = std::find_if(I + 1, InstrumentationList.end(),
1577 [OrigIns](const ShadowOriginAndInsertPoint &R) {
1578 return OrigIns != R.OrigIns;
1579 });
1580 // Process all checks of instruction at once.
1581 materializeInstructionChecks(ArrayRef<ShadowOriginAndInsertPoint>(I, J));
1582 I = J;
1583 }
1584
1585 LLVM_DEBUG(dbgs() << "DONE:\n" << F);
1586 }
1587
1588 // Returns the last instruction in the new prologue
1589 void insertKmsanPrologue(IRBuilder<> &IRB) {
1590 Value *ContextState = IRB.CreateCall(MS.MsanGetContextStateFn, {});
1591 Constant *Zero = IRB.getInt32(0);
1592 MS.ParamTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1593 {Zero, IRB.getInt32(0)}, "param_shadow");
1594 MS.RetvalTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1595 {Zero, IRB.getInt32(1)}, "retval_shadow");
1596 MS.VAArgTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1597 {Zero, IRB.getInt32(2)}, "va_arg_shadow");
1598 MS.VAArgOriginTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1599 {Zero, IRB.getInt32(3)}, "va_arg_origin");
1600 MS.VAArgOverflowSizeTLS =
1601 IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1602 {Zero, IRB.getInt32(4)}, "va_arg_overflow_size");
1603 MS.ParamOriginTLS = IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1604 {Zero, IRB.getInt32(5)}, "param_origin");
1605 MS.RetvalOriginTLS =
1606 IRB.CreateGEP(MS.MsanContextStateTy, ContextState,
1607 {Zero, IRB.getInt32(6)}, "retval_origin");
1608 if (MS.TargetTriple.getArch() == Triple::systemz)
1609 MS.MsanMetadataAlloca = IRB.CreateAlloca(MS.MsanMetadata, 0u);
1610 }
1611
1612 /// Add MemorySanitizer instrumentation to a function.
1613 bool runOnFunction() {
1614 // Iterate all BBs in depth-first order and create shadow instructions
1615 // for all instructions (where applicable).
1616 // For PHI nodes we create dummy shadow PHIs which will be finalized later.
1617 for (BasicBlock *BB : depth_first(FnPrologueEnd->getParent()))
1618 visit(*BB);
1619
1620 // `visit` above only collects instructions. Process them after iterating
1621 // CFG to avoid requirement on CFG transformations.
1622 for (Instruction *I : Instructions)
1624
1625 // Finalize PHI nodes.
1626 for (PHINode *PN : ShadowPHINodes) {
1627 PHINode *PNS = cast<PHINode>(getShadow(PN));
1628 PHINode *PNO = MS.TrackOrigins ? cast<PHINode>(getOrigin(PN)) : nullptr;
1629 size_t NumValues = PN->getNumIncomingValues();
1630 for (size_t v = 0; v < NumValues; v++) {
1631 PNS->addIncoming(getShadow(PN, v), PN->getIncomingBlock(v));
1632 if (PNO)
1633 PNO->addIncoming(getOrigin(PN, v), PN->getIncomingBlock(v));
1634 }
1635 }
1636
1637 VAHelper->finalizeInstrumentation();
1638
1639 // Poison llvm.lifetime.start intrinsics, if we haven't fallen back to
1640 // instrumenting only allocas.
1642 for (auto Item : LifetimeStartList) {
1643 instrumentAlloca(*Item.second, Item.first);
1644 AllocaSet.remove(Item.second);
1645 }
1646 }
1647 // Poison the allocas for which we didn't instrument the corresponding
1648 // lifetime intrinsics.
1649 for (AllocaInst *AI : AllocaSet)
1650 instrumentAlloca(*AI);
1651
1652 // Insert shadow value checks.
1653 materializeChecks();
1654
1655 // Delayed instrumentation of StoreInst.
1656 // This may not add new address checks.
1657 materializeStores();
1658
1659 return true;
1660 }
1661
1662 /// Compute the shadow type that corresponds to a given Value.
1663 Type *getShadowTy(Value *V) { return getShadowTy(V->getType()); }
1664
1665 /// Compute the shadow type that corresponds to a given Type.
1666 Type *getShadowTy(Type *OrigTy) {
1667 if (!OrigTy->isSized()) {
1668 return nullptr;
1669 }
1670 // For integer type, shadow is the same as the original type.
1671 // This may return weird-sized types like i1.
1672 if (IntegerType *IT = dyn_cast<IntegerType>(OrigTy))
1673 return IT;
1674 const DataLayout &DL = F.getDataLayout();
1675 if (VectorType *VT = dyn_cast<VectorType>(OrigTy)) {
1676 uint32_t EltSize = DL.getTypeSizeInBits(VT->getElementType());
1677 return VectorType::get(IntegerType::get(*MS.C, EltSize),
1678 VT->getElementCount());
1679 }
1680 if (ArrayType *AT = dyn_cast<ArrayType>(OrigTy)) {
1681 return ArrayType::get(getShadowTy(AT->getElementType()),
1682 AT->getNumElements());
1683 }
1684 if (StructType *ST = dyn_cast<StructType>(OrigTy)) {
1686 for (unsigned i = 0, n = ST->getNumElements(); i < n; i++)
1687 Elements.push_back(getShadowTy(ST->getElementType(i)));
1688 StructType *Res = StructType::get(*MS.C, Elements, ST->isPacked());
1689 LLVM_DEBUG(dbgs() << "getShadowTy: " << *ST << " ===> " << *Res << "\n");
1690 return Res;
1691 }
1692 if (isScalableNonVectorType(OrigTy)) {
1693 LLVM_DEBUG(dbgs() << "getShadowTy: Scalable non-vector type: " << *OrigTy
1694 << "\n");
1695 return OrigTy;
1696 }
1697
1698 uint32_t TypeSize = DL.getTypeSizeInBits(OrigTy);
1699 return IntegerType::get(*MS.C, TypeSize);
1700 }
1701
1702 /// Extract combined shadow of struct elements as a bool
1703 Value *collapseStructShadow(StructType *Struct, Value *Shadow,
1704 IRBuilder<> &IRB) {
1705 Value *FalseVal = IRB.getIntN(/* width */ 1, /* value */ 0);
1706 Value *Aggregator = FalseVal;
1707
1708 for (unsigned Idx = 0; Idx < Struct->getNumElements(); Idx++) {
1709 // Combine by ORing together each element's bool shadow
1710 Value *ShadowItem = IRB.CreateExtractValue(Shadow, Idx);
1711 Value *ShadowBool = convertToBool(ShadowItem, IRB);
1712
1713 if (Aggregator != FalseVal)
1714 Aggregator = IRB.CreateOr(Aggregator, ShadowBool);
1715 else
1716 Aggregator = ShadowBool;
1717 }
1718
1719 return Aggregator;
1720 }
1721
1722 // Extract combined shadow of array elements
1723 Value *collapseArrayShadow(ArrayType *Array, Value *Shadow,
1724 IRBuilder<> &IRB) {
1725 if (!Array->getNumElements())
1726 return IRB.getIntN(/* width */ 1, /* value */ 0);
1727
1728 Value *FirstItem = IRB.CreateExtractValue(Shadow, 0);
1729 Value *Aggregator = convertShadowToScalar(FirstItem, IRB);
1730
1731 for (unsigned Idx = 1; Idx < Array->getNumElements(); Idx++) {
1732 Value *ShadowItem = IRB.CreateExtractValue(Shadow, Idx);
1733 Value *ShadowInner = convertShadowToScalar(ShadowItem, IRB);
1734 Aggregator = IRB.CreateOr(Aggregator, ShadowInner);
1735 }
1736 return Aggregator;
1737 }
1738
1739 /// Convert a shadow value to it's flattened variant. The resulting
1740 /// shadow may not necessarily have the same bit width as the input
1741 /// value, but it will always be comparable to zero.
1742 Value *convertShadowToScalar(Value *V, IRBuilder<> &IRB) {
1743 if (StructType *Struct = dyn_cast<StructType>(V->getType()))
1744 return collapseStructShadow(Struct, V, IRB);
1745 if (ArrayType *Array = dyn_cast<ArrayType>(V->getType()))
1746 return collapseArrayShadow(Array, V, IRB);
1747 if (isa<VectorType>(V->getType())) {
1748 if (isa<ScalableVectorType>(V->getType()))
1749 return convertShadowToScalar(IRB.CreateOrReduce(V), IRB);
1750 unsigned BitWidth =
1751 V->getType()->getPrimitiveSizeInBits().getFixedValue();
1752 return IRB.CreateBitCast(V, IntegerType::get(*MS.C, BitWidth));
1753 }
1754 return V;
1755 }
1756
1757 // Convert a scalar value to an i1 by comparing with 0
1758 Value *convertToBool(Value *V, IRBuilder<> &IRB, const Twine &name = "") {
1759 Type *VTy = V->getType();
1760 if (!VTy->isIntegerTy())
1761 return convertToBool(convertShadowToScalar(V, IRB), IRB, name);
1762 if (VTy->getIntegerBitWidth() == 1)
1763 // Just converting a bool to a bool, so do nothing.
1764 return V;
1765 return IRB.CreateICmpNE(V, ConstantInt::get(VTy, 0), name);
1766 }
1767
1768 Type *ptrToIntPtrType(Type *PtrTy) const {
1769 if (VectorType *VectTy = dyn_cast<VectorType>(PtrTy)) {
1770 return VectorType::get(ptrToIntPtrType(VectTy->getElementType()),
1771 VectTy->getElementCount());
1772 }
1773 assert(PtrTy->isIntOrPtrTy());
1774 return MS.IntptrTy;
1775 }
1776
1777 Type *getPtrToShadowPtrType(Type *IntPtrTy, Type *ShadowTy) const {
1778 if (VectorType *VectTy = dyn_cast<VectorType>(IntPtrTy)) {
1779 return VectorType::get(
1780 getPtrToShadowPtrType(VectTy->getElementType(), ShadowTy),
1781 VectTy->getElementCount());
1782 }
1783 assert(IntPtrTy == MS.IntptrTy);
1784 return MS.PtrTy;
1785 }
1786
1787 Constant *constToIntPtr(Type *IntPtrTy, uint64_t C) const {
1788 if (VectorType *VectTy = dyn_cast<VectorType>(IntPtrTy)) {
1790 VectTy->getElementCount(),
1791 constToIntPtr(VectTy->getElementType(), C));
1792 }
1793 assert(IntPtrTy == MS.IntptrTy);
1794 // TODO: Avoid implicit trunc?
1795 // See https://github.com/llvm/llvm-project/issues/112510.
1796 return ConstantInt::get(MS.IntptrTy, C, /*IsSigned=*/false,
1797 /*ImplicitTrunc=*/true);
1798 }
1799
1800 /// Returns the integer shadow offset that corresponds to a given
1801 /// application address, whereby:
1802 ///
1803 /// Offset = (Addr & ~AndMask) ^ XorMask
1804 /// Shadow = ShadowBase + Offset
1805 /// Origin = (OriginBase + Offset) & ~Alignment
1806 ///
1807 /// Note: for efficiency, many shadow mappings only require use the XorMask
1808 /// and OriginBase; the AndMask and ShadowBase are often zero.
1809 Value *getShadowPtrOffset(Value *Addr, IRBuilder<> &IRB) {
1810 Type *IntptrTy = ptrToIntPtrType(Addr->getType());
1811 Value *OffsetLong = IRB.CreatePointerCast(Addr, IntptrTy);
1812
1813 if (uint64_t AndMask = MS.MapParams->AndMask)
1814 OffsetLong = IRB.CreateAnd(OffsetLong, constToIntPtr(IntptrTy, ~AndMask));
1815
1816 if (uint64_t XorMask = MS.MapParams->XorMask)
1817 OffsetLong = IRB.CreateXor(OffsetLong, constToIntPtr(IntptrTy, XorMask));
1818 return OffsetLong;
1819 }
1820
1821 /// Compute the shadow and origin addresses corresponding to a given
1822 /// application address.
1823 ///
1824 /// Shadow = ShadowBase + Offset
1825 /// Origin = (OriginBase + Offset) & ~3ULL
1826 /// Addr can be a ptr or <N x ptr>. In both cases ShadowTy the shadow type of
1827 /// a single pointee.
1828 /// Returns <shadow_ptr, origin_ptr> or <<N x shadow_ptr>, <N x origin_ptr>>.
1829 std::pair<Value *, Value *>
1830 getShadowOriginPtrUserspace(Value *Addr, IRBuilder<> &IRB, Type *ShadowTy,
1831 MaybeAlign Alignment) {
1832 VectorType *VectTy = dyn_cast<VectorType>(Addr->getType());
1833 if (!VectTy) {
1834 assert(Addr->getType()->isPointerTy());
1835 } else {
1836 assert(VectTy->getElementType()->isPointerTy());
1837 }
1838 Type *IntptrTy = ptrToIntPtrType(Addr->getType());
1839 Value *ShadowOffset = getShadowPtrOffset(Addr, IRB);
1840 Value *ShadowLong = ShadowOffset;
1841 if (uint64_t ShadowBase = MS.MapParams->ShadowBase) {
1842 ShadowLong =
1843 IRB.CreateAdd(ShadowLong, constToIntPtr(IntptrTy, ShadowBase));
1844 }
1845 Value *ShadowPtr = IRB.CreateIntToPtr(
1846 ShadowLong, getPtrToShadowPtrType(IntptrTy, ShadowTy));
1847
1848 Value *OriginPtr = nullptr;
1849 if (MS.TrackOrigins) {
1850 Value *OriginLong = ShadowOffset;
1851 uint64_t OriginBase = MS.MapParams->OriginBase;
1852 if (OriginBase != 0)
1853 OriginLong =
1854 IRB.CreateAdd(OriginLong, constToIntPtr(IntptrTy, OriginBase));
1855 if (!Alignment || *Alignment < kMinOriginAlignment) {
1856 uint64_t Mask = kMinOriginAlignment.value() - 1;
1857 OriginLong = IRB.CreateAnd(OriginLong, constToIntPtr(IntptrTy, ~Mask));
1858 }
1859 OriginPtr = IRB.CreateIntToPtr(
1860 OriginLong, getPtrToShadowPtrType(IntptrTy, MS.OriginTy));
1861 }
1862 return std::make_pair(ShadowPtr, OriginPtr);
1863 }
1864
1865 template <typename... ArgsTy>
1866 Value *createMetadataCall(IRBuilder<> &IRB, FunctionCallee Callee,
1867 ArgsTy... Args) {
1868 if (MS.TargetTriple.getArch() == Triple::systemz) {
1869 IRB.CreateCall(Callee,
1870 {MS.MsanMetadataAlloca, std::forward<ArgsTy>(Args)...});
1871 return IRB.CreateLoad(MS.MsanMetadata, MS.MsanMetadataAlloca);
1872 }
1873
1874 return IRB.CreateCall(Callee, {std::forward<ArgsTy>(Args)...});
1875 }
1876
1877 std::pair<Value *, Value *> getShadowOriginPtrKernelNoVec(Value *Addr,
1878 IRBuilder<> &IRB,
1879 Type *ShadowTy,
1880 bool isStore) {
1881 Value *ShadowOriginPtrs;
1882 const DataLayout &DL = F.getDataLayout();
1883 TypeSize Size = DL.getTypeStoreSize(ShadowTy);
1884
1885 FunctionCallee Getter = MS.getKmsanShadowOriginAccessFn(isStore, Size);
1886 Value *AddrCast = IRB.CreatePointerCast(Addr, MS.PtrTy);
1887 if (Getter) {
1888 ShadowOriginPtrs = createMetadataCall(IRB, Getter, AddrCast);
1889 } else {
1890 Value *SizeVal = ConstantInt::get(MS.IntptrTy, Size);
1891 ShadowOriginPtrs = createMetadataCall(
1892 IRB,
1893 isStore ? MS.MsanMetadataPtrForStoreN : MS.MsanMetadataPtrForLoadN,
1894 AddrCast, SizeVal);
1895 }
1896 Value *ShadowPtr = IRB.CreateExtractValue(ShadowOriginPtrs, 0);
1897 ShadowPtr = IRB.CreatePointerCast(ShadowPtr, MS.PtrTy);
1898 Value *OriginPtr = IRB.CreateExtractValue(ShadowOriginPtrs, 1);
1899
1900 return std::make_pair(ShadowPtr, OriginPtr);
1901 }
1902
1903 /// Addr can be a ptr or <N x ptr>. In both cases ShadowTy the shadow type of
1904 /// a single pointee.
1905 /// Returns <shadow_ptr, origin_ptr> or <<N x shadow_ptr>, <N x origin_ptr>>.
1906 std::pair<Value *, Value *> getShadowOriginPtrKernel(Value *Addr,
1907 IRBuilder<> &IRB,
1908 Type *ShadowTy,
1909 bool isStore) {
1910 VectorType *VectTy = dyn_cast<VectorType>(Addr->getType());
1911 if (!VectTy) {
1912 assert(Addr->getType()->isPointerTy());
1913 return getShadowOriginPtrKernelNoVec(Addr, IRB, ShadowTy, isStore);
1914 }
1915
1916 // TODO: Support callbacs with vectors of addresses.
1917 unsigned NumElements = cast<FixedVectorType>(VectTy)->getNumElements();
1918 Value *ShadowPtrs = ConstantInt::getNullValue(
1919 FixedVectorType::get(IRB.getPtrTy(), NumElements));
1920 Value *OriginPtrs = nullptr;
1921 if (MS.TrackOrigins)
1922 OriginPtrs = ConstantInt::getNullValue(
1923 FixedVectorType::get(IRB.getPtrTy(), NumElements));
1924 for (unsigned i = 0; i < NumElements; ++i) {
1925 Value *OneAddr =
1926 IRB.CreateExtractElement(Addr, ConstantInt::get(IRB.getInt32Ty(), i));
1927 auto [ShadowPtr, OriginPtr] =
1928 getShadowOriginPtrKernelNoVec(OneAddr, IRB, ShadowTy, isStore);
1929
1930 ShadowPtrs = IRB.CreateInsertElement(
1931 ShadowPtrs, ShadowPtr, ConstantInt::get(IRB.getInt32Ty(), i));
1932 if (MS.TrackOrigins)
1933 OriginPtrs = IRB.CreateInsertElement(
1934 OriginPtrs, OriginPtr, ConstantInt::get(IRB.getInt32Ty(), i));
1935 }
1936 return {ShadowPtrs, OriginPtrs};
1937 }
1938
1939 std::pair<Value *, Value *> getShadowOriginPtr(Value *Addr, IRBuilder<> &IRB,
1940 Type *ShadowTy,
1941 MaybeAlign Alignment,
1942 bool isStore) {
1943 if (MS.CompileKernel)
1944 return getShadowOriginPtrKernel(Addr, IRB, ShadowTy, isStore);
1945 return getShadowOriginPtrUserspace(Addr, IRB, ShadowTy, Alignment);
1946 }
1947
1948 /// Compute the shadow address for a given function argument.
1949 ///
1950 /// Shadow = ParamTLS+ArgOffset.
1951 Value *getShadowPtrForArgument(IRBuilder<> &IRB, int ArgOffset) {
1952 return IRB.CreatePtrAdd(MS.ParamTLS,
1953 ConstantInt::get(MS.IntptrTy, ArgOffset), "_msarg");
1954 }
1955
1956 /// Compute the origin address for a given function argument.
1957 Value *getOriginPtrForArgument(IRBuilder<> &IRB, int ArgOffset) {
1958 if (!MS.TrackOrigins)
1959 return nullptr;
1960 return IRB.CreatePtrAdd(MS.ParamOriginTLS,
1961 ConstantInt::get(MS.IntptrTy, ArgOffset),
1962 "_msarg_o");
1963 }
1964
1965 /// Compute the shadow address for a retval.
1966 Value *getShadowPtrForRetval(IRBuilder<> &IRB) {
1967 return IRB.CreatePointerCast(MS.RetvalTLS, IRB.getPtrTy(0), "_msret");
1968 }
1969
1970 /// Compute the origin address for a retval.
1971 Value *getOriginPtrForRetval() {
1972 // We keep a single origin for the entire retval. Might be too optimistic.
1973 return MS.RetvalOriginTLS;
1974 }
1975
1976 /// Set SV to be the shadow value for V.
1977 void setShadow(Value *V, Value *SV) {
1978 assert(!ShadowMap.count(V) && "Values may only have one shadow");
1979 ShadowMap[V] = PropagateShadow ? SV : getCleanShadow(V);
1980 }
1981
1982 /// Set Origin to be the origin value for V.
1983 void setOrigin(Value *V, Value *Origin) {
1984 if (!MS.TrackOrigins)
1985 return;
1986 assert(!OriginMap.count(V) && "Values may only have one origin");
1987 LLVM_DEBUG(dbgs() << "ORIGIN: " << *V << " ==> " << *Origin << "\n");
1988 OriginMap[V] = Origin;
1989 }
1990
1991 Constant *getCleanShadow(Type *OrigTy) {
1992 Type *ShadowTy = getShadowTy(OrigTy);
1993 if (!ShadowTy)
1994 return nullptr;
1995 return Constant::getNullValue(ShadowTy);
1996 }
1997
1998 /// Create a clean shadow value for a given value.
1999 ///
2000 /// Clean shadow (all zeroes) means all bits of the value are defined
2001 /// (initialized).
2002 Constant *getCleanShadow(Value *V) { return getCleanShadow(V->getType()); }
2003
2004 /// Create a dirty shadow of a given shadow type.
2005 Constant *getPoisonedShadow(Type *ShadowTy) {
2006 assert(ShadowTy);
2007 if (isa<IntegerType>(ShadowTy) || isa<VectorType>(ShadowTy))
2008 return Constant::getAllOnesValue(ShadowTy);
2009 if (ArrayType *AT = dyn_cast<ArrayType>(ShadowTy)) {
2010 SmallVector<Constant *, 4> Vals(AT->getNumElements(),
2011 getPoisonedShadow(AT->getElementType()));
2012 return ConstantArray::get(AT, Vals);
2013 }
2014 if (StructType *ST = dyn_cast<StructType>(ShadowTy)) {
2016 for (unsigned i = 0, n = ST->getNumElements(); i < n; i++)
2017 Vals.push_back(getPoisonedShadow(ST->getElementType(i)));
2018 return ConstantStruct::get(ST, Vals);
2019 }
2020 llvm_unreachable("Unexpected shadow type");
2021 }
2022
2023 /// Create a dirty shadow for a given value.
2024 Constant *getPoisonedShadow(Value *V) {
2025 Type *ShadowTy = getShadowTy(V);
2026 if (!ShadowTy)
2027 return nullptr;
2028 return getPoisonedShadow(ShadowTy);
2029 }
2030
2031 /// Create a clean (zero) origin.
2032 Value *getCleanOrigin() { return Constant::getNullValue(MS.OriginTy); }
2033
2034 /// Get the shadow value for a given Value.
2035 ///
2036 /// This function either returns the value set earlier with setShadow,
2037 /// or extracts if from ParamTLS (for function arguments).
2038 Value *getShadow(Value *V) {
2039 if (Instruction *I = dyn_cast<Instruction>(V)) {
2040 if (!PropagateShadow || I->getMetadata(LLVMContext::MD_nosanitize))
2041 return getCleanShadow(V);
2042 // For instructions the shadow is already stored in the map.
2043 Value *Shadow = ShadowMap[V];
2044 if (!Shadow) {
2045 LLVM_DEBUG(dbgs() << "No shadow: " << *V << "\n" << *(I->getParent()));
2046 assert(Shadow && "No shadow for a value");
2047 }
2048 return Shadow;
2049 }
2050 // Handle fully undefined values
2051 // (partially undefined constant vectors are handled later)
2052 if ([[maybe_unused]] UndefValue *U = dyn_cast<UndefValue>(V)) {
2053 Value *AllOnes = (PropagateShadow && PoisonUndef) ? getPoisonedShadow(V)
2054 : getCleanShadow(V);
2055 LLVM_DEBUG(dbgs() << "Undef: " << *U << " ==> " << *AllOnes << "\n");
2056 return AllOnes;
2057 }
2058 if (Argument *A = dyn_cast<Argument>(V)) {
2059 // For arguments we compute the shadow on demand and store it in the map.
2060 Value *&ShadowPtr = ShadowMap[V];
2061 if (ShadowPtr)
2062 return ShadowPtr;
2063 Function *F = A->getParent();
2064 IRBuilder<> EntryIRB(FnPrologueEnd);
2065 unsigned ArgOffset = 0;
2066 const DataLayout &DL = F->getDataLayout();
2067 for (auto &FArg : F->args()) {
2068 if (!FArg.getType()->isSized() || FArg.getType()->isScalableTy()) {
2069 LLVM_DEBUG(dbgs() << (FArg.getType()->isScalableTy()
2070 ? "vscale not fully supported\n"
2071 : "Arg is not sized\n"));
2072 if (A == &FArg) {
2073 ShadowPtr = getCleanShadow(V);
2074 setOrigin(A, getCleanOrigin());
2075 break;
2076 }
2077 continue;
2078 }
2079
2080 unsigned Size = FArg.hasByValAttr()
2081 ? DL.getTypeAllocSize(FArg.getParamByValType())
2082 : DL.getTypeAllocSize(FArg.getType());
2083
2084 if (A == &FArg) {
2085 bool Overflow = ArgOffset + Size > kParamTLSSize;
2086 if (FArg.hasByValAttr()) {
2087 // ByVal pointer itself has clean shadow. We copy the actual
2088 // argument shadow to the underlying memory.
2089 // Figure out maximal valid memcpy alignment.
2090 const Align ArgAlign = DL.getValueOrABITypeAlignment(
2091 FArg.getParamAlign(), FArg.getParamByValType());
2092 Value *CpShadowPtr, *CpOriginPtr;
2093 std::tie(CpShadowPtr, CpOriginPtr) =
2094 getShadowOriginPtr(V, EntryIRB, EntryIRB.getInt8Ty(), ArgAlign,
2095 /*isStore*/ true);
2096 if (!PropagateShadow || Overflow) {
2097 // ParamTLS overflow.
2098 EntryIRB.CreateMemSet(
2099 CpShadowPtr, Constant::getNullValue(EntryIRB.getInt8Ty()),
2100 Size, ArgAlign);
2101 } else {
2102 Value *Base = getShadowPtrForArgument(EntryIRB, ArgOffset);
2103 const Align CopyAlign = std::min(ArgAlign, kShadowTLSAlignment);
2104 [[maybe_unused]] Value *Cpy = EntryIRB.CreateMemCpy(
2105 CpShadowPtr, CopyAlign, Base, CopyAlign, Size);
2106 LLVM_DEBUG(dbgs() << " ByValCpy: " << *Cpy << "\n");
2107
2108 if (MS.TrackOrigins) {
2109 Value *OriginPtr = getOriginPtrForArgument(EntryIRB, ArgOffset);
2110 // FIXME: OriginSize should be:
2111 // alignTo(V % kMinOriginAlignment + Size, kMinOriginAlignment)
2112 unsigned OriginSize = alignTo(Size, kMinOriginAlignment);
2113 EntryIRB.CreateMemCpy(
2114 CpOriginPtr,
2115 /* by getShadowOriginPtr */ kMinOriginAlignment, OriginPtr,
2116 /* by origin_tls[ArgOffset] */ kMinOriginAlignment,
2117 OriginSize);
2118 }
2119 }
2120 }
2121
2122 if (!PropagateShadow || Overflow || FArg.hasByValAttr() ||
2123 (MS.EagerChecks && FArg.hasAttribute(Attribute::NoUndef))) {
2124 ShadowPtr = getCleanShadow(V);
2125 setOrigin(A, getCleanOrigin());
2126 } else {
2127 // Shadow over TLS
2128 Value *Base = getShadowPtrForArgument(EntryIRB, ArgOffset);
2129 ShadowPtr = EntryIRB.CreateAlignedLoad(getShadowTy(&FArg), Base,
2131 if (MS.TrackOrigins) {
2132 Value *OriginPtr = getOriginPtrForArgument(EntryIRB, ArgOffset);
2133 setOrigin(A, EntryIRB.CreateLoad(MS.OriginTy, OriginPtr));
2134 }
2135 }
2137 << " ARG: " << FArg << " ==> " << *ShadowPtr << "\n");
2138 break;
2139 }
2140
2141 ArgOffset += alignTo(Size, kShadowTLSAlignment);
2142 }
2143 assert(ShadowPtr && "Could not find shadow for an argument");
2144 return ShadowPtr;
2145 }
2146
2147 // Check for partially-undefined constant vectors
2148 // TODO: scalable vectors (this is hard because we do not have IRBuilder)
2149 if (isa<FixedVectorType>(V->getType()) && isa<Constant>(V) &&
2150 cast<Constant>(V)->containsUndefOrPoisonElement() && PropagateShadow &&
2151 PoisonUndefVectors) {
2152 unsigned NumElems = cast<FixedVectorType>(V->getType())->getNumElements();
2153 SmallVector<Constant *, 32> ShadowVector(NumElems);
2154 for (unsigned i = 0; i != NumElems; ++i) {
2155 Constant *Elem = cast<Constant>(V)->getAggregateElement(i);
2156 ShadowVector[i] = isa<UndefValue>(Elem) ? getPoisonedShadow(Elem)
2157 : getCleanShadow(Elem);
2158 }
2159
2160 Value *ShadowConstant = ConstantVector::get(ShadowVector);
2161 LLVM_DEBUG(dbgs() << "Partial undef constant vector: " << *V << " ==> "
2162 << *ShadowConstant << "\n");
2163
2164 return ShadowConstant;
2165 }
2166
2167 // TODO: partially-undefined constant arrays, structures, and nested types
2168
2169 // For everything else the shadow is zero.
2170 return getCleanShadow(V);
2171 }
2172
2173 /// Get the shadow for i-th argument of the instruction I.
2174 Value *getShadow(Instruction *I, int i) {
2175 return getShadow(I->getOperand(i));
2176 }
2177
2178 /// Get the origin for a value.
2179 Value *getOrigin(Value *V) {
2180 if (!MS.TrackOrigins)
2181 return nullptr;
2182 if (!PropagateShadow || isa<Constant>(V) || isa<InlineAsm>(V))
2183 return getCleanOrigin();
2185 "Unexpected value type in getOrigin()");
2186 if (Instruction *I = dyn_cast<Instruction>(V)) {
2187 if (I->getMetadata(LLVMContext::MD_nosanitize))
2188 return getCleanOrigin();
2189 }
2190 Value *Origin = OriginMap[V];
2191 assert(Origin && "Missing origin");
2192 return Origin;
2193 }
2194
2195 /// Get the origin for i-th argument of the instruction I.
2196 Value *getOrigin(Instruction *I, int i) {
2197 return getOrigin(I->getOperand(i));
2198 }
2199
2200 /// Remember the place where a shadow check should be inserted.
2201 ///
2202 /// This location will be later instrumented with a check that will print a
2203 /// UMR warning in runtime if the shadow value is not 0.
2204 void insertCheckShadow(Value *Shadow, Value *Origin, Instruction *OrigIns) {
2205 assert(Shadow);
2206 if (!InsertChecks)
2207 return;
2208
2209 if (!DebugCounter::shouldExecute(DebugInsertCheck)) {
2210 LLVM_DEBUG(dbgs() << "Skipping check of " << *Shadow << " before "
2211 << *OrigIns << "\n");
2212 return;
2213 }
2214
2215 Type *ShadowTy = Shadow->getType();
2216 if (isScalableNonVectorType(ShadowTy)) {
2217 LLVM_DEBUG(dbgs() << "Skipping check of scalable non-vector " << *Shadow
2218 << " before " << *OrigIns << "\n");
2219 return;
2220 }
2221#ifndef NDEBUG
2222 assert((isa<IntegerType>(ShadowTy) || isa<VectorType>(ShadowTy) ||
2223 isa<StructType>(ShadowTy) || isa<ArrayType>(ShadowTy)) &&
2224 "Can only insert checks for integer, vector, and aggregate shadow "
2225 "types");
2226#endif
2227 InstrumentationList.push_back(
2228 ShadowOriginAndInsertPoint(Shadow, Origin, OrigIns));
2229 }
2230
2231 /// Get shadow for value, and remember the place where a shadow check should
2232 /// be inserted.
2233 ///
2234 /// This location will be later instrumented with a check that will print a
2235 /// UMR warning in runtime if the value is not fully defined.
2236 void insertCheckShadowOf(Value *Val, Instruction *OrigIns) {
2237 assert(Val);
2238 Value *Shadow, *Origin;
2240 Shadow = getShadow(Val);
2241 if (!Shadow)
2242 return;
2243 Origin = getOrigin(Val);
2244 } else {
2245 Shadow = dyn_cast_or_null<Instruction>(getShadow(Val));
2246 if (!Shadow)
2247 return;
2248 Origin = dyn_cast_or_null<Instruction>(getOrigin(Val));
2249 }
2250 insertCheckShadow(Shadow, Origin, OrigIns);
2251 }
2252
2254 switch (a) {
2255 case AtomicOrdering::NotAtomic:
2256 return AtomicOrdering::NotAtomic;
2257 case AtomicOrdering::Unordered:
2258 case AtomicOrdering::Monotonic:
2259 case AtomicOrdering::Release:
2260 return AtomicOrdering::Release;
2261 case AtomicOrdering::Acquire:
2262 case AtomicOrdering::AcquireRelease:
2263 return AtomicOrdering::AcquireRelease;
2264 case AtomicOrdering::SequentiallyConsistent:
2265 return AtomicOrdering::SequentiallyConsistent;
2266 }
2267 llvm_unreachable("Unknown ordering");
2268 }
2269
2270 Value *makeAddReleaseOrderingTable(IRBuilder<> &IRB) {
2271 constexpr int NumOrderings = (int)AtomicOrderingCABI::seq_cst + 1;
2272 uint32_t OrderingTable[NumOrderings] = {};
2273
2274 OrderingTable[(int)AtomicOrderingCABI::relaxed] =
2275 OrderingTable[(int)AtomicOrderingCABI::release] =
2276 (int)AtomicOrderingCABI::release;
2277 OrderingTable[(int)AtomicOrderingCABI::consume] =
2278 OrderingTable[(int)AtomicOrderingCABI::acquire] =
2279 OrderingTable[(int)AtomicOrderingCABI::acq_rel] =
2280 (int)AtomicOrderingCABI::acq_rel;
2281 OrderingTable[(int)AtomicOrderingCABI::seq_cst] =
2282 (int)AtomicOrderingCABI::seq_cst;
2283
2284 return ConstantDataVector::get(IRB.getContext(), OrderingTable);
2285 }
2286
2288 switch (a) {
2289 case AtomicOrdering::NotAtomic:
2290 return AtomicOrdering::NotAtomic;
2291 case AtomicOrdering::Unordered:
2292 case AtomicOrdering::Monotonic:
2293 case AtomicOrdering::Acquire:
2294 return AtomicOrdering::Acquire;
2295 case AtomicOrdering::Release:
2296 case AtomicOrdering::AcquireRelease:
2297 return AtomicOrdering::AcquireRelease;
2298 case AtomicOrdering::SequentiallyConsistent:
2299 return AtomicOrdering::SequentiallyConsistent;
2300 }
2301 llvm_unreachable("Unknown ordering");
2302 }
2303
2304 Value *makeAddAcquireOrderingTable(IRBuilder<> &IRB) {
2305 constexpr int NumOrderings = (int)AtomicOrderingCABI::seq_cst + 1;
2306 uint32_t OrderingTable[NumOrderings] = {};
2307
2308 OrderingTable[(int)AtomicOrderingCABI::relaxed] =
2309 OrderingTable[(int)AtomicOrderingCABI::acquire] =
2310 OrderingTable[(int)AtomicOrderingCABI::consume] =
2311 (int)AtomicOrderingCABI::acquire;
2312 OrderingTable[(int)AtomicOrderingCABI::release] =
2313 OrderingTable[(int)AtomicOrderingCABI::acq_rel] =
2314 (int)AtomicOrderingCABI::acq_rel;
2315 OrderingTable[(int)AtomicOrderingCABI::seq_cst] =
2316 (int)AtomicOrderingCABI::seq_cst;
2317
2318 return ConstantDataVector::get(IRB.getContext(), OrderingTable);
2319 }
2320
2321 // ------------------- Visitors.
2322 using InstVisitor<MemorySanitizerVisitor>::visit;
2323 void visit(Instruction &I) {
2324 if (I.getMetadata(LLVMContext::MD_nosanitize))
2325 return;
2326 // Don't want to visit if we're in the prologue
2327 if (isInPrologue(I))
2328 return;
2329 if (!DebugCounter::shouldExecute(DebugInstrumentInstruction)) {
2330 LLVM_DEBUG(dbgs() << "Skipping instruction: " << I << "\n");
2331 // We still need to set the shadow and origin to clean values.
2332 setShadow(&I, getCleanShadow(&I));
2333 setOrigin(&I, getCleanOrigin());
2334 return;
2335 }
2336
2337 Instructions.push_back(&I);
2338 }
2339
2340 /// Instrument LoadInst
2341 ///
2342 /// Loads the corresponding shadow and (optionally) origin.
2343 /// Optionally, checks that the load address is fully defined.
2344 void visitLoadInst(LoadInst &I) {
2345 assert(I.getType()->isSized() && "Load type must have size");
2346 assert(!I.getMetadata(LLVMContext::MD_nosanitize));
2347 NextNodeIRBuilder IRB(&I);
2348 Type *ShadowTy = getShadowTy(&I);
2349 Value *Addr = I.getPointerOperand();
2350 Value *ShadowPtr = nullptr, *OriginPtr = nullptr;
2351 const Align Alignment = I.getAlign();
2352 if (PropagateShadow) {
2353 std::tie(ShadowPtr, OriginPtr) =
2354 getShadowOriginPtr(Addr, IRB, ShadowTy, Alignment, /*isStore*/ false);
2355 setShadow(&I,
2356 IRB.CreateAlignedLoad(ShadowTy, ShadowPtr, Alignment, "_msld"));
2357 } else {
2358 setShadow(&I, getCleanShadow(&I));
2359 }
2360
2362 insertCheckShadowOf(I.getPointerOperand(), &I);
2363
2364 if (I.isAtomic())
2365 I.setOrdering(addAcquireOrdering(I.getOrdering()));
2366
2367 if (MS.TrackOrigins) {
2368 if (PropagateShadow) {
2369 const Align OriginAlignment = std::max(kMinOriginAlignment, Alignment);
2370 setOrigin(
2371 &I, IRB.CreateAlignedLoad(MS.OriginTy, OriginPtr, OriginAlignment));
2372 } else {
2373 setOrigin(&I, getCleanOrigin());
2374 }
2375 }
2376 }
2377
2378 /// Instrument StoreInst
2379 ///
2380 /// Stores the corresponding shadow and (optionally) origin.
2381 /// Optionally, checks that the store address is fully defined.
2382 void visitStoreInst(StoreInst &I) {
2383 StoreList.push_back(&I);
2385 insertCheckShadowOf(I.getPointerOperand(), &I);
2386 }
2387
2388 void handleCASOrRMW(Instruction &I) {
2390
2391 IRBuilder<> IRB(&I);
2392 Value *Addr = I.getOperand(0);
2393 Value *Val = I.getOperand(1);
2394 Value *ShadowPtr = getShadowOriginPtr(Addr, IRB, getShadowTy(Val), Align(1),
2395 /*isStore*/ true)
2396 .first;
2397
2399 insertCheckShadowOf(Addr, &I);
2400
2401 // Only test the conditional argument of cmpxchg instruction.
2402 // The other argument can potentially be uninitialized, but we can not
2403 // detect this situation reliably without possible false positives.
2405 insertCheckShadowOf(Val, &I);
2406
2407 IRB.CreateStore(getCleanShadow(Val), ShadowPtr);
2408
2409 setShadow(&I, getCleanShadow(&I));
2410 setOrigin(&I, getCleanOrigin());
2411 }
2412
2413 void visitAtomicRMWInst(AtomicRMWInst &I) {
2414 handleCASOrRMW(I);
2415 I.setOrdering(addReleaseOrdering(I.getOrdering()));
2416 }
2417
2418 void visitAtomicCmpXchgInst(AtomicCmpXchgInst &I) {
2419 handleCASOrRMW(I);
2420 I.setSuccessOrdering(addReleaseOrdering(I.getSuccessOrdering()));
2421 }
2422
2423 // Vector manipulation.
2424 void visitExtractElementInst(ExtractElementInst &I) {
2425 insertCheckShadowOf(I.getOperand(1), &I);
2426 IRBuilder<> IRB(&I);
2427 setShadow(&I, IRB.CreateExtractElement(getShadow(&I, 0), I.getOperand(1),
2428 "_msprop"));
2429 setOrigin(&I, getOrigin(&I, 0));
2430 }
2431
2432 void visitInsertElementInst(InsertElementInst &I) {
2433 insertCheckShadowOf(I.getOperand(2), &I);
2434 IRBuilder<> IRB(&I);
2435 auto *Shadow0 = getShadow(&I, 0);
2436 auto *Shadow1 = getShadow(&I, 1);
2437 setShadow(&I, IRB.CreateInsertElement(Shadow0, Shadow1, I.getOperand(2),
2438 "_msprop"));
2439 setOriginForNaryOp(I);
2440 }
2441
2442 void visitShuffleVectorInst(ShuffleVectorInst &I) {
2443 IRBuilder<> IRB(&I);
2444 auto *Shadow0 = getShadow(&I, 0);
2445 auto *Shadow1 = getShadow(&I, 1);
2446 setShadow(&I, IRB.CreateShuffleVector(Shadow0, Shadow1, I.getShuffleMask(),
2447 "_msprop"));
2448 setOriginForNaryOp(I);
2449 }
2450
2451 // Casts.
2452 void visitSExtInst(SExtInst &I) {
2453 IRBuilder<> IRB(&I);
2454 setShadow(&I, IRB.CreateSExt(getShadow(&I, 0), I.getType(), "_msprop"));
2455 setOrigin(&I, getOrigin(&I, 0));
2456 }
2457
2458 void visitZExtInst(ZExtInst &I) {
2459 IRBuilder<> IRB(&I);
2460 setShadow(&I, IRB.CreateZExt(getShadow(&I, 0), I.getType(), "_msprop"));
2461 setOrigin(&I, getOrigin(&I, 0));
2462 }
2463
2464 void visitTruncInst(TruncInst &I) {
2465 IRBuilder<> IRB(&I);
2466 setShadow(&I, IRB.CreateTrunc(getShadow(&I, 0), I.getType(), "_msprop"));
2467 setOrigin(&I, getOrigin(&I, 0));
2468 }
2469
2470 void visitBitCastInst(BitCastInst &I) {
2471 // Special case: if this is the bitcast (there is exactly 1 allowed) between
2472 // a musttail call and a ret, don't instrument. New instructions are not
2473 // allowed after a musttail call.
2474 if (auto *CI = dyn_cast<CallInst>(I.getOperand(0)))
2475 if (CI->isMustTailCall())
2476 return;
2477 IRBuilder<> IRB(&I);
2478 setShadow(&I, IRB.CreateBitCast(getShadow(&I, 0), getShadowTy(&I)));
2479 setOrigin(&I, getOrigin(&I, 0));
2480 }
2481
2482 void visitPtrToIntInst(PtrToIntInst &I) {
2483 IRBuilder<> IRB(&I);
2484 setShadow(&I, IRB.CreateIntCast(getShadow(&I, 0), getShadowTy(&I), false,
2485 "_msprop_ptrtoint"));
2486 setOrigin(&I, getOrigin(&I, 0));
2487 }
2488
2489 void visitIntToPtrInst(IntToPtrInst &I) {
2490 IRBuilder<> IRB(&I);
2491 setShadow(&I, IRB.CreateIntCast(getShadow(&I, 0), getShadowTy(&I), false,
2492 "_msprop_inttoptr"));
2493 setOrigin(&I, getOrigin(&I, 0));
2494 }
2495
2496 void visitFPToSIInst(CastInst &I) { handleShadowOr(I); }
2497 void visitFPToUIInst(CastInst &I) { handleShadowOr(I); }
2498 void visitSIToFPInst(CastInst &I) { handleShadowOr(I); }
2499 void visitUIToFPInst(CastInst &I) { handleShadowOr(I); }
2500 void visitFPExtInst(CastInst &I) { handleShadowOr(I); }
2501 void visitFPTruncInst(CastInst &I) { handleShadowOr(I); }
2502
2503 /// Propagate shadow for bitwise AND.
2504 ///
2505 /// This code is exact, i.e. if, for example, a bit in the left argument
2506 /// is defined and 0, then neither the value not definedness of the
2507 /// corresponding bit in B don't affect the resulting shadow.
2508 void visitAnd(BinaryOperator &I) {
2509 IRBuilder<> IRB(&I);
2510 // "And" of 0 and a poisoned value results in unpoisoned value.
2511 // 1&1 => 1; 0&1 => 0; p&1 => p;
2512 // 1&0 => 0; 0&0 => 0; p&0 => 0;
2513 // 1&p => p; 0&p => 0; p&p => p;
2514 // S = (S1 & S2) | (V1 & S2) | (S1 & V2)
2515 Value *S1 = getShadow(&I, 0);
2516 Value *S2 = getShadow(&I, 1);
2517 Value *V1 = I.getOperand(0);
2518 Value *V2 = I.getOperand(1);
2519 if (V1->getType() != S1->getType()) {
2520 V1 = IRB.CreateIntCast(V1, S1->getType(), false);
2521 V2 = IRB.CreateIntCast(V2, S2->getType(), false);
2522 }
2523 Value *S1S2 = IRB.CreateAnd(S1, S2);
2524 Value *V1S2 = IRB.CreateAnd(V1, S2);
2525 Value *S1V2 = IRB.CreateAnd(S1, V2);
2526 setShadow(&I, IRB.CreateOr({S1S2, V1S2, S1V2}));
2527 setOriginForNaryOp(I);
2528 }
2529
2530 void visitOr(BinaryOperator &I) {
2531 IRBuilder<> IRB(&I);
2532 // "Or" of 1 and a poisoned value results in unpoisoned value:
2533 // 1|1 => 1; 0|1 => 1; p|1 => 1;
2534 // 1|0 => 1; 0|0 => 0; p|0 => p;
2535 // 1|p => 1; 0|p => p; p|p => p;
2536 //
2537 // S = (S1 & S2) | (~V1 & S2) | (S1 & ~V2)
2538 //
2539 // If the "disjoint OR" property is violated, the result is poison, and
2540 // hence the entire shadow is uninitialized:
2541 // S = S | SignExt(V1 & V2 != 0)
2542 Value *S1 = getShadow(&I, 0);
2543 Value *S2 = getShadow(&I, 1);
2544 Value *V1 = I.getOperand(0);
2545 Value *V2 = I.getOperand(1);
2546 if (V1->getType() != S1->getType()) {
2547 V1 = IRB.CreateIntCast(V1, S1->getType(), false);
2548 V2 = IRB.CreateIntCast(V2, S2->getType(), false);
2549 }
2550
2551 Value *NotV1 = IRB.CreateNot(V1);
2552 Value *NotV2 = IRB.CreateNot(V2);
2553
2554 Value *S1S2 = IRB.CreateAnd(S1, S2);
2555 Value *S2NotV1 = IRB.CreateAnd(NotV1, S2);
2556 Value *S1NotV2 = IRB.CreateAnd(S1, NotV2);
2557
2558 Value *S = IRB.CreateOr({S1S2, S2NotV1, S1NotV2});
2559
2560 if (ClPreciseDisjointOr && cast<PossiblyDisjointInst>(&I)->isDisjoint()) {
2561 Value *V1V2 = IRB.CreateAnd(V1, V2);
2562 Value *DisjointOrShadow = IRB.CreateSExt(
2563 IRB.CreateICmpNE(V1V2, getCleanShadow(V1V2)), V1V2->getType());
2564 S = IRB.CreateOr(S, DisjointOrShadow, "_ms_disjoint");
2565 }
2566
2567 setShadow(&I, S);
2568 setOriginForNaryOp(I);
2569 }
2570
2571 /// Default propagation of shadow and/or origin.
2572 ///
2573 /// This class implements the general case of shadow propagation, used in all
2574 /// cases where we don't know and/or don't care about what the operation
2575 /// actually does. It converts all input shadow values to a common type
2576 /// (extending or truncating as necessary), and bitwise OR's them.
2577 ///
2578 /// This is much cheaper than inserting checks (i.e. requiring inputs to be
2579 /// fully initialized), and less prone to false positives.
2580 ///
2581 /// This class also implements the general case of origin propagation. For a
2582 /// Nary operation, result origin is set to the origin of an argument that is
2583 /// not entirely initialized. If there is more than one such arguments, the
2584 /// rightmost of them is picked. It does not matter which one is picked if all
2585 /// arguments are initialized.
2586 template <bool CombineShadow> class Combiner {
2587 Value *Shadow = nullptr;
2588 Value *Origin = nullptr;
2589 IRBuilder<> &IRB;
2590 MemorySanitizerVisitor *MSV;
2591
2592 public:
2593 Combiner(MemorySanitizerVisitor *MSV, IRBuilder<> &IRB)
2594 : IRB(IRB), MSV(MSV) {}
2595
2596 /// Add a pair of shadow and origin values to the mix.
2597 Combiner &Add(Value *OpShadow, Value *OpOrigin) {
2598 if (CombineShadow) {
2599 assert(OpShadow);
2600 if (!Shadow)
2601 Shadow = OpShadow;
2602 else {
2603 OpShadow = MSV->CreateShadowCast(IRB, OpShadow, Shadow->getType());
2604 Shadow = IRB.CreateOr(Shadow, OpShadow, "_msprop");
2605 }
2606 }
2607
2608 if (MSV->MS.TrackOrigins) {
2609 assert(OpOrigin);
2610 if (!Origin) {
2611 Origin = OpOrigin;
2612 } else {
2613 Constant *ConstOrigin = dyn_cast<Constant>(OpOrigin);
2614 // No point in adding something that might result in 0 origin value.
2615 if (!ConstOrigin || !ConstOrigin->isNullValue()) {
2616 Value *Cond = MSV->convertToBool(OpShadow, IRB);
2617 Origin = IRB.CreateSelect(Cond, OpOrigin, Origin);
2618 }
2619 }
2620 }
2621 return *this;
2622 }
2623
2624 /// Add an application value to the mix.
2625 Combiner &Add(Value *V) {
2626 Value *OpShadow = MSV->getShadow(V);
2627 Value *OpOrigin = MSV->MS.TrackOrigins ? MSV->getOrigin(V) : nullptr;
2628 return Add(OpShadow, OpOrigin);
2629 }
2630
2631 /// Set the current combined values as the given instruction's shadow
2632 /// and origin.
2633 void Done(Instruction *I) {
2634 if (CombineShadow) {
2635 assert(Shadow);
2636 Shadow = MSV->CreateShadowCast(IRB, Shadow, MSV->getShadowTy(I));
2637 MSV->setShadow(I, Shadow);
2638 }
2639 if (MSV->MS.TrackOrigins) {
2640 assert(Origin);
2641 MSV->setOrigin(I, Origin);
2642 }
2643 }
2644
2645 /// Store the current combined value at the specified origin
2646 /// location.
2647 void DoneAndStoreOrigin(TypeSize TS, Value *OriginPtr) {
2648 if (MSV->MS.TrackOrigins) {
2649 assert(Origin);
2650 MSV->paintOrigin(IRB, Origin, OriginPtr, TS, kMinOriginAlignment);
2651 }
2652 }
2653 };
2654
2655 using ShadowAndOriginCombiner = Combiner<true>;
2656 using OriginCombiner = Combiner<false>;
2657
2658 /// Propagate origin for arbitrary operation.
2659 void setOriginForNaryOp(Instruction &I) {
2660 if (!MS.TrackOrigins)
2661 return;
2662 IRBuilder<> IRB(&I);
2663 OriginCombiner OC(this, IRB);
2664 for (Use &Op : I.operands())
2665 OC.Add(Op.get());
2666 OC.Done(&I);
2667 }
2668
2669 size_t VectorOrPrimitiveTypeSizeInBits(Type *Ty) {
2670 assert(!(Ty->isVectorTy() && Ty->getScalarType()->isPointerTy()) &&
2671 "Vector of pointers is not a valid shadow type");
2672 return Ty->isVectorTy() ? cast<FixedVectorType>(Ty)->getNumElements() *
2674 : Ty->getPrimitiveSizeInBits();
2675 }
2676
2677 /// Cast between two shadow types, extending or truncating as
2678 /// necessary.
2679 Value *CreateShadowCast(IRBuilder<> &IRB, Value *V, Type *dstTy,
2680 bool Signed = false) {
2681 Type *srcTy = V->getType();
2682 if (srcTy == dstTy)
2683 return V;
2684 size_t srcSizeInBits = VectorOrPrimitiveTypeSizeInBits(srcTy);
2685 size_t dstSizeInBits = VectorOrPrimitiveTypeSizeInBits(dstTy);
2686 if (srcSizeInBits > 1 && dstSizeInBits == 1)
2687 return IRB.CreateICmpNE(V, getCleanShadow(V));
2688
2689 if (dstTy->isIntegerTy() && srcTy->isIntegerTy())
2690 return IRB.CreateIntCast(V, dstTy, Signed);
2691 if (dstTy->isVectorTy() && srcTy->isVectorTy() &&
2692 cast<VectorType>(dstTy)->getElementCount() ==
2693 cast<VectorType>(srcTy)->getElementCount())
2694 return IRB.CreateIntCast(V, dstTy, Signed);
2695 Value *V1 = IRB.CreateBitCast(V, Type::getIntNTy(*MS.C, srcSizeInBits));
2696 Value *V2 =
2697 IRB.CreateIntCast(V1, Type::getIntNTy(*MS.C, dstSizeInBits), Signed);
2698 return IRB.CreateBitCast(V2, dstTy);
2699 // TODO: handle struct types.
2700 }
2701
2702 /// Cast an application value to the type of its own shadow.
2703 Value *CreateAppToShadowCast(IRBuilder<> &IRB, Value *V) {
2704 Type *ShadowTy = getShadowTy(V);
2705 if (V->getType() == ShadowTy)
2706 return V;
2707 if (V->getType()->isPtrOrPtrVectorTy())
2708 return IRB.CreatePtrToInt(V, ShadowTy);
2709 else
2710 return IRB.CreateBitCast(V, ShadowTy);
2711 }
2712
2713 /// Propagate shadow for arbitrary operation.
2714 void handleShadowOr(Instruction &I) {
2715 IRBuilder<> IRB(&I);
2716 ShadowAndOriginCombiner SC(this, IRB);
2717 for (Use &Op : I.operands())
2718 SC.Add(Op.get());
2719 SC.Done(&I);
2720 }
2721
2722 // Perform a bitwise OR on the horizontal pairs (or other specified grouping)
2723 // of elements.
2724 //
2725 // For example, suppose we have:
2726 // VectorA: <a0, a1, a2, a3, a4, a5>
2727 // VectorB: <b0, b1, b2, b3, b4, b5>
2728 // ReductionFactor: 3
2729 // Shards: 1
2730 // The output would be:
2731 // <a0|a1|a2, a3|a4|a5, b0|b1|b2, b3|b4|b5>
2732 //
2733 // If we have:
2734 // VectorA: <a0, a1, a2, a3, a4, a5, a6, a7>
2735 // VectorB: <b0, b1, b2, b3, b4, b5, b6, b7>
2736 // ReductionFactor: 2
2737 // Shards: 2
2738 // then a and be each have 2 "shards", resulting in the output being
2739 // interleaved:
2740 // <a0|a1, a2|a3, b0|b1, b2|b3, a4|a5, a6|a7, b4|b5, b6|b7>
2741 //
2742 // This is convenient for instrumenting horizontal add/sub.
2743 // For bitwise OR on "vertical" pairs, see maybeHandleSimpleNomemIntrinsic().
2744 Value *horizontalReduce(IntrinsicInst &I, unsigned ReductionFactor,
2745 unsigned Shards, Value *VectorA, Value *VectorB) {
2746 assert(isa<FixedVectorType>(VectorA->getType()));
2747 unsigned NumElems =
2748 cast<FixedVectorType>(VectorA->getType())->getNumElements();
2749
2750 [[maybe_unused]] unsigned TotalNumElems = NumElems;
2751 if (VectorB) {
2752 assert(VectorA->getType() == VectorB->getType());
2753 TotalNumElems *= 2;
2754 }
2755
2756 assert(NumElems % (ReductionFactor * Shards) == 0);
2757
2758 Value *Or = nullptr;
2759
2760 IRBuilder<> IRB(&I);
2761 for (unsigned i = 0; i < ReductionFactor; i++) {
2762 SmallVector<int, 16> Mask;
2763
2764 for (unsigned j = 0; j < Shards; j++) {
2765 unsigned Offset = NumElems / Shards * j;
2766
2767 for (unsigned X = 0; X < NumElems / Shards; X += ReductionFactor)
2768 Mask.push_back(Offset + X + i);
2769
2770 if (VectorB) {
2771 for (unsigned X = 0; X < NumElems / Shards; X += ReductionFactor)
2772 Mask.push_back(NumElems + Offset + X + i);
2773 }
2774 }
2775
2776 Value *Masked;
2777 if (VectorB)
2778 Masked = IRB.CreateShuffleVector(VectorA, VectorB, Mask);
2779 else
2780 Masked = IRB.CreateShuffleVector(VectorA, Mask);
2781
2782 if (Or)
2783 Or = IRB.CreateOr(Or, Masked);
2784 else
2785 Or = Masked;
2786 }
2787
2788 return Or;
2789 }
2790
2791 /// Propagate shadow for 1- or 2-vector intrinsics that combine adjacent
2792 /// fields.
2793 ///
2794 /// e.g., <2 x i32> @llvm.aarch64.neon.saddlp.v2i32.v4i16(<4 x i16>)
2795 /// <16 x i8> @llvm.aarch64.neon.addp.v16i8(<16 x i8>, <16 x i8>)
2796 void handlePairwiseShadowOrIntrinsic(IntrinsicInst &I, unsigned Shards) {
2797 assert(I.arg_size() == 1 || I.arg_size() == 2);
2798
2799 assert(I.getType()->isVectorTy());
2800 assert(I.getArgOperand(0)->getType()->isVectorTy());
2801
2802 [[maybe_unused]] FixedVectorType *ParamType =
2803 cast<FixedVectorType>(I.getArgOperand(0)->getType());
2804 assert((I.arg_size() != 2) ||
2805 (ParamType == cast<FixedVectorType>(I.getArgOperand(1)->getType())));
2806 [[maybe_unused]] FixedVectorType *ReturnType =
2807 cast<FixedVectorType>(I.getType());
2808 assert(ParamType->getNumElements() * I.arg_size() ==
2809 2 * ReturnType->getNumElements());
2810
2811 IRBuilder<> IRB(&I);
2812
2813 // Horizontal OR of shadow
2814 Value *FirstArgShadow = getShadow(&I, 0);
2815 Value *SecondArgShadow = nullptr;
2816 if (I.arg_size() == 2)
2817 SecondArgShadow = getShadow(&I, 1);
2818
2819 Value *OrShadow = horizontalReduce(I, /*ReductionFactor=*/2, Shards,
2820 FirstArgShadow, SecondArgShadow);
2821
2822 OrShadow = CreateShadowCast(IRB, OrShadow, getShadowTy(&I));
2823
2824 setShadow(&I, OrShadow);
2825 setOriginForNaryOp(I);
2826 }
2827
2828 /// Propagate shadow for 1- or 2-vector intrinsics that combine adjacent
2829 /// fields, with the parameters reinterpreted to have elements of a specified
2830 /// width. For example:
2831 /// @llvm.x86.ssse3.phadd.w(<1 x i64> [[VAR1]], <1 x i64> [[VAR2]])
2832 /// conceptually operates on
2833 /// (<4 x i16> [[VAR1]], <4 x i16> [[VAR2]])
2834 /// and can be handled with ReinterpretElemWidth == 16.
2835 void handlePairwiseShadowOrIntrinsic(IntrinsicInst &I, unsigned Shards,
2836 int ReinterpretElemWidth) {
2837 assert(I.arg_size() == 1 || I.arg_size() == 2);
2838
2839 assert(I.getType()->isVectorTy());
2840 assert(I.getArgOperand(0)->getType()->isVectorTy());
2841
2842 FixedVectorType *ParamType =
2843 cast<FixedVectorType>(I.getArgOperand(0)->getType());
2844 assert((I.arg_size() != 2) ||
2845 (ParamType == cast<FixedVectorType>(I.getArgOperand(1)->getType())));
2846
2847 [[maybe_unused]] FixedVectorType *ReturnType =
2848 cast<FixedVectorType>(I.getType());
2849 assert(ParamType->getNumElements() * I.arg_size() ==
2850 2 * ReturnType->getNumElements());
2851
2852 IRBuilder<> IRB(&I);
2853
2854 FixedVectorType *ReinterpretShadowTy = nullptr;
2855 assert(isAligned(Align(ReinterpretElemWidth),
2856 ParamType->getPrimitiveSizeInBits()));
2857 ReinterpretShadowTy = FixedVectorType::get(
2858 IRB.getIntNTy(ReinterpretElemWidth),
2859 ParamType->getPrimitiveSizeInBits() / ReinterpretElemWidth);
2860
2861 // Horizontal OR of shadow
2862 Value *FirstArgShadow = getShadow(&I, 0);
2863 FirstArgShadow = IRB.CreateBitCast(FirstArgShadow, ReinterpretShadowTy);
2864
2865 // If we had two parameters each with an odd number of elements, the total
2866 // number of elements is even, but we have never seen this in extant
2867 // instruction sets, so we enforce that each parameter must have an even
2868 // number of elements.
2870 Align(2),
2871 cast<FixedVectorType>(FirstArgShadow->getType())->getNumElements()));
2872
2873 Value *SecondArgShadow = nullptr;
2874 if (I.arg_size() == 2) {
2875 SecondArgShadow = getShadow(&I, 1);
2876 SecondArgShadow = IRB.CreateBitCast(SecondArgShadow, ReinterpretShadowTy);
2877 }
2878
2879 Value *OrShadow = horizontalReduce(I, /*ReductionFactor=*/2, Shards,
2880 FirstArgShadow, SecondArgShadow);
2881
2882 OrShadow = CreateShadowCast(IRB, OrShadow, getShadowTy(&I));
2883
2884 setShadow(&I, OrShadow);
2885 setOriginForNaryOp(I);
2886 }
2887
2888 void visitFNeg(UnaryOperator &I) { handleShadowOr(I); }
2889
2890 // Handle multiplication by constant.
2891 //
2892 // Handle a special case of multiplication by constant that may have one or
2893 // more zeros in the lower bits. This makes corresponding number of lower bits
2894 // of the result zero as well. We model it by shifting the other operand
2895 // shadow left by the required number of bits. Effectively, we transform
2896 // (X * (A * 2**B)) to ((X << B) * A) and instrument (X << B) as (Sx << B).
2897 // We use multiplication by 2**N instead of shift to cover the case of
2898 // multiplication by 0, which may occur in some elements of a vector operand.
2899 void handleMulByConstant(BinaryOperator &I, Constant *ConstArg,
2900 Value *OtherArg) {
2901 Constant *ShadowMul;
2902 Type *Ty = ConstArg->getType();
2903 if (auto *VTy = dyn_cast<VectorType>(Ty)) {
2904 unsigned NumElements = cast<FixedVectorType>(VTy)->getNumElements();
2905 Type *EltTy = VTy->getElementType();
2907 for (unsigned Idx = 0; Idx < NumElements; ++Idx) {
2908 if (ConstantInt *Elt =
2910 const APInt &V = Elt->getValue();
2911 APInt V2 = APInt(V.getBitWidth(), 1) << V.countr_zero();
2912 Elements.push_back(ConstantInt::get(EltTy, V2));
2913 } else {
2914 Elements.push_back(ConstantInt::get(EltTy, 1));
2915 }
2916 }
2917 ShadowMul = ConstantVector::get(Elements);
2918 } else {
2919 if (ConstantInt *Elt = dyn_cast<ConstantInt>(ConstArg)) {
2920 const APInt &V = Elt->getValue();
2921 APInt V2 = APInt(V.getBitWidth(), 1) << V.countr_zero();
2922 ShadowMul = ConstantInt::get(Ty, V2);
2923 } else {
2924 ShadowMul = ConstantInt::get(Ty, 1);
2925 }
2926 }
2927
2928 IRBuilder<> IRB(&I);
2929 setShadow(&I,
2930 IRB.CreateMul(getShadow(OtherArg), ShadowMul, "msprop_mul_cst"));
2931 setOrigin(&I, getOrigin(OtherArg));
2932 }
2933
2934 void visitMul(BinaryOperator &I) {
2935 Constant *constOp0 = dyn_cast<Constant>(I.getOperand(0));
2936 Constant *constOp1 = dyn_cast<Constant>(I.getOperand(1));
2937 if (constOp0 && !constOp1)
2938 handleMulByConstant(I, constOp0, I.getOperand(1));
2939 else if (constOp1 && !constOp0)
2940 handleMulByConstant(I, constOp1, I.getOperand(0));
2941 else
2942 handleShadowOr(I);
2943 }
2944
2945 void visitFAdd(BinaryOperator &I) { handleShadowOr(I); }
2946 void visitFSub(BinaryOperator &I) { handleShadowOr(I); }
2947 void visitFMul(BinaryOperator &I) { handleShadowOr(I); }
2948 void visitAdd(BinaryOperator &I) { handleShadowOr(I); }
2949 void visitSub(BinaryOperator &I) { handleShadowOr(I); }
2950 void visitXor(BinaryOperator &I) { handleShadowOr(I); }
2951
2952 void handleIntegerDiv(Instruction &I) {
2953 IRBuilder<> IRB(&I);
2954 // Strict on the second argument.
2955 insertCheckShadowOf(I.getOperand(1), &I);
2956 setShadow(&I, getShadow(&I, 0));
2957 setOrigin(&I, getOrigin(&I, 0));
2958 }
2959
2960 void visitUDiv(BinaryOperator &I) { handleIntegerDiv(I); }
2961 void visitSDiv(BinaryOperator &I) { handleIntegerDiv(I); }
2962 void visitURem(BinaryOperator &I) { handleIntegerDiv(I); }
2963 void visitSRem(BinaryOperator &I) { handleIntegerDiv(I); }
2964
2965 // Floating point division is side-effect free. We can not require that the
2966 // divisor is fully initialized and must propagate shadow. See PR37523.
2967 void visitFDiv(BinaryOperator &I) { handleShadowOr(I); }
2968 void visitFRem(BinaryOperator &I) { handleShadowOr(I); }
2969
2970 /// Instrument == and != comparisons.
2971 ///
2972 /// Sometimes the comparison result is known even if some of the bits of the
2973 /// arguments are not.
2974 void handleEqualityComparison(ICmpInst &I) {
2975 IRBuilder<> IRB(&I);
2976 Value *A = I.getOperand(0);
2977 Value *B = I.getOperand(1);
2978 Value *Sa = getShadow(A);
2979 Value *Sb = getShadow(B);
2980
2981 // Get rid of pointers and vectors of pointers.
2982 // For ints (and vectors of ints), types of A and Sa match,
2983 // and this is a no-op.
2984 A = IRB.CreatePointerCast(A, Sa->getType());
2985 B = IRB.CreatePointerCast(B, Sb->getType());
2986
2987 // A == B <==> (C = A^B) == 0
2988 // A != B <==> (C = A^B) != 0
2989 // Sc = Sa | Sb
2990 Value *C = IRB.CreateXor(A, B);
2991 Value *Sc = IRB.CreateOr(Sa, Sb);
2992 // Now dealing with i = (C == 0) comparison (or C != 0, does not matter now)
2993 // Result is defined if one of the following is true
2994 // * there is a defined 1 bit in C
2995 // * C is fully defined
2996 // Si = !(C & ~Sc) && Sc
2998 Value *MinusOne = Constant::getAllOnesValue(Sc->getType());
2999 Value *LHS = IRB.CreateICmpNE(Sc, Zero);
3000 Value *RHS =
3001 IRB.CreateICmpEQ(IRB.CreateAnd(IRB.CreateXor(Sc, MinusOne), C), Zero);
3002 Value *Si = IRB.CreateAnd(LHS, RHS);
3003 Si->setName("_msprop_icmp");
3004 setShadow(&I, Si);
3005 setOriginForNaryOp(I);
3006 }
3007
3008 /// Instrument relational comparisons.
3009 ///
3010 /// This function does exact shadow propagation for all relational
3011 /// comparisons of integers, pointers and vectors of those.
3012 /// FIXME: output seems suboptimal when one of the operands is a constant
3013 void handleRelationalComparisonExact(ICmpInst &I) {
3014 IRBuilder<> IRB(&I);
3015 Value *A = I.getOperand(0);
3016 Value *B = I.getOperand(1);
3017 Value *Sa = getShadow(A);
3018 Value *Sb = getShadow(B);
3019
3020 // Get rid of pointers and vectors of pointers.
3021 // For ints (and vectors of ints), types of A and Sa match,
3022 // and this is a no-op.
3023 A = IRB.CreatePointerCast(A, Sa->getType());
3024 B = IRB.CreatePointerCast(B, Sb->getType());
3025
3026 // Let [a0, a1] be the interval of possible values of A, taking into account
3027 // its undefined bits. Let [b0, b1] be the interval of possible values of B.
3028 // Then (A cmp B) is defined iff (a0 cmp b1) == (a1 cmp b0).
3029 bool IsSigned = I.isSigned();
3030
3031 auto GetMinMaxUnsigned = [&](Value *V, Value *S) {
3032 if (IsSigned) {
3033 // Sign-flip to map from signed range to unsigned range. Relation A vs B
3034 // should be preserved, if checked with `getUnsignedPredicate()`.
3035 // Relationship between Amin, Amax, Bmin, Bmax also will not be
3036 // affected, as they are created by effectively adding/substructing from
3037 // A (or B) a value, derived from shadow, with no overflow, either
3038 // before or after sign flip.
3039 APInt MinVal =
3040 APInt::getSignedMinValue(V->getType()->getScalarSizeInBits());
3041 V = IRB.CreateXor(V, ConstantInt::get(V->getType(), MinVal));
3042 }
3043 // Minimize undefined bits.
3044 Value *Min = IRB.CreateAnd(V, IRB.CreateNot(S));
3045 Value *Max = IRB.CreateOr(V, S);
3046 return std::make_pair(Min, Max);
3047 };
3048
3049 auto [Amin, Amax] = GetMinMaxUnsigned(A, Sa);
3050 auto [Bmin, Bmax] = GetMinMaxUnsigned(B, Sb);
3051 Value *S1 = IRB.CreateICmp(I.getUnsignedPredicate(), Amin, Bmax);
3052 Value *S2 = IRB.CreateICmp(I.getUnsignedPredicate(), Amax, Bmin);
3053
3054 Value *Si = IRB.CreateXor(S1, S2);
3055 setShadow(&I, Si);
3056 setOriginForNaryOp(I);
3057 }
3058
3059 /// Instrument signed relational comparisons.
3060 ///
3061 /// Handle sign bit tests: x<0, x>=0, x<=-1, x>-1 by propagating the highest
3062 /// bit of the shadow. Everything else is delegated to handleShadowOr().
3063 void handleSignedRelationalComparison(ICmpInst &I) {
3064 Constant *constOp;
3065 Value *op = nullptr;
3067 if ((constOp = dyn_cast<Constant>(I.getOperand(1)))) {
3068 op = I.getOperand(0);
3069 pre = I.getPredicate();
3070 } else if ((constOp = dyn_cast<Constant>(I.getOperand(0)))) {
3071 op = I.getOperand(1);
3072 pre = I.getSwappedPredicate();
3073 } else {
3074 handleShadowOr(I);
3075 return;
3076 }
3077
3078 if ((constOp->isNullValue() &&
3079 (pre == CmpInst::ICMP_SLT || pre == CmpInst::ICMP_SGE)) ||
3080 (constOp->isAllOnesValue() &&
3081 (pre == CmpInst::ICMP_SGT || pre == CmpInst::ICMP_SLE))) {
3082 IRBuilder<> IRB(&I);
3083 Value *Shadow = IRB.CreateICmpSLT(getShadow(op), getCleanShadow(op),
3084 "_msprop_icmp_s");
3085 setShadow(&I, Shadow);
3086 setOrigin(&I, getOrigin(op));
3087 } else {
3088 handleShadowOr(I);
3089 }
3090 }
3091
3092 void visitICmpInst(ICmpInst &I) {
3093 if (!ClHandleICmp) {
3094 handleShadowOr(I);
3095 return;
3096 }
3097 if (I.isEquality()) {
3098 handleEqualityComparison(I);
3099 return;
3100 }
3101
3102 assert(I.isRelational());
3103 if (ClHandleICmpExact) {
3104 handleRelationalComparisonExact(I);
3105 return;
3106 }
3107 if (I.isSigned()) {
3108 handleSignedRelationalComparison(I);
3109 return;
3110 }
3111
3112 assert(I.isUnsigned());
3113 if ((isa<Constant>(I.getOperand(0)) || isa<Constant>(I.getOperand(1)))) {
3114 handleRelationalComparisonExact(I);
3115 return;
3116 }
3117
3118 handleShadowOr(I);
3119 }
3120
3121 void visitFCmpInst(FCmpInst &I) { handleShadowOr(I); }
3122
3123 void handleShift(BinaryOperator &I) {
3124 IRBuilder<> IRB(&I);
3125 // If any of the S2 bits are poisoned, the whole thing is poisoned.
3126 // Otherwise perform the same shift on S1.
3127 Value *S1 = getShadow(&I, 0);
3128 Value *S2 = getShadow(&I, 1);
3129 Value *S2Conv =
3130 IRB.CreateSExt(IRB.CreateICmpNE(S2, getCleanShadow(S2)), S2->getType());
3131 Value *V2 = I.getOperand(1);
3132 Value *Shift = IRB.CreateBinOp(I.getOpcode(), S1, V2);
3133 setShadow(&I, IRB.CreateOr(Shift, S2Conv));
3134 setOriginForNaryOp(I);
3135 }
3136
3137 void visitShl(BinaryOperator &I) { handleShift(I); }
3138 void visitAShr(BinaryOperator &I) { handleShift(I); }
3139 void visitLShr(BinaryOperator &I) { handleShift(I); }
3140
3141 void handleFunnelShift(IntrinsicInst &I) {
3142 IRBuilder<> IRB(&I);
3143 // If any of the S2 bits are poisoned, the whole thing is poisoned.
3144 // Otherwise perform the same shift on S0 and S1.
3145 Value *S0 = getShadow(&I, 0);
3146 Value *S1 = getShadow(&I, 1);
3147 Value *S2 = getShadow(&I, 2);
3148 Value *S2Conv =
3149 IRB.CreateSExt(IRB.CreateICmpNE(S2, getCleanShadow(S2)), S2->getType());
3150 Value *V2 = I.getOperand(2);
3151 Value *Shift = IRB.CreateIntrinsic(I.getIntrinsicID(), S2Conv->getType(),
3152 {S0, S1, V2});
3153 setShadow(&I, IRB.CreateOr(Shift, S2Conv));
3154 setOriginForNaryOp(I);
3155 }
3156
3157 /// Instrument llvm.memmove
3158 ///
3159 /// At this point we don't know if llvm.memmove will be inlined or not.
3160 /// If we don't instrument it and it gets inlined,
3161 /// our interceptor will not kick in and we will lose the memmove.
3162 /// If we instrument the call here, but it does not get inlined,
3163 /// we will memmove the shadow twice: which is bad in case
3164 /// of overlapping regions. So, we simply lower the intrinsic to a call.
3165 ///
3166 /// Similar situation exists for memcpy and memset.
3167 void visitMemMoveInst(MemMoveInst &I) {
3168 getShadow(I.getArgOperand(1)); // Ensure shadow initialized
3169 IRBuilder<> IRB(&I);
3170 IRB.CreateCall(MS.MemmoveFn,
3171 {I.getArgOperand(0), I.getArgOperand(1),
3172 IRB.CreateIntCast(I.getArgOperand(2), MS.IntptrTy, false)});
3174 }
3175
3176 /// Instrument memcpy
3177 ///
3178 /// Similar to memmove: avoid copying shadow twice. This is somewhat
3179 /// unfortunate as it may slowdown small constant memcpys.
3180 /// FIXME: consider doing manual inline for small constant sizes and proper
3181 /// alignment.
3182 ///
3183 /// Note: This also handles memcpy.inline, which promises no calls to external
3184 /// functions as an optimization. However, with instrumentation enabled this
3185 /// is difficult to promise; additionally, we know that the MSan runtime
3186 /// exists and provides __msan_memcpy(). Therefore, we assume that with
3187 /// instrumentation it's safe to turn memcpy.inline into a call to
3188 /// __msan_memcpy(). Should this be wrong, such as when implementing memcpy()
3189 /// itself, instrumentation should be disabled with the no_sanitize attribute.
3190 void visitMemCpyInst(MemCpyInst &I) {
3191 getShadow(I.getArgOperand(1)); // Ensure shadow initialized
3192 IRBuilder<> IRB(&I);
3193 IRB.CreateCall(MS.MemcpyFn,
3194 {I.getArgOperand(0), I.getArgOperand(1),
3195 IRB.CreateIntCast(I.getArgOperand(2), MS.IntptrTy, false)});
3197 }
3198
3199 // Same as memcpy.
3200 void visitMemSetInst(MemSetInst &I) {
3201 IRBuilder<> IRB(&I);
3202 IRB.CreateCall(
3203 MS.MemsetFn,
3204 {I.getArgOperand(0),
3205 IRB.CreateIntCast(I.getArgOperand(1), IRB.getInt32Ty(), false),
3206 IRB.CreateIntCast(I.getArgOperand(2), MS.IntptrTy, false)});
3208 }
3209
3210 void visitVAStartInst(VAStartInst &I) { VAHelper->visitVAStartInst(I); }
3211
3212 void visitVACopyInst(VACopyInst &I) { VAHelper->visitVACopyInst(I); }
3213
3214 /// Handle vector store-like intrinsics.
3215 ///
3216 /// Instrument intrinsics that look like a simple SIMD store: writes memory,
3217 /// has 1 pointer argument and 1 vector argument, returns void.
3218 bool handleVectorStoreIntrinsic(IntrinsicInst &I) {
3219 assert(I.arg_size() == 2);
3220
3221 IRBuilder<> IRB(&I);
3222 Value *Addr = I.getArgOperand(0);
3223 Value *Shadow = getShadow(&I, 1);
3224 Value *ShadowPtr, *OriginPtr;
3225
3226 // We don't know the pointer alignment (could be unaligned SSE store!).
3227 // Have to assume to worst case.
3228 std::tie(ShadowPtr, OriginPtr) = getShadowOriginPtr(
3229 Addr, IRB, Shadow->getType(), Align(1), /*isStore*/ true);
3230 IRB.CreateAlignedStore(Shadow, ShadowPtr, Align(1));
3231
3233 insertCheckShadowOf(Addr, &I);
3234
3235 // FIXME: factor out common code from materializeStores
3236 if (MS.TrackOrigins)
3237 IRB.CreateStore(getOrigin(&I, 1), OriginPtr);
3238 return true;
3239 }
3240
3241 /// Handle vector load-like intrinsics.
3242 ///
3243 /// Instrument intrinsics that look like a simple SIMD load: reads memory,
3244 /// has 1 pointer argument, returns a vector.
3245 bool handleVectorLoadIntrinsic(IntrinsicInst &I) {
3246 assert(I.arg_size() == 1);
3247
3248 IRBuilder<> IRB(&I);
3249 Value *Addr = I.getArgOperand(0);
3250
3251 Type *ShadowTy = getShadowTy(&I);
3252 Value *ShadowPtr = nullptr, *OriginPtr = nullptr;
3253 if (PropagateShadow) {
3254 // We don't know the pointer alignment (could be unaligned SSE load!).
3255 // Have to assume to worst case.
3256 const Align Alignment = Align(1);
3257 std::tie(ShadowPtr, OriginPtr) =
3258 getShadowOriginPtr(Addr, IRB, ShadowTy, Alignment, /*isStore*/ false);
3259 setShadow(&I,
3260 IRB.CreateAlignedLoad(ShadowTy, ShadowPtr, Alignment, "_msld"));
3261 } else {
3262 setShadow(&I, getCleanShadow(&I));
3263 }
3264
3266 insertCheckShadowOf(Addr, &I);
3267
3268 if (MS.TrackOrigins) {
3269 if (PropagateShadow)
3270 setOrigin(&I, IRB.CreateLoad(MS.OriginTy, OriginPtr));
3271 else
3272 setOrigin(&I, getCleanOrigin());
3273 }
3274 return true;
3275 }
3276
3277 /// Handle (SIMD arithmetic)-like intrinsics.
3278 ///
3279 /// Instrument intrinsics with any number of arguments of the same type [*],
3280 /// equal to the return type, plus a specified number of trailing flags of
3281 /// any type.
3282 ///
3283 /// [*] The type should be simple (no aggregates or pointers; vectors are
3284 /// fine).
3285 ///
3286 /// Caller guarantees that this intrinsic does not access memory.
3287 ///
3288 /// TODO: "horizontal"/"pairwise" intrinsics are often incorrectly matched by
3289 /// by this handler. See horizontalReduce().
3290 ///
3291 /// TODO: permutation intrinsics are also often incorrectly matched.
3292 [[maybe_unused]] bool
3293 maybeHandleSimpleNomemIntrinsic(IntrinsicInst &I,
3294 unsigned int trailingFlags) {
3295 Type *RetTy = I.getType();
3296 if (!(RetTy->isIntOrIntVectorTy() || RetTy->isFPOrFPVectorTy()))
3297 return false;
3298
3299 unsigned NumArgOperands = I.arg_size();
3300 assert(NumArgOperands >= trailingFlags);
3301 for (unsigned i = 0; i < NumArgOperands - trailingFlags; ++i) {
3302 Type *Ty = I.getArgOperand(i)->getType();
3303 if (Ty != RetTy)
3304 return false;
3305 }
3306
3307 IRBuilder<> IRB(&I);
3308 ShadowAndOriginCombiner SC(this, IRB);
3309 for (unsigned i = 0; i < NumArgOperands; ++i)
3310 SC.Add(I.getArgOperand(i));
3311 SC.Done(&I);
3312
3313 return true;
3314 }
3315
3316 /// Returns whether it was able to heuristically instrument unknown
3317 /// intrinsics.
3318 ///
3319 /// The main purpose of this code is to do something reasonable with all
3320 /// random intrinsics we might encounter, most importantly - SIMD intrinsics.
3321 /// We recognize several classes of intrinsics by their argument types and
3322 /// ModRefBehaviour and apply special instrumentation when we are reasonably
3323 /// sure that we know what the intrinsic does.
3324 ///
3325 /// We special-case intrinsics where this approach fails. See llvm.bswap
3326 /// handling as an example of that.
3327 bool maybeHandleUnknownIntrinsicUnlogged(IntrinsicInst &I) {
3328 unsigned NumArgOperands = I.arg_size();
3329 if (NumArgOperands == 0)
3330 return false;
3331
3332 if (NumArgOperands == 2 && I.getArgOperand(0)->getType()->isPointerTy() &&
3333 I.getArgOperand(1)->getType()->isVectorTy() &&
3334 I.getType()->isVoidTy() && !I.onlyReadsMemory()) {
3335 // This looks like a vector store.
3336 return handleVectorStoreIntrinsic(I);
3337 }
3338
3339 if (NumArgOperands == 1 && I.getArgOperand(0)->getType()->isPointerTy() &&
3340 I.getType()->isVectorTy() && I.onlyReadsMemory()) {
3341 // This looks like a vector load.
3342 return handleVectorLoadIntrinsic(I);
3343 }
3344
3345 if (I.doesNotAccessMemory())
3346 if (maybeHandleSimpleNomemIntrinsic(I, /*trailingFlags=*/0))
3347 return true;
3348
3349 // FIXME: detect and handle SSE maskstore/maskload?
3350 // Some cases are now handled in handleAVXMasked{Load,Store}.
3351 return false;
3352 }
3353
3354 bool maybeHandleUnknownIntrinsic(IntrinsicInst &I) {
3355 if (maybeHandleUnknownIntrinsicUnlogged(I)) {
3357 dumpInst(I);
3358
3359 LLVM_DEBUG(dbgs() << "UNKNOWN INSTRUCTION HANDLED HEURISTICALLY: " << I
3360 << "\n");
3361 return true;
3362 } else
3363 return false;
3364 }
3365
3366 void handleInvariantGroup(IntrinsicInst &I) {
3367 setShadow(&I, getShadow(&I, 0));
3368 setOrigin(&I, getOrigin(&I, 0));
3369 }
3370
3371 void handleLifetimeStart(IntrinsicInst &I) {
3372 if (!PoisonStack)
3373 return;
3374 AllocaInst *AI = dyn_cast<AllocaInst>(I.getArgOperand(0));
3375 if (AI)
3376 LifetimeStartList.push_back(std::make_pair(&I, AI));
3377 }
3378
3379 void handleBswap(IntrinsicInst &I) {
3380 IRBuilder<> IRB(&I);
3381 Value *Op = I.getArgOperand(0);
3382 Type *OpType = Op->getType();
3383 setShadow(&I, IRB.CreateIntrinsic(Intrinsic::bswap, ArrayRef(&OpType, 1),
3384 getShadow(Op)));
3385 setOrigin(&I, getOrigin(Op));
3386 }
3387
3388 // Uninitialized bits are ok if they appear after the leading/trailing 0's
3389 // and a 1. If the input is all zero, it is fully initialized iff
3390 // !is_zero_poison.
3391 //
3392 // e.g., for ctlz, with little-endian, if 0/1 are initialized bits with
3393 // concrete value 0/1, and ? is an uninitialized bit:
3394 // - 0001 0??? is fully initialized
3395 // - 000? ???? is fully uninitialized (*)
3396 // - ???? ???? is fully uninitialized
3397 // - 0000 0000 is fully uninitialized if is_zero_poison,
3398 // fully initialized otherwise
3399 //
3400 // (*) TODO: arguably, since the number of zeros is in the range [3, 8], we
3401 // only need to poison 4 bits.
3402 //
3403 // OutputShadow =
3404 // ((ConcreteZerosCount >= ShadowZerosCount) && !AllZeroShadow)
3405 // || (is_zero_poison && AllZeroSrc)
3406 void handleCountLeadingTrailingZeros(IntrinsicInst &I) {
3407 IRBuilder<> IRB(&I);
3408 Value *Src = I.getArgOperand(0);
3409 Value *SrcShadow = getShadow(Src);
3410
3411 Value *False = IRB.getInt1(false);
3412 Value *ConcreteZerosCount = IRB.CreateIntrinsic(
3413 I.getType(), I.getIntrinsicID(), {Src, /*is_zero_poison=*/False});
3414 Value *ShadowZerosCount = IRB.CreateIntrinsic(
3415 I.getType(), I.getIntrinsicID(), {SrcShadow, /*is_zero_poison=*/False});
3416
3417 Value *CompareConcreteZeros = IRB.CreateICmpUGE(
3418 ConcreteZerosCount, ShadowZerosCount, "_mscz_cmp_zeros");
3419
3420 Value *NotAllZeroShadow =
3421 IRB.CreateIsNotNull(SrcShadow, "_mscz_shadow_not_null");
3422 Value *OutputShadow =
3423 IRB.CreateAnd(CompareConcreteZeros, NotAllZeroShadow, "_mscz_main");
3424
3425 // If zero poison is requested, mix in with the shadow
3426 Constant *IsZeroPoison = cast<Constant>(I.getOperand(1));
3427 if (!IsZeroPoison->isZeroValue()) {
3428 Value *BoolZeroPoison = IRB.CreateIsNull(Src, "_mscz_bzp");
3429 OutputShadow = IRB.CreateOr(OutputShadow, BoolZeroPoison, "_mscz_bs");
3430 }
3431
3432 OutputShadow = IRB.CreateSExt(OutputShadow, getShadowTy(Src), "_mscz_os");
3433
3434 setShadow(&I, OutputShadow);
3435 setOriginForNaryOp(I);
3436 }
3437
3438 /// Handle Arm NEON vector convert intrinsics.
3439 ///
3440 /// e.g., <4 x i32> @llvm.aarch64.neon.fcvtpu.v4i32.v4f32(<4 x float>)
3441 /// i32 @llvm.aarch64.neon.fcvtms.i32.f64(double)
3442 ///
3443 /// For x86 SSE vector convert intrinsics, see
3444 /// handleSSEVectorConvertIntrinsic().
3445 void handleNEONVectorConvertIntrinsic(IntrinsicInst &I) {
3446 assert(I.arg_size() == 1);
3447
3448 IRBuilder<> IRB(&I);
3449 Value *S0 = getShadow(&I, 0);
3450
3451 /// For scalars:
3452 /// Since they are converting from floating-point to integer, the output is
3453 /// - fully uninitialized if *any* bit of the input is uninitialized
3454 /// - fully ininitialized if all bits of the input are ininitialized
3455 /// We apply the same principle on a per-field basis for vectors.
3456 Value *OutShadow = IRB.CreateSExt(IRB.CreateICmpNE(S0, getCleanShadow(S0)),
3457 getShadowTy(&I));
3458 setShadow(&I, OutShadow);
3459 setOriginForNaryOp(I);
3460 }
3461
3462 /// Some instructions have additional zero-elements in the return type
3463 /// e.g., <16 x i8> @llvm.x86.avx512.mask.pmov.qb.512(<8 x i64>, ...)
3464 ///
3465 /// This function will return a vector type with the same number of elements
3466 /// as the input, but same per-element width as the return value e.g.,
3467 /// <8 x i8>.
3468 FixedVectorType *maybeShrinkVectorShadowType(Value *Src, IntrinsicInst &I) {
3469 assert(isa<FixedVectorType>(getShadowTy(&I)));
3470 FixedVectorType *ShadowType = cast<FixedVectorType>(getShadowTy(&I));
3471
3472 // TODO: generalize beyond 2x?
3473 if (ShadowType->getElementCount() ==
3474 cast<VectorType>(Src->getType())->getElementCount() * 2)
3475 ShadowType = FixedVectorType::getHalfElementsVectorType(ShadowType);
3476
3477 assert(ShadowType->getElementCount() ==
3478 cast<VectorType>(Src->getType())->getElementCount());
3479
3480 return ShadowType;
3481 }
3482
3483 /// Doubles the length of a vector shadow (extending with zeros) if necessary
3484 /// to match the length of the shadow for the instruction.
3485 /// If scalar types of the vectors are different, it will use the type of the
3486 /// input vector.
3487 /// This is more type-safe than CreateShadowCast().
3488 Value *maybeExtendVectorShadowWithZeros(Value *Shadow, IntrinsicInst &I) {
3489 IRBuilder<> IRB(&I);
3491 assert(isa<FixedVectorType>(I.getType()));
3492
3493 Value *FullShadow = getCleanShadow(&I);
3494 unsigned ShadowNumElems =
3495 cast<FixedVectorType>(Shadow->getType())->getNumElements();
3496 unsigned FullShadowNumElems =
3497 cast<FixedVectorType>(FullShadow->getType())->getNumElements();
3498
3499 assert((ShadowNumElems == FullShadowNumElems) ||
3500 (ShadowNumElems * 2 == FullShadowNumElems));
3501
3502 if (ShadowNumElems == FullShadowNumElems) {
3503 FullShadow = Shadow;
3504 } else {
3505 // TODO: generalize beyond 2x?
3506 SmallVector<int, 32> ShadowMask(FullShadowNumElems);
3507 std::iota(ShadowMask.begin(), ShadowMask.end(), 0);
3508
3509 // Append zeros
3510 FullShadow =
3511 IRB.CreateShuffleVector(Shadow, getCleanShadow(Shadow), ShadowMask);
3512 }
3513
3514 return FullShadow;
3515 }
3516
3517 /// Handle x86 SSE vector conversion.
3518 ///
3519 /// e.g., single-precision to half-precision conversion:
3520 /// <8 x i16> @llvm.x86.vcvtps2ph.256(<8 x float> %a0, i32 0)
3521 /// <8 x i16> @llvm.x86.vcvtps2ph.128(<4 x float> %a0, i32 0)
3522 ///
3523 /// floating-point to integer:
3524 /// <4 x i32> @llvm.x86.sse2.cvtps2dq(<4 x float>)
3525 /// <4 x i32> @llvm.x86.sse2.cvtpd2dq(<2 x double>)
3526 ///
3527 /// Note: if the output has more elements, they are zero-initialized (and
3528 /// therefore the shadow will also be initialized).
3529 ///
3530 /// This differs from handleSSEVectorConvertIntrinsic() because it
3531 /// propagates uninitialized shadow (instead of checking the shadow).
3532 void handleSSEVectorConvertIntrinsicByProp(IntrinsicInst &I,
3533 bool HasRoundingMode) {
3534 if (HasRoundingMode) {
3535 assert(I.arg_size() == 2);
3536 [[maybe_unused]] Value *RoundingMode = I.getArgOperand(1);
3537 assert(RoundingMode->getType()->isIntegerTy());
3538 } else {
3539 assert(I.arg_size() == 1);
3540 }
3541
3542 Value *Src = I.getArgOperand(0);
3543 assert(Src->getType()->isVectorTy());
3544
3545 // The return type might have more elements than the input.
3546 // Temporarily shrink the return type's number of elements.
3547 VectorType *ShadowType = maybeShrinkVectorShadowType(Src, I);
3548
3549 IRBuilder<> IRB(&I);
3550 Value *S0 = getShadow(&I, 0);
3551
3552 /// For scalars:
3553 /// Since they are converting to and/or from floating-point, the output is:
3554 /// - fully uninitialized if *any* bit of the input is uninitialized
3555 /// - fully ininitialized if all bits of the input are ininitialized
3556 /// We apply the same principle on a per-field basis for vectors.
3557 Value *Shadow =
3558 IRB.CreateSExt(IRB.CreateICmpNE(S0, getCleanShadow(S0)), ShadowType);
3559
3560 // The return type might have more elements than the input.
3561 // Extend the return type back to its original width if necessary.
3562 Value *FullShadow = maybeExtendVectorShadowWithZeros(Shadow, I);
3563
3564 setShadow(&I, FullShadow);
3565 setOriginForNaryOp(I);
3566 }
3567
3568 // Instrument x86 SSE vector convert intrinsic.
3569 //
3570 // This function instruments intrinsics like cvtsi2ss:
3571 // %Out = int_xxx_cvtyyy(%ConvertOp)
3572 // or
3573 // %Out = int_xxx_cvtyyy(%CopyOp, %ConvertOp)
3574 // Intrinsic converts \p NumUsedElements elements of \p ConvertOp to the same
3575 // number \p Out elements, and (if has 2 arguments) copies the rest of the
3576 // elements from \p CopyOp.
3577 // In most cases conversion involves floating-point value which may trigger a
3578 // hardware exception when not fully initialized. For this reason we require
3579 // \p ConvertOp[0:NumUsedElements] to be fully initialized and trap otherwise.
3580 // We copy the shadow of \p CopyOp[NumUsedElements:] to \p
3581 // Out[NumUsedElements:]. This means that intrinsics without \p CopyOp always
3582 // return a fully initialized value.
3583 //
3584 // For Arm NEON vector convert intrinsics, see
3585 // handleNEONVectorConvertIntrinsic().
3586 void handleSSEVectorConvertIntrinsic(IntrinsicInst &I, int NumUsedElements,
3587 bool HasRoundingMode = false) {
3588 IRBuilder<> IRB(&I);
3589 Value *CopyOp, *ConvertOp;
3590
3591 assert((!HasRoundingMode ||
3592 isa<ConstantInt>(I.getArgOperand(I.arg_size() - 1))) &&
3593 "Invalid rounding mode");
3594
3595 switch (I.arg_size() - HasRoundingMode) {
3596 case 2:
3597 CopyOp = I.getArgOperand(0);
3598 ConvertOp = I.getArgOperand(1);
3599 break;
3600 case 1:
3601 ConvertOp = I.getArgOperand(0);
3602 CopyOp = nullptr;
3603 break;
3604 default:
3605 llvm_unreachable("Cvt intrinsic with unsupported number of arguments.");
3606 }
3607
3608 // The first *NumUsedElements* elements of ConvertOp are converted to the
3609 // same number of output elements. The rest of the output is copied from
3610 // CopyOp, or (if not available) filled with zeroes.
3611 // Combine shadow for elements of ConvertOp that are used in this operation,
3612 // and insert a check.
3613 // FIXME: consider propagating shadow of ConvertOp, at least in the case of
3614 // int->any conversion.
3615 Value *ConvertShadow = getShadow(ConvertOp);
3616 Value *AggShadow = nullptr;
3617 if (ConvertOp->getType()->isVectorTy()) {
3618 AggShadow = IRB.CreateExtractElement(
3619 ConvertShadow, ConstantInt::get(IRB.getInt32Ty(), 0));
3620 for (int i = 1; i < NumUsedElements; ++i) {
3621 Value *MoreShadow = IRB.CreateExtractElement(
3622 ConvertShadow, ConstantInt::get(IRB.getInt32Ty(), i));
3623 AggShadow = IRB.CreateOr(AggShadow, MoreShadow);
3624 }
3625 } else {
3626 AggShadow = ConvertShadow;
3627 }
3628 assert(AggShadow->getType()->isIntegerTy());
3629 insertCheckShadow(AggShadow, getOrigin(ConvertOp), &I);
3630
3631 // Build result shadow by zero-filling parts of CopyOp shadow that come from
3632 // ConvertOp.
3633 if (CopyOp) {
3634 assert(CopyOp->getType() == I.getType());
3635 assert(CopyOp->getType()->isVectorTy());
3636 Value *ResultShadow = getShadow(CopyOp);
3637 Type *EltTy = cast<VectorType>(ResultShadow->getType())->getElementType();
3638 for (int i = 0; i < NumUsedElements; ++i) {
3639 ResultShadow = IRB.CreateInsertElement(
3640 ResultShadow, ConstantInt::getNullValue(EltTy),
3641 ConstantInt::get(IRB.getInt32Ty(), i));
3642 }
3643 setShadow(&I, ResultShadow);
3644 setOrigin(&I, getOrigin(CopyOp));
3645 } else {
3646 setShadow(&I, getCleanShadow(&I));
3647 setOrigin(&I, getCleanOrigin());
3648 }
3649 }
3650
3651 // Given a scalar or vector, extract lower 64 bits (or less), and return all
3652 // zeroes if it is zero, and all ones otherwise.
3653 Value *Lower64ShadowExtend(IRBuilder<> &IRB, Value *S, Type *T) {
3654 if (S->getType()->isVectorTy())
3655 S = CreateShadowCast(IRB, S, IRB.getInt64Ty(), /* Signed */ true);
3656 assert(S->getType()->getPrimitiveSizeInBits() <= 64);
3657 Value *S2 = IRB.CreateICmpNE(S, getCleanShadow(S));
3658 return CreateShadowCast(IRB, S2, T, /* Signed */ true);
3659 }
3660
3661 // Given a vector, extract its first element, and return all
3662 // zeroes if it is zero, and all ones otherwise.
3663 Value *LowerElementShadowExtend(IRBuilder<> &IRB, Value *S, Type *T) {
3664 Value *S1 = IRB.CreateExtractElement(S, (uint64_t)0);
3665 Value *S2 = IRB.CreateICmpNE(S1, getCleanShadow(S1));
3666 return CreateShadowCast(IRB, S2, T, /* Signed */ true);
3667 }
3668
3669 Value *VariableShadowExtend(IRBuilder<> &IRB, Value *S) {
3670 Type *T = S->getType();
3671 assert(T->isVectorTy());
3672 Value *S2 = IRB.CreateICmpNE(S, getCleanShadow(S));
3673 return IRB.CreateSExt(S2, T);
3674 }
3675
3676 // Instrument vector shift intrinsic.
3677 //
3678 // This function instruments intrinsics like int_x86_avx2_psll_w.
3679 // Intrinsic shifts %In by %ShiftSize bits.
3680 // %ShiftSize may be a vector. In that case the lower 64 bits determine shift
3681 // size, and the rest is ignored. Behavior is defined even if shift size is
3682 // greater than register (or field) width.
3683 void handleVectorShiftIntrinsic(IntrinsicInst &I, bool Variable) {
3684 assert(I.arg_size() == 2);
3685 IRBuilder<> IRB(&I);
3686 // If any of the S2 bits are poisoned, the whole thing is poisoned.
3687 // Otherwise perform the same shift on S1.
3688 Value *S1 = getShadow(&I, 0);
3689 Value *S2 = getShadow(&I, 1);
3690 Value *S2Conv = Variable ? VariableShadowExtend(IRB, S2)
3691 : Lower64ShadowExtend(IRB, S2, getShadowTy(&I));
3692 Value *V1 = I.getOperand(0);
3693 Value *V2 = I.getOperand(1);
3694 Value *Shift = IRB.CreateCall(I.getFunctionType(), I.getCalledOperand(),
3695 {IRB.CreateBitCast(S1, V1->getType()), V2});
3696 Shift = IRB.CreateBitCast(Shift, getShadowTy(&I));
3697 setShadow(&I, IRB.CreateOr(Shift, S2Conv));
3698 setOriginForNaryOp(I);
3699 }
3700
3701 // Get an MMX-sized (64-bit) vector type, or optionally, other sized
3702 // vectors.
3703 Type *getMMXVectorTy(unsigned EltSizeInBits,
3704 unsigned X86_MMXSizeInBits = 64) {
3705 assert(EltSizeInBits != 0 && (X86_MMXSizeInBits % EltSizeInBits) == 0 &&
3706 "Illegal MMX vector element size");
3707 return FixedVectorType::get(IntegerType::get(*MS.C, EltSizeInBits),
3708 X86_MMXSizeInBits / EltSizeInBits);
3709 }
3710
3711 // Returns a signed counterpart for an (un)signed-saturate-and-pack
3712 // intrinsic.
3713 Intrinsic::ID getSignedPackIntrinsic(Intrinsic::ID id) {
3714 switch (id) {
3715 case Intrinsic::x86_sse2_packsswb_128:
3716 case Intrinsic::x86_sse2_packuswb_128:
3717 return Intrinsic::x86_sse2_packsswb_128;
3718
3719 case Intrinsic::x86_sse2_packssdw_128:
3720 case Intrinsic::x86_sse41_packusdw:
3721 return Intrinsic::x86_sse2_packssdw_128;
3722
3723 case Intrinsic::x86_avx2_packsswb:
3724 case Intrinsic::x86_avx2_packuswb:
3725 return Intrinsic::x86_avx2_packsswb;
3726
3727 case Intrinsic::x86_avx2_packssdw:
3728 case Intrinsic::x86_avx2_packusdw:
3729 return Intrinsic::x86_avx2_packssdw;
3730
3731 case Intrinsic::x86_mmx_packsswb:
3732 case Intrinsic::x86_mmx_packuswb:
3733 return Intrinsic::x86_mmx_packsswb;
3734
3735 case Intrinsic::x86_mmx_packssdw:
3736 return Intrinsic::x86_mmx_packssdw;
3737
3738 case Intrinsic::x86_avx512_packssdw_512:
3739 case Intrinsic::x86_avx512_packusdw_512:
3740 return Intrinsic::x86_avx512_packssdw_512;
3741
3742 case Intrinsic::x86_avx512_packsswb_512:
3743 case Intrinsic::x86_avx512_packuswb_512:
3744 return Intrinsic::x86_avx512_packsswb_512;
3745
3746 default:
3747 llvm_unreachable("unexpected intrinsic id");
3748 }
3749 }
3750
3751 // Instrument vector pack intrinsic.
3752 //
3753 // This function instruments intrinsics like x86_mmx_packsswb, that
3754 // packs elements of 2 input vectors into half as many bits with saturation.
3755 // Shadow is propagated with the signed variant of the same intrinsic applied
3756 // to sext(Sa != zeroinitializer), sext(Sb != zeroinitializer).
3757 // MMXEltSizeInBits is used only for x86mmx arguments.
3758 //
3759 // TODO: consider using GetMinMaxUnsigned() to handle saturation precisely
3760 void handleVectorPackIntrinsic(IntrinsicInst &I,
3761 unsigned MMXEltSizeInBits = 0) {
3762 assert(I.arg_size() == 2);
3763 IRBuilder<> IRB(&I);
3764 Value *S1 = getShadow(&I, 0);
3765 Value *S2 = getShadow(&I, 1);
3766 assert(S1->getType()->isVectorTy());
3767
3768 // SExt and ICmpNE below must apply to individual elements of input vectors.
3769 // In case of x86mmx arguments, cast them to appropriate vector types and
3770 // back.
3771 Type *T =
3772 MMXEltSizeInBits ? getMMXVectorTy(MMXEltSizeInBits) : S1->getType();
3773 if (MMXEltSizeInBits) {
3774 S1 = IRB.CreateBitCast(S1, T);
3775 S2 = IRB.CreateBitCast(S2, T);
3776 }
3777 Value *S1_ext =
3779 Value *S2_ext =
3781 if (MMXEltSizeInBits) {
3782 S1_ext = IRB.CreateBitCast(S1_ext, getMMXVectorTy(64));
3783 S2_ext = IRB.CreateBitCast(S2_ext, getMMXVectorTy(64));
3784 }
3785
3786 Value *S = IRB.CreateIntrinsic(getSignedPackIntrinsic(I.getIntrinsicID()),
3787 {S1_ext, S2_ext}, /*FMFSource=*/nullptr,
3788 "_msprop_vector_pack");
3789 if (MMXEltSizeInBits)
3790 S = IRB.CreateBitCast(S, getShadowTy(&I));
3791 setShadow(&I, S);
3792 setOriginForNaryOp(I);
3793 }
3794
3795 // Convert `Mask` into `<n x i1>`.
3796 Constant *createDppMask(unsigned Width, unsigned Mask) {
3798 for (auto &M : R) {
3799 M = ConstantInt::getBool(F.getContext(), Mask & 1);
3800 Mask >>= 1;
3801 }
3802 return ConstantVector::get(R);
3803 }
3804
3805 // Calculate output shadow as array of booleans `<n x i1>`, assuming if any
3806 // arg is poisoned, entire dot product is poisoned.
3807 Value *findDppPoisonedOutput(IRBuilder<> &IRB, Value *S, unsigned SrcMask,
3808 unsigned DstMask) {
3809 const unsigned Width =
3810 cast<FixedVectorType>(S->getType())->getNumElements();
3811
3812 S = IRB.CreateSelect(createDppMask(Width, SrcMask), S,
3814 Value *SElem = IRB.CreateOrReduce(S);
3815 Value *IsClean = IRB.CreateIsNull(SElem, "_msdpp");
3816 Value *DstMaskV = createDppMask(Width, DstMask);
3817
3818 return IRB.CreateSelect(
3819 IsClean, Constant::getNullValue(DstMaskV->getType()), DstMaskV);
3820 }
3821
3822 // See `Intel Intrinsics Guide` for `_dp_p*` instructions.
3823 //
3824 // 2 and 4 element versions produce single scalar of dot product, and then
3825 // puts it into elements of output vector, selected by 4 lowest bits of the
3826 // mask. Top 4 bits of the mask control which elements of input to use for dot
3827 // product.
3828 //
3829 // 8 element version mask still has only 4 bit for input, and 4 bit for output
3830 // mask. According to the spec it just operates as 4 element version on first
3831 // 4 elements of inputs and output, and then on last 4 elements of inputs and
3832 // output.
3833 void handleDppIntrinsic(IntrinsicInst &I) {
3834 IRBuilder<> IRB(&I);
3835
3836 Value *S0 = getShadow(&I, 0);
3837 Value *S1 = getShadow(&I, 1);
3838 Value *S = IRB.CreateOr(S0, S1);
3839
3840 const unsigned Width =
3841 cast<FixedVectorType>(S->getType())->getNumElements();
3842 assert(Width == 2 || Width == 4 || Width == 8);
3843
3844 const unsigned Mask = cast<ConstantInt>(I.getArgOperand(2))->getZExtValue();
3845 const unsigned SrcMask = Mask >> 4;
3846 const unsigned DstMask = Mask & 0xf;
3847
3848 // Calculate shadow as `<n x i1>`.
3849 Value *SI1 = findDppPoisonedOutput(IRB, S, SrcMask, DstMask);
3850 if (Width == 8) {
3851 // First 4 elements of shadow are already calculated. `makeDppShadow`
3852 // operats on 32 bit masks, so we can just shift masks, and repeat.
3853 SI1 = IRB.CreateOr(
3854 SI1, findDppPoisonedOutput(IRB, S, SrcMask << 4, DstMask << 4));
3855 }
3856 // Extend to real size of shadow, poisoning either all or none bits of an
3857 // element.
3858 S = IRB.CreateSExt(SI1, S->getType(), "_msdpp");
3859
3860 setShadow(&I, S);
3861 setOriginForNaryOp(I);
3862 }
3863
3864 Value *convertBlendvToSelectMask(IRBuilder<> &IRB, Value *C) {
3865 C = CreateAppToShadowCast(IRB, C);
3866 FixedVectorType *FVT = cast<FixedVectorType>(C->getType());
3867 unsigned ElSize = FVT->getElementType()->getPrimitiveSizeInBits();
3868 C = IRB.CreateAShr(C, ElSize - 1);
3869 FVT = FixedVectorType::get(IRB.getInt1Ty(), FVT->getNumElements());
3870 return IRB.CreateTrunc(C, FVT);
3871 }
3872
3873 // `blendv(f, t, c)` is effectively `select(c[top_bit], t, f)`.
3874 void handleBlendvIntrinsic(IntrinsicInst &I) {
3875 Value *C = I.getOperand(2);
3876 Value *T = I.getOperand(1);
3877 Value *F = I.getOperand(0);
3878
3879 Value *Sc = getShadow(&I, 2);
3880 Value *Oc = MS.TrackOrigins ? getOrigin(C) : nullptr;
3881
3882 {
3883 IRBuilder<> IRB(&I);
3884 // Extract top bit from condition and its shadow.
3885 C = convertBlendvToSelectMask(IRB, C);
3886 Sc = convertBlendvToSelectMask(IRB, Sc);
3887
3888 setShadow(C, Sc);
3889 setOrigin(C, Oc);
3890 }
3891
3892 handleSelectLikeInst(I, C, T, F);
3893 }
3894
3895 // Instrument sum-of-absolute-differences intrinsic.
3896 void handleVectorSadIntrinsic(IntrinsicInst &I, bool IsMMX = false) {
3897 const unsigned SignificantBitsPerResultElement = 16;
3898 Type *ResTy = IsMMX ? IntegerType::get(*MS.C, 64) : I.getType();
3899 unsigned ZeroBitsPerResultElement =
3900 ResTy->getScalarSizeInBits() - SignificantBitsPerResultElement;
3901
3902 IRBuilder<> IRB(&I);
3903 auto *Shadow0 = getShadow(&I, 0);
3904 auto *Shadow1 = getShadow(&I, 1);
3905 Value *S = IRB.CreateOr(Shadow0, Shadow1);
3906 S = IRB.CreateBitCast(S, ResTy);
3907 S = IRB.CreateSExt(IRB.CreateICmpNE(S, Constant::getNullValue(ResTy)),
3908 ResTy);
3909 S = IRB.CreateLShr(S, ZeroBitsPerResultElement);
3910 S = IRB.CreateBitCast(S, getShadowTy(&I));
3911 setShadow(&I, S);
3912 setOriginForNaryOp(I);
3913 }
3914
3915 // Instrument multiply-add(-accumulate)? intrinsics.
3916 //
3917 // e.g., Two operands:
3918 // <4 x i32> @llvm.x86.sse2.pmadd.wd(<8 x i16> %a, <8 x i16> %b)
3919 //
3920 // Two operands which require an EltSizeInBits override:
3921 // <1 x i64> @llvm.x86.mmx.pmadd.wd(<1 x i64> %a, <1 x i64> %b)
3922 //
3923 // Three operands:
3924 // <4 x i32> @llvm.x86.avx512.vpdpbusd.128
3925 // (<4 x i32> %s, <16 x i8> %a, <16 x i8> %b)
3926 // (this is equivalent to multiply-add on %a and %b, followed by
3927 // adding/"accumulating" %s. "Accumulation" stores the result in one
3928 // of the source registers, but this accumulate vs. add distinction
3929 // is lost when dealing with LLVM intrinsics.)
3930 //
3931 // ZeroPurifies means that multiplying a known-zero with an uninitialized
3932 // value results in an initialized value. This is applicable for integer
3933 // multiplication, but not floating-point (counter-example: NaN).
3934 void handleVectorPmaddIntrinsic(IntrinsicInst &I, unsigned ReductionFactor,
3935 bool ZeroPurifies,
3936 unsigned EltSizeInBits = 0) {
3937 IRBuilder<> IRB(&I);
3938
3939 [[maybe_unused]] FixedVectorType *ReturnType =
3940 cast<FixedVectorType>(I.getType());
3941 assert(isa<FixedVectorType>(ReturnType));
3942
3943 // Vectors A and B, and shadows
3944 Value *Va = nullptr;
3945 Value *Vb = nullptr;
3946 Value *Sa = nullptr;
3947 Value *Sb = nullptr;
3948
3949 assert(I.arg_size() == 2 || I.arg_size() == 3);
3950 if (I.arg_size() == 2) {
3951 Va = I.getOperand(0);
3952 Vb = I.getOperand(1);
3953
3954 Sa = getShadow(&I, 0);
3955 Sb = getShadow(&I, 1);
3956 } else if (I.arg_size() == 3) {
3957 // Operand 0 is the accumulator. We will deal with that below.
3958 Va = I.getOperand(1);
3959 Vb = I.getOperand(2);
3960
3961 Sa = getShadow(&I, 1);
3962 Sb = getShadow(&I, 2);
3963 }
3964
3965 FixedVectorType *ParamType = cast<FixedVectorType>(Va->getType());
3966 assert(ParamType == Vb->getType());
3967
3968 assert(ParamType->getPrimitiveSizeInBits() ==
3969 ReturnType->getPrimitiveSizeInBits());
3970
3971 if (I.arg_size() == 3) {
3972 [[maybe_unused]] auto *AccumulatorType =
3973 cast<FixedVectorType>(I.getOperand(0)->getType());
3974 assert(AccumulatorType == ReturnType);
3975 }
3976
3977 FixedVectorType *ImplicitReturnType =
3978 cast<FixedVectorType>(getShadowTy(ReturnType));
3979 // Step 1: instrument multiplication of corresponding vector elements
3980 if (EltSizeInBits) {
3981 ImplicitReturnType = cast<FixedVectorType>(
3982 getMMXVectorTy(EltSizeInBits * ReductionFactor,
3983 ParamType->getPrimitiveSizeInBits()));
3984 ParamType = cast<FixedVectorType>(
3985 getMMXVectorTy(EltSizeInBits, ParamType->getPrimitiveSizeInBits()));
3986
3987 Va = IRB.CreateBitCast(Va, ParamType);
3988 Vb = IRB.CreateBitCast(Vb, ParamType);
3989
3990 Sa = IRB.CreateBitCast(Sa, getShadowTy(ParamType));
3991 Sb = IRB.CreateBitCast(Sb, getShadowTy(ParamType));
3992 } else {
3993 assert(ParamType->getNumElements() ==
3994 ReturnType->getNumElements() * ReductionFactor);
3995 }
3996
3997 // Each element of the vector is represented by a single bit (poisoned or
3998 // not) e.g., <8 x i1>.
3999 Value *SaNonZero = IRB.CreateIsNotNull(Sa);
4000 Value *SbNonZero = IRB.CreateIsNotNull(Sb);
4001 Value *And;
4002 if (ZeroPurifies) {
4003 // Multiplying an *initialized* zero by an uninitialized element results
4004 // in an initialized zero element.
4005 //
4006 // This is analogous to bitwise AND, where "AND" of 0 and a poisoned value
4007 // results in an unpoisoned value. We can therefore adapt the visitAnd()
4008 // instrumentation:
4009 // OutShadow = (SaNonZero & SbNonZero)
4010 // | (VaNonZero & SbNonZero)
4011 // | (SaNonZero & VbNonZero)
4012 // where non-zero is checked on a per-element basis (not per bit).
4013 Value *VaInt = Va;
4014 Value *VbInt = Vb;
4015 if (!Va->getType()->isIntegerTy()) {
4016 VaInt = CreateAppToShadowCast(IRB, Va);
4017 VbInt = CreateAppToShadowCast(IRB, Vb);
4018 }
4019
4020 Value *VaNonZero = IRB.CreateIsNotNull(VaInt);
4021 Value *VbNonZero = IRB.CreateIsNotNull(VbInt);
4022
4023 Value *SaAndSbNonZero = IRB.CreateAnd(SaNonZero, SbNonZero);
4024 Value *VaAndSbNonZero = IRB.CreateAnd(VaNonZero, SbNonZero);
4025 Value *SaAndVbNonZero = IRB.CreateAnd(SaNonZero, VbNonZero);
4026
4027 And = IRB.CreateOr({SaAndSbNonZero, VaAndSbNonZero, SaAndVbNonZero});
4028 } else {
4029 And = IRB.CreateOr({SaNonZero, SbNonZero});
4030 }
4031
4032 // Extend <8 x i1> to <8 x i16>.
4033 // (The real pmadd intrinsic would have computed intermediate values of
4034 // <8 x i32>, but that is irrelevant for our shadow purposes because we
4035 // consider each element to be either fully initialized or fully
4036 // uninitialized.)
4037 And = IRB.CreateSExt(And, Sa->getType());
4038
4039 // Step 2: instrument horizontal add
4040 // We don't need bit-precise horizontalReduce because we only want to check
4041 // if each pair/quad of elements is fully zero.
4042 // Cast to <4 x i32>.
4043 Value *Horizontal = IRB.CreateBitCast(And, ImplicitReturnType);
4044
4045 // Compute <4 x i1>, then extend back to <4 x i32>.
4046 Value *OutShadow = IRB.CreateSExt(
4047 IRB.CreateICmpNE(Horizontal,
4048 Constant::getNullValue(Horizontal->getType())),
4049 ImplicitReturnType);
4050
4051 // Cast it back to the required fake return type (if MMX: <1 x i64>; for
4052 // AVX, it is already correct).
4053 if (EltSizeInBits)
4054 OutShadow = CreateShadowCast(IRB, OutShadow, getShadowTy(&I));
4055
4056 // Step 3 (if applicable): instrument accumulator
4057 if (I.arg_size() == 3)
4058 OutShadow = IRB.CreateOr(OutShadow, getShadow(&I, 0));
4059
4060 setShadow(&I, OutShadow);
4061 setOriginForNaryOp(I);
4062 }
4063
4064 // Instrument compare-packed intrinsic.
4065 // Basically, an or followed by sext(icmp ne 0) to end up with all-zeros or
4066 // all-ones shadow.
4067 void handleVectorComparePackedIntrinsic(IntrinsicInst &I) {
4068 IRBuilder<> IRB(&I);
4069 Type *ResTy = getShadowTy(&I);
4070 auto *Shadow0 = getShadow(&I, 0);
4071 auto *Shadow1 = getShadow(&I, 1);
4072 Value *S0 = IRB.CreateOr(Shadow0, Shadow1);
4073 Value *S = IRB.CreateSExt(
4074 IRB.CreateICmpNE(S0, Constant::getNullValue(ResTy)), ResTy);
4075 setShadow(&I, S);
4076 setOriginForNaryOp(I);
4077 }
4078
4079 // Instrument compare-scalar intrinsic.
4080 // This handles both cmp* intrinsics which return the result in the first
4081 // element of a vector, and comi* which return the result as i32.
4082 void handleVectorCompareScalarIntrinsic(IntrinsicInst &I) {
4083 IRBuilder<> IRB(&I);
4084 auto *Shadow0 = getShadow(&I, 0);
4085 auto *Shadow1 = getShadow(&I, 1);
4086 Value *S0 = IRB.CreateOr(Shadow0, Shadow1);
4087 Value *S = LowerElementShadowExtend(IRB, S0, getShadowTy(&I));
4088 setShadow(&I, S);
4089 setOriginForNaryOp(I);
4090 }
4091
4092 // Instrument generic vector reduction intrinsics
4093 // by ORing together all their fields.
4094 //
4095 // If AllowShadowCast is true, the return type does not need to be the same
4096 // type as the fields
4097 // e.g., declare i32 @llvm.aarch64.neon.uaddv.i32.v16i8(<16 x i8>)
4098 void handleVectorReduceIntrinsic(IntrinsicInst &I, bool AllowShadowCast) {
4099 assert(I.arg_size() == 1);
4100
4101 IRBuilder<> IRB(&I);
4102 Value *S = IRB.CreateOrReduce(getShadow(&I, 0));
4103 if (AllowShadowCast)
4104 S = CreateShadowCast(IRB, S, getShadowTy(&I));
4105 else
4106 assert(S->getType() == getShadowTy(&I));
4107 setShadow(&I, S);
4108 setOriginForNaryOp(I);
4109 }
4110
4111 // Similar to handleVectorReduceIntrinsic but with an initial starting value.
4112 // e.g., call float @llvm.vector.reduce.fadd.f32.v2f32(float %a0, <2 x float>
4113 // %a1)
4114 // shadow = shadow[a0] | shadow[a1.0] | shadow[a1.1]
4115 //
4116 // The type of the return value, initial starting value, and elements of the
4117 // vector must be identical.
4118 void handleVectorReduceWithStarterIntrinsic(IntrinsicInst &I) {
4119 assert(I.arg_size() == 2);
4120
4121 IRBuilder<> IRB(&I);
4122 Value *Shadow0 = getShadow(&I, 0);
4123 Value *Shadow1 = IRB.CreateOrReduce(getShadow(&I, 1));
4124 assert(Shadow0->getType() == Shadow1->getType());
4125 Value *S = IRB.CreateOr(Shadow0, Shadow1);
4126 assert(S->getType() == getShadowTy(&I));
4127 setShadow(&I, S);
4128 setOriginForNaryOp(I);
4129 }
4130
4131 // Instrument vector.reduce.or intrinsic.
4132 // Valid (non-poisoned) set bits in the operand pull low the
4133 // corresponding shadow bits.
4134 void handleVectorReduceOrIntrinsic(IntrinsicInst &I) {
4135 assert(I.arg_size() == 1);
4136
4137 IRBuilder<> IRB(&I);
4138 Value *OperandShadow = getShadow(&I, 0);
4139 Value *OperandUnsetBits = IRB.CreateNot(I.getOperand(0));
4140 Value *OperandUnsetOrPoison = IRB.CreateOr(OperandUnsetBits, OperandShadow);
4141 // Bit N is clean if any field's bit N is 1 and unpoison
4142 Value *OutShadowMask = IRB.CreateAndReduce(OperandUnsetOrPoison);
4143 // Otherwise, it is clean if every field's bit N is unpoison
4144 Value *OrShadow = IRB.CreateOrReduce(OperandShadow);
4145 Value *S = IRB.CreateAnd(OutShadowMask, OrShadow);
4146
4147 setShadow(&I, S);
4148 setOrigin(&I, getOrigin(&I, 0));
4149 }
4150
4151 // Instrument vector.reduce.and intrinsic.
4152 // Valid (non-poisoned) unset bits in the operand pull down the
4153 // corresponding shadow bits.
4154 void handleVectorReduceAndIntrinsic(IntrinsicInst &I) {
4155 assert(I.arg_size() == 1);
4156
4157 IRBuilder<> IRB(&I);
4158 Value *OperandShadow = getShadow(&I, 0);
4159 Value *OperandSetOrPoison = IRB.CreateOr(I.getOperand(0), OperandShadow);
4160 // Bit N is clean if any field's bit N is 0 and unpoison
4161 Value *OutShadowMask = IRB.CreateAndReduce(OperandSetOrPoison);
4162 // Otherwise, it is clean if every field's bit N is unpoison
4163 Value *OrShadow = IRB.CreateOrReduce(OperandShadow);
4164 Value *S = IRB.CreateAnd(OutShadowMask, OrShadow);
4165
4166 setShadow(&I, S);
4167 setOrigin(&I, getOrigin(&I, 0));
4168 }
4169
4170 void handleStmxcsr(IntrinsicInst &I) {
4171 IRBuilder<> IRB(&I);
4172 Value *Addr = I.getArgOperand(0);
4173 Type *Ty = IRB.getInt32Ty();
4174 Value *ShadowPtr =
4175 getShadowOriginPtr(Addr, IRB, Ty, Align(1), /*isStore*/ true).first;
4176
4177 IRB.CreateStore(getCleanShadow(Ty), ShadowPtr);
4178
4180 insertCheckShadowOf(Addr, &I);
4181 }
4182
4183 void handleLdmxcsr(IntrinsicInst &I) {
4184 if (!InsertChecks)
4185 return;
4186
4187 IRBuilder<> IRB(&I);
4188 Value *Addr = I.getArgOperand(0);
4189 Type *Ty = IRB.getInt32Ty();
4190 const Align Alignment = Align(1);
4191 Value *ShadowPtr, *OriginPtr;
4192 std::tie(ShadowPtr, OriginPtr) =
4193 getShadowOriginPtr(Addr, IRB, Ty, Alignment, /*isStore*/ false);
4194
4196 insertCheckShadowOf(Addr, &I);
4197
4198 Value *Shadow = IRB.CreateAlignedLoad(Ty, ShadowPtr, Alignment, "_ldmxcsr");
4199 Value *Origin = MS.TrackOrigins ? IRB.CreateLoad(MS.OriginTy, OriginPtr)
4200 : getCleanOrigin();
4201 insertCheckShadow(Shadow, Origin, &I);
4202 }
4203
4204 void handleMaskedExpandLoad(IntrinsicInst &I) {
4205 IRBuilder<> IRB(&I);
4206 Value *Ptr = I.getArgOperand(0);
4207 MaybeAlign Align = I.getParamAlign(0);
4208 Value *Mask = I.getArgOperand(1);
4209 Value *PassThru = I.getArgOperand(2);
4210
4212 insertCheckShadowOf(Ptr, &I);
4213 insertCheckShadowOf(Mask, &I);
4214 }
4215
4216 if (!PropagateShadow) {
4217 setShadow(&I, getCleanShadow(&I));
4218 setOrigin(&I, getCleanOrigin());
4219 return;
4220 }
4221
4222 Type *ShadowTy = getShadowTy(&I);
4223 Type *ElementShadowTy = cast<VectorType>(ShadowTy)->getElementType();
4224 auto [ShadowPtr, OriginPtr] =
4225 getShadowOriginPtr(Ptr, IRB, ElementShadowTy, Align, /*isStore*/ false);
4226
4227 Value *Shadow =
4228 IRB.CreateMaskedExpandLoad(ShadowTy, ShadowPtr, Align, Mask,
4229 getShadow(PassThru), "_msmaskedexpload");
4230
4231 setShadow(&I, Shadow);
4232
4233 // TODO: Store origins.
4234 setOrigin(&I, getCleanOrigin());
4235 }
4236
4237 void handleMaskedCompressStore(IntrinsicInst &I) {
4238 IRBuilder<> IRB(&I);
4239 Value *Values = I.getArgOperand(0);
4240 Value *Ptr = I.getArgOperand(1);
4241 MaybeAlign Align = I.getParamAlign(1);
4242 Value *Mask = I.getArgOperand(2);
4243
4245 insertCheckShadowOf(Ptr, &I);
4246 insertCheckShadowOf(Mask, &I);
4247 }
4248
4249 Value *Shadow = getShadow(Values);
4250 Type *ElementShadowTy =
4251 getShadowTy(cast<VectorType>(Values->getType())->getElementType());
4252 auto [ShadowPtr, OriginPtrs] =
4253 getShadowOriginPtr(Ptr, IRB, ElementShadowTy, Align, /*isStore*/ true);
4254
4255 IRB.CreateMaskedCompressStore(Shadow, ShadowPtr, Align, Mask);
4256
4257 // TODO: Store origins.
4258 }
4259
4260 void handleMaskedGather(IntrinsicInst &I) {
4261 IRBuilder<> IRB(&I);
4262 Value *Ptrs = I.getArgOperand(0);
4263 const Align Alignment = I.getParamAlign(0).valueOrOne();
4264 Value *Mask = I.getArgOperand(1);
4265 Value *PassThru = I.getArgOperand(2);
4266
4267 Type *PtrsShadowTy = getShadowTy(Ptrs);
4269 insertCheckShadowOf(Mask, &I);
4270 Value *MaskedPtrShadow = IRB.CreateSelect(
4271 Mask, getShadow(Ptrs), Constant::getNullValue((PtrsShadowTy)),
4272 "_msmaskedptrs");
4273 insertCheckShadow(MaskedPtrShadow, getOrigin(Ptrs), &I);
4274 }
4275
4276 if (!PropagateShadow) {
4277 setShadow(&I, getCleanShadow(&I));
4278 setOrigin(&I, getCleanOrigin());
4279 return;
4280 }
4281
4282 Type *ShadowTy = getShadowTy(&I);
4283 Type *ElementShadowTy = cast<VectorType>(ShadowTy)->getElementType();
4284 auto [ShadowPtrs, OriginPtrs] = getShadowOriginPtr(
4285 Ptrs, IRB, ElementShadowTy, Alignment, /*isStore*/ false);
4286
4287 Value *Shadow =
4288 IRB.CreateMaskedGather(ShadowTy, ShadowPtrs, Alignment, Mask,
4289 getShadow(PassThru), "_msmaskedgather");
4290
4291 setShadow(&I, Shadow);
4292
4293 // TODO: Store origins.
4294 setOrigin(&I, getCleanOrigin());
4295 }
4296
4297 void handleMaskedScatter(IntrinsicInst &I) {
4298 IRBuilder<> IRB(&I);
4299 Value *Values = I.getArgOperand(0);
4300 Value *Ptrs = I.getArgOperand(1);
4301 const Align Alignment = I.getParamAlign(1).valueOrOne();
4302 Value *Mask = I.getArgOperand(2);
4303
4304 Type *PtrsShadowTy = getShadowTy(Ptrs);
4306 insertCheckShadowOf(Mask, &I);
4307 Value *MaskedPtrShadow = IRB.CreateSelect(
4308 Mask, getShadow(Ptrs), Constant::getNullValue((PtrsShadowTy)),
4309 "_msmaskedptrs");
4310 insertCheckShadow(MaskedPtrShadow, getOrigin(Ptrs), &I);
4311 }
4312
4313 Value *Shadow = getShadow(Values);
4314 Type *ElementShadowTy =
4315 getShadowTy(cast<VectorType>(Values->getType())->getElementType());
4316 auto [ShadowPtrs, OriginPtrs] = getShadowOriginPtr(
4317 Ptrs, IRB, ElementShadowTy, Alignment, /*isStore*/ true);
4318
4319 IRB.CreateMaskedScatter(Shadow, ShadowPtrs, Alignment, Mask);
4320
4321 // TODO: Store origin.
4322 }
4323
4324 // Intrinsic::masked_store
4325 //
4326 // Note: handleAVXMaskedStore handles AVX/AVX2 variants, though AVX512 masked
4327 // stores are lowered to Intrinsic::masked_store.
4328 void handleMaskedStore(IntrinsicInst &I) {
4329 IRBuilder<> IRB(&I);
4330 Value *V = I.getArgOperand(0);
4331 Value *Ptr = I.getArgOperand(1);
4332 const Align Alignment = I.getParamAlign(1).valueOrOne();
4333 Value *Mask = I.getArgOperand(2);
4334 Value *Shadow = getShadow(V);
4335
4337 insertCheckShadowOf(Ptr, &I);
4338 insertCheckShadowOf(Mask, &I);
4339 }
4340
4341 Value *ShadowPtr;
4342 Value *OriginPtr;
4343 std::tie(ShadowPtr, OriginPtr) = getShadowOriginPtr(
4344 Ptr, IRB, Shadow->getType(), Alignment, /*isStore*/ true);
4345
4346 IRB.CreateMaskedStore(Shadow, ShadowPtr, Alignment, Mask);
4347
4348 if (!MS.TrackOrigins)
4349 return;
4350
4351 auto &DL = F.getDataLayout();
4352 paintOrigin(IRB, getOrigin(V), OriginPtr,
4353 DL.getTypeStoreSize(Shadow->getType()),
4354 std::max(Alignment, kMinOriginAlignment));
4355 }
4356
4357 // Intrinsic::masked_load
4358 //
4359 // Note: handleAVXMaskedLoad handles AVX/AVX2 variants, though AVX512 masked
4360 // loads are lowered to Intrinsic::masked_load.
4361 void handleMaskedLoad(IntrinsicInst &I) {
4362 IRBuilder<> IRB(&I);
4363 Value *Ptr = I.getArgOperand(0);
4364 const Align Alignment = I.getParamAlign(0).valueOrOne();
4365 Value *Mask = I.getArgOperand(1);
4366 Value *PassThru = I.getArgOperand(2);
4367
4369 insertCheckShadowOf(Ptr, &I);
4370 insertCheckShadowOf(Mask, &I);
4371 }
4372
4373 if (!PropagateShadow) {
4374 setShadow(&I, getCleanShadow(&I));
4375 setOrigin(&I, getCleanOrigin());
4376 return;
4377 }
4378
4379 Type *ShadowTy = getShadowTy(&I);
4380 Value *ShadowPtr, *OriginPtr;
4381 std::tie(ShadowPtr, OriginPtr) =
4382 getShadowOriginPtr(Ptr, IRB, ShadowTy, Alignment, /*isStore*/ false);
4383 setShadow(&I, IRB.CreateMaskedLoad(ShadowTy, ShadowPtr, Alignment, Mask,
4384 getShadow(PassThru), "_msmaskedld"));
4385
4386 if (!MS.TrackOrigins)
4387 return;
4388
4389 // Choose between PassThru's and the loaded value's origins.
4390 Value *MaskedPassThruShadow = IRB.CreateAnd(
4391 getShadow(PassThru), IRB.CreateSExt(IRB.CreateNeg(Mask), ShadowTy));
4392
4393 Value *NotNull = convertToBool(MaskedPassThruShadow, IRB, "_mscmp");
4394
4395 Value *PtrOrigin = IRB.CreateLoad(MS.OriginTy, OriginPtr);
4396 Value *Origin = IRB.CreateSelect(NotNull, getOrigin(PassThru), PtrOrigin);
4397
4398 setOrigin(&I, Origin);
4399 }
4400
4401 // e.g., void @llvm.x86.avx.maskstore.ps.256(ptr, <8 x i32>, <8 x float>)
4402 // dst mask src
4403 //
4404 // AVX512 masked stores are lowered to Intrinsic::masked_load and are handled
4405 // by handleMaskedStore.
4406 //
4407 // This function handles AVX and AVX2 masked stores; these use the MSBs of a
4408 // vector of integers, unlike the LLVM masked intrinsics, which require a
4409 // vector of booleans. X86InstCombineIntrinsic.cpp::simplifyX86MaskedLoad
4410 // mentions that the x86 backend does not know how to efficiently convert
4411 // from a vector of booleans back into the AVX mask format; therefore, they
4412 // (and we) do not reduce AVX/AVX2 masked intrinsics into LLVM masked
4413 // intrinsics.
4414 void handleAVXMaskedStore(IntrinsicInst &I) {
4415 assert(I.arg_size() == 3);
4416
4417 IRBuilder<> IRB(&I);
4418
4419 Value *Dst = I.getArgOperand(0);
4420 assert(Dst->getType()->isPointerTy() && "Destination is not a pointer!");
4421
4422 Value *Mask = I.getArgOperand(1);
4423 assert(isa<VectorType>(Mask->getType()) && "Mask is not a vector!");
4424
4425 Value *Src = I.getArgOperand(2);
4426 assert(isa<VectorType>(Src->getType()) && "Source is not a vector!");
4427
4428 const Align Alignment = Align(1);
4429
4430 Value *SrcShadow = getShadow(Src);
4431
4433 insertCheckShadowOf(Dst, &I);
4434 insertCheckShadowOf(Mask, &I);
4435 }
4436
4437 Value *DstShadowPtr;
4438 Value *DstOriginPtr;
4439 std::tie(DstShadowPtr, DstOriginPtr) = getShadowOriginPtr(
4440 Dst, IRB, SrcShadow->getType(), Alignment, /*isStore*/ true);
4441
4442 SmallVector<Value *, 2> ShadowArgs;
4443 ShadowArgs.append(1, DstShadowPtr);
4444 ShadowArgs.append(1, Mask);
4445 // The intrinsic may require floating-point but shadows can be arbitrary
4446 // bit patterns, of which some would be interpreted as "invalid"
4447 // floating-point values (NaN etc.); we assume the intrinsic will happily
4448 // copy them.
4449 ShadowArgs.append(1, IRB.CreateBitCast(SrcShadow, Src->getType()));
4450
4451 CallInst *CI =
4452 IRB.CreateIntrinsic(IRB.getVoidTy(), I.getIntrinsicID(), ShadowArgs);
4453 setShadow(&I, CI);
4454
4455 if (!MS.TrackOrigins)
4456 return;
4457
4458 // Approximation only
4459 auto &DL = F.getDataLayout();
4460 paintOrigin(IRB, getOrigin(Src), DstOriginPtr,
4461 DL.getTypeStoreSize(SrcShadow->getType()),
4462 std::max(Alignment, kMinOriginAlignment));
4463 }
4464
4465 // e.g., <8 x float> @llvm.x86.avx.maskload.ps.256(ptr, <8 x i32>)
4466 // return src mask
4467 //
4468 // Masked-off values are replaced with 0, which conveniently also represents
4469 // initialized memory.
4470 //
4471 // AVX512 masked stores are lowered to Intrinsic::masked_load and are handled
4472 // by handleMaskedStore.
4473 //
4474 // We do not combine this with handleMaskedLoad; see comment in
4475 // handleAVXMaskedStore for the rationale.
4476 //
4477 // This is subtly different than handleIntrinsicByApplyingToShadow(I, 1)
4478 // because we need to apply getShadowOriginPtr, not getShadow, to the first
4479 // parameter.
4480 void handleAVXMaskedLoad(IntrinsicInst &I) {
4481 assert(I.arg_size() == 2);
4482
4483 IRBuilder<> IRB(&I);
4484
4485 Value *Src = I.getArgOperand(0);
4486 assert(Src->getType()->isPointerTy() && "Source is not a pointer!");
4487
4488 Value *Mask = I.getArgOperand(1);
4489 assert(isa<VectorType>(Mask->getType()) && "Mask is not a vector!");
4490
4491 const Align Alignment = Align(1);
4492
4494 insertCheckShadowOf(Mask, &I);
4495 }
4496
4497 Type *SrcShadowTy = getShadowTy(Src);
4498 Value *SrcShadowPtr, *SrcOriginPtr;
4499 std::tie(SrcShadowPtr, SrcOriginPtr) =
4500 getShadowOriginPtr(Src, IRB, SrcShadowTy, Alignment, /*isStore*/ false);
4501
4502 SmallVector<Value *, 2> ShadowArgs;
4503 ShadowArgs.append(1, SrcShadowPtr);
4504 ShadowArgs.append(1, Mask);
4505
4506 CallInst *CI =
4507 IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(), ShadowArgs);
4508 // The AVX masked load intrinsics do not have integer variants. We use the
4509 // floating-point variants, which will happily copy the shadows even if
4510 // they are interpreted as "invalid" floating-point values (NaN etc.).
4511 setShadow(&I, IRB.CreateBitCast(CI, getShadowTy(&I)));
4512
4513 if (!MS.TrackOrigins)
4514 return;
4515
4516 // The "pass-through" value is always zero (initialized). To the extent
4517 // that that results in initialized aligned 4-byte chunks, the origin value
4518 // is ignored. It is therefore correct to simply copy the origin from src.
4519 Value *PtrSrcOrigin = IRB.CreateLoad(MS.OriginTy, SrcOriginPtr);
4520 setOrigin(&I, PtrSrcOrigin);
4521 }
4522
4523 // Test whether the mask indices are initialized, only checking the bits that
4524 // are actually used.
4525 //
4526 // e.g., if Idx is <32 x i16>, only (log2(32) == 5) bits of each index are
4527 // used/checked.
4528 void maskedCheckAVXIndexShadow(IRBuilder<> &IRB, Value *Idx, Instruction *I) {
4529 assert(isFixedIntVector(Idx));
4530 auto IdxVectorSize =
4531 cast<FixedVectorType>(Idx->getType())->getNumElements();
4532 assert(isPowerOf2_64(IdxVectorSize));
4533
4534 // Compiler isn't smart enough, let's help it
4535 if (isa<Constant>(Idx))
4536 return;
4537
4538 auto *IdxShadow = getShadow(Idx);
4539 Value *Truncated = IRB.CreateTrunc(
4540 IdxShadow,
4541 FixedVectorType::get(Type::getIntNTy(*MS.C, Log2_64(IdxVectorSize)),
4542 IdxVectorSize));
4543 insertCheckShadow(Truncated, getOrigin(Idx), I);
4544 }
4545
4546 // Instrument AVX permutation intrinsic.
4547 // We apply the same permutation (argument index 1) to the shadow.
4548 void handleAVXVpermilvar(IntrinsicInst &I) {
4549 IRBuilder<> IRB(&I);
4550 Value *Shadow = getShadow(&I, 0);
4551 maskedCheckAVXIndexShadow(IRB, I.getArgOperand(1), &I);
4552
4553 // Shadows are integer-ish types but some intrinsics require a
4554 // different (e.g., floating-point) type.
4555 Shadow = IRB.CreateBitCast(Shadow, I.getArgOperand(0)->getType());
4556 CallInst *CI = IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(),
4557 {Shadow, I.getArgOperand(1)});
4558
4559 setShadow(&I, IRB.CreateBitCast(CI, getShadowTy(&I)));
4560 setOriginForNaryOp(I);
4561 }
4562
4563 // Instrument AVX permutation intrinsic.
4564 // We apply the same permutation (argument index 1) to the shadows.
4565 void handleAVXVpermi2var(IntrinsicInst &I) {
4566 assert(I.arg_size() == 3);
4567 assert(isa<FixedVectorType>(I.getArgOperand(0)->getType()));
4568 assert(isa<FixedVectorType>(I.getArgOperand(1)->getType()));
4569 assert(isa<FixedVectorType>(I.getArgOperand(2)->getType()));
4570 [[maybe_unused]] auto ArgVectorSize =
4571 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
4572 assert(cast<FixedVectorType>(I.getArgOperand(1)->getType())
4573 ->getNumElements() == ArgVectorSize);
4574 assert(cast<FixedVectorType>(I.getArgOperand(2)->getType())
4575 ->getNumElements() == ArgVectorSize);
4576 assert(I.getArgOperand(0)->getType() == I.getArgOperand(2)->getType());
4577 assert(I.getType() == I.getArgOperand(0)->getType());
4578 assert(I.getArgOperand(1)->getType()->isIntOrIntVectorTy());
4579 IRBuilder<> IRB(&I);
4580 Value *AShadow = getShadow(&I, 0);
4581 Value *Idx = I.getArgOperand(1);
4582 Value *BShadow = getShadow(&I, 2);
4583
4584 maskedCheckAVXIndexShadow(IRB, Idx, &I);
4585
4586 // Shadows are integer-ish types but some intrinsics require a
4587 // different (e.g., floating-point) type.
4588 AShadow = IRB.CreateBitCast(AShadow, I.getArgOperand(0)->getType());
4589 BShadow = IRB.CreateBitCast(BShadow, I.getArgOperand(2)->getType());
4590 CallInst *CI = IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(),
4591 {AShadow, Idx, BShadow});
4592 setShadow(&I, IRB.CreateBitCast(CI, getShadowTy(&I)));
4593 setOriginForNaryOp(I);
4594 }
4595
4596 [[maybe_unused]] static bool isFixedIntVectorTy(const Type *T) {
4597 return isa<FixedVectorType>(T) && T->isIntOrIntVectorTy();
4598 }
4599
4600 [[maybe_unused]] static bool isFixedFPVectorTy(const Type *T) {
4601 return isa<FixedVectorType>(T) && T->isFPOrFPVectorTy();
4602 }
4603
4604 [[maybe_unused]] static bool isFixedIntVector(const Value *V) {
4605 return isFixedIntVectorTy(V->getType());
4606 }
4607
4608 [[maybe_unused]] static bool isFixedFPVector(const Value *V) {
4609 return isFixedFPVectorTy(V->getType());
4610 }
4611
4612 // e.g., <16 x i32> @llvm.x86.avx512.mask.cvtps2dq.512
4613 // (<16 x float> a, <16 x i32> writethru, i16 mask,
4614 // i32 rounding)
4615 //
4616 // Inconveniently, some similar intrinsics have a different operand order:
4617 // <16 x i16> @llvm.x86.avx512.mask.vcvtps2ph.512
4618 // (<16 x float> a, i32 rounding, <16 x i16> writethru,
4619 // i16 mask)
4620 //
4621 // If the return type has more elements than A, the excess elements are
4622 // zeroed (and the corresponding shadow is initialized).
4623 // <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.128
4624 // (<4 x float> a, i32 rounding, <8 x i16> writethru,
4625 // i8 mask)
4626 //
4627 // dst[i] = mask[i] ? convert(a[i]) : writethru[i]
4628 // dst_shadow[i] = mask[i] ? all_or_nothing(a_shadow[i]) : writethru_shadow[i]
4629 // where all_or_nothing(x) is fully uninitialized if x has any
4630 // uninitialized bits
4631 void handleAVX512VectorConvertFPToInt(IntrinsicInst &I, bool LastMask) {
4632 IRBuilder<> IRB(&I);
4633
4634 assert(I.arg_size() == 4);
4635 Value *A = I.getOperand(0);
4636 Value *WriteThrough;
4637 Value *Mask;
4639 if (LastMask) {
4640 WriteThrough = I.getOperand(2);
4641 Mask = I.getOperand(3);
4642 RoundingMode = I.getOperand(1);
4643 } else {
4644 WriteThrough = I.getOperand(1);
4645 Mask = I.getOperand(2);
4646 RoundingMode = I.getOperand(3);
4647 }
4648
4649 assert(isFixedFPVector(A));
4650 assert(isFixedIntVector(WriteThrough));
4651
4652 unsigned ANumElements =
4653 cast<FixedVectorType>(A->getType())->getNumElements();
4654 [[maybe_unused]] unsigned WriteThruNumElements =
4655 cast<FixedVectorType>(WriteThrough->getType())->getNumElements();
4656 assert(ANumElements == WriteThruNumElements ||
4657 ANumElements * 2 == WriteThruNumElements);
4658
4659 assert(Mask->getType()->isIntegerTy());
4660 unsigned MaskNumElements = Mask->getType()->getScalarSizeInBits();
4661 assert(ANumElements == MaskNumElements ||
4662 ANumElements * 2 == MaskNumElements);
4663
4664 assert(WriteThruNumElements == MaskNumElements);
4665
4666 // Some bits of the mask may be unused, though it's unusual to have partly
4667 // uninitialized bits.
4668 insertCheckShadowOf(Mask, &I);
4669
4670 assert(RoundingMode->getType()->isIntegerTy());
4671 // Only some bits of the rounding mode are used, though it's very
4672 // unusual to have uninitialized bits there (more commonly, it's a
4673 // constant).
4674 insertCheckShadowOf(RoundingMode, &I);
4675
4676 assert(I.getType() == WriteThrough->getType());
4677
4678 Value *AShadow = getShadow(A);
4679 AShadow = maybeExtendVectorShadowWithZeros(AShadow, I);
4680
4681 if (ANumElements * 2 == MaskNumElements) {
4682 // Ensure that the irrelevant bits of the mask are zero, hence selecting
4683 // from the zeroed shadow instead of the writethrough's shadow.
4684 Mask =
4685 IRB.CreateTrunc(Mask, IRB.getIntNTy(ANumElements), "_ms_mask_trunc");
4686 Mask =
4687 IRB.CreateZExt(Mask, IRB.getIntNTy(MaskNumElements), "_ms_mask_zext");
4688 }
4689
4690 // Convert i16 mask to <16 x i1>
4691 Mask = IRB.CreateBitCast(
4692 Mask, FixedVectorType::get(IRB.getInt1Ty(), MaskNumElements),
4693 "_ms_mask_bitcast");
4694
4695 /// For floating-point to integer conversion, the output is:
4696 /// - fully uninitialized if *any* bit of the input is uninitialized
4697 /// - fully ininitialized if all bits of the input are ininitialized
4698 /// We apply the same principle on a per-element basis for vectors.
4699 ///
4700 /// We use the scalar width of the return type instead of A's.
4701 AShadow = IRB.CreateSExt(
4702 IRB.CreateICmpNE(AShadow, getCleanShadow(AShadow->getType())),
4703 getShadowTy(&I), "_ms_a_shadow");
4704
4705 Value *WriteThroughShadow = getShadow(WriteThrough);
4706 Value *Shadow = IRB.CreateSelect(Mask, AShadow, WriteThroughShadow,
4707 "_ms_writethru_select");
4708
4709 setShadow(&I, Shadow);
4710 setOriginForNaryOp(I);
4711 }
4712
4713 // Instrument BMI / BMI2 intrinsics.
4714 // All of these intrinsics are Z = I(X, Y)
4715 // where the types of all operands and the result match, and are either i32 or
4716 // i64. The following instrumentation happens to work for all of them:
4717 // Sz = I(Sx, Y) | (sext (Sy != 0))
4718 void handleBmiIntrinsic(IntrinsicInst &I) {
4719 IRBuilder<> IRB(&I);
4720 Type *ShadowTy = getShadowTy(&I);
4721
4722 // If any bit of the mask operand is poisoned, then the whole thing is.
4723 Value *SMask = getShadow(&I, 1);
4724 SMask = IRB.CreateSExt(IRB.CreateICmpNE(SMask, getCleanShadow(ShadowTy)),
4725 ShadowTy);
4726 // Apply the same intrinsic to the shadow of the first operand.
4727 Value *S = IRB.CreateCall(I.getCalledFunction(),
4728 {getShadow(&I, 0), I.getOperand(1)});
4729 S = IRB.CreateOr(SMask, S);
4730 setShadow(&I, S);
4731 setOriginForNaryOp(I);
4732 }
4733
4734 static SmallVector<int, 8> getPclmulMask(unsigned Width, bool OddElements) {
4735 SmallVector<int, 8> Mask;
4736 for (unsigned X = OddElements ? 1 : 0; X < Width; X += 2) {
4737 Mask.append(2, X);
4738 }
4739 return Mask;
4740 }
4741
4742 // Instrument pclmul intrinsics.
4743 // These intrinsics operate either on odd or on even elements of the input
4744 // vectors, depending on the constant in the 3rd argument, ignoring the rest.
4745 // Replace the unused elements with copies of the used ones, ex:
4746 // (0, 1, 2, 3) -> (0, 0, 2, 2) (even case)
4747 // or
4748 // (0, 1, 2, 3) -> (1, 1, 3, 3) (odd case)
4749 // and then apply the usual shadow combining logic.
4750 void handlePclmulIntrinsic(IntrinsicInst &I) {
4751 IRBuilder<> IRB(&I);
4752 unsigned Width =
4753 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
4754 assert(isa<ConstantInt>(I.getArgOperand(2)) &&
4755 "pclmul 3rd operand must be a constant");
4756 unsigned Imm = cast<ConstantInt>(I.getArgOperand(2))->getZExtValue();
4757 Value *Shuf0 = IRB.CreateShuffleVector(getShadow(&I, 0),
4758 getPclmulMask(Width, Imm & 0x01));
4759 Value *Shuf1 = IRB.CreateShuffleVector(getShadow(&I, 1),
4760 getPclmulMask(Width, Imm & 0x10));
4761 ShadowAndOriginCombiner SOC(this, IRB);
4762 SOC.Add(Shuf0, getOrigin(&I, 0));
4763 SOC.Add(Shuf1, getOrigin(&I, 1));
4764 SOC.Done(&I);
4765 }
4766
4767 // Instrument _mm_*_sd|ss intrinsics
4768 void handleUnarySdSsIntrinsic(IntrinsicInst &I) {
4769 IRBuilder<> IRB(&I);
4770 unsigned Width =
4771 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
4772 Value *First = getShadow(&I, 0);
4773 Value *Second = getShadow(&I, 1);
4774 // First element of second operand, remaining elements of first operand
4775 SmallVector<int, 16> Mask;
4776 Mask.push_back(Width);
4777 for (unsigned i = 1; i < Width; i++)
4778 Mask.push_back(i);
4779 Value *Shadow = IRB.CreateShuffleVector(First, Second, Mask);
4780
4781 setShadow(&I, Shadow);
4782 setOriginForNaryOp(I);
4783 }
4784
4785 void handleVtestIntrinsic(IntrinsicInst &I) {
4786 IRBuilder<> IRB(&I);
4787 Value *Shadow0 = getShadow(&I, 0);
4788 Value *Shadow1 = getShadow(&I, 1);
4789 Value *Or = IRB.CreateOr(Shadow0, Shadow1);
4790 Value *NZ = IRB.CreateICmpNE(Or, Constant::getNullValue(Or->getType()));
4791 Value *Scalar = convertShadowToScalar(NZ, IRB);
4792 Value *Shadow = IRB.CreateZExt(Scalar, getShadowTy(&I));
4793
4794 setShadow(&I, Shadow);
4795 setOriginForNaryOp(I);
4796 }
4797
4798 void handleBinarySdSsIntrinsic(IntrinsicInst &I) {
4799 IRBuilder<> IRB(&I);
4800 unsigned Width =
4801 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements();
4802 Value *First = getShadow(&I, 0);
4803 Value *Second = getShadow(&I, 1);
4804 Value *OrShadow = IRB.CreateOr(First, Second);
4805 // First element of both OR'd together, remaining elements of first operand
4806 SmallVector<int, 16> Mask;
4807 Mask.push_back(Width);
4808 for (unsigned i = 1; i < Width; i++)
4809 Mask.push_back(i);
4810 Value *Shadow = IRB.CreateShuffleVector(First, OrShadow, Mask);
4811
4812 setShadow(&I, Shadow);
4813 setOriginForNaryOp(I);
4814 }
4815
4816 // _mm_round_ps / _mm_round_ps.
4817 // Similar to maybeHandleSimpleNomemIntrinsic except
4818 // the second argument is guaranteed to be a constant integer.
4819 void handleRoundPdPsIntrinsic(IntrinsicInst &I) {
4820 assert(I.getArgOperand(0)->getType() == I.getType());
4821 assert(I.arg_size() == 2);
4822 assert(isa<ConstantInt>(I.getArgOperand(1)));
4823
4824 IRBuilder<> IRB(&I);
4825 ShadowAndOriginCombiner SC(this, IRB);
4826 SC.Add(I.getArgOperand(0));
4827 SC.Done(&I);
4828 }
4829
4830 // Instrument @llvm.abs intrinsic.
4831 //
4832 // e.g., i32 @llvm.abs.i32 (i32 <Src>, i1 <is_int_min_poison>)
4833 // <4 x i32> @llvm.abs.v4i32(<4 x i32> <Src>, i1 <is_int_min_poison>)
4834 void handleAbsIntrinsic(IntrinsicInst &I) {
4835 assert(I.arg_size() == 2);
4836 Value *Src = I.getArgOperand(0);
4837 Value *IsIntMinPoison = I.getArgOperand(1);
4838
4839 assert(I.getType()->isIntOrIntVectorTy());
4840
4841 assert(Src->getType() == I.getType());
4842
4843 assert(IsIntMinPoison->getType()->isIntegerTy());
4844 assert(IsIntMinPoison->getType()->getIntegerBitWidth() == 1);
4845
4846 IRBuilder<> IRB(&I);
4847 Value *SrcShadow = getShadow(Src);
4848
4849 APInt MinVal =
4850 APInt::getSignedMinValue(Src->getType()->getScalarSizeInBits());
4851 Value *MinValVec = ConstantInt::get(Src->getType(), MinVal);
4852 Value *SrcIsMin = IRB.CreateICmp(CmpInst::ICMP_EQ, Src, MinValVec);
4853
4854 Value *PoisonedShadow = getPoisonedShadow(Src);
4855 Value *PoisonedIfIntMinShadow =
4856 IRB.CreateSelect(SrcIsMin, PoisonedShadow, SrcShadow);
4857 Value *Shadow =
4858 IRB.CreateSelect(IsIntMinPoison, PoisonedIfIntMinShadow, SrcShadow);
4859
4860 setShadow(&I, Shadow);
4861 setOrigin(&I, getOrigin(&I, 0));
4862 }
4863
4864 void handleIsFpClass(IntrinsicInst &I) {
4865 IRBuilder<> IRB(&I);
4866 Value *Shadow = getShadow(&I, 0);
4867 setShadow(&I, IRB.CreateICmpNE(Shadow, getCleanShadow(Shadow)));
4868 setOrigin(&I, getOrigin(&I, 0));
4869 }
4870
4871 void handleArithmeticWithOverflow(IntrinsicInst &I) {
4872 IRBuilder<> IRB(&I);
4873 Value *Shadow0 = getShadow(&I, 0);
4874 Value *Shadow1 = getShadow(&I, 1);
4875 Value *ShadowElt0 = IRB.CreateOr(Shadow0, Shadow1);
4876 Value *ShadowElt1 =
4877 IRB.CreateICmpNE(ShadowElt0, getCleanShadow(ShadowElt0));
4878
4879 Value *Shadow = PoisonValue::get(getShadowTy(&I));
4880 Shadow = IRB.CreateInsertValue(Shadow, ShadowElt0, 0);
4881 Shadow = IRB.CreateInsertValue(Shadow, ShadowElt1, 1);
4882
4883 setShadow(&I, Shadow);
4884 setOriginForNaryOp(I);
4885 }
4886
4887 Value *extractLowerShadow(IRBuilder<> &IRB, Value *V) {
4888 assert(isa<FixedVectorType>(V->getType()));
4889 assert(cast<FixedVectorType>(V->getType())->getNumElements() > 0);
4890 Value *Shadow = getShadow(V);
4891 return IRB.CreateExtractElement(Shadow,
4892 ConstantInt::get(IRB.getInt32Ty(), 0));
4893 }
4894
4895 // Handle llvm.x86.avx512.mask.pmov{,s,us}.*.512
4896 //
4897 // e.g., call <16 x i8> @llvm.x86.avx512.mask.pmov.qb.512
4898 // (<8 x i64>, <16 x i8>, i8)
4899 // A WriteThru Mask
4900 //
4901 // call <16 x i8> @llvm.x86.avx512.mask.pmovs.db.512
4902 // (<16 x i32>, <16 x i8>, i16)
4903 //
4904 // Dst[i] = Mask[i] ? truncate_or_saturate(A[i]) : WriteThru[i]
4905 // Dst_shadow[i] = Mask[i] ? truncate(A_shadow[i]) : WriteThru_shadow[i]
4906 //
4907 // If Dst has more elements than A, the excess elements are zeroed (and the
4908 // corresponding shadow is initialized).
4909 //
4910 // Note: for PMOV (truncation), handleIntrinsicByApplyingToShadow is precise
4911 // and is much faster than this handler.
4912 void handleAVX512VectorDownConvert(IntrinsicInst &I) {
4913 IRBuilder<> IRB(&I);
4914
4915 assert(I.arg_size() == 3);
4916 Value *A = I.getOperand(0);
4917 Value *WriteThrough = I.getOperand(1);
4918 Value *Mask = I.getOperand(2);
4919
4920 assert(isFixedIntVector(A));
4921 assert(isFixedIntVector(WriteThrough));
4922
4923 unsigned ANumElements =
4924 cast<FixedVectorType>(A->getType())->getNumElements();
4925 unsigned OutputNumElements =
4926 cast<FixedVectorType>(WriteThrough->getType())->getNumElements();
4927 assert(ANumElements == OutputNumElements ||
4928 ANumElements * 2 == OutputNumElements);
4929
4930 assert(Mask->getType()->isIntegerTy());
4931 assert(Mask->getType()->getScalarSizeInBits() == ANumElements);
4932 insertCheckShadowOf(Mask, &I);
4933
4934 assert(I.getType() == WriteThrough->getType());
4935
4936 // Widen the mask, if necessary, to have one bit per element of the output
4937 // vector.
4938 // We want the extra bits to have '1's, so that the CreateSelect will
4939 // select the values from AShadow instead of WriteThroughShadow ("maskless"
4940 // versions of the intrinsics are sometimes implemented using an all-1's
4941 // mask and an undefined value for WriteThroughShadow). We accomplish this
4942 // by using bitwise NOT before and after the ZExt.
4943 if (ANumElements != OutputNumElements) {
4944 Mask = IRB.CreateNot(Mask);
4945 Mask = IRB.CreateZExt(Mask, Type::getIntNTy(*MS.C, OutputNumElements),
4946 "_ms_widen_mask");
4947 Mask = IRB.CreateNot(Mask);
4948 }
4949 Mask = IRB.CreateBitCast(
4950 Mask, FixedVectorType::get(IRB.getInt1Ty(), OutputNumElements));
4951
4952 Value *AShadow = getShadow(A);
4953
4954 // The return type might have more elements than the input.
4955 // Temporarily shrink the return type's number of elements.
4956 VectorType *ShadowType = maybeShrinkVectorShadowType(A, I);
4957
4958 // PMOV truncates; PMOVS/PMOVUS uses signed/unsigned saturation.
4959 // This handler treats them all as truncation, which leads to some rare
4960 // false positives in the cases where the truncated bytes could
4961 // unambiguously saturate the value e.g., if A = ??????10 ????????
4962 // (big-endian), the unsigned saturated byte conversion is 11111111 i.e.,
4963 // fully defined, but the truncated byte is ????????.
4964 //
4965 // TODO: use GetMinMaxUnsigned() to handle saturation precisely.
4966 AShadow = IRB.CreateTrunc(AShadow, ShadowType, "_ms_trunc_shadow");
4967 AShadow = maybeExtendVectorShadowWithZeros(AShadow, I);
4968
4969 Value *WriteThroughShadow = getShadow(WriteThrough);
4970
4971 Value *Shadow = IRB.CreateSelect(Mask, AShadow, WriteThroughShadow);
4972 setShadow(&I, Shadow);
4973 setOriginForNaryOp(I);
4974 }
4975
4976 // Handle llvm.x86.avx512.* instructions that take a vector of floating-point
4977 // values and perform an operation whose shadow propagation should be handled
4978 // as all-or-nothing [*], with masking provided by a vector and a mask
4979 // supplied as an integer.
4980 //
4981 // [*] if all bits of a vector element are initialized, the output is fully
4982 // initialized; otherwise, the output is fully uninitialized
4983 //
4984 // e.g., <16 x float> @llvm.x86.avx512.rsqrt14.ps.512
4985 // (<16 x float>, <16 x float>, i16)
4986 // A WriteThru Mask
4987 //
4988 // <2 x double> @llvm.x86.avx512.rcp14.pd.128
4989 // (<2 x double>, <2 x double>, i8)
4990 //
4991 // <8 x double> @llvm.x86.avx512.mask.rndscale.pd.512
4992 // (<8 x double>, i32, <8 x double>, i8, i32)
4993 // A Imm WriteThru Mask Rounding
4994 //
4995 // All operands other than A and WriteThru (e.g., Mask, Imm, Rounding) must
4996 // be fully initialized.
4997 //
4998 // Dst[i] = Mask[i] ? some_op(A[i]) : WriteThru[i]
4999 // Dst_shadow[i] = Mask[i] ? all_or_nothing(A_shadow[i]) : WriteThru_shadow[i]
5000 void handleAVX512VectorGenericMaskedFP(IntrinsicInst &I, unsigned AIndex,
5001 unsigned WriteThruIndex,
5002 unsigned MaskIndex) {
5003 IRBuilder<> IRB(&I);
5004
5005 unsigned NumArgs = I.arg_size();
5006 assert(AIndex < NumArgs);
5007 assert(WriteThruIndex < NumArgs);
5008 assert(MaskIndex < NumArgs);
5009 assert(AIndex != WriteThruIndex);
5010 assert(AIndex != MaskIndex);
5011 assert(WriteThruIndex != MaskIndex);
5012
5013 Value *A = I.getOperand(AIndex);
5014 Value *WriteThru = I.getOperand(WriteThruIndex);
5015 Value *Mask = I.getOperand(MaskIndex);
5016
5017 assert(isFixedFPVector(A));
5018 assert(isFixedFPVector(WriteThru));
5019
5020 [[maybe_unused]] unsigned ANumElements =
5021 cast<FixedVectorType>(A->getType())->getNumElements();
5022 unsigned OutputNumElements =
5023 cast<FixedVectorType>(WriteThru->getType())->getNumElements();
5024 assert(ANumElements == OutputNumElements);
5025
5026 for (unsigned i = 0; i < NumArgs; ++i) {
5027 if (i != AIndex && i != WriteThruIndex) {
5028 // Imm, Mask, Rounding etc. are "control" data, hence we require that
5029 // they be fully initialized.
5030 assert(I.getOperand(i)->getType()->isIntegerTy());
5031 insertCheckShadowOf(I.getOperand(i), &I);
5032 }
5033 }
5034
5035 // The mask has 1 bit per element of A, but a minimum of 8 bits.
5036 if (Mask->getType()->getScalarSizeInBits() == 8 && ANumElements < 8)
5037 Mask = IRB.CreateTrunc(Mask, Type::getIntNTy(*MS.C, ANumElements));
5038 assert(Mask->getType()->getScalarSizeInBits() == ANumElements);
5039
5040 assert(I.getType() == WriteThru->getType());
5041
5042 Mask = IRB.CreateBitCast(
5043 Mask, FixedVectorType::get(IRB.getInt1Ty(), OutputNumElements));
5044
5045 Value *AShadow = getShadow(A);
5046
5047 // All-or-nothing shadow
5048 AShadow = IRB.CreateSExt(IRB.CreateICmpNE(AShadow, getCleanShadow(AShadow)),
5049 AShadow->getType());
5050
5051 Value *WriteThruShadow = getShadow(WriteThru);
5052
5053 Value *Shadow = IRB.CreateSelect(Mask, AShadow, WriteThruShadow);
5054 setShadow(&I, Shadow);
5055
5056 setOriginForNaryOp(I);
5057 }
5058
5059 // For sh.* compiler intrinsics:
5060 // llvm.x86.avx512fp16.mask.{add/sub/mul/div/max/min}.sh.round
5061 // (<8 x half>, <8 x half>, <8 x half>, i8, i32)
5062 // A B WriteThru Mask RoundingMode
5063 //
5064 // DstShadow[0] = Mask[0] ? (AShadow[0] | BShadow[0]) : WriteThruShadow[0]
5065 // DstShadow[1..7] = AShadow[1..7]
5066 void visitGenericScalarHalfwordInst(IntrinsicInst &I) {
5067 IRBuilder<> IRB(&I);
5068
5069 assert(I.arg_size() == 5);
5070 Value *A = I.getOperand(0);
5071 Value *B = I.getOperand(1);
5072 Value *WriteThrough = I.getOperand(2);
5073 Value *Mask = I.getOperand(3);
5074 Value *RoundingMode = I.getOperand(4);
5075
5076 // Technically, we could probably just check whether the LSB is
5077 // initialized, but intuitively it feels like a partly uninitialized mask
5078 // is unintended, and we should warn the user immediately.
5079 insertCheckShadowOf(Mask, &I);
5080 insertCheckShadowOf(RoundingMode, &I);
5081
5082 assert(isa<FixedVectorType>(A->getType()));
5083 unsigned NumElements =
5084 cast<FixedVectorType>(A->getType())->getNumElements();
5085 assert(NumElements == 8);
5086 assert(A->getType() == B->getType());
5087 assert(B->getType() == WriteThrough->getType());
5088 assert(Mask->getType()->getPrimitiveSizeInBits() == NumElements);
5089 assert(RoundingMode->getType()->isIntegerTy());
5090
5091 Value *ALowerShadow = extractLowerShadow(IRB, A);
5092 Value *BLowerShadow = extractLowerShadow(IRB, B);
5093
5094 Value *ABLowerShadow = IRB.CreateOr(ALowerShadow, BLowerShadow);
5095
5096 Value *WriteThroughLowerShadow = extractLowerShadow(IRB, WriteThrough);
5097
5098 Mask = IRB.CreateBitCast(
5099 Mask, FixedVectorType::get(IRB.getInt1Ty(), NumElements));
5100 Value *MaskLower =
5101 IRB.CreateExtractElement(Mask, ConstantInt::get(IRB.getInt32Ty(), 0));
5102
5103 Value *AShadow = getShadow(A);
5104 Value *DstLowerShadow =
5105 IRB.CreateSelect(MaskLower, ABLowerShadow, WriteThroughLowerShadow);
5106 Value *DstShadow = IRB.CreateInsertElement(
5107 AShadow, DstLowerShadow, ConstantInt::get(IRB.getInt32Ty(), 0),
5108 "_msprop");
5109
5110 setShadow(&I, DstShadow);
5111 setOriginForNaryOp(I);
5112 }
5113
5114 // Approximately handle AVX Galois Field Affine Transformation
5115 //
5116 // e.g.,
5117 // <16 x i8> @llvm.x86.vgf2p8affineqb.128(<16 x i8>, <16 x i8>, i8)
5118 // <32 x i8> @llvm.x86.vgf2p8affineqb.256(<32 x i8>, <32 x i8>, i8)
5119 // <64 x i8> @llvm.x86.vgf2p8affineqb.512(<64 x i8>, <64 x i8>, i8)
5120 // Out A x b
5121 // where A and x are packed matrices, b is a vector,
5122 // Out = A * x + b in GF(2)
5123 //
5124 // Multiplication in GF(2) is equivalent to bitwise AND. However, the matrix
5125 // computation also includes a parity calculation.
5126 //
5127 // For the bitwise AND of bits V1 and V2, the exact shadow is:
5128 // Out_Shadow = (V1_Shadow & V2_Shadow)
5129 // | (V1 & V2_Shadow)
5130 // | (V1_Shadow & V2 )
5131 //
5132 // We approximate the shadow of gf2p8affineqb using:
5133 // Out_Shadow = gf2p8affineqb(x_Shadow, A_shadow, 0)
5134 // | gf2p8affineqb(x, A_shadow, 0)
5135 // | gf2p8affineqb(x_Shadow, A, 0)
5136 // | set1_epi8(b_Shadow)
5137 //
5138 // This approximation has false negatives: if an intermediate dot-product
5139 // contains an even number of 1's, the parity is 0.
5140 // It has no false positives.
5141 void handleAVXGF2P8Affine(IntrinsicInst &I) {
5142 IRBuilder<> IRB(&I);
5143
5144 assert(I.arg_size() == 3);
5145 Value *A = I.getOperand(0);
5146 Value *X = I.getOperand(1);
5147 Value *B = I.getOperand(2);
5148
5149 assert(isFixedIntVector(A));
5150 assert(cast<VectorType>(A->getType())
5151 ->getElementType()
5152 ->getScalarSizeInBits() == 8);
5153
5154 assert(A->getType() == X->getType());
5155
5156 assert(B->getType()->isIntegerTy());
5157 assert(B->getType()->getScalarSizeInBits() == 8);
5158
5159 assert(I.getType() == A->getType());
5160
5161 Value *AShadow = getShadow(A);
5162 Value *XShadow = getShadow(X);
5163 Value *BZeroShadow = getCleanShadow(B);
5164
5165 CallInst *AShadowXShadow = IRB.CreateIntrinsic(
5166 I.getType(), I.getIntrinsicID(), {XShadow, AShadow, BZeroShadow});
5167 CallInst *AShadowX = IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(),
5168 {X, AShadow, BZeroShadow});
5169 CallInst *XShadowA = IRB.CreateIntrinsic(I.getType(), I.getIntrinsicID(),
5170 {XShadow, A, BZeroShadow});
5171
5172 unsigned NumElements = cast<FixedVectorType>(I.getType())->getNumElements();
5173 Value *BShadow = getShadow(B);
5174 Value *BBroadcastShadow = getCleanShadow(AShadow);
5175 // There is no LLVM IR intrinsic for _mm512_set1_epi8.
5176 // This loop generates a lot of LLVM IR, which we expect that CodeGen will
5177 // lower appropriately (e.g., VPBROADCASTB).
5178 // Besides, b is often a constant, in which case it is fully initialized.
5179 for (unsigned i = 0; i < NumElements; i++)
5180 BBroadcastShadow = IRB.CreateInsertElement(BBroadcastShadow, BShadow, i);
5181
5182 setShadow(&I, IRB.CreateOr(
5183 {AShadowXShadow, AShadowX, XShadowA, BBroadcastShadow}));
5184 setOriginForNaryOp(I);
5185 }
5186
5187 // Handle Arm NEON vector load intrinsics (vld*).
5188 //
5189 // The WithLane instructions (ld[234]lane) are similar to:
5190 // call {<4 x i32>, <4 x i32>, <4 x i32>}
5191 // @llvm.aarch64.neon.ld3lane.v4i32.p0
5192 // (<4 x i32> %L1, <4 x i32> %L2, <4 x i32> %L3, i64 %lane, ptr
5193 // %A)
5194 //
5195 // The non-WithLane instructions (ld[234], ld1x[234], ld[234]r) are similar
5196 // to:
5197 // call {<8 x i8>, <8 x i8>} @llvm.aarch64.neon.ld2.v8i8.p0(ptr %A)
5198 void handleNEONVectorLoad(IntrinsicInst &I, bool WithLane) {
5199 unsigned int numArgs = I.arg_size();
5200
5201 // Return type is a struct of vectors of integers or floating-point
5202 assert(I.getType()->isStructTy());
5203 [[maybe_unused]] StructType *RetTy = cast<StructType>(I.getType());
5204 assert(RetTy->getNumElements() > 0);
5206 RetTy->getElementType(0)->isFPOrFPVectorTy());
5207 for (unsigned int i = 0; i < RetTy->getNumElements(); i++)
5208 assert(RetTy->getElementType(i) == RetTy->getElementType(0));
5209
5210 if (WithLane) {
5211 // 2, 3 or 4 vectors, plus lane number, plus input pointer
5212 assert(4 <= numArgs && numArgs <= 6);
5213
5214 // Return type is a struct of the input vectors
5215 assert(RetTy->getNumElements() + 2 == numArgs);
5216 for (unsigned int i = 0; i < RetTy->getNumElements(); i++)
5217 assert(I.getArgOperand(i)->getType() == RetTy->getElementType(0));
5218 } else {
5219 assert(numArgs == 1);
5220 }
5221
5222 IRBuilder<> IRB(&I);
5223
5224 SmallVector<Value *, 6> ShadowArgs;
5225 if (WithLane) {
5226 for (unsigned int i = 0; i < numArgs - 2; i++)
5227 ShadowArgs.push_back(getShadow(I.getArgOperand(i)));
5228
5229 // Lane number, passed verbatim
5230 Value *LaneNumber = I.getArgOperand(numArgs - 2);
5231 ShadowArgs.push_back(LaneNumber);
5232
5233 // TODO: blend shadow of lane number into output shadow?
5234 insertCheckShadowOf(LaneNumber, &I);
5235 }
5236
5237 Value *Src = I.getArgOperand(numArgs - 1);
5238 assert(Src->getType()->isPointerTy() && "Source is not a pointer!");
5239
5240 Type *SrcShadowTy = getShadowTy(Src);
5241 auto [SrcShadowPtr, SrcOriginPtr] =
5242 getShadowOriginPtr(Src, IRB, SrcShadowTy, Align(1), /*isStore*/ false);
5243 ShadowArgs.push_back(SrcShadowPtr);
5244
5245 // The NEON vector load instructions handled by this function all have
5246 // integer variants. It is easier to use those rather than trying to cast
5247 // a struct of vectors of floats into a struct of vectors of integers.
5248 CallInst *CI =
5249 IRB.CreateIntrinsic(getShadowTy(&I), I.getIntrinsicID(), ShadowArgs);
5250 setShadow(&I, CI);
5251
5252 if (!MS.TrackOrigins)
5253 return;
5254
5255 Value *PtrSrcOrigin = IRB.CreateLoad(MS.OriginTy, SrcOriginPtr);
5256 setOrigin(&I, PtrSrcOrigin);
5257 }
5258
5259 /// Handle Arm NEON vector store intrinsics (vst{2,3,4}, vst1x_{2,3,4},
5260 /// and vst{2,3,4}lane).
5261 ///
5262 /// Arm NEON vector store intrinsics have the output address (pointer) as the
5263 /// last argument, with the initial arguments being the inputs (and lane
5264 /// number for vst{2,3,4}lane). They return void.
5265 ///
5266 /// - st4 interleaves the output e.g., st4 (inA, inB, inC, inD, outP) writes
5267 /// abcdabcdabcdabcd... into *outP
5268 /// - st1_x4 is non-interleaved e.g., st1_x4 (inA, inB, inC, inD, outP)
5269 /// writes aaaa...bbbb...cccc...dddd... into *outP
5270 /// - st4lane has arguments of (inA, inB, inC, inD, lane, outP)
5271 /// These instructions can all be instrumented with essentially the same
5272 /// MSan logic, simply by applying the corresponding intrinsic to the shadow.
5273 void handleNEONVectorStoreIntrinsic(IntrinsicInst &I, bool useLane) {
5274 IRBuilder<> IRB(&I);
5275
5276 // Don't use getNumOperands() because it includes the callee
5277 int numArgOperands = I.arg_size();
5278
5279 // The last arg operand is the output (pointer)
5280 assert(numArgOperands >= 1);
5281 Value *Addr = I.getArgOperand(numArgOperands - 1);
5282 assert(Addr->getType()->isPointerTy());
5283 int skipTrailingOperands = 1;
5284
5286 insertCheckShadowOf(Addr, &I);
5287
5288 // Second-last operand is the lane number (for vst{2,3,4}lane)
5289 if (useLane) {
5290 skipTrailingOperands++;
5291 assert(numArgOperands >= static_cast<int>(skipTrailingOperands));
5293 I.getArgOperand(numArgOperands - skipTrailingOperands)->getType()));
5294 }
5295
5296 SmallVector<Value *, 8> ShadowArgs;
5297 // All the initial operands are the inputs
5298 for (int i = 0; i < numArgOperands - skipTrailingOperands; i++) {
5299 assert(isa<FixedVectorType>(I.getArgOperand(i)->getType()));
5300 Value *Shadow = getShadow(&I, i);
5301 ShadowArgs.append(1, Shadow);
5302 }
5303
5304 // MSan's GetShadowTy assumes the LHS is the type we want the shadow for
5305 // e.g., for:
5306 // [[TMP5:%.*]] = bitcast <16 x i8> [[TMP2]] to i128
5307 // we know the type of the output (and its shadow) is <16 x i8>.
5308 //
5309 // Arm NEON VST is unusual because the last argument is the output address:
5310 // define void @st2_16b(<16 x i8> %A, <16 x i8> %B, ptr %P) {
5311 // call void @llvm.aarch64.neon.st2.v16i8.p0
5312 // (<16 x i8> [[A]], <16 x i8> [[B]], ptr [[P]])
5313 // and we have no type information about P's operand. We must manually
5314 // compute the type (<16 x i8> x 2).
5315 FixedVectorType *OutputVectorTy = FixedVectorType::get(
5316 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getElementType(),
5317 cast<FixedVectorType>(I.getArgOperand(0)->getType())->getNumElements() *
5318 (numArgOperands - skipTrailingOperands));
5319 Type *OutputShadowTy = getShadowTy(OutputVectorTy);
5320
5321 if (useLane)
5322 ShadowArgs.append(1,
5323 I.getArgOperand(numArgOperands - skipTrailingOperands));
5324
5325 Value *OutputShadowPtr, *OutputOriginPtr;
5326 // AArch64 NEON does not need alignment (unless OS requires it)
5327 std::tie(OutputShadowPtr, OutputOriginPtr) = getShadowOriginPtr(
5328 Addr, IRB, OutputShadowTy, Align(1), /*isStore*/ true);
5329 ShadowArgs.append(1, OutputShadowPtr);
5330
5331 CallInst *CI =
5332 IRB.CreateIntrinsic(IRB.getVoidTy(), I.getIntrinsicID(), ShadowArgs);
5333 setShadow(&I, CI);
5334
5335 if (MS.TrackOrigins) {
5336 // TODO: if we modelled the vst* instruction more precisely, we could
5337 // more accurately track the origins (e.g., if both inputs are
5338 // uninitialized for vst2, we currently blame the second input, even
5339 // though part of the output depends only on the first input).
5340 //
5341 // This is particularly imprecise for vst{2,3,4}lane, since only one
5342 // lane of each input is actually copied to the output.
5343 OriginCombiner OC(this, IRB);
5344 for (int i = 0; i < numArgOperands - skipTrailingOperands; i++)
5345 OC.Add(I.getArgOperand(i));
5346
5347 const DataLayout &DL = F.getDataLayout();
5348 OC.DoneAndStoreOrigin(DL.getTypeStoreSize(OutputVectorTy),
5349 OutputOriginPtr);
5350 }
5351 }
5352
5353 /// Handle intrinsics by applying the intrinsic to the shadows.
5354 ///
5355 /// The trailing arguments are passed verbatim to the intrinsic, though any
5356 /// uninitialized trailing arguments can also taint the shadow e.g., for an
5357 /// intrinsic with one trailing verbatim argument:
5358 /// out = intrinsic(var1, var2, opType)
5359 /// we compute:
5360 /// shadow[out] =
5361 /// intrinsic(shadow[var1], shadow[var2], opType) | shadow[opType]
5362 ///
5363 /// Typically, shadowIntrinsicID will be specified by the caller to be
5364 /// I.getIntrinsicID(), but the caller can choose to replace it with another
5365 /// intrinsic of the same type.
5366 ///
5367 /// CAUTION: this assumes that the intrinsic will handle arbitrary
5368 /// bit-patterns (for example, if the intrinsic accepts floats for
5369 /// var1, we require that it doesn't care if inputs are NaNs).
5370 ///
5371 /// For example, this can be applied to the Arm NEON vector table intrinsics
5372 /// (tbl{1,2,3,4}).
5373 ///
5374 /// The origin is approximated using setOriginForNaryOp.
5375 void handleIntrinsicByApplyingToShadow(IntrinsicInst &I,
5376 Intrinsic::ID shadowIntrinsicID,
5377 unsigned int trailingVerbatimArgs) {
5378 IRBuilder<> IRB(&I);
5379
5380 assert(trailingVerbatimArgs < I.arg_size());
5381
5382 SmallVector<Value *, 8> ShadowArgs;
5383 // Don't use getNumOperands() because it includes the callee
5384 for (unsigned int i = 0; i < I.arg_size() - trailingVerbatimArgs; i++) {
5385 Value *Shadow = getShadow(&I, i);
5386
5387 // Shadows are integer-ish types but some intrinsics require a
5388 // different (e.g., floating-point) type.
5389 ShadowArgs.push_back(
5390 IRB.CreateBitCast(Shadow, I.getArgOperand(i)->getType()));
5391 }
5392
5393 for (unsigned int i = I.arg_size() - trailingVerbatimArgs; i < I.arg_size();
5394 i++) {
5395 Value *Arg = I.getArgOperand(i);
5396 ShadowArgs.push_back(Arg);
5397 }
5398
5399 CallInst *CI =
5400 IRB.CreateIntrinsic(I.getType(), shadowIntrinsicID, ShadowArgs);
5401 Value *CombinedShadow = CI;
5402
5403 // Combine the computed shadow with the shadow of trailing args
5404 for (unsigned int i = I.arg_size() - trailingVerbatimArgs; i < I.arg_size();
5405 i++) {
5406 Value *Shadow =
5407 CreateShadowCast(IRB, getShadow(&I, i), CombinedShadow->getType());
5408 CombinedShadow = IRB.CreateOr(Shadow, CombinedShadow, "_msprop");
5409 }
5410
5411 setShadow(&I, IRB.CreateBitCast(CombinedShadow, getShadowTy(&I)));
5412
5413 setOriginForNaryOp(I);
5414 }
5415
5416 // Approximation only
5417 //
5418 // e.g., <16 x i8> @llvm.aarch64.neon.pmull64(i64, i64)
5419 void handleNEONVectorMultiplyIntrinsic(IntrinsicInst &I) {
5420 assert(I.arg_size() == 2);
5421
5422 handleShadowOr(I);
5423 }
5424
5425 bool maybeHandleCrossPlatformIntrinsic(IntrinsicInst &I) {
5426 switch (I.getIntrinsicID()) {
5427 case Intrinsic::uadd_with_overflow:
5428 case Intrinsic::sadd_with_overflow:
5429 case Intrinsic::usub_with_overflow:
5430 case Intrinsic::ssub_with_overflow:
5431 case Intrinsic::umul_with_overflow:
5432 case Intrinsic::smul_with_overflow:
5433 handleArithmeticWithOverflow(I);
5434 break;
5435 case Intrinsic::abs:
5436 handleAbsIntrinsic(I);
5437 break;
5438 case Intrinsic::bitreverse:
5439 handleIntrinsicByApplyingToShadow(I, I.getIntrinsicID(),
5440 /*trailingVerbatimArgs*/ 0);
5441 break;
5442 case Intrinsic::is_fpclass:
5443 handleIsFpClass(I);
5444 break;
5445 case Intrinsic::lifetime_start:
5446 handleLifetimeStart(I);
5447 break;
5448 case Intrinsic::launder_invariant_group:
5449 case Intrinsic::strip_invariant_group:
5450 handleInvariantGroup(I);
5451 break;
5452 case Intrinsic::bswap:
5453 handleBswap(I);
5454 break;
5455 case Intrinsic::ctlz:
5456 case Intrinsic::cttz:
5457 handleCountLeadingTrailingZeros(I);
5458 break;
5459 case Intrinsic::masked_compressstore:
5460 handleMaskedCompressStore(I);
5461 break;
5462 case Intrinsic::masked_expandload:
5463 handleMaskedExpandLoad(I);
5464 break;
5465 case Intrinsic::masked_gather:
5466 handleMaskedGather(I);
5467 break;
5468 case Intrinsic::masked_scatter:
5469 handleMaskedScatter(I);
5470 break;
5471 case Intrinsic::masked_store:
5472 handleMaskedStore(I);
5473 break;
5474 case Intrinsic::masked_load:
5475 handleMaskedLoad(I);
5476 break;
5477 case Intrinsic::vector_reduce_and:
5478 handleVectorReduceAndIntrinsic(I);
5479 break;
5480 case Intrinsic::vector_reduce_or:
5481 handleVectorReduceOrIntrinsic(I);
5482 break;
5483
5484 case Intrinsic::vector_reduce_add:
5485 case Intrinsic::vector_reduce_xor:
5486 case Intrinsic::vector_reduce_mul:
5487 // Signed/Unsigned Min/Max
5488 // TODO: handling similarly to AND/OR may be more precise.
5489 case Intrinsic::vector_reduce_smax:
5490 case Intrinsic::vector_reduce_smin:
5491 case Intrinsic::vector_reduce_umax:
5492 case Intrinsic::vector_reduce_umin:
5493 // TODO: this has no false positives, but arguably we should check that all
5494 // the bits are initialized.
5495 case Intrinsic::vector_reduce_fmax:
5496 case Intrinsic::vector_reduce_fmin:
5497 handleVectorReduceIntrinsic(I, /*AllowShadowCast=*/false);
5498 break;
5499
5500 case Intrinsic::vector_reduce_fadd:
5501 case Intrinsic::vector_reduce_fmul:
5502 handleVectorReduceWithStarterIntrinsic(I);
5503 break;
5504
5505 case Intrinsic::scmp:
5506 case Intrinsic::ucmp: {
5507 handleShadowOr(I);
5508 break;
5509 }
5510
5511 case Intrinsic::fshl:
5512 case Intrinsic::fshr:
5513 handleFunnelShift(I);
5514 break;
5515
5516 case Intrinsic::is_constant:
5517 // The result of llvm.is.constant() is always defined.
5518 setShadow(&I, getCleanShadow(&I));
5519 setOrigin(&I, getCleanOrigin());
5520 break;
5521
5522 default:
5523 return false;
5524 }
5525
5526 return true;
5527 }
5528
5529 bool maybeHandleX86SIMDIntrinsic(IntrinsicInst &I) {
5530 switch (I.getIntrinsicID()) {
5531 case Intrinsic::x86_sse_stmxcsr:
5532 handleStmxcsr(I);
5533 break;
5534 case Intrinsic::x86_sse_ldmxcsr:
5535 handleLdmxcsr(I);
5536 break;
5537
5538 // Convert Scalar Double Precision Floating-Point Value
5539 // to Unsigned Doubleword Integer
5540 // etc.
5541 case Intrinsic::x86_avx512_vcvtsd2usi64:
5542 case Intrinsic::x86_avx512_vcvtsd2usi32:
5543 case Intrinsic::x86_avx512_vcvtss2usi64:
5544 case Intrinsic::x86_avx512_vcvtss2usi32:
5545 case Intrinsic::x86_avx512_cvttss2usi64:
5546 case Intrinsic::x86_avx512_cvttss2usi:
5547 case Intrinsic::x86_avx512_cvttsd2usi64:
5548 case Intrinsic::x86_avx512_cvttsd2usi:
5549 case Intrinsic::x86_avx512_cvtusi2ss:
5550 case Intrinsic::x86_avx512_cvtusi642sd:
5551 case Intrinsic::x86_avx512_cvtusi642ss:
5552 handleSSEVectorConvertIntrinsic(I, 1, true);
5553 break;
5554 case Intrinsic::x86_sse2_cvtsd2si64:
5555 case Intrinsic::x86_sse2_cvtsd2si:
5556 case Intrinsic::x86_sse2_cvtsd2ss:
5557 case Intrinsic::x86_sse2_cvttsd2si64:
5558 case Intrinsic::x86_sse2_cvttsd2si:
5559 case Intrinsic::x86_sse_cvtss2si64:
5560 case Intrinsic::x86_sse_cvtss2si:
5561 case Intrinsic::x86_sse_cvttss2si64:
5562 case Intrinsic::x86_sse_cvttss2si:
5563 handleSSEVectorConvertIntrinsic(I, 1);
5564 break;
5565 case Intrinsic::x86_sse_cvtps2pi:
5566 case Intrinsic::x86_sse_cvttps2pi:
5567 handleSSEVectorConvertIntrinsic(I, 2);
5568 break;
5569
5570 // TODO:
5571 // <1 x i64> @llvm.x86.sse.cvtpd2pi(<2 x double>)
5572 // <2 x double> @llvm.x86.sse.cvtpi2pd(<1 x i64>)
5573 // <4 x float> @llvm.x86.sse.cvtpi2ps(<4 x float>, <1 x i64>)
5574
5575 case Intrinsic::x86_vcvtps2ph_128:
5576 case Intrinsic::x86_vcvtps2ph_256: {
5577 handleSSEVectorConvertIntrinsicByProp(I, /*HasRoundingMode=*/true);
5578 break;
5579 }
5580
5581 // Convert Packed Single Precision Floating-Point Values
5582 // to Packed Signed Doubleword Integer Values
5583 //
5584 // <16 x i32> @llvm.x86.avx512.mask.cvtps2dq.512
5585 // (<16 x float>, <16 x i32>, i16, i32)
5586 case Intrinsic::x86_avx512_mask_cvtps2dq_512:
5587 handleAVX512VectorConvertFPToInt(I, /*LastMask=*/false);
5588 break;
5589
5590 // Convert Packed Double Precision Floating-Point Values
5591 // to Packed Single Precision Floating-Point Values
5592 case Intrinsic::x86_sse2_cvtpd2ps:
5593 case Intrinsic::x86_sse2_cvtps2dq:
5594 case Intrinsic::x86_sse2_cvtpd2dq:
5595 case Intrinsic::x86_sse2_cvttps2dq:
5596 case Intrinsic::x86_sse2_cvttpd2dq:
5597 case Intrinsic::x86_avx_cvt_pd2_ps_256:
5598 case Intrinsic::x86_avx_cvt_ps2dq_256:
5599 case Intrinsic::x86_avx_cvt_pd2dq_256:
5600 case Intrinsic::x86_avx_cvtt_ps2dq_256:
5601 case Intrinsic::x86_avx_cvtt_pd2dq_256: {
5602 handleSSEVectorConvertIntrinsicByProp(I, /*HasRoundingMode=*/false);
5603 break;
5604 }
5605
5606 // Convert Single-Precision FP Value to 16-bit FP Value
5607 // <16 x i16> @llvm.x86.avx512.mask.vcvtps2ph.512
5608 // (<16 x float>, i32, <16 x i16>, i16)
5609 // <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.128
5610 // (<4 x float>, i32, <8 x i16>, i8)
5611 // <8 x i16> @llvm.x86.avx512.mask.vcvtps2ph.256
5612 // (<8 x float>, i32, <8 x i16>, i8)
5613 case Intrinsic::x86_avx512_mask_vcvtps2ph_512:
5614 case Intrinsic::x86_avx512_mask_vcvtps2ph_256:
5615 case Intrinsic::x86_avx512_mask_vcvtps2ph_128:
5616 handleAVX512VectorConvertFPToInt(I, /*LastMask=*/true);
5617 break;
5618
5619 // Shift Packed Data (Left Logical, Right Arithmetic, Right Logical)
5620 case Intrinsic::x86_avx512_psll_w_512:
5621 case Intrinsic::x86_avx512_psll_d_512:
5622 case Intrinsic::x86_avx512_psll_q_512:
5623 case Intrinsic::x86_avx512_pslli_w_512:
5624 case Intrinsic::x86_avx512_pslli_d_512:
5625 case Intrinsic::x86_avx512_pslli_q_512:
5626 case Intrinsic::x86_avx512_psrl_w_512:
5627 case Intrinsic::x86_avx512_psrl_d_512:
5628 case Intrinsic::x86_avx512_psrl_q_512:
5629 case Intrinsic::x86_avx512_psra_w_512:
5630 case Intrinsic::x86_avx512_psra_d_512:
5631 case Intrinsic::x86_avx512_psra_q_512:
5632 case Intrinsic::x86_avx512_psrli_w_512:
5633 case Intrinsic::x86_avx512_psrli_d_512:
5634 case Intrinsic::x86_avx512_psrli_q_512:
5635 case Intrinsic::x86_avx512_psrai_w_512:
5636 case Intrinsic::x86_avx512_psrai_d_512:
5637 case Intrinsic::x86_avx512_psrai_q_512:
5638 case Intrinsic::x86_avx512_psra_q_256:
5639 case Intrinsic::x86_avx512_psra_q_128:
5640 case Intrinsic::x86_avx512_psrai_q_256:
5641 case Intrinsic::x86_avx512_psrai_q_128:
5642 case Intrinsic::x86_avx2_psll_w:
5643 case Intrinsic::x86_avx2_psll_d:
5644 case Intrinsic::x86_avx2_psll_q:
5645 case Intrinsic::x86_avx2_pslli_w:
5646 case Intrinsic::x86_avx2_pslli_d:
5647 case Intrinsic::x86_avx2_pslli_q:
5648 case Intrinsic::x86_avx2_psrl_w:
5649 case Intrinsic::x86_avx2_psrl_d:
5650 case Intrinsic::x86_avx2_psrl_q:
5651 case Intrinsic::x86_avx2_psra_w:
5652 case Intrinsic::x86_avx2_psra_d:
5653 case Intrinsic::x86_avx2_psrli_w:
5654 case Intrinsic::x86_avx2_psrli_d:
5655 case Intrinsic::x86_avx2_psrli_q:
5656 case Intrinsic::x86_avx2_psrai_w:
5657 case Intrinsic::x86_avx2_psrai_d:
5658 case Intrinsic::x86_sse2_psll_w:
5659 case Intrinsic::x86_sse2_psll_d:
5660 case Intrinsic::x86_sse2_psll_q:
5661 case Intrinsic::x86_sse2_pslli_w:
5662 case Intrinsic::x86_sse2_pslli_d:
5663 case Intrinsic::x86_sse2_pslli_q:
5664 case Intrinsic::x86_sse2_psrl_w:
5665 case Intrinsic::x86_sse2_psrl_d:
5666 case Intrinsic::x86_sse2_psrl_q:
5667 case Intrinsic::x86_sse2_psra_w:
5668 case Intrinsic::x86_sse2_psra_d:
5669 case Intrinsic::x86_sse2_psrli_w:
5670 case Intrinsic::x86_sse2_psrli_d:
5671 case Intrinsic::x86_sse2_psrli_q:
5672 case Intrinsic::x86_sse2_psrai_w:
5673 case Intrinsic::x86_sse2_psrai_d:
5674 case Intrinsic::x86_mmx_psll_w:
5675 case Intrinsic::x86_mmx_psll_d:
5676 case Intrinsic::x86_mmx_psll_q:
5677 case Intrinsic::x86_mmx_pslli_w:
5678 case Intrinsic::x86_mmx_pslli_d:
5679 case Intrinsic::x86_mmx_pslli_q:
5680 case Intrinsic::x86_mmx_psrl_w:
5681 case Intrinsic::x86_mmx_psrl_d:
5682 case Intrinsic::x86_mmx_psrl_q:
5683 case Intrinsic::x86_mmx_psra_w:
5684 case Intrinsic::x86_mmx_psra_d:
5685 case Intrinsic::x86_mmx_psrli_w:
5686 case Intrinsic::x86_mmx_psrli_d:
5687 case Intrinsic::x86_mmx_psrli_q:
5688 case Intrinsic::x86_mmx_psrai_w:
5689 case Intrinsic::x86_mmx_psrai_d:
5690 handleVectorShiftIntrinsic(I, /* Variable */ false);
5691 break;
5692 case Intrinsic::x86_avx2_psllv_d:
5693 case Intrinsic::x86_avx2_psllv_d_256:
5694 case Intrinsic::x86_avx512_psllv_d_512:
5695 case Intrinsic::x86_avx2_psllv_q:
5696 case Intrinsic::x86_avx2_psllv_q_256:
5697 case Intrinsic::x86_avx512_psllv_q_512:
5698 case Intrinsic::x86_avx2_psrlv_d:
5699 case Intrinsic::x86_avx2_psrlv_d_256:
5700 case Intrinsic::x86_avx512_psrlv_d_512:
5701 case Intrinsic::x86_avx2_psrlv_q:
5702 case Intrinsic::x86_avx2_psrlv_q_256:
5703 case Intrinsic::x86_avx512_psrlv_q_512:
5704 case Intrinsic::x86_avx2_psrav_d:
5705 case Intrinsic::x86_avx2_psrav_d_256:
5706 case Intrinsic::x86_avx512_psrav_d_512:
5707 case Intrinsic::x86_avx512_psrav_q_128:
5708 case Intrinsic::x86_avx512_psrav_q_256:
5709 case Intrinsic::x86_avx512_psrav_q_512:
5710 handleVectorShiftIntrinsic(I, /* Variable */ true);
5711 break;
5712
5713 // Pack with Signed/Unsigned Saturation
5714 case Intrinsic::x86_sse2_packsswb_128:
5715 case Intrinsic::x86_sse2_packssdw_128:
5716 case Intrinsic::x86_sse2_packuswb_128:
5717 case Intrinsic::x86_sse41_packusdw:
5718 case Intrinsic::x86_avx2_packsswb:
5719 case Intrinsic::x86_avx2_packssdw:
5720 case Intrinsic::x86_avx2_packuswb:
5721 case Intrinsic::x86_avx2_packusdw:
5722 // e.g., <64 x i8> @llvm.x86.avx512.packsswb.512
5723 // (<32 x i16> %a, <32 x i16> %b)
5724 // <32 x i16> @llvm.x86.avx512.packssdw.512
5725 // (<16 x i32> %a, <16 x i32> %b)
5726 // Note: AVX512 masked variants are auto-upgraded by LLVM.
5727 case Intrinsic::x86_avx512_packsswb_512:
5728 case Intrinsic::x86_avx512_packssdw_512:
5729 case Intrinsic::x86_avx512_packuswb_512:
5730 case Intrinsic::x86_avx512_packusdw_512:
5731 handleVectorPackIntrinsic(I);
5732 break;
5733
5734 case Intrinsic::x86_sse41_pblendvb:
5735 case Intrinsic::x86_sse41_blendvpd:
5736 case Intrinsic::x86_sse41_blendvps:
5737 case Intrinsic::x86_avx_blendv_pd_256:
5738 case Intrinsic::x86_avx_blendv_ps_256:
5739 case Intrinsic::x86_avx2_pblendvb:
5740 handleBlendvIntrinsic(I);
5741 break;
5742
5743 case Intrinsic::x86_avx_dp_ps_256:
5744 case Intrinsic::x86_sse41_dppd:
5745 case Intrinsic::x86_sse41_dpps:
5746 handleDppIntrinsic(I);
5747 break;
5748
5749 case Intrinsic::x86_mmx_packsswb:
5750 case Intrinsic::x86_mmx_packuswb:
5751 handleVectorPackIntrinsic(I, 16);
5752 break;
5753
5754 case Intrinsic::x86_mmx_packssdw:
5755 handleVectorPackIntrinsic(I, 32);
5756 break;
5757
5758 case Intrinsic::x86_mmx_psad_bw:
5759 handleVectorSadIntrinsic(I, true);
5760 break;
5761 case Intrinsic::x86_sse2_psad_bw:
5762 case Intrinsic::x86_avx2_psad_bw:
5763 handleVectorSadIntrinsic(I);
5764 break;
5765
5766 // Multiply and Add Packed Words
5767 // < 4 x i32> @llvm.x86.sse2.pmadd.wd(<8 x i16>, <8 x i16>)
5768 // < 8 x i32> @llvm.x86.avx2.pmadd.wd(<16 x i16>, <16 x i16>)
5769 // <16 x i32> @llvm.x86.avx512.pmaddw.d.512(<32 x i16>, <32 x i16>)
5770 //
5771 // Multiply and Add Packed Signed and Unsigned Bytes
5772 // < 8 x i16> @llvm.x86.ssse3.pmadd.ub.sw.128(<16 x i8>, <16 x i8>)
5773 // <16 x i16> @llvm.x86.avx2.pmadd.ub.sw(<32 x i8>, <32 x i8>)
5774 // <32 x i16> @llvm.x86.avx512.pmaddubs.w.512(<64 x i8>, <64 x i8>)
5775 //
5776 // These intrinsics are auto-upgraded into non-masked forms:
5777 // < 4 x i32> @llvm.x86.avx512.mask.pmaddw.d.128
5778 // (<8 x i16>, <8 x i16>, <4 x i32>, i8)
5779 // < 8 x i32> @llvm.x86.avx512.mask.pmaddw.d.256
5780 // (<16 x i16>, <16 x i16>, <8 x i32>, i8)
5781 // <16 x i32> @llvm.x86.avx512.mask.pmaddw.d.512
5782 // (<32 x i16>, <32 x i16>, <16 x i32>, i16)
5783 // < 8 x i16> @llvm.x86.avx512.mask.pmaddubs.w.128
5784 // (<16 x i8>, <16 x i8>, <8 x i16>, i8)
5785 // <16 x i16> @llvm.x86.avx512.mask.pmaddubs.w.256
5786 // (<32 x i8>, <32 x i8>, <16 x i16>, i16)
5787 // <32 x i16> @llvm.x86.avx512.mask.pmaddubs.w.512
5788 // (<64 x i8>, <64 x i8>, <32 x i16>, i32)
5789 case Intrinsic::x86_sse2_pmadd_wd:
5790 case Intrinsic::x86_avx2_pmadd_wd:
5791 case Intrinsic::x86_avx512_pmaddw_d_512:
5792 case Intrinsic::x86_ssse3_pmadd_ub_sw_128:
5793 case Intrinsic::x86_avx2_pmadd_ub_sw:
5794 case Intrinsic::x86_avx512_pmaddubs_w_512:
5795 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/2,
5796 /*ZeroPurifies=*/true);
5797 break;
5798
5799 // <1 x i64> @llvm.x86.ssse3.pmadd.ub.sw(<1 x i64>, <1 x i64>)
5800 case Intrinsic::x86_ssse3_pmadd_ub_sw:
5801 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/2,
5802 /*ZeroPurifies=*/true, /*EltSizeInBits=*/8);
5803 break;
5804
5805 // <1 x i64> @llvm.x86.mmx.pmadd.wd(<1 x i64>, <1 x i64>)
5806 case Intrinsic::x86_mmx_pmadd_wd:
5807 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/2,
5808 /*ZeroPurifies=*/true, /*EltSizeInBits=*/16);
5809 break;
5810
5811 // AVX Vector Neural Network Instructions: bytes
5812 //
5813 // Multiply and Add Signed Bytes
5814 // < 4 x i32> @llvm.x86.avx2.vpdpbssd.128
5815 // (< 4 x i32>, <16 x i8>, <16 x i8>)
5816 // < 8 x i32> @llvm.x86.avx2.vpdpbssd.256
5817 // (< 8 x i32>, <32 x i8>, <32 x i8>)
5818 // <16 x i32> @llvm.x86.avx10.vpdpbssd.512
5819 // (<16 x i32>, <64 x i8>, <64 x i8>)
5820 //
5821 // Multiply and Add Signed Bytes With Saturation
5822 // < 4 x i32> @llvm.x86.avx2.vpdpbssds.128
5823 // (< 4 x i32>, <16 x i8>, <16 x i8>)
5824 // < 8 x i32> @llvm.x86.avx2.vpdpbssds.256
5825 // (< 8 x i32>, <32 x i8>, <32 x i8>)
5826 // <16 x i32> @llvm.x86.avx10.vpdpbssds.512
5827 // (<16 x i32>, <64 x i8>, <64 x i8>)
5828 //
5829 // Multiply and Add Signed and Unsigned Bytes
5830 // < 4 x i32> @llvm.x86.avx2.vpdpbsud.128
5831 // (< 4 x i32>, <16 x i8>, <16 x i8>)
5832 // < 8 x i32> @llvm.x86.avx2.vpdpbsud.256
5833 // (< 8 x i32>, <32 x i8>, <32 x i8>)
5834 // <16 x i32> @llvm.x86.avx10.vpdpbsud.512
5835 // (<16 x i32>, <64 x i8>, <64 x i8>)
5836 //
5837 // Multiply and Add Signed and Unsigned Bytes With Saturation
5838 // < 4 x i32> @llvm.x86.avx2.vpdpbsuds.128
5839 // (< 4 x i32>, <16 x i8>, <16 x i8>)
5840 // < 8 x i32> @llvm.x86.avx2.vpdpbsuds.256
5841 // (< 8 x i32>, <32 x i8>, <32 x i8>)
5842 // <16 x i32> @llvm.x86.avx512.vpdpbusds.512
5843 // (<16 x i32>, <64 x i8>, <64 x i8>)
5844 //
5845 // Multiply and Add Unsigned and Signed Bytes
5846 // < 4 x i32> @llvm.x86.avx512.vpdpbusd.128
5847 // (< 4 x i32>, <16 x i8>, <16 x i8>)
5848 // < 8 x i32> @llvm.x86.avx512.vpdpbusd.256
5849 // (< 8 x i32>, <32 x i8>, <32 x i8>)
5850 // <16 x i32> @llvm.x86.avx512.vpdpbusd.512
5851 // (<16 x i32>, <64 x i8>, <64 x i8>)
5852 //
5853 // Multiply and Add Unsigned and Signed Bytes With Saturation
5854 // < 4 x i32> @llvm.x86.avx512.vpdpbusds.128
5855 // (< 4 x i32>, <16 x i8>, <16 x i8>)
5856 // < 8 x i32> @llvm.x86.avx512.vpdpbusds.256
5857 // (< 8 x i32>, <32 x i8>, <32 x i8>)
5858 // <16 x i32> @llvm.x86.avx10.vpdpbsuds.512
5859 // (<16 x i32>, <64 x i8>, <64 x i8>)
5860 //
5861 // Multiply and Add Unsigned Bytes
5862 // < 4 x i32> @llvm.x86.avx2.vpdpbuud.128
5863 // (< 4 x i32>, <16 x i8>, <16 x i8>)
5864 // < 8 x i32> @llvm.x86.avx2.vpdpbuud.256
5865 // (< 8 x i32>, <32 x i8>, <32 x i8>)
5866 // <16 x i32> @llvm.x86.avx10.vpdpbuud.512
5867 // (<16 x i32>, <64 x i8>, <64 x i8>)
5868 //
5869 // Multiply and Add Unsigned Bytes With Saturation
5870 // < 4 x i32> @llvm.x86.avx2.vpdpbuuds.128
5871 // (< 4 x i32>, <16 x i8>, <16 x i8>)
5872 // < 8 x i32> @llvm.x86.avx2.vpdpbuuds.256
5873 // (< 8 x i32>, <32 x i8>, <32 x i8>)
5874 // <16 x i32> @llvm.x86.avx10.vpdpbuuds.512
5875 // (<16 x i32>, <64 x i8>, <64 x i8>)
5876 //
5877 // These intrinsics are auto-upgraded into non-masked forms:
5878 // <4 x i32> @llvm.x86.avx512.mask.vpdpbusd.128
5879 // (<4 x i32>, <16 x i8>, <16 x i8>, i8)
5880 // <4 x i32> @llvm.x86.avx512.maskz.vpdpbusd.128
5881 // (<4 x i32>, <16 x i8>, <16 x i8>, i8)
5882 // <8 x i32> @llvm.x86.avx512.mask.vpdpbusd.256
5883 // (<8 x i32>, <32 x i8>, <32 x i8>, i8)
5884 // <8 x i32> @llvm.x86.avx512.maskz.vpdpbusd.256
5885 // (<8 x i32>, <32 x i8>, <32 x i8>, i8)
5886 // <16 x i32> @llvm.x86.avx512.mask.vpdpbusd.512
5887 // (<16 x i32>, <64 x i8>, <64 x i8>, i16)
5888 // <16 x i32> @llvm.x86.avx512.maskz.vpdpbusd.512
5889 // (<16 x i32>, <64 x i8>, <64 x i8>, i16)
5890 //
5891 // <4 x i32> @llvm.x86.avx512.mask.vpdpbusds.128
5892 // (<4 x i32>, <16 x i8>, <16 x i8>, i8)
5893 // <4 x i32> @llvm.x86.avx512.maskz.vpdpbusds.128
5894 // (<4 x i32>, <16 x i8>, <16 x i8>, i8)
5895 // <8 x i32> @llvm.x86.avx512.mask.vpdpbusds.256
5896 // (<8 x i32>, <32 x i8>, <32 x i8>, i8)
5897 // <8 x i32> @llvm.x86.avx512.maskz.vpdpbusds.256
5898 // (<8 x i32>, <32 x i8>, <32 x i8>, i8)
5899 // <16 x i32> @llvm.x86.avx512.mask.vpdpbusds.512
5900 // (<16 x i32>, <64 x i8>, <64 x i8>, i16)
5901 // <16 x i32> @llvm.x86.avx512.maskz.vpdpbusds.512
5902 // (<16 x i32>, <64 x i8>, <64 x i8>, i16)
5903 case Intrinsic::x86_avx512_vpdpbusd_128:
5904 case Intrinsic::x86_avx512_vpdpbusd_256:
5905 case Intrinsic::x86_avx512_vpdpbusd_512:
5906 case Intrinsic::x86_avx512_vpdpbusds_128:
5907 case Intrinsic::x86_avx512_vpdpbusds_256:
5908 case Intrinsic::x86_avx512_vpdpbusds_512:
5909 case Intrinsic::x86_avx2_vpdpbssd_128:
5910 case Intrinsic::x86_avx2_vpdpbssd_256:
5911 case Intrinsic::x86_avx10_vpdpbssd_512:
5912 case Intrinsic::x86_avx2_vpdpbssds_128:
5913 case Intrinsic::x86_avx2_vpdpbssds_256:
5914 case Intrinsic::x86_avx10_vpdpbssds_512:
5915 case Intrinsic::x86_avx2_vpdpbsud_128:
5916 case Intrinsic::x86_avx2_vpdpbsud_256:
5917 case Intrinsic::x86_avx10_vpdpbsud_512:
5918 case Intrinsic::x86_avx2_vpdpbsuds_128:
5919 case Intrinsic::x86_avx2_vpdpbsuds_256:
5920 case Intrinsic::x86_avx10_vpdpbsuds_512:
5921 case Intrinsic::x86_avx2_vpdpbuud_128:
5922 case Intrinsic::x86_avx2_vpdpbuud_256:
5923 case Intrinsic::x86_avx10_vpdpbuud_512:
5924 case Intrinsic::x86_avx2_vpdpbuuds_128:
5925 case Intrinsic::x86_avx2_vpdpbuuds_256:
5926 case Intrinsic::x86_avx10_vpdpbuuds_512:
5927 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/4,
5928 /*ZeroPurifies=*/true);
5929 break;
5930
5931 // AVX Vector Neural Network Instructions: words
5932 //
5933 // Multiply and Add Signed Word Integers
5934 // < 4 x i32> @llvm.x86.avx512.vpdpwssd.128
5935 // (< 4 x i32>, < 8 x i16>, < 8 x i16>)
5936 // < 8 x i32> @llvm.x86.avx512.vpdpwssd.256
5937 // (< 8 x i32>, <16 x i16>, <16 x i16>)
5938 // <16 x i32> @llvm.x86.avx512.vpdpwssd.512
5939 // (<16 x i32>, <32 x i16>, <32 x i16>)
5940 //
5941 // Multiply and Add Signed Word Integers With Saturation
5942 // < 4 x i32> @llvm.x86.avx512.vpdpwssds.128
5943 // (< 4 x i32>, < 8 x i16>, < 8 x i16>)
5944 // < 8 x i32> @llvm.x86.avx512.vpdpwssds.256
5945 // (< 8 x i32>, <16 x i16>, <16 x i16>)
5946 // <16 x i32> @llvm.x86.avx512.vpdpwssds.512
5947 // (<16 x i32>, <32 x i16>, <32 x i16>)
5948 //
5949 // Multiply and Add Signed and Unsigned Word Integers
5950 // < 4 x i32> @llvm.x86.avx2.vpdpwsud.128
5951 // (< 4 x i32>, < 8 x i16>, < 8 x i16>)
5952 // < 8 x i32> @llvm.x86.avx2.vpdpwsud.256
5953 // (< 8 x i32>, <16 x i16>, <16 x i16>)
5954 // <16 x i32> @llvm.x86.avx10.vpdpwsud.512
5955 // (<16 x i32>, <32 x i16>, <32 x i16>)
5956 //
5957 // Multiply and Add Signed and Unsigned Word Integers With Saturation
5958 // < 4 x i32> @llvm.x86.avx2.vpdpwsuds.128
5959 // (< 4 x i32>, < 8 x i16>, < 8 x i16>)
5960 // < 8 x i32> @llvm.x86.avx2.vpdpwsuds.256
5961 // (< 8 x i32>, <16 x i16>, <16 x i16>)
5962 // <16 x i32> @llvm.x86.avx10.vpdpwsuds.512
5963 // (<16 x i32>, <32 x i16>, <32 x i16>)
5964 //
5965 // Multiply and Add Unsigned and Signed Word Integers
5966 // < 4 x i32> @llvm.x86.avx2.vpdpwusd.128
5967 // (< 4 x i32>, < 8 x i16>, < 8 x i16>)
5968 // < 8 x i32> @llvm.x86.avx2.vpdpwusd.256
5969 // (< 8 x i32>, <16 x i16>, <16 x i16>)
5970 // <16 x i32> @llvm.x86.avx10.vpdpwusd.512
5971 // (<16 x i32>, <32 x i16>, <32 x i16>)
5972 //
5973 // Multiply and Add Unsigned and Signed Word Integers With Saturation
5974 // < 4 x i32> @llvm.x86.avx2.vpdpwusds.128
5975 // (< 4 x i32>, < 8 x i16>, < 8 x i16>)
5976 // < 8 x i32> @llvm.x86.avx2.vpdpwusds.256
5977 // (< 8 x i32>, <16 x i16>, <16 x i16>)
5978 // <16 x i32> @llvm.x86.avx10.vpdpwusds.512
5979 // (<16 x i32>, <32 x i16>, <32 x i16>)
5980 //
5981 // Multiply and Add Unsigned and Unsigned Word Integers
5982 // < 4 x i32> @llvm.x86.avx2.vpdpwuud.128
5983 // (< 4 x i32>, < 8 x i16>, < 8 x i16>)
5984 // < 8 x i32> @llvm.x86.avx2.vpdpwuud.256
5985 // (< 8 x i32>, <16 x i16>, <16 x i16>)
5986 // <16 x i32> @llvm.x86.avx10.vpdpwuud.512
5987 // (<16 x i32>, <32 x i16>, <32 x i16>)
5988 //
5989 // Multiply and Add Unsigned and Unsigned Word Integers With Saturation
5990 // < 4 x i32> @llvm.x86.avx2.vpdpwuuds.128
5991 // (< 4 x i32>, < 8 x i16>, < 8 x i16>)
5992 // < 8 x i32> @llvm.x86.avx2.vpdpwuuds.256
5993 // (< 8 x i32>, <16 x i16>, <16 x i16>)
5994 // <16 x i32> @llvm.x86.avx10.vpdpwuuds.512
5995 // (<16 x i32>, <32 x i16>, <32 x i16>)
5996 //
5997 // These intrinsics are auto-upgraded into non-masked forms:
5998 // <4 x i32> @llvm.x86.avx512.mask.vpdpwssd.128
5999 // (<4 x i32>, <8 x i16>, <8 x i16>, i8)
6000 // <4 x i32> @llvm.x86.avx512.maskz.vpdpwssd.128
6001 // (<4 x i32>, <8 x i16>, <8 x i16>, i8)
6002 // <8 x i32> @llvm.x86.avx512.mask.vpdpwssd.256
6003 // (<8 x i32>, <16 x i16>, <16 x i16>, i8)
6004 // <8 x i32> @llvm.x86.avx512.maskz.vpdpwssd.256
6005 // (<8 x i32>, <16 x i16>, <16 x i16>, i8)
6006 // <16 x i32> @llvm.x86.avx512.mask.vpdpwssd.512
6007 // (<16 x i32>, <32 x i16>, <32 x i16>, i16)
6008 // <16 x i32> @llvm.x86.avx512.maskz.vpdpwssd.512
6009 // (<16 x i32>, <32 x i16>, <32 x i16>, i16)
6010 //
6011 // <4 x i32> @llvm.x86.avx512.mask.vpdpwssds.128
6012 // (<4 x i32>, <8 x i16>, <8 x i16>, i8)
6013 // <4 x i32> @llvm.x86.avx512.maskz.vpdpwssds.128
6014 // (<4 x i32>, <8 x i16>, <8 x i16>, i8)
6015 // <8 x i32> @llvm.x86.avx512.mask.vpdpwssds.256
6016 // (<8 x i32>, <16 x i16>, <16 x i16>, i8)
6017 // <8 x i32> @llvm.x86.avx512.maskz.vpdpwssds.256
6018 // (<8 x i32>, <16 x i16>, <16 x i16>, i8)
6019 // <16 x i32> @llvm.x86.avx512.mask.vpdpwssds.512
6020 // (<16 x i32>, <32 x i16>, <32 x i16>, i16)
6021 // <16 x i32> @llvm.x86.avx512.maskz.vpdpwssds.512
6022 // (<16 x i32>, <32 x i16>, <32 x i16>, i16)
6023 case Intrinsic::x86_avx512_vpdpwssd_128:
6024 case Intrinsic::x86_avx512_vpdpwssd_256:
6025 case Intrinsic::x86_avx512_vpdpwssd_512:
6026 case Intrinsic::x86_avx512_vpdpwssds_128:
6027 case Intrinsic::x86_avx512_vpdpwssds_256:
6028 case Intrinsic::x86_avx512_vpdpwssds_512:
6029 case Intrinsic::x86_avx2_vpdpwsud_128:
6030 case Intrinsic::x86_avx2_vpdpwsud_256:
6031 case Intrinsic::x86_avx10_vpdpwsud_512:
6032 case Intrinsic::x86_avx2_vpdpwsuds_128:
6033 case Intrinsic::x86_avx2_vpdpwsuds_256:
6034 case Intrinsic::x86_avx10_vpdpwsuds_512:
6035 case Intrinsic::x86_avx2_vpdpwusd_128:
6036 case Intrinsic::x86_avx2_vpdpwusd_256:
6037 case Intrinsic::x86_avx10_vpdpwusd_512:
6038 case Intrinsic::x86_avx2_vpdpwusds_128:
6039 case Intrinsic::x86_avx2_vpdpwusds_256:
6040 case Intrinsic::x86_avx10_vpdpwusds_512:
6041 case Intrinsic::x86_avx2_vpdpwuud_128:
6042 case Intrinsic::x86_avx2_vpdpwuud_256:
6043 case Intrinsic::x86_avx10_vpdpwuud_512:
6044 case Intrinsic::x86_avx2_vpdpwuuds_128:
6045 case Intrinsic::x86_avx2_vpdpwuuds_256:
6046 case Intrinsic::x86_avx10_vpdpwuuds_512:
6047 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/2,
6048 /*ZeroPurifies=*/true);
6049 break;
6050
6051 // Dot Product of BF16 Pairs Accumulated Into Packed Single
6052 // Precision
6053 // <4 x float> @llvm.x86.avx512bf16.dpbf16ps.128
6054 // (<4 x float>, <8 x bfloat>, <8 x bfloat>)
6055 // <8 x float> @llvm.x86.avx512bf16.dpbf16ps.256
6056 // (<8 x float>, <16 x bfloat>, <16 x bfloat>)
6057 // <16 x float> @llvm.x86.avx512bf16.dpbf16ps.512
6058 // (<16 x float>, <32 x bfloat>, <32 x bfloat>)
6059 case Intrinsic::x86_avx512bf16_dpbf16ps_128:
6060 case Intrinsic::x86_avx512bf16_dpbf16ps_256:
6061 case Intrinsic::x86_avx512bf16_dpbf16ps_512:
6062 handleVectorPmaddIntrinsic(I, /*ReductionFactor=*/2,
6063 /*ZeroPurifies=*/false);
6064 break;
6065
6066 case Intrinsic::x86_sse_cmp_ss:
6067 case Intrinsic::x86_sse2_cmp_sd:
6068 case Intrinsic::x86_sse_comieq_ss:
6069 case Intrinsic::x86_sse_comilt_ss:
6070 case Intrinsic::x86_sse_comile_ss:
6071 case Intrinsic::x86_sse_comigt_ss:
6072 case Intrinsic::x86_sse_comige_ss:
6073 case Intrinsic::x86_sse_comineq_ss:
6074 case Intrinsic::x86_sse_ucomieq_ss:
6075 case Intrinsic::x86_sse_ucomilt_ss:
6076 case Intrinsic::x86_sse_ucomile_ss:
6077 case Intrinsic::x86_sse_ucomigt_ss:
6078 case Intrinsic::x86_sse_ucomige_ss:
6079 case Intrinsic::x86_sse_ucomineq_ss:
6080 case Intrinsic::x86_sse2_comieq_sd:
6081 case Intrinsic::x86_sse2_comilt_sd:
6082 case Intrinsic::x86_sse2_comile_sd:
6083 case Intrinsic::x86_sse2_comigt_sd:
6084 case Intrinsic::x86_sse2_comige_sd:
6085 case Intrinsic::x86_sse2_comineq_sd:
6086 case Intrinsic::x86_sse2_ucomieq_sd:
6087 case Intrinsic::x86_sse2_ucomilt_sd:
6088 case Intrinsic::x86_sse2_ucomile_sd:
6089 case Intrinsic::x86_sse2_ucomigt_sd:
6090 case Intrinsic::x86_sse2_ucomige_sd:
6091 case Intrinsic::x86_sse2_ucomineq_sd:
6092 handleVectorCompareScalarIntrinsic(I);
6093 break;
6094
6095 case Intrinsic::x86_avx_cmp_pd_256:
6096 case Intrinsic::x86_avx_cmp_ps_256:
6097 case Intrinsic::x86_sse2_cmp_pd:
6098 case Intrinsic::x86_sse_cmp_ps:
6099 handleVectorComparePackedIntrinsic(I);
6100 break;
6101
6102 case Intrinsic::x86_bmi_bextr_32:
6103 case Intrinsic::x86_bmi_bextr_64:
6104 case Intrinsic::x86_bmi_bzhi_32:
6105 case Intrinsic::x86_bmi_bzhi_64:
6106 case Intrinsic::x86_bmi_pdep_32:
6107 case Intrinsic::x86_bmi_pdep_64:
6108 case Intrinsic::x86_bmi_pext_32:
6109 case Intrinsic::x86_bmi_pext_64:
6110 handleBmiIntrinsic(I);
6111 break;
6112
6113 case Intrinsic::x86_pclmulqdq:
6114 case Intrinsic::x86_pclmulqdq_256:
6115 case Intrinsic::x86_pclmulqdq_512:
6116 handlePclmulIntrinsic(I);
6117 break;
6118
6119 case Intrinsic::x86_avx_round_pd_256:
6120 case Intrinsic::x86_avx_round_ps_256:
6121 case Intrinsic::x86_sse41_round_pd:
6122 case Intrinsic::x86_sse41_round_ps:
6123 handleRoundPdPsIntrinsic(I);
6124 break;
6125
6126 case Intrinsic::x86_sse41_round_sd:
6127 case Intrinsic::x86_sse41_round_ss:
6128 handleUnarySdSsIntrinsic(I);
6129 break;
6130
6131 case Intrinsic::x86_sse2_max_sd:
6132 case Intrinsic::x86_sse_max_ss:
6133 case Intrinsic::x86_sse2_min_sd:
6134 case Intrinsic::x86_sse_min_ss:
6135 handleBinarySdSsIntrinsic(I);
6136 break;
6137
6138 case Intrinsic::x86_avx_vtestc_pd:
6139 case Intrinsic::x86_avx_vtestc_pd_256:
6140 case Intrinsic::x86_avx_vtestc_ps:
6141 case Intrinsic::x86_avx_vtestc_ps_256:
6142 case Intrinsic::x86_avx_vtestnzc_pd:
6143 case Intrinsic::x86_avx_vtestnzc_pd_256:
6144 case Intrinsic::x86_avx_vtestnzc_ps:
6145 case Intrinsic::x86_avx_vtestnzc_ps_256:
6146 case Intrinsic::x86_avx_vtestz_pd:
6147 case Intrinsic::x86_avx_vtestz_pd_256:
6148 case Intrinsic::x86_avx_vtestz_ps:
6149 case Intrinsic::x86_avx_vtestz_ps_256:
6150 case Intrinsic::x86_avx_ptestc_256:
6151 case Intrinsic::x86_avx_ptestnzc_256:
6152 case Intrinsic::x86_avx_ptestz_256:
6153 case Intrinsic::x86_sse41_ptestc:
6154 case Intrinsic::x86_sse41_ptestnzc:
6155 case Intrinsic::x86_sse41_ptestz:
6156 handleVtestIntrinsic(I);
6157 break;
6158
6159 // Packed Horizontal Add/Subtract
6160 case Intrinsic::x86_ssse3_phadd_w:
6161 case Intrinsic::x86_ssse3_phadd_w_128:
6162 case Intrinsic::x86_ssse3_phsub_w:
6163 case Intrinsic::x86_ssse3_phsub_w_128:
6164 handlePairwiseShadowOrIntrinsic(I, /*Shards=*/1,
6165 /*ReinterpretElemWidth=*/16);
6166 break;
6167
6168 case Intrinsic::x86_avx2_phadd_w:
6169 case Intrinsic::x86_avx2_phsub_w:
6170 handlePairwiseShadowOrIntrinsic(I, /*Shards=*/2,
6171 /*ReinterpretElemWidth=*/16);
6172 break;
6173
6174 // Packed Horizontal Add/Subtract
6175 case Intrinsic::x86_ssse3_phadd_d:
6176 case Intrinsic::x86_ssse3_phadd_d_128:
6177 case Intrinsic::x86_ssse3_phsub_d:
6178 case Intrinsic::x86_ssse3_phsub_d_128:
6179 handlePairwiseShadowOrIntrinsic(I, /*Shards=*/1,
6180 /*ReinterpretElemWidth=*/32);
6181 break;
6182
6183 case Intrinsic::x86_avx2_phadd_d:
6184 case Intrinsic::x86_avx2_phsub_d:
6185 handlePairwiseShadowOrIntrinsic(I, /*Shards=*/2,
6186 /*ReinterpretElemWidth=*/32);
6187 break;
6188
6189 // Packed Horizontal Add/Subtract and Saturate
6190 case Intrinsic::x86_ssse3_phadd_sw:
6191 case Intrinsic::x86_ssse3_phadd_sw_128:
6192 case Intrinsic::x86_ssse3_phsub_sw:
6193 case Intrinsic::x86_ssse3_phsub_sw_128:
6194 handlePairwiseShadowOrIntrinsic(I, /*Shards=*/1,
6195 /*ReinterpretElemWidth=*/16);
6196 break;
6197
6198 case Intrinsic::x86_avx2_phadd_sw:
6199 case Intrinsic::x86_avx2_phsub_sw:
6200 handlePairwiseShadowOrIntrinsic(I, /*Shards=*/2,
6201 /*ReinterpretElemWidth=*/16);
6202 break;
6203
6204 // Packed Single/Double Precision Floating-Point Horizontal Add
6205 case Intrinsic::x86_sse3_hadd_ps:
6206 case Intrinsic::x86_sse3_hadd_pd:
6207 case Intrinsic::x86_sse3_hsub_ps:
6208 case Intrinsic::x86_sse3_hsub_pd:
6209 handlePairwiseShadowOrIntrinsic(I, /*Shards=*/1);
6210 break;
6211
6212 case Intrinsic::x86_avx_hadd_pd_256:
6213 case Intrinsic::x86_avx_hadd_ps_256:
6214 case Intrinsic::x86_avx_hsub_pd_256:
6215 case Intrinsic::x86_avx_hsub_ps_256:
6216 handlePairwiseShadowOrIntrinsic(I, /*Shards=*/2);
6217 break;
6218
6219 case Intrinsic::x86_avx_maskstore_ps:
6220 case Intrinsic::x86_avx_maskstore_pd:
6221 case Intrinsic::x86_avx_maskstore_ps_256:
6222 case Intrinsic::x86_avx_maskstore_pd_256:
6223 case Intrinsic::x86_avx2_maskstore_d:
6224 case Intrinsic::x86_avx2_maskstore_q:
6225 case Intrinsic::x86_avx2_maskstore_d_256:
6226 case Intrinsic::x86_avx2_maskstore_q_256: {
6227 handleAVXMaskedStore(I);
6228 break;
6229 }
6230
6231 case Intrinsic::x86_avx_maskload_ps:
6232 case Intrinsic::x86_avx_maskload_pd:
6233 case Intrinsic::x86_avx_maskload_ps_256:
6234 case Intrinsic::x86_avx_maskload_pd_256:
6235 case Intrinsic::x86_avx2_maskload_d:
6236 case Intrinsic::x86_avx2_maskload_q:
6237 case Intrinsic::x86_avx2_maskload_d_256:
6238 case Intrinsic::x86_avx2_maskload_q_256: {
6239 handleAVXMaskedLoad(I);
6240 break;
6241 }
6242
6243 // Packed
6244 case Intrinsic::x86_avx512fp16_add_ph_512:
6245 case Intrinsic::x86_avx512fp16_sub_ph_512:
6246 case Intrinsic::x86_avx512fp16_mul_ph_512:
6247 case Intrinsic::x86_avx512fp16_div_ph_512:
6248 case Intrinsic::x86_avx512fp16_max_ph_512:
6249 case Intrinsic::x86_avx512fp16_min_ph_512:
6250 case Intrinsic::x86_avx512_min_ps_512:
6251 case Intrinsic::x86_avx512_min_pd_512:
6252 case Intrinsic::x86_avx512_max_ps_512:
6253 case Intrinsic::x86_avx512_max_pd_512: {
6254 // These AVX512 variants contain the rounding mode as a trailing flag.
6255 // Earlier variants do not have a trailing flag and are already handled
6256 // by maybeHandleSimpleNomemIntrinsic(I, 0) via
6257 // maybeHandleUnknownIntrinsic.
6258 [[maybe_unused]] bool Success =
6259 maybeHandleSimpleNomemIntrinsic(I, /*trailingFlags=*/1);
6260 assert(Success);
6261 break;
6262 }
6263
6264 case Intrinsic::x86_avx_vpermilvar_pd:
6265 case Intrinsic::x86_avx_vpermilvar_pd_256:
6266 case Intrinsic::x86_avx512_vpermilvar_pd_512:
6267 case Intrinsic::x86_avx_vpermilvar_ps:
6268 case Intrinsic::x86_avx_vpermilvar_ps_256:
6269 case Intrinsic::x86_avx512_vpermilvar_ps_512: {
6270 handleAVXVpermilvar(I);
6271 break;
6272 }
6273
6274 case Intrinsic::x86_avx512_vpermi2var_d_128:
6275 case Intrinsic::x86_avx512_vpermi2var_d_256:
6276 case Intrinsic::x86_avx512_vpermi2var_d_512:
6277 case Intrinsic::x86_avx512_vpermi2var_hi_128:
6278 case Intrinsic::x86_avx512_vpermi2var_hi_256:
6279 case Intrinsic::x86_avx512_vpermi2var_hi_512:
6280 case Intrinsic::x86_avx512_vpermi2var_pd_128:
6281 case Intrinsic::x86_avx512_vpermi2var_pd_256:
6282 case Intrinsic::x86_avx512_vpermi2var_pd_512:
6283 case Intrinsic::x86_avx512_vpermi2var_ps_128:
6284 case Intrinsic::x86_avx512_vpermi2var_ps_256:
6285 case Intrinsic::x86_avx512_vpermi2var_ps_512:
6286 case Intrinsic::x86_avx512_vpermi2var_q_128:
6287 case Intrinsic::x86_avx512_vpermi2var_q_256:
6288 case Intrinsic::x86_avx512_vpermi2var_q_512:
6289 case Intrinsic::x86_avx512_vpermi2var_qi_128:
6290 case Intrinsic::x86_avx512_vpermi2var_qi_256:
6291 case Intrinsic::x86_avx512_vpermi2var_qi_512:
6292 handleAVXVpermi2var(I);
6293 break;
6294
6295 // Packed Shuffle
6296 // llvm.x86.sse.pshuf.w(<1 x i64>, i8)
6297 // llvm.x86.ssse3.pshuf.b(<1 x i64>, <1 x i64>)
6298 // llvm.x86.ssse3.pshuf.b.128(<16 x i8>, <16 x i8>)
6299 // llvm.x86.avx2.pshuf.b(<32 x i8>, <32 x i8>)
6300 // llvm.x86.avx512.pshuf.b.512(<64 x i8>, <64 x i8>)
6301 //
6302 // The following intrinsics are auto-upgraded:
6303 // llvm.x86.sse2.pshuf.d(<4 x i32>, i8)
6304 // llvm.x86.sse2.gpshufh.w(<8 x i16>, i8)
6305 // llvm.x86.sse2.pshufl.w(<8 x i16>, i8)
6306 case Intrinsic::x86_avx2_pshuf_b:
6307 case Intrinsic::x86_sse_pshuf_w:
6308 case Intrinsic::x86_ssse3_pshuf_b_128:
6309 case Intrinsic::x86_ssse3_pshuf_b:
6310 case Intrinsic::x86_avx512_pshuf_b_512:
6311 handleIntrinsicByApplyingToShadow(I, I.getIntrinsicID(),
6312 /*trailingVerbatimArgs=*/1);
6313 break;
6314
6315 // AVX512 PMOV: Packed MOV, with truncation
6316 // Precisely handled by applying the same intrinsic to the shadow
6317 case Intrinsic::x86_avx512_mask_pmov_dw_512:
6318 case Intrinsic::x86_avx512_mask_pmov_db_512:
6319 case Intrinsic::x86_avx512_mask_pmov_qb_512:
6320 case Intrinsic::x86_avx512_mask_pmov_qw_512: {
6321 // Intrinsic::x86_avx512_mask_pmov_{qd,wb}_512 were removed in
6322 // f608dc1f5775ee880e8ea30e2d06ab5a4a935c22
6323 handleIntrinsicByApplyingToShadow(I, I.getIntrinsicID(),
6324 /*trailingVerbatimArgs=*/1);
6325 break;
6326 }
6327
6328 // AVX512 PMVOV{S,US}: Packed MOV, with signed/unsigned saturation
6329 // Approximately handled using the corresponding truncation intrinsic
6330 // TODO: improve handleAVX512VectorDownConvert to precisely model saturation
6331 case Intrinsic::x86_avx512_mask_pmovs_dw_512:
6332 case Intrinsic::x86_avx512_mask_pmovus_dw_512: {
6333 handleIntrinsicByApplyingToShadow(I,
6334 Intrinsic::x86_avx512_mask_pmov_dw_512,
6335 /* trailingVerbatimArgs=*/1);
6336 break;
6337 }
6338
6339 case Intrinsic::x86_avx512_mask_pmovs_db_512:
6340 case Intrinsic::x86_avx512_mask_pmovus_db_512: {
6341 handleIntrinsicByApplyingToShadow(I,
6342 Intrinsic::x86_avx512_mask_pmov_db_512,
6343 /* trailingVerbatimArgs=*/1);
6344 break;
6345 }
6346
6347 case Intrinsic::x86_avx512_mask_pmovs_qb_512:
6348 case Intrinsic::x86_avx512_mask_pmovus_qb_512: {
6349 handleIntrinsicByApplyingToShadow(I,
6350 Intrinsic::x86_avx512_mask_pmov_qb_512,
6351 /* trailingVerbatimArgs=*/1);
6352 break;
6353 }
6354
6355 case Intrinsic::x86_avx512_mask_pmovs_qw_512:
6356 case Intrinsic::x86_avx512_mask_pmovus_qw_512: {
6357 handleIntrinsicByApplyingToShadow(I,
6358 Intrinsic::x86_avx512_mask_pmov_qw_512,
6359 /* trailingVerbatimArgs=*/1);
6360 break;
6361 }
6362
6363 case Intrinsic::x86_avx512_mask_pmovs_qd_512:
6364 case Intrinsic::x86_avx512_mask_pmovus_qd_512:
6365 case Intrinsic::x86_avx512_mask_pmovs_wb_512:
6366 case Intrinsic::x86_avx512_mask_pmovus_wb_512: {
6367 // Since Intrinsic::x86_avx512_mask_pmov_{qd,wb}_512 do not exist, we
6368 // cannot use handleIntrinsicByApplyingToShadow. Instead, we call the
6369 // slow-path handler.
6370 handleAVX512VectorDownConvert(I);
6371 break;
6372 }
6373
6374 // AVX512/AVX10 Reciprocal
6375 // <16 x float> @llvm.x86.avx512.rsqrt14.ps.512
6376 // (<16 x float>, <16 x float>, i16)
6377 // <8 x float> @llvm.x86.avx512.rsqrt14.ps.256
6378 // (<8 x float>, <8 x float>, i8)
6379 // <4 x float> @llvm.x86.avx512.rsqrt14.ps.128
6380 // (<4 x float>, <4 x float>, i8)
6381 //
6382 // <8 x double> @llvm.x86.avx512.rsqrt14.pd.512
6383 // (<8 x double>, <8 x double>, i8)
6384 // <4 x double> @llvm.x86.avx512.rsqrt14.pd.256
6385 // (<4 x double>, <4 x double>, i8)
6386 // <2 x double> @llvm.x86.avx512.rsqrt14.pd.128
6387 // (<2 x double>, <2 x double>, i8)
6388 //
6389 // <32 x bfloat> @llvm.x86.avx10.mask.rsqrt.bf16.512
6390 // (<32 x bfloat>, <32 x bfloat>, i32)
6391 // <16 x bfloat> @llvm.x86.avx10.mask.rsqrt.bf16.256
6392 // (<16 x bfloat>, <16 x bfloat>, i16)
6393 // <8 x bfloat> @llvm.x86.avx10.mask.rsqrt.bf16.128
6394 // (<8 x bfloat>, <8 x bfloat>, i8)
6395 //
6396 // <32 x half> @llvm.x86.avx512fp16.mask.rsqrt.ph.512
6397 // (<32 x half>, <32 x half>, i32)
6398 // <16 x half> @llvm.x86.avx512fp16.mask.rsqrt.ph.256
6399 // (<16 x half>, <16 x half>, i16)
6400 // <8 x half> @llvm.x86.avx512fp16.mask.rsqrt.ph.128
6401 // (<8 x half>, <8 x half>, i8)
6402 //
6403 // TODO: 3-operand variants are not handled:
6404 // <2 x double> @llvm.x86.avx512.rsqrt14.sd
6405 // (<2 x double>, <2 x double>, <2 x double>, i8)
6406 // <4 x float> @llvm.x86.avx512.rsqrt14.ss
6407 // (<4 x float>, <4 x float>, <4 x float>, i8)
6408 // <8 x half> @llvm.x86.avx512fp16.mask.rsqrt.sh
6409 // (<8 x half>, <8 x half>, <8 x half>, i8)
6410 case Intrinsic::x86_avx512_rsqrt14_ps_512:
6411 case Intrinsic::x86_avx512_rsqrt14_ps_256:
6412 case Intrinsic::x86_avx512_rsqrt14_ps_128:
6413 case Intrinsic::x86_avx512_rsqrt14_pd_512:
6414 case Intrinsic::x86_avx512_rsqrt14_pd_256:
6415 case Intrinsic::x86_avx512_rsqrt14_pd_128:
6416 case Intrinsic::x86_avx10_mask_rsqrt_bf16_512:
6417 case Intrinsic::x86_avx10_mask_rsqrt_bf16_256:
6418 case Intrinsic::x86_avx10_mask_rsqrt_bf16_128:
6419 case Intrinsic::x86_avx512fp16_mask_rsqrt_ph_512:
6420 case Intrinsic::x86_avx512fp16_mask_rsqrt_ph_256:
6421 case Intrinsic::x86_avx512fp16_mask_rsqrt_ph_128:
6422 handleAVX512VectorGenericMaskedFP(I, /*AIndex=*/0, /*WriteThruIndex=*/1,
6423 /*MaskIndex=*/2);
6424 break;
6425
6426 // AVX512/AVX10 Reciprocal Square Root
6427 // <16 x float> @llvm.x86.avx512.rcp14.ps.512
6428 // (<16 x float>, <16 x float>, i16)
6429 // <8 x float> @llvm.x86.avx512.rcp14.ps.256
6430 // (<8 x float>, <8 x float>, i8)
6431 // <4 x float> @llvm.x86.avx512.rcp14.ps.128
6432 // (<4 x float>, <4 x float>, i8)
6433 //
6434 // <8 x double> @llvm.x86.avx512.rcp14.pd.512
6435 // (<8 x double>, <8 x double>, i8)
6436 // <4 x double> @llvm.x86.avx512.rcp14.pd.256
6437 // (<4 x double>, <4 x double>, i8)
6438 // <2 x double> @llvm.x86.avx512.rcp14.pd.128
6439 // (<2 x double>, <2 x double>, i8)
6440 //
6441 // <32 x bfloat> @llvm.x86.avx10.mask.rcp.bf16.512
6442 // (<32 x bfloat>, <32 x bfloat>, i32)
6443 // <16 x bfloat> @llvm.x86.avx10.mask.rcp.bf16.256
6444 // (<16 x bfloat>, <16 x bfloat>, i16)
6445 // <8 x bfloat> @llvm.x86.avx10.mask.rcp.bf16.128
6446 // (<8 x bfloat>, <8 x bfloat>, i8)
6447 //
6448 // <32 x half> @llvm.x86.avx512fp16.mask.rcp.ph.512
6449 // (<32 x half>, <32 x half>, i32)
6450 // <16 x half> @llvm.x86.avx512fp16.mask.rcp.ph.256
6451 // (<16 x half>, <16 x half>, i16)
6452 // <8 x half> @llvm.x86.avx512fp16.mask.rcp.ph.128
6453 // (<8 x half>, <8 x half>, i8)
6454 //
6455 // TODO: 3-operand variants are not handled:
6456 // <2 x double> @llvm.x86.avx512.rcp14.sd
6457 // (<2 x double>, <2 x double>, <2 x double>, i8)
6458 // <4 x float> @llvm.x86.avx512.rcp14.ss
6459 // (<4 x float>, <4 x float>, <4 x float>, i8)
6460 // <8 x half> @llvm.x86.avx512fp16.mask.rcp.sh
6461 // (<8 x half>, <8 x half>, <8 x half>, i8)
6462 case Intrinsic::x86_avx512_rcp14_ps_512:
6463 case Intrinsic::x86_avx512_rcp14_ps_256:
6464 case Intrinsic::x86_avx512_rcp14_ps_128:
6465 case Intrinsic::x86_avx512_rcp14_pd_512:
6466 case Intrinsic::x86_avx512_rcp14_pd_256:
6467 case Intrinsic::x86_avx512_rcp14_pd_128:
6468 case Intrinsic::x86_avx10_mask_rcp_bf16_512:
6469 case Intrinsic::x86_avx10_mask_rcp_bf16_256:
6470 case Intrinsic::x86_avx10_mask_rcp_bf16_128:
6471 case Intrinsic::x86_avx512fp16_mask_rcp_ph_512:
6472 case Intrinsic::x86_avx512fp16_mask_rcp_ph_256:
6473 case Intrinsic::x86_avx512fp16_mask_rcp_ph_128:
6474 handleAVX512VectorGenericMaskedFP(I, /*AIndex=*/0, /*WriteThruIndex=*/1,
6475 /*MaskIndex=*/2);
6476 break;
6477
6478 // <32 x half> @llvm.x86.avx512fp16.mask.rndscale.ph.512
6479 // (<32 x half>, i32, <32 x half>, i32, i32)
6480 // <16 x half> @llvm.x86.avx512fp16.mask.rndscale.ph.256
6481 // (<16 x half>, i32, <16 x half>, i32, i16)
6482 // <8 x half> @llvm.x86.avx512fp16.mask.rndscale.ph.128
6483 // (<8 x half>, i32, <8 x half>, i32, i8)
6484 //
6485 // <16 x float> @llvm.x86.avx512.mask.rndscale.ps.512
6486 // (<16 x float>, i32, <16 x float>, i16, i32)
6487 // <8 x float> @llvm.x86.avx512.mask.rndscale.ps.256
6488 // (<8 x float>, i32, <8 x float>, i8)
6489 // <4 x float> @llvm.x86.avx512.mask.rndscale.ps.128
6490 // (<4 x float>, i32, <4 x float>, i8)
6491 //
6492 // <8 x double> @llvm.x86.avx512.mask.rndscale.pd.512
6493 // (<8 x double>, i32, <8 x double>, i8, i32)
6494 // A Imm WriteThru Mask Rounding
6495 // <4 x double> @llvm.x86.avx512.mask.rndscale.pd.256
6496 // (<4 x double>, i32, <4 x double>, i8)
6497 // <2 x double> @llvm.x86.avx512.mask.rndscale.pd.128
6498 // (<2 x double>, i32, <2 x double>, i8)
6499 // A Imm WriteThru Mask
6500 //
6501 // <32 x bfloat> @llvm.x86.avx10.mask.rndscale.bf16.512
6502 // (<32 x bfloat>, i32, <32 x bfloat>, i32)
6503 // <16 x bfloat> @llvm.x86.avx10.mask.rndscale.bf16.256
6504 // (<16 x bfloat>, i32, <16 x bfloat>, i16)
6505 // <8 x bfloat> @llvm.x86.avx10.mask.rndscale.bf16.128
6506 // (<8 x bfloat>, i32, <8 x bfloat>, i8)
6507 //
6508 // Not supported: three vectors
6509 // - <8 x half> @llvm.x86.avx512fp16.mask.rndscale.sh
6510 // (<8 x half>, <8 x half>,<8 x half>, i8, i32, i32)
6511 // - <4 x float> @llvm.x86.avx512.mask.rndscale.ss
6512 // (<4 x float>, <4 x float>, <4 x float>, i8, i32, i32)
6513 // - <2 x double> @llvm.x86.avx512.mask.rndscale.sd
6514 // (<2 x double>, <2 x double>, <2 x double>, i8, i32,
6515 // i32)
6516 // A B WriteThru Mask Imm
6517 // Rounding
6518 case Intrinsic::x86_avx512fp16_mask_rndscale_ph_512:
6519 case Intrinsic::x86_avx512fp16_mask_rndscale_ph_256:
6520 case Intrinsic::x86_avx512fp16_mask_rndscale_ph_128:
6521 case Intrinsic::x86_avx512_mask_rndscale_ps_512:
6522 case Intrinsic::x86_avx512_mask_rndscale_ps_256:
6523 case Intrinsic::x86_avx512_mask_rndscale_ps_128:
6524 case Intrinsic::x86_avx512_mask_rndscale_pd_512:
6525 case Intrinsic::x86_avx512_mask_rndscale_pd_256:
6526 case Intrinsic::x86_avx512_mask_rndscale_pd_128:
6527 case Intrinsic::x86_avx10_mask_rndscale_bf16_512:
6528 case Intrinsic::x86_avx10_mask_rndscale_bf16_256:
6529 case Intrinsic::x86_avx10_mask_rndscale_bf16_128:
6530 handleAVX512VectorGenericMaskedFP(I, /*AIndex=*/0, /*WriteThruIndex=*/2,
6531 /*MaskIndex=*/3);
6532 break;
6533
6534 // AVX512 FP16 Arithmetic
6535 case Intrinsic::x86_avx512fp16_mask_add_sh_round:
6536 case Intrinsic::x86_avx512fp16_mask_sub_sh_round:
6537 case Intrinsic::x86_avx512fp16_mask_mul_sh_round:
6538 case Intrinsic::x86_avx512fp16_mask_div_sh_round:
6539 case Intrinsic::x86_avx512fp16_mask_max_sh_round:
6540 case Intrinsic::x86_avx512fp16_mask_min_sh_round: {
6541 visitGenericScalarHalfwordInst(I);
6542 break;
6543 }
6544
6545 // AVX Galois Field New Instructions
6546 case Intrinsic::x86_vgf2p8affineqb_128:
6547 case Intrinsic::x86_vgf2p8affineqb_256:
6548 case Intrinsic::x86_vgf2p8affineqb_512:
6549 handleAVXGF2P8Affine(I);
6550 break;
6551
6552 default:
6553 return false;
6554 }
6555
6556 return true;
6557 }
6558
6559 bool maybeHandleArmSIMDIntrinsic(IntrinsicInst &I) {
6560 switch (I.getIntrinsicID()) {
6561 case Intrinsic::aarch64_neon_rshrn:
6562 case Intrinsic::aarch64_neon_sqrshl:
6563 case Intrinsic::aarch64_neon_sqrshrn:
6564 case Intrinsic::aarch64_neon_sqrshrun:
6565 case Intrinsic::aarch64_neon_sqshl:
6566 case Intrinsic::aarch64_neon_sqshlu:
6567 case Intrinsic::aarch64_neon_sqshrn:
6568 case Intrinsic::aarch64_neon_sqshrun:
6569 case Intrinsic::aarch64_neon_srshl:
6570 case Intrinsic::aarch64_neon_sshl:
6571 case Intrinsic::aarch64_neon_uqrshl:
6572 case Intrinsic::aarch64_neon_uqrshrn:
6573 case Intrinsic::aarch64_neon_uqshl:
6574 case Intrinsic::aarch64_neon_uqshrn:
6575 case Intrinsic::aarch64_neon_urshl:
6576 case Intrinsic::aarch64_neon_ushl:
6577 // Not handled here: aarch64_neon_vsli (vector shift left and insert)
6578 handleVectorShiftIntrinsic(I, /* Variable */ false);
6579 break;
6580
6581 // TODO: handling max/min similarly to AND/OR may be more precise
6582 // Floating-Point Maximum/Minimum Pairwise
6583 case Intrinsic::aarch64_neon_fmaxp:
6584 case Intrinsic::aarch64_neon_fminp:
6585 // Floating-Point Maximum/Minimum Number Pairwise
6586 case Intrinsic::aarch64_neon_fmaxnmp:
6587 case Intrinsic::aarch64_neon_fminnmp:
6588 // Signed/Unsigned Maximum/Minimum Pairwise
6589 case Intrinsic::aarch64_neon_smaxp:
6590 case Intrinsic::aarch64_neon_sminp:
6591 case Intrinsic::aarch64_neon_umaxp:
6592 case Intrinsic::aarch64_neon_uminp:
6593 // Add Pairwise
6594 case Intrinsic::aarch64_neon_addp:
6595 // Floating-point Add Pairwise
6596 case Intrinsic::aarch64_neon_faddp:
6597 // Add Long Pairwise
6598 case Intrinsic::aarch64_neon_saddlp:
6599 case Intrinsic::aarch64_neon_uaddlp: {
6600 handlePairwiseShadowOrIntrinsic(I, /*Shards=*/1);
6601 break;
6602 }
6603
6604 // Floating-point Convert to integer, rounding to nearest with ties to Away
6605 case Intrinsic::aarch64_neon_fcvtas:
6606 case Intrinsic::aarch64_neon_fcvtau:
6607 // Floating-point convert to integer, rounding toward minus infinity
6608 case Intrinsic::aarch64_neon_fcvtms:
6609 case Intrinsic::aarch64_neon_fcvtmu:
6610 // Floating-point convert to integer, rounding to nearest with ties to even
6611 case Intrinsic::aarch64_neon_fcvtns:
6612 case Intrinsic::aarch64_neon_fcvtnu:
6613 // Floating-point convert to integer, rounding toward plus infinity
6614 case Intrinsic::aarch64_neon_fcvtps:
6615 case Intrinsic::aarch64_neon_fcvtpu:
6616 // Floating-point Convert to integer, rounding toward Zero
6617 case Intrinsic::aarch64_neon_fcvtzs:
6618 case Intrinsic::aarch64_neon_fcvtzu:
6619 // Floating-point convert to lower precision narrow, rounding to odd
6620 case Intrinsic::aarch64_neon_fcvtxn: {
6621 handleNEONVectorConvertIntrinsic(I);
6622 break;
6623 }
6624
6625 // Add reduction to scalar
6626 case Intrinsic::aarch64_neon_faddv:
6627 case Intrinsic::aarch64_neon_saddv:
6628 case Intrinsic::aarch64_neon_uaddv:
6629 // Signed/Unsigned min/max (Vector)
6630 // TODO: handling similarly to AND/OR may be more precise.
6631 case Intrinsic::aarch64_neon_smaxv:
6632 case Intrinsic::aarch64_neon_sminv:
6633 case Intrinsic::aarch64_neon_umaxv:
6634 case Intrinsic::aarch64_neon_uminv:
6635 // Floating-point min/max (vector)
6636 // The f{min,max}"nm"v variants handle NaN differently than f{min,max}v,
6637 // but our shadow propagation is the same.
6638 case Intrinsic::aarch64_neon_fmaxv:
6639 case Intrinsic::aarch64_neon_fminv:
6640 case Intrinsic::aarch64_neon_fmaxnmv:
6641 case Intrinsic::aarch64_neon_fminnmv:
6642 // Sum long across vector
6643 case Intrinsic::aarch64_neon_saddlv:
6644 case Intrinsic::aarch64_neon_uaddlv:
6645 handleVectorReduceIntrinsic(I, /*AllowShadowCast=*/true);
6646 break;
6647
6648 case Intrinsic::aarch64_neon_ld1x2:
6649 case Intrinsic::aarch64_neon_ld1x3:
6650 case Intrinsic::aarch64_neon_ld1x4:
6651 case Intrinsic::aarch64_neon_ld2:
6652 case Intrinsic::aarch64_neon_ld3:
6653 case Intrinsic::aarch64_neon_ld4:
6654 case Intrinsic::aarch64_neon_ld2r:
6655 case Intrinsic::aarch64_neon_ld3r:
6656 case Intrinsic::aarch64_neon_ld4r: {
6657 handleNEONVectorLoad(I, /*WithLane=*/false);
6658 break;
6659 }
6660
6661 case Intrinsic::aarch64_neon_ld2lane:
6662 case Intrinsic::aarch64_neon_ld3lane:
6663 case Intrinsic::aarch64_neon_ld4lane: {
6664 handleNEONVectorLoad(I, /*WithLane=*/true);
6665 break;
6666 }
6667
6668 // Saturating extract narrow
6669 case Intrinsic::aarch64_neon_sqxtn:
6670 case Intrinsic::aarch64_neon_sqxtun:
6671 case Intrinsic::aarch64_neon_uqxtn:
6672 // These only have one argument, but we (ab)use handleShadowOr because it
6673 // does work on single argument intrinsics and will typecast the shadow
6674 // (and update the origin).
6675 handleShadowOr(I);
6676 break;
6677
6678 case Intrinsic::aarch64_neon_st1x2:
6679 case Intrinsic::aarch64_neon_st1x3:
6680 case Intrinsic::aarch64_neon_st1x4:
6681 case Intrinsic::aarch64_neon_st2:
6682 case Intrinsic::aarch64_neon_st3:
6683 case Intrinsic::aarch64_neon_st4: {
6684 handleNEONVectorStoreIntrinsic(I, false);
6685 break;
6686 }
6687
6688 case Intrinsic::aarch64_neon_st2lane:
6689 case Intrinsic::aarch64_neon_st3lane:
6690 case Intrinsic::aarch64_neon_st4lane: {
6691 handleNEONVectorStoreIntrinsic(I, true);
6692 break;
6693 }
6694
6695 // Arm NEON vector table intrinsics have the source/table register(s) as
6696 // arguments, followed by the index register. They return the output.
6697 //
6698 // 'TBL writes a zero if an index is out-of-range, while TBX leaves the
6699 // original value unchanged in the destination register.'
6700 // Conveniently, zero denotes a clean shadow, which means out-of-range
6701 // indices for TBL will initialize the user data with zero and also clean
6702 // the shadow. (For TBX, neither the user data nor the shadow will be
6703 // updated, which is also correct.)
6704 case Intrinsic::aarch64_neon_tbl1:
6705 case Intrinsic::aarch64_neon_tbl2:
6706 case Intrinsic::aarch64_neon_tbl3:
6707 case Intrinsic::aarch64_neon_tbl4:
6708 case Intrinsic::aarch64_neon_tbx1:
6709 case Intrinsic::aarch64_neon_tbx2:
6710 case Intrinsic::aarch64_neon_tbx3:
6711 case Intrinsic::aarch64_neon_tbx4: {
6712 // The last trailing argument (index register) should be handled verbatim
6713 handleIntrinsicByApplyingToShadow(
6714 I, /*shadowIntrinsicID=*/I.getIntrinsicID(),
6715 /*trailingVerbatimArgs*/ 1);
6716 break;
6717 }
6718
6719 case Intrinsic::aarch64_neon_fmulx:
6720 case Intrinsic::aarch64_neon_pmul:
6721 case Intrinsic::aarch64_neon_pmull:
6722 case Intrinsic::aarch64_neon_smull:
6723 case Intrinsic::aarch64_neon_pmull64:
6724 case Intrinsic::aarch64_neon_umull: {
6725 handleNEONVectorMultiplyIntrinsic(I);
6726 break;
6727 }
6728
6729 default:
6730 return false;
6731 }
6732
6733 return true;
6734 }
6735
6736 void visitIntrinsicInst(IntrinsicInst &I) {
6737 if (maybeHandleCrossPlatformIntrinsic(I))
6738 return;
6739
6740 if (maybeHandleX86SIMDIntrinsic(I))
6741 return;
6742
6743 if (maybeHandleArmSIMDIntrinsic(I))
6744 return;
6745
6746 if (maybeHandleUnknownIntrinsic(I))
6747 return;
6748
6749 visitInstruction(I);
6750 }
6751
6752 void visitLibAtomicLoad(CallBase &CB) {
6753 // Since we use getNextNode here, we can't have CB terminate the BB.
6754 assert(isa<CallInst>(CB));
6755
6756 IRBuilder<> IRB(&CB);
6757 Value *Size = CB.getArgOperand(0);
6758 Value *SrcPtr = CB.getArgOperand(1);
6759 Value *DstPtr = CB.getArgOperand(2);
6760 Value *Ordering = CB.getArgOperand(3);
6761 // Convert the call to have at least Acquire ordering to make sure
6762 // the shadow operations aren't reordered before it.
6763 Value *NewOrdering =
6764 IRB.CreateExtractElement(makeAddAcquireOrderingTable(IRB), Ordering);
6765 CB.setArgOperand(3, NewOrdering);
6766
6767 NextNodeIRBuilder NextIRB(&CB);
6768 Value *SrcShadowPtr, *SrcOriginPtr;
6769 std::tie(SrcShadowPtr, SrcOriginPtr) =
6770 getShadowOriginPtr(SrcPtr, NextIRB, NextIRB.getInt8Ty(), Align(1),
6771 /*isStore*/ false);
6772 Value *DstShadowPtr =
6773 getShadowOriginPtr(DstPtr, NextIRB, NextIRB.getInt8Ty(), Align(1),
6774 /*isStore*/ true)
6775 .first;
6776
6777 NextIRB.CreateMemCpy(DstShadowPtr, Align(1), SrcShadowPtr, Align(1), Size);
6778 if (MS.TrackOrigins) {
6779 Value *SrcOrigin = NextIRB.CreateAlignedLoad(MS.OriginTy, SrcOriginPtr,
6781 Value *NewOrigin = updateOrigin(SrcOrigin, NextIRB);
6782 NextIRB.CreateCall(MS.MsanSetOriginFn, {DstPtr, Size, NewOrigin});
6783 }
6784 }
6785
6786 void visitLibAtomicStore(CallBase &CB) {
6787 IRBuilder<> IRB(&CB);
6788 Value *Size = CB.getArgOperand(0);
6789 Value *DstPtr = CB.getArgOperand(2);
6790 Value *Ordering = CB.getArgOperand(3);
6791 // Convert the call to have at least Release ordering to make sure
6792 // the shadow operations aren't reordered after it.
6793 Value *NewOrdering =
6794 IRB.CreateExtractElement(makeAddReleaseOrderingTable(IRB), Ordering);
6795 CB.setArgOperand(3, NewOrdering);
6796
6797 Value *DstShadowPtr =
6798 getShadowOriginPtr(DstPtr, IRB, IRB.getInt8Ty(), Align(1),
6799 /*isStore*/ true)
6800 .first;
6801
6802 // Atomic store always paints clean shadow/origin. See file header.
6803 IRB.CreateMemSet(DstShadowPtr, getCleanShadow(IRB.getInt8Ty()), Size,
6804 Align(1));
6805 }
6806
6807 void visitCallBase(CallBase &CB) {
6808 assert(!CB.getMetadata(LLVMContext::MD_nosanitize));
6809 if (CB.isInlineAsm()) {
6810 // For inline asm (either a call to asm function, or callbr instruction),
6811 // do the usual thing: check argument shadow and mark all outputs as
6812 // clean. Note that any side effects of the inline asm that are not
6813 // immediately visible in its constraints are not handled.
6815 visitAsmInstruction(CB);
6816 else
6817 visitInstruction(CB);
6818 return;
6819 }
6820 LibFunc LF;
6821 if (TLI->getLibFunc(CB, LF)) {
6822 // libatomic.a functions need to have special handling because there isn't
6823 // a good way to intercept them or compile the library with
6824 // instrumentation.
6825 switch (LF) {
6826 case LibFunc_atomic_load:
6827 if (!isa<CallInst>(CB)) {
6828 llvm::errs() << "MSAN -- cannot instrument invoke of libatomic load."
6829 "Ignoring!\n";
6830 break;
6831 }
6832 visitLibAtomicLoad(CB);
6833 return;
6834 case LibFunc_atomic_store:
6835 visitLibAtomicStore(CB);
6836 return;
6837 default:
6838 break;
6839 }
6840 }
6841
6842 if (auto *Call = dyn_cast<CallInst>(&CB)) {
6843 assert(!isa<IntrinsicInst>(Call) && "intrinsics are handled elsewhere");
6844
6845 // We are going to insert code that relies on the fact that the callee
6846 // will become a non-readonly function after it is instrumented by us. To
6847 // prevent this code from being optimized out, mark that function
6848 // non-readonly in advance.
6849 // TODO: We can likely do better than dropping memory() completely here.
6850 AttributeMask B;
6851 B.addAttribute(Attribute::Memory).addAttribute(Attribute::Speculatable);
6852
6854 if (Function *Func = Call->getCalledFunction()) {
6855 Func->removeFnAttrs(B);
6856 }
6857
6859 }
6860 IRBuilder<> IRB(&CB);
6861 bool MayCheckCall = MS.EagerChecks;
6862 if (Function *Func = CB.getCalledFunction()) {
6863 // __sanitizer_unaligned_{load,store} functions may be called by users
6864 // and always expects shadows in the TLS. So don't check them.
6865 MayCheckCall &= !Func->getName().starts_with("__sanitizer_unaligned_");
6866 }
6867
6868 unsigned ArgOffset = 0;
6869 LLVM_DEBUG(dbgs() << " CallSite: " << CB << "\n");
6870 for (const auto &[i, A] : llvm::enumerate(CB.args())) {
6871 if (!A->getType()->isSized()) {
6872 LLVM_DEBUG(dbgs() << "Arg " << i << " is not sized: " << CB << "\n");
6873 continue;
6874 }
6875
6876 if (A->getType()->isScalableTy()) {
6877 LLVM_DEBUG(dbgs() << "Arg " << i << " is vscale: " << CB << "\n");
6878 // Handle as noundef, but don't reserve tls slots.
6879 insertCheckShadowOf(A, &CB);
6880 continue;
6881 }
6882
6883 unsigned Size = 0;
6884 const DataLayout &DL = F.getDataLayout();
6885
6886 bool ByVal = CB.paramHasAttr(i, Attribute::ByVal);
6887 bool NoUndef = CB.paramHasAttr(i, Attribute::NoUndef);
6888 bool EagerCheck = MayCheckCall && !ByVal && NoUndef;
6889
6890 if (EagerCheck) {
6891 insertCheckShadowOf(A, &CB);
6892 Size = DL.getTypeAllocSize(A->getType());
6893 } else {
6894 [[maybe_unused]] Value *Store = nullptr;
6895 // Compute the Shadow for arg even if it is ByVal, because
6896 // in that case getShadow() will copy the actual arg shadow to
6897 // __msan_param_tls.
6898 Value *ArgShadow = getShadow(A);
6899 Value *ArgShadowBase = getShadowPtrForArgument(IRB, ArgOffset);
6900 LLVM_DEBUG(dbgs() << " Arg#" << i << ": " << *A
6901 << " Shadow: " << *ArgShadow << "\n");
6902 if (ByVal) {
6903 // ByVal requires some special handling as it's too big for a single
6904 // load
6905 assert(A->getType()->isPointerTy() &&
6906 "ByVal argument is not a pointer!");
6907 Size = DL.getTypeAllocSize(CB.getParamByValType(i));
6908 if (ArgOffset + Size > kParamTLSSize)
6909 break;
6910 const MaybeAlign ParamAlignment(CB.getParamAlign(i));
6911 MaybeAlign Alignment = std::nullopt;
6912 if (ParamAlignment)
6913 Alignment = std::min(*ParamAlignment, kShadowTLSAlignment);
6914 Value *AShadowPtr, *AOriginPtr;
6915 std::tie(AShadowPtr, AOriginPtr) =
6916 getShadowOriginPtr(A, IRB, IRB.getInt8Ty(), Alignment,
6917 /*isStore*/ false);
6918 if (!PropagateShadow) {
6919 Store = IRB.CreateMemSet(ArgShadowBase,
6921 Size, Alignment);
6922 } else {
6923 Store = IRB.CreateMemCpy(ArgShadowBase, Alignment, AShadowPtr,
6924 Alignment, Size);
6925 if (MS.TrackOrigins) {
6926 Value *ArgOriginBase = getOriginPtrForArgument(IRB, ArgOffset);
6927 // FIXME: OriginSize should be:
6928 // alignTo(A % kMinOriginAlignment + Size, kMinOriginAlignment)
6929 unsigned OriginSize = alignTo(Size, kMinOriginAlignment);
6930 IRB.CreateMemCpy(
6931 ArgOriginBase,
6932 /* by origin_tls[ArgOffset] */ kMinOriginAlignment,
6933 AOriginPtr,
6934 /* by getShadowOriginPtr */ kMinOriginAlignment, OriginSize);
6935 }
6936 }
6937 } else {
6938 // Any other parameters mean we need bit-grained tracking of uninit
6939 // data
6940 Size = DL.getTypeAllocSize(A->getType());
6941 if (ArgOffset + Size > kParamTLSSize)
6942 break;
6943 Store = IRB.CreateAlignedStore(ArgShadow, ArgShadowBase,
6945 Constant *Cst = dyn_cast<Constant>(ArgShadow);
6946 if (MS.TrackOrigins && !(Cst && Cst->isNullValue())) {
6947 IRB.CreateStore(getOrigin(A),
6948 getOriginPtrForArgument(IRB, ArgOffset));
6949 }
6950 }
6951 assert(Store != nullptr);
6952 LLVM_DEBUG(dbgs() << " Param:" << *Store << "\n");
6953 }
6954 assert(Size != 0);
6955 ArgOffset += alignTo(Size, kShadowTLSAlignment);
6956 }
6957 LLVM_DEBUG(dbgs() << " done with call args\n");
6958
6959 FunctionType *FT = CB.getFunctionType();
6960 if (FT->isVarArg()) {
6961 VAHelper->visitCallBase(CB, IRB);
6962 }
6963
6964 // Now, get the shadow for the RetVal.
6965 if (!CB.getType()->isSized())
6966 return;
6967 // Don't emit the epilogue for musttail call returns.
6968 if (isa<CallInst>(CB) && cast<CallInst>(CB).isMustTailCall())
6969 return;
6970
6971 if (MayCheckCall && CB.hasRetAttr(Attribute::NoUndef)) {
6972 setShadow(&CB, getCleanShadow(&CB));
6973 setOrigin(&CB, getCleanOrigin());
6974 return;
6975 }
6976
6977 IRBuilder<> IRBBefore(&CB);
6978 // Until we have full dynamic coverage, make sure the retval shadow is 0.
6979 Value *Base = getShadowPtrForRetval(IRBBefore);
6980 IRBBefore.CreateAlignedStore(getCleanShadow(&CB), Base,
6982 BasicBlock::iterator NextInsn;
6983 if (isa<CallInst>(CB)) {
6984 NextInsn = ++CB.getIterator();
6985 assert(NextInsn != CB.getParent()->end());
6986 } else {
6987 BasicBlock *NormalDest = cast<InvokeInst>(CB).getNormalDest();
6988 if (!NormalDest->getSinglePredecessor()) {
6989 // FIXME: this case is tricky, so we are just conservative here.
6990 // Perhaps we need to split the edge between this BB and NormalDest,
6991 // but a naive attempt to use SplitEdge leads to a crash.
6992 setShadow(&CB, getCleanShadow(&CB));
6993 setOrigin(&CB, getCleanOrigin());
6994 return;
6995 }
6996 // FIXME: NextInsn is likely in a basic block that has not been visited
6997 // yet. Anything inserted there will be instrumented by MSan later!
6998 NextInsn = NormalDest->getFirstInsertionPt();
6999 assert(NextInsn != NormalDest->end() &&
7000 "Could not find insertion point for retval shadow load");
7001 }
7002 IRBuilder<> IRBAfter(&*NextInsn);
7003 Value *RetvalShadow = IRBAfter.CreateAlignedLoad(
7004 getShadowTy(&CB), getShadowPtrForRetval(IRBAfter), kShadowTLSAlignment,
7005 "_msret");
7006 setShadow(&CB, RetvalShadow);
7007 if (MS.TrackOrigins)
7008 setOrigin(&CB, IRBAfter.CreateLoad(MS.OriginTy, getOriginPtrForRetval()));
7009 }
7010
7011 bool isAMustTailRetVal(Value *RetVal) {
7012 if (auto *I = dyn_cast<BitCastInst>(RetVal)) {
7013 RetVal = I->getOperand(0);
7014 }
7015 if (auto *I = dyn_cast<CallInst>(RetVal)) {
7016 return I->isMustTailCall();
7017 }
7018 return false;
7019 }
7020
7021 void visitReturnInst(ReturnInst &I) {
7022 IRBuilder<> IRB(&I);
7023 Value *RetVal = I.getReturnValue();
7024 if (!RetVal)
7025 return;
7026 // Don't emit the epilogue for musttail call returns.
7027 if (isAMustTailRetVal(RetVal))
7028 return;
7029 Value *ShadowPtr = getShadowPtrForRetval(IRB);
7030 bool HasNoUndef = F.hasRetAttribute(Attribute::NoUndef);
7031 bool StoreShadow = !(MS.EagerChecks && HasNoUndef);
7032 // FIXME: Consider using SpecialCaseList to specify a list of functions that
7033 // must always return fully initialized values. For now, we hardcode "main".
7034 bool EagerCheck = (MS.EagerChecks && HasNoUndef) || (F.getName() == "main");
7035
7036 Value *Shadow = getShadow(RetVal);
7037 bool StoreOrigin = true;
7038 if (EagerCheck) {
7039 insertCheckShadowOf(RetVal, &I);
7040 Shadow = getCleanShadow(RetVal);
7041 StoreOrigin = false;
7042 }
7043
7044 // The caller may still expect information passed over TLS if we pass our
7045 // check
7046 if (StoreShadow) {
7047 IRB.CreateAlignedStore(Shadow, ShadowPtr, kShadowTLSAlignment);
7048 if (MS.TrackOrigins && StoreOrigin)
7049 IRB.CreateStore(getOrigin(RetVal), getOriginPtrForRetval());
7050 }
7051 }
7052
7053 void visitPHINode(PHINode &I) {
7054 IRBuilder<> IRB(&I);
7055 if (!PropagateShadow) {
7056 setShadow(&I, getCleanShadow(&I));
7057 setOrigin(&I, getCleanOrigin());
7058 return;
7059 }
7060
7061 ShadowPHINodes.push_back(&I);
7062 setShadow(&I, IRB.CreatePHI(getShadowTy(&I), I.getNumIncomingValues(),
7063 "_msphi_s"));
7064 if (MS.TrackOrigins)
7065 setOrigin(
7066 &I, IRB.CreatePHI(MS.OriginTy, I.getNumIncomingValues(), "_msphi_o"));
7067 }
7068
7069 Value *getLocalVarIdptr(AllocaInst &I) {
7070 ConstantInt *IntConst =
7071 ConstantInt::get(Type::getInt32Ty((*F.getParent()).getContext()), 0);
7072 return new GlobalVariable(*F.getParent(), IntConst->getType(),
7073 /*isConstant=*/false, GlobalValue::PrivateLinkage,
7074 IntConst);
7075 }
7076
7077 Value *getLocalVarDescription(AllocaInst &I) {
7078 return createPrivateConstGlobalForString(*F.getParent(), I.getName());
7079 }
7080
7081 void poisonAllocaUserspace(AllocaInst &I, IRBuilder<> &IRB, Value *Len) {
7082 if (PoisonStack && ClPoisonStackWithCall) {
7083 IRB.CreateCall(MS.MsanPoisonStackFn, {&I, Len});
7084 } else {
7085 Value *ShadowBase, *OriginBase;
7086 std::tie(ShadowBase, OriginBase) = getShadowOriginPtr(
7087 &I, IRB, IRB.getInt8Ty(), Align(1), /*isStore*/ true);
7088
7089 Value *PoisonValue = IRB.getInt8(PoisonStack ? ClPoisonStackPattern : 0);
7090 IRB.CreateMemSet(ShadowBase, PoisonValue, Len, I.getAlign());
7091 }
7092
7093 if (PoisonStack && MS.TrackOrigins) {
7094 Value *Idptr = getLocalVarIdptr(I);
7095 if (ClPrintStackNames) {
7096 Value *Descr = getLocalVarDescription(I);
7097 IRB.CreateCall(MS.MsanSetAllocaOriginWithDescriptionFn,
7098 {&I, Len, Idptr, Descr});
7099 } else {
7100 IRB.CreateCall(MS.MsanSetAllocaOriginNoDescriptionFn, {&I, Len, Idptr});
7101 }
7102 }
7103 }
7104
7105 void poisonAllocaKmsan(AllocaInst &I, IRBuilder<> &IRB, Value *Len) {
7106 Value *Descr = getLocalVarDescription(I);
7107 if (PoisonStack) {
7108 IRB.CreateCall(MS.MsanPoisonAllocaFn, {&I, Len, Descr});
7109 } else {
7110 IRB.CreateCall(MS.MsanUnpoisonAllocaFn, {&I, Len});
7111 }
7112 }
7113
7114 void instrumentAlloca(AllocaInst &I, Instruction *InsPoint = nullptr) {
7115 if (!InsPoint)
7116 InsPoint = &I;
7117 NextNodeIRBuilder IRB(InsPoint);
7118 const DataLayout &DL = F.getDataLayout();
7119 TypeSize TS = DL.getTypeAllocSize(I.getAllocatedType());
7120 Value *Len = IRB.CreateTypeSize(MS.IntptrTy, TS);
7121 if (I.isArrayAllocation())
7122 Len = IRB.CreateMul(Len,
7123 IRB.CreateZExtOrTrunc(I.getArraySize(), MS.IntptrTy));
7124
7125 if (MS.CompileKernel)
7126 poisonAllocaKmsan(I, IRB, Len);
7127 else
7128 poisonAllocaUserspace(I, IRB, Len);
7129 }
7130
7131 void visitAllocaInst(AllocaInst &I) {
7132 setShadow(&I, getCleanShadow(&I));
7133 setOrigin(&I, getCleanOrigin());
7134 // We'll get to this alloca later unless it's poisoned at the corresponding
7135 // llvm.lifetime.start.
7136 AllocaSet.insert(&I);
7137 }
7138
7139 void visitSelectInst(SelectInst &I) {
7140 // a = select b, c, d
7141 Value *B = I.getCondition();
7142 Value *C = I.getTrueValue();
7143 Value *D = I.getFalseValue();
7144
7145 handleSelectLikeInst(I, B, C, D);
7146 }
7147
7148 void handleSelectLikeInst(Instruction &I, Value *B, Value *C, Value *D) {
7149 IRBuilder<> IRB(&I);
7150
7151 Value *Sb = getShadow(B);
7152 Value *Sc = getShadow(C);
7153 Value *Sd = getShadow(D);
7154
7155 Value *Ob = MS.TrackOrigins ? getOrigin(B) : nullptr;
7156 Value *Oc = MS.TrackOrigins ? getOrigin(C) : nullptr;
7157 Value *Od = MS.TrackOrigins ? getOrigin(D) : nullptr;
7158
7159 // Result shadow if condition shadow is 0.
7160 Value *Sa0 = IRB.CreateSelect(B, Sc, Sd);
7161 Value *Sa1;
7162 if (I.getType()->isAggregateType()) {
7163 // To avoid "sign extending" i1 to an arbitrary aggregate type, we just do
7164 // an extra "select". This results in much more compact IR.
7165 // Sa = select Sb, poisoned, (select b, Sc, Sd)
7166 Sa1 = getPoisonedShadow(getShadowTy(I.getType()));
7167 } else if (isScalableNonVectorType(I.getType())) {
7168 // This is intended to handle target("aarch64.svcount"), which can't be
7169 // handled in the else branch because of incompatibility with CreateXor
7170 // ("The supported LLVM operations on this type are limited to load,
7171 // store, phi, select and alloca instructions").
7172
7173 // TODO: this currently underapproximates. Use Arm SVE EOR in the else
7174 // branch as needed instead.
7175 Sa1 = getCleanShadow(getShadowTy(I.getType()));
7176 } else {
7177 // Sa = select Sb, [ (c^d) | Sc | Sd ], [ b ? Sc : Sd ]
7178 // If Sb (condition is poisoned), look for bits in c and d that are equal
7179 // and both unpoisoned.
7180 // If !Sb (condition is unpoisoned), simply pick one of Sc and Sd.
7181
7182 // Cast arguments to shadow-compatible type.
7183 C = CreateAppToShadowCast(IRB, C);
7184 D = CreateAppToShadowCast(IRB, D);
7185
7186 // Result shadow if condition shadow is 1.
7187 Sa1 = IRB.CreateOr({IRB.CreateXor(C, D), Sc, Sd});
7188 }
7189 Value *Sa = IRB.CreateSelect(Sb, Sa1, Sa0, "_msprop_select");
7190 setShadow(&I, Sa);
7191 if (MS.TrackOrigins) {
7192 // Origins are always i32, so any vector conditions must be flattened.
7193 // FIXME: consider tracking vector origins for app vectors?
7194 if (B->getType()->isVectorTy()) {
7195 B = convertToBool(B, IRB);
7196 Sb = convertToBool(Sb, IRB);
7197 }
7198 // a = select b, c, d
7199 // Oa = Sb ? Ob : (b ? Oc : Od)
7200 setOrigin(&I, IRB.CreateSelect(Sb, Ob, IRB.CreateSelect(B, Oc, Od)));
7201 }
7202 }
7203
7204 void visitLandingPadInst(LandingPadInst &I) {
7205 // Do nothing.
7206 // See https://github.com/google/sanitizers/issues/504
7207 setShadow(&I, getCleanShadow(&I));
7208 setOrigin(&I, getCleanOrigin());
7209 }
7210
7211 void visitCatchSwitchInst(CatchSwitchInst &I) {
7212 setShadow(&I, getCleanShadow(&I));
7213 setOrigin(&I, getCleanOrigin());
7214 }
7215
7216 void visitFuncletPadInst(FuncletPadInst &I) {
7217 setShadow(&I, getCleanShadow(&I));
7218 setOrigin(&I, getCleanOrigin());
7219 }
7220
7221 void visitGetElementPtrInst(GetElementPtrInst &I) { handleShadowOr(I); }
7222
7223 void visitExtractValueInst(ExtractValueInst &I) {
7224 IRBuilder<> IRB(&I);
7225 Value *Agg = I.getAggregateOperand();
7226 LLVM_DEBUG(dbgs() << "ExtractValue: " << I << "\n");
7227 Value *AggShadow = getShadow(Agg);
7228 LLVM_DEBUG(dbgs() << " AggShadow: " << *AggShadow << "\n");
7229 Value *ResShadow = IRB.CreateExtractValue(AggShadow, I.getIndices());
7230 LLVM_DEBUG(dbgs() << " ResShadow: " << *ResShadow << "\n");
7231 setShadow(&I, ResShadow);
7232 setOriginForNaryOp(I);
7233 }
7234
7235 void visitInsertValueInst(InsertValueInst &I) {
7236 IRBuilder<> IRB(&I);
7237 LLVM_DEBUG(dbgs() << "InsertValue: " << I << "\n");
7238 Value *AggShadow = getShadow(I.getAggregateOperand());
7239 Value *InsShadow = getShadow(I.getInsertedValueOperand());
7240 LLVM_DEBUG(dbgs() << " AggShadow: " << *AggShadow << "\n");
7241 LLVM_DEBUG(dbgs() << " InsShadow: " << *InsShadow << "\n");
7242 Value *Res = IRB.CreateInsertValue(AggShadow, InsShadow, I.getIndices());
7243 LLVM_DEBUG(dbgs() << " Res: " << *Res << "\n");
7244 setShadow(&I, Res);
7245 setOriginForNaryOp(I);
7246 }
7247
7248 void dumpInst(Instruction &I) {
7249 if (CallInst *CI = dyn_cast<CallInst>(&I)) {
7250 errs() << "ZZZ call " << CI->getCalledFunction()->getName() << "\n";
7251 } else {
7252 errs() << "ZZZ " << I.getOpcodeName() << "\n";
7253 }
7254 errs() << "QQQ " << I << "\n";
7255 }
7256
7257 void visitResumeInst(ResumeInst &I) {
7258 LLVM_DEBUG(dbgs() << "Resume: " << I << "\n");
7259 // Nothing to do here.
7260 }
7261
7262 void visitCleanupReturnInst(CleanupReturnInst &CRI) {
7263 LLVM_DEBUG(dbgs() << "CleanupReturn: " << CRI << "\n");
7264 // Nothing to do here.
7265 }
7266
7267 void visitCatchReturnInst(CatchReturnInst &CRI) {
7268 LLVM_DEBUG(dbgs() << "CatchReturn: " << CRI << "\n");
7269 // Nothing to do here.
7270 }
7271
7272 void instrumentAsmArgument(Value *Operand, Type *ElemTy, Instruction &I,
7273 IRBuilder<> &IRB, const DataLayout &DL,
7274 bool isOutput) {
7275 // For each assembly argument, we check its value for being initialized.
7276 // If the argument is a pointer, we assume it points to a single element
7277 // of the corresponding type (or to a 8-byte word, if the type is unsized).
7278 // Each such pointer is instrumented with a call to the runtime library.
7279 Type *OpType = Operand->getType();
7280 // Check the operand value itself.
7281 insertCheckShadowOf(Operand, &I);
7282 if (!OpType->isPointerTy() || !isOutput) {
7283 assert(!isOutput);
7284 return;
7285 }
7286 if (!ElemTy->isSized())
7287 return;
7288 auto Size = DL.getTypeStoreSize(ElemTy);
7289 Value *SizeVal = IRB.CreateTypeSize(MS.IntptrTy, Size);
7290 if (MS.CompileKernel) {
7291 IRB.CreateCall(MS.MsanInstrumentAsmStoreFn, {Operand, SizeVal});
7292 } else {
7293 // ElemTy, derived from elementtype(), does not encode the alignment of
7294 // the pointer. Conservatively assume that the shadow memory is unaligned.
7295 // When Size is large, avoid StoreInst as it would expand to many
7296 // instructions.
7297 auto [ShadowPtr, _] =
7298 getShadowOriginPtrUserspace(Operand, IRB, IRB.getInt8Ty(), Align(1));
7299 if (Size <= 32)
7300 IRB.CreateAlignedStore(getCleanShadow(ElemTy), ShadowPtr, Align(1));
7301 else
7302 IRB.CreateMemSet(ShadowPtr, ConstantInt::getNullValue(IRB.getInt8Ty()),
7303 SizeVal, Align(1));
7304 }
7305 }
7306
7307 /// Get the number of output arguments returned by pointers.
7308 int getNumOutputArgs(InlineAsm *IA, CallBase *CB) {
7309 int NumRetOutputs = 0;
7310 int NumOutputs = 0;
7311 Type *RetTy = cast<Value>(CB)->getType();
7312 if (!RetTy->isVoidTy()) {
7313 // Register outputs are returned via the CallInst return value.
7314 auto *ST = dyn_cast<StructType>(RetTy);
7315 if (ST)
7316 NumRetOutputs = ST->getNumElements();
7317 else
7318 NumRetOutputs = 1;
7319 }
7320 InlineAsm::ConstraintInfoVector Constraints = IA->ParseConstraints();
7321 for (const InlineAsm::ConstraintInfo &Info : Constraints) {
7322 switch (Info.Type) {
7324 NumOutputs++;
7325 break;
7326 default:
7327 break;
7328 }
7329 }
7330 return NumOutputs - NumRetOutputs;
7331 }
7332
7333 void visitAsmInstruction(Instruction &I) {
7334 // Conservative inline assembly handling: check for poisoned shadow of
7335 // asm() arguments, then unpoison the result and all the memory locations
7336 // pointed to by those arguments.
7337 // An inline asm() statement in C++ contains lists of input and output
7338 // arguments used by the assembly code. These are mapped to operands of the
7339 // CallInst as follows:
7340 // - nR register outputs ("=r) are returned by value in a single structure
7341 // (SSA value of the CallInst);
7342 // - nO other outputs ("=m" and others) are returned by pointer as first
7343 // nO operands of the CallInst;
7344 // - nI inputs ("r", "m" and others) are passed to CallInst as the
7345 // remaining nI operands.
7346 // The total number of asm() arguments in the source is nR+nO+nI, and the
7347 // corresponding CallInst has nO+nI+1 operands (the last operand is the
7348 // function to be called).
7349 const DataLayout &DL = F.getDataLayout();
7350 CallBase *CB = cast<CallBase>(&I);
7351 IRBuilder<> IRB(&I);
7352 InlineAsm *IA = cast<InlineAsm>(CB->getCalledOperand());
7353 int OutputArgs = getNumOutputArgs(IA, CB);
7354 // The last operand of a CallInst is the function itself.
7355 int NumOperands = CB->getNumOperands() - 1;
7356
7357 // Check input arguments. Doing so before unpoisoning output arguments, so
7358 // that we won't overwrite uninit values before checking them.
7359 for (int i = OutputArgs; i < NumOperands; i++) {
7360 Value *Operand = CB->getOperand(i);
7361 instrumentAsmArgument(Operand, CB->getParamElementType(i), I, IRB, DL,
7362 /*isOutput*/ false);
7363 }
7364 // Unpoison output arguments. This must happen before the actual InlineAsm
7365 // call, so that the shadow for memory published in the asm() statement
7366 // remains valid.
7367 for (int i = 0; i < OutputArgs; i++) {
7368 Value *Operand = CB->getOperand(i);
7369 instrumentAsmArgument(Operand, CB->getParamElementType(i), I, IRB, DL,
7370 /*isOutput*/ true);
7371 }
7372
7373 setShadow(&I, getCleanShadow(&I));
7374 setOrigin(&I, getCleanOrigin());
7375 }
7376
7377 void visitFreezeInst(FreezeInst &I) {
7378 // Freeze always returns a fully defined value.
7379 setShadow(&I, getCleanShadow(&I));
7380 setOrigin(&I, getCleanOrigin());
7381 }
7382
7383 void visitInstruction(Instruction &I) {
7384 // Everything else: stop propagating and check for poisoned shadow.
7386 dumpInst(I);
7387 LLVM_DEBUG(dbgs() << "DEFAULT: " << I << "\n");
7388 for (size_t i = 0, n = I.getNumOperands(); i < n; i++) {
7389 Value *Operand = I.getOperand(i);
7390 if (Operand->getType()->isSized())
7391 insertCheckShadowOf(Operand, &I);
7392 }
7393 setShadow(&I, getCleanShadow(&I));
7394 setOrigin(&I, getCleanOrigin());
7395 }
7396};
7397
7398struct VarArgHelperBase : public VarArgHelper {
7399 Function &F;
7400 MemorySanitizer &MS;
7401 MemorySanitizerVisitor &MSV;
7402 SmallVector<CallInst *, 16> VAStartInstrumentationList;
7403 const unsigned VAListTagSize;
7404
7405 VarArgHelperBase(Function &F, MemorySanitizer &MS,
7406 MemorySanitizerVisitor &MSV, unsigned VAListTagSize)
7407 : F(F), MS(MS), MSV(MSV), VAListTagSize(VAListTagSize) {}
7408
7409 Value *getShadowAddrForVAArgument(IRBuilder<> &IRB, unsigned ArgOffset) {
7410 Value *Base = IRB.CreatePointerCast(MS.VAArgTLS, MS.IntptrTy);
7411 return IRB.CreateAdd(Base, ConstantInt::get(MS.IntptrTy, ArgOffset));
7412 }
7413
7414 /// Compute the shadow address for a given va_arg.
7415 Value *getShadowPtrForVAArgument(IRBuilder<> &IRB, unsigned ArgOffset) {
7416 return IRB.CreatePtrAdd(
7417 MS.VAArgTLS, ConstantInt::get(MS.IntptrTy, ArgOffset), "_msarg_va_s");
7418 }
7419
7420 /// Compute the shadow address for a given va_arg.
7421 Value *getShadowPtrForVAArgument(IRBuilder<> &IRB, unsigned ArgOffset,
7422 unsigned ArgSize) {
7423 // Make sure we don't overflow __msan_va_arg_tls.
7424 if (ArgOffset + ArgSize > kParamTLSSize)
7425 return nullptr;
7426 return getShadowPtrForVAArgument(IRB, ArgOffset);
7427 }
7428
7429 /// Compute the origin address for a given va_arg.
7430 Value *getOriginPtrForVAArgument(IRBuilder<> &IRB, int ArgOffset) {
7431 // getOriginPtrForVAArgument() is always called after
7432 // getShadowPtrForVAArgument(), so __msan_va_arg_origin_tls can never
7433 // overflow.
7434 return IRB.CreatePtrAdd(MS.VAArgOriginTLS,
7435 ConstantInt::get(MS.IntptrTy, ArgOffset),
7436 "_msarg_va_o");
7437 }
7438
7439 void CleanUnusedTLS(IRBuilder<> &IRB, Value *ShadowBase,
7440 unsigned BaseOffset) {
7441 // The tails of __msan_va_arg_tls is not large enough to fit full
7442 // value shadow, but it will be copied to backup anyway. Make it
7443 // clean.
7444 if (BaseOffset >= kParamTLSSize)
7445 return;
7446 Value *TailSize =
7447 ConstantInt::getSigned(IRB.getInt32Ty(), kParamTLSSize - BaseOffset);
7448 IRB.CreateMemSet(ShadowBase, ConstantInt::getNullValue(IRB.getInt8Ty()),
7449 TailSize, Align(8));
7450 }
7451
7452 void unpoisonVAListTagForInst(IntrinsicInst &I) {
7453 IRBuilder<> IRB(&I);
7454 Value *VAListTag = I.getArgOperand(0);
7455 const Align Alignment = Align(8);
7456 auto [ShadowPtr, OriginPtr] = MSV.getShadowOriginPtr(
7457 VAListTag, IRB, IRB.getInt8Ty(), Alignment, /*isStore*/ true);
7458 // Unpoison the whole __va_list_tag.
7459 IRB.CreateMemSet(ShadowPtr, Constant::getNullValue(IRB.getInt8Ty()),
7460 VAListTagSize, Alignment, false);
7461 }
7462
7463 void visitVAStartInst(VAStartInst &I) override {
7464 if (F.getCallingConv() == CallingConv::Win64)
7465 return;
7466 VAStartInstrumentationList.push_back(&I);
7467 unpoisonVAListTagForInst(I);
7468 }
7469
7470 void visitVACopyInst(VACopyInst &I) override {
7471 if (F.getCallingConv() == CallingConv::Win64)
7472 return;
7473 unpoisonVAListTagForInst(I);
7474 }
7475};
7476
7477/// AMD64-specific implementation of VarArgHelper.
7478struct VarArgAMD64Helper : public VarArgHelperBase {
7479 // An unfortunate workaround for asymmetric lowering of va_arg stuff.
7480 // See a comment in visitCallBase for more details.
7481 static const unsigned AMD64GpEndOffset = 48; // AMD64 ABI Draft 0.99.6 p3.5.7
7482 static const unsigned AMD64FpEndOffsetSSE = 176;
7483 // If SSE is disabled, fp_offset in va_list is zero.
7484 static const unsigned AMD64FpEndOffsetNoSSE = AMD64GpEndOffset;
7485
7486 unsigned AMD64FpEndOffset;
7487 AllocaInst *VAArgTLSCopy = nullptr;
7488 AllocaInst *VAArgTLSOriginCopy = nullptr;
7489 Value *VAArgOverflowSize = nullptr;
7490
7491 enum ArgKind { AK_GeneralPurpose, AK_FloatingPoint, AK_Memory };
7492
7493 VarArgAMD64Helper(Function &F, MemorySanitizer &MS,
7494 MemorySanitizerVisitor &MSV)
7495 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/24) {
7496 AMD64FpEndOffset = AMD64FpEndOffsetSSE;
7497 for (const auto &Attr : F.getAttributes().getFnAttrs()) {
7498 if (Attr.isStringAttribute() &&
7499 (Attr.getKindAsString() == "target-features")) {
7500 if (Attr.getValueAsString().contains("-sse"))
7501 AMD64FpEndOffset = AMD64FpEndOffsetNoSSE;
7502 break;
7503 }
7504 }
7505 }
7506
7507 ArgKind classifyArgument(Value *arg) {
7508 // A very rough approximation of X86_64 argument classification rules.
7509 Type *T = arg->getType();
7510 if (T->isX86_FP80Ty())
7511 return AK_Memory;
7512 if (T->isFPOrFPVectorTy())
7513 return AK_FloatingPoint;
7514 if (T->isIntegerTy() && T->getPrimitiveSizeInBits() <= 64)
7515 return AK_GeneralPurpose;
7516 if (T->isPointerTy())
7517 return AK_GeneralPurpose;
7518 return AK_Memory;
7519 }
7520
7521 // For VarArg functions, store the argument shadow in an ABI-specific format
7522 // that corresponds to va_list layout.
7523 // We do this because Clang lowers va_arg in the frontend, and this pass
7524 // only sees the low level code that deals with va_list internals.
7525 // A much easier alternative (provided that Clang emits va_arg instructions)
7526 // would have been to associate each live instance of va_list with a copy of
7527 // MSanParamTLS, and extract shadow on va_arg() call in the argument list
7528 // order.
7529 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
7530 unsigned GpOffset = 0;
7531 unsigned FpOffset = AMD64GpEndOffset;
7532 unsigned OverflowOffset = AMD64FpEndOffset;
7533 const DataLayout &DL = F.getDataLayout();
7534
7535 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
7536 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
7537 bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
7538 if (IsByVal) {
7539 // ByVal arguments always go to the overflow area.
7540 // Fixed arguments passed through the overflow area will be stepped
7541 // over by va_start, so don't count them towards the offset.
7542 if (IsFixed)
7543 continue;
7544 assert(A->getType()->isPointerTy());
7545 Type *RealTy = CB.getParamByValType(ArgNo);
7546 uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
7547 uint64_t AlignedSize = alignTo(ArgSize, 8);
7548 unsigned BaseOffset = OverflowOffset;
7549 Value *ShadowBase = getShadowPtrForVAArgument(IRB, OverflowOffset);
7550 Value *OriginBase = nullptr;
7551 if (MS.TrackOrigins)
7552 OriginBase = getOriginPtrForVAArgument(IRB, OverflowOffset);
7553 OverflowOffset += AlignedSize;
7554
7555 if (OverflowOffset > kParamTLSSize) {
7556 CleanUnusedTLS(IRB, ShadowBase, BaseOffset);
7557 continue; // We have no space to copy shadow there.
7558 }
7559
7560 Value *ShadowPtr, *OriginPtr;
7561 std::tie(ShadowPtr, OriginPtr) =
7562 MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(), kShadowTLSAlignment,
7563 /*isStore*/ false);
7564 IRB.CreateMemCpy(ShadowBase, kShadowTLSAlignment, ShadowPtr,
7565 kShadowTLSAlignment, ArgSize);
7566 if (MS.TrackOrigins)
7567 IRB.CreateMemCpy(OriginBase, kShadowTLSAlignment, OriginPtr,
7568 kShadowTLSAlignment, ArgSize);
7569 } else {
7570 ArgKind AK = classifyArgument(A);
7571 if (AK == AK_GeneralPurpose && GpOffset >= AMD64GpEndOffset)
7572 AK = AK_Memory;
7573 if (AK == AK_FloatingPoint && FpOffset >= AMD64FpEndOffset)
7574 AK = AK_Memory;
7575 Value *ShadowBase, *OriginBase = nullptr;
7576 switch (AK) {
7577 case AK_GeneralPurpose:
7578 ShadowBase = getShadowPtrForVAArgument(IRB, GpOffset);
7579 if (MS.TrackOrigins)
7580 OriginBase = getOriginPtrForVAArgument(IRB, GpOffset);
7581 GpOffset += 8;
7582 assert(GpOffset <= kParamTLSSize);
7583 break;
7584 case AK_FloatingPoint:
7585 ShadowBase = getShadowPtrForVAArgument(IRB, FpOffset);
7586 if (MS.TrackOrigins)
7587 OriginBase = getOriginPtrForVAArgument(IRB, FpOffset);
7588 FpOffset += 16;
7589 assert(FpOffset <= kParamTLSSize);
7590 break;
7591 case AK_Memory:
7592 if (IsFixed)
7593 continue;
7594 uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
7595 uint64_t AlignedSize = alignTo(ArgSize, 8);
7596 unsigned BaseOffset = OverflowOffset;
7597 ShadowBase = getShadowPtrForVAArgument(IRB, OverflowOffset);
7598 if (MS.TrackOrigins) {
7599 OriginBase = getOriginPtrForVAArgument(IRB, OverflowOffset);
7600 }
7601 OverflowOffset += AlignedSize;
7602 if (OverflowOffset > kParamTLSSize) {
7603 // We have no space to copy shadow there.
7604 CleanUnusedTLS(IRB, ShadowBase, BaseOffset);
7605 continue;
7606 }
7607 }
7608 // Take fixed arguments into account for GpOffset and FpOffset,
7609 // but don't actually store shadows for them.
7610 // TODO(glider): don't call get*PtrForVAArgument() for them.
7611 if (IsFixed)
7612 continue;
7613 Value *Shadow = MSV.getShadow(A);
7614 IRB.CreateAlignedStore(Shadow, ShadowBase, kShadowTLSAlignment);
7615 if (MS.TrackOrigins) {
7616 Value *Origin = MSV.getOrigin(A);
7617 TypeSize StoreSize = DL.getTypeStoreSize(Shadow->getType());
7618 MSV.paintOrigin(IRB, Origin, OriginBase, StoreSize,
7620 }
7621 }
7622 }
7623 Constant *OverflowSize =
7624 ConstantInt::get(IRB.getInt64Ty(), OverflowOffset - AMD64FpEndOffset);
7625 IRB.CreateStore(OverflowSize, MS.VAArgOverflowSizeTLS);
7626 }
7627
7628 void finalizeInstrumentation() override {
7629 assert(!VAArgOverflowSize && !VAArgTLSCopy &&
7630 "finalizeInstrumentation called twice");
7631 if (!VAStartInstrumentationList.empty()) {
7632 // If there is a va_start in this function, make a backup copy of
7633 // va_arg_tls somewhere in the function entry block.
7634 IRBuilder<> IRB(MSV.FnPrologueEnd);
7635 VAArgOverflowSize =
7636 IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
7637 Value *CopySize = IRB.CreateAdd(
7638 ConstantInt::get(MS.IntptrTy, AMD64FpEndOffset), VAArgOverflowSize);
7639 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
7640 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
7641 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
7642 CopySize, kShadowTLSAlignment, false);
7643
7644 Value *SrcSize = IRB.CreateBinaryIntrinsic(
7645 Intrinsic::umin, CopySize,
7646 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
7647 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
7648 kShadowTLSAlignment, SrcSize);
7649 if (MS.TrackOrigins) {
7650 VAArgTLSOriginCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
7651 VAArgTLSOriginCopy->setAlignment(kShadowTLSAlignment);
7652 IRB.CreateMemCpy(VAArgTLSOriginCopy, kShadowTLSAlignment,
7653 MS.VAArgOriginTLS, kShadowTLSAlignment, SrcSize);
7654 }
7655 }
7656
7657 // Instrument va_start.
7658 // Copy va_list shadow from the backup copy of the TLS contents.
7659 for (CallInst *OrigInst : VAStartInstrumentationList) {
7660 NextNodeIRBuilder IRB(OrigInst);
7661 Value *VAListTag = OrigInst->getArgOperand(0);
7662
7663 Value *RegSaveAreaPtrPtr =
7664 IRB.CreatePtrAdd(VAListTag, ConstantInt::get(MS.IntptrTy, 16));
7665 Value *RegSaveAreaPtr = IRB.CreateLoad(MS.PtrTy, RegSaveAreaPtrPtr);
7666 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
7667 const Align Alignment = Align(16);
7668 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
7669 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
7670 Alignment, /*isStore*/ true);
7671 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
7672 AMD64FpEndOffset);
7673 if (MS.TrackOrigins)
7674 IRB.CreateMemCpy(RegSaveAreaOriginPtr, Alignment, VAArgTLSOriginCopy,
7675 Alignment, AMD64FpEndOffset);
7676 Value *OverflowArgAreaPtrPtr =
7677 IRB.CreatePtrAdd(VAListTag, ConstantInt::get(MS.IntptrTy, 8));
7678 Value *OverflowArgAreaPtr =
7679 IRB.CreateLoad(MS.PtrTy, OverflowArgAreaPtrPtr);
7680 Value *OverflowArgAreaShadowPtr, *OverflowArgAreaOriginPtr;
7681 std::tie(OverflowArgAreaShadowPtr, OverflowArgAreaOriginPtr) =
7682 MSV.getShadowOriginPtr(OverflowArgAreaPtr, IRB, IRB.getInt8Ty(),
7683 Alignment, /*isStore*/ true);
7684 Value *SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSCopy,
7685 AMD64FpEndOffset);
7686 IRB.CreateMemCpy(OverflowArgAreaShadowPtr, Alignment, SrcPtr, Alignment,
7687 VAArgOverflowSize);
7688 if (MS.TrackOrigins) {
7689 SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSOriginCopy,
7690 AMD64FpEndOffset);
7691 IRB.CreateMemCpy(OverflowArgAreaOriginPtr, Alignment, SrcPtr, Alignment,
7692 VAArgOverflowSize);
7693 }
7694 }
7695 }
7696};
7697
7698/// AArch64-specific implementation of VarArgHelper.
7699struct VarArgAArch64Helper : public VarArgHelperBase {
7700 static const unsigned kAArch64GrArgSize = 64;
7701 static const unsigned kAArch64VrArgSize = 128;
7702
7703 static const unsigned AArch64GrBegOffset = 0;
7704 static const unsigned AArch64GrEndOffset = kAArch64GrArgSize;
7705 // Make VR space aligned to 16 bytes.
7706 static const unsigned AArch64VrBegOffset = AArch64GrEndOffset;
7707 static const unsigned AArch64VrEndOffset =
7708 AArch64VrBegOffset + kAArch64VrArgSize;
7709 static const unsigned AArch64VAEndOffset = AArch64VrEndOffset;
7710
7711 AllocaInst *VAArgTLSCopy = nullptr;
7712 Value *VAArgOverflowSize = nullptr;
7713
7714 enum ArgKind { AK_GeneralPurpose, AK_FloatingPoint, AK_Memory };
7715
7716 VarArgAArch64Helper(Function &F, MemorySanitizer &MS,
7717 MemorySanitizerVisitor &MSV)
7718 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/32) {}
7719
7720 // A very rough approximation of aarch64 argument classification rules.
7721 std::pair<ArgKind, uint64_t> classifyArgument(Type *T) {
7722 if (T->isIntOrPtrTy() && T->getPrimitiveSizeInBits() <= 64)
7723 return {AK_GeneralPurpose, 1};
7724 if (T->isFloatingPointTy() && T->getPrimitiveSizeInBits() <= 128)
7725 return {AK_FloatingPoint, 1};
7726
7727 if (T->isArrayTy()) {
7728 auto R = classifyArgument(T->getArrayElementType());
7729 R.second *= T->getScalarType()->getArrayNumElements();
7730 return R;
7731 }
7732
7733 if (const FixedVectorType *FV = dyn_cast<FixedVectorType>(T)) {
7734 auto R = classifyArgument(FV->getScalarType());
7735 R.second *= FV->getNumElements();
7736 return R;
7737 }
7738
7739 LLVM_DEBUG(errs() << "Unknown vararg type: " << *T << "\n");
7740 return {AK_Memory, 0};
7741 }
7742
7743 // The instrumentation stores the argument shadow in a non ABI-specific
7744 // format because it does not know which argument is named (since Clang,
7745 // like x86_64 case, lowers the va_args in the frontend and this pass only
7746 // sees the low level code that deals with va_list internals).
7747 // The first seven GR registers are saved in the first 56 bytes of the
7748 // va_arg tls arra, followed by the first 8 FP/SIMD registers, and then
7749 // the remaining arguments.
7750 // Using constant offset within the va_arg TLS array allows fast copy
7751 // in the finalize instrumentation.
7752 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
7753 unsigned GrOffset = AArch64GrBegOffset;
7754 unsigned VrOffset = AArch64VrBegOffset;
7755 unsigned OverflowOffset = AArch64VAEndOffset;
7756
7757 const DataLayout &DL = F.getDataLayout();
7758 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
7759 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
7760 auto [AK, RegNum] = classifyArgument(A->getType());
7761 if (AK == AK_GeneralPurpose &&
7762 (GrOffset + RegNum * 8) > AArch64GrEndOffset)
7763 AK = AK_Memory;
7764 if (AK == AK_FloatingPoint &&
7765 (VrOffset + RegNum * 16) > AArch64VrEndOffset)
7766 AK = AK_Memory;
7767 Value *Base;
7768 switch (AK) {
7769 case AK_GeneralPurpose:
7770 Base = getShadowPtrForVAArgument(IRB, GrOffset);
7771 GrOffset += 8 * RegNum;
7772 break;
7773 case AK_FloatingPoint:
7774 Base = getShadowPtrForVAArgument(IRB, VrOffset);
7775 VrOffset += 16 * RegNum;
7776 break;
7777 case AK_Memory:
7778 // Don't count fixed arguments in the overflow area - va_start will
7779 // skip right over them.
7780 if (IsFixed)
7781 continue;
7782 uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
7783 uint64_t AlignedSize = alignTo(ArgSize, 8);
7784 unsigned BaseOffset = OverflowOffset;
7785 Base = getShadowPtrForVAArgument(IRB, BaseOffset);
7786 OverflowOffset += AlignedSize;
7787 if (OverflowOffset > kParamTLSSize) {
7788 // We have no space to copy shadow there.
7789 CleanUnusedTLS(IRB, Base, BaseOffset);
7790 continue;
7791 }
7792 break;
7793 }
7794 // Count Gp/Vr fixed arguments to their respective offsets, but don't
7795 // bother to actually store a shadow.
7796 if (IsFixed)
7797 continue;
7798 IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
7799 }
7800 Constant *OverflowSize =
7801 ConstantInt::get(IRB.getInt64Ty(), OverflowOffset - AArch64VAEndOffset);
7802 IRB.CreateStore(OverflowSize, MS.VAArgOverflowSizeTLS);
7803 }
7804
7805 // Retrieve a va_list field of 'void*' size.
7806 Value *getVAField64(IRBuilder<> &IRB, Value *VAListTag, int offset) {
7807 Value *SaveAreaPtrPtr =
7808 IRB.CreatePtrAdd(VAListTag, ConstantInt::get(MS.IntptrTy, offset));
7809 return IRB.CreateLoad(Type::getInt64Ty(*MS.C), SaveAreaPtrPtr);
7810 }
7811
7812 // Retrieve a va_list field of 'int' size.
7813 Value *getVAField32(IRBuilder<> &IRB, Value *VAListTag, int offset) {
7814 Value *SaveAreaPtr =
7815 IRB.CreatePtrAdd(VAListTag, ConstantInt::get(MS.IntptrTy, offset));
7816 Value *SaveArea32 = IRB.CreateLoad(IRB.getInt32Ty(), SaveAreaPtr);
7817 return IRB.CreateSExt(SaveArea32, MS.IntptrTy);
7818 }
7819
7820 void finalizeInstrumentation() override {
7821 assert(!VAArgOverflowSize && !VAArgTLSCopy &&
7822 "finalizeInstrumentation called twice");
7823 if (!VAStartInstrumentationList.empty()) {
7824 // If there is a va_start in this function, make a backup copy of
7825 // va_arg_tls somewhere in the function entry block.
7826 IRBuilder<> IRB(MSV.FnPrologueEnd);
7827 VAArgOverflowSize =
7828 IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
7829 Value *CopySize = IRB.CreateAdd(
7830 ConstantInt::get(MS.IntptrTy, AArch64VAEndOffset), VAArgOverflowSize);
7831 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
7832 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
7833 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
7834 CopySize, kShadowTLSAlignment, false);
7835
7836 Value *SrcSize = IRB.CreateBinaryIntrinsic(
7837 Intrinsic::umin, CopySize,
7838 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
7839 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
7840 kShadowTLSAlignment, SrcSize);
7841 }
7842
7843 Value *GrArgSize = ConstantInt::get(MS.IntptrTy, kAArch64GrArgSize);
7844 Value *VrArgSize = ConstantInt::get(MS.IntptrTy, kAArch64VrArgSize);
7845
7846 // Instrument va_start, copy va_list shadow from the backup copy of
7847 // the TLS contents.
7848 for (CallInst *OrigInst : VAStartInstrumentationList) {
7849 NextNodeIRBuilder IRB(OrigInst);
7850
7851 Value *VAListTag = OrigInst->getArgOperand(0);
7852
7853 // The variadic ABI for AArch64 creates two areas to save the incoming
7854 // argument registers (one for 64-bit general register xn-x7 and another
7855 // for 128-bit FP/SIMD vn-v7).
7856 // We need then to propagate the shadow arguments on both regions
7857 // 'va::__gr_top + va::__gr_offs' and 'va::__vr_top + va::__vr_offs'.
7858 // The remaining arguments are saved on shadow for 'va::stack'.
7859 // One caveat is it requires only to propagate the non-named arguments,
7860 // however on the call site instrumentation 'all' the arguments are
7861 // saved. So to copy the shadow values from the va_arg TLS array
7862 // we need to adjust the offset for both GR and VR fields based on
7863 // the __{gr,vr}_offs value (since they are stores based on incoming
7864 // named arguments).
7865 Type *RegSaveAreaPtrTy = IRB.getPtrTy();
7866
7867 // Read the stack pointer from the va_list.
7868 Value *StackSaveAreaPtr =
7869 IRB.CreateIntToPtr(getVAField64(IRB, VAListTag, 0), RegSaveAreaPtrTy);
7870
7871 // Read both the __gr_top and __gr_off and add them up.
7872 Value *GrTopSaveAreaPtr = getVAField64(IRB, VAListTag, 8);
7873 Value *GrOffSaveArea = getVAField32(IRB, VAListTag, 24);
7874
7875 Value *GrRegSaveAreaPtr = IRB.CreateIntToPtr(
7876 IRB.CreateAdd(GrTopSaveAreaPtr, GrOffSaveArea), RegSaveAreaPtrTy);
7877
7878 // Read both the __vr_top and __vr_off and add them up.
7879 Value *VrTopSaveAreaPtr = getVAField64(IRB, VAListTag, 16);
7880 Value *VrOffSaveArea = getVAField32(IRB, VAListTag, 28);
7881
7882 Value *VrRegSaveAreaPtr = IRB.CreateIntToPtr(
7883 IRB.CreateAdd(VrTopSaveAreaPtr, VrOffSaveArea), RegSaveAreaPtrTy);
7884
7885 // It does not know how many named arguments is being used and, on the
7886 // callsite all the arguments were saved. Since __gr_off is defined as
7887 // '0 - ((8 - named_gr) * 8)', the idea is to just propagate the variadic
7888 // argument by ignoring the bytes of shadow from named arguments.
7889 Value *GrRegSaveAreaShadowPtrOff =
7890 IRB.CreateAdd(GrArgSize, GrOffSaveArea);
7891
7892 Value *GrRegSaveAreaShadowPtr =
7893 MSV.getShadowOriginPtr(GrRegSaveAreaPtr, IRB, IRB.getInt8Ty(),
7894 Align(8), /*isStore*/ true)
7895 .first;
7896
7897 Value *GrSrcPtr =
7898 IRB.CreateInBoundsPtrAdd(VAArgTLSCopy, GrRegSaveAreaShadowPtrOff);
7899 Value *GrCopySize = IRB.CreateSub(GrArgSize, GrRegSaveAreaShadowPtrOff);
7900
7901 IRB.CreateMemCpy(GrRegSaveAreaShadowPtr, Align(8), GrSrcPtr, Align(8),
7902 GrCopySize);
7903
7904 // Again, but for FP/SIMD values.
7905 Value *VrRegSaveAreaShadowPtrOff =
7906 IRB.CreateAdd(VrArgSize, VrOffSaveArea);
7907
7908 Value *VrRegSaveAreaShadowPtr =
7909 MSV.getShadowOriginPtr(VrRegSaveAreaPtr, IRB, IRB.getInt8Ty(),
7910 Align(8), /*isStore*/ true)
7911 .first;
7912
7913 Value *VrSrcPtr = IRB.CreateInBoundsPtrAdd(
7914 IRB.CreateInBoundsPtrAdd(VAArgTLSCopy,
7915 IRB.getInt32(AArch64VrBegOffset)),
7916 VrRegSaveAreaShadowPtrOff);
7917 Value *VrCopySize = IRB.CreateSub(VrArgSize, VrRegSaveAreaShadowPtrOff);
7918
7919 IRB.CreateMemCpy(VrRegSaveAreaShadowPtr, Align(8), VrSrcPtr, Align(8),
7920 VrCopySize);
7921
7922 // And finally for remaining arguments.
7923 Value *StackSaveAreaShadowPtr =
7924 MSV.getShadowOriginPtr(StackSaveAreaPtr, IRB, IRB.getInt8Ty(),
7925 Align(16), /*isStore*/ true)
7926 .first;
7927
7928 Value *StackSrcPtr = IRB.CreateInBoundsPtrAdd(
7929 VAArgTLSCopy, IRB.getInt32(AArch64VAEndOffset));
7930
7931 IRB.CreateMemCpy(StackSaveAreaShadowPtr, Align(16), StackSrcPtr,
7932 Align(16), VAArgOverflowSize);
7933 }
7934 }
7935};
7936
7937/// PowerPC64-specific implementation of VarArgHelper.
7938struct VarArgPowerPC64Helper : public VarArgHelperBase {
7939 AllocaInst *VAArgTLSCopy = nullptr;
7940 Value *VAArgSize = nullptr;
7941
7942 VarArgPowerPC64Helper(Function &F, MemorySanitizer &MS,
7943 MemorySanitizerVisitor &MSV)
7944 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/8) {}
7945
7946 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
7947 // For PowerPC, we need to deal with alignment of stack arguments -
7948 // they are mostly aligned to 8 bytes, but vectors and i128 arrays
7949 // are aligned to 16 bytes, byvals can be aligned to 8 or 16 bytes,
7950 // For that reason, we compute current offset from stack pointer (which is
7951 // always properly aligned), and offset for the first vararg, then subtract
7952 // them.
7953 unsigned VAArgBase;
7954 Triple TargetTriple(F.getParent()->getTargetTriple());
7955 // Parameter save area starts at 48 bytes from frame pointer for ABIv1,
7956 // and 32 bytes for ABIv2. This is usually determined by target
7957 // endianness, but in theory could be overridden by function attribute.
7958 if (TargetTriple.isPPC64ELFv2ABI())
7959 VAArgBase = 32;
7960 else
7961 VAArgBase = 48;
7962 unsigned VAArgOffset = VAArgBase;
7963 const DataLayout &DL = F.getDataLayout();
7964 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
7965 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
7966 bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
7967 if (IsByVal) {
7968 assert(A->getType()->isPointerTy());
7969 Type *RealTy = CB.getParamByValType(ArgNo);
7970 uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
7971 Align ArgAlign = CB.getParamAlign(ArgNo).value_or(Align(8));
7972 if (ArgAlign < 8)
7973 ArgAlign = Align(8);
7974 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
7975 if (!IsFixed) {
7976 Value *Base =
7977 getShadowPtrForVAArgument(IRB, VAArgOffset - VAArgBase, ArgSize);
7978 if (Base) {
7979 Value *AShadowPtr, *AOriginPtr;
7980 std::tie(AShadowPtr, AOriginPtr) =
7981 MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(),
7982 kShadowTLSAlignment, /*isStore*/ false);
7983
7984 IRB.CreateMemCpy(Base, kShadowTLSAlignment, AShadowPtr,
7985 kShadowTLSAlignment, ArgSize);
7986 }
7987 }
7988 VAArgOffset += alignTo(ArgSize, Align(8));
7989 } else {
7990 Value *Base;
7991 uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
7992 Align ArgAlign = Align(8);
7993 if (A->getType()->isArrayTy()) {
7994 // Arrays are aligned to element size, except for long double
7995 // arrays, which are aligned to 8 bytes.
7996 Type *ElementTy = A->getType()->getArrayElementType();
7997 if (!ElementTy->isPPC_FP128Ty())
7998 ArgAlign = Align(DL.getTypeAllocSize(ElementTy));
7999 } else if (A->getType()->isVectorTy()) {
8000 // Vectors are naturally aligned.
8001 ArgAlign = Align(ArgSize);
8002 }
8003 if (ArgAlign < 8)
8004 ArgAlign = Align(8);
8005 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
8006 if (DL.isBigEndian()) {
8007 // Adjusting the shadow for argument with size < 8 to match the
8008 // placement of bits in big endian system
8009 if (ArgSize < 8)
8010 VAArgOffset += (8 - ArgSize);
8011 }
8012 if (!IsFixed) {
8013 Base =
8014 getShadowPtrForVAArgument(IRB, VAArgOffset - VAArgBase, ArgSize);
8015 if (Base)
8016 IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
8017 }
8018 VAArgOffset += ArgSize;
8019 VAArgOffset = alignTo(VAArgOffset, Align(8));
8020 }
8021 if (IsFixed)
8022 VAArgBase = VAArgOffset;
8023 }
8024
8025 Constant *TotalVAArgSize =
8026 ConstantInt::get(MS.IntptrTy, VAArgOffset - VAArgBase);
8027 // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
8028 // a new class member i.e. it is the total size of all VarArgs.
8029 IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
8030 }
8031
8032 void finalizeInstrumentation() override {
8033 assert(!VAArgSize && !VAArgTLSCopy &&
8034 "finalizeInstrumentation called twice");
8035 IRBuilder<> IRB(MSV.FnPrologueEnd);
8036 VAArgSize = IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
8037 Value *CopySize = VAArgSize;
8038
8039 if (!VAStartInstrumentationList.empty()) {
8040 // If there is a va_start in this function, make a backup copy of
8041 // va_arg_tls somewhere in the function entry block.
8042
8043 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
8044 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
8045 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
8046 CopySize, kShadowTLSAlignment, false);
8047
8048 Value *SrcSize = IRB.CreateBinaryIntrinsic(
8049 Intrinsic::umin, CopySize,
8050 ConstantInt::get(IRB.getInt64Ty(), kParamTLSSize));
8051 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
8052 kShadowTLSAlignment, SrcSize);
8053 }
8054
8055 // Instrument va_start.
8056 // Copy va_list shadow from the backup copy of the TLS contents.
8057 for (CallInst *OrigInst : VAStartInstrumentationList) {
8058 NextNodeIRBuilder IRB(OrigInst);
8059 Value *VAListTag = OrigInst->getArgOperand(0);
8060 Value *RegSaveAreaPtrPtr = IRB.CreatePtrToInt(VAListTag, MS.IntptrTy);
8061
8062 RegSaveAreaPtrPtr = IRB.CreateIntToPtr(RegSaveAreaPtrPtr, MS.PtrTy);
8063
8064 Value *RegSaveAreaPtr = IRB.CreateLoad(MS.PtrTy, RegSaveAreaPtrPtr);
8065 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
8066 const DataLayout &DL = F.getDataLayout();
8067 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
8068 const Align Alignment = Align(IntptrSize);
8069 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
8070 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
8071 Alignment, /*isStore*/ true);
8072 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
8073 CopySize);
8074 }
8075 }
8076};
8077
8078/// PowerPC32-specific implementation of VarArgHelper.
8079struct VarArgPowerPC32Helper : public VarArgHelperBase {
8080 AllocaInst *VAArgTLSCopy = nullptr;
8081 Value *VAArgSize = nullptr;
8082
8083 VarArgPowerPC32Helper(Function &F, MemorySanitizer &MS,
8084 MemorySanitizerVisitor &MSV)
8085 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/12) {}
8086
8087 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
8088 unsigned VAArgBase;
8089 // Parameter save area is 8 bytes from frame pointer in PPC32
8090 VAArgBase = 8;
8091 unsigned VAArgOffset = VAArgBase;
8092 const DataLayout &DL = F.getDataLayout();
8093 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
8094 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
8095 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
8096 bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
8097 if (IsByVal) {
8098 assert(A->getType()->isPointerTy());
8099 Type *RealTy = CB.getParamByValType(ArgNo);
8100 uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
8101 Align ArgAlign = CB.getParamAlign(ArgNo).value_or(Align(IntptrSize));
8102 if (ArgAlign < IntptrSize)
8103 ArgAlign = Align(IntptrSize);
8104 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
8105 if (!IsFixed) {
8106 Value *Base =
8107 getShadowPtrForVAArgument(IRB, VAArgOffset - VAArgBase, ArgSize);
8108 if (Base) {
8109 Value *AShadowPtr, *AOriginPtr;
8110 std::tie(AShadowPtr, AOriginPtr) =
8111 MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(),
8112 kShadowTLSAlignment, /*isStore*/ false);
8113
8114 IRB.CreateMemCpy(Base, kShadowTLSAlignment, AShadowPtr,
8115 kShadowTLSAlignment, ArgSize);
8116 }
8117 }
8118 VAArgOffset += alignTo(ArgSize, Align(IntptrSize));
8119 } else {
8120 Value *Base;
8121 Type *ArgTy = A->getType();
8122
8123 // On PPC 32 floating point variable arguments are stored in separate
8124 // area: fp_save_area = reg_save_area + 4*8. We do not copy shaodow for
8125 // them as they will be found when checking call arguments.
8126 if (!ArgTy->isFloatingPointTy()) {
8127 uint64_t ArgSize = DL.getTypeAllocSize(ArgTy);
8128 Align ArgAlign = Align(IntptrSize);
8129 if (ArgTy->isArrayTy()) {
8130 // Arrays are aligned to element size, except for long double
8131 // arrays, which are aligned to 8 bytes.
8132 Type *ElementTy = ArgTy->getArrayElementType();
8133 if (!ElementTy->isPPC_FP128Ty())
8134 ArgAlign = Align(DL.getTypeAllocSize(ElementTy));
8135 } else if (ArgTy->isVectorTy()) {
8136 // Vectors are naturally aligned.
8137 ArgAlign = Align(ArgSize);
8138 }
8139 if (ArgAlign < IntptrSize)
8140 ArgAlign = Align(IntptrSize);
8141 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
8142 if (DL.isBigEndian()) {
8143 // Adjusting the shadow for argument with size < IntptrSize to match
8144 // the placement of bits in big endian system
8145 if (ArgSize < IntptrSize)
8146 VAArgOffset += (IntptrSize - ArgSize);
8147 }
8148 if (!IsFixed) {
8149 Base = getShadowPtrForVAArgument(IRB, VAArgOffset - VAArgBase,
8150 ArgSize);
8151 if (Base)
8152 IRB.CreateAlignedStore(MSV.getShadow(A), Base,
8154 }
8155 VAArgOffset += ArgSize;
8156 VAArgOffset = alignTo(VAArgOffset, Align(IntptrSize));
8157 }
8158 }
8159 }
8160
8161 Constant *TotalVAArgSize =
8162 ConstantInt::get(MS.IntptrTy, VAArgOffset - VAArgBase);
8163 // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
8164 // a new class member i.e. it is the total size of all VarArgs.
8165 IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
8166 }
8167
8168 void finalizeInstrumentation() override {
8169 assert(!VAArgSize && !VAArgTLSCopy &&
8170 "finalizeInstrumentation called twice");
8171 IRBuilder<> IRB(MSV.FnPrologueEnd);
8172 VAArgSize = IRB.CreateLoad(MS.IntptrTy, MS.VAArgOverflowSizeTLS);
8173 Value *CopySize = VAArgSize;
8174
8175 if (!VAStartInstrumentationList.empty()) {
8176 // If there is a va_start in this function, make a backup copy of
8177 // va_arg_tls somewhere in the function entry block.
8178
8179 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
8180 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
8181 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
8182 CopySize, kShadowTLSAlignment, false);
8183
8184 Value *SrcSize = IRB.CreateBinaryIntrinsic(
8185 Intrinsic::umin, CopySize,
8186 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
8187 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
8188 kShadowTLSAlignment, SrcSize);
8189 }
8190
8191 // Instrument va_start.
8192 // Copy va_list shadow from the backup copy of the TLS contents.
8193 for (CallInst *OrigInst : VAStartInstrumentationList) {
8194 NextNodeIRBuilder IRB(OrigInst);
8195 Value *VAListTag = OrigInst->getArgOperand(0);
8196 Value *RegSaveAreaPtrPtr = IRB.CreatePtrToInt(VAListTag, MS.IntptrTy);
8197 Value *RegSaveAreaSize = CopySize;
8198
8199 // In PPC32 va_list_tag is a struct
8200 RegSaveAreaPtrPtr =
8201 IRB.CreateAdd(RegSaveAreaPtrPtr, ConstantInt::get(MS.IntptrTy, 8));
8202
8203 // On PPC 32 reg_save_area can only hold 32 bytes of data
8204 RegSaveAreaSize = IRB.CreateBinaryIntrinsic(
8205 Intrinsic::umin, CopySize, ConstantInt::get(MS.IntptrTy, 32));
8206
8207 RegSaveAreaPtrPtr = IRB.CreateIntToPtr(RegSaveAreaPtrPtr, MS.PtrTy);
8208 Value *RegSaveAreaPtr = IRB.CreateLoad(MS.PtrTy, RegSaveAreaPtrPtr);
8209
8210 const DataLayout &DL = F.getDataLayout();
8211 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
8212 const Align Alignment = Align(IntptrSize);
8213
8214 { // Copy reg save area
8215 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
8216 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
8217 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
8218 Alignment, /*isStore*/ true);
8219 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy,
8220 Alignment, RegSaveAreaSize);
8221
8222 RegSaveAreaShadowPtr =
8223 IRB.CreatePtrToInt(RegSaveAreaShadowPtr, MS.IntptrTy);
8224 Value *FPSaveArea = IRB.CreateAdd(RegSaveAreaShadowPtr,
8225 ConstantInt::get(MS.IntptrTy, 32));
8226 FPSaveArea = IRB.CreateIntToPtr(FPSaveArea, MS.PtrTy);
8227 // We fill fp shadow with zeroes as uninitialized fp args should have
8228 // been found during call base check
8229 IRB.CreateMemSet(FPSaveArea, ConstantInt::getNullValue(IRB.getInt8Ty()),
8230 ConstantInt::get(MS.IntptrTy, 32), Alignment);
8231 }
8232
8233 { // Copy overflow area
8234 // RegSaveAreaSize is min(CopySize, 32) -> no overflow can occur
8235 Value *OverflowAreaSize = IRB.CreateSub(CopySize, RegSaveAreaSize);
8236
8237 Value *OverflowAreaPtrPtr = IRB.CreatePtrToInt(VAListTag, MS.IntptrTy);
8238 OverflowAreaPtrPtr =
8239 IRB.CreateAdd(OverflowAreaPtrPtr, ConstantInt::get(MS.IntptrTy, 4));
8240 OverflowAreaPtrPtr = IRB.CreateIntToPtr(OverflowAreaPtrPtr, MS.PtrTy);
8241
8242 Value *OverflowAreaPtr = IRB.CreateLoad(MS.PtrTy, OverflowAreaPtrPtr);
8243
8244 Value *OverflowAreaShadowPtr, *OverflowAreaOriginPtr;
8245 std::tie(OverflowAreaShadowPtr, OverflowAreaOriginPtr) =
8246 MSV.getShadowOriginPtr(OverflowAreaPtr, IRB, IRB.getInt8Ty(),
8247 Alignment, /*isStore*/ true);
8248
8249 Value *OverflowVAArgTLSCopyPtr =
8250 IRB.CreatePtrToInt(VAArgTLSCopy, MS.IntptrTy);
8251 OverflowVAArgTLSCopyPtr =
8252 IRB.CreateAdd(OverflowVAArgTLSCopyPtr, RegSaveAreaSize);
8253
8254 OverflowVAArgTLSCopyPtr =
8255 IRB.CreateIntToPtr(OverflowVAArgTLSCopyPtr, MS.PtrTy);
8256 IRB.CreateMemCpy(OverflowAreaShadowPtr, Alignment,
8257 OverflowVAArgTLSCopyPtr, Alignment, OverflowAreaSize);
8258 }
8259 }
8260 }
8261};
8262
8263/// SystemZ-specific implementation of VarArgHelper.
8264struct VarArgSystemZHelper : public VarArgHelperBase {
8265 static const unsigned SystemZGpOffset = 16;
8266 static const unsigned SystemZGpEndOffset = 56;
8267 static const unsigned SystemZFpOffset = 128;
8268 static const unsigned SystemZFpEndOffset = 160;
8269 static const unsigned SystemZMaxVrArgs = 8;
8270 static const unsigned SystemZRegSaveAreaSize = 160;
8271 static const unsigned SystemZOverflowOffset = 160;
8272 static const unsigned SystemZVAListTagSize = 32;
8273 static const unsigned SystemZOverflowArgAreaPtrOffset = 16;
8274 static const unsigned SystemZRegSaveAreaPtrOffset = 24;
8275
8276 bool IsSoftFloatABI;
8277 AllocaInst *VAArgTLSCopy = nullptr;
8278 AllocaInst *VAArgTLSOriginCopy = nullptr;
8279 Value *VAArgOverflowSize = nullptr;
8280
8281 enum class ArgKind {
8282 GeneralPurpose,
8283 FloatingPoint,
8284 Vector,
8285 Memory,
8286 Indirect,
8287 };
8288
8289 enum class ShadowExtension { None, Zero, Sign };
8290
8291 VarArgSystemZHelper(Function &F, MemorySanitizer &MS,
8292 MemorySanitizerVisitor &MSV)
8293 : VarArgHelperBase(F, MS, MSV, SystemZVAListTagSize),
8294 IsSoftFloatABI(F.getFnAttribute("use-soft-float").getValueAsBool()) {}
8295
8296 ArgKind classifyArgument(Type *T) {
8297 // T is a SystemZABIInfo::classifyArgumentType() output, and there are
8298 // only a few possibilities of what it can be. In particular, enums, single
8299 // element structs and large types have already been taken care of.
8300
8301 // Some i128 and fp128 arguments are converted to pointers only in the
8302 // back end.
8303 if (T->isIntegerTy(128) || T->isFP128Ty())
8304 return ArgKind::Indirect;
8305 if (T->isFloatingPointTy())
8306 return IsSoftFloatABI ? ArgKind::GeneralPurpose : ArgKind::FloatingPoint;
8307 if (T->isIntegerTy() || T->isPointerTy())
8308 return ArgKind::GeneralPurpose;
8309 if (T->isVectorTy())
8310 return ArgKind::Vector;
8311 return ArgKind::Memory;
8312 }
8313
8314 ShadowExtension getShadowExtension(const CallBase &CB, unsigned ArgNo) {
8315 // ABI says: "One of the simple integer types no more than 64 bits wide.
8316 // ... If such an argument is shorter than 64 bits, replace it by a full
8317 // 64-bit integer representing the same number, using sign or zero
8318 // extension". Shadow for an integer argument has the same type as the
8319 // argument itself, so it can be sign or zero extended as well.
8320 bool ZExt = CB.paramHasAttr(ArgNo, Attribute::ZExt);
8321 bool SExt = CB.paramHasAttr(ArgNo, Attribute::SExt);
8322 if (ZExt) {
8323 assert(!SExt);
8324 return ShadowExtension::Zero;
8325 }
8326 if (SExt) {
8327 assert(!ZExt);
8328 return ShadowExtension::Sign;
8329 }
8330 return ShadowExtension::None;
8331 }
8332
8333 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
8334 unsigned GpOffset = SystemZGpOffset;
8335 unsigned FpOffset = SystemZFpOffset;
8336 unsigned VrIndex = 0;
8337 unsigned OverflowOffset = SystemZOverflowOffset;
8338 const DataLayout &DL = F.getDataLayout();
8339 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
8340 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
8341 // SystemZABIInfo does not produce ByVal parameters.
8342 assert(!CB.paramHasAttr(ArgNo, Attribute::ByVal));
8343 Type *T = A->getType();
8344 ArgKind AK = classifyArgument(T);
8345 if (AK == ArgKind::Indirect) {
8346 T = MS.PtrTy;
8347 AK = ArgKind::GeneralPurpose;
8348 }
8349 if (AK == ArgKind::GeneralPurpose && GpOffset >= SystemZGpEndOffset)
8350 AK = ArgKind::Memory;
8351 if (AK == ArgKind::FloatingPoint && FpOffset >= SystemZFpEndOffset)
8352 AK = ArgKind::Memory;
8353 if (AK == ArgKind::Vector && (VrIndex >= SystemZMaxVrArgs || !IsFixed))
8354 AK = ArgKind::Memory;
8355 Value *ShadowBase = nullptr;
8356 Value *OriginBase = nullptr;
8357 ShadowExtension SE = ShadowExtension::None;
8358 switch (AK) {
8359 case ArgKind::GeneralPurpose: {
8360 // Always keep track of GpOffset, but store shadow only for varargs.
8361 uint64_t ArgSize = 8;
8362 if (GpOffset + ArgSize <= kParamTLSSize) {
8363 if (!IsFixed) {
8364 SE = getShadowExtension(CB, ArgNo);
8365 uint64_t GapSize = 0;
8366 if (SE == ShadowExtension::None) {
8367 uint64_t ArgAllocSize = DL.getTypeAllocSize(T);
8368 assert(ArgAllocSize <= ArgSize);
8369 GapSize = ArgSize - ArgAllocSize;
8370 }
8371 ShadowBase = getShadowAddrForVAArgument(IRB, GpOffset + GapSize);
8372 if (MS.TrackOrigins)
8373 OriginBase = getOriginPtrForVAArgument(IRB, GpOffset + GapSize);
8374 }
8375 GpOffset += ArgSize;
8376 } else {
8377 GpOffset = kParamTLSSize;
8378 }
8379 break;
8380 }
8381 case ArgKind::FloatingPoint: {
8382 // Always keep track of FpOffset, but store shadow only for varargs.
8383 uint64_t ArgSize = 8;
8384 if (FpOffset + ArgSize <= kParamTLSSize) {
8385 if (!IsFixed) {
8386 // PoP says: "A short floating-point datum requires only the
8387 // left-most 32 bit positions of a floating-point register".
8388 // Therefore, in contrast to AK_GeneralPurpose and AK_Memory,
8389 // don't extend shadow and don't mind the gap.
8390 ShadowBase = getShadowAddrForVAArgument(IRB, FpOffset);
8391 if (MS.TrackOrigins)
8392 OriginBase = getOriginPtrForVAArgument(IRB, FpOffset);
8393 }
8394 FpOffset += ArgSize;
8395 } else {
8396 FpOffset = kParamTLSSize;
8397 }
8398 break;
8399 }
8400 case ArgKind::Vector: {
8401 // Keep track of VrIndex. No need to store shadow, since vector varargs
8402 // go through AK_Memory.
8403 assert(IsFixed);
8404 VrIndex++;
8405 break;
8406 }
8407 case ArgKind::Memory: {
8408 // Keep track of OverflowOffset and store shadow only for varargs.
8409 // Ignore fixed args, since we need to copy only the vararg portion of
8410 // the overflow area shadow.
8411 if (!IsFixed) {
8412 uint64_t ArgAllocSize = DL.getTypeAllocSize(T);
8413 uint64_t ArgSize = alignTo(ArgAllocSize, 8);
8414 if (OverflowOffset + ArgSize <= kParamTLSSize) {
8415 SE = getShadowExtension(CB, ArgNo);
8416 uint64_t GapSize =
8417 SE == ShadowExtension::None ? ArgSize - ArgAllocSize : 0;
8418 ShadowBase =
8419 getShadowAddrForVAArgument(IRB, OverflowOffset + GapSize);
8420 if (MS.TrackOrigins)
8421 OriginBase =
8422 getOriginPtrForVAArgument(IRB, OverflowOffset + GapSize);
8423 OverflowOffset += ArgSize;
8424 } else {
8425 OverflowOffset = kParamTLSSize;
8426 }
8427 }
8428 break;
8429 }
8430 case ArgKind::Indirect:
8431 llvm_unreachable("Indirect must be converted to GeneralPurpose");
8432 }
8433 if (ShadowBase == nullptr)
8434 continue;
8435 Value *Shadow = MSV.getShadow(A);
8436 if (SE != ShadowExtension::None)
8437 Shadow = MSV.CreateShadowCast(IRB, Shadow, IRB.getInt64Ty(),
8438 /*Signed*/ SE == ShadowExtension::Sign);
8439 ShadowBase = IRB.CreateIntToPtr(ShadowBase, MS.PtrTy, "_msarg_va_s");
8440 IRB.CreateStore(Shadow, ShadowBase);
8441 if (MS.TrackOrigins) {
8442 Value *Origin = MSV.getOrigin(A);
8443 TypeSize StoreSize = DL.getTypeStoreSize(Shadow->getType());
8444 MSV.paintOrigin(IRB, Origin, OriginBase, StoreSize,
8446 }
8447 }
8448 Constant *OverflowSize = ConstantInt::get(
8449 IRB.getInt64Ty(), OverflowOffset - SystemZOverflowOffset);
8450 IRB.CreateStore(OverflowSize, MS.VAArgOverflowSizeTLS);
8451 }
8452
8453 void copyRegSaveArea(IRBuilder<> &IRB, Value *VAListTag) {
8454 Value *RegSaveAreaPtrPtr = IRB.CreateIntToPtr(
8455 IRB.CreateAdd(
8456 IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
8457 ConstantInt::get(MS.IntptrTy, SystemZRegSaveAreaPtrOffset)),
8458 MS.PtrTy);
8459 Value *RegSaveAreaPtr = IRB.CreateLoad(MS.PtrTy, RegSaveAreaPtrPtr);
8460 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
8461 const Align Alignment = Align(8);
8462 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
8463 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(), Alignment,
8464 /*isStore*/ true);
8465 // TODO(iii): copy only fragments filled by visitCallBase()
8466 // TODO(iii): support packed-stack && !use-soft-float
8467 // For use-soft-float functions, it is enough to copy just the GPRs.
8468 unsigned RegSaveAreaSize =
8469 IsSoftFloatABI ? SystemZGpEndOffset : SystemZRegSaveAreaSize;
8470 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
8471 RegSaveAreaSize);
8472 if (MS.TrackOrigins)
8473 IRB.CreateMemCpy(RegSaveAreaOriginPtr, Alignment, VAArgTLSOriginCopy,
8474 Alignment, RegSaveAreaSize);
8475 }
8476
8477 // FIXME: This implementation limits OverflowOffset to kParamTLSSize, so we
8478 // don't know real overflow size and can't clear shadow beyond kParamTLSSize.
8479 void copyOverflowArea(IRBuilder<> &IRB, Value *VAListTag) {
8480 Value *OverflowArgAreaPtrPtr = IRB.CreateIntToPtr(
8481 IRB.CreateAdd(
8482 IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
8483 ConstantInt::get(MS.IntptrTy, SystemZOverflowArgAreaPtrOffset)),
8484 MS.PtrTy);
8485 Value *OverflowArgAreaPtr = IRB.CreateLoad(MS.PtrTy, OverflowArgAreaPtrPtr);
8486 Value *OverflowArgAreaShadowPtr, *OverflowArgAreaOriginPtr;
8487 const Align Alignment = Align(8);
8488 std::tie(OverflowArgAreaShadowPtr, OverflowArgAreaOriginPtr) =
8489 MSV.getShadowOriginPtr(OverflowArgAreaPtr, IRB, IRB.getInt8Ty(),
8490 Alignment, /*isStore*/ true);
8491 Value *SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSCopy,
8492 SystemZOverflowOffset);
8493 IRB.CreateMemCpy(OverflowArgAreaShadowPtr, Alignment, SrcPtr, Alignment,
8494 VAArgOverflowSize);
8495 if (MS.TrackOrigins) {
8496 SrcPtr = IRB.CreateConstGEP1_32(IRB.getInt8Ty(), VAArgTLSOriginCopy,
8497 SystemZOverflowOffset);
8498 IRB.CreateMemCpy(OverflowArgAreaOriginPtr, Alignment, SrcPtr, Alignment,
8499 VAArgOverflowSize);
8500 }
8501 }
8502
8503 void finalizeInstrumentation() override {
8504 assert(!VAArgOverflowSize && !VAArgTLSCopy &&
8505 "finalizeInstrumentation called twice");
8506 if (!VAStartInstrumentationList.empty()) {
8507 // If there is a va_start in this function, make a backup copy of
8508 // va_arg_tls somewhere in the function entry block.
8509 IRBuilder<> IRB(MSV.FnPrologueEnd);
8510 VAArgOverflowSize =
8511 IRB.CreateLoad(IRB.getInt64Ty(), MS.VAArgOverflowSizeTLS);
8512 Value *CopySize =
8513 IRB.CreateAdd(ConstantInt::get(MS.IntptrTy, SystemZOverflowOffset),
8514 VAArgOverflowSize);
8515 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
8516 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
8517 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
8518 CopySize, kShadowTLSAlignment, false);
8519
8520 Value *SrcSize = IRB.CreateBinaryIntrinsic(
8521 Intrinsic::umin, CopySize,
8522 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
8523 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
8524 kShadowTLSAlignment, SrcSize);
8525 if (MS.TrackOrigins) {
8526 VAArgTLSOriginCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
8527 VAArgTLSOriginCopy->setAlignment(kShadowTLSAlignment);
8528 IRB.CreateMemCpy(VAArgTLSOriginCopy, kShadowTLSAlignment,
8529 MS.VAArgOriginTLS, kShadowTLSAlignment, SrcSize);
8530 }
8531 }
8532
8533 // Instrument va_start.
8534 // Copy va_list shadow from the backup copy of the TLS contents.
8535 for (CallInst *OrigInst : VAStartInstrumentationList) {
8536 NextNodeIRBuilder IRB(OrigInst);
8537 Value *VAListTag = OrigInst->getArgOperand(0);
8538 copyRegSaveArea(IRB, VAListTag);
8539 copyOverflowArea(IRB, VAListTag);
8540 }
8541 }
8542};
8543
8544/// i386-specific implementation of VarArgHelper.
8545struct VarArgI386Helper : public VarArgHelperBase {
8546 AllocaInst *VAArgTLSCopy = nullptr;
8547 Value *VAArgSize = nullptr;
8548
8549 VarArgI386Helper(Function &F, MemorySanitizer &MS,
8550 MemorySanitizerVisitor &MSV)
8551 : VarArgHelperBase(F, MS, MSV, /*VAListTagSize=*/4) {}
8552
8553 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
8554 const DataLayout &DL = F.getDataLayout();
8555 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
8556 unsigned VAArgOffset = 0;
8557 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
8558 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
8559 bool IsByVal = CB.paramHasAttr(ArgNo, Attribute::ByVal);
8560 if (IsByVal) {
8561 assert(A->getType()->isPointerTy());
8562 Type *RealTy = CB.getParamByValType(ArgNo);
8563 uint64_t ArgSize = DL.getTypeAllocSize(RealTy);
8564 Align ArgAlign = CB.getParamAlign(ArgNo).value_or(Align(IntptrSize));
8565 if (ArgAlign < IntptrSize)
8566 ArgAlign = Align(IntptrSize);
8567 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
8568 if (!IsFixed) {
8569 Value *Base = getShadowPtrForVAArgument(IRB, VAArgOffset, ArgSize);
8570 if (Base) {
8571 Value *AShadowPtr, *AOriginPtr;
8572 std::tie(AShadowPtr, AOriginPtr) =
8573 MSV.getShadowOriginPtr(A, IRB, IRB.getInt8Ty(),
8574 kShadowTLSAlignment, /*isStore*/ false);
8575
8576 IRB.CreateMemCpy(Base, kShadowTLSAlignment, AShadowPtr,
8577 kShadowTLSAlignment, ArgSize);
8578 }
8579 VAArgOffset += alignTo(ArgSize, Align(IntptrSize));
8580 }
8581 } else {
8582 Value *Base;
8583 uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
8584 Align ArgAlign = Align(IntptrSize);
8585 VAArgOffset = alignTo(VAArgOffset, ArgAlign);
8586 if (DL.isBigEndian()) {
8587 // Adjusting the shadow for argument with size < IntptrSize to match
8588 // the placement of bits in big endian system
8589 if (ArgSize < IntptrSize)
8590 VAArgOffset += (IntptrSize - ArgSize);
8591 }
8592 if (!IsFixed) {
8593 Base = getShadowPtrForVAArgument(IRB, VAArgOffset, ArgSize);
8594 if (Base)
8595 IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
8596 VAArgOffset += ArgSize;
8597 VAArgOffset = alignTo(VAArgOffset, Align(IntptrSize));
8598 }
8599 }
8600 }
8601
8602 Constant *TotalVAArgSize = ConstantInt::get(MS.IntptrTy, VAArgOffset);
8603 // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
8604 // a new class member i.e. it is the total size of all VarArgs.
8605 IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
8606 }
8607
8608 void finalizeInstrumentation() override {
8609 assert(!VAArgSize && !VAArgTLSCopy &&
8610 "finalizeInstrumentation called twice");
8611 IRBuilder<> IRB(MSV.FnPrologueEnd);
8612 VAArgSize = IRB.CreateLoad(MS.IntptrTy, MS.VAArgOverflowSizeTLS);
8613 Value *CopySize = VAArgSize;
8614
8615 if (!VAStartInstrumentationList.empty()) {
8616 // If there is a va_start in this function, make a backup copy of
8617 // va_arg_tls somewhere in the function entry block.
8618 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
8619 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
8620 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
8621 CopySize, kShadowTLSAlignment, false);
8622
8623 Value *SrcSize = IRB.CreateBinaryIntrinsic(
8624 Intrinsic::umin, CopySize,
8625 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
8626 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
8627 kShadowTLSAlignment, SrcSize);
8628 }
8629
8630 // Instrument va_start.
8631 // Copy va_list shadow from the backup copy of the TLS contents.
8632 for (CallInst *OrigInst : VAStartInstrumentationList) {
8633 NextNodeIRBuilder IRB(OrigInst);
8634 Value *VAListTag = OrigInst->getArgOperand(0);
8635 Type *RegSaveAreaPtrTy = PointerType::getUnqual(*MS.C);
8636 Value *RegSaveAreaPtrPtr =
8637 IRB.CreateIntToPtr(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
8638 PointerType::get(*MS.C, 0));
8639 Value *RegSaveAreaPtr =
8640 IRB.CreateLoad(RegSaveAreaPtrTy, RegSaveAreaPtrPtr);
8641 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
8642 const DataLayout &DL = F.getDataLayout();
8643 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
8644 const Align Alignment = Align(IntptrSize);
8645 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
8646 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
8647 Alignment, /*isStore*/ true);
8648 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
8649 CopySize);
8650 }
8651 }
8652};
8653
8654/// Implementation of VarArgHelper that is used for ARM32, MIPS, RISCV,
8655/// LoongArch64.
8656struct VarArgGenericHelper : public VarArgHelperBase {
8657 AllocaInst *VAArgTLSCopy = nullptr;
8658 Value *VAArgSize = nullptr;
8659
8660 VarArgGenericHelper(Function &F, MemorySanitizer &MS,
8661 MemorySanitizerVisitor &MSV, const unsigned VAListTagSize)
8662 : VarArgHelperBase(F, MS, MSV, VAListTagSize) {}
8663
8664 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {
8665 unsigned VAArgOffset = 0;
8666 const DataLayout &DL = F.getDataLayout();
8667 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
8668 for (const auto &[ArgNo, A] : llvm::enumerate(CB.args())) {
8669 bool IsFixed = ArgNo < CB.getFunctionType()->getNumParams();
8670 if (IsFixed)
8671 continue;
8672 uint64_t ArgSize = DL.getTypeAllocSize(A->getType());
8673 if (DL.isBigEndian()) {
8674 // Adjusting the shadow for argument with size < IntptrSize to match the
8675 // placement of bits in big endian system
8676 if (ArgSize < IntptrSize)
8677 VAArgOffset += (IntptrSize - ArgSize);
8678 }
8679 Value *Base = getShadowPtrForVAArgument(IRB, VAArgOffset, ArgSize);
8680 VAArgOffset += ArgSize;
8681 VAArgOffset = alignTo(VAArgOffset, IntptrSize);
8682 if (!Base)
8683 continue;
8684 IRB.CreateAlignedStore(MSV.getShadow(A), Base, kShadowTLSAlignment);
8685 }
8686
8687 Constant *TotalVAArgSize = ConstantInt::get(MS.IntptrTy, VAArgOffset);
8688 // Here using VAArgOverflowSizeTLS as VAArgSizeTLS to avoid creation of
8689 // a new class member i.e. it is the total size of all VarArgs.
8690 IRB.CreateStore(TotalVAArgSize, MS.VAArgOverflowSizeTLS);
8691 }
8692
8693 void finalizeInstrumentation() override {
8694 assert(!VAArgSize && !VAArgTLSCopy &&
8695 "finalizeInstrumentation called twice");
8696 IRBuilder<> IRB(MSV.FnPrologueEnd);
8697 VAArgSize = IRB.CreateLoad(MS.IntptrTy, MS.VAArgOverflowSizeTLS);
8698 Value *CopySize = VAArgSize;
8699
8700 if (!VAStartInstrumentationList.empty()) {
8701 // If there is a va_start in this function, make a backup copy of
8702 // va_arg_tls somewhere in the function entry block.
8703 VAArgTLSCopy = IRB.CreateAlloca(Type::getInt8Ty(*MS.C), CopySize);
8704 VAArgTLSCopy->setAlignment(kShadowTLSAlignment);
8705 IRB.CreateMemSet(VAArgTLSCopy, Constant::getNullValue(IRB.getInt8Ty()),
8706 CopySize, kShadowTLSAlignment, false);
8707
8708 Value *SrcSize = IRB.CreateBinaryIntrinsic(
8709 Intrinsic::umin, CopySize,
8710 ConstantInt::get(MS.IntptrTy, kParamTLSSize));
8711 IRB.CreateMemCpy(VAArgTLSCopy, kShadowTLSAlignment, MS.VAArgTLS,
8712 kShadowTLSAlignment, SrcSize);
8713 }
8714
8715 // Instrument va_start.
8716 // Copy va_list shadow from the backup copy of the TLS contents.
8717 for (CallInst *OrigInst : VAStartInstrumentationList) {
8718 NextNodeIRBuilder IRB(OrigInst);
8719 Value *VAListTag = OrigInst->getArgOperand(0);
8720 Type *RegSaveAreaPtrTy = PointerType::getUnqual(*MS.C);
8721 Value *RegSaveAreaPtrPtr =
8722 IRB.CreateIntToPtr(IRB.CreatePtrToInt(VAListTag, MS.IntptrTy),
8723 PointerType::get(*MS.C, 0));
8724 Value *RegSaveAreaPtr =
8725 IRB.CreateLoad(RegSaveAreaPtrTy, RegSaveAreaPtrPtr);
8726 Value *RegSaveAreaShadowPtr, *RegSaveAreaOriginPtr;
8727 const DataLayout &DL = F.getDataLayout();
8728 unsigned IntptrSize = DL.getTypeStoreSize(MS.IntptrTy);
8729 const Align Alignment = Align(IntptrSize);
8730 std::tie(RegSaveAreaShadowPtr, RegSaveAreaOriginPtr) =
8731 MSV.getShadowOriginPtr(RegSaveAreaPtr, IRB, IRB.getInt8Ty(),
8732 Alignment, /*isStore*/ true);
8733 IRB.CreateMemCpy(RegSaveAreaShadowPtr, Alignment, VAArgTLSCopy, Alignment,
8734 CopySize);
8735 }
8736 }
8737};
8738
8739// ARM32, Loongarch64, MIPS and RISCV share the same calling conventions
8740// regarding VAArgs.
8741using VarArgARM32Helper = VarArgGenericHelper;
8742using VarArgRISCVHelper = VarArgGenericHelper;
8743using VarArgMIPSHelper = VarArgGenericHelper;
8744using VarArgLoongArch64Helper = VarArgGenericHelper;
8745
8746/// A no-op implementation of VarArgHelper.
8747struct VarArgNoOpHelper : public VarArgHelper {
8748 VarArgNoOpHelper(Function &F, MemorySanitizer &MS,
8749 MemorySanitizerVisitor &MSV) {}
8750
8751 void visitCallBase(CallBase &CB, IRBuilder<> &IRB) override {}
8752
8753 void visitVAStartInst(VAStartInst &I) override {}
8754
8755 void visitVACopyInst(VACopyInst &I) override {}
8756
8757 void finalizeInstrumentation() override {}
8758};
8759
8760} // end anonymous namespace
8761
8762static VarArgHelper *CreateVarArgHelper(Function &Func, MemorySanitizer &Msan,
8763 MemorySanitizerVisitor &Visitor) {
8764 // VarArg handling is only implemented on AMD64. False positives are possible
8765 // on other platforms.
8766 Triple TargetTriple(Func.getParent()->getTargetTriple());
8767
8768 if (TargetTriple.getArch() == Triple::x86)
8769 return new VarArgI386Helper(Func, Msan, Visitor);
8770
8771 if (TargetTriple.getArch() == Triple::x86_64)
8772 return new VarArgAMD64Helper(Func, Msan, Visitor);
8773
8774 if (TargetTriple.isARM())
8775 return new VarArgARM32Helper(Func, Msan, Visitor, /*VAListTagSize=*/4);
8776
8777 if (TargetTriple.isAArch64())
8778 return new VarArgAArch64Helper(Func, Msan, Visitor);
8779
8780 if (TargetTriple.isSystemZ())
8781 return new VarArgSystemZHelper(Func, Msan, Visitor);
8782
8783 // On PowerPC32 VAListTag is a struct
8784 // {char, char, i16 padding, char *, char *}
8785 if (TargetTriple.isPPC32())
8786 return new VarArgPowerPC32Helper(Func, Msan, Visitor);
8787
8788 if (TargetTriple.isPPC64())
8789 return new VarArgPowerPC64Helper(Func, Msan, Visitor);
8790
8791 if (TargetTriple.isRISCV32())
8792 return new VarArgRISCVHelper(Func, Msan, Visitor, /*VAListTagSize=*/4);
8793
8794 if (TargetTriple.isRISCV64())
8795 return new VarArgRISCVHelper(Func, Msan, Visitor, /*VAListTagSize=*/8);
8796
8797 if (TargetTriple.isMIPS32())
8798 return new VarArgMIPSHelper(Func, Msan, Visitor, /*VAListTagSize=*/4);
8799
8800 if (TargetTriple.isMIPS64())
8801 return new VarArgMIPSHelper(Func, Msan, Visitor, /*VAListTagSize=*/8);
8802
8803 if (TargetTriple.isLoongArch64())
8804 return new VarArgLoongArch64Helper(Func, Msan, Visitor,
8805 /*VAListTagSize=*/8);
8806
8807 return new VarArgNoOpHelper(Func, Msan, Visitor);
8808}
8809
8810bool MemorySanitizer::sanitizeFunction(Function &F, TargetLibraryInfo &TLI) {
8811 if (!CompileKernel && F.getName() == kMsanModuleCtorName)
8812 return false;
8813
8814 if (F.hasFnAttribute(Attribute::DisableSanitizerInstrumentation))
8815 return false;
8816
8817 MemorySanitizerVisitor Visitor(F, *this, TLI);
8818
8819 // Clear out memory attributes.
8821 B.addAttribute(Attribute::Memory).addAttribute(Attribute::Speculatable);
8822 F.removeFnAttrs(B);
8823
8824 return Visitor.runOnFunction();
8825}
#define Success
assert(UImm &&(UImm !=~static_cast< T >(0)) &&"Invalid immediate!")
constexpr LLT S1
AMDGPU Uniform Intrinsic Combine
This file implements a class to represent arbitrary precision integral constant values and operations...
static bool isStore(int Opcode)
MachineBasicBlock MachineBasicBlock::iterator DebugLoc DL
static cl::opt< ITMode > IT(cl::desc("IT block support"), cl::Hidden, cl::init(DefaultIT), cl::values(clEnumValN(DefaultIT, "arm-default-it", "Generate any type of IT block"), clEnumValN(RestrictedIT, "arm-restrict-it", "Disallow complex IT blocks")))
static const size_t kNumberOfAccessSizes
static cl::opt< bool > ClWithComdat("asan-with-comdat", cl::desc("Place ASan constructors in comdat sections"), cl::Hidden, cl::init(true))
VarLocInsertPt getNextNode(const DbgRecord *DVR)
Atomic ordering constants.
This file contains the simple types necessary to represent the attributes associated with functions a...
static GCRegistry::Add< ErlangGC > A("erlang", "erlang-compatible garbage collector")
static GCRegistry::Add< StatepointGC > D("statepoint-example", "an example strategy for statepoint")
static GCRegistry::Add< OcamlGC > B("ocaml", "ocaml 3.10-compatible GC")
Analysis containing CSE Info
Definition CSEInfo.cpp:27
This file contains the declarations for the subclasses of Constant, which represent the different fla...
const MemoryMapParams Linux_LoongArch64_MemoryMapParams
const MemoryMapParams Linux_X86_64_MemoryMapParams
static cl::opt< int > ClTrackOrigins("dfsan-track-origins", cl::desc("Track origins of labels"), cl::Hidden, cl::init(0))
static AtomicOrdering addReleaseOrdering(AtomicOrdering AO)
static AtomicOrdering addAcquireOrdering(AtomicOrdering AO)
const MemoryMapParams Linux_AArch64_MemoryMapParams
static bool isAMustTailRetVal(Value *RetVal)
This file provides an implementation of debug counters.
#define DEBUG_COUNTER(VARNAME, COUNTERNAME, DESC)
This file defines the DenseMap class.
This file builds on the ADT/GraphTraits.h file to build generic depth first graph iterator.
@ Default
static bool runOnFunction(Function &F, bool PostInlining)
This is the interface for a simple mod/ref and alias analysis over globals.
static size_t TypeSizeToSizeIndex(uint32_t TypeSize)
#define op(i)
Hexagon Common GEP
#define _
Module.h This file contains the declarations for the Module class.
static LVOptions Options
Definition LVOptions.cpp:25
#define F(x, y, z)
Definition MD5.cpp:54
#define I(x, y, z)
Definition MD5.cpp:57
Machine Check Debug Module
static const PlatformMemoryMapParams Linux_S390_MemoryMapParams
static const Align kMinOriginAlignment
static cl::opt< uint64_t > ClShadowBase("msan-shadow-base", cl::desc("Define custom MSan ShadowBase"), cl::Hidden, cl::init(0))
static cl::opt< bool > ClPoisonUndef("msan-poison-undef", cl::desc("Poison fully undef temporary values. " "Partially undefined constant vectors " "are unaffected by this flag (see " "-msan-poison-undef-vectors)."), cl::Hidden, cl::init(true))
static const PlatformMemoryMapParams Linux_X86_MemoryMapParams
static cl::opt< uint64_t > ClOriginBase("msan-origin-base", cl::desc("Define custom MSan OriginBase"), cl::Hidden, cl::init(0))
static cl::opt< bool > ClCheckConstantShadow("msan-check-constant-shadow", cl::desc("Insert checks for constant shadow values"), cl::Hidden, cl::init(true))
static const PlatformMemoryMapParams Linux_LoongArch_MemoryMapParams
static const MemoryMapParams NetBSD_X86_64_MemoryMapParams
static const PlatformMemoryMapParams Linux_MIPS_MemoryMapParams
static const unsigned kOriginSize
static cl::opt< bool > ClWithComdat("msan-with-comdat", cl::desc("Place MSan constructors in comdat sections"), cl::Hidden, cl::init(false))
static cl::opt< int > ClTrackOrigins("msan-track-origins", cl::desc("Track origins (allocation sites) of poisoned memory"), cl::Hidden, cl::init(0))
Track origins of uninitialized values.
static cl::opt< int > ClInstrumentationWithCallThreshold("msan-instrumentation-with-call-threshold", cl::desc("If the function being instrumented requires more than " "this number of checks and origin stores, use callbacks instead of " "inline checks (-1 means never use callbacks)."), cl::Hidden, cl::init(3500))
static cl::opt< int > ClPoisonStackPattern("msan-poison-stack-pattern", cl::desc("poison uninitialized stack variables with the given pattern"), cl::Hidden, cl::init(0xff))
static const Align kShadowTLSAlignment
static cl::opt< bool > ClHandleICmpExact("msan-handle-icmp-exact", cl::desc("exact handling of relational integer ICmp"), cl::Hidden, cl::init(true))
static const PlatformMemoryMapParams Linux_ARM_MemoryMapParams
static cl::opt< bool > ClDumpStrictInstructions("msan-dump-strict-instructions", cl::desc("print out instructions with default strict semantics i.e.," "check that all the inputs are fully initialized, and mark " "the output as fully initialized. These semantics are applied " "to instructions that could not be handled explicitly nor " "heuristically."), cl::Hidden, cl::init(false))
static Constant * getOrInsertGlobal(Module &M, StringRef Name, Type *Ty)
static cl::opt< bool > ClPreciseDisjointOr("msan-precise-disjoint-or", cl::desc("Precisely poison disjoint OR. If false (legacy behavior), " "disjointedness is ignored (i.e., 1|1 is initialized)."), cl::Hidden, cl::init(false))
static const MemoryMapParams Linux_S390X_MemoryMapParams
static cl::opt< bool > ClPoisonStack("msan-poison-stack", cl::desc("poison uninitialized stack variables"), cl::Hidden, cl::init(true))
static const MemoryMapParams Linux_I386_MemoryMapParams
const char kMsanInitName[]
static cl::opt< bool > ClPoisonUndefVectors("msan-poison-undef-vectors", cl::desc("Precisely poison partially undefined constant vectors. " "If false (legacy behavior), the entire vector is " "considered fully initialized, which may lead to false " "negatives. Fully undefined constant vectors are " "unaffected by this flag (see -msan-poison-undef)."), cl::Hidden, cl::init(false))
static cl::opt< bool > ClPrintStackNames("msan-print-stack-names", cl::desc("Print name of local stack variable"), cl::Hidden, cl::init(true))
static cl::opt< uint64_t > ClAndMask("msan-and-mask", cl::desc("Define custom MSan AndMask"), cl::Hidden, cl::init(0))
static cl::opt< bool > ClHandleLifetimeIntrinsics("msan-handle-lifetime-intrinsics", cl::desc("when possible, poison scoped variables at the beginning of the scope " "(slower, but more precise)"), cl::Hidden, cl::init(true))
static cl::opt< bool > ClKeepGoing("msan-keep-going", cl::desc("keep going after reporting a UMR"), cl::Hidden, cl::init(false))
static const MemoryMapParams FreeBSD_X86_64_MemoryMapParams
static GlobalVariable * createPrivateConstGlobalForString(Module &M, StringRef Str)
Create a non-const global initialized with the given string.
static const PlatformMemoryMapParams Linux_PowerPC_MemoryMapParams
static const size_t kNumberOfAccessSizes
static cl::opt< bool > ClEagerChecks("msan-eager-checks", cl::desc("check arguments and return values at function call boundaries"), cl::Hidden, cl::init(false))
static cl::opt< int > ClDisambiguateWarning("msan-disambiguate-warning-threshold", cl::desc("Define threshold for number of checks per " "debug location to force origin update."), cl::Hidden, cl::init(3))
static VarArgHelper * CreateVarArgHelper(Function &Func, MemorySanitizer &Msan, MemorySanitizerVisitor &Visitor)
static const MemoryMapParams Linux_MIPS64_MemoryMapParams
static const MemoryMapParams Linux_PowerPC64_MemoryMapParams
static cl::opt< uint64_t > ClXorMask("msan-xor-mask", cl::desc("Define custom MSan XorMask"), cl::Hidden, cl::init(0))
static cl::opt< bool > ClHandleAsmConservative("msan-handle-asm-conservative", cl::desc("conservative handling of inline assembly"), cl::Hidden, cl::init(true))
static const PlatformMemoryMapParams FreeBSD_X86_MemoryMapParams
static const PlatformMemoryMapParams FreeBSD_ARM_MemoryMapParams
static const unsigned kParamTLSSize
static cl::opt< bool > ClHandleICmp("msan-handle-icmp", cl::desc("propagate shadow through ICmpEQ and ICmpNE"), cl::Hidden, cl::init(true))
static cl::opt< bool > ClEnableKmsan("msan-kernel", cl::desc("Enable KernelMemorySanitizer instrumentation"), cl::Hidden, cl::init(false))
static cl::opt< bool > ClPoisonStackWithCall("msan-poison-stack-with-call", cl::desc("poison uninitialized stack variables with a call"), cl::Hidden, cl::init(false))
static const PlatformMemoryMapParams NetBSD_X86_MemoryMapParams
static cl::opt< bool > ClDumpHeuristicInstructions("msan-dump-heuristic-instructions", cl::desc("Prints 'unknown' instructions that were handled heuristically. " "Use -msan-dump-strict-instructions to print instructions that " "could not be handled explicitly nor heuristically."), cl::Hidden, cl::init(false))
static const unsigned kRetvalTLSSize
static const MemoryMapParams FreeBSD_AArch64_MemoryMapParams
const char kMsanModuleCtorName[]
static const MemoryMapParams FreeBSD_I386_MemoryMapParams
static cl::opt< bool > ClCheckAccessAddress("msan-check-access-address", cl::desc("report accesses through a pointer which has poisoned shadow"), cl::Hidden, cl::init(true))
static cl::opt< bool > ClDisableChecks("msan-disable-checks", cl::desc("Apply no_sanitize to the whole file"), cl::Hidden, cl::init(false))
#define T
FunctionAnalysisManager FAM
if(PassOpts->AAPipeline)
const SmallVectorImpl< MachineOperand > & Cond
static const char * name
void visit(MachineFunction &MF, MachineBasicBlock &Start, std::function< void(MachineBasicBlock *)> op)
This file implements a set that has insertion order iteration characteristics.
This file defines the SmallPtrSet class.
This file defines the SmallVector class.
This file contains some functions that are useful when dealing with strings.
#define LLVM_DEBUG(...)
Definition Debug.h:114
static TableGen::Emitter::OptClass< SkeletonEmitter > X("gen-skeleton-class", "Generate example skeleton class")
static SymbolRef::Type getType(const Symbol *Sym)
Definition TapiFile.cpp:39
Value * RHS
Value * LHS
static APInt getSignedMinValue(unsigned numBits)
Gets minimum signed value of APInt for a specific bit width.
Definition APInt.h:220
void setAlignment(Align Align)
PassT::Result & getResult(IRUnitT &IR, ExtraArgTs... ExtraArgs)
Get the result of an analysis pass for a given IR unit.
const T & front() const
front - Get the first element.
Definition ArrayRef.h:145
static LLVM_ABI ArrayType * get(Type *ElementType, uint64_t NumElements)
This static method is the primary way to construct an ArrayType.
This class stores enough information to efficiently remove some attributes from an existing AttrBuild...
AttributeMask & addAttribute(Attribute::AttrKind Val)
Add an attribute to the mask.
iterator end()
Definition BasicBlock.h:472
LLVM_ABI const_iterator getFirstInsertionPt() const
Returns an iterator to the first instruction in this block that is suitable for inserting a non-PHI i...
LLVM_ABI const BasicBlock * getSinglePredecessor() const
Return the predecessor of this block if it has a single predecessor block.
InstListType::iterator iterator
Instruction iterators...
Definition BasicBlock.h:170
bool isInlineAsm() const
Check if this call is an inline asm statement.
Function * getCalledFunction() const
Returns the function called, or null if this is an indirect function invocation or the function signa...
bool hasRetAttr(Attribute::AttrKind Kind) const
Determine whether the return value has the given attribute.
LLVM_ABI bool paramHasAttr(unsigned ArgNo, Attribute::AttrKind Kind) const
Determine whether the argument or parameter has the given attribute.
void removeFnAttrs(const AttributeMask &AttrsToRemove)
Removes the attributes from the function.
void setCannotMerge()
MaybeAlign getParamAlign(unsigned ArgNo) const
Extract the alignment for a call or parameter (0=unknown).
Type * getParamByValType(unsigned ArgNo) const
Extract the byval type for a call or parameter.
Value * getCalledOperand() const
Type * getParamElementType(unsigned ArgNo) const
Extract the elementtype type for a parameter.
Value * getArgOperand(unsigned i) const
void setArgOperand(unsigned i, Value *v)
FunctionType * getFunctionType() const
iterator_range< User::op_iterator > args()
Iteration adapter for range-for loops.
void addParamAttr(unsigned ArgNo, Attribute::AttrKind Kind)
Adds the attribute to the indicated argument.
Predicate
This enumeration lists the possible predicates for CmpInst subclasses.
Definition InstrTypes.h:676
@ ICMP_SLT
signed less than
Definition InstrTypes.h:705
@ ICMP_SLE
signed less or equal
Definition InstrTypes.h:706
@ ICMP_SGT
signed greater than
Definition InstrTypes.h:703
@ ICMP_SGE
signed greater or equal
Definition InstrTypes.h:704
static LLVM_ABI Constant * get(ArrayType *T, ArrayRef< Constant * > V)
static LLVM_ABI Constant * getString(LLVMContext &Context, StringRef Initializer, bool AddNull=true)
This method constructs a CDS and initializes it with a text string.
static LLVM_ABI Constant * get(LLVMContext &Context, ArrayRef< uint8_t > Elts)
get() constructors - Return a constant with vector type with an element count and element type matchi...
static ConstantInt * getSigned(IntegerType *Ty, int64_t V, bool ImplicitTrunc=false)
Return a ConstantInt with the specified value for the specified type.
Definition Constants.h:135
static LLVM_ABI ConstantInt * getBool(LLVMContext &Context, bool V)
static LLVM_ABI Constant * get(StructType *T, ArrayRef< Constant * > V)
static LLVM_ABI Constant * getSplat(ElementCount EC, Constant *Elt)
Return a ConstantVector with the specified constant in each element.
static LLVM_ABI Constant * get(ArrayRef< Constant * > V)
This is an important base class in LLVM.
Definition Constant.h:43
static LLVM_ABI Constant * getAllOnesValue(Type *Ty)
LLVM_ABI bool isAllOnesValue() const
Return true if this is the value that would be returned by getAllOnesValue.
static LLVM_ABI Constant * getNullValue(Type *Ty)
Constructor to create a '0' constant of arbitrary type.
LLVM_ABI Constant * getAggregateElement(unsigned Elt) const
For aggregates (struct/array/vector) return the constant that corresponds to the specified element if...
LLVM_ABI bool isZeroValue() const
Return true if the value is negative zero or null value.
Definition Constants.cpp:76
LLVM_ABI bool isNullValue() const
Return true if this is the value that would be returned by getNullValue.
Definition Constants.cpp:90
static bool shouldExecute(CounterInfo &Counter)
bool empty() const
Definition DenseMap.h:109
unsigned getNumElements() const
static LLVM_ABI FixedVectorType * get(Type *ElementType, unsigned NumElts)
Definition Type.cpp:802
static FixedVectorType * getHalfElementsVectorType(FixedVectorType *VTy)
A handy container for a FunctionType+Callee-pointer pair, which can be passed around as a single enti...
unsigned getNumParams() const
Return the number of fixed parameters this function type requires.
LLVM_ABI void setComdat(Comdat *C)
Definition Globals.cpp:214
@ PrivateLinkage
Like Internal, but omit from symbol table.
Definition GlobalValue.h:61
@ ExternalLinkage
Externally visible function.
Definition GlobalValue.h:53
Analysis pass providing a never-invalidated alias analysis result.
ConstantInt * getInt1(bool V)
Get a constant value representing either true or false.
Definition IRBuilder.h:497
Value * CreateInsertElement(Type *VecTy, Value *NewElt, Value *Idx, const Twine &Name="")
Definition IRBuilder.h:2585
Value * CreateConstGEP1_32(Type *Ty, Value *Ptr, unsigned Idx0, const Twine &Name="")
Definition IRBuilder.h:1939
AllocaInst * CreateAlloca(Type *Ty, unsigned AddrSpace, Value *ArraySize=nullptr, const Twine &Name="")
Definition IRBuilder.h:1833
IntegerType * getInt1Ty()
Fetch the type representing a single bit.
Definition IRBuilder.h:547
LLVM_ABI CallInst * CreateMaskedCompressStore(Value *Val, Value *Ptr, MaybeAlign Align, Value *Mask=nullptr)
Create a call to Masked Compress Store intrinsic.
Value * CreateInsertValue(Value *Agg, Value *Val, ArrayRef< unsigned > Idxs, const Twine &Name="")
Definition IRBuilder.h:2639
Value * CreateExtractElement(Value *Vec, Value *Idx, const Twine &Name="")
Definition IRBuilder.h:2573
IntegerType * getIntNTy(unsigned N)
Fetch the type representing an N-bit integer.
Definition IRBuilder.h:575
LoadInst * CreateAlignedLoad(Type *Ty, Value *Ptr, MaybeAlign Align, const char *Name)
Definition IRBuilder.h:1867
Value * CreateZExtOrTrunc(Value *V, Type *DestTy, const Twine &Name="")
Create a ZExt or Trunc from the integer value V to DestTy.
Definition IRBuilder.h:2103
CallInst * CreateMemCpy(Value *Dst, MaybeAlign DstAlign, Value *Src, MaybeAlign SrcAlign, uint64_t Size, bool isVolatile=false, const AAMDNodes &AAInfo=AAMDNodes())
Create and insert a memcpy between the specified pointers.
Definition IRBuilder.h:687
LLVM_ABI CallInst * CreateAndReduce(Value *Src)
Create a vector int AND reduction intrinsic of the source vector.
Value * CreatePointerCast(Value *V, Type *DestTy, const Twine &Name="")
Definition IRBuilder.h:2254
Value * CreateExtractValue(Value *Agg, ArrayRef< unsigned > Idxs, const Twine &Name="")
Definition IRBuilder.h:2632
LLVM_ABI CallInst * CreateMaskedLoad(Type *Ty, Value *Ptr, Align Alignment, Value *Mask, Value *PassThru=nullptr, const Twine &Name="")
Create a call to Masked Load intrinsic.
LLVM_ABI Value * CreateSelect(Value *C, Value *True, Value *False, const Twine &Name="", Instruction *MDFrom=nullptr)
BasicBlock::iterator GetInsertPoint() const
Definition IRBuilder.h:202
Value * CreateSExt(Value *V, Type *DestTy, const Twine &Name="")
Definition IRBuilder.h:2097
Value * CreateIntToPtr(Value *V, Type *DestTy, const Twine &Name="")
Definition IRBuilder.h:2202
Value * CreateLShr(Value *LHS, Value *RHS, const Twine &Name="", bool isExact=false)
Definition IRBuilder.h:1513
IntegerType * getInt32Ty()
Fetch the type representing a 32-bit integer.
Definition IRBuilder.h:562
ConstantInt * getInt8(uint8_t C)
Get a constant 8-bit value.
Definition IRBuilder.h:512
Value * CreatePtrAdd(Value *Ptr, Value *Offset, const Twine &Name="", GEPNoWrapFlags NW=GEPNoWrapFlags::none())
Definition IRBuilder.h:2039
IntegerType * getInt64Ty()
Fetch the type representing a 64-bit integer.
Definition IRBuilder.h:567
Value * CreateUDiv(Value *LHS, Value *RHS, const Twine &Name="", bool isExact=false)
Definition IRBuilder.h:1454
Value * CreateICmpNE(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:2336
Value * CreateGEP(Type *Ty, Value *Ptr, ArrayRef< Value * > IdxList, const Twine &Name="", GEPNoWrapFlags NW=GEPNoWrapFlags::none())
Definition IRBuilder.h:1926
Value * CreateNeg(Value *V, const Twine &Name="", bool HasNSW=false)
Definition IRBuilder.h:1784
LLVM_ABI CallInst * CreateOrReduce(Value *Src)
Create a vector int OR reduction intrinsic of the source vector.
LLVM_ABI Value * CreateBinaryIntrinsic(Intrinsic::ID ID, Value *LHS, Value *RHS, FMFSource FMFSource={}, const Twine &Name="")
Create a call to intrinsic ID with 2 operands which is mangled on the first type.
LLVM_ABI CallInst * CreateIntrinsic(Intrinsic::ID ID, ArrayRef< Type * > Types, ArrayRef< Value * > Args, FMFSource FMFSource={}, const Twine &Name="")
Create a call to intrinsic ID with Args, mangled using Types.
ConstantInt * getInt32(uint32_t C)
Get a constant 32-bit value.
Definition IRBuilder.h:522
PHINode * CreatePHI(Type *Ty, unsigned NumReservedValues, const Twine &Name="")
Definition IRBuilder.h:2497
Value * CreateNot(Value *V, const Twine &Name="")
Definition IRBuilder.h:1808
Value * CreateICmpEQ(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:2332
LLVM_ABI DebugLoc getCurrentDebugLocation() const
Get location information used by debugging information.
Definition IRBuilder.cpp:64
Value * CreateSub(Value *LHS, Value *RHS, const Twine &Name="", bool HasNUW=false, bool HasNSW=false)
Definition IRBuilder.h:1420
Value * CreateBitCast(Value *V, Type *DestTy, const Twine &Name="")
Definition IRBuilder.h:2207
ConstantInt * getIntN(unsigned N, uint64_t C)
Get a constant N-bit value, zero extended or truncated from a 64-bit value.
Definition IRBuilder.h:533
LoadInst * CreateLoad(Type *Ty, Value *Ptr, const char *Name)
Provided to resolve 'CreateLoad(Ty, Ptr, "...")' correctly, instead of converting the string to 'bool...
Definition IRBuilder.h:1850
Value * CreateShl(Value *LHS, Value *RHS, const Twine &Name="", bool HasNUW=false, bool HasNSW=false)
Definition IRBuilder.h:1492
CallInst * CreateMemSet(Value *Ptr, Value *Val, uint64_t Size, MaybeAlign Align, bool isVolatile=false, const AAMDNodes &AAInfo=AAMDNodes())
Create and insert a memset to the specified pointer and the specified value.
Definition IRBuilder.h:630
Value * CreateZExt(Value *V, Type *DestTy, const Twine &Name="", bool IsNonNeg=false)
Definition IRBuilder.h:2085
Value * CreateShuffleVector(Value *V1, Value *V2, Value *Mask, const Twine &Name="")
Definition IRBuilder.h:2607
LLVMContext & getContext() const
Definition IRBuilder.h:203
Value * CreateAnd(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:1551
StoreInst * CreateStore(Value *Val, Value *Ptr, bool isVolatile=false)
Definition IRBuilder.h:1863
LLVM_ABI CallInst * CreateMaskedStore(Value *Val, Value *Ptr, Align Alignment, Value *Mask)
Create a call to Masked Store intrinsic.
Value * CreateAdd(Value *LHS, Value *RHS, const Twine &Name="", bool HasNUW=false, bool HasNSW=false)
Definition IRBuilder.h:1403
Value * CreatePtrToInt(Value *V, Type *DestTy, const Twine &Name="")
Definition IRBuilder.h:2197
Value * CreateIsNotNull(Value *Arg, const Twine &Name="")
Return a boolean value testing if Arg != 0.
Definition IRBuilder.h:2665
CallInst * CreateCall(FunctionType *FTy, Value *Callee, ArrayRef< Value * > Args={}, const Twine &Name="", MDNode *FPMathTag=nullptr)
Definition IRBuilder.h:2511
Value * CreateTrunc(Value *V, Type *DestTy, const Twine &Name="", bool IsNUW=false, bool IsNSW=false)
Definition IRBuilder.h:2071
PointerType * getPtrTy(unsigned AddrSpace=0)
Fetch the type representing a pointer.
Definition IRBuilder.h:605
Value * CreateBinOp(Instruction::BinaryOps Opc, Value *LHS, Value *RHS, const Twine &Name="", MDNode *FPMathTag=nullptr)
Definition IRBuilder.h:1708
Value * CreateICmpSLT(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:2364
LLVM_ABI Value * CreateTypeSize(Type *Ty, TypeSize Size)
Create an expression which evaluates to the number of units in Size at runtime.
Value * CreateICmpUGE(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:2344
Value * CreateIntCast(Value *V, Type *DestTy, bool isSigned, const Twine &Name="")
Definition IRBuilder.h:2280
Value * CreateIsNull(Value *Arg, const Twine &Name="")
Return a boolean value testing if Arg == 0.
Definition IRBuilder.h:2660
void SetInsertPoint(BasicBlock *TheBB)
This specifies that created instructions should be appended to the end of the specified block.
Definition IRBuilder.h:207
Type * getVoidTy()
Fetch the type representing void.
Definition IRBuilder.h:600
StoreInst * CreateAlignedStore(Value *Val, Value *Ptr, MaybeAlign Align, bool isVolatile=false)
Definition IRBuilder.h:1886
LLVM_ABI CallInst * CreateMaskedExpandLoad(Type *Ty, Value *Ptr, MaybeAlign Align, Value *Mask=nullptr, Value *PassThru=nullptr, const Twine &Name="")
Create a call to Masked Expand Load intrinsic.
Value * CreateInBoundsPtrAdd(Value *Ptr, Value *Offset, const Twine &Name="")
Definition IRBuilder.h:2044
Value * CreateAShr(Value *LHS, Value *RHS, const Twine &Name="", bool isExact=false)
Definition IRBuilder.h:1532
Value * CreateXor(Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:1599
Value * CreateICmp(CmpInst::Predicate P, Value *LHS, Value *RHS, const Twine &Name="")
Definition IRBuilder.h:2442
Value * CreateOr(Value *LHS, Value *RHS, const Twine &Name="", bool IsDisjoint=false)
Definition IRBuilder.h:1573
IntegerType * getInt8Ty()
Fetch the type representing an 8-bit integer.
Definition IRBuilder.h:552
Value * CreateMul(Value *LHS, Value *RHS, const Twine &Name="", bool HasNUW=false, bool HasNSW=false)
Definition IRBuilder.h:1437
LLVM_ABI CallInst * CreateMaskedScatter(Value *Val, Value *Ptrs, Align Alignment, Value *Mask=nullptr)
Create a call to Masked Scatter intrinsic.
LLVM_ABI CallInst * CreateMaskedGather(Type *Ty, Value *Ptrs, Align Alignment, Value *Mask=nullptr, Value *PassThru=nullptr, const Twine &Name="")
Create a call to Masked Gather intrinsic.
This provides a uniform API for creating instructions and inserting them into a basic block: either a...
Definition IRBuilder.h:2794
std::vector< ConstraintInfo > ConstraintInfoVector
Definition InlineAsm.h:123
void visit(Iterator Start, Iterator End)
Definition InstVisitor.h:87
const DebugLoc & getDebugLoc() const
Return the debug location for this node as a DebugLoc.
LLVM_ABI InstListType::iterator eraseFromParent()
This method unlinks 'this' from the containing basic block and deletes it.
MDNode * getMetadata(unsigned KindID) const
Get the metadata of given kind attached to this Instruction.
LLVM_ABI bool comesBefore(const Instruction *Other) const
Given an instruction Other in the same basic block as this instruction, return true if this instructi...
static LLVM_ABI IntegerType * get(LLVMContext &C, unsigned NumBits)
This static method is the primary way of constructing an IntegerType.
Definition Type.cpp:318
LLVM_ABI MDNode * createUnlikelyBranchWeights()
Return metadata containing two branch weights, with significant bias towards false destination.
Definition MDBuilder.cpp:48
A Module instance is used to store all the information related to an LLVM module.
Definition Module.h:67
void addIncoming(Value *V, BasicBlock *BB)
Add an incoming value to the end of the PHI list.
static LLVM_ABI PoisonValue * get(Type *T)
Static factory methods - Return an 'poison' object of the specified type.
A set of analyses that are preserved following a run of a transformation pass.
Definition Analysis.h:112
static PreservedAnalyses none()
Convenience factory function for the empty preserved set.
Definition Analysis.h:115
static PreservedAnalyses all()
Construct a special preserved set that preserves all passes.
Definition Analysis.h:118
PreservedAnalyses & abandon()
Mark an analysis as abandoned.
Definition Analysis.h:171
bool remove(const value_type &X)
Remove an item from the set vector.
Definition SetVector.h:181
bool insert(const value_type &X)
Insert a new element into the SetVector.
Definition SetVector.h:151
void append(ItTy in_start, ItTy in_end)
Add the specified range to the end of the SmallVector.
void push_back(const T &Elt)
StringRef - Represent a constant reference to a string, i.e.
Definition StringRef.h:55
static LLVM_ABI StructType * get(LLVMContext &Context, ArrayRef< Type * > Elements, bool isPacked=false)
This static method is the primary way to create a literal StructType.
Definition Type.cpp:413
unsigned getNumElements() const
Random access to the elements.
Type * getElementType(unsigned N) const
Analysis pass providing the TargetLibraryInfo.
Provides information about what library functions are available for the current target.
AttributeList getAttrList(LLVMContext *C, ArrayRef< unsigned > ArgNos, bool Signed, bool Ret=false, AttributeList AL=AttributeList()) const
bool getLibFunc(StringRef funcName, LibFunc &F) const
Searches for a particular function name.
Triple - Helper class for working with autoconf configuration names.
Definition Triple.h:47
bool isMIPS64() const
Tests whether the target is MIPS 64-bit (little and big endian).
Definition Triple.h:1061
@ loongarch64
Definition Triple.h:65
bool isRISCV32() const
Tests whether the target is 32-bit RISC-V.
Definition Triple.h:1104
bool isPPC32() const
Tests whether the target is 32-bit PowerPC (little and big endian).
Definition Triple.h:1077
ArchType getArch() const
Get the parsed architecture type of this triple.
Definition Triple.h:414
bool isRISCV64() const
Tests whether the target is 64-bit RISC-V.
Definition Triple.h:1109
bool isLoongArch64() const
Tests whether the target is 64-bit LoongArch.
Definition Triple.h:1050
bool isMIPS32() const
Tests whether the target is MIPS 32-bit (little and big endian).
Definition Triple.h:1056
bool isARM() const
Tests whether the target is ARM (little and big endian).
Definition Triple.h:938
bool isPPC64() const
Tests whether the target is 64-bit PowerPC (little and big endian).
Definition Triple.h:1082
bool isAArch64() const
Tests whether the target is AArch64 (little and big endian).
Definition Triple.h:1029
bool isSystemZ() const
Tests whether the target is SystemZ.
Definition Triple.h:1128
The instances of the Type class are immutable: once they are created, they are never changed.
Definition Type.h:45
LLVM_ABI unsigned getIntegerBitWidth() const
bool isVectorTy() const
True if this is an instance of VectorType.
Definition Type.h:273
bool isArrayTy() const
True if this is an instance of ArrayType.
Definition Type.h:264
LLVM_ABI bool isScalableTy(SmallPtrSetImpl< const Type * > &Visited) const
Return true if this is a type whose size is a known multiple of vscale.
Definition Type.cpp:61
bool isIntOrIntVectorTy() const
Return true if this is an integer type or a vector of integer types.
Definition Type.h:246
bool isPointerTy() const
True if this is an instance of PointerType.
Definition Type.h:267
Type * getArrayElementType() const
Definition Type.h:408
bool isPPC_FP128Ty() const
Return true if this is powerpc long double.
Definition Type.h:165
static LLVM_ABI Type * getVoidTy(LLVMContext &C)
Definition Type.cpp:280
Type * getScalarType() const
If this is a vector type, return the element type, otherwise return 'this'.
Definition Type.h:352
LLVM_ABI TypeSize getPrimitiveSizeInBits() const LLVM_READONLY
Return the basic size of this type if it is a primitive type.
Definition Type.cpp:197
bool isSized(SmallPtrSetImpl< Type * > *Visited=nullptr) const
Return true if it makes sense to take the size of this type.
Definition Type.h:311
LLVM_ABI unsigned getScalarSizeInBits() const LLVM_READONLY
If this is a vector type, return the getPrimitiveSizeInBits value for the element type.
Definition Type.cpp:230
bool isFloatingPointTy() const
Return true if this is one of the floating-point types.
Definition Type.h:184
bool isIntOrPtrTy() const
Return true if this is an integer type or a pointer type.
Definition Type.h:255
bool isIntegerTy() const
True if this is an instance of IntegerType.
Definition Type.h:240
bool isFPOrFPVectorTy() const
Return true if this is a FP type or a vector of FP.
Definition Type.h:225
bool isVoidTy() const
Return true if this is 'void'.
Definition Type.h:139
Value * getOperand(unsigned i) const
Definition User.h:233
unsigned getNumOperands() const
Definition User.h:255
size_type count(const KeyT &Val) const
Return 1 if the specified key is in the map, 0 otherwise.
Definition ValueMap.h:156
Type * getType() const
All values are typed, get the type of this value.
Definition Value.h:256
LLVM_ABI void setName(const Twine &Name)
Change the name of the value.
Definition Value.cpp:397
LLVM_ABI StringRef getName() const
Return a constant reference to the value's name.
Definition Value.cpp:322
ElementCount getElementCount() const
Return an ElementCount instance to represent the (possibly scalable) number of elements in the vector...
Type * getElementType() const
int getNumOccurrences() const
constexpr ScalarTy getFixedValue() const
Definition TypeSize.h:200
constexpr bool isScalable() const
Returns whether the quantity is scaled by a runtime quantity (vscale).
Definition TypeSize.h:168
An efficient, type-erasing, non-owning reference to a callable.
const ParentTy * getParent() const
Definition ilist_node.h:34
self_iterator getIterator()
Definition ilist_node.h:123
This class implements an extremely fast bulk output stream that can only output to a stream.
Definition raw_ostream.h:53
CallInst * Call
#define llvm_unreachable(msg)
Marks that the current location is not supposed to be reachable.
constexpr char Align[]
Key for Kernel::Arg::Metadata::mAlign.
constexpr std::underlying_type_t< E > Mask()
Get a bitmask with 1s in all places up to the high-order bit of E's largest value.
@ C
The default llvm calling convention, compatible with C.
Definition CallingConv.h:34
@ BasicBlock
Various leaf nodes.
Definition ISDOpcodes.h:81
initializer< Ty > init(const Ty &Val)
Function * Kernel
Summary of a kernel (=entry point for target offloading).
Definition OpenMPOpt.h:21
NodeAddr< FuncNode * > Func
Definition RDFGraph.h:393
friend class Instruction
Iterator for Instructions in a `BasicBlock.
Definition BasicBlock.h:73
This is an optimization pass for GlobalISel generic memory operations.
Definition Types.h:26
unsigned Log2_32_Ceil(uint32_t Value)
Return the ceil log base 2 of the specified value, 32 if the value is zero.
Definition MathExtras.h:344
@ Offset
Definition DWP.cpp:532
FunctionAddr VTableAddr Value
Definition InstrProf.h:137
auto size(R &&Range, std::enable_if_t< std::is_base_of< std::random_access_iterator_tag, typename std::iterator_traits< decltype(Range.begin())>::iterator_category >::value, void > *=nullptr)
Get the size of a range.
Definition STLExtras.h:1667
auto enumerate(FirstRange &&First, RestRanges &&...Rest)
Given two or more input ranges, returns a new range whose values are tuples (A, B,...
Definition STLExtras.h:2530
decltype(auto) dyn_cast(const From &Val)
dyn_cast<X> - Return the argument parameter cast to the specified type.
Definition Casting.h:643
@ Done
Definition Threading.h:60
bool isAligned(Align Lhs, uint64_t SizeInBytes)
Checks that SizeInBytes is a multiple of the alignment.
Definition Alignment.h:134
LLVM_ABI std::pair< Instruction *, Value * > SplitBlockAndInsertSimpleForLoop(Value *End, BasicBlock::iterator SplitBefore)
Insert a for (int i = 0; i < End; i++) loop structure (with the exception that End is assumed > 0,...
InnerAnalysisManagerProxy< FunctionAnalysisManager, Module > FunctionAnalysisManagerModuleProxy
Provide the FunctionAnalysisManager to Module proxy.
constexpr bool isPowerOf2_64(uint64_t Value)
Return true if the argument is a power of two > 0 (64 bit edition.)
Definition MathExtras.h:284
unsigned Log2_64(uint64_t Value)
Return the floor log base 2 of the specified value, -1 if the value is zero.
Definition MathExtras.h:337
auto dyn_cast_or_null(const Y &Val)
Definition Casting.h:753
LLVM_ABI std::pair< Function *, FunctionCallee > getOrCreateSanitizerCtorAndInitFunctions(Module &M, StringRef CtorName, StringRef InitName, ArrayRef< Type * > InitArgTypes, ArrayRef< Value * > InitArgs, function_ref< void(Function *, FunctionCallee)> FunctionsCreatedCallback, StringRef VersionCheckName=StringRef(), bool Weak=false)
Creates sanitizer constructor function lazily.
LLVM_ABI raw_ostream & dbgs()
dbgs() - This returns a reference to a raw_ostream for debugging messages.
Definition Debug.cpp:207
LLVM_ABI void report_fatal_error(Error Err, bool gen_crash_diag=true)
Definition Error.cpp:167
class LLVM_GSL_OWNER SmallVector
Forward declaration of SmallVector so that calculateSmallVectorDefaultInlinedElements can reference s...
bool isa(const From &Val)
isa<X> - Return true if the parameter to the template is an instance of one of the template type argu...
Definition Casting.h:547
LLVM_ABI bool isKnownNonZero(const Value *V, const SimplifyQuery &Q, unsigned Depth=0)
Return true if the given value is known to be non-zero when defined.
LLVM_ABI raw_fd_ostream & errs()
This returns a reference to a raw_ostream for standard error.
AtomicOrdering
Atomic ordering for LLVM's memory model.
@ First
Helpers to iterate all locations in the MemoryEffectsBase class.
Definition ModRef.h:74
IRBuilder(LLVMContext &, FolderTy, InserterTy, MDNode *, ArrayRef< OperandBundleDef >) -> IRBuilder< FolderTy, InserterTy >
@ Or
Bitwise or logical OR of integers.
@ And
Bitwise or logical AND of integers.
@ Add
Sum of integers.
uint64_t alignTo(uint64_t Size, Align A)
Returns a multiple of A needed to store Size bytes.
Definition Alignment.h:144
DWARFExpression::Operation Op
RoundingMode
Rounding mode.
ArrayRef(const T &OneElt) -> ArrayRef< T >
constexpr unsigned BitWidth
LLVM_ABI void appendToGlobalCtors(Module &M, Function *F, int Priority, Constant *Data=nullptr)
Append F to the list of global ctors of module M with the given Priority.
decltype(auto) cast(const From &Val)
cast<X> - Return the argument parameter cast to the specified type.
Definition Casting.h:559
iterator_range< df_iterator< T > > depth_first(const T &G)
LLVM_ABI Instruction * SplitBlockAndInsertIfThen(Value *Cond, BasicBlock::iterator SplitBefore, bool Unreachable, MDNode *BranchWeights=nullptr, DomTreeUpdater *DTU=nullptr, LoopInfo *LI=nullptr, BasicBlock *ThenBlock=nullptr)
Split the containing block at the specified instruction - everything before SplitBefore stays in the ...
LLVM_ABI void maybeMarkSanitizerLibraryCallNoBuiltin(CallInst *CI, const TargetLibraryInfo *TLI)
Given a CallInst, check if it calls a string function known to CodeGen, and mark it with NoBuiltin if...
Definition Local.cpp:3865
LLVM_ABI bool removeUnreachableBlocks(Function &F, DomTreeUpdater *DTU=nullptr, MemorySSAUpdater *MSSAU=nullptr)
Remove all blocks that can not be reached from the function's entry.
Definition Local.cpp:2883
LLVM_ABI bool checkIfAlreadyInstrumented(Module &M, StringRef Flag)
Check if module has flag attached, if not add the flag.
std::string itostr(int64_t X)
AnalysisManager< Module > ModuleAnalysisManager
Convenience typedef for the Module analysis manager.
Definition MIRParser.h:39
This struct is a compact representation of a valid (non-zero power of two) alignment.
Definition Alignment.h:39
constexpr uint64_t value() const
This is a hole in the type system and should not be abused.
Definition Alignment.h:77
LLVM_ABI void printPipeline(raw_ostream &OS, function_ref< StringRef(StringRef)> MapClassName2PassName)
LLVM_ABI PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM)
A CRTP mix-in to automatically provide informational APIs needed for passes.
Definition PassManager.h:70