$ cat 1.cc #include <stdio.h> #include <sanitizer/msan_interface.h> struct Map { unsigned Small; long Buckets; long *Ptr; unsigned NumBuckets; Map() { Small = true; Buckets = 42; __msan_allocated_memory(&NumBuckets, sizeof(NumBuckets)); } long *getBuckets() { return Small ? &Buckets : Ptr; } unsigned getNumBuckets() { return Small ? 1 : NumBuckets; } }; Map M; int main() { for (int i = 0; i < 1; ++i) { if (M.getNumBuckets() != 0) { if (42 == *M.getBuckets()) return 1; } } if (!M.Small) printf("zz\n"); } $ clang++ 1.cc -std=c++14 -g -O3 -fsanitize=memory && ./a.out WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x4879c3 in main 1.cc:30:9 #1 0x7f12aeff9f44 in __libc_start_main #2 0x419b11 in _start Looking at the bitcode, the first thing the program does is compare NumBuckets with 0 and branch on the result. That's what MSan is complaining about. The source code reads NumBuckets only when Small == 0, which is impossible. This happens only at -O3. $ clang++ 1.cc -std=c++14 -g -O3 -S -emit-llvm -o - ... %struct.Map = type <{ i32, [4 x i8], i64, i64*, i32, [4 x i8] }> ... define i32 @main() #0 { entry: %0 = load i32, i32* getelementptr inbounds (%struct.Map, %struct.Map* @M, i64 0, i32 0), align 8 %tobool.i20 = icmp eq i32 %0, 0 %1 = load i32, i32* getelementptr inbounds (%struct.Map, %struct.Map* @M, i64 0, i32 4), align 8 %cmp1 = icmp eq i32 %1, 0 %2 = load i64*, i64** getelementptr inbounds (%struct.Map, %struct.Map* @M, i64 0, i32 3), align 8 br i1 %cmp1, label %entry.split.us, label %entry.split
Full bitcode for main() at -O3: define i32 @main() #0 { entry: %0 = load i32, i32* getelementptr inbounds (%struct.Map, %struct.Map* @M, i64 0, i32 0), align 8 %tobool.i20 = icmp eq i32 %0, 0 %1 = load i32, i32* getelementptr inbounds (%struct.Map, %struct.Map* @M, i64 0, i32 4), align 8 %cmp1 = icmp eq i32 %1, 0 %2 = load i64*, i64** getelementptr inbounds (%struct.Map, %struct.Map* @M, i64 0, i32 3), align 8 br i1 %cmp1, label %entry.split.us, label %entry.split entry.split.us: ; preds = %entry br i1 %tobool.i20, label %if.then6, label %for.body.us.preheader for.body.us.preheader: ; preds = %entry.split.us %3 = load i64, i64* getelementptr inbounds (%struct.Map, %struct.Map* @M, i64 0, i32 2), align 8 %cmp3.us = icmp eq i64 %3, 42 %. = zext i1 %cmp3.us to i32 br label %if.end8 entry.split: ; preds = %entry br i1 %tobool.i20, label %for.body.us23.preheader, label %for.body.preheader for.body.preheader: ; preds = %entry.split %4 = load i64, i64* getelementptr inbounds (%struct.Map, %struct.Map* @M, i64 0, i32 2), align 8, !tbaa !1 %cmp3 = icmp eq i64 %4, 42 %.42 = zext i1 %cmp3 to i32 ret i32 %.42 for.body.us23.preheader: ; preds = %entry.split %5 = load i64, i64* %2, align 8, !tbaa !1 %cmp3.us31 = icmp eq i64 %5, 42 br i1 %cmp3.us31, label %if.end8, label %if.then6 if.then6: ; preds = %for.body.us23.preheader, %entry.split.us %puts = tail call i32 @puts(i8* getelementptr inbounds ([3 x i8], [3 x i8]* @str, i64 0, i64 0)) br label %if.end8 if.end8: ; preds = %for.body.us.preheader, %for.body.us23.preheader, %if.then6 %retval.017 = phi i32 [ 0, %if.then6 ], [ 1, %for.body.us23.preheader ], [ %., %for.body.us.preheader ] ret i32 %retval.017 }
Created attachment 16494 [details] IR test case The problem is introduced in the loop-unswitch pass. $ opt 2.ll -loop-unswitch -S -o - ... define i32 @main() #0 { entry: %0 = load i32, i32* getelementptr inbounds (%struct.Map, %struct.Map* @M, i64 0, i32 0), align 8 %tobool.i12 = icmp eq i32 %0, 0 %1 = load i32, i32* getelementptr inbounds (%struct.Map, %struct.Map* @M, i64 0, i32 4), align 8 %cmp1 = icmp eq i32 %1, 0 %.pr = load i32, i32* getelementptr inbounds (%struct.Map, %struct.Map* @M, i64 0, i32 0), align 8 %tobool.i1 = icmp eq i32 %.pr, 0 %2 = load i64*, i64** getelementptr inbounds (%struct.Map, %struct.Map* @M, i64 0, i32 3), align 8 %3 = load i32, i32* getelementptr inbounds (%struct.Map, %struct.Map* @M, i64 0, i32 0), align 8 %tobool.i = icmp eq i32 %3, 0 br i1 %cmp1, label %entry.split.us, label %entry.entry.split_crit_edge ...
This is causing the failures on the sanitizer-bootstrap bot that started in the build with the revision range (271398, 271414], but with this test case I can reproduce the bug a lot earlier, at least at r246152 (Aug 2015).
Simplified test case: volatile int sink; long y; void f(bool x, int *p, int *q) { volatile bool x2 = x; for (int i = 0; i < 1; ++i) { if (x2) { if (y) sink = *p; else sink = *q; } } } This loop is unswitched on (y), this creates an unconditional load (and "msan-use") of y independent on the value of x. This is bad for MSan (y may not be initialized if x == false) and for TSan (there could be concurrent writes to y if x == false). My idea is to disable this optimization for MSan and TSan if the unswitch condition does not post-dominate the loop header, i.e. unswitching is safe if the condition is evaluated in all executions.
Proposed fix: http://reviews.llvm.org/D21165 To correct the last comment, TSan is fine - it would never speculate the load of "y". MSan is fine with speculative loads and stores, but it is unhappy about speculative branch instructions - that's when MSan checks the shadow. In the example below "if (y)" can not be moved out of the loop, even though there are no loads. volatile int sink; void f(bool x, bool y, int *p, int *q) { volatile bool x2 = x; for (int i = 0; i < 1; ++i) { if (x2) { if (y) sink = *p; else sink = *q; } } }
Here is another example where it is basically impossible for MSan to propagate initializeness information exactly after the unswitching optimization. This loop is unswitched on y, then we have 2 copies of the loop calculating "a" and phi + ret at the end. When y is undef, if we just track a single-bit "control flow is poisoned" state, we end up with "a" being undef, too. To conclude that a is defined MSan has to prove that both loops calculate "a" with the same sequence of operations which sounds impossible in the general case. volatile int sink; // The interesting execution has y = false, z = 1. int f(bool y, unsigned z) { int a = 42; for (int i = 0; i < z; ++i) { if (i) if (y) sink = 1; a ^= 123; } return a; } The root of the problem is in the difference between the LLVM semantics for undef, and the semantics enforced by MSan. MSan traps on the branch with and undefined predicate. LLVM IR, on the other hand, has undefined behaviour only when the two sides of such branch have different observable behaviour. This property seems very hard to check. The fix in http://reviews.llvm.org/D21165 should be enough short-term. For a long term fix, these 2 options came up in a discussion with Chandler: 1. Teach MSan not to report use-of-uninit in a branch predicate unless the branches diverge. This is a runtime property, i.e. the branches may diverge in the executions when the predicate is initialized. This sounds like a very hard problem, with unclear performance implications. 2. Get some help from the frontend, which can introduce "observation points" - intrinsics that say "this value must be initialized". Then MSan could do best-effort propagation, reverting to fully initialized when it find something it can not handle, and insert checks only at observation points. In the first iteration of this approach, MSan can ignore uninitialized branch conditions (that aren't observation points). As an improvement, it can have an i1 control_flow_is_poisoned flag which would affect "hard" side-effects but not the regular SSA computations. As a big plus of this approach, it would make MSan a lot more predictable from the user point of view. It would start detecting all those bugs that are missed because of the bad access being optimized away due to undef propagation outside of the MemorySanitizer pass. On the negative side, MSan will become less useful as a tool to catch miscompilations. As I hear, a lot of miscompilation bugs in LLVM manifest as mysterious MSan (or ASan, TSan) reports.