LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 47480 - 7 GWP-ASan lit test failures in linux self build after commit b22910daab95
Summary: 7 GWP-ASan lit test failures in linux self build after commit b22910daab95
Status: NEW
Alias: None
Product: libraries
Classification: Unclassified
Component: Scalar Optimizations (show other bugs)
Version: trunk
Hardware: PC All
: P normal
Assignee: Unassigned LLVM Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-09-10 00:51 PDT by Douglas Yung
Modified: 2021-07-08 12:53 PDT (History)
13 users (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Douglas Yung 2020-09-10 00:51:57 PDT
As part of our testing internally, we do a self-host build of the compiler on linux and run the lit tests to make sure that everything works. Recently this self-host build started failing 7 GWP-ASan tests in the stage 1 build of our self-host build. I bisected the test failure to recent commit b22910daab95be1ebc6ab8a74190e38130b0e6ef.

The 7 tests that are failing are:

1. GWP-ASan-x86_64 :: double_delete.cpp
2. GWP-ASan-x86_64 :: double_deletea.cpp
3. GWP-ASan-x86_64 :: double_free.cpp
4. GWP-ASan-x86_64 :: invalid_free_left.cpp
5. GWP-ASan-x86_64 :: invalid_free_right.cpp
6. GWP-ASan-x86_64 :: realloc.cpp
7. GwpAsan-Unittest :: ./GwpAsan-x86_64- Test/BacktraceGuardedPoolAllocator.DoubleFree

I was able to reproduce this locally on my linux machine running Ubuntu 20.04 using both Gcc 9.3.0 and clang 10.0.0 to build the initial compiler.

Steps to reproduce the failure:
1. Build clang using the system compiler (gcc 9.3 or clang 10 in my case)
2. Use the clang built in step 1 to rebuild clang
3. Using the version of clang built in step 2, run the lit tests

The cmake command I used to build was the following:
cmake -G Ninja -DLLVM_INCLUDE_EXAMPLES=OFF -DLLVM_VERSION_SUFFIX= -DLLVM_BUILD_RUNTIME=ON -DLLVM_TOOL_LLD_BUILD=OFF "-DLLVM_LIT_ARGS=--verbose -j150" -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_OPTIMIZED_TABLEGEN=ON -DLLVM_TARGETS_TO_BUILD=all "-DLLVM_ENABLE_PROJECTS=clang;compiler-rt" -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ <path to llvm sources>

The Unittest that failed produced the following output:
FAIL: GwpAsan-Unittest :: ./GwpAsan-x86_64-Test/BacktraceGuardedPoolAllocator.DoubleFree (32710 of 69710)
******************** TEST 'GwpAsan-Unittest :: ./GwpAsan-x86_64-Test/BacktraceGuardedPoolAllocator.DoubleFree' FAILED ********************
Note: Google Test filter = BacktraceGuardedPoolAllocator.DoubleFree
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from BacktraceGuardedPoolAllocator
[ RUN      ] BacktraceGuardedPoolAllocator.DoubleFree
/home/dyung/src/upstream/llvm_clean_git/compiler-rt/lib/gwp_asan/tests/backtrace.cpp:45: Failure
Death test: DeallocateMemory2(GPA, Ptr)
    Result: failed to die.
 Error msg:
[  DEATH   ]
[  FAILED  ] BacktraceGuardedPoolAllocator.DoubleFree (62 ms)
[----------] 1 test from BacktraceGuardedPoolAllocator (62 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (69 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] BacktraceGuardedPoolAllocator.DoubleFree

 1 FAILED TEST


While the regular lit tests that failed all seemed to not produce any output:
FileCheck error: '<stdin>' is empty.
FileCheck command line:  FileCheck /home/dyung/src/upstream/llvm_clean_git/compiler-rt/test/gwp_asan/double_delete.cpp
Comment 1 Roman Lebedev 2020-09-10 00:55:07 PDT
See ongoing disscussion in https://reviews.llvm.org/D87149
I believe TLDR is that unless these miscompiles are for volatile stores,
it is GWP-ASan that needs to be fixed.
Comment 2 Sanjay Patel 2020-09-10 06:42:57 PDT
Warning: I don't know anything about GWP-ASan, and I'm not experienced in the UB subtleties exposed by my patch...
but a first grep of that dir shows the likely problematic function:

void GuardedPoolAllocator::trapOnAddress(uintptr_t Address, Error E) {
  State.FailureType = E;
  State.FailureAddress = Address;

  // Raise a SEGV by touching first guard page.
  volatile char *p = reinterpret_cast<char *>(State.GuardedPagePool);
  *p = 0;
  __builtin_unreachable();
}


IIUC, we are probably going to change LLVM to allow this usage. But would it be safer/better to use __builtin_trap() or __builtin_debugtrap() in this function?
Comment 3 Sanjay Patel 2020-09-10 06:57:58 PDT
cc'ing people that I hopefully correctly cross-referenced from:
https://reviews.llvm.org/D60593
Comment 4 Nuno Lopes 2020-09-10 07:01:57 PDT
(In reply to Sanjay Patel from comment #2)
> Warning: I don't know anything about GWP-ASan, and I'm not experienced in
> the UB subtleties exposed by my patch...
> but a first grep of that dir shows the likely problematic function:
> 
> void GuardedPoolAllocator::trapOnAddress(uintptr_t Address, Error E) {
>   State.FailureType = E;
>   State.FailureAddress = Address;
> 
>   // Raise a SEGV by touching first guard page.
>   volatile char *p = reinterpret_cast<char *>(State.GuardedPagePool);
>   *p = 0;
>   __builtin_unreachable();
> }
> 
> 
> IIUC, we are probably going to change LLVM to allow this usage. But would it
> be safer/better to use __builtin_trap() or __builtin_debugtrap() in this
> function?

Nice catch, thanks!
Using __builtin_trap() would be better indeed. unreachable is quite nasty.
Comment 5 Douglas Yung 2020-09-10 10:35:15 PDT
(In reply to Nuno Lopes from comment #4)
> (In reply to Sanjay Patel from comment #2)
> > Warning: I don't know anything about GWP-ASan, and I'm not experienced in
> > the UB subtleties exposed by my patch...
> > but a first grep of that dir shows the likely problematic function:
> > 
> > void GuardedPoolAllocator::trapOnAddress(uintptr_t Address, Error E) {
> >   State.FailureType = E;
> >   State.FailureAddress = Address;
> > 
> >   // Raise a SEGV by touching first guard page.
> >   volatile char *p = reinterpret_cast<char *>(State.GuardedPagePool);
> >   *p = 0;
> >   __builtin_unreachable();
> > }
> > 
> > 
> > IIUC, we are probably going to change LLVM to allow this usage. But would it
> > be safer/better to use __builtin_trap() or __builtin_debugtrap() in this
> > function?
> 
> Nice catch, thanks!
> Using __builtin_trap() would be better indeed. unreachable is quite nasty.

I can confirm that changing the __builtin_unreachable() to __builtin_trap() does seem to fix the problem!
Comment 6 Johannes Doerfert 2020-09-10 10:56:35 PDT
While the "problem" will be address either way, I would generally think `__builtin_trap()` is the better way to cause a trap.
Comment 7 Sanjay Patel 2020-09-10 10:58:03 PDT
Note that we have at least temporarily restored the failing tests to passing:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/29534

...with this patch:
https://github.com/llvm/llvm-project/commit/4e413e16216d0c94ada2171f3c59e0a85f4fa4b6

But it sounds like changing the GWP-ASan source would be safer.
Comment 8 Mitch Phillips 2020-09-10 16:31:30 PDT
The volatile store in GWP-ASan is (by designed) supposed to crash the program with a segv at a specific address.

The __builtin_unreachable() is there to guide the compiler that it can tail-call optimise this function.

It seems to me that the root cause is that the __builtin_unreachable() shouldn't be allowed to dead-code eliminate things with side effects (like the volatile store)...
Comment 9 Johannes Doerfert 2020-09-10 16:39:03 PDT
(In reply to Mitch Phillips from comment #8)
> The volatile store in GWP-ASan is (by designed) supposed to crash the
> program with a segv at a specific address.
> 
> The __builtin_unreachable() is there to guide the compiler that it can
> tail-call optimise this function.
> 
> It seems to me that the root cause is that the __builtin_unreachable()
> shouldn't be allowed to dead-code eliminate things with side effects (like
> the volatile store)...

As we discuss in the thread https://reviews.llvm.org/D87149, the LLVM LangRef needs to be updated to provide the above guarantees. That said, generic "side effects", in the LLVM-IR sense, will *not* be preserved if they are inevitable followed by an unreachable. That is, a regular store to a global for example is a side effect but it won't be preserved. volatile accesses are different and you will probably get what you expect here, we'll update this once the discussion is settled.
Comment 10 Roman Lebedev 2021-07-08 10:03:45 PDT
(In reply to Johannes Doerfert from comment #9)
> (In reply to Mitch Phillips from comment #8)
> > The volatile store in GWP-ASan is (by designed) supposed to crash the
> > program with a segv at a specific address.
> > 
> > The __builtin_unreachable() is there to guide the compiler that it can
> > tail-call optimise this function.
> > 
> > It seems to me that the root cause is that the __builtin_unreachable()
> > shouldn't be allowed to dead-code eliminate things with side effects (like
> > the volatile store)...
> 
> As we discuss in the thread https://reviews.llvm.org/D87149, the LLVM
> LangRef needs to be updated to provide the above guarantees. That said,
> generic "side effects", in the LLVM-IR sense, will *not* be preserved if
> they are inevitable followed by an unreachable. That is, a regular store to
> a global for example is a side effect but it won't be preserved. volatile
> accesses are different and you will probably get what you expect here, we'll
> update this once the discussion is settled.

FYI looks like GWP-ASan still hasn't been fixed,
nor did anybody submit an RFC to change LangRef.
Chances are, these tests will break again tomorrow,
once the temporary patch expires as part of https://reviews.llvm.org/D105338
Comment 11 Mitch Phillips 2021-07-08 12:53:19 PDT
Sent https://reviews.llvm.org/D105654 to add a workaround in GWP-ASan, but __builtin_unreachable() optimising away code with side effects still needs to be fixed.