This is a regression, this test executes correctly with clang 6.0.1. Running the command: ${TEST_SUITE_BUILDDIR}/MultiSource/Applications/siod/siod -v1 ${TEST_SUITE_SRCDIR}/test-suite/MultiSource/Applications/siod/test.scm Produces an error: Content-type: text/plain 10 9 8 7 6 5 4 3 2 1 -42 is negative 0 is zero 42 is positive the value of (get 'answer 'value) is 42 the value of (get 'answer 'value) is xyzzy the 33rd Fibonacci number is ERROR: damaged frame (see errobj) I am going to try to bisect this.
This failure was introduced by: r322957.
Can you attach pre-processed source or IR to repro the problem?
Created attachment 20757 [details] Pre-processed source Here is the pre-processed source. Here are the compile arguments that are used: clang -DNDEBUG -O2 -DNDEBUG -w -Werror=date-time -D__USE_MISC -D__USE_GNU -D__USE_SVID -D__USE_XOPEN_EXTENDED -D__USE_XOPEN -Dunix -o slib.c.o -c slib.i
(In reply to Tom Stellard from comment #3) > Created attachment 20757 [details] > Pre-processed source > > Here is the pre-processed source. Here are the compile arguments that are > used: > > clang -DNDEBUG -O2 -DNDEBUG -w -Werror=date-time -D__USE_MISC -D__USE_GNU > -D__USE_SVID -D__USE_XOPEN_EXTENDED -D__USE_XOPEN -Dunix -o slib.c.o -c > slib.i Thanks! To make it easier to A/B test this, I added a flag that disables the optimization that was added with r322957 here: https://reviews.llvm.org/rL340540 At first glance, that optimization is doing what we expected in _c_mark_and_sweep(): movabsq $9223372036854775800, %r14 ## imm = 0x7FFFFFFFFFFFFFF8 shrq %rdx andq %r14, %rdx becomes: andq $-8, %rdx ; avoided a giant constant and saved a register But there's another diff later: movabsq $-6148914691236517205, %r10 ## imm = 0xAAAAAAAAAAAAAAAB becomes: movq _heaps@GOTPCREL(%rip), %r10 That seems...odd. cc'ing Craig for ideas about how to reduce/debug this.
I'm not able to reproduce this error. I downloaded the 7.0-rc1 source and built it on linux, but this is the output i got from the test Content-type: text/plain 10 9 8 7 6 5 4 3 2 1 -42 is negative 0 is zero 42 is positive the value of (get 'answer 'value) is 42 the value of (get 'answer 'value) is xyzzy the 33rd Fibonacci number is 3.52458e+06
Let's make sure we're compiling the same way - the "-O2 NDEBUG" appears to be necessary (and I'm not sure if that's how test-suite would build this by default). I set up test-suite on Linux x86-64, and I can reproduce the problem, but I'm not seeing that same suspicious asm that I saw before (that was on macOS using the attached source file). Here's how I built/ran the siod benchmark: ~/mysandbox $ ./bin/lnt runtest test-suite --sandbox ~/mysandbox/ --cc ~/myllvm/release/bin/clang --test-suite ~/llvm-test-suite --only-test MultiSource/Applications/siod --use-lit ~/myllvm/release/bin/llvm-lit --use-cmake cmake --build-threads 4 --threads 4 --cflags="-O2 -DNDEBUG -mllvm -x86-and-imm-shrink=0 -mllvm -stats -save-temps" This passes as shown in comment 5. Now, allow mask shrinking: ~/mysandbox $ ./bin/lnt runtest test-suite --sandbox ~/mysandbox/ --cc ~/myllvm/release/bin/clang --test-suite ~/llvm-test-suite --only-test MultiSource/Applications/siod --use-lit ~/myllvm/release/bin/llvm-lit --use-cmake cmake --build-threads 4 --threads 4 --cflags="-O2 -DNDEBUG -mllvm -x86-and-imm-shrink=1 -mllvm -stats -save-temps" This fails as shown in the description.
I managed to get it to fail with and shrinking enabled, but I got "ERROR: too few arguements" If you modify test.scm to calculate a different fibonacci number, do you find that it stops failing for smaller values? For me the failure doesn't happen calculating the 20th fibonacci number, but fails for 21.
(In reply to Craig Topper from comment #7) > I managed to get it to fail with and shrinking enabled, but I got "ERROR: > too few arguements" > > If you modify test.scm to calculate a different fibonacci number, do you > find that it stops failing for smaller values? For me the failure doesn't > happen calculating the 20th fibonacci number, but fails for 21. For me the lowest number that fails for me is 23.
Created attachment 20784 [details] Reproducer script Here is the script I've been using to reproduce. You may have to adjust some of the paths if you haven't run cmake.
(In reply to Tom Stellard from comment #8) > (In reply to Craig Topper from comment #7) > > I managed to get it to fail with and shrinking enabled, but I got "ERROR: > > too few arguements" > > > > If you modify test.scm to calculate a different fibonacci number, do you > > find that it stops failing for smaller values? For me the failure doesn't > > happen calculating the 20th fibonacci number, but fails for 21. > > For me the lowest number that fails for me is 23. That's what I'm seeing too. There were some unnecessary params in my earlier test-suite invocation. It can be reduced to: ~/mysandbox$ ./bin/lnt runtest test-suite --sandbox ~/mysandbox/ --cc ~/myllvm/release/bin/clang --test-suite ~/llvm-test-suite --only-test MultiSource/Applications/siod --use-lit ~/myllvm/release/bin/llvm-lit --use-cmake cmake --cflags="-O2 -DNDEBUG -mllvm -x86-and-imm-shrink=1"
Do you think we can fix this for the release? It seems bad if it's a regression from 6.0.1.
I've looked a little closer and the ands that got optimized changed their mask from 0x7fff'ffff'ffff'fff8 in a register created by a movabsq that was hoisted outside of loop to using an and with a -8 immediate. The instruction proceeding all of the ands was a right shift by 1. So this seems correct to me. But this freed up a register and changed the stack spills and reloads created by the register allocator. The program has some questionable stack memory access behavior. Valgrind reports many conditional operations using uninitialized memory. I haven't checked it with address sanitizer yet. I wonder if the change in spills/reloads changed the behavior.
The gc_mark_and_sweep function creates a variable named "end_of_stack", but doesn't initialize it. It then passes the address of this variable as an end pointer to a function called mark_locations. The "start" address passed comes from a global variable that points to another uninitialized object in an earlier callers stack frame. The mark_locations code will note that start and end are out of order due to stack growing down and will swap them. Making start then point to the beginning of the end_of_stack object. Then it calls mark_locations_array which will read from start to end looking for pointers. So this will read the uninitialized value from end_of_stack and it will read the return addresses and other spills, etc. along the way.
(In reply to Craig Topper from comment #13) > The gc_mark_and_sweep function creates a variable named "end_of_stack", but > doesn't initialize it. It then passes the address of this variable as an end > pointer to a function called mark_locations. The "start" address passed > comes from a global variable that points to another uninitialized object in > an earlier callers stack frame. The mark_locations code will note that start > and end are out of order due to stack growing down and will swap them. > Making start then point to the beginning of the end_of_stack object. Then it > calls mark_locations_array which will read from start to end looking for > pointers. So this will read the uninitialized value from end_of_stack and it > will read the return addresses and other spills, etc. along the way. Nice detective work! Not sure what the policy is for this sort of thing. Do we resolve this bug as invalid? Are we supposed to flag the test as potentially misbehaving or fix the test itself?
(In reply to Sanjay Patel from comment #14) > (In reply to Craig Topper from comment #13) > > The gc_mark_and_sweep function creates a variable named "end_of_stack", but > > doesn't initialize it. It then passes the address of this variable as an end > > pointer to a function called mark_locations. The "start" address passed > > comes from a global variable that points to another uninitialized object in > > an earlier callers stack frame. The mark_locations code will note that start > > and end are out of order due to stack growing down and will swap them. > > Making start then point to the beginning of the end_of_stack object. Then it > > calls mark_locations_array which will read from start to end looking for > > pointers. So this will read the uninitialized value from end_of_stack and it > > will read the return addresses and other spills, etc. along the way. > > Nice detective work! +1, thanks for digging into this! > > Not sure what the policy is for this sort of thing. Do we resolve this bug > as invalid? Are we supposed to flag the test as potentially misbehaving or > fix the test itself? At least it means I won't worry about it for 7.0.0, so unblocking :-) I think for the test-suite, it's common to fix the test to not misbehave if possible. If that's not possible / too inconvenient, I suppose we should flag or delete it since it's not adding value.
One our engineers finally figured out the real problem here https://reviews.llvm.org/D60039 If I remember right from my analysis, Sanjay's patch did change some spill behavior here. I just didn't understand at the time how that would effect things.
Hi James, Do you think this is OK to merge into 8.0.1?
Yes, should be fine.
Merged: r362747 and r362751.