39550 – Miscompilations on AArch64 since "[SimplifyCFG] don't sink common insts too soon"

LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 39550 - Miscompilations on AArch64 since "[SimplifyCFG] don't sink common insts too soon"

Summary: Miscompilations on AArch64 since "[SimplifyCFG] don't sink common insts too s...

Status:	RESOLVED FIXED

Alias:	None

Product:	libraries
Classification:	Unclassified
Component:	Scalar Optimizations (show other bugs)
Version:	trunk
Hardware:	PC All

Importance:	P enhancement
Assignee:	Tim Northover

URL:
Keywords:

Depends on:
Blocks:	release-7.0.1
	Show dependency tree

Reported:	2018-11-03 15:39 PDT by Martin Storsjö
Modified:	2018-12-07 12:42 PST (History)
CC List:	9 users (show)

See Also:
Fixed By Commit(s):	r346203 r348444 r348636 r348642

Attachments
Reproduction code (41.69 KB, text/plain) 2018-11-03 15:39 PDT, Martin Storsjö	Details
Reduced test case (1.75 KB, text/x-csrc) 2018-11-05 00:04 PST, Martin Storsjö	Details
Show Obsolete (1) Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Martin Storsjö 2018-11-03 15:39:01 PDT

Created attachment 21080 [details]
Reproduction code

I'm running into a case where code seems to miscompile after SVN r320749, "[SimplifyCFG] don't sink common insts too soon (PR34603)" (part of LLVM 6.0, but the same issues can be reproduced with the latest trunk version as well).

I can't (yet) point out exactly where the new generated code is wrong, but this commit changed the outcome of the attached code.

To reproduce (somewhat), compile the attached sample with "clang -std=c99 -O3 -fvisibility=hidden -fomit-frame-pointer -ffast-math --target=aarch64-linux-gnu -c ref_mvs-preproc.c". If compiled with clang built from before SVN r320749, the compiled code does what it is supposed to, while if compiled with a later version, it produces incorrect results.

I have tried looking at the output from compiling with `-mllvm -print-after-all` to look at differences between before and after this commit, and there obviously are differences, but nothing that I could spot that stands out as obviously incorrect.

Surprisingly, the same code built for other architectures (both 32 and 64 bit x86, and armv7) with newer clang/llvm versions run just fine without any of the misbehaviour as I run into on AArch64.

Can someone spot what this SimplifyCFG change does wrt to this code sample, if there's some overlooked case? Or are the transformations correct and it just happens to trigger buggy codepaths in the AArch64 target after the transformation?

Comment 1 Sanjay Patel 2018-11-04 07:32:49 PST

To narrow down the problem, we can compile the source to optimized IR:
$ clang -O3 ... -emit-llvm -S

Then pass that IR to the backend:
$ llc ref_mvs-preproc.ll -o - | clang -x assembler -

And run the code:
$ a.out

Does that produce the wrong result? If yes, turn off optimization in the backend:
$ llc -O0 ref_mvs-preproc.ll -o - | clang -x assembler - ; ./a.out


Does that produce the wrong result? If yes, the bug is probably in the IR. If not, the bug is probably in the backend...and so r320749 is probably not the problem. (It's possible that there are multiple bugs interacting, but let's hope not...)

Comment 2 Martin Storsjö 2018-11-04 11:47:58 PST

(In reply to Sanjay Patel from comment #1)
> To narrow down the problem, we can compile the source to optimized IR:
> $ clang -O3 ... -emit-llvm -S
> 
> Then pass that IR to the backend:
> $ llc ref_mvs-preproc.ll -o - | clang -x assembler -
> 
> And run the code:
> $ a.out
> 
> Does that produce the wrong result? If yes, turn off optimization in the
> backend:
> $ llc -O0 ref_mvs-preproc.ll -o - | clang -x assembler - ; ./a.out
> 
> Does that produce the wrong result? If yes, the bug is probably in the IR.
> If not, the bug is probably in the backend...and so r320749 is probably not
> the problem. (It's possible that there are multiple bugs interacting, but
> let's hope not...)

Thanks for these points! It does indeed look like this commit isn't to blame, as the issue disappears if building the IR with llc -O0. That would also explain why it only appears on AArch64. Unfortunately, that also gets me more or less back to square one about figuring out the reason for it, but I guess I'll have to strip down the optimization behaviour change further to zoom in on what might be causing it.

Comment 3 Martin Storsjö 2018-11-05 00:04:39 PST

Created attachment 21086 [details]
Reduced test case

With this test case, the issue is much clearer.

Comment 4 Martin Storsjö 2018-11-05 00:18:05 PST

After reducing the issue further, it is clear that the issue lies somewhere around AArch64 CCMP instructions. CC:ing a few people who probably know a bit or two around that.

The issue seems to be that there is a condition that looks like this:

    (mode == 15 || mode == 20) && type > 0 && block_size_allowed

After lowering this to CCMPs, the effective form of the condition turns out into this:

    (((block_size_allowed && mode == 20) || mode == 15) && type > 0)

To reproduce, compile the attached reproduction case e.g. like this:

$ clang -O2 -target aarch64-linux-gnu ref_mvs-preproc.c -o repro
$ ./repro
ret aaaa
$ clang -O1 -target aarch64-linux-gnu ref_mvs-preproc.c -o repro
$ ./repro
ret bbbb

The generated assembly shows the issue rather clearly:
$ clang -O2 -target aarch64-linux-gnu ref_mvs-preproc.c -S -o -
...
        ldrb    w9, [x9, x8]
        ldrb    w8, [x10, x8]
        cmp     w9, w8
        csel    w8, w9, w8, lo
        cmp     w8, #7          // block_size_allowed
        ccmp    w2, #20, #0, hi // block_size_allowed && mode == 20
        ccmp    w2, #15, #4, ne // (block_size_allowed && mode == 20) || mode == 15
        ccmp    w4, #0, #4, eq  // ... && type > 0
        csel    x8, x0, x1, gt

While the issue is clearly visible here, I do see in the comment for emitConditionalComparison in AArch64ISelLowering.cpp that it is prepared to handle mixed AND/OR conditions, but apparently something goes wrong here.

Comment 5 Matthias Braun 2018-11-05 16:17:36 PST

You cannot directly express `||` with CCMP instructions. But in principle AArch64ISelLowering.cpp should transform `(mode == 15 || mode == 20) && type > 0 && block_size_allowed` into `!(mode != 15 && mode != 20) && type > 0 && block_size_allowed` (see also the long explanation I wrote in AArch64ISelLowering (search for AArch64CCMP).

It seems though something goes wrong with that computation...

Comment 6 Matthias Braun 2018-11-05 16:36:16 PST

I think for this to work properly we would need to emit the expression as:
type > 0 && block_size_allowed && !(mode != 15 && mode != 20)     (or with the type/block_size_allowed checks swapped, the important part if that the !() expression comes last in the conjunction. It seems the code fails to reorder the expressions as necessary.

Comment 7 Matthias Braun 2018-11-05 18:50:07 PST

I've got a fix for the issue now. Just need to cleanup the code some more and create a testcase before I can upload a review.

Comment 8 Martin Storsjö 2018-11-05 23:15:34 PST

(In reply to Matthias Braun from comment #5)
> You cannot directly express `||` with CCMP instructions. But in principle
> AArch64ISelLowering.cpp should transform `(mode == 15 || mode == 20) && type
> > 0 && block_size_allowed` into `!(mode != 15 && mode != 20) && type > 0 &&
> block_size_allowed` (see also the long explanation I wrote in
> AArch64ISelLowering (search for AArch64CCMP).
> 
> It seems though something goes wrong with that computation...

Thanks for looking at this! Yes, I found these functions; but the code seemed like it already meant to take care of this and it wasn't really obvious where the assumptions went wrong.

Comment 9 Matthias Braun 2018-11-07 16:40:58 PST

https://reviews.llvm.org/D54137 should fix this.

Comment 10 Martin Storsjö 2018-11-08 01:30:46 PST

(In reply to Matthias Braun from comment #9)
> https://reviews.llvm.org/D54137 should fix this.

I can confirm that the current version of the patch fixes this issue - thanks!

Comment 11 Tom Stellard 2018-12-05 19:18:06 PST

Tim, is this OK to merge to the release_70 branch?

https://reviews.llvm.org/rL348444

Comment 12 Martin Storsjö 2018-12-05 22:22:27 PST

FWIW, merging that fix also requires merging https://reviews.llvm.org/rL346203 as a preceding cleanup.

Comment 13 Tim Northover 2018-12-07 01:55:35 PST

Yep, I think they're both good.

Comment 14 Tom Stellard 2018-12-07 12:42:34 PST

Merged: r348636 r348642