New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ARM] SCE optimization pass hoisting DIV operation outside of is-else BB as a speculative execution #41750
Comments
assigned to @anton-afanasyev |
Why is this an issue for AArch64? Only (32-bit) M-class and R-class CPUs can trap on division by zero. |
X86 (didn't check other targets) for example mark SDIV/UDIV as hasSideEffect and it prevent the optimization/bug. |
I'm asking why this is a bug at all on AArch64 (the target in your llc invocation). I could see it being a problem on some ARMv7, possibly also resolved by marking division as having side-effects. |
I am not familiar with AArch64 target, we only use it as reproducer to show the problem ( we have out of tree target) . |
I believe adding |
I think that's what I'm leaning towards too. It's a shame "mayRaiseFPException" is named so specifically because it feels even more apt otherwise. |
I don't think that UnmodeledSideEffects is the solution, it limit Scheduler a lot. |
I'm sympathetic, but unless you can come up with a way it is modeled there's not really any alternative. It clearly has some undesirable side-effect. Though the more I look at mayRaiseFPException, the more appropriate it looks for this use-case except for the name. In principle we can even arrange the "FPExcept" flag to be set correctly based on the architecture in ARM. |
It's performance optimization primarily, since it eliminates partial redundancy of computation. I can also add special condition to prevent hoisting for such CFG (not so strong as suggested here https://reviews.llvm.org/rL362901#663831 though) since it makes no performance effect (only code size one). This would be a workaround for CSE step possibly not being able to recognize regs pressure causing by elimination, but it is an other issue, not related to the original bug (though it fixes it). The original bug should be fixed by something like "mayRaiseException" flag. |
IMHO, such an optimization, which adds a speculative execution of an operation, especially when done in Machine instruction level, should be very careful when hoisting an operation as a speculative execute. |
Btw, I've made a modification to PRE condition preventing it from useless hoisting: https://reviews.llvm.org/D63934. This change fixes this bug implicitly but I still believe the actual solution should involve checking special flag like "mayRaiseFPException". |
FWIW, the failure I'm seeing in Halide due to https://reviews.llvm.org/rL362901 is not resolved https://reviews.llvm.org/D63934. |
Hi Alina, could you provide me with test case or at least its control flow graph? We can separate your case from this issue. |
Yes, I am working on it. |
(Moving discussion back to this thread from Ping! Hi Alina, Ayman, Igor, I've asked before (https://reviews.llvm.org/D63934#1592654) whether the new patch from Kai Luo (https://reviews.llvm.org/D64394) fixes your particular issues. I can revert my original patch if it is not the case. |
Hi Anton, Apologies for the delayed reply. We no longer see the original failure, but it appears it is not due to changes in your original patch, that triggered the failure. The possible source of the fix may be in the HVX backend. |
Hi Anton, Apologies for the delayed reply. |
Hi all, I've benchmarked the effect of the revertion my and @lkail patches. |
Hi Igor, Does this patch https://reviews.llvm.org/D66132 is what you need to fix the issue? You can set |
Yes, Thank you! |
Hi, Is this fixed? I think it is - https://www.diffchecker.com/SvD3hVXM/. |
Extended Description
Attached is an MIR test (reproducer.mir) that shows a case where the original sequence performs the DIV operation only if the divisor is not 0. But after running MachineCSE pass we can see that the DIV was hoisted outside of the if-else BBs, what would result in an integer division by zero.
The command line for running the pass is:
llc -mtriple=aarch64-none-linux-gnu -run-pass=machine-cse -asm-verbose=1 -verify-machineinstrs reproducer.mir -o after-pass
Most likely that this bug was introduced in: https://reviews.llvm.org/rL362901.
Owner of the change is already investigating the issue.
The text was updated successfully, but these errors were encountered: