-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gstreamer causes assert "Wrong topological sorting", function InitDAGTopologicalSorting #16212
Comments
Reduced test case, compile with llc -march=x86 -mcpu=i486 < t.ll: define void @gst_atomic_queue_push() { while.body: if.then: |
fix for assertion failure |
proposed fix |
The patch to fix the assertion failure turns not to be necessary but it's still a valid patch I think. |
Proposed fix, updated for ToT llvm[3]: Compiling X86ISelDAGToDAG.cpp for Release+Asserts build and so on for all the possible overloads. Apparently llvm:DebugLoc has been replaced with llvm::SDLoc after the 3.3 release. I have updated the patch to compile with top-of-trunk. Now the only thing missing is a good testcase... And I agree, your earlier patch replacing vector with queue should also go in, as a separate commit of course. |
For some reason, this fix also changes the behavior of the codegen a little, since it makes test/CodeGen/X86/2012-01-16-mfence-nosse-flags.ll fail. This is because instead of the expected "cmpl $0, %eax", it now generates "testl %eax, %eax". I have no idea why yet. If it is correct, that other test must also be updated. |
I'm not sufficiently familiar with llvm internals to know if this is a correct explanation but I suspect the reason is that the fence used to be "xorl r,r \ lock orl r,(%esp)" whereas now it is "lock orl $O,(%esp)" (in the old code there was getConstant(0) and now it's getTargetConstant(0)). In the test case the comparison comes before the fence which means initially llvm expected to be able to reuse the zero for the fence in the comparison (cmpl r,%eax) and this isn't optimised into a testl. Later the comparison is rescheduled and llvm doesn't know r isn't clobbered so it becomes cmpl $0,%eax. With the patch the fence doesn't require a zero so initially there's cmpl $0,%eax which is optimised into a testl that is then rescheduled. First I tried to keep the getConstant(0), as an operand to X86ISD::FENCE and then instruction selection would choose between LOCK_OR32mi8 and LOCK_OR32mr based on whether the zero was used only once or not. This basically worked but I couldn't figure out what to put in the .td file for X86ISD::FENCE with an operand and I couldn't get it to optimise {__sync_synchronize();return 0;} which had two "xorl %eax,%eax" (llvm doesn't seem to know the src operand in LOCK_OR isn't clobbered). As for a test case, wouldn't Benjamin Kramer's test above be good enough? |
Proposed fix, updated for ToT, with testcase I also had to update the testcase for bug 11768, which now generates more efficient code, e.g. before the fix it generated:
after the fix it generates:
|
I've done some digging on exactly why it's failing, so I'll record my discoveries in case I forget over the weekend. The issue is that SelectionDAG comes across:
in chain order. When deciding whether it can merge the sub and the "A load", as is x86's wont, it tries to check if combining those chains would create a cycle. Unfortunately it stops looking when it reaches a MachineNode, thinking that everything beyond that point must be selected and hence OK (SelectionDAG.cpp:1863). That is an incorrect assumption in this case and a loopy SUB32rm is emitted. So the patch works by making sure there isn't a pre-selected MachineNode to find during this check. It would be nice to assume that MachineNodes only exist after selection, but if so, we've got quite a bit of refactoring in multiple backends to get there. I'll see if I can work out what the performance implications of doing a more thorough search here are. |
Fixed in r191165. Thanks Tim! |
mentioned in issue llvm/llvm-bugzilla-archive#15981 |
Extended Description
Reduced testcase attached. Compiling with "-target i386-unknown-freebsd10 -O2 -c gstatomicqueue-testcase.c" results in:
Assertion failed: (Node2Index[SU->NodeNum] > Node2Index[I->getSUnit()->NodeNum] && "Wrong topological sorting"), function InitDAGTopologicalSorting, file /share/dim/src/llvm/trunk/lib/CodeGen/ScheduleDAG.cpp, line 511.
Stack dump:
0. Program arguments: /home/dim/obj/llvm-180108M-trunk-freebsd10-i386-aconf-rel-1/Release+Asserts/bin/clang -cc1 -triple i386-unknown-freebsd10 -emit-obj -disable-free -main-file-name gstatomicqueue-testcase.c -mrelocation-model static -mdisable-fp-elim -masm-verbose -mconstructor-aliases -target-cpu i486 -target-linker-version 2.23.1 -coverage-file /share/dim/src/llvm/bugs/gstreamer/gstatomicqueue-testcase.o -resource-dir /home/dim/obj/llvm-180108M-trunk-freebsd10-i386-aconf-rel-1/Release+Asserts/bin/../lib/clang/3.3 -O2 -fdebug-compilation-dir /share/dim/src/llvm/bugs/gstreamer -ferror-limit 19 -fmessage-length 278 -mstackrealign -fobjc-runtime=gnustep -fobjc-default-synthesize-properties -fdiagnostics-show-option -fcolor-diagnostics -backend-option -vectorize-loops -o gstatomicqueue-testcase.o -x c gstatomicqueue-testcase.c
clang: error: unable to execute command: Abort trap (core dumped)
clang: error: clang frontend command failed due to signal (use -v to see invocation)
clang version 3.3 (trunk 180108)
Target: i386-unknown-freebsd10
Thread model: posix
The text was updated successfully, but these errors were encountered: