Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PPC] Diamond if-converter fails when two BB terminate in ret instead of branches. #36173

Closed
vchuravy opened this issue Mar 20, 2018 · 14 comments
Labels
bugzilla Issues migrated from bugzilla

Comments

@vchuravy
Copy link
Contributor

Bugzilla Link 36825
Resolution FIXED
Resolved on Nov 02, 2018 14:47
Version 6.0
OS Linux
CC @DimitryAndric,@hfinkel,@nemanjai,@vtjnash

Extended Description

Found while upgrading the Julia frontend to LLVM 6.0.

/raid1/home/ibm/valentin/src/julia/deps/srccache/llvm-6.0.0/include/llvm/ADT/ilist_iterator.h:140: llvm::ilist_iterator<OptionsT, IsReverse, IsConst>::reference llvm::ilist_iterator<OptionsT, IsReverse, IsConst>::operator*() const [with OptionsT = llvm::ilist_detail::node_options<llvm::MachineInstr, true, true, void>; bool IsReverse = false; bool IsConst = false; llvm::ilist_iterator<OptionsT, IsReverse, IsConst>::reference = llvm::MachineInstr&]: Assertion `!NodePtr->isKnownSentinel()' failed.

signal (6): Aborted
in expression starting at sysimg.jl:213
gsignal at /lib64/libc.so.6 (unknown line)
abort at /lib64/libc.so.6 (unknown line)
__assert_fail_base at /lib64/libc.so.6 (unknown line)
__assert_fail at /lib64/libc.so.6 (unknown line)
operator* at /raid1/home/ibm/valentin/src/julia/deps/srccache/llvm-6.0.0/include/llvm/ADT/ilist_iterator.h:140
operator* at /raid1/home/ibm/valentin/src/julia/deps/srccache/llvm-6.0.0/include/llvm/CodeGen/MachineInstrBundleIterator.h:134 [inlined]
operator* at /raid1/home/ibm/valentin/src/julia/deps/srccache/llvm-6.0.0/include/llvm/CodeGen/MachineInstrBundleIterator.h:179 [inlined]
operator-> at /raid1/home/ibm/valentin/src/julia/deps/srccache/llvm-6.0.0/include/llvm/CodeGen/MachineInstrBundleIterator.h:180 [inlined]
IfConvertDiamondCommon at /raid1/home/ibm/valentin/src/julia/deps/srccache/llvm-6.0.0/lib/CodeGen/IfConversion.cpp:1726
IfConvertDiamond at /raid1/home/ibm/valentin/src/julia/deps/srccache/llvm-6.0.0/lib/CodeGen/IfConversion.cpp:1910 [inlined]
runOnMachineFunction at /raid1/home/ibm/valentin/src/julia/deps/srccache/llvm-6.0.0/lib/CodeGen/IfConversion.cpp:461
runOnFunction at /raid1/home/ibm/valentin/src/julia/deps/srccache/llvm-6.0.0/lib/CodeGen/MachineFunctionPass.cpp:62
runOnFunction at /raid1/home/ibm/valentin/src/julia/deps/srccache/llvm-6.0.0/lib/IR/LegacyPassManager.cpp:1520
runOnModule at /raid1/home/ibm/valentin/src/julia/deps/srccache/llvm-6.0.0/lib/IR/LegacyPassManager.cpp:1541
run at /raid1/home/ibm/valentin/src/julia/deps/srccache/llvm-6.0.0/lib/IR/LegacyPassManager.cpp:1597
run at /raid1/home/ibm/valentin/src/julia/deps/srccache/llvm-6.0.0/lib/IR/LegacyPassManager.cpp:1731
operator() at /raid1/home/ibm/valentin/src/julia/src/jitlayers.cpp:466
Allocations: 43736398 (Pool: 43715287; Big: 21111); GC: 111

I am failing to isolate this failure for the last two weeks (also see http://lists.llvm.org/pipermail/llvm-dev/2018-March/121932.html)

I nailed it down to the function japi1__require_7687 (in gdb MF->getName) and I put the log online (124MB) https://drive.google.com/open?id=1Br0s9Qvr4tzPv8nqbnV_nWezpEH5Ci7B

The setup of the target machine is roughly equivalent to:

llc -O3 -fast-isel=0 -code-model=medium -mcpu=pwr8 -relocation-model=static -emulated-tls

and the last Julia specific pass in the log is LowerPTLS. So a pipeline starting from bool-ret-to-int should be able to re-trigger the assertion, but to no avail.

I compared the output of debug-pass=Arguments and the only difference between the pipeline used by llc and julia post bool-ret-to-int is:

llc: -basicaa -aa -loops -branch-prob -isel -machinedomtree -ppc-ctr-loops-verify -ppc-vsx-copy -expand-isel-pseudos
Julia: -basicaa -aa -loops -branch-prob -machinedomtree -ppc-ctr-loops-verify -ppc-vsx-copy -expand-isel-pseudos

Note the additional isel in the llc pipeline that is missing in Julia, but I can't find a reason why that would be true.

Some other breadcrumps:

Ifcvt: function (4486) 'japi1__require_7687'
Ifcvt (Diamond): %bb.38 (T:44,F:39) 

(gdb) p BBI.BB->dump()
%bb.38: derived from LLVM BB %L762
    Live Ins: %x23
    Predecessors according to CFG: %bb.37
        renamable %x3 = LD 504, %x31; mem:Volatile LD8[%"#temp#15"](dereferenceable) dbg:loading.jl:1024
        renamable %x4 = XORI8 renamable %x3, 1; dbg:loading.jl:1024
        renamable %cr0 = CMPLDI renamable %x3, 1; dbg:loading.jl:1024
        renamable %x4 = CNTLZD killed renamable %x4; dbg:loading.jl:1024
        renamable %x4 = RLDICL killed renamable %x4, 58, 63; dbg:loading.jl:1024
        STD killed renamable %x3, 1352, %x31; mem:ST8[%33] dbg:loading.jl:1024
        STB renamable %r4, 1351, %x31, implicit killed %x4; mem:ST1[%34] dbg:loading.jl:1024
        renamable %x3 = ADDIStocHA %x2, @"jl_global#10"; dbg:loading.jl:1024
        renamable %x3 = LDtocL @"jl_global#10", killed renamable %x3, implicit %x2; mem:LD8[GOT] dbg:loading.jl:1024
    Successors according to CFG: %bb.39(0x40000000 / 0x80000000 = 50.00%) %bb.44(0x40000000 / 0x80000000 = 50.00%)
@vchuravy
Copy link
Contributor Author

I should have added that this is with https://reviews.llvm.org/D43781 backported to 6.0.0

@vchuravy
Copy link
Contributor Author

Any suggestions/hints how I can isolate this bug or how I can help with debugging?

@nemanjai
Copy link
Member

Valentin, perhaps it would be good if you can provide instructions on how I can set up an environment similar to yours (what to download, build, run) and I can try to investigate as well.

It is definitely difficult to investigate when we can't reproduce it with llc, but hopefully we can make some sense out of it.

@vchuravy
Copy link
Contributor Author

Nemanja, that I can certainly do. Let me know if you run into any issues
The commands below setup Julia with llvm6 and a Release + DebugInfo build.

git clone https://github.com/JuliaLang/julia
cd julia
git checkout vc/ppc-llvm6
echo """
override LLVM_CMAKE_BUILDTYPE=RelWithDebInfo
override LLVM_BUILDTYPE=RelWithDebInfo+Asserts
LLVM_ASSERTIONS=1
FORCE_ASSERTIONS=1
""" > Make.user
make -j160 julia-deps

This will build all the necessary dependencies including LLVM

all the llvm tooling will be in ./usr/tools

You will want to apply the following patch before continuing:
diff --git a/src/codegen.cpp b/src/codegen.cpp
index 915474b..10cd522 100644
--- a/src/codegen.cpp
+++ b/src/codegen.cpp
@@ -6950,9 +6950,7 @@ extern "C" void *jl_init_llvm(void)
const char *const argv_copyprop[] = {"", "-disable-copyprop"}; // llvm bug 21743
cl::ParseCommandLineOptions(sizeof(argv_copyprop)/sizeof(argv_copyprop[0]), argv_copyprop, "disable-copyprop\n");
#endif
-#if defined(JL_DEBUG_BUILD) || defined(USE_POLLY)
cl::ParseEnvironmentOptions("Julia", "JULIA_LLVM_ARGS");
-#endif

So that you can pass JULIA_LLVM_ARGS in the environment to dump IR, etc.

make -j16 julia-base-compiler

This will compile the julia compiler and add a ./usr/lib/libjulia.so

which can be loaded by opt to get access to the Julia specific LLVM passes

make VERBOSE=1 julia-sysimg-release # towards the end of this you should hit the seqfault

@vchuravy
Copy link
Contributor Author

vchuravy commented Apr 3, 2018

reduced mir test-case
Run with:

llc -mtriple=powerpc64le-unknown-linux-gnu -fast-isel=0 -code-model=medium -mcpu=pwr8 -relocation-model=static -emulated-tls -run-pass=if-converter -debug-only=if-converter if-conversion.mir

@vchuravy
Copy link
Contributor Author

vchuravy commented Apr 3, 2018

Thanks to @​vtjnash we managed to pinpoint the failure and construct a test-case.

Upon closer inspects the two BB blocks that were deemed eligible for diamond if conversion terminated in return statements and not branches. But the if-converter pass looks expects that BB are terminated by branches.

@vchuravy
Copy link
Contributor Author

vchuravy commented Apr 3, 2018

test case for trunk
Since the MIR format changed between 6.0 and trunk I minimised and created a new test-case.

I also proposed a minimal fix at https://reviews.llvm.org/D45218

@nemanjai
Copy link
Member

nemanjai commented Apr 3, 2018

I don't think there's a really good reason to stop if-conversion when we've encountered a function return. The patch certainly causes a number of test cases to fail when I apply it and try it (at least one fails with a crash).

The problem in this case seems to be that the two blocks are identical. What will happen in-turn is that we will erase all the instructions out of both blocks. So this is certainly something we'd want to do.

I think we can fix this by noticing that the two blocks are empty after the splice/erase operations and just get rid of them. The below patch fixes the reduced test case for me and introduces no new failures.

Could you try this patch to see if it solves the problem:

Index: lib/CodeGen/IfConversion.cpp

--- lib/CodeGen/IfConversion.cpp (revision 328620)
+++ lib/CodeGen/IfConversion.cpp (working copy)
@@ -1728,6 +1728,8 @@ bool IfConverter::IfConvertDiamondCommon(
}
while (NumDups1 != 0) {
++DI2;

  • if (DI2 == MBB2.end())

  •  break;
    

    if (!DI2->isDebugValue())
    --NumDups1;
    }
    @@ -1762,6 +1764,12 @@ bool IfConverter::IfConvertDiamondCommon(
    }
    MBB1.erase(DI1, MBB1.end());

  • // Identical blocks, already moved all the instructions up.

  • if (MBB1.empty() && MBB2.empty()) {

  • BBI.BB->removeSuccessor(&MBB1, true);

  • BBI.BB->removeSuccessor(&MBB2, true);

  • return true;

  • }
    DI2 = BBI2->BB->end();
    // The branches have been checked to match. Skip over the branch in the false
    // block so that we don't try to predicate it.

@vchuravy
Copy link
Contributor Author

vchuravy commented Apr 4, 2018

You are right. I forgot to enable all the other targets in my build and didn't see the other test failures. Sorry about that.

Your approach is similar to my first attempt at mitigating this, but I didn't think of just eliminating the two BBs, but I think the patch would need to be:

diff --git a/lib/CodeGen/IfConversion.cpp b/lib/CodeGen/IfConversion.cpp
index a22ce0dab9c..527eeb01ddb 100644
--- a/lib/CodeGen/IfConversion.cpp
+++ b/lib/CodeGen/IfConversion.cpp
@@ -1723,11 +1723,15 @@ bool IfConverter::IfConvertDiamondCommon(
// Skip past the dups on each side separately since there may be
// differing dbg_value entries.
for (unsigned i = 0; i < NumDups1; ++DI1) {

  • if (DI1 == MBB1.end())

  •  break;
    

    if (!DI1->isDebugValue())
    ++i;
    }
    while (NumDups1 != 0) {
    ++DI2;

  • if (DI2 == MBB2.end())

  •  break;
    

    if (!DI2->isDebugValue())
    --NumDups1;
    }
    @@ -1762,6 +1766,12 @@ bool IfConverter::IfConvertDiamondCommon(
    }
    MBB1.erase(DI1, MBB1.end());

  • // Identical blocks, already moved all the instructions up.

  • if (MBB1.empty() && MBB2.empty()) {

  • BBI.BB->removeSuccessor(&MBB1, true);

  • BBI.BB->removeSuccessor(&MBB2, true);

  • return true;

  • }
    DI2 = BBI2->BB->end();
    // The branches have been checked to match. Skip over the branch in the false
    // block so that we don't try to predicate it.
    --
    2.16.3

Should I update my diff or do you want to commit it with an appropriate test-case?

@nemanjai
Copy link
Member

nemanjai commented Apr 4, 2018

If you don't mind updating, that would be great. I'd certainly like to get this reviewed.

@vchuravy
Copy link
Contributor Author

Fixed by https://reviews.llvm.org/rL330345

@DimitryAndric
Copy link
Collaborator

*** Bug #35355 has been marked as a duplicate of this bug. ***

@DimitryAndric
Copy link
Collaborator

*** Bug #29594 has been marked as a duplicate of this bug. ***

@DimitryAndric
Copy link
Collaborator

*** Bug #32860 has been marked as a duplicate of this bug. ***

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla
Projects
None yet
Development

No branches or pull requests

3 participants