Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOM crash in vectorizer (with late vectorization and relaxed aliasing) #18410

Closed
llvmbot opened this issue Nov 23, 2013 · 10 comments
Closed

OOM crash in vectorizer (with late vectorization and relaxed aliasing) #18410

llvmbot opened this issue Nov 23, 2013 · 10 comments
Labels
bugzilla Issues migrated from bugzilla clang:frontend Language frontend issues, e.g. anything involving "Sema"

Comments

@llvmbot
Copy link
Collaborator

llvmbot commented Nov 23, 2013

Bugzilla Link 18036
Resolution FIXED
Resolved on Mar 30, 2014 09:53
Version 3.3
OS FreeBSD
Attachments run script
Reporter LLVM Bugzilla Contributor
CC @DimitryAndric,@hfinkel,@jsonn,@rengolin

Extended Description

This happens with the base FreeBSD 10-STABLE clang,

FreeBSD clang version 3.3 (tags/RELEASE_33/final 183502) 2013061
Target: i386-unknown-freebsd10.0
Thread model: posix

Relevant trace files attached.

This happens while compiling rawtherapee, an open source Raw photo processor.

@llvmbot
Copy link
Collaborator Author

llvmbot commented Nov 23, 2013

Due to size restrictions, even a xz --best compressed .cpp cannot be attached to this bug. I have uploaded it to http://people.freebsd.org/~mandree/FTblockDN-4LhQBL.cpp.xz

@DimitryAndric
Copy link
Collaborator

Just so there is some progress on this bug: I can reproduce it with the latest trunk (r195641). Clang dies because it runs out of memory very quickly. This can easily be seen by lowering the virtual memory ulimit to, say, 256M:

$ ulimit -v $((256*1024))
$ gdb --args ~/obj/llvm-195641-trunk-freebsd11-i386-ninja-rel-1/bin/clang -cc1 -triple i386-unknown-freebsd10.0 -emit-obj -disable-free -disable-llvm-verifier -main-file-name FTblockDN.cc -mrelocation-model pic -pic-level 2 -mdisable-fp-elim -relaxed-aliasing -menable-no-infs -menable-no-nans -menable-unsafe-fp-math -ffp-contract=fast -ffast-math -masm-verbose -mconstructor-aliases -target-cpu i486 -target-feature +sse -D "BZIP_SUPPORT" -D "MYFILE_MMAP" -D "NDEBUG" -D "_DNDEBUG" -O3 -fdeprecated-macro -ferror-limit 19 -fmessage-length 0 -funroll-loops -mstackrealign -fobjc-runtime=gnustep -fcxx-exceptions -fexceptions -fdiagnostics-show-option -vectorize-loops -x c++ FTblockDN-4LhQBL.cpp
GNU gdb (GDB) 7.6.1 [GDB v7.6.1 for FreeBSD]
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i386-portbld-freebsd11.0".
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/...
Reading symbols from /home/dim/obj/llvm-195641-trunk-freebsd11-i386-ninja-rel-1/bin/clang-3.5...(no debugging symbols found)...done.
(gdb) run
Starting program: /home/dim/obj/llvm-195641-trunk-freebsd11-i386-ninja-rel-1/bin/clang -cc1 -triple i386-unknown-freebsd10.0 -emit-obj -disable-free -disable-llvm-verifier -main-file-name FTblockDN.cc -mrelocation-model pic -pic-level 2 -mdisable-fp-elim -relaxed-aliasing -menable-no-infs -menable-no-nans -menable-unsafe-fp-math -ffp-contract=fast -ffast-math -masm-verbose -mconstructor-aliases -target-cpu i486 -target-feature +sse -D BZIP_SUPPORT -D MYFILE_MMAP -D NDEBUG -D _DNDEBUG -O3 -fdeprecated-macro -ferror-limit 19 -fmessage-length 0 -funroll-loops -mstackrealign -fobjc-runtime=gnustep -fcxx-exceptions -fexceptions -fdiagnostics-show-option -vectorize-loops -x c++ FTblockDN-4LhQBL.cpp
[New LWP 100281]
[New Thread 2a803080 (LWP 100281)]

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 2a803080 (LWP 100281)]
0x08d0ccb0 in llvm::MallocSlabAllocator::Allocate(unsigned int) ()
(gdb) bt
#​0 0x08d0ccb0 in llvm::MallocSlabAllocator::Allocate(unsigned int) ()
#​1 0x08d0c997 in llvm::BumpPtrAllocator::Allocate(unsigned int, unsigned int) ()
#​2 0x087087cf in llvm::SelectionDAG::getNode(unsigned int, llvm::SDLoc, llvm::EVT, llvm::SDValue, llvm::SDValue, llvm::SDValue) ()
#​3 0x087c02cd in llvm::DAGTypeLegalizer::SplitVecOp_VSELECT(llvm::SDNode*, unsigned int) ()
#​4 0x087bd6d0 in llvm::DAGTypeLegalizer::SplitVectorOperand(llvm::SDNode*, unsigned int) ()
#​5 0x087a7f8c in llvm::DAGTypeLegalizer::run() ()
#​6 0x087ac706 in llvm::SelectionDAG::LegalizeTypes() ()
#​7 0x0875ee06 in llvm::SelectionDAGISel::CodeGenAndEmitDAG() ()
#​8 0x0875eadc in llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) ()
#​9 0x0875d76a in llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) ()
#​10 0x088c4669 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) ()
#​11 0x08c75ae8 in llvm::FPPassManager::runOnFunction(llvm::Function&) ()
#​12 0x08c75cdf in llvm::FPPassManager::runOnModule(llvm::Module&) ()
#​13 0x08c76120 in llvm::legacy::PassManagerImpl::run(llvm::Module&) ()
#​14 0x08c766db in llvm::legacy::PassManager::run(llvm::Module&) ()
#​15 0x08d6c103 in clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::Module*, clang::BackendAction, llvm::raw_ostream*) ()
#​16 0x08d696a9 in clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) ()
#​17 0x08fea061 in clang::ParseAST(clang::Sema&, bool, bool) ()
#​18 0x08f47ed4 in clang::ASTFrontendAction::ExecuteAction() ()
#​19 0x08d68bd0 in clang::CodeGenAction::ExecuteAction() ()
#​20 0x08f47add in clang::FrontendAction::Execute() ()
#​21 0x08f28d68 in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) ()
#​22 0x08d53609 in clang::ExecuteCompilerInvocation(clang::CompilerInstance*) ()
#​23 0x08317346 in cc1_main(char const**, char const**, char const*, void*) ()
#​24 0x083162d1 in main ()

I'm currently attempting to minimize this testcase, but running creduce will take quite a while.

There also seems to be a relation with bug 18037 (and bug 17525, maybe), but those can only be trigged by turning late vectorization off (which became the default with r189858).

@DimitryAndric
Copy link
Collaborator

New test case, reduced from FTblockDN-4LhQBL.cpp
Here is a new test case, reduced to 25 lines by creduce (nice job!). The problem can be triggered with:

clang -cc1 -triple i386-unknown-freebsd10.0 -emit-obj -relaxed-aliasing -target-cpu i486 -target-feature +sse -O3 -fcxx-exceptions -fexceptions -vectorize-loops -x c++ pr18036-1.cpp

@DimitryAndric
Copy link
Collaborator

Ping. This still occurs with trunk r202496.

@llvmbot
Copy link
Collaborator Author

llvmbot commented Mar 5, 2014

A simple test case like this breaks with SSE (not >=SSE2).

release/Release+Asserts/bin/llc -mcpu=i686 -mattr=+sse t.ll -o -

target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128"
target triple = "i386-unknown-freebsd10.0"

define <4 x float> @​foo(<4 x float>*%p, <4 x i32> %q) {
entry:
%a1 = icmp eq <4 x i32> %q, zeroinitializer
%a18 = select <4 x i1> %a1, <4 x float> <float 1.000000e+00, float 0.000000e+00, float 1.000000e+00, float 1.000000e+0> , <4 x float> zeroinitializer
ret <4 x float> %a18
}

I would argue that this shows that < sse2 vector lowering is not well tested and we should disable vectorization on such platforms as we probably don't have the bandwidth to fix/test sse1 vector lowering.

As far as the test case goes we seem to run into an infinite ping pong of wanting to widen types to v4f32 and splitting v4integer types:

Widen node result 0: 0x7ff392802f10: v2f32 = vselect 0x7ff392802c10, 0x7ff391046410, 0x7ff391046610 [ORD=3] [ID=0]

Split node result: 0x7ff392803410: v4i1 = concat_vectors 0x7ff392802c10, 0x7ff391800110 [ORD=3] [ID=0]

Split node operand: 0x7ff392803510: v4f32 = vselect 0x7ff392803410, 0x7ff391045310, 0x7ff391045410 [ORD=3] [ID=0]

Split node result: 0x7ff392803610: v2i1 = extract_subvector 0x7ff392803410, 0x7ff391042010 [ORD=3] [ID=0]

Widen node result 0: 0x7ff392803810: v2f32 = vselect 0x7ff392803610, 0x7ff391046410, 0x7ff391046610 [ORD=3] [ID=0]

Split node result: 0x7ff392003810: v4i1 = concat_vectors 0x7ff392803610, 0x7ff391800110 [ORD=3] [ID=0]

Split node operand: 0x7ff392003910: v4f32 = vselect 0x7ff392003810, 0x7ff391045310, 0x7ff391045410 [ORD=3] [ID=0]

Split node result: 0x7ff392003a10: v2i1 = extract_subvector 0x7ff392003810, 0x7ff391042010 [ORD=3] [ID=0]

Widen node result 0: 0x7ff392003c10: v2f32 = vselect 0x7ff392003a10, 0x7ff391046410, 0x7ff391046610 [ORD=3] [ID=0]

Split node result: 0x7ff392004110: v4i1 = concat_vectors 0x7ff392003a10, 0x7ff391800110 [ORD=3] [ID=0]

Split node operand: 0x7ff392004210: v4f32 = vselect 0x7ff392004110, 0x7ff391045310, 0x7ff391045410 [ORD=3] [ID=0]

Split node result: 0x7ff392004310: v2i1 = extract_subvector 0x7ff392004110, 0x7ff391042010 [ORD=3] [ID=0]

Widen node result 0: 0x7ff392004510: v2f32 = vselect 0x7ff392004310, 0x7ff391046410, 0x7ff391046610 [ORD=3] [ID=0]

Split node result: 0x7ff392004b10: v4i1 = concat_vectors 0x7ff392004310, 0x7ff391800110 [ORD=3] [ID=0]

Split node operand: 0x7ff392004c10: v4f32 = vselect 0x7ff392004b10, 0x7ff391045310, 0x7ff391045410 [ORD=3] [ID=0]

Split node result: 0x7ff392004d10: v2i1 = extract_subvector 0x7ff392004b10, 0x7ff391042010 [ORD=3] [ID=0]

Widen node result 0: 0x7ff392004f10: v2f32 = vselect 0x7ff392004d10, 0x7ff391046410, 0x7ff391046610 [ORD=3] [ID=0]

Split node result: 0x7ff392005410: v4i1 = concat_vectors 0x7ff392004d10, 0x7ff391800110 [ORD=3] [ID=0]

Split node operand: 0x7ff392005510: v4f32 = vselect 0x7ff392005410, 0x7ff391045310, 0x7ff391045410 [ORD=3] [ID=0]

Split node result: 0x7ff392005610: v2i1 = extract_subvector 0x7ff392005410, 0x7ff391042010 [ORD=3] [ID=0]

@rengolin
Copy link
Member

rengolin commented Mar 5, 2014

I would argue that this shows that < sse2 vector lowering is not well tested and we should disable vectorization on such platforms as we probably don't have the bandwidth to fix/test sse1 vector lowering.

I agree. Conformance before performance.

Interested parties should get buildbots running green and keep them that way, as with any other major platform.

@llvmbot
Copy link
Collaborator Author

llvmbot commented Mar 8, 2014

I committed a fix for the type legalization issue in r203311.

@DimitryAndric
Copy link
Collaborator

*** Bug llvm/llvm-bugzilla-archive#19283 has been marked as a duplicate of this bug. ***

@DimitryAndric
Copy link
Collaborator

Forgot to mark this bug as fixed by r203311.

@DimitryAndric
Copy link
Collaborator

mentioned in issue llvm/llvm-bugzilla-archive#19283

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 9, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla clang:frontend Language frontend issues, e.g. anything involving "Sema"
Projects
None yet
Development

No branches or pull requests

3 participants