-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OOM crash in vectorizer (with late vectorization and relaxed aliasing) #18410
Comments
Due to size restrictions, even a xz --best compressed .cpp cannot be attached to this bug. I have uploaded it to http://people.freebsd.org/~mandree/FTblockDN-4LhQBL.cpp.xz |
Just so there is some progress on this bug: I can reproduce it with the latest trunk (r195641). Clang dies because it runs out of memory very quickly. This can easily be seen by lowering the virtual memory ulimit to, say, 256M: $ ulimit -v $((256*1024)) Program received signal SIGSEGV, Segmentation fault. I'm currently attempting to minimize this testcase, but running creduce will take quite a while. There also seems to be a relation with bug 18037 (and bug 17525, maybe), but those can only be trigged by turning late vectorization off (which became the default with r189858). |
New test case, reduced from FTblockDN-4LhQBL.cpp clang -cc1 -triple i386-unknown-freebsd10.0 -emit-obj -relaxed-aliasing -target-cpu i486 -target-feature +sse -O3 -fcxx-exceptions -fexceptions -vectorize-loops -x c++ pr18036-1.cpp |
Ping. This still occurs with trunk r202496. |
A simple test case like this breaks with SSE (not >=SSE2). release/Release+Asserts/bin/llc -mcpu=i686 -mattr=+sse t.ll -o - target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128" define <4 x float> @foo(<4 x float>*%p, <4 x i32> %q) { I would argue that this shows that < sse2 vector lowering is not well tested and we should disable vectorization on such platforms as we probably don't have the bandwidth to fix/test sse1 vector lowering. As far as the test case goes we seem to run into an infinite ping pong of wanting to widen types to v4f32 and splitting v4integer types: Widen node result 0: 0x7ff392802f10: v2f32 = vselect 0x7ff392802c10, 0x7ff391046410, 0x7ff391046610 [ORD=3] [ID=0] Split node result: 0x7ff392803410: v4i1 = concat_vectors 0x7ff392802c10, 0x7ff391800110 [ORD=3] [ID=0] Split node operand: 0x7ff392803510: v4f32 = vselect 0x7ff392803410, 0x7ff391045310, 0x7ff391045410 [ORD=3] [ID=0] Split node result: 0x7ff392803610: v2i1 = extract_subvector 0x7ff392803410, 0x7ff391042010 [ORD=3] [ID=0] Widen node result 0: 0x7ff392803810: v2f32 = vselect 0x7ff392803610, 0x7ff391046410, 0x7ff391046610 [ORD=3] [ID=0] Split node result: 0x7ff392003810: v4i1 = concat_vectors 0x7ff392803610, 0x7ff391800110 [ORD=3] [ID=0] Split node operand: 0x7ff392003910: v4f32 = vselect 0x7ff392003810, 0x7ff391045310, 0x7ff391045410 [ORD=3] [ID=0] Split node result: 0x7ff392003a10: v2i1 = extract_subvector 0x7ff392003810, 0x7ff391042010 [ORD=3] [ID=0] Widen node result 0: 0x7ff392003c10: v2f32 = vselect 0x7ff392003a10, 0x7ff391046410, 0x7ff391046610 [ORD=3] [ID=0] Split node result: 0x7ff392004110: v4i1 = concat_vectors 0x7ff392003a10, 0x7ff391800110 [ORD=3] [ID=0] Split node operand: 0x7ff392004210: v4f32 = vselect 0x7ff392004110, 0x7ff391045310, 0x7ff391045410 [ORD=3] [ID=0] Split node result: 0x7ff392004310: v2i1 = extract_subvector 0x7ff392004110, 0x7ff391042010 [ORD=3] [ID=0] Widen node result 0: 0x7ff392004510: v2f32 = vselect 0x7ff392004310, 0x7ff391046410, 0x7ff391046610 [ORD=3] [ID=0] Split node result: 0x7ff392004b10: v4i1 = concat_vectors 0x7ff392004310, 0x7ff391800110 [ORD=3] [ID=0] Split node operand: 0x7ff392004c10: v4f32 = vselect 0x7ff392004b10, 0x7ff391045310, 0x7ff391045410 [ORD=3] [ID=0] Split node result: 0x7ff392004d10: v2i1 = extract_subvector 0x7ff392004b10, 0x7ff391042010 [ORD=3] [ID=0] Widen node result 0: 0x7ff392004f10: v2f32 = vselect 0x7ff392004d10, 0x7ff391046410, 0x7ff391046610 [ORD=3] [ID=0] Split node result: 0x7ff392005410: v4i1 = concat_vectors 0x7ff392004d10, 0x7ff391800110 [ORD=3] [ID=0] Split node operand: 0x7ff392005510: v4f32 = vselect 0x7ff392005410, 0x7ff391045310, 0x7ff391045410 [ORD=3] [ID=0] Split node result: 0x7ff392005610: v2i1 = extract_subvector 0x7ff392005410, 0x7ff391042010 [ORD=3] [ID=0] |
I agree. Conformance before performance. Interested parties should get buildbots running green and keep them that way, as with any other major platform. |
I committed a fix for the type legalization issue in r203311. |
*** Bug llvm/llvm-bugzilla-archive#19283 has been marked as a duplicate of this bug. *** |
Forgot to mark this bug as fixed by r203311. |
mentioned in issue llvm/llvm-bugzilla-archive#19283 |
Extended Description
This happens with the base FreeBSD 10-STABLE clang,
FreeBSD clang version 3.3 (tags/RELEASE_33/final 183502) 20130610
Target: i386-unknown-freebsd10.0
Thread model: posix
Relevant trace files attached.
This happens while compiling rawtherapee, an open source Raw photo processor.
The text was updated successfully, but these errors were encountered: