Building clang 8.0.0rc1 for fedora using gcc 9 compiles file, but validation fails in ~150 places. When building with -O0, everything validates fine. When building with -O1, everything validates fine. When building with -O2, validation fails. When building with -O2 -fsanitize=address, everything validates fine. Running clang under valgrind gives good hint on the error location: valgrind /builddir/build/BUILD/cfe-8.0.0rc1.src/_build/bin/clang -cc1 -internal-isystem /builddir/build/BUILD/cfe-8.0.0rc1.src/_build/lib64/clang/8.0.0/include -nostdsysteminc -fopenmp -x c++ -triple powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm /builddir/build/BUILD/cfe-8.0.0rc1.src/test/OpenMP/nvptx_declare_target_var_ctor_dtor_codegen.cpp -o - ==2582== Conditional jump or move depends on uninitialised value(s) ==2582== at 0x880DC79: clang::CodeGen::CodeGenModule::ConstructAttributeList(llvm::StringRef, clang::CodeGen::CGFunctionInfo const&, clang::CodeGen::CGCalleeInfo, llvm::AttributeList&, unsigned int&, bool) (CGCall.cpp:2060) ==2582== by 0x8A1E749: clang::CodeGen::CodeGenModule::SetLLVMFunctionAttributes(clang::GlobalDecl, clang::CodeGen::CGFunctionInfo const&, llvm::Function*) (CodeGenModule.cpp:1171) ==2582== by 0x8A238AA: clang::CodeGen::CodeGenModule::SetFunctionAttributes(clang::GlobalDecl, llvm::Function*, bool, bool) (CodeGenModule.cpp:1548) ==2582== by 0x8A5CA16: clang::CodeGen::CodeGenModule::GetOrCreateLLVMFunction(llvm::StringRef, llvm::Type*, clang::GlobalDecl, bool, bool, bool, llvm::AttributeList, clang::CodeGen::ForDefinition_t) (CodeGenModule.cpp:2824) ==2582== by 0x87F7FAC: clang::CodeGen::CodeGenModule::getAddrOfCXXStructor(clang::CXXMethodDecl const*, clang::CodeGen::StructorType, clang::CodeGen::CGFunctionInfo const*, llvm::FunctionType*, bool, clang::CodeGen::ForDefinition_t) (CGCXX.cpp:253) ==2582== by 0x882255E: clang::CodeGen::CodeGenFunction::EmitCXXConstructorCall(clang::CXXConstructorDecl const*, clang::CXXCtorType, bool, bool, clang::CodeGen::Address, clang::CodeGen::CallArgList&, clang::CodeGen::AggValueSlot::Overlap_t, clang::SourceLocation, bool) (CGClass.cpp:2134) ==2582== by 0x882343C: clang::CodeGen::CodeGenFunction::EmitCXXConstructorCall(clang::CXXConstructorDecl const*, clang::CXXCtorType, bool, bool, clang::CodeGen::Address, clang::CXXConstructExpr const*, clang::CodeGen::AggValueSlot::Overlap_t, bool) (CGClass.cpp:2052) ==2582== by 0x88B1B79: clang::CodeGen::CodeGenFunction::EmitCXXConstructEx (...)
The problem lies in the initialization of the bitfield members of ABIArgInfo. Rewriting the default constructor to properly set each member fixes the error.
+hans for release bugs.
Fix available in https://reviews.llvm.org/D57523
(In reply to sguelton from comment #0) > ==2582== Conditional jump or move depends on uninitialised value(s) > ==2582== at 0x880DC79: > clang::CodeGen::CodeGenModule::ConstructAttributeList(llvm::StringRef, > clang::CodeGen::CGFunctionInfo const&, clang::CodeGen::CGCalleeInfo, > llvm::AttributeList&, unsigned int&, bool) (CGCall.cpp:2060) CGCall.cpp:2060 at 8.0.0-rc1 is this line: else if (AI.getInReg()) that returns the InReg bitfield. But that bitfield does get initialized (to false) in both of ABIArgInfo's constructors. So I don't understand what the error is here exactly?
Can you provide more information about exactly what gcc version you're using (I believe 9 doesn't even have a release branch yet), how your llvm build is configured, and in what way "validation" fails, i.e. an example of a test that's failing.
Sure. GCC version is from the gcc-9 branch, that's the one used on fedora rawhide: svn export svn://gcc.gnu.org/svn/gcc/branches/redhat/gcc-9-branch@268371 The llvm build uses RelWithDebInfo You can get the whole setup here: https://koji.fedoraproject.org/koji/taskinfo?taskID=32372028
EDIT: the above link is for a rebuild of 7.01, but I had the same issue with 8.0.0rc1.
(In reply to sguelton from comment #6) > Sure. > > GCC version is from the gcc-9 branch, that's the one used on fedora rawhide: > > svn export svn://gcc.gnu.org/svn/gcc/branches/redhat/gcc-9-branch@268371 > > The llvm build uses RelWithDebInfo > > You can get the whole setup here: > > https://koji.fedoraproject.org/koji/taskinfo?taskID=32372028 Perfect, I'll take a look.
@hans: same problem when rebuilding from master with the following setup: gcc built from svn export svn://gcc.gnu.org/svn/gcc/branches/redhat/gcc-9-branch@268371 llvm master CXXFLAGS -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection CFLAGS=CXXFLAGS cmake options: ../llvm-project/llvm -DCMAKE_BUILD_TYPE=RelWithDebInfo -DLLVM_ENABLE_PROJECTS=clang -DCMAKE_CXX_COMPILER=/opt/notnfs/sergesanspaille/install/bin/g++ -DCMAKE_C_COMPILER=/opt/notnfs/sergesanspaille/install/bin/gcc
I was able to reproduce this with the specified gcc version, except I only got four test failures: Failing Tests (4): Clang :: CodeGen/ppc64-extend.c Clang :: CodeGen/ppc64-inline-asm.c Clang :: CodeGenCXX/mips-size_t-ptrdiff_t.cpp Clang :: Driver/le32-unknown-nacl.cpp I'm still not sure what's going on though. Maybe the next step is to try bisecting gcc.
> I'm still not sure what's going on though. Maybe the next step is to try > bisecting gcc. Does the redhat/gcc-9-branch branch exist in gcc's git mirror?
(In reply to Hans Wennborg from comment #11) > > I'm still not sure what's going on though. Maybe the next step is to try > > bisecting gcc. > > Does the redhat/gcc-9-branch branch exist in gcc's git mirror? Never mind, I can reproduce with trunk gcc.
My bisection last night failed, and I'll probably not have time to try again today, so if someone else wants to give it a try, please do so.
I think my previous bisection failed because the test failure isn't completely deterministic. The same test doesn't fail each time. I tried bisecting again, building gcc, then clang, then running check-all at each step, and it points to this: --- 75a6f948423540b728611660963d1740440d87eb is the first bad commit commit 75a6f948423540b728611660963d1740440d87eb Author: hubicka <hubicka@138bc75d-0d04-0410-961f-82ee72b054a4> Date: Sat Jan 12 18:18:23 2019 +0000 * params.def (inline-unit-growth): Set to 40. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@267883 138bc75d-0d04-0410-961f-82ee72b054a4 :040000 040000 73c34e4b0fc97de49ad228c609d495461e496b90 5882541f2e6fbcd712fd641f535238b4856ff045 M gcc --- That doesn't seem very interesting. If it is indeed the culprit, I guess it only uncovered some already-existing problem :-/
I can't really spend more time on this. Landing https://reviews.llvm.org/D57523 seems like the way to go.
Patch merged, thanks Hans!
Looks like it was reverted in r354549. Let's keep this open until it sticks and also makes it to the release_80 branch.
Serge, do you have any ideas for how to do this without breaking various GCC versions? :-)
Not quite :-) As gcc 9 is not officially out yet, as of https://gcc.gnu.org/releases.html at least, I guess it's okay to still do the release, and eventually backport a patch if needed?
(In reply to sguelton from comment #19) > Not quite :-) As gcc 9 is not officially out yet, as of > https://gcc.gnu.org/releases.html at least, I guess it's okay to still do > the release, and eventually backport a patch if needed? As we have an idea of how to work around it, it would still be nice to fix before release if possible. Of course it would be even better if GCC was fixed before released. Do you know if there's a bug tracking this on the GCC side?
> Of course it would be even better if GCC was fixed before released. Do you know if there's a bug tracking this on the GCC side? Ping?
I'll give it another try today :-)
Unblocking the release from this.
On Ubuntu 18.08, I'm seeing the following Failing Tests (5): Clang :: CodeGen/arm-fp16-arguments.c Clang :: CodeGen/lanai-regparm.c caused by https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=267883 * params.def (inline-unit-growth): Set to 40. Clang :: CodeGen/ppc64-complex-parms.c Clang :: CodeGen/ppc64-complex-return.c Clang :: CodeGen/ppc64-vector.c caused by https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=268448 * parms.def (MAX_INLINE_INSNS_SINGLE): Reduce from 400 to 200. But these changes only caused failures in clang 7+. Bisecting clan on Clang :: CodeGen/ppc64-complex-parms.c Clang :: CodeGen/ppc64-complex-return.c Clang :: CodeGen/ppc64-vector.c I get: http://llvm.org/viewvc/llvm-project?view=revision&revision=322396 Refactor handling of signext/zeroext in ABIArgInfo Alex Bradbury, Could you take a look at this to see why gcc is over optimizing your code after r268448?
*** Bug 42927 has been marked as a duplicate of this bug. ***
I'm still getting around 129 test failures when building clang with gcc-9 on Fedora 31.
Also, re-applying https://reviews.llvm.org/rL354546 fixes the failures for me.
The patch changing inliner metrics should not trigger wrong code, so indeed it probably just uncovers latent problem. Is it understood whether this is a GCC bug and if so, do you have a testcase? Note that I have also run into misoptimization issues while compiling clang with GCC 9 and FDO. I had no chance to debug it, but it was crashing after parsing even quite simple source file. It would be great to figure out where the problem is. To compile with profile feedback you need to pass -fprofile-generate in addition to standard flags and build whole tree, then train by testsuite and then rebuilding with -fprofile-use.
(In reply to Tom Stellard from comment #27) > Also, re-applying https://reviews.llvm.org/rL354546 fixes the failures for > me. I had trouble re-applying that cleanly due to the changes in https://reviews.llvm.org/D60349 and created the attached patch. I'm not a programmer or coder, please verify the attached patch.
Created attachment 22381 [details] adjusted patch to apply against latest trunk
One more data point. I bisected Clang against: Clang :: CodeGen/arm-fp16-arguments.c Clang :: CodeGen/lanai-regparm.c which pointed to: http://llvm.org/viewvc/llvm-project?view=revision&revision=337514 Reapply "ADT: Shrink size of SmallVector by 8B on 64-bit platforms" Duncan, This commit appears to be be causing gcc to over optimize.
(In reply to Luke from comment #31) > One more data point. I bisected Clang against: > Clang :: CodeGen/arm-fp16-arguments.c > Clang :: CodeGen/lanai-regparm.c > > which pointed to: > http://llvm.org/viewvc/llvm-project?view=revision&revision=337514 > Reapply "ADT: Shrink size of SmallVector by 8B on 64-bit platforms" > > Duncan, > This commit appears to be be causing gcc to over optimize. Interesting! It caused problems in very old (unsupported) versions of clang too, likely due to a bug in the AArch64 backend. I won’t be able to investigate GCC to figure out the bug, but let me know if you want more details about what that commit did. I suspect it’s a recent regression in GCC since that commit landed over a year ago in Clang.
I took one last stab at bisecting gcc. This time rather trace back a current failure, I bisected against the first validation failures, which yielded: https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=261090 * c-cppbuiltin.c (c_cpp_builtins): Bump __cpp_deduction_guides to 201703
It would be nice to get to the bottom of this, but I don't think we can block the release on it.
I made another attempt to bisect gcc. Rather target a specific failure, this time I bisected against any validation failure. This points to https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=261089 gimple-ssa-store-merging.c: Include gimple-fold.h. While subsequent gcc tweaks resulted in different tests failing, this commit appears to be the genesis of all of the failures.
(In reply to Luke from comment #35) > I made another attempt to bisect gcc. Rather target a specific failure, this > time I bisected against any validation failure. This points to > https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=261089 > gimple-ssa-store-merging.c: Include gimple-fold.h. > > While subsequent gcc tweaks resulted in different tests failing, this commit > appears to be the genesis of all of the failures. Have you filed a gcc bug for this?
(In reply to Tom Stellard from comment #36) > Have you filed a gcc bug for this? Done. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91758 Please add any useful information that I may have left out, as I am neither a clang nor gcc developer.
Ok, I've got a patch. It's an issue in clang where ABIArgInfo constructor does not initialize some of the bit fields and then we end up with usage of an uninitialized variables. Following patch should fix it: diff --git a/clang/include/clang/CodeGen/CGFunctionInfo.h b/clang/include/clang/CodeGen/CGFunctionInfo.h index 1f81072e23d..ec2567555a3 100644 --- a/clang/include/clang/CodeGen/CGFunctionInfo.h +++ b/clang/include/clang/CodeGen/CGFunctionInfo.h @@ -109,14 +109,11 @@ private: UnpaddedCoerceAndExpandType = T; } - ABIArgInfo(Kind K) - : TheKind(K), PaddingInReg(false), InReg(false) { - } - public: - ABIArgInfo() - : TypeData(nullptr), PaddingType(nullptr), DirectOffset(0), - TheKind(Direct), PaddingInReg(false), InReg(false) {} + ABIArgInfo(Kind K = Direct) + : TheKind(K), PaddingInReg(false), InAllocaSRet(false), + IndirectByVal(false), IndirectRealign(false), SRetAfterThis(false), + InReg(false), CanBeFlattened(false), SignExt(false) {} static ABIArgInfo getDirect(llvm::Type *T = nullptr, unsigned Offset = 0, llvm::Type *Padding = nullptr, I'm going to send the patch to Phabricator.
Have you tested this patch with gcc-7? This similar patch was committed and reverted due to ICE in gcc-7: https://reviews.llvm.org/D57523
(In reply to Tom Stellard from comment #39) > Have you tested this patch with gcc-7? This similar patch was committed and > reverted due to ICE in gcc-7: https://reviews.llvm.org/D57523 Ah, you had the patch. Yes, the patch should be re-applied! I'll rebuild it with gcc-7 to see the ICE. Do you have a link to the ICE please?
(In reply to Martin Liška from comment #40) > (In reply to Tom Stellard from comment #39) > > Have you tested this patch with gcc-7? This similar patch was committed and > > reverted due to ICE in gcc-7: https://reviews.llvm.org/D57523 > > Ah, you had the patch. Yes, the patch should be re-applied! > I'll rebuild it with gcc-7 to see the ICE. > Do you have a link to the ICE please? There was a link in the revert commit, but the logs have been deleted now: https://reviews.llvm.org/rL354549
> There was a link in the revert commit, but the logs have been deleted now: > https://reviews.llvm.org/rL354549 Ok, I'm rebuilding clang with the CC=gcc-7 CXX=g++-7 with the very same instructions like here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91758#c0 The ICE was a normal ICE in GCC compiler during compilation of a clang object file, right?
(In reply to Martin Liška from comment #42) > > There was a link in the revert commit, but the logs have been deleted now: > > https://reviews.llvm.org/rL354549 > > Ok, I'm rebuilding clang with the CC=gcc-7 CXX=g++-7 with the very same > instructions like here: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91758#c0 > > The ICE was a normal ICE in GCC compiler during compilation of a clang > object file, right? I can't see the ICE with: g++-7 --version g++-7 (SUSE Linux) 7.4.1 20190725 [gcc-7-branch revision 273795] So please reapply the patch back.
(In reply to Martin Liška from comment #43) > (In reply to Martin Liška from comment #42) > > > There was a link in the revert commit, but the logs have been deleted now: > > > https://reviews.llvm.org/rL354549 > > > > Ok, I'm rebuilding clang with the CC=gcc-7 CXX=g++-7 with the very same > > instructions like here: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91758#c0 > > > > The ICE was a normal ICE in GCC compiler during compilation of a clang > > object file, right? > > I can't see the ICE with: > g++-7 --version > g++-7 (SUSE Linux) 7.4.1 20190725 [gcc-7-branch revision 273795] > > So please reapply the patch back. Can you submit your patch? The old patch doesn't apply cleanly any more.
Created attachment 22516 [details] Proposed patch Please make a patch submission on my behalf. Thanks.
(In reply to Martin Liška from comment #38) > Ok, I've got a patch. It's an issue in clang where ABIArgInfo constructor > does not initialize some of the bit fields and then we end up with usage of > an uninitialized variables. Can you point to where the uninitialized variable gets used? I stared hard at the code but couldn't find it. Maybe initializing all of them is a good idea in any case though.
(In reply to Hans Wennborg from comment #46) > (In reply to Martin Liška from comment #38) > > Ok, I've got a patch. It's an issue in clang where ABIArgInfo constructor > > does not initialize some of the bit fields and then we end up with usage of > > an uninitialized variables. > > Can you point to where the uninitialized variable gets used? I stared hard > at the code but couldn't find it. It happens in void computeInfo(CGFunctionInfo &FI) const override {: I.info = classifyArgumentType(I.type, State); Here there's an assignment which eventually will copy members of the ABIArgInfo. > > Maybe initializing all of them is a good idea in any case though.
Commited as https://reviews.llvm.org/rGe93aded7f02d661234ad81aac0785ccdef6a79dd Let's hope there's no gcc regression this time :-/
Verified that r372281 fixes the issue. Both gcc-7: gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0 gcc-10: gcc-8.3 (GCC) 10.0.0 20190822 (experimental) 'make check-all' for clang r372468 with no failures.
Can we get this backported to 9.0.1 (if it isn't yet)?
Merged: 7fc9f12