Created attachment 3688 [details] bc that crashes lli When called from lli, the attached bitcode file crashes in BitcodeReader::ParseMetadataAttachment(). bugpoint also crashed on OSX trying to reduce the test case: $ Debug/bin/bugpoint -run-jit fib.bc Read input file : 'fib.bc' *** All input ok Initializing execution environment: Found lli: /Users/jyasskin/src/llvm/trunk/obj/Debug/bin/lli Running the code generator to test for a crash: Generating reference output from raw program: <cbe><gcc><program> Reference output is: bugpoint.reference.out-PpLumv *** Checking the code generator... <jit> *** Input program does not match reference diff! Debugging code generator problem! Checking to see if the program is misoptimized when these functions are run through the passes: fib_left fib_right fib main <cbe><gcc> Error running tool: /opt/local/bin/gcc -x c -fno-strict-aliasing bugpoint.safe.bc-qp2574.cbe.c -x none -shared -fPIC -o bugpoint.safe.bc-qp2574.cbe.c.dylib -O2 Undefined symbols: "_main", referenced from: start in crt1.10.5.o ld: symbol(s) not found collect2: ld returned 1 exit status *** Debugging code generator crash! Checking to see if we can delete global inits: - Removing all global inits hides problem! *** Attempting to reduce the number of functions in the testcase Checking for crash with only these functions: fib_left fib_right fib main: Checking for crash with only these blocks: entry if.then if.end return entry if.then if.end return entry if.then... <13 total>: Checking for crash with only 62 instructions: *** Attempting to reduce testcase by deleting instructions: Simplification Level #1 Checking instruction: %retval = alloca i32 ; <i32*> [#uses=3] Checking instruction: %i.addr = alloca i32 ; <i32*> [#uses=5] Checking instruction: store i32 %i, i32* %i.addr Checking instruction: %0 = bitcast i32* %i.addr to { }* ; <{ }*> [#uses=1]invalid llvm.dbg.declare intrinsic call call void @llvm.dbg.declare({ }* null, metadata !0) Broken module found, compilation aborted! $ /opt/local/bin/gcc --version i686-apple-darwin9-gcc-4.0.1 (GCC) 4.0.1 (Apple Inc. build 5488)
There is a bug in llvm::BitcodeReader::ParseMetadataAttachment(): File lib/Bitcode/Reader/BitcodeReader.cpp (SVN r85709): 1601 case bitc::METADATA_ATTACHMENT: { 1602 unsigned RecordLength = Record.size(); 1603 if (Record.empty() || (RecordLength - 1) % 2 == 1) 1604 return Error ("Invalid METADATA_ATTACHMENT reader!"); 1605 Instruction *Inst = InstructionList[Record[0]]; 1606 for (unsigned i = 1; i != RecordLength; i = i+2) { 1607 unsigned Kind = Record[i]; 1608 Value *Node = MDValueList.getValueFwdRef(Record[i+1]); 1609 TheMetadata.addMD(Kind, cast<MDNode>(Node), Inst); 1610 } 1611 break; 1612 } The loop at 1606 never ends because RecordLength is even and i is odd. This might be the root cause of this problem. See also: http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-November/026880.html
(In reply to comment #1) > There is a bug in llvm::BitcodeReader::ParseMetadataAttachment(): > > File lib/Bitcode/Reader/BitcodeReader.cpp (SVN r85709): > > 1601 case bitc::METADATA_ATTACHMENT: { > 1602 unsigned RecordLength = Record.size(); > 1603 if (Record.empty() || (RecordLength - 1) % 2 == 1) > 1604 return Error ("Invalid METADATA_ATTACHMENT reader!"); > 1605 Instruction *Inst = InstructionList[Record[0]]; > 1606 for (unsigned i = 1; i != RecordLength; i = i+2) { > 1607 unsigned Kind = Record[i]; > 1608 Value *Node = MDValueList.getValueFwdRef(Record[i+1]); > 1609 TheMetadata.addMD(Kind, cast<MDNode>(Node), Inst); > 1610 } > 1611 break; > 1612 } > > The loop at 1606 never ends because RecordLength is even and i is odd. There's a check (with ugly inverted logic) on line 1603 so RecordLength cannot be even. > This might be the root cause of this problem. > > See also: http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-November/026880.html >
> There's a check (with ugly inverted logic) on line 1603 so RecordLength cannot > be even. Ah you're right; sorry. On further thought, the only reason for the assertion (see stack trace below, extracted from the mailling list thread linked in Comment #1) would be the indexing of "InstructionList" with index=Record[0]. The assertion was triggered on a llvm::SmallVectorImpl<llvm::Instruction*>, so it has to be InstructionList (another reason that it can't be line 1607/1608). ------- 0 lli 0x0000000000feda16 1 lli 0x0000000000fed88f 2 libpthread.so.0 0x0000003df340eee0 3 libc.so.6 0x0000003df28332f5 gsignal + 53 4 libc.so.6 0x0000003df2834b20 abort + 384 5 libc.so.6 0x0000003df282c2fa __assert_fail + 234 6 lli 0x000000000085ece9 llvm::SmallVectorImpl<llvm::Instruction*>::operator[](unsigned int) + 77 7 lli 0x0000000000850ce0 llvm::BitcodeReader::ParseMetadataAttachment() + 448 8 lli 0x0000000000851043 llvm::BitcodeReader::ParseFunctionBody(llvm::Function*) + 677 9 lli 0x0000000000854b29 llvm::BitcodeReader::materializeFunction(llvm::Function*, std::string*) + 323 10 lli 0x0000000000c6a073 llvm::JIT::getPointerToFunction(llvm::Function*) + 421 11 lli 0x0000000000c68ae6 llvm::JIT::runFunction(llvm::Function*, std::vector<llvm::GenericValue, std::allocator<llvm::GenericValue> > const&) + 120 12 lli 0x0000000000c91b25 llvm::ExecutionEngine::runFunctionAsMain(llvm::Function*, std::vector<std::string, std::allocator<std::string> > const&, char const* const*) + 1091 13 lli 0x00000000008448ea main + 1841 14 libc.so.6 0x0000003df281ea2d __libc_start_main + 253 15 lli 0x0000000000843fd9
I think I hit the same problem but I don't have access to 64-bit OS X so I can't test your bitcode file. Here's what I see: Steps to reproduce: 1) cat > testcase.c <<EOF #include <stdio.h> #include <stdlib.h> #include <string.h> #include <assert.h> void d(void) { puts("aa"); } int main(int argc, char *argv[]) { d(); return 0; } EOF 2) clang -emit-llvm -c -o testcase.bc -g testcase.c 3) lli testcase.bc Expected results: 3) program prints "aa" Actual results 3) lli crashes with lli: /local/lindi/build/86985.86985/include/llvm/ADT/SmallVector.h:124: T& llvm::SmallVectorImpl<T>::operator[](unsigned int) [with T = llvm::Instruction*]: Assertion `Begin + idx < End' failed. 0 lli 0x08bf8721 1 lli 0x08bf8ced 2 0xb7eed400 __kernel_sigreturn + 0 3 libc.so.6 0xb7c70018 abort + 392 4 libc.so.6 0xb7c675be __assert_fail + 238 5 lli 0x084bb6d1 llvm::SmallVectorImpl<llvm::Instruction*>::operator[](unsigned int) + 83 6 lli 0x084a409b llvm::BitcodeReader::ParseMetadataAttachment() + 451 7 lli 0x084a7906 llvm::BitcodeReader::ParseFunctionBody(llvm::Function*) + 646 8 lli 0x084ab997 llvm::BitcodeReader::materializeFunction(llvm::Function*, std::string*) + 235 9 lli 0x088750a6 llvm::JIT::getPointerToFunction(llvm::Function*) + 372 10 lli 0x08875442 llvm::JIT::runFunction(llvm::Function*, std::vector<llvm::GenericValue, std::allocator<llvm::GenericValue> > const&) + 96 11 lli 0x0889d609 llvm::ExecutionEngine::runFunctionAsMain(llvm::Function*, std::vector<std::string, std::allocator<std::string> > const&, char const* const*) + 1025 12 lli 0x0849c8ef main + 2093 13 libc.so.6 0xb7c59455 __libc_start_main + 229 14 lli 0x0849b9f1 Stack dump: 0. Program arguments: /home/lindi/llvm-install/86985.86985/bin/lli testcase.bc More info: 1) distro is debian stable 2) architecture is x86 3) llvm version is 86985 4) clang version is 86985 5) I see the following things take place: 5.1) materializeFunction is called for "main" 5.2) 14 FUNC_CODE_INST_* RECORDs are seen and 14 instructions get added to InstructionList 5.3) SUBBLOCK with METADATA_ATTACHMENT_ID is seen and contains 6 METADATA_ATTACHMENTs 5.4) They refer to instructions 8, 11, 12, 13, 14 and 15. 5.5) ParseMetadataAttachment tries to use these to index InstructionList and of course fails with indexes 14 and 15 since we only have 14 instructions in InstructionList so far => How should instructions be numbered? If you materialize "main" first then the first instruction of main will be at InstructionList[0]. My untested guess is that if you materialize "d" first then its first instruction will be InstructionList[0], right? Having the numbering depend on the order in which the functions are materialized obviously does not work since the METADATA_ATTACHMENTs refer to the instructions using some fixed numbering (which I don't yet understand, maybe it's the order in which the instructions appear in the file?).
Created attachment 3817 [details] another bc file that crashes lli, source code in comment #4, built for x86 linux with clang
A workaround is to pass the --disable-lazy-compilation flag to lli, but it would still be nice to be able to lazy-load bitcode containing debug info.
Fixed in r97132: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20100222/096821.html