On x86-64, lli -jit-kind=orc executes the following code incorrectly. The return value from main should be 0. Instead, it returns 16. lli -jit-kind=mcjit and llc produce the correct output. define i32 @main() #0 { br label %1 1: ; preds = %1, %0 %2 = phi i64 [ 0, %0 ], [ %6, %1 ] %3 = phi <4 x i64> [ <i64 -16, i64 -16, i64 -16, i64 -16>, %0 ], [ %5, %1 ] %4 = trunc <4 x i64> %3 to <4 x i32> %5 = call <4 x i64> @new_function(<4 x i64> %3) %6 = add i64 %2, 1 %7 = icmp eq i64 %6, 2 br i1 %7, label %8, label %1 8: ; preds = %1 %9 = extractelement <4 x i32> %4, i32 3 ret i32 %9 } define <4 x i64> @new_function(<4 x i64> %0) unnamed_addr { %2 = add <4 x i64> %0, <i64 16, i64 16, i64 16, i64 16> ret <4 x i64> %2 } attributes #0 = { "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" }
Simplified version: define i32 @main() #0 { %1 = call <4 x i64> @new_function(<4 x i64> <i64 -16, i64 -16, i64 -16, i64 -16>) %2 = extractelement <4 x i64> %1, i32 3 %3 = trunc i64 %2 to i32 ret i32 %3 } define <4 x i64> @new_function(<4 x i64> %0) unnamed_addr { %2 = add <4 x i64> %0, <i64 16, i64 16, i64 16, i64 16> ret <4 x i64> %2 } attributes #0 = { "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" } Instructions generated for the call: $xmm0 = COPY %1:vr128 $xmm1 = COPY %1:vr128 CALL64r killed %2:gr64, <regmask $bh $bl $bp $bph $bpl $bx $ebp $ebx $hbp $hbx $rbp $rbx $r12 $r13 $r14 $r15 $r12b $r13b $r14b $r15b $r12bh $r13bh $r14bh $r15bh $r12d $r13d $r14d $r15d $r12w $r13w $r14w $r15w $r12wh and 3 more...>, implicit $rsp, implicit $ssp, implicit $xmm0, implicit $xmm1, implicit-def $rsp, implicit-def $ssp, implicit-def $xmm0, implicit-def $xmm1 ADJCALLSTACKUP64 0, 0, implicit-def dead $rsp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $rsp, implicit $ssp %3:vr128 = COPY $xmm0 %4:vr128 = COPY $xmm1 Instructions generated for new_function: liveins: $ymm0 %0:vr256 = COPY $ymm0 %1:gr64 = MOV64ri %const.0 %2:vr128 = VMOVDQArm killed %1:gr64, 1, $noreg, 0, $noreg :: (load (s128) from constant-pool) %3:vr128 = VEXTRACTF128rr %0:vr256, 1 %4:vr128 = VPADDQrr killed %3:vr128, %2:vr128 %5:vr128 = COPY %0.sub_xmm:vr256 %6:vr128 = VPADDQrr killed %5:vr128, %2:vr128 %8:vr256 = IMPLICIT_DEF %7:vr256 = INSERT_SUBREG %8:vr256(tied-def 0), killed %6:vr128, %subreg.sub_xmm %9:vr256 = VINSERTF128rr killed %7:vr256, killed %4:vr128, 1 $ymm0 = COPY %9:vr256 RET 0, $ymm0 So the call is using XMM0 and XMM1 for the argument and return value, but the function definition is using YMM0 only.
This *might* be a compiler setup bug in JITTargetMachineBuilder, but that system is pretty simple. It seems more likely that this is a code generator bug. How did you generate the MIR in Comment #1? Is that coming out of the JIT?
Yes, I ran lli --print-after-isel.
Oh -- You're missing attribute #0 on new_function. I think that's the problem. Could you add it and see if it works?
Yes, that fixed it. I can change my code to avoid producing the "target-features" mismatch, but I still suggest adding a check to OrcJIT, even if it's just an assertion.
The proper place to catch this would be in the module verifier, which we do run on every module in LLI. Unfortunately it doesn't seem to catch this bug. I suspect that target-features attributes are opaque to the verifier, so it can't recognize the incompatibility. I've filed https://llvm.org/PR52156 to investigate making this detectable, but it's outside my wheelhouse -- we'll have to wait for someone else to pick this up and run with it.