LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 44337 - IR with double constant doesn't work with OrcJIT build by MSVC
Summary: IR with double constant doesn't work with OrcJIT build by MSVC
Status: RESOLVED FIXED
Alias: None
Product: libraries
Classification: Unclassified
Component: OrcJIT (show other bugs)
Version: 9.0
Hardware: PC Windows XP
: P normal
Assignee: Unassigned LLVM Bugs
URL:
Keywords:
Depends on: 44336
Blocks:
  Show dependency tree
 
Reported: 2019-12-18 09:28 PST by Alexander Kornilov
Modified: 2020-01-22 15:01 PST (History)
4 users (show)

See Also:
Fixed By Commit(s):


Attachments
The IR module (708 bytes, text/plain)
2019-12-18 09:28 PST, Alexander Kornilov
Details
Binary IR module (1.08 KB, application/octet-stream)
2019-12-18 09:29 PST, Alexander Kornilov
Details
The x86-64 assembler version of module (1.22 KB, text/plain)
2019-12-18 09:30 PST, Alexander Kornilov
Details
The patch which fixes the issue. (2.17 KB, patch)
2019-12-20 08:34 PST, Alexander Kornilov
Details
The x86-64 assembler version of module (Linux) (1.23 KB, text/x-tex)
2019-12-22 02:33 PST, Alexander Kornilov
Details
The patch which fixes the issue [updated]. (2.26 KB, patch)
2019-12-27 01:03 PST, Alexander Kornilov
Details
Patch to auto-claim new symbols discovered by the linking layer. (850 bytes, patch)
2020-01-02 16:19 PST, Lang Hames
Details
Patch to auto-claim new symbols discovered by the linking layer LLVM 9.0.0 (808 bytes, patch)
2020-01-09 07:25 PST, Alexander Kornilov
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alexander Kornilov 2019-12-18 09:28:49 PST
Created attachment 22946 [details]
The IR module

Precondition:
LLVM 9.0.0 built by MSVC 2019.

Steps to reproduce:
1) Load 'module.bc' to memory from the file.
2) Add module to LLLazyJIT.
3) Request address of function 'calculate'.
4) Call function 'calculate'.

Expected result:
The function should be executed fine.

Actual result:
The OrcJIT failed on an assertion Core.cpp:867:
Assertion failed: I != Symbols.end() && "Symbol not found", file C:\Workspace\llvm-9.0.0.src\lib\ExecutionEngine\Orc\Core.cpp, line 867

Note:
I figured out that the missed symbol is __real@40091eb851eb851f. It appears after IR code is compiled by llc to assembler.

This issue is not observed on Linux and MinGW builds on Windows.
Comment 1 Alexander Kornilov 2019-12-18 09:29:49 PST
Created attachment 22947 [details]
Binary IR module
Comment 2 Alexander Kornilov 2019-12-18 09:30:21 PST
Created attachment 22948 [details]
The x86-64 assembler version of module
Comment 3 Alexander Kornilov 2019-12-18 09:30:43 PST
; ModuleID = 'module.bc'
source_filename = "module.ll"

@PI = local_unnamed_addr constant double 3.140000e+00
@cachedValue = private unnamed_addr global double 0.000000e+00

; Function Attrs: norecurse nounwind writeonly
define double @"calculate"(i32) local_unnamed_addr #0 {
  %2 = sitofp i32 %0 to double
  %3 = fmul double %2, 3.140000e+00
  %4 = fmul double %3, %2
  store double %4, double* @cachedValue, align 8
  ret double %4
}

; Function Attrs: norecurse nounwind readonly
define double @"getCachedValue"() local_unnamed_addr #1 {
  %1 = load double, double* @cachedValue, align 8
  ret double %1
}

attributes #0 = { norecurse nounwind writeonly }
attributes #1 = { norecurse nounwind readonly }
Comment 4 Alexander Kornilov 2019-12-18 09:31:03 PST
	.text
	.def	 @feat.00;
	.scl	3;
	.type	0;
	.endef
	.globl	@feat.00
.set @feat.00, 0
	.file	"module.ll"
	.def	 calculate;
	.scl	2;
	.type	32;
	.endef
	.globl	__real@40091eb851eb851f # -- Begin function calculate
	.section	.rdata,"dr",discard,__real@40091eb851eb851f
	.p2align	3
__real@40091eb851eb851f:
	.quad	4614253070214989087     # double 3.1400000000000001
	.text
	.globl	calculate
	.p2align	4, 0x90
calculate:                              # @calculate
# %bb.0:
	cvtsi2sdl	%ecx, %xmm1
	movsd	__real@40091eb851eb851f(%rip), %xmm0 # xmm0 = mem[0],zero
	mulsd	%xmm1, %xmm0
	mulsd	%xmm1, %xmm0
	movsd	%xmm0, .LcachedValue(%rip)
	retq
                                        # -- End function
	.def	 getCachedValue;
	.scl	2;
	.type	32;
	.endef
	.globl	getCachedValue          # -- Begin function getCachedValue
	.p2align	4, 0x90
getCachedValue:                         # @getCachedValue
# %bb.0:
	movsd	.LcachedValue(%rip), %xmm0 # xmm0 = mem[0],zero
	retq
                                        # -- End function
	.section	.rdata,"dr"
	.globl	PI                      # @PI
	.p2align	3
PI:
	.quad	4614253070214989087     # double 3.1400000000000001

	.lcomm	.LcachedValue,8,8       # @cachedValue
Comment 5 Alexander Kornilov 2019-12-20 08:34:13 PST
Created attachment 22955 [details]
The patch which fixes the issue.
Comment 6 Lang Hames 2019-12-20 15:27:42 PST
Hi Alexander,

Thanks for looking in to this.

The proposed patch is not the right fix: It weakens responsibility checking and dynamically injects new definitions. I'm trying to wrap my head around what has gone wrong so that I can figure out the correct fix.

From the original assertion it looks like the assertion that fired was in a call to JITDylib::resolve that included a symbol that wasn't defined in the JITDylib, but I'm not sure how that could have happened: We should have hit the corresponding assertion in MaterializationResponsibility::notifyResolved before we got to JITDylib::resolve. Would you be able to run this under a debugger and obtain a backtrace for the original assertion?

If this analysis is correct: That the symbol has somehow made it into the responsibility set for the MaterializationResponsibility, but is missing from the JITDylib, then we need to track down where this correspondence broke. They should move in lock-step.
Comment 7 Alexander Kornilov 2019-12-22 02:33:29 PST
Created attachment 22960 [details]
The x86-64 assembler version of module (Linux)
Comment 8 Alexander Kornilov 2019-12-22 02:33:59 PST
	.text
	.file	"module.ll"
	.section	.rodata.cst8,"aM",@progbits,8
	.p2align	3               # -- Begin function calculate
.LCPI0_0:
	.quad	4614253070214989087     # double 3.1400000000000001
	.text
	.globl	calculate
	.p2align	4, 0x90
	.type	calculate,@function
calculate:                              # @calculate
# %bb.0:
	cvtsi2sd	%edi, %xmm1
	movsd	.LCPI0_0(%rip), %xmm0   # xmm0 = mem[0],zero
	mulsd	%xmm1, %xmm0
	mulsd	%xmm1, %xmm0
	movsd	%xmm0, .LcachedValue(%rip)
	retq
.Lfunc_end0:
	.size	calculate, .Lfunc_end0-calculate
                                        # -- End function
	.globl	getCachedValue          # -- Begin function getCachedValue
	.p2align	4, 0x90
	.type	getCachedValue,@function
getCachedValue:                         # @getCachedValue
# %bb.0:
	movsd	.LcachedValue(%rip), %xmm0 # xmm0 = mem[0],zero
	retq
.Lfunc_end1:
	.size	getCachedValue, .Lfunc_end1-getCachedValue
                                        # -- End function
	.type	PI,@object              # @PI
	.section	.rodata,"a",@progbits
	.globl	PI
	.p2align	3
PI:
	.quad	4614253070214989087     # double 3.1400000000000001
	.size	PI, 8

	.type	.LcachedValue,@object   # @cachedValue
	.local	.LcachedValue
	.comm	.LcachedValue,8,8
	.section	".note.GNU-stack","",@progbits
Comment 9 Alexander Kornilov 2019-12-22 02:36:19 PST
Hi Lang,

Thank you for the fast response.

I had some debug session before I prepared my "hot fix" :)
So per my light analysis, the root cause of the issue is a difference between low-level code generation on MSVC (COFF) and Linux/MinGW.
I have attached the assembler version of the module which compiled by LLC on Linux.
You can see a difference: MSVC version has additional constant __real@40091eb851eb851f(%rip) in function 'calculate'.
This constant wasn't declared in the IR module, but was added during compilation to platform-specific code. Maybe it connected with some rules of compilation for the MSVC target.

So what is happened in code in my understanding:
1) We requested to materialize function 'calculate' and inserted the function name 'calculate' in the 'Symbols' map.
2) We have got 'notifyResolved' notification from RTDyldObjectLinkingLayer.cpp with list of resolved symbols.
3) We expected to see in resolved symbols list only symbols which we requested (which contains in 'Symbols') and assertion failed.
But because a new constant '__real@40091eb851eb851f' was generated during compilation it also presents in the resolved list which comes from the linking layer (RTDyldObjectLinkingLayer.cpp).

So in my opinion, 'calculation' depends on constant '__real@40091eb851eb851f' and we have to extend our 'Symbol' map.
And you right, my patch injects all additional symbols which come from the linking layer. Because all symbols already have real addresses it is safe.

---
Best wishes,
Alexander
Comment 10 Alexander Kornilov 2019-12-25 10:30:12 PST
Another way just discard all unexpected symbols, but I don't enough familiar with OrcJIT architecture to say what way is proper.
Comment 11 Alexander Kornilov 2019-12-27 01:03:06 PST
Created attachment 22965 [details]
The patch which fixes the issue [updated].

Ignore already resolved symbols.
Comment 12 Alexander Kornilov 2019-12-27 09:04:09 PST
Hi Lang,

Is there any chance that this patch (or "true" fix) will be in LLVM 10?
Comment 13 Lang Hames 2020-01-02 10:35:47 PST
Hi Alexander,

I'm just back from vacation. I'll try to get you an update on this today.

-- Lang.
Comment 14 Lang Hames 2020-01-02 16:18:23 PST
Hi Alexander,

> I had some debug session before I prepared my "hot fix" :)
> So per my light analysis, the root cause of the issue is a difference between low-level code generation on MSVC (COFF) and Linux/MinGW.
> I have attached the assembler version of the module which compiled by LLC on Linux. You can see a difference: MSVC version has additional constant
__real@40091eb851eb851f(%rip) in function 'calculate'.
>This constant wasn't declared in the IR module, but was added during compilation to platform-specific code. Maybe it connected with some rules of compilation for the MSVC target.

That all sounds right to me. I guess the code generator transformed this into a named constant (probably intended to be non-exported from the final symbol table).

The thing that has me confused is the specific assertion that you saw: It is in JITDylib::resolve, but this is only called from MaterializationResponsibility::notifyResolved which contains what should have been an equivalent assertion: That the symbol appears in the SymbolFlags map of the MaterializationResponsibility. So what I want to know is how we managed to trigger the second assertion without triggering the first? Are you able to debug your LLVM build and tell me whether the loop in MaterializationResponsibility::notifyResolved is running?

That mystery aside, I think the right answer here is to have RTDyldObjectLinkingLayer auto-claim responsibility for new symbols in the object. I've attached a patch to do that (auto-claim-obj-syms.patch). Would you be able to test it out for me? I do not have access to a Windows machine.

-- Lang.
Comment 15 Lang Hames 2020-01-02 16:19:19 PST
Created attachment 22978 [details]
Patch to auto-claim new symbols discovered by the linking layer.
Comment 16 Alexander Kornilov 2020-01-09 02:07:52 PST
We had a winter holidays.
Sure. I'll try your patch.
Could you please also take a look https://bugs.llvm.org/show_bug.cgi?id=44336 (issue with an assertion of Export flag)?
Because this bug depends on 44336.
Comment 17 Alexander Kornilov 2020-01-09 02:52:17 PST
What version of LLVM is your patch for?
I use the 9.0.0 release.
Could you please prepare a patch for 9.0.0?
Comment 18 Alexander Kornilov 2020-01-09 04:45:32 PST
> The thing that has me confused is the specific assertion that you saw: It is
> in JITDylib::resolve, but this is only called from
> MaterializationResponsibility::notifyResolved which contains what should
> have been an equivalent assertion: That the symbol appears in the
> SymbolFlags map of the MaterializationResponsibility. So what I want to know
> is how we managed to trigger the second assertion without triggering the
> first? Are you able to debug your LLVM build and tell me whether the loop in
> MaterializationResponsibility::notifyResolved is running?

Sorry for the confusion. This part of the patch fixes the issue https://bugs.llvm.org/show_bug.cgi?id=44336 (The root cause of the issue that the resolved symbol on COFF doesn't have flag JITSymbolFlags::Exported).
Because without such fix I can't debug issue with extra symbols.
Comment 19 Alexander Kornilov 2020-01-09 07:25:30 PST
Created attachment 23004 [details]
Patch to auto-claim new symbols discovered by the linking layer LLVM 9.0.0
Comment 20 Alexander Kornilov 2020-01-09 07:27:41 PST
Lang,

I have adapted your patch for LLVM 9.0.0 (please take a look is it right) and have got the error at runtime:
JIT session error: Duplicate definition of symbol '__real@40091eb851eb851f'
Comment 21 Alexander Kornilov 2020-01-10 04:39:18 PST
I think root cause of this error (Duplicate definition of symbol '__real@40091eb851eb851f') is here (return make_error<DuplicateDefinition>(*KV.first);):

Error JITDylib::defineMaterializing(const SymbolFlagsMap &SymbolFlags) {
  return ES.runSessionLocked([&]() -> Error {
    std::vector<SymbolTable::iterator> AddedSyms;

    for (auto &KV : SymbolFlags) {
      SymbolTable::iterator EntryItr;
      bool Added;

      std::tie(EntryItr, Added) =
          Symbols.insert(std::make_pair(KV.first, SymbolTableEntry(KV.second)));

      if (Added) {
        AddedSyms.push_back(EntryItr);
        EntryItr->second.setState(SymbolState::Materializing);
      } else {
        // Remove any symbols already added.
        for (auto &SI : AddedSyms)
          Symbols.erase(SI);

        // FIXME: Return all duplicates.
        return make_error<DuplicateDefinition>(*KV.first);
      }
    }

    return Error::success();
  });
}

Called from:
Error RTDyldObjectLinkingLayer::onObjLoad(
    VModuleKey K, MaterializationResponsibility &R, object::ObjectFile &Obj,
    std::unique_ptr<RuntimeDyld::LoadedObjectInfo> LoadedObjInfo,
    std::map<StringRef, JITEvaluatedSymbol> Resolved,
    std::set<StringRef> &InternalSymbols) {
  SymbolFlagsMap ExtraSymbolsToClaim;
  SymbolMap Symbols;
  for (auto &KV : Resolved) {
    // Scan the symbols and add them to the Symbols map for resolution.

    // We never claim internal symbols.
    if (InternalSymbols.count(KV.first))
      continue;

    auto InternedName = getExecutionSession().intern(KV.first);
    auto Flags = KV.second.getFlags();

    // Override object flags and claim responsibility for symbols if
    // requested.
    if (OverrideObjectFlags || AutoClaimObjectSymbols) {
      auto I = R.getSymbols().find(InternedName);

      if (OverrideObjectFlags && I != R.getSymbols().end())
        Flags = I->second;
      else if (AutoClaimObjectSymbols && I == R.getSymbols().end())
        ExtraSymbolsToClaim[InternedName] = Flags;
    }

    Symbols[InternedName] = JITEvaluatedSymbol(KV.second.getAddress(), Flags);
  }

  if (!ExtraSymbolsToClaim.empty())
    if (auto Err = R.defineMaterializing(ExtraSymbolsToClaim))
      return Err;

  R.notifyResolved(Symbols);

  if (NotifyLoaded)
    NotifyLoaded(K, Obj, *LoadedObjInfo);

  return Error::success();
}

The second issue is here:
      if (OverrideObjectFlags && I != R.getSymbols().end())
        Flags = I->second;
      else if (AutoClaimObjectSymbols && I == R.getSymbols().end())
        ExtraSymbolsToClaim[InternedName] = Flags;
    }

    Symbols[InternedName] = JITEvaluatedSymbol(KV.second.getAddress(), Flags);

Despite the AutoClaimObjectSymbols flag is true extra symbol will be added to Symbols[InternedName].
Comment 22 Lang Hames 2020-01-21 14:04:30 PST
Hi Alexander,

Someone has reported a similar bug in http://bugs.llvm.org/PR40074. I have committed a (likely) fix for their issue in 84217ad66115cc31b184374a03c8333e4578996f. Would you be able to test and see if that commit fixes your issue too?

-- Lang.
Comment 23 Alexander Kornilov 2020-01-22 07:16:27 PST
(In reply to Lang Hames from comment #22)
> Hi Alexander,
> 
> Someone has reported a similar bug in http://bugs.llvm.org/PR40074. I have
> committed a (likely) fix for their issue in
> 84217ad66115cc31b184374a03c8333e4578996f. Would you be able to test and see
> if that commit fixes your issue too?
> 
> -- Lang.

Hi Lang,

I have ported your patch from #40074 to LLVM 9.0.0 and tested it in the mine environment.
Your patch is working fine for me.
Also, it fixes bug #44336.

But are you sure that it is a good idea to pass a parameter by value?
Error defineMaterializing(SymbolFlagsMap SymbolFlags);
If you don't want to modify the map, a constant reference is a better solution, if you want to modify, a non-constant reference or pointer is more applicable here, in my opinion.

One more question for you: is removing a module from OrcJIT will be available in LLVM 10?

---
Best wishes,
Alexander
Comment 24 Lang Hames 2020-01-22 15:01:19 PST
> I have ported your patch from #40074 to LLVM 9.0.0 and tested it in the mine environment.
> Your patch is working fine for me.
> Also, it fixes bug #44336.

That's great news. :)
Ok -- closing this as fixed by 84217ad6611.

> But are you sure that it is a good idea to pass a parameter by value?
> Error defineMaterializing(SymbolFlagsMap SymbolFlags);
> If you don't want to modify the map, a constant reference is a better solution, if you want to modify, a non-constant reference or pointer is more applicable here, in my opinion.

The JITDylib::defineMaterializing method makes modifications to the map based on what definitions are accepted and rejected, and returns these to MaterializationResponsibilty::defineMaterializing so that it can update the responsibility set. Since these modifications are not interesting to the client, it makes sense to pass by value: It is cheap, and if the client does not need their value any more then no copies are required.

> One more question for you: is removing a module from OrcJIT will be available in LLVM 10?

Unfortunately support for removing modules will not make it in to LLVM 10. The best solution for now is to maintain a separate ExecutionSession for code that you want to throw away. I am working on removable module support, and hope to get it in to LLVM 11.