Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve TLV performance in ORC Runtime #51162

Open
lhames opened this issue Sep 11, 2021 · 0 comments
Open

Improve TLV performance in ORC Runtime #51162

lhames opened this issue Sep 11, 2021 · 0 comments
Labels
bugzilla Issues migrated from bugzilla orcjit

Comments

@lhames
Copy link
Contributor

lhames commented Sep 11, 2021

Bugzilla Link 51820
Version trunk
OS All
CC @AlexDenisov

Extended Description

For convenience, the initial TLV support in the ORC runtime (for both macho_platform and elfnix_platform) just saves all registers, then jumps into a function to do table lookup for the requested TLV instance address. I think the following scheme should provide better performance, while still allowing us to add new TLVs at runtime:

  1. Instead of pointing directly at a TLV manager object the thread data pointer for the current thread should point to a table structure containing a N + 2 pointer sized objects. The first two entries will contain a pointer to the TLV manager object and a table size, and all subsequent entries will contain addresses of TLV instances for the current thread. As new TLVs are added to the JIT they are assigned indexes within this table (these indexes may extend past the end of the currently allocated table).
tlv_table:
  <pointer to TLV manager>
  <number of elements N>
  <element 0>
  ...
  <element N - 1>
  1. TLV descriptors (in the current MachO parlance) should become a triple of (get_addr_fn, key, index) (instead of the current (get_addr_fn, key, addr)), and...

  2. The new tlv_get_addr function should look something like this:

if (desc->idx > table->size)
  return expandTableAndGet(desc);
else if (table->elements[desc->idx] == 0)
  return allocateElementAndGet(desc);
return table->elements[desc->idx];

Or, in x86-64 assembly, something like:

_tlv_get_addr:
        movq    8(%rdi),%rax                    // get key from desc                                                                                                                            
        movq    %gs:0x0(,%rax,8),%rsi           // get table for thread                                                                                                                                            
        movq    16(%rdi), %rax                  // get index from desc
        cmpq    8(%rsi), %rax                   // compare index and table size
        jae     LexpandTableAndGet              // if index > size then expand table
        movq    16(%rsi,%rax,8), %rax.          // otherwise get table entry
        testq   %rax, %rax                      // check for a null entry
        je      LallocateElementAndGet          // if null then allocate entry 
        ret                                     // Non-null entry: we have our result.
LexpandTableAndGet
        jmp __resize_table
LallocateElementAndGet
        jmp __allocate_element
@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla orcjit
Projects
None yet
Development

No branches or pull requests

1 participant