5064 – Support byval the msvc++ way.

LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 5064 - Support byval the msvc++ way.

Summary: Support byval the msvc++ way.

Status:	RESOLVED FIXED

Alias:	None

Product:	libraries
Classification:	Unclassified
Component:	Backend: X86 (show other bugs)
Version:	trunk
Hardware:	PC Windows XP

Importance:	P normal
Assignee:	Reid Kleckner

URL:
Keywords:

Duplicates (1):	17167 (view as bug list)
Depends on:
Blocks:	12477 17167
	Show dependency tree

Reported:	2009-09-26 22:15 PDT by Óscar Fuentes
Modified:	2014-02-27 18:23 PST (History)
CC List:	10 users (show)

See Also:
Fixed By Commit(s):

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Óscar Fuentes 2009-09-26 22:15:32 PDT

When a C++ struct function argument has a destructor or copy constructor and is passed by value, GCC creates a temporary copy somewhere and passes a pointer to the temporary copy to the function. LLVM does not support this directly and is the front-end the responsible of creating the temporary and use the pointer as the function's parameter.

MSVC++ does not use this trick: the C++ class is put on the stack along with the rest of function parameters. As LLVM is the responsible of arranging the parameters on the stack for the function call, this requires that LLVM know about the copy constructor and the destructor.

Right now there is no workaround for this, so LLVM code is unable to call functions compiled with MSVC++ that take C++ classes by value (when those classes have a copy constructor and/or destructor).

LLVM's byval should be extended to allow the front-end to specify a copy constructor and a destructor, and insert calls to them when the temporary is created and after the call is performed.

Comment 1 Anton Korobeynikov 2009-09-27 03:52:36 PDT

(In reply to comment #0)
> Right now there is no workaround for this, so LLVM code is unable to call
> functions compiled with MSVC++ that take C++ classes by value (when those
> classes have a copy constructor and/or destructor).
How you're going to call these functions? Name mangling used by VCPP is undocumented. Even worse, I don't think this is possible at all -VCPP C++ ABI is undocumented as well.

I really think this PR should be closed as 'wontfix'.

Comment 2 Duncan Sands 2009-09-27 07:15:41 PDT

I don't see how this is an LLVM issue.  LLVM is low-level, it is up to the
front-end to generate explicit calls to constructors and destructors.  Yes,
this means the front-end needs to know about the target ABI but that's
already the case.  I think this should be closed as wontfix.

Comment 3 Óscar Fuentes 2009-09-27 07:53:38 PDT

Name mangling is not LLVM's bussiness. (Does LLVM know about GCC's name mangling?). If you have a pointer to a function, you don't care about its name mangling.

This PR is not about supporting the full VC++ ABI, just about being able to do one precise thing, not necessarily exclusive of VC++.

With current LLVM, the front end is unable to implement this feature because LLVM makes it impossible. In other words, LLVM lacks features required for managing certain function parameters on a certain, natural, way. GCC uses a hack, possibly forced by its legacy, multi-layered, increasingly generic, internal architecture. The GCC hack is not elegant nor efficient (two pointer indirections instead of one for referencing the C++ class).

Closing this as wontfix is handwaving. Is saying "I don't care about LLVM's current limitations as long as it is hack-compatible with GCC".

I opened this PR not hoping for a fix overnight. This is more geared towards documenting a fact and influencing future evolution of LLVM design.

Comment 4 Anton Korobeynikov 2009-09-27 08:06:36 PDT

(In reply to comment #3)
> With current LLVM, the front end is unable to implement this feature because
> LLVM makes it impossible.
You definitely can. See, how structs are passed by value, you need to do the same thing in your frontend, e.g.:

struct Foo {
  int a;
  double b;
};

void bar(Foo f);

void foo(void) {
  Foo f = {0, 0};
  bar(f);
}

is turned into:

define void @_Z3foov() {
entry:
        tail call void @_Z3bar3Foo( i32 0, double 0.000000e+00 )
        ret void
}

declare void @_Z3bar3Foo(i32, double)

> In other words, LLVM lacks features required for
> managing certain function parameters on a certain, natural, way. GCC uses a
> hack, possibly forced by its legacy, multi-layered, increasingly generic,
> internal architecture. The GCC hack is not elegant nor efficient (two pointer
> indirections instead of one for referencing the C++ class).
This is neither called "GCC hack" nor "legacy, multi-layered, increasingly generic, internal architecture". It's called Itanium C++ ABI.

Comment 5 Óscar Fuentes 2009-09-27 08:51:41 PDT

(In reply to comment #4)
> (In reply to comment #3)
> > With current LLVM, the front end is unable to implement this feature because
> > LLVM makes it impossible.
> You definitely can. See, how structs are passed by value, you need to do the
> same thing in your frontend, e.g.:
> 
> struct Foo {
>   int a;
>   double b;
> };
> 
> void bar(Foo f);
> 
> void foo(void) {
>   Foo f = {0, 0};
>   bar(f);
> }
> 
> is turned into:
> 
> define void @_Z3foov() {
> entry:
>         tail call void @_Z3bar3Foo( i32 0, double 0.000000e+00 )
>         ret void
> }
> 
> declare void @_Z3bar3Foo(i32, double)

Wouldn't this create data alignment issues? Register/stack confusions (the callee expects the data on the stack, but due to the hacked signature the caller passes data on registers). Performance issues (pushing class data members one by one)? Debug info nuisances? Headaches while reading LLVM or assembler code? :-)

> > In other words, LLVM lacks features required for
> > managing certain function parameters on a certain, natural, way. GCC uses a
> > hack, possibly forced by its legacy, multi-layered, increasingly generic,
> > internal architecture. The GCC hack is not elegant nor efficient (two pointer
> > indirections instead of one for referencing the C++ class).
> This is neither called "GCC hack" nor "legacy, multi-layered, increasingly
> generic, internal architecture". It's called Itanium C++ ABI.

... which was heavily influenced by how GCC does things. But I don't care. Call it GCC, Itanium C++ ABI or whatever, it is a hack.

You would agree on that passing C++ classes by "referenced value" has no justification on terms of efficiency. Quite the contrary.

Comment 6 Anton Korobeynikov 2013-06-20 14:33:11 PDT

*** Bug 16390 has been marked as a duplicate of this bug. ***

Comment 7 Reid Kleckner 2013-06-20 15:01:51 PDT

There some discussion here about how to do fix this in LLVM:
http://llvm.org/bugs/show_bug.cgi?id=16226#c2

The consensus seems to be that LLVM needs some way to expose the stack memory used for the call args.  Then the frontend can emit the copy ctor call.

It's not pretty, it's not portable, but it's probably better than teaching LLVM how to emit a copy ctor, especially given the need to emit EH cleanups.

There could also be some side benefits of letting the IR optimizers work on the memory accesses.

Comment 8 Eli Friedman 2013-09-09 15:59:08 PDT

*** Bug 17167 has been marked as a duplicate of this bug. ***

Comment 9 Reid Kleckner 2013-09-26 13:10:31 PDT

llvm/utils/TableGen/CodeGenDAGPatterns.h uses StringMap<SmallVector<...>>, which clang miscompiles due to this bug.  We'll have to tackle it sooner rather than later.

Comment 10 James Baker 2013-11-04 08:16:25 PST

(In reply to comment #9)
> llvm/utils/TableGen/CodeGenDAGPatterns.h uses StringMap<SmallVector<...>>,
> which clang miscompiles due to this bug.  We'll have to tackle it sooner
> rather than later.

Bug 17167, which was set as a Release Blocker was tagged as a duplicate of this one.  In my opinion, that should make this a release blocker as well.

It means that Clang/llvm can't be used in a production environment.

I haven't looked at the code in llvm/utils/TableGen/CodeGenDAGPatterns.h, but would it be feasible to rework it there so that it used a different method that would allow a debug version to work?

Comment 11 Reid Kleckner 2013-11-04 11:24:09 PST

(In reply to comment #10)
> Bug 17167, which was set as a Release Blocker was tagged as a duplicate of
> this one.  In my opinion, that should make this a release blocker as well.

Yes, I agree, this is basically the most important bug for MSVC compatibility, and I'm working on it.  LLVM does time-based releases, so I don't think it will block the 3.4 release.  I'd like to fix it by then, but that's up to me.

> It means that Clang/llvm can't be used in a production environment.

Yes, the Microsoft C++ ABI support is experimental and can't really be used in a production environment.

> I haven't looked at the code in llvm/utils/TableGen/CodeGenDAGPatterns.h,
> but would it be feasible to rework it there so that it used a different
> method that would allow a debug version to work?

CodeGenDAGPatterns.h doesn't cause the bug.  It exhibits the bug when we compile it with clang.

Comment 12 Reid Kleckner 2014-02-27 18:23:11 PST

This is basically fixed with inalloca: http://llvm.org/docs/InAlloca.html

I've been amazed at how few miscompiles we've hit with it so far.  A lot of work is needed to teach LLVM how to optimize it well, especially in the presence of exceptions, but that's TODO.