LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 10367 - Fix the design of GlobalAlias to not require dest type to match source type
Summary: Fix the design of GlobalAlias to not require dest type to match source type
Status: RESOLVED FIXED
Alias: None
Product: libraries
Classification: Unclassified
Component: Core LLVM classes (show other bugs)
Version: 1.0
Hardware: PC All
: P normal
Assignee: Unassigned LLVM Bugs
URL:
Keywords: code-cleanup
Depends on:
Blocks:
 
Reported: 2011-07-15 01:25 PDT by Chris Lattner
Modified: 2014-06-02 21:54 PDT (History)
12 users (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Lattner 2011-07-15 01:25:33 PDT
Global alias is currently defined to have its own type, and then have an initializer of the same type.  The initializer is a "Constant*" which is either a) a global value, b) a constant expr bitcast, c) a constantexpr gep with all zero indices, d) null (which isn't valid, but transiently happens).

This doesn't make sense for a number of reasons.  Instead, the initializer of a GlobalAlias should be required to be a GlobalValue, but the type of the source and dest of the alias should not be required to be the same, they should just be completely decoupled.

-Chris
Comment 1 Anton Korobeynikov 2011-07-15 04:58:38 PDT
But what are these reasons? Right now everyone can look "through" alias without thinking how it should interpret the stuff.
Comment 2 Chris Lattner 2011-07-15 10:48:27 PDT
Because the design is broken.  Aliases are aliases to *a global* not a constant expression.  The fact that we have to allow getelementptr (for example) is a sign that this is a hack.
Comment 3 Chris Lattner 2011-07-15 10:48:50 PDT
I'm not saying that this is a huge priority to fix. I know that the current implementation works fine.
Comment 4 Anton Korobeynikov 2011-07-15 14:42:45 PDT
Originally only GlobalValue was allowed as an aliasee and types should match. This was relaxed lately, but I don't remember the exact reason. Probably because it was implemented in gcc in such a wrong way that people often 'exploited' this and even considered as a *feature* (that types of alias / aliasee might now match), thus it was changed lately...

Maybe we should just be more conservative on this matter?
Comment 5 Chris Lattner 2011-07-15 15:05:38 PDT
The types *have* to be able to mismatch.  This is important for the IR linker among other things.  I'm just saying that a constantexpr isn't the right way to represent this.
Comment 6 Duncan Sands 2011-07-25 05:26:35 PDT
How would this interact with RAUW?  Suppose you have

  @a = i32 alias i64 @b

(for example), then you use RAUW to replace @b with (eg) a ptrtoint:

  @a = i32 alias (ptrtoint i8* @b.1 to i64)

Then you would get an illegal alias.
Comment 7 Chris Lattner 2011-07-25 11:57:00 PDT
Globals are always pointers, but your issue is a valid one none-the-less with bitcasts.
Comment 8 Jay Foad 2011-07-29 03:11:10 PDT
I posted a patch here to make GlobalAlias's API look like the aliasee is always a GlobalValue; that is, setAliasee() takes a GlobalValue, and getAliasee() always returns a GlobalValue:

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20110725/125031.html
Comment 9 Chris Lattner 2011-07-29 15:33:40 PDT
I like this approach, this seems better than my suggestion in the description above.
Comment 10 Jay Foad 2011-08-10 03:12:23 PDT
> I posted a patch here to make GlobalAlias's API look like the aliasee is always
> a GlobalValue; that is, setAliasee() takes a GlobalValue, and getAliasee()
> always returns a GlobalValue:
> 
> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20110725/125031.html

Duncan argued that it would be better if the type of the GlobalAlias always matched the type of its operand. I updated the patch accordingly:

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20110801/125477.html
Comment 11 Rafael Ávila de Espíndola 2013-10-30 17:01:36 PDT
I was thinking about this a bit. The object files we support have a different view of what an alias is.

On Mach-O there is (or there will be once the linker implements it) N_INDR. It is just a directive that tells the linker that this symbol should have the same address as another one.

The consequences of implementing alias like that are

* Alias can point to symbols in other files.
* Alias must point to a symbol.

On ELF (and I am almost sure on COFF), an alias is just another symbol. It just happens to point to a position also pointed by another symbol.

The consequences of implementing alias like that are:

* Alias can only point to symbols in this file.
* Alias can point inside an object.

This could be useful for representing cases where the abi mandates a symbol inside a particular data structure (like microsoft vtables).

My suggestion is then:
* Have ELF/COFF backends error on an attempt to alias a declaration. (right now they miscompile).
* Implement explicit COMDATs at the llvm IR. This is needed to have the notion of a set of symbols/data that can be dropped as a group.
* Allow alias with non-zero GEPs, but have the Mach-O backend error on them.
Comment 12 Tim Northover 2013-10-30 17:18:05 PDT
For MachO, I was toying with only producing an N_INDR symbol when the destination couldn't be resolved within the file. I don't think I'd implemented it like that yet, because I'd viewed it as more of an optimisation; but I now see it affects semantics and permitted forms.

Making that a policy decision would seem to remove at least some of the special cases if I'm reading your description properly. If so, I'd be happy to make sure it *does* work like that, if it would help.
Comment 13 Rafael Ávila de Espíndola 2013-10-30 17:27:55 PDT
(In reply to comment #12)
> For MachO, I was toying with only producing an N_INDR symbol when the
> destination couldn't be resolved within the file. I don't think I'd
> implemented it like that yet, because I'd viewed it as more of an
> optimisation; but I now see it affects semantics and permitted forms.
> 
> Making that a policy decision would seem to remove at least some of the
> special cases if I'm reading your description properly. If so, I'd be happy
> to make sure it *does* work like that, if it would help.

It would help in two ways
* Knowing what can be implemented in MachO
* Having any alias support at all in MachO :-)

Thanks!
Comment 14 Rafael Ávila de Espíndola 2014-01-14 18:20:32 PST
Starting with r198349 we also accept addrspacecast, and that does look like a valid use (see the thread following the commit).

For this bug this probably means that the "new alias" should also have an optional address space.
Comment 15 Reid Kleckner 2014-02-12 20:02:55 PST
One other design issue with GlobalAlias is that the LangRef doesn't give them a section, but in practice, I can construct an alias with a section using LLVM's C++ API because GlobalAlias inherits from GlobalValue.  However, the section doesn't round trip through bitcode or LLVM asm because it was skipped.

There's a few other attributes on GlobalValue that aren't present in LangRef, but they probably don't matter as much.
Comment 16 Rafael Ávila de Espíndola 2014-06-02 21:54:16 PDT
This was fixed, but not in the way that is discussed in this bug. Instead, alias are defined to point to arbitrary expressions and we changed all the code that thought it was possible to, for example, find the aliased symbol.

The argument that convinced me to go this ways is that at the assembly level expressions like

a = b - c

are sometime valid depending on what b and c are. Before r210062 there was no way to produce such an expression in LLVM. What we now have is

* GlobalObjecs introduce new data. They are functions or variables. They normally have a symbol (except for some cases with private linkage).
* GlobalAliases define a label pointing somewhere. At the LLVM level it is not possible to know if the expression defining the location is valid or not.

I would love to be able to tell if a given expression is valid at the IR level without dropping support for representing any expression an assembler supports, but that would be quite a massive change:

With pr10368 we would basically get expressions to represent what is representable with relocations in object files. In addition to that we would need to be have a different expression type for all the cases that don't need a relocation. This would require knowing at the llvm level what section/atom a label is in.