Global alias is currently defined to have its own type, and then have an initializer of the same type. The initializer is a "Constant*" which is either a) a global value, b) a constant expr bitcast, c) a constantexpr gep with all zero indices, d) null (which isn't valid, but transiently happens). This doesn't make sense for a number of reasons. Instead, the initializer of a GlobalAlias should be required to be a GlobalValue, but the type of the source and dest of the alias should not be required to be the same, they should just be completely decoupled. -Chris
But what are these reasons? Right now everyone can look "through" alias without thinking how it should interpret the stuff.
Because the design is broken. Aliases are aliases to *a global* not a constant expression. The fact that we have to allow getelementptr (for example) is a sign that this is a hack.
I'm not saying that this is a huge priority to fix. I know that the current implementation works fine.
Originally only GlobalValue was allowed as an aliasee and types should match. This was relaxed lately, but I don't remember the exact reason. Probably because it was implemented in gcc in such a wrong way that people often 'exploited' this and even considered as a *feature* (that types of alias / aliasee might now match), thus it was changed lately... Maybe we should just be more conservative on this matter?
The types *have* to be able to mismatch. This is important for the IR linker among other things. I'm just saying that a constantexpr isn't the right way to represent this.
How would this interact with RAUW? Suppose you have @a = i32 alias i64 @b (for example), then you use RAUW to replace @b with (eg) a ptrtoint: @a = i32 alias (ptrtoint i8* @b.1 to i64) Then you would get an illegal alias.
Globals are always pointers, but your issue is a valid one none-the-less with bitcasts.
I posted a patch here to make GlobalAlias's API look like the aliasee is always a GlobalValue; that is, setAliasee() takes a GlobalValue, and getAliasee() always returns a GlobalValue: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20110725/125031.html
I like this approach, this seems better than my suggestion in the description above.
> I posted a patch here to make GlobalAlias's API look like the aliasee is always > a GlobalValue; that is, setAliasee() takes a GlobalValue, and getAliasee() > always returns a GlobalValue: > > http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20110725/125031.html Duncan argued that it would be better if the type of the GlobalAlias always matched the type of its operand. I updated the patch accordingly: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20110801/125477.html
I was thinking about this a bit. The object files we support have a different view of what an alias is. On Mach-O there is (or there will be once the linker implements it) N_INDR. It is just a directive that tells the linker that this symbol should have the same address as another one. The consequences of implementing alias like that are * Alias can point to symbols in other files. * Alias must point to a symbol. On ELF (and I am almost sure on COFF), an alias is just another symbol. It just happens to point to a position also pointed by another symbol. The consequences of implementing alias like that are: * Alias can only point to symbols in this file. * Alias can point inside an object. This could be useful for representing cases where the abi mandates a symbol inside a particular data structure (like microsoft vtables). My suggestion is then: * Have ELF/COFF backends error on an attempt to alias a declaration. (right now they miscompile). * Implement explicit COMDATs at the llvm IR. This is needed to have the notion of a set of symbols/data that can be dropped as a group. * Allow alias with non-zero GEPs, but have the Mach-O backend error on them.
For MachO, I was toying with only producing an N_INDR symbol when the destination couldn't be resolved within the file. I don't think I'd implemented it like that yet, because I'd viewed it as more of an optimisation; but I now see it affects semantics and permitted forms. Making that a policy decision would seem to remove at least some of the special cases if I'm reading your description properly. If so, I'd be happy to make sure it *does* work like that, if it would help.
(In reply to comment #12) > For MachO, I was toying with only producing an N_INDR symbol when the > destination couldn't be resolved within the file. I don't think I'd > implemented it like that yet, because I'd viewed it as more of an > optimisation; but I now see it affects semantics and permitted forms. > > Making that a policy decision would seem to remove at least some of the > special cases if I'm reading your description properly. If so, I'd be happy > to make sure it *does* work like that, if it would help. It would help in two ways * Knowing what can be implemented in MachO * Having any alias support at all in MachO :-) Thanks!
Starting with r198349 we also accept addrspacecast, and that does look like a valid use (see the thread following the commit). For this bug this probably means that the "new alias" should also have an optional address space.
One other design issue with GlobalAlias is that the LangRef doesn't give them a section, but in practice, I can construct an alias with a section using LLVM's C++ API because GlobalAlias inherits from GlobalValue. However, the section doesn't round trip through bitcode or LLVM asm because it was skipped. There's a few other attributes on GlobalValue that aren't present in LangRef, but they probably don't matter as much.
This was fixed, but not in the way that is discussed in this bug. Instead, alias are defined to point to arbitrary expressions and we changed all the code that thought it was possible to, for example, find the aliased symbol. The argument that convinced me to go this ways is that at the assembly level expressions like a = b - c are sometime valid depending on what b and c are. Before r210062 there was no way to produce such an expression in LLVM. What we now have is * GlobalObjecs introduce new data. They are functions or variables. They normally have a symbol (except for some cases with private linkage). * GlobalAliases define a label pointing somewhere. At the LLVM level it is not possible to know if the expression defining the location is valid or not. I would love to be able to tell if a given expression is valid at the IR level without dropping support for representing any expression an assembler supports, but that would be quite a massive change: With pr10368 we would basically get expressions to represent what is representable with relocations in object files. In addition to that we would need to be have a different expression type for all the cases that don't need a relocation. This would require knowing at the llvm level what section/atom a label is in.