First Last Prev Next    No search results available
Details
: ambiguity parsing local vs global symbols
Bug#: 645
: libraries
: Core LLVM classes
Status: RESOLVED
Resolution: FIXED
: All
: All
: 1.0
: P2
: normal
: 2.0

:
: compile-fail
:
: 761
  Show dependency tree - Show dependency graph
People
Reporter: Markus F.X.J. Oberhumer <markus@oberhumer.com>
Assigned To: Reid Spencer <rspencer@reidspencer.com>
:

Attachments
unstripped.bc (36.68 KB, application/octet-stream)
2005-10-31 07:23, Markus F.X.J. Oberhumer
Details
vmcore + asmparser patch (31.55 KB, patch)
2007-01-11 01:04, Chris Lattner
Details


Note

You need to log in before you can comment on or make changes to this bug.

Related actions


Description:   Opened: 2005-10-31 07:22
Please try the following with the attached unstripped.bc:

opt -strip -o stripped.bc unstripped.bc
llvm-dis stripped.bc
gccas stripped.ll

=> stripped.ll:4104: Reference to an invalid definition: #5 of type 'void () *'

The offending function now looks like this:

int %_Z20test_inline_funcdiffPFvvEPFjjE(void ()*, uint (uint)*) {
        cast void ()* %5 to int         ; <int>:0 [#uses=1]
        cast uint (uint)* %2 to int             ; <int>:1 [#uses=1]
        sub int %0, %1          ; <int>:2 [#uses=1]
        ret int %2
}

~Markus
------- Comment #1 From Markus F.X.J. Oberhumer 2005-10-31 07:23:25 -------
Created an attachment (id=279) [details]
unstripped.bc
------- Comment #2 From Chris Lattner 2005-10-31 12:48:05 -------
Here is a reduced testcase:

---
; RUN: llvm-as < %s | opt -strip | llvm-dis | llvm-as
int %foo(int()* %P) {
        %X = call int %P()
        ret int %X
}
internal int %XX() {
        ret int 1
}
---

This looks like a bug in the ll writer.

-Chris
------- Comment #3 From Chris Lattner 2005-10-31 12:53:55 -------
It looks like this is going to take some significant restructuring.  I'm going
to defer it to 1.7.  Sorry :(

-Chris
------- Comment #4 From Chris Lattner 2006-12-05 11:58:56 -------
The only way to fix this reasonably is to give global symbols a different
prefix (e.g. $X instead of %X).  This 
would also remove hacks needed to support
Assembler/2004-12-05-LocalGlobalSymtabConflict.ll
------- Comment #5 From Reid Spencer 2006-12-05 13:20:51 -------
Initial support for llvm-upgrade has been implemented with this patch:

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20061204/040812.html

When more details are available, the upgrade will be implemented.
------- Comment #6 From Reid Spencer 2006-12-05 16:17:34 -------
Idea:

Continue to use % for all global names (vars, constants, types, functions). Make
use of % illegal for register names inside functions where $ might be more
appropriate (more assemblerish). 

------- Comment #7 From Chris Lattner 2006-12-05 16:20:29 -------
I actually am gravitating towards the %% idea.

It seems that most simple global names can use %%foo, which has the added advantage that they stand 
out a bit. 

Complex global names (cases where we use "foo bar" instead of %), can use %"..".  This would allow us 
to continue the use of any character in the name.

Reasonable?

-Chris
------- Comment #8 From Reid Spencer 2006-12-05 16:31:15 -------
Yup, makes sense that way too, but I'd still vote for not allowing % as a
register name. That is %% or %"..." is always global.  "..." or $... is local.
That way its even more distinct.
------- Comment #9 From Chris Lattner 2006-12-05 17:12:10 -------
The problem with disallowing stuff is that it means we'd have to check it
somewhere, slowing things 
down.

Is there a problem with:

%%G and %"G" being global and
%L and "L" being local?

In the asmprinter, if % is used in a local symbol, you could always use "%"
like we do today.

Today, this is valid syntax:
"%" = global int 4

-Chris
------- Comment #10 From Reid Spencer 2006-12-06 00:38:31 -------
> The problem with disallowing stuff is that it means we'd have to check it
> somewhere, slowing things down.

Huh? All this means is turning the Name production into a GlobalName and a
LocalName production. Same for OptAssign -> GlobalOptAssign and LocalOptAssign.

You're worried about the parser doing extra checking?

> Is there a problem with:
> %%G and %"G" being global and
> %L and "L" being local?

No, there's nothing "wrong" with it. Parsing wise its not ambiguous, but when
humans read it, its much more clear with:

%G and %"G" being global and
$L and $"L" being local. 

Simple rule: $ means local, % means global.


> In the asmprinter, if % is used in a local symbol, you could always use 
> "%" like we do today.

Sure.

> Today, this is valid syntax:
> "%" = global int 4

Sure, I'm just trying to simplify things so that when someone reads the assembly
its always clear: $=local, %=global. Period, no special cases.

But, its not a big deal. Implement what you like.
------- Comment #11 From Chris Lattner 2006-12-10 14:59:47 -------
Ok, I misunderstood what you said.  I have no inherent preference for the
prefix we use.  The 
disadvantage of using $ as a prefix is that $ is a valid (and commonly used on
macos) character in some 
identifiers.  Using $ as a prefix would force those to print with "" syntax
instead of the simple prefix 
syntax.
------- Comment #12 From Reid Spencer 2006-12-10 15:11:30 -------
It doesn't have to be $. My whole point was that % and %% might confused by some
humans. Using % for globals and something else for locals would alleviate that
confusion.

Anyway, I reiterate, implement what you like. Its not that big a deal.

Reid.
------- Comment #13 From Chris Lattner 2006-12-10 15:16:19 -------
What do you think of @global and %local?

-Chris
------- Comment #14 From Reid Spencer 2006-12-10 15:49:33 -------
Works for me.
------- Comment #15 From Chris Lattner 2007-01-11 00:23:05 -------
Another hard case:
define i32 @foo(i32()* %P) {
        %X = call i32 %P()
        %Y = call i32 @XX()
        ret i32 %X
}
define internal i32 @XX() {
        ret i32 1
}
------- Comment #16 From Chris Lattner 2007-01-11 00:38:17 -------
The .ll file printer + asmparser bits are implemented, I'll attach a patch
soon.  Unfortunately I can't commit 
these until llvm-upgrade is implemented, as doing so would massacre llvm/test. 
Reid volenteered to do 
the llvm-upgrade part.
------- Comment #17 From Chris Lattner 2007-01-11 01:04:47 -------
Created an attachment (id=557) [details]
vmcore + asmparser patch

Hold your breath, here it is.
------- Comment #18 From Reid Spencer 2007-01-12 12:36:28 -------
Mine.
------- Comment #19 From Reid Spencer 2007-01-12 12:58:39 -------
Are module asm blocks considered global ? I'm assuming they are.
------- Comment #20 From Reid Spencer 2007-01-12 13:07:32 -------
Never mind, silly question. They don't have names :)
------- Comment #21 From Reid Spencer 2007-01-12 13:49:27 -------
This is an even harder test case (for llvm-upgrade):

int %foo(int()* %P) {
        %XX = call int %P()
        %Y = call uint %XX()
        ret int %XX
}

internal uint "XX"() {
        ret uint 1
}

------- Comment #22 From Reid Spencer 2007-01-12 14:38:57 -------
This is basically going to take a rewrite of llvm-upgrade. However, I'm not even
sure that will help because the ambiguity that PR645 is solving is still present
in llvm-ugprade's input. Consider this:

void %func(int* %X) {
entry:
    br next

more:
    %ptr = getelementptr long* %X, int 0
    %ptr2 = getelementptr int* %Y, int 0
    br done

next:
    %Y = getelementptr int* %X, int 0
    br more

done:
   ret void
}

%X = internal constant long 5

When we hit the more block we have two problems. First, we have no idea that
long* %X is forward referenced. It could be either local or global. Furthermore
there's a %X local symbol of a different type.  Second, LLVM allows instructions
to be forward referenced too. This is illustrated in the next instruction in
which case %Y is local but it hasn't been seen yet by the parser. This prevents
use from using a rule like: "if you haven't seen it yet, its global" because it
might be a forward referenced local. 

These issues require llvm-upgrade to keep a model of each function until the
entire function is finished parsing. Then we have to go back and "Fixup" any
references that were not defined by the function. llvm-upgrade wasn't designed
to do this. It emits instructions as soon as they are read. 

Consequently, its going to be a while before this can get resolved.
------- Comment #23 From Chris Lattner 2007-01-12 15:35:08 -------
I assume you should use the same approach that asmparser used to.  Basically,
if you see a name you:

1. Check to see if it's already defined, if so, use it.
2. If not, see if we have a placeholder for it already, if so, use it.
3. If not, create a new placeholder.  Add it to the list of unresolved names.
4. When done parsing each function, scan through the list of unresolved names. 
If any instructions match 
an unresolved name, resolve it.
5. When done parsing the module, scan through the list of unresolved names, any
unresolved names get 
resolved to globals.  If something isn't resolvable, it's an error.

-Chris
------- Comment #24 From Reid Spencer 2007-01-12 15:56:30 -------
Nope, it doesn't. Until now it was unnecessary. I decoupled llvm-upgrade from
LLVM so that it is not affected by IR changes or any "new" rules. For the most
part it just passes strings through on the assumption that the input is valid
and the number of changes required are minimal.

With this change, I'm going to do exactly what you just described. Hence, the
rewrite.
------- Comment #25 From Chris Lattner 2007-01-12 16:04:38 -------
ok
------- Comment #26 From Chris Lattner 2007-01-25 11:37:34 -------
Note: when this lands, we should remove Function::renameLocalSymbols and all
uses.  This fix obsoletes 
that hack.

-Chris
------- Comment #27 From Reid Spencer 2007-01-25 12:03:15 -------
Okay. This should land this week if I can get through the llvm-upgrade portion
of it. I'll make the change now so that it gets included with the commit.
------- Comment #28 From Chris Lattner 2007-01-25 12:04:35 -------
no worries, we can make the change after it lands.  I just didn't want to
forget about it. :)
------- Comment #29 From Reid Spencer 2007-01-26 02:32:19 -------
This has now been implemented via joint effort. The llvm-upgrade changes are in
this patch:

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070122/043254.html

First Last Prev Next    No search results available