Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ambiguity parsing local vs global symbols #1017

Closed
markus-oberhumer opened this issue Oct 31, 2005 · 29 comments
Closed

ambiguity parsing local vs global symbols #1017

markus-oberhumer opened this issue Oct 31, 2005 · 29 comments
Labels
bugzilla Issues migrated from bugzilla compile-fail Use [accepts-invalid] and [rejects-valid] instead llvm:core

Comments

@markus-oberhumer
Copy link

Bugzilla Link 645
Resolution FIXED
Resolved on Feb 22, 2010 12:50
Version 1.0
OS All
Blocks llvm/llvm-bugzilla-archive#761
Attachments unstripped.bc
CC @lattner

Extended Description

Please try the following with the attached unstripped.bc:

opt -strip -o stripped.bc unstripped.bc
llvm-dis stripped.bc
gccas stripped.ll

=> stripped.ll:4104: Reference to an invalid definition: #​5 of type 'void () *'

The offending function now looks like this:

int %_Z20test_inline_funcdiffPFvvEPFjjE(void (), uint (uint)) {
cast void ()* %5 to int ; :0 [#uses=1]
cast uint (uint)* %2 to int ; :1 [#uses=1]
sub int %0, %1 ; :2 [#uses=1]
ret int %2
}

~Markus

@lattner
Copy link
Collaborator

lattner commented Oct 31, 2005

Here is a reduced testcase:


; RUN: llvm-as < %s | opt -strip | llvm-dis | llvm-as
int %foo(int()* %P) {
%X = call int %P()
ret int %X
}
internal int %XX() {
ret int 1
}

This looks like a bug in the ll writer.

-Chris

@lattner
Copy link
Collaborator

lattner commented Oct 31, 2005

It looks like this is going to take some significant restructuring. I'm going to defer it to 1.7. Sorry :(

-Chris

@lattner
Copy link
Collaborator

lattner commented Dec 5, 2006

The only way to fix this reasonably is to give global symbols a different prefix (e.g. $X instead of %X). This
would also remove hacks needed to support Assembler/2004-12-05-LocalGlobalSymtabConflict.ll

@llvmbot
Copy link
Collaborator

llvmbot commented Dec 5, 2006

Initial support for llvm-upgrade has been implemented with this patch:

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20061204/040812.html

When more details are available, the upgrade will be implemented.

@llvmbot
Copy link
Collaborator

llvmbot commented Dec 6, 2006

Idea:

Continue to use % for all global names (vars, constants, types, functions). Make
use of % illegal for register names inside functions where $ might be more
appropriate (more assemblerish).

@lattner
Copy link
Collaborator

lattner commented Dec 6, 2006

I actually am gravitating towards the %% idea.

It seems that most simple global names can use %%foo, which has the added advantage that they stand
out a bit.

Complex global names (cases where we use "foo bar" instead of %), can use %"..". This would allow us
to continue the use of any character in the name.

Reasonable?

-Chris

@llvmbot
Copy link
Collaborator

llvmbot commented Dec 6, 2006

Yup, makes sense that way too, but I'd still vote for not allowing % as a
register name. That is %% or %"..." is always global. "..." or $... is local.
That way its even more distinct.

@lattner
Copy link
Collaborator

lattner commented Dec 6, 2006

The problem with disallowing stuff is that it means we'd have to check it somewhere, slowing things
down.

Is there a problem with:

%%G and %"G" being global and
%L and "L" being local?

In the asmprinter, if % is used in a local symbol, you could always use "%" like we do today.

Today, this is valid syntax:
"%" = global int 4

-Chris

@llvmbot
Copy link
Collaborator

llvmbot commented Dec 6, 2006

The problem with disallowing stuff is that it means we'd have to check it
somewhere, slowing things down.

Huh? All this means is turning the Name production into a GlobalName and a
LocalName production. Same for OptAssign -> GlobalOptAssign and LocalOptAssign.

You're worried about the parser doing extra checking?

Is there a problem with:
%%G and %"G" being global and
%L and "L" being local?

No, there's nothing "wrong" with it. Parsing wise its not ambiguous, but when
humans read it, its much more clear with:

%G and %"G" being global and
$L and $"L" being local.

Simple rule: $ means local, % means global.

In the asmprinter, if % is used in a local symbol, you could always use
"%" like we do today.

Sure.

Today, this is valid syntax:
"%" = global int 4

Sure, I'm just trying to simplify things so that when someone reads the assembly
its always clear: $=local, %=global. Period, no special cases.

But, its not a big deal. Implement what you like.

@lattner
Copy link
Collaborator

lattner commented Dec 10, 2006

Ok, I misunderstood what you said. I have no inherent preference for the prefix we use. The
disadvantage of using $ as a prefix is that $ is a valid (and commonly used on macos) character in some
identifiers. Using $ as a prefix would force those to print with "" syntax instead of the simple prefix
syntax.

@llvmbot
Copy link
Collaborator

llvmbot commented Dec 10, 2006

It doesn't have to be $. My whole point was that % and %% might confused by some
humans. Using % for globals and something else for locals would alleviate that
confusion.

Anyway, I reiterate, implement what you like. Its not that big a deal.

Reid.

@lattner
Copy link
Collaborator

lattner commented Dec 10, 2006

What do you think of @​global and %local?

-Chris

@llvmbot
Copy link
Collaborator

llvmbot commented Dec 10, 2006

Works for me.

@lattner
Copy link
Collaborator

lattner commented Jan 11, 2007

Another hard case:
define i32 @​foo(i32()* %P) {
%X = call i32 %P()
%Y = call i32 @​XX()
ret i32 %X
}
define internal i32 @​XX() {
ret i32 1
}

@lattner
Copy link
Collaborator

lattner commented Jan 11, 2007

The .ll file printer + asmparser bits are implemented, I'll attach a patch soon. Unfortunately I can't commit
these until llvm-upgrade is implemented, as doing so would massacre llvm/test. Reid volenteered to do
the llvm-upgrade part.

@lattner
Copy link
Collaborator

lattner commented Jan 11, 2007

vmcore + asmparser patch
Hold your breath, here it is.

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 12, 2007

Mine.

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 12, 2007

Are module asm blocks considered global ? I'm assuming they are.

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 12, 2007

Never mind, silly question. They don't have names :)

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 12, 2007

This is an even harder test case (for llvm-upgrade):

int %foo(int()* %P) {
%XX = call int %P()
%Y = call uint %XX()
ret int %XX
}

internal uint "XX"() {
ret uint 1
}

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 12, 2007

This is basically going to take a rewrite of llvm-upgrade. However, I'm not even
sure that will help because the ambiguity that llvm/llvm-bugzilla-archive#645 is solving is still present
in llvm-ugprade's input. Consider this:

void %func(int* %X) {
entry:
br next

more:
%ptr = getelementptr long* %X, int 0
%ptr2 = getelementptr int* %Y, int 0
br done

next:
%Y = getelementptr int* %X, int 0
br more

done:
ret void
}

%X = internal constant long 5

When we hit the more block we have two problems. First, we have no idea that
long* %X is forward referenced. It could be either local or global. Furthermore
there's a %X local symbol of a different type. Second, LLVM allows instructions
to be forward referenced too. This is illustrated in the next instruction in
which case %Y is local but it hasn't been seen yet by the parser. This prevents
use from using a rule like: "if you haven't seen it yet, its global" because it
might be a forward referenced local.

These issues require llvm-upgrade to keep a model of each function until the
entire function is finished parsing. Then we have to go back and "Fixup" any
references that were not defined by the function. llvm-upgrade wasn't designed
to do this. It emits instructions as soon as they are read.

Consequently, its going to be a while before this can get resolved.

@lattner
Copy link
Collaborator

lattner commented Jan 12, 2007

I assume you should use the same approach that asmparser used to. Basically, if you see a name you:

  1. Check to see if it's already defined, if so, use it.
  2. If not, see if we have a placeholder for it already, if so, use it.
  3. If not, create a new placeholder. Add it to the list of unresolved names.
  4. When done parsing each function, scan through the list of unresolved names. If any instructions match
    an unresolved name, resolve it.
  5. When done parsing the module, scan through the list of unresolved names, any unresolved names get
    resolved to globals. If something isn't resolvable, it's an error.

-Chris

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 12, 2007

Nope, it doesn't. Until now it was unnecessary. I decoupled llvm-upgrade from
LLVM so that it is not affected by IR changes or any "new" rules. For the most
part it just passes strings through on the assumption that the input is valid
and the number of changes required are minimal.

With this change, I'm going to do exactly what you just described. Hence, the
rewrite.

@lattner
Copy link
Collaborator

lattner commented Jan 13, 2007

ok

@lattner
Copy link
Collaborator

lattner commented Jan 25, 2007

Note: when this lands, we should remove Function::renameLocalSymbols and all uses. This fix obsoletes
that hack.

-Chris

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 25, 2007

Okay. This should land this week if I can get through the llvm-upgrade portion
of it. I'll make the change now so that it gets included with the commit.

@lattner
Copy link
Collaborator

lattner commented Jan 25, 2007

no worries, we can make the change after it lands. I just didn't want to forget about it. :)

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 26, 2007

This has now been implemented via joint effort. The llvm-upgrade changes are in
this patch:

http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070122/043254.html

@lattner
Copy link
Collaborator

lattner commented Nov 27, 2021

mentioned in issue llvm/llvm-bugzilla-archive#761

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 3, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla compile-fail Use [accepts-invalid] and [rejects-valid] instead llvm:core
Projects
None yet
Development

No branches or pull requests

3 participants