First Last Prev Next    No search results available
Details
: Add support for "non-call" exceptions, eliminate invoke
Bug#: 1269
: libraries
: Core LLVM classes
Status: NEW
Resolution:
: All
: All
: 1.0
: P2
: enhancement
: ---

: http://nondot.org/~sabre/LLVMNotes/ExceptionHandlingChang...
: new-feature
:
:
  Show dependency tree - Show dependency graph
People
Reporter: Reid Spencer <rspencer@reidspencer.com>
Assigned To: Unassigned LLVM Bugs <unassignedbugs@nondot.org>
:

Attachments
Patch for steps 1. to 5. (7.45 KB, patch)
2007-10-12 23:46, varth
Details
Patch for steps 1. to 5., 7. and 8. to 11. (16.47 KB, patch)
2007-10-13 19:19, varth
Details
Throwing instructions as terminators (94.76 KB, patch)
2007-12-04 04:54, varth
Details
add unwindDest to BasicBlock, making it a User (12.57 KB, patch)
2008-03-01 17:08, Nick Lewycky
Details
update Support/CFG.h to handle unwindTo as predecessor / successor (2.30 KB, patch)
2008-03-01 21:42, Nick Lewycky
Details
teach prune-eh to prune unwind_to (2.49 KB, patch)
2008-03-06 01:00, Nick Lewycky
Details
make the inliner and simplifycfg respect unwind_to (4.30 KB, patch)
2008-03-08 22:59, Nick Lewycky
Details
update the cloner (fixes bugpoint) and its biggest user, the loop optz'ns (7.91 KB, patch)
2008-03-08 23:15, Nick Lewycky
Details


Note

You need to log in before you can comment on or make changes to this bug.

Related actions


Description:   Opened: 2007-03-24 16:12
I would like to see Chris' notes on Exception Handling implemented (see the link
above), preferably for 2.0. My reasons for this are two fold:

1. It fits in the same category of disruptive assembly/bytecode changes as the 
   other major changes in LLVM 2.0; so, including it in 2.0 means less future
   impact for 2.0 users. 
2. As I start to develop the HLVM exception model, I'd prefer to do it on an EH
   model in LLVM that is simpler for the kinds of languages HLVM targets.

Yes, I'm signing up to do this; hopefully, before 2.0 is released.
------- Comment #1 From Chris Lattner 2007-03-24 16:57:00 -------
Very cool.
------- Comment #2 From Reid Spencer 2007-03-24 17:12:43 -------
Some increments to map this work out (feedback would be welcome):

 1. Add an "unwindTo" member to BasicBlock.
 2. Add "unwind to" keywords to block label in assembly parser
 3. Add "unwind to" keywords to block label in assembly writer
 4. Add support for "unwind to" to bytecode reader
 5. Add support for "unwind to" to bytecode writer.
 6. Fix prune-eh to deal with "unwindTo" in BasicBlock.
 7. Remove generation of invoke instructions in llvm-gcc
 8. Remove invoke instruction from assembly parser
 9. Remove invoke instruction from assembly writer
10. Remove invoke instruction from bytecode reader
11. Remove invoke instruction from bytecode writer
12. Replace all uses of invoke instruction in LLVM passes with Call instruction
13. Remove all uses of CallSite class 
14. Remove CallSite class.
15. Remove invoke instruction from LLVM IR

Anything I forgot, or anything that would lead to an inconsistent state?
------- Comment #3 From Reid Spencer 2007-04-25 23:32:52 -------
Unfortunately, I won't get time to work on this for 2.0
------- Comment #4 From Chris Lattner 2007-05-23 02:58:03 -------
Here are the notes in question:
http://nondot.org/sabre/LLVMNotes/ExceptionHandlingChanges.txt
------- Comment #5 From Anton Korobeynikov 2007-05-23 17:47:56 -------
Btw, this should simplify debug information emission a lot, since DWARF debug
information operates on "ranges".
------- Comment #6 From Reid Spencer 2007-09-06 21:20:48 -------
Looks like I'm not going to get to this any time soon.
------- Comment #7 From varth 2007-10-12 23:46:12 -------
Created an attachment (id=1157) [details]
Patch for steps 1. to 5.
------- Comment #8 From varth 2007-10-12 23:47:23 -------
I would like to sign on this. I'm not entirely familiar with everything so
currently I believe that 1. to 15. are the right steps to accomplish it.

Here's a first patch that addresses 1. to 5. I wait for your comments. If
everything is fine, I'll assign myself to the bug (do I have the rights to do
that?)

Before addressing 6., we must think on how we represent control flow when a
basic block unwinds to another basic block. The issue is that a BasicBlock is
currently not a User object, so it can not be added to the use objects of
another basic block.

I see three windows here on how to implement this:

1) Following Chris notes, we could add a bit to each instruction to say if it
can raise an exception. In that case, the instructions that raise an exception
of a basic block are all "uses" of the basic block to which they may unwind to
(but this may change all these instructions to terminators instructions in LLVM
design)

2) We add a new field to the BasicBlock class, which is an array of
BasicBlocks, listing all predecessors (I think a BasicBlock only needs to know
the basic blocks that are predecessors, not the specific instructions that may
jump to the basic block. I may be wrong here)

3) Make the BasicBlock class inherits the User class (but does it make sense?)


I've been dying on having this feature implemented. I think once we solve this
implementation of uses issue, steps 6. to 15. shouldn't be hard to implement.

And by the way, I must say, it's a real pleasure to hack LLVM. You've done a
really great job for having so, so, so readable code (I'm talking about the
code that I changed for 1. to 5. OK, it's just parsing, but well, I've found so
unreadable code on other projects....)
------- Comment #9 From varth 2007-10-13 19:19:15 -------
Created an attachment (id=1159) [details]
Patch for steps 1. to 5., 7. and 8. to 11.

This patch corrects the previous patch (a bug in AsmWriter) and implements 8.
to 11. It also contains the patch for step 7. I think once we solve the issue
on my first comment, we can plan on removing the InvokeInstruction visible to
users (since Invoke instructions are spread all over LLVM, it may take a long
time before removing them all).

Once I get your feedback on the patches, I'll apply steps 1. to 5., and then
mail the mailing list, informing of the changes. I hope it does not imply many
changes in the codegen.

For step 7, since in gcc call sites only can throw exceptions, the changes were
not difficult. However, my machine can not compile llvm-gcc, can someone look
if my patch compiles fine?
------- Comment #10 From Duncan Sands 2007-10-15 03:30:38 -------
One issue not addressed in steps 1 .. 15 is the problem of
instruction re-ordering.  For example, consider

...
  %tmp = some_useful_value
  invoke some_routine unwind %some_unwind_block
...
some_unwind_block:
  ...use %tmp...

It is essential that codegen does not allow the assignment
to %tmp to be re-ordered after the invoke.  This is carefully
taken care of in SelectionDAGISel.  The difficulty is that since
%tmp is a register, and isn't used as an argument to the invoke
call, a priori there's no reason why it shouldn't be re-ordered
after the call.  This is why special logic is required, which
basically says that an invoke implies a compiler barrier.  How
to handle this in the new setup?  Must all potentially throwing
instructions imply compiler barriers?
------- Comment #11 From Duncan Sands 2007-10-15 03:39:49 -------
> 1) Following Chris notes, we could add a bit to each instruction to say if
> it can raise an exception. In that case, the instructions that raise an
> exception of a basic block are all "uses" of the basic block to which
> they may unwind to (but this may change all these instructions to
> terminators instructions in LLVM design)

I think they have to become terminators: if you allow control flow
to branch in the middle of basic blocks then you have discarded the
whole notion of "basic block" which seems like a very dangerous thing
to do.
------- Comment #12 From Duncan Sands 2007-10-15 03:45:29 -------
> For step 7, since in gcc call sites only can throw exceptions, the changes
> were not difficult.

This is completely wrong: the whole reason for this PR is that in gcc non
call sites can throw exceptions!  See tree_could_throw_p in tree-eh.c.
------- Comment #13 From varth 2007-10-15 20:31:41 -------
(In reply to comment #12)
> > For step 7, since in gcc call sites only can throw exceptions, the changes
> > were not difficult.
> 
> This is completely wrong: the whole reason for this PR is that in gcc non
> call sites can throw exceptions!  See tree_could_throw_p in tree-eh.c.
> 

I am tempted to believe you (in c++, non call sites can throw exceptions) since
I'm not a lot familiar with c++ exceptions, but following Chris notes, you're
wrong.

I'm more familiar with Java and c# exceptions, and they definitely need this
kind of behavior. This is why the PR is here.

On an other note, if you are right, then that does not matter to me :) I am
only changing llvm-gcc so that it compiles and generates code that conforms to
the exception model of LLVM. So in that sense, the patch for 7. should be OK.
------- Comment #14 From varth 2007-10-15 20:35:47 -------
(In reply to comment #11)
> I think they have to become terminators: if you allow control flow
> to branch in the middle of basic blocks 

Err, I do not allow control flow to branch in the middle of basic blocks. I
allow (well actually, this design) non-terminator instructions to branch to a
basic block.

Actually, still following Chris notes, they should not be terminator inst,
since this would not avoid having things like (replace invoke by calls):


  invoke bar() to cont unwind %BarThrew

BarThrew:
   invoke ~X(&F) to cont2 unwind %unexp
cont2:
   invoke ~X(&E) to cont3 unwind %unexp
cont3:
   invoke ~X(&D) to cont4 unwind %unexp
cont4:
   invoke ~X(&C) to cont5 unwind %unexp
cont5:
   invoke ~X(&B) to cont6 unwind %unexp
cont6:
   invoke ~X(&A) to cont7 unwind %unexp
cont7:
   unwind
unexp:
   call std::unexpected()
   unwind
------- Comment #15 From varth 2007-10-15 20:41:44 -------
(In reply to comment #10)
> to handle this in the new setup?  Must all potentially throwing
> instructions imply compiler barriers?
> 

For that I am tempted to apply the 1. to 5. patches (preparing for changes),
email the list for talking about this kind of issue (involving codegen folks)
and magically all will work!

Since this will surely not be the case, we need to have a discussion of wether
or not we want this feature. I definitely want it, and I definitely want it to
be in 2.2. However, if it's not a desirable feature for now (because it
involves too many parts of LLVM), we could delay, but I'm afraid it will be
harder and harder to switch the exception model in such a case.
------- Comment #16 From varth 2007-12-04 04:49:23 -------
Following discussions with Chris and Duncan, the file exceptionsBlock.patch
sets throwing instructions (calls or non calls) as terminator instructions.

The main modifications are:
1) Removes the TerminatorInst class
2) Sets the BasicBlock class as a User. A BasicBlock now uses two blocks: and
unwind destination and a normal destination (which can be set to null).
3) Modifies the .ll and .bc parsers and writers (but should be compatible with
old .bc and .ll files)
4) Modifies SelectionDAGISel.cpp to generate exception DAGs for throwing
instructions instead of only Invoke instructions.
5) You can set non call exceptions with the -non-call-exceptions command line
(defined in Instruction.cpp)

I tried to provide a compatibility layer with Invoke instructions, but this
involves too many workaround for an instruction that will disappear. .ll or .bc
files which have invoke instructions will not work with this patch. We need a
tool to upgrade these files.
------- Comment #17 From varth 2007-12-04 04:54:43 -------
Created an attachment (id=1278) [details]
Throwing instructions as terminators
------- Comment #18 From Nick Lewycky 2008-02-28 01:23:34 -------
Please don't remove TerminatorInst. We still have non-Invoke terminators and
it's useful to distinguish the class of instructions that represent
control-flow.

Removing invoke is fine, but you still have to upgrade it, for bitcode file
format compatibility. If you encounter an invoke, construct a branch to a new
BB with a call and a branch in it.

I'd suggest leaving invoke *in* for now, and just adding the unwindTo support
to basic blocks and getting that reviewed and checked into the tree.
------- Comment #19 From varth 2008-02-28 03:14:42 -------
(In reply to comment #18)
> Please don't remove TerminatorInst. We still have non-Invoke terminators and
> it's useful to distinguish the class of instructions that represent
> control-flow.
> 

Actually, only the class hierarchy with TerminatorInst is removed. You still
get a isTerminator() function which returns true if the instruction is a
control flow instruction or a throwing instruction.

> Removing invoke is fine, but you still have to upgrade it, for bitcode file
> format compatibility. If you encounter an invoke, construct a branch to a new
> BB with a call and a branch in it.

Yes, from what I remember that was easy to imagine, but painfull to do ;-)

> 
> I'd suggest leaving invoke *in* for now, and just adding the unwindTo support
> to basic blocks and getting that reviewed and checked into the tree.
> 

The patch is rather old, and I don't think it is still valid on TOT. Consider
this patch as a starter for someone who desperately wants the feature (like
me). But really applying it needs someone with better LLVM insights.
------- Comment #20 From Nick Lewycky 2008-03-01 17:08:37 -------
Created an attachment (id=1482) [details]
add unwindDest to BasicBlock, making it a User

This patch adds the unwind block to the BasicBlock structure, turning
BasicBlock into a User instead of a Value. The asm parser and writer as well as
the bitcode reader and writer are updated to handle this field properly.

Nothing else is changed. Most importantly, no semantics of any instruction are
changed (such as the "unwind" instruction actually following the unwindDest on
a BB), nor does the unwind-block show up as a successor or predecessor in the
graph traversal. I'm saving that for the nest patch.

Please let me know whether I can commit this change.
------- Comment #21 From Chris Lattner 2008-03-01 18:42:34 -------
Overall, the patch looks great.  One minor quibble: instead of making
FUNC_CODE_BLOCK_UNWINDDEST be a block in funcinfo, why not make this be a
record emitted into the function stream?  At the start of each basic block you
could emit an "INST_BB_UNWINDDEST <bb#>" if the destination is not null, and
the reader could easily pick this up.  It seems that it would simplify some of
the input/out and be more efficient in terms of file size.  

Also, "foo: on unwind %bar" is somewhat awkward, how about "foo: unwind_dest
%bar" or "foo: unwind_to %bar" ?

Thanks for working on this!
------- Comment #22 From Nick Lewycky 2008-03-01 21:42:48 -------
Created an attachment (id=1483) [details]
update Support/CFG.h to handle unwindTo as predecessor / successor

This updates CFG.h in the obvious way to include the unwind dest in the CFG.
------- Comment #23 From Chris Lattner 2008-03-02 13:56:09 -------
This patch looks fine, but I don't understand why this can't use dyn_cast:

+    if (isa<TerminatorInst>(*It))      // not dyn_cast due to
const-correctness
+      return cast<TerminatorInst>(*It)->getParent();

Also, "atEnd()" should be updated to handle the BB case as well.  If the block
has a null unwind dest, then idx = # succs is end, otherwise it isn't.

-Chris
------- Comment #24 From varth 2008-03-02 15:52:29 -------
Thanx for doing this Nick. 

What are your plans after handling syntactic changes? Things can get very
tricky when "throwing instructions must be terminator", and I haven't looked at
what this would involve in Codegen. Are you planning on implementing the new EH
model? (that would be great by the way!)
------- Comment #25 From Nick Lewycky 2008-03-02 16:25:01 -------
> This patch looks fine, but I don't understand why this can't use dyn_cast:
> 
> +    if (isa<TerminatorInst>(*It))      // not dyn_cast due to const-correctness
> +      return cast<TerminatorInst>(*It)->getParent();

This code works because cast<> preserves const-ness. If I were to rewrite it
as:

  if (TerminatorInst *TI = dyn_cast<TerminatorInst>(*It))

then I'd be forcing it to be non-const which is wrong (compile error) in one
case, or if I did use "const TerminatorInst", it'd be wrong (compile error)
trying to cast a const BasicBlock* to a non-const BB*.

I found the template code hard to follow, but I think this is what's going
wrong. It certainly fails experimentally.

> Also, "atEnd()" should be updated to handle the BB case as well.  If the block
> has a null unwind dest, then idx = # succs is end, otherwise it isn't.

I don't follow. Neither PredIterator nor SuccIterator have an "atEnd()" method.
succ_iterator has a constructor for the end iterator which has already been
updated to add one to idx when the BB has an unwind dest.

The It.atEnd() doesn't need to be modified since it's a simple use iterator. It
already includes BBs that use this one, much like it includes references to PHI
nodes.
------- Comment #26 From Nick Lewycky 2008-03-02 16:40:21 -------
No problem, Nicolas!

I'd like the new EH model too, but that's a long way off. Right now I'm
focussing on removing "invoke" from the front and middle. There's a lot of work
just to fix all the optimizations to handle unwind-dest properly anyways.

Allowing arbitrary instructions to trap is a whole other ball of wax that I
won't even think about until this part is done.
------- Comment #27 From Chris Lattner 2008-03-06 00:52:31 -------
Re comment 25, "ok and ok".
------- Comment #28 From Nick Lewycky 2008-03-06 01:00:01 -------
Created an attachment (id=1497) [details]
teach prune-eh to prune unwind_to

This is a rather inelegant implementation of the idea that a BB with no calls
(or only calls marked nounwind) doesn't need to have an unwind_to on it.
------- Comment #29 From Nick Lewycky 2008-03-08 22:59:51 -------
Created an attachment (id=1510) [details]
make the inliner and simplifycfg respect unwind_to
------- Comment #30 From Nick Lewycky 2008-03-08 23:15:14 -------
Created an attachment (id=1511) [details]
update the cloner (fixes bugpoint) and its biggest user, the loop optz'ns

No test cases this time, sorry. They're hard to write for this sort of change.

First Last Prev Next    No search results available