Chris pointed out that it would be useful to have a "nocapture" parameter attribute. Its semantics would be that a pointer value passed in with the "nocapture" attribute can be assumed never to escape from the callee (never stored to a global or passed to another function where it could escape).
This could make AddressMightEscape more aggressive in BasicAA, giving better mod/ref info for calls. This property is true of a huge variety of pointer arguments to functions, e.g. strlen, memcpy, etc. Knowledge of this property allows us to optimize things like this: char buffer[100]; buffer[0] = 0; strlen(buffer); // buffer doesn't escape x = buffer[0]; foo(); // can't modify buffer because strlen doesn't cause it to escape y = buffer[0]; // always = x This property could also be inferred for a lot of functions by globalsmodref.
Since strlen is 'readonly' this doesn't seem like a good example.
The point here is that strlen doesn't capture the pointer, so foo couldn't modify it. nocapture and readonly are orthogonal concepts.
Can a readonly function capture a pointer? It can't write it into a global variable because it is readonly. I guess it could encode the pointer in the value it returns...
This is readnone but does capture its argument: int *foo(int *X) { return X; }
OK, but llvm-gcc nonetheless seems to have no trouble simplifying your example! I presume the plan is to enhance the gcc tables (which list properties like 'const' for a bunch of functions) so they also have 'nocapture' info. Also, what is the policy with attribute propagation vs having an analysis? We get some attributes from the front-end, which set them using special front-end knowledge. I'm thinking of 'readnone' and 'readonly'. There is an analysis pass GlobalsModRef which can (presumably) deduce readnone/readonly for some functions for which we have the body. A transform could be built on top of GlobalsModRef that sets readnone/readonly on those functions. But what's the point of propagating attributes in this way if we have the analysis anyway? My answer is that it speeds things up if later the module is linked with another one and reoptimized. I like the structure of having (1) an attribute deducing analysis and (2) an attribute setting transform. Currently for 'nounwind' these two are lumped into one (PruneEH). Logically it should probably be split, and places that want to know if a function can unwind should query the analysis. Right now this is probably not worth it, but once trapping instructions are implemented that might change.
> OK, but llvm-gcc nonetheless seems to have no trouble > simplifying your example! It must be 'proving' that it is unnecessary some other way. It would be interesting to find out how it decides it is useless. > I presume the plan is to enhance the gcc tables (which list properties like 'const' > for a bunch of functions) so they also have 'nocapture' info. That would make sense, alternatively we could add it to simplifylibcalls (yuck), or add a llvm-specific table somewhere in the front-end (less yuck, but still strange). I have no preference for how it is implemented in llvm-gcc. > Also, what is the policy with attribute propagation vs having > an analysis? We get some attributes from the front-end, which > set them using special front-end knowledge. I'm thinking of > 'readnone' and 'readonly'. Right. > There is an analysis pass GlobalsModRef > which can (presumably) deduce readnone/readonly for some functions > for which we have the body. A transform could be built on top of > GlobalsModRef that sets readnone/readonly on those functions. But > what's the point of propagating attributes in this way if we have > the analysis anyway? I'm not sure what you mean by "we have the analysis anyway". An analysis and attrs are two very different things. Attributes are nice because they can be cheaply computed by a variety of means (e.g. tables) and are efficient to propagate in this case. Analyses are nice because they can do arbitrary weird things, like deciding that the function writes, but only reads one argument, or something. You can also put ad-hoc knowledge into passes, such as the mod/ref behavior of printf or scanf, which requires analyzing the format string. The current GlobalsModRef was written before we had attributes, so it is probably doing too much or doing what it does in the wrong way now. It should be revisited. Ideally running it wouldn't build up heavy heap datastructures, it would just add propagate and update tags on the functions. > I like the structure of having (1) an attribute deducing analysis > and (2) an attribute setting transform. Me too :) > Currently for 'nounwind' > these two are lumped into one (PruneEH). Logically it should probably > be split, and places that want to know if a function can unwind should > query the analysis. Right now this is probably not worth it, but once > trapping instructions are implemented that might change. That would be reasonable too. At this point, I'm mostly concerned with optimizing various memory things, like your crazy memcpy cases, so I'm focused on readonly/nocapture etc, but optimizing and improving EH info is also useful. -Chris
The other useful thing about this is that it lets us do partial mem2reg/sroa for stuff that would otherwise be prevented due to byref args. For example: void foo(int &X) NOINLINE { X = 1; } int bar() { int X; foo(X); for (i = 0 .. 100) X += i; return X; } Right now, foo pins X to the stack so mem2reg can't promote it in the loop. If we had no capture, we could promote allocas and just store/reload around calls like this. -Chris
I added this a while ago. Missed optz'ns with nocapture should get their own bugs.