-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve debug info accuracy for locals after inlining escaping function #47290
Comments
assigned to @OCHyams |
There are alternative proposals for how to track stores, but I'm not sure how far along these are: |
In case it helps anyone I've written up a walkthrough of the example with IR here: https://gist.github.com/OCHyams/29ec463e82b70955c9fc5542f0b69cd7 |
I started prototyping this and ran into a problem. One thing we must consider is that we need to find the variables associated with each of the alloca addresses passed to the callee. Currently I rely on the assumption that we'll have a dbg.value+deref before the callsite for these variables, thanks to LowerDbgDeclare, and do this:
However, it turns out this assumption doesn't always hold. If we have the following pattern after LowerDbgDeclare converts a dbg.declare to dbg.values: dbg.value(%alloca, "x", [DW_OP_deref]) RemoveRedundantDbgInstrs will remove the second dbg.value. See comment from BasicBlockUtils.cpp below. /// ForwardScan strategy: And we're left with the following: dbg.value(%alloca, "x", [DW_OP_deref]) If we then use the the prototype method when inlining 'two' we won't find any variable associated with %alloca. So, I think we will have to either: Option A doesn't feel very nice. Adding in more hard-to-enforce debug info rules doesn't sound good. Additionally, leaving a dbg.value+deref before the call to 'two' may cause misleading debug info in some cases. However, option B may be difficult. For example, scanning back further in the block isn't enough when inlining 'two' if 'one' was inlined first and has multiple blocks. Does anyone have any ideas or suggestions? |
Orlando wrote:
IMO this is the correct path -- we usually treat dbg.values as an assignment of a value that is then constant until changed by another assignment, but the presence of DW_OP_deref invalidates that treatment. It really does need a different set of behaviours. I've been plugging this elsewhere: the correct intrinsic for this IMO is dbg.addr, the implementation for which is just "dbg.value with a deref". The fact that it's a different intrinsic signifies it needs to be treated differently, because it refers to a value stored in an alloca that can change. dbg.addr is also only supposed to associate with allocas, and we wouldn't have to dig through DIExpressions to identify intrinsics that can refer to memory. |
SGTM. However, after some prototyping it seems like dbg.addr is not working properly, or at least isn't currently suitable for this task. For example, I found that EarlyCSE drops all dbg.addrs, and we just drop undef dbg.addrs in the SelectionDAG builder phase instead of emitting an undef DBG_VALUE. The former is an easy fix, and the latter might be too, but the fact that these issues are so shallow has me worried that dbg.addr isn't ready to be a drop-in replacement for dbg.value+deref without further investigation and work. With this in mind I plan to work around the dbg.value+deref problem for now, treating dbg.value+derefs as special. |
I've had a go at implementing this, patch stack starts at D91423. |
For future readers, |
@llvm/issue-subscribers-debuginfo |
Extended Description
Version:
clang @ 234c47a with D85555 and D89810 applied (both currently approved).
The problem:
InstCombine's LowerDbgDeclare effectively means that we follow all the source assignment values for any given variable throughout a function. That is no longer true after inlining if a caller-local variable is written to by the callee because we don't track the callee's uses of the variable in the same way, which is undesirable. The purpose of this ticket is to document this and to propose a solution (mentioned here [1]).
Prior to D85555 and D89810 the root problem still existed but it manifested in a different way.
An example manifestation of the problem:
$ cat -n test.c
1 int g;
2 attribute((always_inline))
3 static void use(int* p) {
4 g = *p;
5 *p = 0xff;
6 }
7 attribute((noinline))
8 void fun(int param) {
9 use(¶m);
10 }
11 int main() {
12 fun(5);
13 }
Compiling with -O2 -g, the combination of LowerDbgDeclare and inlining cause us to show the value 5 instead of 255 for 'param' after, and while, stepping through inlined function 'use' in 'fun'.
Early in the pipeline, 'param' in 'fun' lives on the stack because it is escaped by 'use'. InstCombine runs LowerDbgDeclare, inserting a dbg.value(%stored_value) before the store to the alloca, and a dbg.value(%alloca_addr, [DW_OP_deref]) before the 'use' call. The inliner then pulls the body of 'use' into 'fun'. mem2reg notices that the 0xff store is redundant and that the remaining store/load pair can be promoted, which means that it can remove param's alloca. With D89810, we remove the dbg.value(%alloca_addr, [DW_OP_deref]), which would've otherwise become undef. Now the location dbg.value(%store_value) will cover the inlined scope and the rest of the function, including after the DSE'd 0xff assignment, which is misleading.
Proposed solution:
Teach the inliner to insert dbg.values in the same pattern as LowerDbgDeclare [2] for uses of caller-local allocas. For the example above, we would have the inliner insert a dbg.value(%stored_value_0xff, "param", DIExpression()) before the inlined store *p = 0xff. After the store is deleted we'd end up with a salvagedOrUndef dbg.value at the store site.
[1] https://reviews.llvm.org/D89810#2343831
[2] LowerDbgDeclare
llvm-project/llvm/lib/Transforms/Utils/Local.cpp
Line 1499 in e2858bf
The text was updated successfully, but these errors were encountered: