47946 – Improve debug info accuracy for locals after inlining escaping function

LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 47946 - Improve debug info accuracy for locals after inlining escaping function

Summary: Improve debug info accuracy for locals after inlining escaping function

Status:	NEW

Alias:	None

Product:	libraries
Classification:	Unclassified
Component:	Transformation Utilities (show other bugs)
Version:	trunk
Hardware:	PC Windows NT

Importance:	P enhancement
Assignee:	Orlando Cazalet-Hyams

URL:
Keywords:

Depends on:
Blocks:

Reported:	2020-10-22 08:30 PDT by Orlando Cazalet-Hyams
Modified:	2020-11-13 07:22 PST (History)
CC List:	5 users (show)

See Also:
Fixed By Commit(s):

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Orlando Cazalet-Hyams 2020-10-22 08:30:47 PDT

Version:
clang @ 234c47ae2a8e with D85555 and D89810 applied (both currently approved).

The problem:
InstCombine's LowerDbgDeclare effectively means that we follow all the source assignment values for any given variable throughout a function. That is no longer true after inlining if a caller-local variable is written to by the callee because we don't track the callee's uses of the variable in the same way, which is undesirable. The purpose of this ticket is to document this and to propose a solution (mentioned here [1]).

Prior to D85555 and D89810 the root problem still existed but it manifested in a different way.

An example manifestation of the problem:
$ cat -n test.c
     1	int g;
     2	__attribute__((__always_inline__))
     3	static void use(int* p) {
     4	  g = *p;
     5	  *p = 0xff;
     6	}
     7	__attribute__((__noinline__))
     8	void fun(int param) {
     9	  use(&param);
    10	}
    11	int main() {
    12	  fun(5);
    13	}

Compiling with -O2 -g, the combination of LowerDbgDeclare and inlining cause us to show the value 5 instead of 255 for 'param' after, and while, stepping through inlined function 'use' in 'fun'.

Early in the pipeline, 'param' in 'fun' lives on the stack because it is escaped by 'use'. InstCombine runs LowerDbgDeclare, inserting a dbg.value(%stored_value) before the store to the alloca, and a dbg.value(%alloca_addr, [DW_OP_deref]) before the 'use' call. The inliner then pulls the body of 'use' into 'fun'. mem2reg notices that the 0xff store is redundant and that the remaining store/load pair can be promoted, which means that it can remove param's alloca. With D89810, we remove the dbg.value(%alloca_addr, [DW_OP_deref]), which would've otherwise become undef. Now the location dbg.value(%store_value) will cover the inlined scope and the rest of the function, including after the DSE'd 0xff assignment, which is misleading.

Proposed solution:
Teach the inliner to insert dbg.values in the same pattern as LowerDbgDeclare [2] for uses of caller-local allocas. For the example above, we would have the inliner insert a dbg.value(%stored_value_0xff, "param", DIExpression()) before the inlined store *p = 0xff. After the store is deleted we'd end up with a salvagedOrUndef dbg.value at the store site.

[1] https://reviews.llvm.org/D89810#2343831
[2] LowerDbgDeclare https://github.com/llvm/llvm-project/blob/e2858bf633b548cbf4e31b6b10852fccde940270/llvm/lib/Transforms/Utils/Local.cpp#L1499

Comment 1 Vedant Kumar 2020-10-22 10:47:49 PDT

+ 1. I'm not aware of an alternative solution that's compatible with our use of LowerDbgDeclare to track stores. Note that the usual salvageDI() fallback isn't applicable because the solution needs to be inlining-aware.

There are alternative proposals for how to track stores, but I'm not sure how far along these are:
- -gdbg-value from llvm.org/PR34136
- https://gist.github.com/Keno/480b8057df1b7c63c321

Comment 2 Orlando Cazalet-Hyams 2020-10-22 11:23:56 PDT

(In reply to Vedant Kumar from comment #1)
> + 1. I'm not aware of an alternative solution that's compatible with our use
> of LowerDbgDeclare to track stores. Note that the usual salvageDI() fallback
> isn't applicable because the solution needs to be inlining-aware.
> 
> There are alternative proposals for how to track stores, but I'm not sure
> how far along these are:
> - -gdbg-value from llvm.org/PR34136
> - https://gist.github.com/Keno/480b8057df1b7c63c321
I agree that this is (hopefully) more of an "in the meantime" fix while we work out an alternative design.

In case it helps anyone I've written up a walkthrough of the example with IR here: https://gist.github.com/OCHyams/29ec463e82b70955c9fc5542f0b69cd7

Comment 3 Orlando Cazalet-Hyams 2020-10-28 04:04:45 PDT

I started prototyping this and ran into a problem. One thing we must consider is that we need to find the variables associated with each of the alloca addresses passed to the callee. Currently I rely on the assumption that we'll have a dbg.value+deref before the callsite for these variables, thanks to LowerDbgDeclare, and do this:

  1. for inst backwards from (callsite - 1) to (block start):
    2. if isnt is not a dbg.value, exit loop.
    3. else if inst is a dbg.value+deref of an alloca which is an arg:
      4. save the { alloca, variable } pair in a map for later.

However, it turns out this assumption doesn't always hold. If we have the following pattern after LowerDbgDeclare converts a dbg.declare to dbg.values:

  dbg.value(%alloca, "x", [DW_OP_deref])
  call one(%alloca)
  dbg.value(%alloca, "x", [DW_OP_deref])
  call two(%alloca)

RemoveRedundantDbgInstrs will remove the second dbg.value. See comment from BasicBlockUtils.cpp below.

  /// ForwardScan strategy:
  /// ---------------------
  /// Given two identical dbg.value instructions, separated by a block of
  /// instructions that isn't describing the same variable, like this
  ///
  ///   dbg.value X1, "x", FragmentX1  (**)
  ///   <block of instructions, none being "dbg.value ..., "x", ...">
  ///   dbg.value X1, "x", FragmentX1  (*)

And we're left with the following:

  dbg.value(%alloca, "x", [DW_OP_deref])
  call one(%alloca)
  call two(%alloca)

If we then use the the prototype method when inlining 'two' we won't find any variable associated with %alloca. So, I think we will have to either:
  A) treat dbg.value+deref as special and prevent their removal, or
  B) be more intelligent about how we find the DILocalVariables associated with each argument value.

Option A doesn't feel very nice. Adding in more hard-to-enforce debug info rules doesn't sound good. Additionally, leaving a dbg.value+deref before the call to 'two' may cause misleading debug info in some cases. However, option B may be difficult. For example, scanning back further in the block isn't enough when inlining 'two' if 'one' was inlined first and has multiple blocks.

Does anyone have any ideas or suggestions?

Comment 4 Jeremy Morse 2020-10-28 04:19:31 PDT

Orlando wrote:
>   A) treat dbg.value+deref as special and prevent their removal, or

IMO this is the correct path -- we usually treat dbg.values as an assignment of a value that is then constant until changed by another assignment, but the presence of DW_OP_deref invalidates that treatment. It really does need a different set of behaviours.

I've been plugging this elsewhere: the correct intrinsic for this IMO is dbg.addr, the implementation for which is just "dbg.value with a deref". The fact that it's a different intrinsic signifies it needs to be treated differently, because it refers to a value stored in an alloca that can change. dbg.addr is also only supposed to associate with allocas, and we wouldn't have to dig through DIExpressions to identify intrinsics that can refer to memory.

Comment 5 Orlando Cazalet-Hyams 2020-11-02 09:40:08 PST

(In reply to Jeremy Morse from comment #4)
> Orlando wrote:
> >   A) treat dbg.value+deref as special and prevent their removal, or
> 
> IMO this is the correct path -- we usually treat dbg.values as an assignment
> of a value that is then constant until changed by another assignment, but
> the presence of DW_OP_deref invalidates that treatment. It really does need
> a different set of behaviours.
> 
> I've been plugging this elsewhere: the correct intrinsic for this IMO is
> dbg.addr, the implementation for which is just "dbg.value with a deref". The
> fact that it's a different intrinsic signifies it needs to be treated
> differently, because it refers to a value stored in an alloca that can
> change. dbg.addr is also only supposed to associate with allocas, and we
> wouldn't have to dig through DIExpressions to identify intrinsics that can
> refer to memory.

SGTM. However, after some prototyping it seems like dbg.addr is not working properly, or at least isn't currently suitable for this task. For example, I found that EarlyCSE drops all dbg.addrs, and we just drop undef dbg.addrs in the SelectionDAG builder phase instead of emitting an undef DBG_VALUE. The former is an easy fix, and the latter might be too, but the fact that these issues are so shallow has me worried that dbg.addr isn't ready to be a drop-in replacement for dbg.value+deref without further investigation and work.

With this in mind I plan to work around the dbg.value+deref problem for now, treating dbg.value+derefs as special.

Comment 6 Orlando Cazalet-Hyams 2020-11-13 07:22:28 PST

I've had a go at implementing this, patch stack starts at D91423.

https://reviews.llvm.org/D91423