New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DebugInfo@O2] Filtering out-of-scope variables drops many legitimate locations #47435
Comments
Apologies for the naive interjection here, but do you have an example of the changes in the final DWARF? A small example would be great. |
Reproducer
Here's the assembly produced: (Again, pretty meaningless). Here's the DWARF in LLVM as it stands today, for the lexical block with the variable:
And if you disable this block of code, from line 1650 to 1665 in
DW_TAG_lexical_block
The variable location covers addresses 14->16 (the load / mov (%rax)), where it didn't before. The underlying reason for this is that we don't track that variable location across the instructions at 0x10 and 0x12. (The location can be found from 0x16 onwards due to some tail-duplication shenanigans). I tried and failed to make a C reproducer to nicely demonstrate this; I also haven't examined the un-reduced IR to see whether it'd actually be observable / a problem for a developer. Probably the best test for whether this is a real problem would be finding the largest change in coverage caused by the scope filter and seeing if it's something a developer would be annoyed by. (I probably won't get to look at this for a while). |
(... which corresponds to the "loopexit" block in the LLVM-IR, for which all the instructions are outside the variables scope... ) |
NB: some of this is fixed by https://reviews.llvm.org/D117877 , although not in the general case. |
Extended Description
The VarLoc based LiveDebugValues implementation has a filter in the 'join' method, to cease propagating variable locations that have gone out of scope. I believe the aim of this is to prevent LiveDebugValues needlessly computing the location of every variable for every instruction, even those that are only in scope for a few instructions.
Clearly this is effective at moderating compile times, however it also seems to damage location coverage too. Specifically: if you have, say, an X86 CMOV64 instruction:
CMP64rr $rbx, $rcx, implicit-def $eflags
$rbx = CMOV64rr $rbx (tied-def 0), $rcx, 7, implicit $eflags, debug-location !1
Then occasionally the X86 backend will implement this with control flow:
CMP64rr $rbx, $rcx, implicit-def $eflags
JCC_1 %bb.2, 7, debug-location !1
bb.1:
$rbx = COPY $rcx, debug-location !1
bb.2:
...
Unfortunately, this effectively becomes a notch filter for variables that are in scope for DILocation !1. Any variables that aren't in scope in bb.1 will be dropped, and subsequently cause the same variables to be dropped from bb.2 onwards. That means any LLVM-IR select instruction can potentially lead to neighbouring variable locations being dropped, depending on their scopes. In terms of impact, here's llvm-locstats for a clang-3.4 build with and without the out-of-scope filter. With filter:
=================================================
cov% samples percentage(~)
0% 765406 22%
(0%,10%) 45179 1%
[10%,20%) 51699 1%
[20%,30%) 52044 1%
[30%,40%) 46905 1%
[40%,50%) 48292 1%
[50%,60%) 61342 1%
[60%,70%) 58315 1%
[70%,80%) 69848 2%
[80%,90%) 81937 2%
[90%,100%) 101384 2%
100% 2032034 59%
-the number of debug variables processed: 3414385
-PC ranges covered: 61%
-total availability: 64%
Without:
=================================================
cov% samples percentage(~)
0% 765357 22%
(0%,10%) 45168 1%
[10%,20%) 51584 1%
[20%,30%) 51911 1%
[30%,40%) 46569 1%
[40%,50%) 47978 1%
[50%,60%) 60356 1%
[60%,70%) 53071 1%
[70%,80%) 56018 1%
[80%,90%) 79131 2%
[90%,100%) 103664 3%
100% 2059075 60%
-the number of debug variables processed: 3419882
-PC ranges covered: 61%
-total availability: 65%
Importantly, there are an additional ~27k variables in the 100% range bucket, mostly moved up from the 60-99% ranges. Presumably these are long lived variables that have their lifetimes artificially interrupted by the scope filter. A similar amount of improvement occurs for InstrRefBasedLDV.
IMO, this is eminently solvable in InstrRefBasedLDV -- as it deals with things in terms of transfer functions, we should be able to invent transfer functions between isolated segments of particular scopes using dominance information.
The text was updated successfully, but these errors were encountered: