Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single-stepping through clang-cl code randomly drops into assembly language #42875

Closed
randomascii mannequin opened this issue Oct 1, 2019 · 9 comments
Closed

Single-stepping through clang-cl code randomly drops into assembly language #42875

randomascii mannequin opened this issue Oct 1, 2019 · 9 comments
Labels
bugzilla Issues migrated from bugzilla clang Clang issues not falling into any other category

Comments

@randomascii
Copy link
Mannequin

randomascii mannequin commented Oct 1, 2019

Bugzilla Link 43530
Resolution FIXED
Resolved on Jan 03, 2020 16:29
Version unspecified
OS Windows NT
CC @amykhuang,@dwblaikie,@jmorse,@zygoloid,@rnk

Extended Description

While debugging either compiler_proxy.exe or chrome.exe I have noticed that sometimes single-stepping through plain C++ code will suddenly drop me into the disassembly window. The exact details depend on user settings and whether you are debugging a locally built binary or one from the build machines. Some people have auto-switch to disassembly disabled so they just see a warning page saying:

cpu.cc not found

You need to find cpu.cc to view the source for the current call stack frame

Viewing details gives me this:

Locating source for 'c:\src\chromium3\src\base\cpu.cc'. Checksum: MD5 {f2 8b 7a f0 cb f 9d f3 29 47 b3 b6 9c c0 2b 33}
The file 'c:\src\chromium3\src\base\cpu.cc' exists.
Determining whether the checksum matches for the following locations:
1: c:\src\chromium3\src\base\cpu.cc Checksum: MD5 {f2 8b 7a f0 cb f 9d f3 29 47 b3 b6 9c c0 2b 33} Checksum matches.
The debugger found source in the following locations:
1: c:\src\chromium3\src\base\cpu.cc Checksum: MD5 {f2 8b 7a f0 cb f 9d f3 29 47 b3 b6 9c c0 2b 33}
The debugger will use the source at location 1.

There is a Browse and find cpu.cc link but it doesn't work.

It is odd that the debugger says that the source matches and that it will use it but then it doesn't. It might be a debugger bug but it seems suspicious that the debugger happily steps through code and then switches to assembly language. It feels like there is a discontinunity in the source mappings. When debugging VC++-generated code I have not seen this behavior.

If I use Ctrl+O to load the specified file it loads but doesn't show any "execution is here" cursor. If I use Ctrl+F11 to switch to assembly mode it asks me to disambiguate and then takes me here (the cmp instruction):

static_assert(kParameterSize * sizeof(cpu_info) + 1 == base::size(cpu_string),
"cpu_string has wrong size");

if (max_parameter >= kParameterEnd) {
001A3C49 3D 04 00 00 80 cmp eax,80000004h
001A3C4E 7C 68 jl base::CPU::Initialize+238h (01A3CB8h)
// Copyright (c) 2012 The Chromium Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.

Again, no cursor showing that execution is there. Right-clicking and selecting Show Next Statement takes me here - on the first mov instruction:

return PENTIUM;
}

} // namespace base

001A3C50 89 C7 mov edi,eax
size_t i = 0;
for (int parameter = kParameterStart; parameter <= kParameterEnd;
++parameter) {
__cpuid(cpu_info, parameter);
001A3C52 B8 02 00 00 80 mov eax,80000002h

If I single-step to the next instruction then Ctrl+F11 works and I can go back to source code with an execution cursor and all is well.

So, it looks like there is a source-to-assembly discontinuity on the instruction at address 001A3C50.

The easiest way to reproduce this is to set a breakpoint on line 255 of base\cpu.cc then single step (F10). That's it.

I reproduced this at Chromium hash 8a9aaaa8c338d22a97442e87518f23a661bef002. I don't know how stable this exact repro is. Here are the gn args I used:

is_component_build = false
is_debug = false
target_cpu = "x86"
enable_nacl = false
dcheck_always_on = true
use_goma = true
blink_symbol_level = 1

@rnk
Copy link
Collaborator

rnk commented Oct 9, 2019

I was able to reproduce this with base_unittests and will dig into it a bit.

@rnk
Copy link
Collaborator

rnk commented Oct 9, 2019

It looks like line zero upsets Visual Studio, which is unfortunate, since the DWARF folks have been tirelessly working to use line zero in more places when we don't know the current source location. This is the compiler-generated assembly for that bit of code:

.cv_loc	20 1 255 0              # ../../base/cpu.cc:255:0
cmp	eax, -2147483644
jl	.LBB2_20

.Ltmp87:

%bb.18:

.cv_loc	20 1 0 0                # ../../base/cpu.cc:0:0
mov	edi, eax

.Ltmp88:
.cv_loc 20 1 259 0 # ../../base/cpu.cc:259:0
mov eax, -2147483646
xor ecx, ecx
#APP
cpuid
#NO_APP

I was able to make VS do the same thing by artificially setting the line number to zero like this:

#include <stdio.h>
int main() {
printf("%d\n", LINE);
#line 0
printf("%d\n", LINE);
#line 7
printf("%d\n", LINE);
printf("%d\n", LINE);
}

I tried compiling that sample with MSVC, but it elides the "line zero" locations. The line table with VS looks like this:

C:\src\llvm-project\build\t.cpp (MD5: 951FC80309A6E680E37501997E65A1DF)
0001:000003A4-000003F0, line/addr entries = 5
2 000003A4 3 000003A8 7 000003C7 8 000003D8 9 000003E9

So, prologue, first printf, then third printf, then fourth, then epilogue.

@dwblaikie
Copy link
Collaborator

It looks like line zero upsets Visual Studio, which is unfortunate, since
the DWARF folks have been tirelessly working to use line zero in more places
when we don't know the current source location.

I don't think that's necessarily a problem/in tension. If CodeView has no way to represent ambiguous/unlocated instructions - then the line zero can be ignored & the previous instructions location can be propagated as is the behavior for some instructions that come from no particular location (I think the frontend creates some instructions with no location (not location zero, but no location at all) - so this is a supported flow/not a bug that's going to be fixed/etc)

@rnk
Copy link
Collaborator

rnk commented Oct 10, 2019

I committed r374267 which avoids putting line zero in cv line tables.

I don't think that's necessarily a problem/in tension. If CodeView has no
way to represent ambiguous/unlocated instructions - then the line zero can
be ignored & the previous instructions location can be propagated as is the
behavior for some instructions that come from no particular location (I
think the frontend creates some instructions with no location (not location
zero, but no location at all) - so this is a supported flow/not a bug that's
going to be fixed/etc)

Well, the more location-less instructions we have, the more important it becomes to implement some kind of data flow analysis to backfill source locations, when it might have been easier to keep them around during optimization. This is what I wrote in the commit message of r374267:

The fix is incomplete, because it's possible to have a basic block with
no source locations at all. In this case, we don't emit a .cv_loc, but
that will result in wrong stepping behavior in the debugger if the
layout predecessor of the location-less BB has an unrelated source
location. We could try harder to find a valid location that dominates or
post-dominates the current BB, but in general it's a dataflow problem,
and one still might not exist. I left a FIXME about this.

I think what I described there is a problem for DWARF as well. This is a sketch of the what the situation would look like in assembly:

bbA:
.loc 1
jmp shared
bbB:
.loc 2
jmp shared
bbC:
.loc 3
ret
bbD:
.loc 4
ret
shared:

no loc

cmp
jcc bbC
jmp bbD

In this case, the location-less block 'shared' would pick up the location from it's layout predecessor, bbD, which depends on what block placement decides to do. Layout tries to achieve as much fallthrough as possible, so I guess in practice most line tables end up being smooth enough that we don't notice. Anyway, there's a FIXME in the code for that case, but I think in practice there are very few location-less BBs.

@dwblaikie
Copy link
Collaborator

I committed r374267 which avoids putting line zero in cv line tables.

I don't think that's necessarily a problem/in tension. If CodeView has no
way to represent ambiguous/unlocated instructions - then the line zero can
be ignored & the previous instructions location can be propagated as is the
behavior for some instructions that come from no particular location (I
think the frontend creates some instructions with no location (not location
zero, but no location at all) - so this is a supported flow/not a bug that's
going to be fixed/etc)

Well, the more location-less instructions we have, the more important it
becomes to implement some kind of data flow analysis to backfill source
locations, when it might have been easier to keep them around during
optimization.

Yep - agreed on all counts.

@randomascii
Copy link
Mannequin Author

randomascii mannequin commented Oct 10, 2019

Is the VS behavior a code-view limitation or a UI limitation?

Either way it seems like there might be a need for a VS bug. The VS message on this was extremely confusing, presumably because it had found the right file but fell down on the line number and didn't know how to express that.

I'm not sure if we should file a bug saying that their error message is rubbish, or that they should support #line 0 data, or if this is just pointless since clang-cl is fixed.

Thanks for the fix. This was quite painful when stepping through goma code.

@rnk
Copy link
Collaborator

rnk commented Jan 4, 2020

*** Bug llvm/llvm-bugzilla-archive#44300 has been marked as a duplicate of this bug. ***

@rnk
Copy link
Collaborator

rnk commented Nov 27, 2021

mentioned in issue llvm/llvm-bugzilla-archive#44300

@rnk
Copy link
Collaborator

rnk commented Nov 27, 2021

mentioned in issue llvm/llvm-bugzilla-archive#44522

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla clang Clang issues not falling into any other category
Projects
None yet
Development

No branches or pull requests

2 participants