While debugging either compiler_proxy.exe or chrome.exe I have noticed that sometimes single-stepping through plain C++ code will suddenly drop me into the disassembly window. The exact details depend on user settings and whether you are debugging a locally built binary or one from the build machines. Some people have auto-switch to disassembly disabled so they just see a warning page saying: cpu.cc not found You need to find cpu.cc to view the source for the current call stack frame Viewing details gives me this: Locating source for 'c:\src\chromium3\src\base\cpu.cc'. Checksum: MD5 {f2 8b 7a f0 cb f 9d f3 29 47 b3 b6 9c c0 2b 33} The file 'c:\src\chromium3\src\base\cpu.cc' exists. Determining whether the checksum matches for the following locations: 1: c:\src\chromium3\src\base\cpu.cc Checksum: MD5 {f2 8b 7a f0 cb f 9d f3 29 47 b3 b6 9c c0 2b 33} Checksum matches. The debugger found source in the following locations: 1: c:\src\chromium3\src\base\cpu.cc Checksum: MD5 {f2 8b 7a f0 cb f 9d f3 29 47 b3 b6 9c c0 2b 33} The debugger will use the source at location 1. There is a Browse and find cpu.cc link but it doesn't work. It is odd that the debugger says that the source matches and that it will use it but then it doesn't. It *might* be a debugger bug but it seems suspicious that the debugger happily steps through code and then switches to assembly language. It feels like there is a discontinunity in the source mappings. When debugging VC++-generated code I have not seen this behavior. If I use Ctrl+O to load the specified file it loads but doesn't show any "execution is here" cursor. If I use Ctrl+F11 to switch to assembly mode it asks me to disambiguate and then takes me here (the cmp instruction): static_assert(kParameterSize * sizeof(cpu_info) + 1 == base::size(cpu_string), "cpu_string has wrong size"); if (max_parameter >= kParameterEnd) { 001A3C49 3D 04 00 00 80 cmp eax,80000004h 001A3C4E 7C 68 jl base::CPU::Initialize+238h (01A3CB8h) // Copyright (c) 2012 The Chromium Authors. All rights reserved. // Use of this source code is governed by a BSD-style license that can be // found in the LICENSE file. Again, no cursor showing that execution is there. Right-clicking and selecting Show Next Statement takes me here - on the first mov instruction: return PENTIUM; } } // namespace base 001A3C50 89 C7 mov edi,eax size_t i = 0; for (int parameter = kParameterStart; parameter <= kParameterEnd; ++parameter) { __cpuid(cpu_info, parameter); 001A3C52 B8 02 00 00 80 mov eax,80000002h If I single-step to the next instruction then Ctrl+F11 works and I can go back to source code with an execution cursor and all is well. So, it looks like there is a source-to-assembly discontinuity on the instruction at address 001A3C50. The easiest way to reproduce this is to set a breakpoint on line 255 of base\cpu.cc then single step (F10). That's it. I reproduced this at Chromium hash 8a9aaaa8c338d22a97442e87518f23a661bef002. I don't know how stable this exact repro is. Here are the gn args I used: is_component_build = false is_debug = false target_cpu = "x86" enable_nacl = false dcheck_always_on = true use_goma = true blink_symbol_level = 1
I was able to reproduce this with base_unittests and will dig into it a bit.
It looks like line zero upsets Visual Studio, which is unfortunate, since the DWARF folks have been tirelessly working to use line zero in more places when we don't know the current source location. This is the compiler-generated assembly for that bit of code: .cv_loc 20 1 255 0 # ../../base/cpu.cc:255:0 cmp eax, -2147483644 jl .LBB2_20 .Ltmp87: # %bb.18: .cv_loc 20 1 0 0 # ../../base/cpu.cc:0:0 mov edi, eax .Ltmp88: .cv_loc 20 1 259 0 # ../../base/cpu.cc:259:0 mov eax, -2147483646 xor ecx, ecx #APP cpuid #NO_APP I was able to make VS do the same thing by artificially setting the line number to zero like this: #include <stdio.h> int main() { printf("%d\n", __LINE__); #line 0 printf("%d\n", __LINE__); #line 7 printf("%d\n", __LINE__); printf("%d\n", __LINE__); } I tried compiling that sample with MSVC, but it elides the "line zero" locations. The line table with VS looks like this: C:\src\llvm-project\build\t.cpp (MD5: 951FC80309A6E680E37501997E65A1DF) 0001:000003A4-000003F0, line/addr entries = 5 2 000003A4 3 000003A8 7 000003C7 8 000003D8 9 000003E9 So, prologue, first printf, then third printf, then fourth, then epilogue.
(In reply to Reid Kleckner from comment #2) > It looks like line zero upsets Visual Studio, which is unfortunate, since > the DWARF folks have been tirelessly working to use line zero in more places > when we don't know the current source location. I don't think that's necessarily a problem/in tension. If CodeView has no way to represent ambiguous/unlocated instructions - then the line zero can be ignored & the previous instructions location can be propagated as is the behavior for some instructions that come from no particular location (I think the frontend creates some instructions with no location (not location zero, but no location at all) - so this is a supported flow/not a bug that's going to be fixed/etc)
I committed r374267 which avoids putting line zero in cv line tables. (In reply to David Blaikie from comment #3) > I don't think that's necessarily a problem/in tension. If CodeView has no > way to represent ambiguous/unlocated instructions - then the line zero can > be ignored & the previous instructions location can be propagated as is the > behavior for some instructions that come from no particular location (I > think the frontend creates some instructions with no location (not location > zero, but no location at all) - so this is a supported flow/not a bug that's > going to be fixed/etc) Well, the more location-less instructions we have, the more important it becomes to implement some kind of data flow analysis to backfill source locations, when it might have been easier to keep them around during optimization. This is what I wrote in the commit message of r374267: The fix is incomplete, because it's possible to have a basic block with no source locations at all. In this case, we don't emit a .cv_loc, but that will result in wrong stepping behavior in the debugger if the layout predecessor of the location-less BB has an unrelated source location. We could try harder to find a valid location that dominates or post-dominates the current BB, but in general it's a dataflow problem, and one still might not exist. I left a FIXME about this. I think what I described there is a problem for DWARF as well. This is a sketch of the what the situation would look like in assembly: bbA: .loc 1 jmp shared bbB: .loc 2 jmp shared bbC: .loc 3 ret bbD: .loc 4 ret shared: # no loc cmp jcc bbC jmp bbD In this case, the location-less block 'shared' would pick up the location from it's layout predecessor, bbD, which depends on what block placement decides to do. Layout tries to achieve as much fallthrough as possible, so I guess in practice most line tables end up being smooth enough that we don't notice. Anyway, there's a FIXME in the code for that case, but I think in practice there are very few location-less BBs.
(In reply to Reid Kleckner from comment #4) > I committed r374267 which avoids putting line zero in cv line tables. > > (In reply to David Blaikie from comment #3) > > I don't think that's necessarily a problem/in tension. If CodeView has no > > way to represent ambiguous/unlocated instructions - then the line zero can > > be ignored & the previous instructions location can be propagated as is the > > behavior for some instructions that come from no particular location (I > > think the frontend creates some instructions with no location (not location > > zero, but no location at all) - so this is a supported flow/not a bug that's > > going to be fixed/etc) > > Well, the more location-less instructions we have, the more important it > becomes to implement some kind of data flow analysis to backfill source > locations, when it might have been easier to keep them around during > optimization. Yep - agreed on all counts.
Is the VS behavior a code-view limitation or a UI limitation? Either way it seems like there might be a need for a VS bug. The VS message on this was extremely confusing, presumably because it had found the right file but fell down on the line number and didn't know how to express that. I'm not sure if we should file a bug saying that their error message is rubbish, or that they should support #line 0 data, or if this is just pointless since clang-cl is fixed. Thanks for the fix. This was quite painful when stepping through goma code.
*** Bug 44300 has been marked as a duplicate of this bug. ***