LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 30553 - Debug info generated for arrays is not what GDB expects (not as good as GCC's)
Summary: Debug info generated for arrays is not what GDB expects (not as good as GCC's)
Status: REOPENED
Alias: None
Product: clang
Classification: Unclassified
Component: -New Bugs (show other bugs)
Version: trunk
Hardware: PC Linux
: P normal
Assignee: Adrian Prantl
URL:
Keywords:
Depends on:
Blocks: 24345
  Show dependency tree
 
Reported: 2016-09-28 12:16 PDT by Caroline Tice
Modified: 2020-12-10 08:05 PST (History)
13 users (show)

See Also:
Fixed By Commit(s): 323952


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Caroline Tice 2016-09-28 12:16:13 PDT
There is a very annoying difference in the way GCC & LLVM generate debug information for arrays.  The symptom is that when GDB is asked to print an array-type variable that was compiled with GCC, it shows the array and it's contents, while when asked to print the same variable for the same program, compiled by LLVM, all you get is a pointer:

GCC version:
(gdb) print vla
$1 = {5, 7, 9}
(gdb) print vlaref
$2 = (int (&)[3]) @0x7fffffffdc30: {5, 7, 9}
(gdb) print vlaref2
$3 = (const vlareftypedef) @0x7fffffffdc30: {5, 7, 9}

LLVM version:
(gdb) print vla
$1 = 0x7fffffffdc20
(gdb) print vlaref
$2 = (int (&)[]) @0x7fffffffdc20: 0x7fffffffdc20
(gdb) print vlaref2
$3 = (vlareftypedef) @0x7fffffffdc20: 0x7fffffffdc20

LLVM can't even tell gdb the length of the array, much less its contents!

In discussing this with Eric Christopher, he said:

A simple testcase is:

int foo(int a) {
  int vla[a];
  int sum = 0;

  for (int i = 0; i < a; ++i)
    vla[i] = i;
  for (int j = 0; j < a; ++j)
    sum += vla[j];

  return sum;
}

int main (void) {
  return foo(4);
}

What's happening is that we're not adding a DW_AT_upper_bound of type DW_FORM_expr/exprloc with the upper bound of the array.
Comment 1 Florian Hahn 2018-01-03 03:13:44 PST
Sander is working on this and has put up a set of patches https://reviews.llvm.org/D41698
Comment 2 Florian Hahn 2018-02-05 02:16:37 PST
This was fixed in https://reviews.llvm.org/rL323952 

Thanks Sander!
Comment 3 Paul Robinson 2018-02-09 10:32:21 PST
Reopened on behalf of Carlos Enciso, who made this comment on Phabricator
post-commit:

Hi @sdesmalen!

First of all my apologies for commenting after the issue has been closed, but I do not have an account to add a comment to the associated bugzilla.

I have found what it seems to be an issue with the current implementation.

For the given test case

  int main() {
    int size = 2;
  
    int vla_expr[size];
    vla_expr[1] = 1;
  
    return 0;
  }

and while debugging with LLDB, the following error is generated:

  (lldb) n
  Process 21014 stopped
  * thread #1, name = 'bad.out', stop reason = step over
      frame #0: 0x0000000000400502 bad.out`main at vla_2.cpp:7
     4   	  int vla_expr[size];
     5   	  vla_expr[1] = 1;
     6   	
  -> 7   	  return 0;
     8   	}
  
  (lldb) p vla_expr
  (unsigned long) $0 = 2
  
  (lldb) p vla_expr[1]
  error: subscripted value is not an array, pointer, or vector
  
  (lldb) 

Looking at the DWARF generated, there are 2 variables with the same name at the same scope

  DW_TAG_subprogram "main"
    ...
    DW_TAG_variable "size"
    DW_TAG_variable "vla_expr"
    DW_TAG_variable "vla_expr"

I think there are 2 issues:

The compiler generated variable 'vla_expr'

- should be flagged as artificial (DW_AT_artificial)
- its name should start with double underscore to avoid conflicting with user-defined names.

Thanks,
Carlos
Comment 4 Davide Italiano 2018-04-19 09:58:54 PDT
I really can't reproduce this on ToT, but I hit a different issue.

(lldb) frame var
(int) size = 2
(unsigned long) __vla_expr = 2
(int [81]) vla_expr = {
  [0] = 12872
  [1] = 1
  [2] = 0
  [3] = 0
  [4] = 0
  [5] = 0
  [6] = 0
  [7] = 0
  [8] = 2
  [9] = 0
  [10] = -272631232
  [11] = 32766
  [12] = 2
  [13] = 0
  [14] = 1738604614
  [15] = 1882609363
  [16] = -272631160
  [17] = 32766
  [18] = 1970217237
  [19] = 32767
  [20] = 1970217237
  [21] = 32767
  [22] = 0
  [23] = 0
  [24] = 1
  [25] = 0
  [26] = -272630848
  [27] = 32766
  [28] = 0
  [29] = 0
  [30] = -272630841
[...]

So, the 81 elements array is definitely off.
Comment 5 Davide Italiano 2018-04-19 10:01:16 PDT
And, FWIW, we already synthetize the variable as artificial and put two underscores in front of via_expr


0x00000051:         TAG_variable [4]
                     AT_location( fbreg -32 )
                     AT_name( "__vla_expr" )
                     AT_type( {0x00000074} ( long unsigned int ) )
                     AT_artificial( true )

0x0000005d:         TAG_variable [5]
                     AT_location( 0x00000000
                        0x0000000100000f49 - 0x0000000100000f6f: rsi+0 )
                     AT_name( "vla_expr" )
                     AT_decl_file( "/Users/davide/work/llvm-monorepo/build/bin/blah.c" )
                     AT_decl_line( 4 )
                     AT_type( {0x0000007b} ( int[] ) )
Comment 6 Davide Italiano 2018-04-19 10:34:32 PDT
Looking at the DWARF more closely, this is still a debug info generation bug.
For some reason, we emit an array with 0x51 elements


0x0000007b:     TAG_array_type [7] *
                 AT_type( {0x0000006d} ( int ) )

0x00000080:         TAG_subrange_type [8]
                     AT_type( {0x0000008a} ( __ARRAY_SIZE_TYPE__ ) )
                     AT_count( {0x00000051} )

0x00000089:         NULL
Comment 7 Adrian Prantl 2018-04-19 10:39:15 PDT
What does llvm-dwarfdump --debug-info=0x51 say?
Comment 8 Davide Italiano 2018-04-19 10:44:33 PDT
davide@Davidinos-Mac-Pro ~/w/l/b/bin> ./llvm-dwarfdump --debug-info=0x51 ./blah.dSYM
blah.dSYM/Contents/Resources/DWARF/blah:	file format Mach-O 64-bit x86-64

.debug_info contents:

0x00000051: DW_TAG_variable
              DW_AT_location	(DW_OP_fbreg -32)
              DW_AT_name	("__vla_expr")
              DW_AT_type	(0x00000074 "long unsigned int")
              DW_AT_artificial	(true)
Comment 9 Adrian Prantl 2018-04-19 10:51:13 PDT
It looks like LLDB may be misinterpreting the DIE reference in the DW_AT_count attribute for a constant.
Comment 10 Davide Italiano 2018-04-19 12:03:13 PDT
I completely agree, I tried GDB and this what I got

(gdb) r
Starting program: /home/davide/llvm-work/build/bin/blah

Breakpoint 1, main () at blah.c:2
2               int size = 2;
(gdb) n
3               int blah[size];
(gdb) n
4               blah[1] = 2;
(gdb) n
5               return 0;
(gdb) p blah
$1 = {0, 2}

So, yes, we miss the support in lldb.
Comment 11 Davide Italiano 2018-04-19 13:44:13 PDT
This is the culprit.

Process 62787 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = step over
    frame #0: 0x00000001091cdbd4 liblldb.7.0.0.dylib`DWARFASTParserClang::ParseChildArrayInfo(this=0x00007facf23009d0, sc=0x00007ffee94cd4d8, parent_die=0x00007ffee94cd808, first_index=0x00007ffee94c95f0, element_orders=size=1, byte_stride=0x00007ffee94c95ec, bit_stride=0x00007ffee94c95e8) at DWARFASTParserClang.cpp:
3702
   3699               break;
   3700
   3701             case DW_AT_count:
-> 3702               num_elements = form_value.Unsigned();
   3703               break;
   3704
   3705             case DW_AT_bit_stride:
Target 0: (lldb) stopped.

I wonder if this has ever worked :)
Comment 12 Davide Italiano 2018-04-19 13:52:11 PDT
The DWARF standard says (thanks to Adrian for pointing out!):

The subrange entry may have the attributes DW_AT_lower_bound and DW_AT_upper_bound to specify, respectively, the lower and upper bound values of the subrange. The DW_AT_upper_bound attribute may be replaced by a DW_AT_count attribute, whose value describes the number of elements in the subrange rather than the value of the last element. The value of each of these attributes is determined as described in Section 2.19 on page 55.

2.19 Static and Dynamic Values of Attributes
[...]
The value of these attributes is determined based on the class as follows:
* For a constant, the value of the constant is the value of the attribute.
* For a reference, the value is a reference to another debugging information
entry. This entry may:
– describe a constant which is the attribute value,
– describe a variable which contains the attribute value, or
– contain a DW_AT_location attribute whose value is a DWARF expression which computes the attribute value (for example, a DW_TAG_dwarf_procedure entry).
* For an exprloc, the value is interpreted as a DWARF expression; evaluation of the expression yields the value of the attribute.

lldb currently handles only the first case correctly.
Comment 13 Davide Italiano 2018-04-19 13:53:46 PDT
dwarfdump -F makes this more clear (the fact that this is a reference):

0x0000007b:   DW_TAG_array_type
                DW_AT_type [DW_FORM_ref4]       (0x0000006d "int")

0x00000080:     DW_TAG_subrange_type
                  DW_AT_type [DW_FORM_ref4]     (0x0000008a "__ARRAY_SIZE_TYPE__")
                  DW_AT_count [DW_FORM_ref4]    (0x00000051)
Comment 14 Davide Italiano 2018-12-04 09:44:40 PST
Adrian, you fixed this one, didn't you?
Comment 15 Francois Pichet 2020-12-10 08:05:25 PST

I investigated some debuginfo problem using VLA for an out of tree target and I noticed: 


- DEBUG_VALUE associated with VLA are not always propagated correctly across MBB in O0.
- When VLA in a parameter (ie: int sumAll(int n, int A[n]). the debug info for A is just a regular pointer.