42914 – [AST] AutoType within AutoTypeLoc is missing deduced type.

LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 42914 - [AST] AutoType within AutoTypeLoc is missing deduced type.

Summary: [AST] AutoType within AutoTypeLoc is missing deduced type.

Status:	NEW

Alias:	None

Product:	clang
Classification:	Unclassified
Component:	-New Bugs (show other bugs)
Version:	trunk
Hardware:	PC Linux

Importance:	P enhancement
Assignee:	Unassigned Clang Bugs

URL:
Keywords:

Depends on:
Blocks:

Reported:	2019-08-07 06:13 PDT by Sam McCall
Modified:	2021-09-24 05:18 PDT (History)
CC List:	8 users (show)

See Also:
Fixed By Commit(s):

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Sam McCall 2019-08-07 06:13:34 PDT

the program `auto x = 4;` has an AST that looks like:

VarDecl type=T1
 - typeloc type=T2
 - integerliteral

T1 is correctly an auto type that wraps int.
However T2 is undeduced: it's ASTContext::getAutoDeductType().

This irregularity makes it harder/slower to write tools that (e.g.) care what an `auto` under the cursor expands to.

Comment 1 Nathan Ridge 2020-01-01 23:21:34 PST

I investigated this a bit. What seems to be happening is:

  * At the time the VarDecl node is created, we haven't seen the
    initializer (and thus computed the deduced type) yet, so
    we set the VarDecl's type to an empty DeducedType. The
    TypeSourceInfo stored in the DeclaratorDecl is set based on
    this as well.
 
  * Later, we encounter the initializer and compute the deduced
    type. This happens in Sema::DeduceVariableDeclarationType().

  * We update the deduced type on the VarDecl via
    ValueDecl::setType() [1]. This updates ValueDecl::DeclType,
    but it does not update DeclaratorDecl::DeclInfo, which
    continues to point to the TypeSourceInfo wrapping the old,
    empty DeducedType.

  * When RecursiveASTVisitor (which is what clangd uses to 
    get at the TypeLoc) encounters a DeclaratorDecl, it prefers
    to get the TypeLoc via the TypeSourceInfo if there is one [2],
    only falling back to ValueDecl::DeclType otherwise.
    Therefore, it gives us the undeduced, empty TypeLoc.

[1] https://github.com/llvm/llvm-project/blob/a2976c490da3b6d7253d4034ae507a760457ea18/clang/lib/Sema/SemaDecl.cpp#L11353
[2] https://github.com/llvm/llvm-project/blob/a2976c490da3b6d7253d4034ae507a760457ea18/clang/include/clang/AST/RecursiveASTVisitor.h#L1950

Comment 2 Nathan Ridge 2020-01-01 23:24:28 PST

Naively, I would think the solution ought to be that, in addition to calling VDecl->setType(), Sema::DeduceVariableDeclarationType() should also call VDecl->setTypeSourceInfo().

However, I don't at this stage understand how these types (Type / TypeLoc / TypeSourceInfo) interact well enough to know how to construct a TypeSourceInfo in that place.

Comment 3 Sam McCall 2020-01-02 06:37:46 PST

I heard that Ilya Biryukov and Richard Smith had a conversation about this bug, and it wasn't totally trivial.
But I'm not sure that was written down anywhere, it'd be good to get Richard's take (even if second-hand) on what needs to happen here.

Comment 4 Richard Smith 2020-01-10 14:47:23 PST

A couple of issues:

 1) For functions with deduced return types, it's important that the AST representation for the TypeSourceInfo does *not* contain the deduced type. Redeclaration matching requires that we preserve the non-deduced auto in the declared return type. So tooling that wants to inspect auto types needs to be able to look into the type, not just the type source info, in general.

 2) For variables with deduced types, it's important to serialization that we don't include the deduced type in the type source info. We load the type source info early (before merging), and doing so must not deserialize (say) a lambda used to initialize the variable, because that would introduce a deserialization cycle. We could perhaps avoid that for variables, by removing the type-checks we perform when merging declarations of variables across modules (those checks are strictly-speaking wrong anyway, since they mean we won't merge 'extern int n;' with 'auto n = 0;').

So, I think we could change our model so that we store the deduced type in the TypeSourceInfo of a variable. It would require some contortions in deserialization, and perhaps elsewhere. But I don't think we want to change our model for functions, where it's important that we preserve the declared type. So far, the consistency argument (the TypeSourceInfo should always represent the declared type) has won out over the convenience-for-tooling argument, especially given that tooling will need to deal with the deduced-type-is-not-in-the-TSI case *anyway* to properly handle functions with deduced return types. But perhaps someone can come up with a smart design that gives us the best of both worlds (perhaps we could represent the auto type in the TSI as a type that is canonically an undeduced auto but that tracks the deduced type via some type sugar?)

Comment 5 Nathan Ridge 2020-01-10 15:13:32 PST

(In reply to Richard Smith from comment #4)
> especially given that tooling will need to deal with the
> deduced-type-is-not-in-the-TSI case *anyway* to properly handle functions
> with deduced return types.

Note, it's RecursiveASTVisitor that ignores the type and visits the TSI in the case where we have a TSI.

Would it be reasonable to change the behaviour of RecursiveASTVisitor here? The actual tooling use cases (at least, the ones we've come across in clangd that I'm aware of) should "just work" if RAV visited the deduced type.