Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in std::string comparison after libc++ aaef3b82f4f0 #60709

Closed
zmodem opened this issue Feb 13, 2023 · 15 comments
Closed

Memory leak in std::string comparison after libc++ aaef3b82f4f0 #60709

zmodem opened this issue Feb 13, 2023 · 15 comments
Assignees
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" consteval C++20 consteval

Comments

@zmodem
Copy link
Collaborator

zmodem commented Feb 13, 2023

It's mysterious, but bisection clearly points to aaef3b8. Here is a reduced reproducer:

$ cat /tmp/a.cc
#include <memory>
#include <string>

struct S {
  std::string ToString() { return std::string(data, sizeof(data)); }
  void inc() { data[0]++; }
  char data[40] = {'a'};
};

int main() {
  std::unique_ptr<S> s = std::make_unique<S>();
  for (; s->ToString() < "foo"; s->inc()) {
  }
  return 0;
}

$ build2/bin/clang++ -g -fsanitize=address -stdlib=libc++ /tmp/a.cc -std=c++20
$ ASAN_OPTIONS="detect_leaks=1" ASAN_SYMBOLIZER_PATH=build2/bin/llvm-symbolizer LD_LIBRARY_PATH=build2/lib/x86_64-unknown-linux-gnu ./a.out

=================================================================
==245534==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 48 byte(s) in 1 object(s) allocated from:
    #0 0x5622d7ad056d in operator new(unsigned long) /work/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:95:3
    #1 0x7f6f1fe6ffa1 in std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>::__init(char const*, unsigned long) (build2/lib/x86_64-unknown-linux-gnu/libc++.so.1+0x5afa1)
    #2 0x5622d7ad3494 in S::ToString() /tmp/a.cc:5:35
    #3 0x5622d7ad2c5f in main /tmp/a.cc:12:13
    #4 0x7f6f1faea189 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16

SUMMARY: AddressSanitizer: 48 byte(s) leaked in 1 allocation(s).

Dropping the -std=c++20 flag makes the leak go away.

The diff below also makes it go away.

diff --git a/libcxx/include/__compare/ordering.h b/libcxx/include/__compare/ordering.h
index 572a15cf3485..1d02bc570d9f 100644
--- a/libcxx/include/__compare/ordering.h
+++ b/libcxx/include/__compare/ordering.h
@@ -40,7 +40,7 @@ template<class _Tp, class... _Args>
 inline constexpr bool __one_of_v = (is_same_v<_Tp, _Args> || ...);
 
 struct _CmpUnspecifiedParam {
-  _LIBCPP_HIDE_FROM_ABI consteval
+  _LIBCPP_HIDE_FROM_ABI constexpr
   _CmpUnspecifiedParam(int _CmpUnspecifiedParam::*) noexcept {}
 
   template<class _Tp, class = enable_if_t<!__one_of_v<_Tp, int, partial_ordering, weak_ordering, strong_ordering>>>
@zmodem zmodem added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Feb 13, 2023
zmodem added a commit that referenced this issue Feb 13, 2023
It causes mysterious memory leaks when comparing std::string, see GitHub
Issue #60709 and the code review.

> All supported compilers support `consteval`, so there is no more need for the macro.
>
> Reviewed By: ldionne, Mordante, #libc
>
> Spies: libcxx-commits
>
> Differential Revision: https://reviews.llvm.org/D143489

This reverts commit aaef3b8.
@philnik777
Copy link
Contributor

I'm pretty sure this is a compiler problem. Changing consteval to constexpr should never result in a memory leak.

@philnik777 philnik777 added the clang:frontend Language frontend issues, e.g. anything involving "Sema" label Feb 13, 2023
@llvmbot
Copy link
Collaborator

llvmbot commented Feb 13, 2023

@llvm/issue-subscribers-clang-frontend

@AaronBallman
Copy link
Collaborator

I'm pretty sure this is a compiler problem. Changing consteval to constexpr should never result in a memory leak.

constexpr functions can be deferred to runtime, whereas consteval functions cannot. So changing consteval into constexpr can result in memory leaks. Consider: https://godbolt.org/z/xdcK35jbY

It's also worth noting that Clang doesn't currently define __cpp_consteval: https://godbolt.org/z/Y1nfPr5jW

@philnik777
Copy link
Contributor

philnik777 commented Feb 22, 2023

I'm pretty sure this is a compiler problem. Changing consteval to constexpr should never result in a memory leak.

constexpr functions can be deferred to runtime, whereas consteval functions cannot. So changing consteval into constexpr can result in memory leaks. Consider: https://godbolt.org/z/xdcK35jbY

Sorry, I meant the other way around. Making the function consteval apparently resulted in a memory leak. And changing that back to constexpr resolved the issue.

It's also worth noting that Clang doesn't currently define __cpp_consteval: https://godbolt.org/z/Y1nfPr5jW

I am aware of that. The memory leak is still very surprising.

@AaronBallman
Copy link
Collaborator

I'm pretty sure this is a compiler problem. Changing consteval to constexpr should never result in a memory leak.
constexpr functions can be deferred to runtime, whereas consteval functions cannot. So changing consteval into constexpr can result in memory leaks. Consider: https://godbolt.org/z/xdcK35jbY

Sorry, I meant the other way around. Making the function consteval apparently resulted in a memory leak. And changing that back to constexpr resolved the issue.

Oh yeah that IS surprising!

It's also worth noting that Clang doesn't currently define __cpp_consteval: https://godbolt.org/z/Y1nfPr5jW

I am aware of that. The memory leak is still very surprising.

Agreed, and I'm not suggesting this isn't necessarily a frontend issue in Clang.

@ldionne ldionne removed the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Mar 30, 2023
@ldionne
Copy link
Member

ldionne commented Mar 30, 2023

Removing the libc++ tag since this was fixed in libc++ by using constexpr instead of consteval in c6b12b7. I still think this is worth investigating on Clang's side cause it's kinda scary.

@cor3ntin cor3ntin added consteval C++20 consteval needs-reduction Large reproducer that should be reduced into a simpler form labels Jun 5, 2023
@cor3ntin
Copy link
Contributor

cor3ntin commented Jun 6, 2023

I wonder if that could be related to #46711 - IE, code gen bug

@Fznamznon
Copy link
Contributor

Fznamznon commented Jun 6, 2023

Still reproducible with trunk and the diff

diff --git a/libcxx/include/__compare/ordering.h b/libcxx/include/__compare/ordering.h
index c348f0433a32..e0104c018879 100644
--- a/libcxx/include/__compare/ordering.h
+++ b/libcxx/include/__compare/ordering.h
@@ -40,7 +40,7 @@ template<class _Tp, class... _Args>
 inline constexpr bool __one_of_v = (is_same_v<_Tp, _Args> || ...);

 struct _CmpUnspecifiedParam {
-  _LIBCPP_HIDE_FROM_ABI constexpr
+  _LIBCPP_HIDE_FROM_ABI consteval
   _CmpUnspecifiedParam(int _CmpUnspecifiedParam::*) noexcept {}

   template<class _Tp, class = enable_if_t<!__one_of_v<_Tp, int, partial_ordering, weak_ordering, strong_ordering>>>

I got the exact same error.

==271377==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 48 byte(s) in 1 object(s) allocated from:
    #0 0x5583912d0c3d in operator new(unsigned long) /source/llorg/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:95:3
    #1 0x7f758397e681 in std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>::__init(char const*, unsigned long) (/source/llorg/llvm-project/build/lib/x86_64-unknown-linux-gnu/libc++.so.1+0x5c681)
    #2 0x5583912d3884 in S::ToString() /testing/60709/t.cpp:5:35
    #3 0x5583912d304f in main /testing/60709/t.cpp:12:13
    #4 0x7f7582562492 in __libc_start_main (/lib64/libc.so.6+0x23492) (BuildId: a91d7b51b23a64e86b82a91511c446c137d3ec8e)

SUMMARY: AddressSanitizer: 48 byte(s) leaked in 1 allocation(s).

@Fznamznon Fznamznon self-assigned this Jun 12, 2023
@Fznamznon
Copy link
Contributor

Fznamznon commented Jun 13, 2023

Reduced:

struct _CmpUnspecifiedParam {
    consteval _CmpUnspecifiedParam(int _CmpUnspecifiedParam::*) noexcept {} // Change here to constexpr to make leak go away
};

struct ordering {
    int value;
    friend constexpr bool operator<(ordering lhs, _CmpUnspecifiedParam) { return lhs.value > 0; }
};

class string {
public:
  string(const char* data, unsigned int size) {Data = new char[size];}

  friend ordering operator<=>(string lhs, const char* rhs) { return ordering{1}; }
  ~string() { delete[] Data;}
  private:
  char *Data;
};

struct S {
  string ToString() { return string(data, sizeof(data)); }
  char data[10] = {'a'};
};

int main() {
  S s;
  int i = 0;
  do {
    ++i;
  } while (i < 2 && s.ToString() < "whatev");
  return 0;
}

It seems changing constructor of _CmpUnspecifiedParam to consteval causes missing destructor for temporary string created in loop condition. Can be seen in IR. In AST can be observed as moving of ExprWithCleanups from above of the whole && operator to above of _CmpUnspecifiedParam constant expression. Error message is:

Direct leak of 10 byte(s) in 1 object(s) allocated from:
    #0 0x560095bf303d in operator new[](unsigned long) /source/llorg/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:98:3
    #1 0x560095bf5a14 in string::string(char const*, unsigned int) /testing/60709/red.cpp:12:55
    #2 0x560095bf5984 in S::ToString() /testing/60709/red.cpp:21:30
    #3 0x560095bf534d in main /testing/60709/red.cpp:30:23
    #4 0x7fcd1885f492 in __libc_start_main (/lib64/libc.so.6+0x23492) (BuildId: a91d7b51b23a64e86b82a91511c446c137d3ec8e)

SUMMARY: AddressSanitizer: 10 byte(s) leaked in 1 allocation(s).

Looking further.

@Fznamznon Fznamznon removed the needs-reduction Large reproducer that should be reduced into a simpler form label Jun 13, 2023
@Fznamznon
Copy link
Contributor

Fznamznon commented Jun 14, 2023

The problematic bit seems to be this:

E = MaybeCreateExprWithCleanups(E);

I'm not quite sure why immediate invocation would need a cleanup, but calling MaybeCreateExprWithCleanups causes call to DiscardCleanupsInEvaluationContext which discards later required cleanup for temporary string created by loop condition.
Removing this call to MaybeCreateExprWithCleanups fixes memory leak in reduced and original example and passes check-clang. Still not quite sure if this a correct thing to do though.

@shafik
Copy link
Collaborator

shafik commented Jun 14, 2023

CC @zygoloid for any insights he might have based on the information you gathered so far

@ilya-biryukov
Copy link
Contributor

For the record, I'm also looking into it. I can confirm @Fznamznon's conclusions and I also think the line mentioned above is probably wrong. I would assume that immediate invocations never need to run the cleanups themselves (there are no consteval destructors) and any destructors they have to run need to run at full expressions positions, just like any other.
Removing the line seems like the right fix. @zygoloid are we missing anything here?

Here are a few more details. A smaller repro is:

struct iparam {
  consteval iparam() {}
};
struct string {
    string(int v) { this->data = new int(v); }
    ~string() { delete data; }
private:
    int *data;
};
int main() {
  for (;string(1), iparam(), false;);
  return 0;
}

See godbolt.
if iparam is not consteval the error goes away. This confirms what @Fznamznon observed earlier, adding an immediate invocation makes ExprWithCleanups wrap the immediate invocation instead of the full expression.
AST becomes:

    |-ForStmt 0x55929cd8ae98 <line:11:3, col:37>
    …
    | |-BinaryOperator 0x55929cd8ae70 <col:9, col:30> 'bool' ','
    | | |-BinaryOperator 0x55929cd8ae40 <col:9, col:27> 'iparam':'iparam' ','
    …
    | | |   `-ConstantExpr 0x55929cd8acb0 <col:20, col:27> 'iparam':'iparam'
    | | |     |-value: Struct
    | | |     `-ExprWithCleanups 0x55929cd8ac98 <col:20, col:27> 'iparam':'iparam'
    | | |       `-CXXTemporaryObjectExpr 0x55929cd8ac68 <col:20, col:27> 'iparam':'iparam' 'void ()'

If constructor is not consteval, the AST is:

    |-ForStmt 0x564ab2279a78 <line:11:3, col:37>
    …
    | |-ExprWithCleanups 0x564ab2279a58 <col:9, col:30> 'bool'
    | | `-BinaryOperator 0x564ab2279a38 <col:9, col:30> 'bool' ','
    | |   |-BinaryOperator 0x564ab2279a08 <col:9, col:27> 'iparam':'iparam' ','
    …
    | |   | `-CXXTemporaryObjectExpr 0x564ab22798c8 <col:20, col:27> 'iparam':'iparam' 'void ()'

Note that ExprWithCleanups is unnecessary for the ConstantExpr, I suspect it's created by mistake whenever sibling expressions we saw earlier (in traversal order) need cleanup. E.g. the error goes away if the temporary is created after immediate invocation:

  for (;iparam(), string(1), false;);

Or if the temporary is created both before and after the immediate invocation:

  for (;string(1), iparam(), string(1), false;);

In this case, we get actually get two ExprWithCleanups:

    |-ForStmt 0x55eb1a703e50 <line:11:3, col:48>
    …
    | |-ExprWithCleanups 0x55eb1a703e30 <col:9, col:41> 'bool'
    | | `-BinaryOperator 0x55eb1a703e10 <col:9, col:41> 'bool' ','
    | |   |-BinaryOperator 0x55eb1a703de0 <col:9, col:38> 'string':'string' ','
    | |   | |-BinaryOperator 0x55eb1a703d10 <col:9, col:27> 'iparam':'iparam' ','
    | |   | | |-CXXFunctionalCastExpr 0x55eb1a7036a8 <col:9, col:17> 'string':'string' functional cast to string <ConstructorConversion>
    | |   | | | `-CXXBindTemporaryExpr 0x55eb1a703688 <col:9, col:17> 'string':'string' (CXXTemporary 0x55eb1a703688)
    | |   | | |   `-CXXConstructExpr 0x55eb1a703650 <col:9, col:17> 'string':'string' 'void (int)'
    | |   | | |     `-IntegerLiteral 0x55eb1a703340 <col:16> 'int' 1
    | |   | | `-CXXFunctionalCastExpr 0x55eb1a703ce8 <col:20, col:27> 'iparam':'iparam' functional cast to iparam <NoOp>
    | |   | |   `-ConstantExpr 0x55eb1a703b80 <col:20, col:27> 'iparam':'iparam'
    | |   | |     |-value: Struct
    | |   | |     `-ExprWithCleanups 0x55eb1a703b68 <col:20, col:27> 'iparam':'iparam'
    | |   | |       `-CXXTemporaryObjectExpr 0x55eb1a703b38 <col:20, col:27> 'iparam':'iparam' 'void ()'
    | |   | `-CXXFunctionalCastExpr 0x55eb1a703db8 <col:30, col:38> 'string':'string' functional cast to string <ConstructorConversion>
    | |   |   `-CXXBindTemporaryExpr 0x55eb1a703d98 <col:30, col:38> 'string':'string' (CXXTemporary 0x55eb1a703d98)
    | |   |     `-CXXConstructExpr 0x55eb1a703d60 <col:30, col:38> 'string':'string' 'void (int)'
    | |   |       `-IntegerLiteral 0x55eb1a703d40 <col:37> 'int' 1

@Fznamznon
Copy link
Contributor

I also have another guess, that even though the line I mentioned above may not be correct, it should bring no harm. I suspect that Cleanup.exprNeedsCleanups() shouldn't return true for the immediate invocation.

@ilya-biryukov given that your findings confirm my earlier conclusion, what about I post a patch removing the problematic bit?

@ilya-biryukov
Copy link
Contributor

I suspect that Cleanup.exprNeedsCleanups() shouldn't return true for the immediate invocation.

IIUC the Cleanup aims to capture whether a full expression being processed needs cleanups, so Cleanup.exprNeedsCleanups() == true at a point where immediate invocation is being processed seems fine. It marks that previously processed parts of full expression needed cleanup, exactly as expected.
Also, certain immediate invocations might require a cleanup, see this example:

struct param {
    consteval param() { val = nullptr; allocated = false; } 
    ~param() { delete val; }

    param& assign(int v) {
        if (!allocated) {
            val = new int;
            allocated = true;
        }
        *val = v;
        return *this;
    }

    int get() { return *val; }

private:
    int *val;
    bool allocated;
};

int main() {
    // param() is an immediate invocation,
    // but we need to run the destructor for a temporary produced by it.
    for (; param().assign(10).get() < 5;) {}
}

what about I post a patch removing the problematic bit?

@Fznamznon this sounds good, let's try this out. I think it would also be valuable to add tests that check AST and codegen is doing the right thing here. We could continue the discussion in Phabricator.

@Fznamznon
Copy link
Contributor

I've created https://reviews.llvm.org/D153294

IIUC the Cleanup aims to capture whether a full expression being processed needs cleanups, so Cleanup.exprNeedsCleanups() == true at a point where immediate invocation is being processed seems fine.

I see! Thank you for the explanation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" consteval C++20 consteval
Projects
Status: Done
Development

No branches or pull requests

9 participants