Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clang 14/15 libc++: std::vector memory leak in case of iterator exception #58392

Closed
pkl97 opened this issue Oct 16, 2022 · 7 comments
Closed
Assignees
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.

Comments

@pkl97
Copy link

pkl97 commented Oct 16, 2022

The program below shows this memory leak when executed in Valgrind:

[user@host]$ valgrind --leak-check=full ./LeakTest
==725012== Memcheck, a memory error detector
==725012== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==725012== Using Valgrind-3.20.0.GIT and LibVEX; rerun with -h for copyright info
==725012== Command: ./LeakTest
==725012==
Exceeded limit!
==725012==
==725012== HEAP SUMMARY:
==725012==     in use at exit: 40 bytes in 2 blocks
==725012==   total heap usage: 6 allocs, 4 frees, 1,272 bytes allocated
==725012==
==725012== 24 bytes in 1 blocks are definitely lost in loss record 2 of 2
==725012==    at 0x4C3761F: operator new(unsigned long) (vg_replace_malloc.c:432)
==725012==    by 0x10CF74: void* std::__1::__libcpp_operator_new[abi:v15002]<unsigned long>(unsigned long) (clang15el8/bin/../include/c++/v1/new:246)
==725012==    by 0x10CEFC: std::__1::__libcpp_allocate[abi:v15002](unsigned long, unsigned long) (clang15el8/bin/../include/c++/v1/new:272)
==725012==    by 0x10CE67: std::__1::allocator<unsigned int>::allocate[abi:v15002](unsigned long) (clang15el8/bin/../include/c++/v1/__memory/allocator.h:112)
==725012==    by 0x10CACC: std::__1::__allocation_result<std::__1::allocator_traits<std::__1::allocator<unsigned int> >::pointer> std::__1::__allocate_at_least[abi:v15002]<std::__1::allocator<unsigned int> >(std::__1::allocator<unsigned int>&, unsigned long) (clang15el8/bin/../include/c++/v1/__memory/allocate_at_least.h:54)
==725012==    by 0x10C852: std::__1::vector<unsigned int, std::__1::allocator<unsigned int> >::__vallocate[abi:v15002](unsigned long) (clang15el8/bin/../include/c++/v1/vector:682)
==725012==    by 0x10C5E7: std::__1::vector<unsigned int, std::__1::allocator<unsigned int> >::vector<MyIterator>(MyIterator, std::__1::enable_if<__is_cpp17_forward_iterator<MyIterator>::value&&is_constructible<unsigned int, std::__1::iterator_traits<MyIterator>::reference>::value, MyIterator>::type) (clang15el8/bin/../include/c++/v1/vector:1157)
==725012==    by 0x10C23C: main (main.cpp:57)
==725012==
==725012== LEAK SUMMARY:
==725012==    definitely lost: 24 bytes in 1 blocks
==725012==    indirectly lost: 0 bytes in 0 blocks
==725012==      possibly lost: 0 bytes in 0 blocks
==725012==    still reachable: 16 bytes in 1 blocks
==725012==         suppressed: 0 bytes in 0 blocks
==725012== Reachable blocks (those to which a pointer was found) are not shown.
==725012== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==725012==
==725012== For lists of detected and suppressed errors, rerun with: -s
==725012== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

The program defines an iterator that at some point throws an exception in operator*(). Such behavior is common e.g. for Unicode iterators (UTF-8 and UTF-16) if they encounter illegal byte sequences (e.g. missing parts of surrogate pairs or incorrect leading/continuation bytes). It can be triggered with Boost.Regex when combined with the ICU library.
The iterator constructor of std::vector is used to actually trigger the memory leak.

Combinations where the leak occurs:

  • clang 15.0.2 with libc++ (clang++ -Wall -Wextra -std=c++20 -stdlib=libc++ main.cpp -o LeakTest)
  • clang 14.0.6 with libc++

Combinations where no leak occurs:

  • clang 11.0.1 with libc++
  • clang 14.0.6 with libstdc++ (clang++ -Wall -Wextra -std=c++20 -stdlib=libstdc++ main.cpp -o LeakTest)
  • gcc 12.2 with libstdc++ (g++ -Wall -Wextra -std=c++20 main.cpp -o LeakTest)

This looks like a regression in recent versions of libc++.

The program to reproduce the issue:

#include <iostream>
#include <vector>

class MyIterator
{
public:
    using iterator_category = std::forward_iterator_tag;
    using difference_type = std::ptrdiff_t;
    using value_type = unsigned int;
    using pointer = value_type*;
    using reference = value_type&;
    using const_reference = const value_type&;

    using ContainerType = std::vector<value_type>;
    using ContainerTypeConstIterator = typename ContainerType::const_iterator;

    const_reference operator*() const
    {
        static unsigned int s_counter = 0;
        if (++s_counter == 2U)
            throw std::runtime_error("Exceeded limit!");
        return *m_itValue;
    }

    bool operator==(const MyIterator&) const = default;

    MyIterator& operator++()
    {
        ++m_itValue;
        return *this;
    }

    static MyIterator begin(const ContainerType& p_values)
    {
        return MyIterator(p_values.begin(), p_values.end());
    }

    static MyIterator end(const ContainerType& p_values)
    {
        return MyIterator(p_values.end(), p_values.end());
    }

private:
    MyIterator(const ContainerTypeConstIterator& p_itValue, const ContainerTypeConstIterator& p_itEnd) : m_itValue(p_itValue), m_itEnd(p_itEnd)
    {
    }

    ContainerTypeConstIterator m_itValue;
    const ContainerTypeConstIterator m_itEnd;
};

int main()
{
    try
    {
        const std::vector<unsigned int> values{ 0, 1, 2, 3, 4, 5 };
        const std::vector<unsigned int> copy(MyIterator::begin(values), MyIterator::end(values));
    }
    catch (const std::runtime_error& e) {
        std::cout << e.what() << std::endl;
    }
    return 0;
}
@EugeneZelenko EugeneZelenko added libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. and removed new issue labels Oct 16, 2022
@JoeLoser
Copy link
Member

CC: @philnik777

@pkl97
Copy link
Author

pkl97 commented Oct 17, 2022

Did some more testing. The problem does not occur with clang 12.0.1 and 13.0.1 together with libc++, so it looks like a regression starting with clang 14.
It is however independent of the C++ Standard used. It also occurs when using C++17 or C++14 (if operator==() is properly implemented of course).

Here is a sample program that uses Boost.Regex in combination with ICU:

#include <boost/regex/icu.hpp>

int main()
{
    try
    {
        // clang 15 leaks 12 bytes of memory
        boost::make_u32regex("^\xed\xa0\x80$"); // single lowest lead surrogate U+D800
    }
    catch (...) {
    }
    return EXIT_SUCCESS;
}

It leaks 12 bytes when run. The given string is the UTF-8 encoding of a single lead surrogate which is illegal because it is only one half of a character. Internally operator*() throws in Boost.Regex.

@EugeneZelenko EugeneZelenko added clang:codegen and removed libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. labels Oct 17, 2022
@llvmbot
Copy link
Collaborator

llvmbot commented Oct 17, 2022

@llvm/issue-subscribers-clang-codegen

@philnik777
Copy link
Contributor

This is actually a libc++ bug. We removed the base class in LLVM14, which caused this bug. I'll look into how to fix the bug in a nice way. Unfortunately we don't have delegating constructors yet in C++03. @var-const do you have interest in adding that to C++03? You mentioned something in a patch. Pinging @ldionne for awareness.

@philnik777 philnik777 added libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. and removed clang:codegen labels Oct 17, 2022
@philnik777 philnik777 self-assigned this Oct 17, 2022
@pkl97
Copy link
Author

pkl97 commented Nov 3, 2022

Any idea when a fix for this issue could be available?

I assume there is no chance for a backport to clang 15?

@philnik777
Copy link
Contributor

Sorry, I had something else to do and then forgot about it. I've got https://reviews.llvm.org/D138601 to fix the issue now. I hope we can get it into LLVM15, since it's a pretty serious issue.

tstellar pushed a commit that referenced this issue Jan 12, 2023
Fixes #58392

Reviewed By: ldionne, #libc

Spies: alexfh, hans, joanahalili, dblaikie, libcxx-commits

Differential Revision: https://reviews.llvm.org/D138601
@pkl97
Copy link
Author

pkl97 commented Jan 14, 2023

Thank you for the fix and the backport to clang 15.0.7!

Works like a charm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

No branches or pull requests

5 participants