LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 23293 - libcxx: racy use-after-free on condition_variable
Summary: libcxx: racy use-after-free on condition_variable
Status: RESOLVED FIXED
Alias: None
Product: libc++
Classification: Unclassified
Component: All Bugs (show other bugs)
Version: unspecified
Hardware: PC Linux
: P release blocker
Assignee: Eric Fiselier
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-04-20 05:43 PDT by Dmitry Vyukov
Modified: 2015-06-12 09:18 PDT (History)
8 users (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dmitry Vyukov 2015-04-20 05:43:53 PDT
clang version 3.7.0 (trunk 235293)
Target: x86_64-unknown-linux-gnu

The program is:

#include <iostream>
#include <memory>
#include <future>
#include <atomic>

std::atomic<int> _counter;

int main()
{
    _counter = 1;

    auto f = std::async(std::launch::async,
                        []{
                            _counter += 2;
                            return true;
                        });
    std::cout << "Result: " << std::boolalpha << f.get() << "\n";
    return 0;
}

ThreadSanitizer says:

==================
WARNING: ThreadSanitizer: data race (pid=19479)
  Write of size 8 at 0x7d200000bfc0 by main thread:
    #0 pthread_cond_destroy /ssd/src/llvm/build/../projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:1153 (a.out+0x00000044aa8c)
    #1 std::__1::condition_variable::~condition_variable() build/../projects/libcxx/src/condition_variable.cpp:23:5 (libc++.so.1+0x00000008cb69)
    #2 std::__1::__assoc_sub_state::~__assoc_sub_state() build/bin/../include/c++/v1/future:515:24 (a.out+0x00000049999a)
    #3 std::__1::__async_assoc_state<bool, std::__1::__async_func<main::$_0> >::~__async_assoc_state() build/bin/../include/c++/v1/future:940:7 (a.out+0x0000004992d5)
    #4 std::__1::__assoc_state<bool>::__on_zero_shared() build/bin/../include/c++/v1/future:635:5 (a.out+0x000000499a63)
    #5 std::__1::__async_assoc_state<bool, std::__1::__async_func<main::$_0> >::__on_zero_shared() build/bin/../include/c++/v1/future:990:5 (a.out+0x00000049930d)
    #6 std::__1::__shared_count::__release_shared() build/../projects/libcxx/src/memory.cpp:63:9 (libc++.so.1+0x00000008eb20)
    #7 std::__1::__release_shared_count::operator()(std::__1::__shared_count*) build/bin/../include/c++/v1/future:1155:41 (a.out+0x00000049a155)
    #8 std::__1::unique_ptr<std::__1::__shared_count, std::__1::__release_shared_count>::reset(std::__1::__shared_count*) build/bin/../include/c++/v1/memory:2669:13 (a.out+0x0000004996bd)
    #9 std::__1::unique_ptr<std::__1::__shared_count, std::__1::__release_shared_count>::~unique_ptr() build/bin/../include/c++/v1/memory:2637 (a.out+0x0000004996bd)
    #10 std::__1::future<bool>::get() build/bin/../include/c++/v1/future:1173 (a.out+0x0000004996bd)
    #11 main /tmp/shared_ptr.cc:17:50 (a.out+0x000000498f62)

  Previous read of size 8 at 0x7d200000bfc0 by thread T1:
    #0 pthread_cond_broadcast /ssd/src/llvm/build/../projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:1146 (a.out+0x000000447e40)
    #1 std::__1::condition_variable::notify_all() build/../projects/libcxx/src/condition_variable.cpp:35:5 (libc++.so.1+0x00000008cbc9)
    #2 void std::__1::__assoc_state<bool>::set_value<bool>(bool&&) build/bin/../include/c++/v1/future:655:5 (a.out+0x000000499c01)
    #3 std::__1::__async_assoc_state<bool, std::__1::__async_func<main::$_0> >::__execute() build/bin/../include/c++/v1/future:975:9 (a.out+0x00000049934e)
    #4 _ZNSt3__18__invokeIMNS_19__async_assoc_stateIbNS_12__async_funcIZ4mainE3$_0JEEEEEFvvEPS5_JEvEEDTcldsdeclsr3std3__1E7forwardIT0_Efp0_Efp_spclsr3std3__1E7forwardIT1_Efp1_EEEOT_OS9_DpOSA_ build/bin/../include/c++/v1/__functional_base:382:12 (a.out+0x0000004994a1)
    #5 void std::__1::__thread_execute<void (std::__1::__async_assoc_state<bool, std::__1::__async_func<main::$_0> >::*)(), std::__1::__async_assoc_state<bool, std::__1::__async_func<main::$_0> >*, 1ul>(std::__1::tuple<void (std::__1::__async_assoc_state<bool, std::__1::__async_func<main::$_0> >::*)(), std::__1::__async_assoc_state<bool, std::__1::__async_func<main::$_0> >*>&, std::__1::__tuple_indices<1ul>) build/bin/../include/c++/v1/thread:337 (a.out+0x0000004994a1)
    #6 void* std::__1::__thread_proxy<std::__1::tuple<void (std::__1::__async_assoc_state<bool, std::__1::__async_func<main::$_0> >::*)(), std::__1::__async_assoc_state<bool, std::__1::__async_func<main::$_0> >*> >(void*) build/bin/../include/c++/v1/thread:347 (a.out+0x0000004994a1)

  Location is heap block of size 120 at 0x7d200000bf80 allocated by main thread:
    #0 operator new(unsigned long) /ssd/src/llvm/build/../projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:615 (a.out+0x00000043e08f)
    #1 std::__1::future<bool> std::__1::__make_async_assoc_state<bool, std::__1::__async_func<main::$_0> >(std::__1::__async_func<main::$_0>&&) build/bin/../include/c++/v1/future:2312:13 (a.out+0x000000499035)
    #2 std::__1::future<std::__1::__invoke_of<std::__1::decay<main::$_0>::type>::type> std::__1::async<main::$_0>(std::__1::launch, main::$_0&&) build/bin/../include/c++/v1/future:2361:16 (a.out+0x000000498fce)
    #3 main /tmp/shared_ptr.cc:12:14 (a.out+0x000000498f0d)

  Thread T1 (tid=19481, finished) created by main thread at:
    #0 pthread_create /ssd/src/llvm/build/../projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:951 (a.out+0x000000447123)
    #1 std::__1::thread::thread<void (std::__1::__async_assoc_state<bool, std::__1::__async_func<main::$_0> >::*)(), std::__1::__async_assoc_state<bool, std::__1::__async_func<main::$_0> >*, void>(void (std::__1::__async_assoc_state<bool, std::__1::__async_func<main::$_0> >::*&&)(), std::__1::__async_assoc_state<bool, std::__1::__async_func<main::$_0> >*&&) build/bin/../include/c++/v1/thread:359:16 (a.out+0x000000499282)
    #2 std::__1::future<bool> std::__1::__make_async_assoc_state<bool, std::__1::__async_func<main::$_0> >(std::__1::__async_func<main::$_0>&&) build/bin/../include/c++/v1/future:2313:5 (a.out+0x0000004990a4)
    #3 std::__1::future<std::__1::__invoke_of<std::__1::decay<main::$_0>::type>::type> std::__1::async<main::$_0>(std::__1::launch, main::$_0&&) build/bin/../include/c++/v1/future:2361:16 (a.out+0x000000498fce)
    #4 main /tmp/shared_ptr.cc:12:14 (a.out+0x000000498f0d)

SUMMARY: ThreadSanitizer: data race build/../projects/libcxx/src/condition_variable.cpp:23:5 in std::__1::condition_variable::~condition_variable()
==================

The program was built as:

clang++ test.cc -fsanitize=thread -std=c++14 -stdlib=libc++ -lc++abi -Wall -g -O1

with tsan-instrumented libcxx (see instructions here https://code.google.com/p/memory-sanitizer/wiki/LibcxxHowTo).

The problem is here:

__assoc_state<_Rp>::set_value(_Arg&& __arg)
{
    unique_lock<mutex> __lk(this->__mut_);
#ifndef _LIBCPP_NO_EXCEPTIONS
    if (this->__has_value())
        throw future_error(make_error_code(future_errc::promise_already_satisfied));
#endif
    ::new(&__value_) _Rp(_VSTD::forward<_Arg>(__arg));
    this->__state_ |= base::__constructed | base::ready;
    __lk.unlock();
    __cv_.notify_all();
}

The function first unlocks the mutex and after than signals the cond var. But once the mutex is unlocked the object can be destroyed as it is in ready state.
Comment 1 Eric Fiselier 2015-04-20 06:07:18 PDT
Howard is no longer active in libc++. I'll take a look at this. I am already somewhat familiar with the issue.
Comment 2 Eric Fiselier 2015-05-12 17:30:48 PDT
I'm not 100% convinced that this is a real defect. TSAN is reporting the race condition on the pthread_condvar_t in std::condition_variable. I think that TSAN's semantics for condition_variable are incorrect is some cases and this may be one of those.

Could you provide a pseudo-example where you think this will cause an error? I just want to fully understand this before I make any changes.
Comment 3 Dmitry Vyukov 2015-05-13 01:37:20 PDT
What's wrong with my reasoning? Condition variables are semantical no-ops, so there is nothing to model.
Comment 4 Eric Fiselier 2015-05-13 14:19:20 PDT
> What's wrong with my reasoning?

I just don't see the race condition in the example you gave. I'm not saying it is not there but it isn't clear to me yet. I don't see how the __assoc_state can get destroyed before it exits the call to get_value(...) even if it releases the lock.

However I don't think the TSAN diagnostics are caused by the race condition you described.

> Condition variables are semantical no-ops, so there is nothing to model.

Sorry, I see you've done a bunch of work on how TSAN handles condition variables recently. I wasn't aware that a lot had changed in the past month. Why is it that TSAN's interceptor for pthread_cond_wait(Cv, Mut) releases the mutex before marking Cv as read? 
(see: https://github.com/llvm-mirror/compiler-rt/blob/master/lib/tsan/rtl/tsan_interceptors.cc#L1005)
Comment 5 Eric Fiselier 2015-05-19 18:46:01 PDT
Ping Dimitry, Kostya, Eugenis:

Why is it that TSAN's interceptor for pthread_cond_wait(Cv, Mut) releases the mutex before marking Cv as read?
Comment 6 Dmitry Vyukov 2015-05-20 03:44:23 PDT
Eric,

Sorry for the delay, I was OOO.

I don't think that I did any work on condvar in the last months. cond_wait interceptor always released and reaqcuired the mutex. Why? Because that's what pthread_cond_wait itself does -- it releases the mutex before waiting, otherwise the mutex-protected predicate won't be able to change.

> I don't see how the __assoc_state can get destroyed before it exits the call to get_value(...).

You misread the report and my explanation. It is not get_value that runs concurrently with destruction, it is set_value that runs concurrently with destruction.

The report says that:
1. set_value changes object state to "ready" and releases the mutex.
2. then get_value observes the "ready" state and destroys the object.
3. then set_value signals on the condvar which is already destroyed.
Comment 7 Eric Fiselier 2015-05-21 07:00:13 PDT
>You misread the report and my explanation. It is not get_value that runs 
>concurrently with destruction, it is set_value that runs concurrently with 
>destruction.
>
>The report says that:
>1. set_value changes object state to "ready" and releases the mutex.
>2. then get_value observes the "ready" state and destroys the object.
>3. then set_value signals on the condvar which is already destroyed.

Thanks for pointing out my mistake. I incorrectly assumed that the producer that called set_value had a reference to the shared state. I'll try and push a fix for this today.
Comment 8 Eric Fiselier 2015-05-21 10:01:18 PDT
FYI, I filed a bug against compiler-rt that documents my concerns with how TSAN handles condition variables.

https://llvm.org/bugs/show_bug.cgi?id=23616
Comment 9 Eric Fiselier 2015-06-12 09:18:27 PDT
Fixed in r239577. Thanks for the analysis and sorry for the delay.