Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libcxx: racy use-after-free on condition_variable #23667

Closed
KernelAddress mannequin opened this issue Apr 20, 2015 · 9 comments
Closed

libcxx: racy use-after-free on condition_variable #23667

KernelAddress mannequin opened this issue Apr 20, 2015 · 9 comments
Labels
bugzilla Issues migrated from bugzilla libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.

Comments

@KernelAddress
Copy link
Mannequin

KernelAddress mannequin commented Apr 20, 2015

Bugzilla Link 23293
Resolution FIXED
Resolved on Jun 12, 2015 09:18
Version unspecified
OS Linux
CC @KernelAddress,@eugenis,@ramosian-glider,@kcc,@mclow

Extended Description

clang version 3.7.0 (trunk 235293)
Target: x86_64-unknown-linux-gnu

The program is:

#include
#include
#include
#include

std::atomic _counter;

int main()
{
_counter = 1;

auto f = std::async(std::launch::async,
                    []{
                        _counter += 2;
                        return true;
                    });
std::cout << "Result: " << std::boolalpha << f.get() << "\n";
return 0;

}

ThreadSanitizer says:

==================
WARNING: ThreadSanitizer: data race (pid=19479)
Write of size 8 at 0x7d200000bfc0 by main thread:
#​0 pthread_cond_destroy /ssd/src/llvm/build/../projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:1153 (a.out+0x00000044aa8c)
#​1 std::__1::condition_variable::condition_variable() build/../projects/libcxx/src/condition_variable.cpp:23:5 (libc++.so.1+0x00000008cb69)
#​2 std::__1::__assoc_sub_state::
__assoc_sub_state() build/bin/../include/c++/v1/future:515:24 (a.out+0x00000049999a)
#​3 std::__1::__async_assoc_state<bool, std::__1::__async_funcmain::$_0 >::~__async_assoc_state() build/bin/../include/c++/v1/future:940:7 (a.out+0x0000004992d5)
#​4 std::__1::__assoc_state::__on_zero_shared() build/bin/../include/c++/v1/future:635:5 (a.out+0x000000499a63)
#​5 std::__1::__async_assoc_state<bool, std::__1::__async_funcmain::$_0 >::__on_zero_shared() build/bin/../include/c++/v1/future:990:5 (a.out+0x00000049930d)
#​6 std::__1::__shared_count::__release_shared() build/../projects/libcxx/src/memory.cpp:63:9 (libc++.so.1+0x00000008eb20)
#​7 std::__1::__release_shared_count::operator()(std::__1::__shared_count*) build/bin/../include/c++/v1/future:1155:41 (a.out+0x00000049a155)
#​8 std::__1::unique_ptr<std::__1::__shared_count, std::__1::__release_shared_count>::reset(std::__1::__shared_count*) build/bin/../include/c++/v1/memory:2669:13 (a.out+0x0000004996bd)
#​9 std::__1::unique_ptr<std::__1::__shared_count, std::__1::__release_shared_count>::~unique_ptr() build/bin/../include/c++/v1/memory:2637 (a.out+0x0000004996bd)
#​10 std::__1::future::get() build/bin/../include/c++/v1/future:1173 (a.out+0x0000004996bd)
#​11 main /tmp/shared_ptr.cc:17:50 (a.out+0x000000498f62)

Previous read of size 8 at 0x7d200000bfc0 by thread T1:
#​0 pthread_cond_broadcast /ssd/src/llvm/build/../projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:1146 (a.out+0x000000447e40)
#​1 std::__1::condition_variable::notify_all() build/../projects/libcxx/src/condition_variable.cpp:35:5 (libc++.so.1+0x00000008cbc9)
#​2 void std::__1::__assoc_state::set_value(bool&&) build/bin/../include/c++/v1/future:655:5 (a.out+0x000000499c01)
#​3 std::__1::__async_assoc_state<bool, std::__1::__async_funcmain::$_0 >::__execute() build/bin/../include/c++/v1/future:975:9 (a.out+0x00000049934e)
#​4 _ZNSt3__18__invokeIMNS_19__async_assoc_stateIbNS_12__async_funcIZ4mainE3$0JEEEEEFvvEPS5_JEvEEDTcldsdeclsr3std3__1E7forwardIT0_Efp0_Efp_spclsr3std3__1E7forwardIT1_Efp1_EEEOT_OS9_DpOSA build/bin/../include/c++/v1/__functional_base:382:12 (a.out+0x0000004994a1)
#​5 void std::__1::__thread_execute<void (std::__1::__async_assoc_state<bool, std::__1::__async_funcmain::$_0 >::)(), std::__1::__async_assoc_state<bool, std::__1::__async_funcmain::$_0 >, 1ul>(std::__1::tuple<void (std::__1::__async_assoc_state<bool, std::__1::__async_funcmain::$_0 >::)(), std::__1::__async_assoc_state<bool, std::__1::__async_funcmain::$_0 >>&, std::__1::__tuple_indices<1ul>) build/bin/../include/c++/v1/thread:337 (a.out+0x0000004994a1)
#​6 void* std::__1::__thread_proxy<std::__1::tuple<void (std::__1::__async_assoc_state<bool, std::__1::__async_funcmain::$_0 >::)(), std::__1::__async_assoc_state<bool, std::__1::__async_funcmain::$_0 >> >(void*) build/bin/../include/c++/v1/thread:347 (a.out+0x0000004994a1)

Location is heap block of size 120 at 0x7d200000bf80 allocated by main thread:
#​0 operator new(unsigned long) /ssd/src/llvm/build/../projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:615 (a.out+0x00000043e08f)
#​1 std::__1::future std::__1::__make_async_assoc_state<bool, std::__1::__async_funcmain::$_0 >(std::__1::__async_funcmain::$_0&&) build/bin/../include/c++/v1/future:2312:13 (a.out+0x000000499035)
#​2 std::__1::future<std::__1::__invoke_of<std::__1::decaymain::$_0::type>::type> std::__1::asyncmain::$_0(std::__1::launch, main::$_0&&) build/bin/../include/c++/v1/future:2361:16 (a.out+0x000000498fce)
#​3 main /tmp/shared_ptr.cc:12:14 (a.out+0x000000498f0d)

Thread T1 (tid=19481, finished) created by main thread at:
#​0 pthread_create /ssd/src/llvm/build/../projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:951 (a.out+0x000000447123)
#​1 std::__1::thread::thread<void (std::__1::__async_assoc_state<bool, std::__1::__async_funcmain::$_0 >::)(), std::__1::__async_assoc_state<bool, std::__1::__async_funcmain::$_0 >, void>(void (std::__1::__async_assoc_state<bool, std::__1::__async_funcmain::$_0 >::&&)(), std::__1::__async_assoc_state<bool, std::__1::__async_funcmain::$_0 >&&) build/bin/../include/c++/v1/thread:359:16 (a.out+0x000000499282)
#​2 std::__1::future std::__1::__make_async_assoc_state<bool, std::__1::__async_funcmain::$_0 >(std::__1::__async_funcmain::$_0&&) build/bin/../include/c++/v1/future:2313:5 (a.out+0x0000004990a4)
#​3 std::__1::future<std::__1::__invoke_of<std::__1::decaymain::$_0::type>::type> std::__1::asyncmain::$_0(std::__1::launch, main::$_0&&) build/bin/../include/c++/v1/future:2361:16 (a.out+0x000000498fce)
#​4 main /tmp/shared_ptr.cc:12:14 (a.out+0x000000498f0d)

SUMMARY: ThreadSanitizer: data race build/../projects/libcxx/src/condition_variable.cpp:23:5 in std::__1::condition_variable::~condition_variable()

The program was built as:

clang++ test.cc -fsanitize=thread -std=c++14 -stdlib=libc++ -lc++abi -Wall -g -O1

with tsan-instrumented libcxx (see instructions here https://code.google.com/p/memory-sanitizer/wiki/LibcxxHowTo).

The problem is here:

__assoc_state<_Rp>::set_value(_Arg&& __arg)
{
unique_lock __lk(this->_mut);
#ifndef _LIBCPP_NO_EXCEPTIONS
if (this->__has_value())
throw future_error(make_error_code(future_errc::promise_already_satisfied));
#endif
::new(&_value) _Rp(_VSTD::forward<_Arg>(__arg));
this->_state |= base::__constructed | base::ready;
__lk.unlock();
_cv.notify_all();
}

The function first unlocks the mutex and after than signals the cond var. But once the mutex is unlocked the object can be destroyed as it is in ready state.

@llvmbot
Copy link
Collaborator

llvmbot commented Apr 20, 2015

Howard is no longer active in libc++. I'll take a look at this. I am already somewhat familiar with the issue.

@llvmbot
Copy link
Collaborator

llvmbot commented May 13, 2015

I'm not 100% convinced that this is a real defect. TSAN is reporting the race condition on the pthread_condvar_t in std::condition_variable. I think that TSAN's semantics for condition_variable are incorrect is some cases and this may be one of those.

Could you provide a pseudo-example where you think this will cause an error? I just want to fully understand this before I make any changes.

@KernelAddress
Copy link
Mannequin Author

KernelAddress mannequin commented May 13, 2015

What's wrong with my reasoning? Condition variables are semantical no-ops, so there is nothing to model.

@llvmbot
Copy link
Collaborator

llvmbot commented May 13, 2015

What's wrong with my reasoning?

I just don't see the race condition in the example you gave. I'm not saying it is not there but it isn't clear to me yet. I don't see how the __assoc_state can get destroyed before it exits the call to get_value(...) even if it releases the lock.

However I don't think the TSAN diagnostics are caused by the race condition you described.

Condition variables are semantical no-ops, so there is nothing to model.

Sorry, I see you've done a bunch of work on how TSAN handles condition variables recently. I wasn't aware that a lot had changed in the past month. Why is it that TSAN's interceptor for pthread_cond_wait(Cv, Mut) releases the mutex before marking Cv as read?
(see: https://github.com/llvm-mirror/compiler-rt/blob/master/lib/tsan/rtl/tsan_interceptors.cc#L1005)

@llvmbot
Copy link
Collaborator

llvmbot commented May 20, 2015

Ping Dimitry, Kostya, Eugenis:

Why is it that TSAN's interceptor for pthread_cond_wait(Cv, Mut) releases the mutex before marking Cv as read?

@KernelAddress
Copy link
Mannequin Author

KernelAddress mannequin commented May 20, 2015

Eric,

Sorry for the delay, I was OOO.

I don't think that I did any work on condvar in the last months. cond_wait interceptor always released and reaqcuired the mutex. Why? Because that's what pthread_cond_wait itself does -- it releases the mutex before waiting, otherwise the mutex-protected predicate won't be able to change.

I don't see how the __assoc_state can get destroyed before it exits the call to get_value(...).

You misread the report and my explanation. It is not get_value that runs concurrently with destruction, it is set_value that runs concurrently with destruction.

The report says that:

  1. set_value changes object state to "ready" and releases the mutex.
  2. then get_value observes the "ready" state and destroys the object.
  3. then set_value signals on the condvar which is already destroyed.

@llvmbot
Copy link
Collaborator

llvmbot commented May 21, 2015

You misread the report and my explanation. It is not get_value that runs
concurrently with destruction, it is set_value that runs concurrently with
destruction.

The report says that:

  1. set_value changes object state to "ready" and releases the mutex.
  2. then get_value observes the "ready" state and destroys the object.
  3. then set_value signals on the condvar which is already destroyed.

Thanks for pointing out my mistake. I incorrectly assumed that the producer that called set_value had a reference to the shared state. I'll try and push a fix for this today.

@llvmbot
Copy link
Collaborator

llvmbot commented May 21, 2015

FYI, I filed a bug against compiler-rt that documents my concerns with how TSAN handles condition variables.

https://llvm.org/bugs/show_bug.cgi?id=23616

@llvmbot
Copy link
Collaborator

llvmbot commented Jun 12, 2015

Fixed in r239577. Thanks for the analysis and sorry for the delay.

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

No branches or pull requests

1 participant