Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libc++ std::sort has O(n^2) worst case, standard mandates O(N log(N)) #21211

Closed
llvmbot opened this issue Sep 2, 2014 · 13 comments
Closed

libc++ std::sort has O(n^2) worst case, standard mandates O(N log(N)) #21211

llvmbot opened this issue Sep 2, 2014 · 13 comments
Assignees
Labels
bugzilla Issues migrated from bugzilla libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.

Comments

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 2, 2014

Bugzilla Link 20837
Resolution FIXED
Resolved on Nov 23, 2021 04:42
Version unspecified
OS All
Attachments Evidence for quadratic worst case., A patch that fixes the quadratic behaviour by implementing introsort.
Reporter LLVM Bugzilla Contributor
CC @Dushistov,@hfinkel,@hiraditya,@danlark1,@mclow,@nico,@sebpop,@zamazan4ik

Extended Description

The C++ standard mandates that std::sort has complexity O(N log(N)), but libc++'s implementation takes O(N^2) in the worst case.

As proof I've attached a program that constructs a worst case input for several sizes. It also instruments the number of comparisons used to sort these worst cases and prints the results. The technique used is described in the 1999 paper "A Killer Adversary for Quicksort" by M. D. McIlroy.

Compiling and running:

$ clang++ -O2 -m64 -march=native -std=c++11 -stdlib=libc++ main.cpp -nodefaultlibs -lc++ -lc++abi -lm -lc -lgcc_s -lgcc && ./a.out
N: comparisons
100: 2479
200: 10229
400: 40729
800: 161729
1600: 448698
3200: 1413898
6400: 5264298

This is clearly quadratic. To illustrate this defect more, these are the comparison counts given by compiling using libstdc++:

$ clang++ -O2 -m64 -march=native -std=c++11 main.cpp && ./a.out
N: comparisons
100: 1742
200: 4260
400: 10035
800: 22784
1600: 51049
3200: 112386
6400: 244850

Inspecting the source of shows the cause of the issue: there is no introsort implemented, only quicksort (albeit with non-trivial optimizations, but nothing preventing the worst case). I've written a patch that augments the current implementation to make it work as an introspective sort, switching to heapsort if the recursion depth exceeds 2*floor(log(N)). This post can only have one attachment, so I'll post it as an attachment to a comment.

Having not contributed to libc++ before I'm not 100% familiar with all coding styles/(un)written rules. My changes are rather minimal though, so I've just followed what style could be found in context. And of course I knowingly submit my code under the libc++ licenses, the MIT license and the UIUC License.

@llvmbot
Copy link
Collaborator Author

llvmbot commented Sep 2, 2014

assigned to @mclow

@hfinkel
Copy link
Collaborator

hfinkel commented Sep 2, 2014

Hi Orson,

Please mail this patch to the cfe-commits list for review. Please make sure the patch is sent as an attachment to the e-mail (not inline text), and please make sure the repeat the background information you've included here in the e-mail itself.

For more information on the development policy and workflow, please see: http://llvm.org/docs/DeveloperPolicy.html

Thanks!

@llvmbot
Copy link
Collaborator Author

llvmbot commented Sep 2, 2014

Hi Hal,

I've sent the patch to the mailing list, where it is waiting for moderator approval since I'm not a member of the list.

Greetings,

Orson Peters

@mclow
Copy link
Contributor

mclow commented Mar 3, 2015

An update. EricWF is working on a set of benchmarks for std::sort and other algorithms/containers. He expects to finish this quite soon, and I'll be using that to make sure this doesn't introduce a performance regression - and will be adding this to the test suite.

So I haven't forgotten about it.

@hfinkel
Copy link
Collaborator

hfinkel commented Jul 1, 2016

An update. EricWF is working on a set of benchmarks for std::sort and other
algorithms/containers. He expects to finish this quite soon, and I'll be
using that to make sure this doesn't introduce a performance regression -
and will be adding this to the test suite.

So I haven't forgotten about it.

What's the status on this?

@sebpop
Copy link
Mannequin

sebpop mannequin commented Jul 14, 2016

I think std::sort() is still broken in trunk as of today. std::sort() should not call the compare function on the same element: it cannot be O(N log N) if it does.

Related bug: https://llvm.org/bugs/show_bug.cgi?id=28551

Reduced testcase:
$ cat a.cc
#include
#include
#include

int main(int argc, char** argv) {
int size = 100;
std::vector v(size);

// Elements in v are unique.
for (int i = 0; i < size; ++i)
v[i] = i;

std::sort(v.begin(), v.end(),
[&](int x, int y) { assert(x != y); return x < y; });

return 0;
}
$ clang++ -std=c++11 -stdlib=libc++ a.cc -o a.out
$ ./a.out
a.out: a.cc:14: auto main(int, char **)::(anonymous class)::operator()(int, int) const: Assertion `x != y' failed.
./go.sh: line 5: 19447 Aborted (core dumped) ./a.out

$ clang++ -std=c++11 -stdlib=libstdc++ a.cc -o a.out
$ ./a.out
Works fine with libstdc++.

@sebpop
Copy link
Mannequin

sebpop mannequin commented Jul 14, 2016

*** Bug llvm/llvm-bugzilla-archive#28551 has been marked as a duplicate of this bug. ***

@llvmbot
Copy link
Collaborator Author

llvmbot commented Jul 18, 2016

I would like to point out that our std::sort implementation is between 2x and 5x faster on already sorted inputs than libstdc++ is.

Some benchmark results:
Run on (4 X 4228.3 MHz CPU s)

// libc++ //
Benchmark Time CPU Iterations

BM_Sort/random_uint32/1024 170614 ns 174629 ns 4375
BM_Sort/sorted_ascending_uint32/1024 12902 ns 13276 ns 53030
BM_Sort/sorted_descending_uint32/1024 20919 ns 21712 ns 29661
BM_Sort/single_element_uint32/1024 13747 ns 14384 ns 60345
BM_Sort/random_strings/1024 1097028 ns 998514 ns 673
BM_Sort/sorted_ascending_strings/1024 233688 ns 208469 ns 3070
BM_Sort/sorted_descending_strings/1024 341642 ns 361934 ns 1923
BM_Sort/single_element_strings/1024 1081605 ns 1126143 ns 547

// libstdc++ //
Benchmark Time CPU Iterations

BM_Sort/random_uint32/1024 156709 ns 161200 ns 4268
BM_Sort/sorted_ascending_uint32/1024 49431 ns 53600 ns 10000
BM_Sort/sorted_descending_uint32/1024 42401 ns 41201 ns 16990
BM_Sort/single_element_uint32/1024 32115 ns 30892 ns 20588
BM_Sort/random_strings/1024 878349 ns 796434 ns 673
BM_Sort/sorted_ascending_strings/1024 576253 ns 532000 ns 1000
BM_Sort/sorted_descending_strings/1024 506059 ns 368390 ns 1683
BM_Sort/single_element_strings/1024 2453953 ns 2542751 ns 269

The benchmarks used can be found here: https://reviews.llvm.org/D22240

@llvmbot
Copy link
Collaborator Author

llvmbot commented Jul 18, 2016

Eric Fiselier, can you run that same benchmark with the 'block' branch of my pattern-defeating quicksort algorithm? https://github.com/orlp/pdqsort/tree/block

I've been waiting for a while now on this benchmark suite to start discussion of integrating the algorithm to libc++, I wasn't aware you finished the benchmark.

@sebpop
Copy link
Mannequin

sebpop mannequin commented Jul 19, 2016

Benchmark Time CPU Iterations

BM_Sort/random_uint32/1024 170614 ns 174629 ns 4375
BM_Sort/sorted_ascending_uint32/1024 12902 ns 13276 ns 53030
[...]

Could you please measure the number of instructions executed, data reads, and data writes, instead of the number of nano seconds of execution? I think those are a more stable metrics than the execution time that may vary with the state of the machine, daemons, noise, etc.

$ valgrind --tool=cachegrind ./a.out
$ cg_annotate cachegrind.out

@zamazan4ik
Copy link

I double Orson's request about using pdqsort as internal algorithm for std::sort. You can see here, how pdqsort is fast (https://github.com/orlp/pdqsort).

@danlark1
Copy link
Contributor

Fixed with https://reviews.llvm.org/D113413, should land in llvm 14

If you have stability sorting guarantees that you cannot fix and need migration, use -D_LIBCPP_DEBUG_RANDOMIZE_UNSPECIFIED_STABILITY or -D_LIBCPP_DEBUG=1 from https://reviews.llvm.org/D96946 to randomize the input range before sorting

@sebpop
Copy link
Mannequin

sebpop mannequin commented Nov 26, 2021

mentioned in issue llvm/llvm-bugzilla-archive#28551

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 9, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

No branches or pull requests

5 participants