Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std::minmax_element is 3 times slower than hand written loop #39879

Closed
apolukhin opened this issue Jan 30, 2019 · 5 comments
Closed

std::minmax_element is 3 times slower than hand written loop #39879

apolukhin opened this issue Jan 30, 2019 · 5 comments
Assignees
Labels
bugzilla Issues migrated from bugzilla invalid Resolved as invalid, i.e. not a bug libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.

Comments

@apolukhin
Copy link
Member

Bugzilla Link 40533
Resolution INVALID
Resolved on Feb 07, 2019 10:10
Version unspecified
OS Linux
CC @apolukhin,@mclow

Extended Description

Moreover, std::minmax_element is slower than calling std::min_element+std::max_element if there's a lot of data and it does not fit into the CPU cache: http://quick-bench.com/tlgxCx9CUMZgOfYbwhFaEI0WNOg

@apolukhin
Copy link
Member Author

assigned to @mclow

@mclow
Copy link
Contributor

mclow commented Jan 30, 2019

When I visited that Quickbench link, I noticed that it was set to GCC and to libstdc++.

When I changed it to clang and libc++, the standard library implementation was faster than the hand-written version.

Did you mean to report this against libstdc++?

@apolukhin
Copy link
Member Author

Sorry, wrong link. Here's the right one http://quick-bench.com/RQrBIzB8sBYS932z90b7DTEyEks

There's a good comment from Marc Glisse with description of close behaviour in GCC https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89120#c1

However, it looks like there's still a room for improvement. If std::minmax_element is called with standard comparators and for fundamental type, then using a simple loop (with more comparisons) will improve performance (for CPUs that have branch predictors and non costly comparison instructions for builtins).

Although I'm not sure that this could/should be solved on the library level - too many CPU specific knowledge in libc++ does not seem right.

@mclow
Copy link
Contributor

mclow commented Jan 30, 2019

Sorry, wrong link. Here's the right one
http://quick-bench.com/RQrBIzB8sBYS932z90b7DTEyEks

That's better; but you are testing different things.
Your "hand written code" returns (in pointers) two values.
The "standard one" returns a pair of iterators.

Also, despite being faster, you're doing more comparisons than the one in the standard:
You: 2N comparisons.
Std: 3N/2 comparisons.

There's a good comment from Marc Glisse with description of close behaviour
in GCC https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89120#c1

And now I see I've made the same points that he has.

@mclow
Copy link
Contributor

mclow commented Feb 7, 2019

Closing as "Not a bug"

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla invalid Resolved as invalid, i.e. not a bug libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

No branches or pull requests

2 participants