LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 40533 - std::minmax_element is 3 times slower than hand written loop
Summary: std::minmax_element is 3 times slower than hand written loop
Status: RESOLVED INVALID
Alias: None
Product: libc++
Classification: Unclassified
Component: All Bugs (show other bugs)
Version: unspecified
Hardware: PC Linux
: P enhancement
Assignee: Marshall Clow (home)
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-01-30 09:23 PST by Antony Polukhin
Modified: 2019-02-07 10:10 PST (History)
3 users (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Antony Polukhin 2019-01-30 09:23:05 PST
Moreover, std::minmax_element is slower than calling std::min_element+std::max_element if there's a lot of data and it does not fit into the CPU cache: http://quick-bench.com/tlgxCx9CUMZgOfYbwhFaEI0WNOg
Comment 1 Marshall Clow (home) 2019-01-30 11:15:00 PST
When I visited that Quickbench link, I noticed that it was set to GCC and to libstdc++.

When I changed it to clang and libc++, the standard library implementation was faster than the hand-written version.

Did you mean to report this against libstdc++?
Comment 2 Antony Polukhin 2019-01-30 13:08:31 PST
Sorry, wrong link. Here's the right one http://quick-bench.com/RQrBIzB8sBYS932z90b7DTEyEks

There's a good comment from Marc Glisse with description of close behaviour in GCC https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89120#c1

However, it looks like there's still a room for improvement. If std::minmax_element is called with standard comparators and for fundamental type, then using a simple loop (with more comparisons) will improve performance (for CPUs that have branch predictors and non costly comparison instructions for builtins).

Although I'm not sure that this could/should be solved on the library level - too many CPU specific knowledge in libc++ does not seem right.
Comment 3 Marshall Clow (home) 2019-01-30 13:58:30 PST
(In reply to Antony Polukhin from comment #2)
> Sorry, wrong link. Here's the right one
> http://quick-bench.com/RQrBIzB8sBYS932z90b7DTEyEks

That's better; but you are testing different things.
Your "hand written code" returns (in pointers) two values.
The "standard one" returns a pair of iterators.

Also, despite being faster, you're doing more comparisons than the one in the standard:
You: 2N comparisons.
Std: 3N/2 comparisons. 

> There's a good comment from Marc Glisse with description of close behaviour
> in GCC https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89120#c1

And now I see I've made the same points that he has.
Comment 4 Marshall Clow (home) 2019-02-07 10:10:32 PST
Closing as "Not a bug"