This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Implement ranges::find_end, ranges::search{, _n}
ClosedPublic

Authored by philnik on Apr 20 2022, 2:52 AM.

Diff Detail

Event Timeline

philnik created this revision.Apr 20 2022, 2:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 20 2022, 2:52 AM
Herald added a subscriber: mgorny. · View Herald Transcript
philnik requested review of this revision.Apr 20 2022, 2:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 20 2022, 2:52 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
philnik updated this revision to Diff 423861.Apr 20 2022, 3:22 AM
  • Fix some pre-C++20 code
philnik updated this revision to Diff 424157.Apr 21 2022, 5:07 AM
  • run file generation
philnik updated this revision to Diff 424569.Apr 22 2022, 12:18 PM
  • Try to fix CI

(Note: this is on my radar but I'll probably won't be able to get to it for a few days)

Is there a reason to combine these two algorithms in a single patch? Even on their own, each one is sizable, and they don't seem to be dependent on each other or even too closely related.

(Note: this is on my radar but I'll probably won't be able to get to it for a few days)

Is there a reason to combine these two algorithms in a single patch? Even on their own, each one is sizable, and they don't seem to be dependent on each other or even too closely related.

Which two do you mean? search and find_end are very similar. I'm even using the random access search implementation in find_end. If you mean search_n and the other two, they are relatively closely related still. Most of the algorithm is almost identical, although they don't share any code in the current implementation. But if you want I can put search_n into it's own PR.

philnik updated this revision to Diff 431172.May 21 2022, 12:51 PM
  • Next try
var-const requested changes to this revision.Jun 10 2022, 3:53 PM

I apologize this review took so long. Generally looks good but there are a few missing parts to address.

libcxx/include/__algorithm/ranges_find_end.h
46

Nit: move?

47

Same optional nit re. moving here.

libcxx/include/__algorithm/ranges_search.h
79

Should this be in an else block of the if constexpr?

var-const added inline comments.Jun 10 2022, 3:53 PM
libcxx/include/__algorithm/ranges_search.h
66

Question: would it be possible/worthwhile to do a similar optimization for the iterator + sentinel overload? It seems it should be possible to use std::size for the case where the sentinel is the same type as the iterator, or perhaps use subrange (presuming the overhead is negligible and easily optimizable).

69

Hmm, I'd like to double-check this. IIUC, the relevant section is

{i, i + (last2 - first2)}, where i is the first iterator in the range [first1, last1 - (last2 - first2)) such that for every non-negative integer n less than last2 - first2 the condition

bool(invoke(pred, invoke(proj1, *(i + n)), invoke(proj2, *(first2 + n)))) is true.

However, if last2 - first2 is zero, there exists no non-negative integer n less than that. It seems like it would mean that no such iterator i can exist and thus the algorithm should go to the next clause and return {last1, last1}.

libcxx/include/__algorithm/search.h
35

Hide from ABI?

60–66

Question: why is __m1 incremented? It was not incremented in the original.

152–154

Does this need to be in an else branch for this code to be guaranteed to be eliminated? Perhaps you might need to add another if constexpr (sized_sentinel_for<_Sent1, _Iter1> || sized_sentinel_for<_Sent2, _Iter2>) { condition to wrap up the optimizations?

189

Nit: consider moving the iterators.

libcxx/include/__algorithm/search_n.h
76

Can you remove the trailing underscore from the name? (I know it's a preexistent thing)

80

Can you use using?

152

Same comment as in search.h regarding whether this should be an else branch of the if constexpr.

171

Nit: unnecessary blank line.

libcxx/include/__iterator/advance.h
204 ↗(On Diff #431172)

Should this work if _Sent is a different type from _Iter? If not, perhaps static assert?

libcxx/include/__iterator/next.h
90 ↗(On Diff #431172)

Same question re. static_assert here.

libcxx/include/__iterator/reverse_iterator.h
333 ↗(On Diff #431172)

I think I'd rather avoid adding these two functions. __make_begin_reverse_iterator seems to exist only for symmetry, but more importantly, the difference between language versions that is being abstracted away by __make_end_reverse_iterator is not specific to reverse iterators. How about something like

template <class _Iter, class _Sent>
decltype(auto) __end_iterator(const _Iter& __first, _Sent& __last) {
#if _LIBCPP_STD_VER > 17
    return ranges::next(__first, __last);
#else
    static_assert(is_same<_Iter, _Sent>::value, "Iterator and Sentinel have to be the same type pre-C++20");
    return __last;
#endif
}

and then use this function when creating the reverse iterators at the call site?

libcxx/include/algorithm
1438

Please update the synopsis.

libcxx/test/std/algorithms/alg.nonmodifying/alg.find.end/ranges.find_end.pass.cpp
10

Missing synopsis.

20

Please check SFINAE as well.

25

Consider a helper function to take care of both the iterator overload and the range overload (can be done in a follow-up if you prefer).

283

Should be find_end.

297

Nit: decltype(auto).

libcxx/test/std/algorithms/alg.nonmodifying/alg.search/ranges.search_n.pass.cpp
250

Should be search_n.

This revision now requires changes to proceed.Jun 10 2022, 3:53 PM
philnik updated this revision to Diff 439724.Jun 24 2022, 6:17 AM
philnik marked 22 inline comments as done.
  • Address comments
libcxx/include/__algorithm/ranges_search.h
66

This was already done in search.h. I've moved it now here, since it would have broken the classic algorithms, as I now know.

69

Yes, the non-negative integers less than zero is the empty set. Normally "all of an empty set" is trivially true.

79

No. The stuff inside the if constexpr just checks if there even can be a match or if it's trivial because the size of __range2 is zero, which means the beginning is always a match.

libcxx/include/__algorithm/search.h
60–66

Yes, that was a bug in the original. The second element of pair was never used anywhere. I don't know why __search ever returned a pair.

152–154

An else branch would be required for the guarantee, but it makes the code a lot more complicated and clang eliminated dead code even at -O0, so I don't se much of an upside.

189

I would avoid moving anything in the classic algorithm parts. It probably doesn't do much and might be a portability pit-fall.

libcxx/include/__algorithm/search_n.h
152

Answer is also the same.

libcxx/include/__iterator/advance.h
204 ↗(On Diff #431172)

Doesn't apply anymore because I'm now using _IterOps.

libcxx/include/__iterator/next.h
90 ↗(On Diff #431172)

Again, doesn't apply anymore because of _IterOps.

libcxx/include/__iterator/reverse_iterator.h
333 ↗(On Diff #431172)

Doesn't apply anymore, since we can't use reverse_iterator in classic algos.

libcxx/test/std/algorithms/alg.nonmodifying/alg.find.end/ranges.find_end.pass.cpp
25

I'll do it in a follow-up.

huixie90 requested changes to this revision.Jun 26 2022, 5:02 PM
huixie90 added a subscriber: huixie90.
huixie90 added inline comments.
libcxx/include/__algorithm/ranges_find_end.h
55–56

This looks a bug to me.

In general iterator_traits<_Iter1>::iterator_concept() doesn't exist. (this probably indicates lack of tests in this change)
There is a util in the spec ITER_CONCEPT https://eel.is/c++draft/iterator.concepts.general
but that is not even meant to use for checking the exact concept. It is only meant to be used as part of the real iterator concepts together with all syntactic constraints. Note the default is std::random_access_iterator_tag

In general, for c++20 code, use the real concepts instead of these tags.

https://godbolt.org/z/3b4hGo8h7

This revision now requires changes to proceed.Jun 26 2022, 5:02 PM
philnik updated this revision to Diff 442520.Jul 6 2022, 4:59 AM
philnik marked an inline comment as done.
  • Rebased
  • Address comments
huixie90 added inline comments.Jul 6 2022, 1:41 PM
libcxx/include/__algorithm/ranges_search.h
100

why <=? as the size is unsigned, is it just ==0?

libcxx/include/__iterator/advance.h
200 ↗(On Diff #442520)

does it need to check _LIBCPP_HAS_NO_INCOMPLETE_RANGES too?

201 ↗(On Diff #442520)

What if this line is called on C++20 mode with the classic algorithm/classic iterator?

C++20 sentinel_for requires the sentinel to be semiregular. But C++ 17's iterator are not required to be default constructible.

what happens if you call the classic algorithm with a classic iterator that is not default constructible?

var-const accepted this revision as: var-const.Jul 6 2022, 2:55 PM

LGTM with one remaining question.

libcxx/include/__algorithm/ranges_search.h
69

Hmm, I might be missing something (I wrote the original comment a while ago), but it seems that if last2 - first2 == 0, the current implementation would return {first1, first1}, whereas it should return {last1, last1} (assuming I'm correct that the "if no such iterator exists" cause applies). first1 isn't necessarily equal to last1 in this case, so returning {first1, first1} isn't equivalent to returning {last1, last1}, even if both are empty sets.

philnik updated this revision to Diff 443229.Jul 8 2022, 6:23 AM
philnik marked 3 inline comments as done.
  • Address comments
libcxx/include/__algorithm/ranges_search.h
69

I think you are wrong about the condition. IIUC the condition applies for every non-negative integer, because there are no non-negative integers less than zero. It's the same way that ∀x∈∅. ⊥ is true.

libcxx/include/__iterator/advance.h
201 ↗(On Diff #442520)

Oops, this shouldn't have been here anymore.

huixie90 accepted this revision.Jul 10 2022, 11:27 PM
huixie90 added inline comments.
libcxx/include/__algorithm/ranges_search.h
48

nit: I think it might be better to just use operator- since you've already checked it is sized_sentinel_for. If somehow we forgot to check the sized_sentinel_for concept, operator- would result in a compile time error if the type does not model it, but ranges::distance would result in a runtime linear complexity computation

libcxx/include/__algorithm/ranges_search_n.h
65–70

It would be nice if you can restructure the code so that this block of code can be in an else block of if constexpr, in a way that only one of the two __search_n_random_access_impl and __search_n_forward_impl would be instantiated. IIUC, with this setup, even if the code path calls __search_n_random_access_impl, the __search_n_forward_impl has still need to be instantiated because it is not guarded by if constexpr

philnik updated this revision to Diff 443883.Jul 12 2022, 2:31 AM
  • Try to fix CI
libcxx/include/__algorithm/ranges_search.h
48

Not necessarily. You can disable_sized_sentinel_for and are still allowed to have operator-. If we forget the check in that case our algorithm wouldn't just be slow, it would be incorrect.

libcxx/include/__algorithm/ranges_search_n.h
65–70

Doing that would mean that I have to either duplicate the __search_n_forward_impl call or the size check. I don't think the added complexity is justified for that potentially small increase in compilation speed.

h-vetinari added inline comments.
libcxx/docs/Status/RangesAlgorithms.csv
25–26

This line can be set to done as well...

var-const accepted this revision.Jul 12 2022, 9:46 AM
This revision is now accepted and ready to land.Jul 12 2022, 9:46 AM
philnik updated this revision to Diff 444047.Jul 12 2022, 12:29 PM
philnik marked an inline comment as done.
  • Rebased
  • Fix RangesAlgorithms
  • Fix CI
This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.Jul 13 2022, 4:36 AM

This doesn't build over here: http://45.33.8.238/linux/81035/step_4.txt

Do you think that's a problem on my side, or is that a problem in the patch?

This doesn't build over here: http://45.33.8.238/linux/81035/step_4.txt

Do you think that's a problem on my side, or is that a problem in the patch?

That's a merge conflict in libc++. I'll revert for now.

philnik reopened this revision.Jul 13 2022, 4:42 AM
This revision is now accepted and ready to land.Jul 13 2022, 4:42 AM
philnik updated this revision to Diff 444297.Jul 13 2022, 9:07 AM
  • Try to fix CI
This revision was automatically updated to reflect the committed changes.
bgraur added a subscriber: bgraur.Jul 19 2022, 12:36 AM

Hi @philnik,

This commit breaks std::search_n().

Reproducer:

#include <algorithm>
#include <vector>

class A {
 public:
  A(int x, int y) : x_(x), y_(y) {}
  int x() const { return x_; }
  int y() const { return y_; }

 private:
  int x_;
  int y_;
};

bool search(const std::vector<A>& a, int value) {
  return std::search_n(a.begin(), a.end(), 1, value,
                       [](const A& l, int r) { return l.x() == r; }) == a.end();
}

Compile command:

clang  -std=gnu++17  -c /tmp/test.cc  -o /tmp/a.o

That should compile according to https://en.cppreference.com/w/cpp/algorithm/search_n but it no longer does after this commit.
It compiles correctly with the commit before.

@philnik considering this commit breaks std::search_n() could you please revert?

huixie90 added inline comments.Jul 19 2022, 1:33 AM
libcxx/include/__algorithm/iterator_operations.h
40

question: why do we need another alias for ranges::advance? we could have another overload for classic one with the same name

libcxx/include/__algorithm/search_n.h
166

re @bgraur's bug report. this line should be

static_assert(__is_callable<_BinaryPredicate, decltype(*__first),  const _Tp&>::value, "BinaryPredicate has to be callable");

Let's not revert since we have a fix in D130124. Thanks a bunch @huixie90 for jumping on this.