This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Add a way to enable lightweight assertions independent of the debug mode
AbandonedPublic

Authored by ldionne on Mar 7 2022, 8:05 AM.

Details

Reviewers
None
Group Reviewers
Restricted Project
Summary

This is a (IMO superior) alternative to https://reviews.llvm.org/D119969.
This patch is more involved than D119969, but it has the benefit of being
cleaner in the long run, especially since we know our current Debug mode
doesn't work -- so perhaps we shouldn't try to reuse too many of its
mechanisms.

In particular, this is better than reusing the existing debug mode because
we now have the ability to set the assertion handler at compile-time,
which means that assertions can be enabled even in code that targets
platforms on which the debug mode support had not been shipped yet.

As-is, this patch provides backwards compatibility with any user that
might have been setting _LIBCPP_DEBUG=0 in order to get basic assertions
in their code. Hence, this patch should be entirely backwards compatible.

However, given that our debug mode has been broken since day 1, I think
we'll need a discussion around either fixing it or removing it, since the
status quo is good for nobody. In particular, an immediate next step to
simplify the library would be to remove __libcpp_debug_info and friends.
We can keep the symbols in the dylib to avoid linker errors, but we should
not support two distinct ways of setting an assertion handler.

Diff Detail

Event Timeline

ldionne created this revision.Mar 7 2022, 8:05 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 7 2022, 8:05 AM
ldionne requested review of this revision.Mar 7 2022, 8:05 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 7 2022, 8:05 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
ldionne added inline comments.Mar 7 2022, 8:13 AM
libcxx/CMakeLists.txt
85–88

A point of discussion I expect is whether LIBCXX_ENABLE_ASSERTIONS at CMake configure-time should control *both* whether the dylib contains assertions and what the default is for user code. I've done it that way for now because I didn't really expect the need for additional customizeability, but I could be convinced otherwise.

TLDR, the one thing you can't do with the current design is build libc++.dylib with assertions enabled, but have assertions disabled inside user code by default (or the other way around, i.e. no assertions in the dylib, but user code builds with assertions by default).

libcxx/docs/UsingLibcxx.rst
167

I am not very happy about requiring users to include an internal header to customize the assertion handler. IMO that's kind of hacky. I thought about using <version> instead, but that would require <version> to include <__assert>, and unfortunately that's a no-go because <__assert> needs to include <iosfwd> right now (due to the legacy debug mode support).

libcxx/src/debug.cpp
27

In retrospect, I should have left this code alone in D119769, but I didn't know I would end up creating a new assertion handler.

ldionne added a subscriber: Restricted Project.EditedMar 7 2022, 8:14 AM

Sorry for pinging the vendors so often lately, but this change is something they'll want to be aware of, too.

philnik added a subscriber: philnik.Mar 7 2022, 8:23 AM
philnik added inline comments.
libcxx/CMakeLists.txt
85–88

I would expect it to be off by default regardless of how the vendor configures it. What's the point of a default if it can be different on every platform anyways?

libcxx/docs/UsingLibcxx.rst
167

Why not put it in <cassert>? That would make the most sense IMO.

ldionne marked 2 inline comments as done.Mar 7 2022, 8:42 AM
ldionne added inline comments.
libcxx/CMakeLists.txt
85–88

I understand this is a bit surprising, however from a vendor's perspective, it may be relevant to ship a "hardened" library where assertions are enabled by default *for everyone*, by default. I know there are platforms like that out there, and in fact it is one of the use cases motivating this work.

libcxx/docs/UsingLibcxx.rst
167

I agree it makes some sense, however we still have the problem that we need to include <iosfwd> from this header.

In other words, I don't think we can easily change this part of the user-facing API unless we agree to drop <iosfwd> from <__assert> (and in that case we can even include it from e.g. <__config>).

I have no particular opinion on whether this is a good direction that will eventually lead somewhere useful, or just a random walk. No particular objection, either.

libcxx/CMakeLists.txt
681–685

I see you're no longer fiddling with NDEBUG. Why then are you keeping the fiddling with _DEBUG? What goes wrong if you remove these 5 lines too?

libcxx/docs/BuildingLibcxx.rst
219–222

Here and several times below, where you talk about "the library," I think it's important to find a phrase that clarifies we're talking about the .so or .dylib (or I guess the .a, which is not a "shared library") — we're not just talking about code that happens to be inside "libc++," like in the headers or in the std namespace.
Given that this phrase has to work for both .so and .a, though, I have no great suggestion. :( The best I've thought of so far is "the compiled library."

libcxx/docs/ReleaseNotes.rst
71–72

More flexibility than what? Personally I'd cut this clause and just start the sentence at "Vendors can select..."

libcxx/docs/UsingLibcxx.rst
132

I thought some of them do majorly change the complexity of algorithms! E.g. __do_compare_assert in the debug comparator will definitely change some algorithms on vectors-of-vectors-of-vectors from linearish (each step does 1^k comparisons) to exponentialish (each step does 2^k comparisons).
I might be wrong about this (e.g. if your new assert mode doesn't turn on debug comparators). But if this is an intended invariant, this needs to be called out vocally and explicitly, like maybe in a bulleted list of "things libc++ guarantees about our assertions," so that it's very clear to users and maintainers what'll count as a bug here.

158–161
// In HelloWorldConfig.h (this must always be included before any other libc++ header)

I think this has come up before, like in the past six months: We shouldn't encourage users to #define _LIBCPP_WHATEVER at the top of their file, because that will tend to get moved down, or not work when their file is a .h, or break when they move to modules, or whatever. We should encourage them to pass -D_LIBCPP_WHATEVER in their build system's C(XX)FLAGS instead.

(Yikes, you're even suggesting that they do put it in a .h, with a note that this header must be included first in any .cpp file that needs it. That's asking for trouble.)

167

Tangent FWIW: it seems like we could use a <__stringfwd> helper header.

libcxx/src/assert.cpp
12–13

<string> is no longer needed.

libcxx/src/debug.cpp
43–46

I think it's weird that __libcpp_set_debug_function is a dylib function, when all it actually does is write to the extern-linkage global variable __libcpp_debug_function which is already accessed from many places in the header code. Consider making this an inline function in the header, instead?
Or is there a benefit to having it in one place in the dylib where I guess someone could set a breakpoint on it or something?

ldionne marked 8 inline comments as done.Mar 7 2022, 11:54 AM

I have no particular opinion on whether this is a good direction that will eventually lead somewhere useful, or just a random walk. No particular objection, either.

It does lead somewhere useful today. Indeed, as you can see, it adds an assertion handler in the dylib under all configurations, and _LIBCPP_ENABLE_ASSERTIONS is such that it can be defined by the user regardless of how the library was built. This, alone, fixes most of the issues with the debug mode.

Another way to look at this patch is that it makes the bits of the debug mode that matter the most (i.e. non-complexity-modifying assertions) available to everyone, outside of the debug mode. This sets up the stage for either removing the remaining debug mode assertions (which are broken as everyone knows), or fixing them but keeping them separate from basic assertions. This is all cleanup / future work, but the basic assertions this allows us to enable is a real benefit that we get today.

libcxx/CMakeLists.txt
681–685

Because that is related to LIBCXX_DEBUG_BUILD, and that appears to still be used (in principle) on Windows (see for example https://github.com/llvm/llvm-project/blob/main/libcxx/utils/libcxx/test/config.py#L273).

Basically, I'm trying to clean some stuff up, but I'm also trying to be careful not to change anything unrelated to this PR specifically. I do definitely plan to look into it when I get to it.

libcxx/docs/BuildingLibcxx.rst
219–222

Agreed, will change to "the compiled library".

libcxx/docs/UsingLibcxx.rst
132

You are right, however D121129 fixes that and makes __do_compare_assert a debug-mode assertion instead of a regular assertion.

IMO the current description is sufficient: it says that we aim not to change the complexity of algorithm, and I think that should be the criterion that we start with.

158–161

I agree, this is error prone and not very ergonomic. It's also not simple to change this:

  • The assertion handler needs to have been declared before any libc++ code is included. Otherwise, the compiler will complain that we are using ::HelloWorldAssertionHandler in the code but it hasn't been declared yet.
  • The assertion handler needs to be aware of the type std::__libcpp_assertion_info, which means that the declaration of HelloWorldAssertionHandler must come after including <__assert>, and before including anything else. That's why we have this weird include sandwich.

I think I can get rid of the second requirement by passing primitive types to the assertion handler, i.e. passing char const* file, int line, char const* predicate, char const* message) directly to the function. However, that prevents me from being backwards compatible with users that might have been setting __libcpp_debug_function (because I use the conversion operator in __libcpp_assertion_info to achieve that).

I'm not sure how to solve the first problem, though. In practice, I would expect that people use -include <path-to-header-that-declares-the-assertion-handler>.

To make progress, would you be OK with me changing the documentation to:

// In HelloWorldAssert.h
#define _LIBCPP_ASSERTION_HANDLER(info) ::HelloWorldAssertionHandler(info)
#include <__assert>

// This could also be defined out-of-line, but we're defining it inline
// for simplicity in this example.
inline void HelloWorldAssertionHandler(std::__libcpp_assertion_info info) {
  std::printf("Assertion %s failed at %s:%d, more info: %s",
              info.__predicate(), info.file(), info.__line(), info.__message());
  std::abort();
}

// In HelloWorld.cpp
// Make sure to pass `-include path/to/HelloWorldAssert.h` to the compiler
#include <vector>

int main() {
  std::vector<int> v;
  int& x = v[0]; // Your assertion handler will be called here if _LIBCPP_ENABLE_ASSERTIONS=1
}

If we decide to break backwards compatibility with the debug mode, we can simplify this to (I'm happy to do it, but I would prefer doing it separately since it's a breaking change):

// In HelloWorldAssert.h
#define _LIBCPP_ASSERTION_HANDLER(info) ::HelloWorldAssertionHandler(info)
// This could also be defined out-of-line, but we're defining it inline
// for simplicity in this example.
inline void HelloWorldAssertionHandler(char const* file, int line, char const* expression, char const* message) {
  std::printf("Assertion %s failed at %s:%d, more info: %s", expression, file, line, message);
  std::abort();
}

// Same story from HelloWorld.cpp, still need `-include path/to/HelloWorldAssert.h`
167

I agree, and it's useful for e.g. D116950 anyways.

libcxx/src/debug.cpp
43–46

Well, we can't really remove it from the dylib because some folks may depend on that symbol being defined. So it has to stay in the dylib, at the very least.

And as I've hinted at, I believe we should remove those from our headers in the future: __libcpp_set_debug_function, __libcpp_debug_function and __libcpp_abort_debug_function. Indeed, those are effectively supplanted by this new suggested assertion handler.

I haven't looked at the code and design in depth, but have some remarks already.

libcxx/docs/UsingLibcxx.rst
166

Do users need to do this in every translation unit? (Looking at the macro I assume yes.)

167

I think when we need an internal header it would be a good idea to add a new header specifically for that purpose.
__libcpp_assertion_hander for example. That header can then include __assert, but that way we can change our implementation without breaking user code.

171

How about adding a [[noreturn]] attribute on the function?

libcxx/include/__assert
21

I think this looks better in code that is visible to users.

libcxx/include/__availability
87

What happens when a user enables this on Apple? Will it result in a compilation or a linkage error?

libcxx/docs/UsingLibcxx.rst
158–161

Ah, I see that I did forget something: you can't just -D_LIBCPP_ASSERTION_HANDLER=foo and then include libc++ headers, because you also need to provide a forward declaration of foo (otherwise it's an undeclared identifier). Gross.
(Backstory: My previous experiences with defining handlers for things via macros is all from https://github.com/troydhanson/uthash , where the handlers are used only within macro expansions themselves, and thus forward declarations are never needed; you just have to declare foo sometime before the first call to HASH_ADD-or-whatever.)

In absolute terms I'm not a huge fan of -include either... but yeah, I think putting -include HelloWorld.h into your build flags is a big improvement over putting forward declarations and stuff above the libc++ headers in every translation unit. :)

167

+1 Mordante's idea of <__libcpp_assertion_handler>, except that I worry that "one header per entry point" is too fine a level of granularity. Like maybe it should be <__libcpp_assert> and pull in all the stuff relevant to assertion handlers (__libcpp_assertion_handler, __libcpp_assertion_handler_type, __libcpp_set_assertion_handler...). And then that's just <__assert> (i.e. what Louis has done) but with a libcpp on the front. Which might still be a nice touch.

If we were to rename __assert to __libcpp_assert, would that hurt any users? (No, because __assert hasn't shipped yet?)
If we were to rename it, what would be connoted by the symmetry between __libcpp_assert and __libcpp_version? Would that be a good thing or a bad thing? (Is __libcpp_version even shipped? I have never bothered to understand its job.)

Is this a job for a subdirectory, like <libcpp/assert.h> or <ext/libcpp_assertions.h> or something?

ldionne marked 10 inline comments as done.Mar 8 2022, 7:08 AM
ldionne added inline comments.
libcxx/docs/UsingLibcxx.rst
166

Yes, they do, but as we've mentioned, this should be done from the compiler command-line.

167

I'm not sure I understand the request here. We already have a header that includes "everything related to assertions", and the name of that header is <__assert>. We can discuss changing the name of that header, no problem, but I'm not sure I understand your request if it's not just to rename the header. What would < __libcpp_assertion_hander> contain?

libcxx/include/__availability
87

If a user tries to enable assertions on Apple and is deploying to a platform that doesn't support the default handler (which is all platforms for the time being), they will get a compilation error saying they are using the default handler, which hasn't been introduced yet.

If they define their own custom handler, they won't get any error.

Eventually, once we actually ship a dylib with support for the assertion handler, this will move from __attribute__((unavailable)) on line 160 below to __attribute__((availability(macosx,strict,introduced=SOME VERSION))).

ldionne updated this revision to Diff 413807.Mar 8 2022, 7:49 AM
ldionne marked 2 inline comments as done.

Address review comments except the ask for <__libcpp_assertion_hander>, which I don't understand yet.

Quuxplusone added inline comments.Mar 8 2022, 8:20 AM
libcxx/docs/UsingLibcxx.rst
167

FWIW, I interpreted @Mordante's comment as a reply to @philnik's suggestion; i.e. my interpretation of this thread has been [all words are my own and quite possibly not what people were actually intending to convey]:

L: "I'm not thrilled about making people include <__assert>, because it isn't very discoverable, and it might encourage them to include other detail headers like <__string>, which we do not want them to."
P: "How about <cassert> then? it's more discoverable, and obviously user-facing."
M: "How about <__libcpp_assertion_handler>? it's named exactly the same as the facility they must already know about because they're trying to use it, and maybe the libcpp prefix indicates that it's more implementation-specific and also more special than <__string> and so on."
Q: "Yeah, not <cassert>, and I like the libcpp prefix idea. Or at least some sort of prefix that separates this thing from <__string> and <__hash_table>. How about put it in a subdirectory?"

@ldionne wrote:

I'm not sure I understand your request if it's not just to rename the header.

My interpretation was that it did boil down to just a request to rename the header, but I could be wrong.

I've given this review some more thoughts and I'm not sure whether this is the best approach. I've send you a message to look at a slightly different approach.

libcxx/benchmarks/algorithms.bench.cpp
154

Is this intended or for testing. When intended please commit it separately.

libcxx/docs/UsingLibcxx.rst
167

Correct. Except I wasn't aware __debug was recent. Hence my proposal to have a separate header to include __debug. When __debug is still new renaming seems better than adding an additional header.

ldionne marked an inline comment as done.Mar 8 2022, 12:22 PM
ldionne added inline comments.
libcxx/benchmarks/algorithms.bench.cpp
154

Agreed -- see D121244.

libcxx/docs/UsingLibcxx.rst
167

I'm OK with having a header named __libcpp_assertion_handler or ideally __libcpp_assert for users to include, instead of <__assert>. However, I do not want to merge __debug and __assert (however it ends up being named), specifically because I want to keep a clear distinction between the notion of assertions and the notion of a debugging mode. This lack of distinction is what has been muddying the waters for so long, and specifically what I'm trying to fix.

Most of our code wants the ability to add assertions for preconditions, UB and stuff like that. Only a tiny fraction of our code cares about the full debug mode -- basically only containers and a couple of algorithms.

In general LGTM, I want to wait with approval after the CI is green. (I also like to see whether you think the weak-def version is a better solution.)

libcxx/docs/UsingLibcxx.rst
167

Sorry my bad I meant __assert instead of __debug.

170

The current version is ill-formed.

ldionne abandoned this revision.Mar 16 2022, 11:56 AM
ldionne marked an inline comment as done.

Abandoning in favour of the third design I've tried: D121478