Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++20 coroutines marked always_inline are (partially) inlined at -O1, but not at -O0 #53413

Closed
ChuanqiXu9 opened this issue Jan 26, 2022 · 11 comments

Comments

@ChuanqiXu9
Copy link
Member

ChuanqiXu9 commented Jan 26, 2022

The decision about the issue has been made actually. The intention of the issue is to make it clear in case the user is confused. It would be easier for them to search.

The core problem is about pipeline ordering. In O0 pipeline, the AlwaysInline pass run before CoroSplit pass. And the coroutine haven't been split couldn't be inlined due to the current design of coroutine. So here is the problem.

Note that in On (n != 0) pipeline, the AlwaysInline pass would run after CoroSplit pass.

I guess there would be two question:
(1) Could we make coroutine to be able to be inlined before to be split?

Theoretical possible. But it is hard in practice. Also there are other issues in coroutine now. So I would say it wouldn't be possible in near future.

(2) Could we run CoroSplit pass before AlwaysInline pass?

Maybe yes. The reason why we didn't do this is that LLVM Coroutine is shared by C++ Coroutine, swift coroutine and MLIR for now. The list might be longer in the future (maybe won't). And we need to keep consistent with what other language do here.

Another reason to do the decision is that we don't think this is a key issue. Generally we expect to get performance improvement if we marked a function with always_inline. But if we care about performance, we shouldn't use O0. Also always_inline is not a standard feature, so it shouldn't be a bug if it doesn't work under O0.

If you have a strong reason to inline coroutine with always_inline at O0, you could state your reason and reopen the issue.

BTW, GCC made a more aggressive decision: disable the use of always_inline with coroutine no matter what the optimization level is: https://godbolt.org/z/df8WEj9jb

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 26, 2022

@llvm/issue-subscribers-c-20

@Quuxplusone Quuxplusone changed the title The C++ coroutine marked with always_inline couldn't be inlined at O0 C++20 coroutines marked always_inline are (partially) inlined at -O1, but not at -O0 Jan 26, 2022
@dwblaikie
Copy link
Collaborator

Not sure that filing and closing bugs is maybe the best way to document intended behavior - presumably that sort of thing should go in some documentation somewhere in Clang? (maybe in the documentation for attributes)

@AaronBallman
Copy link
Collaborator

I agree that we should make sure our docs are sufficiently clear on this.

BTW, GCC made a more aggressive decision: disable the use of always_inline with coroutine no matter what the optimization level is: https://godbolt.org/z/df8WEj9jb

Is there a reason we don't want to be as aggressive?

@ChuanqiXu9
Copy link
Member Author

ChuanqiXu9 commented Jan 27, 2022

@dwblaikie @AaronBallman I put the conclusion at https://reviews.llvm.org/D115867#change-3MBnnJ2r7Yn8. I didn't put the analysis part in the document since I feel it is lengthy and wordy... Given that a coroutine is a special function, which wouldn't take a big part of all functions. It looks odd to me as a user that the document for 'always_inline' talks about coroutine mostly. How do you think about this?

BTW, GCC made a more aggressive decision: disable the use of always_inline with coroutine no matter what the optimization level is: https://godbolt.org/z/df8WEj9jb

Is there a reason we don't want to be as aggressive?

Yeah, there are some reasons:
(1) It looks not bad to inline nothing under O0. The reason I has stated in the issue.
(2) It is generally (not always) good to inline something. At least we could inline something under On.
(3) Most importantly, there is an optimization pass called CoroElide. This pass would try to eliminate the heap allocation of the coroutine, which is an important optimization for coroutines. And a necessary (not sufficient) condition for CoroElide is that the coroutine ramp function need to be inlined. So it is meaningful for a user to mark a coroutine as always_inline to increase the probability to make it get optimized. And GCC wouldn't implement such optimizations. So I guess it is the reason.

@AaronBallman
Copy link
Collaborator

It looks odd to me as a user that the document for 'always_inline' talks about coroutine mostly. How do you think about this?

I think we should explicitly document the circumstances under which always_inline does something other than always inlining the function, to the best extent possible.

(2) It is generally (not always) good to inline something. At least we could inline something under On.

The whole point to the always_inline attribute is to give the developer a way to say "no, really, I mean it, inline this thing unless you absolutely can't. I prefer the GCC behavior where the user is given a diagnostic so they know they've done something that isn't going to be honored by the compiler.

@ChuanqiXu9
Copy link
Member Author

It looks odd to me as a user that the document for 'always_inline' talks about coroutine mostly. How do you think about this?

I think we should explicitly document the circumstances under which always_inline does something other than always inlining the function, to the best extent possible.

I think the AttrDocs.doc document is for user of clang instead of developer of clang/llvm. I mean we shouldn't leak some implementation details in the document. For example, in the coroutine part of C++ standard, it talks nothing about implementation details like splitting, coroutine frame and so on. The users could avoid to learn them to use coroutine. I think this is the intention of the design of C++ coroutine. Similarly, I feel it is odd to talk about pass pipeline, indirect call in AttrDocs.doc. I don't think it is something the ordinary user should know. It is a burden to them.

(2) It is generally (not always) good to inline something. At least we could inline something under On.

The whole point to the always_inline attribute is to give the developer a way to say "no, really, I mean it, inline this thing unless you absolutely can't. I prefer the GCC behavior where the user is given a diagnostic so they know they've done something that isn't going to be honored by the compiler.

Yeah, we did give a diagnostic in https://reviews.llvm.org/D115867. Although it is a warning instead of an error. I prefer the warning since that there is an optimization depending on inlining (some part of) coroutine. Yeah, I agree that there is a really mismatch between the behavior and the intention of always_inline. That's the reason why I give a warning.

@AaronBallman
Copy link
Collaborator

I think the AttrDocs.doc document is for user of clang instead of developer of clang/llvm.

It's for both (we usually mention when an attribute has implementation-detail effects that users generally shouldn't care about. e.g., https://github.com/llvm/llvm-project/blob/main/clang/include/clang/Basic/AttrDocs.td#L430).

Similarly, I feel it is odd to talk about pass pipeline, indirect call in AttrDocs.doc. I don't think it is something the ordinary user should know. It is a burden to them.

It's possible that I'm confused. My understanding of the issue is that when the user writes always_inline on their own coroutine function (calls co_await), they get a diagnostic because the always_inline is ignored. If that's correct, then this is not an implementation detail at all, it affects user code that is reasonable for them to write. Have I misunderstood?

Yeah, we did give a diagnostic in https://reviews.llvm.org/D115867. Although it is a warning instead of an error. I prefer the warning since that there is an optimization depending on inlining (some part of) coroutine. Yeah, I agree that there is a really mismatch between the behavior and the intention of always_inline. That's the reason why I give a warning.

Ah, it's great that we give a diagnostic (I don't have a strong opinion on error vs warning), but that's all the more reason why this behavior should be documented.

@ChuanqiXu9
Copy link
Member Author

Similarly, I feel it is odd to talk about pass pipeline, indirect call in AttrDocs.doc. I don't think it is something the ordinary user should know. It is a burden to them.

It's possible that I'm confused. My understanding of the issue is that when the user writes always_inline on their own coroutine function (calls co_await), they get a diagnostic because the always_inline is ignored. If that's correct, then this is not an implementation detail at all, it affects user code that is reasonable for them to write. Have I misunderstood?

Yeah, the actual behavior would be that the always_inline attribute on coroutine function would be ignored under O0. With higher optimization, due to the coroutine function would be split into pieces, not all the function is guarantee to be inlined due to some call to the coroutine would be indirect call, which is not impossible to inline. From the perspective of user, assume the user is an expert who would read the assembler to get the behavior of compiler, he would found the coroutine marked with always_inline get inlined partly. I guess he must be surprising.

Although the always_inline has already said that it doesn't guarantee that the inline substitution actually occurs: https://github.com/llvm/llvm-project/blob/main/clang/include/clang/Basic/AttrDocs.td#L6203. I think it should be a surprise that people find their code are inlined partly. So here is the new wording I added to the always_inline docs.

Do you accept to add a link to the issue in the AttrDocs? If yes, I would like to add it. If no, I still want to say that the most user of always_inline wouldn't like to see such a wordy and lengthy description on things they don't care really.

@AaronBallman
Copy link
Collaborator

Do you accept to add a link to the issue in the AttrDocs? If yes, I would like to add it. If no, I still want to say that the most user of always_inline wouldn't like to see such a wordy and lengthy description on things they don't care really.

No, I don't think the documentation should link to issues (unless the issue is a major one that we're committed to solving and awareness is critical, or some other extenuating circumstance). I don't think we need a wordy or lengthy description here, but I also disagree with "don't care really" either. If we thought it was worth adding a warning over (which I agree was the right move), it's worth documenting the behavior to the user so they understand why they get the warning and what they can or can't do about it.

I'm thinking something as simple as:

Does not guarantee that inline substitution actually occurs. <ins>Note: applying this attribute to a coroutine at the `-O0` optimization level has no effect; other optimization levels may only partially inline and result in a diagnostic.</ins>

or something to that effect.

@ChuanqiXu9
Copy link
Member Author

Do you accept to add a link to the issue in the AttrDocs? If yes, I would like to add it. If no, I still want to say that the most user of always_inline wouldn't like to see such a wordy and lengthy description on things they don't care really.

No, I don't think the documentation should link to issues (unless the issue is a major one that we're committed to solving and awareness is critical, or some other extenuating circumstance). I don't think we need a wordy or lengthy description here, but I also disagree with "don't care really" either. If we thought it was worth adding a warning over (which I agree was the right move), it's worth documenting the behavior to the user so they understand why they get the warning and what they can or can't do about it.

I'm thinking something as simple as:

Does not guarantee that inline substitution actually occurs. <ins>Note: applying this attribute to a coroutine at the `-O0` optimization level has no effect; other optimization levels may only partially inline and result in a diagnostic.</ins>

or something to that effect.

I've addressed your opinion in https://reviews.llvm.org/D115867. Thanks for helping wording!

@llvmbot
Copy link
Collaborator

llvmbot commented Feb 7, 2022

@llvm/issue-subscribers-clang-codegen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants