New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conditional branches are exchanged unoptimally #48680
Comments
Was it reordered in IR or the backend? |
In LLVM IR the two conditions are combined %cmp2 = icmp slt i32 %op.addr.0, %conv In instruction selection, it was changed to two conditional branches in bad order. |
As far as I know the backend just processes the operands of the AND in order. This is done in SelectionDAGBuilder::visitBR I think. It looks like the != check is first. I'm guessing this was reordered by the reassociate pass in IR. Does this patch fix it https://reviews.llvm.org/D94089 |
I still don't agree that we should be viewing this kind of problem as if
? |
That sounds right to me. This might require a pipeline adjustment as the first step: we run -simplifycfg before -lower-expect for some reason. So we've already logically combined the conditions before turning the expect intrinsic into metadata: %cmp = icmp slt i32 %0, 0 And if I'm seeing it correctly, -lower-expect just drops the intrinsic for that pattern. No profile metadata is created, so there's no hope that any later pass would be able to do anything special. |
|
Craig, the patch https://reviews.llvm.org/D94089 can fix this problem. Sanjay commented in that patch: "If the expression order matters, then we want the programmer to indicate that explicitly with something like "__builtin_expect" or profile metadata." It looks __builtin_expect doesn't actually work in this case. I think Sanjai's analysis in #48680 #c5 is correct. The LowerExpect pass silently dropped the call to @llvm.expect.i64 without any modification to the IR or metadata. So from then on we lost the expect information. |
Any recommended solution? |
We have to start by making SimplifyCFG smarter about profile metadata. I'm looking at that now. That first fix won't change anything for the example here though. We'll either need to modify the pass pipeline or limit SimplifyCFG even more. Then, we'll have to make sure the metadata is respected through codegen. |
|
Proposal to change pass ordering: |
This example appears to be solved with: |
Extended Description
The following code is extracted from zippy. Compile it with -O2
#include
#define PREDICT_FALSE(x) __builtin_expect((x), 0)
volatile int g;
std::pair<char*, int>
foo(char* ip, char* ip_limit, int op) {
do {
char* old_ip = ip;
int tag_type = g;
ip = ip + 1;
int offset = *old_ip;
if (offset < 8) {
ip = ip + 2;
}
int delta = op - offset;
if (PREDICT_FALSE(delta < 0)) {
if (tag_type != 0) {
ip = old_ip;
break;
}
}
op += 8;
} while (ip < ip_limit);
return {ip, op};
}
LLVM generates:
_Z3fooPcS_i: # @_Z3fooPcS_i
.cfi_startproc
%bb.0: # %entry
.LBB0_3: # %cleanup
# in Loop: Header=BB0_1 Depth=1
movb %r8b, %cl
leaq (%rax,%rcx,2), %rax
addq $1, %rax
addl $8, %edx
cmpq %rsi, %rax
jae .LBB0_4
.LBB0_1: # %do.body
# =>This Inner Loop Header: Depth=1
movl g(%rip), %r9d
movsbl (%rax), %edi
xorl %ecx, %ecx
cmpl $8, %edi
setl %r8b
testl %r9d, %r9d // if (tag_type != 0)
je .LBB0_3
%bb.2: # %do.body
.LBB0_4: # %do.end
retq
The condition (tag_type != 0) is unpredictable, but (delta < 0) is predicted as false, so the unpredictable condition (tag_type != 0) is rarely executed.
But in the generated code, llvm changed the order of the two conditions, so the unpredictable condition is always executed, causes more branch mis-prediction.
The text was updated successfully, but these errors were encountered: