Consider this function (very similar to strhr): void foo(unsigned char C); const char *FindChar(const char *CurPtr) { unsigned char C; do C = *CurPtr++; while (C != 'x' && C != '\0'); foo(C); return CurPtr; } We currently compile the loop to: LBB1_1: #bb movb (%esi), %al incl %esi cmpb $119, %al jg LBB1_4 #bb LBB1_3: #bb testb %al, %al je LBB1_2 #bb7 jmp LBB1_1 #bb LBB1_4: #bb cmpb $120, %al jne LBB1_1 #bb It seems that switch lowering could produce something like: cmpb $120, %al je out testb %al, %al jnz LBB1_1 #bb7 which would be much faster. -Chris
GCC generates code very close to the desired code: L3: movzbl (%esi), %eax addl $1, %esi cmpb $120, %al je L4 testb %al, %al jne L3
Implemented. Testcase here: Regression/CodeGen/Generic/SwitchLowering.ll Patch here: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20061016/038932.html We now compile the inner loop to: LBB1_1: #bb movb (%esi), %al incl %esi testb %al, %al je LBB1_2 #bb7 LBB1_3: #bb cmpb $120, %al jne LBB1_1 #bb -Chris