New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
r7 not saved/restored around inline asm blocks that kill it #40920
Comments
If you have a "register" variable which refers to a register that isn't allocatable, the compiler assumes you know what you're doing when you read/modify it. This is generally useful in various situations: you can read the value of the stack pointer, access x18 on AArch64 with -ffixed-x18, etc. There's maybe some argument that frame/base pointers in particular should be treated specially, since it isn't necessarily easy for the user to predict which registers will be reserved, especially on ARM. But then you get into weird edge cases. If the function contains dynamic stack allocation, and an inline asm is using the registers we would normally use for the frame/base pointers, what do we do? Temporarily reassign them to some other registers? I guess that's possible, if there are enough other registers available, but it would be really complicated to implement and test. |
We fairly recently added a warning when a reserved register was put into a clobber list of an asm block. This seems like it should do the same. |
Note that compiler-rt uses very similar inline assembly, so this might also be broken then: |
I think the only documentation for this feature is https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html. The documentation doesn't state anything about whether the compiler should aim to avoid conflicts when the register is already used for other reasons, beyond just passing in a value in a specific register to an extended assembly block. The work-around for the miscompile reported above seems to be to not use the "specified register for a local variable"-feature, but achieve getting the needed value into r7 in another way. Indeed, compiler-rt seems to have very similar code. libgcc seems to implement the entire function in assembly. |
Does anyone have a suggestion what we can do to make __clear_cache in compiler-rt work? Since both clang and compiler-rt are part of the LLVM project, we really should try to come up with something that allows both parts to cooperate :) |
Three possible ideas on the compiler-rt side:
None of them are entirely ideal. |
In Chrome, we've settled on the workaround of marking |
Extended Description
With clang trunk:
$ cat /tmp/repro.c
void escape(void *);
void Bar(int p0, int p1, int p2, int p3, int p4, int p5) {
register unsigned magic asm("r7") = 0x000f0002;
asm volatile("svc 0\n" : : "r"(magic) : "memory");
int local = p5;
escape(&local);
}
$ clang --target=arm-linux-gnueabihf -march=armv7-a -mthumb /tmp/repro.c -S -o - -O2 -emit-llvm
<< snip >>
; Function Attrs: nounwind
define dso_local void @Bar(i32 %p0, i32 %p1, i32 %p2, i32 %p3, i32 %p4, i32 %p5) local_unnamed_addr #0 {
entry:
%local = alloca i32, align 4
tail call void asm sideeffect "svc 0\0A", "{r7},~{memory}"(i32 983042) #3, !srcloc !3
%0 = bitcast i32* %local to i8*
call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %0) #3
store i32 %p5, i32* %local, align 4, !tbaa !4
call void @escape(i8* nonnull %0) #3
call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #3
ret void
}
$ clang --target=arm-linux-gnueabihf -march=armv7-a -mthumb /tmp/repro.c -S -o - -O2
<< snip >>
Bar:
.fnstart
@ %bb.0: @ %entry
.save {r6, r7, lr}
push {r6, r7, lr}
.setfp r7, sp, #4
add r7, sp, #4
.pad #4
sub sp, #4
movs r7, #2
movt r7, #15
@APP
svc #0
So, we set up r7 as our fp, it gets clobbered by
magic
for inline asm to do stuff with, and we proceed to use themagic
value as though it was still our pc in a later ldr.As one who's not super familiar with our rules around inline asm, I'd assume that
tail call void asm sideeffect "svc 0\0A", "{r7},~{memory}"(i32 983042) #​3, !srcloc !​3
should emit code to save/restore r7, instead of leaving it in a clobbered state.FWIW, we have code that looks very similar to the non-reduced repro in compiler-rt's clear_cache.c. That appears to work because r7 is a callee-saved register, so I presume we save/restore everything properly there (and don't use r7 as a fp).
The text was updated successfully, but these errors were encountered: