LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 41575 - r7 not saved/restored around inline asm blocks that kill it
Summary: r7 not saved/restored around inline asm blocks that kill it
Status: CONFIRMED
Alias: None
Product: libraries
Classification: Unclassified
Component: Backend: ARM (show other bugs)
Version: trunk
Hardware: PC Linux
: P enhancement
Assignee: Unassigned LLVM Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-04-23 18:31 PDT by George Burgess
Modified: 2019-05-23 10:48 PDT (History)
10 users (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description George Burgess 2019-04-23 18:31:55 PDT
With clang trunk:

$ cat /tmp/repro.c
void escape(void *);

void Bar(int p0, int p1, int p2, int p3, int p4, int p5) {
  register unsigned magic asm("r7") = 0x000f0002;
  asm volatile("svc 0\n" : : "r"(magic) : "memory");

  int local = p5;
  escape(&local);
}

$ clang --target=arm-linux-gnueabihf -march=armv7-a -mthumb /tmp/repro.c -S -o - -O2 -emit-llvm 

<< snip >>

; Function Attrs: nounwind
define dso_local void @Bar(i32 %p0, i32 %p1, i32 %p2, i32 %p3, i32 %p4, i32 %p5) local_unnamed_addr #0 {
entry:
  %local = alloca i32, align 4
  tail call void asm sideeffect "svc 0\0A", "{r7},~{memory}"(i32 983042) #3, !srcloc !3
  %0 = bitcast i32* %local to i8*
  call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %0) #3
  store i32 %p5, i32* %local, align 4, !tbaa !4
  call void @escape(i8* nonnull %0) #3
  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #3
  ret void
}


$ clang --target=arm-linux-gnueabihf -march=armv7-a -mthumb /tmp/repro.c -S -o - -O2

<< snip >>

Bar:
        .fnstart
@ %bb.0:                                @ %entry
        .save   {r6, r7, lr}
        push    {r6, r7, lr}
        .setfp  r7, sp, #4
        add     r7, sp, #4
        .pad    #4
        sub     sp, #4
        movs    r7, #2
        movt    r7, #15
        @APP
        svc     #0

        @NO_APP
        ldr     r0, [r7, #12]
        str     r0, [sp]
        mov     r0, sp
        bl      escape
        add     sp, #4
        pop     {r6, r7, pc}


So, we set up r7 as our fp, it gets clobbered by `magic` for inline asm to do stuff with, and we proceed to use the `magic` value as though it was still our pc in a later ldr.

As one who's not super familiar with our rules around inline asm, I'd assume that `tail call void asm sideeffect "svc 0\0A", "{r7},~{memory}"(i32 983042) #3, !srcloc !3` should emit code to save/restore r7, instead of leaving it in a clobbered state.

FWIW, we have code that looks very similar to the non-reduced repro in compiler-rt's clear_cache.c. That appears to work because r7 is a callee-saved register, so I presume we save/restore everything properly there (and don't use r7 as a fp).
Comment 1 Eli Friedman 2019-04-24 00:03:13 PDT
If you have a "register" variable which refers to a register that isn't allocatable, the compiler assumes you know what you're doing when you read/modify it.  This is generally useful in various situations: you can read the value of the stack pointer, access x18 on AArch64 with -ffixed-x18, etc.

There's maybe some argument that frame/base pointers in particular should be treated specially, since it isn't necessarily easy for the user to predict which registers will be reserved, especially on ARM.  But then you get into weird edge cases. If the function contains dynamic stack allocation, and an inline asm is using the registers we would normally use for the frame/base pointers, what do we do?  Temporarily reassign them to some other registers?  I guess that's possible, if there are enough other registers available, but it would be really complicated to implement and test.
Comment 2 Tim Northover 2019-04-24 02:34:13 PDT
We fairly recently added a warning when a reserved register was put into a clobber list of an asm block. This seems like it should do the same.
Comment 3 Clemens Hammacher 2019-04-25 01:55:12 PDT
Note that compiler-rt uses very similar inline assembly, so this might also be broken then:
https://github.com/llvm-mirror/compiler-rt/blob/master/lib/builtins/clear_cache.c#L122
Comment 4 Kristof Beyls 2019-04-25 05:31:32 PDT
(In reply to Eli Friedman from comment #1)
> If you have a "register" variable which refers to a register that isn't
> allocatable, the compiler assumes you know what you're doing when you
> read/modify it.  This is generally useful in various situations: you can
> read the value of the stack pointer, access x18 on AArch64 with -ffixed-x18,
> etc.
> 
> There's maybe some argument that frame/base pointers in particular should be
> treated specially, since it isn't necessarily easy for the user to predict
> which registers will be reserved, especially on ARM.  But then you get into
> weird edge cases. If the function contains dynamic stack allocation, and an
> inline asm is using the registers we would normally use for the frame/base
> pointers, what do we do?  Temporarily reassign them to some other registers?
> I guess that's possible, if there are enough other registers available, but
> it would be really complicated to implement and test.

I think the only documentation for this feature is https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html.
There, it states that "The only supported use for this feature is to specify registers for input and output operands when calling Extended asm".
It seems to me that that documentation might imply that uses such as trying to read the value of the stack pointer or accessing x18 on AArch64 with -ffixed-x18 are not supported.

The documentation doesn't state anything about whether the compiler should aim to avoid conflicts when the register is already used for other reasons, beyond just passing in a value in a specific register to an extended assembly block.
If really the only supported use case for this feature is to pass in/out values from an extended assembly block, maybe it wouldn't be unreasonable to expect the compiler to save/restore other live values in that register around the inline asm block. That being said, I haven't look into how hard it would be to implement that.
Implementing this behaviour may indeed break existing code that relies on behaviour that according to the documentation isn't supported.

The work-around for the miscompile reported above seems to be to not use the "specified register for a local variable"-feature, but achieve getting the needed value into r7 in another way. Indeed, compiler-rt seems to have very similar code. libgcc seems to implement the entire function in assembly.
Comment 5 Nico Weber 2019-04-26 04:26:00 PDT
Does anyone have a suggestion what we can do to make __clear_cache in compiler-rt work? Since both clang and compiler-rt are part of the LLVM project, we really should try to come up with something that allows both parts to cooperate :)
Comment 6 Tim Northover 2019-04-26 04:35:00 PDT
Three possible ideas on the compiler-rt side:

1. Switch to assembly.
2. Use -marm, where r11 is the frame pointer on Linux.
3. -fomit-frame-pointer (though note iOS actively disallows that and will warn if the flag is applied for it).

None of them are entirely ideal.
Comment 7 George Burgess 2019-05-23 10:48:12 PDT
In Chrome, we've settled on the workaround of marking `__clear_cache` with `noinline` for the moment, since the calling convention appears to produce the appropriate r7 save. To Nico's point, we should keep pieces of LLVM working together harmoniously, so is anyone against me doing that while we decide on/develop a more ideal fix?