Testcase: #include <stdlib.h> #include <assert.h> static const char *s = " 123 10"; int main() { char *p; strtoul(s, &p, 8); asm( "nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;" "nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;" "nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;" "nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;" "nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;" "nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;" "nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;" "nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;" "nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;" "nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;" "nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;" "nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;" "nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;" "nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;" "nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;" "nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;nop;" ::: "r1", "r2", "r3", "r4", "r5", "r6", "r7", "r8", "r9", "r10", "r11", "r12", "r13"); if (p) assert(p == s+12); } (The giant asm statement is there to force the generation of two constant pools; maybe there's some shorter way to do that?) Compile with clang -O2 for an ARM target, and the assertion fails. Compile with -arm-promote-constant=false, and the assertion passes. Not sure what the right answer is here; maybe we shouldn't be marking the string "unnamed_addr"?
Note that several libc++ tests may fail due to this issue. The issue may or may not present depending on the register allocation pattern of the function.
> Not sure what the right answer is here; maybe we shouldn't be marking the string "unnamed_addr"? IMO we should proceed this way. Per LangRef, the semantics of "unnamed_addr" is that the address does not matter. Here is certainly does. The main problem is that it would force almost all string literals to lost "unnamed_addr" specifier, because we'd need to distinguish the case when the address might be captured and when - might not.
Suppose we split unnamed_addr into two bits, "mergeable" and "duplicatable", where "mergeable" means a constant can be merged with other constants, and "duplicatable" means every reference to the global can produce a different pointer value. The ARM constant pool transform requires "duplicatable"; constant merging, function merging, and putting a constant into a mergeable section require "mergeable". (I don't know of any other relevant transformations.) The "duplicatable" attribute seems too dangerous to be usable: operations which don't obviously involve pointer comparisons can introduce them, e.g. turning a library call into a loop whose exit condition is a pointer comparison introduces UB.
(In reply to Eli Friedman from comment #3) > The "duplicatable" attribute seems too dangerous to be usable: operations > which don't obviously involve pointer comparisons can introduce them, e.g. > turning a library call into a loop whose exit condition is a pointer > comparison introduces UB. Right. Therefore it seems that the -arm-promote-constant should be always false and we're safe to delete this transform as being non-safe. This certainly should be integrated into 4.0.1 - currently we're producing very subtle broken code for large functions on ARM.
It might be possible to make the constant islands pass avoid duplicating constants in this case.... but I guess that's kind of tricky: in general we end up putting the offset into the constant pool.
(In reply to Eli Friedman from comment #5) > It might be possible to make the constant islands pass avoid duplicating > constants in this case.... but I guess that's kind of tricky: in general we > end up putting the offset into the constant pool. Alternatively, we could restrict optimization only to string literals that have single use. In this case the address is taken, but only once and we're fine.
Hi Anton, Restricting to arrays that only have one use seems like a good bailout to me. Apologies, I didn't even consider this case when writing the transform :/ It would be nice if we could somehow demote back to a Global if the constant pool entry needed to be cloned, but that would break all manner of things in the backend :(
See https://reviews.llvm.org/D32567.
*** Bug 33074 has been marked as a duplicate of this bug. ***
As D32567 has not been accepted and 4.0.1 is apporaching, and taking into account the severity level of the problem, another patch is suggested to temporarily disable globals promotion: https://reviews.llvm.org/D33446
Thanks Oleg. Once approved, applying D33446 to 4.0.1 to fix the breakage should be fine.
I had a go at fixing this by leaving the decision to put the string in the constant pool as it was, and instead in ARMConstantIslands if the constant pool goes out of range instead of cloning the constant to a new constant pool insert the address of the constant into a new constant pool and use that i.e. turn adr r0, .LCPI0_0 <... too far ...> adr r0, .LCPI0_0 <...> .LCPI0_0: . asciz "stuff" into ldr r0, .LCPI0_1 // gets the same address as adr r0, .LCPI0_0 would <...> .LCPI0_1: .long .LCIP0_0 <...> adr r0, .LCPI0_0 <...> .LCPI0_0: . asciz "stuff" That seems to work, but the tricky part is putting the address of one constant pool label into another constant pool. The two possibilities I see are using ARMConstantPoolSymbol and somehow getting the name of the label (currently only AsmPrinter::GetCPISymbol knows what the symbol name is) or adding a new ARMConstantPoolValue subclass for the address of a constant pool label.
Thank you for disabling, I am finally able to use llvm 4.0.1 for ARM now, because of bug 32394. I hope someone will look into that one before re-enabling.
Kristof, re-enabling this optimization is something that is potentially interesting for Android as well. Any thoughts about this? This came up in the context of an NDK bug that was recently filed (https://github.com/android-ndk/ndk/issues/642).
(In reply to Stephen Hines from comment #14) > Kristof, re-enabling this optimization is something that is potentially > interesting for Android as well. Any thoughts about this? This came up in > the context of an NDK bug that was recently filed > (https://github.com/android-ndk/ndk/issues/642). My understanding is that this (disabled) optimization mainly results in a potential code size optimization for really small embedded systems. My understanding is that for Android applications, it probably won't result in major improvements. You could try of course using the command line option to re-enable this optimization, and see if it does give an improvement. Given that this optimization is expected to mostly affect only very small embedded programs, and that this is an optimization in the ARMConstantIslandPass - known for its complexity and error-proneness - I don't think doing whatever is needed to re-enable this optimization is high priority.