-expandmemcmp generates loads with incorrect alignment #43225

aqjune · 2019-11-02T05:18:54Z


Bugzilla Link	43880
Resolution	FIXED
Resolved on	Mar 16, 2020 06:50
Version	trunk
OS	All
CC	@legrosbuffle,@RKSimon,@nunoplopes,@rotateright

Extended Description

--
$ cat memcmp.ll
target datalayout = "e-i8:8:8-i16:16:16"
target triple = "x86_64-unknown-unknown"
declare i32 @memcmp(i8* nocapture, i8* nocapture, i64)

define i32 @cmp2(i8* nocapture readonly %x, i8* nocapture readonly %y) {
%call = tail call i32 @memcmp(i8* %x, i8* %y, i64 2)
ret i32 %call
}

$ ./llvm/bin/opt -expandmemcmp -S -o - memcmp.ll
; ModuleID = 'memcmp.ll'
source_filename = "memcmp.ll"
target datalayout = "e-i8:8:8-i16:16:16"
target triple = "x86_64-unknown-unknown"

declare i32 @memcmp(i8* nocapture, i8* nocapture, i64)

define i32 @cmp2(i8* nocapture readonly %x, i8* nocapture readonly %y) {
%1 = bitcast i8* %x to i16*
%2 = bitcast i8* %y to i16*
%3 = load i16, i16* %1
%4 = load i16, i16* %2
%5 = call i16 @llvm.bswap.i16(i16 %3)
%6 = call i16 @llvm.bswap.i16(i16 %4)
%7 = zext i16 %5 to i32
%8 = zext i16 %6 to i32
%9 = sub i32 %7, %8
ret i32 %9
}

; Function Attrs: nounwind readnone speculatable willreturn
declare i16 @llvm.bswap.i16(i16) #0

attributes #0 = { nounwind readnone speculatable willreturn }

This is incorrect because the loads in the output has align 2 (omitted alignment in loads mean they have the ABI alignment). If %x or %y is not 2-byte aligned, the optimized code raises undefined behavior, where as the source wouldn't.

legrosbuffle · 2020-03-09T10:35:05Z

Unaligned loads are both valid and fast on X86, so we do expand memcmp.
Arm v6. for example, does not allow unaligned loads, so we're not expanding.

aqjune · 2020-03-09T13:12:28Z

Unaligned loads are both valid and fast on X86, so we do expand memcmp.
Arm v6. for example, does not allow unaligned loads, so we're not expanding.

Hello Clement,
Thank you for the info - so unaligned loads are allowed in this case.
Would it make sense if the two loads are explicitly given align 1?

  %3 = load i16, i16* %1, align 1
  %4 = load i16, i16* %2, align 1

If they are not attached, the pointers are assumed to have alignment 2 in this case, which may not be correct.

aqjune · 2020-03-15T10:52:07Z

Made a patch here: https://reviews.llvm.org/D76113

Turns out that expandmemcmp can be activated at target aarch64 and powerpc (test/CodeGen/AArch64/bcmp-inline-small.ll, etc). In case of aarch64, mismatch in alignment may cause trap

legrosbuffle · 2020-03-16T07:47:41Z

I was wrong actually, aarch64 does indeed expand even when strict loads are required. What it does is check strict loads to allow overlapping loads: the underlying assumption here is that input buffers are always aligned :(

aqjune · 2020-03-16T13:49:13Z

Fixed via https://reviews.llvm.org/rGacdcd23b7b07 , thanks!

aqjune · 2020-03-16T13:50:51Z

Sorry, wrong link. It is 7aecf23

llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

-expandmemcmp generates loads with incorrect alignment #43225

-expandmemcmp generates loads with incorrect alignment #43225

aqjune commented Nov 2, 2019

legrosbuffle commented Mar 9, 2020

aqjune commented Mar 9, 2020

aqjune commented Mar 15, 2020

legrosbuffle commented Mar 16, 2020

aqjune commented Mar 16, 2020

aqjune commented Mar 16, 2020

-expandmemcmp generates loads with incorrect alignment #43225

-expandmemcmp generates loads with incorrect alignment #43225

Comments

aqjune commented Nov 2, 2019

Extended Description

attributes #​0 = { nounwind readnone speculatable willreturn }

legrosbuffle commented Mar 9, 2020

aqjune commented Mar 9, 2020

aqjune commented Mar 15, 2020

legrosbuffle commented Mar 16, 2020

aqjune commented Mar 16, 2020

aqjune commented Mar 16, 2020

attributes #0 = { nounwind readnone speculatable willreturn }