LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 1286 - insertelement of undef generates horrible code (e.g. _mm_sll_epi16)
Summary: insertelement of undef generates horrible code (e.g. _mm_sll_epi16)
Status: RESOLVED FIXED
Alias: None
Product: libraries
Classification: Unclassified
Component: Scalar Optimizations (show other bugs)
Version: trunk
Hardware: Macintosh MacOS X
: P normal
Assignee: Chris Lattner
URL:
Keywords: code-quality
Depends on:
Blocks:
 
Reported: 2007-03-28 14:07 PDT by Bill Wendling
Modified: 2010-02-22 12:48 PST (History)
1 user (show)

See Also:
Fixed By Commit(s):


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Bill Wendling 2007-03-28 14:07:06 PDT
This:

#include <xmmintrin.h>

void foo(__m128i *A, __m128i *B) {
  *A = _mm_sll_epi16 (*A, *B);
}

generates this code:

_foo:
        movl 8(%esp), %eax
        movdqa (%eax), %xmm0
        #IMPLICIT_DEF %eax
	pinsrw $2, %eax, %xmm0
        xorl %ecx, %ecx
	pinsrw $3, %ecx, %xmm0
        pinsrw $4, %eax, %xmm0
        pinsrw $5, %ecx, %xmm0
	pinsrw $6, %eax, %xmm0
        pinsrw $7, %ecx, %xmm0
	movl 4(%esp), %eax
        movdqa (%eax), %xmm1
        psllw %xmm0, %xmm1
	movdqa %xmm1, (%eax)
        ret

This is obviously bad! We should be as good GCC at least.
Comment 1 Chris Lattner 2007-04-08 20:12:55 PDT
Fixed, patch here:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070402/047039.html

Testcase here: Transforms/InstCombine/vec_insertelt.ll

We now compile this to:

_foo:
        movl 4(%esp), %eax
        movdqa (%eax), %xmm0
        movl 8(%esp), %ecx
        psllw (%ecx), %xmm0
        movdqa %xmm0, (%eax)
        ret

GCC manages:

_foo:
        subl    $12, %esp
        movl    16(%esp), %edx
        movl    20(%esp), %eax
        movdqa  (%eax), %xmm0
        movdqa  (%edx), %xmm1
        psllw   %xmm0, %xmm1
        movdqa  %xmm1, (%edx)
        addl    $12, %esp
        ret

-Chris