This: #include <xmmintrin.h> void foo(__m128i *A, __m128i *B) { *A = _mm_sll_epi16 (*A, *B); } generates this code: _foo: movl 8(%esp), %eax movdqa (%eax), %xmm0 #IMPLICIT_DEF %eax pinsrw $2, %eax, %xmm0 xorl %ecx, %ecx pinsrw $3, %ecx, %xmm0 pinsrw $4, %eax, %xmm0 pinsrw $5, %ecx, %xmm0 pinsrw $6, %eax, %xmm0 pinsrw $7, %ecx, %xmm0 movl 4(%esp), %eax movdqa (%eax), %xmm1 psllw %xmm0, %xmm1 movdqa %xmm1, (%eax) ret This is obviously bad! We should be as good GCC at least.
Fixed, patch here: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20070402/047039.html Testcase here: Transforms/InstCombine/vec_insertelt.ll We now compile this to: _foo: movl 4(%esp), %eax movdqa (%eax), %xmm0 movl 8(%esp), %ecx psllw (%ecx), %xmm0 movdqa %xmm0, (%eax) ret GCC manages: _foo: subl $12, %esp movl 16(%esp), %edx movl 20(%esp), %eax movdqa (%eax), %xmm0 movdqa (%edx), %xmm1 psllw %xmm0, %xmm1 movdqa %xmm1, (%edx) addl $12, %esp ret -Chris