LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 47464 - [x86] After 16f3d698f2af, -march=amdfam10 incorrectly implies SSSE3 instructions
Summary: [x86] After 16f3d698f2af, -march=amdfam10 incorrectly implies SSSE3 instructions
Status: RESOLVED FIXED
Alias: None
Product: libraries
Classification: Unclassified
Component: Backend: X86 (show other bugs)
Version: 11.0
Hardware: PC All
: P enhancement
Assignee: Unassigned LLVM Bugs
URL:
Keywords:
Depends on:
Blocks: release-11.0.0
  Show dependency tree
 
Reported: 2020-09-08 10:06 PDT by Dimitry Andric
Modified: 2020-09-08 12:00 PDT (History)
8 users (show)

See Also:
Fixed By Commit(s): e6bb4c8e7b3e27f214c9665763a2dd09aa96a5ac


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dimitry Andric 2020-09-08 10:06:04 PDT
After https://reviews.llvm.org/D83273 /
https://github.com/llvm/llvm-project/commit/16f3d698f2af ("[X86] Move
the feature dependency handling in X86TargetInfo::setFeatureEnabledImpl
to a table based lookup in X86TargetParser.cpp"), compiling with -march=amdfam10 incorrectly starts using SSSE3 instructions, in particular pshufb.

Test case:

#include <xmmintrin.h>

__m128i foo(unsigned char c) {
  return _mm_set1_epi8(c);
}

Asm output before:

foo:                                    # @foo
        .cfi_startproc
# %bb.0:                                # %entry
        movd    %edi, %xmm0
        punpcklbw       %xmm0, %xmm0            # xmm0 = xmm0[0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7]
        pshuflw $224, %xmm0, %xmm0              # xmm0 = xmm0[0,0,2,3,4,5,6,7]
        pshufd  $0, %xmm0, %xmm0                # xmm0 = xmm0[0,0,0,0]
        retq

After:

foo:                                    # @foo
        .cfi_startproc
# %bb.0:                                # %entry
        movd    %edi, %xmm0
        pxor    %xmm1, %xmm1
        pshufb  %xmm1, %xmm0
        retq

Apparently https://github.com/llvm/llvm-project/commit/16f3d698f2afbbea43e0c3df81df6f2a640ce806#diff-045a11495d5fad4e01af59c60f6a387aR500 makes SSE4_A now imply SSSE3, which is not correct (at least not always, I think).
Comment 1 Hans Wennborg 2020-09-08 10:18:26 PDT
Craig, can you take a look?
Comment 2 Andriy Gapon 2020-09-08 10:25:37 PDT
Perhaps it was just a typo -- SSE3 vs SSSE3?
Comment 3 Craig Topper 2020-09-08 10:56:46 PDT
Fixed by e6bb4c8e7b3e27f214c9665763a2dd09aa96a5ac
Comment 4 Hans Wennborg 2020-09-08 12:00:46 PDT
(In reply to Craig Topper from comment #3)
> Fixed by e6bb4c8e7b3e27f214c9665763a2dd09aa96a5ac

Thanks! Pushed to 11.x as 6f1dbbc17c03206040eeaaee71e5db961f2cac30.