19429 – Mesa llvmpipe lp_test_arit regression

LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 19429 - Mesa llvmpipe lp_test_arit regression

Summary: Mesa llvmpipe lp_test_arit regression

Status:	RESOLVED INVALID

Alias:	None

Product:	new-bugs
Classification:	Unclassified
Component:	new bugs (show other bugs)
Version:	trunk
Hardware:	PC Linux

Importance:	P normal
Assignee:	Unassigned LLVM Bugs

URL:
Keywords:	regression

Depends on:
Blocks:

Reported:	2014-04-14 19:55 PDT by Vinson Lee
Modified:	2016-08-05 09:54 PDT (History)
CC List:	8 users (show)

See Also:	8999
Fixed By Commit(s):

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Vinson Lee 2014-04-14 19:55:39 PDT

Mesa unit test lp_test_arit now fails starting with llvm-3.5svn r206094.

$ ./build/linux-x86_64-debug/bin/lp_test_arit -v
neg(-inf): ref = inf, out = inf, precision = 24.000000 bits, PASS
neg(-60): ref = 60, out = 60, precision = 24.000000 bits, PASS
neg(-4): ref = 4, out = 4, precision = 24.000000 bits, PASS
neg(-2): ref = 2, out = 2, precision = 24.000000 bits, PASS
neg(-1): ref = 1, out = 1, precision = 24.000000 bits, PASS
neg(-1.00000001e-07): ref = 1.00000001e-07, out = 1.00000001e-07, precision = 24.000000 bits, PASS
neg(0): ref = -0, out = -0, precision = 24.000000 bits, PASS
neg(1.00000001e-07): ref = -1.00000001e-07, out = -1.00000001e-07, precision = 24.000000 bits, PASS
neg(0.00999999978): ref = -0.00999999978, out = -0.00999999978, precision = 24.000000 bits, PASS
neg(0.100000001): ref = -0.100000001, out = -0.100000001, precision = 24.000000 bits, PASS
neg(0.899999976): ref = -0.899999976, out = -0.899999976, precision = 24.000000 bits, PASS
neg(0.99000001): ref = -0.99000001, out = -0.99000001, precision = 24.000000 bits, PASS
neg(1): ref = -1, out = -1, precision = 24.000000 bits, PASS
neg(2): ref = -2, out = -2, precision = 24.000000 bits, PASS
neg(4): ref = -4, out = -4, precision = 24.000000 bits, PASS
neg(60): ref = -60, out = -60, precision = 24.000000 bits, PASS
neg(inf): ref = -inf, out = -inf, precision = 24.000000 bits, PASS
LLVM ERROR: Cannot select: intrinsic %llvm.x86.sse41.round.ps


6bb00df864ea4e2f74f47c088b65baaff962cca5 is the first bad commit
commit 6bb00df864ea4e2f74f47c088b65baaff962cca5
Author: Jim Grosbach <grosbach@apple.com>
Date:   Sat Apr 12 01:34:29 2014 +0000

    X86: Remove TargetMachine CPU auto-detection.
    
    This logic is properly in the realm of whatever is creating the
    TargetMachine. This makes plain 'llc foo.ll' consistent across
    heterogenous machines.
    
    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206094 91177308-0d34-0410-b5e6-96231b3b80d8

:040000 040000 2cae884260c943f7fc7bb85b5a3231cbc4977fdc f9e5bdc9dc900207f424347deb3c62a5b5ffe833 M	lib
bisect run success

Comment 1 Eric Christopher 2014-04-14 21:28:22 PDT

Seems like Mesa will need to be updated to set all of the machine specific bits when calling into llvm.

Comment 2 Jim Grosbach 2014-04-15 11:32:50 PDT

Correct. JIT clients need to specify a CPU and/or target features.

Note that the old JIT's use of autodetection also led to frequent backend failures on newer hardware (sandy bridge and forward), as it doesn't know how to handle lots of the new instructions. That won't happen anymore.

Comment 3 sroland 2014-04-15 21:18:45 PDT

So is it necessary to pass in everything the chip can do? So for Haswell something like "+avx, +avx2, +bmi, +bmi2, +cmov, +f16c, +fma, +movbe, +lzcnt, +popcnt, +sse, +sse2, +sse3 +ssse3, sse4.1, sse4.2" and the dozen others I forgot?
I guess I'll need to figure out how to get that from somewhere - right now we only detect some stuff for which we emit intrinsics ourselves (notably most sse versions, avx). Probably the same for the cpu model though I doubt it makes much of a difference for the generated code as nearly all of it is simd).

Comment 4 Jose Fonseca 2014-04-16 08:13:24 PDT

(In reply to comment #1)
> Seems like Mesa will need to be updated to set all of the machine specific
> bits when calling into llvm.

(In reply to comment #2)
> Correct. JIT clients need to specify a CPU and/or target features.

Is there a standard way to do that through LLVM C bindings?

Comment 5 Jim Grosbach 2014-04-16 12:54:29 PDT

(In reply to comment #3)
> So is it necessary to pass in everything the chip can do? So for Haswell
> something like "+avx, +avx2, +bmi, +bmi2, +cmov, +f16c, +fma, +movbe,
> +lzcnt, +popcnt, +sse, +sse2, +sse3 +ssse3, sse4.1, sse4.2" and the dozen
> others I forgot?

Just specify the CPU. Don't worry about the specific subtarget features.

That said, as mentioned above, unless you're using the MCJIT, you don't want to specify newer hardware, as the old JIT can't encode some of those instructions and you'll see crashes as a result.

> I guess I'll need to figure out how to get that from somewhere - right now
> we only detect some stuff for which we emit intrinsics ourselves (notably
> most sse versions, avx). Probably the same for the cpu model though I doubt
> it makes much of a difference for the generated code as nearly all of it is
> simd).

llvm::sys::getHostCPUName() and llvm::sys::getHostCPUFeatures()

Comment 6 Jim Grosbach 2014-04-16 12:56:29 PDT

(In reply to comment #4)
> (In reply to comment #1)
> > Seems like Mesa will need to be updated to set all of the machine specific
> > bits when calling into llvm.
> 
> (In reply to comment #2)
> > Correct. JIT clients need to specify a CPU and/or target features.
> 
> Is there a standard way to do that through LLVM C bindings?


If there's not, we should add one. That's something that should be there regardless of this change.

Comment 7 sroland 2014-04-16 13:15:41 PDT

(In reply to comment #5)
> (In reply to comment #3)
> > So is it necessary to pass in everything the chip can do? So for Haswell
> > something like "+avx, +avx2, +bmi, +bmi2, +cmov, +f16c, +fma, +movbe,
> > +lzcnt, +popcnt, +sse, +sse2, +sse3 +ssse3, sse4.1, sse4.2" and the dozen
> > others I forgot?
> 
> Just specify the CPU. Don't worry about the specific subtarget features.
> 
> That said, as mentioned above, unless you're using the MCJIT, you don't want
> to specify newer hardware, as the old JIT can't encode some of those
> instructions and you'll see crashes as a result.
We can use both mcjit or old jit though generally don't use mcjit on x86 (we used mcjit at a time where the old jit couldn't handle avx).
So I guess we need to "clamp" the cpu (though I wonder what old jit can't handle, all the new features of haswell? We actually have code which assumed llvm would emit avx2 instructions otherwise we did workarounds for emulating true vector shifts I guess that didn't actually work...)

> 
> > I guess I'll need to figure out how to get that from somewhere - right now
> > we only detect some stuff for which we emit intrinsics ourselves (notably
> > most sse versions, avx). Probably the same for the cpu model though I doubt
> > it makes much of a difference for the generated code as nearly all of it is
> > simd).
> 
> llvm::sys::getHostCPUName() and llvm::sys::getHostCPUFeatures()

Comment 8 Jim Grosbach 2014-04-16 13:43:52 PDT

Some of the new instructions probably work fine. Anything that the generic tblgen'erated binary encoding function will handle fully will fall out OK. It's stuff that needs anything more than will fall over. In particular, I've seen failures with some of the BMI instructions.