21414 – clang shouldn't use aligned SSE instructions on 32bit x86 linux

LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 21414 - clang shouldn't use aligned SSE instructions on 32bit x86 linux

Summary: clang shouldn't use aligned SSE instructions on 32bit x86 linux

Status:	NEW

Alias:	None

Product:	libraries
Classification:	Unclassified
Component:	Backend: X86 (show other bugs)
Version:	trunk
Hardware:	PC All

Importance:	P normal
Assignee:	Unassigned LLVM Bugs

URL:
Keywords:

Depends on:
Blocks:

Reported:	2014-10-29 18:39 PDT by Nico Weber
Modified:	2014-11-17 08:44 PST (History)
CC List:	8 users (show)

See Also:
Fixed By Commit(s):

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Nico Weber 2014-10-29 18:39:15 PDT

We recently enabled clang as default compiler for chrome on linux. It mostly went well. One problem we ran into was that chrome 38 crashed on old debians. rnk explained this to me, and from what I understand (I probably got some of it wrong) it was due to clang using aligned SSE instructions.

gcc changed their alignment abi on linux a while ago, and the current abi does guarantee alignment. The old one doesn't, and I suppose on debian system libraries still use the old abi. gcc emits unaligned SSE loads on 32bit linux, clang should too.

As is, every provider of binary-distributed binaries on linux will run into this and then work around it somehow. We ended up adding tons of stack adjustments, which bloats binary size, is bad for the icache, etc. Unaligned SSE reads are almost as fast as aligned ones, so clang should do the thing that Just Works instead of giving a bad user experience to its users.

https://code.google.com/p/nativeclient/issues/detail?id=3935

Comment 1 Dimitry Andric 2014-10-30 11:11:36 PDT

Since bug 8969, 32-bit Linux assumes a 16 byte aligned stack.  Upstream gcc also defaults to this, so why would it ever choose unaligned SSE loads?

Comment 2 Nico Weber 2014-10-30 12:33:52 PDT

See comment 0. Binaries built with old gccs don't conform to this abi, so gcc uses unaligned reads to be abi compatible with old versions of itself.

Comment 3 Reid Kleckner 2014-10-30 13:23:55 PDT

(In reply to comment #2)
> See comment 0. Binaries built with old gccs don't conform to this abi, so
> gcc uses unaligned reads to be abi compatible with old versions of itself.

I think it's a little more nuanced than that, here's the history as I understand it:

- Long ago (2006?) gcc increased the "preferred" stack alignment to 16.
- They waited a while with this setting
- In the next release, they began assuming the stack was 16 byte aligned.
- Users reported bugs, and their compromise solution was "we'll keep assuming 16 byte alignment, but we'll use unaligned loads and stores so that users stop filing bugs on us"

I don't consider it correct to say that they are ABI-compatible with old GCC binaries. It's just that they have made an engineering tradeoff to avoid the most obvious forms of breakage for people on systems that aren't using the new ABI.

I think, in retrospect, we should probably also make this compromise. It's the practical thing to do.

See also:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40838

If you follow the breadcrumbs, it looks like GCC started aligning the stack back in 2006:
https://gcc.gnu.org/ml/gcc-patches/2006-09/msg00252.html

Comment 4 Reid Kleckner 2014-10-30 13:25:08 PDT

Also, to be absolutely clear, this is for 32-bit x86 Linux, not x86_64 which has always had a 16-byte stack alignment requirement in the ABI.