This is an archive of the discontinued LLVM Phabricator instance.

[X86][AsmParser] re-introduce 'offset' operator
ClosedPublic

Authored by epastor on Dec 12 2019, 2:21 PM.

Details

Summary

Amend MS offset operator implementation, to more closely fit with its MS counterpart:

  1. InlineAsm: evaluate non-local source entities to their (address) location
  2. Provide a mean with which one may acquire the address of an assembly label via MS syntax, rather than yielding a memory reference (i.e. "offset asm_label" and "$asm_label" should be synonymous
  3. address PR32530

Based on http://llvm.org/D37461

Fix broken test where the break appears unrelated.

  • Set up appropriate memory-input rewrites for variable references.
  • Intel-dialect assembly printing now correctly handles addresses by adding "offset".
  • Pass offsets as immediate operands (using "r" constraint for offsets of locals).

Diff Detail

Event Timeline

epastor created this revision.Dec 12 2019, 2:21 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 12 2019, 2:21 PM
epastor added a comment.EditedDec 12 2019, 7:29 PM

Note: this currently breaks several test cases, at least some of which are valid.

Example: (minimized from ms-inline-asm-redundant-clobber.ll)
call void asm inteldialect "cmpxchg8b $0", "=*m"(i32* @Bar)
previously emitted
cmpxchg8b Bar(%rip)
and now (after the change to X86AsmPrinter.cpp) emits

<inline asm>:2:29: error: unknown token in expression
        cmpxchg8b [rip + offset Bar]
                                   ^

I'm not sure how to fix this... but among other problems, it seems we've ended up emitting offset even when emitting AT&T-syntax assembly. (The same error occurs if we pass -x86-asm-syntax=intel, too.) This appears to affect all indirect memory operands.

EDIT: This was a bad interpretation; the issue was that the operand was expanded into an inteldialect Intel expression, and the offset was immediately followed by ']'. Making offset legal before ']' and ')' fixes that problem.

epastor updated this revision to Diff 233778.Dec 13 2019, 4:46 AM
  • Allow offsets at the end of parenthesized expressions
epastor added a reviewer: rnk.Dec 13 2019, 5:14 AM

Remaining test failures include:

call void asm sideeffect inteldialect "lea edi, dword ptr $0", "*m,~{edi},~{dirflag},~{fpsr},~{flags}"([2 x i32]* @results) nounwind
previously expanded as
lea edi, dword ptr [_results]
and now expands as
lea edi, dword ptr [offset _results]
This may be correct.

In C & C++,
__asm mov [eax], offset var
is now ambiguous, since "offset var" is not aliased directly to a pointer-sized register. I think this is technically correct, but I don't think I should make that decision without input.

epastor updated this revision to Diff 233786.Dec 13 2019, 5:24 AM
  • Fix tests expecting use of 'r' constraints for offset operands
Herald added a project: Restricted Project. · View Herald TranscriptDec 13 2019, 5:24 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
epastor updated this revision to Diff 233806.Dec 13 2019, 7:58 AM
  • Fix Intel named operator parsing

Unit tests: fail. 60824 tests passed, 4 failed and 726 were skipped.

failed: Clang.CodeGen/ms-inline-asm-64.c
failed: Clang.CodeGen/ms-inline-asm.c
failed: Clang.SemaTemplate/instantiation-depth-default.cpp
failed: LLVM.CodeGen/X86/ms-inline-asm.ll

clang-format: pass.

Build artifacts: console-log.txt, CMakeCache.txt, test-results.xml, diff.json

epastor updated this revision to Diff 233813.Dec 13 2019, 8:29 AM
  • Respect correct usage of offset.
  • Don't assume implicit sizing on offset.
  • Check that offset works in compound expressions.

Unit tests: pass. 60854 tests passed, 0 failed and 726 were skipped.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or apply this patch.

Build artifacts: console-log.txt, CMakeCache.txt, clang-format.patch, test-results.xml, diff.json

epastor updated this revision to Diff 234067.Dec 16 2019, 8:03 AM
  • Apply formatting fixes where reasonable

Unit tests: pass. 60854 tests passed, 0 failed and 726 were skipped.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or apply this patch.

Build artifacts: diff.json, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

rnk added a comment.Dec 16 2019, 4:13 PM

Thanks! I made some comments. This code is convoluted. I don't think I can reason through all the edge cases, but this seems like an improvement. I'd suggest adding the recommended tests and any other corner case inputs you can think of, and after another round of review we can land it.

clang/test/CodeGen/ms-inline-asm-64.c
18

I guess this has the same semantics as it did before.

llvm/include/llvm/MC/MCParser/MCTargetAsmParser.h
74

These explicit empty StringRef initializations seem unnecessary, I'd suggest removing them.

99–100

nit, pack the bool after the unsigned for a smaller struct. Hey, we make a vector of these. :)

llvm/lib/MC/MCParser/AsmParser.cpp
5938–5943

Maybe this would be simpler with

auto rewrite_it = find_if(it, AsmStrRewrites.end(), [&](const AsmRewrite &OffsetAR) {
  // ... conditions below
});
llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
1481

While this doesn't trigger warnings, it seems like it would be simpler to do the assignment as its own statement and then if (ParseErrror) return true;

I see that the local code style uses this pattern already, but I'm not sure that's a good reason to promote it and keep using it.

1484

ditto

1823

Please test this corner case, I imagine it looks like:

mov eax, offset 3
1827

Please add a test for this corner case, I'm curious to see if the error reporting really works.

epastor updated this revision to Diff 234371.Dec 17 2019, 12:59 PM
epastor marked 6 inline comments as done.
  • Address refactoring comments

Thanks for feedback!

clang/test/CodeGen/ms-inline-asm-64.c
18

It does; this previously resolved to a "qword ptr" mov automatically, since $0 unpacked as a (forced) 64-bit register name. Now we need to specify the size of the move, since the operand is an immediate rather than a register.

llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
1823

Interesting. This corner case didn't trigger in that scenario; we get an "expected identifier" error message with good source location, followed by another error "use of undeclared label '3'" in debug builds... and in release builds, we instead get a crash. On tracing the crash, it's a AsmStrRewrite applying to a SMLoc not coming from the same string...

As near as I can tell, the issue is that we end up trying to parse "3" as a not-yet-declared label, as such expand it to __MSASMLABEL_.${:uid}__3, and then end up in a bad state because the operand rewrite is applying to the expanded symbol... which isn't in the same AsmString. If you use an actual undeclared label, you hit the same crash in release builds.

This is going to take some work; I'll get back to it in a day or two.

1827

This error reporting doesn't work, in fact. We instead get "cannot take the address of an rvalue of type 'int'", with bad source location. Will investigate why we end up there.

Unit tests: pass. 60854 tests passed, 0 failed and 726 were skipped.

clang-tidy: fail. Please fix clang-tidy findings.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

epastor updated this revision to Diff 234566.Dec 18 2019, 10:00 AM
epastor marked 2 inline comments as done.

Fix issues around enum values and labels

  • Only rewrite variables as offset operands; fixes crash due to conflicting rewrites for labels
  • Recognize inline assembly references to enum values
llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
1823

Fixed; we now get the same errors in this scenario as we do in the current LLVM trunk, reporting "expected identifier" and "use of undeclared label '3'".

On the other hand, I'm still working on finding a scenario that DOES trigger this corner case.

1827

Turns out Sema wasn't successfully recognizing inline-assembly references to enum values, so we were processing them through as labels, and only recognizing the problem late in processing. Fixed now, with accurate error reporting; test added.

epastor updated this revision to Diff 234567.Dec 18 2019, 10:04 AM
  • Removing accidental artifacts of local testing...

Unit tests: unknown.

clang-tidy: fail. Please fix clang-tidy findings.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt

Unit tests: pass. 60854 tests passed, 0 failed and 726 were skipped.

clang-tidy: fail. Please fix clang-tidy findings.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

rnk accepted this revision.Dec 28 2019, 1:04 PM

lgtm, thanks!

llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
1823

I see. Well, it sounds like it was a useful test. :)

This revision is now accepted and ready to land.Dec 28 2019, 1:04 PM
epastor updated this revision to Diff 235617.Dec 30 2019, 10:06 AM

Rebase on top of HEAD

  • Add compatibility with dc5b614fa9a1 (which changed handling for call operands in x86 assembly)
  • Remove another artifact of local testing
epastor updated this revision to Diff 235619.Dec 30 2019, 10:18 AM
epastor marked 5 inline comments as done.
  • Fix a few final issues.
    • Pack the AsmRewrite struct more optimally.
    • Only check if we can fuse rewrites with ones we haven't already reached!
llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
1823

Very useful indeed!

Unit tests: pass. 61144 tests passed, 0 failed and 728 were skipped.

clang-tidy: fail. Please fix clang-tidy findings.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

Unit tests: unknown.

clang-tidy: fail. Please fix clang-tidy findings.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt

epastor updated this revision to Diff 235620.Dec 30 2019, 10:30 AM
  • The three-argument version of find_if is std::find_if, and the qualification is required.

Unit tests: pass. 61144 tests passed, 0 failed and 728 were skipped.

clang-tidy: fail. Please fix clang-tidy findings.

clang-format: fail. Please format your changes with clang-format by running git-clang-format HEAD^ or applying this patch.

Build artifacts: diff.json, clang-tidy.txt, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

This revision was automatically updated to reflect the committed changes.