This is an archive of the discontinued LLVM Phabricator instance.

ConstantMerge: merge common initial sequences
AbandonedPublic

Authored by jfb on Aug 10 2018, 5:22 PM.

Details

Reviewers
hiraditya
Summary

Global constants are currently merged only if they're idential, with appropriate update of their alignment and debug info, and only if semantically correct. We can do better by merging global constants which are 'smaller' into equivalent 'bigger' global constants, under the same update / semantic restrictions as before. This patch implements such merging for global constants with common initial sequences of the exact same type.

As noted in the patch there are other common global constant merging opportunities which this patch chooses not to pursue. They might pay off, but might be more computationally expensive.

Because the merged globals are constants this is a gain in final executable size, and can help reduce the dynamic amount of data cache usage because there might be some temporal redundancy in global constant access.

rdar://problem/43163738

Diff Detail

Event Timeline

jfb created this revision.Aug 10 2018, 5:22 PM

Overall this is ready to review. I ran a bunch of tests and tried it out with asan, everything seems happy, but I'm a bit paranoid. I'm re-running the test-suite before / after the change to make sure everything works as I expect it to.

test/Transforms/ConstantMerge/initial-match.ll
79

I'll look into this soon :)

Have you tested this patch with some benchmarks or an open source project? What is the impact on compilation time?

lib/Transforms/IPO/ConstantMerge.cpp
111

if-else instead of switch?

181

Do we really need a goto here?

183

I think we should split the nested loop to separate function that may help remove goto.

test/Transforms/ConstantMerge/initial-match.ll
2

Can we reduce the size of this test case and maybe make multiple functions.

xbolva00 added inline comments.
lib/Transforms/IPO/ConstantMerge.cpp
183

Yes, please split it.

365–366

while here, modernize to range based loop?

382

cannot we use for (for &I : CommonInitialSequenceRewrite)?

jfb updated this revision to Diff 160641.Aug 14 2018, 11:02 AM
jfb marked 4 inline comments as done.
  • Address comments.
jfb added a comment.Aug 14 2018, 11:02 AM

Have you tested this patch with some benchmarks or an open source project? What is the impact on compilation time?

Yes! I did so over the weekend and then discarded my results on Monday because I'm Smart. I'll re-run a stage 2 bootstrap before / after.

lib/Transforms/IPO/ConstantMerge.cpp
111

This is on purpose: a switch without default will cause a diagnostic if a new entry is added to the enum class but isn't handled here.

181

I like goto to break out of loop nests. :-)

183

Done... but I still like goto for loop nests.

382

erase needs an iterator, and I was previously using another container type. Now that I use MapVector it's better to do a first-pass erase_if and then do the rest of the loop with a range-based for, so I updated the code to do so.

test/Transforms/ConstantMerge/initial-match.ll
2

I could, but one of the goals is to make sure things work through multiple iterations of the outer while (because an earlier implementation I had didn't). If I break it up then you end up with some single-iteration things. These (if implemented improperly) would break as tested, but wouldn't break in single-iteration.

In particular, removeDeadConstantUsers and eraseFromParent are trip hazards because they can invalidate values we discover. For example, you might be tempted to cache initial sequence matches found from one iteration to another (so you don't re-discover them every iteration), but that would break if removeDeadConstantUsers kicks in.

Do you have numbers of the size win? (maybe on clang/llvm itself)

test/Transforms/ConstantMerge/initial-match.ll
11–25

CHECK-NOT are really brittle, is there any way you can make this slightly more robust?

If you have the code-size numbers, please provide that then I can accept the patch.

jfb planned changes to this revision.Sep 10 2018, 10:55 AM

If you have the code-size numbers, please provide that then I can accept the patch.

I want to get back to this after addressing a few related things. My main worry is that this is only useful with LTO because the linker should do this work otherwise, so at a minimum I need to change the code accordingly. I'm working on other ways to achieve the size wins I want, and then I'll fiddle with LTO to see how much can be done.

My main worry is that this is only useful with LTO because the linker should do this work otherwise, so at a minimum I need to change the code accordingly.

This is still useful even if it helps during LTO. As long as you have numbers to validate, it should be fine. We can add a flag to disable this by default.

vsk added a subscriber: vsk.Jan 22 2019, 11:11 AM
jfb abandoned this revision.Aug 26 2020, 8:41 AM

This is probably still useful, if someone wants to pick it up.

Herald added a project: Restricted Project. · View Herald TranscriptAug 26 2020, 8:41 AM
Herald added a subscriber: ributzka. · View Herald Transcript