26774 – Doing certain kinds of IPO over comdat functions is unsound

LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 26774 - Doing certain kinds of IPO over comdat functions is unsound

Summary: Doing certain kinds of IPO over comdat functions is unsound

Status:	NEW

Alias:	None

Product:	libraries
Classification:	Unclassified
Component:	Interprocedural Optimizations (show other bugs)
Version:	trunk
Hardware:	PC All

Importance:	P normal
Assignee:	Unassigned LLVM Bugs

URL:
Keywords:

Depends on:
Blocks:

Reported:	2016-02-29 11:41 PST by Sanjoy Das
Modified:	2016-05-17 17:57 PDT (History)
CC List:	6 users (show)

See Also:
Fixed By Commit(s):

Attachments
Reproducer (2.41 KB, application/zip) 2016-02-29 11:41 PST, Sanjoy Das	Details
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Sanjoy Das 2016-02-29 11:41:35 PST

Created attachment 15960 [details]
Reproducer

This was discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2016-February/095833.html

Description copy-pasted from the thread:

"""
Let's start with an example that shows that we have a problem (direct
copy/paste from the guard intrinsics thread). Say we have:

```
void foo() available_externally {
  %t0 = load atomic %ptr
  %t1 = load atomic %ptr
  if (%t0 != %t1) print("X");
}
void main() {
  foo();
  print("Y");
}
```

The possible behaviors of the above program are {print("X"),
print("Y")} or {print("Y")}.  But if we run opt then we have

```
void foo() available_externally readnone nounwind {
  ;; After CSE'ing the two loads and folding the condition
}
void main() {
  foo();
  print("Y");
}
```

and some generic reordering

```
void foo() available_externally readnone nounwind {
  ;; After CSE'ing the two loads and folding the condition
}
void main() {
  print("Y");
  foo();  // legal since we're moving a readnone nounwind function that
          // was guaranteed to execute (hence can't have UB)
}
```

If we do not inline @foo(), and instead re-link the call site in @main
to some non-optimized copy (or differently optimized copy) of @foo,
then it is possible for the program to have the behavior {print("Y");
print ("X")}, which was disallowed in the earlier program.

In other words, opt refined the semantics of @foo() (i.e. reduced the
set of behaviors it may have) in ways that would make later
optimizations invalid if we de-refine the implementation of @foo().

The above example is clearly fabricated, but such cases can come up
even if everything is optimized to the same level.  E.g. one of the
atomic loads in the unrefined implementation of @foo() could have been
hidden behind a function call, whose body existed in only one module.
That module would then be able to refine @foo() to `ret void` but
other modules won't.
"""

Comment 1 Duncan 2016-02-29 12:51:21 PST

Adding a link to a complete reproducer (also from Sanjoy) that has nothing to do with atomics:
https://github.com/sanjoy/comdat-ipo

Comment 2 Sanjoy Das 2016-02-29 12:54:49 PST

(In reply to comment #1)
> Adding a link to a complete reproducer (also from Sanjoy) that has nothing
> to do with atomics:
> https://github.com/sanjoy/comdat-ipo

FYI: that's the zip file I've attached to this bug.

Comment 3 Reid Kleckner 2016-02-29 14:29:31 PST

I vote that we stop inferring readnone/readonly/etc on functions with isWeakForLinker linkage. I expect that in most places where we can infer such attributes, inlining is likely to happen anyway, and be just as profitable.

Are we aware of any other IPO soundness issues with other passes that aren't functionattrs? I think IPSCP is fine for ODR linkage functions, they all have to return the same constant regardless of optimization.

Comment 4 Mehdi Amini 2016-02-29 14:37:25 PST

Just a note: be careful with isWeakForLinker(), it is not the right one (as I wrote in the email thread). I think the right condition is !isStrongDefinitionForLinker().
(isWeakForLinker() does not catch available_externally)

Comment 5 Sanjoy Das 2016-02-29 15:01:29 PST

(In reply to comment #3)
> Are we aware of any other IPO soundness issues with other passes that aren't
> functionattrs? I think IPSCP is fine for ODR linkage functions, they all
> have to return the same constant regardless of optimization.

I think both ipsccp and ipconstprop are bad.  Consider a variant of
the original example:

void f(i32* %ptr) available_externally {
  %v0 = load atomic i32* %ptr
  %v1 = load atomic i32* %ptr
  %sub = sub i32 %v0, %v1
  call @print(i32 %sub)
  ret i32 %sub
}

void caller() {
  %t = call @f(%ptr)
  print(%t)
}

function local opts ==>

void f(i32* %ptr) available_externally {
  call @print(i32 0)
  ret i32 0
}

void caller() {
  %t = call @f(%ptr)
  print(%t)
}


ipsccp or ipconstprop ==>

void f(i32* %ptr) available_externally {
  call @print(i32 0)
  ret i32 0
}

void caller() {
  %t = call @f(%ptr)
  print(i32 0)
}

Now if we link to the unoptimized @f, we can print {<non zero value>,
<0>}, when in the original program the two values printed would have
had to be the same.

I think (haven't tried) we can construct similar cases with undef
(no atomics) as well.

Comment 6 Hal Finkel 2016-02-29 15:19:45 PST

(In reply to comment #3)
> I vote that we stop inferring readnone/readonly/etc on functions with
> isWeakForLinker linkage. I expect that in most places where we can infer
> such attributes, inlining is likely to happen anyway, and be just as
> profitable.

I think we need to be careful here; you could be right, but there seem to be some attributes that we do commonly prove for inline functions, even those too large to inline. We can prove 'nocapture' on the implicit this pointer, for example. readonly seems more common than readnone.

I really think we need to benchmark this against privatizing the functions to make an informed decision.

> 
> Are we aware of any other IPO soundness issues with other passes that aren't
> functionattrs? I think IPSCP is fine for ODR linkage functions, they all
> have to return the same constant regardless of optimization.

I suspect that GlobalsAA has the same problem as attribute propagation (in one TU, a function's access to a global could be optimized out, but the linker could pick a version of the function which still accesses the global). [And privatizing just got harder?...]

Comment 7 Warren Ristow 2016-05-17 17:57:20 PDT

I see a change was made at r265762 to address this issue.  That's causing a
lost optimization in variadic cases, that I believe can still safely be
optimized.  I've made bug 27796 for it.