965 – Infinite loops are optimized out in languages without forward progress guarantees (C, Rust)

LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 965 - Infinite loops are optimized out in languages without forward progress guarantees (C, Rust)

Summary: Infinite loops are optimized out in languages without forward progress guaran...

Status:	RESOLVED FIXED

Alias:	None

Product:	libraries
Classification:	Unclassified
Component:	Interprocedural Analyses (show other bugs)
Version:	trunk
Hardware:	All All

Importance:	P normal
Assignee:	Unassigned LLVM Bugs

URL:
Keywords:	miscompilation

Duplicates (10):	3593 5387 5644 10858 12033 14454 14973 16733 31217 47189 (view as bug list)
Depends on:
Blocks:

Reported:	2006-10-22 19:33 PDT by Nikhil Patil
Modified:	2021-01-24 14:54 PST (History)
CC List:	40 users (show)

See Also:	47189
Fixed By Commit(s):

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Nikhil Patil 2006-10-22 19:33:40 PDT

Hi,

I have been playing with functions marked __attribute__((noreturn));
Specifically consider a following code:

void f(void) __attribute__((noreturn));

int call_f()
{
    f();
}

void f()
{
    while(1)
        ;
}

With "opt -globalsmodref-aa -adce" this gets compiled to:

int %call_f() {
entry:
    %retval = alloca int, align 4           ; <int*> [#uses=1]
    unreachable ; **** WRONG!!
    
return:         ; No predecessors!
    %retval = load int* %retval             ; <int> [#uses=1]
    ret int %retval
    
void %f() {
entry:
    br label %bb
    
bb:             ; preds = %bb, %entry
    br label %bb
    
return:         ; No predecessors!
    ret void    
}

Comment 1 Chris Lattner 2006-10-23 16:33:42 PDT

Please attach the .bc file you are inputting to opt here.  I get:

void %f() {
entry:
        br label %bb

bb:             ; preds = %bb, %entry
        br label %bb
}

int %call_f() {
entry:
        br label %bb

bb:             ; preds = %bb, %entry
        br label %bb
}

which is right.

Comment 2 Chris Lattner 2006-10-23 16:35:03 PDT

Ah, n/m I figured it out, you're using -O0.

Comment 3 Chris Lattner 2006-10-23 16:40:31 PDT

In this case, it don't really have anything to do with noreturn.  Here globalmodrefaa determines that f() 
has no side effects and produces no return value, so it deletes the call.

For example:

int call_f() {
    f();
    return 17;
}

void f() {
    while(1);
}  

Compiles to:

int %call_f() {
entry:
        %retval = alloca int, align 4           ; <int*> [#uses=2]
        %tmp = alloca int, align 4              ; <int*> [#uses=2]
        store int 17, int* %tmp
        %tmp = load int* %tmp           ; <int> [#uses=1]
        store int %tmp, int* %retval
        br label %return

return:         ; preds = %entry
        %retval = load int* %retval             ; <int> [#uses=1]
        ret int %retval
}

With: llvm-gcc dce.c -emit-llvm -S -o - | llvm-as | opt -globalsmodref-aa -adce | llvm-dis
-Chris

Comment 4 Eli Friedman 2008-06-18 03:37:35 PDT

How is this an issue with alias analysis?  The function *doesn't* read or write memory.  The issue is that ADCE is making bad assumptions.

The new version of ADCE has switched to using "mayWriteToMemory" instead of alias analysis... it still has the same issue, though, if you put the readnone attribute on f.  I hesitate to suggest a new function attribute, but having an attribute which guarantees a function will eventually return (the opposite of noreturn, I guess), would make some optimizations more feasible, like a correct ADCE and LICM with potentially trapping instructions.

A similar sort of testcase for LICM (where it assumes calls always return):
int b(void), c(int);
int a(int a) {
int k; for (int i = 0; i < 10; i++) {b(); k = 5/a; c(k);} return k;
}

Comment 5 Eli Friedman 2008-08-06 17:43:34 PDT

Same issue in LoopStrengthReduce:

declare i32 @a(i32) readonly

define void @b() {
entry:
  br label %loopstart

loopstart:
  %indvar = phi i32 [0, %entry], [%indvarinc, %loopstart]
  %randomphi = phi i32 [0, %entry], [%randomvar, %loopstart]
  %randomvar = call i32 @a(i32 %randomphi)
  %indvarinc = add i32 %indvar, 1
  %cmp = icmp eq i32 %indvarinc, 10
  br i1 %cmp, label %loopstart, label %exit

exit:
  ret void
}

Run opt -loop-reduce to reproduce.

Maybe we should add a function to Instruction, something like hasNoSideEffects, which returns false for all call instructions.  Then, we can make it a bit smarter if there's eventually a way to indicate that a function doesn't have any infinite loops.

Comment 6 Eli Friedman 2008-09-19 12:23:07 PDT

This is likely to trigger much more frequently with the new addreadattrs pass.

Comment 7 Duncan Sands 2008-09-19 12:39:18 PDT

Is this actually a bug?

Comment 8 Eli Friedman 2008-09-19 12:51:23 PDT

(In reply to comment #7)
> Is this actually a bug?

I'm pretty sure turning a well-defined infinite loop into a program with undefined behavior qualifies as a bug.

Comment 9 Duncan Sands 2008-09-21 04:29:51 PDT

Fair enough.  There is a similar issue with
raising exceptions: a function containing an
unwind instruction can be marked readnone/readonly.
But it should not be deleted if the result is not
used, since calls to it can modify control flow.

Here's what I suggest
(1) a new function attribute "willreturn" indicating that
calls to the function are sure to return.
(2) mark intrinsics with this, plus generate it in llvm-gcc
for a bunch of standard library functions.
(3) add a pass like PruneEH and AddReadAttrs which marks
functions willreturn if it can prove that they will not
infinite loop.
(4) add a mayChangeControlFlow method which says that
calls may change control flow except if they are marked
both nounwind and willreturn.
(5) create a hasNoSideEffects method which checks
mayWriteToMemory (or whatever) and mayChangeControlFlow.
Replace isTriviallyDead with this, or use it in
isTriviallyDead.

Comment 10 Pekka Jääskeläinen 2008-10-14 07:49:09 PDT

This affects our embedded target's simulator where we stop simulation immediately when program enters _exit() from _start() with a crt0() like 
this:

void _start(void) __attribute__((noinline));
void _exit(int) __attribute__((noinline,noreturn));

void _start(void)
{
    _exit(main());
}

/* override normal exit. simulator should stop when _exit() is seen */
void _exit(int retval) {
    while(1); /* less instructions are generated, when this while exists */
}

When compiled with llvm-gcc -O0 this results in:

define void @_start() nounwind noinline {
entry:
        %0 = call i32 (...)* @main() nounwind           ; <i32> [#uses=1]
        call void @_exit(i32 %0) noreturn nounwind
        unreachable

return:         ; No predecessors!
        ret void
}

declare i32 @main(...)

define void @_exit(i32 %retval) noreturn nounwind noinline {
entry:
        %retval_addr = alloca i32               ; <i32*> [#uses=1]
        %"alloca point" = bitcast i32 0 to i32          ; <i32> [#uses=0]
        store i32 %retval, i32* %retval_addr
        br label %bb

bb:             ; preds = %bb, %entry
        br label %bb

return:         ; No predecessors!
        ret void
}

When compiling with -O2 the call to _exit() is removed and our simulator
flows from _start() after returning to it from main() which causes an infinite loop to main() (as it's the next function in memory) and not being able to stop simulation automatically:

define void @_start() nounwind noinline {
entry:
        %0 = tail call i32 (...)* @main() nounwind              ; <i32> [#uses=0]
        unreachable
}

define void @_exit(i32 %retval) noreturn nounwind readnone noinline {
entry:
        br label %bb

bb:             ; preds = %bb, %entry
        br label %bb
}

We can workaround this issue by compiling crt0() with -O0 so it's not a serious problem for us.

Comment 11 Nick Lewycky 2009-02-16 23:12:19 PST

*** Bug 3593 has been marked as a duplicate of this bug. ***

Comment 12 Nick Lewycky 2009-02-16 23:14:43 PST

Most functions that are obviously readnone/readonly will also be obviously halting. I think adding that annotation and a utility method for 'hasSideEffect' is the right way to go. Just make sure that it's invalid to declare noreturn+halting at the same time.

Comment 13 Duncan Sands 2009-11-04 04:16:35 PST

*** Bug 5387 has been marked as a duplicate of this bug. ***

Comment 14 John Regehr 2009-11-04 15:51:05 PST

As a random note, we recently started looking for wrongful-termination and wrongful-nontermination code generation bugs in LLVM, and ran across this.  

I don't want to be a whiner but since we hit this bug a lot, we have to avoid doing any more of this kind of testing until this problem is fixed.

Comment 15 Chris Lattner 2009-12-31 21:59:18 PST

*** Bug 5644 has been marked as a duplicate of this bug. ***

Comment 16 Evan Cheng 2010-09-07 17:28:09 PDT

rdar://8401758

Comment 17 Duncan Sands 2010-09-08 03:06:15 PDT

A fix for this was posted to the mailing list:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20100705/103672.html

Comment 18 Russell Harmon 2010-11-23 15:55:47 PST

(In reply to comment #17)
> A fix for this was posted to the mailing list:
> http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20100705/103672.html

The correct link is http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20100705/103670.html

Comment 19 Matti Niemenmaa 2011-09-04 13:51:49 PDT

*** Bug 10858 has been marked as a duplicate of this bug. ***

Comment 20 Benjamin Kramer 2012-02-19 04:46:02 PST

*** Bug 12033 has been marked as a duplicate of this bug. ***

Comment 21 Eli Friedman 2012-11-29 03:00:27 PST

*** Bug 14454 has been marked as a duplicate of this bug. ***

Comment 22 Richard Smith 2013-01-16 19:28:08 PST

*** Bug 14973 has been marked as a duplicate of this bug. ***

Comment 23 Matti Niemenmaa 2013-07-29 11:39:55 PDT

*** Bug 16733 has been marked as a duplicate of this bug. ***

Comment 24 Reid Kleckner 2016-11-30 14:40:34 PST

*** Bug 31217 has been marked as a duplicate of this bug. ***

Comment 25 Reid Kleckner 2016-11-30 14:48:18 PST

I think the last time we discussed this seriously was in July 2015:
http://lists.llvm.org/pipermail/llvm-dev/2015-July/088095.html

I have a different proposal. The forward progress required by C++ is the odd duck here. We should ask frontends to put a function attribute on functions that they want to receive this kind of optimization. It should be safe to remove this attribute when doing cross-language inlining during LTO. Passes like ADCE or functionattrs can power down when a function lacks this attribute, so we don't delete infinite loops or infer that infinite loop functions are readnone in C.

Comment 26 Dan Gohman 2017-09-27 15:09:10 PDT

I posted a patch implementing @llvm.sideeffect(): https://reviews.llvm.org/D38336

Comment 27 simonas+llvm.org 2019-02-17 04:19:01 PST

Light testing suggests that considering `noreturn` side-effectful and not infering `readnone`, `readonly` or `writeonly` attributes for such functions (in the derive function attributes pass) yields fairly good results and goes most of the way towards resolving this issue.

Would a patch implementing this idea be accepted?

Comment 28 Nick Lewycky 2019-02-17 14:14:43 PST

Simon, for your proposal to be correct you would have to treat all non-returning functions as side-effecting, but functions which lack the 'noreturn' attribute might not return. As is, your proposal would decrease the chance that a user encounters a miscompile, but not fix it entirely. I don't know whether llvm-commits would accept such a patch.

There's a more broad rule to follow when designing IR attributes. LLVM IR is designed to be cross-language at function call boundaries (and possibly within a function after inlining), so you shouldn't assume anything about a function's behaviour unless you can see the definition, or have an attribute on the declaration or call site. It follows that we have to make the most pessimistic assumptions in the absence of attributes. Function attributes should be designed to constrain the possible behaviour of the function, and we can only optimize in the presence of the attributes.

Reid's proposal is correct. We can keep 'noreturn' since it constrains the function (it must not return) while also adding a 'forwards-progress' which constrains the function to either perform a side-effect or return. A single function could have both attributes, for example a function containing the select loop of a server.

In C++ mode clang could add 'forwards-progress' on all function declarations except those marked extern "C".

Comment 29 Todd Snider 2019-07-19 14:23:29 PDT

Are there any plans to move ahead with Reid Kleckner's proposal to add a "forwards-progress" attribute to a C / C++ function?

I confess that I am one of the clang users/developers that would prefer that clang provide some way of leaving a noreturn function which contains only an infinite loop intact.

Some applications that we've encountered use such functions as "parking" functions where an error condition can go to wait until serviced by an interrupt.

We currently insert an asm("") statement into such functions to fake a side effect so that the optimizer will not remove calls in the same compilation unit to the "parking" function.

If the "forwards-progress" attribute that Reid proposes were implemented, it seems that if we want a "parking" function to stay intact in C++ mode, we'd still need some way to turn off the "forwards-progress" attribute, maybe a "waiting-for-service" attribute? Maybe an C/C++-callabel intrinsic that maps to the @llvm.side-effect() intinsic?

Comment 30 Florian Hahn 2020-08-17 10:59:31 PDT

IICU there is another push towards resolving this, by adding a `no progress` attribute https://reviews.llvm.org/D85393

Comment 31 David Blaikie 2020-08-17 11:11:17 PDT

*** Bug 47189 has been marked as a duplicate of this bug. ***

Comment 32 Nikita Popov 2021-01-24 04:57:55 PST

This issue is fixed in LLVM 12 by the introduction of the mustprogress attribute and loop metadata, inference of willreturn and finally limitation of call DCE to willreturn functions.

There might still be some incorrect assumptions left over here and there, but the core issue is fixed now, and we should track remaining issues separately.

Comment 33 Chris Lattner 2021-01-24 14:54:27 PST

Very nice, congratulations!

baldrick
bloodjazman
butchelor
chenyang
clattner
dhazeghi
dwelch
efriedma
evan.cheng
florian_hahn
geek4civic
id
josh
jryans
jyasskin
kalvdans
llvm-bugs
llvm
matti.niemenmaa+llvmbugs
nicholas
nikita.ppv
nlewycky
npmazzuca
pawel
pekka.jaaskelainen
penguinofthegods
post+llvm
programmerjake
quickslyver
regehr
rnk
russ
scoopr
simonas+llvm.org
t-snider
t.p.northover
troels.roennow
uenoku.tokotoko
wenhan.gu
yuanfang.chen