Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miscompile lencod (after Sanjoy's SCEV changes?) #28318

Closed
tobiasgrosser opened this issue May 31, 2016 · 7 comments
Closed

Miscompile lencod (after Sanjoy's SCEV changes?) #28318

tobiasgrosser opened this issue May 31, 2016 · 7 comments
Labels
bugzilla Issues migrated from bugzilla polly

Comments

@tobiasgrosser
Copy link
Contributor

Bugzilla Link 27944
Resolution FIXED
Resolved on Jul 11, 2016 07:21
Version unspecified
OS Linux
CC @hfinkel,@jdoerfert,@jdoerfert,@sanjoy

Extended Description

It seems that after Sanjoys SCEV changes that happend between 115336 and 115351 a miscompile was introduced in FAIL: lencod.execution_time

http://lab.llvm.org:8011/builders/perf-x86_64-penryn-O3-polly-before-vectorizer-unprofitable/builds/3034

To the the build-breakage that happened before it is not easily possible to track down the precise version.

@sanjoy
Copy link
Contributor

sanjoy commented May 31, 2016

Hi Tobias,

It look like it broke earlier, in http://lab.llvm.org:8011/builders/perf-x86_64-penryn-O3-polly-before-vectorizer-unprofitable/builds/3026/ and the only SCEV change in that build is 271151. I didn't notice this earlier because I thought this too was a failed build due to removing the apply method.

Do you know why the test is failing? Is it a performance regression, or is the benchmark infinite looping?

Finally, can you point me to some instructions on how I can reproduce this?

@tobiasgrosser
Copy link
Contributor Author

Hi Sanjoy,

you are right. This is really just the one commit. I tried to revert it locally, but I unfortunately already get conflicts. The command to try this locally should be:

lnt runtest nt --sandbox /tmp --cc llvm_build/bin/clang-3.9 --test-suite test-suite/ --only-test MultiSource/Applications/JM/lencod --mllvm=-polly --mllvm=-polly-position=before-vectorizer

I am currently traveling so I won't be able to debug this until next week. As the SCEV nsw stuff has many corner cases, it might very well be that this just triggers a Polly bug. So don't feel obliged to debug this trough. However, if you have some insights I would be glad to hear them.

Best,
Tobias

@tobiasgrosser
Copy link
Contributor Author

*** Bug llvm/llvm-bugzilla-archive#27980 has been marked as a duplicate of this bug. ***

@tobiasgrosser
Copy link
Contributor Author

@​Johannes, this bug seems to be in code which you know best. Could you possibly have a look?

OK. This is really a nasty bug, but I can reproduce it. I first tried to create a big .bc file and then just wanted to run bugpoint to reduce it. Unfortunately, this did not work for two reasons: 1) opt and llc/lli took minutes to just run this test case once. So everything took ages. 2) When specifying the individual passes one-by-one instead of using -O3, this bug disappears. :( So I had to revert to manually bisecting this bug. After a couple of hours debugging this is what I got:

  1. https://llvm.org/svn/llvm-project/llvm/trunk@271151 is indeed the change that triggers this bug.

  2. There is only a single mis-compile

File: block.c
Function: dct_chroma
Region: %for.body660---%if.end1256

  1. The bug disappears if I add -polly-invariant-load-hoisting=false

Before Sanjoys patch we get:

Invariant Accesses: {
        ReadAccess :=	[Reduction Type: NONE] [Scalar: 0]
            [qp_per_dc.0] -> { Stmt_if_else764[i0] -> MemRef_197[0] };
        Execution Context: [qp_per_dc.0] -> {  :  }
        ReadAccess :=	[Reduction Type: NONE] [Scalar: 0]
            [qp_per_dc.0] -> { Stmt_for_inc826_for_body660_crit_edge[i0] -> MemRef_dct_chroma_m4[1, 0] };
        Execution Context: [qp_per_dc.0] -> {  :  }
        ReadAccess :=	[Reduction Type: NONE] [Scalar: 0]
            [qp_per_dc.0] -> { Stmt_for_inc826_for_body660_crit_edge[i0] -> MemRef_dct_chroma_m4[1, 2] };
        Execution Context: [qp_per_dc.0] -> {  :  }
        ReadAccess :=	[Reduction Type: NONE] [Scalar: 0]
            [qp_per_dc.0] -> { Stmt_for_inc826_for_body660_crit_edge[i0] -> MemRef_dct_chroma_m4[1, 1] };
        Execution Context: [qp_per_dc.0] -> {  :  }
        ReadAccess :=	[Reduction Type: NONE] [Scalar: 0]
            [qp_per_dc.0] -> { Stmt_for_inc826_for_body660_crit_edge[i0] -> MemRef_dct_chroma_m4[1, 3] };
        Execution Context: [qp_per_dc.0] -> {  :  }
}

After Sanjoys patch we get:

Invariant Accesses: {
        ReadAccess :=	[Reduction Type: NONE] [Scalar: 0]
            [qp_per_dc.0] -> { Stmt_if_else764[i0] -> MemRef_197[0] };
        Execution Context: [qp_per_dc.0] -> {  :  }
        ReadAccess :=	[Reduction Type: NONE] [Scalar: 0]
            [qp_per_dc.0] -> { Stmt_for_inc826_for_body660_crit_edge[i0] -> MemRef_dct_chroma_m4[1, 0] };
        Execution Context: [qp_per_dc.0] -> {  : 1 = 0 }
        ReadAccess :=	[Reduction Type: NONE] [Scalar: 0]
            [qp_per_dc.0] -> { Stmt_for_inc826_for_body660_crit_edge[i0] -> MemRef_dct_chroma_m4[1, 2] };
        Execution Context: [qp_per_dc.0] -> {  : 1 = 0 }
        ReadAccess :=	[Reduction Type: NONE] [Scalar: 0]
            [qp_per_dc.0] -> { Stmt_for_inc826_for_body660_crit_edge[i0] -> MemRef_dct_chroma_m4[1, 1] };
        Execution Context: [qp_per_dc.0] -> {  : 1 = 0 }
        ReadAccess :=	[Reduction Type: NONE] [Scalar: 0]
            [qp_per_dc.0] -> { Stmt_for_inc826_for_body660_crit_edge[i0] -> MemRef_dct_chroma_m4[1, 3] };
        Execution Context: [qp_per_dc.0] -> {  : 1 = 0 }
}

Without Load Hoisting we get:

Invariant Accesses: {                                                        
}                                                                            
Context:                                                                     
[qp_per_dc.0] -> {  : -2147483648 <= qp_per_dc.0 <= 2147483647 }             
Assumed Context:                                                             
[qp_per_dc.0] -> {  :  }                                                     
Invalid Context:                                                             
[qp_per_dc.0] -> {  : 1 = 0 }                                                
p0: %qp_per_dc.0     

How to reproduce the bug:

Go to /home/grosser/Projects/polly/test-suite/MultiSource/Applications/JM/lencod and run:

for i in ls *.c; do echo $i; /home/grosser/tmp/llvm-project/llvm-project/build/bin/clang -O3 -mllvm -polly -mllvm -polly-process-unprofitable -mllvm -polly-position=before-vectorizer $i -c -o $i.o; done && clang *.o -o executable && ./executable -d data/encoder_small.cfg -p InputFile=data/foreman_part_qcif_444.yuv -p LeakyBucketRateFile=data/leakybucketrate.cfg -p QmatrixFile=data/q_matrix.cfg > executable.out && diff lencod.reference_output.small executable.out

@tobiasgrosser
Copy link
Contributor Author

Reduced test case
Hi guys,

I just managed to reduce this test case further. Interestingly, the new test case failed already before Sanjoy's patch.

I see the following with latest Polly (274430):

$opt test.ll -polly-scops -polly-process-unprofitable -polly-invariant-load-hoisting=false -analyze

Invariant Accesses: {
}
Context:
{  :  }
Assumed Context:
{  :  }
Invalid Context:
{  : 1 = 0 }
Alias Groups (0):
    n/a
Statements {
	Stmt_loop

...
Stmt_loop_next
Domain :=
{ Stmt_loop_next[0] };
Schedule :=
{ Stmt_loop_next[i0] -> [0, 1] };
MustWriteAccess := [Reduction Type: NONE] [Scalar: 1]
{ Stmt_loop_next[i0] -> MemRef_val__phi[] };
ReadAccess := [Reduction Type: NONE] [Scalar: 0]
{ Stmt_loop_next[i0] -> MemRef_a[1] };
}

$opt test.ll -polly-scops -polly-process-unprofitable -analyze
Invariant Accesses: {
ReadAccess := [Reduction Type: NONE] [Scalar: 0]
{ Stmt_loop_next[i0] -> MemRef_a[1] };
Execution Context: { : 1 = 0 }
}
Context:
{ : }
Assumed Context:
{ : }
Invalid Context:
{ : 1 = 0 }
Statements {
Stmt_loop
...
Stmt_loop_next
Domain :=
{ Stmt_loop_next[0] };
Schedule :=
{ Stmt_loop_next[i0] -> [0, 1] };
MustWriteAccess := [Reduction Type: NONE] [Scalar: 1]
{ Stmt_loop_next[i0] -> MemRef_val__phi[] };
}

Even though Stmt_loop_next is always executed, it has an empty execution context when hoisted. This seems incorrect.

It seems the reason is that the subscript expression of the hoisted memory access
is "{0,+,4}<%loop>" and does not have nsw/nuw annotations. As a result, the
invalid context computed in MemoryAccess::getPwAff is non-empty '{ [i0] : i0 <= -2305843009213693953 or i0 >= 2305843009213693952 }' and will -- when projected on the parameters -- indicate that any execution of this memory access is invalid and consequently won't happen.

It seems we have a mismatch here. The invalid-context of the entire scop still allows the scop to be executed while we assume locally that some parts won't be executed.

The following patch fixes this bug as well as the LNT bug we see in MultiSource/Applications/JM/lencod:

--- a/lib/Analysis/ScopInfo.cpp
+++ b/lib/Analysis/ScopInfo.cpp
@@ -869,7 +869,9 @@ void MemoryAccess::dump() const { print(errs()); }
__isl_give isl_pw_aff *MemoryAccess::getPwAff(const SCEV *E) {
auto *Stmt = getStatement();
PWACtx PWAC = Stmt->getParent()->getPwAff(E, Stmt->getEntryBlock());

  • InvalidDomain = isl_set_union(InvalidDomain, PWAC.second);
  • isl_set *StmtDom = isl_set_reset_tuple_id(getStatement()->getDomain());
  • isl_set *NewInvalidDom = isl_set_intersect(StmtDom, PWAC.second);
  • InvalidDomain = isl_set_union(InvalidDomain, NewInvalidDom);
    return PWAC.first;
    }

It restricts the invalid accesses to the memory accesses that are actually executed, which in this very case is enough to fix this issue as the only value of i0 that is every executed is i0 = 0, and the problematic values of i0 are larger.

It would probably also make sense to collect the invalid context from the MemoryAccesses to ensure we won't get into a situation again where these invalid contexts are used, but not enforced by our run-time checks.

Johannes, any thoughts on this?

Best,
Tobias

@tobiasgrosser
Copy link
Contributor Author

I committed the fix proposed below in r275053. This addresses the most obvious issues and should make the buildbots green again.

We may still need to review the load hoisting code more carefully. The fact that the run-time check we construct does not take into account the invalid-assumptions we derive for memory accesses looks rather fragile and would clearly benefit from some further review and testing. Unfortunately, Johannes is currently not available to help with these issues.

@tobiasgrosser
Copy link
Contributor Author

mentioned in issue llvm/llvm-bugzilla-archive#27980

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla polly
Projects
None yet
Development

No branches or pull requests

2 participants