Created attachment 10266 [details] C source file, kernel operations on 2D image While opt -polly-codegen -polly-opt-isl -polly-ignore-aliasing kernel.ll -polly-detect-only=apply_kernel_3x3 works fine, opt -polly-codegen -polly-opt-isl -polly-ignore-aliasing kernel.ll -polly-detect-only=apply_kernel_5x5 seems to make no progress at all. Polly git: 331e829324157cca54491d268410f8ee2d33cd65 Cloog git: c7721fc941db89dd1afc6240eaceea46d0bcad17 isl git: 9f82ab3cd18ac34f883c30594111e4eb17426e11 kernel.ll is generated by: clang -emit-llvm -S kernel.c -o kernel.bc opt -S kernel.bc -polly-prepare -mem2reg -polly-indvars -o kernel.ll Stacktrace after running opt for a while: ============================================================== #0 0x00007ffff65bb724 in isl_seq_combine () from /home/johannes/git/POLLY/install/lib/libisl.so.10 #1 0x00007ffff65bff9a in isl_tab_add_row () from /home/johannes/git/POLLY/install/lib/libisl.so.10 #2 0x00007ffff65c0202 in isl_tab_add_ineq () from /home/johannes/git/POLLY/install/lib/libisl.so.10 #3 0x00007ffff6588509 in basic_map_collect_diff.part.1 () from /home/johannes/git/POLLY/install/lib/libisl.so.10 #4 0x00007ffff6588b3c in map_subtract () from /home/johannes/git/POLLY/install/lib/libisl.so.10 #5 0x00007ffff6552adf in isl_access_info_compute_flow () from /home/johannes/git/POLLY/install/lib/libisl.so.10 #6 0x00007ffff65537f3 in compute_flow () from /home/johannes/git/POLLY/install/lib/libisl.so.10 #7 0x00007ffff655b109 in isl_hash_table_foreach () from /home/johannes/git/POLLY/install/lib/libisl.so.10 #8 0x00007ffff65d42ed in isl_union_map_foreach_map () from /home/johannes/git/POLLY/install/lib/libisl.so.10 #9 0x00007ffff6553bd0 in isl_union_map_compute_flow () from /home/johannes/git/POLLY/install/lib/libisl.so.10 #10 0x00007ffff68fd8e3 in polly::Dependences::calculateDependences(polly::Scop&) () from install/lib/LLVMPolly.so #11 0x00007ffff68fdba4 in polly::Dependences::runOnScop(polly::Scop&) () from install/lib/LLVMPolly.so #12 0x00007ffff691e946 in polly::ScopPass::runOnRegion(llvm::Region*, llvm::RGPassManager&) () from install/lib/LLVMPolly.so #13 0x000000000151ea44 in llvm::RGPassManager::runOnFunction(llvm::Function&) () #14 0x000000000167cf29 in llvm::FPPassManager::runOnFunction(llvm::Function&) () #15 0x000000000167d11a in llvm::FPPassManager::runOnModule(llvm::Module&) () #16 0x000000000167d477 in llvm::MPPassManager::runOnModule(llvm::Module&) () #17 0x000000000167da79 in llvm::PassManagerImpl::run(llvm::Module&) () #18 0x000000000167dc8b in llvm::PassManager::run(llvm::Module&) () #19 0x00000000008bdf05 in main ()
Created attachment 10267 [details] LLVM-IR kernel file
The kernel looks as follows (the 5x5 is similar) void apply_kernel_3x3(double* u, double* dx, double* dy, long nx, long ny) { long i,j; for (i = 1; i < nx - 1; i++) { for (j = 1; j < ny - 1; j++) { dx[i*ny+j] = u[(i-1)*ny+j-1] + 2 * u[(i-1)*ny+j] + u[(i-1)*ny+j+1]; dx[i*ny+j] += - u[(i+1)*ny+j-1] - 2 * u[(i+1)*ny+j] - u[(i+1)*ny+j+1]; dy[i*ny+j] = - u[(i-1)*ny+j-1] - 2 * u[(i)*ny+j-1] - u[(i+1)*ny+j-1]; dy[i*ny+j] += u[(i-1)*ny+j+1] + 2 * u[(i)*ny+j+1] + u[(i+1)*ny+j+1]; } } } Due to the linearized variable sized arrays we get expressions such as i * ny. Those expressions can not be expressed in the polyhedral model. As a result we only detect the j-loop as a SCoP and hide the i * ny style expressions in scop invariant parameters. This has two problems: 1) We do not detect the full kernel If we would understand the multi-dimensionality of the array references, we could detect the full kernel and could analyze it in detail 2) Large number of parameters causes slow downs Without delinearization we get a large number of independent parameters. With increasing dimensionality the number of parameters grows fast, such that even at 3D the dependence analysis becomes slow. There are several solutions: 1) Use fixed size arrays This makes the problem go away and allows us to fully optimize the kernel 2) Make Polly understand the multi-dimensionality of the arrays This needs some work in Polly, but it would be the optimal solution as we should both be fast and precise after detecting the multi-dimensionality. 3) Fix the slow analysis We should find a way to either do our analysis fast or just bail out when we realize we can not do anything useful here. Cheers, Tobias
I just checked with 197560. The number of parameters does not explode any more and dependence analysis is now performed very quick (< 0.1 sec). We still do not detect the whole scop. This could be fixed with Sebastian's delinearization pass. However, as the bug report was about the slowness, we close it now. Please open a new bug report to track the delinearization issue.
Move to Polly Product.