943 – Too much spill code generated after loop unrolling

LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 943 - Too much spill code generated after loop unrolling

Summary: Too much spill code generated after loop unrolling

Status:	RESOLVED FIXED

Alias:	None

Product:	libraries
Classification:	Unclassified
Component:	Common Code Generator Code (show other bugs)
Version:	trunk
Hardware:	All All

Importance:	P normal
Assignee:	Chris Lattner

URL:
Keywords:	code-quality

Depends on:
Blocks:

Reported:	2006-10-11 18:48 PDT by Bill Wendling
Modified:	2010-03-06 14:00 PST (History)
CC List:	1 user (show)

See Also:
Fixed By Commit(s):

Attachments
Extracted function (721 bytes, application/octet-stream) 2006-10-11 18:50 PDT, Bill Wendling	Details
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Bill Wendling 2006-10-11 18:48:44 PDT

This code:

int value(uint64 b1, uint64 b2)                                                                                   
{                                                                                                                 
  int i, j, k;                                                                                                    
  int value = 0;                                                                                                  
                                                                                                                  
  for (k = 0; k < 2; k++)                                                                                         
    for (i = 0; i < 6; i++)                                                                                       
      for (j = 0; j < 2; j++)                                                                                     
        if ((b2 & 0xf << (j + i * 6)) == 0xf << (j + i * 6))                                                      
          value += 1000;                                                                                          
                                                                                                                  
  return value;                                                                                                   
}                                                                                                                 

generates too many unnecessary spills/reloads when rerun through the optimizer (at -O2). See "***" 
below:

LBB1_1: #bb17.preheader                                                                                           
        movl $15, %eax                                                                                            
***     movb 39(%esp), %cl                                                                                        
        movl %eax, %edx                                                                                           
        shll %cl, %edx                                                                                            
***     movb 39(%esp), %cl                                                                                        
        incb %cl                                                                                                  
        shll %cl, %eax                                                                                            
        movl %edx, %ecx                                                                                           

The result is that the loop will run slower than if optimizations were turned on only once.

Comment 1 Bill Wendling 2006-10-11 18:50:03 PDT

Created attachment 409 [details]
Extracted function

Comment 2 Chris Lattner 2006-10-11 20:46:14 PDT

Taking a look.

Comment 3 Chris Lattner 2006-10-11 21:36:18 PDT

Fixed.  Patch here:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20061009/038511.html

The diff of produced code:

--- t.s 2006-10-11 18:57:23.000000000 -0700
+++ t2.s        2006-10-11 19:29:35.000000000 -0700
@@ -22,7 +22,6 @@
        movb 39(%esp), %cl
        movl %eax, %edx
        shll %cl, %edx
-       movb 39(%esp), %cl
        incb %cl
        shll %cl, %eax
        movl %edx, %ecx
@@ -73,7 +72,6 @@
        movb 15(%esp), %cl
        movl %esi, %eax
        shll %cl, %eax
-       movb 15(%esp), %cl
        incb %cl
        shll %cl, %esi
        movl %eax, %ecx


-Chris

Comment 4 Bill Wendling 2006-10-11 22:38:54 PDT

Rockin'!

Comment 5 Bill Wendling 2006-10-12 17:10:42 PDT

This sped up the fourinarow testpoiont on x86. It now has a smaller gap (~.5s instead of ~.9s) between 
the times.