On Windows, most of Chromium builds with /O1 /Os (optimize for size), but some of it builds with /O2 /Ot (optimize for speed). For this code: void f(int x, int y); void g(int x) { f(x, 1); } clang -target i686-pc-win32 -Os generates: 00000000 <?g@@YAXH@Z>: 0: 6a 01 push $0x1 2: ff 74 24 08 pushl 0x8(%esp) 6: e8 00 00 00 00 call b <?g@@YAXH@Z+0xb> b: 83 c4 08 add $0x8,%esp e: c3 ret but with -O2 it generates: 00000000 <?g@@YAXH@Z>: 0: 83 ec 08 sub $0x8,%esp 3: 8b 44 24 0c mov 0xc(%esp),%eax 7: 89 04 24 mov %eax,(%esp) a: c7 44 24 04 01 00 00 movl $0x1,0x4(%esp) 11: 00 12: e8 00 00 00 00 call 17 <?g@@YAXH@Z+0x17> 17: 83 c4 08 add $0x8,%esp 1a: c3 ret Is the -O2 version actually faster enough to justify the size increase here?
I tend to agree. I originally wanted it for all opt levels, but there was some pushback because of potential performance concerns. And since my main concern was -Os, I just never got to benchmarking it for -O2. Adding Dave, who's probably a more appropriate contact for this than I am right now.
r264966