Search this Site
2013 LLVM Developers' Meeting
The meeting serves as a forum for LLVM,
Clang, LLDB and
other LLVM project developers and users to get acquainted, learn how LLVM is used, and
exchange ideas about LLVM and its (potential) applications. More broadly, we
believe the event will be of particular interest to the following people:
- Active developers of projects in the LLVM Umbrella
(LLVM core, Clang, LLDB, libc++, compiler_rt, klee, dragonegg, lld, etc).
- Anyone interested in using these as part of another project.
- Compiler, programming language, and runtime enthusiasts.
- Those interested in using compiler and toolchain technology in novel
and interesting ways.
We also invite you to sign up for the official Developer Meeting mailing list to be kept informed of updates concerning the meeting.
November 7 - Meeting Agenda
|LLVM: 10 years and going strong|
Chris Lattner, Apple
Vikram Adve, University of Illinois, Urbana-Champaign
Alon Zakai, Mozilla
|Code Size Reduction using Similar Function Merging|
Tobias Edler von Koch, University of Edinburgh / QuIC
Pranav Bhandarkar, QuIC
|BOF: Performance Tracking & Benchmarking Infrastructure|
Kristof Beyls, ARM
|Julia: An LLVM-based approach to scientific computing|
Keno Fischer, Harvard College/MIT CSAIL
|Verifying optimizations using SMT solvers|
Nuno Lopes, INESC-ID / U. Lisboa
Mihail Popa, ARM
|New Address Sanitizer Features|
Alexey Samsonov, Google
|A Detailed Look at the R600 Backend|
Tom Stellard, Advanced Micro Devices Inc.
|BOF: Debug Info|
Eric Christopher, Google
|Developer Toolchain for the PlayStation®4|
Paul T. Robinson, Sony Computer Entertainment America
|Annotations for Safe Parallelism in Clang|
Alexandros Tzannes, University of Illinois, Urbana-Champaign
|BOF: Extending the Sanitizer tools and porting them to other platforms|
Kostya Serebryany, Google
Alexey Samsonov, Google
Evgeniy Stepanov, Google
|Vectorization in LLVM|
Nadav Rotem, Apple
Arnold Schwaighofer, Apple
|Bringing clang and LLVM to Visual C++ users|
Reid Kleckner, Google
|BOF: High Level Loop Optimization / Polly|
Tobias Grosser, INRIA
Sebastian Pop, QuIC
Zino Benaissa, QuIC
|Building a Modern Database with LLVM|
Skye Wanderman-Milne, Cloudera
|Adapting LLDB for your hardware: Remote Debugging the Hexagon DSP|
Colin Riley, Codeplay
|BOF: Optimizations using LTO|
Zino Benaissa, QuIC
Tony Linthicum, QuIC
|PGO in LLVM: Status and Current Work|
Bob Wilson, Apple
Chandler Carruth, Google
Diego Novillo, Google
|See Abstracts||Lightning Talks|
|BOF: JIT & MCJIT|
Andy Kaylor, Intel Corporation
LLVM: 10 years and going strong
Chris Lattner - Apple,
Vikram Adve - University of Illinois, Urbana-Champaign
Keynote talk celebrating the 10th anniversary of LLVM 1.0.
Alon Zakai - Mozilla
Code Size Reduction using Similar Function Merging
Tobias Edler von Koch - University of Edinburgh / QuIC, Pranav Bhandarkar - QuIC
Code size reduction is a critical goal for compiler optimizations targeting embedded applications. While LLVM continues to improve its performance optimization capabilities, it is currently still lacking a robust set of optimizations specifically targeting code size. In our talk, we will describe an optimization pass that aims to reduce code size by merging similar functions at the IR level. Significantly extending the existing MergeFunctions optimization, the pass is capable of merging multiple functions even if there are minor differences between them. A number of heuristics are used to determine when merging of functions is profitable. Alongside hash tables, these also ensure that compilation time remains at an acceptable level. We will describe our experience of using this new optimization pass to reduce the code size of a significant embedded application at Qualcomm Innovation Center by 2%.
Julia: An LLVM-based approach to scientific computing
Keno Fischer - Harvard College/MIT CSAIL
Julia is a new high-level dynamic programming language specifically designed for
scientific and technical computing, while at the same time not ignoring the
need for the expressiveness and the power of a modern general purpose
Thanks to LLVM's JIT compilation capabilities, for which Julia was written
from the ground up, Julia can achieve a level of performance usually reserved
for compiled programs written in C, C++ or other compiled languages. It thus
manages to bridge the gap between very high level languages such as MATLAB, R or
Python usually used for algorithm prototyping and those languages used when
performance is of the essence, reducing development time and the possibility for
subtle differences between the prototype and the production algorithms.
Verifying optimizations using SMT solvers
Nuno Lopes - INESC-ID / U. Lisboa
Instcombine and Selection DAG optimizations, although usually simple, can easily hide bugs.
We've had many cases in the past where these optimizers were producing wrong code in certain corner cases.
In this talk I'll describe a way to prove the correctness of such optimization using an off-the-shelf SMT solver (bit-vector theory). I'll give examples of past bugs found in these optimizations, how to encode them into SMT-Lib 2 format, and how to spot the bugs.
The encoding to the SMT format, although manual, is straightfoward and consumes little time. The verification is then automatic.
New Address Sanitizer Features
Kostya Serebryany - Google,
Alexey Samsonov - Google
AddressSanitizer is a fast memory error detector that uses LLVM for compile-time instrumentation. In this talk we will present several new features in AddressSanitizer.
- Initialization order checker finds bugs where the program behavior depends on the order in which global variables from different modules are initialized.
- Stack-use-after-scope detector finds uses of stack-allocated objects outside of the scope where they are defined.
- Similarly, stack-use-after-return detector finds uses of stack variables after the functions they are defined in have exited.
- LeakSanitizer finds heap memory leaks; it is built on top of AddressSanitizer memory allocator.
- We will also give an update on AddressSanitizer for Linux kernel.
A Detailed Look at the R600 Backend
Tom Stellard - Advanced Micro Devices Inc.
The R600 backend, which targets AMD GPUs, was merged into LLVM prior to
the 3.3 release. It is one component of AMD's open source GPU drivers
which provide support for several popular graphics and compute APIs.
The backend supports two different generation of GPUs, the older
VLIW4/VLIW5 architecture and the more recent GCN architecture. In this
talk, I will discuss the history of the R600 backend, how it is used,
and why we choose to use LLVM for our open source drivers. Additionally,
I'll give an in-depth look at the backend and its features and present an
overview of the unique architecture of supported GPUs. I will describe
the challenges this architecture presented in writing an LLVM backend and
the approaches we have taken for instruction selection and scheduling.
I will also look at the future goals for this backend and areas for
improvement in the backend as well as core LLVM.
Developer Toolchain for the PlayStation®4
Paul T. Robinson - Sony Computer Entertainment America
The PlayStation®4 has a developer toolchain centered on Clang as the CPU compiler. We describe how Clang/LLVM fits into Sony Computer Entertainment's (mostly proprietary) toolchain, focusing on customizations, game-developer experience, and working with the open-source community.
Annotations for Safe Parallelism in Clang
Alexandros Tzannes -
University of Illinois, Urbana-Champaign
The Annotations for Safe Parallelism (ASaP) project at UIUC is implementing a static checker in Clang to allow writing provably safe parallel code. ASaP is inspired by DPJ (Deterministic Parallel Java) but unlike it, it does not extend the base language. Instead, we rely on the rich C++11 attribute system to enrich C++ types and to pass information to our ASaP checker. The ASaP checker gives strong guarantees such as race-freedom, *strong* atomicity, and deadlock freedom for commonly used parallelism patterns, and it is at the prototyping stage where we can prove the parallel safety of simple TBB programs. We are evolving ASaP in collaboration with our Autodesk partners who help guide its design in order to solve incrementally complex problems faced by real software teams in industry. In this presentation, I will present an overview of how the checker works, what is currently supported, what we have "in the works", and some discussion about incorporating some of the ideas of the thread safety annotation to assist our analysis.
Vectorization in LLVM
Nadav Rotem - Apple, Arnold Schwaighofer - Apple
Vectorization is a powerful optimization that can accelerate programs in multiple domains. Over the last year two new vectorization passes were added to LLVM: the Loop-vectorizer, which vectorizes loops, and the SLP-vectorizer, which combines independent scalar calculations into a vector. Both of these optimizations together show a significant performance increase on many applications. In this talk we’ll present our work on the vectorizers in the past year. We’ll discuss the overall architecture of these passes, the cost model for deciding when vectorization is profitable, and describe some interesting design tradeoffs. Finally, we want to talk about some ideas to further improve the vectorization infrastructure.
Bringing clang and LLVM to Visual C++ users
Reid Kleckner - Google
This talk covers the work we've been doing to help make clang and LLVM more
compatible with Microsoft's Visual C++ toolchain. With a compatible toolchain,
we can deliver all of the features that clang and LLVM have to offer, such as
AddressSanitizer. Perhaps the most important point of compatibility is the C++
ABI, which is a huge and complicated beast that covers name mangling, calling
conventions, record layout, vtable layout, virtual inheritance, and more. This
talk will go into detail about some of the more interesting parts of the ABI.
Building a Modern Database with LLVM
Skye Wanderman-Milne - Cloudera
Cloudera Impala is a low-latency SQL query engine for Apache Hadoop. In order to achieve optimal CPU efficiency and query execution times, Impala uses LLVM to perform JIT code generation to take advantage of query-specific information unavailable at compile time. For example, code generation allows us to remove many conditionals (and the associated branch misprediction overhead) necessary for handling multiples types, operators, functions, etc.; inline what would otherwise be virtual function calls; and propagate query-specific constants. These optimization can reduce overall query time by almost 300%.
In this talk, I'll outline the motivation for using LLVM within Impala and go over some examples and results of JIT optimizations we currently perform, as well as ones we'd like to implement in the future.
Adapting LLDB for your hardware: Remote Debugging the Hexagon DSP
Colin Riley - Codeplay
LLDB is at the stage of development where support is being added for a wide range of hardware devices. Its modular approach means adapting it to debug a new system has a well-defined step-by-step process, which can progress fairly quickly. Presented is a guide of what implementation steps are required to get your hardware supported via LLDB using Remote Debugging, giving examples from work we are doing to support the Hexagon DSP within LLDB.
PGO in LLVM: Status and Current Work
Bob Wilson - Apple,
Chandler Carruth - Google,
Diego Novillo - Google
Profile Guided Optimization (PGO) is one of the most fundamental weaknesses in the LLVM optimization portfolio. We have had several attempts to build it, and to this day we still lack a holistic platform for driving optimizations through profiling. This talk will consist of three light-speed crash courses on where PGO is in LLVM, where it needs to be, and how several of us are working to get it there.
First, we will present some motivational background on what PGO is good for and what it isn't. We will cover exactly how profile information interacts with the LLVM optimizations, the strategies we use at a high level to organize and use profile information, and the specific optimizations that are in turn driven by it. Much of this will cover infrastructure as it exists today, with some forward-looking information added into the mix.
Next, we will cover one planned technique for getting profile information into LLVM: AutoProfile. This technique simplifies the use and deployment of PGO by using external profile sources such as Linux perf events or other sample-based external profilers. When available, it has some key advantages: no instrumentation build mode, reduced instrumentation overhead, and more predictable application behavior by using hardware to assist the profiling.
Finally, we will cover an alternate strategy to provide more traditional and detailed profiling through compiler inserted instrumentation. This approach will also strive toward two fundamental goals: resilience of the profile to beth source code and compiler changes, and visualization of the profile by developers to understand how their code is being exercised. The second draws obvious parallels with code coverage tools, and the design tries to unify these two use cases in a way that the same infrastructure can drive both.
Finding a few needles in some large haystacks: Identifying missing target optimizations using a superoptimizer
Hal Finkel - Argonne National Laboratory
So you're developing an LLVM backend, and you've added a bunch of TableGen patterns, custom DAG combines and other lowering code; are you done? This poster describes the development of a specialized superoptimizer, applied to the output of the compiler on large codebases, to look for missing optimizations in the PowerPC backend. This superoptimizer extracts potentially-interesting instruction sequences from assembly code, and then uses the open-source CVC4 SMT solver to search for provably-correct shorter alternatives.
Intel® AVX-512 Architecture. Comprehensive vector extension for HPC and enterprise
Elena Demikhovsky, Intel® Software and Services Group - Israel
Knights Landing (KNL) is the second generation of the Intel® MIC architecture-based products. KNL will support Intel® Advanced Vector Extensions 512 instruction set architecture, a significant leap in SIMD support. This new ISA, designed with unprecedented level of richness, offers a new level of support and opportunities for vectorizing compilers to target efficiently. The poster presents Intel®AVX-512 ISA and shows how the new capabilities may be used in LLVM compiler.
Fracture: Inverting the Target Independent Code Generator
Richard T. Carback III – Charles Stark Draper Laboratories
Fracture is a TableGen backend and associated library that ingests a basic block of target instructions and emits a DAG which resembles the post-legalization phase of LLVM’s SelectionDAG instruction selection process. It leverages the pre-existing target TableGen definitions, without modification, to provide a generic way to abstract LLVM IR efficiently from different target instruction sets. Fracture can speed up a variety of applications and also enable generic implementations of a number of static and dynamic analysis tools. Examples include interactive debuggers or disassemblers that provide LLVM IR representations to users unfamiliar with the instruction set, static analysis algorithms that solve indirect control transfer (ICT) problems modified for IR to use KLEE or other LLVM technologies, and IR-based decompilers or emulators extended to work on machine binaries.
Automatic generation of LLVM backends from LISA
Jeroen Dobbelaere - Synopsys
LISA (language for instruction-set architectures) allows for the efficient specification of processor architectures,
including non-standard, customized architectures. Using a LISA input specification designers can automatically
generate instruction-set simulator, assembler, linker, debugger interface as well as RTL.
We have extended LISA to allow for the generation of a LLVM compiler backend tailored to the custom architecture.
This work includes the development of a new scheduler that is able to handle hazards with high latency and delay slots,
expanding the applicability of LLVM to a wider range of architectures. The LISA-based design flow allows for rapid
architectural explorations, profiling dozens of different processors architectures within hours, with the automatic
generation of a LLVM compiler being a key enabler of this design methodology.
clad - Automatic Differentiation with Clang
Violeta Ilieva (Princeton University), CERN; Vassil Vassilev, CERN
Automatic differentiation (AD) evaluates the derivative of a function specified in a computer program by applying a set of techniques to change the semantics of that function. Unlike other methods for differentiation, such as numerical and symbolic, AD yields machine-precision derivatives even of complicated functions at relatively low processing and storage costs. We would like to present our AD tool, clad - a clang plugin that derives C++ functions through implementing source code transformation and employing the chain rule of differential calculus in its forward mode. That is, clad decomposes the original functions into elementary statements and generates their derivatives with respect to the user-defined independent variables. The combination of these intermediate expressions forms additional source code, built through modifying clang’s abstract syntax tree (AST) along the control flow. Compared to other tools, clad has the advantage of relying on clang and llvm modules for parsing the original program. It uses clang's plugin mechanism for constructing the derivative's AST representation, for generating executable code, and for performing global analysis. Thus it results in low maintenance, high compatibility, and excellent performance.
Lightning Talk Abstracts
Fixing MC for ARM v7-A: Just a few corner cases – how hard can it be?
Mihail Popa - ARM
In 2012, MC Hammer was presented as a testing infrastructure to exhaustively verify the MC layer implementation for the ARM backend. Within ARM we have been working to fix any bugs and we have reached the point where all but one problem remains unsolved. Some of the issues discovered in this process have proven to be excessively difficult to fix. The purpose of the presentation is to give a brief rundown of the major headaches and to suggest possible courses of action for improving LLVM infrastructure.
VLIW Support in the MC Layer
Mario Guerra - Qualcomm Innovation Center, Incorporated
Modern DSP architectures such as Hexagon use VLIW instruction packets, which are not well suited to the single instruction streaming model of the LLVM MC layer. Developing an assembler for Hexagon presents unique challenges in the MC layer, especially since Hexagon leverages an optimizing assembler to achieve maximum performance. It is possible to support VLIW within the MC layer by treating every MC instruction as a bundle, and adding all instructions in a packet as sub instruction operands. Furthermore, subclassing MCInst to create a target-specific type of MCInst allows us to capture packet information that will be used to make optimization decisions prior to emitting the code to object format.
Link-Time Optimization without Linker Support
Yunzhong Gao - Sony Computer Entertainment America
LLVM's plugin for the Gold linker enables link-time optimization (LTO). But the toolchain for PlayStation®4 does not include Gold. Here's how we achieved LTO without a bitcode-aware linker.
A comparison of the DWARF debugging information produced by LLVM and GCC
Keith Walker, ARM
This talk explores the quality of the DWARF debugging information generated by LLVM by
comparing it with that produced by GCC for ARM/AArch64 based targets. It highlights where LLVM's debugging information is superior to that generated by GCC
and also where there are deficiencies and scope for further development.
I will also explain how these difference translate into good or bad debug experiences
for users of LLVM.
aarch64 neon work
Ana Pazos - QuIC, Jiangning Liu - ARM
ARM and Qualcom are implementing aarch64 advanced SIMD (neon) instruction set. We as a joint team will be implementing all of 25 classes of neon instructions on MC layer as well as all of ACLE(ARM C Language Extension) intrinsics on C level. Our talk will highlight the design choice of unique arm_neon.h for both ARM(aarch32) and aarch64, appropriate decision making of value types on LLVM IR for generating SISD instruction classes, the patterns’ qualities in .td files by reducing LLVM IR intrinsics, and all of the test categories to build a robust back-end. Finally, we’d like to mention some future plan like enabling machine instruction based scheduler, and performance tuning etc.
Filip Pizlo - Apple Inc.
Debug Info Quick Update
Eric Christopher - Google Inc.
A quick update on what's been going on in debug info support since the Euro meeting.
lld a linker framework
Shankar Easwaran, Qualcomm Innovation Centre.
The lld project is working towards becoming a production quality linker targeting PECOFF, Darwin, ELF formats.The lld project is under heavy development. The talk discusses on how lld achieves universal linking and how its moving towards becoming a linker framework that could be an integral part of llvm. The talk continues to discuss by exposes new opportunities with linking like, lld API's, Symbol resolution improvements, Link time optimizations(LTO) and enhancing the user experience by providing diagnostics, user driven inputs that drive linker behavior.
BOF: Performance Tracking & Benchmarking Infrastructure
Kristof Beyls - ARM
We lack a good public infrastructure to efficiently track performance
improvements/regressions easily. As a small step to improve on the
current situation, I propose to organize a BoF to discuss mainly the
(a) What advantages do we want the performance tracking and
benchmarking infrastructure to give us?
(b) What are the main technical and non-technical challenges we expect
for setting up an infrastructure?
Mihail Popa - ARM
Tablegen is an essential component of the LLVM ecosystem and time has come to consider its evolution. The largest issues are the lack of formal specification, the mixing of logical concepts and the unsuitability for automated generation. The aim of this BoF is to gather ideas toward an improved specification language which follows the generally accepted criteria for domain specific languages: well defined domain meta-models, formally defined semantics, simplicity, expressiveness, lack of redundancy.
BOF: Debug Info
Eric Christopher - Google
BOF: Extending the Sanitizer tools and porting them to other platforms
Kostya Serebryany - Google,
Alexey Samsonov - Google,
Evgeniy Stepanov - Google
BOF: High Level Loop Optimization / Polly
Tobias Grosser - INRIA,
Sebastian Pop - QuIC,
Zino Benaissa - QuIC
Discussions about Loop Optimizations, both generic ones as well as
polyhedral Loop Optimizations as implemented in Polly. Topics include
the pass order for high level loop optimizations, scalar evolution,
dependence analysis, high level loop optimizations in core LLVM, the polyhedral infrastructure of Polly as well as the isl polyhedral support library.
BOF: Optimizations using LTO
Zino Benaissa - QuIC
BOF: JIT & MCJIT
Andy Kaylor - Intel Corporation