LLVM 2.8 Release Notes
LLVM Dragon Logo
  1. Introduction
  2. Sub-project Status Update
  3. External Projects Using LLVM 2.8
  4. What's New in LLVM 2.8?
  5. Installation Instructions
  6. Known Problems
  7. Additional Information

Written by the LLVM Team

Introduction

This document contains the release notes for the LLVM Compiler Infrastructure, release 2.8. Here we describe the status of LLVM, including major improvements from the previous release and significant known problems. All LLVM releases may be downloaded from the LLVM releases web site.

For more information about LLVM, including information about the latest release, please check out the main LLVM web site. If you have questions or comments, the LLVM Developer's Mailing List is a good place to send them.

Note that if you are reading this file from a Subversion checkout or the main LLVM web page, this document applies to the next release, not the current one. To see the release notes for a specific release, please see the releases page.

Sub-project Status Update

The LLVM 2.8 distribution currently consists of code from the core LLVM repository (which roughly includes the LLVM optimizers, code generators and supporting tools), the Clang repository and the llvm-gcc repository. In addition to this code, the LLVM Project includes other sub-projects that are in development. Here we include updates on these subprojects.

Clang: C/C++/Objective-C Frontend Toolkit

Clang is an LLVM front end for the C, C++, and Objective-C languages. Clang aims to provide a better user experience through expressive diagnostics, a high level of conformance to language standards, fast compilation, and low memory use. Like LLVM, Clang provides a modular, library-based architecture that makes it suitable for creating or integrating with other development tools. Clang is considered a production-quality compiler for C, Objective-C, C++ and Objective-C++ on x86 (32- and 64-bit), and for darwin-arm targets.

In the LLVM 2.8 time-frame, the Clang team has made many improvements:

Clang Static Analyzer

The Clang Static Analyzer project is an effort to use static source code analysis techniques to automatically find bugs in C and Objective-C programs (and hopefully C++ in the future!). The tool is very good at finding bugs that occur on specific paths through code, such as on error conditions.

The LLVM 2.8 release fixes a number of bugs and slightly improves precision over 2.7, but there are no major new features in the release.

DragonEgg: llvm-gcc ported to gcc-4.5

DragonEgg is a port of llvm-gcc to gcc-4.5. Unlike llvm-gcc, dragonegg in theory does not require any gcc-4.5 modifications whatsoever (currently one small patch is needed) thanks to the new gcc plugin architecture. DragonEgg is a gcc plugin that makes gcc-4.5 use the LLVM optimizers and code generators instead of gcc's, just like with llvm-gcc.

DragonEgg is still a work in progress, but it is able to compile a lot of code, for example all of gcc, LLVM and clang. Currently Ada, C, C++ and Fortran work well, while all other languages either don't work at all or only work poorly. For the moment only the x86-32 and x86-64 targets are supported, and only on linux and darwin (darwin may need additional gcc patches).

The 2.8 release has the following notable changes:

VMKit: JVM/CLI Virtual Machine Implementation

The VMKit project is an implementation of a Java Virtual Machine (Java VM or JVM) that uses LLVM for static and just-in-time compilation. As of LLVM 2.8, VMKit now supports copying garbage collectors, and can be configured to use MMTk's copy mark-sweep garbage collector. In LLVM 2.8, the VMKit .NET VM is no longer being maintained.

compiler-rt: Compiler Runtime Library

The new LLVM compiler-rt project is a simple library that provides an implementation of the low-level target-specific hooks required by code generation and other runtime components. For example, when compiling for a 32-bit target, converting a double to a 64-bit unsigned integer is compiled into a runtime call to the "__fixunsdfdi" function. The compiler-rt library provides highly optimized implementations of this and other low-level routines (some are 3x faster than the equivalent libgcc routines).

All of the code in the compiler-rt project is available under the standard LLVM License, a "BSD-style" license. New in LLVM 2.8, compiler_rt now supports soft floating point (for targets that don't have a real floating point unit), and includes an extensive testsuite for the "blocks" language feature and the blocks runtime included in compiler_rt.

LLDB: Low Level Debugger

LLDB is a brand new member of the LLVM umbrella of projects. LLDB is a next generation, high-performance debugger. It is built as a set of reusable components which highly leverage existing libraries in the larger LLVM Project, such as the Clang expression parser, the LLVM disassembler and the LLVM JIT.

LLDB is in early development and not included as part of the LLVM 2.8 release, but is mature enough to support basic debugging scenarios on Mac OS X in C, Objective-C and C++. We'd really like help extending and expanding LLDB to support new platforms, new languages, new architectures, and new features.

libc++: C++ Standard Library

libc++ is another new member of the LLVM family. It is an implementation of the C++ standard library, written from the ground up to specifically target the forthcoming C++'0X standard and focus on delivering great performance.

As of the LLVM 2.8 release, libc++ is virtually feature complete, but would benefit from more testing and better integration with Clang++. It is also looking forward to the C++ committee finalizing the C++'0x standard.

KLEE: A Symbolic Execution Virtual Machine

KLEE is a symbolic execution framework for programs in LLVM bitcode form. KLEE tries to symbolically evaluate "all" paths through the application and records state transitions that lead to fault states. This allows it to construct testcases that lead to faults and can even be used to verify some algorithms.

Although KLEE does not have any major new features as of 2.8, we have made various minor improvements, particular to ease development:

External Open Source Projects Using LLVM 2.8

An exciting aspect of LLVM is that it is used as an enabling technology for a lot of other language and tools projects. This section lists some of the projects that have already been updated to work with LLVM 2.8.

TTA-based Codesign Environment (TCE)

TCE is a toolset for designing application-specific processors (ASP) based on the Transport triggered architecture (TTA). The toolset provides a complete co-design flow from C/C++ programs down to synthesizable VHDL and parallel program binaries. Processor customization points include the register files, function units, supported operations, and the interconnection network.

TCE uses llvm-gcc/Clang and LLVM for C/C++ language support, target independent optimizations and also for parts of code generation. It generates new LLVM-based code generators "on the fly" for the designed TTA processors and loads them in to the compiler backend as runtime libraries to avoid per-target recompilation of larger parts of the compiler chain.

Horizon Bytecode Compiler

Horizon is a bytecode language and compiler written on top of LLVM, intended for producing single-address-space managed code operating systems that run faster than the equivalent multiple-address-space C systems. More in-depth blurb is available on the wiki.

Clam AntiVirus

Clam AntiVirus is an open source (GPL) anti-virus toolkit for UNIX, designed especially for e-mail scanning on mail gateways. Since version 0.96 it has bytecode signatures that allow writing detections for complex malware. It uses LLVM's JIT to speed up the execution of bytecode on X86, X86-64, PPC32/64, falling back to its own interpreter otherwise. The git version was updated to work with LLVM 2.8.

The ClamAV bytecode compiler uses Clang and LLVM to compile a C-like language, insert runtime checks, and generate ClamAV bytecode.

Pure

Pure is an algebraic/functional programming language based on term rewriting. Programs are collections of equations which are used to evaluate expressions in a symbolic fashion. Pure offers dynamic typing, eager and lazy evaluation, lexical closures, a hygienic macro system (also based on term rewriting), built-in list and matrix support (including list and matrix comprehensions) and an easy-to-use C interface. The interpreter uses LLVM as a backend to JIT-compile Pure programs to fast native code.

Pure versions 0.44 and later have been tested and are known to work with LLVM 2.8 (and continue to work with older LLVM releases >= 2.5).

Glasgow Haskell Compiler (GHC)

GHC is an open source, state-of-the-art programming suite for Haskell, a standard lazy functional programming language. It includes an optimizing static compiler generating good code for a variety of platforms, together with an interactive system for convenient, quick development.

In addition to the existing C and native code generators, GHC 7.0 now supports an LLVM code generator. GHC supports LLVM 2.7 and later.

Clay Programming Language

Clay is a new systems programming language that is specifically designed for generic programming. It makes generic programming very concise thanks to whole program type propagation. It uses LLVM as its backend.

llvm-py Python Bindings for LLVM

llvm-py has been updated to work with LLVM 2.8. llvm-py provides Python bindings for LLVM, allowing you to write a compiler backend or a VM in Python.

FAUST Real-Time Audio Signal Processing Language

FAUST is a compiled language for real-time audio signal processing. The name FAUST stands for Functional AUdio STream. Its programming model combines two approaches: functional programming and block diagram composition. In addition with the C, C++, JAVA output formats, the Faust compiler can now generate LLVM bitcode, and works with LLVM 2.7 and 2.8.

Jade Just-in-time Adaptive Decoder Engine

Jade (Just-in-time Adaptive Decoder Engine) is a generic video decoder engine using LLVM for just-in-time compilation of video decoder configurations. Those configurations are designed by MPEG Reconfigurable Video Coding (RVC) committee. MPEG RVC standard is built on a stream-based dataflow representation of decoders. It is composed of a standard library of coding tools written in RVC-CAL language and a dataflow configuration — block diagram — of a decoder.

Jade project is hosted as part of the Open RVC-CAL Compiler and requires it to translate the RVC-CAL standard library of video coding tools into an LLVM assembly code.

LLVM JIT for Neko VM

Neko LLVM JIT replaces the standard Neko JIT with an LLVM-based implementation. While not fully complete, it is already providing a 1.5x speedup on 64-bit systems. Neko LLVM JIT requires LLVM 2.8 or later.

Crack Scripting Language

Crack aims to provide the ease of development of a scripting language with the performance of a compiled language. The language derives concepts from C++, Java and Python, incorporating object-oriented programming, operator overloading and strong typing. Crack 0.2 works with LLVM 2.7, and the forthcoming Crack 0.2.1 release builds on LLVM 2.8.

Dresden TM Compiler (DTMC)

DTMC provides support for Transactional Memory, which is an easy-to-use and efficient way to synchronize accesses to shared memory. Transactions can contain normal C/C++ code (e.g., __transaction { list.remove(x); x.refCount--; }) and will be executed virtually atomically and isolated from other transactions.

Kai Programming Language

Kai (Japanese 会 for meeting/gathering) is an experimental interpreter that provides a highly extensible runtime environment and explicit control over the compilation process. Programs are defined using nested symbolic expressions, which are all parsed into first-class values with minimal intrinsic semantics. Kai can generate optimised code at run-time (using LLVM) in order to exploit the nature of the underlying hardware and to integrate with external software libraries. It is a unique exploration into world of dynamic code compilation, and the interaction between high level and low level semantics.

OSL: Open Shading Language

OSL is a shading language designed for use in physically based renderers and in particular production rendering. By using LLVM instead of the interpreter, it was able to meet its performance goals (>= C-code) while retaining the benefits of runtime specialization and a portable high-level language.

What's New in LLVM 2.8?

This release includes a huge number of bug fixes, performance tweaks and minor improvements. Some of the major improvements and new features are listed in this section.

Major New Features

LLVM 2.8 includes several major new capabilities:

LLVM IR and Core Improvements

LLVM IR has several new features for better support of new targets and that expose new optimization opportunities:

Optimizer Improvements

In addition to a large array of minor performance tweaks and bug fixes, this release includes a few major enhancements and additions to the optimizers:

MC Level Improvements

The LLVM Machine Code (aka MC) subsystem was created to solve a number of problems in the realm of assembly, disassembly, object file format handling, and a number of other related areas that CPU instruction-set level tools work in.

The MC subproject has made great leaps in LLVM 2.8. For example, support for directly writing .o files from LLC (and clang) now works reliably for darwin/x86[-64] (including inline assembly support) and the integrated assembler is turned on by default in Clang for these targets. This provides improved compile times among other things.

For more information, please see the Intro to the LLVM MC Project Blog Post.

Target Independent Code Generator Improvements

We have put a significant amount of work into the code generator infrastructure, which allows us to implement more aggressive algorithms and make it run faster:

X86-32 and X86-64 Target Improvements

New features and major changes in the X86 target include:

ARM Target Improvements

New features of the ARM target include:

Major Changes and Removed Features

If you're already an LLVM user or developer with out-of-tree changes based on LLVM 2.7, this section lists some "gotchas" that you may run into upgrading from the previous release.

In addition, many APIs have changed in this release. Some of the major LLVM API changes are:

Development Infrastructure Changes

This section lists changes to the LLVM development infrastructure. This mostly impacts users who actively work on LLVM or follow development on mainline, but may also impact users who leverage the LLVM build infrastructure or are interested in LLVM qualification.

Known Problems

This section contains significant known problems with the LLVM system, listed by component. If you run into a problem, please check the LLVM bug database and submit a bug if there isn't already one.

Experimental features included with this release

The following components of this LLVM release are either untested, known to be broken or unreliable, or are in early development. These components should not be relied on, and bugs should not be filed against them, but they may be useful to some people. In particular, if you would like to work on one of these components, please contact us on the LLVMdev list.

Known problems with the X86 back-end
Known problems with the PowerPC back-end
Known problems with the ARM back-end
Known problems with the SPARC back-end
Known problems with the MIPS back-end
Known problems with the Alpha back-end
Known problems with the C back-end

The C backend has numerous problems and is not being actively maintained. Depending on it for anything serious is not advised.

Known problems with the llvm-gcc front-end

llvm-gcc is generally very stable for the C family of languages. The only major language feature of GCC not supported by llvm-gcc is the __builtin_apply family of builtins. However, some extensions are only supported on some targets. For example, trampolines are only supported on some targets (these are used when you take the address of a nested function).

Fortran support generally works, but there are still several unresolved bugs in Bugzilla. Please see the tools/gfortran component for details. Note that llvm-gcc is missing major Fortran performance work in the frontend and library that went into GCC after 4.2. If you are interested in Fortran, we recommend that you consider using dragonegg instead.

The llvm-gcc 4.2 Ada compiler has basic functionality, but is no longer being actively maintained. If you are interested in Ada, we recommend that you consider using dragonegg instead.

Additional Information

A wide variety of additional information is available on the LLVM web page, in particular in the documentation section. The web page also contains versions of the API documentation which is up-to-date with the Subversion version of the source code. You can access versions of these documents specific to this release by going into the "llvm/doc/" directory in the LLVM tree.

If you have any questions or comments about LLVM, please feel free to contact us via the mailing lists.


Valid CSS Valid HTML 4.01 LLVM Compiler Infrastructure
Last modified: $Date: 2010-10-04 15:41:06 -0500 (Mon, 04 Oct 2010) $