The Low Level Virtual Machine (LLVM) is a compiler infrastructure, written in C++, which is designed for compile-time, link-time, run-time, and “idle-time” optimization of programs written in arbitrary programming languages. Originally implemented for C/C++, the language-independent design (and the success) of LLVM has since spawned a wide variety of front ends, including Objective-C, Fortran, Ada, Haskell, Java bytecode, Python, Ruby, ActionScript, GLSL, and others.
LLVM can provide the middle layers of a complete compiler system, taking intermediate form (IF) code from a compiler and outputting an optimized IF that can then be converted and linked into machine-dependent assembler code for a target platform. LLVM can accept the IF from the GCC toolchain, allowing it to be used with a wide array of existing compilers written for that project.
LLVM can also generate relocatable machine code at compile-time or link-time or even binary machine code at run-time.
LLVM supports a language-independent instruction set and type system. Each instruction is in static single assignment form (SSA), meaning that each variable (called a typed register) is assigned once and is frozen. This helps simplify the analysis of dependencies among variables. LLVM allows code to be compiled statically, as it is under the traditional GCC system, or left for late-compiling from the IF to machine code in a just-in-time compiler (JIT) in a fashion similar to Java. The type system consists of basic types such as integers or floats and five derived types: pointers, arrays, vectors, structures, and functions. A type construct in a concrete language can be represented by combining these basic types in LLVM. For example, a class in C++ can be represented by a combination of structures, functions and arrays of function pointers.
Additional links: