You are on page 1of 474

Documentation for the LLVM System at SVN head

Documentation for the LLVM System at SVN head
If you are using a released version of LLVM, see the download page to find your documentation.

• LLVM Design
• LLVM Publications
• LLVM User Guides
• General LLVM Programming Documentation
• LLVM Subsystem Documentation
• LLVM Mailing Lists
Written by The LLVM Team

LLVM Design & Overview

• LLVM Language Reference Manual - Defines the LLVM intermediate representation.
• Introduction to the LLVM Compiler - Presentation describing LLVM.
• The LLVM Compiler Framework and Infrastructure Tutorial - Tutorial for writing passes, exploring
the system.
• LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation - Design
overview.
• LLVM: An Infrastructure for Multi-Stage Optimization - More details (quite old now).
• GetElementPtr FAQ - Answers to some very frequent questions about LLVM's most frequently
misunderstood instruction.

LLVM User Guides

• The LLVM Getting Started Guide - Discusses how to get up and running quickly with the LLVM
infrastructure. Everything from unpacking and compilation of the distribution to execution of some
tools.
• Getting Started with the LLVM System using Microsoft Visual Studio - An addendum to the main
Getting Started guide for those using Visual Studio on Windows.
• LLVM Tutorial - A walk through the process of using LLVM for a custom language, and the facilities
LLVM offers in tutorial form.
• Developer Policy - The LLVM project's policy towards developers and their contributions.
• LLVM Command Guide - A reference manual for the LLVM command line utilities ("man" pages for
LLVM tools).
Current tools: llvm-ar, llvm-as, llvm-dis, llvm-extract, llvm-ld, llvm-link, llvm-nm, llvm-prof,
llvm-ranlib, opt, llc, lli, llvmc llvm-gcc, llvm-g++, bugpoint, llvm-bcanalyzer,
• LLVM's Analysis and Transform Passes - A list of optimizations and analyses implemented in
LLVM.
• Frequently Asked Questions - A list of common questions and problems and their solutions.
• Release notes for the current release - This describes new features, known bugs, and other limitations.
• How to Submit A Bug Report - Instructions for properly submitting information about any bugs you
run into in the LLVM system.
• LLVM Testing Infrastructure Guide - A reference manual for using the LLVM testing infrastructure.
• How to build the Ada/C/C++/Fortran front-ends - Instructions for building gcc front-ends from
source.
• Packaging guide - Advice on packaging LLVM into a distribution.
• The LLVM Lexicon - Definition of acronyms, terms and concepts used in LLVM.
• You can probably find help on the unofficial LLVM IRC channel. We often are on irc.oftc.net in the

1

Documentation for the LLVM System at SVN head

#llvm channel. If you are using the mozilla browser, and have chatzilla installed, you can join #llvm
on irc.oftc.net directly.

General LLVM Programming Documentation

• LLVM Language Reference Manual - Defines the LLVM intermediate representation and the
assembly form of the different nodes.
• The LLVM Programmers Manual - Introduction to the general layout of the LLVM sourcebase,
important classes and APIs, and some tips & tricks.
• LLVM Project Guide - How-to guide and templates for new projects that use the LLVM
infrastructure. The templates (directory organization, Makefiles, and test tree) allow the project code
to be located outside (or inside) the llvm/ tree, while using LLVM header files and libraries.
• LLVM Makefile Guide - Describes how the LLVM makefiles work and how to use them.
• CommandLine library Reference Manual - Provides information on using the command line parsing
library.
• LLVM Coding standards - Details the LLVM coding standards and provides useful information on
writing efficient C++ code.
• Extending LLVM - Look here to see how to add instructions and intrinsics to LLVM.
• Using LLVM Libraries - Look here to understand how to use the libraries produced when LLVM is
compiled.
• How To Release LLVM To The Public - This is a guide to preparing LLVM releases. Most
developers can ignore it.
• Doxygen generated documentation (classes) (tarball)
• ViewVC Repository Browser

LLVM Subsystem Documentation

• Writing an LLVM Pass - Information on how to write LLVM transformations and analyses.
• Writing an LLVM Backend - Information on how to write LLVM backends for machine targets.
• The LLVM Target-Independent Code Generator - The design and implementation of the LLVM code
generator. Useful if you are working on retargetting LLVM to a new architecture, designing a new
codegen pass, or enhancing existing components.
• TableGen Fundamentals - Describes the TableGen tool, which is used heavily by the LLVM code
generator.
• Alias Analysis in LLVM - Information on how to write a new alias analysis implementation or how to
use existing analyses.
• Accurate Garbage Collection with LLVM - The interfaces source-language compilers should use for
compiling GC'd programs.
• Source Level Debugging with LLVM - This document describes the design and philosophy behind
the LLVM source-level debugger.
• Zero Cost Exception handling in LLVM - This document describes the design and implementation of
exception handling in LLVM.
• Bugpoint - automatic bug finder and test-case reducer description and usage information.
• Compiler Driver (llvmc) Tutorial - This document is a tutorial introduction to the usage and
configuration of the LLVM compiler driver tool, llvmc.
• Compiler Driver (llvmc) Reference - This document describes the design and configuration of llvmc
in more detail.
• LLVM Bitcode File Format - This describes the file format and encoding used for LLVM "bc" files.
• System Library - This document describes the LLVM System Library (lib/System) and how to
keep LLVM source code portable

2

Documentation for the LLVM System at SVN head
• Link Time Optimization - This document describes the interface between LLVM intermodular
optimizer and the linker and its design
• The LLVM gold plugin - How to build your programs with link-time optimization on Linux.
• The GDB JIT interface - How to debug JITed code with GDB.

LLVM Mailing Lists

• The LLVM Announcements List: This is a low volume list that provides important announcements
regarding LLVM. It gets email about once a month.
• The Developer's List: This list is for people who want to be included in technical discussions of
LLVM. People post to this list when they have questions about writing code for or using the LLVM
tools. It is relatively low volume.
• The Bugs & Patches Archive: This list gets emailed every time a bug is opened and closed, and when
people submit patches to be included in LLVM. It is higher volume than the LLVMdev list.
• The Commits Archive: This list contains all commit messages that are made when LLVM developers
commit code changes to the repository. It is useful for those who want to stay on the bleeding edge of
LLVM development. This list is very high volume.
• The Test Results Archive: A message is automatically sent to this list by every active nightly tester
when it completes. As such, this list gets email several times each day, making it a high volume list.

LLVM Compiler Infrastructure
Last modified: $Date: 2010-02-25 18:54:42 -0600 (Thu, 25 Feb 2010) $

3

Documentation for the LLVM System at SVN head
LLVM Language Reference Manual

1. Abstract
2. Introduction
3. Identifiers
4. High Level Structure
1. Module Structure
2. Linkage Types
1. 'private' Linkage
2. 'linker_private' Linkage
3. 'internal' Linkage
4. 'available_externally' Linkage
5. 'linkonce' Linkage
6. 'common' Linkage
7. 'weak' Linkage
8. 'appending' Linkage
9. 'extern_weak' Linkage
10. 'linkonce_odr' Linkage
11. 'weak_odr' Linkage
12. 'externally visible' Linkage
13. 'dllimport' Linkage
14. 'dllexport' Linkage
3. Calling Conventions
4. Named Types
5. Global Variables
6. Functions
7. Aliases
8. Named Metadata
9. Parameter Attributes
10. Function Attributes
11. Garbage Collector Names
12. Module-Level Inline Assembly
13. Data Layout
14. Pointer Aliasing Rules
5. Type System
1. Type Classifications
2. Primitive Types
1. Integer Type
2. Floating Point Types
3. Void Type
4. Label Type
5. Metadata Type
3. Derived Types
1. Aggregate Types
1. Array Type
2. Structure Type
3. Packed Structure Type
4. Union Type
5. Vector Type
2. Function Type
3. Pointer Type

4

Documentation for the LLVM System at SVN head
4. Opaque Type
4. Type Up-references
6. Constants
1. Simple Constants
2. Complex Constants
3. Global Variable and Function Addresses
4. Undefined Values
5. Addresses of Basic Blocks
6. Constant Expressions
7. Other Values
1. Inline Assembler Expressions
2. Metadata Nodes and Metadata Strings
8. Intrinsic Global Variables
1. The 'llvm.used' Global Variable
2. The 'llvm.compiler.used' Global Variable
3. The 'llvm.global_ctors' Global Variable
4. The 'llvm.global_dtors' Global Variable
9. Instruction Reference
1. Terminator Instructions
1. 'ret' Instruction
2. 'br' Instruction
3. 'switch' Instruction
4. 'indirectbr' Instruction
5. 'invoke' Instruction
6. 'unwind' Instruction
7. 'unreachable' Instruction
2. Binary Operations
1. 'add' Instruction
2. 'fadd' Instruction
3. 'sub' Instruction
4. 'fsub' Instruction
5. 'mul' Instruction
6. 'fmul' Instruction
7. 'udiv' Instruction
8. 'sdiv' Instruction
9. 'fdiv' Instruction
10. 'urem' Instruction
11. 'srem' Instruction
12. 'frem' Instruction
3. Bitwise Binary Operations
1. 'shl' Instruction
2. 'lshr' Instruction
3. 'ashr' Instruction
4. 'and' Instruction
5. 'or' Instruction
6. 'xor' Instruction
4. Vector Operations
1. 'extractelement' Instruction
2. 'insertelement' Instruction
3. 'shufflevector' Instruction
5. Aggregate Operations

5

Documentation for the LLVM System at SVN head
1. 'extractvalue' Instruction
2. 'insertvalue' Instruction
6. Memory Access and Addressing Operations
1. 'alloca' Instruction
2. 'load' Instruction
3. 'store' Instruction
4. 'getelementptr' Instruction
7. Conversion Operations
1. 'trunc .. to' Instruction
2. 'zext .. to' Instruction
3. 'sext .. to' Instruction
4. 'fptrunc .. to' Instruction
5. 'fpext .. to' Instruction
6. 'fptoui .. to' Instruction
7. 'fptosi .. to' Instruction
8. 'uitofp .. to' Instruction
9. 'sitofp .. to' Instruction
10. 'ptrtoint .. to' Instruction
11. 'inttoptr .. to' Instruction
12. 'bitcast .. to' Instruction
8. Other Operations
1. 'icmp' Instruction
2. 'fcmp' Instruction
3. 'phi' Instruction
4. 'select' Instruction
5. 'call' Instruction
6. 'va_arg' Instruction
10. Intrinsic Functions
1. Variable Argument Handling Intrinsics
1. 'llvm.va_start' Intrinsic
2. 'llvm.va_end' Intrinsic
3. 'llvm.va_copy' Intrinsic
2. Accurate Garbage Collection Intrinsics
1. 'llvm.gcroot' Intrinsic
2. 'llvm.gcread' Intrinsic
3. 'llvm.gcwrite' Intrinsic
3. Code Generator Intrinsics
1. 'llvm.returnaddress' Intrinsic
2. 'llvm.frameaddress' Intrinsic
3. 'llvm.stacksave' Intrinsic
4. 'llvm.stackrestore' Intrinsic
5. 'llvm.prefetch' Intrinsic
6. 'llvm.pcmarker' Intrinsic
7. llvm.readcyclecounter' Intrinsic
4. Standard C Library Intrinsics
1. 'llvm.memcpy.*' Intrinsic
2. 'llvm.memmove.*' Intrinsic
3. 'llvm.memset.*' Intrinsic
4. 'llvm.sqrt.*' Intrinsic
5. 'llvm.powi.*' Intrinsic
6. 'llvm.sin.*' Intrinsic

6

Documentation for the LLVM System at SVN head
7. 'llvm.cos.*' Intrinsic
8. 'llvm.pow.*' Intrinsic
5. Bit Manipulation Intrinsics
1. 'llvm.bswap.*' Intrinsics
2. 'llvm.ctpop.*' Intrinsic
3. 'llvm.ctlz.*' Intrinsic
4. 'llvm.cttz.*' Intrinsic
6. Arithmetic with Overflow Intrinsics
1. 'llvm.sadd.with.overflow.* Intrinsics
2. 'llvm.uadd.with.overflow.* Intrinsics
3. 'llvm.ssub.with.overflow.* Intrinsics
4. 'llvm.usub.with.overflow.* Intrinsics
5. 'llvm.smul.with.overflow.* Intrinsics
6. 'llvm.umul.with.overflow.* Intrinsics
7. Debugger intrinsics
8. Exception Handling intrinsics
9. Trampoline Intrinsic
1. 'llvm.init.trampoline' Intrinsic
10. Atomic intrinsics
1. llvm.memory_barrier
2. llvm.atomic.cmp.swap
3. llvm.atomic.swap
4. llvm.atomic.load.add
5. llvm.atomic.load.sub
6. llvm.atomic.load.and
7. llvm.atomic.load.nand
8. llvm.atomic.load.or
9. llvm.atomic.load.xor
10. llvm.atomic.load.max
11. llvm.atomic.load.min
12. llvm.atomic.load.umax
13. llvm.atomic.load.umin
11. Memory Use Markers
1. llvm.lifetime.start
2. llvm.lifetime.end
3. llvm.invariant.start
4. llvm.invariant.end
12. General intrinsics
1. 'llvm.var.annotation' Intrinsic
2. 'llvm.annotation.*' Intrinsic
3. 'llvm.trap' Intrinsic
4. 'llvm.stackprotector' Intrinsic
5. 'llvm.objectsize' Intrinsic

Written by Chris Lattner and Vikram Adve

Abstract
This document is a reference manual for the LLVM assembly language. LLVM is a Static Single Assignment
(SSA) based representation that provides type safety, low-level operations, flexibility, and the capability of
representing 'all' high-level languages cleanly. It is the common code representation used throughout all
phases of the LLVM compilation strategy.

7

Documentation for the LLVM System at SVN head
Introduction
The LLVM code representation is designed to be used in three different forms: as an in-memory compiler IR,
as an on-disk bitcode representation (suitable for fast loading by a Just-In-Time compiler), and as a human
readable assembly language representation. This allows LLVM to provide a powerful intermediate
representation for efficient compiler transformations and analysis, while providing a natural means to debug
and visualize the transformations. The three different forms of LLVM are all equivalent. This document
describes the human readable representation and notation.

The LLVM representation aims to be light-weight and low-level while being expressive, typed, and extensible
at the same time. It aims to be a "universal IR" of sorts, by being at a low enough level that high-level ideas
may be cleanly mapped to it (similar to how microprocessors are "universal IR's", allowing many source
languages to be mapped to them). By providing type information, LLVM can be used as the target of
optimizations: for example, through pointer analysis, it can be proven that a C automatic variable is never
accessed outside of the current function, allowing it to be promoted to a simple SSA value instead of a
memory location.

Well-Formedness
It is important to note that this document describes 'well formed' LLVM assembly language. There is a
difference between what the parser accepts and what is considered 'well formed'. For example, the following
instruction is syntactically okay, but not well formed:

%x = add i32 1, %x

because the definition of %x does not dominate all of its uses. The LLVM infrastructure provides a
verification pass that may be used to verify that an LLVM module is well formed. This pass is automatically
run by the parser after parsing input assembly and by the optimizer before it outputs bitcode. The violations
pointed out by the verifier pass indicate bugs in transformation passes or input to the parser.

Identifiers
LLVM identifiers come in two basic types: global and local. Global identifiers (functions, global variables)
begin with the '@' character. Local identifiers (register names, types) begin with the '%' character.
Additionally, there are three different formats for identifiers, for different purposes:

1. Named values are represented as a string of characters with their prefix. For example, %foo,
@DivisionByZero, %a.really.long.identifier. The actual regular expression used is
'[%@][a-zA-Z$._][a-zA-Z$._0-9]*'. Identifiers which require other characters in their
names can be surrounded with quotes. Special characters may be escaped using "\xx" where xx is
the ASCII code for the character in hexadecimal. In this way, any character can be used in a name
value, even quotes themselves.
2. Unnamed values are represented as an unsigned numeric value with their prefix. For example, %12,
@2, %44.
3. Constants, which are described in a section about constants, below.

LLVM requires that values start with a prefix for two reasons: Compilers don't need to worry about name
clashes with reserved words, and the set of reserved words may be expanded in the future without penalty.
Additionally, unnamed identifiers allow a compiler to quickly come up with a temporary variable without
having to avoid symbol table conflicts.

Reserved words in LLVM are very similar to reserved words in other languages. There are keywords for
different opcodes ('add', 'bitcast', 'ret', etc...), for primitive type names ('void', 'i32', etc...), and others.
These reserved words cannot conflict with variable names, because none of them start with a prefix character

8

Documentation for the LLVM System at SVN head

('%' or '@').

Here is an example of LLVM code to multiply the integer variable '%X' by 8:

The easy way:

%result = mul i32 %X, 8

After strength reduction:

%result = shl i32 %X, i8 3

And the hard way:

%0 = add i32 %X, %X ; yields {i32}:%0
%1 = add i32 %0, %0 ; yields {i32}:%1
%result = add i32 %1, %1

This last way of multiplying %X by 8 illustrates several important lexical features of LLVM:

1. Comments are delimited with a ';' and go until the end of line.
2. Unnamed temporaries are created when the result of a computation is not assigned to a named value.
3. Unnamed temporaries are numbered sequentially

It also shows a convention that we follow in this document. When demonstrating instructions, we will follow
an instruction with a comment that defines the type and name of value produced. Comments are shown in
italic text.

High Level Structure
Module Structure
LLVM programs are composed of "Module"s, each of which is a translation unit of the input programs. Each
module consists of functions, global variables, and symbol table entries. Modules may be combined together
with the LLVM linker, which merges function (and global variable) definitions, resolves forward declarations,
and merges symbol table entries. Here is an example of the "hello world" module:

; Declare the string constant as a global constant.
@.LC0 = internal constant [13 x i8] c"hello world\0A\00" ; [13 x i8]*

; External declaration of the puts function
declare i32 @puts(i8 *) ; i32(i8 *)*

; Definition of main function
define i32 @main() { ; i32()*
; Convert [13 x i8]* to i8 *...
%cast210 = getelementptr [13 x i8]* @.LC0, i64 0, i64 0 ; i8 *

; Call puts function to write out the string to stdout.
call i32 @puts(i8 * %cast210) ; i32
ret i32 0
}

; Named metadata
!1 = metadata !{i32 41}
!foo = !{!1, null}

9

Documentation for the LLVM System at SVN head

This example is made up of a global variable named ".LC0", an external declaration of the "puts" function,
a function definition for "main" and named metadata "foo".

In general, a module is made up of a list of global values, where both functions and global variables are global
values. Global values are represented by a pointer to a memory location (in this case, a pointer to an array of
char, and a pointer to a function), and have one of the following linkage types.

Linkage Types
All Global Variables and Functions have one of the following types of linkage:

private
Global values with private linkage are only directly accessible by objects in the current module. In
particular, linking code into a module with an private global value may cause the private to be
renamed as necessary to avoid collisions. Because the symbol is private to the module, all references
can be updated. This doesn't show up in any symbol table in the object file.
linker_private
Similar to private, but the symbol is passed through the assembler and removed by the linker after
evaluation. Note that (unlike private symbols) linker_private symbols are subject to coalescing by the
linker: weak symbols get merged and redefinitions are rejected. However, unlike normal strong
symbols, they are removed by the linker from the final linked image (executable or dynamic library).
internal
Similar to private, but the value shows as a local symbol (STB_LOCAL in the case of ELF) in the
object file. This corresponds to the notion of the 'static' keyword in C.
available_externally
Globals with "available_externally" linkage are never emitted into the object file
corresponding to the LLVM module. They exist to allow inlining and other optimizations to take
place given knowledge of the definition of the global, which is known to be somewhere outside the
module. Globals with available_externally linkage are allowed to be discarded at will, and
are otherwise the same as linkonce_odr. This linkage type is only allowed on definitions, not
declarations.
linkonce
Globals with "linkonce" linkage are merged with other globals of the same name when linkage
occurs. This can be used to implement some forms of inline functions, templates, or other code which
must be generated in each translation unit that uses it, but where the body may be overridden with a
more definitive definition later. Unreferenced linkonce globals are allowed to be discarded. Note
that linkonce linkage does not actually allow the optimizer to inline the body of this function into
callers because it doesn't know if this definition of the function is the definitive definition within the
program or whether it will be overridden by a stronger definition. To enable inlining and other
optimizations, use "linkonce_odr" linkage.
weak
"weak" linkage has the same merging semantics as linkonce linkage, except that unreferenced
globals with weak linkage may not be discarded. This is used for globals that are declared "weak" in
C source code.
common
"common" linkage is most similar to "weak" linkage, but they are used for tentative definitions in C,
such as "int X;" at global scope. Symbols with "common" linkage are merged in the same way as
weak symbols, and they may not be deleted if unreferenced. common symbols may not have an
explicit section, must have a zero initializer, and may not be marked 'constant'. Functions and
aliases may not have common linkage.
appending

10

Documentation for the LLVM System at SVN head
"appending" linkage may only be applied to global variables of pointer to array type. When two
global variables with appending linkage are linked together, the two global arrays are appended
together. This is the LLVM, typesafe, equivalent of having the system linker append together
"sections" with identical names when .o files are linked.
extern_weak
The semantics of this linkage follow the ELF object file model: the symbol is weak until linked, if not
linked, the symbol becomes null instead of being an undefined reference.
linkonce_odr
weak_odr
Some languages allow differing globals to be merged, such as two functions with different semantics.
Other languages, such as C++, ensure that only equivalent globals are ever merged (the "one
definition rule" - "ODR"). Such languages can use the linkonce_odr and weak_odr linkage
types to indicate that the global will only be merged with equivalent globals. These linkage types are
otherwise the same as their non-odr versions.
externally visible:
If none of the above identifiers are used, the global is externally visible, meaning that it participates in
linkage and can be used to resolve external symbol references.

The next two types of linkage are targeted for Microsoft Windows platform only. They are designed to
support importing (exporting) symbols from (to) DLLs (Dynamic Link Libraries).

dllimport
"dllimport" linkage causes the compiler to reference a function or variable via a global pointer to
a pointer that is set up by the DLL exporting the symbol. On Microsoft Windows targets, the pointer
name is formed by combining __imp_ and the function or variable name.
dllexport
"dllexport" linkage causes the compiler to provide a global pointer to a pointer in a DLL, so that
it can be referenced with the dllimport attribute. On Microsoft Windows targets, the pointer name
is formed by combining __imp_ and the function or variable name.

For example, since the ".LC0" variable is defined to be internal, if another module defined a ".LC0" variable
and was linked with this one, one of the two would be renamed, preventing a collision. Since "main" and
"puts" are external (i.e., lacking any linkage declarations), they are accessible outside of the current module.

It is illegal for a function declaration to have any linkage type other than "externally visible", dllimport or
extern_weak.

Aliases can have only external, internal, weak or weak_odr linkages.

Calling Conventions
LLVM functions, calls and invokes can all have an optional calling convention specified for the call. The
calling convention of any pair of dynamic caller/callee must match, or the behavior of the program is
undefined. The following calling conventions are supported by LLVM, and more may be added in the future:

"ccc" - The C calling convention:
This calling convention (the default if no other calling convention is specified) matches the target C
calling conventions. This calling convention supports varargs function calls and tolerates some
mismatch in the declared prototype and implemented declaration of the function (as does normal C).
"fastcc" - The fast calling convention:
This calling convention attempts to make calls as fast as possible (e.g. by passing things in registers).
This calling convention allows the target to use whatever tricks it wants to produce fast code for the

11

Documentation for the LLVM System at SVN head
target, without having to conform to an externally specified ABI (Application Binary Interface). Tail
calls can only be optimized when this or the GHC convention is used. This calling convention does
not support varargs and requires the prototype of all callees to exactly match the prototype of the
function definition.
"coldcc" - The cold calling convention:
This calling convention attempts to make code in the caller as efficient as possible under the
assumption that the call is not commonly executed. As such, these calls often preserve all registers so
that the call does not break any live ranges in the caller side. This calling convention does not support
varargs and requires the prototype of all callees to exactly match the prototype of the function
definition.
"cc 10" - GHC convention:
This calling convention has been implemented specifically for use by the Glasgow Haskell Compiler
(GHC). It passes everything in registers, going to extremes to achieve this by disabling callee save
registers. This calling convention should not be used lightly but only for specific situations such as an
alternative to the register pinning performance technique often used when implementing functional
programming languages.At the moment only X86 supports this convention and it has the following
limitations:
◊ On X86-32 only supports up to 4 bit type parameters. No floating point types are supported.
◊ On X86-64 only supports up to 10 bit type parameters and 6 floating point parameters.
This calling convention supports tail call optimization but requires both the caller and callee are using
it.
"cc <n>" - Numbered convention:
Any calling convention may be specified by number, allowing target-specific calling conventions to
be used. Target specific calling conventions start at 64.

More calling conventions can be added/defined on an as-needed basis, to support Pascal conventions or any
other well-known target-independent convention.

Visibility Styles
All Global Variables and Functions have one of the following visibility styles:

"default" - Default style:
On targets that use the ELF object file format, default visibility means that the declaration is visible to
other modules and, in shared libraries, means that the declared entity may be overridden. On Darwin,
default visibility means that the declaration is visible to other modules. Default visibility corresponds
to "external linkage" in the language.
"hidden" - Hidden style:
Two declarations of an object with hidden visibility refer to the same object if they are in the same
shared object. Usually, hidden visibility indicates that the symbol will not be placed into the dynamic
symbol table, so no other module (executable or shared library) can reference it directly.
"protected" - Protected style:
On ELF, protected visibility indicates that the symbol will be placed in the dynamic symbol table, but
that references within the defining module will bind to the local symbol. That is, the symbol cannot be
overridden by another module.

Named Types
LLVM IR allows you to specify name aliases for certain types. This can make it easier to read the IR and
make the IR more condensed (particularly when recursive types are involved). An example of a name
specification is:

%mytype = type { %mytype*, i32 }

12

Documentation for the LLVM System at SVN head
You may give a name to any type except "void". Type name aliases may be used anywhere a type is expected
with the syntax "%mytype".

Note that type names are aliases for the structural type that they indicate, and that you can therefore specify
multiple names for the same type. This often leads to confusing behavior when dumping out a .ll file. Since
LLVM IR uses structural typing, the name is not part of the type. When printing out LLVM IR, the printer
will pick one name to render all types of a particular shape. This means that if you have code where two
different source types end up having the same LLVM type, that the dumper will sometimes print the "wrong"
or unexpected type. This is an important design point and isn't going to change.

Global Variables
Global variables define regions of memory allocated at compilation time instead of run-time. Global variables
may optionally be initialized, may have an explicit section to be placed in, and may have an optional explicit
alignment specified. A variable may be defined as "thread_local", which means that it will not be shared by
threads (each thread will have a separated copy of the variable). A variable may be defined as a global
"constant," which indicates that the contents of the variable will never be modified (enabling better
optimization, allowing the global data to be placed in the read-only section of an executable, etc). Note that
variables that need runtime initialization cannot be marked "constant" as there is a store to the variable.

LLVM explicitly allows declarations of global variables to be marked constant, even if the final definition of
the global is not. This capability can be used to enable slightly better optimization of the program, but requires
the language definition to guarantee that optimizations based on the 'constantness' are valid for the translation
units that do not include the definition.

As SSA values, global variables define pointer values that are in scope (i.e. they dominate) all basic blocks in
the program. Global variables always define a pointer to their "content" type because they describe a region of
memory, and all memory objects in LLVM are accessed through pointers.

A global variable may be declared to reside in a target-specific numbered address space. For targets that
support them, address spaces may affect how optimizations are performed and/or what target instructions are
used to access the variable. The default address space is zero. The address space qualifier must precede any
other attributes.

LLVM allows an explicit section to be specified for globals. If the target supports it, it will emit globals to the
section specified.

An explicit alignment may be specified for a global. If not present, or if the alignment is set to zero, the
alignment of the global is set by the target to whatever it feels convenient. If an explicit alignment is specified,
the global is forced to have at least that much alignment. All alignments must be a power of 2.

For example, the following defines a global in a numbered address space with an initializer, section, and
alignment:

@G = addrspace(5) constant float 1.0, section "foo", align 4

Functions
LLVM function definitions consist of the "define" keyword, an optional linkage type, an optional visibility
style, an optional calling convention, a return type, an optional parameter attribute for the return type, a
function name, a (possibly empty) argument list (each with optional parameter attributes), optional function
attributes, an optional section, an optional alignment, an optional garbage collector name, an opening curly
brace, a list of basic blocks, and a closing curly brace.

13

Documentation for the LLVM System at SVN head
LLVM function declarations consist of the "declare" keyword, an optional linkage type, an optional
visibility style, an optional calling convention, a return type, an optional parameter attribute for the return
type, a function name, a possibly empty list of arguments, an optional alignment, and an optional garbage
collector name.

A function definition contains a list of basic blocks, forming the CFG (Control Flow Graph) for the function.
Each basic block may optionally start with a label (giving the basic block a symbol table entry), contains a list
of instructions, and ends with a terminator instruction (such as a branch or function return).

The first basic block in a function is special in two ways: it is immediately executed on entrance to the
function, and it is not allowed to have predecessor basic blocks (i.e. there can not be any branches to the entry
block of a function). Because the block can have no predecessors, it also cannot have any PHI nodes.

LLVM allows an explicit section to be specified for functions. If the target supports it, it will emit functions to
the section specified.

An explicit alignment may be specified for a function. If not present, or if the alignment is set to zero, the
alignment of the function is set by the target to whatever it feels convenient. If an explicit alignment is
specified, the function is forced to have at least that much alignment. All alignments must be a power of 2.

Syntax:

define [linkage] [visibility]
[cconv] [ret attrs]
<ResultType> @<FunctionName> ([argument list])
[fn Attrs] [section "name"] [align N]
[gc] { ... }

Aliases
Aliases act as "second name" for the aliasee value (which can be either function, global variable, another alias
or bitcast of global value). Aliases may have an optional linkage type, and an optional visibility style.

Syntax:

@<Name> = alias [Linkage] [Visibility] <AliaseeTy> @<Aliasee>

Named Metadata
Named metadata is a collection of metadata. Metadata nodes (but not metadata strings) and null are the only
valid operands for a named metadata.

Syntax:

!1 = metadata !{metadata !"one"}
!name = !{null, !1}

Parameter Attributes
The return type and each parameter of a function type may have a set of parameter attributes associated with
them. Parameter attributes are used to communicate additional information about the result or parameters of a
function. Parameter attributes are considered to be part of the function, not of the function type, so functions
with different parameter attributes can have the same function type.

Parameter attributes are simple keywords that follow the type specified. If multiple parameter attributes are
needed, they are space separated. For example:

14

Documentation for the LLVM System at SVN head
declare i32 @printf(i8* noalias nocapture, ...)
declare i32 @atoi(i8 zeroext)
declare signext i8 @returns_signed_char()

Note that any attributes for the function result (nounwind, readonly) come immediately after the
argument list.

Currently, only the following parameter attributes are defined:

zeroext
This indicates to the code generator that the parameter or return value should be zero-extended to a
32-bit value by the caller (for a parameter) or the callee (for a return value).
signext
This indicates to the code generator that the parameter or return value should be sign-extended to a
32-bit value by the caller (for a parameter) or the callee (for a return value).
inreg
This indicates that this parameter or return value should be treated in a special target-dependent
fashion during while emitting code for a function call or return (usually, by putting it in a register as
opposed to memory, though some targets use it to distinguish between two different kinds of
registers). Use of this attribute is target-specific.
byval
This indicates that the pointer parameter should really be passed by value to the function. The
attribute implies that a hidden copy of the pointee is made between the caller and the callee, so the
callee is unable to modify the value in the callee. This attribute is only valid on LLVM pointer
arguments. It is generally used to pass structs and arrays by value, but is also valid on pointers to
scalars. The copy is considered to belong to the caller not the callee (for example, readonly
functions should not write to byval parameters). This is not a valid attribute for return values. The
byval attribute also supports specifying an alignment with the align attribute. This has a
target-specific effect on the code generator that usually indicates a desired alignment for the
synthesized stack slot.
sret
This indicates that the pointer parameter specifies the address of a structure that is the return value of
the function in the source program. This pointer must be guaranteed by the caller to be valid: loads
and stores to the structure may be assumed by the callee to not to trap. This may only be applied to the
first parameter. This is not a valid attribute for return values.
noalias
This indicates that the pointer does not alias any global or any other parameter. The caller is
responsible for ensuring that this is the case. On a function return value, noalias additionally
indicates that the pointer does not alias any other pointers visible to the caller. For further details,
please see the discussion of the NoAlias response in alias analysis.
nocapture
This indicates that the callee does not make any copies of the pointer that outlive the callee itself. This
is not a valid attribute for return values.
nest
This indicates that the pointer parameter can be excised using the trampoline intrinsics. This is not a
valid attribute for return values.

Garbage Collector Names
Each function may specify a garbage collector name, which is simply a string:

define void @f() gc "name" { ... }

15

Documentation for the LLVM System at SVN head

The compiler declares the supported values of name. Specifying a collector which will cause the compiler to
alter its output in order to support the named garbage collection algorithm.

Function Attributes
Function attributes are set to communicate additional information about a function. Function attributes are
considered to be part of the function, not of the function type, so functions with different parameter attributes
can have the same function type.

Function attributes are simple keywords that follow the type specified. If multiple attributes are needed, they
are space separated. For example:

define void @f() noinline { ... }
define void @f() alwaysinline { ... }
define void @f() alwaysinline optsize { ... }
define void @f() optsize { ... }

alignstack(<n>)
This attribute indicates that, when emitting the prologue and epilogue, the backend should forcibly
align the stack pointer. Specify the desired alignment, which must be a power of two, in parentheses.
alwaysinline
This attribute indicates that the inliner should attempt to inline this function into callers whenever
possible, ignoring any active inlining size threshold for this caller.
inlinehint
This attribute indicates that the source code contained a hint that inlining this function is desirable
(such as the "inline" keyword in C/C++). It is just a hint; it imposes no requirements on the inliner.
noinline
This attribute indicates that the inliner should never inline this function in any situation. This attribute
may not be used together with the alwaysinline attribute.
optsize
This attribute suggests that optimization passes and code generator passes make choices that keep the
code size of this function low, and otherwise do optimizations specifically to reduce code size.
noreturn
This function attribute indicates that the function never returns normally. This produces undefined
behavior at runtime if the function ever does dynamically return.
nounwind
This function attribute indicates that the function never returns with an unwind or exceptional control
flow. If the function does unwind, its runtime behavior is undefined.
readnone
This attribute indicates that the function computes its result (or decides to unwind an exception) based
strictly on its arguments, without dereferencing any pointer arguments or otherwise accessing any
mutable state (e.g. memory, control registers, etc) visible to caller functions. It does not write through
any pointer arguments (including byval arguments) and never changes any state visible to callers.
This means that it cannot unwind exceptions by calling the C++ exception throwing methods, but
could use the unwind instruction.
readonly
This attribute indicates that the function does not write through any pointer arguments (including
byval arguments) or otherwise modify any state (e.g. memory, control registers, etc) visible to caller
functions. It may dereference pointer arguments and read state that may be set in the caller. A
readonly function always returns the same value (or unwinds an exception identically) when called
with the same set of arguments and global state. It cannot unwind an exception by calling the C++
exception throwing methods, but may use the unwind instruction.

16

Documentation for the LLVM System at SVN head
ssp
This attribute indicates that the function should emit a stack smashing protector. It is in the form of a
"canary"—a random value placed on the stack before the local variables that's checked upon return
from the function to see if it has been overwritten. A heuristic is used to determine if a function needs
stack protectors or not.

If a function that has an ssp attribute is inlined into a function that doesn't have an ssp attribute,
then the resulting function will have an ssp attribute.
sspreq
This attribute indicates that the function should always emit a stack smashing protector. This
overrides the ssp function attribute.

If a function that has an sspreq attribute is inlined into a function that doesn't have an sspreq
attribute or which has an ssp attribute, then the resulting function will have an sspreq attribute.
noredzone
This attribute indicates that the code generator should not use a red zone, even if the target-specific
ABI normally permits it.
noimplicitfloat
This attributes disables implicit floating point instructions.
naked
This attribute disables prologue / epilogue emission for the function. This can have very
system-specific consequences.

Module-Level Inline Assembly
Modules may contain "module-level inline asm" blocks, which corresponds to the GCC "file scope inline
asm" blocks. These blocks are internally concatenated by LLVM and treated as a single unit, but may be
separated in the .ll file if desired. The syntax is very simple:

module asm "inline asm code goes here"
module asm "more can go here"

The strings can contain any character by escaping non-printable characters. The escape sequence used is
simply "\xx" where "xx" is the two digit hex code for the number.

The inline asm code is simply printed to the machine code .s file when assembly code is generated.

Data Layout
A module may specify a target specific data layout string that specifies how data is to be laid out in memory.
The syntax for the data layout is simply:

target datalayout = "layout specification"

The layout specification consists of a list of specifications separated by the minus sign character ('-'). Each
specification starts with a letter and may include other information after the letter to define some aspect of the
data layout. The specifications accepted are as follows:

E
Specifies that the target lays out data in big-endian form. That is, the bits with the most significance
have the lowest address location.
e

17

Documentation for the LLVM System at SVN head
Specifies that the target lays out data in little-endian form. That is, the bits with the least significance
have the lowest address location.
p:size:abi:pref
This specifies the size of a pointer and its abi and preferred alignments. All sizes are in bits.
Specifying the pref alignment is optional. If omitted, the preceding : should be omitted too.
isize:abi:pref
This specifies the alignment for an integer type of a given bit size. The value of size must be in the
range [1,2^23).
vsize:abi:pref
This specifies the alignment for a vector type of a given bit size.
fsize:abi:pref
This specifies the alignment for a floating point type of a given bit size. The value of size must be
either 32 (float) or 64 (double).
asize:abi:pref
This specifies the alignment for an aggregate type of a given bit size.
ssize:abi:pref
This specifies the alignment for a stack object of a given bit size.
nsize1:size2:size3...
This specifies a set of native integer widths for the target CPU in bits. For example, it might contain
"n32" for 32-bit PowerPC, "n32:64" for PowerPC 64, or "n8:16:32:64" for X86-64. Elements of this
set are considered to support most general arithmetic operations efficiently.

When constructing the data layout for a given target, LLVM starts with a default set of specifications which
are then (possibly) overriden by the specifications in the datalayout keyword. The default specifications
are given in this list:

• E - big endian
• p:64:64:64 - 64-bit pointers with 64-bit alignment
• i1:8:8 - i1 is 8-bit (byte) aligned
• i8:8:8 - i8 is 8-bit (byte) aligned
• i16:16:16 - i16 is 16-bit aligned
• i32:32:32 - i32 is 32-bit aligned
• i64:32:64 - i64 has ABI alignment of 32-bits but preferred alignment of 64-bits
• f32:32:32 - float is 32-bit aligned
• f64:64:64 - double is 64-bit aligned
• v64:64:64 - 64-bit vector is 64-bit aligned
• v128:128:128 - 128-bit vector is 128-bit aligned
• a0:0:1 - aggregates are 8-bit aligned
• s0:64:64 - stack objects are 64-bit aligned

When LLVM is determining the alignment for a given type, it uses the following rules:

1. If the type sought is an exact match for one of the specifications, that specification is used.
2. If no match is found, and the type sought is an integer type, then the smallest integer type that is larger
than the bitwidth of the sought type is used. If none of the specifications are larger than the bitwidth
then the the largest integer type is used. For example, given the default specifications above, the i7
type will use the alignment of i8 (next largest) while both i65 and i256 will use the alignment of i64
(largest specified).
3. If no match is found, and the type sought is a vector type, then the largest vector type that is smaller
than the sought vector type will be used as a fall back. This happens because <128 x double> can be
implemented in terms of 64 <2 x double>, for example.

18

19 . . structure. Consequently. The first operand of a store similarly only indicates the size and alignment of the store. i2. vector. as well as the interpretation of the value. union.. x86_fp80. metadata. . fp128. • A null pointer in the default address-space is associated with no address. Metadata may be used to encode additional information which specialized optimization passes may use to implement type-based alias analysis. Pointer values are associated with address ranges according to the following rules: • A pointer value formed from a getelementptr instruction is associated with the addresses associated with the first operand of the getelementptr.. floating point. ppc_fp128 first class integer. opaque. derived array. i32. A strong type system makes it easier to read the generated code and enables novel analyses and transformations that are not feasible to perform on normal three address code representations. pointer. i3.. primitive label. • The result value of an allocation instruction is associated with the address range of the allocated storage. double. • A pointer value formed by an inttoptr is associated with all address ranges of all pointer values that contribute (directly or indirectly) to the computation of the pointer's value. Values of these types are the only ones which can be produced by instructions. type-based alias analysis. Such ranges shall not overlap with any ranges of addresses allocated by mechanisms provided by LLVM. .. union. LLVM IR does not associate types with memory.. .. structure. i16. void. i8. Being typed enables a number of optimizations to be performed on the intermediate representation directly. floating point. packed structure. metadata. i64. Type Classifications The types fall into a few useful classifications: Classification Types integer i1.. The first class types are perhaps the most important... • The result value of a bitcast is associated with all addresses associated with the operand of the bitcast. Documentation for the LLVM System at SVN head Pointer Aliasing Rules Any memory access must be done through a pointer value associated with an address range of the memory access. Type System The LLVM type system is one of the most important features of the intermediate representation. aka TBAA. is not applicable to general unadorned LLVM IR. otherwise the behavior is undefined. vector. array. • An address of a global variable is associated with the address range of the variable's storage. pointer. aka -fstrict-aliasing. . function.. label. without having to do extra analyses on the side before the transformation. The result type of a load merely indicates the size and alignment of the memory from which to load. floating point float. • An integer constant other than zero or a pointer value returned from a function not defined within LLVM may be associated with address ranges allocated through mechanisms other than those provided by LLVM.

Syntax: iN The number of bits the integer will occupy is specified by the N value. Integer Type Overview: The integer type is a very simple type that simply specifies an arbitrary bit width for the integer type desired. Any bit width from 1 bit to 223-1 (about 8 million) can be specified. i1942652 a really big integer of over 1 million bits. Examples: i1 a single-bit integer. Floating Point Types Type Description float 32-bit floating point value double 64-bit floating point value fp128 128-bit floating point value (112-bit mantissa) x86_fp80 80-bit floating point value (X87) ppc_fp128 128-bit floating point value (two 64-bits) Void Type Overview: The void type does not represent any value and has no size. i32 a 32-bit integer. Syntax: void Label Type Overview: The label type represents code labels. Documentation for the LLVM System at SVN head Primitive Types The primitive types are the fundamental building blocks of the LLVM system. Syntax: label Metadata Type 20 .

Aggregate Types Aggregate Types are a subset of derived types that can contain multiple member types. Array Type Overview: The array type is a very simple derived type that arranges elements sequentially in memory. Each of these types contain one or more element types which may be a primitive type. it is possible to have a two dimensional array. vectors and unions are aggregate types. pointers. This means that single-dimension 'variable sized array' addressing can be implemented in LLVM with a zero length array type. This is what allows a programmer to represent arrays. For example. Documentation for the LLVM System at SVN head Overview: The metadata type represents embedded metadata. [41 x i32] Array of 41 32-bit integer values. An implementation of 'pascal style arrays' in LLVM could use the type "{ i32. functions. [12 x [10 x float]] 12x10 array of single precision floating point values. [0 x float]}". using an array as the element type of another array. There is no restriction on indexing beyond the end of the array implied by a static type (though there are restrictions on indexing beyond the bounds of an allocated object in some cases). No derived types may be created from metadata except for function arguments. Here are some examples of multidimensional arrays: [3 x [4 x i32]] 3x4 array of 32-bit integer values. Syntax: metadata Derived Types The real power in LLVM comes from the derived types in the system. Function Type 21 . elementtype may be any type with a size. Arrays. [2 x [3 x [4 x i16]]] 2x3x4 array of 16-bit integer values. The array type requires a size (number of elements) and an underlying data type. [4 x i8] Array of 4 8-bit integer values. or another derived type. and other useful types. Examples: [40 x i32] Array of 40 32-bit integer values. for example. Syntax: [<# elements> x <elementtype>] The number of elements is a constant integer value. structs.

22 ... Syntax: <returntype> (<parameter list>) . returning float (i16. Syntax: { <type list> } Examples: { i32. The elements of a structure may be any type that has a size. Packed Structure Type Overview: The packed structure type is used to represent a collection of data members together in memory. If the return type is a struct type then all struct elements must be of first class types. There is no padding between fields. Structures in memory are accessed using 'load' and 'store' by getting a pointer to a field with the 'getelementptr' instruction. which indicates that the function takes a variable number of arguments. This is the signature for printf in LLVM. Further. a void type. i32 } A triple of three i32 values A pair.. which i32 (i8*. The packing of the field types is defined to match the ABI of the underlying processor. the parameter list may include a type . i32.. i32} (i32) values Structure Type Overview: The structure type is used to represent a collection of data members together in memory. It consists of a return type and a list of formal parameter types. i32 (i32) * } pointer to a function that takes an i32.where '<parameter list>' is a comma-separated list of type specifiers. Documentation for the LLVM System at SVN head Overview: The function type can be thought of as a function signature. returning an i32 Pointer to a function that takes an i16 and a pointer to i32. '<returntype>' is any type except label. or a union type. returning a structure containing two i32 {i32. Examples: i32 (i32) function taking an i32. A vararg function that takes at least one pointer to i8 (char in C). The elements of a packed structure may be any type that has a size. the alignment of a packed structure is 1 byte. a struct type. i32 *) * float..) returns an integer... returning an i32. The return type of a function type is a scalar type. A function taking an i32. Variable argument functions can access their arguments with the variable argument handling intrinsic functions. . where the first element is a float and the second element is a { float. Structures in registers are accessed using the 'extractvalue' and 'insertvalue' instructions. and the struct must have at least one element. Optionally.

The elements of a union may be any type that has a size. a pointer to an i32. i32 (i32)* } > pointer to a function that takes an i32. Pointers are commonly used to reference objects in memory. the getelementptr instruction does not affect the address. returning an i32. where the first element is a float and the second element { float. Syntax: < { <type list> } > Examples: < { i32. only the type of the resulting pointer. Use i8* instead.empty unions are not allowed. i32*. The semantics of non-zero address spaces are target-specific. Note that LLVM does not permit pointers to void (void*) nor does it permit pointers to labels (label*). Syntax: union { <type list> } Examples: union { i32. and the alignment requirements of the union as a whole will be the largest alignment requirement of any member. Union Type Overview: A union type describes an object with size and alignment suitable for an object of any one of a given set of types (also known as an "untagged" union). Pointer Type Overview: The pointer type is used to specify memory locations. The default address space is number zero. Documentation for the LLVM System at SVN head Structures are accessed using 'load and 'store' by getting a pointer to a field with the 'getelementptr' instruction. Pointer types may have an optional address space attribute defining the numbered address space where the pointed-to object resides. except that all members of the union have an offset of zero. Since all members are at offset zero. float A union of three types: an i32. i32 (i32) * } is a pointer to a function that takes an i32. i32. Unions must have at least one member . i32 } > A triple of three i32 values A pair. returning an i32. where the first element is a float and the second element is a < { float. The size of the union as a whole will be the size of its largest member. 23 . and a float. Union members are accessed using 'load and 'store' by getting a pointer to a field with the 'getelementptr' instruction. It is similar in concept and usage to a struct. } union A union.

Type Up-references Overview: An "up reference" allows you to refer to a lexically enclosing type without requiring it to have a name. elementtype may be any integer or floating point type. Vector types are used when multiple primitive data are operated in parallel using a single instruction (SIMD). <8 x float> Vector of 8 32-bit floating-point values. Documentation for the LLVM System at SVN head Syntax: <type> * Examples: [4 x i32]* A pointer to array of four i32 values. Vector Type Overview: A vector type is a simple derived type that represents a vector of elements. Vector types are considered first class. opaque types can eventually be resolved to any type (not just a structure type). Syntax: opaque Examples: opaque An opaque type. returning an i32. <2 x i64> Vector of 2 64-bit integer values. A vector type requires a size (number of elements) and an underlying primitive data type. Examples: <4 x i32> Vector of 4 32-bit integer values. i32 addrspace(5)* A pointer to an i32 value that resides in address space #5. Opaque Type Overview: Opaque types are used to represent unknown types in the system. Syntax: < <# elements> x <elementtype> > The number of elements is a constant integer value. For instance. In LLVM. a structure declaration may contain a pointer to any of the types it is lexically a member of. i32 (i32 *) * A pointer to a function that takes an i32*. Example of up references (with their equivalent as named type declarations) include: { \2 * } %x = type { %x* } { \2 }* %y = type { %y }* \1* %z = type %z* 24 . This corresponds (for example) to the C notion of a forward declared structure type.

For example. and other special values are represented in their IEEE hexadecimal format so that assembly and disassembly do not cause any bits to change in the constants. NaN's. the form 'double 0x432ff973cafa8000' is equivalent to (but harder to read than) 'double 4. or a more precise hexadecimal notation (see below). The assembler requires the exact decimal value of a floating-point constant. { { \3*. All hexadecimal formats are big-endian (sign bit at the left). This section describes them all and their syntax. it needs a syntax to handle recursive types that have no names (all names are optional in llvm IR). The only time hexadecimal floating point constants are required (and the only time that they are generated by the disassembler) is when a floating point constant must be emitted but it cannot be represented as a decimal floating point number in a reasonable number of digits. Simple Constants Boolean constants The two strings 'true' and 'false' are both valid constants of the i1 type. Syntax: \<level> The level is the count of the lexical type that is being referred to. The IEEE 128-bit format is represented by 0xL followed by 32 hexadecimal digits. Floating point constants Floating point constants use standard decimal notation (e.421). When using the hexadecimal form. Hexadecimal format is always used for long double.3 because 1. 123. For example.3 is a repeating decimal in binary.25 but rejects 1. float values must. i32 } Recursive structure where the upref refers to the out-most structure. the assembler accepts 1. Constants LLVM has several different basic types of constants. Examples: \1* Self-referential pointer. however.5e+15'. be exactly representable as IEE754 single precision. 1. Floating point constants must have a floating point type. constants of types float and double are represented using the 16-digit form shown above (which matches the IEEE754 representation for double). The 128-bit format used by PowerPC (two adjacent doubles) is represented by 0xM followed by 32 hexadecimal digits. Complex Constants 25 . infinities. Negative numbers may be used with integer types. exponential notation (e. Integer constants Standard integers (such as '4') are constants of the integer type. i8 }. Because the asmprinter does not want to print out an infinite type string. no currently supported target uses this format. Documentation for the LLVM System at SVN head An up reference is needed by the asmprinter for printing out cyclic types when there is no declared name for a type in the cycle. The one non-intuitive notation for constants is the hexadecimal form of floating point constants.g.g.23421e+2). The 80-bit format used by x86 is represented as 0xK followed by 20 hexadecimal digits. Long doubles will only work if they match the long double format on your target. Null pointer constants The identifier 'null' is recognized as a null pointer constant and must be of pointer type. For example. and there are three forms of long double.

surrounded by braces ({})). i32 100 >". and indicates that the user of the value may receive an unspecified bit-pattern. a single typed element surrounded by braces ({})). i32 74. Metadata node A metadata node is a structure-like constant with metadata type. metadata is a place to attach additional information such as debug info. surrounded by less-than/greater-than's (<>)). Zero initialization The string 'zeroinitializer' can be used to zero initialize a value to zero of any type. including scalar and aggregate types. For example. undef 26 . For example: "metadata !{ i32 0. For example: "< i32 42. i32 11. metadata !"test" }". for large arrays) and is always exactly equivalent to using explicit zero initializers. Undefined values are useful because they indicate to the compiler that the program is well defined no matter what value is used. Array constants Array constants are represented with notation similar to array type definitions (a comma separated list of elements. Structure constants must have structure type. For example: "[ i32 42. Unlike other constants that are meant to be interpreted as part of the instruction stream. Vector constants Vector constants are represented with notation similar to vector type definitions (a comma separated list of elements. Here are some examples of (potentially surprising) transformations that are valid (in pseudo IR): %A = add %X. i32* @G }". For example: "{ i32 4 }". This is often used to avoid having to print large zero initializers (e. This gives the compiler more freedom to optimize. Undefined values may be of any type (other than label or void) and be used anywhere a constant is permitted. the following is a legal LLVM file: @X = global i32 17 @Y = global i32 42 @Z = global [2 x i32*] [ i32* @X. Union constants Union constants are represented with notation similar to a structure with a single element . Structure constants Structure constants are represented with notation similar to structure type definitions (a comma separated list of elements. Vector constants must have vector type. and the number and types of elements must match those specified by the type.g. The union type can be initialized with a single-element struct as long as the type of the struct element matches the type of one of the union members. Array constants must have array type. These constants are explicitly referenced when the identifier for the global is used and always have pointer type. For example: "{ i32 4. i32 74 ]". and the number and types of elements must match those specified by the type. surrounded by square brackets ([])). i32* @Y ] Undefined Values The string 'undef' can be used anywhere a constant is expected.that is. Documentation for the LLVM System at SVN head Complex constants are a (potentially recursive) combination of simple constants and smaller complex constants. where "@G" is declared as "@G = external global i32". float 17. Global Variable and Function Addresses The addresses of global variables and functions are always implicitly valid (link-time) constants.0. i32 11. and the number and types of elements must match those specified by the type.

For example. %B %D = undef %E = icmp lt %D. and optimize the and to 0. In the %A example. 4 %F = icmp gte %D. allowing the whole select to be eliminated. then %A would have to have a cleared low bit. allowing the or to be folded to -1. %Y %B = select undef. %A = select undef. However. 4 Safe: %A = undef %B = undef %C = undef %D = undef %E = undef 27 . However. if %X and %Y were both known to have a clear low bit. %Y. then the output of the 'and' operation will always be a zero. %X. it is safe to assume that all the bits of the undef operand to the or could be set. %Y %C = select %X. it is safe to assume that all bits of the undef could be 0. undef %B = undef %C = xor %B. undef Safe: %A = undef %B = undef %C = undef This is safe because all of the output bits are affected by the undef bits. if "%X" has a zero bit. the optimizer is allowed to assume that the undef operand could be the same as %Y. undef Safe: %A = -1 %B = 0 Unsafe: %A = undef %B = undef These logical operations have bits that are not always affected by the input. %A = xor undef. no matter what the corresponding bit from the undef is. undef Safe: %A = %X (or %Y) %B = 42 (or %Y) %C = %Y Unsafe: %A = undef %B = undef %C = undef This set of examples show that undefined select (and conditional branch) conditions can go "either way" but they have to come from one of the two operands. Documentation for the LLVM System at SVN head %B = sub %X. Likewise. undef %C = xor %X. undef %B = and %X. 42. %A = or %X. in the %C example. it is unsafe to optimize or assume that the result of the and is undef. Any output bit can have a zero or one depending on the input bits. As such.

so the value is not necessarily consistent over time. and always has an i8* type. Documentation for the LLVM System at SVN head %F = undef This example points out that two undef operands are not necessarily the same. but that is target specific. Addresses of Basic Blocks blockaddress(@function. the optimizer can assume that it occurs in dead code. This can be surprising to people (and also matches C semantics) where they assume that "X^X" is always zero. comparison against null is ok. but the short answer is that an undef "variable" can arbitrarily change its value over its "live range". Pointer equality tests between labels addresses is undefined behavior . %X %B = fdiv %X. This allows us to delete the divide and all code after it: since the undefined operation "can't happen". therefore. %block) The 'blockaddress' constant computes the address of the specified basic block in the specified function. we are allowed to assume that it could be zero. This may also be passed around as an opaque pointer sized value as long as the bits are not inspected. This isn't true for a number of reasons. However. some targets may provide defined semantics when using the value as the operand to an inline assembly. again. the value is logically read from arbitrary registers that happen to be around when needed. Finally. This means that the %A operation can be constant folded to undef because the undef could be an SNaN. it has undefined behavior. However. and no label is equal to the null pointer. %A and %C need to have the same semantics or the core LLVM "replace all uses with" concept would not hold. a store "to" an undefined location could clobber arbitrary memory. undef Safe: %A = undef b: unreachable These examples show the crucial difference between an undefined value and undefined behavior. Constant Expressions 28 . Since a divide by zero has undefined behavior. This value only has defined behavior when used as an operand to the 'indirectbr' instruction or for comparisons against null. %A = fdiv undef. we can make a more aggressive assumption: because the undef is allowed to be an arbitrary value. An undefined value (like undef) is allowed to have an arbitrary bit-pattern.though. and fdiv is not (currently) defined on SNaN's. This is true because the "variable" doesn't actually have a live range. This allows ptrtoint and arithmetic to be performed on these values so long as the original value is reconstituted before the indirectbr. Instead. we are allowed to assume that the operation does not execute at all. in the second example. Taking the address of the entry block is illegal. even if X is undef. In fact. a: store undef -> %X b: store %X -> undef Safe: a: <deleted> b: unreachable These examples reiterate the fdiv example: a store "of" an undefined value can be assumed to not have any effect: we can assume that the value is overwritten with bits that happen to match what was already there.

) 29 . Constant expressions may be of any first class type and may involve any LLVM operation that does not have side effects (e. TYPE must be a scalar or vector floating point type. CST must be of scalar or vector floating point type. Both types must be integers. If the value won't fit in the floating point type. The CST value is zero extended. or vectors of the same number of elements. load and call are not supported). CST must be of scalar or vector integer type. The following is the syntax for constant expressions: trunc ( CST to TYPE ) Truncate a constant to another type. The bit size of CST must be larger than the bit size of TYPE. CST. The size of CST must be larger than the size of TYPE. truncated. This one is really dangerous! bitcast ( CST to TYPE ) Convert a constant. CST must be of scalar or vector floating point type. The bit size of CST must be smaller or equal to the bit size of TYPE. If the value won't fit in the integer type. Documentation for the LLVM System at SVN head Constant expressions are used to allow expressions involving other constants to be used as constants. . fptoui ( CST to TYPE ) Convert a floating point constant to the corresponding unsigned integer constant. sitofp ( CST to TYPE ) Convert a signed integer constant to the corresponding floating point constant. truncated. Both types must be floating point. or vectors of the same number of elements. the results are undefined. The constraints of the operands are the same as those for the bitcast instruction. TYPE must be a pointer type. Both CST and TYPE must be scalars. getelementptr ( CSTPTR. uitofp ( CST to TYPE ) Convert an unsigned integer constant to the corresponding floating point constant.. The size of CST must be smaller or equal to the size of TYPE. the results are undefined. to another TYPE. Both types must be integers. The bit size of CST must be smaller or equal to the bit size of TYPE. TYPE must be a scalar or vector integer type. ptrtoint ( CST to TYPE ) Convert a pointer typed constant to the corresponding integer constant TYPE must be an integer type. IDX0. IDX1. Both CST and TYPE must be scalars. Both types must be integers. Both CST and TYPE must be scalars. or unchanged to make it fit in a pointer size. Both CST and TYPE must be scalars. CST must be of integer type. fptosi ( CST to TYPE ) Convert a floating point constant to the corresponding signed integer constant. sext ( CST to TYPE ) Sign extend a constant to another type. the results are undefined. the results are undefined. or vectors of the same number of elements. fptrunc ( CST to TYPE ) Truncate a floating point constant to another floating point type.. The CST value is zero extended. TYPE must be a scalar or vector floating point type. or vectors of the same number of elements. If the value won't fit in the floating point type. zext ( CST to TYPE ) Zero extend a constant to another type. fpext ( CST to TYPE ) Floating point extend a constant to another type.g. or unchanged to make it fit in TYPE. CST must be of scalar or vector integer type. CST must be of pointer type. If the value won't fit in the integer type. TYPE must be a scalar or vector integer type. Both types must be floating point. inttoptr ( CST to TYPE ) Convert a integer constant to a pointer constant.

TODO: The format of the asm and constraints string still need to be documented here. and a flag indicating whether the function containing the asm needs to align its stack conservatively. VEC2.r"(i32 %Y) Inline asms with side effects not visible in the constraint list must be marked as having side effects. yet will not contain code that does that alignment within the asm. OPCODE ( LHS. "=r. . moving. icmp COND ( VAL1. "=r. extractelement ( VAL. like so: call void asm sideeffect "eieio". VAL2 ) Performs the icmp operation on constants. shufflevector ( VEC1.. ) Perform the getelementptr operation on constants. a flag that indicates whether or not the inline asm expression has side effects. VAL2 ) Performs the fcmp operation on constants. As with the getelementptr instruction. This is probably best done by reference to 30 . Thus. ""() If both keywords appear the 'sideeffect' keyword must come first. Other Values Inline Assembler Expressions LLVM supports inline assembler expressions (as opposed to Module-Level Inline Assembly) through the use of a special value. which are required to make sense for the type of "CSTPTR". This is done through the use of the 'sideeffect' keyword. a list of operand constraints (stored as a string). An example inline assembler expression is: i32 (i32) asm "bswap $0". IDX ) Perform the extractelement operation on constants.. ""() In some cases inline asms will contain code that will not work unless the stack is aligned in some way. RHS ) Perform the specified operation of the LHS and RHS constants. IDXMASK ) Perform the shufflevector operation on constants.g. the index list may have zero or more indexes. IDX1. The compiler should make conservative assumptions about what the asm might contain and should generate its usual stack alignment code in the prologue if the 'alignstack' keyword is present: call void asm alignstack "eieio". ELT. This value represents the inline assembler as a string (containing the instructions to emit). VAL2 ) Perform the select operation on constants. OPCODE may be any of the binary or bitwise binary operations. IDX ) Perform the insertelement operation on constants. VAL1.r" Inline assembler expressions may only be used as the callee operand of a call instruction. Documentation for the LLVM System at SVN head getelementptr inbounds ( CSTPTR.g. select ( COND. no bitwise operations on floating point values are allowed). duplication. typically we have: %X = call i32 asm "bswap $0". Constraints on what can be done (e. etc need to be documented). The constraints on operands are the same as those for the corresponding instruction (e. such as calls or SSE instructions on x86. fcmp COND ( VAL1. IDX0. insertelement ( VAL.

Here llvm. section "llvm. All metadata has the metadata type and is identified in syntax by a preceding exclamation point ('!'). Metadata Nodes and Metadata Strings LLVM IR allows metadata to be attached to instructions in the program that can convey extra information about the code to the optimizers and code generator. For example: "!foo = metadata !{!4. which can be looked up in the module symbol table. a legal use of it is: @X = global i8 4 @Y = global i32 123 @llvm.dbg. One example application of metadata is source-level debug information. For example: "!"test\00"".value function is using two metadata arguments.next = add i64 %indvar.used list.value(metadata !24. and corresponds to "attribute((used))" in GNU C. Metadata nodes can have any values as their operand. and linker are required to treat the symbol as if there is a reference to the global that it cannot see. Metadata nodes are represented with notation similar to structure constants (a comma separated list of elements. i8* bitcast (i32* @Y to i8*) ]. For example. These are documented here. 1. 31 . !3}". Here metadata !21 is attached with add instruction using !dbg identifier. This section and all globals that start with "llvm. it cannot be deleted. metadata !25) Metadata can be attached with an instruction." are reserved for use by LLVM. For example.used list. Metadata can be used as function arguments. A named metadata is a collection of metadata nodes. then the compiler. A metadata string is a string surrounded by double quotes. i64 0. There are two metadata primitives: strings and nodes. assembler.used = appending global [2 x i8*] [ i8* @X. For example: "!{ metadata !"test\00". This is commonly used to represent references from inline asms and other things the compiler cannot "see".metadata". The 'llvm.used' Global Variable The @llvm. All globals of this sort should have a section specified as "llvm.dbg.metadata" If a global variable appears in the @llvm. %indvar. Documentation for the LLVM System at SVN head another document that covers inline asm from a holistic perspective. It can contain any character by escaping non-printable characters with "\xx" where "xx" is the two digit hex code. if a variable has internal linkage and no references other than that from the @llvm. call void @llvm. !dbg !21 Intrinsic Global Variables LLVM has a number of "magic" global variables that contain data that affect code generation or other IR semantics. This array contains a list of pointers to global variables and functions which may optionally have a pointer cast formed of bitcast or getelementptr. surrounded by braces and preceded by an exclamation point). i32 10}".used global is an array with i8* element type which has appending linkage.

Instruction Reference The LLVM instruction set consists of several different classifications of instructions: terminator instructions. the 'br' instruction. 32 . bitwise binary instructions. The type of the return value must be a 'first class' type. except that it only prevents the compiler from touching the symbol. the 'unwind' instruction. 'ret' Instruction Syntax: ret <type> <value> . The 'llvm. The 'llvm.global_ctors' Global Variable TODO: Describe this. These terminator instructions typically yield a 'void' value: they produce control flow. The 'llvm. the 'invoke' instruction. A function is not well formed if it it has a non-void return type and contains a 'ret' instruction with no return value or a return value with a type that does not match its type. and other instructions. the ''indirectbr' Instruction. the return value. which indicates which block should be executed after the current block is finished.used' Global Variable The @llvm. not values (the one exception being the 'invoke' instruction).global_dtors' Global Variable TODO: Describe this. or if it has a void return type and contains a 'ret' instruction with a return value.used.used directive.used directive is the same as the @llvm. and should not be exposed to source languages. There are six different terminator instructions: the 'ret' instruction. Return from void function Overview: The 'ret' instruction is used to return control flow (and optionally a value) from a function back to the caller. every basic block in a program ends with a "Terminator" instruction.compiler. the code generator must emit a directive to the assembler or object file to prevent the assembler and linker from molesting the symbol. Terminator Instructions As mentioned previously. On targets that support it. Return a value from a non-void function ret void . memory instructions. this allows an intelligent linker to optimize references to the symbol without being impeded as it would be by @llvm. and the 'unreachable' instruction. binary instructions. Arguments: The 'ret' instruction optionally accepts a single argument. and one that just causes control flow to occur. Documentation for the LLVM System at SVN head On some targets. There are two forms of the 'ret' instruction: one that returns a value and then causes control flow.compiler. the 'switch' instruction. This is a rare construct that should only be used in rare circumstances.

Return a struct of values 4 and 2 'br' Instruction Syntax: br i1 <cond>. If "cond" is false. Example: Test: %cond = icmp eq i32 %a.. Documentation for the LLVM System at SVN head Semantics: When the 'ret' instruction is executed. label %IfEqual. i8 2 } . If the instruction returns a value. label <dest> . that value shall set the call or invoke instruction's return value. If the caller was an "invoke" instruction. It is a generalization of the 'br' instruction. execution continues at the instruction after the call. 33 . Arguments: The conditional branch form of the 'br' instruction takes a single 'i1' value and two 'label' values. label <iffalse> br label <dest> . control flows to the 'iftrue' label argument. The unconditional form of the 'br' instruction takes a single 'label' value as a target. control flow returns back to the calling function's context. Unconditional branch Overview: The 'br' instruction is used to cause control flow to transfer to a different basic block in the current function. Return an integer value of 5 ret void . allowing a branch to occur to one of many possible destinations. corresponding to a conditional branch and an unconditional branch. %b br i1 %cond. Example: ret i32 5 . label <defaultdest> [ <intty> <val>. label <iftrue>. i8 } { i32 4. If the caller is a "call" instruction. execution continues at the beginning of the "normal" destination block. There are two forms of this instruction. Semantics: Upon execution of a conditional 'br' instruction.. control flows to the 'iffalse' label argument. Return from a void function ret { i32. label %IfUnequal IfEqual: ret i32 1 IfUnequal: ret i32 0 'switch' Instruction Syntax: switch <intty> <value>. ] Overview: The 'switch' instruction is used to transfer control flow to one of several different places. the 'i1' argument is evaluated. If the value is true.

label %onzero i32 1. a default 'label' destination. label %truedest [ i32 0. Blocks are allowed to occur multiple times in the destination list. Implement a jump table: switch i32 %val. this table is searched for the given value. though this isn't particularly useful. Example: . All possible destination blocks must be listed in the label list. The rest of the arguments indicate the full set of possible destinations that the address may point to. Semantics: The switch instruction specifies a table of values and destinations. Implementation: Depending on properties of the target machine and the particular switch instruction. The table is not allowed to contain duplicate constant entries. this instruction may be code generated in different ways. If the value is found. When the 'switch' instruction is executed. otherwise. label %ontwo ] 'indirectbr' Instruction Syntax: indirectbr <somety>* <address>. Emulate a conditional br instruction %Val = zext i1 %value to i32 switch i32 %Val. Address must be derived from a blockaddress constant. label %falsedest ] . For example. [ label <dest1>. otherwise this instruction has undefined behavior. label <dest2>. Arguments: The 'address' argument is the address of the label to jump to. label %dest [ ] . Semantics: Control transfers to the block specified in the address argument. and an array of pairs of comparison value constants and 'label's. ] Overview: The 'indirectbr' instruction implements an indirect branch to a label within the current function. label %onone i32 2. control flow is transferred to the corresponding destination. it could be generated as a series of chained conditional branches or with a lookup table. This destination list is required so that dataflow analysis has an accurate understanding of the CFG.. Emulate an unconditional br instruction switch i32 0. control flow is transferred to the default destination. label %otherwise [ i32 0.. This implies that jumps to labels defined in 34 . Documentation for the LLVM System at SVN head Arguments: The 'switch' instruction uses three parameters: an integer comparison value 'value'. whose address is specified by "address". .

5. 'normal label': the label reached when the called function executes a 'ret' instruction. Example: indirectbr i8* %Addr. 'function ptr val': An LLVM value containing a pointer to a function to be invoked. 'readonly' and 'readnone' attributes are valid here. This instruction is used in languages with destructors to ensure that proper cleanup is performed in the case of either a longjmp or a thrown exception. If the callee function returns with the "ret" instruction. [ label %bb1. The optional "cconv" marker indicates which calling convention the call should use. control flow will return to the "normal" label. 3. All arguments must be of first class type. The optional Parameter Attributes list for return values. The optional function attributes list. this is important for implementation of 'catch' clauses in high-level languages that support them. If the function signature indicates the function accepts a variable number of arguments. Semantics: This instruction is designed to operate as a standard 'call' instruction in most regards. and 'inreg' attributes are valid here. 2. Only 'zeroext'. 'ptr to function ty': shall be the signature of the pointer to function value being invoked. 4. If the callee (or any indirect callees) returns with the "unwind" instruction. 'exception label': the label reached when a callee returns with the unwind instruction. 7. label %bb2. 'function args': argument list whose types match the function signature argument types and parameter attributes. Implementation: This is typically implemented with a jump through a register. branching off an arbitrary pointer to function value. but indirect invokes are just as possible. If none is specified. the call defaults to using C calling conventions. Arguments: This instruction requires several arguments: 1. the extra arguments can be specified. 'nounwind'. 6. with the possibility of control flow transfer to either the 'normal' label or the 'exception' label. which is used by the runtime library to unwind the stack. 35 . this is a direct function invocation. 8. control is interrupted and continued at the dynamically nearest "exception" label. 'signext'. In most cases. The primary difference is that it establishes an association with a label. Only 'noreturn'. Additionally. Documentation for the LLVM System at SVN head other functions have undefined behavior as well. label %bb3 ] 'invoke' Instruction Syntax: <result> = invoke [cconv] [ret attrs] <ptr to function ty> <function ptr val>(<function args> to label <normal label> unwind label <exception label> Overview: The 'invoke' instruction causes control to transfer to a specified function.

Semantics: The 'unreachable' instruction has no defined semantics. Once found. Documentation for the LLVM System at SVN head For the purposes of the SSA form. Semantics: The 'unwind' instruction causes execution of the current function to immediately halt. 'unreachable' Instruction Syntax: unreachable Overview: The 'unreachable' instruction has no defined semantics. {i32}:retval set %retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue unwind label %TestCleanup . 36 . execution continues at the "exceptional" destination block specified by the invoke instruction. This is primarily used to implement exception handling. The operands might represent multiple data. They require two operands of the same type. If there is no invoke instruction in the dynamic call chain. Note that the code generator does not yet completely support unwind. Binary Operations Binary operators are used to do most of the computation in a program. and that the invoke/unwind semantics are likely to change in future versions. This can be used to indicate that the code after a no-return function cannot be reached. the definition of the value returned by the 'invoke' instruction is deemed to occur on the edge from the current block to the "normal" label. and that the invoke/unwind semantics are likely to change in future versions. This instruction is used to inform the optimizer that a particular portion of the code is not reachable. Example: %retval = invoke i32 @Test(i32 15) to label %Continue unwind label %TestCleanup . continuing control flow at the first callee in the dynamic call stack which used an invoke instruction to perform the call. Note that the code generator does not yet completely support unwind. undefined behavior results. The dynamic call stack is then searched for the first invoke instruction on the call stack. and other facts. and produce a single value. {i32}:retval set 'unwind' Instruction Syntax: unwind Overview: The 'unwind' instruction unwinds the stack. If the callee unwinds then no return value is available. execute an operation on them.

%var . Both arguments must have identical types. <op2> . <op2> . Documentation for the LLVM System at SVN head as is the case with the vector data type. the result value of the add is undefined if unsigned and/or signed overflow. <op2> . the result returned is the mathematical result modulo 2n. yields {ty}:result <result> = add nuw nsw <ty> <op1>. Example: <result> = add i32 4. Both arguments must have identical types. Arguments: The two arguments to the 'fadd' instruction must be floating point or vector of floating point values. 37 . There are several different binary operators: 'add' Instruction Syntax: <result> = add <ty> <op1>. Because LLVM integers use a two's complement representation. If the nuw and/or nsw keywords are present. respectively. Arguments: The two arguments to the 'add' instruction must be integer or vector of integer values. occurs. If the sum has unsigned overflow. nuw and nsw stand for "No Unsigned Wrap" and "No Signed Wrap". this instruction is appropriate for both signed and unsigned integers. The result value has the same type as its operands. <op2> . Semantics: The value produced is the integer sum of the two operands. yields {ty}:result Overview: The 'fadd' instruction returns the sum of its two operands. yields {ty}:result <result> = add nuw <ty> <op1>. yields {i32}:result = 4 + %var 'fadd' Instruction Syntax: <result> = fadd <ty> <op1>. <op2> . where n is the bit width of the result. yields {ty}:result <result> = add nsw <ty> <op1>. yields {ty}:result Overview: The 'add' instruction returns the sum of its two operands. respectively.

yields {float}:result = 4. Example: <result> = fadd float 4. where n is the bit width of the result. %var . respectively. occurs. the result value of the sub is undefined if unsigned and/or signed overflow. respectively.%var <result> = sub i32 0. <op2> . Example: <result> = sub i32 4. yields {ty}:result <result> = sub nsw <ty> <op1>. yields {ty}:result <result> = sub nuw <ty> <op1>.0 + %var 'sub' Instruction Syntax: <result> = sub <ty> <op1>. %var . <op2> . this instruction is appropriate for both signed and unsigned integers.0. nuw and nsw stand for "No Unsigned Wrap" and "No Signed Wrap". <op2> . yields {ty}:result <result> = sub nuw nsw <ty> <op1>. Note that the 'sub' instruction is used to represent the 'neg' instruction present in most other intermediate representations. yields {i32}:result = -%var 'fsub' Instruction Syntax: <result> = fsub <ty> <op1>. Arguments: The two arguments to the 'sub' instruction must be integer or vector of integer values. If the difference has unsigned overflow. <op2> . the result returned is the mathematical result modulo 2n. Semantics: The value produced is the integer difference of the two operands. Because LLVM integers use a two's complement representation. yields {ty}:result Overview: The 'sub' instruction returns the difference of its two operands. Both arguments must have identical types. yields {ty}:result 38 . <op2> . yields {i32}:result = 4 . Documentation for the LLVM System at SVN head Semantics: The value produced is the floating point sum of the two operands. If the nuw and/or nsw keywords are present. %val .

Documentation for the LLVM System at SVN head Overview: The 'fsub' instruction returns the difference of its two operands. %val . respectively. Because LLVM integers use a two's complement representation. Semantics: The value produced is the integer product of the two operands. nuw and nsw stand for "No Unsigned Wrap" and "No Signed Wrap". 39 . <op2> . Arguments: The two arguments to the 'mul' instruction must be integer or vector of integer values. <op2> . i32xi32->i64) is needed. %var . this instruction returns the correct result for both signed and unsigned integers. occurs.0. and the result is the same width as the operands. Example: <result> = fsub float 4. yields {ty}:result <result> = mul nuw <ty> <op1>. <op2> . the result returned is the mathematical result modulo 2n. yields {float}:result = 4.0. where n is the bit width of the result. If a full product (e. Note that the 'fsub' instruction is used to represent the 'fneg' instruction present in most other intermediate representations. yields {ty}:result <result> = mul nuw nsw <ty> <op1>. Both arguments must have identical types. Both arguments must have identical types. <op2> . respectively. yields {ty}:result <result> = mul nsw <ty> <op1>. Semantics: The value produced is the floating point difference of the two operands. Arguments: The two arguments to the 'fsub' instruction must be floating point or vector of floating point values. the operands should be sign-extended or zero-extended as appropriate to the width of the full product. yields {float}:result = -%var 'mul' Instruction Syntax: <result> = mul <ty> <op1>.0 . If the result of the multiplication has unsigned overflow. the result value of the mul is undefined if unsigned and/or signed overflow. yields {ty}:result Overview: The 'mul' instruction returns the product of its two operands.%var <result> = fsub float -0.g. If the nuw and/or nsw keywords are present.

yields {i32}:result = 4 / %var 40 . yields {ty}:result Overview: The 'fmul' instruction returns the product of its two operands. %var . yields {ty}:result Overview: The 'udiv' instruction returns the quotient of its two operands. yields {i32}:result = 4 * %var 'fmul' Instruction Syntax: <result> = fmul <ty> <op1>. Both arguments must have identical types. yields {float}:result = 4. Documentation for the LLVM System at SVN head Example: <result> = mul i32 4.0 * %var 'udiv' Instruction Syntax: <result> = udiv <ty> <op1>. %var . Semantics: The value produced is the floating point product of the two operands. Semantics: The value produced is the unsigned integer quotient of the two operands. use 'sdiv'. <op2> . for signed integer division.0. <op2> . Note that unsigned integer division and signed integer division are distinct operations. Example: <result> = udiv i32 4. Division by zero leads to undefined behavior. Arguments: The two arguments to the 'udiv' instruction must be integer or vector of integer values. %var . Example: <result> = fmul float 4. Arguments: The two arguments to the 'fmul' instruction must be floating point or vector of floating point values. Both arguments must have identical types.

<op2> . Note that signed integer division and unsigned integer division are distinct operations. <op2> . Documentation for the LLVM System at SVN head 'sdiv' Instruction Syntax: <result> = sdiv <ty> <op1>. yields {i32}:result = 4 / %var 'fdiv' Instruction Syntax: <result> = fdiv <ty> <op1>. Overflow also leads to undefined behavior. but can occur. Both arguments must have identical types. 41 . yields {ty}:result Overview: The 'sdiv' instruction returns the quotient of its two operands. Division by zero leads to undefined behavior. Semantics: The value produced is the signed integer quotient of the two operands rounded towards zero. If the exact keyword is present. %var . the result value of the sdiv is undefined if the result would be rounded or if overflow would occur. use 'udiv'. for example. Arguments: The two arguments to the 'fdiv' instruction must be floating point or vector of floating point values. <op2> . yields {ty}:result Overview: The 'fdiv' instruction returns the quotient of its two operands. Arguments: The two arguments to the 'sdiv' instruction must be integer or vector of integer values. this is a rare case. for unsigned integer division. Both arguments must have identical types. by doing a 32-bit division of -2147483648 by -1. yields {ty}:result <result> = sdiv exact <ty> <op1>. Semantics: The value produced is the floating point quotient of the two operands. Example: <result> = sdiv i32 4.

%var . use 'srem'. This instruction can also take vector versions of the values in which case the elements must be integers. Semantics: This instruction returns the remainder of a division (where the result has the same sign as the dividend. For more information about the difference. %var . not the modulo operator (where the result has the same sign as the divisor. see The Math Forum. <op2> . Both arguments must have identical types. Both arguments must have identical types. yields {float}:result = 4. Documentation for the LLVM System at SVN head Example: <result> = fdiv float 4.0. please see Wikipedia: modulo operation. For a table of how this is implemented in various languages. Semantics: This instruction returns the unsigned integer remainder of a division. Note that unsigned integer remainder and signed integer remainder are distinct operations. for signed integer remainder. Taking the remainder of a division by zero leads to undefined behavior. op1). yields {i32}:result = 4 % %var 'srem' Instruction Syntax: <result> = srem <ty> <op1>. op2) of a value.0 / %var 'urem' Instruction Syntax: <result> = urem <ty> <op1>. Arguments: The two arguments to the 'srem' instruction must be integer or vector of integer values. <op2> . Example: <result> = urem i32 4. Arguments: The two arguments to the 'urem' instruction must be integer or vector of integer values. 42 . This instruction always performs an unsigned division to get the remainder. yields {ty}:result Overview: The 'urem' instruction returns the remainder from the unsigned division of its two arguments. yields {ty}:result Overview: The 'srem' instruction returns the remainder from the signed division of its two operands.

<op2> .0. but this rule lets srem be implemented using instructions that return both the result of the division and the remainder. <op2> . Arguments: The two arguments to the 'frem' instruction must be floating point or vector of floating point values. Both arguments must have identical types. The resulting value is the same type as its operands. They are generally very efficient instructions and can commonly be strength reduced from other instructions. 43 . Taking the remainder of a division by zero leads to undefined behavior. but can occur. 'shl' Instruction Syntax: <result> = shl <ty> <op1>. yields {ty}:result Overview: The 'shl' instruction returns the first operand shifted to the left a specified number of bits. Documentation for the LLVM System at SVN head Note that signed integer remainder and unsigned integer remainder are distinct operations. for unsigned integer remainder. They require two operands of the same type. and produce a single value. for example. yields {i32}:result = 4 % %var 'frem' Instruction Syntax: <result> = frem <ty> <op1>. (The remainder doesn't actually overflow.) Example: <result> = srem i32 4. The remainder has the same sign as the dividend. Overflow also leads to undefined behavior. yields {ty}:result Overview: The 'frem' instruction returns the remainder from the division of its two operands. execute an operation on them. yields {float}:result = 4. %var . this is a rare case. %var . Example: <result> = frem float 4. by taking the remainder of a 32-bit division of -2147483648 by -1. Semantics: This instruction returns the remainder of a division. Arguments: Both arguments to the 'shl' instruction must be the same integer or vector of integer type. use 'urem'.0 % %var Bitwise Binary Operations Bitwise binary operators are used to do various forms of bit-twiddling in a program. 'op2' is treated as an unsigned value.

%var . i32 4>. < i32 1. Documentation for the LLVM System at SVN head Semantics: The value produced is op1 * 2op2 mod 2n. Arguments: Both arguments to the 'lshr' instruction must be the same integer or vector of integer type. yields: result=<2 x i32> < i32 'lshr' Instruction Syntax: <result> = lshr <ty> <op1>. yields {ty}:result Overview: The 'ashr' instruction (arithmetic shift right) returns the first operand shifted to the right a specified number of bits with sign extension. i32 2> . 3 . <op2> . Example: <result> = lshr i32 4. If op2 is (statically or dynamically) negative or equal to or larger than the number of bits in op1. undefined <result> = lshr <2 x i32> < i32 -2. yields {i32}:result = 1 <result> = lshr i8 4. yields {i32}:result = 2 <result> = lshr i32 4. < i32 1. where n is the width of the result. 1 . 2 . 44 . Semantics: This instruction always performs a logical shift right operation. 'op2' is treated as an unsigned value. i32 2> . undefined <result> = shl <2 x i32> < i32 1. the result is undefined. 10 . yields {ty}:result Overview: The 'lshr' instruction (logical shift right) returns the first operand shifted to the right a specified number of bits with zero fill. Example: <result> = shl i32 4. yields: result=<2 x i32> < i3 'ashr' Instruction Syntax: <result> = ashr <ty> <op1>. yields {i32}: 16 <result> = shl i32 1. each vector element of op1 is shifted by the corresponding shift amount in op2. If the arguments are vectors. yields {i8}:result = 0x7FFFFFFF <result> = lshr i32 1. 2 . yields {i32}: 4 << %var <result> = shl i32 4. yields {i32}: 1024 <result> = shl i32 1. i32 1>. 1 . 32 . yields {i8}:result = 0 <result> = lshr i8 -2. each vector element of op1 is shifted by the corresponding shift amount in op2. If op2 is (statically or dynamically) equal to or larger than the number of bits in op1. If the arguments are vectors. the result is undefined. <op2> . 32 . The most significant bits of the result will be filled with zero bits after the shift.

undefined <result> = ashr <2 x i32> < i32 -2. If the arguments are vectors. The most significant bits of the result will be filled with the sign bit of op1. yields {ty}:result Overview: The 'and' instruction returns the bitwise logical and of its two operands. 1 . Semantics: This instruction always performs an arithmetic shift right operation. 'op2' is treated as an unsigned value. <op2> . If op2 is (statically or dynamically) equal to or larger than the number of bits in op1. Semantics: The truth table used for the 'and' instruction is: In0 In1 Out 0 0 0 0 1 0 1 0 0 1 1 1 Example: <result> = and i32 4. 40 . < i32 1. Example: <result> = ashr i32 4. each vector element of op1 is shifted by the corresponding shift amount in op2. the result is undefined. yields {i8}:result = -1 <result> = ashr i32 1. %var . yields {i32}:result = 8 <result> = and i32 4. yields {i8}:result = 0 <result> = ashr i8 -2. 1 . i32 3> . yields {i32}:result = 2 <result> = ashr i32 4. yields {i32}:result = 0 'or' Instruction 45 . 2 . yields {i32}:result = 4 & %var <result> = and i32 15. Arguments: The two arguments to the 'and' instruction must be integer or vector of integer values. yields: result=<2 x i32> < i3 'and' Instruction Syntax: <result> = and <ty> <op1>. Both arguments must have identical types. 32 . 3 . i32 4>. yields {i32}:result = 1 <result> = ashr i8 4. 8 . Documentation for the LLVM System at SVN head Arguments: Both arguments to the 'ashr' instruction must be the same integer or vector of integer type.

40 . <op2> . yields {ty}:result Overview: The 'or' instruction returns the bitwise logical inclusive or of its two operands. 8 . Arguments: The two arguments to the 'or' instruction must be integer or vector of integer values. Semantics: The truth table used for the 'xor' instruction is: In0 In1 Out 0 0 0 46 . Semantics: The truth table used for the 'or' instruction is: In0 In1 Out 0 0 0 0 1 1 1 0 1 1 1 1 Example: <result> = or i32 4. <op2> . Both arguments must have identical types. yields {i32}:result = 12 'xor' Instruction Syntax: <result> = xor <ty> <op1>. yields {ty}:result Overview: The 'xor' instruction returns the bitwise logical exclusive or of its two operands. %var . Both arguments must have identical types. which is the "~" operator in C. The xor is used to implement the "one's complement" operation. yields {i32}:result = 4 | %var <result> = or i32 15. yields {i32}:result = 47 <result> = or i32 4. Documentation for the LLVM System at SVN head Syntax: <result> = or <ty> <op1>. Arguments: The two arguments to the 'xor' instruction must be integer or vector of integer values.

The second operand is an index indicating the position from which to extract the element. i32 <idx> . Its value is the value at position idx of val. many sophisticated algorithms will want to use target-specific intrinsics to take full advantage of a specific target. The third operand is an index 47 . yields {i32}:result = 4 ^ %var <result> = xor i32 15. yields {i32}:result = 12 <result> = xor i32 %V. These instructions cover the element-access and vector-specific operations needed to process vectors effectively. yields {i32}:result = 39 <result> = xor i32 4. 8 . If idx exceeds the length of val. i32 0 . i32 <idx> . While LLVM does directly support these vector operations. <ty> <elt>. Arguments: The first operand of an 'extractelement' instruction is a value of vector type. 40 . the results are undefined. Example: <result> = extractelement <4 x i32> %vec. The index may be a variable. Documentation for the LLVM System at SVN head 0 1 1 1 0 1 1 1 0 Example: <result> = xor i32 4. The second operand is a scalar value whose type must equal the element type of the first operand. yields {i32}:result = ~%V Vector Operations LLVM supports several instructions to represent vector operations in a target-independent manner. Semantics: The result is a scalar of the same type as the element type of val. yields <n x <ty>> Overview: The 'insertelement' instruction inserts a scalar element into a vector at a specified index. yields <ty> Overview: The 'extractelement' instruction extracts a single scalar element from a vector at a specified index. Arguments: The first operand of an 'insertelement' instruction is a value of vector type. yields i32 'insertelement' Instruction Syntax: <result> = insertelement <n x <ty>> <val>. %var . -1 . 'extractelement' Instruction Syntax: <result> = extractelement <n x <ty>> <val>.

i32 7 > Aggregate Operations LLVM supports several instructions for working with aggregate values. Example: <result> = shufflevector <4 x i32> %v1. If idx exceeds the length of val. yields <4 x i32> <result> = shufflevector <4 x i32> %v1. i32 1. i32 3> . i32 4. <m x i32> <mask> . i32 2. i32 1. Arguments: The first two operands of a 'shufflevector' instruction are vectors with types that match each other. Its element values are those of val except at position idx. <4 x i32> <i32 0. The shuffle mask operand is required to be a constant vector with either constant integer or undef values. i32 4. returning a vector with the same element type as the input and length that is the same as the shuffle mask. <8 x i32> undef. <4 x i32> %v2. <4 x i32> undef. i32 2. The shuffle mask operand specifies.Identity <result> = shufflevector <8 x i32> %v1. i32 3. <n x <ty>> <v2>. The element selector may be undef (meaning "don't care") and the second operand may be undef if performing a shuffle from only one vector. 'extractvalue' Instruction 48 . the results are undefined. Documentation for the LLVM System at SVN head indicating the position at which to insert the value. i32 1. for each element of the result vector. yields <4 x i32> 'shufflevector' Instruction Syntax: <result> = shufflevector <n x <ty>> <v1>. The result of the instruction is a vector whose length is the same as the shuffle mask and whose element type is the same as the element type of the first two operands. i32 0 . i32 3> . i32 2. <4 x i32> <i32 0. i32 1. yields <4 x i32> <result> = shufflevector <4 x i32> %v1. where it gets the value elt. The third argument is a shuffle mask whose element type is always 'i32'. i32 1. <4 x i32> %v2. Semantics: The result is a vector of the same type as val. yields <4 x i32> . <4 x i32> <i32 0. i32 6. <8 x i32> <i32 0. i32 5> . i32 5. which element of the two input vectors the result element gets. Example: <result> = insertelement <4 x i32> %vec. The index may be a variable. Semantics: The elements of the two input vectors are numbered from left to right across both of the vectors. yields <m x Overview: The 'shufflevector' instruction constructs a permutation of elements from two input vectors.

write. <idx>{. and allocate memory in LLVM. 0 . 1 . 0 . yields <aggregate type> Overview: The 'insertvalue' instruction inserts a value into a member field in an aggregate value. union or array type. yields i32 'insertvalue' Instruction Syntax: <result> = insertvalue <aggregate type> <val>. 49 . yields {i32 1. The operands are constant indices to specify which value to extract in a similar manner as indices in a 'getelementptr' instruction. <ty> <elt>. Documentation for the LLVM System at SVN head Syntax: <result> = extractvalue <aggregate type> <val>. float %val. This section describes how to read. union or array type. Semantics: The result is an aggregate of the same type as val. float undef} %agg2 = insertvalue {i32. In LLVM. float} undef. which makes things very simple. The following operands are constant indices indicating the position at which to insert the value in a similar manner as indices in a 'getelementptr' instruction. Arguments: The first operand of an 'insertvalue' instruction is a value of struct. yields {i32 1. Example: %agg1 = insertvalue {i32. float} %agg1. <idx>}* Overview: The 'extractvalue' instruction extracts the value of a member field from an aggregate value. float %val} Memory Access and Addressing Operations A key design point of an SSA-based representation is how it represents memory. i32 1. no memory locations are in SSA form. <idx> . The second operand is a first-class value to insert. float} %agg. Arguments: The first operand of an 'extractvalue' instruction is a value of struct. Example: <result> = extractvalue {i32. The value to insert must have the same type as the value identified by the indices. Semantics: The result is the value at the position in the aggregate specified by the index operands. Its value is that of val except that the value at the position specified by the indices is that of elt.

Arguments: The 'alloca' instruction allocates sizeof(<type>)*NumElements bytes of memory on the runtime stack. align 1024 . yields {i32*}:ptr 'load' Instruction Syntax: <result> = load <ty>* <pointer>[. The operation is undefined if there is insufficient stack space for the allocation. Semantics: Memory is allocated. align <alignment>][. yields {i32*}:ptr %ptr = alloca i32. to be automatically released when this function returns to its caller. If a constant alignment is specified. i32 <NumElements>][. Arguments: The argument to the 'load' instruction specifies the memory address from which to load. yields {i32*}:ptr %ptr = alloca i32. or if zero. otherwise "NumElements" is defaulted to be one. Documentation for the LLVM System at SVN head 'alloca' Instruction Syntax: <result> = alloca <type>[. When the function returns (either with the ret or unwind instructions). yields {i32*}:ptr %ptr = alloca i32. align 1024 . i32 4 . The pointer must point to a first class type. align <alignment>] . the value result of the allocation is guaranteed to be aligned to at least that boundary. align <alignment>][. then the optimizer is not allowed to modify the number or order of execution of this load with other volatile load and store instructions. i32 4. If the load is marked as volatile. a pointer is returned. the target can choose to align the allocation on any convenient boundary compatible with the type. the memory is reclaimed. Example: %ptr = alloca i32 . !nontemporal !<index>] !<index> = !{ i32 1 } Overview: The 'load' instruction is used to read from memory. it is the number of elements allocated. yields {type*}:resul Overview: The 'alloca' instruction allocates memory on the stack frame of the currently executing function. Allocating zero bytes is legal. The 'alloca' instruction is commonly used to represent automatic variables that must have an address available. 50 . If not specified. !nontemporal !<index>] <result> = volatile load <ty>* <pointer>[. If "NumElements" is specified. The object is always allocated in the generic address space (address space zero). 'alloca'd memory is automatically released when the function returns. returning a pointer of the appropriate type to the program. 'type' may be any sized type. but the result is undefined.

Underestimating the alignment may produce less efficient code. !nontemporal !] volatile store <ty> <value>. !nontemporal !] Overview: The 'store' instruction is used to write to memory. An alignment of 1 is always safe. When loading a value of a type like i20 with a size that is not an integral number of bytes. Arguments: There are two arguments to the 'store' instruction: a value to store and an address at which to store it. The optional !nontemporal metadata must reference a single metatadata name <index> corresponding to a metadata node with one i32 entry of value 1. then the optimizer is not allowed to modify the number or order of execution of this store with other volatile load and store instructions. An alignment of 1 is always safe. The type of the '<pointer>' operand must be a pointer to the first class type of the '<value>' operand. yields {i32}:val = i32 3 'store' Instruction Syntax: store <ty> <value>. Documentation for the LLVM System at SVN head The optional constant align argument specifies the alignment of the operation (that is. Overestimating the alignment results in undefined behavior. It is the responsibility of the code emitter to ensure that the alignment information is correct. Underestimating the alignment may produce less efficient code. the result is undefined if the value was not originally written using a store of the same type. For example. Examples: %ptr = alloca i32 . A value of 0 or an omitted align argument means that the operation has the preferential alignment for the target. The existence of the !nontemporal metatadata on the instruction tells the optimizer and code generator that this load is not expected to be reused in the cache. loading an i24 reads at most three bytes. The existence of the !nontemporal metatadata on the instruction tells the optimizer and code generator that this load is not expected to be reused in the cache. The code generator may select special instructions to save cache bandwidth. The optional !nontemporal metadata must reference a single metatadata name corresponding to a metadata node with one i32 entry of value 1. such as the MOVNT instruction on x86. <ty>* <pointer>[. A value of 0 or an omitted "align" argument means that the operation has the preferential alignment for the target. Semantics: The location of memory pointed to is loaded. If the value being loaded is of scalar type then the number of bytes read does not exceed the minimum number of bytes needed to hold all bits of the type. Overestimating the alignment results in an undefined behavior. yields {i32*}:ptr store i32 3. If the store is marked as volatile. The optional constant "align" argument specifies the alignment of the operation (that is. It is the responsibility of the code emitter to ensure that the alignment information is correct. yields {void} %val = load i32* %ptr . the alignment of the memory address). align <alignment>][. <ty>* <pointer>[. The code generator may 51 . i32* %ptr . align <alignment>][. the alignment of the memory address).

since the first index can be non-zero). and they are not required to be constant. }. pointer or vector. yields {void} %val = load i32* %ptr . The type of each index argument depends on the type it is indexing into. The first type indexed into must be a pointer value. The remaining arguments are indices that indicate which of the elements of the aggregate object are indexed. and forms the basis of the calculation. char C. When writing a value of a type like i20 with a size that is not an integral number of bytes. <ty> <idx>}* Overview: The 'getelementptr' instruction is used to get the address of a subelement of an aggregate data structure. structs and unions. only i32 integer constants are allowed. When indexing into an array. since that would require loading the pointer before continuing calculation. struct RT Z. it is unspecified what happens to the extra bits that do not belong to the type. the second index indexes a value of the type pointed to (not necessarily the value directly pointed to. i32* %ptr . It performs address calculation only and does not access memory. Documentation for the LLVM System at SVN head select special instructions to save cache bandwidth. Arguments: The first argument is always a pointer. struct ST { int X. }. integers of any width are allowed. The interpretation of each index is dependent on the type being indexed into. Example: %ptr = alloca i32 . For example. For example. vectors. but they will typically be overwritten. double Y. int B[10][20]. etc. int *foo(struct ST *s) { 52 . Semantics: The contents of memory are updated to contain '<value>' at the location specified by the '<pointer>' operand. The first index always indexes the pointer value given as the first argument. let's consider a C code fragment and how it gets compiled to LLVM: struct RT { char A. storing an i24 writes at most three bytes. When indexing into a (optionally packed) structure or union. such as the MOVNT instruction on x86. Note that subsequent types being indexed into can never be pointers. If '<value>' is of scalar type then the number of bytes written does not exceed the minimum number of bytes needed to hold all bits of the type. <ty> <idx>}* <result> = getelementptr inbounds <pty>* <ptrval>{. yields {i32*}:ptr store i32 3. subsequent types can be arrays. yields {i32}:val = i32 3 'getelementptr' Instruction Syntax: <result> = getelementptr <pty>* <ptrval>{.

i8 } %ST = type { i32. The second index indexes into the third element of the structure. The 'getelementptr' instruction returns a pointer to this element. [10 x [20 x i32]]. a structure. i32 0. double. an array. which is a pointer. } The LLVM code generated by the GCC frontend is: %RT = type { i8 . Note that it is perfectly legal to index partially through a structure. [12 x i8]}* %saptr. The third index indexes into the second element of the structure. i32 2. Example: . i32 1. i32 13 .B[5][13]. see the getelementptr FAQ. yields [10 x [20 x i32]]*:%t3 %t4 = getelementptr [10 x [20 x i32]]* %t3. The getelementptr instruction is often confusing. yields i8*:vptr %vptr = getelementptr {i32. %RT } define i32* @foo(%ST* %s) { entry: %reg = getelementptr %ST* %s. returning a pointer to an inner element. yields [20 x i32]*:%t4 %t5 = getelementptr [20 x i32]* %t4. Because of this. the LLVM code for the given testcase is equivalent to: define i32* @foo(%ST* %s) { %t1 = getelementptr %ST* %s. another structure. even if it happens to point into allocated storage. yielding a '%ST' = '{ i32. and the result value of the getelementptr may be outside the object pointed to by the base pointer. the result value of the getelementptr is undefined if the base pointer is not an in bounds address of an allocated object. yields %RT*:%t2 %t3 = getelementptr %RT* %t2. i8 }' type. the first index is indexing into the '%ST*' type. i64 0. i32 5. i32 13 ret i32* %reg } Semantics: In the example above. i32 1 . [10 x [20 x i32]]. yields %ST*:%t1 %t2 = getelementptr %ST* %t1. The result value may not necessarily be used to access memory though. yielding a '%RT' = '{ i8 . i64 0. i32 2 . For some more insight into how it works. i32 0. i32 5 . %RT }' type. plus the address one byte past the end. yielding an 'i32' type. the offsets are added to the base address with silently-wrapping two's complement arithmetic. The two dimensions of the array are subscripted into. i32 1 53 . i32 1. <2 x i8>}* %svptr.Z. i32 0. or if any of the addresses that would be formed by successive addition of the offsets implied by the indices to the base address with infinitely precise arithmetic are not an in bounds address of that allocated object. i32 1 . Documentation for the LLVM System at SVN head return &s[1]. i32 1 . i32 0. yields [12 x i8]*:aptr %aptr = getelementptr {i32. The in bounds addresses for an allocated object are all the addresses that point into the object. See the Pointer Aliasing Rules section for more information. yields i32*:%t5 ret i32* %t5 } If the inbounds keyword is present. yielding a '[10 x [20 x i32]]' type. thus computing a value of 'i32*' type. If the inbounds keyword is not present. double. i32 1.

. yields ty2 Overview: The 'zext' instruction zero extends its operand to type ty2. Arguments: The 'trunc' instruction takes a value to trunc. i16 0 Conversion Operations The instructions in this category are the conversion instructions (casting) which all take a single operand and a type. and a type to cast it to. and a type that specifies the size and type of the result. Example: %X = trunc i32 257 to i8 . i32 1 . yields i8:1 %Y = trunc i32 123 to i1 . The bit size of value must be larger than the bit size of ty2. which must be an integer type. trunc cannot be a no-op cast. i64 0. which must also be of integer type. It will always truncate bits. Since the source size must be larger than the destination size. yields i1:false 'zext . ty2. Arguments: The 'zext' instruction takes a value to cast. The bit size of the value must be smaller than the bit size of the destination type. i16 0. yields ty2 Overview: The 'trunc' instruction truncates its operand to the type ty2. Documentation for the LLVM System at SVN head . 'trunc . They perform various bit conversions on the operand. Equal sized types are not allowed.. yields i8*:eptr %eptr = getelementptr [12 x i8]* %aptr. which must be an integer type. Semantics: The 'trunc' instruction truncates the high order bits in value and converts the remaining bits to ty2. yields i1:true %Z = trunc i32 122 to i1 . to' Instruction Syntax: <result> = trunc <ty> <value> to <ty2> . which must be of integer type. 54 . to' Instruction Syntax: <result> = zext <ty> <value> to <ty2> . yields i32*:iptr %iptr = getelementptr [10 x i32]* @arr.

. yields ty2 Overview: The 'sext' sign extends value to the type ty2. This implies that fptrunc cannot be used to make a no-op cast. The bit size of the value must be smaller than the bit size of the destination type. yields i64:257 %Y = zext i1 true to i32 . The size of value must be larger than the size of ty2.. 55 . When sign extending from i1. to' Instruction Syntax: <result> = fptrunc <ty> <value> to <ty2> . When zero extending from i1. Arguments: The 'fptrunc' instruction takes a floating point value to cast and a floating point type to cast it to. and a type to cast it to. ty2. yields i16 :65535 %Y = sext i1 true to i32 . Semantics: The 'sext' instruction performs a sign extension by copying the sign bit (highest order bit) of the value until it reaches the bit size of the type ty2. yields ty2 Overview: The 'fptrunc' instruction truncates value to type ty2. Arguments: The 'sext' instruction takes a value to cast. Example: %X = zext i32 257 to i64 . the result will always be either 0 or 1. the extension always results in -1 or 0. which must be of integer type. Documentation for the LLVM System at SVN head Semantics: The zext fills the high order bits of the value with zero bits until it reaches the size of the destination type. to' Instruction Syntax: <result> = sext <ty> <value> to <ty2> . yields i32:1 'sext . yields i32:-1 'fptrunc . which must also be of integer type. ty2. Example: %X = sext i8 -1 to i16 .

If the value cannot fit within the destination type. yields ty2 Overview: The 'fpext' extends a floating point value to a larger floating point value. yields undefined 'fpext . The source type must be smaller than the destination type. then the results are undefined.0 to float . to' Instruction Syntax: <result> = fptoui <ty> <value> to <ty2> . Arguments: The 'fptoui' instruction takes a value to cast. and a floating point type to cast it to. Example: %X = fptrunc double 123. The fpext cannot be used to make a no-op cast because it always changes bits. Semantics: The 'fpext' instruction extends the value from a smaller floating point type to a larger floating point type.1415 %Y = fpext float 1. Arguments: The 'fpext' instruction takes a floating point value to cast. Use bitcast to make a no-op cast for a floating point cast.0 to float . ty2 must be a vector integer type with the same number of elements as ty 56 . Documentation for the LLVM System at SVN head Semantics: The 'fptrunc' instruction truncates a value from a larger floating point type to a smaller floating point type. ty2. and a type to cast it to ty2.0E+300 to float .0 %Y = fptrunc double 1. to' Instruction Syntax: <result> = fpext <ty> <value> to <ty2> . yields double:3. yields ty2 Overview: The 'fptoui' converts a floating point value to its unsigned integer equivalent of type ty2. If ty is a vector floating point type.1415 to double .. yields float:1.. which must be a scalar or vector floating point value. which must be an integer type. Example: %X = fpext float 3.0 (no-op) 'fptoui . yields float:123.

the results are undefined. Documentation for the LLVM System at SVN head Semantics: The 'fptoui' instruction converts its floating point operand into the nearest (rounding towards zero) unsigned integer value. ty2 must be a vector floating point type with the same number of elements as ty 57 .. Arguments: The 'uitofp' instruction takes a value to cast. Arguments: The 'fptosi' instruction takes a value to cast. and a type to cast it to ty2. which must be a scalar or vector floating point value. If the value cannot fit in ty2.04E+17 to i8 . to' Instruction Syntax: <result> = fptosi <ty> <value> to <ty2> .0E+300 to i1 . yields i32:-123 %Y = fptosi float 1. to' Instruction Syntax: <result> = uitofp <ty> <value> to <ty2> . Example: %X = fptoui double 123. ty2 must be a vector integer type with the same number of elements as ty Semantics: The 'fptosi' instruction converts its floating point operand into the nearest (rounding towards zero) signed integer value. If the value cannot fit in ty2. yields i32:123 %Y = fptoui float 1. yields undefined:1 'fptosi .04E+17 to i8 . and a type to cast it to ty2. If ty is a vector integer type. which must be a scalar or vector integer value.0 to i32 . yields undefined:1 'uitofp .0 to i32 . which must be an integer type. the results are undefined. Example: %X = fptosi double -123. If ty is a vector floating point type. yields undefined:1 %Z = fptoui float 1. yields undefined:1 %Z = fptosi float 1. which must be an floating point type. yields ty2 Overview: The 'uitofp' instruction regards value as an unsigned integer and converts that value to the ty2 type.0E-247 to i1 .. yields ty2 Overview: The 'fptosi' instruction converts floating point value to type ty2.

yields double:255. to' Instruction Syntax: <result> = sitofp <ty> <value> to <ty2> . yields double:-1. to' Instruction Syntax: <result> = ptrtoint <ty> <value> to <ty2> . yields ty2 Overview: The 'sitofp' instruction regards value as a signed integer and converts that value to the ty2 type.0 %Y = sitofp i8 -1 to double .0 %Y = uitofp i8 -1 to double . Example: %X = sitofp i32 257 to float . which must be an integer type.. If value is smaller than ty2 then a zero extension is done. If they are the 58 .0 'sitofp . If ty is a vector integer type. which must be a scalar or vector integer value. If the value cannot fit in the floating point value. the results are undefined. If value is larger than ty2 then a truncation is done. ty2 must be a vector floating point type with the same number of elements as ty Semantics: The 'sitofp' instruction interprets its operand as a signed integer quantity and converts it to the corresponding floating point value. Example: %X = uitofp i32 257 to float . which must be a pointer value. yields float:257. the results are undefined. and a type to cast it to ty2. which must be an floating point type. Documentation for the LLVM System at SVN head Semantics: The 'uitofp' instruction interprets its operand as an unsigned integer quantity and converts it to the corresponding floating point value. yields ty2 Overview: The 'ptrtoint' instruction converts the pointer value to the integer type ty2. Semantics: The 'ptrtoint' instruction converts value to integer type ty2 by interpreting the pointer value as an integer and either truncating or zero extending that value to the size of the integer type. If the value cannot fit in the floating point value.0 'ptrtoint . yields float:257.. and a type to cast it to ty2. Arguments: The 'sitofp' instruction takes a value to cast. Arguments: The 'ptrtoint' instruction takes a value to cast.

If they are the same size. Semantics: The 'inttoptr' instruction converts value to type ty2 by applying either a zero extension or a truncation depending on the size of the integer value. 59 . Example: %X = ptrtoint i32* %X to i8 . Arguments: The 'bitcast' instruction takes a value to cast. and a type to cast it to. This instruction supports bitwise conversion of vectors to integers and to vectors of other types (as long as they have the same size). nothing is done (no-op cast). to' Instruction Syntax: <result> = inttoptr <ty> <value> to <ty2> . the destination type must also be a pointer. and a type to cast it to. The bit sizes of value and the destination type.. Documentation for the LLVM System at SVN head same size. yields ty2 Overview: The 'inttoptr' instruction converts an integer value to a pointer type. yields ty2 Overview: The 'bitcast' instruction converts value to type ty2 without changing any bits. which must be a non-aggregate first class value. Example: %X = inttoptr i32 255 to i32* . must be identical. which must also be a non-aggregate first class type. which must be a pointer type. yields no-op on 32-bit architecture %Z = inttoptr i64 0 to i32* . If value is smaller than the size of a pointer then a zero extension is done.. yields zero extension on 32-bit architecture 'inttoptr . yields zero extension on 64-bit architecture %Y = inttoptr i32 255 to i32* . yields truncation on 32-bit architecture %Y = ptrtoint i32* %x to i64 . If value is larger than the size of a pointer then a truncation is done. ty2. to' Instruction Syntax: <result> = bitcast <ty> <value> to <ty2> . Arguments: The 'inttoptr' instruction takes an integer value to cast. then nothing is done (no-op cast) other than a type change. ty2. If the source type is a pointer. yields truncation on 32-bit architecture 'bitcast .

ugt: unsigned greater than 4. Documentation for the LLVM System at SVN head Semantics: The 'bitcast' instruction converts value to type ty2. yields i8 :-1 %Y = bitcast i32* %x to sint* . <op2> . The comparison performed always yields either an i1 or vector of i1 result. Example: %X = bitcast i8 255 to i8 . sge: signed greater or equal 9. integer vector. ule: unsigned less or equal 7. as follows: 1. It is not a value. 60 . sgt: signed greater than 8. false otherwise. use the inttoptr or ptrtoint instructions first. ne: not equal 3. They must also be identical types. Pointer types may only be converted to other pointer types with this instruction. The conversion is done as if the value had been stored to memory and read back as type ty2. sle: signed less or equal The remaining two arguments must be integer or pointer or integer vector typed. yields i64: %V Other Operations The instructions in this category are the "miscellaneous" instructions. eq: equal 2. uge: unsigned greater or equal 5. 'icmp' Instruction Syntax: <result> = icmp <cond> <ty> <op1>. To convert pointers to other types. or pointer operands. Arguments: The 'icmp' instruction takes three operands. ult: unsigned less than 6. eq: yields true if the operands are equal. Semantics: The 'icmp' compares op1 and op2 according to the condition code given as cond. which defy better classification. The possible condition code are: 1. just a keyword. . slt: signed less than 10. It is always a no-op cast because no bits change with this conversion. No sign interpretation is necessary or performed. yields {i1} or {<N x i1>}:result Overview: The 'icmp' instruction returns a boolean value or a vector of boolean values based on comparison of its two integer. yields sint*:%x %Z = bitcast <2 x int> %V to i64. The first operand is the condition code indicating the kind of comparison to perform.

ult: interprets the operands as unsigned values and yields true if op1 is less than op2. If the operands are pointer typed. Documentation for the LLVM System at SVN head 2. sle: interprets the operands as signed values and yields true if op1 is less than or equal to op2. ule: interprets the operands as unsigned values and yields true if op1 is less than or equal to op2. If the operands are integer vectors. the result is an i1. <op2> . 3. Example: <result> = icmp eq i32 4. The result is an i1 vector with the same number of elements as the values being compared. 'fcmp' Instruction Syntax: <result> = fcmp <cond> <ty> <op1>. yields: result=false <result> = icmp sge i16 4. 5 . Otherwise. yields: result=false <result> = icmp ne float* %X. 6. false otherwise. 5 . %X . olt: ordered and less than 61 . The first operand is the condition code indicating the kind of comparison to perform. 7. always returns false 2. yields: result=false <result> = icmp ule i16 -4. yields: result=false Note that the code generator does not yet support vector types with the icmp instruction. then the result type is a vector of boolean with the same number of elements as the operands being compared. oge: ordered and greater than or equal 5. 9. If the operands are floating point scalars. just a keyword. yields: result=true <result> = icmp sgt i16 4. ogt: ordered and greater than 4. slt: interprets the operands as signed values and yields true if op1 is less than op2. sgt: interprets the operands as signed values and yields true if op1 is greater than op2. oeq: ordered and equal 3. false: no comparison. 5. If the operands are floating point vectors. It is not a value. No sign interpretation is necessary or performed. then the result type is a boolean (i1). uge: interprets the operands as unsigned values and yields true if op1 is greater than or equal to op2. ugt: interprets the operands as unsigned values and yields true if op1 is greater than op2. 5 . ne: yields true if the operands are unequal. 5 . yields {i1} or {<N x i1>}:result Overview: The 'fcmp' instruction returns a boolean value or vector of boolean values based on comparison of its operands. 5 . The possible condition code are: 1. 4. sge: interprets the operands as signed values and yields true if op1 is greater than or equal to op2. 8. the pointer values are compared as if they were integers. 10. then they are compared element by element. Arguments: The 'fcmp' instruction takes three operands. yields: result=false <result> = icmp ult i16 4.

Each comparison performed always yields an i1 result. ugt: yields true if either operand is a QNAN or op1 is greater than op2. 2. always returns true Ordered means that neither operand is a QNAN while unordered means that either operand may be a QNAN. ule: unordered or less than or equal 14.0 . ogt: yields true if both operands are not a QNAN and op1 is greater than op2. ole: ordered and less than or equal 7. 14. 2. 5. yields: result=true <result> = fcmp olt float 4.0 . yields: result=true <result> = fcmp ueq double 1. 3. 11. 13. Example: <result> = fcmp oeq float 4. true: no comparison. uge: unordered or greater than or equal 12. une: yields true if either operand is a QNAN or op1 is not equal to op2. 7.0. 10. yields: result=false <result> = fcmp one float 4. 4.0. uge: yields true if either operand is a QNAN or op1 is greater than or equal to op2. oeq: yields true if both operands are not a QNAN and op1 is equal to op2. 8. une: unordered or not equal 15. ult: unordered or less than 13. ule: yields true if either operand is a QNAN or op1 is less than or equal to op2. olt: yields true if both operands are not a QNAN and op1 is less than op2. true: always yields true.0 . 16. then the vectors are compared element by element. ult: yields true if either operand is a QNAN or op1 is less than op2. ord: ordered (no nans) 9. 5. 6.0. 5. ord: yields true if both operands are not a QNAN. Each of val1 and val2 arguments must be either a floating point type or a vector of floating point type. 'phi' Instruction 62 . uno: unordered (either nans) 16. 15. as follows: 1. If the operands are vectors.0. Documentation for the LLVM System at SVN head 6. regardless of operands. oge: yields true if both operands are not a QNAN and op1 is greater than or equal to op2. yields: result=false Note that the code generator does not yet support vector types with the fcmp instruction. uno: yields true if either operand is a QNAN. ole: yields true if both operands are not a QNAN and op1 is less than or equal to op2. ueq: yields true if either operand is a QNAN or op1 is equal to op2. false: always yields false. regardless of operands. one: yields true if both operands are not a QNAN and op1 is not equal to op2. one: ordered and not equal 8. ueq: unordered or equal 10. ugt: unordered or greater than 11. 5. Semantics: The 'fcmp' instruction compares op1 and op2 according to the condition code given as cond.0 . 9. They must have identical types. 12.

<ty> <val2> .. <label0>]. not individual elements. the use of each incoming value is deemed to occur on the edge from the corresponding predecessor block to the current block (but after any definition of an 'invoke' instruction's return value on the same edge). <ty> <val1>. Arguments: The type of the incoming values is specified with the first type field. Overview: The 'phi' instruction is used to implement the φ node in the SSA graph representing the function. Only labels may be used as the label arguments. Arguments: The 'select' instruction requires an 'i1' value or a vector of 'i1' values indicating the condition.e. If the val1/val2 are vectors and the condition is a scalar. 63 . %indvar = phi i32 [ 0. without branching. 1 br label %Loop 'select' Instruction Syntax: <result> = select selty <cond>. and two values of the same first class type. %LoopHeader ].. %Loop ] %nextindvar = add i32 %indvar. then entire vectors are selected. . Infinite loop that counts from 0 on up. [ %nextindvar. the 'phi' instruction logically takes on the value specified by the pair corresponding to the predecessor basic block that executed just prior to the current block. Only values of first class type may be used as the value arguments to the PHI node. For the purposes of the SSA form.. Documentation for the LLVM System at SVN head Syntax: <result> = phi <ty> [ <val0>. with one pair for each predecessor basic block of the current block. Example: Loop: . There must be no non-phi instructions between the start of a basic block and the PHI instructions: i. PHI instructions must be first in a basic block. After this. yields ty selty is either i1 or {<N x i1>} Overview: The 'select' instruction is used to choose one value based on a condition. the 'phi' instruction takes a list of pairs as arguments. Semantics: At runtime..

or 2) forced tail call optimization when the following extra requirements are met: ♦ Caller and callee both have the calling convention fastcc. Only 'zeroext'. Note that calls may be marked "tail" even if they do not occur before a ret instruction. 6. Example: %X = select i1 true. If the function signature indicates the 64 . Functions that return no value are marked void. If none is specified. 'signext'. The optional "tail" marker indicates that the callee function does not access any allocas or varargs in the caller. calling an arbitrary pointer to function value. i8 17. 5. then the value arguments must be vectors of the same size. the instruction returns the first value argument. the function call is eligible for tail call optimization. 'call' Instruction Syntax: <result> = [tail] call [cconv] [ret attrs] <ty> [<fnty>*] <fnptrval>(<function args>) [fn att Overview: The 'call' instruction represents a simple function call. 'function args': argument list whose types match the function signature argument types and parameter attributes. 4. If the condition is a vector of i1. 3. 'fnptrval': An LLVM value containing a pointer to a function to be invoked. The optional Parameter Attributes list for return values. but indirect calls are just as possible. The optional "cconv" marker indicates which calling convention the call should use. 'ty': the type of the call instruction itself which is also the type of the return value. This type can be omitted if the function is not varargs and if the function type does not return a pointer to a function. 'fnty': shall be the signature of the pointer to function value being invoked. If the "tail" marker is present. ♦ Platform specific constraints are met. ♦ The call is in tail position (ret immediately follows call and ret uses value of call or is void). and the selection is done element by element. The calling convention of the call must match the calling convention of the target function. yields i8:17 Note that the code generator does not yet support conditions with vector type. i8 42 . The code generator may optimize calls marked "tail" with either 1) automatic sibling call optimization when the caller and callee have matching signatures. Documentation for the LLVM System at SVN head Semantics: If the condition is an i1 and it evaluates to 1. and 'inreg' attributes are valid here. Arguments: This instruction requires several arguments: 1. 7. In most cases. The argument types must match the types implied by this signature. or llvm::GuaranteedTailCallOpt is true. ♦ Option -tailcallopt is enabled. All arguments must be of first class type. 2. this is a direct function invocation. otherwise. or else the behavior is undefined. the call defaults to using C calling conventions. but might not in fact be optimized into a jump. it returns the second value argument.

Semantics: The 'va_arg' instruction loads an argument of the specified type from the specified va_list and causes the va_list to point to the next argument. 'va_arg' Instruction Syntax: <resultval> = va_arg <va_list*> <arglist>. 'nounwind'. Return value is %zero extended llvm treats calls to some functions with names and arguments that match the standard C99 library as being the C99 library functions. and may perform optimizations or generate code for them under that assumption. 0 . 1 . Only 'noreturn'. yields i8 %Z = call void @foo() noreturn .)* @printf(i8 * %msg. see the variable argument handling Intrinsic Functions.A @foo() . indicates that %foo never returns normall %ZZ = call zeroext i32 @bar() . Arguments: This instruction takes a va_list* value and the type of the argument.A %r. The actual type of va_list is target specific.. . Upon a 'ret' instruction in the called function. Documentation for the LLVM System at SVN head function accepts a variable number of arguments. the extra arguments can be specified. This is something we'd like to change in the future to provide better support for freestanding environments and non-C-based languages. and the return value of the function is bound to the result argument. i8 } %gr = extractvalue %struct. 8. i8 } %r = call %struct. control flow continues with the instruction after the function call. yields i32 %gr1 = extractvalue %struct. yields i32 %X = tail call i32 @foo() . 65 . i8 42) . Example: %retval = call i32 @test(i32 %argc) call i32 (i8 *. It returns a value of the specified argument type and increments the va_list to point to the next argument. 'readonly' and 'readnone' attributes are valid here. For more information.A %r.A = type { i32.. The optional function attributes list. It is used to implement the va_arg macro in C. i32 12. yields i32 call void %foo(i8 97 signext) %struct. Semantics: The 'call' instruction is used to cause control flow to transfer to a specified function. yields i32 %Y = tail call fastcc i32 @foo() . with its incoming arguments bound to the specified values. <argty> Overview: The 'va_arg' instruction is used to access arguments passed through the "variable argument" area of a function call. yields { 32.

ctpop.h> header file. but needs all of them to be of the same type. it is required if any are added that they be documented here. Variable Argument Handling Intrinsics Variable argument support is defined in LLVM with the va_arg instruction and these three intrinsic functions. it does not require its own name suffix. the intrinsic represents a family of functions that perform the same operation but on different data types. to only be overloaded with respect to a single argument or the result. This leads to a family of functions such as i8 @llvm.ctpop. etc. Because the argument's type is matched against the return type. Only one type. These functions are related to the similarly named macros defined in the <stdarg. is overloaded.i29(i29 %val). and only one type suffix is required. each preceded by a period. Documentation for the LLVM System at SVN head It is legal for this instruction to be called in a function which does not take a variable number of arguments.i8(i8 %val) and i29 @llvm. so all transformations should be prepared to handle these functions regardless of the type used. This prefix is reserved in LLVM for intrinsic names. This allows an intrinsic function which accepts multiple arguments. function names may not begin with this prefix. Intrinsic functions may only be used in call or invoke instructions: it is illegal to take the address of an intrinsic function. Argument types may also be defined as exactly matching a previous argument's type or the result type. these intrinsics represent an extension mechanism for the LLVM language that does not require changing all of the transformations in LLVM when adding to the language (or the bitcode reader/writer. the parser. the llvm. Overall. Additionally. Some intrinsic functions can be overloaded. The LLVM assembly language reference manual does not define what this type is. Arguments whose type is matched against another type do not. Only those types which are overloaded result in a name suffix.. These functions have well known names and semantics and are required to follow certain restrictions. for example. va_arg is an LLVM instruction instead of an intrinsic function because it takes a type as an argument.. the vfprintf function.).ctpop function can take an integer of any width and returns an integer of exactly the same integer width. One or more of the argument types or the result type can be overloaded to accept any integer type. the return type. please see the Extending LLVM Guide.. thus. Intrinsic function names must all start with an "llvm. Because LLVM can represent over 8 million different integer types. 66 . overloading is used commonly to allow an intrinsic function to operate on any integer type. All of these functions operate on arguments that use a target-specific value type "va_list".e. Intrinsic functions must always be external functions: you cannot define the body of intrinsic functions. Note that the code generator does not yet fully support va_arg on many targets. To learn how to add an intrinsic function. Example: See the variable argument processing section." prefix. because intrinsic functions are part of the LLVM language. Overloaded intrinsics will have the names of its overloaded argument types encoded into its function name. Intrinsic Functions LLVM supports the notion of an "intrinsic function". Also. i. For example. it does not currently support va_arg with aggregate types on any target.

it initializes the va_list element to which the argument points.) { . Documentation for the LLVM System at SVN head This example shows how the va_arg instruction and the variable argument handling intrinsic functions are used. Initialize variable argument processing %ap = alloca i8* %ap2 = bitcast i8** %ap to i8* call void @llvm.va_copy(i8* %aq2.va_end(i8* <arglist>) Overview: The 'llvm.va_end' Intrinsic Syntax: declare void @llvm. In a target-dependent way.va_copy and llvm. this intrinsic does not need to know the last argument of the function as the compiler can figure that out.va_copy.va_start or llvm. Arguments: The argument is a pointer to a va_list element to initialize. Demonstrate usage of llvm.va_start(i8*) declare void @llvm..va_end' intrinsic destroys *<arglist>. call void @llvm.va_start' intrinsic works just like the va_start macro available in C.va_start' Intrinsic Syntax: declare void %llvm.va_start(i8* %ap2) . i32 . .va_end(i8* %ap2) ret i32 %tmp } declare void @llvm. 'llvm. so that the next call to va_arg will produce the first variable argument passed to the function. Read a single integer argument %tmp = va_arg i8** %ap. Unlike the C va_start macro. Stop processing of arguments. i8*) declare void @llvm.va_start(i8* <arglist>) Overview: The 'llvm.va_start' intrinsic initializes *<arglist> for subsequent use by va_arg.. Semantics: The 'llvm. 67 . i8* %ap2) call void @llvm. define i32 @test(i32 %X.va_end %aq = alloca i8* %aq2 = bitcast i8** %aq to i8* call void @llvm.va_copy(i8*.va_end(i8*) 'llvm.va_end(i8* %aq2) . which has been initialized previously with llvm.

for example. it destroys the va_list element to which the argument points. as well as garbage collector implementations that require read and write barriers.va_copy(i8* <destarglist>. i8* <srcarglist>) Overview: The 'llvm. Documentation for the LLVM System at SVN head Arguments: The argument is a pointer to a va_list to destroy. The second argument is a pointer to a va_list element to copy from.gcroot' intrinsic declares the existence of a GC root to the code generator.va_copy' intrinsic works just like the va_copy macro available in C. In a target-dependent way. The garbage collection intrinsics only operate on objects in the generic address space (address space zero). These intrinsics allow identification of GC roots on the stack. memory allocation. Semantics: The 'llvm. Semantics: The 'llvm.va_start and llvm. This intrinsic is necessary because the llvm.va_copy' intrinsic copies the current argument position from the source argument list to the destination argument list.va_copy' Intrinsic Syntax: declare void @llvm. it copies the source va_list element into the destination va_list element. 68 . see Accurate Garbage Collection with LLVM. and allows some metadata to be associated with it.va_start intrinsic may be arbitrarily complex and require.gcroot' Intrinsic Syntax: declare void @llvm. Arguments: The first argument is a pointer to a va_list element to initialize. For more details. Accurate Garbage Collection Intrinsics LLVM support for Accurate Garbage Collection (GC) requires the implementation and generation of these intrinsics. Calls to llvm. In a target-dependent way.va_copy must be matched exactly with calls to llvm.va_end. 'llvm.va_end' intrinsic works just like the va_end macro available in C. i8* %metadata) Overview: The 'llvm.gcroot(i8** %ptrloc. 'llvm. Front-ends for type-safe garbage collected languages should generate these intrinsics to make use of the LLVM garbage collectors.

gcread' intrinsic identifies reads of references from heap locations. The second pointer (which must be either a constant or a global value address) contains the meta-data to be associated with the root. if needed by the language runtime (otherwise null). i8* %Obj.gcroot' intrinsic may only be used in a function which specifies a GC algorithm. a call to this intrinsic stores a null pointer into the "ptrloc" location. the second is the start of the object to store it to. Semantics: The 'llvm. Semantics: At runtime.gcread' intrinsic may only be used in a function which specifies a GC algorithm. Arguments: The second argument is the address to read from. as needed. i8** %P2) Overview: The 'llvm.gcwrite' Intrinsic Syntax: declare void @llvm.gcwrite(i8* %P1. The 'llvm. 'llvm. Obj may be null. The 'llvm. Documentation for the LLVM System at SVN head Arguments: The first argument specifies the address of a stack object that contains the root pointer. Arguments: The first argument is the reference to store. allowing garbage collector implementations that require write barriers (such as generational or reference counting collectors). and the third is the address of the field of Obj to store to. i8** %Ptr) Overview: The 'llvm. 69 .gcread' Intrinsic Syntax: declare i8* @llvm.gcread' intrinsic has the same semantics as a load instruction.gcwrite' intrinsic identifies writes of references to heap locations.gcread(i8* %ObjPtr. allowing garbage collector implementations that require read barriers. The first object is a pointer to the start of the referenced object. which should be an address allocated from the garbage collector. If the runtime does not require a pointer to the object. At compile-time. 'llvm. the code generator generates information to allow the runtime to find the pointer at GC safe points. but may be replaced with substantially more complex code by the garbage collector runtime.

Code Generator Intrinsics These intrinsics are provided by LLVM to expose special features that may only be implemented with code generator support. Zero indicates the calling function.returnaddress' Intrinsic Syntax: declare i8 *@llvm. 'llvm. Note that calling this intrinsic does not prevent function inlining or other aggressive transformations.frameaddress(i32 <level>) Overview: The 'llvm. or zero if it cannot be identified. Arguments: The argument to this intrinsic indicates which function to return the address for.returnaddress' intrinsic attempts to compute a target-specific value indicating the return address of the current function or one of its callers. one indicates its caller.frameaddress' Intrinsic Syntax: declare i8 *@llvm. so the value returned may not be that of the obvious source-language caller. The 'llvm. Semantics: The 'llvm. Zero indicates the calling function. The argument is required to be a constant integer value.returnaddress' intrinsic either returns a pointer indicating the return address of the specified call frame. as needed. The value returned by this intrinsic is likely to be incorrect or 0 for arguments other than zero. etc. 70 . 'llvm.returnaddress(i32 <level>) Overview: The 'llvm. Documentation for the LLVM System at SVN head Semantics: The 'llvm. so it should only be used for debugging purposes. one indicates its caller. Arguments: The argument to this intrinsic indicates which function to return the frame pointer for. but may be replaced with substantially more complex code by the garbage collector runtime. The argument is required to be a constant integer value.gcwrite' intrinsic has the same semantics as a store instruction.frameaddress' intrinsic attempts to return the target-specific frame pointer value for the specified stack frame.gcwrite' intrinsic may only be used in a function which specifies a GC algorithm. etc.

Documentation for the LLVM System at SVN head Semantics: The 'llvm.stacksave() Overview: The 'llvm. 'llvm.stacksave. this pops any alloca blocks from the stack that were allocated after the llvm.stacksave' intrinsic is used to remember the current state of the function stack.stackrestore intrinsic is executed with a value saved from llvm.stacksave intrinsic executed. it is a noop. 'llvm.stacksave intrinsic executed. Semantics: This intrinsic returns a opaque pointer value that can be passed to llvm. When an llvm. so it should only be used for debugging purposes. otherwise. 'llvm. i32 <rw>.stackrestore(i8 * %ptr) Overview: The 'llvm.stackrestore. 71 .stacksave. for use with llvm.prefetch(i8* <address>. The value returned by this intrinsic is likely to be incorrect or 0 for arguments other than zero.prefetch' intrinsic is a hint to the code generator to insert a prefetch instruction if supported. This is useful for implementing language features like scoped automatic variable sized arrays in C99. or zero if it cannot be identified. so the value returned may not be that of the obvious source-language caller.frameaddress' intrinsic either returns a pointer indicating the frame address of the specified call frame.stackrestore' intrinsic is used to restore the state of the function stack to the state it was in when the corresponding llvm.stacksave was executed. Semantics: See the description for llvm. In practice.stackrestore. i32 <locality>) Overview: The 'llvm. it effectively restores the state of the stack to the state it was in when the llvm. This is useful for implementing language features like scoped automatic variable sized arrays in C99.stacksave' Intrinsic Syntax: declare i8 *@llvm. Prefetches have no effect on the behavior of the program but can change its performance characteristics.stackrestore' Intrinsic Syntax: declare void @llvm. Note that calling this intrinsic does not prevent function inlining or other aggressive transformations.prefetch' Intrinsic Syntax: declare void @llvm.

high accuracy clocks) on those targets that support it. In particular. It is possible that the presence of a marker will inhibit optimizations. Arguments: id is a numerical id identifying the marker. Implementations are allowed to either return a application specific value or a system wide value. prefetches cannot trap and do not produce a value. The method is target specific. Standard C Library Intrinsics 72 .pcmarker' Intrinsic Syntax: declare void @llvm. The rw and locality arguments must be constant integers. On targets that support this intrinsic. it should map to RDTSC. Semantics: When directly supported. but it is expected that the marker will use exported symbols to transmit the PC of the marker. The intended use is to be inserted after optimizations to allow correlations of simulation runs. Semantics: This intrinsic does not modify the behavior of the program. On X86.extremely local keep in cache.pcmarker(i32 <id>) Overview: The 'llvm.readcyclecounter( ) Overview: The 'llvm. this should only be used for small timings. to (3) . the prefetch can provide hints to the processor cache for better performance. reading the cycle counter should not modify any memory.readcyclecounter' Intrinsic Syntax: declare i64 @llvm. On backends without support. this is lowered to a constant 0. The marker makes no guarantees that it will remain with any specific instruction after optimizations. On Alpha. As the backing counters overflow quickly (on the order of 9 seconds on alpha).readcyclecounter' intrinsic provides access to the cycle counter register (or similar low latency. it should map to RPCC. Documentation for the LLVM System at SVN head Arguments: address is the address to be prefetched. Backends that do not support this intrinsic may ignore it.pcmarker' intrinsic is a method to export a Program Counter (PC) in a region of code to simulators and other tools. Semantics: This intrinsic does not modify the behavior of the program. 'llvm.no locality. 'llvm. rw is the specifier determining if the fetch should be for a read (0) or write (1). and locality is a temporal locality specifier ranging from (0) .

the second is a pointer to the source. i8 * <src>. Not all targets support all bit widths however. i8 <len>.memcpy. i32 <align>) declare void @llvm. the llvm. Documentation for the LLVM System at SVN head LLVM provides intrinsics for a few important standard C library functions. i32 <align>) declare void @llvm.memcpy on any integer bit width. Note that. It copies "len" bytes of memory over. i64 <len>.memmove on any integer bit width.memcpy. These intrinsics allow source-language front-ends to pass information about the alignment of the pointer arguments to the code generator. If the argument is known to be aligned to some boundary.memmove. i32 <len>. i16 <len>. i8 * <src>.* intrinsics do not return a value.memcpy. Arguments: The first argument is a pointer to the destination. i8 * <src>. unlike the standard libc function.*' intrinsics copy a block of memory from the source location to the destination location.i16(i8 * <dest>.*' intrinsics copy a block of memory from the source location to the destination location.i8(i8 * <dest>. which are not allowed to overlap.memcpy' Intrinsic Syntax: This is an overloaded intrinsic.memcpy. providing opportunity for more efficient code generation. 'llvm. i32 <align>) declare void @llvm. declare void @llvm. and takes an extra alignment argument. this can be specified as the fourth argument. i32 <align>) 73 . declare void @llvm.i32(i8 * <dest>. i8 * <src>.memcpy. then the caller guarantees that both the source and destination pointers are aligned to that boundary. i32 <len>. otherwise it should be set to 0 or 1.memmove. You can use llvm. Semantics: The 'llvm. and the fourth argument is the alignment of the source and destination locations. i32 <align>) Overview: The 'llvm. i8 * <src>. i16 <len>. 'llvm.i64(i8 * <dest>. i8 * <src>.memcpy. You can use llvm. Not all targets support all bit widths however.i8(i8 * <dest>.memmove.i16(i8 * <dest>. i32 <align>) declare void @llvm. If the call to this intrinsic has an alignment value that is not 0 or 1.i32(i8 * <dest>.memcpy. i32 <align>) declare void @llvm. The third argument is an integer argument specifying the number of bytes to copy. i8 <len>. i8 * <src>.memmove' Intrinsic Syntax: This is an overloaded intrinsic.

and the fourth argument is the alignment of the source and destination locations.memcpy' intrinsic but allows the two memory locations to overlap.memmove. which may overlap. i32 <len>. i8 <len>. otherwise it should be set to 0 or 1.*' intrinsics fill a block of memory with a particular byte value.i32(i8 * <dest>.i16(i8 * <dest>.* intrinsics do not return a value. i16 <len>. The third argument is an integer argument specifying the number of bytes to copy. 74 .memmove. declare void @llvm.i8(i8 * <dest>.memset. Note that. i32 <align>) declare void @llvm. this can be specified as the fourth argument. i32 <align>) declare void @llvm.i64(i8 * <dest>.memset. the llvm. the llvm.memset on any integer bit width. Arguments: The first argument is a pointer to the destination to fill.memset. It is similar to the 'llvm. Not all targets support all bit widths however. If the argument is known to be aligned to some boundary. i8 <val>. the second is the byte value to fill it with. Note that.memset. the third argument is an integer argument specifying the number of bytes to fill.memmove. and takes an extra alignment argument.memmove. i64 <len>. and takes an extra alignment argument.memset intrinsic does not return a value.*' intrinsics copy a block of memory from the source location to the destination location. You can use llvm. unlike the standard libc function. i8 <val>.*' intrinsics move a block of memory from the source location to the destination location. the second is a pointer to the source. Documentation for the LLVM System at SVN head declare void @llvm.memset. If the call to this intrinsic has an alignment value that is not 0 or 1.memset. Arguments: The first argument is a pointer to the destination. unlike the standard libc function. i8 <val>. 'llvm. i32 <align>) declare void @llvm. Semantics: The 'llvm. i32 <align>) Overview: The 'llvm. i8 * <src>. i64 <len>. and the fourth argument is the known alignment of destination location.*' Intrinsics Syntax: This is an overloaded intrinsic. It copies "len" bytes of memory over. i32 <align>) Overview: The 'llvm. i8 <val>. then the caller guarantees that the source and destination pointers are aligned to that boundary.i64(i8 * <dest>.

f64(double %Val) declare x86_fp80 @llvm.*' intrinsics return the first operand raised to the specified (positive or negative) power. 'llvm.sqrt.f128(fp128 %Val. 'llvm. then the caller guarantees that the destination pointer is aligned to that boundary.f80(x86_fp80 %Val) declare fp128 @llvm.f32(float %Val. llvm.0) is defined to return -0. i32 %power) Overview: The 'llvm.powi.f128(fp128 %Val) declare ppc_fp128 @llvm.powi on any floating point or vector of floating point type.memset. Not all targets support all types however.powi.powi.ppcf128(ppc_fp128 %Val. because there is no need to worry about errno being set).sqrt has undefined behavior for negative numbers other than -0. however.powi. this can be specified as the fourth argument.0 (which allows for better optimization.sqrt.powi. i32 %power) declare double @llvm. otherwise it should be set to 0 or 1. Arguments: The argument and return value are floating point numbers of the same type. The order of evaluation of multiplications is not defined. returning the same value as the libm 'sqrt' functions would.f80(x86_fp80 %Val. declare float @llvm.sqrt on any floating point or vector of floating point type. When a vector of floating point type is used. Unlike sqrt in libm. You can use llvm.*' Intrinsic Syntax: This is an overloaded intrinsic.*' intrinsics fill "len" bytes of memory starting at the destination location.powi. If the argument is known to be aligned to some boundary.0 like IEEE sqrt. the second argument remains a scalar integer value.f64(double %Val. Not all targets support all types however. declare float @llvm.sqrt.sqrt' intrinsics return the sqrt of the specified operand.sqrt.sqrt. Semantics: The 'llvm.ppcf128(ppc_fp128 %Val) Overview: The 'llvm.powi.sqrt.*' Intrinsic Syntax: This is an overloaded intrinsic. i32 %power) declare ppc_fp128 @llvm. i32 %power) declare fp128 @llvm. Documentation for the LLVM System at SVN head If the call to this intrinsic has an alignment value that is not 0 or 1. Semantics: This function returns the sqrt of the specified operand if it is a nonnegative floating point number.f32(float %Val) declare double @llvm. You can use llvm. 75 . i32 %power) declare x86_fp80 @llvm.sqrt(-0. llvm.

f80(x86_fp80 %Val) declare fp128 @llvm.f128(fp128 %Val) declare ppc_fp128 @llvm.*' intrinsics return the sine of the operand.*' intrinsics return the cosine of the operand.sin.sin.cos.cos on any floating point or vector of floating point type. returning the same values as the libm sin functions would.cos. declare float @llvm.sin.cos.*' Intrinsic Syntax: This is an overloaded intrinsic.cos. You can use llvm. 'llvm. and handles error conditions in the same way. declare float @llvm.sin on any floating point or vector of floating point type.sin. Arguments: The argument and return value are floating point numbers of the same type. Semantics: This function returns the first value raised to the second power with an unspecified sequence of rounding operations.f64(double %Val) declare x86_fp80 @llvm. Semantics: This function returns the sine of the specified operand. Not all targets support all types however.ppcf128(ppc_fp128 %Val) Overview: The 'llvm.f64(double %Val) declare x86_fp80 @llvm.f32(float %Val) declare double @llvm. You can use llvm.sin.f80(x86_fp80 %Val) declare fp128 @llvm. Not all targets support all types however.cos. and the first is a value to raise to that power.f128(fp128 %Val) declare ppc_fp128 @llvm.ppcf128(ppc_fp128 %Val) Overview: The 'llvm. Documentation for the LLVM System at SVN head Arguments: The second argument is an integer power. 76 . Arguments: The argument and return value are floating point numbers of the same type.f32(float %Val) declare double @llvm.*' Intrinsic Syntax: This is an overloaded intrinsic. 'llvm.cos.sin.sin.cos.

fp128 %Power) declare ppc_fp128 @llvm.bswap.pow. Semantics: This function returns the first value raised to the second power.e.i32(i32 <id>) declare i64 @llvm.ppcf128(ppc_fp128 %Val. 'llvm. Bit Manipulation Intrinsics LLVM provides intrinsics for a few important bit manipulation operations. returning the same values as the libm cos functions would.bswap. Not all targets support all types however.pow. double %Power) declare x86_fp80 @llvm.pow. declare i16 @llvm. You can use bswap on any integer type that is an even number of bytes (i.*' Intrinsic Syntax: This is an overloaded intrinsic. Documentation for the LLVM System at SVN head Semantics: This function returns the cosine of the specified operand. 'llvm. 77 . BitWidth % 16 == 0).bswap.pow.pow.bswap.*' Intrinsics Syntax: This is an overloaded intrinsic function.pow.i16(i16 <id>) declare i32 @llvm.pow on any floating point or vector of floating point type. and the first is a value to raise to that power. These are useful for performing operations on data that is not in the target's native byte order. ppc_fp128 Power) Overview: The 'llvm. returning the same values as the libm pow functions would. Arguments: The second argument is a floating point power. and handles error conditions in the same way. and handles error conditions in the same way.f128(fp128 %Val.f32(float %Val.f80(x86_fp80 %Val.*' intrinsics return the first operand raised to the specified (positive or negative) power.bswap' family of intrinsics is used to byte swap integer values with an even number of bytes (positive multiple of 16 bits).i64(i64 <id>) Overview: The 'llvm.pow. These allow efficient code generation for some algorithms. You can use llvm.f64(double %Val. x86_fp80 %Power) declare fp128 @llvm. float %Power) declare double @llvm. declare float @llvm.

i16(i16 <src>) declare i32 @llvm.ctlz.i16(i16 <src>) declare i32 @llvm.i256(i256 <src>) Overview: The 'llvm.bswap. The argument may be of any integer type.ctlz.ctpop. Semantics: The 'llvm.ctlz' family of intrinsic functions counts the number of leading zeros in a variable. The return type must match the argument type.i48.bswap. Not all targets support all bit widths however. Arguments: The only argument is the value to be counted.ctpop' family of intrinsics counts the number of bits set in a value.ctpop.ctpop' intrinsic counts the 1's in a variable.bswap. 2. 'llvm. The argument may be of any integer type.*' Intrinsic Syntax: This is an overloaded intrinsic.ctlz.i32(i32 <src>) declare i64 @llvm. 3 then the returned i32 will have its bytes in 3. declare i8 @llvm.ctlz.i32(i32 <src>) declare i64 @llvm. llvm.i16 intrinsic returns an i16 value that has the high and low byte of the input i16 swapped.ctlz.ctpop. Documentation for the LLVM System at SVN head Semantics: The llvm.bswap.*' Intrinsic Syntax: This is an overloaded intrinsic.ctpop on any integer bit width. 1.ctpop. so that if the input bytes are numbered 0.i32 intrinsic returns an i32 value that has the four bytes of the input i32 swapped.i256(i256 <src>) Overview: The 'llvm.i64 and other intrinsics extend this concept to additional even-byte lengths (6 bytes. You can use llvm. 'llvm.i8(i8 <src>) declare i16 @llvm. respectively). 78 . You can use llvm. 8 bytes and more. Not all targets support all bit widths however. The return type must match the argument type. Similarly.ctpop. 1.ctpop. Arguments: The only argument is the value to be counted. 0 order.i64(i64 <src>) declare i256 @llvm. the llvm. The llvm. declare i8 @llvm.ctlz.ctlz on any integer bit width. 2.i64(i64 <src>) declare i256 @llvm.i8 (i8 <src>) declare i16 @llvm.

Semantics: The 'llvm. If the src == 0 then the result is the size in bits of the type of src.cttz.ctlz' intrinsic counts the leading (most significant) zeros in a variable. i64 %b) Overview: The 'llvm. For example. declare i8 @llvm. llvm.i32(i32 <src>) declare i64 @llvm.overflow.i256(i256 <src>) Overview: The 'llvm. llvm. Arguments: The only argument is the value to be counted.overflow on any integer bit width.i64(i64 %a. You can use llvm.cttz. i1} @llvm.overflow.sadd. declare {i16. For example.cttz.cttz' intrinsic counts the trailing (least significant) zeros in a variable. Arguments: The arguments (%a and %b) and the first element of the result structure may be of integer types of any bit width. The second element of the result structure must be of type i1. i1} @llvm.cttz' family of intrinsic functions counts the number of trailing zeros. 79 .i16(i16 %a. Documentation for the LLVM System at SVN head Semantics: The 'llvm.with. The return type must match the argument type.cttz. %a and %b are the two values that will undergo signed addition. If the src == 0 then the result is the size in bits of the type of src.sadd.cttz(2) = 1.i8 (i8 <src>) declare i16 @llvm. i32 %b) declare {i64. but they must have the same bit width. i16 %b) declare {i32.ctlz(i32 2) = 30.sadd. 'llvm.overflow.overflow.i16(i16 <src>) declare i32 @llvm.with. You can use llvm.with.with. Not all targets support all bit widths however.i64(i64 <src>) declare i256 @llvm.cttz on any integer bit width.sadd.with.overflow' family of intrinsic functions perform a signed addition of the two arguments.cttz.sadd.with. The argument may be of any integer type. i1} @llvm.*' Intrinsic Syntax: This is an overloaded intrinsic. Arithmetic with Overflow Intrinsics LLVM provides intrinsics for some arithmetic with overflow operations. and indicate whether an overflow occurred during the signed summation. 'llvm.sadd.i32(i32 %a.cttz.*' Intrinsics Syntax: This is an overloaded intrinsic.

i1} @llvm.with.with.i32(i32 %a.uadd. 1 br i1 %obit.with. i64 %b) Overview: The 'llvm.uadd.overflow. i16 %b) declare {i32. %a and %b are the two values that will undergo unsigned addition.with. i1} @llvm.overflow. i1} @llvm.overflow' family of intrinsic functions perform a signed addition of the two variables. Arguments: The arguments (%a and %b) and the first element of the result structure may be of integer types of any bit width. i1} %res.ssub. i32 %b) %sum = extractvalue {i32.i32(i32 %a.i32(i32 %a.overflow.sadd. Examples: %res = call {i32. They return a structure — the first element of which is the signed summation.ssub.overflow' family of intrinsic functions perform an unsigned addition of the two arguments. and the second element of which is a bit specifying if the unsigned summation resulted in a carry.overflow. Semantics: The 'llvm.ssub.uadd. i32 %b) declare {i64.i64(i64 %a. and indicate whether a carry occurred during the unsigned summation.uadd. i32 %b) declare {i64.sadd.with. declare {i16. Documentation for the LLVM System at SVN head Semantics: The 'llvm. 1 br i1 %obit.overflow' family of intrinsic functions perform an unsigned addition of the two arguments. They return a structure — the first element of which is the sum.*' Intrinsics Syntax: This is an overloaded intrinsic.uadd. i1} @llvm.with.i64(i64 %a. i16 %b) declare {i32.with. and the second element of which is a bit specifying if the signed summation resulted in an overflow.uadd. label %carry. i1} %res.overflow on any integer bit width. label %normal 'llvm.i32(i32 %a. 0 %obit = extractvalue {i32.with. i1} %res. Examples: %res = call {i32. label %overflow.overflow. label %normal 'llvm. i1} @llvm. i1} @llvm.uadd.uadd. You can use llvm. but they must have the same bit width.ssub.with.ssub.overflow on any integer bit width. i32 %b) %sum = extractvalue {i32.overflow. i64 %b) 80 . declare {i16.with.with.with.overflow.overflow.with. 0 %obit = extractvalue {i32.with. i1} %res.*' Intrinsics Syntax: This is an overloaded intrinsic. The second element of the result structure must be of type i1. You can use llvm.overflow. i1} @llvm.i16(i16 %a. i1} @llvm.overflow.i16(i16 %a.with.

label %normal 'llvm. and the second element of which is a bit specifying if the unsigned subtraction resulted in an overflow.ssub. i16 %b) declare {i32.overflow.overflow' family of intrinsic functions perform an unsigned subtraction of the two arguments.with.usub. i32 %b) %sum = extractvalue {i32.overflow.with.with.with. Examples: %res = call {i32.i32(i32 %a. label %overflow.usub. %a and %b are the two values that will undergo signed subtraction. i1} %res. but they must have the same bit width. 1 br i1 %obit. and the second element of which is a bit specifying if the signed subtraction resulted in an overflow. Arguments: The arguments (%a and %b) and the first element of the result structure may be of integer types of any bit width.with. 0 %obit = extractvalue {i32. Arguments: The arguments (%a and %b) and the first element of the result structure may be of integer types of any bit width. i1} @llvm. They return a structure — the first element of which is the subtraction. i1} %res. They return a structure — the first element of which is the subtraction. Semantics: The 'llvm. 81 . The second element of the result structure must be of type i1. Semantics: The 'llvm. and indicate whether an overflow occurred during the unsigned subtraction.i16(i16 %a. i64 %b) Overview: The 'llvm. You can use llvm.overflow' family of intrinsic functions perform a signed subtraction of the two arguments.overflow.with.*' Intrinsics Syntax: This is an overloaded intrinsic.usub. i1} @llvm. i1} @llvm. i1} @llvm.overflow' family of intrinsic functions perform a signed subtraction of the two arguments.usub.ssub.overflow.with. but they must have the same bit width. %a and %b are the two values that will undergo unsigned subtraction.overflow' family of intrinsic functions perform an unsigned subtraction of the two arguments.ssub.overflow on any integer bit width. Documentation for the LLVM System at SVN head Overview: The 'llvm.i32(i32 %a.with. i32 %b) declare {i64.usub.with.usub. The second element of the result structure must be of type i1.i64(i64 %a.overflow. and indicate whether an overflow occurred during the signed subtraction.usub.with. declare {i16.

i32(i32 %a.overflow' family of intrinsic functions perform a signed multiplication of the two arguments.overflow. i32 %b) %sum = extractvalue {i32.with. i32 %b) declare {i64.with.i32(i32 %a.overflow' family of intrinsic functions perform a unsigned multiplication of the two arguments. i64 %b) Overview: The 'llvm.usub. label %normal 'llvm.i64(i64 %a.overflow. label %overflow. i16 %b) declare {i32.overflow. and indicate whether an overflow occurred during the signed multiplication.smul. Semantics: The 'llvm. i32 %b) %sum = extractvalue {i32.with. declare {i16.overflow. i1} %res. label %overflow.overflow. The second element of the result structure must be of type i1. You can use llvm. i1} @llvm.overflow.with. i1} @llvm.overflow. You can use llvm.umul. i64 %b) Overview: The 'llvm. 82 .with.smul.smul.umul.overflow on any integer bit width.*' Intrinsics Syntax: This is an overloaded intrinsic.with. i1} @llvm.with. i1} %res.with.with.overflow' family of intrinsic functions perform a signed multiplication of the two arguments.umul.i16(i16 %a.smul. Examples: %res = call {i32.i32(i32 %a. 1 br i1 %obit. and indicate whether an overflow occurred during the unsigned multiplication.with. i32 %b) declare {i64.overflow.umul.with.i16(i16 %a. 0 %obit = extractvalue {i32. %a and %b are the two values that will undergo signed multiplication. i1} %res.smul.smul.*' Intrinsics Syntax: This is an overloaded intrinsic.smul.umul. i1} %res.overflow.with. and the second element of which is a bit specifying if the signed multiplication resulted in an overflow.umul. Documentation for the LLVM System at SVN head Examples: %res = call {i32.overflow. i1} @llvm.smul. Arguments: The arguments (%a and %b) and the first element of the result structure may be of integer types of any bit width. i1} @llvm.with.with. declare {i16.overflow on any integer bit width. label %normal 'llvm. i1} @llvm. i1} @llvm. They return a structure — the first element of which is the multiplication.with. 0 %obit = extractvalue {i32. i1} @llvm. 1 br i1 %obit.i64(i64 %a.i32(i32 %a. but they must have the same bit width. i16 %b) declare {i32.

0 %obit = extractvalue {i32. label %overflow. marked with the nest attribute.overflow' family of intrinsic functions perform an unsigned multiplication of the two arguments. and the second element of which is a bit specifying if the unsigned multiplication resulted in an overflow. 'llvm. are described in the LLVM Exception Handling document.i32(i32 %a. if the function is i32 f(i8* nest %c.eh.trampoline(i8* <tramp>.umul. i32 %y ).with. prefix). i32 %b) %sum = extractvalue {i32. Trampoline Intrinsic This intrinsic makes it possible to excise one parameter. i32)* @f %fp = bitcast i8* %p to i32 (i32. i32)* The call %val = call i32 %fp( i32 %x. i8* bitcast (i32 (i8* nest . i32 %y) then the resulting function pointer has signature i32 (i32.the caller does not need to provide a value for it. Instead. It can be created as follows: %tramp = alloca [10 x i8]. i32)*. i32 %x. i32 0 %p = call i8* @llvm. Exception Handling Intrinsics The LLVM exception handling intrinsics (which all start with llvm. a block of memory usually allocated on the stack. prefix).trampoline( i8* %tramp1. i32.trampoline' Intrinsic Syntax: declare i8* @llvm. i32 %y ) is then equivalent to %val = call i32 %f( i8* %nval. i8* <func>. from a function. The second element of the result structure must be of type i1. i32 0.init. Examples: %res = call {i32. This is used to implement the GCC nested function address extension. label %normal Debugger Intrinsics The LLVM debugger intrinsics (which all start with llvm. size and alignment only correct for X86 %tramp1 = getelementptr [10 x i8]* %tramp. i1} @llvm. 1 br i1 %obit. i8* <nval>) 83 . The result is a callable function pointer lacking the nest parameter . For example.dbg.with.overflow. but they must have the same bit width. Semantics: The 'llvm.umul. i1} %res. are described in the LLVM Source Level Debugging document. i1} %res.init. They return a structure — the first element of which is the multiplication. Documentation for the LLVM System at SVN head Arguments: The arguments (%a and %b) and the first element of the result structure may be of integer types of any bit width. align 4 . the value to use is stored in advance in a "trampoline". %a and %b are the two values that will undergo unsigned multiplication.init. i32 %x. which also contains code to splice the nest value into the argument list.

atomic_ops. and intrinsic functions as found in BSD. so a front-end that generates this intrinsic needs to have some target-specific knowledge. If.barrier intrinsic requires five boolean arguments.trampoline. The tramp argument must point to a sufficiently large and sufficiently aligned block of memory.init. this memory is written to by the intrinsic. atomic primitives. No one model or paradigm should be selected above others unless the hardware itself ubiquitously does so. The fifth argument specifies that the barrier applies to io or device or uncached memory. after calling llvm. it also provides a starting point for developing a "universal" atomic operation and synchronization IR. • ll: load-load barrier 84 .memory.init. and it must be of pointer type.memory. all pointers. The first four arguments enables a specific barrier as listed below. turning it into a function. Atomic Operations and Synchronization Intrinsics These intrinsic functions expand the "universal IR" of LLVM to represent hardware constructs for atomic operations and memory synchronization. i1 <ss>. the memory pointed to by tramp is modified. i1 <device> ) Overview: The llvm. 'llvm. but needs to be bitcast to an appropriate function pointer type before being called. Arguments: The llvm. software transaction memory systems.trampoline intrinsic takes three arguments. The new function's signature is the same as that of func with any arguments marked with the nest attribute removed. Calling the new function is equivalent to calling func with the same argument list. At most one such nest argument is allowed.barrier' Intrinsic Syntax: declare void @llvm. This provides an interface to the hardware.memory. Arguments: The llvm. Semantics: The block of memory pointed to by tramp is filled with target dependent code. but with nval used for the missing nest argument. not an interface to the programmer. It is also modeled primarily on hardware behavior. GNU libc. APR. Note that the size and the alignment are target-specific .memory. i1 <ls>. i1 <sl>. then the effect of any later call to the returned function pointer is undefined. These do not form an API such as high-level threading libraries. It is aimed at a low enough level to allow any programming models or APIs (Application Programming Interfaces) which need atomic behaviors to map cleanly onto it. The func argument must hold a function bitcast to an i8*.barrier( i1 <ll>. Just as hardware provides a "universal IR" for source languages. The hardware interface provided by LLVM should allow a clean implementation of all of these APIs and parallel programming models.LLVM currently provides no portable way of determining them. and other system and application libraries.barrier intrinsic guarantees ordering between specific pairs of memory access types. Documentation for the LLVM System at SVN head Overview: This fills the memory pointed to by tramp with code and returns a function pointer suitable for executing it. A pointer to this function is returned.

Not all targets support all bit widths however. these become noops.atomic. i16 <cmp>.memory.*' Intrinsic Syntax: This is an overloaded intrinsic. it stores a new value into the memory.cmp. i32 1) to i32)) %ptr = bitcast i8* %mallocP to i32* store i32 4. i8 <val> ) declare i16 @llvm. Example: %mallocP = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null. guarantee the above finishes store i32 8. %ptr %result1 = load i32* %ptr .barrier( i1 false.cmp.p0i16( i16* <ptr>. i64 <cmp>. i1 false ) . Documentation for the LLVM System at SVN head • ls: load-store barrier • sl: store-load barrier • ss: store-store barrier • device: barrier applies to device and uncached memory also.swap.i8. These semantics are applied with a logical "and" behavior when more than one is enabled in a single memory barrier intrinsic. Specifically the semantics for each pairing is as follows: • ll: All loads before the barrier must complete before any load after the barrier begins.atomic.atomic.p0i8( i8* <ptr>. all of the first operations preceding the barrier will complete before any of the second operations succeeding the barrier begin.atomic. 85 . before this begins 'llvm.swap. You can use llvm.atomic.swap. i32 <cmp>. or store-load). i8 <cmp>. i1 true. i1 false. If they are equal.ex.p0i32( i32* <ptr>.i32. yields {i32}:result1 = 4 call void @llvm.cmp. declare i8 @llvm. Some architectures do not need all types of barriers and on such architectures. • ss: All stores before the barrier must complete before any store after the barrier begins.i64.cmp. load-load. Backends may implement stronger barriers than those requested when they do not support as fine grained a barrier as requested.p0i64( i64* <ptr>. i32 <val> ) declare i64 @llvm. %ptr .swap.swap on any integer bit width and for different address spaces. i64 <val> ) Overview: This loads a value in memory and compares it to a given value.cmp. For any of the specified pairs of load and store operations (f. • sl: All stores before the barrier must complete before any load after the barrier begins. i16 <val> ) declare i32 @llvm. Semantics: This intrinsic causes the system to enforce some ordering constraints upon the loads and stores of the program. it only enforces an order in which they occur.atomic. This barrier does not indicate when any events will occur. • ls: All loads before the barrier must complete before any store after the barrier begins.swap.cmp.i16.

atomic. Arguments: The llvm.atomic. yields {i32}:memval2 = 8 'llvm.swap. Both the val argument and the result must be integers of the same bit width. Semantics: This entire intrinsic must be executed atomically. ptr. yields {i32}:result1 = 4 %stored1 = icmp eq i32 %result1.i64. The first argument.atomic. 1 %result2 = call i32 @llvm.swap. i8 <val> ) declare i16 @llvm. i32 <val> ) declare i64 @llvm. Examples: %mallocP = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null. The targets may only lower integer representations they support.swap. This provides the equivalent of an atomic compare-and-swap operation within the SSA framework. 86 .swap. yields {i1}:stored1 = true %memval1 = load i32* %ptr . You can use llvm.atomic. It first loads the value in memory pointed to by ptr and compares it with the value cmp.atomic. %val1 ) . 4 %result1 = call i32 @llvm. targets may only lower representations they support in hardware. If they are equal.swap on any integer bit width.atomic.atomic. 4 . %val2 ) . must be a pointer to a value of this integer type.i32.cmp.p0i8( i8* <ptr>.i32.*' Intrinsic Syntax: This is an overloaded intrinsic.atomic.i16.i8.swap. While any bit width integer may be used.p0i16( i16* <ptr>.swap. i32 1) to i32)) %ptr = bitcast i8* %mallocP to i32* store i32 4. The result as well as both cmp and val must be integer values with the same bit width. i32 4. i64 <val> ) Overview: This intrinsic loads the value stored in memory at ptr and yields the value from memory.atomic.p0i32( i32* <ptr>. Not all targets support all bit widths however.cmp.atomic. yields {i32}:result2 = 8 %stored2 = icmp eq i32 %result2.cmp. The ptr argument must be a pointer to a value of this integer type. i16 <val> ) declare i32 @llvm.p0i32( i32* %ptr.i32.swap. i32 5. 5 .swap intrinsic takes three arguments. Documentation for the LLVM System at SVN head Arguments: The llvm. val is stored into the memory. %ptr %val1 = add i32 4.p0i32( i32* %ptr. yields {i32}:memval1 = 8 %val2 = add i32 1. declare i8 @llvm. yields {i1}:stored2 = false %memval2 = load i32* %ptr . The loaded value is yielded in all cases.p0i64( i64* <ptr>. It then stores the value in val in the memory at ptr.swap intrinsic takes two arguments.

atomic. Examples: %mallocP = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null.i32. %ptr %result1 = call i32 @llvm. It first loads the value stored at ptr.atomic.i8. %ptr %val1 = add i32 4. yields {i32}:memval2 = 2 'llvm. It yields the original value at ptr. 4 . yields {i32}:result1 = 4 %stored1 = icmp eq i32 %result1. It yields the original value stored at ptr.add.p0i32( i32* %ptr..i32. stores the result to ptr.p0i32( i32* <ptr>. Semantics: This intrinsic does a series of operations atomically. yields {i32}:result2 = 8 %stored2 = icmp eq i32 %result2.i32.atomic.p0i32( i32* %ptr.i16.. It then adds delta.atomic.load. 8 .swap. Not all targets support all bit widths however. the first a pointer to an integer value and the second an integer value.p0i32( i32* %ptr.load.. i8 <delta> ) declare i16 @llvm. The result is also an integer value.atomic.add. i16 <delta> ) declare i32 @llvm. 4 %result1 = call i32 @llvm.p0i64( i64* <ptr>. The targets may only lower integer representations they support.add. Documentation for the LLVM System at SVN head Semantics: This intrinsic loads the value pointed to by ptr.atomic.load.atomic. but they must all have the same bit width. i32 %val1 ) .load. You can use llvm. yields it. This provides the equivalent of an atomic swap operation within the SSA framework. These integer types can have any bit width. i32 <delta> ) declare i64 @llvm. and stores val back into ptr atomically. 1 %result2 = call i32 @llvm.add.add on any integer bit width. yields {i1}:stored2 = true %memval2 = load i32* %ptr .add. Arguments: The intrinsic takes two arguments. i32 4 ) 87 . i32 %val2 ) .swap.load.p0i16( i16* <ptr>. yields {i32}:memval1 = 8 %val2 = add i32 1..i32. Examples: %mallocP = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null.add.p0i8( i8* <ptr>.atomic.atomic.*' Intrinsic Syntax: This is an overloaded intrinsic. i64 <delta> ) Overview: This intrinsic adds delta to the value stored in memory at ptr.load. declare i8 @llvm.i64. yields {i1}:stored1 = true %memval1 = load i32* %ptr . i32 1) to i32)) %ptr = bitcast i8* %mallocP to i32* store i32 4.load. i32 1) to i32)) %ptr = bitcast i8* %mallocP to i32* store i32 4.

i32 2 ) .atomic.*' Intrinsic 'llvm.i16. yields {i32}:result2 = 8 %result3 = call i32 @llvm.i32.i32. i32 5 ) . It yields the original value stored at ptr.load.atomic. The targets may only lower integer representations they support.load_nand.load. i32 5 ) .atomic. i64 <delta> ) Overview: This intrinsic subtracts delta to the value stored in memory at ptr.load. i16 <delta> ) declare i32 @llvm.sub on any integer bit width and for different address spaces.sub. %ptr %result1 = call i32 @llvm.load. yields {i32}:memval1 = -3 'llvm.atomic.atomic.atomic.load. and llvm. Arguments: The intrinsic takes two arguments.p0i32( i32* %ptr.atomic. It then subtracts delta.atomic. These integer types can have any bit width. You can use llvm.i32. stores the result to ptr.add.i8. i32 4 ) . Not all targets support all bit widths however.p0i32( i64* <ptr>.load_and.atomic.load.atomic. i32 2 ) .xor.atomic.load.i64.load. yields {i32}:result2 = 4 %result3 = call i32 @llvm.*' Intrinsic 'llvm.load.atomic.load. llvm.atomic. yields {i32}:result1 = 4 %result2 = call i32 @llvm.p0i32( i32* %ptr. i32 1) to i32)) %ptr = bitcast i8* %mallocP to i32* store i32 8.load.add.sub. Examples: %mallocP = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null. yields {i32}:memval1 = 15 'llvm.sub.p0i32( i32* <ptr>.atomic. yields {i32}:result3 = 10 %memval1 = load i32* %ptr .and.load.load.atomic.atomic.nand.load.*' Intrinsic 'llvm. Semantics: This intrinsic does a series of operations atomically.p0i32( i8* <ptr>.sub.i32.p0i32( i32* %ptr. i32 <delta> ) declare i64 @llvm. Not all targets support all bit widths however. It first loads the value stored at ptr.sub.i32. but they must all have the same bit width.sub. The result is also an integer value.*' Intrinsic Syntax: This is an overloaded intrinsic.p0i32( i32* %ptr.or.sub.atomic. 88 . i8 <delta> ) declare i16 @llvm.sub. llvm. You can use llvm. Documentation for the LLVM System at SVN head . It yields the original value at ptr.p0i32( i32* %ptr. the first a pointer to an integer value and the second an integer value.load_xor on any integer bit width and for different address spaces.atomic.*' Intrinsic Syntax: These are overloaded intrinsics. declare i8 @llvm.p0i32( i16* <ptr>. yields {i32}:result3 = 2 %memval1 = load i32* %ptr .load. yields {i32}:result1 = 8 %result2 = call i32 @llvm.i32.atomic.load_or.

p0i32( i64* <ptr>.load.i64.or.and. xor) delta to the value stored in memory at ptr. %ptr %result0 = call i32 @llvm.load. Examples: %mallocP = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null.*' Intrinsic 89 .or.nand.atomic.nand.i16.i8.atomic. Semantics: These intrinsics does a series of operations atomically.atomic.load.atomic.atomic.p0i32( i16* <ptr>.p0i64( i64* <ptr>. store the result to ptr. i32 0xFF ) .load.atomic.load.*' Intrinsic 'llvm.i64.atomic. These integer types can have any bit width.p0i32( i32* %ptr.i32. i16 <delta> ) declare i32 @llvm.and.i16. i32 <delta> ) declare i64 @llvm.and.load.and. Arguments: These intrinsics take two arguments.xor.atomic.p0i16( i16* <ptr>.atomic.load.*' Intrinsic 'llvm. yields {i32}:result0 = 0x0F0F %result1 = call i32 @llvm. The targets may only lower integer representations they support.atomic.load.i8. nand.p0i32( i32* <ptr>.umin. They first load the value stored at ptr.atomic.atomic. i8 <delta> ) declare i16 @llvm. i16 <delta> ) declare i32 @llvm.nand. i8 <delta> ) declare i16 @llvm. i64 <delta> ) Overview: These intrinsics bitwise the operation (and.atomic.load.p0i32( i32* <ptr>.i32.load. i64 <delta> ) declare i8 @llvm. They then do the bitwise operation delta.p0i32( i32* %ptr.atomic.p0i32( i8* <ptr>.atomic. yields {i32}:result1 = 0xFFFFFFF0 %result2 = call i32 @llvm.i64.xor. i16 <delta> ) declare i32 @llvm. i32 <delta> ) declare i64 @llvm. i32 1) to i32)) %ptr = bitcast i8* %mallocP to i32* store i32 0x0F0F.p0i32( i32* <ptr>.p0i8( i8* <ptr>.load. It yields the original value at ptr.umax.p0i16( i16* <ptr>.load.xor.p0i64( i64* <ptr>.load.xor.atomic. They yield the original value stored at ptr.load.load.load.i32.p0i32( i8* <ptr>.p0i32( i32* %ptr. i8 <delta> ) declare i16 @llvm.i64.i32. i32 0F ) . i32 <delta> ) declare i64 @llvm.i8.or.load.max.*' Intrinsic 'llvm.i32.p0i32( i32* <ptr>. The result is also an integer value.i16. i64 <delta> ) declare i8 @llvm.atomic.load.nand. i8 <delta> ) declare i16 @llvm.i32.load.load.load. i64 <delta> ) declare i8 @llvm.p0i32( i16* <ptr>. yields {i32}:result3 = FF %memval1 = load i32* %ptr .load.i16.atomic.p0i8( i8* <ptr>. i32 0F ) .min.atomic.i32.load.xor.nand.or. Documentation for the LLVM System at SVN head declare i8 @llvm. i32 <delta> ) declare i64 @llvm.i32.atomic.atomic.load.p0i32( i64* <ptr>.i8.and. the first a pointer to an integer value and the second an integer value.atomic. i16 <delta> ) declare i32 @llvm.atomic. i32 0xFF ) .p0i32( i32* %ptr. yields {i32}:result2 = 0xF0 %result3 = call i32 @llvm. or.or. but they must all have the same bit width.atomic. yields {i32}:memval1 = F0 'llvm.

umax.atomic. Arguments: These intrinsics take two arguments.atomic. i32 <delta> ) declare i64 @llvm.min.i64.atomic. i8 <delta> ) declare i16 @llvm.atomic.p0i32( i32* %ptr. i32 -2 ) .i32.p0i16( i16* <ptr>.max.load. i32 <delta> ) declare i64 @llvm. yields {i32}:result3 = 8 90 .p0i16( i16* <ptr>.. i8 <delta> ) declare i16 @llvm.atomic.load.umin.atomic. The result is also an integer value.p0i64( i64* <ptr>.p0i16( i16* <ptr>.i8.min.load.i32.i16.p0i32( i32* <ptr>.load_umax. i64 <delta> ) declare i8 @llvm.max.p0i64( i64* <ptr>.atomic. They first load the value stored at ptr. Semantics: These intrinsics does a series of operations atomically.p0i16( i16* <ptr>.i8.atomic. i64 <delta> ) Overview: These intrinsics takes the signed or unsigned minimum or maximum of delta and the value stored in memory at ptr.load. i32 10 ) .i64. They then do the signed or unsigned min or max delta and the value.p0i8( i8* <ptr>. Documentation for the LLVM System at SVN head Syntax: These are overloaded intrinsics.umin.load.p0i8( i8* <ptr>.load_max.atomic. the first a pointer to an integer value and the second an integer value.atomic. i16 <delta> ) declare i32 @llvm.load.i16.load.load_umin on any integer bit width and for different address spaces. yields {i32}:result2 = 8 %result3 = call i32 @llvm. llvm.max. i8 <delta> ) declare i16 @llvm. i32 8 ) .umax. i32 30 ) . yields {i32}:result1 = -2 %result2 = call i32 @llvm.atomic. They yield the original value stored at ptr.umin.i64.umin.load.atomic.i16..atomic.load.p0i64( i64* <ptr>. i64 <delta> ) declare i8 @llvm.atomic.load.p0i8( i8* <ptr>.i64. store the result to ptr.p0i32( i32* <ptr>. You can use llvm.min.i32.atomic.max.load.load.atomic.i32.load. but they must all have the same bit width. %ptr %result0 = call i32 @llvm.load.atomic.p0i32( i32* <ptr>.p0i8( i8* <ptr>.p0i32( i32* %ptr.load.i32. i64 <delta> ) declare i8 @llvm.min.load.load. Not all targets support all bit widths however.umax. These integer types can have any bit width. i8 <delta> ) declare i16 @llvm.i16.max.p0i32( i32* <ptr>..load..atomic.atomic.p0i32( i32* %ptr. i16 <delta> ) declare i32 @llvm.atomic. i16 <delta> ) declare i32 @llvm.load_min.umax. It yields the original value at ptr. yields {i32}:result0 = 7 %result1 = call i32 @llvm.load.atomic.umax. i32 <delta> ) declare i64 @llvm. i32 1) to i32)) %ptr = bitcast i8* %mallocP to i32* store i32 7.load.i8.p0i32( i32* %ptr. llvm. i32 <delta> ) declare i64 @llvm.min. and llvm.i32.i32.p0i64( i64* <ptr>.i32. declare i8 @llvm. Examples: %mallocP = tail call i8* @malloc(i32 ptrtoint (i32* getelementptr (i32* null.atomic..atomic.atomic. The targets may only lower integer representations they support.i8.umin. i16 <delta> ) declare i32 @llvm.

i8* nocapture <ptr>) readonly 91 .end(i64 <size>. The second argument is a pointer to the object.lifetime. or -1 if it is variable sized.lifetime. the value of the memory pointed to by ptr is dead.start' Intrinsic Syntax: declare void @llvm.invariant. Documentation for the LLVM System at SVN head %memval1 = load i32* %ptr . Arguments: The first argument is a constant integer representing the size of the object. A load from the pointer that precedes this intrinsic can be replaced with 'undef'. Any stores into the memory object following this intrinsic may be removed as dead.lifetime. Semantics: This intrinsic indicates that before this point in the code. i8* nocapture <ptr>) Overview: The 'llvm. or -1 if it is variable sized.end' Intrinsic Syntax: declare void @llvm.start' Intrinsic Syntax: declare {}* @llvm. i8* nocapture <ptr>) Overview: The 'llvm. This means that it is known to never be used and has an undefined value.lifetime. Semantics: This intrinsic indicates that after this point in the code. 'llvm. yields {i32}:memval1 = 30 Memory Use Markers This class of intrinsics exists to information about the lifetime of memory objects and ranges where variables are immutable. The second argument is a pointer to the object.lifetime.end' intrinsic specifies the end of a memory object's lifetime.lifetime.start(i64 <size>.invariant. This means that it is known to never be used and has an undefined value.start(i64 <size>. the value of the memory pointed to by ptr is dead.start' intrinsic specifies the start of a memory object's lifetime. 'llvm. 'llvm. Arguments: The first argument is a constant integer representing the size of the object.

92 . the second is a pointer to a global string.invariant. General Intrinsics This class of intrinsics is designed to be generic and has no specific purpose. or -1 if it is variable sized and the third argument is a pointer to the object.annotation' Intrinsic Syntax: declare void @llvm.var.invariant. i8* <str>.end({}* <start>. The second argument is a constant integer representing the size of the object.invariant. i64 <size>. Arguments: The first argument is a pointer to a value.annotation(i8* <val>. Semantics: This intrinsic indicates that until an llvm. i8* <str>. the third is a pointer to a global string which is the source file name. 'llvm.invariant.end' Intrinsic Syntax: declare void @llvm.invariant.var.annotation' intrinsic.end that uses the return value. or -1 if it is variable sized. Arguments: The first argument is a constant integer representing the size of the object. Documentation for the LLVM System at SVN head Overview: The 'llvm. i8* nocapture <ptr>) Overview: The 'llvm.var. i32 <int> ) Overview: The 'llvm. the referenced memory location is constant and unchanging. Semantics: This intrinsic indicates that the memory is mutable again.start intrinsic. 'llvm.invariant. Arguments: The first argument is the matching llvm. and the last argument is the line number. The second argument is a pointer to the object.end' intrinsic specifies that the contents of a memory object are mutable.start' intrinsic specifies that the contents of a memory object will not change.

i8* <str>.annotation. they are ignored by code generation and optimization. Semantics: This intrinsics is lowered to the target dependent trap instruction. It returns the value of the first argument. This can be useful for special purpose optimizations that want to look for these annotations. i8* <str>. i8* <str>.trap' intrinsic.i256(i256 <val>. i32 <int> ) declare i64 @llvm. i8* <str>. Documentation for the LLVM System at SVN head Semantics: This intrinsic allows annotation of local variables with arbitrary strings. You can use 'llvm.stackprotector' Intrinsic 93 .annotation.i64(i64 <val>.annotation. and the last argument is the line number. This can be useful for special purpose optimizations that want to look for these annotations.annotation' intrinsic. this intrinsic will be lowered to the call of the abort() function. Semantics: This intrinsic allows annotations to be put on arbitrary expressions with arbitrary strings. i8* <str>.trap' Intrinsic Syntax: declare void @llvm. i8* <str>.annotation. 'llvm. the second is a pointer to a global string. declare i8 @llvm. i8* <str>.i32(i32 <val>.annotation. i32 <int> ) Overview: The 'llvm. Arguments: The first argument is an integer value (result of some expression).i16(i16 <val>.annotation' on any integer bit width.annotation. 'llvm. If the target does not have a trap instruction. i32 <int> ) declare i16 @llvm. These have no other defined use.trap() Overview: The 'llvm.i8(i8 <val>. i32 <int> ) declare i32 @llvm. 'llvm. the third is a pointer to a global string which is the source file name.*' Intrinsic Syntax: This is an overloaded intrinsic. These have no other defined use. i32 <int> ) declare i256 @llvm. i8* <str>. i8* <str>. they are ignored by code generation and optimization. Arguments: None. i8* <str>.

objectsize. This needs to be a literal 0 or 1. Chris Lattner The LLVM Compiler Infrastructure Last modified: $Date: 2010-03-11 18:12:20 -0600 (Thu.objectsize intrinsic is lowered to either a constant representing the size of the object concerned or i32/i64 -1 or 0 (depending on the type argument if the size cannot be determined at compile time.objectsize. This argument determines whether you want the maximum (0) or minimum (1) bytes remaining.i64( i8* <object>. Arguments: The llvm. The first argument is the value loaded from the stack guard @__stack_chk_guard. or b) to determine that a runtime check for overflow isn't necessary. structure. An object in this context means an allocation of a specific class. When the function exits. array. Semantics: The llvm. Arguments: The llvm. the guard on the stack is checked against the original guard. If they're different.stackprotector intrinsic takes the guard and stores it onto the stack at slot. This is to ensure that if a local variable on the stack is overwritten.i32( i8* <object>. The second variable is an alloca that has enough space to hold the value of the guard. 11 Mar 2010) $ 94 . The second argument is a boolean 0 or 1.objectsize intrinsic is designed to provide information to the optimizers to discover at compile time either a) when an operation like memcpy will either overflow a buffer that corresponds to an object.objectsize' Intrinsic Syntax: declare i32 @llvm. or other object. Documentation for the LLVM System at SVN head Syntax: declare void @llvm. The stack slot is adjusted to ensure that it is placed on the stack before local variables. Semantics: This intrinsic causes the prologue/epilogue inserter to force the position of the AllocaInst stack slot to be before local variables on the stack. i1 <type> ) declare i64 @llvm.stackprotector intrinsic requires two pointer arguments. The first argument is a pointer to or into the object. then the program aborts by calling the __stack_chk_fail() function. variables are not allowed. 'llvm.objectsize intrinsic takes two arguments.stackprotector( i8* <guard>. i8** <slot> ) Overview: The llvm. it will destroy the value of the guard. i1 <type> ) Overview: The llvm.

0 and GEP x.0.0. Can I do GEP with a different pointer type than the type of the underlying object? 5. How is GEP different from ptrtoint. Why can you index through the first pointer but not subsequent ones? 4. Can I do type-based alias analysis on LLVM IR? 8. Why do struct member indices always use i32? 3. Why is the extra 0 index required? 2.1 and GEP x. How do I do this? 11. in "C": 95 . What happens if an array index is out of bounds? 2. Here we lay out the sources of confusion and show that the GEP instruction is really quite simple. Can array indices be negative? 3. Why do GEP x. Can I compute the distance between two objects. Can GEP index into vector elements? 7. Why don't GEP x. What effect do address spaces have on GEPs? 9. arithmetic. Introduction This document seeks to dispel the mystery and confusion surrounding LLVM's GetElementPtr (GEP) instruction. most notably C array indexing and field selection. Rules 1. Introduction 2. What is the first index of the GEP instruction? Quick answer: The index stepping through the first operand. For example. however it's is a little different and this leads to the following questions. Summary Written by: Reid Spencer. Documentation for the LLVM System at SVN head The Often Misunderstood GEP Instruction 1. Can I cast an object's address to integer and add it to null? 6. Address Computation 1. What is dereferenced by GEP? 3. They aren't the same. What's an uglygep? 5. Why is GEP designed this way? 2. when we write. Can GEP index into unions? 8. Address Computation When people are first confronted with the GEP instruction. Can I compare two values computed with GEPs? 4. and inttoptr? 10.0. Questions about the wily GEP instruction are probably the most frequently occurring questions once a developer gets down to coding with LLVM.1 alias? 6. I'm writing a backend for a target which needs custom lowering for GEP.1 alias? 5. How does VLA addressing work with GEPs? 3. What happens if a GEP computation overflows? 9. How can I tell if my front-end is following the rules? 4. Rationale 1. and add that value to one address to compute the other address? 7. GEP closely resembles C array indexing and field selection.1. they tend to relate it to known concepts from other programming paradigms. The confusion with the first index usually arises from thinking about the GetElementPtr instruction as if it was a C index operator.

munger_struct* %P. i32* %tmp9 ret void } In each case the first operand is the pointer through which the GEP instruction starts.. . . i32 1. indices through it transparently. The second operand indexes through that pointer. The same is true whether the first operand is an argument. Sometimes this question gets rephrased as: Why is it okay to index through the first pointer. just as if you wrote: X = &Foo[0]. Documentation for the LLVM System at SVN head AType *Foo. It must... So. in this example. i32 0 %tmp = load i32* %tmp %tmp6 = getelementptr %struct.f2. int f2. void munge(struct munger_struct *P) { P[0].munger_struct* %P.f1 + P[2].F. However. To arrive at the same address location as the C code. therefore be indexed and requires an index operand. C.. Consider this example: struct munger_struct { int f1.. or a global variable. i32 1 %tmp7 = load i32* %tmp6 %tmp8 = add i32 %tmp7. X = &Foo->F. the front end compiler (llvm-gcc) will generate three GEP instructions for the three indices through "P" in the assignment statement. i32 0 store i32 %tmp8. in LLVM assembly the munge function looks like: void %munge(%struct. }. In this "C" example. The first operand indexes through the pointer. 96 . The first operand to the GEP instruction must be a value of a pointer type. i32 0. on the other hand. the second operand indexes the field F of the structure. %tmp %tmp9 = getelementptr %struct. munge(Array). } . The third operand will be the field offset into the struct munger_struct type. for either the f1 or f2 field. i32 2.munger_struct* %P. The function argument P will be the first operand of each of these GEP instructions. allocated memory.munger_struct* %P) { entry: %tmp = getelementptr %struct. The value of the pointer is provided directly to the GEP instruction as an operand without any need for accessing memory. the selection of the field F. Foo is a pointer. munger_struct Array[3].. That pointer must be indexed explicitly in LLVM.f1 = P[1]. it is natural to think that there is only one index. but subsequent pointers won't be dereferenced? The answer is simply because memory does not have to be accessed to perform the computation. you would provide the GEP instruction with two index operands.

not three i32s long. Becoming aware of the following facts will dispel the confusion: 1. What is dereferenced by GEP? Quick answer: nothing. The second index. However. For example. the first index steps through that pointer. That's what the Load and Store instructions are for. consider this: 97 . No memory is accessed to make these computations because the address of %MyVar is passed directly to the GEP instructions. i64 0 %idx2 = getelementptr i32* %MyVar. as follows (using C syntax): idx1 = (char*) &MyVar + 0 idx2 = (char*) &MyVar + 4 idx3 = (char*) &MyVar + 8 Since the type i32 is known to be four bytes long. respectively. it is inadvisable because any load or store with the pointer that results from these GEP instructions would produce undefined results. That is. The GetElementPtr instruction dereferences nothing. This question arises most often when the GEP instruction is applied to a global variable which is always a pointer type. That is. They result in the computation of addresses that point to memory past the end of the %MyVar global. which is only one i32 long. i32 } . consider this: %MyStruct = uninitialized global { float*.. %idx = getelementptr { float*. i32 } but rather { float*. i32 }* %MyStruct. the indices 0. i32 1 The GEP above yields an i32* by indexing the i32 typed field of the structure %MyStruct. i64 1 %idx3 = getelementptr i32* %MyVar.. 4. When people first look at it. Point #1 is evidenced by noticing the type of the first operand of the GEP instruction (%MyStruct) which is { float*. The type of %MyStruct is not { float*. i64 0 is required to step over the global variable %MyStruct. 1 and 2 translate into memory offsets of 0. and 8. %MyStruct is a pointer to a structure containing a pointer to a float and an i32.. 4. While this is legal in LLVM. Why is the extra 0 index required? Quick answer: there are no superfluous indices. i32 }*. Since the first argument to the GEP instruction must always be a value of pointer type. i32 }*. The first index. A value of 0 means 0 elements offset from that pointer. i32 1 selects the second field of the structure (the i32). GEP is only involved in the computation of addresses. 3. a closer inspection of how globals and GEPs work reveals the need. They compute. %idx1 = getelementptr i32* %MyVar. they wonder why the i64 0 index is needed.. For example. it doesn't access memory in any way. i64 0. i64 2 These GEP instructions are simply making address computations from the base address of MyVar. Documentation for the LLVM System at SVN head To make this clear. The obtuse part of this example is in the cases of %idx2 and %idx3. let's consider a more obtuse example: %MyVar = uninitialized global i32 . 2.

that is MyVar+4. idx2 computes the address of the next structure after %MyVar. In this case. idx1 computes the address of the second integer in the array that is in the structure in %MyVar. i64 0. i32 0. i32 0 %arr = load [40 x i32]** %idx %idx = getelementptr [40 x i32]* %arr. i64 17 In this case. The type of idx2 is { [10 x i32] }* and its value is equivalent to MyVar + 40 because it indexes past the ten 4-byte integers in MyVar. %idx = getelementptr { [40 x i32] }*. this is actually an illegal GEP instruction. The GEP instruction seems to be accessing the 18th integer of the structure's array of ints.. However. The type of idx1 is i32*. i64 1 98 . Why do GEP x. we have to load the pointer in the structure with a load instruction before we can index into the array.0. If you look at the first indices in these GEP instructions you find that they are different (0 and 1).. the structure does not contain a pointer and the GEP instruction can index through the global variable. i64 0. i64 0. However. If the example was changed to: %MyVar = uninitialized global { [40 x i32 ] } . %MyVar that is a pointer to a structure containing a pointer to an array of 40 ints. we have a global variable.0. i64 0 %idx2 = getelementptr { [10 x i32 ] }* %MyVar. it does change the type. it is illegal. i32 0. the pointers don't alias.1. i64 0. Consider this example: %MyVar = global { [10 x i32 ] } %idx1 = getelementptr { [10 x i32 ] }* %MyVar.0 and GEP x. Obviously. %idx = getelementptr { [40 x i32]* }* %MyVar. Why don't GEP x. It won't compile. i64 1 In this example. Consider this example: %MyVar = global { [10 x i32 ] } %idx1 = getelementptr { [10 x i32 ] }* %MyVar. you would need to do the following: %idx = getelementptr { [40 x i32]* }* %. i64 0. The reason is that the pointer in the structure must be dereferenced in order to index into the array of 40 ints.. i64 17 then everything works fine. Since the GEP instruction never accesses memory. Documentation for the LLVM System at SVN head %MyVar = uninitialized global { [40 x i32 ]* } .0. i64 0. i64 1 %idx2 = getelementptr { [10 x i32 ] }* %MyVar. In order to access the 18th integer in the array. i32 0.1 and GEP x.1 alias? Quick Answer: They compute the same address location. i32 0. into the first field of the structure and access the 18th i32 in the array there.. These two GEP instructions will compute the same address because indexing through the 0th element does not change the address.1 alias? Quick Answer: They compute different address locations. therefore the address computation diverges with that index. i64 17 In this example. i64 1. However. in such a situation.

though it's not recommended. which targets can customize. How is GEP different from ptrtoint. specifically alias analysis) benefit from being able to rely on it. this is safe on everything LLVM supports (LLVM internally assumes pointers are never wider than 64 bits in many places). See the Rules section for more information. GEP is more concise in common cases. must be effectively lowered into a form like X[a*m+b*n+c]. there are only subtle differences. and the optimizer will actually narrow the i64 arithmetic down to the actual pointer size on targets which don't support 64-bit arithmetic in most cases. which are what GEP is lowered into. The integer computation implied by a GEP is target-independent. LLVM's type system is entirely static. It leads to awkward special cases in the optimizers. Typically what you'll need to do is make your backend pattern-match expressions trees involving ADD. And. One approach is to pick i64. you'll need to fix a lot of code in the backend. etc. With GEP you can avoid this problem. I'm writing a backend for a target which needs custom lowering for GEP. and dereference it. With ptrtoint. If you require support for addressing units which are not 8 bits. you have to pick an integer type. This has the advantage of letting your code work correctly in more cases. and consumers (optimizers. Can GEP index into vector elements? This hasn't always been forcefully disallowed. and GEP address computations are guided by an LLVM type.. For example. an expression like X[a][b][c]. and inttoptr? It's very similar. GEP does use target-dependent parameters for the size and layout of data types. IR producers (front-ends) must follow this rule. except that the address space qualifier on the first operand pointer type always matches the address space qualifier on the result type. Documentation for the LLVM System at SVN head In this example. How does VLA addressing work with GEPs? GEPs don't natively support VLAs. and fundamental inconsistency in the IR. there is no difference. 99 . with GEP lowering being only a small piece of the overall picture. for the underlying integer computation implied. The value of %idx2 is also MyVar+40 but its type is { [10 x i32] }*. Also. How do I do this? You don't. so that it appears to the GEP as a single-dimensional array reference. In the future. What effect do address spaces have on GEPs? None. VLA indices can be implemented as linearized indices. However. It's invalid to take a GEP from one object. Can GEP index into unions? Unknown. However. address into a different separately allocated object. the value of %idx1 is %MyVar+40 and its type is i32*. there are some cases where it doesn't do this. MUL. it will probably be outright disallowed. arithmetic. GEP carries additional pointer aliasing rules.

If both addresses are within the same allocated object. not the number of elements. Documentation for the LLVM System at SVN head This means if you want to write an analysis which understands array indices and you want to support VLAs. There is no problem with out of bounds indices in this sense. integer arithmetic wrapping may occur. There are no restrictions on bitcasting a pointer value to an arbitrary pointer type. Beyond that there are merely a hint to the optimizer indicating how the value will likely be used. Analysis passes which wish to understand array indexing should not assume that the static array type bounds are respected. This sense is unconnected with inbounds keyword. which always presents VLA and non-VLA indexing in the same manner. Indexing into an array only depends on the size of the array element. rather than high-level array indexing rules. This is basically a special case of array indices being out of bounds. 100 . Indices greater than the number of elements in the corresponding static array type are valid. it's perfectly valid to compute arbitrary element indices. loads and stores don't have to use the same types as the type of the underlying object. Can I do GEP with a different pointer type than the type of the underlying object? Yes. It's common to use array types with zero length to represent these. Furthermore. The types in a GEP serve only to define the parameters for the underlying integer computation. The fact that the static type says there are zero elements is irrelevant. your code will have to be prepared to reverse-engineer the linearization. Note that zero-sized arrays are not a special case here. But the GEP itself is only concerned with computing addresses. One way to solve this problem is to use the ScalarEvolution library. as the computation only depends on the size of the array element. performing a load or a store requires an address of allocated and sufficiently aligned memory. there are no restrictions on computing out-of-bounds addresses. there's the array type which comes from the (static) type of the first operand to the GEP. First. you'll get the comparison result you expect. so the comparison may not be meaningful. or one-past-the-end. Can I compare two values computed with GEPs? Yes. If either is outside of it. the result value of the GEP is undefined if the address is outside the actual underlying allocated object and not the address one-past-the-end. With the inbounds keyword. Without the inbounds keyword. They need not correspond with the actual type of the underlying object. Obviously. The second sense of being out of bounds is computing an address that's beyond the actual underlying allocated object. Can array indices be negative? Yes. Rules What happens if an array index is out of bounds? There are two senses in which an array index can be out of bounds. The inbounds keyword is designed to describe low-level pointer arithmetic overflow conditions. Types in this context serve only to specify memory size and alignment. not the number of elements. A common example of how this is used is arrays where the size is not known.

the result value is the result from evaluating the implied two's complement integer computation. However. the only way to do this is to manually check each place in your front-end where GetElementPtr operators are created. Most of GEP's special aliasing rules do not apply to pointers computed from ptrtoint. It's not possible to write a checker which could find all rule violations statically. unless the object is managed outside of LLVM. The underlying integer computation is sufficiently defined. How can I tell if my front-end is following the rules? There is currently no checker for the getelementptr rules. Currently. the result value is undefined. and inttoptr sequences. it's invalid to access (load from or store to) an LLVM-aware object with such a pointer. Otherwise. and add that value to one address to compute the other address? As with arithmetic on null. 101 . and objects pointed to by noalias pointers.and you can add whatever value you want to it. in rough unofficial order of priority: • Support C. you can't use that pointer to actually access the object. since there's no guarantee of where an object will be allocated in the address space. However. and use inttoptr to convert the result to an address. Documentation for the LLVM System at SVN head Can I cast an object's address to integer and add it to null? You can compute an address that way. arithmetic. no such checker exists today. If you really need this functionality. such values have limited meaning. Rationale Why is GEP designed this way? The design of GEP has the following goals. This includes GlobalVariables. and languages which can be conceptually lowered into C (this covers a lot). You can use GEP to compute an address that way. null has a defined value -. What happens if a GEP computation overflows? If the GEP has the inbounds keyword. However. Also as above. It would be possible to add special annotations to the IR. probably using metadata. It would be possible to write a checker which works by instrumenting the code with dynamic checks though. ptrtoint and inttoptr provide an alternative way to do this which do not have this restriction. you can do the arithmetic with explicit integer instructions. but if you use GEP to do the add. Allocas. Can I do type-based alias analysis on LLVM IR? You can't do type-based alias analysis using LLVM's built-in type system. but you can't use that pointer to actually access the object if you do. Alternatively. This is a much bigger undertaking though. to describe a different type system (such as the C type system). because LLVM has no restrictions on mixing types in addressing. loads or stores. C-like languages. it would be possible to write a static checker which catches a subset of possible problems. and do type-based aliasing on top of that.zero -. Can I compute the distance between two objects. unless the object is managed outside of LLVM.

This isn't pretty. • Minimize target-specific information in the IR. Trailing zero indices are superfluous for pointer aliasing. • Provide a consistent method for computing addresses so that address computations don't need to be a part of load and store instructions in the IR. so there's been no need to change it. Summary In summary. If they've made non-trivial changes. Leading zero indices are not superfluous for pointer aliasing nor the types of the pointers. Requiring that all struct indices be the same reduces the range of possibilities for cases where two GEPs are effectively the same but have distinct operand types. In such cases the optimizer instead will emit a GEP with the base pointer casted to a simple address-unit pointer. sometimes the underlying addressing doesn't correspond with the static type at all. 3. and it's sufficient to preserve the pointer aliasing guarantees that GEP provides. but it's just as valid. What's an uglygep? Some LLVM optimizers operate on GEPs by internally lowering them into more primitive integer expressions. The first operand to the GEP instruction is always a pointer and it must be indexed. but not for the types of the pointers. however it's wide enough for all practical purposes. which allows them to be combined with other integer expressions and/or split into multiple separate integer expressions. here's some things to always remember about the GetElementPtr instruction: 1. 5. It doesn't necessarily imply i32 address arithmetic. 25 Feb 2010) $ 102 . The GEP instruction never accesses memory. It isn't always possibly to fully reconstruct this structure. it only provides pointer computations. 2. There are no superfluous indices for the GEP instruction. to the extent that it doesn't interfere with other goals. 4. The LLVM Compiler Infrastructure Last modified: $Date: 2010-02-25 12:16:03 -0600 (Thu. translating back into LLVM IR can involve reverse-engineering the structure of the addressing in order to fit it into the static type of the original first operand. • Support non-C-like languages. Documentation for the LLVM System at SVN head • Support optimizations such as those that are common in C compilers. using the name "uglygep". it's just an identifier which identifies a field in a struct. Why do struct member indices always use i32? The specific type i32 is probably just a historical artifact.

optional piece called llvm-test. Currently. Local LLVM Configuration 7. The second piece is the GCC front end. Once compiled into LLVM bitcode. Setting Up Your Environment 3. llvm/projects 5. Misha Brukman. It is a suite of programs with a testing harness that can be used to further test LLVM's functionality and performance. Example with llvm-gcc4 • Common Problems • Links Written by: John Criswell. Overview Welcome to LLVM! In order to get started. The Location of LLVM Object Files 10. Install the GCC Front End 6. 103 . bitcode analyzer and bitcode optimizer. This component provides a version of GCC that compiles C and C++ code into LLVM bitcode. The first piece is the LLVM suite. llvm/tools 9. and Guochun Shi. First. There is a third. llvm/test 7. Hardware 2. Compiling the LLVM Suite Source Code 8. Vikram Adve. This contains all of the tools. llvm/lib 4. llvm/utils 10. llvm/include 3. libraries. Broken versions of GCC and other tools • Getting Started with LLVM 1. Optional Configuration Items • Program layout 1. and header files needed to use the low level virtual machine. the GCC front end uses the GCC parser to convert code to LLVM. Unpacking the LLVM Archives 4. Software 3. Chris Lattner. llvm/win32 • An Example Using the LLVM Tool Chain 1. llvm-test 8. Terminology and Notation 2. llvm/examples 2. you first need to know some basic information. It contains an assembler. Cross-Compiling LLVM 9. a program can be manipulated with the LLVM tools from the LLVM suite. LLVM comes in two pieces. It also contains a test suite that can be used to test the LLVM tools and the GCC front end. Documentation for the LLVM System at SVN head Getting Started with the LLVM System • Overview • Getting Started Quickly (A Summary) • Requirements 1. llvm/runtime 6. disassembler. Checkout LLVM from Subversion 5.

2 front end if you intend to compile C or C++ (see Install the GCC Front End for details): 1.bz" use bunzip2 instead of gunzip. Remember that you were warned twice about reading the documentation.2-version-platform. see below. gmake -k |& tee gnumake.tar. The SPEC2000 benchmarks should be available in directory. install-binutils-binary-from-MinGW (Windows only) 4. Go to Program Layout to learn about the layout of the source code tree. Read the documentation. use 7-Zip or a similar archiving tool. 2.gz | tar -xvf - 3. gunzip --stdout llvm-gcc-4. gunzip --stdout llvm-version. 3. 5.tar. Configure the LLVM Build Environment 1. cd llvm/projects 3. 5. Requirements 104 . If not specified. the PATH will be searched.gz | tar -xvf - 6. ◊ --enable-spec2000=directory Enable the SPEC2000 benchmarks for testing. specify for directory the full pathname of the C/C++ front end installation to use with this LLVM configuration. cd where-you-want-to-build-llvm 2. Add llvm-gcc's "bin" directory to your PATH environment variable. Consult the Getting Started with LLVM section for detailed information on configuring and compiling LLVM. cd where-you-want-the-C-front-end-to-live 2. If you get an "internal compiler error (ICE)" or test failures. 4. /path/to/llvm/configure [options] Some common options: ◊ --prefix=directory Specify for directory the full pathname of where you want the LLVM tools and libraries to be installed (default /usr/local). cd where-you-want-llvm-to-live 2. gunzip --stdout llvm-test-version. Build the LLVM Suite: 1. Read the documentation. This is only needed if you want to run the testsuite or do some special kinds of LLVM builds. Note: On Windows. Documentation for the LLVM System at SVN head Getting Started Quickly (A Summary) Here's the short story for getting up and running quickly with LLVM: 1.tar.out # this is csh or tcsh syntax 2.gz | tar -xvf - 7. [Optional] Get the Test Suite Source Code ♦ With the distributed files (or use SVN): 1. See Setting Up Your Environment for tips that simplify working with the GCC front end and LLVM tools. ◊ --with-llvmgccdir=directory Optionally. Install the llvm-gcc-4. 6. Get the LLVM Source Code ♦ With the distributed files (or use SVN): 1. Note: If the binary extension is ". 8. cd where-you-want-llvm-to-live 2.

The GCC-based C/C++ frontend does not build 6.. "-O1" and higher). you can build LLVM tools using "make tools-only". Build is not complete: one or more tools do not link or function 5. as these have Windows-specifics that will cause the build to fail. However. LLVM modules requiring dynamic linking can not be built on Windows.1 (Apple Build 5370) will trip internal LLVM assert messages when compiled for Release at optimization levels greater than 0 (i. This may save you some trouble by knowing ahead of time what hardware and software you will need. and be sure it appears in your path before any Windows-based versions such as Strawberry Perl and ActivePerl.X. In general.20 MinGW/Win32 x861. 9.9 x86 GCC Cygwin/Win32 x861.5 and gcc 4. Code generation supported for Pentium processors and up 2.5 AIX3.4 PowerPC GCC Linux3.8. 10. be sure to install the MSYS version of the perl package.4. 8.20 LLVM has partial support for the following platforms: OS Arch Compilers Windows x861 Visual Studio 2005 SP1 or higher4. 11 GCC 3. 8. XCode 2. Binutils 2. 7. 11.6.20 or later is required to build the assembler generated by LLVM properly.X. 105 . No native code generation 4. For MSYS/MinGW on Windows. binutils 2. Documentation for the LLVM System at SVN head Before you begin to use the LLVM system. The port is done using the MSYS shell. Add OPTIMIZE_OPTION="-O0" to the build command line if compiling for LLVM Release or bootstrapping the LLVM toolchain. review the requirements given below. Code generation supported for 32-bit ABI only 3.5 PowerPC GCC Linux7 Alpha GCC Linux7 Itanium (IA-64) GCC HP-UX7 Itanium (IA-64) HP aCC Notes: 1.0. Native code generation exists but is not complete. binutils 2.e. 10 GCC 3. Hardware LLVM is known to work on the following platforms: OS Arch Compilers AuroraUX x861 GCC Linux x861 GCC Linux amd64 GCC Solaris V9 (Ultrasparc) GCC FreeBSD x861 GCC MacOS X2 PowerPC GCC MacOS X2.4.

bzip2 command for distribution generation • bunzip2* .38.6.3.4 Macro processor for configuration4 GNU Autoconf 2. Only the C and C++ languages are needed so there's no need to build the other languages for LLVM's purposes. The Release build requires considerably less space. If compilation is successful.4 Automated test suite3 expect 5.22 Shared library manager4 Notes: 1. utilities GNU M4 1. depending on the system (it is so large because of all the debugging information and the fact that the libraries are statically linked into multiple tools).0 Automated test suite3 perl ≥5.4.1 Makefile/build processor GCC 3.60 Configuration script builder4 GNU Automake 1.4. Only needed if you want to run the automated test suite in the llvm/test directory. Specifically: • ar . The Package column is the usual name for the software package that LLVM depends on.2 Automated test suite3 tcl 8. If you want to make changes to the configure scripts. See below for specific version info. you will need GNU autoconf (2. You will also need automake (1.0 Nightly tester.9. The table below lists those required packages.2 C/C++ compiler1 TeXinfo 4. you don't need Subversion. Package Version Notes GNU Make 3.79.5 For building the CFE SVN ≥1. You only need Subversion if you intend to build from the latest LLVM sources. 3. your compilation host is expected to have the usual plethora of Unix utilities. 8. and optimize LLVM bitcode. although the generated native code may not work on your platform. GNU M4 (version 1. 2. The Notes column describes how LLVM uses the package and provides other details.5. Additionally.bunzip2 command for distribution checking • chmod . If you want to get it to work on another platform.9. The LLVM suite may compile on other platforms. The Version column provides "known to work" versions of the package.3 Subversion access to LLVM2 DejaGnu 1.6 aclocal macro generator4 libtool 1. you can pass ONLY_TOOLS="tools you need" to make. If you're working from a release distribution. you can download a copy of the source and try to compile it on your platform.2). 4. the LLVM utilities should be able to assemble.59).archive library builder • bzip2* . Software Compiling LLVM requires that you have several software packages installed. We only use aclocal from that package. Code generation should work as well.4 or higher).change permissions on a file 106 . If you do not need many of the tools and you are space-conscious. 3. Documentation for the LLVM System at SVN head Note that you will need about 1-3 GB of space for a full LLVM build in Debug mode. but it is not guaranteed to do so. and consequently. The GCC front end is not very portable at the moment.79. disassemble. analyze.

0. Other versions of GCC will probably work as well.4.2 and 3. This was fixed in later GCCs.copy files • date .3: The version of GCC 3. see important notes below).2 (unlike 3.unzip command for distribution checking • zip* . However gcc 3.2: This version of GCC suffered from a serious bug which causes it to crash in the "convert_from_eh_region_ranges_1" GCC function.3.0: GCC 2.2.0 on linux/x86 (32-bit): GCC miscompiles portions of the code generator.find files/dirs in a file system • grep .3: The version of GCC 3.4.96.x and before had several problems in the STL that effectively prevent it from compiling LLVM. and as such tends to expose bugs in the compiler. GCC 3. please let us know. Please upgrade to a newer version if possible.4. 3.1 successfully with them (however.3.3 commonly shipped with Cygwin does not work. several versions of GCC crash when trying to compile LLVM. If you run into a problem with a version of GCC not listed here. We routinely use GCC 3.regular expression search utility • gzip* .extended regular expression search utility • find .3: These versions of GCC fails to compile LLVM with a bogus template error. Documentation for the LLVM System at SVN head • cat .gzip command for distribution generation • gunzip* . Please download the FSF 3.tape archive for distribution generation • test .zip command for distribution generation Broken versions of GCC and other tools LLVM is very demanding of the host C++ compiler. GCC 3.gunzip command for distribution checking • install . GCC 3.Bourne shell for make build scripts • tar . GCC versions listed here are known to not work.3. SuSE GCC 3.remove (delete) files and directories • sed . Please use the "gcc -v" command to find out which version of GCC you are using. causing an infinite loop in the llvm-gcc build when built with optimizations enabled (i.move (rename) files • ranlib .3 shipped with SuSE 9.4.output concatenation utility • cp .0) correctly compiles LLVM at -O2.symbol table builder for archive libraries • rm . In particular.3.print the current date/time • echo .test things in file system • unzip* .3 or upgrade to a newer version of GCC.e.create a directory • mv . as with 3.4. GCC 3. If you are using one of these versions.3. a release build).3.print to standard output • egrep .3. and Apple 4.3. Cygwin GCC 3.2.2 on linux/x86 (32-bit): GCC miscompiles portions of the code generator at -O3.0. please try to upgrade your GCC to something more recent. GCC versions prior to 3.4.0.stream editor for transforming output • sh .install directories/files • mkdir . A work around is to build release LLVM 107 .1 (and possibly others) does not compile LLVM correctly (it appears that exception handling is broken in some cases).

It appears to work with ENABLE_OPTIMIZED=0 (the default).0: The IA-64 version of GCC 4.1.g.X versions of the ld linker will produce very long warning messages complaining that some ".0.4 (CodeSourcery ARM 2005q3-2): this compiler miscompiles LLVM when building with optimizations enabled.4. GCC 4. Debian GCC 4.1-21)) on Debian: Appears to miscompile parts of LLVM 2.c gets a mangled constant.3: GCC crashes when compiling LLVM at -O3 (which is the default with ENABLE_OPTIMIZED=1. Some 2.*" symbol was defined in a discarded section. Getting Started with LLVM The remainder of this guide is meant to get you up and running with LLVM and to give you some basic information about the LLVM environment.1. GNU Binutils 2.3.6 when optimizations are turned on. GCC mainline (4.2 20071124 (Red Hat 4.x on X86-64/amd64: GCC miscompiles portions of LLVM. The symptom is an infinite loop in FoldingSetImpl::RemoveNode while running the code generator. To work around this.2 20080827 (beta) 2: Users reported various problems related with link errors when using this GCC version.0.4 or later). These messages disappear using ld 2.1.t. GCC 4.4.3." GCC 3. LLVM will appear to mostly work but will be buggy.X.2) did not share the problem.1 on X86-64/amd64: GCC miscompiles portions of LLVM when compiling llvm itself into 64-bit code. We recommend upgrading to a newer version (2.0. 108 .17 contains a bug which causes huge link times (minutes instead of seconds) when building LLVM. GCC 4. GCC 4. GCC 4.1.17.2-42): Suffers from the same symptoms as the previous one.2 on X86: Crashes building some files in LLVM 2. GCC 3.0 is known to miscompile LLVM.1. build with "ENABLE_OPTIMIZED=1 OPTIMIZE_OPTION=-O2".19.16. failing portions of its testsuite.1.2 on OpenSUSE: Seg faults during libstdc++ build and on x86_64 platforms compiling md5. The symptom is an error about cyclic dependencies. e. One symptom is ValueSymbolTable complaining about symbols remaining in the table on destruction. It appears to work with "make ENABLE_OPTIMIZED=1 OPTIMIZE_OPTION=-O1" or build a debug build. Documentation for the LLVM System at SVN head builds with "make ENABLE_OPTIMIZED=1 OPTIMIZE_OPTION=-O2 .3-10) on ARM: Miscompiles parts of LLVM 2. Cygwin GCC 4. We recommend upgrading to a newer version of Gold. You can safely ignore these messages as they are erroneous and the linkage is correct.4.6. At the time of this writing. GNU ld 2. Apple Xcode 2.17. GCC 4..3 (Debian 4.1.1 Gold: This version of Gold contained a bug which causes intermittent failures when building LLVM with position independent code.50.1: GCC fails to build LLVM with template concept check errors compiling some files.gnu. IA-64 GCC 4.16..linkonce. GNU binutils 2.3.2 (20061115 (prerelease) (Debian 4.3.17: Binutils 2.

tar. Unpacking the LLVM Archives If you have the LLVM distribution. These are not environment variables you need to set but just strings used in the rest of this document below. See README. It can be the same as SRC_ROOT). with x. It is provided only as a convenience since you can specify the paths using the -L options of the tools and the C/C++ front-end will automatically use the bitcode files installed in its lib directory. In any of the examples below.y-platform. LLVM_LIB_SEARCH_PATH=/path/to/your/bitcode/libs [Optional] This environment variable helps LLVM linking tools find the locations of your bitcode libraries. The files are as follows.gz Source release for the LLVM test suite. you may need to set some environment variables.y marking the version number: llvm-x. a simple example using the LLVM tool chain. All these paths are absolute: SRC_ROOT This is the top level directory of the LLVM source tree. the LLVMGCCDIR is llvm-gcc/platform/llvm-gcc. LLVM is distributed as a set of two files: the LLVM suite and the LLVM GCC front end compiled for your platform. llvm-gcc-4.2-x. the following names are used to denote paths specific to the local system and working environment.y.LLVM in the root directory for build instructions.2 front end.tar.tar.gz Source release of the llvm-gcc-4. Checkout LLVM from Subversion 109 . the tree where object files and compiled programs will be placed. Each file is a TAR archive that is compressed with the gzip program. and links to find more information about LLVM or to get help via e-mail.2 front end for a specific platform.e. simply replace each of these names with the appropriate pathname on your local system.y.tar. llvm-test-x.2-x. For the pre-built GCC front end binaries. Setting Up Your Environment In order to compile and use LLVM.y. Documentation for the LLVM System at SVN head The later sections of this guide describe the general layout of the the LLVM source tree. LLVMGCCDIR This is where the LLVM GCC Front End is installed.gz Source release for the LLVM libraries and tools. OBJ_ROOT This is the top level directory of the LLVM object tree (i.gz Binary release of the llvm-gcc-4. you will need to unpack it before you can begin to compile it. llvm-gcc-4. There is an additional test suite that is optional. Terminology and Notation Throughout this manual.source.

6: RELEASE_26 • Release 2. you can checkout it from the 'tags' directory (instead of 'trunk'). The following releases are located in the following subdirectories of the 'tags' directory: • Release 2. you can get a fresh copy of the entire source code. It is used for running the llvm-test testsuite and for compiling C/C++ programs.9: RELEASE_19 • Release 1.8: RELEASE_18 • Release 1.org/svn/llvm-project/llvm/trunk llvm This will create an 'llvm' directory in the current directory and fully populate it with the LLVM source code.1: RELEASE_11 • Release 1.7: RELEASE_27 • Release 2. Install the GCC Front End Before configuring and compiling the LLVM suite (or if you want to use just the LLVM GCC front end) you can optionally extract the front end from the binary distribution.org/svn/llvm-project/llvm/trunk llvm • Read-Write:svn co https://user@llvm. 110 . All you need to do is check it out from Subversion as follows: • cd where-you-want-llvm-to-live • Read-Only: svn co http://llvm.4: RELEASE_14 • Release 1.3: RELEASE_23 • Release 2. Note that you can optionally build llvm-gcc yourself after building the main LLVM repository.4).5: RELEASE_15 • Release 1.4: RELEASE_24 • Release 2. Makefiles. you get it from the Subversion repository: % cd llvm/projects % svn co http://llvm. test directories.0: RELEASE_1 If you would like to get the LLVM test suite (a separate package as of 1. Documentation for the LLVM System at SVN head If you have access to our Subversion repository.2: RELEASE_12 • Release 1. If you would like to get the GCC front end source code.3: RELEASE_13 • Release 1.2: RELEASE_22 • Release 2. it will be automatically configured by the LLVM configure script as well as automatically updated when you run svn update. and local copies of documentation files. Please follow these instructions to successfully get and build the LLVM GCC front-end.1: RELEASE_21 • Release 2.0: RELEASE_20 • Release 1. If you want to get a specific release (as opposed to the most recent revision).5: RELEASE_25 • Release 2.7: RELEASE_17 • Release 1.org/svn/llvm-project/test-suite/trunk llvm-test By placing it in the llvm/projects. you can also get it and build it yourself.6: RELEASE_16 • Release 1.

gz | tar -xvf - Once the binary is uncompressed. It also populates OBJ_ROOT with the Makefiles needed to begin building LLVM. Note that you can always build or install llvm-gcc at any point after building the main LLVM repository: just reconfigure llvm and llvm-test will pick it up. add the bin subdirectory of your front end installation directory to your PATH environment variable. this is much easier now than it was in the past. please let us know how you would like to see things improved by dropping us a note on our mailing list. or it may be linked with libraries not available on your system. the LLVM suite source code must be configured via the configure script. The following environment variables are used by the configure script to configure the build system: 111 . If you now want to build LLVM from source. if you uncompressed the binary to c:\llvm-gcc. add c:\llvm-gcc\bin to your PATH. To install binutils on Windows: 1. At this time. For example. do the following (on Windows. not "fix" a header file that needs to be fixed for GCC. use an archival tool like 7-zip that understands gzipped tars): 1. they should be similar enough to those who have previously installed MinGW on Windows systems. This script sets variables in the various *. if you discover that installing the LLVM GCC front end binaries is not as easy as previously described. cd where-you-want-the-front-end-to-live 2. or you would like to suggest improvements. The last remaining step for Windows users is to simply uncompress the binary binutils package from MinGW into your front end installation directory. If you're using a Windows-based system. if you're using a *nix-based system. Thankfully. add a symlink for llvm-gcc and llvm-g++ to some directory in your path. you may want to try building the GCC front end from source. the front end binaries for MinGW/x86 include versions of the required w32api and mingw-runtime binaries. Local LLVM Configuration Once checked out from the Subversion repository. when you configure LLVM.h. it's best to think of the MinGW LLVM GCC front end binary as a self-contained convenience package that requires Windows users to simply download and uncompress the GNU Binutils binary package from the MinGW project. Regardless of your platform. gunzip --stdout llvm-gcc-4. Documentation for the LLVM System at SVN head To install the GCC front end. In cases like these. the binary distribution may include an old version of a system header file.2-version-platform. While the front end installation steps are not quite the same as a typical manual MinGW installation. most notably llvm/Makefile. it will automatically detect llvm-gcc's presence (if it is in your path) enabling its use in llvm-test. uncompress archived binutils directories (not the tar file) into the current directory The binary versions of the LLVM GCC front end may not suit all of your needs.config and llvm/include/Config/config.in files. As a convenience for Windows users. For example. download GNU Binutils from MinGW Downloads 2. We also do not currently support updating of the GCC front end by manually overlaying newer versions of the w32api and mingw-runtime binary packages that may become available from MinGW.tar. cd where-you-uncompressed-the-front-end 3.

If you don't specify this option. Note that this is the default setting if you are using the LLVM distribution. The following options can be used to set or enable LLVM specific options: --with-llvmgccdir Path to the LLVM C/C++ FrontEnd to be used with this LLVM configuration. powerpc. x86. This option will enable usage of udis86 x86 (both 32 and 64 bits) 112 . This is disabled by default because generating the documentation can take a long time and producess 100s of megabytes of output. By default.4 and 8. The current set of targets is: alpha. If the option is not given. You can also specify a comma separated list of target names that you want available in llc. configure will look for the first CXX GCC C++ compiler in PATH.x) for LLVM. LLVM may still be built with the tools-only target but attempting to build the runtime libraries will fail as these libraries require llvm-gcc and llvm-g++. If this option is not provided. The value "host-only" can be specified to build only a native compiler (no cross-compiler targets available). --with-tclinclude Path to the tcl include directory under which tclsh can be found. The "native" target is selected as the target of the build host. sparc. The target names use all lower case. Documentation for the LLVM System at SVN head Variable Purpose Tells configure which C compiler to use. configure will look for the first GCC CC C compiler in PATH. This is not available on all platforms. --enable-debug-runtime Enables debug symbols in the runtime libraries. so it is best to explicitly enable it if you want it. and no llvm-gcc can be found in the path then a warning will be produced by configure indicating this situation. The default behavior of an Subversion checkout is to use an unoptimized build (also known as a debug build). See Install the GCC Front End for details on installing the C/C++ Front End. --enable-doxygen Look for the doxygen program and enable construction of doxygen based documentation from the source code. --enable-jit Compile the Just In Time (JIT) compiler functionality. the PATH will be searched for a program named llvm-gcc and the C/C++ FrontEnd install directory will be inferred from the path found. --enable-targets=target-option Controls which targets will be built and linked into llc. See Bootstrapping the LLVM C/C++ Front-End for details on building the C/C++ Front End. LLVM only uses tcl for running the dejagnu based test suite in llvm/test. Use this variable to override configure's default behavior. Use this if you have multiple tcl installations on your machine and you want to use a specific one (8. Tells configure which C++ compiler to use. The default value for target_options is "all" which builds and links all available targets. The value of this option should specify the full pathname of the C/C++ Front End to be used. By default. skeleton. --enable-optimized Enables optimized compilation (debugging symbols are removed and GCC optimization flags are enabled). the LLVM configure script will search for the tcl 8. Use this variable to override configure's default behavior. ia64. The default is to strip debug symbols from the runtime libraries. --with-udis86 LLVM can use external disassembler library for various purposes (now it's used only for examining code produced by JIT).3 releases. The default is dependent on platform.

Documentation for the LLVM System at SVN head disassembler library. Change directory into the object root directory: % cd OBJ_ROOT 2. you may wish to use some of the parallel build options provided by GNU Make. generated C/C++ files. They compile profiling information into the code for use with programs like gprof. the build system will compile the tools and libraries with GCC optimizations enabled and strip debugging information from the libraries and executables it generates. libraries. Profile Builds These builds are for use with profiling. and executables. This includes object files. please check here to see if you are using a version of GCC that is known not to compile LLVM. The build system will compile the tools and libraries with debugging information. For example. you can build it. Note that Release Builds are default when using an LLVM distribution. Once you have LLVM configured. To configure LLVM. Release (Optimized) Builds These builds are enabled with the --enable-optimized option to configure or by specifying ENABLE_OPTIMIZED=1 on the gmake command line. There are three types of builds: Debug Builds These builds are the default when one is using an Subversion checkout and types gmake (unless the --enable-optimized option was used during configuration). For these builds. follow these steps: 1. If you have multiple processors in your machine. Run the configure script located in the LLVM source tree: % SRC_ROOT/configure --prefix=/install/path [other options] Compiling the LLVM Suite Source Code Once you have configured LLVM. To get a Debug Build using the LLVM distribution the --disable-optimized option must be passed to configure. you could use the command: % gmake -j2 There are several special targets which are useful when working with the LLVM source code: gmake clean Removes all files generated by the build. you can build it by entering the OBJ_ROOT directory and issuing the following command: % gmake If the build fails. 113 . Profile builds must be started by specifying ENABLE_PROFILING=1 on the gmake command line.

specified with . gmake VERBOSE=1 Print what gmake is doing on standard output. this is the target to use once you've built them. which defaults to /usr/local. Entering any directory inside the LLVM object tree and typing gmake should rebuild anything in or below that directory that is out of date. To configure a cross-compile. but also removes files generated by configure. and documentation in a hierarchy under $PREFIX. 114 . Documentation for the LLVM System at SVN head gmake dist-clean Removes everything that gmake clean does. supply the configure script with --build and --host options that are different. gmake TOOL_VERBOSE=1 Ask each tool invoked by the makefiles to print out what it is doing on the standard output. It attempts to return the source tree to the original state in which it was shipped. when this command is run. Cross-Compiling LLVM It is possible to cross-compile LLVM itself./configure --prefix=[dir]. it will install bitcode libraries into the GCC front end's bitcode library directory. Please see the Makefile Guide for further details on these make targets and descriptions of other targets available. libraries. Every directory in the LLVM object tree includes a Makefile to build it and any subdirectories that it contains. It is also possible to override default values from configure by declaring variables on the command line. gmake ENABLE_PROFILING=1 Perform a Profiling build. If you need to update your bitcode libraries. gmake -C runtime install-bytecode Assuming you built LLVM into $OBJDIR. The following are some examples: gmake ENABLE_OPTIMIZED=1 Perform a Release (Optimized) build. This also implies VERBOSE=1. The values of these options must be legal target triples that your GCC compiler supports. That is. gmake install Installs LLVM header files. tools. gmake ENABLE_OPTIMIZED=1 DISABLE_ASSERTIONS=1 Perform a Release (Optimized) build without assertions enabled. you can create LLVM executables and libraries to be hosted on a platform different from the platform where they are build (a Canadian Cross build). gmake ENABLE_OPTIMIZED=0 Perform a Debug build.

On Debian.bc This allows you to execute LLVM bitcode files directly./hello. use commands like this (the first command may not be required if you are already using the module): $ mount -t binfmt_misc none /proc/sys/fs/binfmt_misc $ echo ':llvm:M::BC::/path/to/lli:' > /proc/sys/fs/binfmt_misc/register $ chmod u+x hello. you can also use this command instead of the 'echo' command above: 115 . it is possible to build LLVM for several different platforms or configurations using the same source tree.bc (if needed) $ . Hence. and you have root access on the system. Documentation for the LLVM System at SVN head The result of such a build is executables that are not runnable on on the build host (--build option) but can be executed on the compile host (--host option). This is accomplished in the typical autoconf manner: • Change directory to where the LLVM object files should live: % cd OBJ_ROOT • Run the configure script found in the LLVM source directory: % SRC_ROOT/configure The LLVM build will place files underneath OBJ_ROOT in directories named after the build type: Debug Builds Tools OBJ_ROOT/Debug/bin Libraries OBJ_ROOT/Debug/lib Release Builds Tools OBJ_ROOT/Release/bin Libraries OBJ_ROOT/Release/lib Profile Builds Tools OBJ_ROOT/Profile/bin Libraries OBJ_ROOT/Profile/lib Optional Configuration Items If you're running on a Linux system that supports the "binfmt_misc" module. The Location of LLVM Object Files The LLVM build system is capable of sharing a single LLVM source tree among several LLVM builds. To do this. you can set your system up to execute LLVM bitcode files directly.

llvm/lib/Transforms/ This directory contains the source code for the LLVM to LLVM program transformations. In LLVM. llvm/include/llvm/Config This directory contains header files configured by the configure script. llvm/include This directory contains public header files exported from the LLVM library. llvm/lib/Target/ This directory contains files that describe various target architectures for code generation. llvm/lib/CodeGen/ This directory contains the major parts of the code generator: Instruction Selector. and many others. Instruction Scheduling. For example. some C++ STL utilities and a Command Line option processing library store their header files here. CodeGen. such as Dominator Information. etc. For example.org/doxygen/.. Dead Global Elimination. The three main subdirectories of this directory are: llvm/include/llvm This directory contains all of the LLVM specific header files. Target. llvm/lib/Analysis/ This directory contains a variety of different program analyses. llvm/lib/AsmParser/ This directory holds the source code for the LLVM assembly language parser library. This directory also has subdirectories for different portions of LLVM: Analysis. almost all code exists in libraries. Source code can include these header files which automatically take care of the conditional #includes that the configure script generates. etc. Natural Loop Identification. The following is a brief introduction to code layout: llvm/examples This directory contains some simple examples of how to use the LLVM IR and JIT. They wrap "standard" UNIX and C header files. the llvm/lib/Target/X86 directory holds the X86 machine description while llvm/lib/Target/CBackend implements the LLVM-to-C converter. Transforms. Loop Invariant Code Motion. Induction Variables. Interval Identification.. llvm/lib This directory contains most of the source files of the LLVM system. llvm/lib/BitCode/ This directory holds code for reading and write LLVM bitcode. making it very easy to share code among the different tools. Documentation for the LLVM System at SVN head $ sudo update-binfmts --install llvm /path/to/lli --magic 'BC' Program Layout One useful source of information about the LLVM source base is the LLVM doxygen documentation available at http://llvm. llvm/include/llvm/Support This directory contains generic support libraries that are provided with LLVM but not necessarily specific to LLVM. and Register Allocation. Sparse Conditional Constant Propagation. Call Graphs. such as Aggressive Dead Code Elimination. llvm/lib/Debugger/ 116 . llvm/lib/VMCore/ This directory holds the core LLVM source files that implement core classes like Instruction and BasicBlock. Inlining.

llvmc The LLVM Compiler Driver. Please note that this tool. test-suite This is not a directory in the normal llvm module. translation. The following is a brief introduction to the most important tools. More detailed information is in the Command Guide. llvm/test This directory contains feature and regression tests and other basic sanity checks on the LLVM infrastructure. You can always get help for a tool by typing tool_name -help. llvm/runtime This directory contains libraries which are compiled into LLVM bitcode and used when linking programs with the GCC front end. libc is a stripped down version of glibc. please see the Testing Guide document. llvm/tools The tools directory contains the executables built out of the libraries above. while functional. This is also the directory where you should create your own LLVM-based projects. it is a separate Subversion module that must be checked out (usually to projects/test-suite). This reduces the need to get the traditional -l<name> options right on the command line. which form the main part of the user interface. is still experimental and not feature 117 . whether it is a crash or miscompilation. llvm/projects This directory contains projects that are not strictly part of LLVM but are shipped with LLVM. performance. and linking of programs all from one command line. These are intended to run quickly and cover a lot of territory without being exhaustive. For further details on this test suite. assembly. This module contains a comprehensive correctness. for example. Most of these libraries are skeleton versions of real libraries. llvm/lib/System/ This directory contains the operating system abstraction layer that shields LLVM from platform-specific coding.html for more information on using bugpoint. It is a separate Subversion module because not every LLVM user is interested in downloading or building such a comprehensive test suite. this directory needs the LLVM GCC front end to compile. llvm/lib/ExecutionEngine/ This directory contains libraries for executing LLVM bitcode directly at runtime in both interpreted and JIT compiled fashions. bugpoint bugpoint is used to debug optimization passes or code generation backends by narrowing down the given test case to the minimum number of passes and/or instructions that still cause a problem. This program can be configured to utilize both LLVM and non-LLVM compilation tools to enable pre-processing. See HowToSubmitABug. and benchmarking test suite for LLVM. llvm/lib/Support/ This directory contains the source code that corresponds to the header files located in llvm/include/Support/. llvmc also takes care of processing the dependent libraries found in bitcode. optimization. See llvm/projects/sample for an example of how to set up your own project. Documentation for the LLVM System at SVN head This directory contains the source level debugger library that makes it possible to instrument LLVM programs so that a debugger could identify source code locations at which the program is executing. Unlike the rest of the LLVM suite.

assuming that the other generates correct output. links multiple LLVM modules into a single program. It works just like any other GCC compiler. and some of the utilities are actually required as part of the build process because they are code generators for parts of LLVM infrastructure. and PowerPC).). llc llc is the LLVM backend compiler. and then outputs the resultant bitcode. -S. emacs/ The emacs directory contains syntax-highlighting files which will work with Emacs and XEmacs editors. Documentation for the LLVM System at SVN head complete. llvm/utils This directory contains utilities for working with LLVM source code. applies a series of LLVM to LLVM transformations (which are specified on the command line).sh 118 . the the source code for llvm-gcc is available as a separate Subversion module. For the full user manual. This is a useful tool if you are debugging one of them.. taking the typical -c. or familiarizing yourself with what an analysis does. by default. which translates LLVM bitcode to a native code assembly file or to C code (with the -march=c option). For architectures that support it (currently x86. llvm-link llvm-link. lli lli is the LLVM interpreter. It can also emit LLVM bitcode or assembly (with the -emit-llvm option) instead of the usual machine code output. -E. opt opt reads LLVM bitcode. llvm-dis The disassembler transforms the LLVM bitcode to human readable LLVM assembly. lli will function as a Just-In-Time compiler (if the functionality was compiled in). It performsn standard link time optimizations and allows optimization modules to be loaded and run so that language specific optimizations can be applied at link time. It is primarily useful for debugging analyses. which can directly execute LLVM bitcode (although very slowly. consult the README file in that directory. and will execute the code much faster than the interpreter. -o options that are typically used. run `perldoc codegen-diff'. optionally with an index for faster lookup. llvm-as The assembler transforms the human readable LLVM assembly to LLVM bitcode. For information on how to use the syntax files. Sparc. not surprisingly. llvm-ar The archiver produces an archive containing the given LLVM bitcode files. llvm-gcc llvm-gcc is a GCC-based C frontend that has been retargeted to use LLVM as its backend instead of GCC's RTL backend. opt can also be used to run a specific analysis on an input LLVM bitcode file and print out the results. providing syntax highlighting support for LLVM assembly files and TableGen description files. The 'opt -help' command is a good way to get a list of the program transformations available in LLVM. llvm-ld llvm-ld is a general purpose and extensible linker for LLVM. codegen-diff codegen-diff is a script that finds differences between code that LLC generates and code that LLI generates. This is the linker invoked by llvmc. getsrcs. Additionally..

switch to directory llvm/tools/llc and build it.html These files are used in a cron script to generate nightly status reports of the functionality of tools. vim/ The vim directory contains syntax-highlighting files which will work with the VIM editor.sh` from the top of your LLVM source tree.sh script finds and outputs all non-generated source files.pl and NightlyTestTemplate. Note: The gcc4 frontend's invocation is considerably different from the previous gcc3 frontend. Example with llvm-gcc4 1. The contents of this directory should be considered experimental at this time. consult the README file in that directory. for example: xemacs `utils/getsources. An Example Using the LLVM Tool Chain This section gives an example of using LLVM. 119 . Documentation for the LLVM System at SVN head The getsrcs. NewNightlyTest. and even assemblers from common TableGen description files. One way to use it is to run. causing a re-linking of LLC. llvmgrep This little tool performs an "egrep -H -n" on each source file in LLVM and passes to it a regular expression provided on llvmgrep's command line. TableGen/ The TableGen directory contains the tool used to generate register descriptions. assuming you are in the directory llvm/lib/Target/Sparc. In particular. llvm/win32 This directory contains build scripts and project files for use with Visual C++. name it 'hello.h> int main() { printf("hello world\n"). the gcc4 frontend does not create bitcode by default: gcc4 produces native code. the CFLAGS variable needs '--emit-llvm' to produce bitcode output. create a simple C file. llvm-gcc3 is now obsolete. so we only include instructions for llvm-gcc4. This allows developers on Windows to build LLVM without the need for Cygwin. and the results can be seen by following the appropriate link on the LLVM homepage. As the example below illustrates. For example. First. which is useful if one wishes to do a lot of development across directories and does not want to individually find each file. instruction set descriptions. simply running makellvm llc will make a build of the current directory. the '--emit-llvm' flag is needed to produce LLVM bitcode output. For information on how to use the syntax files. This is a very efficient way of searching the source base for a particular regular expression.c': #include <stdio. For makefiles and configure scripts. if makellvm is in your path. providing syntax highlighting support for LLVM assembly files and TableGen description files. makellvm The makellvm script compiles all files in the current directory and then compiles and links the tool that is the first argument.

Documentation for the LLVM System at SVN head return 0. 5. Execute the native code program: % . 3. Run the program in both forms. when the -emit-llvm option is not present) does steps 6/7/8 for you.. respectively). Next.c -c -o hello.native Others: % gcc hello.bc The -emit-llvm option can be used with the -S or -c options to emit an LLVM ". there are many more interesting and complicated things that you can do that aren't documented here (but we'll gladly accept a patch if you want to write something up!). llvm-gcc4 correctly responds to -O[0123] arguments.c -o hello Note that llvm-gcc works just like GCC by default. please consult the Frequently Asked Questions page. Compile the program to native assembly using the LLC code generator: % llc hello. Use the llvm-dis utility to take a look at the LLVM assembly code: llvm-dis < hello.s -o hello.native Note that using llvm-gcc to compile directly to native code (i.bc The second examples shows how to invoke the LLVM JIT.s -o hello. or if you have any other general questions about LLVM.s or . This allows you to use the standard LLVM tools on the bitcode file.e.ll" or ". Assemble the native assembly language file into a program: Solaris: % /opt/SUNWspro/bin/cc -xarch=v9 hello.bc -o hello. compile the C file into a LLVM bitcode file: % llvm-gcc -O3 -emit-llvm hello. For more information about LLVM. Unlike llvm-gcc3. } 2. To run the program. lli./hello. The standard -S and -c arguments work as usual (producing a native . compile the C file into a native executable: % llvm-gcc hello.s 7. Common Problems If you are having problems building or using LLVM. use: % .bc" file (respectively) for the code.o file.. 4. check out: • LLVM homepage 120 . Links This document is just an introduction on how to use LLVM to do some simple things.native 8. Next./hello and % lli hello.bc | less 6.

Documentation for the LLVM System at SVN head • LLVM doxygen tree • Starting a Project that Uses LLVM Chris Lattner Reid Spencer The LLVM Compiler Infrastructure Last modified: $Date: 2010-04-27 01:53:59 -0500 (Tue. 27 Apr 2010) $ 121 .

Terminology and Notation 2. Documentation for the LLVM System at SVN head Getting Started with the LLVM System using Microsoft Visual Studio • Overview • Getting Started Quickly (A Summary) • Requirements 1. Get the Source Code ♦ With the distributed files: 1. 4.gz | tar -xvf . 2. The LLVM test suite cannot be run on the Visual Studio port at this time. Remember that you were warned twice about reading the documentation. llvm-gcc is based on GCC. Software • Getting Started with LLVM 1. but it is currently not possible to generate assembly code which is then assembled into an executable. Additional information about the LLVM directory structure and tool chain can be found on the main Getting Started page. Hardware 2. The JIT and interpreter are functional. cd where-you-want-llvm-to-live 2.tar. cd where-you-want-llvm-to-live 122 . The other tools 'should' work. 3. It is suitable for use only if you are writing your own compiler front end or otherwise have a need to dynamically generate machine code. The Location of LLVM Object Files • An Example Using the LLVM Tool Chain • Common Problems • Links Written by: Jeff Cohen Overview The Visual Studio port at this time is experimental. read the documentation. there is no C/C++ front end currently available. or use WinZip 3. Seriously. To emphasize. which cannot be bootstrapped using VC++. gunzip --stdout llvm-version. But be aware the odds of linking C++ code compiled with llvm-gcc with code compiled with VC++ is essentially zero. Eventually there should be a llvm-gcc based on Cygwin or MinGW that is usable. There is also the option of generating bitcode files on Unix and copying them over to Windows. Most of the tools build and work. but does not work. You can indirectly create executables by using the C back end. Read the documentation. but have not been fully tested. bugpoint does build. cd llvm ♦ With anonymous Subversion access: 1. Getting Started Quickly (A Summary) Here's the short story for getting up and running quickly with LLVM: 1.

The one option you may really want to change. 123 . then the directory you created the project files. ♦ If CMake is installed then the most simple way is to just start the CMake GUI. cd llvm 5. Modify the project's debugging properties to provide a numeric command line argument. then installs the LLVM headers. regardless of anything else. The VS2005 SP1 beta and the normal VS2005 still have bugs that are not completely compatible. then simply double click on the solution file llvm/win32/llvm. Software You will need Visual Studio .NET 2005 SP1 is fine. svn co http://llvm. Build the LLVM Suite: ♦ Simply build the solution. just double-click on that to open Visual Studio. VS2003 would work except (at last check) it has a bug with friend classes that you can work-around with some minor code rewriting (and please submit a patch if you do). 7. ♦ If you use CMake to generate the Visual Studio solution and project files. review the requirements given below. You will also need the CMake build system since it generates the project files you will use to build with. select the directory where you have LLVM extracted to. ♦ The Fibonacci project is a sample program that uses the JIT. 6. might be the CMAKE_INSTALL_PREFIX setting to select a directory to INSTALL to once compiling is complete. or the INSTALL project. It is strongly encouraged that you get the latest version from Subversion as changes are continually making the VS support better. The LLVM source tree and object files. The program will print the corresponding fibonacci value. the root directory will have an llvm. but rather select and build just the ALL_BUILD project to build everything. This may save you some trouble by knowing ahead of time what hardware and software you will need. Requirements Before you begin to use the LLVM system. Use CMake to generate up-to-date project files: ♦ This step is currently optional as LLVM does still come with a normal Visual Studio solution file. libraries and executables will consume approximately 3GB. Start Visual Studio ♦ If you did not use CMake.org/svn/llvm-project/llvm-top/trunk llvm-top 3. and the default options should all be fine. Hardware Any system that can adequately run Visual Studio . which first builds the ALL_BUILD project. and other useful things to the directory set by the CMAKE_INSTALL_PREFIX setting when you first configured CMake.NET 2005 SP1 or higher. make checkout MODULE=llvm 4. Earlier versions of Visual Studio do not support the C++ standard well enough and will not work. ♦ If you used CMake. but to build them all do not just select all of them in batch build (as some are meant as configuration projects). Documentation for the LLVM System at SVN head 2. libs. The projects may still be built individually. but it is not always kept up-to-date and will soon be deprecated in favor of the multi-platform generator CMake. then the Solution will have a few extra options compared to the current included one.sln file.sln.

bc 124 . In any of the examples below.h> int main() { printf("hello world\n"). All these paths are absolute: SRC_ROOT This is the top level directory of the LLVM source tree. Note: while you cannot do this step on Windows. Important: transfer as a binary file! 3. These are not environment variables you need to set but just strings used in the rest of this document below. The Location of LLVM Object Files The object files are placed under OBJ_ROOT/Debug for debug builds and OBJ_ROOT/Release for release (optimized) builds. Getting Started with LLVM The remainder of this guide is meant to get you up and running with LLVM using Visual Studio and to give you some basic information about the LLVM environment.bc which is the LLVM bitcode that corresponds the the compiled program and the library facilities that it required.) as the configure step will fail.bc This will create the result file hello. compile the C file into a LLVM bitcode file: % llvm-gcc -c hello. First. Run the program using the just-in-time compiler: % lli hello. You application must have OBJ_ROOT in its include search path just before SRC_ROOT/include. } 2. An Example Using the LLVM Tool Chain 1.bc to Windows. create a simple C file. Next. It is fixed at SRC_ROOT/win32).g. the following names are used to denote paths specific to the local system and working environment. OBJ_ROOT This is the top level directory of the LLVM object tree (i. return 0. name it 'hello. These include both executables and libararies that your application can link against. the tree where object files and compiled programs will be placed. optimize or analyze it further with the opt tool.. The files that configure would create when building on Unix are created by the Configure project and placed in OBJ_ROOT/llvm. you can do it on a Unix system and transfer hello.. You can execute this file directly using lli tool. Terminology and Notation Throughout this manual. C:\Documents and Settings\. Documentation for the LLVM System at SVN head Do not install the LLVM directory tree into a path containing spaces (e.c -emit-llvm -o hello.c': #include <stdio. etc.e. compile it to native assembly with the llc. simply replace each of these names with the appropriate pathname on your local system.

Use the llvm-dis utility to take a look at the LLVM assembly code: % llvm-dis < hello. this can be added in Project Properties->Linker->Input->Force Symbol References..exe Common Problems • In Visual C++.cbe. or if you have any other general questions about LLVM. the linker will remove the x86 target library from your generated executable or shared library because there are no references to it. there are many more interesting and complicated things that you can do that aren't documented here (but we'll gladly accept a patch if you want to write something up!). Compile to binary using Microsoft C: % cl hello. Links This document is just an introduction to how to use LLVM to do some simple things. 7.cbe. For more information about LLVM. please consult the Frequently Asked Questions page. Execute the native code program: % hello.bc 6. if you are linking with the x86 target statically. Non-trivial programs (and any C++ program) will have dependencies on the GCC runtime that won't be satisfied by the Microsoft runtime libraries. check out: • LLVM homepage • LLVM doxygen tree • Starting a Project that Uses LLVM Jeff Cohen The LLVM Compiler Infrastructure Last modified: $Date: 2009-08-05 10:42:44 -0500 (Wed. In the Visual Studio IDE. 4. You can force the linker to include these references by using "/INCLUDE:_X86TargetMachineModule" when linking. Documentation for the LLVM System at SVN head Note: this will only work for trivial C programs. Compile the program to C using the LLC code generator: % llc -march=c hello. If you are having problems building or using LLVM.. 05 Aug 2009) $ 125 . Non-trivial programs (and any C++ program) will have dependencies on the GCC runtime that won't be satisfied by the Microsoft runtime libraries.bc | more 5.c Note: this will only work for trivial C programs.

Attribution of Changes 3. License. Code Reviews 4. Patents 4. Make life as simple and easy for contributors as possible. License 3. Developer Policies 1. 3. Stay Informed 2. Test Cases 6. This policy is also designed to accomplish the following objectives: 1. Stay Informed Developers should stay informed by reading at least the llvmdev email list. Incremental Development 10. but we expect more from frequent contributors to keep the system as efficient as possible for everyone. Obtaining Commit Access 8. This policy is aimed at frequent contributors to LLVM. Making a Major Change 9. Keep the top of Subversion trees as stable as possible. Making a Patch 3. We recommend that active developers register an email account with LLVM Bugzilla and preferably subscribe to the llvm-bugs email list to keep track of bugs and enhancements occurring in LLVM. 126 . Code Owners 5. Attract both users and developers to the LLVM project. rework. By stating the policy in clear terms. and confusion that might arise from the distributed nature of LLVM's development. Copyright 2. and Patents 1. Developer Policies This section contains policies that pertain to frequent LLVM developers. Developer Agreements Written by the LLVM Oversight Team Introduction This document contains the LLVM Developer Policy which defines the project's policy towards developers and their contributions. People interested in contributing one-off patches can do so in an informal way by sending them to the llvm-commits mailing list and engaging another developer to see it through the process. Frequent LLVM contributors are expected to meet the following requirements in order for LLVM to maintain a high standard of quality. Introduction 2. it is suggested that you also subscribe to the llvm-commits list and pay attention to changes being made by others. Documentation for the LLVM System at SVN head LLVM Developer Policy 1. Copyright. If you are doing anything more than just casual work on LLVM. 2. Quality 7. The intent of this policy is to eliminate miscommunication. We always welcome one-off patches from people who do not routinely contribute to LLVM. we hope each developer can know ahead of time what to expect when making LLVM contributions.

you should return the favor for someone else. Code can be reviewed either before it is committed or after. Note that anyone is welcome to review and give feedback on a patch.g. by making whitespace changes or by wrapping lines). 2. The utils/mkpatch utility takes care of this for you.content_disposition_type. We expect major changes to be reviewed before being committed. This ensures that your mailer will not mangle the patch when it sends it (e. Old patches may not apply correctly if the underlying code changes between the time the patch was created and the time it is applied. and not an old version of LLVM. When sending a patch to a mailing list. which makes it easy to read the diff. but only people with Subversion write access can approve it. Code review can be an iterative process. For information on how to check out SVN trunk. it is a good idea to send it as an attachment to the message. and only commit patches without pre-commit review when they are confident they are right. Code Reviews LLVM has a code review policy. Documentation for the LLVM System at SVN head Making a Patch When making a patch for review. patches should be submitted soon after they are generated. Thunderbird sends your attachment using Content-Disposition: inline rather than Content-Disposition: attachment. Having both is a great way for the project to take advantage of the fact that most people do the right thing most of the time. not embedded into the text of the message. please see the Getting Started Guide. Patches should be made with this command: svn diff or with the utility utils/mkpatch. Patches should not include differences in generated code such as the code generated by autoconf or tblgen. not a branch. Similarly. 3. Code review is one way to increase the quality of software. Code Owners The LLVM Project relies on two features of its process to maintain rapid development in addition to the high quality of its source base: the combination of code review plus post-commit review for trusted maintainers. please open Preferences → Advanced → General → Config Editor. usually on the llvm-commits list. We generally follow these policies: 1. All developers are required to have significant changes reviewed before they are committed to the repository. Code reviews are conducted by email. For Thunderbird users: Before submitting a patch. Make your patch against the Subversion trunk. Without this setting. Apple Mail gamely displays such a file inline. 3. find the key mail. This makes it easy to apply the patch. 2. If someone is kind enough to review your code. 5. 127 . making it difficult to work with for reviewers using that program. we recommend that you: 1. and set its value to 1. but smaller changes (or changes where the developer owns the component) can be reviewed after commit. As such. which continues until the patch is ready to be committed. Developers should participate in code reviews as both reviewers and reviewees. 4. 4. The developer responsible for a code change is also responsible for making all necessary review-related changes. the goal is to make it as easy for the reviewer to read it as possible.

The code must not cause regressions on a reasonable subset of llvm-test. 5. Lex. and Sema Libraries. Quality The minimum quality standards that any change must satisfy before being committed to the main development branch are: 1. etc) should be added to the llvm-test test suite. Code must pass the dejagnu (llvm/test) test suite. entire applications. 5. To solve this problem. benchmarks. Parse. Being a code owner is a somewhat unglamorous position. Note that llvm/test is designed for regression and small feature tests only. and unexpected things happen. by bugpoint or manually. allowing the patch to go unreviewed. For now.g. where "reasonable" depends on the contributor's judgement and the scope of the change (more invasive changes require more testing). 6. we do not have an official policy on how one gets elected to be a code owner. Some tips for getting your testcase approved: 1. Duncan Sands: llvm-gcc 4. performance. Chris Lattner: Everything not covered by someone else. Test cases should be written in LLVM assembly language unless the feature or regression being tested requires another language (e.. Please keep them short. 4. The llvm-test suite is for coverage (correctness. The appropriate sub-directory should be selected (see the Testing Guide for details). either by themself or by someone else. Code owners are the "last line of defense" to guarantee that all patches that are committed are actually reviewed. and Windows codegen. no warnings) on at least one platform. the bug being fixed or feature being implemented is in the llvm-gcc C++ front-end. Bug fixes and new features should include a testcase so we know if the fix/feature ever regresses in the future. A reasonable subset might be something like 128 . we have a notion of an 'owner' for a piece of the code. interests change. More extensive test cases (e. not feature or regression testing. Test cases. It is unacceptable to place an entire failing program into llvm/test as this creates a time-to-test burden on all developers.2. especially for regressions. Evan Cheng: Code generator and all targets. Because people get busy. Doug Gregor: Clang Basic. Ted Kremenek: Clang Static Analyzer. debug information. 3. All feature and regression test cases are added to the llvm/test directory. but it is incredibly important for the ongoing success of the project. 2. 3. Code must adhere to the LLVM Coding Standards. Documentation for the LLVM System at SVN head The trick to this is that the project has to guarantee that all patches that are committed are reviewed after they go in: you don't want everyone to assume someone else will review it. etc) testing. The sole responsibility of a code owner is to ensure that a commit to their area of the code is appropriately reviewed. Anton Korobeynikov: Exception handling. code ownership is purely opt-in.g. and we welcome code review from anyone who is interested. Code must compile cleanly (no errors. 2. 3. Test Cases Developers are required to create test cases for any bugs fixed and any new features added. 4. and anyone can choose to resign their "title" at any time. in which case it must be written in C++). The current code owners are: 1. should be reduced as much as possible. Note that code ownership is completely different than reviewers: anyone can review a piece of code. 2.

"2ACR96qjUqsyM".g. Note that you don't ever tell us what your password is. • You are expected to address any bugzilla bugs that result from your change. you should be able to check out an LLVM tree with an SVN URL of "https://username@llvm. if so. run "htpasswd" (a utility that comes with apache) in crypt mode (often enabled with "-d"). The user name you want to commit with. Random Hacker <hacker@yoyodyne. You are expected to check the build bot messages to see if they are your fault and. • The changes should not cause performance or correctness regressions in code compiled by LLVM on all applicable targets. This is clearly a subjective decision — we simply expect you to use good judgement. e.g. change a comment or add a blank line). To get approval.. e. Commits that violate these quality standards (e.g. these policies apply: 1.g. You are allowed to commit patches without approval which you think are obvious.g. Additionally.. or find a web page that will do it for you. A good rule of thumb is to check the nightly testers for regressions the day after your change. Our build bots and nightly testing infrastructure normally finds these problems." instead of the normal anonymous URL of "http://llvm. please send an email to Chris with the following information: 1. please do a test commit (e. When approved you may commit it yourself.. To verify that your commit access works.com>".. 129 . documentation/comment changes. You are granted commit-after-approval to all parts of LLVM. you can ignore this. the committer is responsible for addressing any problems found in the future that the change is responsible for.org/.". you just give it to us in an encrypted form. Your first commit to a repository may require the autogenerated email to be approved by a mailing list. A "password hash" of the password you want to use. 2. • The change set should not cause performance or correctness regressions for the LLVM tools. The developer is welcome to re-commit the change after the problem has been fixed. • The changes should not cause any correctness regressions in the llvm-test suite and must not cause any major performance regressions. This is normal. e. 2. We prefer for this to be handled before submission but understand that it isn't possible to test all of this for every submission. For example: • The code should compile cleanly on all supported platforms. Once you've been granted commit access. submit a patch to llvm-commits. The first time you commit you'll have to type in your password. To get this. This is necessary when the change blocks other developers from making progress. Note that you may get a warning from SVN about an untrusted key. any other minor changes. reverting obviously broken patches. Build bots will directly email you if a group of commits that included yours caused a failure. are very broken) may be reverted. "J. The full name and email address you want message to llvm-commits to come from. Obtaining Commit Access We grant commit access to contributors with a track record of submitting high quality patches. Documentation for the LLVM System at SVN head "llvm-test/MultiSource/Benchmarks". 3. fix the breakage. If you would like commit access. "hacker". and will be done when the mailing list owner has time.org/. Examples include: fixing build breakage. If you have recently been granted commit access.

You are allowed to commit patches without approval to those portions of LLVM that you have contributed or maintain (i. If the branch development and mainline development occur in the same pieces of code. In any case. We have a strong dislike for huge changes or long-term development branches. keep the community informed about future changes to LLVM. resolving merge conflicts can take a lot of time. Huge changes (produced when a branch is merged back onto mainline) are extremely difficult to code review. with the proviso that such commits must not break the build. 4. These sorts of changes can often be done before the major change is done. it is a good idea to get consensus with the development community before you start working on it. 2. 2. Making a Major Change When a developer begins a major new project with the aim of contributing it back to LLVM. depending on the nature of the change). • The remaining inter-related work should be decomposed into unrelated sets of changes if possible. we do all significant changes as a series of incremental patches. your changes are still subject to code review (either before or after they are committed. This is a "trust but verify" policy and commits of this nature are reviewed after they are committed. Multiple violations of these policies or a single egregious violation may cause commit access to be revoked. 4. Branches are not routinely tested by our nightly tester infrastructure. You are encouraged to review other peoples' patches as well. If you plan to make a major change to the way LLVM works or want to add a major new extension. To address these problems.. API cleanup. 130 . not as a long-term development branch. Some tips: • Large/invasive changes usually have a number of secondary changes that are required before the big change can be made (e. Long-term development branches have a number of drawbacks: 1. Documentation for the LLVM System at SVN head 3. to the extent possible. 5. The reason for this is to: 1. Changes developed as monolithic large changes often don't work until the entire set of changes is done. Other people in the community tend to ignore work on branches. Breaking it down into a set of smaller changes increases the odds that any of the work will be committed to the main repository.g. etc). 3. LLVM uses an incremental development style and we require contributors to follow this practice when making a large/invasive change. ensure that any technical issues around the proposed work are discussed and resolved before any significant work is done. Once the design of the new feature is finalized.e. independently of that work. Branches must have mainline merged into them periodically. and 3. Incremental Development In the LLVM project. Once this is done. but you aren't required to. define the first increment and get consensus on what the end goal of the change is. The design of LLVM is carefully controlled to ensure that all the pieces fit together well and are as consistent as possible. s/he should inform the community with an email to the llvmdev email list. have been assigned responsibility for). avoid duplication of effort by preventing multiple parties working on the same thing and not knowing about it. the work itself should be done as a series of incremental changes.

please say "patch contributed by J. NOTE: This section deals with legal matters but does not provide legal advice.g. please make sure to first discuss the change/gather consensus then ask about the best way to go about making the change. • There's no warranty on LLVM at all. Although UIUC may eventually reassign the copyright of the software to another entity (e. the revision control system keeps a perfect history of who changed what. simplifies code review and reduces the chance that you will get negative feedback on the change. • Binaries derived from LLVM must reproduce the copyright notice (e. The goal of the LLVM project is to always keep the code open and licensed under a very liberal license. to fix a bug). please seek legal counsel from an attorney. We believe that having a single copyright holder is in the best interests of all developers and users as it greatly reduces the managerial burden for any kind of administrative or technical decisions about LLVM. We are not lawyers. or part of a planned series of changes that works towards the development goal. a dedicated non-profit "LLVM Organization") the intent for the project is to always have a single entity hold the copyrights to LLVM at any given time. it is much easier to replace the underlying implementation of the API. This implementation change is logically separate from the API change. Small increments also facilitate the maintenance of a high quality code base. we do not want the source code to be littered with random attributions "this code written by J. and Patents This section addresses the issues of copyright. • You must retain the copyright notice if you redistribute LLVM. Documentation for the LLVM System at SVN head • Each change in the set can be stand alone (e. Once the new API is in place and used. an independent precursor to a big change is to add a new API and slowly migrate clients to use the new API.txt file describes higher-level contributions. and the CREDITS. which boils down to this: • You can freely distribute LLVM. in an included readme file). If you are interested in making a large change. Attribution of Changes We believe in correct attribution of contributions to their contributors. Copyright For consistency and ease of management. Copyright. License We intend to keep LLVM perpetually open source and to use a liberal open source license. The current license is the University of Illinois/NCSA Open Source License. • Each change should be kept as small as possible.g. License. Random Hacker!" in the commit message. • Often. Currently. However. the University of Illinois is the LLVM copyright holder and the terms of its license to LLVM users and developers is the University of Illinois/NCSA Open Source License. and this scares you. This simplifies your work (into a logical progression). the project requires the copyright for all LLVM software to be held by a single copyright holder: the University of Illinois (UIUC). Overall. please do not add contributor names to the source code. license and patents for the LLVM project. If you commit a patch for someone else. Random Hacker" (this is noisy and distracting). • You can't use our names to promote your LLVM derived products. Each change to use the new API is often "obvious" and can be committed without review.g. In practice. 131 .

please contact the LLVM Oversight Group. This implies that any contributions can be licensed under the license that the project uses. It may be a problem if you intend to base commercial development on llvm-gcc without redistributing your source code. please raise this issue with the oversight group before the code is committed. LLVM does not infringe on any patents (we have actually removed code from LLVM in the past that was found to infringe). developers agree to assign their copyrights to UIUC for any contribution made so that the entire software base can be managed by a single copyright holder. If you or your employer own the rights to a patent and would like to contribute code to LLVM that relies on it. we require that the copyright owner sign an agreement that allows any other user of LLVM to freely use your patent. This means that anything "linked" into llvm-gcc must itself be compatible with the GPL. Written by the LLVM Oversight Group The LLVM Compiler Infrastructure Last modified: $Date: 2010-02-26 14:18:32 -0600 (Fri. Documentation for the LLVM System at SVN head We believe this fosters the widest adoption of LLVM because it allows commercial products to be derived from LLVM with few restrictions and without a requirement for making any derived works also open source (i. personally or on behalf of your employer. We suggest that you read the License if further clarification is needed. 26 Feb 2010) $ 132 . Having code in LLVM that infringes on patents would violate an important goal of the project by making it hard or impossible to reuse the code for arbitrary purposes (including commercial use). If the code belongs to some other entity. a proprietary code generator linked into llvm-gcc must be made available under the GPL). which is GPL. This implies that any code linked into llvm-gcc and distributed to others may be subject to the viral aspects of the GPL (for example. Note that the LLVM Project does distribute llvm-gcc. This is not a problem for code already distributed under a more liberal license (like the UIUC license).e. we expect contributors to notify us of any potential for patent-related trouble with their changes. When contributing code. LLVM's license is not a "copyleft" license like the GPL). We have no plans to change the license of LLVM. you also affirm that you are legally entitled to grant this copyright. When contributing code. Please contact the oversight group for more details. Patents To the best of our knowledge. If you have questions or comments about the license. Developer Agreements With regards to the LLVM copyright and licensing. and must be releasable under the terms of the GPL. and does not affect code generated by llvm-gcc.

Transform Passes 4. Introduction 2. The table below divides the passes that LLVM provides into three categories. For example passes to extract functions to bitcode or write a module to bitcode are neither analysis nor transform passes. Transform passes can use (or invalidate) the analysis passes. ANALYSIS PASSES Option Name -aa-eval Exhaustive Alias Analysis Precision Evaluator -basicaa Basic Alias Analysis (default AA impl) -basiccg Basic CallGraph Construction -codegenprepare Optimize for code generation -count-aa Count Alias Analysis Query Responses -debug-aa AA use debugger -domfrontier Dominance Frontier Construction -domtree Dominator Tree Construction -dot-callgraph Print Call Graph to 'dot' file -dot-cfg Print CFG of function to 'dot' file -dot-cfg-only Print CFG of function to 'dot' file (with no function bodies) -globalsmodref-aa Simple mod/ref analysis for globals -instcount Counts the various types of Instructions -intervals Interval Partition Construction -loops Natural Loop Construction -memdep Memory Dependence Analysis -no-aa No Alias Analysis (always returns 'may' alias) -no-profile No Profile Information -postdomfrontier Post-Dominance Frontier Construction -postdomtree Post-Dominator Tree Construction -print-alias-sets Alias Set Printer -print-callgraph Print a call graph 133 . Utility Passes Written by Reid Spencer and Gordon Henriksen Introduction This document serves as a high level summary of the optimization features that LLVM provides. Analysis Passes 3. Utility passes provides some utility but don't otherwise fit categorization. The table below provides a quick summary of each pass and links to the more complete pass description later in the document. Analysis passes compute information that other passes can use or for debugging or program visualization purposes. Documentation for the LLVM System at SVN head LLVM's Analysis and Transform Passes 1. Transform passes all mutate the program in some way. Optimizations are implemented as Passes that traverse some portion of a program to either collect information or transform the program.

Documentation for the LLVM System at SVN head -print-callgraph-sccs Print SCCs of the Call Graph -print-cfg-sccs Print SCCs of each function CFG -print-externalfnconstants Print external fn callsites passed constants -print-function Print function to stderr -print-module Print module to stderr -print-used-types Find Used Types -profile-loader Load profile information from llvmprof.out -scalar-evolution Scalar Evolution Analysis -targetdata Target Data Layout TRANSFORM PASSES Option Name -adce Aggressive Dead Code Elimination -argpromotion Promote 'by reference' arguments to scalars -block-placement Profile Guided Basic Block Placement -break-crit-edges Break critical edges in CFG -codegenprepare Prepare a function for code generation -condprop Conditional Propagation -constmerge Merge Duplicate Global Constants -constprop Simple constant propagation -dce Dead Code Elimination -deadargelim Dead Argument Elimination -deadtypeelim Dead Type Elimination -die Dead Instruction Elimination -dse Dead Store Elimination -globaldce Dead Global Elimination -globalopt Global Variable Optimizer -gvn Global Value Numbering -indmemrem Indirect Malloc and Free Removal -indvars Canonicalize Induction Variables -inline Function Integration/Inlining -insert-block-profiling Insert instrumentation for block profiling -insert-edge-profiling Insert instrumentation for edge profiling -insert-function-profiling Insert instrumentation for function profiling -insert-null-profiling-rs Measure profiling framework overhead -insert-rs-profiling-framework Insert random sampling instrumentation framework -instcombine Combine redundant instructions -internalize Internalize Global Symbols -ipconstprop Interprocedural constant propagation -ipsccp Interprocedural Sparse Conditional Constant Propagation -jump-threading Thread control through conditional blocks -lcssa Loop-Closed SSA Form Pass -licm Loop Invariant Code Motion 134 .

Documentation for the LLVM System at SVN head -loop-deletion Dead Loop Deletion Pass -loop-extract Extract loops into new functions -loop-extract-single Extract at most one loop into a new function -loop-index-split Index Split Loops -loop-reduce Loop Strength Reduction -loop-rotate Rotate Loops -loop-unroll Unroll loops -loop-unswitch Unswitch loops -loopsimplify Canonicalize natural loops -lowerallocs Lower allocations from instructions to calls -lowerinvoke Lower invoke and unwind. 135 . Exhaustive Alias Analysis Precision Evaluator This is a simple N^2 alias analysis accuracy evaluator. DO NOT USE) -extract-blocks Extract Basic Blocks From Module (for bugpoint use) -preverify Preliminary module verification -verify Module Verifier -view-cfg View CFG of function -view-cfg-only View CFG of function (with no function bodies) Analysis Passes This section describes the LLVM Analysis Passes. for unwindless code generators -lowersetjmp Lower Set Jump -lowerswitch Lower SwitchInst's to branches -mem2reg Promote Memory to Register -memcpyopt Optimize use of memcpy and friends -mergereturn Unify function exit nodes -prune-eh Remove unused exception handling info -reassociate Reassociate expressions -reg2mem Demote all values to stack slots -scalarrepl Scalar Replacement of Aggregates -sccp Sparse Conditional Constant Propagation -simplify-libcalls Simplify well-known library calls -simplifycfg Simplify the CFG -strip Strip all symbols from a module -strip-dead-prototypes Remove unused function declarations -sretpromotion Promote sret arguments -tailcallelim Tail Call Elimination -tailduplicate Tail Duplication UTILITY PASSES Option Name -deadarghaX0r Dead Argument Hacking (BUGPOINT USE ONLY. it simply queries to see how the alias analysis implementation answers alias queries between each pair of pointers in the function. Basically. for each function in the program.

It should eventually be removed. but this is a debugging pass. This works around limitations in it's basic-block-at-a-time approach. etc). prints the call graph into a . prints the control flow graph into a . Basic CallGraph Construction Yet to be written. Basic Alias Analysis (default AA impl) This is the default implementation of the Alias Analysis interface that simply implements a few identities (two different globals cannot alias. Print CFG of function to 'dot' file (with no function bodies) This pass. Optimize for code generation This pass munges the code in the input function to better prepare it for SelectionDAG-based code generation. but otherwise does no analysis.dot graph. Yes keeping track of every value in the program is expensive. Print Call Graph to 'dot' file This pass. Dominance Frontier Construction This pass is a simple dominator construction algorithm for finding forward dominator frontiers.dot graph. we can provide pretty accurate and useful information. they do not query AA without informing it of the value. This graph can then be processed with the "dot" tool to convert it to postscript or some other suitable format. and Wojciech Stryjewski. omitting the function bodies. Count Alias Analysis Query Responses A pass which can be used to count how many alias queries are being made and how the alias analysis implementation being used responds. only available in opt. Francesco Spadini. It acts as a shim over any other AA pass you want. Print CFG of function to 'dot' file This pass. AA use debugger This simple pass checks alias analysis users to ensure that if they create a new value. Documentation for the LLVM System at SVN head This is inspired and adapted from code by: Naveen Neelakantam. This graph can then be processed with the "dot" tool to convert it to postscript or some other suitable format. Counts the various types of Instructions This pass collects the count of all instructions and reports them 136 . and keeps track of whether functions read or write memory (are "pure"). For this simple (but very common) case. only available in opt. only available in opt.dot graph. This graph can then be processed with the "dot" tool to convert it to postscript or some other suitable format. Dominator Tree Construction This pass is a simple dominator construction algorithm for finding forward dominators. Simple mod/ref analysis for globals This simple pass provides alias and mod/ref information for global values that do not have their address taken. prints the control flow graph into a .

only available in opt. In this way. This can be useful when looking for standard library functions we should constant fold or handle in alias analyses. only available in opt. and tries to provide a lazy. prints the SCCs of the call graph to standard output in a human-readable form. Post-Dominator Tree Construction This pass is a simple post-dominator construction algorithm for finding post-dominators. Post-Dominance Frontier Construction This pass is a simple post-dominator construction algorithm for finding post-dominator frontiers. caching interface to a common kind of alias information query.. Print external fn callsites passed constants This pass. No Profile Information The default "no profile" implementation of the abstract ProfileInfo interface. Print a call graph This pass. Print function to stderr 137 . not just a single natural loop. prints out call sites to external functions that are called with constant arguments. No Alias Analysis (always returns 'may' alias) Always returns "I don't know" for alias queries. Print SCCs of the Call Graph This pass. in that it does not chain to a previous analysis.. As such it doesn't follow many of the rules that other alias analyses must. prints the call graph to standard output in a human-readable form. Memory Dependence Analysis An analysis that determines. only available in opt. Natural Loop Construction This analysis is used to identify natural loops and determine the loop depth of various nodes of the CFG. only available in opt. NoAA is unlike other alias analysis implementations. prints the SCCs of each function CFG to standard output in a human-readable form. Documentation for the LLVM System at SVN head Interval Partition Construction This analysis calculates and represents the interval partition of a function. Print SCCs of each function CFG This pass. or a preexisting interval partition. what preceding memory operations it depends on. for a given memory operation. the interval partition may be used to reduce a flow graph down to its degenerate single node interval partition (unless it is irreducible). It builds on alias analysis information. Note that the loops identified may actually be several natural loops that share the same header node. Alias Set Printer Yet to be written.

out A concrete implementation of profiling information that loads the information from a profile dump file. If it can prove. this means looking for internal functions that have pointer arguments. Note that it refuses to scalarize aggregates which would require passing in more than three operands to the function. Scalar Evolution Analysis The ScalarEvolution analysis can be used to analyze and catagorize scalar expressions in loops. Note that this analysis explicitly does not include types only used by the symbol table. because passing thousands of operands for a large array or structure is unprofitable! Note that this transformation could also be done for arguments that are only stored to (returning the value instead). Transform Passes This section describes the LLVM Transform Passes. Profile Guided Basic Block Placement This pass is a very simple profile guided basic block placement algorithm. but does not currently. then it can pass the value into the function instead of the address of the value. trip counts of loops and other important properties can be obtained. Documentation for the LLVM System at SVN head The PrintFunctionPass class is designed to be pipelined with other FunctionPasses. This can cause recursive simplification of code and lead to the elimination of allocas (especially in C++ template code like the STL). This analysis is primarily useful for induction variable substitution and strength reduction. Target Data Layout Provides other passes access to information on how the size and alignment required by the the target ABI for various data types. scalarizing them if the elements of the aggregate are only loaded. Given this analysis. This case would be best handled when and if LLVM starts supporting multiple return values from functions. The idea is to put frequently executed blocks together at the start of the function and hopefully increase the number of fall-through conditional branches. that an argument is *only* loaded. This pass also handles aggregate arguments that are passed into a function. Print module to stderr This pass simply prints out the entire module when it is executed. this pass basically orders 138 . representing them with the abstract and opaque SCEV class. Promote 'by reference' arguments to scalars This pass promotes "by reference" arguments to be "by value" arguments. Aggressive Dead Code Elimination ADCE aggressively tries to eliminate code. If there is no profile information for a particular function. Find Used Types This pass is used to seek out all of the types in use by the program. This pass is similar to DCE but it assumes that values are dead until proven otherwise. Load profile information from llvmprof. except applied to the liveness of values. In practice. This is similar to SCCP. through the use of alias analysis. It specializes in recognizing general induction variables. and prints out the functions of the module as they are processed.

regardless of whether or not an existing string is available. It should eventually be removed. It may be "required" by passes that cannot deal with critical edges. It eliminate names for types that are unused in the entire translation unit. Dead Code Elimination Dead code elimination is similar to dead instruction elimination. but it rechecks instructions that were used by removed instructions to see if they are newly dead. This transformation obviously invalidates the CFG. 139 . Dead Argument Elimination This pass deletes dead arguments from internal functions. removing instructions that are obviously dead. This works around limitations in it's basic-block-at-a-time approach. and frontier) information. Conditional Propagation This pass propagates information about conditional expressions through the program. For example: add i32 1. as well as arguments only passed into function calls as dead arguments of other functions. It is a good idea to to run a DIE (Dead Instruction Elimination) pass sometime after running this pass. This is useful because some passes (ie TraceValues) insert a lot of string constants into the program. tree. Prepare a function for code generation This pass munges the code in the input function to better prepare it for SelectionDAG-based code generation. Merge Duplicate Global Constants Merges duplicate global constants together into a single constant that is shared. Dead Type Elimination This pass is used to cleanup the output of GCC. Dead Instruction Elimination Dead instruction elimination performs a single pass over the function. Break critical edges in CFG Break all of the critical edges in the CFG by inserting a dummy basic block. 2 becomes i32 3 NOTE: this pass has a habit of making definitions be dead. Dead argument elimination removes arguments which are directly dead. Documentation for the LLVM System at SVN head blocks in depth-first order. This pass is often useful as a cleanup pass to run after aggressive interprocedural passes. immediate dominators. but can update forward dominator (set. Simple constant propagation This file implements constant propagation and merging. This pass also deletes dead arguments in a similar way. which add possibly-dead arguments. It looks for instructions involving only constant operands and replaces them with a constant value instead of an instruction. using the find used types pass. allowing it to eliminate conditional branches in some cases.

This allows it to delete recursive chunks of the program which are unreachable. If obviously true. 140 . 3. it deletes whatever is left over. Dead Global Elimination This transform is designed to eliminate unreachable internal globals from the program. this transformation will make the loop dead. it marks read/write globals as constant. This turns loops like: for (i = 7. Global Value Numbering This pass performs global value numbering to eliminate fully and partially redundant instructions. Some transforms are much easier (aka possible) only if free or malloc are not called indirectly. i != 25. The canonical induction variable is guaranteed to be the first PHI node in the loop header block. All loops are transformed to have a single canonical induction variable which starts at zero and steps by one. After it finds all of the globals which are needed. i*i <1000. ++i) 2. Indirect Malloc and Free Removal This pass finds places where memory allocation functions may escape into indirect land. searching out globals that are known to be alive. Any use outside of the loop of an expression derived from the indvar is changed to compute the derived value outside of the loop. This transformation makes the following changes to each loop with an identifiable induction variable: 1. 2. etc. If the only purpose of the loop is to compute the exit value of some derived expression. Global Variable Optimizer This pass transforms simple global variables that never have their address taken. Thus find places where the address of memory functions are taken and construct bounce functions with direct calls of those functions. Canonicalize Induction Variables This transformation analyzes and transforms the induction variables (and computations derived from them) into simpler forms suitable for subsequent analysis and transformation. It also performs redundant load elimination. It uses an aggressive algorithm. The exit condition for the loop is canonicalized to compare the induction value against the exit value. this pass also makes the following changes: 1. Any pointer arithmetic recurrences are raised to use array subscripts. ++i) into for (i = 0. Documentation for the LLVM System at SVN head Dead Store Elimination A trivial dead store elimination that only considers basic-block local redundant stores. deletes variables only stored to. eliminating the dependence on the exit value of the induction variable. If the trip count of a loop is computable.

gdce. It inserts a counter for every edge in the program. duplicates all instructions in a function. Combine redundant instructions Combine instructions to form fewer. the loop could be transformed to count down to zero (the "do loop" optimization). which counts the number of times each basic block executes. but cannot reliably detect hot paths through the CFG. ignoring the profiling code. Insert instrumentation for function profiling This pass instruments the specified program with counters for function profiling. Function Integration/Inlining Bottom-up inlining of functions into callees. At each connection point a choice is made as to whether to jump to the profiled code (take a sample) or execute the unprofiled code. Documentation for the LLVM System at SVN head This transformation should be followed by strength reduction after all of the desired loop transformations have been performed. and dse also are good to run afterwards. Insert random sampling instrumentation framework The second stage of the random-sampling instrumentation framework. then connects the two versions together at the entry and at backedges. Control equivalent regions of the CFG should not require duplicate counters. simple instructions. Insert instrumentation for block profiling This pass instruments the specified program with counters for basic block profiling. instcombine. which can tell which blocks are hot. 1 %Z = add i32 %Y. This pass combines things like: %Y = add i32 %X. instead of using control flow information to prune the number of counters inserted. load-vn. It is the default profiler and thus terminates RSProfiler chains. it is highly recommended to runmem2reg and adce. which counts the number of times each function is called. Insert instrumentation for edge profiling This pass instruments the specified program with counters for edge profiling. Note that this implementation is very naïve. and is used for a wide variety of program transformations. on targets where it is profitable. After this pass. but it does put duplicate counters in. Measure profiling framework overhead The basic profiler that does nothing. Additionally. This is the most basic form of profiling. Note that this implementation is very naïve. Edge profiling can give a reasonable approximation of the hot paths through a program. It is useful for measuring framework overhead. 1 into: 141 . This pass does not modify the CFG This pass is where algebraic simplification happens.

it is moved to the right.. or ≥ to = or ≠if possible. X1 = . } if (X < 3) { In this case. • â—¦ etc..hand side. X2) X3 = phi(X1.. This pass makes arguments dead. X is represented as mul X. else else X2 = . • Bitwise operators with constant operands are always grouped so that shifts are performed first.. >.) for (. Documentation for the LLVM System at SVN head %Z = add i32 %X.. ≤. Thread control through conditional blocks Jump threading tries to find distinct threads of control flow running through a basic block. If a main function is found. Internalize Global Symbols This pass loops over all of the functions in the input module. 1 • Multiplies with a constant power-of-two argument are transformed into shifts. • Compare instructions are converted from <. 2 This is a simple worklist driven algorithm. For example.. An example of when this can occur is code like this: if () { ... then ands... Loop-Closed SSA Form Pass This pass transforms loops by placing phi nodes at the end of the loops for all values that are live across the loop boundary. The existing dead argument elimination pass should be run after this to clean up the mess. If one or more of the predecessors of the block can be proven to always cause a jump to one of the successors. then ors. X2) 142 . This pass guarantees that the following canonicalizations are performed on the program: • If a binary operator has a constant operand.. then xors.) if (c) if (c) X1 = . it turns the left into the right code: for (.. It could certainly be improved in many different ways. • All cmp instructions on boolean values are replaced with logical operations.. looking for a main function. like using a worklist. • add X. X3 = phi(X1. we forward the edge from the predecessor to the successor by duplicating the contents of this block. all other functions and all global variables with initializers are marked as internal. X2 = . X = 4. Interprocedural Sparse Conditional Constant Propagation An interprocedural variant of Sparse Conditional Constant Propagation. the unconditional branch at the end of the first if can be revectored to the false side of the second if. Interprocedural constant propagation This pass implements an extremely simple interprocedural constant propagation pass.. This pass looks at blocks that have multiple predecessors and multiple successors. 2 â✘’ shl X. but does not remove them.

or by sinking code to the exit blocks if it is safe. attempting to remove as much code from the body of a loop as possible. If we can determine that a load or call inside of a loop never aliases anything stored to.. Extract loops into new functions A pass wrapper around the ExtractLoop() scalar transformation to extract each top-level loop into its own new function. Dead Loop Deletion Pass This file implements the Dead Loop Deletion Pass. This is accomplished by creating a new value to hold the initial value of the array access for the first iteration. we can promote the loads and stores in the loop of the pointer to use a temporary alloca'd variable. This can only happen if a few conditions are true: ♦ The pointer stored through is loop invariant. This is a pass most useful for debugging via bugpoint. Loop Strength Reduction This pass performs a strength reduction on array references inside loops that have as one or more of their components the loop induction variable. If the loop is the only loop in a given function. This is used by bugpoint. It does this by either hoisting code into the preheader block. Loop Invariant Code Motion This pass performs loop invariant code motion. There are no calls in the loop which mod/ref the pointer. • Scalar Promotion of Memory . and do not contribute to the computation of the function's return value. we try to move the store to happen AFTER the loop instead of inside of the loop. Index Split Loops This pass divides loop's iteration range by spliting loop such that each individual loop is executed efficiently. and then creating a new GEP instruction in the loop to increment the value by the appropriate amount. Rotate Loops 143 . This pass uses alias analysis for two purposes: • Moving loop invariant loads and calls out of loops. Extract at most one loop into a new function Similar to Extract loops into new functions. it is not touched. This pass is responsible for eliminating loops with non-infinite computable trip counts that have no side effects or volatile instructions. This pass also promotes must-aliased memory locations in the loop to live in registers.. We then use the mem2reg functionality to construct the appropriate SSA form for the variable. = X4 + 4 This is still valid LLVM. we can hoist it or sink it like any other instruction. such as LoopUnswitching. The major benefit of this transformation is that it makes many other loop optimizations. = X3 + 4 X4 = phi(X3) . If these conditions are true. the extra phi nodes are purely redundant.If there is a store instruction inside of the loop.. Documentation for the LLVM System at SVN head . simpler.. and will be trivially eliminated by InstCombine. thus hoisting and sinking "invariant" loads and stores. this pass extracts one natural loop from the program into a function if it can. ♦ There are no stores or loads in the loop which may alias the pointer.

This simplifies a number of analyses and transformations. 144 .) if (lic) A. Lower invoke and unwind. This simplifies transformations such as store-sinking that are built into LICM. Canonicalize natural loops This pass performs several transformations to transform natural loops into a simpler form. It works best when loops have been canonicalized by the -indvars pass. Documentation for the LLVM System at SVN head A simple loop rotation transformation. the 'cheap' support and the 'expensive' support. so usage of this pass should not pessimize generated code.) A. to make the unswitching opportunity obvious.. This is a target-dependent tranformation because it depends on the size of data types and alignment constraints. such as LICM. Loop pre-header insertion guarantees that there is a single. Note that the simplifycfg pass will clean up blocks which are split out but end up being unnecessary. allowing it to determine the trip counts of loops easily. For example. This pass obviously modifies the CFG. Lower allocations from instructions to calls Turn malloc and free instructions into @malloc and @free calls. for unwindless code generators This transformation is designed for use by code generators which do not yet support stack unwinding. Unswitch loops This pass transforms loops that contain branches on loop-invariant conditions to have multiple loops. Loop exit-block insertion guarantees that all exit blocks from the loop (blocks which are outside of the loop that have predecessors inside of the loop) only have predecessors from inside of the loop (and are thus dominated by the loop header). it turns the left into the right code: for (. but updates loop information and dominator information. This pass supports two models of exception handling lowering. non-critical entry edge from outside of the loop to the loop header.. C This can increase the size of the code exponentially (doubling it every time a loop is unswitched) so we only unswitch if the resultant code will be smaller than a threshold. which makes subsequent analyses and transformations simpler and more effective...) if (lic) A for (.. B. This pass expects LICM to be run before it to hoist invariant conditions out of the loop. C B else C for (.. This pass also guarantees that loops will have exactly one backedge. Unroll loops This pass implements a simple loop unroller.

then it gets the value returned by the longjmp and goes to where the basic block was split. This is just the standard SSA construction algorithm to construct "pruned" SSA form. Unify function exit nodes Ensure that functions have at most one ret instruction in them. by turning 'invoke' instructions into calls and by turning 'unwind' instructions into calls to abort(). which allows targets to get away with not implementing the switch instruction until it is convenient. Remove unused exception handling info This file implements a simple interprocedural pass which walks the call-graph. Note that after this pass runs the CFG is not entirely accurate (exceptional control flow edges are not correct anymore) so only very simple things should be done after the lowerinvoke pass has run (like generation of native code). it keeps track of which node is the new exit node of the CFG. If it is. Lower Set Jump Lowers setjmp and longjmp to use the LLVM invoke and unwind instructions as necessary. Because the 'expensive' support slows down programs a lot. invoke instructions are handled in a similar fashion with the original except block being executed if it isn't a longjmp except that is handled by that function. At a setjmp call. It basically inserts setjmp/longjmp calls to emulate the exception handling as necessary. An alloca is transformed by using dominator frontiers to place phi nodes. and EH is only used for a subset of the programs. Promote Memory to Register This file promotes memory references to be register references. then traversing the function in depth-first order to rewrite loads and stores as appropriate. the program will print a message then abort. This unwinds the stack for us calling all of the destructors for objects allocated on the stack. turning invoke instructions into call instructions if and only if the callee cannot throw an exception. It promotes alloca instructions which only have loads and stores as uses. It implements this as a bottom-up 145 . Lowering of longjmp is fairly trivial. Additionally. or transforming sets of stores into memset's. if so. Documentation for the LLVM System at SVN head 'Cheap' exception handling support gives the program the ability to execute any program which does not "throw an exception". The calls in a function that have a setjmp are converted to invoke where the except part checks to see if it's a longjmp exception and. 'Expensive' exception handling support gives the full exception handling support to the program at the cost of making the 'invoke' instruction really expensive. We replace the call with a call to the LLVM library function __llvm_sjljeh_throw_longjmp(). If the program does dynamically use the unwind instruction. it must be specifically enabled by the -enable-correct-eh-support option. This should not be used as a general purpose "my LLVM-to-LLVM pass doesn't support the invoke instruction yet" lowering pass. if it's handled in the function. Lower SwitchInst's to branches Rewrites switch instructions with a sequence of branches. Optimize use of memcpy and friend This pass performs various transformations related to eliminating memcpy calls. the basic block is split and the setjmp removed.

It is a good idea to to run a DCE pass sometime after running this pass. Documentation for the LLVM System at SVN head traversal of the call-graph. and other values are assigned ranks corresponding to the reverse post order traversal of current function (starting at 2). constants are assigned rank = 0. This combines a simple scalar replacement of aggregates algorithm with the mem2reg algorithm because often interact. Proves values to be constant. Removes basic blocks with no predecessors. function arguments are rank = 1. and replaces them with constants 4. Specifically: 1. GCSE. Scalar Replacement of Aggregates The well-known scalar replacement of aggregates transformation. Simplify the CFG Performs dead code elimination and basic block merging. As such. a call exit(3) that occurs within the main() function can be transformed into simply return 3. Then. To make later hacking easier. 146 . Proves conditional branches to be unconditional Note that this pass has a habit of making definitions be dead. It is intented to be the inverse of -mem2reg. By converting to load instructions. the only values live across basic blocks are alloca instructions and load instructions before phi nodes. especially for C++ programs. Sparse Conditional Constant Propagation Sparse conditional constant propagation and merging. Assumes BasicBlocks are dead unless proven otherwise 3. if possible. LICM. iterating between scalarrepl. 3.g. it transforms the individual alloca instructions into nice clean scalar SSA form. For example: 4 + (x + 5) â✘’ x + (4 + 5) In the implementation of this algorithm. For example. Simplify well-known library calls Applies a variety of small optimizations for calls to specific well-known function calls (e. runtime library functions). which can be summarized as: 1. Demote all values to stack slots This file demotes all registers to memory references. It is intended that this should make CFG hacking much easier. 2. such that all introduced alloca instructions (and nothing else) are in the entry block. Assumes values are constant unless proven otherwise 2. PRE. Reassociate expressions This pass reassociates commutative expressions in an order that is designed to promote better constant propagation. which effectively gives values in deep loops higher rank than values not in loops. Eliminates PHI nodes for basic blocks with a single predecessor. the entry block is split into two. etc. This transform breaks up alloca instructions of aggregate type (structure or array) into individual alloca instructions for each member if possible. then mem2reg until we run out of things to promote works well. Merges a basic block into its predecessor if there is only one and the predecessor only has one successor.

Dead declarations are declarations of functions for which no implementation is available (i. debug information Note that this transformation makes code much less readable. • TRE is performed if the function returns void. and can still be TRE'd. Documentation for the LLVM System at SVN head 4. creating a loop. though currently the analysis cannot support moving any really useful instructions (only dead ones).e. that the return returns something else (like constant 0). This pass is necessary to straighten out loops created by the C front-end. Remove unused function declarations This pass loops over all of the functions in the input module. looking for dead declarations and removes them. but also is capable of making other code nicer. This transformation can delete: 1. if the return returns the result returned by the call. though unlikely. 147 . • This pass transforms functions that are prevented from being tail recursive by an associative expression to use an accumulator variable. thus compiling the typical naive factorial or fib implementation into efficient code. • If it can prove that callees do not access theier caller stack frame.. such as reducing code size or making it harder to reverse engineer code. It can be TRE'd if all other return instructions in the function return the exact same value. symbols for internal globals and functions 3. Eliminates a basic block that only contains an unconditional branch. the CFG simplify pass should be run to clean up the mess. It is possible. they are marked as eligible for tail call elimination (by the code generator). After this pass is run. so it should only be used in situations where the strip utility would be used. intended to simplify CFGs by removing some unconditional branches. names for virtual registers 2. Tail Duplication This pass performs a limited form of tail duplication. marked with the 'sret' attribute) and replaces them with a new function that simply returns each of the elements of that struct (using multiple return values). Strip all symbols from a module Performs code stripping. This pass works under a number of conditions: • The returned struct must not contain other structs • The returned struct must only be used to load values from • The placeholder struct passed in is the result of an alloca Tail Call Elimination This file transforms calls of the current function (self recursion) followed by a return instruction with a branch to the entry of the function. Promote sret arguments This pass finds functions that return a struct (using a pointer to the struct as the first argument of the function. This pass also implements the following extensions to the basic algorithm: • Trivial instructions between the call and return do not prevent the transformation from taking place. declarations for unused library functions). or if the function returns a run-time constant on all exits from the function.

Note that this does not provide full security verification (like Java). • It is illegal to have a internal global value with no initializer. and also that malformed bitcode is likely to make LLVM crash. • Only phi nodes can be self referential: %x = add i32 %x. Dead Argument Hacking (BUGPOINT USE ONLY. Module Verifier Verifies an LLVM IR code. • All other things that are tested by asserts spread about the code. View CFG of function Displays the control flow graph using the GraphViz tool. so there should be no need to use it directly. This is only for use by bugpoint. • All basic blocks should only end with terminator insts. Note that llvm-as verifies its input before emitting bitcode. • The entry node to a function must not have predecessors. Documentation for the LLVM System at SVN head Utility Passes This section describes the LLVM Utility Passes. • Functions cannot take a void-typed parameter. • The code is in valid SSA form. %x is invalid. • Verify that the indices of mem access instructions match other operands. • All Instructions must be embedded into a basic block. This is useful to run after an optimization which is undergoing testing. • It is illegal to specify a name for a void value. • It is illegal to have a ret instruction that returns a value that does not agree with the function return value type. but deletes arguments to functions which are external. • It is illegal to put a label into any other type (like a structure) or to return one. • PHI nodes must have an entry for each predecessor. DO NOT USE) Same as dead argument elimination. • Verify that a function's argument list agrees with its declared type. Verify that shifts and logicals only happen on integrals f. • PHI nodes must be the first thing in a basic block. • Verify that arithmetic and other things are only performed on first-class types. Extract Basic Blocks From Module (for bugpoint use) This pass is used by bugpoint to extract all blocks from the module into their own functions. not contain them. Preliminary module verification Ensures that the module is in the form required by the Module Verifier pass. • Both of a binary operator's parameters are of the same type. • Function call argument types match the function prototype. all grouped together. Running the verifier runs this pass automatically.e. but instead just tries to ensure that code is well-formed. All language front-ends are therefore encouraged to verify their output before performing optimizing transformations. • All of the constants in a switch statement are of the correct type. with no extras. View CFG of function (with no function bodies) 148 . • PHI nodes must have at least one entry.

Documentation for the LLVM System at SVN head Displays the control flow graph using the GraphViz tool. 01 Mar 2010) $ 149 . Reid Spencer LLVM Compiler Infrastructure Last modified: $Date: 2010-03-01 13:24:17 -0600 (Mon. but omitting function bodies.

stuff that happens when I #include <iostream>? 2. I've upgraded to a new version of LLVM. it finds the wrong C compiler. When creating a dynamic library. 13. but the tests freeze. After Subversion update. Source Languages 1. Where did all of my code go?? 3. 5. Help! 5.3. When I use the test suite. but the resulting tools do not work. I get a strange GLIBC error. License 1. it fails. without redistributing the source? 2. Can I use LLVM to convert C++ code to C code? 5. How portable is the LLVM source code? 3. but my build tree keeps using the old version. How can I disable all optimizations when compiling code using the LLVM GCC front end? 4. but it uses the LLVM linker from a previous build. rebuilding gives the error "No rule to make target". Compiling LLVM with GCC 3. Why? 4. what can be wrong? 11. What do I do? 6. What do I do? 3. I'd like to write a self-hosting LLVM compiler.. I don't understand the GetElementPtr instruction. When I run configure. 8.2 fails. How do I get configure to work correctly? 2. 7. In what language is LLVM written? 2. all of the C Backend tests fail. Using the GCC Front End 1. 14. The configure script finds the right C compiler. Does the University of Illinois Open Source License really qualify as an "open source" license? 3.global_ctors and _GLOBAL__I__tmp_webcompile. Compiling LLVM with GCC succeeds. What is wrong? 12. and I get strange build errors. When I compile code using the LLVM GCC front end. I've built LLVM and am testing it. Source code 1. Why do test results differ when I perform different types of builds? 9. Can I modify LLVM source code and redistribute binaries or other tools based on it. Documentation for the LLVM System at SVN head LLVM: Frequently Asked Questions 1. Questions about code generated by the GCC front-end 1. 4. and now my build is trying to use a file/directory that doesn't exist. I've updated my source tree from Subversion. I've modified a Makefile in my source tree. Why are the LLVM source code and the front-end distributed under different licenses? 2. Can I compile C or C++ code to platform-independent LLVM bitcode? 6. What support is there for higher level source language constructs for building a compiler? 4. How should I interface with the LLVM middle-end optimizers and back-end code generators? 3. What source languages are supported? 2. What is this "undef" thing that shows up in my code? 150 . 2. it complains that it cannot find libcrtend. What is this llvm. the configure script thinks my system has all of the header files and libraries it is testing for. Can I modify LLVM source code and redistribute the modified source? 4. The llvmc program gives me errors/doesn't work.a? 3. When I compile software that uses a configure script. When I compile LLVM-GCC with srcdir == objdir. what should I do? 10. Build Problems 1..

Our aim is to distribute LLVM source code under a much less restrictive license. respectively. 151 . Source Code In what language is LLVM written? All of the LLVM tools and libraries are written in C++ with extensive use of the STL. Porting to systems without these tools (MacOS 9. Does the University of Illinois Open Source License really qualify as an "open source" license? Yes. This is why we distribute LLVM under a less restrictive license than GPL. in particular one that does not compel users who distribute tools based on modifying the source to redistribute the modified source code as well. The tools required to build and test LLVM have been ported to a plethora of platforms. as explained in the first question above. Plan 9) will require more effort. Build Problems When I run configure. the license is certified by the Open Source Initiative (OSI). Can I modify LLVM source code and redistribute the modified source? Yes. How portable is the LLVM source code? The LLVM source code should be portable to most modern UNIX-like operating systems. unless it finds compiler paths set in CC and CXX for the C and C++ compiler. Most of the code is written in standard C++ with operating system services abstracted to a support library. The modified source distribution must retain the copyright notice and follow the three bulletted conditions listed in the LLVM license. • The LLVM build system relies heavily on UNIX shell tools. Documentation for the LLVM System at SVN head 4. Some porting problems may exist in the following areas: • The GCC front end code is not as portable as the LLVM suite. without redistributing the source? Yes. like the Bourne Shell and sed. so it may not compile as well on unsupported platforms. it finds the wrong C compiler. Can I modify LLVM source code and redistribute binaries or other tools based on it. The configure script attempts to locate first gcc and then cc. Why does instcombine + simplifycfg turn a call to a function with a mismatched calling convention into "unreachable"? Why not make the verifier reject it? Written by The LLVM Team License Why are the LLVM source code and the front-end distributed under different licenses? The C/C++ front-ends are based on GCC and must be distributed under the GPL.

I've modified a Makefile in my source tree. and now my build is trying to use a file/directory that doesn't exist. I've built LLVM and am testing it. What do I do? If the Makefile already exists in your object tree. or header file dependencies are especially prone to this sort of problem. autoconf. but my build tree keeps using the old version. When creating a dynamic library. either adjust your PATH environment variable or set CC and CXX explicitly../config. but it allows configure to do its work without having to adjust your PATH permanently. Under some operating systems (i. The configure script finds the right C compiler. In a Borne compatible shell. you will have to modify the configure script to copy it over. Adjust your PATH environment variable so that the correct program appears first in the PATH. Run configure with an alternative PATH that is correct.e./configure . In most cases. Changes in libtool.. 152 . I've updated my source tree from Subversion.status <relative path to Makefile> If the Makefile is new. libtool does not work correctly if GCC was compiled with the --disable-shared option. This is still somewhat inconvenient. install your own version of GCC that has shared libraries enabled by default. What do I do? The configure script uses the PATH to find executables. Linux). To do this. You need to re-run configure in your object directory. I get a strange GLIBC error. just type make clean and then make in the directory that fails to build. changes to the LLVM source code alters how the build system works. and I get strange build errors. they have to be copied over to the object tree in order to be used by the build. there are two ways to fix it: 1. so if it's grabbing the wrong linker/assembler/etc. Sometimes. but the tests freeze. but it uses the LLVM linker from a previous build. When new Makefiles are added to the source tree. The best thing to try is to remove the old files and re-build. you can just run the following command in the top level directory of your object tree: % . Documentation for the LLVM System at SVN head If configure finds the wrong compiler. but may not be convenient when you want them first in your path for other work. I've upgraded to a new version of LLVM. This may work. To work around this. the syntax would be: % PATH=[the path without the bad program] . 2. this takes care of the problem.

In this case. it may be necessary to run make clean before rebuilding. and rebuild: % cd $LLVM_OBJ_DIR % rm -f `find . the debugging assertions in code are not enabled in optimized or profiling builds.2 fails.3..then you must run the tests with the following commands: % cd llvm/test % gmake ENABLE_PROFILING=1 Why do test results differ when I perform different types of builds? The LLVM test suite is dependent upon several features of the LLVM tools and libraries. Second. Compiling LLVM with GCC succeeds. which list dependencies for source files. For example. -name \*\. Documentation for the LLVM System at SVN head This is most likely occurring because you built a profile or release (optimized) build of LLVM and have not specified the same information on the gmake command line. rebuilding gives the error "No rule to make target". if you built LLVM with the command: % gmake ENABLE_PROFILING=1 . Compiling LLVM with GCC 3. some tests may rely upon debugging options or behavior that is only available in the debug build. what should I do? This is a bug in GCC.. If so. This may occur anytime files are moved within the Subversion repository or removed entirely.d files. and affects projects other than LLVM. Please consult your compiler version (gcc --version) to find out whether it is broken. The llvmc program gives me errors/doesn't work. what can be wrong? Several versions of GCC have shown a weakness in miscompiling the LLVM codebase. If the error is of the form: gmake[2]: *** No rule to make target `/path/to/somefile'. Try upgrading or downgrading your GCC. 153 . These tests will fail in an optimized or profile build. After Subversion update. tests that used to fail may pass. Hence. First.d'. the best solution is to erase all . Stop.d` % gmake In other cases. but the resulting tools do not work. needed by `/path/to/another/file. your only option is to upgrade GCC to a known good version.

ll parser is slower than the bitcode reader when interfacing to the middle end ♦ against: you'll have to re-engineer the LLVM IR object model and asm writer in your language ♦ against: it may be harder to track changes to the IR • Emit LLVM bitcode from your compiler's native language. How should I interface with the LLVM middle-end optimizers and back-end code generators? Your compiler front-end will communicate with LLVM by creating a module in the LLVM intermediate representation (IR) format. and . The PyPy developers are working on integrating LLVM into the PyPy backend so that PyPy language can translate to LLVM. or simply remove the GNUmakefile entirely.bc format ♦ for: enables running LLVM optimization passes without a emit/parse overhead ♦ for: adapts well to a JIT context ♦ against: lots of ugly glue code to write • Emit LLVM assembly from your compiler's native language. ♦ for: very straightforward to get started ♦ against: the . compile it. These are available through a special version of GCC that LLVM calls the C Front End There is an incomplete version of a Java front end available in the java module. ♦ for: best tracks changes to the LLVM IR.ll syntax. We suggest using llvm-gcc instead. We regret the inconvenience. it fails. Assuming you want to write your language's compiler in the language itself (rather than C++). and try it. ♦ for: can use the more-efficient bitcode reader when interfacing to the middle end ♦ against: you'll have to re-engineer the LLVM IR object model and bitcode writer in your language ♦ against: it may be harder to track changes to the IR 154 . Source Languages What source languages are supported? LLVM currently has full support for C and C++ source languages. Because the environment isn't set up correctly to do this. there are 3 major ways to tackle generating LLVM IR from a front-end: • Call into the LLVM libraries code using your language's FFI (foreign function interface). the build fails. Why? The GNUmakefile in the top-level directory of LLVM-GCC is a special Makefile used by Apple to invoke the build_gcc script after setting up a special environment. Documentation for the LLVM System at SVN head llvmc is experimental and isn't really supported. People not building LLVM-GCC the "Apple way" need to build LLVM-GCC with srcdir != objdir. There is no documentation on this yet so you'll need to download the code. . This has the unfortunate side-effect that trying to build LLVM-GCC with srcdir == objdir in a "non-Apple way" invokes the GNUmakefile instead of Makefile. When I compile LLVM-GCC with srcdir == objdir. I'd like to write a self-hosting LLVM compiler.

The most common hurdle with calling C from managed code is interfacing with the garbage collector. When I compile code using the LLVM GCC front end. it complains that it cannot find libcrtend. do: % cd llvm/runtime % make clean . leaving you with the truly horrible code that you desire. How do I get configure to work correctly? The configure script is getting things wrong because the LLVM linker allows symbols to be undefined at link time (so that they can be resolved during JIT or translation to the C back end). the configure script thinks my system has all of the header files and libraries it is testing for." To work around this. Help! See The Often Misunderstood GEP Instruction. Creating native code requires standard linkage. 3. There are no facilities for lexical nor semantic analysis. and executable generation. To correct this. Make sure that the regular C compiler is first in your PATH. The C interface was designed to require very little memory management. since most languages have strong support for interfacing with C.-disable-opt -Wl.-native" to your CFLAGS environment variable. there isn't much. Documentation for the LLVM System at SVN head If you go with the first option. Make sure the CC and CXX environment variables contains the full path to the LLVM GCC front end. LLVM supports an intermediate representation which is useful for code representation but will not support the high level (abstract syntax tree) representation needed by most compilers. and so is straightforward in this regard. That is why configure thinks your system "has everything. make install-bytecode How can I disable all optimizations when compiling code using the LLVM GCC front end? Passing "-Wa. perform the following steps: 1. The only way this can happen is if you haven't installed the runtime library. a mostly implemented configuration-driven compiler driver which simplifies the task of running optimizations.-disable-opt" will disable *all* cleanup and optimizations done at the llvm level. Add the string "-Wl.a. which in turn will allow the configure script to find out if code is not linking on your system because the feature isn't available on your system. Using the GCC Front End When I compile software that uses a configure script. 2. There is. What support is there for a higher level source language constructs for building a compiler? Currently. Can I use LLVM to convert C++ code to C code? 155 . the C bindings in include/llvm-c should help a lot. however. I don't understand the GetElementPtr instruction. This will allow the llvm-ld linker to create a native code executable instead of shell script that runs the JIT. linking.

o -o program With llvm-gcc3. using the LLC tool with the C backend: % llc -march=c program. you may be able to manually compile libstdc++ to LLVM bitcode. and not C++-ABI-conforming on most platforms. Convert the LLVM code to C code.o b. In practice. Also. Can I compile C or C++ code to platform-independent LLVM bitcode? No. compile the C file: % cc x. you can use LLVM to convert code from any language LLVM supports to C. the C back end does not support exception handling. Alternatively.cpp -c % llvm-g++ b. there are a number of other limitations of the C backend that cause it to produce code that does not fully conform to the C++ ABI on most platforms. there are several limitations noted below.c Using LLVM does not eliminate the need for C++ library support.cpp -o program or: % llvm-g++ a. statically link it into your program. Some of the C++ programs in LLVM's test suite are known to fail when compiled with the C back end because of ABI incompatibilities with standard C++ libraries. If you want/need it for a certain program. by default. If you are working on a platform that does not provide any C++ libraries. C and C++ are inherently platform-dependent languages. variables are renamed. Compile your program as normal with llvm-g++: % llvm-g++ x. then use the commands above to convert the whole result into C code.bc -o program. the generated code will depend on g++'s C++ support libraries in the same way that code generated from g++ would. The resultant code will use setjmp/longjmp to implement exception support that is relatively slow. Note that. original source formatting is totally lost. expressions are regrouped). Use commands like this: 1. 156 . A very common way that C code is made portable is by using the preprocessor to include platform-specific code. If you use another C++ front-end. The most obvious example of this is the preprocessor. 2. but otherwise correct. The .cpp -c % llvm-g++ a. so this may not be what you're looking for.bc file is the LLVM version of the program all linked together. etc) and not very pretty (comments are stripped. this will generate program and program.bc. Note that the generated C code will be very low level (all loops are lowered to gotos. the generated code will depend on whatever library that front-end would normally require. Documentation for the LLVM System at SVN head Yes. Finally. you can enable it by passing "-enable-correct-eh-support" to the llc program. so the result is inherently dependent on the platform that the preprocessing was targeting. information about other platforms is lost after preprocessing. Also.c 3. you might compile the libraries and your application into two different chunks of C code and link them. If you use the llvm-g++ front-end.

the C function: int X() { int i. The code that you see in the . C++ does not guarantee an order of initialization between static objects in different translation units. if you are computing some expression. However. the object would not necessarily be automatically initialized before your use. for example. make sure that the code is actually needed.. since many platforms define their ABIs in terms of C. consider using printf() instead of iostreams to print values... you may often wonder what happened to all of the code that you typed in. } Is compiled to "ret i32 undef" because "i" never has a value specified for it. Where did all of my code go?? If you are using the LLVM demo page. you can read from and assign to volatile global variables. If you really want to constrain the optimizer. so if your code doesn't actually do anything useful. You can get these if you do not initialize a variable before you use it. thus hard-wiring a platform-specific detail. sizeof is expanded to a constant immediately. Why does instcombine + simplifycfg turn a call to a function with a mismatched calling convention into "unreachable"? Why not make the verifier reject it? This is a common problem run into by authors of front-ends that are using custom calling conventions: you need to make sure to set the right calling convention on both the function and on each call to the function. so if a static ctor/dtor in your . Remember that the demo script is running the code through the LLVM optimizers. For example..ll file corresponds to the constructor and destructor registration code. To prevent this. Also. This object has a static constructor and destructor that initializes and destroys the global iostream objects before they could possibly be used in the file. front-ends currently must emit platform-specific IR in order to have the result conform to the platform ABI. the file will probably use the std::cin/std::cout/.cpp file used std::cout. What is this "undef" thing that shows up in my code? undef is the LLVM way of representing a value that is not defined. For example. For example. global objects. stuff that happens when I #include <iostream>? If you #include the <iostream> header into a C++ translation unit. Questions about code generated by the GCC front-end What is this llvm. this code: define fastcc void @foo() { 157 . It's common for sizeof(long) to vary between platforms. To make std::cout and friends work correctly in these scenarios.global_ctors and _GLOBAL__I__tmp_webcompile. the STL that we use declares a static object that gets created in every translation unit that includes <iostream>. it might all be deleted. and since LLVM is lower-level than C. In most C front-ends. return i. Documentation for the LLVM System at SVN head Another example is sizeof. If you would like to make it easier to understand the LLVM code generated by the compiler in the demo page. return the value from the function instead of leaving it in a local variable.

The sorts of things that can cause this to happen are fairly contrived. and there is valid code that can create this sort of construct (in dead code). but it is not illegal.exit F.. Here's an example: define fastcc void @foo() { ret void } define internal void @bar(void()* %FP. Setting the calling convention on the caller and callee is required for indirect calls to work. This often bites people because "all their code disappears".i T. so people often ask why not make the verifier reject this sort of thing. label %F. the code is perfectly well defined).i: call fastcc void @foo() 158 . false br i1 %X.i: call void @foo() br label %bar. If you run this through the inliner. which ensures that it is dynamically called with the right calling conv (thus.i. If we made it illegal. false call void @bar(void()* @foo. you get this (the explicit "or" is there so that the inliner doesn't dead code eliminate a bunch of stuff): define fastcc void @foo() { ret void } define void @test() { %X = or i1 false. with "opt -instcombine -simplifycfg". label %F T: call void %FP() ret void F: call fastcc void %FP() ret void } define void @test() { %X = or i1 false. label %T. then every transformation that could potentially create this would have to ensure that it doesn't. label %T. i1 %cond) { br i1 %cond.. i1 %X) ret void } In this example. Documentation for the LLVM System at SVN head ret void } define void @bar() { call void @foo( ) ret void } Is optimized to: define fastcc void @foo() { ret void } define void @bar() { unreachable } . but we still need to accept them. The answer is that this code has undefined behavior. "test" always passes @foo/false into bar.

but no amount of dead code elimination will be able to delete the broken call as unreachable.i: call void @foo() br label %bar. label %T. we end up with a branch on a condition that goes to unreachable: a branch to unreachable can never happen. In this case.exit: ret void } The interesting thing about this is that %X must be false for the code to be well-defined.i.i T.i: call fastcc void @foo() ret void } LLVM Compiler Infrastructure Last modified: $Date: 2010-02-25 17:41:41 -0600 (Thu. since instcombine/simplifycfg turns the undefined call into unreachable. Documentation for the LLVM System at SVN head br label %bar.i: call fastcc void @foo() br label %bar. However.exit bar. label %F. so "-inline -instcombine -simplifycfg" is able to produce: define fastcc void @foo() { ret void } define void @test(i1 %X) { F. dead code elimination can trivially remove the undefined code.exit: ret void } Here you can see that the inlining pass made an undefined call to @foo with the wrong calling convention. However. if %X was an input argument to @test. We really don't want to make the inliner have to know about this sort of thing.exit bar. so it needs to be valid code.exit F. the inliner would produce this: define fastcc void @foo() { ret void } define void @test(i1 %X) { br i1 %X. 25 Feb 2010) $ 159 .

Installation Instructions 6.7 Release Notes 1. release 2. please see the releases page. Like LLVM. If you have questions or comments. the Clang repository and the llvm-gcc repository. Here we describe the status of LLVM. Sub-project Status Update 3. Here we include updates on these subprojects. External Projects Using LLVM 2.7 distribution currently consists of code from the core LLVM repository (which roughly includes the LLVM optimizers. Portability and Supported Platforms 7. the Clang team has made many improvements: 160 . Clang is considered a production-quality compiler for C and Objective-C on x86 (32.7. the LLVM Developer's Mailing List is a good place to send them. this document applies to the next release. fast compilation. Introduction 2. and low memory use. Clang provides a modular. ABCD depends on it. including information about the latest release. Documentation for the LLVM System at SVN head LLVM 2. Clang: C/C++/Objective-C Frontend Toolkit Clang is an LLVM front end for the C. C++. All LLVM releases may be downloaded from the LLVM releases web site. and Objective-C languages. not the current one. Known Problems 8. including major improvements from the previous release and significant known problems. the LLVM Project includes other sub-projects that are in development. --> Sub-project Status Update The LLVM 2.7 4.7? 5. For more information about LLVM. please check out the main LLVM web site. In the LLVM 2. Additional Information Written by the LLVM Team Introduction This document contains the release notes for the LLVM Compiler Infrastructure. library-based architecture that makes it suitable for creating or integrating with other development tools.7 time-frame. To see the release notes for a specific release.and 64-bit). a high level of conformance to language standards. What's New in LLVM 2. code generators and supporting tools). Clang aims to provide a better user experience through expressive diagnostics. Note that if you are reading this file from a Subversion checkout or the main LLVM web page. In addition to this code.

• ARM Support: Clang now has ABI support for both the Darwin and Linux ARM ABIs. as well as greater potential for future optimisations.7. In the LLVM 2. • Interface calls in the JVM: we implemented a variant of the Interface Method Table technique for interface calls in the JVM. VMKit now offers precise and efficient garbage collection with multi-threading support. Clang is now suitable for use as a beta quality ARM compiler. and improved format-string warnings. the analyzer core has made several major and minor improvements. Code compiled with these options may be mixed with code compiled with GCC or clang using the old GNU ABI. The new ABI is used when compiling with the -fobjc-nonfragile-abi and -fgnu-runtime options. • CIndex API and Python bindings: Clang now includes a C API as part of the CIndex library. useful when printing a stack trace. The compiler-rt library provides highly optimized implementations of this and 161 . initial support (not enabled by default yet) for doing interprocedural (cross-function) analysis. This includes support for non-fragile instance variables and accelerated proxies. thanks to the MMTk memory management toolkit.NET is an implementation of the CLI) using LLVM for static and just-in-time compilation.7 time-frame. which is precise. See the Clang C++ compatibility page for common C++ migration issues. including control-flow warnings (unreachable code.). when compiling for a 32-bit target. • New warnings: Clang contains a number of new warnings. as well as just in time and ahead of time compilation with LLVM. including better support for tracking the fields of structures. etc. Clang's C++ support has matured enough to build LLVM and Clang. For example. such as on error conditions. and new checks have been added. missing return statements in a non-void function.27 are: • Garbage collection: VMKit now uses the MMTk toolkit for garbage collectors. but requires the libobjc2 runtime from the GNUstep project. it is intended to be stable and has been designed for use by external projects. Clang Static Analyzer The Clang Static Analyzer project is an effort to use static source code analysis techniques to automatically find bugs in C and Objective-C programs (and hopefully C++ in the future!). Although we may make some changes to the API in the future. • Line number information in the JVM: by using the debug metadata of LLVM. the JVM now supports precise line number information. The first collector to be ported is the MarkSweep collector. Documentation for the LLVM System at SVN head • C++ Support: Clang is now capable of self-hosting! While still alpha-quality. and drastically improves the performance of VMKit. and C++ is now enabled by default. compiler-rt: Compiler Runtime Library The new LLVM compiler-rt project is a simple library that provides an implementation of the low-level target-specific hooks required by code generation and other runtime components. sign-comparison warnings. Coupled with many improvements to the LLVM ARM backend. The tool is very good at finding bugs that occur on specific paths through code. With the release of LLVM 2. See the Clang doxygen CIndex documentation for more details. • Objective-C: Clang now includes experimental support for an updated Objective-C ABI on non-Darwin platforms. converting a double to a 64-bit unsigned integer is compiled into a runtime call to the "__fixunsdfdi" function. The major changes in VMKit 0. VMKit: JVM/CLI Virtual Machine Implementation The VMKit project is an implementation of a JVM and a CLI Virtual Machine (Microsoft . VMKit has shifted to a great framework for writing virtual machines. The CIndex API also includes a preliminary set of Python bindings.

and only on linux and darwin (darwin needs an additional gcc patch). You can do this with something like: llvm-mc foo. This is thanks to the new gcc plugin architecture.7 includes major parts of the work required by the new MC Project.7.5! DragonEgg is still a work in progress.5 magically becomes llvm-gcc-4.5 modifications whatsoever (currently one small patch is needed). and gcc-4.5. or only work poorly. eager and lazy evaluation. and a number of other related areas that CPU instruction-set level tools work in. llvm-mc: Machine Code Toolkit The LLVM Machine Code (aka MC) sub-project of LLVM was created to solve a number of problems in the realm of assembly. Pure versions 0. One minor example of what MC can do is to transcode an AT&T syntax X86 .7: compiler_rt now supports ARM targets.7. Pure Pure is an algebraic/functional programming language based on term rewriting. To use it. This section lists some of the projects that have already been updated to work with LLVM 2.s file into intel syntax. For the moment only the x86-32 and x86-64 targets are supported.so" to the gcc-4. Ada and Fortran work fairly well. Programs are collections of equations which are used to evaluate expressions in a symbolic fashion.s External Open Source Projects Using LLVM 2. lexical closures. This work is not complete in LLVM 2.5). 162 . and the LLVM code generators instead of the gcc code generators. DragonEgg is a new project which is seeing its first release with llvm-2. A few targets have been refactored to support it. a "BSD-style" license.s -output-asm-variant=1 -o foo-intel. Unlike llvm-gcc. just like llvm-gcc. dragonegg in theory does not require any gcc-4. disassembly. you add "-fplugin=path/dragonegg.7 (and continue to work with older LLVM releases >= 2. which makes many intrusive changes to the underlying gcc-4. DragonEgg is a gcc plugin that causes the LLVM optimizers to be run instead of the gcc optimizers. while C++. For a gentle introduction. Pure offers dynamic typing.7.5 command line. It is a sub-project of LLVM which provides it with a number of advantages over other compilers that do not have tightly integrated assembly-level tools. object file format handling. New in LLVM 2. Documentation for the LLVM System at SVN head other low-level routines (some are 3x faster than the equivalent libgcc routines). All other languages either don't work at all. please see the Intro to the LLVM MC Project Blog Post. DragonEgg: llvm-gcc ported to gcc-4. which is nothing more than a dynamic library which conforms to the gcc plugin interface.7 An exciting aspect of LLVM is that it is used as an enabling technology for a lot of other language and tools projects. Currently C works very well. 2.43 and later have been tested and are known to work with LLVM 2. All of the code in the compiler-rt project is available under the standard LLVM License. which makes it possible to modify the behaviour of gcc at runtime by loading a plugin.2 code. a hygienic macro system (also based on term rewriting). and work is underway to support a native assembler in LLVM. built-in list and matrix support (including list and matrix comprehensions) and an easy-to-use C interface. The interpreter uses LLVM as a backend to JIT-compile Pure programs to fast native code.5 DragonEgg is a port of llvm-gcc to gcc-4. but it has made substantially more progress on LLVM mainline.

sponsored by Apple Inc. SAFECode Compiler SAFECode is a memory safe C compiler built using LLVM. analyzes the code to ensure that memory accesses and array indexing operations are safe.7.7 (and continue to work with older LLVM releases >= 2. state-of-the-art programming suite for Haskell. MacRuby MacRuby is an implementation of Ruby based on core Mac OS technologies. IcedTea Java Virtual Machine Implementation IcedTea provides a harness to build OpenJDK using only free software build tools and to provide replacements for the not-yet free parts of OpenJDK.2. and the interconnection network. function units. quick development. target independent optimizations and also for parts of code generation. a standard lazy functional programming language. Icedtea6 1. Unladen Swallow Unladen Swallow is a branch of Python intended to be fully compatible and significantly faster.0 have been tested and is known to work with LLVM 2. It generates new LLVM-based code generators "on the fly" for the designed TTA processors and loads them in to the compiler backend as runtime libraries to avoid per-target recompilation of larger parts of the compiler chain. and instruments the code with run-time checks when safety cannot be proven statically. together with an interactive system for convenient. It uses LLVM at runtime for optimization passes. JIT compilation and exception handling. unannotated C code. One of the extensions that IcedTea provides is a new JIT compiler named Shark which uses LLVM to provide native code generation without introducing processor-dependent code. It takes standard. TCE uses llvm-gcc/Clang and LLVM for C/C++ language support. then LLVM is used to compile the bytecode down to machine code. Lua bytecode is analyzed to remove type checks. It includes an optimizing static compiler generating good code for a variety of platforms.8 and later have been tested and are known to work with LLVM 2. LLVM-Lua LLVM-Lua uses LLVM to add JIT and static compiling support to the Lua VM. The upcoming MacRuby 0. LLVM-Lua 1. TTA-based Codesign Environment (TCE) TCE is a toolset for designing application-specific processors (ASP) based on the Transport triggered architecture (TTA). Glasgow Haskell Compiler (GHC) GHC is an open source. Processor customization points include the register files. Documentation for the LLVM System at SVN head Roadsend PHP Roadsend PHP (rphp) is an open source implementation of the PHP programming language that uses LLVM for its optimizer. supported operations. 163 . The toolset provides a complete co-design flow from C/C++ programs down to synthesizable VHDL and parallel program binaries.7.6 release works with LLVM 2. It also allows static (ahead-of-time) compilation of Ruby code straight to machine code. JIT and static compiler.6 as well). It uses LLVM's optimization passes and JIT compiler. This is a reimplementation of an earlier project that is now based on LLVM.

Documentation for the LLVM System at SVN head

In addition to the existing C and native code generators, GHC now supports an LLVM code generator. GHC
supports LLVM 2.7.

What's New in LLVM 2.7?
This release includes a huge number of bug fixes, performance tweaks and minor improvements. Some of the
major improvements and new features are listed in this section.

LLVM Community Changes
In addition to changes to the code, between LLVM 2.6 and 2.7, a number of organization changes have
happened:

• LLVM has a new official logo!
• Ted Kremenek and Doug Gregor have stepped forward as Code Owners of the Clang static analyzer
and the Clang frontend, respectively.
• LLVM now has an official Blog at http://blog.llvm.org. This is a great way to learn about new
LLVM-related features as they are implemented. Several features in this release are already explained
on the blog.
• The LLVM web pages are now checked into the SVN server, in the "www", "www-pubs" and
"www-releases" SVN modules. Previously they were hidden in a largely inaccessible old CVS server.
• llvm.org is now hosted on a new (and much faster) server. It is still graciously hosted at the University
of Illinois of Urbana Champaign.

Major New Features
LLVM 2.7 includes several major new capabilities:

• 2.7 includes initial support for the MicroBlaze target. MicroBlaze is a soft processor core designed for
Xilinx FPGAs.
• 2.7 includes a new LLVM IR "extensible metadata" feature. This feature supports many different use
cases, including allowing front-end authors to encode source level information into LLVM IR, which
is consumed by later language-specific passes. This is a great way to do high-level optimizations like
devirtualization, type-based alias analysis, etc. See the Extensible Metadata Blog Post for more
information.
• 2.7 encodes debug information in a completely new way, built on extensible metadata. The new
implementation is much more memory efficient and paves the way for improvements to optimized
code debugging experience.
• 2.7 now directly supports taking the address of a label and doing an indirect branch through a pointer.
This is particularly useful for interpreter loops, and is used to implement the GCC "address of label"
extension. For more information, see the Address of Label and Indirect Branches in LLVM IR Blog
Post.
• 2.7 is the first release to start supporting APIs for assembling and disassembling target machine code.
These APIs are useful for a variety of low level clients, and are surfaced in the new "enhanced
disassembly" API. For more information see the The X86 Disassembler Blog Post for more
information.
• 2.7 includes major parts of the work required by the new MC Project, see the MC update above for
more information.

LLVM IR and Core Improvements
LLVM IR has several new features for better support of new targets and that expose new optimization
opportunities:

164

Documentation for the LLVM System at SVN head

• LLVM IR now supports a 16-bit "half float" data type through two new intrinsics and APFloat
support.
• LLVM IR supports two new function attributes: inlinehint and alignstack(n). The former is a hint to
the optimizer that a function was declared 'inline' and thus the inliner should weight it higher when
considering inlining it. The later indicates to the code generator that the function diverges from the
platform ABI on stack alignment.
• The new llvm.objectsize intrinsic allows the optimizer to infer the sizes of memory objects in some
cases. This intrinsic is used to implement the GCC __builtin_object_size extension.
• LLVM IR now supports marking load and store instructions with "non-temporal" hints (building on
the new metadata feature). This hint encourages the code generator to generate non-temporal accesses
when possible, which are useful for code that is carefully managing cache behavior. Currently, only
the X86 backend provides target support for this feature.
• LLVM 2.7 has pre-alpha support for unions in LLVM IR. Unfortunately, this support is not really
usable in 2.7, so if you're interested in pushing it forward, please help contribute to LLVM mainline.

Optimizer Improvements
In addition to a large array of minor performance tweaks and bug fixes, this release includes a few major
enhancements and additions to the optimizers:

• The inliner reuses now merges arrays stack objects in different callees when inlining multiple call
sites into one function. This reduces the stack size of the resultant function.
• The -basicaa alias analysis pass (which is the default) has been improved to be less dependent on
"type safe" pointers. It can now look through bitcasts and other constructs more aggressively,
allowing better load/store optimization.
• The load elimination optimization in the GVN Pass [intro blog post] has been substantially improved
to be more aggressive about partial redundancy elimination and do more aggressive phi translation.
Please see the Advanced Topics in Redundant Load Elimination with a Focus on PHI Translation
Blog Post for more details.
• The module target data string now includes a notion of 'native' integer data types for the target. This
helps mid-level optimizations avoid promoting complex sequences of operations to data types that are
not natively supported (e.g. converting i32 operations to i64 on 32-bit chips).
• The mid-level optimizer is now conservative when operating on a module with no target data.
Previously, it would default to SparcV9 settings, which is not what most people expected.
• Jump threading is now much more aggressive at simplifying correlated conditionals and threading
blocks with otherwise complex logic. It has subsumed the old "Conditional Propagation" pass, and
-condprop has been removed from LLVM 2.7.
• The -instcombine pass has been refactored from being one huge file to being a library of its own.
Internally, it uses a customized IRBuilder to clean it up and simplify it.
• The optimal edge profiling pass is reliable and much more complete than in 2.6. It can be used with
the llvm-prof tool but isn't wired up to the llvm-gcc and clang command line options yet.
• A new experimental alias analysis implementation, -scev-aa, has been added. It uses LLVM's Scalar
Evolution implementation to do symbolic analysis of pointer offset expressions to disambiguate
pointers. It can catch a few cases that basicaa cannot, particularly in complex loop nests.
• The default pass ordering has been tweaked for improved optimization effectiveness.

Interpreter and JIT Improvements

• The JIT now supports generating debug information and is compatible with the new GDB 7.0 (and
later) interfaces for registering dynamically generated debug info.
• The JIT now defaults to compiling eagerly to avoid a race condition in the lazy JIT. Clients that still
want the lazy JIT can switch it on by calling

165

Documentation for the LLVM System at SVN head
ExecutionEngine::DisableLazyCompilation(false).
• It is now possible to create more than one JIT instance in the same process. These JITs can generate
machine code in parallel, although you still have to obey the other threading restrictions.

Target Independent Code Generator Improvements
We have put a significant amount of work into the code generator infrastructure, which allows us to
implement more aggressive algorithms and make it run faster:

• The 'llc -asm-verbose' option (which is now the default) has been enhanced to emit many useful
comments to .s files indicating information about spill slots and loop nest structure. This should make
it much easier to read and understand assembly files. This is wired up in llvm-gcc and clang to the
-fverbose-asm option.
• New LSR with "full strength reduction" mode, which can reduce address register pressure in loops
where address generation is important.
• A new codegen level Common Subexpression Elimination pass (MachineCSE) is available and
enabled by default. It catches redundancies exposed by lowering.
• A new pre-register-allocation tail duplication pass is available and enabled by default, it can
substantially improve branch prediction quality in some cases.
• A new sign and zero extension optimization pass (OptimizeExtsPass) is available and enabled by
default. This pass can takes advantage architecture features like x86-64 implicit zero extension
behavior and sub-registers.
• The code generator now supports a mode where it attempts to preserve the order of instructions in the
input code. This is important for source that is hand scheduled and extremely sensitive to scheduling.
It is compatible with the GCC -fno-schedule-insns option.
• The target-independent code generator now supports generating code with arbitrary numbers of result
values. Returning more values than was previously supported is handled by returning through a
hidden pointer. In 2.7, only the X86 and XCore targets have adopted support for this though.
• The code generator now supports generating code that follows the Glasgow Haskell Compiler Calling
Convention and ABI.
• The "DAG instruction selection" phase of the code generator has been largely rewritten for 2.7.
Previously, tblgen spit out tons of C++ code which was compiled and linked into the target to do the
pattern matching, now it emits a much smaller table which is read by the target-independent code. The
primary advantages of this approach is that the size and compile time of various targets is much
improved. The X86 code generator shrunk by 1.5MB of code, for example.
• Almost the entire code generator has switched to emitting code through the MC interfaces instead of
printing textually to the .s file. This led to a number of cleanups and speedups. In 2.7, debug an
exception handling information does not go through MC yet.

X86-32 and X86-64 Target Improvements
New features of the X86 target include:

• The X86 backend now optimizes tails calls much more aggressively for functions that use the
standard C calling convention.
• The X86 backend now models scalar SSE registers as subregs of the SSE vector registers, making the
code generator more aggressive in cases where scalars and vector types are mixed.

ARM Target Improvements
New features of the ARM target include:

• The ARM backend now generates instructions in unified assembly syntax.

166

Documentation for the LLVM System at SVN head

• llvm-gcc now has complete support for the ARM v7 NEON instruction set. This support differs
slightly from the GCC implementation. Please see the ARM Advanced SIMD (NEON) Intrinsics and
Types in LLVM Blog Post for helpful information if migrating code from GCC to LLVM-GCC.
• The ARM and Thumb code generators now use register scavenging for stack object address
materialization. This allows the use of R3 as a general purpose register in Thumb1 code, as it was
previous reserved for use in stack address materialization. Secondly, sequential uses of the same value
will now re-use the materialized constant.
• The ARM backend now has good support for ARMv4 targets and has been tested on StrongARM
hardware. Previously, LLVM only supported ARMv4T and newer chips.
• Atomic builtins are now supported for ARMv6 and ARMv7 (__sync_synchronize,
__sync_fetch_and_add, etc.).

New Useful APIs
This release includes a number of new APIs that are used internally, which may also be useful for external
clients.

• The optimizer uses the new CodeMetrics class to measure the size of code. Various passes (like the
inliner, loop unswitcher, etc) all use this to make more accurate estimates of the code size impact of
various optimizations.
• A new llvm/Analysis/InstructionSimplify.h interface is available for doing symbolic simplification of
instructions (e.g. a+0 -> a) without requiring the instruction to exist. This centralizes a lot of ad-hoc
symbolic manipulation code scattered in various passes.
• The optimizer now uses a new SSAUpdater class which efficiently supports doing unstructured SSA
update operations. This centralized a bunch of code scattered throughout various passes (e.g. jump
threading, lcssa, loop rotate, etc) for doing this sort of thing. The code generator has a similar
MachineSSAUpdater class.
• The llvm/Support/Regex.h header exposes a platform independent regular expression API. Building
on this, the FileCheck utility now supports regular exressions.
• raw_ostream now supports a circular "debug stream" accessed with "dbgs()". By default, this stream
works the same way as "errs()", but if you pass -debug-buffer-size=1000 to opt, the debug
stream is capped to a fixed sized circular buffer and the output is printed at the end of the program's
execution. This is helpful if you have a long lived compiler process and you're interested in seeing
snapshots in time.

Other Improvements and New Features
Other miscellaneous features include:

• You can now build LLVM as a big dynamic library (e.g. "libllvm2.7.so"). To get this, configure
LLVM with the --enable-shared option.
• LLVM command line tools now overwrite their output by default. Previously, they would only do this
with -f. This makes them more convenient to use, and behave more like standard unix tools.
• The opt and llc tools now autodetect whether their input is a .ll or .bc file, and automatically do the
right thing. This means you don't need to explicitly use the llvm-as tool for most things.

Major Changes and Removed Features
If you're already an LLVM user or developer with out-of-tree changes based on LLVM 2.6, this section lists
some "gotchas" that you may run into upgrading from the previous release.

• The Andersen's alias analysis ("anders-aa") pass, the Predicate Simplifier ("predsimplify") pass, the
LoopVR pass, the GVNPRE pass, and the random sampling profiling ("rsprofiling") passes have all
been removed. They were not being actively maintained and had substantial problems. If you are

167

Documentation for the LLVM System at SVN head
interested in these components, you are welcome to ressurect them from SVN, fix the correctness
problems, and resubmit them to mainline.
• LLVM now defaults to building most libraries with RTTI turned off, providing a code size reduction.
Packagers who are interested in building LLVM to support plugins that require RTTI information
should build with "make REQUIRE_RTTI=1" and should read the new Advice on Packaging LLVM
document.
• The LLVM interpreter now defaults to not using libffi even if you have it installed. This makes it
more likely that an LLVM built on one system will work when copied to a similar system. To use
libffi, configure with --enable-libffi.
• Debug information uses a completely different representation, an LLVM 2.6 .bc file should work with
LLVM 2.7, but debug info won't come forward.
• The LLVM 2.6 (and earlier) "malloc" and "free" instructions got removed, along with
LowerAllocations pass. Now you should just use a call to the malloc and free functions in libc. These
calls are optimized as well as the old instructions were.

In addition, many APIs have changed in this release. Some of the major LLVM API changes are:

• Just about everything has been converted to use raw_ostream instead of std::ostream.
• llvm/ADT/iterator.h has been removed, just use <iterator> instead.
• The Streams.h file and DOUT got removed, use DEBUG(errs() << ...); instead.
• The TargetAsmInfo interface was renamed to MCAsmInfo.
• ModuleProvider has been removed and its methods moved to Module and GlobalValue.
Most clients can remove uses of ExistingModuleProvider, replace
getBitcodeModuleProvider with getLazyBitcodeModule, and pass their Module to
functions that used to accept ModuleProvider. Clients who wrote their own ModuleProviders
will need to derive from GVMaterializer instead and use Module::setMaterializer to
attach it to a Module.
• GhostLinkage has given up the ghost. GlobalValues that have not yet been read from their
backing storage have the same linkage they will have after being read in. Clients must replace calls to
GlobalValue::hasNotBeenReadFromBitcode with
GlobalValue::isMaterializable.
• The isInteger, isIntOrIntVector, isFloatingPoint, isFPOrFPVector and
isFPOrFPVector methods have been renamed isIntegerTy, isIntOrIntVectorTy,
isFloatingPointTy, isFPOrFPVectorTy and isFPOrFPVectorTy respectively.
• llvm::Instruction::clone() no longer takes argument.
• raw_fd_ostream's constructor now takes a flag argument, not individual booleans (see
include/llvm/Support/raw_ostream.h for details).
• Some header files have been renamed:
♦ llvm/Support/AIXDataTypesFix.h to llvm/System/AIXDataTypesFix.h
♦ llvm/Support/DataTypes.h to llvm/System/DataTypes.h
♦ llvm/Transforms/Utils/InlineCost.h to llvm/Analysis/InlineCost.h
♦ llvm/Support/Mangler.h to llvm/Target/Mangler.h
♦ llvm/Analysis/Passes.h to llvm/CodeGen/Passes.h

Portability and Supported Platforms
LLVM is known to work on the following platforms:

• Intel and AMD machines (IA32, X86-64, AMD64, EMT-64) running Red Hat Linux, Fedora Core,
FreeBSD and AuroraUX (and probably other unix-like systems).
• PowerPC and X86-based Mac OS X systems, running 10.4 and above in 32-bit and 64-bit modes.
• Intel and AMD machines running on Win32 using MinGW libraries (native).

168

Documentation for the LLVM System at SVN head

• Intel and AMD machines running on Win32 with the Cygwin libraries (limited support is available
for native builds with Visual C++).
• Sun x86 and AMD64 machines running Solaris 10, OpenSolaris 0906.
• Alpha-based machines running Debian GNU/Linux.

The core LLVM infrastructure uses GNU autoconf to adapt itself to the machine and operating system on
which it is built. However, minor porting may be required to get LLVM to work on new platforms. We
welcome your portability patches and reports of successful builds or error messages.

Known Problems
This section contains significant known problems with the LLVM system, listed by component. If you run
into a problem, please check the LLVM bug database and submit a bug if there isn't already one.

• LLVM will not correctly compile on Solaris and/or OpenSolaris using the stock GCC 3.x.x series 'out
the box', See: Broken versions of GCC and other tools. However, A Modern GCC Build for
x86/x86-64 has been made available from the third party AuroraUX Project that has been
meticulously tested for bootstrapping LLVM & Clang.

Experimental features included with this release
The following components of this LLVM release are either untested, known to be broken or unreliable, or are
in early development. These components should not be relied on, and bugs should not be filed against them,
but they may be useful to some people. In particular, if you would like to work on one of these components,
please contact us on the LLVMdev list.

• The MSIL, Alpha, SPU, MIPS, PIC16, Blackfin, MSP430, SystemZ and MicroBlaze backends are
experimental.
• llc "-filetype=asm" (the default) is the only supported value for this option. The MachO writer
is experimental, and works much better in mainline SVN.

Known problems with the X86 back-end

• The X86 backend does not yet support all inline assembly that uses the X86 floating point stack. It
supports the 'f' and 't' constraints, but not 'u'.
• The X86 backend generates inefficient floating point code when configured to generate code for
systems that don't have SSE2.
• Win64 code generation wasn't widely tested. Everything should work, but we expect small issues to
happen. Also, llvm-gcc cannot build the mingw64 runtime currently due to lack of support for the 'u'
inline assembly constraint and for X87 floating point inline assembly.
• The X86-64 backend does not yet support the LLVM IR instruction va_arg. Currently, front-ends
support variadic argument constructs on X86-64 by lowering them manually.

Known problems with the PowerPC back-end

• The Linux PPC32/ABI support needs testing for the interpreter and static compilation, and lacks
support for debug information.

Known problems with the ARM back-end

• Thumb mode works only on ARMv6 or higher processors. On sub-ARMv6 processors, thumb
programs can crash or produce wrong results (PR1388).
• Compilation for ARM Linux OABI (old ABI) is supported but not fully tested.

169

Documentation for the LLVM System at SVN head

Known problems with the SPARC back-end

• The SPARC backend only supports the 32-bit SPARC ABI (-m32); it does not support the 64-bit
SPARC ABI (-m64).

Known problems with the MIPS back-end

• 64-bit MIPS targets are not supported yet.

Known problems with the Alpha back-end

• On 21164s, some rare FP arithmetic sequences which may trap do not have the appropriate nops
inserted to ensure restartability.

Known problems with the C back-end

• The C backend has only basic support for inline assembly code.
• The C backend violates the ABI of common C++ programs, preventing intermixing between C++
compiled by the CBE and C++ code compiled with llc or native compilers.
• The C backend does not support all exception handling constructs.
• The C backend does not support arbitrary precision integers.

Known problems with the llvm-gcc C and C++ front-end
The only major language feature of GCC not supported by llvm-gcc is the __builtin_apply family of
builtins. However, some extensions are only supported on some targets. For example, trampolines are only
supported on some targets (these are used when you take the address of a nested function).

Known problems with the llvm-gcc Fortran front-end

• Fortran support generally works, but there are still several unresolved bugs in Bugzilla. Please see the
tools/gfortran component for details.

Known problems with the llvm-gcc Ada front-end
The llvm-gcc 4.2 Ada compiler works fairly well; however, this is not a mature technology, and problems
should be expected.

• The Ada front-end currently only builds on X86-32. This is mainly due to lack of trampoline support
(pointers to nested functions) on other platforms. However, it also fails to build on X86-64 which
does support trampolines.
• The Ada front-end fails to bootstrap. This is due to lack of LLVM support for setjmp/longjmp
style exception handling, which is used internally by the compiler. Workaround: configure with
--disable-bootstrap.
• The c380004, c393010 and cxg2021 ACATS tests fail (c380004 also fails with gcc-4.2 mainline). If
the compiler is built with checks disabled then c393010 causes the compiler to go into an infinite
loop, using up all system memory.
• Some GCC specific Ada tests continue to crash the compiler.
• The -E binder option (exception backtraces) does not work and will result in programs crashing if an
exception is raised. Workaround: do not use -E.
• Only discrete types are allowed to start or finish at a non-byte offset in a record. Workaround: do not
pack records or use representation clauses that result in a field of a non-discrete type starting or
finishing in the middle of a byte.

170

Documentation for the LLVM System at SVN head
• The lli interpreter considers 'main' as generated by the Ada binder to be invalid. Workaround: hand
edit the file to use pointers for argv and envp rather than integers.
• The -fstack-check option is ignored.

Additional Information
A wide variety of additional information is available on the LLVM web page, in particular in the
documentation section. The web page also contains versions of the API documentation which is up-to-date
with the Subversion version of the source code. You can access versions of these documents specific to this
release by going into the "llvm/doc/" directory in the LLVM tree.

If you have any questions or comments about LLVM, please feel free to contact us via the mailing lists.

LLVM Compiler Infrastructure
Last modified: $Date: 2010-04-27 01:53:59 -0500 (Tue, 27 Apr 2010) $

171

Documentation for the LLVM System at SVN head
How to submit an LLVM bug report

1. Introduction - Got bugs?
2. Crashing Bugs
♦ Front-end bugs
♦ Compile-time
optimization bugs
♦ Code generator
bugs
3. Miscompilations
4. Incorrect code generation
(JIT and LLC)

Written by Chris Lattner and Misha
Brukman

Introduction - Got bugs?
If you're working with LLVM and run into a bug, we definitely want to know about it. This document
describes what you can do to increase the odds of getting it fixed quickly.

Basically you have to do two things at a minimum. First, decide whether the bug crashes the compiler (or an
LLVM pass), or if the compiler is miscompiling the program (i.e., the compiler successfully produces an
executable, but it doesn't run right). Based on what type of bug it is, follow the instructions in the linked
section to narrow down the bug so that the person who fixes it will be able to find the problem more easily.

Once you have a reduced test-case, go to the LLVM Bug Tracking System and fill out the form with the
necessary details (note that you don't need to pick a category, just use the "new-bugs" category if you're not
sure). The bug description should contain the following information:

• All information necessary to reproduce the problem.
• The reduced test-case that triggers the bug.
• The location where you obtained LLVM (if not from our Subversion repository).

Thanks for helping us make LLVM better!

Crashing Bugs
More often than not, bugs in the compiler cause it to crash—often due to an assertion failure of some sort. The
most important piece of the puzzle is to figure out if it is crashing in the GCC front-end or if it is one of the
LLVM libraries (e.g. the optimizer or code generator) that has problems.

To figure out which component is crashing (the front-end, optimizer or code generator), run the llvm-gcc
command line as you were when the crash occurred, but with the following extra command line options:

• -O0 -emit-llvm: If llvm-gcc still crashes when passed these options (which disable the
optimizer and code generator), then the crash is in the front-end. Jump ahead to the section on
front-end bugs.
• -emit-llvm: If llvm-gcc crashes with this option (which disables the code generator), you
found an optimizer bug. Jump ahead to compile-time optimization bugs.

172

bugpoint -run-llc foo.bc --tool-args -relocation-model=pic -enable-eh 173 . bugpoint -run-llc foo.bc files that bugpoint emits. llc foo.bc" file and the list of passes printed by opt.bc -relocation-model=pic -enable-eh 6.bc 2. but add the -save-temps option. then file a bug with the instructions and reduced . You're encouraged to use delta to reduce the code to make the developers' lives easier. then you should be able to debug this with the following bugpoint command: bugpoint foo. This website has instructions on the best way to use delta. llc foo.bc file by passing "-emit-llvm -O0 -c -o foo.c file. bugpoint -run-llc foo. compile your source file to a .bc --tool-args -enable-eh 5. please follow the instructions for a front-end bug.bc -relocation-model=static 4. you have a code generator crash.bc -relocation-model=pic 3.bc --tool-args -relocation-model=pic 3. but it will leave behind a foo. Send us the foo.bc 2. please submit the "foo. Then run: opt -std-compile-opts -debug-pass=Arguments foo.bc <list of passes printed by opt> Please run this.bc --tool-args -relocation-model=static 4. and a brief description of the error it caused. If something goes wrong with bugpoint. Front-end bugs If the problem is in the front-end. Code generator bugs If you find a bug that crashes llvm-gcc in the code generator. you should be able to reduce this with one of the following bugpoint command lines (use the one corresponding to the command above that failed): 1.s for each compiled foo. The compiler will crash again. compile your test-case to a . you should re-run the same llvm-gcc command that resulted in the crash. llc foo. please follow the instructions for a front-end bug. bugpoint -run-llc foo.bc -enable-eh 5.bc. Compile-time optimization bugs If you find that a bug crashes in the optimizer. Documentation for the LLVM System at SVN head • Otherwise. along with the options you passed to llvm-gcc. Jump ahead to code generator bugs. bugpoint -run-llc foo. If it doesn't crash. llc foo.bc -disable-output This command should do two things: it should print out a list of passes. and then it should crash in the same was as llvm-gcc.bc". llc foo.bc" to llvm-gcc (in addition to the options you already pass).bc file by passing "-emit-llvm -c -o foo. llc foo. The delta tool helps to reduce the preprocessed file down to the smallest amount of code that still replicates the problem. one of the following commands should fail: 1.i file. Once your have foo. If this does crash.bc -relocation-model=static -enable-eh If none of these crash.i file (containing preprocessed C source code) and possibly foo. If one of these do crash.

bc --args -- [program arguments] bugpoint will try to narrow down your list of passes to the one pass that causes an error. you will be presented with two bitcode files: a safe file which can be compiled with the C backend and the test file which either LLC or the JIT mis-codegenerates.. and thus 174 .g./.[arguments to pass to llc] \ --args -. and simplify the bitcode file as much as it can to assist you. there is an easier way to debug the JIT. but that executable doesn't run right. passes purify./program make bugpoint-jit At the end of a successful bugpoint run. and then link in the shared object it generates.bc" file and the option that llc crashes with. check to see if the program valgrinds clean.g.[arguments to pass to lli] \ --args -. Documentation for the LLVM System at SVN head 6. bugpoint -run-llc foo. reading a variable before it is defined).bc --tool-args -relocation-model=static -enable-eh Please run this.. Incorrect code generation Similarly to debugging incorrect compilation by mis-behaving passes. but since for correctness. this is either a bug in the code or a bug in the compiler. using the pre-written Makefile targets. please submit the "foo. optzn passes . or LLC) and optionally a series of LLVM passes to run. Miscompilations If llvm-gcc successfully produces an executable.. LLC. and CBE. To debug the JIT: bugpoint -run-jit -output=[correct output file] [bitcode file] \ --tool-args -.bc file that bugpoint emits. In particular. The process bugpoint follows in this case is to try to narrow the code down to a function that is miscompiled by one or the other method.. For example: bugpoint -run-cbe [. C backend. the JIT. you should choose which code generator you wish to compile the program with (e. to debug the LLC.. then file a bug with the instructions and reduced . which will pass the program options specified in the Makefiles: cd llvm/test/.. The first thing to check is to make sure it is not using undefined behavior (e. If something goes wrong with bugpoint. Many of the "LLVM bugs" that we have chased down ended up being bugs in the program being compiled. one would run: bugpoint -run-llc -output=[correct output file] [bitcode file] \ --tool-args -. Once you determine that the program itself is not buggy. using bugpoint. you can debug incorrect code generation by either LLC or the JIT. bugpoint will compile the code it deems to not be affected with the C Backend. or some other memory checker tool. not LLVM.[program arguments] Special note: if you are debugging MultiSource or SPEC tests that already exist in the llvm/test hierarchy. the entire program must be run. It will print a message letting you know how to reproduce the resulting error.[program arguments] Similarly.] file-to-test.

so -o test.llc . Regenerate the shared object from the safe bitcode file: llc -march=c safe. If debugging the JIT.c -o safe. 12 Oct 2009) $ 175 .llc [program options] 3. If debugging LLC. it is sufficient to do the following: 1.bc [program options] Chris Lattner The LLVM Compiler Infrastructure Last modified: $Date: 2009-10-12 09:46:08 -0500 (Mon./test.c gcc -shared safe. To reproduce the error that bugpoint found.s safe.s gcc test. Documentation for the LLVM System at SVN head causes the error.bc -o safe. load the shared object and supply the test bitcode: lli -load=safe.so test.so 2.bc -o test. compile test bitcode native and link with the shared object: llc test.

and how to add and run tests. Documentation for the LLVM System at SVN head LLVM Testing Infrastructure Guide 1. Expect Expect is required by DejaGNU. but can be written in other languages if the test targets a particular language front end (and the appropriate --with-llvmgcc options were used at configure time of the llvm module). Running the nightly tester Written by John T. These tests are driven by the DejaGNU testing framework. and Tanya Lattner Overview This document is the reference manual for the LLVM testing infrastructure. Quick start ♦ DejaGNU tests ♦ Test suite 5. Requirements In order to use the LLVM testing infrastructure. you will need all of the software required to build LLVM. They are usually written in LLVM assembly language. The whole programs tests are referred to as the "Test suite" and are in the test-suite module in subversion. Running the test suite ♦ Configuring External Tests ♦ Running different tests ♦ Generating test output ♦ Writing custom tests for llvm-test 8. DejaGNU tests Code fragments are small pieces of code that test a specific feature of LLVM or trigger a specific bug in LLVM. Overview 2. Reid Spencer. It documents the structure of the LLVM testing infrastructure. LLVM testing infrastructure organization The LLVM testing infrastructure contains two major categories of tests: code fragments and whole programs. Criswell. LLVM testing infrastructure organization ♦ DejaGNU tests ♦ Test suite 4. DejaGNU structure ♦ Writing new DejaGNU tests ♦ The FileCheck utility ♦ Variables and substitutions ♦ Other features 6. plus the following: DejaGNU The Feature and Regressions tests are organized and run by DejaGNU. tcl Tcl is required by DejaGNU. which is 176 . Code fragments are referred to as the "DejaGNU tests" and are in the llvm module in subversion under the llvm/test directory. the tools needed to use it. Requirements 3. Test suite structure 7.

LLVM C backend. Alternatively. This module should be checked out to the llvm/projects directory (don't use another name then the default "test-suite". both in terms of the efficiency of the programs generated as well as the speed with which LLVM compiles. and generates code. Typically when a bug is found in LLVM. In most cases. These programs are generally written in high level languages such as C or C++. whole program tests serve as a way of benchmarking LLVM performance. DejaGNU tests To run all of the simple tests in LLVM using DejaGNU. Documentation for the LLVM System at SVN head hidden behind a few simple makefiles. which are pieces of code which can be compiled and linked into a stand-alone program that can be executed. use the master Makefile in the llvm/test directory: % gmake -C llvm/test or % gmake check To run only a subdirectory of tests in llvm/test using DejaGNU (ie. for then the test suite will be run every time you run make in the main llvm directory). These code fragment tests are located in the llvm/test directory. Transforms). Quick start The tests are located in two separate Subversion modules. These code fragments are not complete programs. LLVM JIT. optimizes. The code generated from them is never executed to determine correct behavior. In addition to compiling and executing programs. Test suite The test suite contains whole programs. just set the TESTSUITE variable to the path of the subdirectory (relative to llvm/test): % gmake TESTSUITE=Transforms check 177 . you can configure the test-suite module manually. often distilled from an actual application or benchmark. LLVM native code generation. The DejaGNU tests are in the main "llvm" module under the directory llvm/test (so you get these tests for free with the main llvm tree). but sometimes they are written straight in LLVM assembly. The output of these programs is compared to ensure that LLVM is compiling the program correctly. The more comprehensive test suite that includes whole programs in C and C++ is in the test-suite module. the test-suite directory will be automatically configured. The test-suite is located in the test-suite Subversion module. When you configure the llvm module. etc). a regression test containing just enough code to reproduce the problem should be written and placed somewhere underneath this directory. this will be a small piece of LLVM assembly language code. These programs are compiled and then executed using several different methods (native compiler.

178 . running the "nightly" set of tests is a good idea.ll check-one To run the tests with Valgrind (Memcheck by default).2 module was configured with --program-prefix=llvm-. • Bitcode: checks Bitcode reader/writer functionality.html Any of the above commands can also be run in a subdirectory of projects/test-suite to run the specified test only on the programs in that subdirectory. Then. Documentation for the LLVM System at SVN head Note: If you are running the tests with objdir != subdir. set TESTONE to its path (relative to llvm/test) and make the check-one target: % gmake TESTONE=Feature/basictest. To run only a single test. run the entire test suite by running make in the test-suite directory: % cd projects/test-suite % gmake Usually.org/svn/llvm-project/test-suite/trunk test-suite % cd . and therefore that the C and C++ compiler drivers are called llvm-gcc and llvm-g++ respectively. and you can also let it generate a report by running: % cd projects/test-suite % gmake TEST=nightly report report. each focused on a particular area of LLVM. % ./configure --with-llvmgccdir=$LLVM_GCC_DIR where $LLVM_GCC_DIR is the directory where you installed llvm-gcc.. The directory is broken into several sub-directories.: % gmake check VG=1 Test suite To run the comprehensive test suite (tests that compile and execute whole programs). This directory contains a large array of small tests that exercise various features of LLVM and to ensure that regressions do not occur. just append VG=1 to the commands above. you must have run the complete testsuite before you can specify a subdirectory. The --with-llvmgccdir option assumes that the llvm-gcc-4. not it's src or obj dir. use --with-llvmgcc/--with-llvmgxx to specify each executable's location. DejaGNU structure The LLVM DejaGNU tests are driven by DejaGNU together with GNU Make and are located in the llvm/test directory. • Archive: checks the Archive library.g. If this is not the case. • Assembler: checks Assembly reader/writer functionality. first checkout and setup the test-suite module: % cd llvm/projects % svn co http://llvm. A few of the important ones are: • Analysis: checks Analysis passes. e.

Below is an example of legal RUN lines in a . Any directory that contains only directories does not need the dg. these lines form the "script" that llvm-runtests executes to run the test case.exp from another directory to get running. just copy dg. the RUN: lines permit pipelines and I/O redirection to be used. RUN: llvm-dis < %s. the llvm-runtests function will issue an error and the test will fail.exp simply loads a Tcl library (test/lib/llvm.exp) and calls the llvm_runtests function defined in that library with a list of file names to run. but does require some information to be set. In order for DejaGNU to work. This file is just a Tcl script and it can do anything you want.ll file: . Consequently the syntax differs from normal shell script syntax in a few ways. However.exp file. RUN lines are interpreted directly by the Tcl exec command. They are never executed by a shell. You can specify as many RUN lines as needed. This are the "RUN" lines that specify how the test is to be run.exp in llvm/test. Writing new DejaGNU tests The DejaGNU structure is very simple. and lastly the command (pipeline) to execute. Tcl will substitute variables and arrange for the pipeline to be executed. This information is gathered via configure and is written to a file. IPO. DejaGNU looks for this file to determine how to run the tests. but we've standardized it for the LLVM regression tests. The names are obtained by using Tcl's glob command. they are not. RUN lines are specified in the comments of the test program using the keyword RUN followed by a colon.bc-13 > %t2 . RUN: diff %t1 %t2 As with a Unix shell. If any process in the pipeline fails. So. even though these lines may look like a shell script. If you're adding a directory of tests. Documentation for the LLVM System at SVN head • CodeGen: checks code generation and each target. see the documentation for the Tcl exec command and the tutorial. and utility transforms to ensure they make the right transformations. This continuation character causes the RUN line to be concatenated with the next one. The llvm-runtests function lookas at each file that is passed to it and gathers any lines together that match "RUN:". Each RUN line is executed on its own. each test script must contain RUN lines if it is to do anything. • Verifier: tests the IR verifier. Together. In this way you can build up long pipelines of commands without making huge line lengths. the entire line (and test case) fails too. each directory of tests must have a dg. The major differences are: 179 . • Linker: tests bitcode linking. If there are no RUN lines. This concatenated set of RUN lines then constitutes one execution. The standard dg. However. RUN: llvm-as < %s | llvm-dis > %t1 . site. The llvm/test Makefile does this work for you.exp file. • Features: checks various features of the LLVM language. The syntax of the RUN lines is similar to a shell's syntax for pipelines including I/O redirection and variable substitution. the usage is slightly different than for Bash. To check what's legal. • Transforms: tests each of the scalar. The lines ending in \ are concatenated until a RUN line that doesn't end in \ is found. distinct from other lines unless its last character is \.

| grep bb[2-8] This. • tcl supports redirecting to open files with the @ syntax but you shouldn't use that here. That's not likely to match anything.. | grep 'find this string' This will fail because the ' characters are passed to grep. This is another Tcl special character. Otherwise. |& grep • You can only redirect to a file. however.. Since these characters are often used in regular expressions this can have disastrous results and cause the entire test run in a directory to fail.. In general nothing needs to be quoted.... 180 . Instead. Tcl won't strip off any ' or " so they will get passed to the invoked program.. That will cause Tcl to write to a file named &1. not to another descriptor and not from a here document.. suppose you had: . you may get invalid results (both false positives and false negatives). They tell Tcl to interpret the content as a command to execute. then it must be doubled. | grep {find this string} Additionally. | grep {i32\\*} If your system includes GNU grep. | grep {bb\[2-8\]} Finally. make sure that GREP_OPTIONS is not set in your environment. the characters [ and ] are treated specially by Tcl. You can do that in tcl with |& so replace this idiom: . if you need to pass the \ character down to a program. For example: . So.. While standard (portable) unix tools like 'grep' work fine on run lines.. like this: . | grep 'i32\*' This will fail to match what you want (a pointer to i32). Second. To avoid this use curly braces to tell Tcl that it should treat everything enclosed as one value.. For example... So our example would become: . First. The FileCheck tool was designed to help with these problems. will cause Tcl to fail because its going to try to execute a program named "2-8". what you want is this: . a common idiom is to look for some basicblock number: . as you see above. and we want to make sure the run lines are portable to a wide range of systems.. 2>&1 | grep with . Another major problem is that grep is not very good at checking to verify that the output of a tools contains a series of different output in a specific order.. This would instruction grep to look for 'find in the files this and string'. the \ gets stripped off by Tcl so what grep sees is: 'i32*'. Usually this is done to get stderr to go through a pipe.. To resolve this you must use \\ and the {}. there are a lot of caveats due to interaction with Tcl syntax. The FileCheck utility A powerful feature of the RUN: lines is that it allows any arbitrary commands to be executed as part of the test harness. There are some quoting rules that you must pay attention to when writing your RUN lines. Documentation for the LLVM System at SVN head • You can't do 2>&1. the ' do not get stripped off.

<4 x i32> %tmp) nounwind { %tmp1 = insertelement <4 x i32> %tmp.load. %xmm0 181 . If it existed somewhere else in the file. it will not match unless there is a "subl" in between those labels. This means that FileCheck will be verifying its standard input (the llc output) against the filename argument specified (the original . FileCheck defaults to ignoring horizontal whitespace differences (e. Now you can see how the file is piped into llvm-as. RUN: | FileCheck %s -check-prefix=X32 . then llc. i32 %s. CHECK: subl %0 = tail call i32 @llvm. i32 %v) { entry: . The syntax of the CHECK: lines is very simple: they are fixed strings that must occur in order. the contents of the CHECK: line is required to match some thing in the test file exactly. RUN: llvm-as < %s | llc -march=x86-64 | FileCheck %s This syntax says to pipe the current file ("%s") into llvm-as. To see how this works.i64. because the test above is checking for the "sub1:" and "inc4:" labels.add. RUN: llvm-as < %s | llc -mtriple=x86_64-apple-darwin9 -mattr=sse41 \ .ll file. Documentation for the LLVM System at SVN head FileCheck (whose basic command line arguments are described in the FileCheck man page is designed to read a file to check from standard input. a space is allowed to match a tab) but otherwise.i32. A simple example of using FileCheck from a RUN line looks like this: . then pipe the output of llc into FileCheck.ll file (after the RUN line): define void @sub1(i32* %p. RUN: | FileCheck %s -check-prefix=X64 define <4 x i32> @pinsrd_1(i32 %s. CHECK: sub1: .p0i32(i32* %p. CHECK: incq %0 = tail call i64 @llvm.sub. X32: pinsrd $1.ll file specified by "%s").atomic. pipe that into llc. and the set of things to verify from a file specified as a command line argument. testing different architectural variants with llc. and the machine code output is what we are verifying. i64 1) ret void } Here you can see some "CHECK:" lines specified in comments. FileCheck checks the machine code output to verify that it matches what the "CHECK:" lines specify. i32 1 ret <4 x i32> %tmp1 . RUN: llvm-as < %s | llc -mtriple=i686-apple-darwin9 -mattr=sse41 \ . The FileCheck -check-prefix option The FileCheck -check-prefix option allows multiple test configurations to be driven from one . One nice thing about FileCheck (compared to grep) is that it allows merging test cases together into logical groups. lets look at the rest of the . CHECK: inc4: .load.g. This is useful in many circumstances. For example. that would not count: "grep subl" matches if subl exists anywhere in the file. for example.p0i64(i64* %p. i32 %v) ret void } define void @inc4(i64* %p) { entry: . Here's a simple example: . 4(%esp).atomic. X32: pinsrd_1: .

For example. you can use CHECK: and CHECK-NEXT: directives to specify this. CHECK: ret i8 } FileCheck Pattern Matching Syntax The CHECK: and CHECK-NOT: directives both take a pattern to match. CHECK-NOT: load . i32 2 > store <2 x double> %tmp9. %xmm0 . align 16 ret void . i32 0 %tmp9 = shufflevector <2 x double> %tmp3. i32* %P %P2 = bitcast i32* %P to i8* %P3 = getelementptr i8* %P2. %eax . a more flexible form of matching is desired. CHECK-NEXT: movapd (%eax). CHECK: movl 8(%esp). align 16 %tmp7 = insertelement <2 x double> undef. The "CHECK-NEXT:" directive Sometimes you want to match lines and would like to verify that matches happen on exactly consequtive lines with no other lines in between them. CHECK: t2: . something like this works as you'd expect: define void @t2(<2 x double>* %r. just use "<PREFIX>-NEXT:". i32 2 %A = load i8* %P3 ret i8 %A . Documentation for the LLVM System at SVN head . CHECK: @coerce_offset0 . To 182 . double %B) { %tmp3 = load <2 x double>* %A. <2 x double>* %A. (%eax) . CHECK-NEXT: movl 4(%esp). For example. <2 x double> %tmp7. %edi. %eax . to verify that a load is removed by a transformation. fixed string matching is perfectly sufficient. In this case. %xmm0 } In this case. The "CHECK-NOT:" directive The CHECK-NOT: directive is used to verify that a string doesn't occur between two matches (or the first match and the beginning of the file). If you specified a custom check prefix. <2 x double>* %r. X64: pinsrd_1: . For some things. <2 x i32> < i32 0. %xmm0 . A CHECK-NEXT cannot be the first directive in a file. we're testing that we get the expected code generation with both 32-bit and 64-bit code generation. X64: pinsrd $1. CHECK-NEXT: ret } CHECK-NEXT: directives reject the input unless there is exactly one newline between it an the previous directive. i32* %P) { store i32 %V. a test like this can be used: define i8 @coerce_offset0(i32 %V. For most uses of FileCheck. double %B. CHECK-NEXT: movhpd 12(%esp). CHECK-NEXT: movapd %xmm0.

you can use something ugly like {{[{][{]}} as your pattern. To do this. The second line verifies that whatever is in REGISTER occurs later in the file after an "andw". FileCheck Variables It is often useful to match a pattern and then verify that it occurs again later in the file. FileCheck has been designed to support mixing and matching fixed string matching with regular expressions. but verify that that register is used consistently later. this can be useful to allow any register. In the rare case that you want to match double braces explicitly from the input. These alternates are deprecated and may go away in a future version. FileCheck variables can be defined multiple times. any Tcl variable that is available in the substitute function (in test/lib/llvm. are named. This is suitable for passing on the command line as the input to an llvm tool. and uses always get the latest value. Here are the available variable names. $srcdir The source directory from where the "make check" was run. This allows you to write things like this: . objdir The object directory that corresponds to the $srcdir.*]]x[[XYZ]]" that the check line will read the previous value of the XYZ variable and define a new one after the match is performed.*}} [[REGISTER]] The first check line matches a regex (%[a-z]+) and captures it into the variables "REGISTER". Because regular expressions are enclosed with double braces. Additionally. This means that if you have something like "CHECK: [[XYZ:. and you don't need to use escape characters within the double braces like you would in C. {{%xmm[0-7]}} In this case. To make a substitution just write the variable's name preceded by a $. The alternate syntax is listed in parentheses.exp) can be substituted into a RUN line. and their names can be formed with the regex "[a-zA-Z][a-zA-Z0-9]*". Note that variables are all read at the start of a "CHECK" line and are all defined at the end. FileCheck allows named variables to be defined and substituted into patterns. Because we want to use fixed string matching for a majority of what we do. then it is a definition of the variable. and any xmm register will be allowed. any offset from the ESP register will be allowed. For codegen tests. if not. $test (%s) The full path to the test case's source. certain names can be accessed with an alternate syntax: a % prefix. CHECK: movhpd {{[0-9]+}}(%esp). CHECK: test5: . this allows you to define two separate CHECK lines that match on the same line. If a colon follows the name. FileCheck allows you to specify regular expressions in matching strings. If you need to do something like this you can probably take advantage of the fact that FileCheck is not actually line-oriented when it matches. Documentation for the LLVM System at SVN head support this. FileCheck variable references are always contained in [[ ]] pairs. 183 . Variables and substitutions With a RUN line there are a number of substitutions that are permitted. for compatibility reasons with previous versions of the test library. In general. CHECK: andw {{. they are visually distinct. it is a use. CHECK: notw [[REGISTER:%[a-z]+]] . Here is a simple example: . surrounded by double braces: {{yourregex}}.

path The path to the directory that contains the test case source. objroot The root directory of the LLVM object tree. in the substitute proc. That's it. the variable can then be used in test scripts. For example: ignore 184 . -D and optimization options. This could be the same as the srcroot. This has all the configured -I. target_triplet (%target_triplet) The target triplet that corresponds to the current host machine (the one running the test cases). You can append to it if you need multiple temporaries. but used by the test.exp file. Note that this might not be gcc. add a line in the test/Makefile that creates the site. -L and -l options. This directory is in the PATH when running tests. Documentation for the LLVM System at SVN head subdir A partial path from the test directory that contains the sub-directory that contains the test source being executed. llvmgcc (%llvmgcc) The full path to the llvm-gcc executable as specified in the configured LLVM environment llvmgxx (%llvmgxx) The full path to the llvm-gxx executable as specified in the configured LLVM environment gccpath The full path to the C compiler used to build LLVM. link (%link) This full link command used to link LLVM executables. This should probably be called "host". tmp The path to a temporary file name that could be used for this test case. First. compile_c (%compile_c) The full command line used to compile LLVM C source code. shlibext (%shlibext) The suffix for the host platforms share library (dll) files. Other Features To make RUN line writing easier. This is for locating any supporting files that are not generated by the test. llvmlibsdir (%llvmlibsdir) The directory where the LLVM libraries are located. Note that this might not be g++. compile_cxx (%compile_cxx) The full command used to compile LLVM C++ source code. This has all the configured -I. in the test/lib/llvm. Second. there are several shell scripts located in the llvm/test/Scripts directory. This will "set" the variable as a global in the site.exp file. The file name won't conflict with other test cases. This is useful as the destination of some redirected output. gxxpath The full path to the C++ compiler used to build LLVM. srcroot The root directory of the LLVM src tree.exp file. This has all the configured -I. so you can just call these scripts using their name. add the variable name to the list of "global" declarations at the beginning of the proc. -D and optimization options. To add more variables. two things need to be changed. This includes the period as the first character.

If there is a match. Non-zero result codes become 0. This is useful to invert the result of a grep. This has two side effects: (a) it prevents special interpretation of lines that are part of the test program. Zero result codes become 1. In addition for testing correctness. any line that contains "END.g. The regular expressions following the : are matched against the target triplet for the host machine.sun To make the output more useful. The regular expressions allow you to XFAIL the test conditionally by host platform. This signals that the test case should succeed if the test fails. XFAIL: darwin. Such test cases are counted separately by DejaGnu. and (b) it speeds things up for really big test cases by avoiding interpretation of the remainder of the file. any program in a pipeline that returns a non-zero result will cause the test to fail. This is useful in cases where the test needs to cause a tool to generate an error (e. You can easily mark a test as XFAIL just by including XFAIL: on a line near the top of the file. llvm-test tests are divided into three types of tests: MultiSource. SingleSource. simply change directory to the programs you want tested and run gmake there. the test is expected to succeed. it will be used in the pass/fail reporting. • llvm-test/SingleSource 185 . the llvm-test directory also performs timing tests of various LLVM optimizations. To XFAIL everywhere just specify XFAIL: *. When executing tests. use the XFAIL keyword in the comments of the test program followed by a colon and one or more regular expressions (separated by a comma). and External. the results from the other programs are compared to the native program output and pass if they match. If not. This is generally done right after the last RUN: line. Here is an example of an XFAIL line: . This makes test run times smaller at first and later on this is useful to investigate individual test failures. This information can be used to compare the effectiveness of LLVM's optimizations and code generation. you can run a different test using the TEST variable to change what tests or run on the selected programs (see below for more info). the llvm_runtest function wil scan the lines of the test case for ones that contain a pattern that matches PR[0-9]+. When a PR number is specified. to check the error output). These programs are compiled using the native compiler and various LLVM backends. it is usually a good idea to start out with a subset of the available tests or programs. However. This script overcomes that issue and nicely documents that the test case is purposefully ignoring the result code of the tool not This script runs its arguments and then inverts the result code from it. Test suite Structure The test-suite module contains a number of programs that can be compiled with LLVM and executed. To run some test only on a subset of programs. This is useful to quickly get some context when a test fails. Finally. The number after "PR" specifies the LLVM bugzilla number. Alternatively. To specify an expected fail. It also records compilation times for the compilers and the JIT." will cause the special interpretation of lines to terminate. Sometimes it is necessary to mark a test case as "expected fail" or XFAIL. Documentation for the LLVM System at SVN head This script runs its arguments and then always returns 0. The output from the program compiled with the native compiler is assumed correct. For example "not grep X" means succeed only if you don't find X in the input. This is the syntax for specifying a PR (Problem Report) number that is related to the test case. the test is expected to fail. not the instructions to the test case.

Documentation for the LLVM System at SVN head The SingleSource directory contains test programs that are only a single source file in size. just as you do before building LLVM. 4. you must either: (1) have llvm-gcc you just built in your path. benchmarks. code that is strange grammatically. Large benchmarks and whole applications go here. • llvm-test/External The External directory contains Makefiles for building code that is external to (i. the result for such tests will be XFAIL (eXpected FAILure). This is because the test suite creates temporary files during execution. They are not executed inside of the LLVM source tree. cd into the llvm/projects directory in your source tree. you need to use the following steps: 1. 3. etc. Re-configure llvm from the top level of each build tree (LLVM object directory tree) in which you want to run the test suite. others are features that we haven't added yet (or may never add). regression tests. Some are bugs that we have not fixed yet. not distributed with) LLVM. a large <program> FAILED message will be displayed. These organizations should be relatively self explanatory. The presence and location of these external programs is configured by the llvm-test configure script. The most prominent members of this directory are the SPEC 95 and SPEC 2000 benchmark suites. In DejaGNU. Check out the test-suite module with: % svn co http://llvm. The External directory does not contain these actual tests. including applications. • llvm-test/MultiSource The MultiSource directory contains subdirectories which contain entire programs with multiple source files. Running the test suite First. all tests are executed within the LLVM object directory tree. 6. In this way. These are usually small benchmark programs or small programs that calculate a particular value. only warnings and other miscellaneous output will be generated. 5. Configure and build llvm. If the test passes.. You must also tell the configure machinery that the test suite is available so it can be configured for your build tree: 186 . but only the Makefiles that know how to properly compile these programs from somewhere else. Configure and build llvm-gcc. Install llvm-gcc somewhere. The tests in the test suite have no such feature at this time. This will help you separate benign warnings from actual test failures. Several such programs are grouped together in each directory. or (2) specify the directory where your just-built llvm-gcc is installed using --with-llvmgccdir=$LLVM_GCC_DIR. To run the test suite. Each tree is then subdivided into several categories. 2. During the re-configuration. Some tests are known to fail.org/svn/llvm-project/test-suite/trunk test-suite This will get the test suite into llvm/projects/test-suite. If a test fails. you can tell the difference between an expected and unexpected failure.e.

They may still be valuable. the External tests won't work. You can now run the test suite from your build tree as follows: % cd $LLVM_OBJ_ROOT/projects/test-suite % make Note that the second and third steps only need to be done once. you must specify --with-externals. This will compile and run all programs in the tree using a number of different methods and compare results. The most simple one is simply running gmake with no arguments.nightly. but are likely drowned in the other output. This Makefile can modify build rules to yield different results. and the llvm re-configuration must recognize the previously-built llvm-gcc.Makefile to create the nightly test reports. --with-externals --with-externals=<directory> This tells LLVM where to find any external tests. Passes are not reported explicitely. as a guide to writing your own TEST Makefile for any optimization or analysis passes that you develop with LLVM. They are expected to be in specifically named subdirectories of <directory>. the test-suite module also provides a mechanism for compiling the programs in different ways. Configuring External Tests In order to run the External tests in the test-suite module. however. 187 . After you have the suite checked out and configured. the LLVM nightly tester uses TEST. There are several TEST Makefiles available in the tree. To run the nightly tests.Makefile. Documentation for the LLVM System at SVN head % cd $LLVM_OBJ_ROOT . the test system will include a Makefile named TEST. For example. not its src or obj directory. Subdirectory names known to LLVM include: spec95 speccpu2000 speccpu2006 povray31 Others are added from time to time.<value of TEST variable>. Some of them are designed for internal LLVM research and will not work outside of the LLVM research group. run gmake TEST=nightly. you don't need to do it again (unless the test code or configure script changes). Running different tests In addition to the regular "whole program" tests. If directory is left unspecified. Generating test output There are a number of ways to run the tests and generate output. This must be done during the re-configuration step (see above).] 7. $LLVM_SRC_ROOT/configure [--with-llvmgccdir=$LLVM_GCC_DIR] [Remember that $LLVM_GCC_DIR is the directory where you installed llvm-gcc. Any failures are reported in the output. If the variable TEST is defined on the gmake command line. If any of these is missing or neglected. and can be determined from configure. configure uses the default value /home/vadve/shared/benchmarks/speccpu2000/benchspec.

Though these lines are still drowned in the output.. (e.<type>.format file (when running with TEST=<type>). the nightly test explicitely outputs TEST-PASS or TEST-FAIL for every test after each program.. collecting statistics or running custom checks for correctness.XXX. similarly for report.format targets (where format is one of html.<type>.html" target to get the table in HTML form. this is how the nightly tester works. it's just one example of a general framework. Lets say that you have an LLVM optimization pass. It can be run like this: % cd llvm/projects/test-suite/MultiSource/Benchmarks # or some other level % make TEST=libcalls report This will do a bunch of stuff.Makefile" fragment (where XXX is the name of your test) and an "llvm-test/TEST. it is really easy to run optimizations or code generator components against every program in the tree. text or graphs).. you can set up a test and a report that collects these and formats them for easy viewing. "gmake TEST=nightly report" should work). For example. First thing you should do is add an LLVM statistic to your pass. This basically is grepping the -stats output and displaying it in a table. Following this. Documentation for the LLVM System at SVN head Somewhat better is running gmake TEST=sometest test.tex. it's easy to grep the output logs in the Output directories. and the framework is very general. If you are interested in testing an optimization pass.report" file that indicates how to format the output into a table.. FreeBench/analyzer/analyzer | 51 | 6 | FreeBench/fourinarow/fourinarow | 1 | 1 | FreeBench/neural/neural | 19 | 9 | FreeBench/pifft/pifft | 5 | 3 | MallocBench/cfrac/cfrac | 1 | * | MallocBench/espresso/espresso | 52 | 12 | MallocBench/gs/gs | 4 | * | Prolangs-C/TimberWolfMC/timberwolfmc | 302 | * | Prolangs-C/agrep/agrep | 33 | 12 | Prolangs-C/allroots/allroots | * | * | Prolangs-C/assembler/assembler | 47 | * | Prolangs-C/bison/mybison | 74 | * | . The report also generate a file called report. check out the "libcalls" test as an example. which runs the specified test and usually adds per-program summaries to the output (depending on which sometest you use). then eventually print a table like this: Name | total | #exit | .raw. There are many example reports of various levels of sophistication included with the test suite. but the text results are always shown at the end of the run and the results are always stored in the report. 188 . This consists of two files. which will tally counts of things you care about. Even better are the report and report. an "test-suite/TEST. At base.g.csv and report. csv.out containing the output of the entire test run. Writing custom tests for the test suite Assuming you can run the test suite.XXX. The exact contents of the report are dependent on which TEST you are running. and you want to see how many times it triggers. You can also use the "TEST=libcalls report.

nice . "opt -simplify-libcalls -stats"). You can do this by passing the command line option "-submit-server [server_address]" and "-submit-script [script_on_server]" to utils/NewNightlyTest. Reid Spencer. An email to llvm-testresults@cs. There are lots of example reports that can do fancy stuff. The optimized x86 Linux nightly test is run from just such a script: #!/bin/bash BASE=/proj/work/llvm/nightlytest export BUILDDIR=$BASE/build export WEBDIR=$BASE/testresults export LLVMGCCDIR=/proj/work/llvm/cfrontend/install export PATH=/proj/install/bin:$LLVMGCCDIR/bin:$PATH export LD_LIBRARY_PATH=/proj/install/lib cd $BASE cp /proj/work/llvm/llvm/utils/NewNightlyTest. If you start running the nightly tests. build it.pl file. If these options are not specified. If you decide to set up a nightly tester please choose a unique nickname and invoke utils/NewNightlyTest. delete the checked out tree. The format is pretty simple: the Makefile indicates how to run the test (in this case. and Tanya Lattner The LLVM Compiler Infrastructure Last modified: $Date: 2010-02-26 15:23:59 -0600 (Fri.org/nightlytest/. take a look at the comments at the top of the utils/NewNightlyTest. Take a look at the NewNightlyTest.pl. and then submit the results to http://llvm.edu summarizing the results is also generated.org -submit-script /nightlytest/NightlyTestAccept. The first value is the header for the column and the second is the regex to grep the output of the command for. For example. After test results are submitted to http://llvm. run the "nightly" program test (described above).pl file to see what all of the flags and strings do.pl -nice -release -verbose -parallel -enable-linscan \ -nickname NightlyTester -noexternals > output./NewNightlyTest.org nightly test results page. This testing scheme is designed to ensure that programs don't break as well as keep track of LLVM's progress over time.pl with the "-nickname [yournickname]" command line option. you would invoke the nightly test script with "-submit-server llvm.org/nightlytest/.libcalls. please let us know. Thanks! John T.pl .log 2>&1 It is also possible to specify the the location your nightly test results are submitted.*. 26 Feb 2010) $ 189 . run all of the DejaGNU tests. to submit to the llvm. Criswell. If you'd like to set up an instance of the nightly tester to run on your machine. they are processed and displayed on the tests page.cgi". and the report contains one line for each column of the output. Documentation for the LLVM System at SVN head The source for this is in test-suite/TEST.uiuc. Running the nightly tester The LLVM Nightly Testers automatically check out an LLVM tree. the nightly test script sends the results to the llvm.org nightly test results page. You can create a shell script to encapsulate the running of the script.

C and C++. It is also possible to download the sources of the llvm-gcc front end from a read-only mirror using subversion.ada There are some complications however: 1.2 code for first time use: svn co http://llvm. License Information Written by the LLVM Team Building llvm-gcc from Source This section describes how to acquire and build llvm-gcc 4.LLVM file for up-to-date instructions on how to build llvm-gcc.org/svn/llvm-project/llvm-gcc-4.gz archive from the LLVM web site. Objective-C and Objective-C++. 1. GNAT GPL 2008. 2006 and 2007 versions of the GNAT GPL Edition.2 and the 2005. which is based on the GCC 4.LLVM file. Building the Fortran front-end 4. Supported languages are Ada. The rest of gcc is written in C. gcc-4. Building llvm-gcc from Source 2.2/trunk dst-directory After that.3 and later will not work.2-version. The Ada front-end is written in Ada so an Ada compiler is needed to build it. the code can be be updated in the destination directory using: svn update The mirror is brought up to date every evening. one that supports Ada and C (such as the 2007 GNAT GPL Edition) and another which supports C++. for example: EXTRALANGS=. adding ". Because the Ada front-end is experimental.2. This causes it to run much slower. Follow the directions in the top-level README. See below for building with support for Ada or Fortran. Note that the instructions for building these front-ends are completely different (and much easier!) than those for building llvm-gcc3 in the past. 3. see below.5 release are gcc-4. It is unlikely to build for other systems without some work.tar.1 front-end. The LLVM parts of llvm-gcc are written in C++ so a C++ compiler is needed to build them. but helps catch mistakes in the compiler (please report any Building the Ada front-end 190 . Fortran. Some linux distributions provide a version of gcc that supports all three languages (the Ada part often comes as an add-on package to the rest of gcc). Otherwise it is possible to combine two versions of gcc. Compilers known to work with the LLVM 2. C++. The only platform for which the Ada front-end is known to build is 32 bit intel x86 running linux. To check out the 4.ada" to EXTRALANGS. The build requires having a compiler that supports Ada. Building the Ada front-end Building with support for Ada amounts to following the directions in the top-level README. it is wise to build the compiler with checking enabled. Building the Ada front-end 3. 2.2. Retrieve the appropriate llvm-gcc-4. 2.source. Documentation for the LLVM System at SVN head Building the LLVM GCC Front-End 1. C.

gz tar xzf llvm-2.tar. Make a build directory llvm-gcc-4.source..5/llvm-2.5 llvm or check out the latest version from subversion: svn co http://llvm. Configure llvm-gcc (here it is configured to install into /usr/local).2-2.5.5/llvm-gcc-4.tar. The Ada front-end fails to bootstrap.2 source and unpack it: wget http://llvm. Download the llvm-gcc-4.gz mv llvm-2. Download the LLVM source and unpack it: wget http://llvm./llvm/configure --prefix=/usr/local --enable-optimized --enable-assertions If you have a multi-compiler setup and the C++ compiler is not the default.org/svn/llvm-project/llvm/trunk llvm 2. 4./llvm/configure --prefix=/usr/local --enable-optimized --enabl To compile without checking (not recommended).5.2 3..2 or check out the latest version from subversion: svn co http://llvm. 5.5.2-2.2-objects 8. replace --enable-assertions with --disable-assertions.gz tar xzf llvm-gcc-4. then you can configure like this: CXX=PATH_TO_C++_COMPILER .org/releases/2. To turn off these checks Building the Ada front-end 191 . Install LLVM (optional): make install 7.gz mv llvm-gcc4.5.2/trunk llvm-gcc-4. Supposing appropriate compilers are available. Documentation for the LLVM System at SVN head problems using LLVM bugzilla).tar.org/svn/llvm-project/llvm-gcc-4..org/releases/2. due to lack of LLVM support for setjmp/longjmp style exception handling (used internally by the compiler). Make a build directory llvm-objects for llvm and make it the current directory: mkdir llvm-objects cd llvm-objects 4. Build LLVM: make 6. mkdir llvm-gcc-4.source. llvm-gcc with Ada support can be built on an x86-32 linux box using the following recipe: 1.source llvm-gcc-4.tar.5. so you must specify --disable-bootstrap. Configure LLVM (here it is configured to install into /usr/local): . The --enable-checking flag turns on sanity checks inside the compiler.2-objects for llvm-gcc and make it the current directory: cd .2-objects cd llvm-gcc-4.2-2.

/llvm-objects \ --disable-bootstrap --disable-multilib 9. adding ".c.c++. LLVM Compiler Infrastructure Last modified: $Date: 2009-07-05 07:01:44 -0500 (Sun. follow the directions in the top-level README.. More information is available in the FAQ. .fortran License Information The LLVM GCC frontend is licensed to you under the GNU General Public License and the GNU Lesser General Public License.c \ --enable-checking --enable-llvm=$PWD/.LIB for more details. Additional languages can be appended to the --enable-languages switch./llvm-objects \ --disable-bootstrap --disable-multilib If you have a multi-compiler setup. for example --enable-languages=ada./llvm-gcc-4. replace --enable-checking with --disable-checking.2/configure --prefix=/usr/local --enable-languages=ada. Build and install the compiler: make make install Building the Fortran front-end To build with support for Fortran. 05 Jul 2009) $ License Information 192 .LLVM file./llvm-gcc-4. Please see the files COPYING and COPYING.. then you can configure like this: export CC=PATH_TO_C_AND_ADA_COMPILER export CXX=PATH_TO_C++_COMPILER .2/configure --prefix=/usr/local --enable-languages=ada. for example: EXTRALANGS=..c \ --enable-checking --enable-llvm=$PWD/. Documentation for the LLVM System at SVN head (not recommended).fortran" to EXTRALANGS..

Documentation for the LLVM System at SVN head This page has moved here. LLVM Compiler Infrastructure Last modified: $Date: 2008-02-13 17:46:10 +0100 (Wed. 13 Feb 2008) $ License Information 193 .

These settings are not optimal for most desktop systems. and the lack of assertions makes it hard to debug problems in user code. This will allow users to build with RTTI enabled and still inherit from LLVM classes.7 installed at the same time to support apps developed against each. LLVM's API changes with each release. Also available by setting DISABLE_ASSERTIONS=0|1 in make's environment. Debian. Documentation for the LLVM System at SVN head Advice on Packaging LLVM 1. Shared Library 5. This defaults to enabled regardless of the optimization setting. turns off debug symbols.g. so you should turn it back on to let users debug their programs. Shared Library Configure with --enable-shared to build libLLVM-major. Overview 2. This saves lots of binary size at the cost of some startup time. Dependencies --enable-libffi Depend on libffi to allow the LLVM interpreter to call external functions. This defaults to enabled when not in a checkout. by default.. for example. C++ Features RTTI LLVM disables RTTI by default.(so|dylib) and link the tools against it. such a build is currently incompatible with users who build without defining NDEBUG. This defaults to disabled when optimizing. C++ Features 4. both LLVM-2. Compile Flags LLVM runs much more quickly when it's optimized and assertions are removed. Dependencies Overview LLVM sets certain default configure options to make sure our developers don't break things for constrained platforms. MacPorts.) will tweak them.minor. The following configure flags are relevant: --disable-assertions Builds LLVM with NDEBUG defined. Compile Flags 3. We recommend allowing users to install both optimized and debug versions of LLVM in parallel. Redhat.6 and LLVM-2. but it slows things down. --with-oprofile License Information 194 . and we hope that packagers (e. Also available by setting ENABLE_OPTIMIZED=0|1 in make's environment. Changes the LLVM ABI. etc. so users are likely to want. Add REQUIRES_RTTI=1 to your environment while running make to re-enable it. However. --enable-debug-symbols Builds LLVM with -g. --enable-optimized (For svn checkouts) Builds LLVM with -O2 and. This document lists settings we suggest you tweak. Also available by setting DEBUG_SYMBOLS=0|1 in make's environment.

26 Feb 2010) $ License Information 195 .4) to let the LLVM JIT tell oprofile about function addresses and line numbers. The LLVM Compiler Infrastructure Last modified: $Date: 2010-02-26 16:25:06 -0600 (Fri. Documentation for the LLVM System at SVN head Depend on libopagent (>=version 0.9.

Documentation for the LLVM System at SVN head The LLVM Lexicon NOTE: This document is a work in progress! Table Of Contents -A- ADCE -B- BURS -C- CSE -D- DAG Derived Pointer DSA DSE -G- GC -I- IPA IPO ISel -L- LCSSA LICM Load-VN -O- Object Pointer -P- PRE -R- RAUW Reassociation Root -S- Safe Point SCC SCCP SDISel SRoA Stack Map Definitions -A- ADCE Aggressive Dead Code Elimination -B- BURS Bottom Up Rewriting System—A method of instruction selection for code generation. For example (a+b)*(a+b) has two subexpressions that are the same: (a+b). License Information 196 . An optimization that removes common subexpression compuation. This optimization would perform the addition only once and then perform the multiply (but only if it's compulationally correct/safe). -C- CSE Common Subexpression Elimination. An example is the BURG tool.

While a derived pointer is live. otherwise the collector might free the referenced object. ISel Instruction Selection. This term is used in opposition to derived pointer. -L- LCSSA Loop-Closed Static Single Assignment Form LICM Loop Invariant Code Motion Load-VN Load Value Numbering -O- Object Pointer A pointer to an object such that the garbage collector is able to trace references contained within the object. With copying collectors. Documentation for the LLVM System at SVN head -D- DAG Directed Acyclic Graph Derived Pointer A pointer to the interior of an object. -I- IPA Inter-Procedural Analysis. derived pointers pose an additional hazard that they may be invalidated at any safe point. The practice of using reachability analysis instead of explicit memory management to reclaim unused memory. the region of memory which is managed using reachability analysis. IPO Inter-Procedural Optimization. -H- Heap In garbage collection. Refers to any variety of code optimization that occurs between procedures. such that a garbage collector is unable to use the pointer for reachability analysis. This term is used in opposition to object pointer. functions or compilation units (modules). the corresponding object pointer must be kept in a root. functions or compilation units (modules). License Information 197 . Refers to any variety of code analysis that occurs between procedures. DSA Data Structure Analysis DSE Dead Store Elimination -G- GC Garbage Collection.

a pointer variable lying outside of the heap from which the collector begins its reachability analysis. SDISel Selection DAG Instruction Selection. changing (A+B-A) into (B+A-A). The LLVM Team The LLVM Compiler Infrastructure Last modified: $Date: 2008-12-14 02:01:51 -0600 (Sun. and Constant::replaceUsesOfWithOnConstant() implement the replacement of one Value with another by iterating over its def/use chain and fixing up all of the pointers to point to the new value. permitting it to be optimized into (B+0) then (B). it is necessary to identify stack roots so that reachability analysis may proceed. Documentation for the LLVM System at SVN head -P- PRE Partial Redundancy Elimination -R- RAUW An abbreviation for Replace All Uses With. In the context of code generation. The functions User::replaceUsesOfWith(). so instead the information may is calculated only at designated safe points. Reassociation Rearranging associative expressions to promote better redundancy elimination and other optimization. derived pointers must not be retained across safe points and object pointers must be reloaded from stack roots. It may be infeasible to provide this information for every instruction. Value::replaceAllUsesWith(). -S- Safe Point In garbage collection. metadata emitted by the code generator which identifies roots within the stack frame of an executing function. For example. SCC Strongly Connected Component SCCP Sparse Conditional Constant Propagation SRoA Scalar Replacement of Aggregates SSA Static Single Assignment Stack Map In garbage collection. 14 Dec 2008) $ License Information 198 . See also def/use chains. Root In garbage collection. With a copying collector. "root" almost always refers to a "stack root" -- a local or temporary variable within an executing function.

h" ◊ <vector> ◊ <deque> ◊ <list> ◊ llvm/ADT/ilist. Important and useful LLVM APIs ♦ The isa<>. Picking the Right Data Structure for a Task ♦ Sequential Containers (std::vector.h" ◊ "llvm/ADT/DenseSet.h" ◊ "llvm/ADT/FoldingSet. SmallSet.h" ◊ "llvm/ADT/DenseMap. General Information ♦ The C++ Standard Template Library 3.h ◊ Other Sequential Container Options ♦ Set-Like Containers (std::set. std::list.h" ◊ <set> ◊ "llvm/ADT/SetVector. Helpful Hints for Common Operations ♦ Basic Inspection and Traversal Routines ◊ Iterating over the BasicBlocks in a Function License Information 199 .h" ◊ "llvm/ADT/IndexedMap. etc) ◊ A sorted 'vector' ◊ "llvm/ADT/StringMap.h" ◊ "llvm/ADT/ValueMap.h" ◊ <map> ◊ Other Map-Like Container Options ♦ String-like containers ♦ BitVector-like containers ◊ A dense bitvector ◊ A "small" dense bitvector ◊ A sparse bitvector 5. cast<> and dyn_cast<> templates ♦ Passing strings (the StringRef and Twine classes) ◊ The StringRef class ◊ The Twine class ♦ The DEBUG() macro and -debug option ◊ Fine grained debug info with DEBUG_TYPE and the -debug-only option ♦ The Statistic class & -stats option ♦ Viewing graphs while debugging code 4.h" ◊ Other Set-Like ContainerOptions ♦ Map-Like Containers (std::map. Documentation for the LLVM System at SVN head LLVM Programmer's Manual 1. DenseMap. SetVector. etc) ◊ A sorted 'vector' ◊ "llvm/ADT/SmallSet. etc) ◊ Fixed Size Arrays ◊ Heap Allocated Arrays ◊ "llvm/ADT/SmallVector.h" ◊ "llvm/ADT/UniqueVector.h" ◊ "llvm/ADT/SmallPtrSet. Introduction 2.

Joel Stanley. Note that this manual is not intended to serve as a replacement for reading the source code. Advanced Topics ♦ LLVM Type Resolution ◊ Basic Recursive Type Construction ◊ The refineAbstractTypeTo method ◊ The PATypeHolder Class ◊ The AbstractTypeUser Class ♦ The ValueSymbolTable and TypeSymbolTable classes ♦ The User and owned Use classes' memory layout 8. Threads and LLVM ♦ Entering and Exiting Multithreaded Mode ♦ Ending execution with llvm_shutdown() ♦ Lazy initialization with ManagedStatic ♦ Achieving Isolation with LLVMContext ♦ Threads and the JIT 7. License Information 200 . how it works. It assumes that you know the basics of LLVM and are interested in writing transformations or otherwise analyzing or manipulating the code. so if you think there should be a method in one of these classes to do something. Documentation for the LLVM System at SVN head ◊ Iterating over the Instructions in a BasicBlock ◊ Iterating over the Instructions in a Function ◊ Turning an iterator into a class pointer ◊ Finding call sites: a more complex example ◊ Treating calls and invokes the same way ◊ Iterating over def-use & use-def chains ◊ Iterating over predecessors & successors of blocks ♦ Making simple changes ◊ Creating and inserting new Instructions ◊ Deleting Instructions ◊ Replacing an Instruction with another Value ◊ Deleting GlobalVariables ♦ How to Create Types 6. The Core LLVM Class Hierarchy Reference ♦ The Type class ♦ The Module class ♦ The Value class ◊ The User class ⋅ The Instruction class ⋅ The Constant class • The GlobalValue class ♦ The Function class ♦ The GlobalVariable class ◊ The BasicBlock class ◊ The Argument class Written by Chris Lattner. This document should get you oriented so that you can find your way in the continuously growing source code that makes up the LLVM infrastructure. and what LLVM code looks like. Dinakar Dhurjati. This manual is not intended to explain what LLVM is. Reid Spencer and Owen Anderson Introduction This document is meant to highlight some of the important classes and interfaces available in the LLVM source-base. Gabor Greif.

CFG traversal routines.an excellent reference for the STL and other parts of the standard C++ library. you must know what they do and how they work. 3. CVS Branch and Tag Primer 2.This is an O'Reilly book in the making. and useful utilities like the InstVisitor template. 2nd ed. All of these templates are defined in the llvm/Support/Casting. The C++ Standard Template Library LLVM makes heavy use of the C++ Standard Template Library (STL).Contains a useful Introduction to the STL. C++ In a Nutshell . C++ Frequently Asked Questions 4. Documentation for the LLVM System at SVN head but it's not listed. Bruce Eckel's Thinking in C++. 5. Other useful references 1. so it will not be discussed in this document. and several books on the subject that you can get. There are many good pages that discuss the STL. Here are some useful links: 1. and the second describes the Core LLVM classes. cast<> and dyn_cast<> templates The LLVM source-base makes extensive use of a custom form of RTTI. 2. such as dominator information. or have seen before. perhaps much more than you are used to. but they don't have some drawbacks (primarily stemming from the fact that dynamic_cast<> only works on classes that have a v-table). get the book). Using static and shared libraries across platforms Important and useful LLVM APIs Here we highlight some LLVM APIs that are generally useful and good to know about when writing transformations. check the source. The first section of this document describes general information that is useful to know when working in the LLVM infrastructure. Volume 2 Revision 4. you might want to do a little background reading in the techniques used and capabilities of the library. The isa<>. General Information This section contains general information that is useful if you are working in the LLVM source-base. It has a decent Standard Library Reference that rivals Dinkumware's. You are also encouraged to take a look at the LLVM Coding Standards guide which focuses on how to write maintainable code more than where to put your curly braces.0 (even better. SGI's STL Programmer's Guide . Because they are used so often. isa<>: License Information 201 . Links to the doxygen sources are provided to make this as easy as possible. Bjarne Stroustrup's C++ Page 6.h file (note that you very rarely have to include this file directly). and is unfortunately no longer free since the book has been published. but that isn't specific to any particular API. These templates have many similarities to the C++ dynamic_cast<> operator. Because of this. Dinkumware C++ Library reference . In the future this manual will be extended with information describing how to use extension libraries.

} Note that you should not use an isa<> test followed by a cast<>. can be abused. In particular. except that it allows for a null pointer as an argument (which it then propagates). If you find yourself wanting to do this. To add support for these templates. this works very much like the dynamic_cast<> operator in C++. for that use the dyn_cast<> operator. Thus. An example of the isa<> and cast<> template is: static bool isLoopInvariant(const Value *V.. returns a pointer to it (this operator does not work with references). you simply need to add classof static methods to the class you are interested casting to. It checks to see if the operand is of the specified type. like C++'s dynamic_cast<> or Java's instanceof operator. return !L->contains(cast<Instruction>(V)->getParent()).. This can sometimes be useful. dyn_cast_or_null<>: The dyn_cast_or_null<> operator works just like the dyn_cast<> operator. cast<>: The cast<> operator is a "checked cast" operation. it must be an instruction. allowing you to combine several null checks into one. but there are lots of examples in the LLVM source base. which is very convenient. const Loop *L) { if (isa<Constant>(V) || isa<Argument>(V) || isa<GlobalValue>(V)) return true. Note that the dyn_cast<> operator. Describing this is currently outside the scope of this document. a null pointer is returned. This can be very useful for constraint checking of various sorts (example below). It converts a pointer or reference from a base class to a derived class. the dyn_cast<> operator is used in an if statement or some other flow control statement like this: if (AllocationInst *AI = dyn_cast<AllocationInst>(Val)) { // . and should be used in the same circumstances. If the operand is not of the correct type. causing an assertion failure if it is not really an instance of the right type. allowing you to combine several null checks into one. dyn_cast<>: The dyn_cast<> operator is a "checking cast" operation. it is much cleaner and more efficient to use the InstVisitor class to dispatch over the instruction type directly. These five templates can be used with any classes.. and if so. This should be used in cases where you have some information that makes you believe that something is of the right type.. This can sometimes be useful. except that it allows for a null pointer as an argument (which it then propagates). Documentation for the LLVM System at SVN head The isa<> operator works exactly like the Java "instanceof" operator. Typically. License Information 202 . It returns true or false depending on whether a reference or pointer points to an instance of the specified class. } This form of the if statement effectively combines together a call to isa<> and a call to cast<> into one statement. cast_or_null<>: The cast_or_null<> operator works just like the cast<> operator. you should not use big chained if/then/else blocks to check for lots of different variants of classes. whether they have a v-table or not. // Otherwise.

find("foo"). Two important examples are the Value class -..cmp"). These are generic classes. a C strings.. etc. Twines can be implicitly constructed as the result of the plus operator applied to strings (i. we do have several important APIs which take strings.find(std::string("bar")). You should rarely use the StringRef class directly. or explicitly with a character pointer and length. They are intended solely for use when defining a function which should be able to efficiently accept concatenated strings. The twine delays the actual concatenation of strings until it is actually required. a common LLVM paradigm is to name one instruction based on the name of another instruction with a suffix. but does not require heap allocation.. // Lookup "foo" Map. 4)). As with a StringRef. See "llvm/ADT/StringRef. // Lookup "bar" Map. Instead. The Twine class The Twine class is an efficient way for APIs to accept concatenated strings. at which point it can be efficiently rendered directly into a character array. Documentation for the LLVM System at SVN head Passing strings (the StringRef and Twine classes) Although LLVM generally does not do much string manipulation.find(StringRef("\0baz". because it contains pointers to external memory it is not generally safe to store an instance of the class (unless you know that the external storage will not be freed). See "llvm/ADT/Twine. // Lookup "\0baz" Similarly. -- and the StringMap class which is used extensively in LLVM and Clang. The StringRef class The StringRef data type represents a reference to a constant string (a character array and a length) and supports the common operations available on std:string. For example. which can be used directly or converted to an std::string using the str member function. It can be implicitly constructed using a C style null-terminated string. The Twine class is effectively a lightweight rope which points to temporary (stack allocated) objects. For example. Twine objects point to external memory and should almost never be stored or mentioned directly. License Information 203 . SO->getName() + ".which has names for instructions. an std::string. APIs which need to return a string may return a StringRef instance. they cannot simply take a const char *. many LLVM APIs use a const StringRef& or a const Twine& for passing strings efficiently. an std::string. functions. the StringRef find function is declared as: iterator find(const StringRef &Key)..h" for more information. and they need to be able to accept strings which may have embedded null characters. for example: New = CmpInst::Create(. The DEBUG() macro and -debug option Often when working on your pass you will put a bunch of debugging printouts and other code into your pass. Therefore.h" for more information. and taking a const std::string& requires clients to perform a heap allocation which is usually unnecessary. and clients can call it using any one of: Map. or a StringRef).e. This avoids unnecessary heap allocation involved in constructing the temporary results of string concatenation.

#undef DEBUG_TYPE #define DEBUG_TYPE "" DEBUG(errs() << "No debug type (2)\n"). Fine grained debug info with DEBUG_TYPE and the -debug-only option Sometimes you may find yourself in a situation where enabling -debug just turns on too much information (such as when working on the code generator).bc > /dev/null -mypass -debug I am here! Using the DEBUG() macro instead of a home-brewed solution allows you to not have to create "yet another" command line option for the debug output for your pass. you want to remove it.bc > /dev/null -mypass <no output> $ opt < a. Note that DEBUG() macros are disabled for optimized builds. #undef DEBUG_TYPE #define DEBUG_TYPE "bar" DEBUG(errs() << "'bar' debug type\n")). Just use "set DebugFlag=0" or "set DebugFlag=1" from the gdb if the program is running. you can put arbitrary code into the argument of the DEBUG macro. Basically. so they do not cause a performance impact at all (for the same reason. If the program hasn't been started yet. and it is only executed if 'opt' (or any other tool) is run with the '-debug' command line argument: DEBUG(errs() << "I am here!\n"). Documentation for the LLVM System at SVN head After you get it working. If you want to enable debug information with more fine-grained control. Then you can run your pass like this: $ opt < a. you can always just run it with -debug.bc > /dev/null -mypass <no output> $ opt < a. A standard compromise is to comment them out. because of this. they should also not contain side-effects!). you define the DEBUG_TYPE macro and the -debug only option as follows: #undef DEBUG_TYPE DEBUG(errs() << "No debug type\n"). Naturally.h" file provides a macro named DEBUG() that is a much nicer solution to this problem. allowing you to enable them if you need them in the future. The "llvm/Support/Debug.bc > /dev/null -mypass -debug-only=foo License Information 204 . you don't want to delete the debug printouts. but you may need it again in the future (to work out new bugs that you run across). but you don't want them to always be noisy. #define DEBUG_TYPE "foo" DEBUG(errs() << "'foo' debug type\n").bc > /dev/null -mypass -debug No debug type 'foo' debug type 'bar' debug type No debug type (2) $ opt < a. One additional nice thing about the DEBUG() macro is that you can enable or disable it directly in gdb. Then you can run your pass like this: $ opt < a.

and you're interested to see how many times it makes a certain transformation. whose name is specified by the first argument. even if the source lives in multiple files. for example. and the description is taken from the second argument. because there is no system in place to ensure that names do not conflict. Whenever you make a transformation. Define your statistic like this: #define DEBUG_TYPE "mypassname" // This goes before any #includes.bc > /dev/null . the preceding example could be written as: DEBUG_WITH_TYPE("". If two different modules use the same string. errs() << "No debug type (2)\n").. The STATISTIC macro defines a static variable. It is useful to see what optimizations are contributing to making a particular program run faster.h". to specify the debug type for the entire module (if you do this before you #include "llvm/Support/Debug. It takes an additional first parameter.bc > /dev/null -mypass -debug-only=bar 'bar' debug type Of course. this is a real pain and not very useful for big programs. use the '-stats' option: $ opt -stats -mypassname < program. DEBUG_WITH_TYPE("foo". bump the counter: ++NumXForms. // I did stuff! That's all you have to do. Although you can do this with hand inspection. "The # of times I did stuff").. DEBUG_WITH_TYPE("bar". they will all be turned on when the name is specified. you should use names more meaningful than "foo" and "bar". Also.h" file provides a class named Statistic that is used as a unified way to keep track of what the LLVM compiler is doing and how effective various optimizations are. it gives a report that looks like this: License Information 205 . errs() << "'foo' debug type\n"). you should only set DEBUG_TYPE at the top of a file. This allows. The DEBUG_WITH_TYPE macro is also available for situations where you would like to set DEBUG_TYPE. and the calculated information is presented in a uniform manner with the rest of the passes being executed. errs() << "No debug type\n").. errs() << "'bar' debug type\n")). but only for one specific DEBUG statement. all debug information for instruction scheduling to be enabled with -debug-type=InstrSched. but the basics of using it are as follows: 1. which is the type to use. For example. Using the Statistic class makes it very easy to keep track of this information. To get 'opt' to print out the statistics gathered. There are many examples of Statistic uses. you don't have to insert the ugly #undef's). statistics output . When running opt on a C file from the SPEC benchmark suite. STATISTIC(NumXForms. The Statistic class & -stats option The "llvm/ADT/Statistic. or some ad-hoc method. Documentation for the LLVM System at SVN head 'foo' debug type $ opt < a. in practice. The pass name is taken from the DEBUG_TYPE macro. The variable defined ("NumXForms" in this case) acts like an unsigned integer. 2. Often you may run your pass on some big program. DEBUG_WITH_TYPE(""..

then the next call DAG. then you can call DAG. there also exists Function::viewCFGOnly() (does not include the instructions). and make sure 'dot' and 'gv' are in your path. you can usually use something like call DAG.) If you want to restart and clear all the current graph attributes. Number of dead inst eliminate 434 instcombine . LLVM provides several callbacks that are available in a debug build to do exactly that. Number of normal instructions 725 bitcodewriter . Alternatively. Number of setcc instruction eliminated 532 gcse . Number of global variables removed 2 adce . it is nice to instantly visualize these graphs. Number of cast-of-self removed 5046 raise .app/Contents/MacOS/ (or wherever you install it) to your path. and the SelectionDAG::viewGraph() methods. Similarly. Number of blocks simplified Obviously. Number of basic blocks removed 134 cee . If you call the Function::viewCFG() method. Number of branches revectored 49 cee . "color"). On Unix systems with X11. Number of expression trees converted 75 raise . you can sprinkle calls to these functions in your code in places you want to debug. Getting this to work requires a small amount of configuration. Number of load/store peepholes 42 deadtypeelim .viewGraph() would highlight the node in the specified color (choices of colors can be found at colors.setGraphColor(node.clearGraphAttrs().viewGraph() to pop up a window. and each node contains the instructions in the block. Viewing graphs while debugging code Several of the important data structures in LLVM are graphs: for example CFGs made out of LLVM BasicBlocks. License Information 206 . Documentation for the LLVM System at SVN head 7646 bitcodewriter . Number of aux indvars removed 25 instcombine . while debugging various parts of the compiler. SelectionDAG has been extended to make it easier to locate interesting nodes in large complex graphs. "attributes") (choices can be found at Graph Attributes. the current LLVM tool will pop up a window containing the CFG for the function where each basic block is a node in the graph. Number of bitcode bytes written 2817 raise . Number of insts DCEd or constprop'd 3213 raise . for example. CFGs made out of LLVM MachineBasicBlocks. Number of alloca's promoted 1444 cfgsimplify . In many cases. no loop pre-head 75 mem2reg . Number of load insts hoisted 1298 licm . install the graphviz toolkit. Number of unused typenames removed from symtab 392 funcresolve . If you are running on Mac OS/X. with so many optimizations. Number of other getelementptr's formed 138 raise . Number of instructions removed 86 indvars . rerun the LLVM configure script and rebuild LLVM to enable this functionality. Number of insts hoisted to multiple loop preds (bad.setGraphAttrs(node. Number of insts hoisted to a loop pre-header 3 licm . having a unified framework for this stuff is very nice. Number of canonical indvars added 87 indvars . if you call DAG. Number of loads removed 2919 gcse . Number of varargs functions resolved 27 globaldce . for example. Within GDB.) More complex node attributes can be provided with call DAG. Number of oversized instructions 129996 bitcodewriter . From gdb. Making your pass fit well into the framework makes it more maintainable and useful. Once in your system and path are set up. the MachineFunction::viewCFG() and MachineFunction::viewCFGOnly(). and add /Applications/Graphviz. and Instruction Selection DAGs. download and install the Mac OS/X Graphviz program. Number of insts combined 248 licm .

which dwarf the cost of adding the elements to the container. only use them if you need one of these capabilities. for example. Fixed Size Arrays Fixed size arrays are very simple and very fast. etc) There are a variety of sequential containers available for you. They are good if the number of elements is variable. Doing so avoids (relatively) expensive malloc/free calls. you can fine tune the memory use. if you know how many elements you will need before the array is allocated. while automatically eliminating duplicates. They are good if you know exactly how many elements you have. etc. based on your needs. If you need that. Map-like containers generally do not support efficient reverse mapping (values to keys). • a sequential container provides the most efficient way to add elements and keeps track of the order they are added to the collection. or you have a (low) upper bound on how many you have. but do not support efficient look-up based on a key. License Information 207 . use two maps. If you have a vector that usually only contains a few elements (but could contain many). you should use: • a map-like container if you need efficient look-up of an value based on another value. and cache behaviors of access by intelligently picking a member of the category. constant factors. Set-like containers are more expensive than sequential containers. They permit duplicates and support efficient iteration. Bit containers require a maximum of 1 bit for each identifier you want to store. Map-like containers are the most expensive sort. "llvm/ADT/SmallVector. supports efficient push_back/pop_back operations. N> is a simple class that looks and smells just like vector<Type>: it supports efficient iteration. Some map-like containers also support efficient iteration through the keys in sorted order. std::list. Sequential Containers (std::vector. lays out elements in memory order (so you can do pointer arithmetic between elements). or a map-like container? The most important thing when choosing a container is the algorithmic properties of how you plan to access the container. consider a SmallVector). The first step is a choose your own adventure: do you want a sequential container. Once the proper category of container is determined. Map-like containers also support efficient queries for containment (whether a key is in the map). Heap Allocated Arrays Heap allocated arrays (new[] + delete[]) are also simple. Documentation for the LLVM System at SVN head Picking the Right Data Structure for a Task LLVM has a plethora of data structures in the llvm/ADT/ directory. the constructor and destructors will be run for every element in the array (re-sizable vectors only construct those elements actually used). and if the array is usually large (if not. and we commonly use STL data structures. • a set-like container if you need to put a bunch of stuff into a container that automatically eliminates duplicates. Also note that if you are allocating an array of a type with a constructor. Note that constant factors and cache behavior can be a big deal. it's much better to use SmallVector than vector . supports efficient random access to its elements. The cost of a heap allocated array is the cost of the new/delete (aka malloc/free). Based on that.h" SmallVector<Type. Some set-like containers support efficient iteration through the elements in sorted order. • a bit container provides an efficient way to store and perform set operations on sets of numeric id's. Pick the first in this section that will do what you want. This section describes the trade-offs you should consider when you pick one. • a string container is a specialized sequential container or reference structure that is used for character or byte arrays. a set-like container.

SmallVector also provides a nice portable and efficient replacement for alloca. Documentation for the LLVM System at SVN head The advantage of SmallVector is that it allocates space for some number of elements (N) in the object itself. ) { std::vector<foo> V. One worthwhile note about std::vector: avoid code like this: for ( . Like std::vector. This can be a big win in cases where the malloc/free call is far more expensive than the code that fiddles around with the elements. This is good for vectors that are "usually small" (e.. use V. the iterator invalidation characteristics of std::list are stronger than that of a vector class: inserting or removing an element into the list does not invalidate iterator or pointers to other elements in the list. for ( . <deque> std::deque is. in some senses. <vector> std::vector is well loved and respected. Because of this. a generalized version of std::vector. write this as: std::vector<foo> V. License Information 208 . As such.. use std::vector or something cheaper. no malloc is performed. this makes the size of the SmallVector itself large. In exchange for this extra flexibility. V. but unlike std::vector or SmallVector). In addition. std::list supports efficient access to both ends of the list (like std::deque. It is useful when SmallVector isn't: when the size of the vector is often large (thus the small optimization will rarely be a benefit) or if you will be allocating many instances of the vector itself (which would waste space for elements that aren't in the container). <list> std::list is an extremely inefficient class that is rarely useful. It performs a heap allocation for every element inserted into it. not random access iteration.. so you don't want to allocate lots of them (doing so will waste a lot of space). vector is also useful when interfacing with code that expects vectors :). std::list also only supports bidirectional iteration. if the SmallVector is dynamically smaller than N. it provides constant time random access and other similar properties.. SmallVectors are most useful when on the stack. thus having an extremely high constant factor. particularly for small data types.g. std::deque has significantly higher constant factor costs than std::vector. } Doing so will save (at least) one heap allocation and free per iteration of the loop. On the other hand. In exchange for this high cost. but it also provides efficient access to the front of the list. It does not guarantee continuity of elements within memory. } Instead.clear(). If possible. ) { use V. the number of predecessors/successors of a block is usually less than 8).

and ilists are guaranteed to support a constant-time splice operation. but it provides some novel characteristics. ilist_node<T>s are meant to be embedded in the node type T. the memory overhead of the associated sentinels License Information 209 . ilist has the same drawbacks as std::list.h ilist_node<T> implements a the forward and backward links that are expected by the ilist<T> (and analogous containers) in the default manner.h ilist<T> implements an 'intrusive' doubly-linked list. These constraints allow for some implementation freedom to the ilist how to allocate and store the sentinel. providing the back-link to the last element. because it requires the element to store and provide access to the prev/next pointers for the list. etc. the traits class is informed when an element is inserted or removed from the list. it needs to support the standard container operations. ilist_traits<T> is a public base of this class and can be used for a wide variety of customizations. such as begin and end iterators. Related classes of interest are explained in the following subsections: • ilist_traits • iplist • llvm/ADT/ilist_node. iplist<T> (and consequently ilist<T>) publicly derive from this traits class. iplist iplist<T> is ilist<T>'s base and as such supports a slightly narrower interface. The only sensible solution to this problem is to allocate a so-called sentinel along with the intrusive list. In particular. To be a good citizen in the C++ ecosystem. it can efficiently store polymorphic objects. Notably. which serves as the end iterator. Sentinels ilists have another specialty that must be considered. inserters from T& are absent. it may break down when T does not provide a default constructor. the operator-- must work correctly on the end iterator in the case of non-empty ilists. usually T publicly derives from ilist_node<T>. llvm/ADT/ilist_node.h • Sentinels ilist_traits ilist_traits<T> is ilist<T>'s customization mechanism. Also. It is intrusive. Documentation for the LLVM System at SVN head llvm/ADT/ilist. The corresponding policy is dictated by ilist_traits<T>. However conforming to the C++ convention it is illegal to operator++ beyond the sentinel and it also must not be dereferenced. While the default policy is sufficient in most cases. Also. By default a T gets heap-allocated whenever the need for a sentinel arises. which is why these are implemented with ilists. in the case of many instances of ilists. These properties are exactly what we want for things like Instructions and basic blocks. and additionally requires an ilist_traits implementation for the element type.

std::priority_queue. To alleviate the situation with numerous and voluminous T-sentinels. a SmallSet<Type. There are several different choices for how to do this. "llvm/ADT/SmallPtrSet. which serves as the back-link of the sentinel. The drawback is that the interface is quite small: it supports insertion. Set-Like Containers (std::set. N> is a good choice. This set has space for N elements in place (thus. std::stack. no malloc traffic is required) and accesses them with a simple linear search. The ilist is augmented by an extra pointer. it falls back to std::set. etc) Set-like containers are useful when you need to canonicalize multiple values into a single representation.h" If you have a set-like data structure that is usually small and whose elements are reasonably small. If more than 'N' insertions are performed. Ghostly sentinels are obtained by specially-crafted ilist_traits<T> which superpose the sentinel with the ilist instance in memory. is easy to address (iterators in the final vector are just indices or pointers). SetVector. but for pointers it uses something far better. "llvm/ADT/DenseSet. leading to ghostly sentinels. unlike std::set. Documentation for the LLVM System at SVN head is wasted. sometimes a trick is employed. SmallPtrSet). Other Sequential Container options Other STL containers are available. Also. These provide simplified access to an underlying container but don't affect the cost of the container itself. providing various trade-offs. "llvm/ADT/SmallSet. a single quadratically probed hash table is allocated and grows as needed. if the set is dynamically smaller than N. such as std::string.h" DenseSet is a simple quadratically probed hash table. There are also various STL adapter classes such as std::queue. This approach works really well if your usage pattern has these two distinct phases (insert then query). Note that. DenseSet is a great way to unique small License Information 210 . it allocates a more expensive representation that guarantees efficient access (for most types. This combination provides the several nice properties: the result data is contiguous in memory (good for cache locality). A sorted 'vector' If you intend to insert a lot of elements. etc. This is the only field in the ghostly sentinel which can be legally accessed. the values visited by the iterators are not visited in sorted order. providing extremely efficient access (constant time insertion/deleting/queries with low constant factors) and is very stingy with malloc traffic. has few allocations. queries and erasing. the iterators of SmallPtrSet are invalidated whenever an insertion occurs. but does not support iteration. It excels at supporting small values: it uses a single allocation to hold all of the pairs that are currently inserted in the set. and can be efficiently queried with a standard binary or radix search.h" SmallPtrSet has all the advantages of SmallSet (and a SmallSet of pointers is transparently implemented with a SmallPtrSet). then do a lot of queries. which is relative to the ilist's this pointer. The magic of this class is that it handles small sets extremely efficiently. and can be coupled with a good choice of sequential container. Pointer arithmetic is used to obtain the sentinel. SmallSet. a great approach is to use a vector (or other sequential container) with std::sort+std::unique to remove duplicates. but gracefully handles extremely large sets without loss of efficiency. but also supports iterators. When the set grows beyond 'N' elements.

using the set-like container for uniquing and the sequential container for iteration. The important property that this provides is efficient insertion with uniquing (duplicate elements are ignored) with iteration support. Because FoldingSet uses intrusive links. The query either returns the element matching the ID or it returns an opaque ID that indicates where insertion should take place. Note that DenseSet has the same requirements for the value type that DenseMap has. which is faster. "llvm/ADT/SetVector. The difference between SetVector and other sets is that the order of iteration is guaranteed to match the order of insertion into the SetVector. then try inserting it into a set only to find out it already exists. "llvm/ADT/FoldingSet. To support this style of client. This property is really important for things like sets of pointers. insertion and removal. unless you use it's "pop_back" method. it can support polymorphic objects in the set (for example. Because pointer values are non-deterministic (e. The advantages of std::set are that its iterators are stable (deleting or inserting an element from the set does not affect iterators or pointers to other elements) and that iteration over the set is guaranteed to be in sorted order. pointers to the elements are stable: inserting or removing elements does not invalidate any pointers to other elements. which is not particularly fast from a complexity standpoint (particularly if the elements of the set are expensive to compare. then the relative overhead of the pointers and malloc traffic is not a big deal. vary across runs of the program on different machines). at which point we would have to delete it and return the node that already exists. Because the elements are individually allocated. a node in the code generator). The drawback of SetVector is that it requires twice as much space as a normal set and has the sum of constant factors from the set-like container and the sequential container that it uses. which is decent at many things but great at nothing. but we don't want to 'new' a node. you can have SDNode instances mixed with LoadSDNodes). <set> std::set is a reasonable all-around set class. License Information 211 . FoldingSet perform a query with a FoldingSetNodeID (which wraps SmallVector) that can be used to describe the element that we want to query for. Documentation for the LLVM System at SVN head values that are not simple pointers (use SmallPtrSet for pointers).h" LLVM's SetVector<Type> is an adapter class that combines your choice of a set-like container along with a Sequential Container. std::set allocates memory for each element inserted (thus it is very malloc intensive) and typically stores three pointers per element in the set (thus adding a large amount of per-element space overhead). It implements this by inserting elements into both a set-like container and the sequential container. like strings).g. The client has a description of *what* it wants to generate (it knows the opcode and all the operands). It is a combination of a chained hash table with intrusive links (uniqued objects are required to inherit from FoldingSetNode) that uses SmallVector as part of its ID process. but if the elements of the set are small. If the elements in the set are large. Consider a case where you want to implement a "getOrCreateFoo" method for a complex object (for example. std::set is almost never a good choice. iterating over the pointers in the set will not be in a well-defined order. and has extremely high constant factors for lookup. Construction of the ID usually does not require heap traffic. SetVector is also expensive to delete elements out of (linear time). It offers guaranteed log(n) performance.h" FoldingSet is an aggregate class that is really good at uniquing expensive-to-create or polymorphic objects. Use it *only* if you need to iterate over the elements in a deterministic order.

Documentation for the LLVM System at SVN head SetVector is an adapter class that defaults to using std::vector and std::set for the underlying containers. std::multiset is useful if you're not interested in elimination of duplicates. and it assigns a unique ID for each value inserted into the set. "llvm/ADT/IndexedMap. The only difference is that your query function (which uses std::lower_bound to get efficient log(n) lookup) should only compare the key. Map-Like Containers (std::map. It should be avoided. This yields the same advantages as sorted vectors for sets. :) A sorted 'vector' If your usage pattern follows a strict insert-then-query approach. DenseMap. etc) Map-like containers are useful when you want to associate data to a key.h" UniqueVector is similar to SetVector. This container guarantees the "(char*)(&Value+1)" points to the key string for a value. "llvm/ADT/UniqueVector. It internally contains a map and a vector. The StringMap is very fast for several reasons: quadratic probing is very cache efficient for lookups. StringMap also provides query methods that take byte ranges. such as std::multiset and the various "hash_set" like containers (whether from C++ TR1 or from the SGI library). the hash value of strings in buckets is not recomputed when lookup up an element. "llvm/ADT/SetVector. The entries in the map must be heap allocated because the strings are variable length. and they are difficult to support efficiently: they are variable length. which defaults to using a SmallVector and SmallSet of a specified size.h" Strings are commonly used as keys in maps. hash table growth does not recompute the hash values for strings already in the table. and each pair in the map is store in a single allocation (the string data is stored in the same allocation as the Value of a pair). and if your sets are dynamically smaller than N. Other Set-Like Container Options The STL provides several other options. so it only ever copies a string if a value is inserted into the table. expensive to copy.h" License Information 212 . but it retains a unique ID for each element inserted into the set. so it is quite expensive. However. It supports mapping an arbitrary range of bytes to an arbitrary other object. If you use this. there are a lot of different ways to do this. etc. where the buckets store a pointer to the heap allocated entries (and some other stuff). "llvm/ADT/StringMap. UniqueVector is very expensive: its cost is the sum of the cost of maintaining both the map and vector. The StringMap implementation uses a quadratically-probed hash table. StringMap rarely has to touch the memory for unrelated objects when looking up a value (even when hash collisions happen). StringMap is a specialized container designed to cope with these issues. The string data (key) and the element object (value) are stored in the same allocation with the string data immediately after the element object. it has high complexity. and produces a lot of malloc traffic. As usual. A sorted vector (where you don't delete duplicate entries) or some other approach is almost always better. but has all the drawbacks of std::set. We never use hash_set and unordered_set because they are generally very expensive (each insertion requires a malloc) and very non-portable. inefficient to hash and compare when long.h" also provides a SmallSetVector class. not both the key and value. you can trivially use the same approach as sorted vectors for set-like containers. you will save a lot of heap traffic. high constant factors.

A sorted vector or some other approach is almost always better. This is useful for cases like virtual registers in the LLVM code generator: they have a dense mapping that is offset by a compile-time constant (the first virtual register ID). however. and what else happens on these two events. Describe twine. When a Value is deleted or RAUW'ed. xref to #string_apis. This is required to tell DenseMap about two special marker values (which can never be inserted into the map) that it needs internally. you must implement a partial specialization of DenseMapInfo for the key that you want. there are only two bit storage containers. Finally. it offers log(n) lookup with an extremely large constant factor. ValueMap will update itself so the new version of the key is mapped to the same value. Other Map-Like Container Options The STL provides several other options. just as if the key were a WeakVH. etc. String-like containers TODO: const char* vs stringref vs smallstring vs std::string. and choosing when to use each is relatively straightforward. std::multimap is useful if you want to map a key to multiple values.g. It excels at supporting small keys and values: it uses a single allocation to hold all of the pairs that are currently inserted in the map. std::map is most useful when your keys or values are very large. The iterators in a densemap are invalidated whenever an insertion occurs. We never use hash_set and unordered_set because they are generally very expensive (each insertion requires a malloc) and very non-portable. or if you need stable iterators into the map (i. unlike map.e. Documentation for the LLVM System at SVN head IndexedMap is a specialized container for mapping small dense integers (or values that can be mapped to small dense integers) to some other type. There are several aspects of DenseMap that you should be aware of.h" ValueMap is a wrapper around a DenseMap mapping Value*s (or subclasses) to another type. if it isn't already supported. imposes a space penalty of 3 pointers per pair in the map. but has all the drawbacks of std::map.h" DenseMap is a simple quadratically probed hash table. DenseMap is a great way to map pointers to pointers. such as std::multimap and the various "hash_map" like containers (whether from C++ TR1 or from the SGI library). <map> std::map has similar characteristics to std::set: it uses a single allocation per pair inserted into the map. they don't get invalidated if an insertion or deletion of another element takes place). if you need to iterate over the collection in sorted order. or map other small types to each other. "llvm/ADT/DenseMap. because DenseMap allocates space for a large number of key/value pairs (it starts with 64 by default). One additional option is std::vector<bool>: we discourage its use for two reasons 1) the implementation in many common compilers (e. You can configure exactly how this happens. Bit storage containers (BitVector. SparseBitVector) Unlike the other containers. "llvm/ADT/ValueMap. It is internally implemented as a vector with a mapping function that maps the keys to the dense integer range. commonly available versions of GCC) is extremely License Information 213 . it will waste a lot of space if your keys or values are large. by passing a Config parameter to the ValueMap template. Also.

but operations are performed one word at a time. For a enumerable sequence of values. The Core LLVM Class Hierarchy Reference contains details and descriptions of the main classes that you should know about. Iterating over the BasicBlocks in a Function License Information 214 . but it is optimized for the case where only a small number of bits. or. As a general statement. testing/setting bits in a SparseBitVector is O(distance away from last set bit). The set operations take time O(size of bitvector). please don't use it. SparseBitVector The SparseBitVector container is much like BitVector. Use the BitVector when you expect the number of set bits to be high (IE a dense set). with one major difference: Only the bits that are set. the standard template library algorithms may be used on them. In our implementation. as well as set operations. xor). This makes the BitVector very fast for set operations compared to other containers. the techniques used to traverse these various data structures are all basically the same. are needed. In any case. the XXXend() function returns an iterator pointing to one past the last valid element of the sequence. Documentation for the LLVM System at SVN head inefficient and 2) the C++ standards committee is likely to deprecate this container and/or change it significantly somehow. this can be slower than BitVector. It supports individual bit setting/testing. setting or testing bits in sorted order (either forwards or reverse) is O(1) worst case. Because the pattern for iteration is common across many different aspects of the program representation. Other data structures are traversed in very similar ways. showing the practical side of LLVM transformations. At this time. and there is some XXXiterator data type that is common between the two operations. as well as making set operations O(number of set bits) instead of O(size of universe). First we show a few common examples of the data structures that need to be traversed. Basic Inspection and Traversal Routines The LLVM compiler infrastructure have many different data structures that may be traversed. are stored. This makes the SparseBitVector much more space efficient than BitVector when the set is sparse. Testing and setting bits within 128 bits (depends on size) of the current bit is also O(1). This is meant to give examples of common idioms used. you should also read about the main classes that you will be working with. and it is easier to remember how to iterate. less than 25 or so. The downside to the SparseBitVector is that setting and testing of random bits is O(N). instead of one bit at a time. Helpful Hints for Common Operations This section describes how to perform some very simple transformations of LLVM code. but slightly less efficiently than a plain BitVector. so SmallBitVector should only be used when larger counts are rare. and its operator[] does not provide an assignable lvalue. BitVector The BitVector container provides a dynamic size set of bits for manipulation. SmallBitVector The SmallBitVector container provides the same interface as BitVector. SmallBitVector does not support set operations (and. and on large SparseBitVectors. It also transparently supports larger bit counts. Following the example of the C++ standard template library. the XXXbegin() function (or method) returns an iterator to the start of the sequence. Because this is a "how-to" section.

Here's a code snippet that prints out each instruction in a BasicBlock: // blk is a pointer to a BasicBlock instance for (BasicBlock::iterator i = blk->begin().. Note that i can be used as if it were a pointer for the purposes of invoking member functions of the Instruction class. and then the // number of instructions that it contains errs() << "Basic block (name=" << i->getName() << ") has " << i->size() << " instructions. The STL set worklist would now contain all instructions in the Function pointed to by F. ++I) errs() << *I << "\n"..size() just like you'd expect. I != E. // or better yet. you could have just invoked the print routine on the basic block itself: errs() << *blk << "\n". if you wanted to initialize a work list to contain all instructions in a Function F. This is because the indirection operator is overloaded for the iterator classes. Documentation for the LLVM System at SVN head It's quite common to have a Function instance that you'd like to transform in some way. for (inst_iterator I = inst_begin(F). all you would need to do is something like: std::set<Instruction*> worklist. ++i) // The next statement works since operator<<(ostream&.insert(&*I). I != E. Iterating over the Instructions in a BasicBlock Just like when dealing with BasicBlocks in Functions. E = inst_end(F). ++i) // Print out the name of the basic block if it has one. In the above code. However. For example. e = blk->end(). you'll need to iterate over all of the BasicBlocks that constitute the Function.\n". License Information 215 . it's easy to iterate over the individual instructions that make up BasicBlocks. i != e.. you'd like to manipulate its BasicBlocks.) // is overloaded for Instruction& errs() << *i << "\n". e = func->end(). 64> worklist. SmallPtrSet<Instruction*. the expression i->size() is exactly equivalent to (*i). i != e..h.h" // F is a pointer to a Function instance for (inst_iterator I = inst_begin(F). The following is an example that prints the name of a BasicBlock and the number of Instructions it contains: // func is a pointer to a Function instance for (Function::iterator i = func->begin(). You'll need to include llvm/Support/InstIterator. InstIterator should be used instead. Here's a small example that shows how to dump all instructions in a function to the standard error stream: #include "llvm/Support/InstIterator. To facilitate this. ++I) worklist. E = inst_end(F). and then instantiate InstIterators explicitly in your code. Iterating over the Instructions in a Function If you're finding that you commonly iterate over a Function's BasicBlocks and then that BasicBlock's Instructions. this isn't really the best way to print out the contents of a BasicBlock! Since the ostream operators are overloaded for virtually anything you'll care about. in particular. isn't it? You can also use InstIterators to fill a work list with its initial contents. Easy.

you may want to use an InstVisitor to accomplish this in a much more straight-forward manner. In pseudo-code. extracting a reference or a pointer from an iterator is very straight-forward. this is what we want to do: initialize callCounter to zero for each Function f in the Module for each BasicBlock b in f for each Instruction i in b if (i is a CallInst and calls the given function) increment callCounter And the actual code is (remember. The following code snippet illustrates use of the conversion constructors provided by LLVM iterators. it'll be useful to grab a reference (or pointer) to a class instance when all you've got at hand is an iterator.e. you can explicitly grab the iterator of something without actually obtaining it via iteration over some structure: void printNextInstruction(Instruction* inst) { BasicBlock::iterator it(inst).. across every Function) where a certain function (i. Documentation for the LLVM System at SVN head Turning an iterator into a class pointer (and vice-versa) Sometimes. because we're writing a FunctionPass. it refers to the instruction after *inst if (it != inst->getParent()->end()) errs() << *it << "\n". However. Instead of dereferencing the iterator and then taking the address of the result. Well. you can simply assign the iterator to the proper pointer type and you get the dereference and address-of operation as a result of the assignment (behind the scenes. Assuming that i is a BasicBlock::iterator and j is a BasicBlock::const_iterator: Instruction& inst = *i. Thus the last line of the last example. but this example will allow us to explore how you'd do it if you didn't have InstVisitor around. some Function*) is already in scope.. class OurFunctionPass : public FunctionPass { public: OurFunctionPass(): callCounter(0) { } License Information 216 . } Finding call sites: a slightly more complex example Say that you're writing a FunctionPass and would like to count all the locations in the entire module (that is. Instruction *pinst = &*i. the iterators you'll be working with in the LLVM framework are special: they will automatically convert to a ptr-to-instance type whenever they need to. // Grab reference to instruction reference Instruction* pinst = &*i. // Grab pointer to instruction reference const Instruction& inst = *j. By using these. is semantically equivalent to Instruction *pinst = i. As you'll learn later. and this is a constant time operation (very efficient). It's also possible to turn a class pointer into the corresponding iterator. this is a result of overloading casting mechanisms).. ++it. // After this line.. our FunctionPass-derived class simply has to override the runOnFunction method): Function* targetFunc = .

} Alternately. The list of all Values used by a User is known as a use-def chain. for (User::op_iterator i = pi->op_begin(). } } } } private: unsigned callCounter. let's say we have a Function* named F to a particular function foo. be = F. with costs equivalents to that of a bare pointer. you may find that you want to treat CallInsts and InvokeInsts the same way. for (Value::use_iterator i = F->use_begin(). e = pi->op_end().end(). For example. so we might want to iterate over all of the values that a particular instruction uses (that is. if (callInst->getCalledFunction() == targetFunc) ++callCounter. and in other situations... Finding all of the instructions that use foo is as simple as iterating over the def-use chain of F: Function *F = ... with some methods that provide functionality common to CallInsts and InvokeInsts. i != e. Treating calls and invokes the same way You may have noticed that the previous example was a bit oversimplified in that it did not deal with call sites generated by 'invoke' instructions. Iterating over def-use & use-def chains Frequently.. e = F->use_end(). This class has "value semantics": it should be passed by value. so we // need to determine if it's a call to the // function pointed to by m_func or not. } License Information 217 . i != ie.. assignable and constructable. }. ++i) { Value *v = *i. // .begin(). it's common to have an instance of the User Class and need to know what Values are used by it. ++b) { for (BasicBlock::iterator i = b->begin(). the operands of the particular Instruction): Instruction *pi = . i != e.. which includes lots of less closely-related things. ++i) if (Instruction *Inst = dyn_cast<Instruction>(*i)) { errs() << "F is used in instruction:\n". If you look at its definition. not by reference and it should not be dynamically allocated or deallocated using operator new or operator delete. errs() << *Inst << "\n". Instances of class Instruction are common Users. it has only a single pointer member.. The list of all Users of a particular Value is called a def-use chain. even though their most-specific common base class is Instruction. ie = b->end(). It is essentially a wrapper around an Instruction pointer. For these cases. In this. It is efficiently copyable. we might have an instance of the Value Class and we want to determine which Users use the Value. Documentation for the LLVM System at SVN head virtual runOnFunction(Function& F) { for (Function::iterator b = F. ++i) { if (CallInst* callInst = dyn_cast<CallInst>(&*i)) { // We know we've encountered a call instruction. b != be. LLVM provides a handy wrapper class called CallSite.

at run time. and I'm intending to use it within the same Function. This section describes some of the common methods for doing so and gives example code. Making simple changes There are some primitive transformation operations present in the LLVM infrastructure that are worth knowing about. for (pred_iterator PI = pred_begin(BB). where indexLoc is now the logical name of the instruction's execution value. Each Instruction subclass is likely to have varying default parameters which change the semantics of the instruction. If you end up looking at generated LLVM machine code. which is a pointer to an integer on the run time stack.. // . it's fairly common to manipulate the contents of basic blocks. When performing transformations. } Similarly. as this facilitates the debugging of your transformations. PI != E.. an AllocaInst only requires a (const-ptr-to) Type.h". To accomplish this. Just use code like this to iterate over all predecessors of BB: #include "llvm/Support/CFG. I might do: AllocaInst* pa = new AllocaInst(Type::Int32Ty. Thus: AllocaInst* ai = new AllocaInst(Type::Int32Ty). so refer to the doxygen documentation for the subclass of Instruction that you're interested in instantiating. to iterate over successors use succ_iterator/succ_begin/succ_end. For example. E = pred_end(BB). Documentation for the LLVM System at SVN head Iterating over predecessors & successors of blocks Iterating over the predecessors and successors of a block is quite easy with the routines defined in "llvm/Support/CFG. Naming values It is very useful to name the values of instructions when you're able to.. Creating and inserting new Instructions Instantiating Instructions Creation of Instructions is straight-forward: simply call the constructor for the kind of instruction to instantiate and provide the necessary parameters. say that I'm writing a transformation that dynamically allocates space for an integer on the stack. 0.. For example. I place an AllocaInst at the first point in the first BasicBlock of some Function. and that integer is going to be used as some kind of index by some other code.h" BasicBlock *BB = . ++PI) { BasicBlock *Pred = *PI. you associate a logical name with the result of the instruction's execution at run time. "indexLoc"). Inserting instructions License Information 218 . you definitely want to have logical names associated with the results of instructions! By supplying a value for the Name (default) parameter of the Instruction constructor.. will create an AllocaInst instance that represents the allocation of one integer in the current stack frame.

// Inserts newInst before pi in pb Appending to the end of a BasicBlock is so common that the Instruction class and Instruction-derived classes provide constructors which take a pointer to a BasicBlock to be appended to. Thus. newInst).... immediately before that instruction..... pi). pb->getInstList(). Instruction *newInst = new Instruction(. Using an Instruction constructor with a insertBefore (default) parameter.. Instruction *pi = .insert(pi...push_back(newInst)..). which is much cleaner.insert(pi. pi->getParent()->getInstList()..... we could have accomplished the same thing as the above code without being given a BasicBlock by doing: Instruction *pi = . Instruction *newInst = new Instruction(.. Instruction* newInst = new Instruction(. an Instruction* pi within that BasicBlock.... this sequence of steps occurs so frequently that the Instruction class and Instruction-derived classes provide constructors which take (as a default parameter) a pointer to an Instruction which the newly-created Instruction should precede. Instruction constructors are capable of inserting the newly-created instance into the BasicBlock of a provided instruction. That is. newInst). Instruction *newInst = new Instruction(.. • Insertion into an implicit instruction list Instruction instances that are already in BasicBlocks are implicitly associated with an existing instruction list: the instruction list of the enclosing basic block. // Appends newInst to pb becomes: BasicBlock *pb = . Instruction *newInst = new Instruction(. pb->getInstList().. the above code becomes: Instruction* pi = .. For example code that looked like: BasicBlock *pb = . especially if you're creating a lot of instructions and adding them to BasicBlocks... In fact. pb).)...). and a newly-created instruction we wish to insert before *pi. especially if you are creating long instruction streams. Deleting Instructions License Information 219 .. Documentation for the LLVM System at SVN head There are essentially two ways to insert an Instruction into an existing sequence of instructions that form a BasicBlock: • Insertion into an explicit instruction list Given a BasicBlock* pb.... we do the following: BasicBlock *pb = . which is much cleaner.

new AllocaInst(Type::Int32Ty. "ptrToReplacedInt")). . First. For example: Instruction *I = .. ii. you must have a pointer to the global variable that you wish to delete. First.. BasicBlock::iterator ii(instToReplace).. Deleting Instructions • ReplaceInstWithValue This function replaces all uses of a given instruction with a value.. Second.h" permits use of two very useful replace functions: ReplaceInstWithValue and ReplaceInstWithInst. • ReplaceInstWithInst This function replaces a particular instruction with another instruction. AllocaInst* instToReplace = . The following example illustrates the replacement of one AllocaInst with another. For example: GlobalVariable *GV = . ReplaceInstWithValue(instToReplace->getParent()->getInstList(). AllocaInst* instToReplace = .. respectively. inserting the new instruction into the basic block at the location where the old instruction was. for more information. You use the pointer to the basic block to get its list of instructions and then use the erase function to remove your instruction. The following example illustrates the replacement of the result of a particular AllocaInst that allocates memory for a single integer with a null pointer to an integer. Documentation for the LLVM System at SVN head Deleting an instruction from an existing sequence of instructions that form a BasicBlock is very straight-forward. . ReplaceInstWithInst(instToReplace->getParent()->getInstList(). Constant::getNullValue(PointerType::getUnqual(Type::Int32Ty))). You use this pointer to erase it from its parent. 0. ii.. and then removes the original instruction. I->eraseFromParent().. the module. License Information 220 . See the doxygen documentation for the Value Class and User Class. Deleting GlobalVariables Deleting a global variable from a module is just as easy as deleting an Instruction. you need to obtain the pointer to that instruction's basic block. Replacing an Instruction with another Value Replacing individual instructions Including "llvm/Transforms/Utils/BasicBlockUtils. and replacing any uses of the old instruction with the new instruction.. you must have a pointer to the instruction that you wish to delete. Replacing multiple uses of Users and Values You can use Value::replaceAllUsesWith and User::replaceUsesOfWith to change more than one use at a time. BasicBlock::iterator ii(instToReplace).

5. functions. TypeBuilder has two forms depending on whether you're building types for cross-compilation or native library use. LLVM requires the presence of GCC's atomic intrinsics in order to support threaded operation. If you need a multhreading-capable LLVM on a platform without a suitably modern system compiler. TypeBuilder<T. but not threaded client access to the APIs. is easier to read and write than the equivalent std::vector<const Type*> params. However. Note that LLVM's support for multithreading is still relatively young. in the hosted application. the LLVM must intialize certain data structures necessary to provide guards around its internals. you can use TypeBuilder<. the execution of threaded hosted applications was supported.. clients must adhere to the guidelines specified below to ensure proper operation in multithreaded mode. consider compiling LLVM and LLVM-GCC in single-threaded mode. and using the resultant compiler to build a copy of LLVM with multithreading support. though care must be taken to ensure that side exits and the like do not accidentally result in concurrent LLVM API calls. on Unix-like platforms. The return value of llvm_start_multithreaded() indicates the success or failure of the initialization. That is to say that no other LLVM API calls may be executing at any time during the execution of llvm_start_multithreaded() or llvm_stop_multithreaded . params. FunctionType *ft = TypeBuilder<types::i<8>(types::i<32>*). the client program must invoke llvm_start_multithreaded() before making any concurrent LLVM API calls. true> requires that T be independent of the host environment. FunctionType *ft = FunctionType::get(Type::Int8Ty. See the class comment for more details. false> additionally allows native C types whose size may depend on the host compiler. In this case. License Information 221 . For example. To subsequently tear down these structures. use the llvm_stop_multithreaded() call. Failure typically indicates that your copy of LLVM was built without multithreading support. It's is the client's responsibility to enforce this isolation. params. TypeBuilder<T. defined in llvm/Support/TypeBuilder. How to Create Types In generating IR. To do so. You can also use the llvm_is_multithreaded() call to check the status of multithreaded mode.push_back(PointerType::getUnqual(Type::Int32Ty)).h. arrays.>::get(). typically because GCC atomic intrinsics were not found in your system compiler. If you know these types statically.. Up through version 2. meaning that it's built out of types from the llvm::types namespace and pointers. While this use case is now supported. and in the JIT. the LLVM API will not be safe for concurrent calls. Documentation for the LLVM System at SVN head GV->eraseFromParent(). Note that. Threads and LLVM This section describes the interaction of the LLVM APIs with multithreading. you may need some complex types. false). Entering and Exiting Multithreaded Mode In order to properly protect its internal data structures while avoiding excessive locking overhead in the single-threaded case. true>::get(). Note that both of these calls must be made in isolation. it will be safe for hosting threaded applications in the JIT. built of those. to retrieve them. both on the part of client applications. etc.

Because every Type carries a reference to its owning context. it is possible to have ManagedStatics of llvm::sys::Mutexs.) in LLVM's in-memory IR belongs to an LLVMContext. because no other threads are allowed to issue LLVM API calls before llvm_start_multithreaded() returns. Entities in different contexts cannot interact with each other: Modules in different contexts cannot be linked together. Before the invocation of llvm_shutdown(). These should only be used internally to LLVM. it implements a simple lazy initialization scheme. Note that. Once llvm_start_multithreaded() returns. LLVMContext exists to enable just this kind of scenario! Conceptually. Fortunately. Functions cannot be added to Modules in different contexts. please try to maintain this interface design. and multiple threads can run code output by the JIT concurrently. other than the Type creation/lookup APIs. For clients that do not require the benefits of isolation. In practice. llvm_shutdown() requires the same isolation guarantees as llvm_stop_multithreaded(). which calls llvm_shutdown() in its destructor. If you are adding new entities to LLVM IR. Values. lazily initialized LLVMContext that may be used in situations where isolation is not a concern. etc. LLVMContext provides isolation. most other entities can determine what context they belong to by looking at their own Type. etc. the compilation of an individual translation unit is conceptually independent from all the others. Threads and the JIT LLVM's "eager" JIT compiler is safe to use in threaded programs. For instance. Constants. Multiple threads can call ExecutionEngine::getPointerToFunction() or ExecutionEngine::runFunction() concurrently. Types. however. you should call llvm_shutdown() to deallocate memory used for internal structures. Lazy Initialization with ManagedStatic ManagedStatic is a utility class in LLVM used to implement static initialization of static resources. and only if you know what you're doing! Achieving Isolation with LLVMContext LLVMContext is an opaque class in the LLVM API which clients can use to operate multiple. and it would be desirable to be able to compile incoming translation units concurrently on independent server threads. you can use the llvm_shutdown_obj class. LLVM provides a convenience API getGlobalContext(). Documentation for the LLVM System at SVN head Ending Execution with llvm_shutdown() When you are done using the LLVM APIs. One way to do that is to always hold the JIT lock while accessing IR outside the JIT (the JIT modifies the IR by adding License Information 222 . Every LLVM entity (Modules. as long as no two threads operate on entities within the same context. such as the global type tables. As such. very few places in the API require the explicit specification of a LLVMContext. it uses double-checked locking to implement thread-safe lazy initialization. This returns a global. The llvm_acquire_global_lock() and llvm_release_global_lock APIs provide access to the global lock used to implement the double-checked locking for lazy initialization. This will also invoke llvm_stop_multithreaded() if LLVM is operating in multithreaded mode. isolated instances of LLVM concurrently within the same address space. Note that. What this means is that is is safe to compile on multiple threads simultaneously. if you use scope-based shutdown. in a hypothetical compile-server. The user must still ensure that only one thread accesses IR in a given LLVMContext while another thread might be modifying it.

"{ i32. // Add a name for the type to the module symbol table (optional) License Information 223 . In particular. we need three concepts. Basic Recursive Type Construction Because the most common question is "how do I build a recursive type with LLVM". Second an "Abstract Type" is any type which includes an opaque type as part of its type graph (for example "{ opaque. and only need to be accessed in unusual circumstances. These API's tend manage the inner workings of the LLVM system. recursive types and late resolution of opaque types makes the situation very difficult to handle. Another way is to only call getPointerToFunction() from the LLVMContext's thread. the LLVM bitcode reader. a concrete type is a type that is not an abstract type (e. assembly parser. Unfortunately achieving this goal is not a simple matter. Here we include enough to cause this to be emitted to an output . an "Opaque Type" is exactly as defined in the language reference. StructType *NewSTy = StructType::get(Elts). use the following LLVM APIs: // Create the initial outer struct PATypeHolder StructTy = OpaqueType::get(). i32 }". In addition to this case.push_back(PointerType::getUnqual(StructTy)). Elts. we answer it now and explain it as we go. This goal makes clients much simpler and faster.g. It's still possible to use the lazy JIT in a threaded program if you ensure that only one thread at a time can call any particular lazy stub and that the JIT lock guards any IR access.ll file: %mylist = type { %mylist*. Third. Fortunately.get()). When the JIT is configured to compile lazily (using ExecutionEngine::DisableLazyCompilation(false)). there is currently a race condition in updating call sites after a function is lazily-jitted.get())->refineAbstractTypeTo(NewSTy). float }"). but StructTy (a PATypeHolder) is // kept up-to-date NewSTy = cast<StructType>(StructTy.push_back(Type::Int32Ty). Documentation for the LLVM System at SVN head CallbackVHs). Tell VMCore that // the struct and the opaque type are actually the same. The primary case where clients are exposed to the inner workings of it are when building a recursive type. First. LLVM Type Resolution The LLVM type system has a very simple goal: allow clients to compare types for structural equality with a simple pointer comparison (aka a shallow compare). for the most part. For our purposes below. our implementation makes most clients able to be completely unaware of the nasty internal details. and is used throughout the LLVM system. Elts. std::vector<const Type*> Elts. but we suggest using only the eager JIT in threaded programs. Advanced Topics This section describes some of the advanced or obscure API's that most clients do not need to be aware of. // NewSTy is potentially invalidated. // At this point. i32 }"). and linker also have to be aware of the inner workings of this system. NewSTy = "{ opaque*. i32 } To build this. cast<OpaqueType>(StructTy.

the DerivedType::{add/remove}AbstractTypeUser methods can be called on a type. and those without names (i. Obviously whenever a type is deleted. i32}" type already created in the system. In the example above. The ValueSymbolTable and TypeSymbolTable classes The ValueSymbolTable class provides a symbol table that the Function and Module classes use for naming value definitions. Note that the SymbolTable class should not be directly accessed by most clients. which will autoinsert it into the appropriate symbol table. The AbstractTypeUser Class Some data structures need more to perform more complex updates when types get resolved. instead. Concrete types (those that do not include any opaque objects) can never be refined. Note that these methods only work for abstract types. the PATypeHolder class is used to maintain a stable reference to a possibly refined type. and all duplicates are deleted (to preserve pointer equality). To support this. NewSTy). and the AbstractTypeUser class is used to update more complex datastructures. types can become structurally isomorphic to existing types. it is safest to avoid having any "Type*" pointers to abstract types live across a call to refineAbstractTypeTo (note that non-abstract types can never move or be deleted). The ValueSymbolTable class exposes no public mutator methods. For example the pointer from a Value to its Type is maintained by PATypeHolder objects. which is very special purpose. the pointer and struct type created are also deleted. PATypeHolder is an extremely light-weight object that uses a lazy union-find implementation to update pointers. even though the Type*'s are potentially invalidated. These symbol tables support iteration over the values/types in the symbol table with begin/end/iterator and supports querying to see if a specific name is in the symbol table (with lookup). While this method is actually a member of the DerivedType class. This class allows it to get callbacks when certain types are resolved. To deal with this. it automatically updates all PATypeHolder objects to point to the new type. they have an empty name) do not exist in the symbol table. After unification. this allows the code to maintain a pointer to the resultant resolved recursive type. simply call setName on a value. Additionally. it is most often used on OpaqueType instances. a class can derive from the AbstractTypeUser class. When VMCore happily goes about nuking types that become isomorphic to existing types. The refineAbstractTypeTo method The refineAbstractTypeTo method starts the type unification process.e. the OpaqueType object is definitely deleted. It should only be used when iteration over the symbol table names themselves are required. As such. For types. In the example above. The symbol table can provide a name for any Value. Documentation for the LLVM System at SVN head MyModule->addTypeName("mylist". Note that not all LLVM Values have names. use the License Information 224 . if there is an "{ \2*. we describe the PATypeHolder class. any "Type*" pointers in the program are invalidated. To register to get callbacks for a particular type. The PATypeHolder Class PATypeHolder is a form of a "smart pointer" for Type objects. Type unification is actually a recursive process. This code shows the basic approach used to build recursive types: build a non-recursive type using 'opaque'. The TypeSymbolTable class is used by the Module class to store names for types. then use type unification to close the cycle. After that. The type unification step is performed by the refineAbstractTypeTo method. which is described next.

. | P | P | P | P | '---'---'---'---''' (In the above figures 'P' stands for the Use** that is stored in each Use object in the member Use::Prev) The waymarking algorithm Since the Use objects are deprived of the direct (back)pointer to their User objects. Though not mandatory for layout a).. The Use helper class is employed to do the bookkeeping and to facilitate O(1) addition and removal.) Special forms of allocation operators (operator new) enforce the following memory layouts: • Layout a) is modelled by prepending the User object by the Use[] array. The User object also stores the number of Use objects it has.---.. (Theoretically this information can also be calculated given the scheme presented below.---. We have 2 different layouts in the User (sub)classes: • Layout a) The Use object(s) are inside (resp.. • Layout b) The Use object(s) are referenced by a pointer to an array from the User object and there may be a variable number of them.---. Documentation for the LLVM System at SVN head Module::addTypeName method to insert entries into the symbol table. .---..---.-------. As of v2. . | P | P | P | P | User '''---'---'---'---'-------''' • Layout b) is modelled by pointing at the Use[] array. | User '-------''' | v .4 each layout still possesses a direct pointer to the start of the array of Uses. Interaction and relationship between User and Use objects A subclass of User can choose between incorporating its Use objects or refer to them out-of-line by means of a pointer.---.. A mixed variant (some Uses inline others hung off) is impractical and breaks the invariant that the Use objects belonging to the same User form a contiguous array. This is accomplished by the following scheme: A bit-encoding in the 2 LSBits (least significant bits) of the Use::Prev allows to find the start of the User object: • 00 —> binary digit 0 • 01 —> binary digit 1 • 10 —> stop and calculate (s) License Information 225 .. we stick to this redundancy for the sake of simplicity. The User and owned Use classes' memory layout The User class provides a basis for expressing the ownership of User towards other Values.---. there must be a fast and exact method to recover it..-------.---. at fixed offset) of the User object and there are a fixed number of them.

QuickCheck > > digits :: Int -> [Char] -> [Char] > digits 0 acc = '0' : acc > digits 1 acc = '1' : acc > digits n acc = digits (n `div` 2) $ digits (n `mod` 2) acc > > dist :: Int -> [Char] -> [Char] > dist 0 [] = ['S'] > dist 0 acc = acc > dist 1 acc = let r = dist 0 acc in 's' : digits (length r) r > dist n acc = dist (n .---.---.---.---.---. so that the worst case is 20 memory accesses when there are 1000 Use objects associated with a User. Reference implementation The following literate Haskell fragment demonstrates the concept: > import Test.---.---.---.---. Documentation for the LLVM System at SVN head • 11 —> full stop (S) Given a Use*. all we have to do is to walk till we get a stop and we either have a User immediately behind or we have to walk to the next stop picking up digits and calculating the offset: . printing <pref test> gives 40.1) $ dist 1 acc > > takeLast n ss = reverse $ take n $ reverse ss > > test = takeLast 40 $ dist 20 [] > Printing <test> gives: "1s100000s11010s10100s1111s1010s110s11s1S" The reverse algorithm computes the length of the string just by examining a certain prefix: > pref :: [Char] -> Int > pref "S" = 1 > pref ('s':'1':rest) = decode 2 1 rest > pref (_:rest) = 1 + pref rest > > decode walk acc ('0':rest) = decode (walk + 1) (acc * 2) rest > decode walk acc ('1':rest) = decode (walk + 1) (acc * 2 + 1) rest > decode walk acc _ = walk + acc > Now.---.---.---.---. We can quickCheck this with following property: License Information 226 .---------------- | 1 | s | 1 | 0 | 1 | 0 | s | 1 | 1 | 0 | s | 1 | 1 | s | 1 | S | User (or User*) '---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'---------------- |+15 |+10 |+6 |+3 |+1 | | | | |__> | | | |__________> | | |______________________> | |______________________________________> |__________________________________________________________> Only the significant number of bits need to be stored between the stops. as expected.---.---.---.

Every Value has a Type. setters of Use::Prev must re-tag the new Use** on every modification. but this is not a requirement.h" doxygen info: Type Class The Core LLVM classes are the primary means of representing the program being inspected or transformed. Accordingly getters must strip the tag bits. LabelType. The core LLVM classes are defined in header files in the include/llvm/ directory. This allows type equality to be performed with address equality of the Type Instance. Following this pointer brings us to the User. Types can be named. Documentation for the LLVM System at SVN head > testcase = dist 2000 [] > testcaseLength = length testcase > > identityProp n = n > 0 && n <= testcaseLength ==> length arr == pref arr > where arr = takeLast n testcase > As expected <quickCheck identityProp> gives: *Main> quickCheck identityProp OK. given two Type* values. There exists exactly one instance of a given shape at any one time. the types are identical if the pointers are identical. Let's be a bit more exhaustive: > > deepCheck p = check (defaultConfig { configMaxTest = 500 }) p > And here is the result of <deepCheck identityProp>: *Main> deepCheck identityProp OK.) The Core LLVM Class Hierarchy Reference #include "llvm/Type. That is. For layout b) instead of the User we find a pointer (User* with LSBit set). The Type class and Derived Types Type is a superclass of all type classes. All other types are subclasses of DerivedType. A portable trick ensures that the first bytes of User (if interpreted as a pointer) never has the LSBit set. Type cannot be instantiated directly but only through its subclasses. Tagging considerations To maintain the invariant that the 2 LSBits of each Use** in Use never change after being set up. They are hidden because they offer no useful functionality beyond what the Type class offers except to distinguish themselves from other subclasses of Type. passed 100 tests. (Portability is relying on the fact that all known compilers place the vptr in the first word of the instances. and implemented in the lib/VMCore directory. FloatType and DoubleType) have hidden subclasses. passed 500 tests. Certain primitive types (VoidType. Important Public Methods License Information 227 .

StructType Subclass of DerivedTypes for struct types. ◊ unsigned getBitWidth() const: Get the bit width of an integer type. • bool isFloatingPointTy(): Return true if this is one of the five floating point types.h" doxygen info: Module Class License Information 228 . Vector types are used for vector operations and are usually small vectors of of an integer or floating point type. OpaqueType can also be used for data abstraction. VectorType Subclass of SequentialType for vector types. Once the referenced type is resolved. SequentialType This is subclassed by ArrayType and PointerType ◊ const Type * getElementType() const: Returns the type of each of the elements in the sequential type. OpaqueType Sublcass of DerivedType for abstract types. Documentation for the LLVM System at SVN head • bool isIntegerTy() const: Returns true for any integer type. ◊ const Type * getParamType (unsigned i): Returns the type of the ith parameter. Any bit width between IntegerType::MIN_INT_BITS (1) and IntegerType::MAX_INT_BITS (~8 million) can be represented. A vector type is similar to an ArrayType but is distinguished because it is a first class type whereas ArrayType is not. • bool isSized(): Return true if the type has known size. ◊ unsigned getNumElements() const: Returns the number of elements in the array. the OpaqueType is replaced with the actual type. FunctionType Subclass of DerivedTypes for function types. The Module class #include "llvm/Module. ◊ static const IntegerType* get(unsigned NumBits): get an integer type of a specific bit width. Important Derived Types IntegerType Subclass of DerivedType that represents integer types of any bit width. PointerType Subclass of SequentialType for pointer types. At link time opaque types can be resolved to actual types of the same name. • bool isAbstract(): Return true if the type is abstract (contains an OpaqueType anywhere in its definition). This class defines no content and is used as a placeholder for some other type. ◊ const unsigned getNumParams() const: Returns the number of formal parameters. Note that OpaqueType is used (temporarily) during type resolution for forward references of types. labels and void. ◊ bool isVarArg() const: Returns true if its a vararg function ◊ const Type * getReturnType() const: Returns the return type of the function. Things that don't have a size are abstract types. ArrayType This is a subclass of SequentialType and defines the interface for array types.

An LLVM module is effectively either a translation unit of the original program or a combination of several translation units merged by the linker. This is necessary to use when you need to update the list or perform a complex action that doesn't have a forwarding method. begin(). • Function *getOrInsertFunction(const std::string &Name. and a SymbolTable.Typedef for const_iterator. end() size(). it contains a few helpful member functions that try to make common operations easy. • Module::GlobalListType &getGlobalList() Returns the list of GlobalVariables. • Module::FunctionListType &getFunctionList() Returns the list of Functions.Typedef for global variable list iterator Module::const_global_iterator . global_end() global_size(). • Module::global_iterator .Typedef for function list iterator Module::const_iterator . Additionally. • SymbolTable *getSymbolTable() Return a reference to the SymbolTable for this Module. • Module::iterator .Typedef for const_iterator. const FunctionType *T) License Information 229 . a list of GlobalVariables. • Function *getFunction(const std::string &Name. return null. const FunctionType *Ty) Look up the specified function in the Module SymbolTable. global_empty() These are forwarding methods that make it easy to access the contents of a Module object's GlobalVariable list. You can optionally provide a name for it (probably based on the name of the translation unit). If it does not exist. empty() These are forwarding methods that make it easy to access the contents of a Module object's Function list. Important Public Members of the Module class • Module::Module(std::string name = "") Constructing a Module is easy. Documentation for the LLVM System at SVN head The Module class represents the top level structure present in LLVM programs. This is necessary to use when you need to update the list or perform a complex action that doesn't have a forwarding method. global_begin(). The Module class keeps track of a list of Functions.

• bool addTypeName(const std::string &Name. For example. true is returned and the SymbolTable is not modified. License Information 230 .Returns true if there are no users. every LLVM Value is typed. return it. • std::string getTypeName(const Type *Ty) If there is at least one entry in the SymbolTable for the specified Type. 2 The name of this instruction is "foo".Get an iterator to the end of the use-list. it simplifies the representation and makes it easier to manipulate. If there is already an entry for this name. To keep track of this relationship. const Type *Ty) Insert an entry in the SymbolTable mapping Name to Ty. all LLVM values can be named. so names should ONLY be used for debugging (making the source code easier to read. and this Type is available through the getType() method.h" doxygen info: Value Class The Value class is the most important class in the LLVM Source base.Arguments. use_iterator use_begin() . For this purpose. use_iterator use_end() . The Value class #include "llvm/Value. such as Constants. NOTE that the name of any value may be missing (an empty string). A particular Value may be used many times in the LLVM representation for a program. Documentation for the LLVM System at SVN head Look up the specified function in the Module SymbolTable. Because of this.Returns the number of users of the value. Although this may take some getting used to. Because LLVM is a typed representation. any reference to the value produced by an instruction (or the value available as an incoming argument.Get an iterator to the start of the use-list. add an external declaration for the function and return it. bool use_empty() .Typedef for const_iterator over the use-list unsigned use_size() . One important aspect of LLVM is that there is no distinction between an SSA variable and the operation that produces it. In addition. they should not be used to keep track of values or map between them.Typedef for iterator over the use-list Value::use_const_iterator . debugging printouts). for example) is represented as a direct pointer to the instance of the class that represents this value. The "name" of the Value is a symbolic string printed in the LLVM code: %foo = add i32 1. There are many different types of Values. This use list is how LLVM represents def-use information in the program. use a std::map of pointers to the Value itself instead. the Value class keeps a list of all of the Users that is using it (the User class is a base class for all nodes in the LLVM graph that can refer to Values). Important Public Members of the Value class • Value::use_iterator . Otherwise return the empty string. shown below. an incoming argument to a function (represented with an instance of the Argument class) is "used" by every instruction in the function that references the argument. If it does not exist. and is accessible through the use_* methods. It represents a typed value that may be used (among other things) as an operand to an instruction. Even Instructions and Functions are Values.

Get an iterator to the end of the operand list. The Instruction class #include "llvm/Instruction. • bool hasName() const std::string getName() const void setName(const std::string &Name) This family of methods is used to access and assign a name to a Value.Returns the last element in the list. Important Public Members of the User class The User class exposes the operand list in two ways: through an index access interface and through an iterator based interface. if you detect that an instruction always produces a constant value (for example through constant folding). you can replace all uses of the instruction with the constant like this: Inst->replaceAllUsesWith(ConstVal). The User class #include "llvm/User. This connection provides the use-def information in LLVM. allowing this direct connection. It exposes a list of "Operands" that are all of the Values that the User is referring to.Get an iterator to the start of the operand list. be aware of the precaution above. there can only be one definition referred to. For example. The operands of a User point directly to the LLVM Value that it refers to.h" doxygen info: Instruction Class License Information 231 . • Type *getType() const This method returns the Type of the Value. As with all other iterators in LLVM. • User::op_iterator . • void replaceAllUsesWith(Value *V) This method traverses the use list of a Value changing all Users of the current value to refer to "V" instead. these methods make up the iterator based interface to the operands of a User. • Value *getOperand(unsigned i) unsigned getNumOperands() These two methods expose the operands of the User in a convenient form for direct access. op_iterator op_end() . Together. Because LLVM uses Static Single Assignment (SSA) form. the naming conventions follow the conventions defined by the STL.h" doxygen info: User Class Superclass: Value The User class is the common base class of all LLVM nodes that may refer to Values.Typedef for iterator over the operand list op_iterator op_begin() . The User class itself is a subclass of Value. These methods are the interface to access the def-use information in LLVM. Documentation for the LLVM System at SVN head User *use_back() .

• Instruction *clone() const Returns another instance of the specified instruction.e. so these enum values don't show up correctly in the doxygen output. it is a call. its operands can be accessed in the same way as for other Users (with the getOperand()/getNumOperands() and op_begin()/op_end() methods). and it has no name License Information 232 . Documentation for the LLVM System at SVN head Superclasses: User. i. one of many subclasses of Instruction are used.free. To represent a specific type of instruction. • bool mayWriteToMemory() Returns true if the instruction writes to memory. Value The Instruction class is the common base class for all LLVM instructions. It provides only a few methods. as well as the concrete sub-classes of Instruction that implement the instruction (for example BinaryOperator and CmpInst). An important file for the Instruction class is the llvm/Instruction.def file. • TerminatorInst This subclass is the parent of all terminator instructions (those which can terminate a block). • CastInst This subclass is the parent of the 12 casting instructions. This file contains some meta-data about the various different types of instructions in LLVM. • CmpInst This subclass respresents the two comparison instructions. except for the comparison instructions. identical in all ways to the original except that the instruction has no parent (ie it's not embedded into a BasicBlock). ICmpInst (integer opreands). The primary data tracked by the Instruction class itself is the opcode (instruction type) and the parent BasicBlock the Instruction is embedded into. • unsigned getOpcode() Returns the opcode for the Instruction.invoke. or store. Important Subclasses of the Instruction class • BinaryOperator This subclasses represents all two operand instructions whose operands must be the same type. Because the Instruction class subclasses the User class. and FCmpInst (floating point operands). the use of macros in this file confuses doxygen. Unfortunately. but is a very commonly used class. It describes the enum values that are used as opcodes (for example Instruction::Add and Instruction::ICmp). Important Public Members of the Instruction class • BasicBlock *getParent() Returns the BasicBlock that this Instruction is embedded into. It provides common operations on cast instructions.

♦ const std::vector<Use> &getValues() const: Returns a vector of component constants that makeup this array. IF the value (not the bit width) of the APInt is too large to fit in a uint64_t. Documentation for the LLVM System at SVN head The Constant class and subclasses Constant represents a base class for different types of constants. use of this method is discouraged. for representing the various types of Constants. ♦ double getValue() const: Returns the underlying value of this constant. and does not participate in linking. etc. uint64_t Val): Returns the ConstantInt object that represents the value provided by Val for integer type Ty. In either case. GlobalValue is also a subclass.h" doxygen info: GlobalValue Class Superclasses: Constant. • ConstantStruct : This represents a constant struct. If the value (not the bit width) of the APInt is too large to fit in an int64_t. ♦ const std::vector<Use> &getValues() const: Returns a vector of component constants that makeup this array. an assertion will result. the Type of a global is always a pointer to its contents. For this reason. It is subclassed by ConstantInt. If it has external linkage. ♦ uint64_t getZExtValue() const: Converts the underlying APInt value to a uint64_t via zero extension. If a GlobalValue has internal linkage (equivalent to being static in C). In addition to linkage information. which represents the address of a global variable or function. • ConstantArray : This represents a constant array. an APInt value. it is visible to external code. GlobalValues know whether they have internal or external linkage. and does participate in linking. For example. Because they are visible at global scope. it is not visible to code outside the current translation unit. Value Global values (GlobalVariables or Functions) are the only LLVM values that are visible in the bodies of all Functions. ConstantArray. User. The type is implied as the IntegerType that corresponds to the bit width of Val. ♦ static ConstantInt* get(const Type *Ty. • ConstantFP : This class represents a floating point constant. ♦ int64_t getSExtValue() const: Converts the underlying APInt value to an int64_t via sign extension. As such. Because GlobalValues are memory objects. GlobalValues know their linkage rules. use of this method is discouraged. they are also subject to linking with other globals defined in different translation units. they are always referred to by their address. • GlobalValue : This represents either a global variable or a function. For this reason. ♦ const APInt& getValue() const: Returns the underlying value of this constant. It is important to remember this when using the GetElementPtrInst instruction because this pointer must be dereferenced first. as defined by the LinkageTypes enumeration. an assertion will result. Important Subclasses of Constant • ConstantInt : This subclass of Constant represents an integer constant of any width. if you have License Information 233 . GlobalValues keep track of which Module they are currently part of. Specifically. To control the linking process. The GlobalValue class #include "llvm/GlobalValue. the value is a constant fixed address (after linking). ♦ static ConstantInt* get(const APInt& Val): Returns the ConstantInt object that represents the value provided by Val.

then the GlobalVariable is a pointer to that array. which indicate how the code will be laid out by the backend. type [24 x i32]. this indicates that the Function is actually a function declaration: the actual body of the function hasn't been linked in yet.h" doxygen info: Function Class Superclasses: GlobalValue. There are no implicit exit nodes. and a SymbolTable. Additionally. a list of formal Arguments. The list of BasicBlocks is the most commonly used part of Function objects. The list imposes an implicit ordering of the blocks in the function. Value The Function class represents a single procedure in LLVM. just like the BasicBlock list does for the BasicBlocks. In addition to a list of BasicBlocks. Documentation for the LLVM System at SVN head a GlobalVariable (a subclass of GlobalValue) that is an array of 24 ints. the first BasicBlock is the implicit entry node for the Function. then its elements can be accessed. This container manages the lifetime of the Argument nodes. Constant. Note that Function is a GlobalValue and therefore also a Constant. It is not legal in LLVM to explicitly branch to this initial block. The Function class #include "llvm/Function. If the BasicBlock list is empty. Aside from that. The value of the function is its address (after linking) which is guaranteed to be constant. The GlobalVariable's type is [24 x i32]. The first element's type is i32. or Arguments in the function body. The SymbolTable is a very rarely used LLVM feature that is only used when you have to look up a value by name. Important Public Members of the Function class • Function(const FunctionType *Ty. accessing a global value requires you to dereference the pointer with GetElementPtrInst first. User. The Function class keeps track of a list of BasicBlocks. Module* Parent = 0) License Information 234 . they have different types. It is actually one of the more complex classes in the LLVM hierarchy because it must keep track of a large amount of data. Because of this. LinkageTypes Linkage. This is explained in the LLVM Language Reference Manual. BasicBlocks. the Function class also keeps track of the list of formal Arguments that the function receives. Although the address of the first element of this array and the value of the GlobalVariable are the same. const std::string &N = "". Important Public Members of the GlobalValue class • bool hasInternalLinkage() const bool hasExternalLinkage() const void setInternalLinkage(bool HasInternalLinkage) These methods manipulate the linkage characteristics of the GlobalValue. • Module *getParent() This returns the Module that the GlobalValue is currently embedded into. and in fact there may be multiple exit nodes from a single Function. the SymbolTable is used internally to make sure that there are not conflicts between the names of Instructions.

this returns the first block of the Function. or the FunctionType of the actual function. User. end() size().Typedef for const_iterator. Because the entry block for the function is always the first block. The GlobalVariable class #include "llvm/GlobalVariable. • BasicBlock &getEntryBlock() Returns the entry BasicBlock for the function.Typedef for the argument list iterator Function::const_arg_iterator . arg_empty() These are forwarding methods that make it easy to access the contents of a Function object's Argument list. arg_end() arg_size(). arg_begin(). Constant. The constructor must specify the type of the function to create and what type of linkage the function should have. If the function is "external". This is necessary to use when you need to update the list or perform a complex action that doesn't have a forwarding method. and thus must be resolved by linking with a function defined in a different translation unit. • Function::ArgumentListType &getArgumentList() Returns the list of Arguments. the function will automatically be inserted into that module's list of functions. Documentation for the LLVM System at SVN head Constructor used when you need to create new Functions to add the the program. • Type *getReturnType() FunctionType *getFunctionType() This traverses the Type of the Function and returns the return type of the function. • SymbolTable *getSymbolTable() Return a pointer to the SymbolTable for this Function. Value Global variables are represented with the (surprise surprise) GlobalVariable class. • bool isDeclaration() Return whether or not the Function has a body defined. GlobalVariables are also subclasses of GlobalValue. The Parent argument specifies the Module in which the function is defined. and as such are always referenced by their License Information 235 . begin(). • Function::arg_iterator .Typedef for const_iterator. The FunctionType argument specifies the formal arguments and return value for the function. If this argument is provided. The same FunctionType value can be used to create multiple functions. it does not have a body. Like functions.Typedef for basic block list iterator Function::const_iterator . • Function::BasicBlockListType &getBasicBlockList() Returns the list of BasicBlocks. • Function::iterator . This is necessary to use when you need to update the list or perform a complex action that doesn't have a forwarding method. empty() These are forwarding methods that make it easy to access the contents of a Function object's BasicBlock list.h" doxygen info: GlobalVariable Class Superclasses: GlobalValue.

which form the body of the block. The BasicBlock class #include "llvm/BasicBlock. • bool hasInitializer() Returns true if this GlobalVariable has an intializer. Function *Parent = 0) License Information 236 . Constant *Initializer = 0. Documentation for the LLVM System at SVN head address (global values must live in memory. appending) for the variable. LinkageTypes& Linkage. If isConstant is true then the global variable will be marked as unchanging for the program.h" doxygen info: BasicBlock Class Superclass: Value This class represents a single entry multiple exit section of the code. Important Public Members of the BasicBlock class • BasicBlock(const std::string &Name = "". If the linkage is InternalLinkage. LinkOnceAnyLinkage or LinkOnceODRLinkage. then the resultant global variable will have internal linkage. Matching the language definition. Global variables may have an initial value (which must be a Constant). linkonce. • bool isConstant() const Returns true if this is a global variable that is known not to be modified at runtime. Optionally an initializer. Important Public Members of the GlobalVariable class • GlobalVariable(const Type *Ty. the BasicBlock class also keeps track of the Function that it is embedded into. and the module to put the variable into may be specified for the global variable as well. weak. so their "name" refers to their constant address). • Constant *getInitializer() Returns the initial value for a GlobalVariable. In addition to tracking the list of instructions that make up the block. WeakAnyLinkage. the last element of this list of instructions is always a terminator instruction (a subclass of the TerminatorInst class). they may be marked as "constant" themselves (indicating that their contents never change at runtime). bool isConstant. BasicBlocks have type label. See the LLVM Language Reference for further details on linkage types. external. commonly known as a basic block by the compiler community. and if they have an initializer. WeakODRLinkage. The BasicBlock class maintains a list of Instructions. See GlobalValue for more on this. Note that BasicBlocks themselves are Values. Module* Parent = 0) Create a new global variable of the specified type. AppendingLinkage concatenates together all instances (in different translation units) of the variable into a single variable but is only applicable to arrays. const std::string &Name = "". The Linkage parameter specifies the type of linkage (internal. a name. It is not legal to call this method if there is no initializer. because they are referenced by instructions like branches and can go in the switch tables.

and a Function to insert it into. Dinakar Dhurjati and Chris Lattner The LLVM Compiler Infrastructure Last modified: $Date: 2010-02-25 17:51:27 -0600 (Thu. These methods expose the underlying instruction list of a basic block in a way that is easy to manipulate. the new BasicBlock is automatically inserted at the end of the specified Function. If the Parent parameter is specified. An argument has a pointer to the parent Function. front(). begin(). or a null pointer if it is homeless. Documentation for the LLVM System at SVN head The BasicBlock constructor is used to create new basic blocks for insertion into a function. empty() STL-style functions for accessing the instruction list. These methods and typedefs are forwarding functions that have the same semantics as the standard library methods of the same names. The Argument class This subclass of Value defines the interface for incoming formal arguments to a function. end(). 25 Feb 2010) $ License Information 237 . you must use the getInstList() method. back(). If there is no terminator instruction. Because there are no forwarding functions for "updating" operations.Typedef for instruction list iterator BasicBlock::const_iterator . size(). • BasicBlock::iterator . if not specified. then a null pointer is returned. A Function maintains a list of its formal arguments. The constructor optionally takes a name for the new block. or if the last instruction in the block is not a terminator. the BasicBlock must be manually inserted into the Function. To get the full complement of container operations (including operations to update the list).Typedef for const_iterator. you need to use this if you want to update the contents of a BasicBlock. This method must be used when there isn't a forwarding function in the BasicBlock class for the operation that you would like to perform. • Function *getParent() Returns a pointer to Function the block is embedded into. • TerminatorInst *getTerminator() Returns a pointer to the terminator instruction that appears at the end of the BasicBlock. • BasicBlock::InstListType &getInstList() This method is used to get access to the underlying container that actually holds the Instructions.

libraries. ♦ PROJ_INSTALL_ROOT . Create a project from the Sample Project 3. If you want to devise your own build system. You can write your own Makefiles which hard-code these values. Documentation for the LLVM System at SVN head Creating an LLVM Project 1. a configure script that can be used to configure the location of LLVM.The root of the LLVM source tree.The relative path from the current directory to the project's root ($PROJ_OBJ_ROOT). Variables for Building Subdirectories 3. License Information 238 . ♦ LLVM_SRC_ROOT .The root of the project's source tree. Overview 2. Writing LLVM-style Makefiles 1.The root of the project's object tree. Further help Written by John Criswell Overview The LLVM build system is designed to facilitate the building of third party projects that use LLVM header files. There are several variables that a Makefile needs to set to use the LLVM build system: ♦ PROJECT_NAME . This sample project includes Makefiles.The root of the LLVM object tree. 2. You can place it anywhere you like.The name by which your project is known. Variables for Building Programs 5. Source tree layout 4. 3. Required Variables 2. There are two ways that you can set all of these variables: 1.config from $(LLVM_OBJ_ROOT). Placement of object code 6. ♦ PROJ_OBJ_ROOT . and the ability to support multiple object directories from a single source directory. Include Makefile. You can use the pre-made LLVM sample project. Create a Project from the Sample Project Follow these simple steps to start your project: 1. In order to use these facilities. ♦ PROJ_SRC_ROOT . Rename the directory to match the name of your project. Variables for Building Libraries 4. Copy the llvm/projects/sample directory to any place of your choosing. This document assumes that you will base your project on the LLVM sample project found in llvm/projects/sample. Include Makefile. studying the sample project and LLVM Makefiles will probably provide enough information on how to write your own Makefiles. ♦ LLVM_OBJ_ROOT . ♦ LEVEL .The root installation directory. and tools.rules from $(LLVM_SRC_ROOT). Miscellaneous Variables 5. a Makefile from a project must do the following things: 1. Set make variables. 2.

ac as follows: ♦ AC_INIT. If you downloaded LLVM using Subversion. Run configure in the directory in which you want to place object code. regenerate the configure script with these commands: % cd autoconf % . The best way to do this is to just copy the project tree from llvm/projects/sample and modify it to meet your needs.in ♦ AC_CONFIG_FILES. ♦ AC_CONFIG_MAKEFILE. remove all the directories named . its version number and a contact email address for your project as the arguments to this macro ♦ AC_CONFIG_AUX_DIR. 4. If your project isn't in the llvm/projects directory then you might need to adjust this so that it specifies a relative path to the llvm/autoconf directory. 6. --prefix=<directory> Tell your project where it should get installed. 3. This macro arranges for your makefiles to be copied from the source directory. --with-llvmobj=<directory> Tell your project where the LLVM object tree is located. you should have the following directories: lib This subdirectory should contain all of your library source code. Underneath your top level directory. unmodified.svn (and all the files therein) from your project's new source tree. 5. and your project should build. Documentation for the LLVM System at SVN head 2. but you can certainly add to it if you want. Do not change. you will have one directory in lib that will contain that library's source code. you want your source tree layout to look similar to the LLVM source tree layout. ♦ LLVM_CONFIG_PROJECT. If you want your project to be configured with the configure script then you need to edit autoconf/configure. Place the name of your project. That's it! Now all you have to do is type gmake (or make if your on a GNU/Linux system) in the root of your object directory. Use one of these macros for each Makefile that your project uses. Use the following options to tell your project where it can find LLVM: --with-llvmsrc=<directory> Tell your project where the LLVM source tree is located. Add your source code and Makefiles to your source tree. Libraries can be object files.ac. Specify a path to a file name that identifies your project. For each library that you build. After updating autoconf/configure.common. The lib directory is just a convenient place for libraries as it places them all in a directory from which they can be linked later.9 or later. Source Tree Layout In order to use the LLVM build system. you will want to organize your source code so that it can benefit from the build system's features. or just leave it at Makefile. Just leave this alone. This will keep Subversion from thinking that your project is inside llvm/trunk/projects/sample./AutoRegen. archives. or dynamic libraries. Mainly.sh You must be using Autoconf version 2. ♦ AC_CONFIG_SRCDIR.59 or later and your aclocal version should be 1. to the build directory. License Information 239 .

◊ LLVM contains an optional package called llvm-test which provides benchmarks and programs that are known to compile with the LLVM GCC front ends.exp. Automated tests are especially useful. Most of your project Makefiles will only need to define a few variables. PARALLEL_DIRS License Information 240 . You can use these programs to test your code.. By global.". in the order specified. gather statistics information. then the Makefile in /tmp/src/jump/high would set LEVEL to ".h". For example. They will be built. you will have one directory in tools that will contain that program's source code. For example. test This subdirectory should contain tests that verify that your code works correctly. then your source files can include it simply with #include "jazz/note.. if you have a file include/jazz/note. You will simply need to find a way to use the source provided within that directory on your own. if your source code is in /tmp/src. there is no way to hook your tests directly into the llvm/test testing harness. Below is a list of the variables one can set and what they can do: Required Variables LEVEL This variable is the relative path from this Makefile to the top directory of your project's source code. It can be found in llvm/lib/llvm-dg. The LLVM system provides the following: ◊ LLVM provides a tcl procedure that is used by Dejagnu to run tests. Currently. See the TestingGuide for more details. Writing LLVM Style Makefiles The LLVM build system provides a convenient way to build libraries and executables. Currently.h./. and compare it to the current LLVM performance statistics. This test procedure uses RUN lines in the actual test case to determine how to run the test. For each program that you build. we mean that they are used by more than one library or executable of your project. You can easily write Makefile support similar to the Makefiles in llvm/test to use Dejagnu to run your project's tests. Typically. you will want to build your lib directory first followed by your tools directory. Variables for Building Subdirectories DIRS This is a space separated list of subdirectories that should be built. tools This subdirectory should contain all of your source code for executables. one at a time. the LLVM build system provides basic support for tests. Documentation for the LLVM System at SVN head include This subdirectory should contain any header files that are global to your project. By placing your header files in include. they will be found automatically by the LLVM build system.

so. It is highly suggested that you append to CFLAGS and CPPFLAGS as opposed to overwriting them.a. USEDLIBS This variable holds a space separated list of libraries that should be linked into the program. The master Makefiles may already have useful options in them that you may not want to overwrite. to link libsample. To build an archive (also known as a static library). BUILD_ARCHIVE By default. For example.a. LIBRARYNAME should be set to sample. TOOLNAME should be set to sample. but will not cause an error if they do not exist. a shared (or dynamic) library will be built. to build a library named libsample. CFLAGS CPPFLAGS This variable can be used to add options to the C and C++ compiler. Variables for Building Programs TOOLNAME This variable contains the name of the program that will be built. For example. For example. to link libsample. The libraries must be specified by their base name. It is useful for including the output of Lex and Yacc programs.o file that is linked directly into a program. set the BUILD_ARCHIVE variable. Placement of Object Code License Information 241 . They are built serially in the order in which they are listed. SHARED_LIBRARY If SHARED_LIBRARY is defined in your Makefile. you would set USEDLIBS to sample. For example. OPTIONAL_DIRS This is a list of directories that can be built if they exist. you would have the following line in your Makefile: LIBS += -lsample Miscellaneous Variables ExtraSource This variable contains a space separated list of extra source files that need to be built. a library is a . add -l<library base name> to the LIBS variable. These will be built after the directories in DIRS have been built. to build an executable named sample. Note that this works only for statically linked libraries. These libraries must either be LLVM libraries or libraries that come from your lib directory. The LLVM build system will look in the same places for dynamic libraries as it does for static libraries. Variables for Building Libraries LIBRARYNAME This variable contains the base name of the library that will be built. It is typically used to add options that tell the compiler the location of additional directories to search for header files. Documentation for the LLVM System at SVN head This is a list of directories that can be built in parallel. respectively. LIBS To link dynamic libraries.

or Profile build. or profiled build. where type is Debug. optimized. or profiled build. Release. where type is Debug. 13 Aug 2009) $ License Information 242 . Release. You can always post your questions to the LLVM Developers Mailing List. Release. John Criswell The LLVM Compiler Infrastructure Last modified: $Date: 2009-08-13 15:08:52 -0500 (Thu. respectively. optimized. or Profile for a debug. Libraries All libraries (static and dynamic) will be stored in PROJ_OBJ_ROOT/<type>/lib. Executables All executables will be stored in PROJ_OBJ_ROOT/<type>/bin. Documentation for the LLVM System at SVN head The final location of built libraries and executables will depend upon whether you do a Debug. Further Help If you have any questions or need any help creating an LLVM project. the LLVM team would be more than happy to help. respectively. or Profile for a debug.

Targets Supported 1. tags 16. Comments 3. Using Variables 1. Tutorial 1. clean 6. Makefile. Tools 1. spotless 15. Override Variables 3. install 11.79.config 4. If License Information 243 . LLVM requires simply GNU Make 3. clean-local 7. Consequently. preconditions 12. Including Makefiles 1. dist-clean 10. Loadable Modules 2. While loosely patterned after the BSD makefile system. Makefile 2. printvars 13. Variable Values 3. check-local 5. a widely portable makefile processor. dist 8. Makefile. Projects 2. Introduction 2. LLVM unabashedly makes heavy use of the features of GNU Make so the dependency on GNU Make is firm. dist-check 9. Readable Variables 4. Bitcode Modules 2. JIT Tools 3.rules 4. General Concepts 1. check 4. all-local 3. Internal Variables Written by Reid Spencer Introduction This document provides usage information about the LLVM makefile system.common 3. Makefile. it has become clear that the features needed by LLVM and the Makefile norm are too great to use a more limited tool. all 2. uninstall 5. Documentation for the LLVM System at SVN head LLVM Makefile Guide 1. reconfigure 14. LLVM has taken a departure from BSD in order to implement additional features needed by LLVM. Libraries 1. Although makefile systems such as automake were attempted at one point. Projects 4. Control Variables 2.

It includes the standard rules for the LLVM Makefile system. Building software that uses LLVM does not require the LLVM Makefile System nor even placement in the llvm/projects directory. it is treated separately here because of the volume of content and because it is often an early source of bewilderment for new developers. 3. These files and other general concepts are described in this section. it is recommended that you read the GNU Makefile Manual. but it can build yours too. The various files involved are described in the sections that follow. Makefile. generating distributions. Makefile. These variables enable rules and processing in the makefile system that automatically Do The Right Thing™. Any directory under projects that has both a configure script and a Makefile is assumed to be a project that uses the LLVM Makefile system. 2. simply mimic the llvm/projects/sample project or for further details. include $(LEVEL)/Makefile. consult the Projects. It not only builds its own software. This file serves three purposes: 1. 3.config License Information 244 .common Every project must have a Makefile. This is the file first read by make.Required that must be set first. Built into the system is knowledge of the llvm/projects directory. This file is the "guts" of the LLVM Makefile system. doing so will allow your project to get up and running quickly by utilizing the built-in features that are used to compile LLVM. It specifies any other (static) values that are needed throughout the project. Documentation for the LLVM System at SVN head you're not familiar with make. 2. $(LLVM_SRC_ROOT)/Makefile. It has three sections: 1. However. Projects The LLVM Makefile System is quite generous. Only values that are used in all or a large proportion of the project's directories should be placed here.common . checking those distributions. Including Makefiles Setting variables alone is not enough. For complete details on setting up your projects configuration. You must include into your Makefile additional files that provide the rules of the LLVM Makefile system. testing it.include the LLVM Makefile system.rules. installing and uninstalling. This is done by including the $(LEVEL)/Makefile. you simply create a file named Makefile in your directory and declare values for certain variables.Override variables set by the LLVM Makefile system. Override Variables . While this document is rightly part of the LLVM Programmer's Manual. etc. LLVM compiles itself using the same features of the makefile system as used for projects. General Concepts The LLVM Makefile System is the component of LLVM that is responsible for building the software.html page. Settable Variables . Variable Values To use the makefile system.common file at its top source directory. It includes the project's configuration makefile to obtain values determined by the configure script. The variables and values that you select determine what the makefile system will do. It consists of a several files throughout the source tree.config file. Makefile Each directory to participate in the build needs to have a file named Makefile.

it is required on systems like HP-UX and Darwin. are ignored by make.o (pre-linked) object.config. or bitcode library. Archive libraries are the default.so) and an archive library (mylib.a) built.rules is included. you can ask for a shared library (. Libraries Only a few variable definitions are needed to build a regular library. shared library.a) version.config at the top of its build directory. they are just constructed differently. It provides all the logic. You should use LOADABLE_MODULE for any shared library that you intend to be loaded into an tool via the -load option. The contents of all the libraries produced will be the same. Tutorial This section provides some examples of the different kinds of modules you can build with the LLVM makefile system. dependencies. This means the library is not searchable and that the distinction between compilation units has been dissolved. Makefile. located at $(LLVM_SRC_ROOT)/Makefile. Documentation for the LLVM System at SVN head Every project must have a Makefile. Bitcode Modules In some situations. to the end of the line. In general.so) or archive library (. instead of an archive. Bitcode modules can be specified in addition to any of the other types of libraries by defining the MODULE_NAME variable. This file is generated by the configure script from the pattern provided by the Makefile. Normally. and rules for building the targets supported by the system. The LOADABLE_MODULE=1 directive can be used in conjunction with SHARED_LIBRARY=1 to indicate that the resulting shared library should be openable with the dlopen function and searchable with the dlsym function (or your operating system's equivalents). Makefile comments are invoked with the pound (#) character. The # character and any text following it.html document for an example of why you might want to do this. What it does largely depends on the values of make variables that have been set before Makefile. the makefile system will build all the software into a single libname. For example: LIBRARYNAME = mylib BYTECODE_LIBRARY = 1 MODULE_NAME = mymod License Information 245 .rules This file. The contents of this file depend largely on what configuration items the project uses. For example: LIBRARYNAME = mylib SHARED_LIBRARY = 1 ARCHIVE_LIBRARY = 1 says to build a library named "mylib" with both a shared library (mylib. Optionally. See the WritingAnLLVMPass. it is desirable to build a single bitcode module from a variety of sources.config. however most projects can get what they need by just relying on LLVM's configuration found in $(LLVM_OBJ_ROOT)/Makefile. Note that you normally do not need to specify the sources involved. The LLVM Makefile system will infer the source files from the contents of the source directory. Comments User Makefiles need not have comments in them unless the construction is unusual or it does not strictly follow the rules and patterns of the LLVM makefile system. While this isn't strictly necessary on Linux and a few other platforms.in file located at the top of the project's source directory.rules is the heart of the LLVM Makefile System. each directory you provide will build a single object although that object may be composed of additionally compiled components.

but important depending on how the module or library is to be linked. Also note that there are two different ways of specifying a library: with a . There will be no "lib" prefix on the module. In the case of building LLVM tools. to include all passes from a library of passes.a and LLVMSystem. All you need to do is use the LOADABLE_MODULE variable in your Makefile. This is useful. This distinction is necessary to support projects. The SHARED_LIBRARY variable is turned on. USEDLIBS refers to the libraries built by your project.bc from the sources in the directory. LLVMLIBS refers to the LLVM libraries found in the LLVM object directory. The LINK_LIBS_IN_SHARED variable is turned on. If the . LLVMSupport. This module will be an aggregation of all the bitcode modules derived from the sources. A loadable module is loaded by LLVM via the facilities of libtool's libltdl library which is part of lib/System implementation.a. Loadable Modules In some situations.o) file which will include all symbols of the library.a and LLVMSystem. The example will also build a bitcode archive containing a bitcode module for each compiled source file. USEDLIBS and LLVMLIBS can be used interchangeably since the "project" is LLVM itself and USEDLIBS refers to the same place as LLVMLIBS. Loadable modules can be loaded into programs like opt or llc to specify additional passes to run or targets to support. Documentation for the LLVM System at SVN head will build a module named mymod. to build a loadable module named MyMod that uses the LLVM libraries LLVMSupport. Note that two different variables are use to indicate which libraries are linked: USEDLIBS and LLVMLIBS. Loadable modules are also useful for debugging a pass or providing a pass with another package if that pass can't be included in LLVM. This differentiates it from a standard shared library of the same name. The difference is subtle.a suffix is used then the library is linked as a License Information 246 . you need to create a loadable module. 2. the entry refers to the re-linked (. you must provide the name of the tool and the names of the libraries you wish to link with the tool.a. For example: TOOLNAME = mytool USEDLIBS = mylib LINK_COMPONENTS = support system says that we are to build a tool name mytool and that it requires three libraries: mylib.a suffix and without. For example. Tools For building executable programs (tools). Without the suffix. for example. LLVM provides complete support for building such a module. you would specify: LIBRARYNAME := MyMod LOADABLE_MODULE := 1 LINK_COMPONENTS := support system Use of the LOADABLE_MODULE facility implies several things: 1. 3.

any additional libraries may be listed as other components. the library in question must have been built with the ARCHIVE_LIBRARY option set. dist all Prepare a source distribution tarball. For example. it will link libraries and generate executables. But. To get a full understanding of how this changes the linker command. you are implicitly instructing it to seek the "all" target (goal). Generally this is only defined in the Makefile of check-local the project's test directory. dist-clean clean Clean source distribution tarball temporary files. JIT Tools Many tools will want to use the JIT features of LLVM. Run a local test suite. tags Make C and C++ tags files for emacs and vi.a suffix. uninstall Remove built objects from installation directory.g. clean Remove built objects recursively. in a lib directory. check Change to the test directory in a project and run the test suite there. Other (unreferenced) symbols will not be included when the . Note that in order to use the . dist-check all Prepare a source distribution tarball and check that it builds. "check". you simply specify that you want an execution 'engine'. all (default) When you invoke make with no arguments.a syntax is used. the "all" target will compile source files and generate libraries. in a tools directory. Implied Target Name Target Description Targets all Compile the software recursively. To do this. In this case. Any target can be invoked from any directory but not all are applicable to a given directory (e. it is recommended that you: cd examples/Fibonacci make VERBOSE=1 Targets Supported This section describes each of the targets that can be built using the LLVM Makefile system. and the makefiles will automatically link in the appropriate JIT for the host or an interpreter if none is available: TOOLNAME = my_jit_tool USEDLIBS = mylib LINK_COMPONENTS = engine Of course. if they exist. printvars all Prints variables defined by the makefile system (for debugging). only the symbols that are unresolved at that point will be resolved from the library. License Information 247 . clean-local Remove built objects from the local directory only. all-local Compile the software in the local directory only. install all Copy built objects to installation directory. "dist" and "install" will always operate as if invoked from the top level directory). This target is used for building the software recursively and will do different things in different directories. Documentation for the LLVM System at SVN head searchable library (with the -l option). preconditions all Check to make sure configuration and makefiles are up to date. Default target.

If TESTSUITE is not set. This target can take a long time to run but should be done before a release goes out to make sure that the distributed tarball can actually be built into a working release. It is up to the project to define what different values for TESTSUTE will do. clean This target cleans the build directory. recursively removing all things that the Makefile builds. When completed. The cleaning rules have been made guarded so they shouldn't go awry (via rm -f $(UNSET_VARIABLE)/* which will attempt to erase the entire directory structure. building it. Other projects may choose to use dejagnu or any other testing mechanism. dist This target builds a distribution tarball. this feature is not enabled because it takes a long time and generates a massive amount of data (>100MB). but probably not for a release (see dist-check). A warning is produced otherwise. it will be passed down to the invocation of make check-local in the test directory.3. headers. dist-clean This is a special form of the clean clean target.7 or later) is available in your PATH. check-local This target should be implemented by the Makefile in the project's test directory. The generated tarball is sufficient for a casual source distribution. the implementation of check-local should run all normal tests. if it exists and has a Makefile. See the TestingGuide for further details. It is invoked by the check target elsewhere. If you want this feature. By default. The LLVM project itself uses dejagnu to run a suite of feature and regresson tests. The check is made by unpacking the tarball to a new directory. executables and documentation to the directory given with the --prefix option to configure. dist-check This target does the same thing as the dist target but also checks the distribution tarball. the prefix directory will have everything needed to use LLVM. The intended usage for this is to assist in running specific suites of tests. configuring it. If TESTSUITE is defined on the make command line. Documentation for the LLVM System at SVN head all-local This target is the same as all but it operates only on the current directory instead of recursively. It first builds the entire project using the all target and then tars up the necessary files and compresses it. It performs a normal clean but also removes things pertaining to building the distribution. clean-local This target does the same thing as clean but only for the current (local) directory. you must configure LLVM with the --enable-doxygen switch and ensure that a modern version of doxygen (1. check This target can be invoked from anywhere within a project's directories but always invokes the check-local target in the project's test directory. installing it. install This target finalizes shared objects and executables and copies all libraries. and then verifying that the installation results are correct (by comparing to the original build). The LLVM makefiles can generate complete internal documentation for all the classes by using doxygen. Each project is free to define the actions of check-local as appropriate for that project. You can download doxygen from License Information 248 .

status --recheck to rerun the configuration tests and rebuild the configured files. library and executable files from the installation directories. Users may overload this target to ensure that sanity checks are run before any building of targets as all the targets depend on preconditions. Variables Variables are used to tell the LLVM Makefile System what to do and to obtain information from it. It simply runs $(PROJ_OBJ_ROOT)/config. tags This target will generate a TAGS file in the top-level source directory. Variable names that contain only the upper case alphabetic letters and underscore are intended for use by the end user. The TAGS file provides an index of symbol definitions so that the editor can jump you to the definition quickly. reconfigure This utility target will force a reconfigure of LLVM or your project. --prefix=/usr). BUILT_SOURCES License Information 249 .a) library to be built. XEmacs. will completely clean the $(PROJ_OBJ_ROOT) directory by removing its content entirely and reconfiguring the directory.g. Use with caution. Control Variables Variables listed in the table below should be set before the inclusion of $(LEVEL)/Makefile. preconditions This utility target checks to see if the Makefile in the object directory is older than the Makefile in the source directory and copies it if so. The sections below describe how to use the LLVM Makefile variables. spotless This utility target. Documentation for the LLVM System at SVN head here. It removes the header. only available when $(PROJ_OBJ_ROOT) is not the same as $(PROJ_SRC_ROOT).config file similarly. It is meant for use with emacs. uninstall This target is the opposite of the install target.common. This isn't generally useful as the makefiles will reconfigure themselves whenever its necessary. Note that the directories themselves are not removed because it is not guaranteed that LLVM is the only thing installing there (e. BUILD_ARCHIVE If set to any value. It also reruns the configure script if that needs to be done and rebuilds the Makefile. causes an archive (. Variables are also used internally by the LLVM Makefile System. This returns the $(PROJ_OBJ_ROOT) directory to a completely fresh state. These variables provide input to the LLVM make system that tell it what to do for the current directory. or ViM. printvars This utility target just causes the LLVM makefiles to print out some of the makefile variables so that you can double check how things are set. All other variables are internal to the LLVM Makefile System and should not be relied upon nor modified. All content in the directory except configured files and top-level makefiles will be lost.

This alters the flags specified to the compilers and linkers. specifies that when linking executables the makefiles should retain debug symbols in the executable. Documentation for the LLVM System at SVN head Specifies a set of source files that are generated from other source files. Debugging isn't fun in an optimized build. all built sources. ENABLE_OPTIMIZED If set to any value. One symbol per line. libraries and executables. Generally debugging won't be a fun experience with an optimized build. causes the build to include debugging symbols even in optimized objects. This alters the flags specified to the compilers and linkers. EXPORTED_SYMBOL_LIST Specifies a set of symbols to be exported by the linker. but it is possible. symbols are stripped from the executable. that should also be made using the same goal. Use of this feature is discouraged and it may be removed at a later date. EXPORTED_SYMBOL_FILE Specifies the name of a single file that contains a list of the symbols to be exported by the linker. EXPERIMENTAL_DIRS Specify a set of directories that should be built.out). libraries and executables. All source files. DEBUG_SYMBOLS If set to any value. Note that this should only be used temporarily while code is being written. LIBRARYNAME Specify the name of the library to be built.bc) to be built. Normally. EXTRA_DIST Specifies additional files that should be distributed with LLVM. These sources will be built before any other target processing to ensure they are present. but if they fail. it should not cause the build to fail. This will exclude all assertion check code from the build. causes the build to generate optimized objects. This alters the flags specified to the compilers and linkers to ensure that profile data can be collected from the tools built. causes the makefiles to not automatically generate dependencies when running the compiler. DIRS Specifies a set of directories. all Makefiles. LEVEL(required) Specify the level of nesting from the top level. Use this variable to distribute any files that are not automatically distributed. DISABLE_ASSERTIONS If set to any value. causes the build to generate both optimized and profiled objects. BYTECODE_LIBRARY If set to any value. These directories will be built serially. but with little help when things go wrong. ENABLE_PROFILING If set to any value. CONFIG_FILES Specifies a set of configuration files to be installed. usually children of the current directory. causes a bitcode library (. libraries and executables. (Required For Libraries) LINK_COMPONENTS License Information 250 . KEEP_SYMBOLS If set to any value. This variable must be set in each makefile as it is used to find the top level and thus the other makefiles. even if building a release or profile build. LLVM will execute faster. causes the build to disable assertions. and most documentation files will be automatically distributed. Use the gprof tool to analyze the output from the profiled tools (gmon. DISABLE_AUTO_DEPENDENCIES If set to any value.

not all libraries need to be specified. NO_INSTALL Specifies that the build products of the directory should not be installed but should be built even if the install target is given. Use it only where you really need a shared library. It constructs from the sources a single linked bitcode file. such as code generators (e. This prevents shared libs from including things that will be in the LLVM tool the shared library will be loaded into. Documentation for the LLVM System at SVN head When specified for building a tool. SOURCES(optional) Specifies the list of source files in the current directory to be built.td files. Loadable modules can be opened with the dlopen() function and searched with dlsym (or the operating system's equivalent).g. tblgen). If not specified. Note that this option will cause all source files to be built twice: once with options for position independent code and once without. The USEDLIBS variable can still be used in conjunction with LINK_COMPONENTS so that additional project-specific libraries can be linked with the LLVM libraries specified by LINK_COMPONENTS LINK_LIBS_IN_SHARED By default. config files. TOOL_VERBOSE Implies VERBOSE and also tells each tool invoked to be verbose. SUFFIXES Specifies a set of filename suffixes that occur in suffix match rules. Note that setting this variable without also setting SHARED_LIBRARY will have no effect. OPTIONAL_DIRS Specify a set of directories that may be built. documentation. the value of this variable will be passed to the llvm-config tool to generate a link line for the tool. but its not an error for them not to exist. LLVMLIBS Specifies the set of libraries from the LLVM $(ObjDir) that will be linked into the tool or library.inc files from . Source files of any type may be specified (programs. TARGET Specifies the name of the LLVM code generation target that the current directory builds. A bitcode module can be specified in conjunction with other kinds of library builds or by itself. if they exist. TESTSUITE Specifies the directory of tests to run in llvm/test. This is handy when you're trying to see the sub-tools invoked by each tool invoked by the makefile. the makefile system will infer the set of source files from the files present in the current directory. shared library linking will ignore any libraries specified with the LLVMLIBS or USEDLIBS. etc. The llvm-config tool will figure out the library dependencies and add any libraries that are needed. PARALLEL_DIRS Specify a set of directories to build recursively and in parallel if the -j option was used with make. However. TOOLNAME Specifies the name of the tool that the current directory should build. For example. causes a shared library (. LOADABLE_MODULE If set to any value.). SHARED_LIBRARY If set to any value. Unlike USEDLIBS and LLVMLIBS. this will pass -v to the License Information 251 . MODULE_NAME Specifies the name of a bitcode module to be created. This is handy for directories that build libraries or tools that are only used as part of the build process. Only set this if your local Makefile specifies additional suffix match rules. sometimes it is useful to link certain libraries into your shared library and this option enables that feature.so) to be built in addition to any other kinds of libraries. causes the shared library being built to also be a loadable module. Setting this variable enables additional rules to build .

Unix). The override variables are given below: AR (defaulted) Specifies the path to the ar tool. EXEEXT(configured) Provides the extension to be used on executables built by the makefiles. CC(configured) The path to the 'C' compiler. CXXFLAGS Additional flags to be passed to the C++ compiler. assembler. Documentation for the LLVM System at SVN head GCC compilers which causes it to print out the command lines it uses to invoke sub-tools (compiler. linker). DATE(configured) Specifies the path to the date program or any program that can generate the current date and time on its standard output DOT(configured) Specifies the path to the dot tool or false if there isn't one. VERBOSE Tells the Makefile system to produce detailed output of what it is doing instead of just summary comments. LIBS(configured) License Information 252 . setenv.recommended. • On the make command line -. PROJ_SRC_DIR The directory which contains the source files to be built.common). ECHO(configured) Specifies the path to the echo tool for printing output. • On the configure command line • In the Makefile (only after the inclusion of $(LEVEL)/Makefile.not recommended. Override Variables Override variables can be used to override the default values provided by the LLVM makefile system. CXX Specifies the path to the C++ compiler. PROJ_OBJ_DIR The directory into which the products of build rules will be placed. CFLAGS Additional flags to be passed to the 'C' compiler.g. INSTALL(configured) Specifies the path to the install tool.g. export) -. These variables can be set in several ways: • In the environment (e. This might be the same as PROJ_SRC_DIR but typically is not. USEDLIBS Specifies the list of project libraries that will be linked into the tool or library. The value may be empty on platforms that do not use file extensions for executables (e. BZIP2(configured) The path to the bzip2 tool. LDFLAGS(configured) Allows users to specify additional flags to pass to the linker. This will generate a LOT of output.

Documentation for the LLVM System at SVN head
The list of libraries that should be linked with each tool.
LIBTOOL(configured)
Specifies the path to the libtool tool. This tool is renamed mklib by the configure script and
always located in the
LLVMAS(defaulted)
Specifies the path to the llvm-as tool.
LLVMCC
Specifies the path to the LLVM capable compiler.
LLVMCXX
Specifies the path to the LLVM C++ capable compiler.
LLVMGCC(defaulted)
Specifies the path to the LLVM version of the GCC 'C' Compiler
LLVMGXX(defaulted)
Specifies the path to the LLVM version of the GCC C++ Compiler
LLVMLD(defaulted)
Specifies the path to the LLVM bitcode linker tool
LLVM_OBJ_ROOT(configured)
Specifies the top directory into which the output of the build is placed.
LLVM_SRC_ROOT(configured)
Specifies the top directory in which the sources are found.
LLVM_TARBALL_NAME (configured)
Specifies the name of the distribution tarball to create. This is configured from the name of the project
and its version number.
MKDIR(defaulted)
Specifies the path to the mkdir tool that creates directories.
ONLY_TOOLS
If set, specifies the list of tools to build.
PLATFORMSTRIPOPTS
The options to provide to the linker to specify that a stripped (no symbols) executable should be built.
RANLIB(defaulted)
Specifies the path to the ranlib tool.
RM(defaulted)
Specifies the path to the rm tool.
SED(defaulted)
Specifies the path to the sed tool.
SHLIBEXT(configured)
Provides the filename extension to use for shared libraries.
TBLGEN(defaulted)
Specifies the path to the tblgen tool.
TAR(defaulted)
Specifies the path to the tar tool.
ZIP(defaulted)
Specifies the path to the zip tool.

Readable Variables
Variables listed in the table below can be used by the user's Makefile but should not be changed. Changing the
value will generally cause the build to go wrong, so don't do it.

bindir
The directory into which executables will ultimately be installed. This value is derived from the
--prefix option given to configure.

License Information 253

Documentation for the LLVM System at SVN head
BuildMode
The name of the type of build being performed: Debug, Release, or Profile
bytecode_libdir
The directory into which bitcode libraries will ultimately be installed. This value is derived from the
--prefix option given to configure.
ConfigureScriptFLAGS
Additional flags given to the configure script when reconfiguring.
DistDir
The current directory for which a distribution copy is being made.
Echo
The LLVM Makefile System output command. This provides the llvm[n] prefix and starts with @
so the command itself is not printed by make.
EchoCmd
Same as Echo but without the leading @.
includedir
The directory into which include files will ultimately be installed. This value is derived from the
--prefix option given to configure.
libdir
The directory into which native libraries will ultimately be installed. This value is derived from the
--prefix option given to configure.
LibDir
The configuration specific directory into which libraries are placed before installation.
MakefileConfig
Full path of the Makefile.config file.
MakefileConfigIn
Full path of the Makefile.config.in file.
ObjDir
The configuration and directory specific directory where build objects (compilation results) are
placed.
SubDirs
The complete list of sub-directories of the current directory as specified by other variables.
Sources
The complete list of source files.
sysconfdir
The directory into which configuration files will ultimately be installed. This value is derived from the
--prefix option given to configure.
ToolDir
The configuration specific directory into which executables are placed before they are installed.
TopDistDir
The top most directory into which the distribution files are copied.
Verb
Use this as the first thing on your build script lines to enable or disable verbose mode. It expands to
either an @ (quiet mode) or nothing (verbose mode).

Internal Variables
Variables listed below are used by the LLVM Makefile System and considered internal. You should not use
these variables under any circumstances.

Archive AR.Flags BaseNameSources BCCompile.C BCCompile.CXX BCLinkLib
C.Flags Compile.C CompileCommonOpts Compile.CXX ConfigStatusScript
ConfigureScript CPP.Flags CPP.Flags CXX.Flags DependFiles DestArchiveLib

License Information 254

Documentation for the LLVM System at SVN head
DestBitcodeLib DestModule DestSharedLib DestTool DistAlways DistCheckDir
DistCheckTop DistFiles DistName DistOther DistSources DistSubDirs
DistTarBZ2 DistTarGZip DistZip ExtraLibs FakeSources INCFiles
InternalTargets LD.Flags LibName.A LibName.BC LibName.LA LibName.O
LibTool.Flags Link LinkModule LLVMLibDir LLVMLibsOptions LLVMLibsPaths
LLVMToolDir LLVMUsedLibs LocalTargets Module ObjectsBC ObjectsLO ObjectsO
ObjMakefiles ParallelTargets PreConditions ProjLibsOptions ProjLibsPaths
ProjUsedLibs Ranlib RecursiveTargets SrcMakefiles Strip StripWarnMsg
TableGen TDFiles ToolBuildPath TopLevelTargets UserTargets

Reid Spencer
The LLVM Compiler Infrastructure
Last modified: $Date: 2010-02-23 04:00:53 -0600 (Tue, 23 Feb 2010) $

License Information 255

Documentation for the LLVM System at SVN head
CommandLine 2.0 Library Manual

1. Introduction
2. Quick Start Guide
1. Boolean Arguments
2. Argument Aliases
3. Selecting an alternative from a set of possibilities
4. Named alternatives
5. Parsing a list of options
6. Collecting options as a set of flags
7. Adding freeform text to help output
3. Reference Guide
1. Positional Arguments
◊ Specifying positional options with hyphens
◊ Determining absolute position with getPosition
◊ The cl::ConsumeAfter modifier
2. Internal vs External Storage
3. Option Attributes
4. Option Modifiers
◊ Hiding an option from -help output
◊ Controlling the number of occurrences required and allowed
◊ Controlling whether or not a value must be specified
◊ Controlling other formatting options
◊ Miscellaneous option modifiers
◊ Response files
5. Top-Level Classes and Functions
◊ The cl::ParseCommandLineOptions function
◊ The cl::ParseEnvironmentOptions function
◊ The cl::SetVersionPrinter function
◊ The cl::opt class
◊ The cl::list class
◊ The cl::bits class
◊ The cl::alias class
◊ The cl::extrahelp class
6. Builtin parsers
◊ The Generic parser<t> parser
◊ The parser<bool> specialization
◊ The parser<boolOrDefault> specialization
◊ The parser<string> specialization
◊ The parser<int> specialization
◊ The parser<double> and parser<float> specializations
4. Extension Guide
1. Writing a custom parser
2. Exploiting external storage
3. Dynamically adding command line options

Written by Chris Lattner

Introduction
This document describes the CommandLine argument processing library. It will show you how to use it, and
what it can do. The CommandLine library uses a declarative approach to specifying the command line options

License Information 256

Documentation for the LLVM System at SVN head
that your program takes. By default, these options declarations implicitly hold the value parsed for the option
declared (of course this can be changed).

Although there are a lot of command line argument parsing libraries out there in many different languages,
none of them fit well with what I needed. By looking at the features and problems of other libraries, I
designed the CommandLine library to have the following features:

1. Speed: The CommandLine library is very quick and uses little resources. The parsing time of the
library is directly proportional to the number of arguments parsed, not the the number of options
recognized. Additionally, command line argument values are captured transparently into user defined
global variables, which can be accessed like any other variable (and with the same performance).
2. Type Safe: As a user of CommandLine, you don't have to worry about remembering the type of
arguments that you want (is it an int? a string? a bool? an enum?) and keep casting it around. Not only
does this help prevent error prone constructs, it also leads to dramatically cleaner source code.
3. No subclasses required: To use CommandLine, you instantiate variables that correspond to the
arguments that you would like to capture, you don't subclass a parser. This means that you don't have
to write any boilerplate code.
4. Globally accessible: Libraries can specify command line arguments that are automatically enabled in
any tool that links to the library. This is possible because the application doesn't have to keep a list of
arguments to pass to the parser. This also makes supporting dynamically loaded options trivial.
5. Cleaner: CommandLine supports enum and other types directly, meaning that there is less error and
more security built into the library. You don't have to worry about whether your integral command
line argument accidentally got assigned a value that is not valid for your enum type.
6. Powerful: The CommandLine library supports many different types of arguments, from simple
boolean flags to scalars arguments (strings, integers, enums, doubles), to lists of arguments. This is
possible because CommandLine is...
7. Extensible: It is very simple to add a new argument type to CommandLine. Simply specify the parser
that you want to use with the command line option when you declare it. Custom parsers are no
problem.
8. Labor Saving: The CommandLine library cuts down on the amount of grunt work that you, the user,
have to do. For example, it automatically provides a -help option that shows the available command
line options for your tool. Additionally, it does most of the basic correctness checking for you.
9. Capable: The CommandLine library can handle lots of different forms of options often found in real
programs. For example, positional arguments, ls style grouping options (to allow processing 'ls
-lad' naturally), ld style prefix options (to parse '-lmalloc -L/usr/lib'), and interpreter style
options.

This document will hopefully let you jump in and start using CommandLine in your utility quickly and
painlessly. Additionally it should be a simple reference manual to figure out how stuff works. If it is failing in
some area (or you want an extension to the library), nag the author, Chris Lattner.

Quick Start Guide
This section of the manual runs through a simple CommandLine'ification of a basic compiler tool. This is
intended to show you how to jump into using the CommandLine library in your own program, and show you
some of the cool things it can do.

To start out, you need to include the CommandLine header file into your program:

#include "llvm/Support/CommandLine.h"

Additionally, you need to add this as the first line of your main program:

License Information 257

Documentation for the LLVM System at SVN head
int main(int argc, char **argv) {
cl::ParseCommandLineOptions(argc, argv);
...
}

... which actually parses the arguments and fills in the variable declarations.

Now that you are ready to support command line arguments, we need to tell the system which ones we want,
and what type of arguments they are. The CommandLine library uses a declarative syntax to model command
line arguments with the global variable declarations that capture the parsed values. This means that for every
command line option that you would like to support, there should be a global variable declaration to capture
the result. For example, in a compiler, we would like to support the Unix-standard '-o <filename>' option
to specify where to put the output. With the CommandLine library, this is represented like this:

cl::opt<string> OutputFilename("o", cl::desc("Specify output filename"), cl::value_desc("filena

This declares a global variable "OutputFilename" that is used to capture the result of the "o" argument
(first parameter). We specify that this is a simple scalar option by using the "cl::opt" template (as opposed
to the "cl::list template), and tell the CommandLine library that the data type that we are parsing is a
string.

The second and third parameters (which are optional) are used to specify what to output for the "-help"
option. In this case, we get a line that looks like this:

USAGE: compiler [options]

OPTIONS:
-help - display available options (-help-hidden for more)
-o <filename> - Specify output filename

Because we specified that the command line option should parse using the string data type, the variable
declared is automatically usable as a real string in all contexts that a normal C++ string object may be used.
For example:

...
std::ofstream Output(OutputFilename.c_str());
if (Output.good()) ...
...

There are many different options that you can use to customize the command line option handling library, but
the above example shows the general interface to these options. The options can be specified in any order, and
are specified with helper functions like cl::desc(...), so there are no positional dependencies to
remember. The available options are discussed in detail in the Reference Guide.

Continuing the example, we would like to have our compiler take an input filename as well as an output
filename, but we do not want the input filename to be specified with a hyphen (ie, not -filename.c). To
support this style of argument, the CommandLine library allows for positional arguments to be specified for
the program. These positional arguments are filled with command line parameters that are not in option form.
We use this feature like this:

cl::opt<string> InputFilename(cl::Positional, cl::desc("<input file>"), cl::init("-"));

License Information 258

Documentation for the LLVM System at SVN head
This declaration indicates that the first positional argument should be treated as the input filename. Here we
use the cl::init option to specify an initial value for the command line option, which is used if the option
is not specified (if you do not specify a cl::init modifier for an option, then the default constructor for the
data type is used to initialize the value). Command line options default to being optional, so if we would like
to require that the user always specify an input filename, we would add the cl::Required flag, and we
could eliminate the cl::init modifier, like this:

cl::opt<string> InputFilename(cl::Positional, cl::desc("<input file>"), cl::Required);

Again, the CommandLine library does not require the options to be specified in any particular order, so the
above declaration is equivalent to:

cl::opt<string> InputFilename(cl::Positional, cl::Required, cl::desc("<input file>"));

By simply adding the cl::Required flag, the CommandLine library will automatically issue an error if the
argument is not specified, which shifts all of the command line option verification code out of your
application into the library. This is just one example of how using flags can alter the default behaviour of the
library, on a per-option basis. By adding one of the declarations above, the -help option synopsis is now
extended to:

USAGE: compiler [options] <input file>

OPTIONS:
-help - display available options (-help-hidden for more)
-o <filename> - Specify output filename

... indicating that an input filename is expected.

Boolean Arguments
In addition to input and output filenames, we would like the compiler example to support three boolean flags:
"-f" to force writing binary output to a terminal, "--quiet" to enable quiet mode, and "-q" for backwards
compatibility with some of our users. We can support these by declaring options of boolean type like this:

cl::opt<bool> Force ("f", cl::desc("Enable binary output on terminals"));
cl::opt<bool> Quiet ("quiet", cl::desc("Don't print informational messages"));
cl::opt<bool> Quiet2("q", cl::desc("Don't print informational messages"), cl::Hidden);

This does what you would expect: it declares three boolean variables ("Force", "Quiet", and "Quiet2") to
recognize these options. Note that the "-q" option is specified with the "cl::Hidden" flag. This modifier
prevents it from being shown by the standard "-help" output (note that it is still shown in the
"-help-hidden" output).

The CommandLine library uses a different parser for different data types. For example, in the string case, the
argument passed to the option is copied literally into the content of the string variable... we obviously cannot
do that in the boolean case, however, so we must use a smarter parser. In the case of the boolean parser, it
allows no options (in which case it assigns the value of true to the variable), or it allows the values "true" or
"false" to be specified, allowing any of the following inputs:

compiler -f # No value, 'Force' == true
compiler -f=true # Value specified, 'Force' == true
compiler -f=TRUE # Value specified, 'Force' == true
compiler -f=FALSE # Value specified, 'Force' == false

License Information 259

Documentation for the LLVM System at SVN head
... you get the idea. The bool parser just turns the string values into boolean values, and rejects things like
'compiler -f=foo'. Similarly, the float, double, and int parsers work like you would expect, using the
'strtol' and 'strtod' C library calls to parse the string value into the specified data type.

With the declarations above, "compiler -help" emits this:

USAGE: compiler [options] <input file>

OPTIONS:
-f - Enable binary output on terminals
-o - Override output filename
-quiet - Don't print informational messages
-help - display available options (-help-hidden for more)

and "compiler -help-hidden" prints this:

USAGE: compiler [options] <input file>

OPTIONS:
-f - Enable binary output on terminals
-o - Override output filename
-q - Don't print informational messages
-quiet - Don't print informational messages
-help - display available options (-help-hidden for more)

This brief example has shown you how to use the 'cl::opt' class to parse simple scalar command line
arguments. In addition to simple scalar arguments, the CommandLine library also provides primitives to
support CommandLine option aliases, and lists of options.

Argument Aliases
So far, the example works well, except for the fact that we need to check the quiet condition like this now:

...
if (!Quiet && !Quiet2) printInformationalMessage(...);
...

... which is a real pain! Instead of defining two values for the same condition, we can use the "cl::alias"
class to make the "-q" option an alias for the "-quiet" option, instead of providing a value itself:

cl::opt<bool> Force ("f", cl::desc("Overwrite output files"));
cl::opt<bool> Quiet ("quiet", cl::desc("Don't print informational messages"));
cl::alias QuietA("q", cl::desc("Alias for -quiet"), cl::aliasopt(Quiet));

The third line (which is the only one we modified from above) defines a "-q" alias that updates the "Quiet"
variable (as specified by the cl::aliasopt modifier) whenever it is specified. Because aliases do not hold
state, the only thing the program has to query is the Quiet variable now. Another nice feature of aliases is
that they automatically hide themselves from the -help output (although, again, they are still visible in the
-help-hidden output).

Now the application code can simply use:

...
if (!Quiet) printInformationalMessage(...);
...

License Information 260

Documentation for the LLVM System at SVN head
... which is much nicer! The "cl::alias" can be used to specify an alternative name for any variable type,
and has many uses.

Selecting an alternative from a set of possibilities
So far we have seen how the CommandLine library handles builtin types like std::string, bool and
int, but how does it handle things it doesn't know about, like enums or 'int*'s?

The answer is that it uses a table-driven generic parser (unless you specify your own parser, as described in
the Extension Guide). This parser maps literal strings to whatever type is required, and requires you to tell it
what this mapping should be.

Let's say that we would like to add four optimization levels to our optimizer, using the standard flags "-g",
"-O0", "-O1", and "-O2". We could easily implement this with boolean options like above, but there are
several problems with this strategy:

1. A user could specify more than one of the options at a time, for example, "compiler -O3 -O2".
The CommandLine library would not be able to catch this erroneous input for us.
2. We would have to test 4 different variables to see which ones are set.
3. This doesn't map to the numeric levels that we want... so we cannot easily see if some level >= "-O1"
is enabled.

To cope with these problems, we can use an enum value, and have the CommandLine library fill it in with the
appropriate level directly, which is used like this:

enum OptLevel {
g, O1, O2, O3
};

cl::opt<OptLevel> OptimizationLevel(cl::desc("Choose optimization level:"),
cl::values(
clEnumVal(g , "No optimizations, enable debugging"),
clEnumVal(O1, "Enable trivial optimizations"),
clEnumVal(O2, "Enable default optimizations"),
clEnumVal(O3, "Enable expensive optimizations"),
clEnumValEnd));

...
if (OptimizationLevel >= O2) doPartialRedundancyElimination(...);
...

This declaration defines a variable "OptimizationLevel" of the "OptLevel" enum type. This variable
can be assigned any of the values that are listed in the declaration (Note that the declaration list must be
terminated with the "clEnumValEnd" argument!). The CommandLine library enforces that the user can
only specify one of the options, and it ensure that only valid enum values can be specified. The
"clEnumVal" macros ensure that the command line arguments matched the enum values. With this option
added, our help output now is:

USAGE: compiler [options] <input file>

OPTIONS:
Choose optimization level:
-g - No optimizations, enable debugging
-O1 - Enable trivial optimizations
-O2 - Enable default optimizations
-O3 - Enable expensive optimizations

License Information 261

clEnumVal(quick. . This definition defines an enumerated command line variable of type "enum DebugLev". detailed }. we can alternatively write this example like this: enum OptLevel { Debug.. quick... clEnumVal(O3 . Documentation for the LLVM System at SVN head -f . clEnumVal(O2 . We shall use this style in our compiler to specify different debug levels that can be used. Specify output filename -quiet . "Enable default optimizations"). but sometimes you can't or don't want to preserve the mapping. of which only one can be specified at a time: "--debug-level=none". cl::values( clEnumValN(nodebuginfo. The difference here is just the interface exposed to the user of your program and the help output by the "-help" option: USAGE: compiler [options] <input file> OPTIONS: Choose optimization level: -g . clEnumValEnd)).. "Enable expensive optimizations"). Because of this. we want to support the following options. O3 }. clEnumVal(O1 . // Enable Debug Options to be specified on the command line cl::opt<DebugLev> DebugLevel("debug_level". cl::desc("Set the debugging level:").. Instead of each debug level being its own switch. "No optimizations. if (OptimizationLevel == Debug) outputDebugInfo(. which is when you would use it.No optimizations. cl::opt<OptLevel> OptimizationLevel(cl::desc("Choose optimization level:"). enable debugging"). because we probably don't want a enum definition named "g" in our program. O1. display available options (-help-hidden for more) -o <filename> . enable debugging License Information 262 . For this case. . "--debug-level=quick". which works exactly the same way as before. cl::values( clEnumValN(Debug. To do this. O2. but we also specify an option name. "enable detailed debug information"). In general a direct mapping is nice. "enable quick debug information").). we can directly specify the name that the flag should get.. "Enable trivial optimizations"). Named Alternatives Another useful argument form is a named alternative style. By using the "clEnumValN" macro instead of "clEnumVal". "--debug-level=detailed". "disable debug information"). we use the exact same format as our optimization level flags. Don't print informational messages In this case. it is sort of awkward that flag names correspond directly to enum names. clEnumValEnd)). the code looks like this: enum DebugLev { nodebuginfo. Enable binary output on terminals -help . clEnumVal(detailed. "g". "none".

lets get a little wild and crazy. enable quick debug information =detailed . "Procedure Integration"). inlining. and needs to capture them into a list. Enable expensive optimizations -debug_level . This defines a variable that is conceptually of the type "std::vector<enum Opts>". start by defining an enum of the optimizations that you would like to perform: enum Opts { // 'inline' is a C++ keyword. clEnumVal(strip . Lets say that we want our optimizer to accept a list of optimizations to perform. so name it 'inlining' dce. One especially useful way to use a list is to capture all of the positional arguments together if there may be more than one specified. In this case. Set the debugging level: =none . "Dead Code Elimination"). i != OptimizationList. Note that the "cl::list" template is completely general and may be used with any data types or other arguments that you can use with the "cl::opt" template. Enable trivial optimizations -O2 .. the only structural difference between the debug level declaration and the optimization level declaration is that the debug level declaration includes an option name ("debug_level"). Specify output filename -quiet . "Strip Symbols").. clEnumValN(inlining. "inline". For example. the order of the arguments and the number of appearances is very important. cl::values( clEnumVal(dce . Thus. First. enable detailed debug information -f . disable debug information =quick . the linker takes several '. to iterate through the list of options specified. which automatically changes how the library processes the argument. Then define your "cl::list" variable: cl::list<Opts> OptimizationList(cl::desc("Available Optimizations:"). constprop. clEnumVal(constprop .o' files. Parsing a list of options Now that we have the standard run-of-the-mill argument types out of the way.. In the case of a linker. This is what the "cl::list" template is for. you can access it with standard vector methods: for (unsigned i = 0. "Constant Propagation"). for example. strip }. Don't print informational messages Again.size(). clEnumValEnd)). This is naturally specified as: License Information 263 . Enable binary output on terminals -help . The CommandLine library supports both forms so that you can choose the form most appropriate for your application.. display available options (-help-hidden for more) -o <filename> . . ++i) switch (OptimizationList[i]) . Documentation for the LLVM System at SVN head -O1 . Enable default optimizations -O3 . we might want to run: "compiler -dce -constprop -inline -dce -strip". allowing duplicates.

we used the cl::OneOrMore modifier to inform the CommandLine library that it is an error if the user does not specify any . "Dead Code Elimination"). 0 otherwise. this just reduces the amount of checking we have to do. To test to see if constprop was specified. cl::list<std::string> InputFilenames(cl::Positional. we may decide to put summary information about what it does into the help output. "Constant Propagation"). "inline". License Information 264 . the resulting enum's bit is set in the option's bit vector: bits |= 1 <<(unsigned)enum. The help output is styled to look similar to a Unix man page. simply pass a third argument to the cl::ParseCommandLineOptions call in main. Any instances after the first are discarded.isSet(constprop)) { . Collecting options as a set of flags Instead of collecting sets of options in a list. just like above. Finally. . then the location specified must be of type unsigned.. As each specified value is parsed.. To add this to your CommandLine program. This variable works just like a "vector<string>" object.. . if external storage is used. Adding freeform text to help output As our program grows and becomes more mature. In all other ways a cl::bits option is equivalent to a cl::list option. cl::desc("<Input files>"). This additional argument is then printed as the overview information for your program.. we can use the cl:bits::isSet function: if (OptimizationBits. accessing the list is simple.getBits(). As such.. } It's also possible to get the raw bit vector using the cl::bits::getBits function: unsigned bits = OptimizationBits. however often have a description about what the program does. In this example. clEnumVal(strip . clEnumValEnd)). "Strip Symbols").. The representation used by the cl::bits class is an unsigned integer. it is also possible to gather information for enum values in a bit vector.. clEnumVal(constprop ..\n").. Documentation for the LLVM System at SVN head . " CommandLine compiler example\n\n" " This program blah blah blah. Reworking the above list example. 1 indicating that the enum was specified.o files on our command line. allowing you to include any additional information that you want. providing concise information about a program. cl::values( clEnumVal(dce . Unix man pages. clEnumValN(inlining. Options that are specified multiple times are redundant. cl::OneOrMore). argv. we could replace cl::list with cl::bits: cl::bits<Opts> OptimizationBits(cl::desc("Available Optimizations:"). "Procedure Integration"). Again. For example: int main(int argc.. char **argv) { cl::ParseCommandLineOptions(argc. An enum value is represented by a 0/1 in the enum's ordinal value bit position.

. and will fail (and single quotes will not save you). cl::desc("<input file>"). USAGE: compiler [options] <input file> OPTIONS: .. the -help output for our grep replacement would look like this: USAGE: spiffygrep [options] <regular expression> <input file> OPTIONS: -help . because it will try to find an argument named '-foo'. cl::init("-")). and an optional filename to search through (which defaults to standard input if a filename is not specified). this would be specified as: cl::opt<string> Regex (cl::Positional. and the resultant program could be used just like the standard grep tool.. the standard Unix grep tool takes a regular expression argument. searching for '-foo' in a file).txt Unknown command line argument '-foo'. this section will give you the detailed information you need to tune how command line options work. Note that the system grep has the same problem: $ spiffygrep '-foo' test. Specifying positional options with hyphens Sometimes you may want to specify a value to your positional argument that starts with a hyphen (for example.o License Information 265 . but will not have an ordering defined if the positional arguments are defined in multiple . Positional arguments should be used when an option is specified by its position alone. Given these two option declarations. At first... For example. -help . cl::desc("<regular expression>"). Try: spiffygrep -help' $ grep '-foo' test. Using the CommandLine library.display available options (-help-hidden for more) -o <filename> . you will have trouble doing this. Documentation for the LLVM System at SVN head } would yield the help output: OVERVIEW: CommandLine compiler example This program blah blah blah.cpp files.. Positional arguments are sorted by their order of construction.txt grep: illegal option -. This means that command line options will be ordered according to how they are listed in a .cpp file.Specify output filename Reference Guide Now that you know the basics of how to use the CommandLine library.display available options (-help-hidden for more) . as well as information on more "advanced" command line option processing capabilities. Positional Arguments Positional arguments are those arguments that are not named.cpp file. and are not specified with a hyphen. The fix for this problem is simply to define all of your positional arguments in one . cl::opt<string> Filename(cl::Positional. cl::Required).f grep: illegal option -.

cl::ZeroOrMore). .-foo test.o Usage: grep -hblcnsviw pattern file . So. This is also useful for options like -llibname which is actually a positional argument that starts with a dash. not options.end() ) libPos = Libraries... else libPos = 0. Determining absolute position with getPosition() Sometimes an option can affect or modify the meaning of another option. unsigned libPos = 0. so their interaction(s) can be applied correctly. } else break. Documentation for the LLVM System at SVN head grep: illegal option -.begin(). Thus. if ( filePos != 0 && (libPos == 0 || filePos < libPos) ) { // Source File Is next ++fileIt..getPosition( fileIt . the problem is that you have two cl::list variables that interact in some way. std::vector<std::string>::iterator fileIt = Files. it is telling the program that all options after the '--' should be treated as positional arguments. To ensure the correct interaction. . The idiom for usage is like this: static cl::list<std::string> Files(cl::Positional.end() ) filePos = Files. When the user specifies '--' on the command line.. The solution for this problem is the same for both your tool and the system version: use the '--' marker.txt . generally. static cl::list<std::string> Libraries("l".Libraries. Files. filePos = 0.begin() ). int main(int argc.output. std::vector<std::string>::iterator libIt = Libraries. the cl::opt also supports an unsigned getPosition() option License Information 266 .begin(). you can use the cl::list::getPosition(optnum) method. For example. for compatibility reasons. In order to handle this properly. while ( 1 ) { if ( libIt != Libraries. especially those in lists. char**argv) { // .begin() ). // we're done with the list } } Note that. cl::OneOrMore). you need to know the absolute position of each argument. else filePos = 0. consider gcc's -x LANG option.. } else if ( libPos != 0 && (filePos == 0 || libPos < filePos) ) { // Library is next ++libIt. This tells gcc to ignore the suffix of subsequent positional arguments and force the file to be interpreted as if it contained source code in language LANG. This method returns the absolute position (as found on the command line) of the optnum item in the cl::list. we can use it like this: $ spiffygrep -.. if ( fileIt != Files.getPosition( libIt .

Sometimes. we would specify this as: cl::opt<string> Script(cl::Positional. Using the CommandLine library. This is very convenient in the common case. Documentation for the LLVM System at SVN head that will provide the absolute position of that option. like this for example: // DebugFlag. because they were specified after the last positional argument (which is the script name). cl::desc("<input script>"). there must not be any cl::list positional arguments. the boolean value controlling the debug code should be globally accessible (in a header file. You can apply the same approach as above with a cl::opt and a cl::list option as you can with two lists.")). then you specify the name of the script to run. lets say that we have a '-debug' option that we would like to use to enable debug information across the entire body of our program. lets say we are developing a replacement for the standard Unix Bourne shell (/bin/sh).. To do this. cl::list<string> Argv(cl::ConsumeAfter. especially when combined with the ability to define command line options in the files that use them. As a concrete example. then you specify arguments to the script. set up your . cl::init("-")). only one cl::ConsumeAfter can be specified per program. For example. cl::desc("<program arguments>.display available options (-help-hidden for more) -x . "bar"].Get access to the '-debug' command line option // // DebugFlag .sh -a -x -y bar'. and the Argv list will contain ["-a". OPTIONS: -help .. The cl::ConsumeAfter modifier The cl::ConsumeAfter formatting option is used to construct programs that use "interpreter style" option processing. With this style of option processing. first you specify options to the shell itself (like -x which turns on trace output). This is called the internal storage model. There are several limitations to when cl::ConsumeAfter options can be specified. "-y". In this case.h file with your option. which automatically provides the help output: USAGE: spiffysh [options] <input script> <program arguments>.. all command line options automatically hold the value that they parse from the command line. Internal vs External Storage By default.sh"..h).This boolean is set to true if the '-debug' command line option License Information 267 . To run /bin/sh. and the cl::ConsumeAfter option should be a cl::list option. however. the Script variable will be set to "test. the Trace variable will be set to true. for example) yet the command line option processing code should not be exposed to all of these clients (requiring lots of . but are not interpreted as options to the shell itself. For example. cl::desc("Enable trace output")). These arguments to the script are parsed by the Bourne shell command line option processor.h . there must be at least one positional argument specified.cpp files to #include CommandLine. cl::opt<bool> Trace("x". if we run our new shell replacement as `spiffysh -x test.Enable trace output At runtime. all arguments specified after the last positional argument are treated as special interpreter arguments that are not interpreted by the command line argument. it is nice to separate the command line option processing code from the storage of the value parsed. "-x".

This macro automatically makes the option name be the same as the enum name. Option Attributes This section describes the basic attributes that you can specify on options. • The cl::desc attribute specifies a description for the option to be shown in the -help output for the program. description) triplets that specify the option name. This should probably not be referenced directly. value. // In the '-debug' option is specified on the command line. so that when the command-line parser sees cl::init. It takes a clEnumValEnd terminated list of (option. except positional options) specifies what the option name is. Look here for an example. use // the DEBUG macro below. and we specify where to fill in with the cl::location attribute: bool DebugFlag. Because the generic parser is used most frequently with enum values. • The cl::value_desc attribute specifies a string that can be used to fine tune the -help output for a command line option. See the section on Internal vs External Storage for more information. and if this is a // debug build. true> // The parser Debug("debug". • The option name attribute (which is required for all options. then the code specified as the option to the macro will be // executed. Otherwise it will not be. The clEnumVal macro is used as a nice simple way to specify a triplet for an enum. The first option License Information 268 . cl::location(DebugFlag)). we specify the cl::location attribute. • The cl::init attribute specifies an initial value for a scalar option. // DEBUG macro . In addition to this. Warning: If you specify both cl::init and cl::location for an option. // the actual value static cl::opt<bool. so that DebugFlag is automatically set.This macro should be used by code to emit debug information. we pass an additional argument to our command line argument processor. or the DebugFlag explicitly if they want to. // extern bool DebugFlag. the value mapped to. two macros are often useful: 1. Now we just need to be able to set the DebugFlag boolean when the option is set. If this attribute is not specified then the command line option value defaults to the value created by the default constructor for the type. In the above example. and the description shown in the -help for the tool. you must specify cl::location first.) • The cl::location attribute where to store the value for a parsed command line option if using external storage. cl::desc("Enable debug output"). } } while (0) #endif This allows clients to blissfully use the DEBUG() macro. indicating that the template should not maintain a copy of the value itself. (You will get an error at runtime if you don't put them in the right order. #ifdef NDEBUG #define DEBUG(X) #else #define DEBUG(X) do { if (DebugFlag) { X. it knows where to put the initial value. Documentation for the LLVM System at SVN head // is specified. instead. cl::Hidden. To do this. • The cl::values attribute specifies the string-to-value mapping to be used by the generic parser. This option is specified in simple double quotes: cl::opt<bool> Quiet("quiet"). we specify "true" as the second argument to the cl::opt template. • The cl::aliasopt attribute specifies which option a cl::alias option is an alias for.

Specifying a value for this setting allows the CommandLine library to do error checking for you. and cl::ReallyHidden modifiers are used to control whether or not an option appears in the -help and -help-hidden output for the compiled program: • The cl::NotHidden modifier (which is the default for cl::opt and cl::list options) indicates the option is to appear in both help listings. You will get a compile time error if you try to use cl::values with a parser that does not support it. obviously). • The cl::multi_val attribute specifies that this option takes has multiple values (example: -sectalign segname sectname sectvalue). Controlling other formatting options 5. Documentation for the LLVM System at SVN head to the macro is the enum. • The cl::ReallyHidden modifier indicates that the option should not appear in any help output. Controlling whether or not a value must be specified 4. the second is the description for the command line option. The allowed values for this option group are: • The cl::Optional modifier (which is the default for the cl::opt and cl::alias classes) indicates that your program will allow either zero or one occurrence of the option to be specified. the first argument is the enum value. The clEnumValN macro is used to specify macro options where the option name doesn't equal the enum name. which mean that you usually shouldn't have to worry about these. • The cl::ZeroOrMore modifier (which is the default for the cl::list class) indicates that your program will allow the option to be specified zero or more times. • The cl::Required modifier indicates that the specified option must be specified exactly one time. For this macro. License Information 269 . cl::Hidden. Controlling the number of occurrences required and allowed This group of options is used to control how many time an option is allowed (or required) to be specified on the command line of your program. except for options in the miscellaneous category. Hiding an option from -help output 2. It is allowed to use all of the usual modifiers on multi-valued options (besides cl::ValueDisallowed. • The cl::Hidden modifier (which is the default for cl::alias options) indicates that the option should not appear in the -help output. and the second is the description. Hiding an option from -help output The cl::NotHidden. Option Modifiers Option modifiers are the flags and expressions that you pass into the constructors for cl::opt and cl::list. These options fall into five main categories: 1. Miscellaneous option modifiers It is not possible to specify two options from the same category (you'll get a runtime error) to a single option. 2. This attribute is valid only on cl::list options (and will fail with compile error if you try to use it with other option types). The CommandLine library specifies defaults for all of these settings that are the most useful in practice and the most common. These modifiers give you the ability to tweak how options are parsed and how -help output is generated to fit your application well. This attribute takes one unsigned argument - the number of values for the option. but should appear in the -help-hidden output. Controlling the number of occurrences required and allowed 3. the second is the flag name.

Here. it is illegal for the value to be provided without the equal sign. The allowed values for this option group are: • The cl::ValueOptional modifier (which is the default for bool typed options) specifies that it is acceptable to have a value. that the next argument provided must be the value. If an option is not specified. you can specify the cl::ValueDisallowed modifier to a boolean argument to restrict your command line parser. you can only specify one of these arguments at most.g. Note that cl::Prefix options must not have License Information 270 . Therefore '-foo true' is illegal. that have the cl::Prefix modifier added to allow the CommandLine library to recognize them. If an option is specified with this mode. Documentation for the LLVM System at SVN head • The cl::OneOrMore modifier indicates that the option must be specified at least one time.out'). • The cl::ConsumeAfter modifier is described in the Positional arguments section. If an option is specified multiple times for an option of the cl::opt class. • The cl::ValueDisallowed modifier (which is the default for unnamed alternatives using the generic parser) indicates that it is a runtime error for the user to specify a value. a value is either specified with an equal sign (e. • The cl::ValueRequired modifier (which is the default for all other types except for unnamed alternatives using the generic parser) specifies that a value must be provided. '-o a. This mode informs the command line library that if an option is not provides with an equal sign. • The cl::Positional modifier specifies that this is a positional argument that does not have a command line option associated with it. This allows things like '-o a. If the cl::init attribute is not specified. the option value is initialized with the default constructor for the data type. then the value of the option is equal to the value specified by the cl::init attribute. or not. These options are mostly useful when extending the library. With 'Prefix' options. the equal sign does not separate the value from the option name specified. • The cl::Prefix modifier specifies that this option prefixes its value. Controlling whether or not a value must be specified This group of options is used to control whether or not the option allows a value to be present. See this section for more information. To get this behavior.out' to work. the default values for this option group work just like you would want them to. In general. Controlling other formatting options The formatting option group is used to specify that the command line option has special abilities and is otherwise different from other command line arguments. This is useful for processing odd arguments like -lmalloc and -L/usr/lib in a linker tool or -DNAME=value in a compiler tool. the value is everything after the prefix.g. only the last value will be retained. the 'l'. See the Positional Arguments section for more information. or it can have an explicit '-foo=true'. As mentioned above. In the case of the CommandLine library. As usual. • The cl::NormalFormatting modifier (which is the default all options) specifies that this option is "normal". 'D' and 'L' options are normal string (or list) options. This can be provided to disallow users from providing options to boolean options (like '-foo=true'). A boolean argument can be enabled just by appearing on the command line. • The cl::ConsumeAfter modifier specifies that this option is used to capture "interpreter style" arguments. including any equal sign if present. Instead. you must use the cl::ValueRequired modifier. '-index-depth=17') or as a trailing string (e.

If there is at least one option with cl::Sink modifier specified. if (!OrigInput. OrigInput.parse(input).begin(). Note that cl::Grouping options cannot have values.empty()) input. • The cl::Sink modifier is used to handle unknown options.empty()) { // Must be grouping options getOption(input). string input = OrigInput. // Normal option 3. if you have two "eating" positional arguments. but only require a single dash. it is a cl::list option). • The cl::PositionalEatsArgs modifier (which only applies to positional arguments.parse(). For example. if (input.parse(). the CommandLine library uses a greedy algorithm to parse the input option into (potentially multiple) prefix and grouping options. the 'ls -labF' command actually enables four different options.pop_back(). 2. • The cl::Grouping modifier is used to implement Unix-style tools (like ls) that have lots of single letter arguments. To do this. • The cl::CommaSeparated modifier indicates that any commas specified for an option's value should be used to split the value up into multiple values for the option. // No matching option 5. This option only makes sense to be used in a case where the option is allowed to accept one or more values (i.b. For example.empty()) input. The strategy basically looks like this: parse(string OrigInput) { 1. but it is possible to specify ambiguous argument settings.pop_back(). Documentation for the LLVM System at SVN head the cl::ValueDisallowed modifier specified.begin()+input. the parser passes unrecognized option strings to it as values instead of signaling an error. These flags specify boolean properties that modify the option. The CommandLine library does not restrict how you use the cl::Prefix or cl::Grouping modifiers.erase(OrigInput.e. while (!input. "pos1" and "pos2". input = OrigInput.empty()) error(). the string "-pos1 -foo -bar baz -pos2 -bork" would cause the "-foo -bar -baz" strings to be applied to the "-pos1" option and the "-bork" string to be applied to the "-pos2" option. As with cl::CommaSeparated. } Miscellaneous option modifiers The miscellaneous option modifiers are the only flags where you can specify more than one flag from the set: they are not mutually exclusive. while (!isOption(input) && !input. this modifier only makes sense with a cl::list License Information 271 .length()). if (isOption(input)) return getOption(input). 6. } 7. all of which are single letters.isPrefix()) return getOption(input). Thus. and only makes sense for lists) indicates that positional argument should consume any strings after it (including strings that start with a "-") up until another recognized positional argument. while (!isOption(input) && !input. OrigInput.c". and they will still work as designed. For example. // Remove the last letter 4. if (getOption(input).empty()) return error(). it is possible to have multiple letter options that are prefix or grouping options. these two options are equivalent when cl::CommaSeparated is specified: "-foo=a -foo=b -foo=c" and "-foo=a.

except that it is designed to take values for options from an environment variable. Response files are enabled by an optional fourth argument to cl::ParseEnvironmentOptions and cl::ParseCommandLineOptions. Such programs should just define a small function that takes no arguments and returns void and that prints out whatever version information is appropriate for the program. The cl::ParseCommandLineOptions function The cl::ParseCommandLineOptions function is designed to be called directly from main. It simply arranges for a function to be called in response to the --version option instead of having the CommandLine library print out the usual version string for LLVM. for those cases in which reading the command line is not convenient or desired. so an environment variable containing -option "foo bar" will be parsed as three words. Pass the address of that function to cl::SetVersionPrinter to arrange for it to be called when the --version option is given by the user. The cl::ParseCommandLineOptions function requires two parameters (argc and argv). cl::list. Note: Currently cl::ParseEnvironmentOptions does not support quoting. The cl::ParseEnvironmentOptions function The cl::ParseEnvironmentOptions function has mostly the same effects as cl::ParseCommandLineOptions. which is different from what you would get from the shell with the same input. Response files Some systems. and is used to fill in the values of all of the command line option variables once argc and argv are available. such as certain variants of Microsoft Windows and some older Unices have a relatively low limit on command-line length. License Information 272 . Its use is optional. Documentation for the LLVM System at SVN head option. Top-Level Classes and Functions Despite all of the built-in flexibility. -option. thereby working around the command-line length limits. these are the only three miscellaneous option modifiers. This section describes these three classes in detail. and a fourth boolean parameter that enables response files. This is useful for programs that are not part of LLVM but wish to use the CommandLine facilities. the name of the environment variable to examine. These files are mentioned on the command-line (using the "@file") syntax. the optional additional extra text to emit when the -help option is invoked. The program reads these files and inserts the contents into argv. So far. "foo. and bar". and the boolean switch that controls whether response files should be read. it can't just look in argv[0]). and cl::alias. The cl::SetVersionPrinter function The cl::SetVersionPrinter function is designed to be called directly from main and before cl::ParseCommandLineOptions. It is therefore customary to use the so-called 'response files' to circumvent this restriction. It takes four parameters: the name of the program (since argv may not be available. the CommandLine option library really only consists of one function (cl::ParseCommandLineOptions) and three main classes: cl::opt. but may also take an optional third parameter which holds additional extra text to emit when the -help option is invoked. cl::ParseEnvironmentOptions will break the environment variable's value up into words and then process them using cl::ParseCommandLineOptions. It fills in the values of all the command line option variables just like cl::ParseCommandLineOptions does.

class ParserClass = parser<DataType> > class bits. In general. and is the one used most of the time. class ParserClass = parser<DataType> > class list. bool ExternalStorage = false. Documentation for the LLVM System at SVN head The cl::opt class The cl::opt class is the class used to represent scalar command line options. not a boolean value. } This class works the exact same as the cl::lists class. except that the second argument is the type of the external storage. The second template argument is used to specify whether the option should contain the storage for the option (the default) or whether external storage should be used to contain the value parsed for the option (see Internal vs External Storage for more information). class Storage = bool. } This class works the exact same as the cl::opt class. The cl::list class The cl::list class is the class used to represent a list of command line options. so this option is only used when using a custom parser. the marker type 'bool' is used to indicate that internal storage should be used. It is also a templated class which can take up to three arguments: namespace cl { template <class DataType. and is used to select a default parser implementation. } The first template argument specifies what underlying data type the command line argument is. The third template argument specifies which parser to use. except that the second argument must be of type unsigned if external storage is used. } License Information 273 . class ParserClass = parser<DataType> > class opt. It is a templated class which can take up to three arguments (all except for the first have default values though): namespace cl { template <class DataType. It too is a templated class which can take up to three arguments: namespace cl { template <class DataType. The cl::alias class The cl::alias class is a nontemplated class that is used to form aliases for other arguments. class Storage = bool. this default works well for most applications. The cl::bits class The cl::bits class is the class used to represent a list of command line options in the form of a bit vector. namespace cl { class alias. The default value selects an instantiation of the parser class based on the underlying data type of the option. For this class.

Note that multiple cl::extrahelp can be used. • The parser<bool> specialization is used to convert boolean strings to a boolean value. Documentation for the LLVM System at SVN head The cl::aliasopt attribute should be used to specify which option this is an alias for. The most common use of this parser is for parsing enum values. The cl::extrahelp class The cl::extrahelp class is a nontemplated class that allows extra help text to be printed out for the -help option. through the use of the cl::values property. For example: cl::extrahelp("\nADDITIONAL HELP:\n\n This is the extra help\n"). No conversion or modification of the data is performed. Despite this. but we also need to know whether the option was specified at all. License Information 274 . "TRUE". Currently accepted strings are "true". As such. also be extended to work with new data types and new ways of interpreting the same data. Builtin parsers Parsers control how the string value taken from the command line is translated into a typed value. "False". See the Writing a Custom Parser for more details on this type of library extension. • The parser<string> specialization simply stores the parsed string into the string value specified. the generic parser class can be used for any data type. a broad range of string formats is supported. It can. By default. including exponential notation (ex: 1. • The generic parser<t> parser can be used to map strings values to any data type. As such. boolOrDefault is an enum with 3 values. which are identified with a '0' prefix digit. BOU_TRUE and BOU_FALSE. however. and use the aliased options parser to do the conversion from string to data. Because of this. The text passed to the constructor will be printed at the bottom of the help message. "1". "True". BOU_UNSET. "FALSE". simply construct one with a const char* parameter to the constructor. • The parser<int> specialization uses the C strtol function to parse the string input. verbatim. put all that help into a single cl::extrahelp instance. suitable for use in a C++ program.7e15) and properly supports locales. • The parser<double> and parser<float> specializations use the standard C strtod function to convert floating point strings into floating point values. The CommandLine library provides the following builtin parser specializations. } To use the extrahelp. namespace cl { struct extrahelp. This parser accepts the same strings as parser<bool>. and hexadecimal numbers with a prefix of '0x' or '0X'. Alias arguments default to being Hidden. which allows you to use the CommandLine library for all of the error checking to make sure that only valid enum values are specified (as opposed to accepting arbitrary strings). however. and "0". but this practice is discouraged. custom option processing is specified with specializations of the 'parser' class. If your tool needs to print additional help information. which specifies the mapping information. it will accept a decimal number (with an optional '+' or '-' prefix) which must start with a non-zero digit. It accepts octal numbers. • The parser<boolOrDefault> specialization is used for cases where the value is boolean. the CommandLine library uses an instance of parser<type> if the command line option specifies that it uses values of type 'type'. which are sufficient for most applications. "false".

As discussed previously. "41M". We give it the data type that we parse into. There are two ways to use a new parser: 1. the option name. the only method that must be implemented in a custom parser is the parse method. The parse method is called whenever the option is invoked. To guide the discussion. parsers are the portion of the CommandLine library that turns string input from the user into a particular parsed data type. This approach works well in situations where you would line to parse an option using special syntax for a not-very-special data-type. This section discusses how the CommandLine library works under the covers and illustrates how to do some simple. char *End. Writing a custom parser One of the simplest and most common extensions is the use of a custom parser. Our new class inherits from the cl::basic_parser template class to fill in the default. extensions.c_str(). one of its true strengths lie in its extensibility. We choose approach #2 above because we don't want to make this the default for all unsigned options. This approach has the advantage that users of your custom data type will automatically use your custom parser whenever they define an option with a value type of your data type. passing in the option itself. bool parse(cl::Option &O. the last argument to the parse method. Documentation for the LLVM System at SVN head Extension Guide Although the CommandLine library has a lot of functionality built into it already (as discussed previously). }. License Information 275 . and a reference to a return value. 2. so that clients of our custom parser know what object type to pass in to the parse method. boiler plate code for us. Specialize the cl::parser template for your custom data type. The disadvantage of this approach is that it doesn't work if your fundamental data type is something that is already supported. const char *ArgName. specified with an optional unit after the numeric size. the string to parse.Return true on error. we would like to parse "102kb". Otherwise it should return false and set 'Val' to the parsed value. In this case. If the string to parse is not well-formed. unsigned &Val). validating the input in the process. The drawback of this approach is that users of your parser have to be aware that they are using your parser instead of the builtin ones. we will discuss a custom parser that accepts file sizes. const std::string &ArgValue.) For most purposes. the underlying data type we want to parse into is 'unsigned'. In our example. For example. To start out. we implement parse as: bool FileSizeParser::parse(cl::Option &O. unsigned &Val) { const char *ArgStart = Arg. "1G" into the appropriate integer value. Write an independent class. common. (Here we declare that we parse into 'unsigned' variables. the parser should output an error message and return true. const std::string &Arg. using it explicitly from options that need it. we declare our new FileSizeParser class: struct FileSizeParser : public cl::basic_parser<unsigned> { // parse . const char *ArgName.

0). // No error case 'i': // Ignore the 'i' in KiB if people use that case 'b': case 'B': // Ignore B suffix break.. and we seem to accept reasonable file sizes./test -max-file-size=dog -max-file-size option: 'dog' value invalid for file size argument! It looks like it works. cl::value_desc("size")). we can use it like this: static cl::opt<unsigned. Note that we use the option itself to print out the error message (the error method always returns true) in order to get a nice error message (shown below). Which adds this to the output of our program: OPTIONS: -help . while (1) { switch (*End++) { case 0: return false. -max-file-size=<size> .display available options (-help-hidden for more) . leaving 'End' pointing to the first non-integer char Val = (unsigned)strtol(ArgStart. } } } This function implements a very simple parser for the kinds of strings we are interested in.Maximum file size to accept And we can test that our parse works correctly now (the test program just prints out the max-file-size argument value): $ . Exploiting external storage Several of the LLVM libraries define static cl::opt instances that will automatically be included in any program that links with that library. In these cases the library does or should provide an external storage location that is accessible to users of the library. The error message that we get is nice and helpful. default: // Print an error message if unrecognized character! return O./test MFS: 0 $ . This wraps up the "custom parser" tutorial./test -max-file-size=123MB MFS: 128974848 $ . case 'g': case 'G': Val *= 1024*1024*1024. Now that we have our parser class. sometimes it is necessary to know the value of the command line option outside of the library.error("'" + Arg + "' value invalid for file size argument!"). FileSizeParser> MFS("max-file-size". cl::desc("Maximum file size to accept"). case 'k': case 'K': Val *= 1024. false. break. &End.. it is good enough for this example. break./test -max-file-size=3G MFS: 3221225472 $ . Documentation for the LLVM System at SVN head // Parse integer part. case 'm': case 'M': Val *= 1024*1024. However. This is a feature. Examples of this include the llvm::DebugFlag License Information 276 . Although it has some holes (it allows "123KKK" for example). break.

TODO: complete this section Dynamically adding command line options TODO: fill in this section Chris Lattner LLVM Compiler Infrastructure Last modified: $Date: 2010-02-26 14:18:32 -0600 (Fri.cpp file and the llvm::TimePassesIsEnabled flag exported by the lib/VMCore/Pass. Documentation for the LLVM System at SVN head exported by the lib/Support/Debug. 26 Feb 2010) $ License Information 277 .cpp file.

Don't use "else" after a return 6. Mechanical Source Issues 1. Assert Liberally 2. Commenting 2. Turn Predicate Loops into Predicate Functions 2. Use Spaces Instead of Tabs 6. Documentation for the LLVM System at SVN head LLVM Coding Standards 1. Introduction 2. Compiler Issues 1. Don't evaluate end() every time through a loop 5. A Public Header File is a Module 2. Treat Compiler Warnings Like Errors 2. The Low Level Issues 1. Use raw_ostream License Information 278 . #include as Little as Possible 3. Avoid std::endl 7. Use of class/struct Keywords 3. Keep "internal" Headers Private 4. Style Issues 1. The High Level Issues 1. Source Code Width 5. Indent Code Consistently 2. Provide a virtual method anchor for classes in headers 4. Do not use 'using namespace std' 3. #include <iostream> is forbidden 6. Use Early Exits and 'continue' to Simplify Code 5. Source Code Formatting 1. #include Style 4. Write Portable Code 3. Comment Formatting 3.

ask on the list if what you are looking to do can be done with already-existing infrastructure. llvm/test/Regression/*: Add test cases for your test cases to the test suite Once the intrinsic has been added to the system. and is transparent to optimization passes. or if maybe someone else is already working on it.td: Add an entry for your intrinsic. Describe its memory access characteristics for optimization (this controls whether it will be DCE'd. Almost all extensions to LLVM should start as an intrinsic function and then be turned into an instruction if warranted. 3. and Chris Lattner Introduction and Warning During the course of using LLVM.html: Document the intrinsic. etc. an intrinsic function is the method of choice for LLVM extension. Adding a new SelectionDAG node 5. Talk to other people about it so that you are sure it's a good idea. Brad Jones. Adding a new instruction 4. Adding a new derived type Written by Misha Brukman. there are a few ways to implement this. llvm/docs/LangRef. Do you really need to extend LLVM? Is it a new fundamental capability that LLVM does not support at its current incarnation or can it be synthesized from already pre-existing LLVM elements? If you are not sure. For most intrinsics. a new intrinsic function. you may realize that you need to add something to LLVM. it makes sense to add code to lower your intrinsic in LowerIntrinsicCall in License Information 279 . etc). 2. Note that any intrinsic using the llvm_int_ty type for an argument will be deemed by tblgen as overloaded and the corresponding suffix will be required on the intrinsic's name. intrinsics. stop and think. you may wish to customize it for your research project or for experimentation. If your added functionality can be expressed as a function call.cpp: If it is possible to constant fold your intrinsic. When you come to this realization. Adding a new intrinsic function 3. types. Generally you must do the following steps: Add support to the C backend in lib/Target/CBackend/ Depending on the intrinsic. or a whole new instruction. 1. Adding a new intrinsic function Adding a new intrinsic function to LLVM is much easier than adding a new instruction. Introduction and Warning 2. Nate Begeman. The reason is that extending LLVM will get involved as you need to update all the different passes that you intend to use with your extension. 4. you must add code generator support for it. You will save yourself a lot of time and effort by doing so. Documentation for the LLVM System at SVN head Extending LLVM: Adding instructions. whether it be a new fundamental type. and there are many LLVM analyses and transformations. Adding an intrinsic function is far easier than adding an instruction. llvm/include/llvm/Intrinsics*. CSE'd. Adding a new type 1. At this point. llvm/lib/Analysis/ConstantFolding. add support to it in the canConstantFoldCallTo and ConstantFoldCall functions. Decide whether it is code generator specific and what the restrictions are. 1. so it may be quite a bit of work. Before you invest a significant amount of effort into a non-trivial extension. Adding a new fundamental type 2. ask on the LLVM-dev list.

Adding a new SelectionDAG node As with intrinsics.td file for the target(s) of your choice in lib/Target/*/*. though it may obviously require adding the instructions you want to generate as well. The case for ISD::UREM for expanding a remainder into a divide. multiply.cpp: Add a case for your node in ExpandOp to teach the legalizer how to perform the action represented by the new node on a value that has been split into high and low halves. 4. see ISD::BSWAP. There are several good examples for simple combines you can do. and a subtract is a good example. just have the code generator emit code that prints an error message and calls abort if executed. Second.cpp. It is likely that not all targets supported by the SelectionDAG framework will natively support the new node. These nodes often map to an LLVM instruction (add. lib/CodeGen/SelectionDAG/DAGCombiner. 1.prefetch for an example). There are lots of examples in the PowerPC and X86 backend to follow.td file that matches the intrinsic. New nodes are often added to help represent instructions common to many targets. If your new node can be evaluated at compile time when given constant arguments (such as an add of a constant with another constant). or other existing nodes in a peephole-like fashion. if the intrinsic really has no way to be lowered. lib/CodeGen/SelectionDAG/LegalizeDAG. and perform the correct operation. 7. and call that function from . Documentation for the LLVM System at SVN head lib/CodeGen/IntrinsicLowering. If this target does not natively support your node. population count). and expand the node as necessary. you must also add code in your node's case statement in LegalizeOp to Expand your node into simpler.cpp. In this case. you will also need to add code to your node's case statement in LegalizeOp to Promote your node's operands to a larger size. You will also need to add code to PromoteOp to do this as well. and then shifts the correct bytes right to emulate the narrower byteswap in the wider type. it can be conditionally supported based on the compiler compiling the CBE output (see llvm. find the getNode method that takes the appropriate number of arguments. if it makes sense to lower the intrinsic to an expanded sequence of C code in all cases.cpp: If targets may support the new node being added only at certain sizes. adding a new SelectionDAG node to LLVM is much easier than adding a new instruction. promote. new nodes have been added to allow many targets to perform a common task (converting between floating point and integer representation) or capture more complicated behavior in a single node (rotate). sub) or intrinsic (byteswap. legal operations. If the intrinsic has some way to express it with GCC (or any other compiler) extensions. Third.cpp: Add code to legalize. visitFABS and visitSRL are good starting places. lib/CodeGen/SelectionDAG/LegalizeDAG. include/llvm/CodeGen/SelectionDAGNodes. add a visit function for it. lib/CodeGen/SelectionDAG/LegalizeDAG.cpp: Add code to print the node to getOperationName. which promotes its operand to a wider size. lib/CodeGen/SelectionDAG/SelectionDAG. The default behavior for a target is to assume that your new node is legal for all types that are legal for that target. and returns a new node if any of the operands changed as a result of being legalized.cpp: Each target has an implementation of the TargetLowering class. 6. just emit the expansion in visitCallInst in Writer.td. This case will be used to support your node with a 64 bit operand on a 32 bit target. For a good example. Add support to the . This is usually a matter of adding a pattern to the .cpp: If your node can be combined with itself. performs the byteswap. At a minimum. you will need to add a case statement for your node in LegalizeOp which calls LegalizeOp on the node's operands. lib/Target/PowerPC/PPCISelLowering. In other cases. 3. then tell the License Information 280 . 2. usually in its own file (although some targets include it in the same file as the DAGToDAGISel). and add a case for your node to the switch statement that performs constant folding for nodes that take the same number of arguments as your new node.h: Add an enum value for the new SelectionDAG node. 5.

and will break compatibility with currently-existing LLVM installations.y: add the grammar on how your instruction can be read and what it will construct as a result 6. add static Type* for this type License Information 281 . 8. llvm/lib/AsmParser/Lexer. llvm/test/Regression/*: add your test cases to the test suite. where SelectionDAG nodes are pattern matched to target-specific nodes. llvm/lib/Bitcode/Reader/Reader. lib/Target/PowerPC/PPCInstrInfo. or add a lowering pass. llvm/include/llvm/Instructions. 10. TODO: document complex patterns.cpp: add a case for your instruction and how it will be parsed from bitcode 7. which represent individual instructions. Only add new types if it is absolutely necessary.td. Also. llvm/lib/AsmParser/llvmAsmParser.l: add a new token to parse your instruction from assembly text file 5. Adding a new type WARNING: adding new types changes the bitcode format.h: add enum for the new type. Documentation for this is a bit sparse right now.cpp: add a case for how your instruction will be printed out to assembly 8. llvm/include/llvm/Type. llvm/lib/VMCore/Instructions. Documentation for the LLVM System at SVN head target to either Promote it (if it is supported at a larger type) or Expand it. For targets that use the DAGToDAG instruction selection framework.td: Most current targets supported by LLVM generate code using the DAGToDAG method.h: add a prototype for a visitor to your new instruction type 4. llvm/lib/Target/*: Add support for your instruction to code generators. llvm/include/llvm/Support/InstVisitor. bswap. Adding a new instruction WARNING: adding instructions changes the bitcode format. llvm/test/Regression/CodeGen/*: Add test cases for your new node to the test suite.h 9. and fadd for examples. lib/Target/TargetSelectionDAG. add a pattern for your new node that uses one or more target nodes. you need to implement (or modify) any analyses or passes that you want to understand this new instruction.cpp: implement the class you defined in llvm/include/llvm/Instructions. Look at add. 1. 11.def: add a number for your instruction and an enum name 2.ll is a good example. In order for the targets to match an instruction to your new node. Only add an instruction if it is absolutely necessary. 9. llvm/lib/VMCore/Instruction. but there are several decent examples. See the patterns for rotl in PPCInstrInfo.td: Each target has a tablegen file that describes the target's instruction set.h: add a definition for the class that will represent your instruction 3. llvm/include/llvm/Instruction. with the appropriate type constraints. llvm/test/Regression/CodeGen/X86/bswap. Test your instruction 10. you must add a def for that node to the list in this file. Adding a fundamental type 1. and it will take some effort to maintain compatibility with the previous version. 11. This will cause the code you wrote in LegalizeOp above to decompose your new node into other legal nodes for this target.

cpp: modify void BitcodeWriter::outputType(const Type *T) to serialize your type 6. llvm/include/llvm/DerivedTypes. llvm/lib/VMCore/Type. llvm/lib/VMCore/AsmWriter. llvm/lib/AsmReader/Lexer. std::string & Result) to output the new derived type The LLVM Compiler Infrastructure Last modified: $Date: 2008-12-11 12:23:24 -0600 (Thu. std::map<const Type*. llvm/include/llvm/Type. const Type*> & EqTypes) add necessary member functions for type. std::vector<const Type*> &TypeStack. llvm/lib/AsmReader/Lexer. std::map<const Type*. add a forward declaration of the type also 2.cpp: add support for derived type to: std::string getTypeDescription(const Type &Ty.cpp: add mapping from TypeID => Type*.std::string> &TypeNames. const Type *Ty2.h: add enum for the new type. llvm/lib/AsmReader/llvmAsmParser.y: add a token for that type Adding a derived type 1. 11 Dec 2008) $ License Information 282 . initialize the static Type* 3. llvm/lib/VMCore/Type.cpp: modify void calcTypeName(const Type *Ty.l: add ability to parse in the type from text assembly 4.cpp: modify const Type *BitcodeReader::ParseType() to read your data type 7. llvm/lib/BitCode/Reader/Reader.h: add new class to represent new class in the hierarchy. and factory methods 4. add forward declaration to the TypeMap value type 3. llvm/lib/BitCode/Writer/Writer. Documentation for the LLVM System at SVN head 2. std::vector<const Type*> &TypeStack) bool TypesEqual(const Type *Ty.l: add ability to parse in the type from text assembly 5.

LLVMSystem 2. what they depend on. You specify re-linked libraries by naming the library without a suffix. Library Dependencies 5. runtime executives. please see llvm-config for more information. This variable specifies which LLVM libraries to link into your tool and the order in which they will be linked.o are known as re-linked libraries because they contain all the compilation units of the library linked together as a single . The archive libraries are used whenever you want to only resolve outstanding symbols at that point in the link without including everything in the library. the functionality of LLVM is available through a set of libraries. An understanding of the contents of these libraries will be useful in coming up with an optimal specification for the libraries to link with.o file. However. (see the Makefile Guide for details). and how to use them. Fortunately.you will use the LLVMLIBS make variable.a LLVM assembly parsing LLVMBCReader . a developer needs to understand what is contained in the various libraries. The order in which the libraries appear in the LLVMLIBS variable definition is the order in which they will be linked. Never link both archive and re-linked Written by Reid Spencer Warning: This document is out of date. Libraries ending in . both are libraries. several of the libraries have both forms of library. there is a tool. and other program execution related tools. Documentation for the LLVM System at SVN head Using The LLVM Libraries 1. LLVM is a toolkit for building compilers. Library Descriptions The table below categorizes each library Library Forms Description Core Libraries LLVMArchive . linkers. or any other utility based on LLVM.o).a LLVM bitcode reading License Information 283 .a LLVM archive reading and writing LLVMAsmParser .a) and objects (ending in . To use LLVM as a toolkit for constructing tools. The purpose of this document is to reduce some of the trial and error that the author experienced in using LLVM. Always link LLVMCore. llvm-config to aid with this. You specify archive libraries by naming the library with a . Getting this order correct for your tool can sometimes be challenging. virtual machines. Abstract 2. Introduction 3. virtual machine. LLVM produces two types of libraries: archives (ending in . Introduction If you're writing a compiler. This document describes the contents of the libraries and how to use llvm-config to generate command line options. If you're using the LLVM Makefile system to link your tools. In addition to the LLVM tool set. Furthermore. you'll need to figure out which of the many libraries files you will need to link with to be successful. Abstract Amongst other things. The re-linked libraries are used whenever you want to include all symbols from the library.a suffix but without the lib prefix. Library Descriptions 4. Linkage Rules Of Thumb 1. LLVMSupport.

o Code generation for Sparc architecture LLVMTarget .o Code generation for Intel x86 architecture Runtime Libraries LLVMInterpreter .o Bitcode Interpreter LLVMJIT .o Aggressive instruction selector for directed acyclic graphs Target Libraries LLVMAlpha .o Code generation for ARM architecture LLVMCBackend . LLVMX86 . For complete details on this tool. LLVMipo .a Various analysis passes.a Source level debugging support LLVMLinker .o 'C' language code generator. the source or object directories used to build LLVM can be accessed by passing options to llvm-config. LLVMipa . For example.a All inter-procedural optimization passes. please see the manual page.a LLVM core intermediate representation LLVMDebugger .a General support utilities LLVMSystem . If all you know is that you want certain libraries to be available. as below: 1.a Transformation utilities used by many passes.o Bitcode JIT Compiler LLVMExecutionEngine . LLVMTransformUtils .o Virtual machine engine Using llvm-config The llvm-config tool is a perl script that produces on its output various kinds of information.a All scalar optimization passes.a Bitcode and archive linking interface LLVMSupport . LLVMScalarOpts .a Operating system abstraction layer LLVMbzip2 . the llvm-config can be very useful. LLVMPowerPC . This generates the command line options necessary to be passed to the ld tool in order to link with LLVM. License Information 284 .o Native code generation infrastructure LLVMSelectionDAG .a LLVM bitcode writing LLVMCore .a Generic code generation utilities.a BZip2 compression library Analysis Libraries LLVMAnalysis . --ldflags. Most notably. Code Generation Libraries LLVMCodeGen . you can generate the complete set of libraries to link with using one of four options. Transformation Libraries LLVMInstrumentation . To understand the relationships between libraries.o Code generation for Alpha architecture LLVMARM . Documentation for the LLVM System at SVN head LLVMBCWriter .o Code generation for PowerPC architecture LLVMSparc . LLVMDataStructure .a Instrumentation passes.o Data structure analysis passes. the -L option is provided to specify a library search directory that contains the LLVM libraries.a Inter-procedural analysis passes.

please see the tool named GenLibDeps. This generates the full path names of the LLVM library files. Documentation for the LLVM System at SVN head 2. Dependency Relationships Of Libraries 285 . If you wish to delve further into how llvm-config generates the correct order (based on library dependencies). 3. --libnames. Dependency Relationships Of Libraries This graph shows the dependency of archive libraries on other archive libraries or objects. If you know the directory in which these files reside (see --ldflags) then you can find the libraries there. --libs. 4. only the archive form is shown. libraries are given with a -l option and object files are given with a full path. Where a library has both archive and object forms. --libfiles. This generates a list of just the library file names.pl in the utils source directory of LLVM. This generates command line options suitable for use with a gcc-style linker. That is.

Documentation for the LLVM System at SVN head Dependency Relationships Of Libraries 286 .

Documentation for the LLVM System at SVN head Dependency Relationships Of Object Files This graph shows the dependency of object files on archive libraries or other objects. Dependency Relationships Of Object Files 287 . only the dependency to the archive form is shown. Where a library has both object and archive forms.

Documentation for the LLVM System at SVN head Dependency Relationships Of Object Files 288 .

a ◊ libLLVMArchive.a ◊ libLLVMSystem.a ◊ libLLVMTarget.a ◊ libLLVMSystem.a libLLVMDebugger.a ◊ libLLVMCore.a libLLVMInstrumentation.a libLLVMBCReader.a ◊ libLLVMSupport.a ◊ libLLVMSystem.a Dependency Relationships Of Object Files 289 .a ◊ libLLVMSystem.a ◊ libLLVMBCReader. The information is the same as shown on the graphs but arranged alphabetically.a ◊ libLLVMTransformUtils.a ◊ libLLVMSystem.a libLLVMCodeGen.a libLLVMLinker.a ◊ libLLVMScalarOpts.a ◊ libLLVMSupport.a ◊ libLLVMSupport.a ◊ libLLVMSupport.a ◊ libLLVMCore.a libLLVMAsmParser.a ◊ libLLVMCore. Documentation for the LLVM System at SVN head The following list shows the dependency relationships between libraries in textual form.a ◊ libLLVMSystem. libLLVMAnalysis.a ◊ libLLVMCore.a ◊ libLLVMTransformUtils.a ◊ libLLVMSupport.a ◊ libLLVMAnalysis.a libLLVMCore.a ◊ libLLVMSupport.a ◊ libLLVMCore.a ◊ libLLVMCore.a libLLVMArchive.a libLLVMBCWriter.a ◊ libLLVMCore.a ◊ libLLVMScalarOpts.a ◊ libLLVMTarget.a ◊ libLLVMBCReader.a ◊ libLLVMSupport.a ◊ libLLVMCore.a ◊ libLLVMCore.a ◊ libLLVMSupport.a ◊ libLLVMSystem.a ◊ libLLVMSystem.a ◊ libLLVMSystem.a ◊ libLLVMSupport.a ◊ libLLVMBCReader.

a ◊ libLLVMSystem.a ◊ libLLVMAnalysis.a ◊ libLLVMCore.a ◊ libLLVMipa.a ◊ libLLVMCore.a Dependency Relationships Of Object Files 290 .a ◊ libLLVMCore.a ◊ libLLVMCodeGen.a ◊ libLLVMSystem.a ◊ libLLVMSupport.a ◊ libLLVMAnalysis.a ◊ libLLVMSystem.a ◊ libLLVMBCWriter.a libLLVMTarget.a ◊ libLLVMSupport.a ◊ libLLVMCore.a ◊ libLLVMTarget.a ◊ libLLVMTarget.a ◊ libLLVMTransformUtils.a ◊ libLLVMAnalysis.a libLLVMipo.a ◊ libLLVMSupport.a ◊ libLLVMAnalysis.a ◊ libLLVMipa.a ◊ libLLVMbzip2.a ◊ libLLVMScalarOpts.a ◊ libLLVMSystem.a ◊ libLLVMTarget.a ◊ libLLVMSupport.a ◊ libLLVMSupport.a ◊ libLLVMCore.a ◊ libLLVMAnalysis.a ◊ libLLVMTarget.a ◊ libLLVMBCReader.a ◊ libLLVMSupport.a libLLVMSystem.a ◊ libLLVMCore.a ◊ libLLVMTransformUtils.a ◊ libLLVMAnalysis.a libLLVMTransformUtils.a ◊ libLLVMSystem.a libLLVMlto.a ◊ libLLVMTransformUtils.a ◊ libLLVMCore.a libLLVMbzip2.a libLLVMipa.a ◊ libLLVMSystem.a ◊ libLLVMSystem.a libLLVMSupport. Documentation for the LLVM System at SVN head libLLVMScalarOpts.a ◊ libLLVMLinker.a libLLVMSelectionDAG.

o ◊ libLLVMCore.o ◊ libLLVMCodeGen.o ◊ libLLVMCore.a ◊ libLLVMSupport.a LLVMInterpreter.a LLVMAlpha. Documentation for the LLVM System at SVN head ◊ libLLVMSupport.a ◊ libLLVMCore.o ◊ libLLVMAnalysis.a ◊ libLLVMSystem.a ◊ libLLVMSelectionDAG.a ◊ libLLVMScalarOpts.a LLVMExecutionEngine.a Dependency Relationships Of Object Files 291 .o ◊ LLVMExecutionEngine.a ◊ libLLVMSystem.a ◊ libLLVMSystem.a ◊ libLLVMipo.a ◊ libLLVMCore.a ◊ libLLVMSystem.o ◊ libLLVMCodeGen.a ◊ libLLVMTarget.o ◊ libLLVMCodeGen.a ◊ libLLVMSupport.a LLVMPowerPC.a ◊ libLLVMipa.a ◊ libLLVMSupport.a ◊ libLLVMSelectionDAG.a ◊ libLLVMTarget.a ◊ libLLVMCodeGen.a ◊ libLLVMTarget.a ◊ libLLVMSupport.a LLVMCBackend.a ◊ libLLVMTarget.a ◊ libLLVMSupport.a ◊ libLLVMCore.a ◊ libLLVMCore.a ◊ libLLVMTransformUtils.a ◊ libLLVMTarget.a LLVMARM.a LLVMJIT.a ◊ libLLVMipa.a ◊ libLLVMCore.a ◊ libLLVMSystem.a ◊ libLLVMTarget.a ◊ libLLVMTarget.o ◊ LLVMExecutionEngine.a ◊ libLLVMSystem.o ◊ libLLVMCodeGen.a ◊ libLLVMSupport.a ◊ libLLVMSystem.a ◊ libLLVMSelectionDAG.

a ◊ libLLVMSystem.a ◊ libLLVMSystem. the archive version will not resolve any symbols.a ◊ libLLVMSelectionDAG. You could even end up with link error if you place the archive version before the re-linked version on the linker's command line.a) versions of a library. Always Link LLVMCore.a ◊ libLLVMTarget. Never link both archive and re-linked library There is never any point to linking both the re-linked (.a ◊ libLLVMSupport.a LLVMX86.a ◊ libLLVMTarget.a ◊ libLLVMSupport.a ◊ libLLVMSystem.a LLVMSparc. 23 Jul 2009) $ Dependency Relationships Of Object Files 292 .a Linkage Rules Of Thumb This section contains various "rules of thumb" about what files you should link into your programs.a ◊ libLLVMTarget.a ◊ libLLVMCore. and LLVMSystem No matter what you do with LLVM. Reid Spencer The LLVM Compiler Infrastructure Last modified: $Date: 2009-07-23 19:30:09 -0500 (Thu.a ◊ libLLVMSelectionDAG. Documentation for the LLVM System at SVN head ◊ libLLVMSupport.o) and the archive (.a LLVMSystem.o ◊ libLLVMCodeGen. LLVMSupport.o ◊ libLLVMCodeGen. There are no LLVM programs that don't depend on these three. the last three entries in the value of your LLVMLIBS make variable should always be: LLVMCore LLVMSupport.a.a ◊ libLLVMCore. Since the re-linked version includes the entire library.

Release Qualification Criteria 1. Bugs introduce by patches merged in will be fixed and if so. 4. Tag the LLVM Release Branch Dependency Relationships Of Object Files 293 . Send out pre-release for first round of testing. Generate and send out second pre-release. Qualify LLVM 2. Release Timeline 4. 6. Update Documentation 2. 5. Release Timeline LLVM is released on a time based schedule (currently every 6 months). John Criswell Introduction This document collects information about successfully releasing LLVM (including subprojects llvm-gcc and Clang) to the public. 3. Finally. 2. Set code freeze and branch creation date for 6 months after last code freeze date. During the first round of testing. Release Patch Rules 6. Documentation for the LLVM System at SVN head How To Release LLVM To The Public 1. Build the LLVM Source Distributions 2. regressions should be found and fixed. Create Release Branch 2. release! Release Process 1. We do not have dot releases because of the nature of LLVM incremental development philosophy. The release schedule is roughly as follows: 1. Specific Target Qualification Details 4. It is the release manager's responsibility to ensure that a high quality build of LLVM is released. Build the LLVM-GCC Binary Distribution 4. Bugs found during this time will not be fixed unless absolutely critical. Patches are merged from mainline to the release branch. Build the Clang Binary Distribution 5. Release Administrative Tasks 1. The release notes should be updated during the first and second round of pre-release testing. Release Process Written by Tanya Lattner. Qualification Criteria 3. Qualify Clang 4. Building the Release 1. a 3rd round of testing is needed. Announce release schedule to the LLVM community and update the website. Target Specific Build Details 3. Qualify LLVM-GCC 3. Reid Spencer. Update Version Numbers 2. Release final tasks 1. Build LLVM 3. Testing will last 7-10 days. Introduction 2. Community Testing 5. Create release branch and begin release process.

The Release Manager should switch to the release branch (as all changes to the release will now be done in the branch).Xsvn to just X. and the llvm test-suite by exporting the source from Subversion and archiving it. Verify that the current Subversion HEAD is in decent shape by examining nightly tester or buildbot results. and creating the release tarballs for the release team to begin testing.org/svn/llvm-project/llvm/branches/release_XX svn copy https://llvm.2/branches/release_XX svn copy https://llvm.where XX is the major and minor release numbers. update the release branches' autoconf/configure. svn copy https://llvm.org/svn/llvm-project/llvm/branches/release_XX llvm-X. This can be done with the following commands: svn export https://llvm. clang. Offenders get commit rights taken away (temporarily).2/trunk \ https://llvm.org/svn/llvm-project/cfe/branches/release_XX Update LLVM Version After creating the LLVM release branch.X Dependency Relationships Of Object Files 294 . LLVM-GCC. llvm-gcc4. 3. Announce the Release Release Administrative Tasks This section describes a few administrative tasks that need to be done for the release process to begin. resetting version numbers. The branch name will be release_XX. Update it on mainline as well to be the next version (X. Create Release Branch Branch the Subversion HEAD using the following procedure: 1. These branches can be created without checking out anything from subversion. Regenerated the configure script for both.org/svn/llvm-project/llvm/branches/release_XX svn co https://llvm.2.org/svn/llvm-project/llvm-gcc-4. Request all developers to refrain from committing. Create the release branch for llvm.ac version from X. 5. clang.org/svn/llvm-project/test-suite/branches/release_XX svn co https://llvm.org/svn/llvm-project/llvm-gcc-4. 2.org/svn/llvm-project/test-suite/branches/release_XX svn copy https://llvm. The easiest way to do this is to grab another working copy using the following commands: svn co https://llvm. Specifically.org/svn/llvm-project/cfe/branches/release_XX 4.org/svn/llvm-project/llvm/trunk \ https://llvm. This must be done for both llvm and the test-suite. Build the LLVM Source Distributions Create source distributions for LLVM. Clang will have a different release number than llvm/ llvm-gcc4 since its first release was years later (still deciding if this will be true or not).org/svn/llvm-project/llvm-gcc-4. Update the LLVM Website 5. the version number of all the Bugzilla components must be updated for the next release.X+1svn). and the test-suite. Update the LLVM Demo Page 4. Advise developers they can work on Subversion HEAD again. it involves creating the release branch.org/svn/llvm-project/cfe/trunk \ https://llvm. Documentation for the LLVM System at SVN head 3.X. In addition.2/branches/release_XX svn co https://llvm.org/svn/llvm-project/test-suite/trunk \ https://llvm. FIXME: Add a note about clang.

4.X.0.1 x86-32 Linux gcc 4.4.5 gcc 4.0.gz tar -czvf .tar.2-2. 3.llvm-test-X.X.org/svn/llvm-project/llvm-gcc-4.X x86-32 mingw gcc 3.tar.3.2-X. Package clang (details to follow).X | gzip > clang-X. the directory would be named llvm-gcc4.gz tar -czvf . release: ENABLE_OPTIMIZED=1 3.tar. objc++ (mac only) and fortran support. where X is the major and minor release numbers. objc (mac only).LLVM file.tar.X x86-32 FreeBSD gcc 4.X | gzip > llvm-X.org/svn/llvm-project/test-suite/branches/release_XX llvm-test-X.llvm-X. Build Clang Binary Distribution Creating the Clang binary distribution (debug/release/release-asserts) requires performing the following steps for each supported platform: 1.2. The frontend must be compiled with c. If all builds are clean. 3. release.1 Dependency Relationships Of Object Files 295 .X.2.gz Building the Release The build of llvm. Please boostrap as well.source. Build the LLVM GCC front-end by following the directions in the README. release-asserts: ENABLE_OPTIMIZED=1 DISABLE_ASSERTIONS=1 Build LLVM Build both debug. Architecture OS compiler x86-32 Mac OS 10.s svn export https://llvm.X svn export https://llvm.2-X. llvm-gcc. and clang must be free of errors and warnings in both debug. but the binary will be a release build. Target Specific Build Details The table below specifies which compilers are used for each arch/os combination when qualifying the build of llvm. debug: ENABLE_OPTIMIZED=0 2. Direction to build llvm are here.clang-X. Archive and compress the new directory. and release-asserts builds.gz tar -czvf . c++. 2. 2.X | gzip > llvm-test-X. llvm-gcc.X. then the release passes build qualification. release (optimized). Build both a debug and release version of clang. For example on Red Hat Enterprise Linux.source | gzip > llvm-gcc-4. clang. Build clang according to the directions here. gcc 4. Copy the installation directory to a directory named for the specific target.X. Be sure to build with LLVM_VERSION_INFO=X. 1.2/branches/release_XX llvm-gcc4.5 gcc 4.X. Documentation for the LLVM System at SVN head svn export https://llvm.X. and release-asserts versions of LLVM on all supported platforms. Build the LLVM GCC Binary Distribution Creating the LLVM GCC binary distribution (release/optimized) requires performing the following steps for each supported platform: 1.llvm-gcc4.org/svn/llvm-project/cfe/branches/release_XX clang-X.X.2-X.6-x86-linux-RHEL4.X tar -czvf .5 x86-64 Mac OS 10.

clang's own test suite passes cleanly. test-suite (including x86-32 Linux last release none spec) x86-32 FreeBSD none none llvm dejagnu. Ask that all LLVM developers test the release in 2 ways: 1. but to spot other Dependency Relationships Of Object Files 296 . llvm-test-X. The results are not used to qualify a release.X x86-64 FreeBSD gcc 4. Qualify LLVM-GCC LLVM-GCC is qualified when front-end specific tests in the llvm dejagnu test suite all pass and there are no regressions in the test-suite. and the llvm-gcc4 and/or clang source. test-suite (including x86-64 Linux last release none spec) x86-64 FreeBSD none none llvm dejagnu.3. the pre-release tar balls may be put on the website and the LLVM community is notified.X. test-suite Community Testing Once all testing has been completed and appropriate bugs filed. Build LLVM. test-suite (including x86-32 Mac OS 10. 2. clang tests. test-suite x86-32 mingw last release none QT llvm dejagnu.X. This doesn't mean that we don't care about other things. We do not use the gcc dejagnu test suite as release criteria. clang tests.5 last release none spec) llvm dejagnu. The metric we use is described below. llvm-test-X. clang tests. Attempt to verify that there are no regressions from the previous release. Ultimately. there is no end to the number of possible bugs in a release. Regressions are new failures in the set of tests that are used to qualify each product and only include things on the list. Specific Target Qualification Details llvm-gcc clang Architecture OS tests baseline baseline llvm dejagnu. Documentation for the LLVM System at SVN head x86-64 Linux gcc 4. Ask LLVM developers to submit the report and make check results to the list. clang tests.X. Regressions are related to correctness only and not performance at this time. Run "make check" and the full llvm-test suite (make TEST=nightly report).X.X. but this are things that must be satisfied before a release can go out Qualify LLVM LLVM is qualified when it has a clean dejagnu test run without a frontend and it has no regressions when using either llvm-gcc or clang with the test-suite from the previous release. clang tests. Download llvm-X. Run "make check" and the full llvm-test suite (make TEST=nightly report). We need a very concrete and definitive release criteria that ensures we have monotonically improving quality on some metric. Qualify Clang Clang is qualified when front-end specific tests in the llvm dejagnu test suite all pass. gcc 4. clang tests.X Building the Release A release is qualified when it has no regressions from the previous release (or baseline). and the appropriate llvm-gcc4 and/or clang binary.2. Download llvm-X. and there are no regressions in the test-suite. Compile everything.5 last release none spec) llvm dejagnu. test-suite (including x86-64 Mac OS 10.2.

llvm-gcc source. Update the website demo page configuration to use the new release. • During the first round of testing. Release Final Tasks The final stages of the release process involving taging the release branch. new known issues. Release Patch Rules Below are the rules regarding patching the release branch. all regressions must be fixed before the second pre-release is created.org/svn/llvm-project/llvm/tags/RELEASE_XX svn copy https://llvm. 3. Documentation for the LLVM System at SVN head potential problems. the release is determined to be ready and the release manager may move onto the next step. and updating the demo page. verify that make check at least is clean. Eventually the websites will be merged hopefully.org/svn/llvm-project/llvm-gcc-4. Commit the llvm. The Release Notes must be updated to reflect bug fixes. and llvm-gcc binaries in this new directory. Merge both changes from mainline into the release branch.org/svn/llvm-project/llvm-gcc-4. this is only to ensure the bug fixes previously merged in have not created new major problems. For unsupported targets. and changes in the list of supported platforms. If this is the second round of testing. Check out the website module from CVS. This consists of using the llvm-gcc binary and building LLVM. patches that fix regressions or that are small and relatively risk free (verified by the appropriate code owner) are applied to the branch. test-suite. Here is what to do: 1.org/svn/llvm-project/test-suite/tags/RELEASE_XX Update the LLVM Demo Page The LLVM demo page must be updated to use the new release.org/svn/llvm-project/llvm/branches/release_XX \ https://llvm. Code owners are asked to be very conservative in approving patches for the branch and we reserve the right to reject any patch that does not fix a regression as previously defined. • Patches applied to the release branch are only applied by the release manager. This is not the time to solve additional and unrelated bugs.X in the releases directory. Update the LLVM Website The website must be updated before the release announcement is sent out. If no patches are merged in. • During the remaining rounds of testing.org/svn/llvm-project/test-suite/branches/release_XX \ https://llvm. FIXME: Add a note if anything needs to be done to the clang website. clang source. Dependency Relationships Of Object Files 297 . Create a new subdirectory X. 2. clang binaries. Tag the Release Branch Tag the release branch using the following procedure: svn copy https://llvm.2/tags/RELEASE_XX svn copy https://llvm. updating documentation that refers to the release.2/branches/release_XX \ https://llvm. The Getting Started Guide should be updated to reflect the new release version number tag avaiable from Subversion and changes in basic system requirements. During the first round of testing time. only patches that fix regressions may be applied. Update Documentation Review the documentation and ensure that it is up to date.

html file with the new release.html and sidebar) to point to the new release and release announcement. Make sure this all gets committed back into Subversion. Update the releases/download. Commit the index. The docs should be built with BUILD_FOR_WEBSITE=1. Copy and commit the llvm/docs and LICENSE. 12 Oct 2009) $ Dependency Relationships Of Object Files 298 .html to the release/X. Documentation for the LLVM System at SVN head 4. 5. Finally. 6. 8. Announce the Release Have Chris send out the release announcement when everything is finished.X directory to redirect (use from previous release. 7. The LLVM Compiler Infrastructure Last modified: $Date: 2009-10-12 09:46:08 -0500 (Mon. update the main page (index.txt files into this new directory. Update the releases/index.html with the new release and link to release documentation.

Pass classes and requirements ♦ The ImmutablePass class ♦ The ModulePass class ◊ The runOnModule method ♦ The CallGraphSCCPass class ◊ The doInitialization(CallGraph &) method ◊ The runOnSCC method ◊ The doFinalization(CallGraph &) method ♦ The FunctionPass class ◊ The doInitialization(Module &) method ◊ The runOnFunction method ◊ The doFinalization(Module &) method ♦ The LoopPass class ◊ The doInitialization(Loop *. Future extensions planned Dependency Relationships Of Object Files 299 . LPPassManager &) method ◊ The runOnLoop method ◊ The doFinalization() method ♦ The BasicBlockPass class ◊ The doInitialization(Function &) method ◊ The runOnBasicBlock method ◊ The doFinalization(Function &) method ♦ The MachineFunctionPass class ◊ The runOnMachineFunction(MachineFunction &) method 4. Implementing Analysis Groups ♦ Analysis Group Concepts ♦ Using RegisterAnalysisGroup 7. Using GDB with dynamically loaded passes ♦ Setting a breakpoint in your pass ♦ Miscellaneous Problems 11. What PassManager does ♦ The releaseMemory method 9. Pass Registration ♦ The print method 5. Documentation for the LLVM System at SVN head Writing an LLVM Pass 1. Pass Statistics 8. Quick Start . Registering dynamically loaded passes ♦ Using existing registries ♦ Creating new registries 10.Writing hello world ♦ Setting up the build environment ♦ Basic code required ♦ Running a pass with opt 3.What is a pass? 2. Introduction . Specifying interactions between passes ♦ The getAnalysisUsage method ♦ The AnalysisUsage::addRequired<> and AnalysisUsage::addRequiredTransitive<> methods ♦ The AnalysisUsage::addPreserved<> method ♦ Example implementations of getAnalysisUsage ♦ The getAnalysis<> and getAnalysisIfAvailable<> methods 6.

What is a pass? The LLVM Pass Framework is an important part of the LLVM system. above all. you need to create a new directory somewhere in the LLVM source base. or LoopPass.a LLVMSystem. we'll assume that you made lib/Transforms/Hello. Documentation for the LLVM System at SVN head ♦ Multithreaded LLVM Written by Chris Lattner and Jim Laskey Introduction . LLVMLIBS = LLVMCore. After the basics are down. For this example. you should inherit from the ModulePass. and they are. everything from setting up the code. and executing it. which implement functionality by overriding virtual methods inherited from Pass. FunctionPass./. CallGraphSCCPass. and how it can be combined with other passes. All LLVM passes are subclasses of the Pass class. The "Hello" pass is designed to simply print out the name of non-external functions that exist in the program being compiled.Writing hello world Here we describe how to write the "hello world" of passes. they build the analysis results that are used by these transformations. # Name of the library to build LIBRARYNAME = Hello # Make the shared library become a loadable module so the tools can # dlopen/dlsym on the resulting library. To do this.. LOADABLE_MODULE = 1 # Tell the build system which LLVM libraries your pass needs. Setting up the build environment First../.cpp files in the current directory are to be compiled and linked together into a Debug/lib/Hello. you must set up a build script (Makefile) that will compile the source code for the new pass. which gives the system more information about what your pass does.a. It does not modify the program at all.. One of the main features of the LLVM Pass Framework is that it schedules passes to run in an efficient way based on the constraints that your pass meets (which are indicated by which class they derive from). more advanced features are discussed.a. loading.common This makefile specifies that all of the .a but possibly several # others too. Next. Passes perform the transformations and optimizations that make up the compiler. LLVMCore. a structuring technique for compiler code. The source code and files for this pass are available in the LLVM source tree in the lib/Transforms/Hello directory.a LLVMSupport. copy the following into Makefile: # Makefile for hello pass # Path to top level of LLVM hierarchy LEVEL = . Quick Start . We start by showing you how to construct a pass. or BasicBlockPass classes. because LLVM passes are where most of the interesting parts of the compiler exist.so shared object that can be dynamically loaded by the opt or Dependency Relationships Of Object Files 300 . to compiling. it just inspects it. You'll probably # need at least LLVMSystem.a # Include the makefile implementation stuff include $(LEVEL)/Makefile. Depending on how your pass works. LLVMSupport.

Next we have: namespace { . If you're not familiar with them. we just need to write the code for the pass itself.getName() << "\n". we declare our pass itself: struct Hello : public FunctionPass { This declares a "Hello" class that is a subclass of FunctionPass. Now that we have the build scripts set up. // end of struct Hello We declare a "runOnFunction" method. which is required because the functions from the include files live in the llvm namespace. Basic code required Now that we have a way to compile our new pass. Hello() : FunctionPass(&ID) {} This declares pass identifier used by LLVM to identify pass. Start out with: #include "llvm/Pass. It makes the things declared inside of the anonymous namespace only visible to the current file. This allows LLVM to avoid using expensive C++ runtime information.so (such as windows or Mac OS/X). we are operating on Function's. If your operating system uses a suffix other than . Next we have: using namespace llvm.. Next. char Hello::ID = 0.h" #include "llvm/Support/raw_ostream. .h" #include "llvm/Function. the appropriate extension will be used. static char ID. This is where we are supposed to do our thing. Dependency Relationships Of Object Files 301 . The different builtin pass subclasses are described in detail later. and we will be doing some printing. consult a decent C++ book for more information... Anonymous namespaces are to C++ what the "static" keyword is to C (at global scope)..h" Which are needed because we are writing a Pass. so we just print out our message with the name of each function. which overloads an abstract virtual method inherited from FunctionPass. } }. which starts out an anonymous namespace. know that FunctionPass's operate a function at a time. return false. virtual bool runOnFunction(Function &F) { errs() << "Hello: " << F. Documentation for the LLVM System at SVN head bugpoint tools via their -load options. we just have to write it. but for now.

"Hello World Pass"). we can use the opt command to run an LLVM program through your pass. for example dominator tree pass.. and a name "Hello World Pass"./Debug/lib/Hello. once loaded. } Now that it's all together.getName() << "\n". false /* Only looks at CFG */. we register our class Hello. any bitcode file will work): $ opt -load . We can now run the bitcode file (hello. then true is supplied as fourth argument. Hello() : FunctionPass(&ID) {} virtual bool runOnFunction(Function &F) { errs() << "Hello: " << F.. char Hello::ID = 0. RegisterPass<Hello> X("hello". As a whole.bc) for the program through our transformation like this (or course./. Because you registered your pass with the RegisterPass template. compile the file with a simple "gmake" command in the local directory and you should get a new "Debug/lib/Hello. } }./.so file. return false. follow the example at the end of the Getting Started Guide to compile "Hello World" to LLVM. the .h" using namespace llvm. If a pass walks CFG without modifying it then third argument is set to true. To test it. Their default value is false.h" #include "llvm/Support/raw_ostream. false /* Analysis Pass */). If a pass is an analysis pass. Running a pass with opt Now that you have a brand new shiny shared object file.h" #include "llvm/Function. RegisterPass<Hello> X("hello".cpp file looks like: #include "llvm/Pass. Documentation for the LLVM System at SVN head We initialize pass ID here. "Hello World Pass". namespace { struct Hello : public FunctionPass { static char ID.bc > /dev/null Hello: __main Hello: puts Hello: main Dependency Relationships Of Object Files 302 . you will be able to use the opt tool to access it.so -hello < hello. LLVM uses ID's address to identify pass so initialization value is not important. Last two RegisterPass arguments are optional. } // end of anonymous namespace Lastly.. giving it a command line argument "hello". Note that everything in this file is contained in an anonymous namespace: this reflects the fact that passes are self contained units that do not need external interfaces (although they can have them) to be useful.

.Resolve Functions -gcse .Global Common Subexpression Elimination -globaldce ..Canonicalize Induction Variables -inline . The Hello World example uses the FunctionPass class for its implementation.0033 ( 6. which makes '-hello' a valid command line argument (which is one reason you need to register your pass). it may become useful to find out how fast your pass is./. The additional passes listed are automatically inserted by the 'opt' tool to verify that the LLVM emitted by your pass is still valid and well formed LLVM./Debug/lib/Hello. Now that you have a working pass. try running opt with the -help option: $ opt -load ..4%) Dominator Set Constru 0.0%) 0. The pass name get added as the information string for your pass. --System Time-.Function Integration/Inlining -instcombine .bc > /dev/null Hello: __main Hello: puts Hello: main =============================================================================== .0%) 0.0031 ( 6.0000 ( 0. Because the hello pass does not modify the program in any interesting way. Once you get it all working and tested.0000 ( 0.. our implementation above is pretty fast :).0402 ( 84..0%) 0. you would go ahead and make it do the cool transformations you want.0100 ( 50.0000 ( 0./.0%) 0.0000 ( 0.7%) Module Verifier 0...0%) 0.0000 ( 0.so -hello -time-passes < hello.0100 (100.0200 (100.0%) 0.0%) 0. To see what happened to the other string you registered.Combine redundant instructions .so -help OVERVIEW: llvm .0%) 0.. For example: $ opt -load .9%) Hello World Pass 0.bc modular optimizer USAGE: opt [options] <input bitcode> OPTIONS: Optimizations available: ..0479 (100./Debug/lib/Hello. -funcresolve ..0%) 0..0000 ( 0.0479059 wall clock) ---User Time--. we just throw away the result of opt (sending it to /dev/null).0%) 0.Hello World Pass -indvars . Now that you have seen the basics of the mechanics behind passes.02 seconds (0. we can talk about some more details of how they work and how to use them.Pass Name --- 0.0000 ( 0.0%) Bitcode Writer 0.0100 ( 50.0000 ( 0. but Dependency Relationships Of Object Files 303 .0013 ( 2.bc -> . Pass execution timing report .. The PassManager provides a nice command line option (--time-passes) that allows you to get information about the execution time of your pass along with the other passes you queue up.0%) 0.0%) 0. --.0%) 0.0100 (100. which hasn't been broken somehow. =============================================================================== Total Execution Time: 0. Pass classes and requirements One of the first things that you should do when designing a new pass is to decide what class you should subclass for your pass.0%) 0.0%) TOTAL As you can see. --User+System-.. giving some documentation to users of opt.0100 (100..0%) 0./. Documentation for the LLVM System at SVN head The '-load' option specifies that 'opt' should load your pass as a shared object./.0100 (100. ---Wall Time--.Dead Global Elimination -hello .

not allowed to modify any Functions that are not in the current SCC. To write a correct ModulePass subclass. If your pass meets the requirements outlined below. in the case of dominators you should only ask for the DominatorTree for function definitions.. It should return true if the module was modified by the transformation and false otherwise. or adding and removing functions. you should derive from CallGraphSCCPass.g. but can provide information about the current compiler configuration. and B-U mean. A module pass can use function level passes (e. Here we talk about the classes available. When choosing a superclass for your Pass. TODO: explain briefly what SCC. Note that this can only be done for functions for which the analysis ran. and other static information that can affect the various transformations. Deriving from CallGraphSCCPass provides some mechanics for building and traversing the CallGraph. Documentation for the LLVM System at SVN head we did not discuss why or when this should occur. ImmutablePasses never invalidate other transformations. Deriving from ModulePass indicates that your pass uses the entire program as a unit. This gives the LLVM Pass Infrastructure information necessary to optimize how passes are run. CallGraphSCCPass subclasses are: 1. it is important for providing information about the current target machine being compiled for. and are never "run". if the function pass does not require any module or immutable passes. This is not a normal type of transformation or analysis. e. Dependency Relationships Of Object Files 304 . not declarations. The runOnModule method performs the interesting work of the pass. from the most general to the most specific. and never need to be updated. Because nothing is known about the behavior of ModulePass subclasses. no optimization can be done for their execution. you should choose the most specific class possible. dominators) using the getAnalysis interface getAnalysis<DominatorTree>(llvm::Function *) to provide the function to retrieve analysis result for. referring to function bodies in no predictable order. derive from ModulePass and overload the runOnModule method with the following signature: The runOnModule method virtual bool runOnModule(Module &M) = 0. do not change state. . but also allows the system to optimize execution of CallGraphSCCPass's. Although this pass class is very infrequently used. are never invalidated.. The CallGraphSCCPass class The "CallGraphSCCPass" is used by passes that need to traverse the program bottom-up on the call graph (callees before callers). Tarjan's algo. and doesn't meet the requirements of a FunctionPass or BasicBlockPass. The ModulePass class The "ModulePass" class is the most general of all superclasses that you can use.g. This pass type is used for passes that do not have to be run. To be explicit. while still being able to meet the requirements listed. so that the resultant compiler isn't unnecessarily slow. The ImmutablePass class The most plain and boring type of pass is the "ImmutablePass" class.

. Add or remove Function's from the current Module.. They can add and remove functions. All of these methods should return true if they modified the program.. . 3. get pointers to functions. 4. The runOnSCC method performs the interesting work of the pass. and FunctionPass's do not modify external functions. Maintain state across invocations of runOnFunction (including global data) Implementing a FunctionPass is usually straightforward (See the Hello World pass for example). The doInitialization(CallGraph &) method virtual bool doInitialization(CallGraph &CG). FunctionPass subclasses do have a predictable. or false if they didn't. not allowed to add or remove SCC's from the current Module. .. Add or remove global variables from the current Module. 5.. To be explicit. though they may change the contents of an SCC.. Implementing a CallGraphSCCPass is slightly tricky in some cases because it has to handle SCCs with more than one node in it. Documentation for the LLVM System at SVN head 2. Modify a Function other than the one currently being processed. FunctionPass subclasses are not allowed to: 1. or false if they didn't. FunctionPass's do not require that they are executed in a particular order. Dependency Relationships Of Object Files 305 . . and should return true if the module was modified by the transformation. The doIninitialize method is allowed to do most of the things that CallGraphSCCPass's are not allowed to do. FunctionPass's may overload three virtual methods to do their work. allowed to maintain state across invocations of runOnSCC (including global data). local behavior that can be expected by the system. 2. 3. updating it to reflect any changes made to the program. allowed to add or remove global variables from the current Module. 6. 4. The doInitialization method is designed to do simple initialization type of stuff that does not depend on the SCCs being processed.. not allowed to inspect any Function's other than those in the current SCC and the direct callees of the SCC. .. . false otherwise. required to preserve the current CallGraph object. All FunctionPass execute on each function in the program independent of all of the other functions in the program. The FunctionPass class In contrast to ModulePass subclasses. The doInitialization method call is not scheduled to overlap with any other pass executions (thus it should be very fast). etc. The runOnSCC method virtual bool runOnSCC(const std::vector<CallGraphNode *> &SCCM) = 0. The doFinalization method is an infrequently used method that is called when the pass framework has finished calling runOnFunction for every function in the program being compiled. The doFinalization(CallGraph &) method virtual bool doFinalization(CallGraph &CG).. All of the virtual methods described below should return true if they modified the program..

LPPassManager &LPM) = 0. The doInitialization method is designed to do simple initialization type of stuff that does not depend on the functions being processed. get pointers to functions. The doFinalization method is an infrequently used method that is called when the pass framework has finished calling runOnFunction for every function in the program being compiled. LoopPass subclasses are allowed to update loop nest using LPPassManager interface. The doInitialization method is designed to do simple initialization type of stuff that does not depend on the functions being processed. or false if they didn't. As usual. The doIninitialize method is allowed to do most of the things that FunctionPass's are not allowed to do. Looppass's may overload three virtual methods to do their work. It uses the doInitialization method to get a reference to the malloc and free functions that it needs. a true value should be returned if the function is modified. The doInitialization method call is not scheduled to overlap with any other pass executions (thus it should be very fast). All these methods should return true if they modified the program. A good example of how this method should be used is the LowerAllocations pass. The doInitialization(Loop *. The doInitialization method call is not scheduled to overlap with any other pass executions (thus it should be very fast). The runOnFunction method must be implemented by your subclass to do the transformation or analysis work of your pass. The runOnLoop method must be implemented by your subclass to do the transformation or analysis work of your pass. The runOnLoop method virtual bool runOnLoop(Loop *. This pass converts malloc and free instructions into platform dependent malloc() and free() function calls. a true value should be returned if the function is modified. The LoopPass class All LoopPass execute on each loop in the function independent of all of the other loops in the function. Dependency Relationships Of Object Files 306 . As usual. Implementing a loop pass is usually straightforward. LPPassManager interface should be used to update loop nest. LPPassManager &) method virtual bool doInitialization(Loop *. adding prototypes to the module if necessary. The runOnFunction method virtual bool runOnFunction(Function &F) = 0. LPPassManager interface should be used to access Function or Module level analysis information. LoopPass processes loops in loop nest order such that outer most loop is processed last. LPPassManager &LPM). The doFinalization() method virtual bool doFinalization(). Documentation for the LLVM System at SVN head The doInitialization(Module &) method virtual bool doInitialization(Module &M). They can add and remove functions. etc. The doFinalization(Module &) method virtual bool doFinalization(Module &M).

Documentation for the LLVM System at SVN head
The doFinalization method is an infrequently used method that is called when the pass framework has
finished calling runOnLoop for every loop in the program being compiled.

The BasicBlockPass class
BasicBlockPass's are just like FunctionPass's, except that they must limit their scope of inspection
and modification to a single basic block at a time. As such, they are not allowed to do any of the following:

1. Modify or inspect any basic blocks outside of the current one
2. Maintain state across invocations of runOnBasicBlock
3. Modify the control flow graph (by altering terminator instructions)
4. Any of the things forbidden for FunctionPasses.

BasicBlockPasses are useful for traditional local and "peephole" optimizations. They may override the
same doInitialization(Module &) and doFinalization(Module &) methods that
FunctionPass's have, but also have the following virtual methods that may also be implemented:

The doInitialization(Function &) method
virtual bool doInitialization(Function &F);

The doIninitialize method is allowed to do most of the things that BasicBlockPass's are not
allowed to do, but that FunctionPass's can. The doInitialization method is designed to do simple
initialization that does not depend on the BasicBlocks being processed. The doInitialization method
call is not scheduled to overlap with any other pass executions (thus it should be very fast).

The runOnBasicBlock method
virtual bool runOnBasicBlock(BasicBlock &BB) = 0;

Override this function to do the work of the BasicBlockPass. This function is not allowed to inspect or
modify basic blocks other than the parameter, and are not allowed to modify the CFG. A true value must be
returned if the basic block is modified.

The doFinalization(Function &) method
virtual bool doFinalization(Function &F);

The doFinalization method is an infrequently used method that is called when the pass framework has
finished calling runOnBasicBlock for every BasicBlock in the program being compiled. This can be used
to perform per-function finalization.

The MachineFunctionPass class
A MachineFunctionPass is a part of the LLVM code generator that executes on the machine-dependent
representation of each LLVM function in the program. A MachineFunctionPass is also a
FunctionPass, so all the restrictions that apply to a FunctionPass also apply to it.
MachineFunctionPasses also have additional restrictions. In particular, MachineFunctionPasses
are not allowed to do any of the following:

1. Modify any LLVM Instructions, BasicBlocks or Functions.
2. Modify a MachineFunction other than the one currently being processed.
3. Add or remove MachineFunctions from the current Module.
4. Add or remove global variables from the current Module.
5. Maintain state across invocations of runOnMachineFunction (including global data)

Dependency Relationships Of Object Files 307

Documentation for the LLVM System at SVN head
The runOnMachineFunction(MachineFunction &MF) method
virtual bool runOnMachineFunction(MachineFunction &MF) = 0;

runOnMachineFunction can be considered the main entry point of a MachineFunctionPass; that
is, you should override this method to do the work of your MachineFunctionPass.

The runOnMachineFunction method is called on every MachineFunction in a Module, so that the
MachineFunctionPass may perform optimizations on the machine-dependent representation of the
function. If you want to get at the LLVM Function for the MachineFunction you're working on, use
MachineFunction's getFunction() accessor method -- but remember, you may not modify the
LLVM Function or its contents from a MachineFunctionPass.

Pass registration
In the Hello World example pass we illustrated how pass registration works, and discussed some of the
reasons that it is used and what it does. Here we discuss how and why passes are registered.

As we saw above, passes are registered with the RegisterPass template, which requires you to pass at
least two parameters. The first parameter is the name of the pass that is to be used on the command line to
specify that the pass should be added to a program (for example, with opt or bugpoint). The second
argument is the name of the pass, which is to be used for the -help output of programs, as well as for debug
output generated by the --debug-pass option.

If you want your pass to be easily dumpable, you should implement the virtual print method:

The print method
virtual void print(std::ostream &O, const Module *M) const;

The print method must be implemented by "analyses" in order to print a human readable version of the
analysis results. This is useful for debugging an analysis itself, as well as for other people to figure out how an
analysis works. Use the opt -analyze argument to invoke this method.

The llvm::OStream parameter specifies the stream to write the results on, and the Module parameter
gives a pointer to the top level module of the program that has been analyzed. Note however that this pointer
may be null in certain circumstances (such as calling the Pass::dump() from a debugger), so it should
only be used to enhance debug output, it should not be depended on.

Specifying interactions between passes
One of the main responsibilities of the PassManager is to make sure that passes interact with each other
correctly. Because PassManager tries to optimize the execution of passes it must know how the passes
interact with each other and what dependencies exist between the various passes. To track this, each pass can
declare the set of passes that are required to be executed before the current pass, and the passes which are
invalidated by the current pass.

Typically this functionality is used to require that analysis results are computed before your pass is run.
Running arbitrary transformation passes can invalidate the computed analysis results, which is what the
invalidation set specifies. If a pass does not implement the getAnalysisUsage method, it defaults to not
having any prerequisite passes, and invalidating all other passes.

The getAnalysisUsage method
virtual void getAnalysisUsage(AnalysisUsage &Info) const;

Dependency Relationships Of Object Files 308

Documentation for the LLVM System at SVN head
By implementing the getAnalysisUsage method, the required and invalidated sets may be specified for
your transformation. The implementation should fill in the AnalysisUsage object with information about
which passes are required and not invalidated. To do this, a pass may call any of the following methods on the
AnalysisUsage object:

The AnalysisUsage::addRequired<> and AnalysisUsage::addRequiredTransitive<>
methods
If your pass requires a previous pass to be executed (an analysis for example), it can use one of these methods
to arrange for it to be run before your pass. LLVM has many different types of analyses and passes that can be
required, spanning the range from DominatorSet to BreakCriticalEdges. Requiring
BreakCriticalEdges, for example, guarantees that there will be no critical edges in the CFG when your
pass has been run.

Some analyses chain to other analyses to do their job. For example, an AliasAnalysis implementation is
required to chain to other alias analysis passes. In cases where analyses chain, the
addRequiredTransitive method should be used instead of the addRequired method. This informs
the PassManager that the transitively required pass should be alive as long as the requiring pass is.

The AnalysisUsage::addPreserved<> method
One of the jobs of the PassManager is to optimize how and when analyses are run. In particular, it attempts to
avoid recomputing data unless it needs to. For this reason, passes are allowed to declare that they preserve
(i.e., they don't invalidate) an existing analysis if it's available. For example, a simple constant folding pass
would not modify the CFG, so it can't possibly affect the results of dominator analysis. By default, all passes
are assumed to invalidate all others.

The AnalysisUsage class provides several methods which are useful in certain circumstances that are
related to addPreserved. In particular, the setPreservesAll method can be called to indicate that the
pass does not modify the LLVM program at all (which is true for analyses), and the setPreservesCFG
method can be used by transformations that change instructions in the program but do not modify the CFG or
terminator instructions (note that this property is implicitly set for BasicBlockPass's).

addPreserved is particularly useful for transformations like BreakCriticalEdges. This pass knows
how to update a small set of loop and dominator related analyses if they exist, so it can preserve them, despite
the fact that it hacks on the CFG.

Example implementations of getAnalysisUsage
// This is an example implementation from an analysis, which does not modify
// the program at all, yet has a prerequisite.
void PostDominanceFrontier::getAnalysisUsage(AnalysisUsage &AU) const {
AU.setPreservesAll();
AU.addRequired<PostDominatorTree>();
}

and:

// This example modifies the program, but does not modify the CFG
void LICM::getAnalysisUsage(AnalysisUsage &AU) const {
AU.setPreservesCFG();
AU.addRequired<LoopInfo>();
}

The getAnalysis<> and getAnalysisIfAvailable<> methods

Dependency Relationships Of Object Files 309

Documentation for the LLVM System at SVN head
The Pass::getAnalysis<> method is automatically inherited by your class, providing you with access
to the passes that you declared that you required with the getAnalysisUsage method. It takes a single
template argument that specifies which pass class you want, and returns a reference to that pass. For example:

bool LICM::runOnFunction(Function &F) {
LoopInfo &LI = getAnalysis<LoopInfo>();
...
}

This method call returns a reference to the pass desired. You may get a runtime assertion failure if you attempt
to get an analysis that you did not declare as required in your getAnalysisUsage implementation. This
method can be called by your run* method implementation, or by any other local method invoked by your
run* method. A module level pass can use function level analysis info using this interface. For example:

bool ModuleLevelPass::runOnModule(Module &M) {
...
DominatorTree &DT = getAnalysis<DominatorTree>(Func);
...
}

In above example, runOnFunction for DominatorTree is called by pass manager before returning a reference
to the desired pass.

If your pass is capable of updating analyses if they exist (e.g., BreakCriticalEdges, as described
above), you can use the getAnalysisIfAvailable method, which returns a pointer to the analysis if it
is active. For example:

...
if (DominatorSet *DS = getAnalysisIfAvailable<DominatorSet>()) {
// A DominatorSet is active. This code will update it.
}
...

Implementing Analysis Groups
Now that we understand the basics of how passes are defined, how they are used, and how they are required
from other passes, it's time to get a little bit fancier. All of the pass relationships that we have seen so far are
very simple: one pass depends on one other specific pass to be run before it can run. For many applications,
this is great, for others, more flexibility is required.

In particular, some analyses are defined such that there is a single simple interface to the analysis results, but
multiple ways of calculating them. Consider alias analysis for example. The most trivial alias analysis returns
"may alias" for any alias query. The most sophisticated analysis a flow-sensitive, context-sensitive
interprocedural analysis that can take a significant amount of time to execute (and obviously, there is a lot of
room between these two extremes for other implementations). To cleanly support situations like this, the
LLVM Pass Infrastructure supports the notion of Analysis Groups.

Analysis Group Concepts
An Analysis Group is a single simple interface that may be implemented by multiple different passes.
Analysis Groups can be given human readable names just like passes, but unlike passes, they need not derive
from the Pass class. An analysis group may have one or more implementations, one of which is the "default"
implementation.

Dependency Relationships Of Object Files 310

Documentation for the LLVM System at SVN head
Analysis groups are used by client passes just like other passes are: the
AnalysisUsage::addRequired() and Pass::getAnalysis() methods. In order to resolve this
requirement, the PassManager scans the available passes to see if any implementations of the analysis group
are available. If none is available, the default implementation is created for the pass to use. All standard rules
for interaction between passes still apply.

Although Pass Registration is optional for normal passes, all analysis group implementations must be
registered, and must use the RegisterAnalysisGroup template to join the implementation pool. Also, a
default implementation of the interface must be registered with RegisterAnalysisGroup.

As a concrete example of an Analysis Group in action, consider the AliasAnalysis analysis group. The default
implementation of the alias analysis interface (the basicaa pass) just does a few simple checks that don't
require significant analysis to compute (such as: two different globals can never alias each other, etc). Passes
that use the AliasAnalysis interface (for example the gcse pass), do not care which implementation of
alias analysis is actually provided, they just use the designated interface.

From the user's perspective, commands work just like normal. Issuing the command 'opt -gcse ...' will
cause the basicaa class to be instantiated and added to the pass sequence. Issuing the command 'opt
-somefancyaa -gcse ...' will cause the gcse pass to use the somefancyaa alias analysis (which
doesn't actually exist, it's just a hypothetical example) instead.

Using RegisterAnalysisGroup
The RegisterAnalysisGroup template is used to register the analysis group itself as well as add pass
implementations to the analysis group. First, an analysis should be registered, with a human readable name
provided for it. Unlike registration of passes, there is no command line argument to be specified for the
Analysis Group Interface itself, because it is "abstract":

static RegisterAnalysisGroup<AliasAnalysis> A("Alias Analysis");

Once the analysis is registered, passes can declare that they are valid implementations of the interface by
using the following code:

namespace {
// Analysis Group implementations must be registered normally...
RegisterPass<FancyAA>
B("somefancyaa", "A more complex alias analysis implementation");

// Declare that we implement the AliasAnalysis interface
RegisterAnalysisGroup<AliasAnalysis> C(B);
}

This just shows a class FancyAA that is registered normally, then uses the RegisterAnalysisGroup
template to "join" the AliasAnalysis analysis group. Every implementation of an analysis group should
join using this template. A single pass may join multiple different analysis groups with no problem.

namespace {
// Analysis Group implementations must be registered normally...
RegisterPass<BasicAliasAnalysis>
D("basicaa", "Basic Alias Analysis (default AA impl)");

// Declare that we implement the AliasAnalysis interface
RegisterAnalysisGroup<AliasAnalysis, true> E(D);
}

Dependency Relationships Of Object Files 311

Documentation for the LLVM System at SVN head
Here we show how the default implementation is specified (using the extra argument to the
RegisterAnalysisGroup template). There must be exactly one default implementation available at all
times for an Analysis Group to be used. Only default implementation can derive from ImmutablePass.
Here we declare that the BasicAliasAnalysis pass is the default implementation for the interface.

Pass Statistics
The Statistic class is designed to be an easy way to expose various success metrics from passes. These
statistics are printed at the end of a run, when the -stats command line option is enabled on the command line.
See the Statistics section in the Programmer's Manual for details.

What PassManager does
The PassManager class takes a list of passes, ensures their prerequisites are set up correctly, and then
schedules passes to run efficiently. All of the LLVM tools that run passes use the PassManager for
execution of these passes.

The PassManager does two main things to try to reduce the execution time of a series of passes:

1. Share analysis results - The PassManager attempts to avoid recomputing analysis results as much as
possible. This means keeping track of which analyses are available already, which analyses get
invalidated, and which analyses are needed to be run for a pass. An important part of work is that the
PassManager tracks the exact lifetime of all analysis results, allowing it to free memory allocated
to holding analysis results as soon as they are no longer needed.
2. Pipeline the execution of passes on the program - The PassManager attempts to get better cache
and memory usage behavior out of a series of passes by pipelining the passes together. This means
that, given a series of consequtive FunctionPass's, it will execute all of the FunctionPass's on
the first function, then all of the FunctionPasses on the second function, etc... until the entire
program has been run through the passes.

This improves the cache behavior of the compiler, because it is only touching the LLVM program
representation for a single function at a time, instead of traversing the entire program. It reduces the
memory consumption of compiler, because, for example, only one DominatorSet needs to be
calculated at a time. This also makes it possible to implement some interesting enhancements in the
future.

The effectiveness of the PassManager is influenced directly by how much information it has about the
behaviors of the passes it is scheduling. For example, the "preserved" set is intentionally conservative in the
face of an unimplemented getAnalysisUsage method. Not implementing when it should be implemented
will have the effect of not allowing any analysis results to live across the execution of your pass.

The PassManager class exposes a --debug-pass command line options that is useful for debugging
pass execution, seeing how things work, and diagnosing when you should be preserving more analyses than
you currently are (To get information about all of the variants of the --debug-pass option, just type 'opt
-help-hidden').

By using the --debug-pass=Structure option, for example, we can see how our Hello World pass
interacts with other passes. Lets try it out with the gcse and licm passes:

$ opt -load ../../../Debug/lib/Hello.so -gcse -licm --debug-pass=Structure < hello.bc > /dev/nu
Module Pass Manager
Function Pass Manager
Dominator Set Construction
Immediate Dominators Construction

Dependency Relationships Of Object Files 312

Documentation for the LLVM System at SVN head
Global Common Subexpression Elimination
-- Immediate Dominators Construction
-- Global Common Subexpression Elimination
Natural Loop Construction
Loop Invariant Code Motion
-- Natural Loop Construction
-- Loop Invariant Code Motion
Module Verifier
-- Dominator Set Construction
-- Module Verifier
Bitcode Writer
--Bitcode Writer

This output shows us when passes are constructed and when the analysis results are known to be dead
(prefixed with '--'). Here we see that GCSE uses dominator and immediate dominator information to do its
job. The LICM pass uses natural loop information, which uses dominator sets, but not immediate dominators.
Because immediate dominators are no longer useful after the GCSE pass, it is immediately destroyed. The
dominator sets are then reused to compute natural loop information, which is then used by the LICM pass.

After the LICM pass, the module verifier runs (which is automatically added by the 'opt' tool), which uses
the dominator set to check that the resultant LLVM code is well formed. After it finishes, the dominator set
information is destroyed, after being computed once, and shared by three passes.

Lets see how this changes when we run the Hello World pass in between the two passes:

$ opt -load ../../../Debug/lib/Hello.so -gcse -hello -licm --debug-pass=Structure < hello.bc >
Module Pass Manager
Function Pass Manager
Dominator Set Construction
Immediate Dominators Construction
Global Common Subexpression Elimination
-- Dominator Set Construction
-- Immediate Dominators Construction
-- Global Common Subexpression Elimination
Hello World Pass
-- Hello World Pass
Dominator Set Construction
Natural Loop Construction
Loop Invariant Code Motion
-- Natural Loop Construction
-- Loop Invariant Code Motion
Module Verifier
-- Dominator Set Construction
-- Module Verifier
Bitcode Writer
--Bitcode Writer
Hello: __main
Hello: puts
Hello: main

Here we see that the Hello World pass has killed the Dominator Set pass, even though it doesn't modify the
code at all! To fix this, we need to add the following getAnalysisUsage method to our pass:

// We don't modify the program, so we preserve all analyses
virtual void getAnalysisUsage(AnalysisUsage &AU) const {
AU.setPreservesAll();
}

Dependency Relationships Of Object Files 313

Documentation for the LLVM System at SVN head

Now when we run our pass, we get this output:

$ opt -load ../../../Debug/lib/Hello.so -gcse -hello -licm --debug-pass=Structure < hello.bc >
Pass Arguments: -gcse -hello -licm
Module Pass Manager
Function Pass Manager
Dominator Set Construction
Immediate Dominators Construction
Global Common Subexpression Elimination
-- Immediate Dominators Construction
-- Global Common Subexpression Elimination
Hello World Pass
-- Hello World Pass
Natural Loop Construction
Loop Invariant Code Motion
-- Loop Invariant Code Motion
-- Natural Loop Construction
Module Verifier
-- Dominator Set Construction
-- Module Verifier
Bitcode Writer
--Bitcode Writer
Hello: __main
Hello: puts
Hello: main

Which shows that we don't accidentally invalidate dominator information anymore, and therefore do not have
to compute it twice.

The releaseMemory method
virtual void releaseMemory();

The PassManager automatically determines when to compute analysis results, and how long to keep them
around for. Because the lifetime of the pass object itself is effectively the entire duration of the compilation
process, we need some way to free analysis results when they are no longer useful. The releaseMemory
virtual method is the way to do this.

If you are writing an analysis or any other pass that retains a significant amount of state (for use by another
pass which "requires" your pass and uses the getAnalysis method) you should implement releaseMemory
to, well, release the memory allocated to maintain this internal state. This method is called after the run*
method for the class, before the next call of run* in your pass.

Registering dynamically loaded passes
Size matters when constructing production quality tools using llvm, both for the purposes of distribution, and
for regulating the resident code size when running on the target system. Therefore, it becomes desirable to
selectively use some passes, while omitting others and maintain the flexibility to change configurations later
on. You want to be able to do all this, and, provide feedback to the user. This is where pass registration comes
into play.

The fundamental mechanisms for pass registration are the MachinePassRegistry class and subclasses of
MachinePassRegistryNode.

An instance of MachinePassRegistry is used to maintain a list of MachinePassRegistryNode
objects. This instance maintains the list and communicates additions and deletions to the command line
interface.

Dependency Relationships Of Object Files 314

Documentation for the LLVM System at SVN head
An instance of MachinePassRegistryNode subclass is used to maintain information provided about a
particular pass. This information includes the command line name, the command help string and the address
of the function used to create an instance of the pass. A global static constructor of one of these instances
registers with a corresponding MachinePassRegistry, the static destructor unregisters. Thus a pass that
is statically linked in the tool will be registered at start up. A dynamically loaded pass will register on load and
unregister at unload.

Using existing registries
There are predefined registries to track instruction scheduling (RegisterScheduler) and register
allocation (RegisterRegAlloc) machine passes. Here we will describe how to register a register allocator
machine pass.

Implement your register allocator machine pass. In your register allocator .cpp file add the following include;

#include "llvm/CodeGen/RegAllocRegistry.h"

Also in your register allocator .cpp file, define a creator function in the form;

FunctionPass *createMyRegisterAllocator() {
return new MyRegisterAllocator();
}

Note that the signature of this function should match the type of
RegisterRegAlloc::FunctionPassCtor. In the same file add the "installing" declaration, in the
form;

static RegisterRegAlloc myRegAlloc("myregalloc",
" my register allocator help string",
createMyRegisterAllocator);

Note the two spaces prior to the help string produces a tidy result on the -help query.

$ llc -help
...
-regalloc - Register allocator to use (default=linearscan)
=linearscan - linear scan register allocator
=local - local register allocator
=simple - simple register allocator
=myregalloc - my register allocator help string
...

And that's it. The user is now free to use -regalloc=myregalloc as an option. Registering instruction
schedulers is similar except use the RegisterScheduler class. Note that the
RegisterScheduler::FunctionPassCtor is significantly different from
RegisterRegAlloc::FunctionPassCtor.

To force the load/linking of your register allocator into the llc/lli tools, add your creator function's global
declaration to "Passes.h" and add a "pseudo" call line to
llvm/Codegen/LinkAllCodegenComponents.h.

Creating new registries
The easiest way to get started is to clone one of the existing registries; we recommend
llvm/CodeGen/RegAllocRegistry.h. The key things to modify are the class name and the

Dependency Relationships Of Object Files 315

bc -load $(LLVMTOP)/llvm/Debug/lib/[libname]. M=@0x70b298) at Pass. Here are some suggestions to debugging your pass with GDB. cl::init(&createDefaultMyPass). and second of all there are problems with inlined functions in shared objects. so it takes time to load. Be patient.so -[passoption] Breakpoint 1. you can't set a breakpoint in a shared object that has not been loaded yet. false. Setting a breakpoint in your pass First thing you do is start gdb on the opt process: $ gdb opt GNU gdb 5. First of all. although nothing described here depends on that. but after it has loaded the shared object. Inc. I'm going to assume that you are debugging a transformation invoked by opt.0 Copyright 2000 Free Software Foundation.cpp:70 70 bool PassManager::run(Module &M) { return PM->run(M). This GDB was configured as "sparc-sun-solaris2. Type "show copying" to see the conditions..6". covered by the GNU General Public License. Here the command option is "mypass". The most foolproof way of doing this is to set a breakpoint in PassManager::run and then run the process with the arguments you want: (gdb) break llvm::PassManager::run Breakpoint 1 at 0x2413bc: file Pass. And finally. Using GDB with dynamically loaded passes Unfortunately. and have it stop before it invokes our pass. Documentation for the LLVM System at SVN head FunctionPassCtor type. PassManager::run (this=0xffbef174.bc -load $(LLVMTOP)/llvm/Debug/lib/[libname]. Example: if your pass registry is RegisterMyPasses then define. (gdb) Note that opt has a lot of debugging information in it. There is absolutely no warranty for GDB.so -[passoption] Starting program: opt test. MachinePassRegistry RegisterMyPasses::Registry. } (gdb) Once the opt stops in the PassManager::run method you are now free to set breakpoints in your pass so that you can trace through execution or do other standard debugging stuff. RegisterPassParser<RegisterMyPasses> > MyPassOpt("mypass". Type "show warranty" for details. Then you need to declare the registry. line 70. cl::desc("my pass option help")). For sake of discussion.. (gdb) run test. GDB is free software. Example: cl::opt<RegisterMyPasses::FunctionPassCtor. with createDefaultMyPass as the default creator. and you are welcome to change it and/or distribute copies of it under certain conditions. declare the command line option for your passes.cpp. using GDB with dynamically loaded passes is not as easy as it should be. we must execute the process. Since we cannot set a breakpoint in our pass yet (the shared object isn't loaded until runtime). Dependency Relationships Of Object Files 316 .

you restart the program (i. Hopefully these tips will help with common case debugging situations. it somehow completely loses this capability. we have kept the LLVM passes SMP ready. When a pass is dynamically loaded however. you have succeeded in getting some breakpoints planted in your pass.e. you type 'run' again). The only way I have found to "fix" this problem is to delete the breakpoints that are already set in your pass. and re-set the breakpoints once execution stops in PassManager::run. If you'd like to contribute some tips of your own. some with solutions. just contact Chris. Nex thing you know. a nice clean way to implement a multithreaded compiler would be for the PassManager class to create multiple instances of each pass object. run the program. Despite that. The only solution I know of is to de-inline a function (move it from the body of a class to a . we simply haven't had time (or multiprocessor machines. thus a reason) to implement this. • Restarting the program breaks breakpoints. there are things we'd like to add in the future. Chris Lattner The LLVM Compiler Infrastructure Last modified: $Date: 2010-02-18 08:37:52 -0600 (Thu. 18 Feb 2010) $ Dependency Relationships Of Object Files 317 . Because of the semantics defined for passes above (specifically they cannot maintain state across invocations of their run* methods). and you start getting errors about breakpoints being unsettable.. and you should too. • Inline functions have bogus stack information. requiring only the LLVM core to have locking in a few places (for global resources). Although this is a simple extension. there are a couple of problems that GDB has. and allow the separate instances to be hacking on different parts of the program at the same time. GDB does a pretty good job getting stack traces and stepping through inline functions. This implementation would prevent each of the passes from having to implement multithreaded constructs. Here is where we are going: Multithreaded LLVM Multiple CPU machines are becoming more common and compilation can never be fast enough: obviously we should allow for a multithreaded compiler. After following the information above.cpp file). Future extensions planned Although the LLVM Pass Infrastructure is very capable as it stands. Documentation for the LLVM System at SVN head Miscellaneous Problems Once you have the basics down. and does some nifty stuff. In general. some without.

Prerequisite Reading Dependency Relationships Of Object Files 318 . Introduction ♦ Audience ♦ Prerequisite Reading ♦ Basic Steps ♦ Preliminaries 2. Instruction Selector ♦ The SelectionDAG Legalize Phase ◊ Promote ◊ Expand ◊ Custom ◊ Legal ♦ Calling Conventions 7. Code intended for a specific machine can take the form of either assembly code or binary code (usable for a JIT compiler). The backend may also be used to generate code targeted at SPUs of the Cell processor or GPUs to support the execution of compute kernels. PowerPC. Target Registration 4. JIT Support ♦ Machine Code Emitter ♦ Target JIT Info Written by Mason Woo and Misha Brukman Introduction This document describes techniques for writing compiler backends that convert the LLVM Intermediate Representation (IR) to code for a specified machine or other languages. Subtarget Support 9. Instruction Set ♦ Instruction Operand Mapping ♦ Implement a subclass of TargetInstrInfo ♦ Branch Folding and If Conversion 6. and SPARC. Register Set and Register Classes ♦ Defining a Register ♦ Defining a Register Class ♦ Implement a subclass of TargetRegisterInfo 5. In particular. this document focuses on the example of creating a static compiler (one that emits text assembly) for a SPARC target. Assembly Printer 8. because SPARC has fairly standard characteristics. Alpha. The document focuses on existing examples found in subdirectories of llvm/lib/Target in a downloaded LLVM release. Target Machine 3. Audience The audience for this document is anyone who needs to write an LLVM backend to generate code for a specific hardware or software target. Documentation for the LLVM System at SVN head Writing an LLVM Compiler Backend 1. The backend of LLVM features a target-independent code generator that may create output for several types of target CPUs — including X86. such as a RISC instruction set and straightforward calling conventions.

Version 8 for reference.td suffix) and generates C++ code that can be used for code generation. Use TableGen to generate code that matches patterns and selects instructions based on additional information in a target-specific version of TargetInstrInfo. see Using As.e. Use TableGen to generate code for register definition. Using As contains a list of target machine dependent features. variants with different capabilities).cpp. Scheduling and Formation. follow these steps: • Create a subclass of the TargetMachine class that describes characteristics of your target machine. Write code for XXXISelDAGToDAG. to perform pattern matching and DAG-to-DAG instruction selection. SSA-based Optimization.h. change code that references "Sparc" to reference your target.td input file. You should also write code for a subclass of AsmPrinter that performs the LLVM-to-assembly conversion and a trivial subclass of TargetAsmInfo. Pay particular attention to the descriptions of code generation stages: Instruction Selection. and Code Emission. as are several SelectionDAG processing steps. Prolog/Epilog Code Insertion. where XXX identifies the specific target. To follow the SPARC examples in this document. • Describe the selection and conversion of the LLVM IR from a Directed Acyclic Graph (DAG) representation of instructions to native target-specific instructions. You should write additional code for a subclass of the TargetInstrInfo class to represent machine instructions supported by the target machine. For more about the GNU Assembler format (GAS). • The LLVM Target-Independent Code Generator — a guide to the components (classes and code generation algorithms) for translating the LLVM internal representation into machine code for a specified target. • Writing an LLVM Pass — The assembly printer is a FunctionPass. You should add assembly strings to the instructions defined in your target-specific version of TargetInstrInfo. add support for subtargets (i. but change the file names for your target. • Describe the instruction set of the target. • Write code for an assembly printer that converts LLVM IR to a GAS format for your target machine. start with SparcTargetMachine.cpp to replace or remove operations and data types that are not supported natively in a SelectionDAG.. for example. For details about the ARM instruction set. TableGen processes input from a target description file (. Also write code in XXXISelLowering. • TableGen Fundamentals —a document that describes the TableGen (tblgen) application that manages domain-specific information to support LLVM code generation. refer to the ARM Architecture Reference Manual. have a copy of The SPARC Architecture Manual. and register classes from a target-specific RegisterInfo.td. register aliases.td and TargetInstrInfo. You should also write additional code for a subclass of the TargetRegisterInfo class that represents the class register file data used for register allocation and also describes the interactions between registers.cpp and SparcTargetMachine. You should also write code for a subclass of the TargetSubtarget class.td. Late Machine Code Optimizations. Copy existing examples of specific TargetMachine class and header files.td. Dependency Relationships Of Object Files 319 . especially for the assembly printer. which allows you to use the -mcpu= and -mattr= command-line options. Basic Steps To write a compiler backend for LLVM that converts the LLVM IR to code for a specified target (machine or other language). Documentation for the LLVM System at SVN head These essential documents must be read before reading this document: • LLVM Language Reference Manual — a reference manual for the LLVM assembly language. Use TableGen to generate code for target-specific instructions from target-specific versions of TargetInstrFormats. • Optionally. Register Allocation. • Describe the register set of the target. Similarly.

To create a concrete target-specific subclass of LLVMTargetMachine.h and SparcTargetMachine. add JIT support and create a machine code emitter (subclass of TargetJITInfo) that is used to emit binary code directly into memory. The absolute minimum is discussed here.ac and regenerate configure by running . register set (getRegisterInfo). the latter of which should be implemented in a subdirectory below lib/Target/Dummy (for example. but any file in the lib/Target directory will be built and should work. initially stub up these methods and then implement them later. You should name the files that you create to reflect your specific target. you need to create and modify a few files. you can change autotools/configure. Search the configure script for TARGETS_TO_BUILD. The library can be named LLVMDummy (for example. Note that these two naming schemes are hardcoded into llvm-config. and then reconfigure. name the files SparcTargetMachine. Using any other naming scheme will confuse llvm-config and produce a lot of (seemingly unrelated) linker errors when linking llc. and then include $(LEVEL)/Makefile. LLVMTargetMachine is defined as a subclass of TargetMachine in include/llvm/Target/TargetMachine. For instance. For a target machine XXX. create a subclass of TargetMachine. It should at least contain the LEVEL. start by copying an existing TargetMachine class and header." create the directory lib/Target/Dummy.common. Alternatively. First. (To create a target from scratch. In the . The LLVMTargetMachine class should be specialized by a concrete target class that implements the various virtual methods. you need to implement a subclass of TargetMachine. Target Machine LLVMTargetMachine is designed as a base class for targets implemented with the LLVM target-independent code generator. you may not know which private members that the class will need and which components will need to be subclassed.h.cpp. you modify the configure script to know about your target when parsing the --enable-targets option. To use LLVM's target independent code generator. The TargetMachine class implementation (TargetMachine. and are intended to obtain the instruction set (getInstrInfo).cpp. for the SPARC target. see the MIPS target). But to actually use the LLVM target-independent code generator. you should create a subdirectory under lib/Target to hold all the files related to your target. It is easiest to copy a Makefile of another target and modify it. you need to add it to the TARGETS_TO_BUILD variable. the implementation of XXXTargetMachine must have access methods to obtain objects that represent target components. you should do what all current machine backends do: create a subclass of LLVMTargetMachine. This implementation should typically be in the file lib/Target/DummyTargetMachine. To make your target actually do something. If your target is called "Dummy. Initially. LIBRARYNAME and TARGET variables. Preliminaries To actually create your compiler backend.sh. These methods are named get*Info. stack frame layout Dependency Relationships Of Object Files 320 .cpp and . files.h. To do this.) To get LLVM to actually build and link your target. Documentation for the LLVM System at SVN head • Optionally.cpp) also processes numerous command-line options. you can split the library into LLVMDummyCodeGen and LLVMDummyAsmPrinter. see the PowerPC target)./autoconf/AutoRegen. create a Makefile. you must perform the steps described in the LLVM Target-Independent Code Generator document. Alternatively. In this new directory. add your target to the lists there (some creativity required).

Documentation for the LLVM System at SVN head (getFrameInfo). public: SparcTargetMachine(const Module &M. // Calculates type size & alignment SparcSubtarget Subtarget. XXXTargetMachine must also implement the getTargetData method to access an object with target-specific data characteristics. the constructor for SparcTargetMachine contains the following: SparcTargetMachine::SparcTargetMachine(const Module &M. For instance. such as data type size and alignment requirements. InstrInfo(Subtarget). and endianness. virtual const SparcInstrInfo *getInstrInfo() const {return &InstrInfo.getRegisterInfo(). bool Fast). alignment. } virtual const TargetSubtarget *getSubtargetImpl() const{return &Subtarget. protected: virtual const TargetAsmInfo *createTargetAsmInfo() const. // Pass Pipeline Configuration virtual bool addInstSelector(PassManagerBase &PM. Subtarget(M. TargetFrameInfo FrameInfo. const std::string &FS) : DataLayout("E-p:32:32-f128:128:128"). namespace llvm { class Module. bool Fast). including characteristics such as pointer size. Dependency Relationships Of Object Files 321 . For example. } virtual const TargetRegisterInfo *getRegisterInfo() const { return &InstrInfo. and similar information. } virtual const TargetFrameInfo *getFrameInfo() const {return &FrameInfo. virtual bool addPreEmitPass(PassManagerBase &PM. const std::string &FS). you also need to support the following methods: • getTargetLowering() • getJITInfo() In addition. SparcInstrInfo InstrInfo. for the SPARC target. }. class SparcTargetMachine : public LLVMTargetMachine { const TargetData DataLayout.h declares prototypes for several get*Info and getTargetData methods that simply return a class member. } static unsigned getModuleMatchQuality(const Module &M). } // end namespace llvm • getInstrInfo() • getRegisterInfo() • getFrameInfo() • getTargetData() • getSubtargetImpl() For some targets. } virtual const TargetData *getTargetData() const { return &DataLayout. the header file SparcTargetMachine. the XXXTargetMachine constructor should specify a TargetDescription string that determines the data layout for the target machine. FS).

a lower-case "e" indicates little-endian. All targets should declare a global Target object which is used to represent the target during registration. Target Registration You must also register your target with the TargetRegistry. For example. vector. "Sparc"). floating point. ABI alignment. in the target's TargetInfo library. 0) { } Hyphens separate portions of the TargetDescription string. which is what other LLVM tools use to be able to lookup and use your target at runtime. "i". Here is an example of registering the Sparc assembly printer: extern "C" void LLVMInitializeSparcAsmPrinter() { RegisterAsmPrinter<SparcAsmPrinter> X(TheSparcTarget). It also describes the interactions between registers. floating-point. "f". In addition. or vector registers. The TargetRegistry can be used directly. because some clients may wish to only link in some parts of the target -. "sparc". the Sparc registration code looks like this: Target llvm::TheSparcTarget. A register class should be added for groups of registers that are all treated the same way for some instruction. You also need to define register classes to categorize related registers. If only two figures follow "p:". for example. "v". or "a" (corresponding to integer. } For more information. and preferred alignment. and the second value is both ABI and preferred alignment. Then. Documentation for the LLVM System at SVN head FrameInfo(TargetFrameInfo::StackGrowsDown. and then ABI preferred alignment. or "a" are followed by ABI alignment and preferred alignment. 8. These registration steps are separate. extern "C" void LLVMInitializeSparcTargetInfo() { RegisterTarget<Triple::sparc.h". then the first value is pointer size. Typical examples are register classes for integer. A register allocator allows an instruction to use any register in a specified register class to perform the instruction in a similar manner. most targets will also register additional features which are available in separate libraries. • An upper-case "E" in the string indicates a big-endian target data model. the target should define that object and use the RegisterTarget template to register the target.the JIT code generator does not require the use of the assembler printer. "f" is followed by three values: the first indicates the size of a long double. Register Set and Register Classes You should describe a concrete target-specific class that represents the register file of a target machine. and register classes let the target-independent register allocator automatically Dependency Relationships Of Object Files 322 . "v". or aggregate). /*HasJIT=*/false> X(TheSparcTarget. • Then a letter for numeric type alignment: "i". but for most targets there are helper templates which should take care of the work for you. see "llvm/Target/TargetRegistry. This class is called XXXRegisterInfo (where XXX identifies the target) and represents the class register file data that is used for register allocation. then ABI alignment. • "p:" is followed by pointer information: size. } This allows the TargetRegistry to look up the target by name or by target triple. Register classes allocate virtual registers to instructions from these sets.

or a debug information writer (such as DwarfWriter in llvm/lib/CodeGen/AsmPrinter) to identify a register.. From the register info file. class Register<string n> { string Namespace = "".inc output files.. TableGen generates a TargetRegisterDesc object for each register. This defines the register AL and assigns it values (with DwarfRegNum) that are used by gcc. int SpillAlignment = 0. string AsmName = n.inc and XXXGenRegisterInfo.h with the following fields: struct TargetRegisterDesc { const char *AsmName. AL_AliasSet. X86::RAX.td file. The basic Register object does not have any subregisters and does not specify any aliases. const unsigned AL_AliasSet[] = { X86::AX. string Name = n. } For example. list<Register> Aliases = []. DwarfRegNum takes an array of 3 values representing 3 different modes: the first element is for X86-64. TableGen generates this code in the X86GenRegisterInfo. "AL". Some of the code in the implementation of XXXRegisterInfo requires hand-coding. // Register Alias Set Dependency Relationships Of Object Files 323 . // Assembly language name for the register const char *Name. list<int> DwarfNumbers = [].td file.. The Register class (specified in Target.. the second for exception handling (EH) on X86-32. The specified string n becomes the Name of the register. }. Defining a Register The XXXRegisterInfo. 0. // Printable name for the reg (for debugging) const unsigned *AliasSet. Much of the code for registers. 0 }.td) is used to define an object for each register. Documentation for the LLVM System at SVN head choose the actual registers. there are register definitions that utilize the Register class. including register definition. . is generated by TableGen from XXXRegisterInfo. From the previously described line in the X86RegisterInfo. list<Register> SubRegs = []. Empty_SubRegsSet.td input files and placed in XXXGenRegisterInfo. register aliases.inc file: static const unsigned GR8[] = { X86::AL. TargetRegisterDesc is defined in include/llvm/Target/TargetRegisterInfo. Empty_SubRegsSet. . const TargetRegisterDesc RegisterDescriptors[] = { . gdb.h. { "AL". -1 is a special Dwarf number that indicates the gcc number is undefined. in the X86RegisterInfo. such as: def AL : Register<"AL">. and register classes. and the third is generic.td file typically starts with register definitions for a target machine. 0]>. and -2 indicates the register number is invalid for this mode. For register AL. DwarfRegNum<[0. int SpillSize = 0. X86::EAX... AL_SuperRegsSet }.

td. The Register class is commonly used as a base class for more complex classes. // Sub-register set const unsigned *ImmSubRegs. } In the SparcRegisterInfo. DwarfRegNum<[0]>. // Immediate sub-register set const unsigned *SuperRegs. } // Ri . Dependency Relationships Of Object Files 324 .td file. } // Rd . // Super-register set }. list<Register> subregs> : Register<n> { let SubRegs = subregs. def F0 : Rf< 0. SparcReg.Slots in the FP register file for 64-bit floating-point values. Documentation for the LLVM System at SVN head const unsigned *SubRegs. and Rd. In this example. additional register classes are defined for SPARC: a Register subclass. } In SparcRegisterInfo. let SubRegs = subregs. other definitions establish the registers "AX".32-bit floating-point registers class Rf<bits<5> num. as shown here: class RegisterWithSubRegs<string n. "F0">.. "F1">. def G1 : Ri< 1. string n. and "RAX" as aliases for one another. } // Rf . "EAX". "G1">. DwarfRegNum<[32]>. . so TableGen generates a null-terminated array (AL_AliasSet) for this register alias set. Rf.td. there are register definitions that utilize these subclasses of Register.32-bit integer registers class Ri<bits<5> num. string n> : SparcReg<n> { let Num = num. SPARC registers are identified by 5-bit ID numbers.td) to determine text names for the register (in the AsmName and Name fields of TargetRegisterDesc) and the relationships of other registers to the defined register (in the other TargetRegisterDesc fields). which is a feature common to these subclasses. and further subclasses: Ri. "G0">. let Namespace = "SP". In Target. class Rd<bits<5> num.. string n> : SparcReg<n> { let Num = num. Note the use of 'let' expressions to override values that are initially defined in a superclass (such as SubRegs field in the Rd class). class SparcReg<string n> : Register<n> { field bits<5> Num. DwarfRegNum<[1]>. list<Register> subregs> : SparcReg<n> { let Num = num. def F1 : Rf< 1. TableGen uses the entire target description file (. such as: def G0 : Ri< 0. the Register class is the base for the RegisterWithSubRegs class that is used to define registers that need to specify subregisters in the SubRegs list.

and IntRegs. list<Register> MemberList = regList. list<ValueType> RegTypes = regTypes. in bits. . and i1 for Boolean). and vector types (for example.td that uses Target. 4 32-bit integers. [F2. The last two registers shown above (D0 and D1) are double-precision floating-point registers that are aliases for pairs of single-precision floating-point sub-registers.td. floating-point types (f32. • The second argument is a list of ValueType register type values that are defined in include/llvm/CodeGen/ValueTypes. specifies which registers are in this class. Dependency Relationships Of Object Files 325 . DFPRegs. use the following 4 arguments: • The first argument of the definition is the name of the namespace. // spill size. // CopyCost is the cost of copying a value between two registers // default value 1 means a single instruction // A negative value means copying is extremely expensive or impossible int CopyCost = 1. int Size = 0. • The third argument of the RegisterClass definition specifies the alignment required of the registers when they are stored or loaded to memory.. f64). "F2". A target description file XXXRegisterInfo. Documentation for the LLVM System at SVN head DwarfRegNum<[33]>. F1]>. v8i16 for an 8 x i16 vector). If an allocation_order_* method is not specified. In SparcRegisterInfo. i32. In addition to aliases. • The final argument. def D1 : Rd< 2. three RegisterClass objects are defined: FPRegs. For all three register classes. DwarfRegNum<[34]>. code MethodProtos = [{}]. the sub-register and super-register relationships of the defined register are in fields of a register's TargetRegisterDesc. list<ValueType> regTypes. [F0. the first argument defines the namespace with the string 'SP'. // to insert arbitrary code code MethodBodies = [{}]. F3]>. Defining a Register Class The RegisterClass class (specified in Target.td can construct register classes using the following class: class RegisterClass<string namespace. All registers in a RegisterClass must have the same ValueType. zero lets tblgen pick the size int Alignment = alignment. but some registers may store vector data in different configurations. Defined values include integer types (such as i16. then regList also defines the order of allocation used by the register allocator.td) is used to define an object that represents a group of related registers and also defines the default allocation order of the registers.. regList. def D0 : Rd< 0.td. 8 16-bit integers. int alignment. DwarfRegNum<[32]>. "F0". } To define a RegisterClass. For example a register that can process a 128-bit vector may be able to handle 16 8-bit integer elements. and so on. list<Register> regList> { string Namespace = namespace. // for register classes that are subregisters of this class list<RegisterClass> SubRegClassList = [].

F11. [L0. // frame ptr I7. G7 // reserved for kernel ]> { let MethodProtos = [{ iterator allocation_order_end(const MachineFunction &MF) const. SP::G1.10 // Don't allocate special registers -1. I3. // constant zero G5. [F0. D8. D6. F28. F6. I5. D2. D10. For IntRegs.inc. F23.cpp. F13. 64. SP::O7. F19. SP::I1.td with TableGen generates several output files that are intended for inclusion in other source code that you write. F5. F24.inc. L1. SP::L2. [i32]. let MethodBodies = [{ IntRegsClass::iterator IntRegsClass::allocation_order_end(const MachineFunction &MF) const { return end() . F25. SP::O0. which should be included in the header file for the implementation of the SPARC register implementation that you write (SparcRegisterInfo. Take special note of the use of MethodBodies in SparcRegisterInfo. O7. F17. L5. The code below shows only the generated integer registers and associated register classes. SP::O5. D12. It also specifies types.inc. F26. SP::L6. I0. SP::O1. F15. based upon the defined register classes: DFPRegsClass. O0. F29. }].h). Dependency Relationships Of Object Files 326 .h. DFPRegs defines a group of 16 double-precision registers (D0-D15). I2.td to create code in SparcGenRegisterInfo. static const unsigned IntRegs[] = { SP::L0. O4. SP::L5.. G6. D3. SP::I2. D15]>. F20. F16. F9. the SPARC register implementation. SP::G3. L6. F18. SP::L7. SP::I0. SparcRegisterInfo. [f32]. O3. D9. I1. MethodProtos generates similar code in SparcGenRegisterInfo. } Using SparcRegisterInfo. F1. O1. O5. F31]>. } }]. F21. F2. [f64]. def DFPRegs : RegisterClass<"SP". which is included at the bottom of SparcRegisterInfo. SP::L3. SP::I5. def IntRegs : RegisterClass<"SP". Documentation for the LLVM System at SVN head FPRegs defines a group of 32 single-precision floating-point registers (F0 to F31). SP::O3. SP::I4. F10. O2. L3. L4. SP::O4. D1. SP::O2. // return address G0. F7. SparcRegisterInfo.. In SparcGenRegisterInfo.h. [D0.h. F3. The order of registers in IntRegs reflects the order in the definition of IntRegs in the target description file. F27.td generates SparcGenRegisterInfo. // Non-allocatable regs: G2. def FPRegs : RegisterClass<"SP". // IntRegs Register Class. I4. G3. O6. F22. D14. L7. F30. L2.inc a new structure is defined called SparcGenRegisterInfo that uses TargetRegisterInfo as its base. FPRegsClass. F8. F12. D4. D5. SP::L4. F4. SP::I3. D7.inc. the MethodProtos and MethodBodies methods are used by TableGen to insert the specified code into generated output. D11.td also generates SparcGenRegisterInfo. 32. and IntRegsClass. 32. G1. D13. SP::G2. // stack ptr I6. G4. SP::L1. F14.

• getReservedRegs — Returns a bitset indexed by physical register numbers.. IntRegsSubRegClasses. or false. .. static const TargetRegisterClass* const IntRegsSubclasses [] = { NULL }. // IntRegs Super-register Classess.... // IntRegs Register Class super-classes.. . ... which implements the interface described in TargetRegisterInfo.. static const TargetRegisterClass* const IntRegsSuperclasses [] = { NULL }.. Here is a list of functions that are overridden for the SPARC implementation in SparcRegisterInfo. 1. SP::I6. static const TargetRegisterClass* const IntRegsSuperRegClasses [] = { NULL }. IntRegsSuperRegClasses..h. SP::G6. // IntRegs Register Class sub-classes.. IntRegsClass IntRegsRegClass. unless overridden.cpp: • getCalleeSavedRegs — Returns a list of callee-saved registers in the order of the desired callee-save stack frame offset.. 4.. IntRegsSuperclasses. SP::G0. . IntRegsSubclasses.. namespace SP { // Register class instances DFPRegsClass DFPRegsRegClass. IntRegsClass::iterator IntRegsClass::allocation_order_end(const MachineFunction &MF) const { return end()-10 // Don't allocate special registers -1. SP::G7. These functions return 0. SP::I7. Dependency Relationships Of Object Files 327 . static const TargetRegisterClass* const IntRegsSubRegClasses [] = { NULL }. IntRegs. FPRegsClass FPRegsRegClass. SP::G5.. MVT::Other }. NULL. Documentation for the LLVM System at SVN head SP::G4. • getCalleeSavedRegClasses — Returns a list of preferred register classes with which to spill each callee saved register. }. indicating if a particular register is unavailable. 4. static const MVT::ValueType IntRegsVTs[] = { MVT::i32.. • hasFP — Return a Boolean indicating if a function should have a dedicated frame pointer register. SP::O6. // IntRegsVTs Register Class Value Types. IntRegs + 32) {} } Implement a subclass of TargetRegisterInfo The final step is to hand code portions of XXXRegisterInfo. IntRegsVTs.. .. } IntRegsClass::IntRegsClass() : TargetRegisterClass(IntRegsRegClassID. // IntRegs Sub-register Classess..

condition codes. // An dag containing the MI use operand list. and other fundamental classes are defined.h file (values of the NodeType enum in the ISD namespace). • XXXInstrFormats. // The . this file is X86InstrMMX. and for Pentium with MMX. dag OutOperandList. sub).td — Patterns for definitions of target-specific instructions. There is also a target-specific XXX. each of which describes one instruction. a different file name may be used. list<Register> Defs = []. For example. definitions of SelectionDAG nodes (such as imm. cond. For architecture modifications. list<dag> Pattern. bb.td. // An dag containing the MI def operand list. You should describe a concrete target-specific class XXXInstrInfo that represents machine instructions supported by a target machine.td) input files to generate much of the code for instruction definition: • Target. The various operation node types are described in the include/llvm/CodeGen/SelectionDAGNodes. InstrInfo. string AsmString = "". PatLeaf.td — Where the Instruction. for Pentium with SSE instruction. // Set to the DAG pattern for this instruction list<Register> Uses = []. Dependency Relationships Of Object Files 328 .td file includes the other . Pat. XXXInstrInfo contains an array of XXXInstrDescriptor objects. An SDNode has an opcode.td input files.td) is mostly used as a base for more complex instruction classes. dag InOperandList. TableGen uses the following target description (.td. For example. this can be called to eliminate them. is commutable) • Target-specific flags The Instruction class (defined in Target. operands. and instructions of an instruction set. ComplexPattern. • emitEpilogue — Insert epilogue code into the function. add. contains SDTC* classes (selection DAG type constraint). class Instruction { string Namespace = "". • emitPrologue — Insert prologue code into the function. An instruction descriptor defines: • Opcode mnemonic • Number of operands • List of implicit register definitions and uses • Target-independent properties (such as memory access. • TargetSelectionDAG. and operation properties. Documentation for the LLVM System at SVN head • eliminateCallFramePseudoInstr — If call frame setup or destroy pseudo instructions are used. but its contents are only directly important for subtargets.td file.td — Target-specific definitions of instruction templates. does an operation load from memory. the LLVM IR code is converted to a SelectionDAG with nodes that are instances of the SDNode class containing target instructions. • XXXInstrInfo.s format to print the instruction with. • eliminateFrameIndex — Eliminate abstract frame indices from instructions that may use them. Operand. where XXX is the name of the target. this file is X86InstrSSE. The XXX.td— Used by SelectionDAG instruction selection generators. Instruction Set During the early stages of code generation. and pattern support (Pattern. is an operation commutative. type requirements. fadd. PatFrag.

and F3_3 for floating-point operations. 0b000000.td largely consists of operand and instruction definitions for the SPARC target. (load ADDRrr:$addr))]>. defines the Load Integer instruction for a Word (the LD SPARC opcode) from a memory address to a register. "ld [$addr]. InstSP is a base class for other instruction classes. Version 8. there are three major 32-bit formats for instructions. is the operation value for this category of operation. An LLVM target could model this with two instructions named ADDri and ADDrr. F2_1 is for SETHI.td: def MEMrr : Operand<i32> { let PrintMethod = "printMemOperand". Also you should specify how the instruction should be printed when the automatic assembly printer is used. The sixth and final parameter is the pattern used to match the instruction during the SelectionDAG Select Phase described in (The LLVM Target-Independent Code Generator). A single instruction from the architecture manual is often modeled as multiple target instructions. Additional base classes are specified for more precise formats: for example in SparcInstrFormat. You should map the register bits to the bits of the instruction in which they are encoded (for the JIT). [(set IntRegs:$dst. which is a register operand and defined in the Register target description file (IntRegs). Dependency Relationships Of Object Files 329 . The instruction objects should represent instructions from the architecture manual of the target machine (such as the SPARC Architecture Manual for the SPARC target). Instruction Selector..td also adds the base class Pseudo for synthetic SPARC instructions. LDrr. (ins MEMrr:$addr). Documentation for the LLVM System at SVN head list<Predicate> Predicates = []. SparcInstrInfo. SparcInstrInfo. } The fifth parameter is a string that is used by the assembly printer and can be left as an empty string until the assembly printer interface is implemented. Format 1 is only for the CALL instruction. Format 3 is for other instructions. a manual might describe an add instruction that takes a register or an immediate operand.td. In SparcInstrInfo. and F2_2 is for branches. This parameter is detailed in the next section. remainder not shown for space . For example. depending upon its operands. // predicates turned into isel match code . which uses the address operand MEMrr that is defined earlier in SparcInstrInfo. The second parameter (0000002) is the specific operation value for LD/Load Word. The fourth parameter is the input source.td.td. The first parameter. the following target description file entry. Format 2 is for branch on condition codes and SETHI (set high bits of a register) instructions. $dst". You should define a class for each instruction category and define each opcode as a subclass of the category with appropriate parameters such as the fixed binary encoding of opcodes and extended opcodes..td.. As is described in the SPARC Architecture Manual. let MIOperandInfo = (ops IntRegs.. the value 3 (112). There are three other base classes: F3_1 for register/register operations. (outs IntRegs:$dst). Each of these formats has corresponding classes in SparcInstrFormat. def LDrr : F3_1 <3. IntRegs). } A SelectionDAG node (SDNode) should contain an object representing a target-specific instruction that is defined in XXXInstrInfo. F3_2 for register/immediate operations. The third parameter is the output destination.

the Sparc target defines the XNORrr instruction as a F3_1 format instruction having three operands. def XNORrr : F3_1<2. For example. $dst".td also includes definitions for condition codes that are referenced by branch instructions. (load ADDRri:$addr))]>. [(set IntRegs:$dst.) Instruction Operand Mapping The code generator backend maps instruction operands to fields in the instruction. } So when the defm directive is used for the XOR and ADD instructions. // Greater . I. [(set IntRegs:$dst. (outs IntRegs:$dst). " $b. (OpNode IntRegs:$b. it creates four instruction objects: XORrr. [(set IntRegs:$dst.h correspond to the values in SparcInstrInfo. bits<6> Op3Val. Operands are assigned to unbound fields in the instruction in the order they are defined. SPCC::FCC_U = 23 and so on. "ld [$addr]. $c. For example. IntRegs:$c))]>. or immediate value operands.td. and the 22nd bit represents the 'greater than' condition for floats. " $b.h also defines enums that correspond to the same SPARC condition codes. the 10th bit represents the 'greater than' condition for integers. to perform a Load Integer instruction for a Word from an immediate operand to a register.e. // Greater def FCC_UG : FCC_VAL<21>. // Not Equal def ICC_E : ICC_VAL< 1>. 0b000000. defm ADD : F3_12<"add".. The following definitions in SparcInstrInfo. !strconcat(OpcStr.. (Note that Sparc. defm XOR : F3_12<"xor". XORri. simm13:$c))]>. $dst"). For example in SparcInstrInfo. 0b000011. SDNode OpNode> { def rr : F3_1 <2. (outs IntRegs:$dst). as seen below. IntRegs:$c). and ADDri. // Unordered or Greater . For example. // Equal def ICC_G : ICC_VAL<10>. Care must be taken to ensure the values in Sparc.. // Unorder