Professional Documents
Culture Documents
Abstract— Retargetable compilers become more and more The scope of this paper is to analyze two open-source
popular as they are involved even in the processors design phase. compilers: GCC, a mature retargetable compiler, and LLVM,
The reduced time-to-market period puts a challenge on optimized a new retargetable compiler, built on different concepts. The
retargetable compilers. An optimized retargetable compiler gives GNU (GCC) compiler is one of the most widely used C/C++
a reliable feedback to tailor processors towards a certain compilers in the world. It is the basic build tool for building all
application domain. The first choice in choosing a retargetable EmbeddedLinux and Android systems, as well as all desktop
compiler may be an open-source one. This paper aims to or server Linux operating systems and their applications. The
compare two well-known open-source compilers: GCC and GNU compiler is also used to build many commercial real-
LLVM. The first is a mature compiler, retargeted for more than
time operating systems, such as those from Enea, QNX,
100 processors, while the second is a new one, retargeted for less
than 10 processors, but built on a very promising approach, one
WindRiver and more.
big plus being the latest release of Redhat (Linux operating Chapter 2 gives an overview of the machine description
system) which replaced the previously used GCC with LLVM. representation for GCC and LLVM, figuring out the pluses
The paper compares the two compilers from both easiness of and minuses of each one. Chapter 3 discusses the compilers
retargetability and the target specific optimizations enablement. construction and analyzes the code generation phase. Chapter
4 presents the two approaches for target specific optimizations
Keywords— optimized retargetable compiler; GCC; LLVM;
(e.g. register allocation) and the interaction between the
retargetable code generation; retargetable optimizations;
machine description and the optimization interrogations. This
chapter also underlines the possibility of adding a new target
I. Introduction specific optimization and estimates the effort of this task.
Chapter 5 focuses on object code generation. Chapter 6 makes
A retargetable compiler is a compiler that can be easily a final analysis and concludes.
modified to generate code for different processors. Optimizing
retargetable compilers are very common nowadays, when Each chapter presents first the GCC approach and then the
time-to-market for new processors became shorter and shorter. LLVM one, just for maturity reasons.
They bridge the gap between classic compilers and electronic
processor design. In this context, the compilers have a double
role: first, they are used in the design phase of the processor to
explore its capabilities and second, they are released with the II. Machine Description
new processor as part of the build tools chain.
The concern regarding retargetable compilers is their lack
Representation
of machine-specific code optimization techniques, which The mechanism of GCC machine descriptions has been
prevents them from achieving the highest code quality. While quite successful as demonstrated by a wide variety of targets
this problem is partially inherent to the retargetable for which GCC was retargeted. The Gnu Compiler Collection
compilation approach, it can be circumvented by designing uses a retargetable compilation model which is adapted to a
flexible, configurable code optimization techniques that apply given target by reading a description of the target and
to many target architectures and defining an interface which instantiating the machine dependent parts of the generated
can configure the optimizations using the machine description compiler.
information. The first step in the retargeting process is to understand the
The retargetable compilers are modular, compared to the architecture of the target microprocessor. The key points in
traditional ones, having target independent modules, but also understanding are the register file (general purpose registers
target specific modules, mainly in the backend. and special purpose registers - if any), the pipeline model of
On the other hand, targetTargetLowering handles all cases As a conclusion for this chapter, the LLVM has a good
of instruction nodes that cannot be lowered automatically but approach, but, for optimized code, the user has to write more
require manual intervention. These cases include: calling code than for GCC.
conventions, special instructions and operands, which cannot
be lowered automatically because of some properties and
restrictions of the architecture. One LLVM drawback in this
IV. Target Specific Optimization
field is that for a CISC architecture the code GCC cannot generate target specific optimizations, but
generation/lowering part is almost 80% hand-written. offers the possibility to configure several ones.
One important issue that the code generator needs to be Peephole optimization gathers two or more consecutive
aware of is the presence of fixed registers. In particular, there instructions into a single one. Each peephole pattern is
are often places in the instruction stream where the register described in machine description file. The peephole
allocator must arrange for a particular value to be in a optimization is called between register allocation and
particular register. This can occur due to limitations of the instruction scheduling[6]. The description contains the input
instruction set (e.g., the x86 can only do a 32-bit divide with instructions, the output instruction and the additional required
the EAX/EDX registers), or external factors like calling scratch registers. Below is an example of peephole pattern
conventions. In any case, the instruction selector should emit description:
code that copies a virtual register into or out of a physical
register when needed.
By the end of code generation, the register allocator would
coalesce the registers and delete the resultant identity moves.
MachineInstr’s are initially selected in SSA-form, and are
maintained in SSA-form until register allocation happens. For
the most part, this is trivially simple since LLVM is already in
SSA form; LLVM PHI nodes become machine code PHI
nodes, and virtual registers are only allowed to have a single
definition.
After register allocation, machine code is no longer in
SSA-form because there are no virtual registers left in the
code.
Instruction selection is arguably the most important part of Fig. 4. GCC – Peephole definition
the code generation phase. Its task is to convert a legal
selection DAG into a new DAG of target machine code. In
other words, the abstract, target-independent input has to be
matched to concrete, target-dependent output. For this purpose
LLVM uses an elaborate pattern-matching algorithm that
consists of two major steps.
The first step happens "offline", when LLVM itself is
being built, and involves the TableGen tool, which generates
the pattern-matching tables from instruction definitions.
TableGen is an important part of the LLVM eco-system, and it Fig. 5. GCC – Peephole Transformation
plays an especially central role in instruction selection, so it's
worthwhile to discuss it here more in depth.