You are on page 1of 11

The Essence of Compiling with Continuations

Cormac Flanagan Amr Sabry Bruce F. Duba Matthias Felleisen


Department of Computer Science
Rice University
Houston, TX 77251-1892

Abstract the  -value rule is an operational semantics for the


source language, that the conventional full -calculus
In order to simplify the compilation process, many com- is a semantics for the intermediate language, and, most
pilers for higher-order languages use the continuation- importantly, that the -calculus proves more equations
passing style (CPS) transformation in a rst phase to between CPS terms than the v -calculus does between
generate an intermediate representation of the source corresponding terms of the source language. Translated
program. The salient aspect of this intermediate form into practice, a compiler can perform more transforma-
is that all procedures take an argument that represents tions on the intermediate language than on the source
the rest of the computation (the \continuation"). Since language 2:4{5]. Second, the language of CPS terms is
the na ve CPS transformation considerably increases basically a stylized assembly language, for which it is
the size of programs, CPS compilers perform reductions easy to generate actual assembly programs for di
erent
to produce a more compact intermediate representation. machines 2, 13, 20]. In short, the CPS transformation
Although often implemented as a part of the CPS trans- provides an organizational principle that simplies the
formation, this step is conceptually a second phase. Fi- construction of compilers.
nally, code generators for typical CPS compilers treat To gain a better understanding of the role that the
continuations specially in order to optimize the inter- CPS transformation plays in the compilation process,
pretation of continuation parameters. we recently studied the precise connection between the
A thorough analysis of the abstract machine for CPS v -calculus for source terms and the -calculus for CPS
terms shows that the actions of the code generator in- terms. The result of this research 17] was an extended
vert the na ve CPS translation step. Put di
erently, v -calculus that precisely corresponds to the -calcu-
the combined e
ect of the three phases is equivalent lus of the intermediate CPS language and that is still
to a source-to-source transformation that simulates the semantically sound for the source language. The ex-
compaction phase. Thus, fully developed CPS compil- tended calculus includes a set of reductions, called the
ers do not need to employ the CPS transformation but A-reductions, that simplify source terms in the same
can achieve the same results with a simple source-level manner as realistic CPS transformations simplify the
transformation. output of the na ve transformation. The e
ect of these
reductions is to name all intermediate results and to
1 Compiling with Continuations merge code blocks across declarations and conditionals.
Direct compilers typically perform these reductions on
A number of prominent compilers for applicative higher- an ad hoc and incomplete basis.1
order programming languages use the language of The goal of the present paper is to show that the true
continuation-passing style (CPS) terms as their inter- purpose of using CPS terms as an intermediate repre-
mediate representation for programs 2, 14, 18, 19]. This sentation is also achieved by using A-normal forms. We
strategy apparently o
ers two major advantages. First, base our argument on a formal development of the ab-
Plotkin 16] showed that the -value calculus based on stract machine for the intermediate code of a CPS-based
compiler. The development shows that this machine is
Supported in part by NSF grants CCR 89-17022 and CCR identical to a machine for A-normal forms. Thus, the
91-22518 and Texas ATP grant 91-003604014. back end of an A-normal form compiler can employ the
To appear in: same code generation techniques that a CPS compiler
1993 Conference on Programming Language Design and Imple- uses. In short, A-normalization provides an organiza-
mentation.
June 21{25, 1993 1 Personal communication: H. Boehm (also 4]), K. Dybvig,
Albuquerque, New Mexico R. Hieb (April 92).

Page 1
M ::= V
j (let (x M1 ) M2 ) V 2 Values
j (if0 M1 M2 M3 ) c 2 Constants
j (M M1 : : : Mn ) x 2 Variables
j (O M1 : : : Mn ) O 2 Primitive Operations
V ::= c j x j (x1 : : : xn :M )

Figure 1: Abstract Syntax of Core Scheme (CS )

tional principle for the construction of compilers that not interfere in denitions or theorems.
combines various stages of fully developed CPS compil- The semantics of the language is a partial function
ers in one straightforward transformation. from programs to answers. A program is a term with
The next section reviews the syntax and semantics of no free variables and an answer is a member of the
a typical higher-order applicative language. The follow- syntactic category of constants. Following conventional
ing section analyses CPS compilers for this language. tradition 1], we specify the operational semantics of
Section 4 introduces the A-reductions and describes A- Core Scheme with an abstract machine. The machine
normal form compilers. Section 5 proves the equiva- we use, the CEK machine 10], has three components: a
lence between A-normal form compilers and realistic control string C , an environment E that includes bind-
CPS compilers. The benets of using A-normal form ings for all free variables in C , and a continuation K
terms as an intermediate representation for compilers is that represents the \rest of the computation".
the topic of Section 6. The appendix includes a linear The CEK machine changes state according to the
A-normalization algorithm. transition function in Figure 2. For example, the state
transition for the block (let (x M1 ) M2 ) starts the eval-
uation of M1 in the current environment E and modies
2 Core Scheme the continuation register to encode the rest of the com-
putation hlt x M2 E K i. When the new continuation
The source language is a simple higher-order applicative receives a value, it extends the environment with a value
language. For our purposes, it suces to consider the for x and proceeds with the evaluation of M2 . The re-
language of abstract syntax trees that is produced by maining clauses have similarly intuitive explanations.
the lexical and syntactic analysis module of the com- The relation 7;! is the reexive transitive closure
piler: see Figure 1 for the context-free grammar of this of the transition function. The function  constructs
language. The terms of the language are either val- machine values from syntactic values and environments.
ues or non-values. Values include constants, variables, The notation E (x) refers to an algorithm for looking up
and procedures. Non-values include let-expressions the value of x in the environment E . The operation
(blocks), conditionals, function applications and prim- E x1 := V1  : : : xn := Vn ] extends the environment E
itive operations.2 The sets of constants and primitive such that subsequent lookups of xi return the value Vi .
procedures are intentionally unspecied. For our pur- The object hcl x1 : : :xn  M E i is a closure , a record that
poses, it is irrelevant whether the language is statically contains the code for M and values for the free variables
typed like ML or dynamically typed like Scheme. of ( x1 : : :xn:M ). The partial function  abstracts the
The language Core Scheme has the following context- semantics of the primitive operations.
sensitive properties, which are assumed to be checked The CEK machine provides a model for designing di-
by the front-end of the compiler. In the procedure rect compilers 6, 11, 15]. A compiler based on the CEK
( x1 : : :xn:M ) the parameters x1 : : : xn are mutually machine implements an ecient representation for en-
distinct and bound in the body M . Similarly, the ex- vironments, e.g., displays, and for continuations, e.g., a
pression (let (x M1 ) M2 ) binds x in M2 . A variable that stack.3 The machine code produced by such a compiler
is not bound by a or a let is free  the set of free vari- realizes the abstract operations specied by the CEK
ables in a term M is FV (M ). Like Barendregt 3:ch 2,3], machine by manipulating these concrete representations
we identify terms modulo bound variables and we as- of environments and continuations.
sume that free and bound variables of distinct terms do
2 The language is overly simple but contains all ingredients that 3 The machine also characterizes compilers for rst-order lan-
are necessary to generate our result for full ML or Scheme. In par- guages, e.g., Fortran. In this case, the creation and deletion of
ticular, the introduction of assignments, and even control opera- the environment and continuation components always follows a
tors, is orthogonal to the analysis of the CPS-based compilation stack-like behavior. Hence the machine reduces to a traditional
strategy. stack machine.

Page 2
Semantics: Let M 2 CS ,
eval d (M ) = c if hM  stopi 7;! hstop ci:
Data Specications:
S 2 Stated = CS  Envd  Contd j Contd  Valued (machine states)
E 2 Envd = Variables ;!  Valued (environments)
V 2 Valued = c j hcl x1 : : : xn  M E i (machine values)
K 2 Contd = stop j hap h: : :  V   M : : :i E K i j hlt x M E K i (continuations)
j hif M1  M2  E K i j hpr O h: : :  V   M : : :i E K i
Transition Rules:
hV E K i 7;! hK (V E )i
h(let (x M1 ) M2 ) E K i 7;! hM1  E hlt x M2  E K ii
h(if0 M1 M2 M3 ) E K i 7;! hM1  E hif M2  M3  E K ii
h(M M1 : : : Mn ) E K i 7;! hM E hap h M1  : : :  Mn i E K ii
h(O M1 M2 : : : Mn ) E K i 7;! hM1  E hpr O h M2  : : :  Mn i E K ii

hhlt x M E K iV i 7;! hM E x := V ] K i


hhif M1  M2  E K i 0i 7;! hM1  E K i
hhif M1  M2  E K iV i 7;! hM2  E K i where V 6= 0
hhap h: : :  Vi   M : : :i E K iVi+1 i 7;! hM E hap h: : :  Vi  Vi+1   : : :i E K ii
hhap V  V1  : : :  i E K iVn i 7;! hM 0  E 0 x1 := V1  : : :  xn := Vn ] K i if V = hcl x1 : : : xn  M 0  E 0 i
hhpr O h: : :  Vi   M : : :i E K iVi+1 i 7;! hM E hpr O h: : :  Vi  Vi+1   : : :i E K ii
hhpr O hV1  : : :  i E K iVn i 7;! hK(O V1  : : :  Vn )i if (O V1  : : :  Vn ) is dened
Converting syntactic values to machine values:
 (c E ) = c
 (x E ) = E (x)
 ((x1 : : : xn :M ) E ) = hcl x1 : : : xn  M E i
Figure 2: The CEK-machine

3 CPS Compilers (( k2: (( k3: (k3 2))


( t1 : (( k4 : (k4 2))
Several compilers map source terms to a CPS intermedi- ( t2 : (+0 k2 t1 t2 ))))))
ate representation before generating machine code. The ( t3 : (( k5 :
function F 12] in Figure 3 is the basis of CPS trans- (( k6: (k6 1))
formations used in various compilers 2, 14, 19]. It uses ( t4: (let (x t4)
special -expressions or continuations to encode the rest (( k7 : (( k8: (k8 f ))
of the computation, thus shifting the burden of main- ( t5: (( k9: (k9 x ))
taining control information from the abstract machine ( t6 : (t5 k7 t6))))))
to the code. The notation ( x:   ) marks the admin- k5)))))
istrative -expressions introduced by the CPS transfor- ( t7 : (+0 k t3 t7 ))))))
mation. The primitive operation O0 used in the CPS
language is equivalent to the operation O for the source
language, except that O0 takes an extra continuation ar- By convention, we ignore the context ( k: ]) enclosing
gument, which receives the result once it is computed. all CPS programs .
The transformation F introduces a large number of To decrease the number of administrative -
administrative -expressions. For example, F maps the abstractions, realistic CPS compilers include a simpli-
code segment cation phase for compacting CPS terms 2:68{69, 14:5{
6, 19:49{51]. For an analysis of this simplication phase,
N df= (+ (+ 2 2) (let (x 1) (f x ))) its optimality, and how it can be combined with F , we
refer the reader to Danvy and Filinski 9] and Sabry
into the CPS term and Felleisen 17]. This phase simplies administrative

Page 3
FV ] = k:(k V ] )
F  (let (x M1 ) M2 )]] = k:(F  M1] t:(let (x t) (F  M2] k)))
F  (if0 M1 M2 M3 )]] = k:(F  M1] t:(if0 t (F  M2] k) (F  M3] k)))
F  (M M1 Mn )]] = k:(F  M ] t:(F  M1] t1 : (F  Mn] tn :(t k t1 : : : tn ))))
F  (O M1 Mn )]] = k:(F  M1] t1 : (F  Mn] tn :(O0 k t1 : : : tn )))

c] = c
x] = x
x1 xn :M ] = kx1 xn :F  M ]

Figure 3: The CPS transformation F

redexes of the form (( x:P ) Q) according to the rule: Nave CPS Compilers The abstract machine that
(( x:P ) Q) ;! P x := Q] ( ) characterizes the code generator of a na ve CPS com-
piler is the Ccps E machine. Since terms in CPS(CS )
The term P x := Q] is the result of the capture-free contain an encoding of control-ow information, the
substitution of all free occurrences of x in P by Q for machine does not require a continuation component
example, ( x:xz ) z := ( y:x)] = ( u:u( y:x)). Apply- (K ) to record the rest of the computation. Evalua-
ing the reduction  to all the administrative redexes in tion proceeds according to the state transition func-
our previous example produces the following  -normal tion in Figure 4. For example, the state transition
form term: for the tail call (W k W1 : : : Wn) computes a closure
hcl k0x1 : : :xn P 0 E 0i corresponding to W , extends E 0
with the values of k W1 : : : Wn and starts the inter-
cps (N ) = (+0 ( t1: (let (x 1) (f ( t2 : (+0 k t1 t2 )) x ))) pretation of P 0.
2 2)
The reduction  is strongly-normalizing on the lan- Realistic CPS Compilers Although the C E ma- cps
guage of CPS terms 17]. Hence, the simplication chine describes what a na ve CPS compiler would do,
phase of a CPS compiler can remove all  -redexes from typical compilers deviate from this model in two re-
the output of the translation F .4 After the simpli- gards.
cation phase, we no longer need to distinguish be- First, the na ve abstract machine for CPS code repre-
tween regular and administrative -expressions, and use sents the continuation as an ordinary closure. Yet, real-
the notation ( :   ) for both classes of -expression. istic CPS compilers \mark" the continuation closure as
With this convention, the language of  -normal forms, a special closure. For example, Shivers partitions pro-
CPS(CS ), is the following 17]: cedures and continuations in order to improve the data
P ::= (k W ) (return ) ow analysis of CPS programs 18:sec 3.8.3]. Also, in
j (let (x W ) P ) (bind ) both Orbit 14] and Rabbit 19], the allocation strategy
of a closure changes if the closure is a continuation. Sim-
j (if0 W P1 P2) (branch ) ilarly, Appel 2:114{124] describes various techniques for
j (W k W1 : : : Wn ) (tail call ) closure allocation that treat the continuation closure in
j (W ( x:P ) W1 : : : Wn ) (call ) a special way.
j (O k W1 : : : Wn )
0
(prim-op ) In order to reect these changes in the machine, we
tag continuation closures with a special marker \ar"
j (O ( x:P ) W1 : : : Wn ) (prim-op )
0
that describes them as activation records.
Second, the CPS representation of any user-dened
W ::= c j x j ( kx1 : : :xn :P ) (values ) procedure receives a continuation argument. However,
Steele 19] modies the CPS transformation with a
Indeed, this language is typical of the intermediate rep- \continuation variable hack" 19:94] that recognizes in-
resentation used by CPS compilers 2, 14, 19]. stances of CPS terms like (( k1    :P ) k2 : : :) and trans-
4 The CPS translation of a conditional expression contains forms them to ((    :P k1 := k2]) : : :). This \optimiza-
two references to the continuation variable k. Thus, the - tion" eliminates \some of the register shuing" 19:94]
normalization phase can produce exponentially larger output. during the evaluation of the term. Appel 2] achieves
Modifying the CPS algorithm to avoid duplicating k removes the
potential for exponential growth. The rest of our technical devel- the same e
ect without modifying the CPS transforma-
opment can be adapted mutatis mutandis. tion by letting the variables k1 and k2 share the same

Page 4
Semantics: Let P 2 CPS(CS ),
eval n (P ) = c if hP k := hcl x (k x) k := stop]i]i 7;! h(k x) x := c k := stop]i:
Data Specications:
Sn 2 Staten = CPS(CS )  Envn (machine states)
E 2 Envn = Variables ;!  Valuen (environments)
W 2 Valuen = c j hcl kx1 : : : xn  P E i j hcl x P E i j stop (machine values)
Transition Rules:
h(k W ) E i ;7 ! hP 0  E 0 x := (W E )]i where E (k) = hcl xP 0  E 0 i
h(let (x W ) P ) E i 7;! hPE x := (W E )]i
h(if0 W P1 P2 ) E i 7;! hP1  E i where (W E ) = 0
or hP2  E i where (W E ) 6= 0
h(W k W1 : : : Wn ) E i 7;! hP 0  E 0 k0 := E (k) x1 := W1  : : :  xn := Wn ]i
where (W E ) = hcl k0 x1 : : : xn  P 0  E 0 i and for 1
i
n Wi = (Wi  E )
h(W (x:P ) W1 : : : Wn ) E i 7;! hP 0  E 0 k0 := hcl x PE i x1 := W1  : : :  xn := Wn ]i
where (W E ) = hcl k0 x1 : : : xn  P 0  E 0 i and for 1
i
n Wi = (Wi  E )
h(O k W1 : : : Wn ) E i 7;! hP 0  E 0 x := c (O0  W1  : : :  Wn )]i if c (O0  W1  : : :  Wn ) is dened,
0

where E (k) = hcl x P 0  E 0 i and for 1


i
n Wi = (Wi  E )
h(O (x:P ) W1 : : : Wn ) E i 7;! hPE x := c (O0  W1  : : :  Wn )]i if c (O0  W1  : : :  Wn ) is dened,
0

and for 1
i
n Wi = (Wi  E )
Converting syntactic values to machine values:
(c E ) = c
(x E ) = E (x)
((kx1 : : : xn :P ) E ) = hcl kx1 : : : xn  P E i
Figure 4: The na ve CPS abstract machine: the Ccps E machine.

register during the procedure call. hcl kx1 : : :xn P 0 E1; i, extends E1; with the values of
In terms of the CPS abstract machine, the opti- W1 : : : Wn and starts the execution of P 0. In particu-
mization corresponds to a modication of the oper- lar, there is no need to extend E1; with the value of k as
ation E 0 k0 := E (k) x1 := W1  : : : xn := Wn ] to this value remains in the environment component E k .
E 0 x1 := W1  : : : xn := Wn ] such that E and E 0 share
the binding of k. In order to make the sharing explicit,
we split the environment into two components: a com-
ponent E k that includes the binding for the continu-
ation, and a component E ; that includes the rest of
4 A-Normal Form Compilers
the bindings, and treat each component independently.
This optimization relies on the fact that every control A close inspection of the Ccps EK machine reveals that
string has exactly one free continuation variable, which the control strings often contain redundant information
implies that the corresponding value can be held in a considering the way instructions are executed. First, a
special register.5 return instruction, i.e., the transition 7;(1)!c , dispatches
Performing these modications on the na ve abstract on the term (k W ), which informs the machine that the
machine produces the realistic CPS abstract machine \return address" is denoted by the value of the variable
in Figure 5. The new Ccps EK machine extracts the k. The machine ignores this information since a re-
information regarding the continuation from the CPS turn instruction automatically uses the value of register
terms and manages the continuation in an optimized E k as the \return address". Second, the call instruc-
way. For example, the state transition for the tail tions, i.e., transitions 7;(4)!c and 7;(5)!c , invoke closures
call (W k W1 : : : Wn ) evaluates W to a closure that expect, among other arguments, a continuation k.
Again, the machine ignores the continuation parameter
5 This fact also holds in the presence of control operators as in the closures and manipulate the \global" register E k
there is always one identiable current continuation. instead.

Page 5
Semantics: Let P 2 CPS(CS ),
eval c (P ) = c if hP  har x (k x)  stopii 7;!c h(k x) x := c] stopi:
Data Specications:
Sc 2 Statec = CPS(CS )  Envc  Contc (machine states)
E; 2 Envc = Variables ;!  Valuec (environments)
W 2 Valuec = c j hcl kx1 : : : xn  PE ; i (machine values)
Ek 2 Contc = stop j har x P E ;  E k i (continuations)
Transition Rules:
h(k W ) E ;  E k i ;7 (1)!c hP 0  E1; x := (W E ; )] E1k i where E k = har x P 0  E1;  E1k i
h(let (x W ) P ) E ;  E k i 7;(2)!c hPE ; x := (W E ; )] E k i
h(if0 W P1 P2 ) E ;  E k i 7;(3)!c hP1  E ;  E k i where (W E ; ) = 0
or hP2  E ;  E k i where (W E ; ) 6= 0
h(W k W1 : : : Wn ) E ;  E k i 7;(4)!c hP 0  E1; x1 := W1  : : :  xn := Wn ] E k i
where (W E ; ) = hcl k0 x1 : : : xn  P 0  E1; i and for 1
i
n Wi = (Wi  E ; )
h(W (x:P ) W1 : : : Wn ) E ;  E k i 7;(5)!c hP 0  E1; x1 := W1  : : :  xn := Wn ] har x P E ;  E k ii
where (W E ; ) = hcl k0 x1 : : : xn  P 0  E1; i and for 1
i
n Wi = (Wi  E ; )
h(O0 k W1 : : : Wn ) E ;  E k i 7;(6)!c hP 0  E1; x := c (O0  W1  : : :  Wn )]E1k i if c (O0  W1  : : :  Wn ) is dened,
where E k = har x P 0  E1;  E1k i and for 1
i
n Wi = (Wi  E ; )
h(O (x:P ) W1 : : : Wn ) E  E i 7;!c hPE ; x := c (O0  W1  : : :  Wn )] E k i
0 ; k (7)
if c (O0  W1  : : :  Wn ) is dened,
and for 1
i
n Wi = (Wi  E ; )

Figure 5: The realistic CPS abstract machine: the Ccps EK machine.

Undoing CPS The crucial insight is that the CS  CPS - 


elimination of the redundant information from the
Ccps EK machine corresponds to an inverse CPS trans-
formation 7, 17] on the intermediate code. The func- A  -normalization
tion U in Figure 6 realizes such an inverse 17]. The in-
verse transformation formalizes our intuition about the
redundancies in the Ccps EK machine. It eliminates the ? ?
variable k from return instructions as well as the param- A(CS )   un-CPS  CPS(CS )
eter k from procedures. The latter change implies that
continuations are not passed as arguments in function The diagram naturally suggests a direct translation A
calls but rather become contexts surrounding the calls. that combines the e
ects of the three phases. The iden-
For example, the code segment cps (N ) in Section 3 be- tication of the translation A requires a theorem re-
comes: lating  -reductions on CPS terms to reductions on the
source language. This correspondence of reductions was
the subject of our previous paper 17]. The resulting set
of source reductions, the A-reductions, is in Figure 7.6
Since the A-reductions are strongly normalizing, we can
characterize the translation A as any function that ap-
A(N ) = (let (t1 (+ 2 2)) plies the A-reductions to a source term until it reaches
(let (x 1) a normal form 17:Theorem 6.4].
(let (t2 (f x )) The denition of the A-reductions refers to the con-
(+ t1 t2 )))) cept of evaluation contexts. An evaluation context is a
term with a \hole" (denoted by ]) in the place of one
subterm. The location of the hole points to the next
6 Danvy 8] and Weise 21] also recognize that the compaction
Based on the above argument, it appears that CPS of CPS terms can be expressed in the source language, but do not
compilers perform a sequence of three steps: explore this topic systematically.

Page 6
The inverse CPS transformation:
U : CPS(CS ) ! A(CS )
U  (k W )]] = W ]
U  (let (x W ) P )]] = (let (x W ] ) U  P ] )
U  (if0 W P1 P2 )]] = (if0 W ] U  P1] U  P2] )
U  (W k W1 : : : Wn )]] = ( W ] W1] : : : Wn ] )
U  (W (x:P ) W1 : : : Wn )]] = (let (x ( W ] W1] : : : Wn ] )) U  P ] )
U  (O0 k W1 : : : Wn )]] = (O W1] : : : Wn ] )
U  (O0 (x:P ) W1 : : : Wn )]] = (let (x (O W1] : : : Wn ] )) U  P ] )

:W ! V
c] = c
x] = x
kx1 : : : xn :P ] = x1 : : : xn :U  M ]
The language A(CS )
:
M ::= V (return )
j (let (x V ) M ) (bind )
j (if0 V M M ) (branch)
j (V V 1 : : : V n ) (tail call )
j (let (x (V V1 : : : Vn )) M ) (call )
j (O V1 : : : Vn ) (prim-op)
j (let (x (O V1 : : : Vn )) M ) (prim-op)

V ::= c j x j (x1 : : : xn :M ) (values)


Figure 6: The inverse CPS transformation and its output

Evaluation Contexts:

E ::=  ] j (let (x E ) M ) j (if0 E M M ) j (F V V E M M ) where F = V or F = O

The A-reductions:

E (let (x M ) N )] ;! (let (x M ) E N ]) where E 6=  ] x 62 FV (E ) (A1 )


E (if0 V M1 M2 )] ;! (if0 V E M1 ] E M2 ]) where E 6=  ] (A2 )
E (F V1 : : : Vn )] ;! (let (t (F V1 : : : Vn )) E t]) (A3 )
where F = V or F = O E 6= E 0 (let (z  ]) M )] E 6=  ] t 62 FV (E )
Figure 7: Evaluation contexts and the set of A-reductions

subexpression to be evaluated according to the CEK se- The A-reductions transform programs in a natu-
mantics. For example, in an expression (let (x M1) M2 ), ral and intuitive manner. The rst two reductions
the next reducible expression must occur within M1 , merge code segments across declarations and condi-
hence the denition of evaluation contexts includes the tionals. The last reduction lifts redexes out of eval-
clause (let (x E ) M ). uation contexts and names intermediate results. Us-
ing evaluation contexts and the A-reductions, we can

Page 7
Semantics: Let M 2 A(CS ),
eval a (M ) = c if hM  har x x  stopii 7;!a hx x := c] stopi:
Data Specications:
Sa 2 Statea = A(CS )  Enva  Conta (machine states)
E 2 Enva = Variables ;!  Valuea (environments)
V 2 Valuea = c j hcl x1 : : : xn  M E i (machine values)
K 2 Conta = stop j har xM E K i (continuations)
Transition Rules:
hV E K i ; 7 (1)!a hM 0  E 0 x :=  (V E )] K 0i where K = har x M 0  E 0  K 0i
h(let (x V ) M ) E K i 7;(2)!a hM E x :=  (V E )] K i
h(if0 V M1 M2 ) E K i 7;(3)!a hM1  E K i where  (V E ) = 0
or hM2  E K i where  (V E ) 6= 0
h(V V1 : : : Vn ) E K i 7;(4)!a hM 0  E 0 x1 := V1  : : :  xn := Vn ]K i
where  (V E ) = hcl x1 : : : xn  M 0  E 0 i and for 1
i
n Vi =  (Vi  E )
h(let (x (V V1 : : : Vn )) M ) E K i 7;(5)!a hM 0  E 0 x1 := V1  : : :  xn := Vn ] har x M E K ii
where  (V E ) = hcl x1 : : : xn  M 0  E 0 i and for 1
i
n Vi =  (Vi  E )
h(O V1 : : : Vn ) E K i 7;(6)!a hM 0  E 0 x := (O V1  : : :  Vn )] K 0i if (O V1  : : :  Vn ) is dened,
where K = har x M  E  K i and for 1
i
n Vi =  (Vi  E )
0 0 0

h(let (x (O V1 : : : Vn )) M ) E K i 7;(7)!a hM E x := (O V1  : : :  Vn )]K i if (O V1  : : :  Vn ) is dened,


and for 1
i
n Vi =  (Vi  E )

Figure 8: The Ca EK machine

rewrite our sample code segment N in Section 3 as fol- The Ca EK machine is a CEK machine specialized to
lows. For clarity, we surround the reducible term with the subset of Core Scheme in A-normal form (Figure 6).
a box: The machine (see Figure 8) has only two kinds of con-
tinuations: the continuation stop, and continuations of
N = (+ (+ 2 2) (let (x 1) (f x ))) the form har x M E K i. Unlike the CEK machine,
;! (let (t1 (+ 2 2)) (A3 ) the Ca EK machine only needs to build a continuation
(+ t1 (let (x 1) (f x ))) ) for the evaluation of a non-tail function call. For exam-
ple, the transition rule for the tail call (V V1 : : : Vn )
;! (let (t1 (+ 2 2)) (A1 ) evaluates V to a closure hcl x1 : : :xn M 0 E 0i, extends
(let (x 1) the environment E 0 with the values of V1  : : : Vn and
(+ t1 (f x )) )) continues with the execution of M 0. The continuation
;! (let (t1 (+ 2 2)) (A3 ) component remains in the register K . By comparison,
(let (x 1) the CEK machine would build a seperate continuation
(let (t2 (f x )) for the evaluation of each sub-expression V V1  : : : Vn.
(+ t1 t2))))
The appendix includes a linear algorithm that maps 5 Equivalence of Compilation
Core Scheme terms to their normal form with respect
to the A-reductions.
Strategies
A comparison of Figures 5 and 8 suggests a close
Compilers In order to establish that the A- relationship between the CcpsEK machine and the
reductions generate the actual intermediate code of CPS Ca EK machine. In fact, the two machines are identi-
compilers, we design an abstract machine for the lan- cal modulo the syntax of the control strings, as cor-
guage of A-normal forms, the Ca EK machine, and prove responding state transitions on the two machines per-
that this machine is \equivalent" to the CPS machine form the same abstract operations. Currently, the tran-
in Figure 5. sition rules for these machines are dened using pattern

Page 8
matching on the syntax of terms. Once we reformulate Denition 5.2. (M, R, V , and K)
these rules using predicates and selectors for abstract
syntax, we can see the correspondence more clearly. M : Statec ;! Statea
For example, we can abstract the transition rules M(hP E ; E k i) = hU P ]  R(E ;) K(E k )i
7;(5)!a and 7;(5)!c from the term syntax as the higher-order
functional T5: R : Envc ;! Enva
R(E ; ) = E where E (x) = V (E ; (x))
T5 call-var  call-body  call?  call-args  call-fn ] =
hC E K i 7;!    if call? (C ) V : Valuec ;! Valuea
where x = call-var (C ) V (c) = c
M = call-body (C ) V (hcl kx1 : : :xn P E ;i) =
V = call-fn (C ) hcl x1 : : :xn U P ]  R(E ;)i
V1  : : : Vn = call-args (C ) K : Contc ;! Conta
The arguments to T5 are abstract-syntax functions for K(stop) = stop
manipulating terms in a syntax-independent manner. K(har x P E ; E k i) =
Applying T5 to the appropriate functions produces ei- har x U P ]  R(E ;) K(E k )i
transition rule 7;(5)!a of the Ca EK machine or
ther the (5)
the rule 7;!c of the Ccps EK machine, i.e.,
Intuitively, the function M maps Ccps EK machine states
7;!a = T5 A-call-var  : : : A-call-fn]
(5) to Ca EK machine states, and R, V and K perform a
7;(5)!c = T5 cps-call-var  : : : cps-call-fn ] similar mapping for environments, machine values and
continuations respectively. We can now formalize the
Suitable denitions of the syntax-functions for the previously stated requirement that O and O0 behave in
language A(CS ) are: the same manner.
A-call-var (let (x (V V1 : : : Vn )) M )]] = x Requirement For all W1  : : : Wn 2 Valuec ,
A-call-body (let (x (V V1 : : : Vn )) M )]] = M
::: V (c (O0 W1  : : : Wn )) =  (O V (W1 ) : : : V (Wn )):
A-call-fn (let (x (V V1 : : : Vn )) M )]] = V The function M commutes with the state transition
Denitions for the language CPS(CS ) follow a similar functions.
pattern: Theorem 5.3 (Commutativity Theorem)
cps-call-var (W ( x:P ) W1 : : : Wn)]] = x Let S 2 Statec : S 7;(n!) c S 0 if and only if M(S ) 7;(n!) a
cps-call-body (W ( x:P ) W1 : : : Wn)]] = P M(S 0 ).
::: S 7;(n!) c S 0
cps-call-fn (W ( x:P ) W1 : : : Wn)]] = W 6 6
In the same manner, we can abstract each pair of transi- M M
tion rules 7;(n!) a and 7;(n!) c as a higher-order functional Tn. ? ?
Let Sa and Sc be abstract-syntax functions appro- M(S ) 7 (n!) a M(S 0 )
;
priate for A-normal forms and CPS terms, respectively.
Then the following theorem characterizes the relation- Proof: The inverse CPS transformations U is bijec-
ship between the two transition functions. tive 17]. Hence by structural induction, the functions
M, R, V and K are also bijective. The proof proceeds
Theorem 5.1 (Machine Equivalence) For 1  n  by case analysis on the transition rules.
7, 7;(n!) a = Tn Sa ] and 7;(n!) c = Tn Sc ].
Intuitively, the evaluation of a CPS term P on the
The theorem states that the transition functions of the CcpsEK machine proceeds in the same fashion as the
Ca EK and Ccps EK machines are identical modulo syn- evaluation of U P ] on the Ca EK machine. Together
tax. However, in order to show that the evaluation of an with the machine equivalence theorem, this implies that
A-normal form term M and its CPS counterpart on the both machines perform the same sequence of abstract
respective machines produces exactly the same behav- operations, and hence compilers based on these abstract
ior, we also need to prove that there exists a bijection M machines can produce identical code for the same input.
between machine states that commutes with the transi- The A-normal form compiler achieves its goal in fewer
tion rules. passes.

Page 9
6 A-Normal Forms as an Inter- A Linear A-Normalization
mediate Language The linear A-normalization algorithm in Figure 9 is
written in Scheme extended with a special form match,
Our analysis suggests that the language of A-normal which performs pattern matching on the syntax of pro-
forms is a good intermediate representation for compil- gram terms. It employs a programming technique for
ers. Indeed, most direct compilers use transformations CPS algorithms pioneered by Danvy and Filinski 9].
similar to the A-reductions on an ad hoc and incomplete To prevent possible exponential growth in code size, the
basis. It is therefore natural to modify such compilers to algorithm avoids duplicating the evaluation context en-
perform a complete A-normalization phase, and analyze closing a conditional expression. We assume the front-
the e
ects. We have conducted such an experiment with end uniquely renames all variables, which implies that
the non-optimizing, direct compiler CAML Light 15]. the condition x 62 FV (E ) of the reduction A1 holds.
This compiler translates ML programs into bytecode via
a -calculus based intermediate language, and then in- Acknowledgments We thank Olivier Danvy, Preston
terprets this bytecode. By performing A-normalization Briggs, and Keith Cooper for comments on an early
on the intermediate language and rewriting the inter- version of the paper.
preter as a Ca EK machine, we achieved speedups of be-
tween 50% and 100% for each of a dozen small bench-
marks. Naturally, we expect the speedups to be smaller
when modifying an optimizing compiler.
References
A major advantage of using a CPS-based intermedi- 1] Aho, A., Sethi, R., and Ullman, J.
ate representation is that many optimizations can be Compilers|Principles, Techniques, and Tools.
expressed as sequences of  and  reductions. For Addison-Wesley, Reading, Mass., 1985.
example, CPS compilers can transform the non-tail
call (W ( x:kx) W1 : : : Wn ) to the tail-recursive call 2] Appel, A. Compiling with Continuations. Cam-
(W k W1 : : : Wn ) using an -reduction on the con- bridge University Press, 1992.
tinuation 2]. An identical transformation 17] on the 3] Barendregt, H. The Lambda Calculus: Its Syn-
language of A-normal forms is the reduction id : tax and Semantics, revised ed. Studies in Logic
and the Foundations of Mathematics 103. North-
(let (x (V V1 : : : Vn )) x) ;! (V V1 : : : Vn ) Holland, 1984.
where V V1  : : : Vn are the A-normal forms correspond- 4] Boehm, H.-J., and Demers, A. Implement-
ing to W W1  : : : Wn respectively. Every other opti- ing Russel. In Proceedings of the ACM SIG-
mization on CPS terms that corresponds to a sequence PLAN 1986 Symposium on Compiler Construction
of -reductions is also expressible on A-normal form (1986), vol. 21(7), Sigplan Notices, pp. 186{195.
terms 17].
The A-reductions also expose optimization opportu- 5] Bondorf, A. Improving binding times without
nities by merging code segments across block declara- explicit CPS-conversion. In Proceedings of the 1992
tions and conditionals. In particular, partial evaluators ACM Conference on Lisp and Functional Program-
rely on the A-reductions to improve their specializa- ming (1992), pp. 1{10.
tion phase 5]. For example, the addition operation and 6] Clinger, W. The Scheme 311 compiler: An ex-
the constant 0 are apparently unrelated in the following ercise in denotational semantics. In Proceedings of
term: the 1984 ACM Conference on Lisp and Functional
Programming (1984), pp. 356{364.
(add1 (let (x (f 5)) 0))
7] Danvy, O. Back to direct style. In Proceedings
The A-normalization phase produces: of the 4th European Symposium on Programming
(Rennes, 1992), Lecture Notes in Computer Sci-
(let (x (f 5)) (add1 0)), ence, 582, Springer Verlag, pp. 130{150.
8] Danvy, O. Three steps for the CPS transforma-
which specializes to (let (x (f 5)) 1). tion. Tech. Rep. CIS-92-2, Kansas State University,
In summary, compilation with A-normal forms char- 1992.
acterizes the critical aspects of the CPS transformation
relevant to compilation. Moreover, it formulates these 9] Danvy, O., and Filinski, A. Representing con-
aspects in a way that direct compilers can easily use. trol: A study of the CPS transformation. Mathe-
Thus, our result should lead to improvements for both matical Structures in Computer Science, 4 (1992),
traditional compilation strategies. 361{391.

Page 10
(dene normalize-term (lambda (M ) (normalize M (lambda (x ) x ))))
(dene normalize
(lambda (M k )
(match M
`(lambda ,params ,body ) (k `(lambda ,params ,(normalize-term body )))]
`(let (,x ,M1 ) ,M2 ) (normalize M1 (lambda (N1 ) `(let (,x ,N1) ,(normalize M2 k ))))]
`(if0 ,M1 ,M2 ,M3 ) (normalize-name M1 (lambda (t ) (k `(if0 ,t ,(normalize-term M2 ) ,(normalize-term M3 )))))]
`(,Fn . ,M ) (if (PrimOp? Fn )
(normalize-name M (lambda (t ) (k `(,Fn . ,t ))))
(normalize-name Fn (lambda (t ) (normalize-name M (lambda (t ) (k `(,t . ,t )))))))]
V (k V )])))
(dene normalize-name
(lambda (M k )
(normalize M (lambda (N ) (if (Value? N ) (k N ) (let (t (newvar )]) `(let (,t ,N ) ,(k t ))))))))
(dene normalize-name
(lambda (M k )
(if (null? M )
(k '())
(normalize-name (car M ) (lambda (t ) (normalize-name (cdr M ) (lambda (t ) (k `(,t . ,t )))))))))

Figure 9: A linear-time A-normalization algorithm

10] Felleisen, M., and Friedman, D. Control op- 15] Leroy, X. The Zinc experiment: An economical
erators, the SECD-machine, and the -calculus. In implementation of the ML language. Tech. Rep.
Formal Description of Programming Concepts III 117, INRIA, 1990.
(Amsterdam, 1986), M. Wirsing, Ed., Elsevier Sci-
ence Publishers B.V. (North-Holland), pp. 193{ 16] Plotkin, G. Call-by-name, call-by-value, and the
217. -calculus. Theoretical Computer Science 1 (1975),
125{159.
11] Fessenden, C., Clinger, W., Friedman, 17] Sabry, A., and Felleisen, M. Reasoning about
D. P., and Haynes, C. T. Scheme 311 version 4 programs in continuation-passing style. In Pro-
reference manual. Computer Science Technical Re- ceedings of the 1992 ACM Conference on Lisp
port 137, Indiana University, Bloomington, Indi- and Functional Programming (1992), pp. 288{298.
ana, Feb. 1983. Technical Report 92-180, Rice University.
18] Shivers, O. Control-Flow Analysis of Higher-
12] Fischer, M. Lambda calculus schemata. In Pro- Order Languages or Taming Lambda. PhD thesis,
ceedings of the ACM Conference on Proving As- Carnegie-Mellon University, 1991.
sertions About Programs (1972), vol. 7(1), Sigplan
Notices, pp. 104{109. 19] Steele, G. L. RABBIT: A compiler for Scheme.
MIT AI Memo 474, Massachusetts Institute of
13] Kelsey, R., and Hudak, P. Realistic com- Technology, Cambridge, Mass., May 1978.
pilation by program transformation. In Confer- 20] Wand, M. Correctness of procedure representa-
ence Record of the 16th Annual ACM Symposium tions in higher-order assembly language. In Pro-
on Principles of Programming Languages (Austin, ceedings of the 1991 Conference on the Mathemat-
TX, Jan. 1989), pp. 281{292. ical Foundations of Programing Semantics (1992),
S. Brookes, Ed., vol. 598 of Lecture Notes in Com-
14] Kranz, D., Kelsey, R., Rees, J., Hudak, P., puter Science, Springer Verlag, pp. 294{311.
Philbin, J., and Adams, N. Orbit: An op-
timizing compiler for Scheme. In Proceedings of 21] Weise, D. Advanced compiling techniques.
the ACM SIGPLAN 1986 Symposium on Compiler Course Notes at Stanford University, 1990.
Construction (1986), vol. 21(7), Sigplan Notices,
pp. 219{233.

Page 11

You might also like