An introduction to Core Erlang

Richard Carlsson
Computing Science Department
Uppsala University
Box 311, 751 05 Uppsala, Sweden


Core Erlang is a new, o ial ore language for the on urrent fun tional language Erlang. Core Erlang is fun -

tional, stri t, dynami ally typed, pattern mat hing, higher

termination; thus, unless the program sends a message to another pro ess (or otherwise auses a side ee t), its exe ution
will not be visible to the rest of the system. Many


programs are therefore written to run in a re ursive loop,

order and on urrent. It has been developed in ooperation

a ting on in oming messages by performing some omputa-

between the High-Performan e Erlang proje t at Uppsala

tion and sending o new messages, in a server-like manner.

Erlang im-

University and the Erlang/Open Tele om Platform devel-

Sin e the number of iterations is unbounded,

opment team at Eri sson, and is a entral omponent in the

plementations must perform tail all optimisation, allowing

new release of the Erlang/OTP ompiler.

a pro ess to run forever in onstant spa e.

s ribes the ore language, its relation to

This paper de-

Erlang, and the

Erlang language is higher-order, stri t, and dynam-

history of its development. We show examples of translation



i ally typed. The primary basi types are atoms (names), 

ed by being performed at the ore language level.

list (written

Erlang to Core Erlang, illustrating how analysis
and transformation of Erlang programs is greatly simpli-

arbitrarily-sized integers, oating-point numbers, the empty

[ ℄), and pro ess identiers. The only data on{t1 , : : :, tn } and ons ells

stru tors are n-tuples, written

[ t 1 | t 2 ℄.

1.1 The Erlang language


Strings are represented by lists of integers ( har-

a ter odes). A very ommon idiom in

Erlang is to pla e

an atom in the rst element of a tuple, as a tag indi ating

[1, 3℄ is a on urrent fun tional language devel-

its ontents: for example, the nodes of a syntax tree ould

{ onstant, : : :}, {lambda, : : :}, {apply,
}, et .; a pro ess ould in a ertain state be interested only
messages on the form {reply, Sender, Value}, and so

oped by Eri sson, whi h is today being used in several large

be represented by

real-world appli ations, with a ode base of millions of lines


of ode. The main design goals for


Erlang have been in-

reased produ tivity, error avoidan e, on urren y and robustness; for these reasons, it is largely side ee t free, the
main sour e of side ee ts being the sending of messages.

Erlang pro esses are very lightweight and are implemented


Erlang has for histori al reasons pi ked up a few synta -

ti al parti ularities from Prolog, the most prominent being
that identiers are variables if they begin with an upper ase

at the abstra t ma hine level, rather than as operating sys-

letter, and are otherwise atoms. Atoms that do not begin

tem threads, making it possible in pra ti e to run thousands

with a lower ase letter, or ontain strange hara ters, must

of pro esses on a single ma hine. Communi ation between

be written within single-quotes, like

pro esses is handled by asyn hronous message passing, and




Erlang program onsists of a set of modules, where ea h

is transparent over network onne tions. Attempting to re-


eive a message from the pro ess mailbox will suspend the

module denes a number of fun tions. A module is uniquely

pro ess until a mat hing message has arrived, or some ho-

identied by its name as an atom. Within a module, ea h
fun tion is identied by its name (also an atom) and its

sen timeout limit is ex eeded.

arity. It is thus possible  and often done  to have several
The ode exe uted by a pro ess is a fun tional program;

fun tion denitions with the same name but with dierent

when the program terminates, so does the pro ess.


arities. For ea h module, only fun tions that are expli itly

value omputed by su h a program is always dis arded upon

de lared as exported will be a essible from other modules.
The following is an example of a simple

Erlang module:

-export([length/1, reverse/1℄).
length([X | Xs℄) -> 1 + length(Xs);
length([℄) -> 0.

reverse(Xs) -> reverse(Xs, [℄).

formations, advan ed inlining te hniques, and spe ialisation,
are not suitable or even possible to apply at that level. Stan-

reverse([X | Xs℄, Ys) ->
reverse(Xs, [X | Ys℄);
reverse([℄, Ys) -> Ys.
(Note that the fun tion whose name is

dard lo al optimisations are of ourse done on the BEAM
ode, but although important, these typi ally have too limited s ope for any drasti improvements.


is not


ussions about whether or not expression A really ould

Calls may be qualied with the name of the module, using

Module:Fun tion(: : :). An external all
above ould look like e. g. list:length(L).

the syntax

An important feature of

ment :

The third reason is that all attempts to simply redu e the
language to a proper subset, ended up in onvoluted dis-

to the

Erlang is dynami ode repla e-

any module an be repla ed at run-time with a new

version. Whenever a module-qualied fun tion all is made,
the latest loaded version of the target module is used; this
applies even if the target is the same as the aller. Thus, if
the re ursive loop of a server pro ess always does a


tail all on ea h iteration, it will start exe uting the new
ode as soon as it loops ba k to handle the next request.

be rewritten as expression B , sin e ex ept in a few ases,
neither was originally dened in terms of the other. Then,
even when semanti equivalen e ould be determined, it was
generally the ase that the rewritten, expanded expression
would result in mu h


e ient ode when passed through

the ompiler, be ause that did in fa t need to identify ea h
sour e ode onstru t dire tly in order to generate reasonably good output.
In 1998, dis ussions were started between the High-Performan e Erlang (HiPE) group at Uppsala University Computing S ien e Department and the OTP development team at

Erlang implementations must allow for at least two ver-

Eri sson about a standard ore language for

the urrent and the previous. If later the same module is re-


sions of a module to be stored in the system simultaneously:

pla ed again, any pro ess that is still running old ode may
be automati ally found and killed, so that that ode an be
safely purged from the system.

Erlang. Sev-

eral dierent prototype representations were qui kly put together, but there was no agreement on the details of the

In 1999, work began on reating a spe i ation, initially
by attempting to unify the existing suggestions, then by
renement and loser study of the many ne points of the

1.2 Core Erlang history
from S heme [10℄,


Erlang. Lots of hanges were ne essary before
and it was not until late 2000 that version 1.0 of the Core
Erlang spe i ation [4℄ was nally available as a te hni al
semanti s of

Being a fun tional language semanti ally not far removed
should be well suited for the

appli ation of many interesting analyses and performan e
improving transformations, from the well-known to the ex-

all involved were able to agree on a ommon ore language,


perimental. However, this has not yet been very su essfully


done. There are several reasons for this, all following from

As of release 8 of the OTP/Open Sour e

the fa t that the development of

bution, s heduled for O tober 2001, the ompiler now uses

Erlang was always prag-

mati ally oriented, rather than theoreti ally (or even very
systemati ally).



Core Erlang as an intermediate representation, is able to
Core Erlang ode in text form, and has
hooks for adding user-dened transformations at the Core
Erlang level.

read and write

The rst reason is that as

Erlang has evolved, its syntax

has be ome rather omplex. It is still on eptually a small

language, whi h makes it easy to learn, but the same fun tion an often be expressed synta ti ally in many dierent
ways. Thus, a program operating on sour e ode must handle so many ases as to be ome impra ti al in general.
The se ond reason is that the original abstra t ma hine

1.3 Core Erlang design goals
The following were our main design goals for the

Erlang language:

Erlang, the JAM [2℄ sta k ma hine, had an in- 

level language that no intermediate representation was ever 

used for 

termediate representations be ame ne essary for the translation, these went as straight as possible to the abstra t ma hine level, in order to get orre t, working ode with the
least amount of trouble.

However, both JAM and BEAM

ode are imperative in nature, and most optimisations from
the fun tional programming world su h as algebrai trans-

1 The positive side of that oin is however that the language

is in fa t very useful for writing heavy-duty real-world programs.

It should be as regular as possible, to fa ilitate the
development of ode-walking tools.

used in the ompiler. When the new abstra t ma hine, the
the sour e language was already xed, and even though in-

It should be a stri t, higher-order fun tional language,
with a lear and simple semanti s.

stru tion set whi h was similar enough to the a tual sour e-

BEAM [7℄ (a WAM-like register ma hine), was introdu ed,


Erlang programs to equivalent Core
Erlang programs should be straightforward, and equally
important, it should be straightforward to translate
from Core Erlang to the intermediate ode used in
Erlang implementations.

Translation from

There should be a well-dened textual representation

Core Erlang,

with a simple, preferably unam-

biguous grammar, making it easy to onstru t tools
for printing and reading programs.

This representa-

tion should be possible to use for ommuni ation be-

tween tools using dierent internal representations of

Core Erlang. 


relatively late,


The textual representation should be easy for humans
to read and edit, in order to simplify debugging and

and play a mu h more entral part in

Erlang. The syntax is:

fun (var 1 ,


, var n ) -> exprs


Where no variable may o ur twi e in

The language should be easy to work with, supporting
most kinds of familiar optimisations at the fun tional
programming level.

var 1 ; : : : ; var n .

To simplify generation of e ient ode, and to allow ertain
optimisations to be performed at the ore language level,

Core Erlang expression always produ es a sequen e of

values, whi h ould be empty.
Of these, the last was by far the most di ult to a hieve,


Value-sequen es are writ-

ten within angular bra kets (less than/greater than); if the

trying to a omodate for the urrent or anti ipated needs of

length is exa tly 1, the bra kets may be left out. Ea h om-

everyone involved. In the end, we hose not to enfor e any

ponent expression

spe i onvention su h as ontinuation-passing style or A-

one value:

normal form; it is however an easy task to normalize any


Core Erlang program a ording to preferen e.

expr 1 ; : : : ; expr n




must produ e exa tly

< expr 1 ,


, expr n >

The use of an expression must mat h the number of values it

2.1 Modules

an argument to a onstru tor or fun tion all, this number

As in

should always be 1. However, it is not possible in general



within a module.

Core Erlang

produ es, whi h should be well-dened; if the expression is

fun tions must reside

A module de laration has the following


to determine at ompile time that a given
program is orre t in this respe t.


module Atom [ fname i1 ,
: : :, fname i ℄
attributes [ Atom 1 = onst 1 ,
: : :, Atom m = onst m ℄
fname 1 = fun 1


2.2 Expressions
Basi expressions in

pressions (funs), ons ell onstru tors, and n-tuple onstru tors; all obviously produ ing exa tly one value. In addition there are

fname n = fun n end

is a

fun tion variable

with a spe ial syn-



external names are the same as the internal. The exported


fun tions are listed after the module name, and must be a


subset of those dened in the top level of the module.




the keys, whi h must be unique, are atoms and the orresponding values are any onstant terms.

The meaning

of module attributes is implementation-dependent;



uses them for things su h as version information. Constant


terms are:






lit j [ onst 1 | onst 2 ℄
{ onst 1 , : : :, onst n }
Integer j Float
Char j String






Core Erlang, all atoms must be written within single

quotes, to avoid any onfusion with keywords.

whi h bind no

Erlang itself does not

Lambda abstra tions are for apparent reasons known as


var j fname j lit j fun
[ exprs 1 | exprs 2 ℄
{ exprs 1 , : : :, exprs n }
let vars = exprs 1 in exprs 2
do exprs 1 exprs 2
letre fname 1 = fun 1 
  fname n = fun n in exprs
apply exprs 0 (exprs 1 , : : :, exprs n )
all exprs n+1 :exprs n+2 (exprs 1 , : : :, exprs n )
primop Atom (exprs 1 , : : :, exprs n )
try exprs 1 at h (var 1 , var 2 ) -> exprs 2
ase exprs of lause 1    lause n end
re eive lause 1    lause n
after exprs 1 -> exprs 2
var j < var 1 , : : :, var n >

Erlang has rather unusual s oping rules, making ode traverCore Erlang, s opes are nested as in

sal di ult, but in

ordinary lambda al ulus.

2.1.1 Functions, lambdas and results of expressions
in the


expressions allow lo al (re-

have, but whi h are often useful in transformations.

program, and keep the export me hanism simple, sin e the

Ea h module has a possibly empty set of


ursive) fun tion denitions, whi h

Atom / Integer

Fun tion variables were not stri tly ne essary for the lan-


whi h bind n variables si-

and the similar sequen ing expressions
variables. Furthermore,

guage, but make it easier to see the onne tion to the original



multaneously to the values produ ed by the right-hand side,



Core Erlang are variables (ordinary

variables and fun tion names), atomi literals, lambda ex- 

where ea h

Core Erlang


Erlang parlan e. They were added to the language

2 Erlang

was originally made a rst-order language, be ause it was thought that lambda abstra tions would make
programs too ompli ated, and ould generally be done


2.2.1 Function calls

Core Erlang fun tion alls ome in three avours: appli-

ations, remote alls,


primitive operations.

The rst is

simply the appli ation of a fun tional value (su h as a lo ally


dened fun tion) to a list of arguments.


apply exprs 0 (exprs 1 , : : :, exprs n )
all exprs n+1 :exprs n+2 (exprs 1 , : : :, exprs n )
primop Atom (exprs 1 , : : :, exprs n )

Patterns in lude the form

remote all

with a mat hing pattern is found whose guard evaluates to

(module-qualied all) has two subexpressions

apart from the argument list: these must evaluate to atoms,
giving the module and fun tion name, respe tively, of the 
remote fun tion to be alled. Usually, both are stati ally
known at ompile time, but they may be dynami ally omputed.

For a remote all, the latest loaded version of the

named module is always used; this allows migration from
old ode to new, as previously des ribed.

If no lause should mat h, the behaviour is undened.

A lause guard is a limited form of expression  it must
produ e a single boolean value (either of atoms


ti tion between operations that are internal to the ompiler,
and normal fun tions that are exported by some module.
The name of the alled operation must be a onstant. E. g.,

variables, atomi literals, onstru tors,
expressions, and

Core Erlang level that destru -

tively updates a part of a data stru ture, but programmers
must be prevented from a essing su h operations dire tly

Erlang, sin e unrestri ted use ould violate the lan-

guage semanti s.

does not spe ify how ex eptions are gen-

two omponents: the


and the


An ex eption has
These may be any

values, but the tag is expe ted to signify the lass of the exand

Erlang, possible tags are urrently the atoms



Error handling is also restri ted in guards. A


in a guard must have the following form:

try exprs 1 at h (var 1 , var 2 ) -> 'false'
In other words, if the evaluation of the tried expression
should fail, the ex eption is aught and the value



substituted for the result. This is used for supporting the
quite unorthodox way

Erlang treats ex eptions in guard

fully general

expressions will be allowed in

2.2.4 Receive expressions
Last, we des ribe the rather intri ate asyn hronous
expression. Its syntax is very similar to a

sion should fail, raising an ex eption, then this will be aught
bound to the tag and

value of the ex eption.

var 2


re eive


re eive lause 1    lause n
after exprs 1 -> exprs 2

and instead the alternative body of the try-expression will

var 1


but it has no expli it sele tor, and adds a nal  lause for

expression, if evaluation of the attempted expres-

be evaluated with


and sele tors.


try exprs 1 at h (var 1 , var 2 ) -> exprs 2
In a

let-, do-

alls to stati ally named


erated; only how they may be aught.



expressions. It is possible that in future versions of

2.2.2 Error handling

eption; in



tions must be guaranteed to exist and be free from side ef-


Core Erlang



Currently, only

fun tions are allowed in guards. Called operations and fun -

it ould be safe for the ompiler in ertain situations to intro-


and must not have side ee ts.

fe ts: typi al examples are type tests, omparison operators

Primitive operations are in luded to make a watertight dis-

du e an operation at the

var = pat , whi h is analogous to as

patterns in ML [9℄. Clauses are tried in order, until the rst



pats when exprs 1 -> exprs 2
pat j < pat 1 , : : :, pat n >
var j lit j [ pat 1 | pat 2 ℄
{ pat 1 , : : :, pat n }
var = pat

to the


re eive

expression impli itly ontains a loop, whi h is

the main reason why it annot be redu ed to simpler omponents in a useful way. The exe uting pro ess will traverse its
mailbox queue, always from the beginning, and either sele t
the rst message that mat hes one of the lauses, removing
the message from the mailbox and evaluating the lause, or
otherwise suspend until a mat hing message arrives or the
timeout limit is ex eeded. Time is given as an integer num-

2.2.3 Pattern matching

Erlang, the ore language uses pattern mat hing for
de omposition and bran hing, but only in two onstru ts
( ompared to six in Erlang), of whi h the rst is the ase

ber of millise onds, zero meaning immediate timeout if no
message mat hes. The atom


is used to signal that

the timeout should never happen.

swit h. The sele tor expression produ es a xed number of

It is obvious that attempting to break down this onstru t

values, and ea h lause, on the form

will always expose the dierent possible states during traver-

body ,

patterns when guard ->

must ontain the orresponding number of patterns.

sal and mat hing, whi h ould jeopardise the integrity of the

Variables may not be repeated in the patterns of a lause,

mailbox. What is perhaps even more important, de ompo-

and are always binding o urren es, whose s ope is the

sition would make it mu h more di ult to use the ore lan-




ase exprs of lause 1    lause n end


guage for pro ess ommuni ation analysis.

re eive

Therefore, the

is arried over pra ti ally as it is from

Core Erlang.

Erlang to

2.3 Syntax summary
Figure 1 shows a summary of the language syntax.



As mentioned, the s oping rules of
usual; ea h expression has an

output environment.

Erlang are rather un-

input environment

and an

It is e. g. possible to write:


f(X) ->
ase X of
{foo, A} -> B = g(A);
{bar, A} -> B = h(A)
{A, B}.
where the values of





<[X | Xs℄, Ys> when 'true' ->
let X3 = [X | Ys℄
in apply 'reverse'/2(Xs, X3)
<[℄, Ys> when 'true' -> Ys
<X3, X4> when 'true' ->
all 'erlang':'exit'('no_ lause')

We see that all pattern mat hing is moved to



sions, and that some of the run-time error he king done

Erlang is made expli it by adding default lauses; in
reverse/1 fun tion, this has been removed by eliminating unrea hable ode. The evaluation order of Erlang



expressions is enfor ed by introdu ing


bindings during

translation. (In the above example this was not stri tly ne in the resulting tuple depend


on the sele ted bran h. Maintaining input and output environments while traversing


ode is umbersome

and error prone. Translating the above into
might produ e the following ode:

Core Erlang

Translating operators like



to qualied alls to the module

is done in order to have a normalised representation

of standard operators, making it possible to write program
transformations that are not implementation dependent. In-

'f'/1 = fun (X) ->
let <X1, X2> =
ase X of
{foo, A} when 'true' ->
let B = apply 'g'/1(A)
in <A, B>
{bar, A} when 'true' ->
let B = apply 'h'/1(A)
in <A, B>
in {X1, X2}
whi h is mu h easier to traverse, manipulate and reason
about, following ordinary lambda al ulus s oping rules. This
example also illustrates one of the main reasons why


Erlang expressions produ e multiple values.
The Erlang module shown in Se tion 1.1 an be automati ally translated to the following Core Erlang ode, using
the fa t that in Erlang, inx operators su h as + belong to
the standard library module named


stead of presenting a list of o ial

Core Erlang primops,

we use the fa t that these fun tions already have denitions
in the

Erlang standard library.

There should be no run-

time penalty indu ed by this, however, be ause the ompiler
is allowed to re ognize alls to su h built-in fun tions (BIFs)
and generate more e ient ode for these, using the assumption that they will not be redened. In other words, a later
transformation stage ould rewrite su h alls to



pli ations, making the program representation dependent of
the parti ular ompiler implementation, as a prelude to lowlevel ode generation.

Dam and Fredlund [5℄ and Hu h [8℄ have used ore fragments
or subsets of

Erlang for purposes of program analysis (also

and independently using the name Core Erlang).


subsets are however in omplete with respe t to representing
all possible

Erlang programs, and are not entirely suitable

as an intermediate ode representation in the ompilation
pro ess.
Feeley and Larose [6℄ (the ETOS proje t) ompile


by translation to S heme[10℄, but the generated S heme ode

module 'list' ['length'/1, 'reverse'/1℄
attributes [℄
'length'/1 =
fun (X1) ->
ase X1 of
[X | Xs℄ when 'true' ->
let X2 = apply 'length'/1(Xs)
in all 'erlang':'+'(1, X2)
[℄ when 'true' -> 0
X2 when 'true' ->
all 'erlang':'exit'('no_ lause')
'reverse'/1 =
fun (X1) -> apply 'reverse'/2(X1, [℄)
'reverse'/2 =
fun (X1, X2) ->
ase <X1, X2> of

is too tightly oupled to their implementation to be useful as
a generi intermediate format for

Erlang. Furthermore, it

does not seem feasible to perform e. g. program veri ation
or pro ess ommuni ation analysis using the output from
their translation.


Core Erlang language, its relaErlang, and the history of its development. Core
Erlang is today being used in the Erlang/OTP ompiler,3
whi h is the standard ompiler for the Erlang language.
We have des ribed the

tion to

Although the ompiler ba k end had to be modied to handle some of the onstru ts, this was not a major problem,
and in several ases leaned up the implementation.



The Erlang/OTP ompiler, runtime system and libraries
are available as Open Sour e from






module Atom [ fname i1 , : : :, fname ik ℄
attributes [ Atom 1 = onst 1 , : : :, Atom m = onst m ℄
fname 1 = fun 1    fname n = fun n end
Atom / Integer
lit j [ onst 1 | onst 2 ℄ j { onst 1 , : : :, onst n }

Integer j Float j Atom
Char j String j [ ℄
fun (var 1 , : : :, var n ) -> exprs
expr j < expr 1 , : : :, expr n >
var j fname j lit j fun
[ exprs 1 | exprs 2 ℄ j { exprs 1 , : : :, exprs n }
let vars = exprs 1 in exprs 2
do exprs 1 exprs 2
letre fname 1 = fun 1    fname n = fun n in exprs
apply exprs 0 (exprs 1 , : : :, exprs n )
all exprs n+1 :exprs n+2 (exprs 1 , : : :, exprs n )
primop Atom (exprs 1 , : : :, exprs n )
try exprs 1 at h (var 1 , var 2 ) -> exprs 2
ase exprs of lause 1    lause n end
re eive lause 1    lause n after exprs 1 -> exprs 2
var j < var 1 , : : :, var n >
pats when exprs 1 -> exprs 2
pat j < pat 1 , : : :, pat n >
var j lit j [ pat 1 | pat 2 ℄ j { pat 1 , : : :, pat n }
var = pat

Figure 1: Syntax summary
thermore, the ore language makes it possible to ontinue

Erlang in a more systemati way than before.

Our experien e (e. g., from the implementation of a re ent,
advan ed inlining algorithm, likely to be in orporated in the
standard ompiler) is that analysis and transformation of

Erlang programs is greatly simplied by being performed
at the ore language level, whi h has only a few, simple and

well-dened onstru ts, and uses ordinary s oping rules.



[5℄ M. Dam and L. Fredlund. On the veri ation of open
distributed systems. Te hni al Report R97-01,
Swedish Institute of Computer S ien e, 1997.
[6℄ M. Feeley and M. Larose. Compiling Erlang to
S heme. In

LNCS 1490,

pages 261272.

Springer-Verlag, 1998.
[7℄ B. Hausman. Turbo Erlang: Approa hing the speed of

Implementations of Logi
Programming Systems. Kluwer A ademi Publishers,

C. In E. Ti k, editor,

I would like to thank Kostis Sagonas for his advi e and omments on versions of this paper.

[8℄ F. Hu h. Veri ation of Erlang programs using
abstra t interpretation and model he king. In



[1℄ Armstrong, Virding, Wikström, and Williams.

Con urrent Programming in Erlang, Se ond Edition.
Prenti e Hall, 1996.

[2℄ J. Armstrong, B. Dä ker, R. Virding, and
M. Williams. Implementing a fun tional language for

Engineering for Tele ommuni ation Swit hing
Systems, Floren e, Mar h 1992.
highly parallel real-time appli ations. In

[3℄ J. Barklund and R. Virding. Erlang 4.7.3 referen e
manual. Draft version 0.7, June 1999.
[4℄ R. Carlsson, B. Gustavsson, E. Johansson,
T. Lindgren, S. Nyström, M. Pettersson, and
R. Virding. Core Erlang 1.0 language spe i ation.
Te hni al Report 2000-030, Department of
Information Te hnology, Uppsala University,
November 2000.

Pro eedings of the ACM SIGPLAN International
Conferen e on Fun tional Programming, pages
261272. ACM Press, 1999.
[9℄ R. Milner, M. Tofte, R. Harper, and D. Ma Queen.

The Denition of Standard ML (Revised).


Press, 1997.
[10℄ G. J. Sussman and G. L. Steele Jr. SCHEME, an
interpreter for extended lambda al ulus. AI Memo
349, Mass. Inst. of Te hnology, Arti ial Intelligen e
Laboratory, Cambridge, Mass., De ember 1975.

Master your semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master your semester with Scribd & The New York Times

Cancel anytime.