Languages and Compilers

Johan Jeuring, Doaitse Swierstra
November 7, 2010
Copyright c ( 2001–2009 by Johan Jeuring and Doaitse Swierstra.
Contents
Preface for the 2009 version v
Voorwoord vii
1. Goals 1
1.1. History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2. Grammar analysis of context-free grammars . . . . . . . . . . . . . . . 3
1.3. Compositionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4. Abstraction mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2. Context-Free Grammars 7
2.1. Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2. Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.1. Notational conventions . . . . . . . . . . . . . . . . . . . . . . . 14
2.3. The language of a grammar . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.1. Examples of basic languages . . . . . . . . . . . . . . . . . . . . 17
2.4. Parse trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.5. Grammar transformations . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.5.1. Removing duplicate productions . . . . . . . . . . . . . . . . . 23
2.5.2. Substituting right hand sides for nonterminals . . . . . . . . . 23
2.5.3. Removing unreachable productions . . . . . . . . . . . . . . . . 23
2.5.4. Left factoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.5.5. Removing left recursion . . . . . . . . . . . . . . . . . . . . . . 24
2.5.6. Associative separator . . . . . . . . . . . . . . . . . . . . . . . . 25
2.5.7. Introduction of priorities . . . . . . . . . . . . . . . . . . . . . . 26
2.5.8. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.6. Concrete and abstract syntax . . . . . . . . . . . . . . . . . . . . . . . 28
2.7. Constructions on grammars . . . . . . . . . . . . . . . . . . . . . . . . 31
2.7.1. SL: an example . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.8. Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.9. Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3. Parser combinators 41
3.1. The type of parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2. Elementary parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.3. Parser combinators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
i
Contents
3.3.1. Matching parentheses: an example . . . . . . . . . . . . . . . . 53
3.4. More parser combinators . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.4.1. Parser combinators for EBNF . . . . . . . . . . . . . . . . . . . 55
3.4.2. Separators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.5. Arithmetical expressions . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.6. Generalised expressions . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.7. Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4. Grammar and Parser design 67
4.1. Step 1: Example sentences for the language . . . . . . . . . . . . . . . 68
4.2. Step 2: A grammar for the language . . . . . . . . . . . . . . . . . . . 68
4.3. Step 3: Testing the grammar . . . . . . . . . . . . . . . . . . . . . . . 69
4.4. Step 4: Analysing the grammar . . . . . . . . . . . . . . . . . . . . . . 69
4.5. Step 5: Transforming the grammar . . . . . . . . . . . . . . . . . . . . 69
4.6. Step 6: Deciding on the types . . . . . . . . . . . . . . . . . . . . . . . 70
4.7. Step 7: Constructing the basic parser . . . . . . . . . . . . . . . . . . . 71
4.7.1. Basic parsers from strings . . . . . . . . . . . . . . . . . . . . . 72
4.7.2. A basic parser from tokens . . . . . . . . . . . . . . . . . . . . 72
4.8. Step 8: Adding semantic functions . . . . . . . . . . . . . . . . . . . . 74
4.9. Step 9: Did you get what you expected . . . . . . . . . . . . . . . . . 75
4.10. Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5. Compositionality 77
5.1. Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.1.1. Built-in lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.1.2. User-defined lists . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.1.3. Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.2. Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.2.1. Binary trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.2.2. Trees for matching parentheses . . . . . . . . . . . . . . . . . . 83
5.2.3. Expression trees . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.2.4. General trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.2.5. Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.3. Algebraic semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.4. Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.4.1. Evaluating expressions . . . . . . . . . . . . . . . . . . . . . . . 90
5.4.2. Adding variables . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.4.3. Adding definitions . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.4.4. Compiling to a stack machine . . . . . . . . . . . . . . . . . . . 95
5.5. Block structured languages . . . . . . . . . . . . . . . . . . . . . . . . 97
5.5.1. Blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.5.2. Generating code . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.6. Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
ii
Contents
6. Computing with parsers 105
6.1. Insert a semantic function in the parser . . . . . . . . . . . . . . . . . 105
6.2. Apply a fold to the abstract syntax . . . . . . . . . . . . . . . . . . . . 105
6.3. Deforestation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.4. Using a class instead of abstract syntax . . . . . . . . . . . . . . . . . 107
6.5. Passing an algebra to the parser . . . . . . . . . . . . . . . . . . . . . 109
7. Programming with higher-order folds 111
7.1. The rep min problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
7.1.1. A straightforward solution . . . . . . . . . . . . . . . . . . . . . 113
7.1.2. Lambda lifting . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
7.1.3. Tupling computations . . . . . . . . . . . . . . . . . . . . . . . 114
7.1.4. Merging tupled functions . . . . . . . . . . . . . . . . . . . . . 115
7.2. A small compiler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
7.2.1. The language . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
7.2.2. A stack machine . . . . . . . . . . . . . . . . . . . . . . . . . . 117
7.2.3. Compiling to the stackmachine . . . . . . . . . . . . . . . . . . 118
7.3. Attribute grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
8. Regular Languages 123
8.1. Finite-state automata . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
8.1.1. Deterministic finite-state automata . . . . . . . . . . . . . . . . 124
8.1.2. Nondeterministic finite-state automata . . . . . . . . . . . . . . 126
8.1.3. Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
8.1.4. Constructing a DFA from an NFA . . . . . . . . . . . . . . . . 132
8.1.5. Partial evaluation of NFA’s . . . . . . . . . . . . . . . . . . . . 135
8.2. Regular grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
8.2.1. Equivalence of Regular grammars and Finite automata . . . . 139
8.3. Regular expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
8.4. Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
8.4.1. Proof of Theorem 8.7 . . . . . . . . . . . . . . . . . . . . . . . 146
8.4.2. Proof of Theorem 8.16 . . . . . . . . . . . . . . . . . . . . . . . 147
8.4.3. Proof of Theorem 8.17 . . . . . . . . . . . . . . . . . . . . . . . 148
8.5. Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
9. Pumping Lemmas: the expressive power of languages 153
9.1. The Chomsky hierarchy . . . . . . . . . . . . . . . . . . . . . . . . . . 154
9.1.1. Type-0 grammars . . . . . . . . . . . . . . . . . . . . . . . . . . 154
9.1.2. Type-1 grammars . . . . . . . . . . . . . . . . . . . . . . . . . . 154
9.1.3. Type-2 grammars . . . . . . . . . . . . . . . . . . . . . . . . . . 154
9.1.4. Type-3 grammars . . . . . . . . . . . . . . . . . . . . . . . . . . 155
9.2. The pumping lemma for regular languages . . . . . . . . . . . . . . . . 155
9.3. The pumping lemma for context-free languages . . . . . . . . . . . . . 157
9.4. Proofs of pumping lemmas . . . . . . . . . . . . . . . . . . . . . . . . . 161
iii
Contents
9.4.1. Proof of the Regular Pumping Lemma, Theorem 9.1 . . . . . . 161
9.4.2. Proof of the Context-free Pumping Lemma, Theorem 9.2 . . . 162
9.5. Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
10.LL Parsing 167
10.1. LL Parsing: Background . . . . . . . . . . . . . . . . . . . . . . . . . . 167
10.1.1. A stack machine for parsing . . . . . . . . . . . . . . . . . . . . 168
10.1.2. Some example derivations . . . . . . . . . . . . . . . . . . . . . 169
10.1.3. LL(1) grammars . . . . . . . . . . . . . . . . . . . . . . . . . . 173
10.2. LL Parsing: Implementation . . . . . . . . . . . . . . . . . . . . . . . . 176
10.2.1. Context-free grammars in Haskell . . . . . . . . . . . . . . . . . 176
10.2.2. Parse trees in Haskell . . . . . . . . . . . . . . . . . . . . . . . 178
10.2.3. LL(1) parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
10.2.4. Implementation of isLL(1) . . . . . . . . . . . . . . . . . . . . . 181
10.2.5. Implementation of lookahead . . . . . . . . . . . . . . . . . . . 181
10.2.6. Implementation of empty . . . . . . . . . . . . . . . . . . . . . 186
10.2.7. Implementation of first and last . . . . . . . . . . . . . . . . . . 187
10.2.8. Implementation of follow . . . . . . . . . . . . . . . . . . . . . 190
11.LL versus LR parsing 195
11.1. LL(1) parser example . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
11.1.1. An LL(1) checker . . . . . . . . . . . . . . . . . . . . . . . . . . 196
11.1.2. An LL(1) grammar . . . . . . . . . . . . . . . . . . . . . . . . . 197
11.1.3. Using the LL(1) parser . . . . . . . . . . . . . . . . . . . . . . . 198
11.2. LR parsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
11.2.1. A stack machine for SLR parsing . . . . . . . . . . . . . . . . . 200
11.3. LR parse example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
11.3.1. An LR checker . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
11.3.2. LR action selection . . . . . . . . . . . . . . . . . . . . . . . . . 202
11.3.3. LR optimizations and generalizations . . . . . . . . . . . . . . . 206
A. The Stack module 211
B. Answers to exercises 213
iv
Preface for the 2009 version
This is a work in progress. These lecture notes are in large parts identical with
the old lecture notes for “Grammars and Parsing” by Johan Jeuring and Doaitse
Swierstra.
The lecture notes have not only been used at Utrecht University, but also at the Open
University. Some modifications made by Manuela Witsiers have been reintegrated.
I have also started to make some modifications to style and content – mainly trying to
adapt the Haskell code contained in the lecture notes to currently established coding
guidelines. I also rearranged some of the material because I think they connect better
in the new order.
However, this work is not quite finished, and this means that some of the later
chapters look a bit different from the earlier chapters. I apologize in advance for any
inconsistencies or mistakes I may have introduced.
Please feel free to point out any mistakes you find in these lecture notes to me – any
other feedback is also welcome.
I hope to be able to update the online version of the lecture notes regularly.
Andres L¨oh
October 2009
v
Preface for the 2009 version
vi
Voorwoord
Het hiervolgende dictaat is gebaseerd op teksten uit vorige jaren, die onder andere
geschreven zijn in het kader van het project Kwaliteit en Studeerbaarheid.
Het dictaat is de afgelopen jaren verbeterd, maar we houden ons van harte aanbevolen
voor suggesties voor verdere verbetering, met name daar waar het het aangeven van
verbanden met andere vakken betreft.
Veel mensen hebben een bijgedrage geleverd aan de totstandkoming van dit dictaat
door een gedeelte te schrijven, of (een gedeelte van) het dictaat te becommentari¨eren.
Speciale vermelding verdienen Jeroen Fokker, Rik van Geldrop, en Luc Duponcheel,
die mee hebben geholpen door het schrijven van (een) hoofdstuk(ken) van het dic-
taat. Commentaar is onder andere geleverd door: Arthur Baars, Arnoud Berendsen,
Gijsbert Bol, Breght Boschker, Martin Bravenboer, Pieter Eendebak, Rijk-Jan van
Haaften, Graham Hutton, Daan Leijen, Andres L¨oh, Erik Meijer, en Vincent Oost-
indi¨e.
Tenslotte willen we van de gelegenheid gebruik maken enige studeeraanwijzingen te
geven:
• Het is onze eigen ervaring dat het uitleggen van de stof aan iemand anders vaak
pas duidelijk maakt welke onderdelen je zelf nog niet goed beheerst. Als je dus
van mening bent dat je een hoofdstuk goed begrijpt, probeer dan eens in eigen
woorden uiteen te zetten.
• Oefening baart kunst. Naarmate er meer aandacht wordt besteed aan de pre-
sentatie van de stof, en naarmate er meer voorbeelden gegeven worden, is het
verleidelijker om, na lezing van een hoofdstuk, de conclusie te trekken dat je
een en ander daadwerkelijk beheerst. “Begrijpen is echter niet hetzelfde als
“kennen”, “kennen” is iets anders dan “beheersen” en “beheersen” is weer iets
anders dan “er iets mee kunnen”. Maak dus de opgaven die in het dictaat
opgenomen zijn zelf, en doe dat niet door te kijken of je de oplossingen die
anderen gevonden hebben, begrijpt. Probeer voor jezelf bij te houden welk
stadium je bereikt hebt met betrekking tot alle genoemde leerdoelen. In het
ideale geval zou je in staat moeten zijn een mooi tentamen in elkaar te zetten
voor je mede-studenten!
• Zorg dat je up-to-date bent. In tegenstelling tot sommige andere vakken is het
bij dit vak gemakkelijk de vaste grond onder je voeten kwijt te raken. Het is
niet “elke week nieuwe kansen”. We hebben geprobeerd door de indeling van
vii
Voorwoord
de stof hier wel iets aan te doen, maar de totale opbouw laat hier niet heel
veel vrijheid toe. Als je een week gemist hebt is het vrijwel onmogelijk de
nieuwe stof van de week daarop te begrijpen. De tijd die je dan op college en
werkcollege doorbrengt is dan weinig effectief, met als gevolg dat je vaak voor
het tentamen heel veel tijd (die er dan niet is) kwijt bent om in je uppie alles
te bestuderen.
• We maken gebruik van de taal Haskell om veel concepten en algoritmen te
presenteren. Als je nog moeilijkheden hebt met de taal Haskell aarzel dan niet
direct hier wat aan te doen, en zonodig hulp te vragen. Anders maak je jezelf
het leven heel moeilijk. Goed gereedschap is het halve werk, en Haskell is hier
ons gereedschap.
Veel sterkte, en hopelijk ook veel plezier,
Johan Jeuring en Doaitse Swierstra
viii
1. Goals
Introduction
Courses on Grammars, Parsing and Compilation of programming languages have
always been some of the core components of a computer science curriculum. The
reason for this is that from the very beginning of these curricula it has been one
of the few areas where the development of formal methods and the application of
formal techniques in actual program construction come together. For a long time the
construction of compilers has been one of the few areas where we had a methodology
available, where we had tools for generating parts of compilers out of formal descrip-
tions of the tasks to be performed, and where such program generators were indeed
generating programs which would have been impossible to create by hand. For many
practicing computer scientists the course on compiler construction still is one of the
highlights of their education.
One of the things which were not so clear however is where exactly this joy originated
from: the techniques taught definitely had a certain elegance, we could construct
programs someone else could not – thus giving us the feeling we had “the right
stuff” –, and when completing the practical exercises, which invariably consisted of
constructing a compiler for some toy language, we had the usual satisfied feeling.
This feeling was augmented by the fact that we would not have had the foggiest idea
how to complete such a product a few months before, and now we knew “how to do
it”.
This situation has remained so for years, and it is only in the last years that we
have started to discover and make explicit the reasons why this area attracted so
much interest. Many of the techniques which were taught on a “this is how you solve
this kind of problems” basis, have been provided with a theoretical underpinning
which explains why the techniques work. As a beneficial side-effect we also gradually
learned to see where the discovered concept further played a rˆole, thus linking the
area with many other areas of computer science; and not only that, but also giving
us a means to explain such links, stress their importance, show correspondences and
transfer insights from one area of interest to the other.
Goals
The goals of these lecture notes can be split into primary goals, which are associated
with the specific subject studied, and secondary – but not less important – goals which
1
1. Goals
have to do with developing skills which one would expect every educated computer
scientist to have. The primary, somewhat more traditional, goals are:
• to understand how to describe structures (i.e., “formulas”) using grammars;
• to know how to parse, i.e., how to recognise (build) such structures in (from) a
sequence of symbols;
• to know how to analyse grammars to see whether or not specific properties
hold;
• to understand the concept of compositionality;
• to be able to apply these techniques in the construction of all kinds of programs;
• to familiarise oneself with the concept of computability.
The secondary, more far reaching, goals are:
• to develop the capability to abstract;
• to understand the concepts of abstract interpretation and partial evaluation;
• to understand the concept of domain specific languages;
• to show how proper formalisations can be used as a starting point for the
construction of useful tools;
• to improve the general programming skills;
• to show a wide variety of useful programming techniques;
• to show how to develop programs in a calculational style.
1.1. History
When at the end of the fifties the use of computers became more and more wide-
spread, and their reliability had increased enough to justify applying them to a wide
range of problems, it was no longer the actual hardware which posed most of the
problems. Writing larger and larger programs by more and more people sparked the
development of the first more or less machine-independent programming language
FORTRAN (FORmula TRANslator), which was soon to be followed by ALGOL-60
and COBOL.
For the developers of the FORTRAN language, of which John Backus was the prime
architect, the problem of how to describe the language was not a hot issue: much
more important problems were to be solved, such as, what should be in the language
and what not, how to construct a compiler for the language that would fit into the
small memories which were available at that time (kilobytes instead of megabytes),
and how to generate machine code that would not be ridiculed by programmers who
had thus far written such code by hand. As a result the language was very much
implicitly defined by what was accepted by the compiler and what not.
Soon after the development of FORTRAN an international working group started to
work on the design of a machine independent high-level programming language, to
2
1.2. Grammar analysis of context-free grammars
become known under the name ALGOL-60. As a remarkable side-effect of this under-
taking, and probably caused by the need to exchange proposals in writing, not only
a language standard was produced, but also a notation for describing programming
languages was proposed by Naur and used to describe the language in the famous
Algol-60 report. Ever since it was introduced, this notation, which soon became to
be known as the Backus-Naur formalism (BNF), has been used as the primary tool
for describing the basic structure of programming languages.
It was not for long that computer scientists, and especially people writing compilers,
discovered that the formalism was not only useful to express what language should be
accepted by their compilers, but could also be used as a guideline for structuring their
compilers. Once this relationship between a piece of BNF and a compiler became
well understood, programs emerged which take such a piece of language description
as input, and produce a skeleton of the desired compiler. Such programs are now
known under the name parser generators.
Besides these very mundane goals, i.e., the construction of compilers, the BNF-
formalism also became soon a subject of study for the more theoretically oriented. It
appeared that the BNF-formalism actually was a member of a hierarchy of grammar
classes which had been formulated a number of years before by the linguist Noam
Chomsky in an attempt to capture the concept of a “language”. Questions arose
about the expressibility of BNF, i.e., which classes of languages can be expressed
by means of BNF and which not, and consequently how to express restrictions and
properties of languages for which the BNF-formalism is not powerful enough. In the
lectures we will see many examples of this.
1.2. Grammar analysis of context-free grammars
Nowadays the use of the word Backus-Naur is gradually diminishing, and, inspired
by the Chomsky hierarchy, we most often speak of context-free grammars. For the
construction of everyday compilers for everyday languages it appears that this class
is still a bit too large. If we use the full power of the context-free languages we
get compilers which in general are inefficient, and probably not so good in handling
erroneous input. This latter fact may not be so important from a theoretical point of
view, but it is from a pragmatical point of view. Most invocations of compilers still
have as their primary goal to discover mistakes made when typing the program, and
not so much generating actual code. This aspect is even stronger present in strongly
typed languages, such as Java and Haskell, where the type checking performed by
the compilers is one of the main contributions to the increase in efficiency in the
programming process.
When constructing a recogniser for a language described by a context-free grammar
one often wants to check whether or not the grammar has specific desirable properties.
Unfortunately, for a human being it is not always easy, and quite often practically
3
1. Goals
impossible, to see whether or not a particular property holds. Furthermore, it may
be very expensive to check whether or not such a property holds. This has led to a
whole hierarchy of context-free grammars classes, some of which are more powerful,
some are easy to check by machine, and some are easily checked by a simple human
inspection. In this course we will see many examples of such classes. The general
observation is that the more precise the answer to a specific question one wants to
have, the more computational effort is needed and the sooner this question cannot
be answered by a human being anymore.
1.3. Compositionality
As we will see the structure of many compilers follows directly from the grammar
that describes the language to be compiled. Once this phenomenon was recognised it
went under the name syntax directed compilation. Under closer scrutiny, and under
the influence of the more functional oriented style of programming, it was recognised
that actually compilers are a special form of homomorphisms, a concept thus far only
familiar to mathematicians and more theoretically oriented computer scientist that
study the description of the meaning of a programming language.
This should not come as a surprise since this recognition is a direct consequence of the
tendency that ever greater parts of compilers are more or less automatically generated
from a formal description of some aspect of a programming language; e.g. by making
use of a description of their outer appearance or by making use of a description of the
semantics (meaning) of a language. We will see many examples of such mappings.
As a side effect you will acquire a special form of writing functional programs, which
makes it often surprisingly simple to solve at first sight rather complicated program-
ming assignments. We will see that the concept of lazy evaluation plays an important
rˆole in making these efficient and straightforward implementations possible.
1.4. Abstraction mechanisms
One of the main reasons for that what used to be an endeavour for a large team in
the past can now easily be done by a couple of first year’s students in a matter of
days or weeks, is that over the last thirty years we have discovered the right kind of
abstractions to be used, and an efficient way of partitioning a problem into smaller
components. Unfortunately there is no simple way to teach the techniques which
have led us thus far. The only way we see is to take a historians view and to compare
the old and the new situations.
Fortunately however there have also been some developments in programming lan-
guage design, of which we want to mention the developments in the area of functional
programming in particular. We claim that the combination of a modern, albeit quite
4
1.4. Abstraction mechanisms
elaborate, type system, combined with the concept of lazy evaluation, provides an
ideal platform to develop and practice ones abstraction skills. There does not exist
another readily executable formalism which may serve as an equally powerful tool.
We hope that by presenting many algorithms, and fragments thereof, in a modern
functional language, we can show the real power of abstraction, and even find some
inspiration for further developments in language design: i.e., find clues about how to
extend such languages to enable us to make common patterns, which thus far have
only been demonstrated by giving examples, explicit.
5
1. Goals
6
2. Context-Free Grammars
Introduction
We often want to recognise a particular structure hidden in a sequence of symbols.
For example, when reading this sentence, you automatically structure it by means of
your understanding of the English language. Of course, not any sequence of symbols
is an English sentence. So how do we characterise English sentences? This is an old
question, which was posed long before computers were widely used; in the area of
natural language research the question has often been posed what actually constitutes
a “language”. The simplest definition one can come up with is to say that the English
language equals the set of all grammatically correct English sentences, and that a
sentence consists of a sequence of English words. This terminology has been carried
over to computer science: the programming language Java can be seen as the set of
all correct Java programs, whereas a Java program can be seen as a sequence of Java
symbols, such as identifiers, reserved words, specific operators etc.
This chapter introduces the most important notions of this course: the concept of
a language and a grammar. A language is a, possibly infinite, set of sentences and
sentences are sequences of symbols taken from a finite set (e. g., sequences of char-
acters, which are referred to as strings). Just as we say that the fact whether or not
a sentence belongs to the English language is determined by the English grammar
(remember that before we have used the phrase “grammatically correct”), we have a
grammatical formalism for describing artificial languages.
A difference with the grammars for natural languages is that this grammatical for-
malism is a completely formal one. This property may enable us to mathematically
prove that a sentence belongs to some language, and often such proofs can be con-
structed automatically by a computer in a process called parsing. Notice that this
is quite different from the grammars for natural languages, where one may easily
disagree about whether something is correct English or not. This completely formal
approach however also comes with a disadvantage; the expressiveness of the class of
grammars we are going to describe in this chapter is rather limited, and there are
many languages one might want to describe but which cannot be described, given
the limitations of the formalism.
7
2. Context-Free Grammars
Goals
The main goal of this chapter is to introduce and show the relation between the
main concepts for describing the parsing problem: languages and sentences, and
grammars.
In particular, after you have studied this chapter you will:
• know the concepts of language and sentence;
• know how to describe languages by means of context-free grammars;
• know the difference between a terminal symbol and a nonterminal symbol;
• be able to read and interpret the BNF notation;
• understand the derivation process used in describing languages;
• understand the rˆole of parse trees;
• understand the relation between context-free grammars and datatypes;
• understand the EBNF formalism;
• understand the concepts of concrete and abstract syntax;
• be able to convert a grammar from EBNF-notation into BNF-notation by hand;
• be able to construct a simple context-free grammar in EBNF notation;
• be able to verify whether or not a simple grammar is ambiguous;
• be able to transform a grammar, for example for removing left recursion.
2.1. Languages
The goal of this section is to introduce the concepts of language and sentence.
In conventional texts about mathematics it is not uncommon to encounter a definition
of sequences that looks as follows:
Definition 2.1 (Sequence). Let X be a set. The set of sequences over X, called X

,
sequence
is defined as follows:
• ε is a sequence, called the empty sequence, and
• if z is a sequence and a is an element of X, then az is also a sequence.
The above definition is an instance of a very common definition pattern: it is a
definition by induction, i. e., the definition of the concept refers to the concept itself.
induction
It is implicitly understood that nothing that cannot be formed by repeated, but finite
application of one of the two given rules is a sequence over X.
Furthermore, the definition corresponds almost exactly to the definition of the type
[a] of lists with elements of type a in Haskell. The one difference is that Haskell lists
can be infinite, whereas sequences are always finite.
In the following, we will introduce several concepts based on sequences. They can be
implemented easily in Haskell using lists.
8
2.1. Languages
Functions that operate on an inductively defined structure such as sequences are
typically structurally recursive, i. e., such definitions often follow a recursion pattern
which is similar to the definition of the structure itself. (Recall the function foldr
from the course on Functional Programming.)
Note that the Haskell notation for lists is generally more precise than the mathemat-
ical notation for sequences. When talking about languages and grammars, we often
leave the distinction between single symbols and sequences implicit.
We use letters from the beginning of the alphabet to represent single symbols, and
letters from the end of the alphabet to represent sequences. We write a to denote
both the single symbol a or the sequence aε, depending on context. We typically use
ε only when we want to explicitly emphasize that we are talking about the empty
sequence.
Furthermore, we denote concatenation of sequences and symbols in the same way,
i. e., az should be understood as the symbol a followed by the sequence z , whereas
xy is the concatenation of sequences x and y.
In Haskell, all these distinctions are explicit. Elements are distinguished from lists
by their type; there is a clear difference between a and [a]. Concatenation of lists
is handled by the operator (++), whereas a single element can be added to the front
of a list using (:). Also, Haskell identifiers often have longer names, so ab in Haskell
is to be understood as a single identifier with name ab, not as a combination of two
symbols a and b.
Now we move from individual sequences to finite or infinite sets of sequences. We
start with some terminology:
Definition 2.2 (Alphabet, Language, Sentence).
• An alphabet is a finite set of symbols.
alphabet
• A language is a subset of T

, for some alphabet T.
language
• A sentence (often also called word) is an element of a language.
sentence
Note that ‘word’ and ‘sentence’ in formal languages are used as synonyms.
Some examples of alphabets are:
• the conventional Roman alphabet: ¦a, b, c, . . . , z¦;
• the binary alphabet ¦0, 1¦;
• sets of reserved words ¦if, then, else¦;
• a set of characters l =¦a, b, c, d, e, i, k, l, m, n, o, p, r, s, t, u, w, x¦;
• a set of English words ¦course, practical, exercise, exam¦.
Examples of languages are:
• T

, ∅ (the empty set), ¦ε¦ and T are languages over alphabet T;
9
2. Context-Free Grammars
• the set ¦course, practical, exercise, exam¦ is a language over the alphabet
l of characters and exam is a sentence in it.
The question that now arises is how to specify a language. Since a language is a set
we immediately see three different approaches:
• enumerate all the elements of the set explicitly;
• characterise the elements of the set by means of a predicate;
• define which elements belong to the set by means of induction.
We have just seen some examples of the first (the Roman alphabet) and third (the set
of sequences over an alphabet) approach. Examples of the second approach are:
• the even natural numbers ¦n [ n ∈ ¦0, 1, . . . , 9¦

, n mod 2 = 0¦;
• the language PAL of palindromes, sequences which read the same forward as
backward, over the alphabet ¦a, b, c¦: ¦s [ s ∈ ¦a, b, c¦

, s = s
R
¦, where s
R
denotes the reverse of sequence s.
One of the fundamental differences between the predicative and the inductive ap-
proach to defining a language is that the latter approach is constructive, i. e., it
provides us with a way to enumerate all elements of a language. If we define a lan-
guage by means of a predicate we only have a means to decide whether or not an
element belongs to a language. A famous example of a language which is easily de-
fined in a predicative way, but for which the membership test is very hard, is the set
of prime numbers.
Quite often we want to prove that a language L, which is defined by means of an
inductive definition, has a specific property P. If this property is of the form P(L) =
∀x ∈ L . P(x), then we want to prove that L ⊆ P.
Since languages are sets the usual set operators such as union, intersection and dif-
ference can be used to construct new languages from existing ones. The complement
of a language L over alphabet T is defined by L =¦x [ x ∈ T

, x / ∈ L¦.
In addition to these set operators, there are more specific operators, which apply only
to sets of sequences. We will use these operators mainly in the chapter on regular
languages, Chapter 8. Note that ∪ denotes set union, so ¦1, 2¦∪¦1, 3¦=¦1, 2, 3¦.
Definition 2.3 (Language operations). Let L and M be languages over the same
alphabet T, then
L = T

−L complement of L
L
R
=¦s
R
[ s ∈ L¦ reverse of L
LM =¦st [ s ∈ L, t ∈ M¦ concatenation of L and M
L
0
=¦ε¦ 0
th
power of L
L
n+1
= LL
n
n + 1
st
power of L
L

=

i∈N
L
i
= L
0
∪ L
1
∪ L
2
∪ . . . star-closure of L
L
+
=

i∈N,i>0
L
i
= L
1
∪ L
2
∪ . . . positive closure of L
10
2.2. Grammars
The following equations follow immediately from the above definitions.
L

=¦ε¦ ∪ LL

L
+
= LL

Exercise 2.1. Let L=¦ab, aa, baa¦, where a and b are the terminals. Which of the following
strings are in L

: abaabaaabaa, aaaabaaaa, baaaaabaaaab, baaaaabaa?
Exercise 2.2. What are the elements of ∅

?
Exercise 2.3. For any language, prove
1. ∅L = L∅ =∅
2. ¦ε¦L = L¦ε¦ = L
Exercise 2.4. In this section we defined two “star” operators: one for arbitrary sets (Defini-
tion 2.1) and one for languages (Definition 2.3). Is there a difference between these operators?
2.2. Grammars
The goal of this section is to introduce the concept of context-free grammars.
Working with sets might be fun, but it often is complicated to manipulate sets, and
to prove properties of sets. For these purposes we introduce syntactical definitions,
called grammars, of sets. This section will only discuss so-called context-free gram-
mars, a kind of grammars that are convenient for automatic processing, and that can
describe a large class of languages. But the class of languages that can be described
by context-free grammars is limited.
In the previous section we defined PAL, the language of palindromes, by means of
a predicate. Although this definition defines the language we want, it is hard to
use in proofs and programs. An important observation is the fact that the set of
palindromes can be defined inductively as follows.
Definition 2.4 (Palindromes by induction).
• The empty string, ε, is a palindrome;
• the strings consisting of just one character, a, b, and c, are palindromes;
• if P is a palindrome, then the strings obtained by prepending and appending
the same character, a, b, and c, to it are also palindromes, that is, the strings
aPa
bPb
cPc
are palindromes.
11
2. Context-Free Grammars
The first two parts of the definition cover the basic cases. The last part of the
definition covers the inductive cases. All strings which belong to the language PAL
inductively defined using the above definition read the same forwards and backwards.
Therefore this definition is said to be sound (every string in PAL is a palindrome).
sound
Conversely, if a string consisting of a’s, b’s, and c’s reads the same forwards and
backwards then it belongs to the language PAL. Therefore this definition is said to
be complete (every palindrome is in PAL).
complete
Finding an inductive definition for a language which is described by a predicate (like
the one for palindromes) is often a nontrivial task. Very often it is relatively easy
to find a definition that is sound, but you also have to convince yourself that the
definition is complete. A typical method for proving soundness and completeness of
an inductive definition is mathematical induction.
Now that we have an inductive definition for palindromes, we can proceed by giving
a formal representation of this inductive definition.
Inductive definitions like the one above can be represented formally by making use
of deduction rules which look like:
a
1
, a
2
, . . . , a
n
¬ a or ¬ a
The first kind of deduction rule has to be read as follows:
if a
1
, a
2
, . . . and a
n
are true,
then a is true.
The second kind of deduction rule, called an axiom, has to be read as follows:
a is true.
Using these deduction rules we can now write down the inductive definition for PAL
as follows:
¬ ε ∈ PAL

¬ a ∈ PAL

¬ b ∈ PAL

¬ c ∈ PAL

P ∈ PAL

¬ aPa ∈ PAL

P ∈ PAL

¬ bPb ∈ PAL

P ∈ PAL

¬ cPc ∈ PAL

Although the definition of PAL

is completely formal, it is still laborious to write.
Since in computer science we use many definitions which follow such a pattern, we
introduce a shorthand for it, called a grammar. A grammar consists of production
grammar
rules. We can give a grammar of PAL

by translating the deduction rules given above
into production rules. The rule with which the empty string is constructed is:
12
2.2. Grammars
P → ε
This rule corresponds to the axiom that states that the empty string ε is a palindrome.
A rule of the form s → α, where s is symbol and λα is a sequence of symbols, is called
a production rule, or production for short. A production rule can be considered as
production rule
a possible way to rewrite the symbol s. The symbol P to the left of the arrow is a
symbol which denotes palindromes. Such a symbol is an example of a nonterminal
symbol , or nonterminal for short. Nonterminal symbols are also called auxiliary
nonterminal
symbols: their only purpose is to denote structure, they are not part of the alphabet
of the language. Three other basic production rules are the rules for constructing
palindromes consisting of just one character. Each of the one element strings a, b,
and c is a palindrome, and gives rise to a production:
P → a
P → b
P → c
These production rules correspond to the axioms that state that the one element
strings a, b, and c are palindromes. If a string α is a palindrome, then we obtain a
new palindrome by prepending and appending an a, b, or c to it, that is, aαa, bαb,
and cαc are also palindromes. To obtain these palindromes we use the following
recursive productions:
P → aPa
P → bPb
P → cPc
These production rules correspond to the deduction rules that state that, if P is a
palindrome, then one can deduce that aPa, bPb and cPc are also palindromes. The
grammar we have presented so far consists of three components:
• the set of terminals ¦a, b, c¦;
terminal
• the set of nonterminals ¦P¦;
• and the set of productions (the seven productions that we have introduced so
far).
Note that the intersection of the set of terminals and the set of nonterminals is
empty. We complete the description of the grammar by adding a fourth component:
the nonterminal start symbol P. In this case we have only one choice for a start
start symbol
symbol, but a grammar may have many nonterminal symbols, and we always have
to select one to start with.
To summarize, we obtain the following grammar for PAL:
P → ε
P → a
P → b
13
2. Context-Free Grammars
P → c
P → aPa
P → bPb
P → cPc
The definition of the set of terminals, ¦a, b, c¦, and the set of nonterminals, ¦P¦, is
often implicit. Also the start-symbol is implicitly defined here since there is only one
nonterminal.
We conclude this example with the formal definition of a context-free grammar.
Definition 2.5 (Context-Free Grammar). A context-free grammar G is a four-tuple
context-free
grammar
(T, N, R, S) where
• T is a finite set of terminal symbols;
• N is a finite set of nonterminal symbols (T and N are disjunct);
• R is a finite set of production rules. Each production has the form A → α,
where A is a nonterminal and α is a sequence of terminals and nonterminals;
• S is the start symbol, S ∈ N.
The adjective “context-free” in the above definition comes from the specific produc-
tion rules that are considered: exactly one nonterminal on the left hand side. Not
every language can be described via a context-free grammar. The standard example
here is ¦a
n
b
n
c
n
[ n ∈ N¦. We will encounter this example again later in these lecture
notes.
2.2.1. Notational conventions
In the definition of the grammar for PAL we have written every production on a
single line. Since this takes up a lot of space, and since the production rules form the
heart of every grammar, we introduce the following shorthand. Instead of writing
S → α
S → β
we combine the two productions for S in one line as using the symbol [:
S → α [ β
We may rewrite any number of rewrite rules for one nonterminal in this fashion, so
the grammar for PAL may also be written as follows:
P → ε [ a [ b [ c [ aPa [ bPb [ cPc
The notation we use for grammars is known as BNF – Backus Naur Form – after
BNF
Backus and Naur, who first used this notation for defining grammars.
14
2.3. The language of a grammar
Another notational convention concerns names of productions. Sometimes we want to
give names to production rules. The names will be written in front of the production.
So, for example,
Alpha rule: S → α
Beta rule: S → β
Finally, if we give a context-free grammar just by means of its productions, the start-
symbol is usually the nonterminal in the left hand side of the first production, and
the start-symbol is usually called S.
Exercise 2.5. Give a context free grammar for the set of sentences over alphabet X where
1. X =¦a¦,
2. X =¦a, b¦.
Exercise 2.6. Give a context free grammar for the language
L =¦a
n
b
n
[ n ∈ N¦
Exercise 2.7. Give a grammar for palindromes over the alphabet ¦a, b¦.
Exercise 2.8. Give a grammar for the language
L =¦s s
R
[ s ∈ ¦a, b¦

¦
This language is known as the language of mirror palindromes.
Exercise 2.9. A parity sequence is a sequence consisting of 0’s and 1’s that has an even
number of ones. Give a grammar for parity sequences.
Exercise 2.10. Give a grammar for the language
L =¦w [ w ∈ ¦a, b¦

∧ #(a, w) = #(b, w)¦
where #(c, w) is the number of c-occurrences in w.
2.3. The language of a grammar
The goal of this section is to describe the relation between grammars and languages:
to show how to derive sentences of a language, given its grammar.
In the previous section, we have demonstrated in several examples how to construct
a grammar for a particular language. Now we consider the reverse question: how to
obtain a language from a given grammar? Before we can answer this question we
first have to say what we can do with a grammar. The answer is simple: we can
derive sequences with it.
15
2. Context-Free Grammars
How do we construct a palindrome? A palindrome is a sequence of terminals, in our
case the characters a, b and c, that can be derived in zero or more direct derivation
steps from the start symbol P using the productions of the grammar for palindromes
given before.
For example, the sequence bacab can be derived using the grammar for palindromes
as follows:
P

bPb

baPab

bacab
Such a construction is called a derivation. In the first step of this derivation pro-
derivation
duction P → bPb is used to rewrite P into bPb. In the second step production
P → aPa is used to rewrite bPb into baPab. Finally, in the last step production
P → c is used to rewrite baPab into bacab. Constructing a derivation can be seen
as a constructive proof that the string bacab is a palindrome.
We will now describe derivation steps more formally.
Definition 2.6 (Derivation). Suppose X → β is a production of a grammar, where
X is a nonterminal symbol and β is a sequence of (nonterminal or terminal) symbols.
Let αXγ be a sequence of (nonterminal or terminal) symbols. We say that αXγ
directly derives the sequence αβγ, which is obtained by replacing the left hand side
direct derivation
X of the production by the corresponding right hand side β. We write αXγ ⇒ αβγ
and we also say that αXγ rewrites to αβγ in one step. A sequence ϕ
n
is derived
derivation
from a sequence ϕ
0
, written ϕ
0


ϕ
n
, if there exist sequences ϕ
0
, . . . , ϕ
n
such that
∀i , 0 i <n : ϕ
i
⇒ ϕ
i+1
If n = 0, this statement is trivially true, and it follows that we can derive each
sentence ϕ from itself in zero steps:
ϕ ⇒

ϕ
A partial derivation is a derivation of a sequence that still contains nonterminals.
partial derivation
Finding a derivation ϕ
0


ϕ
n
is, in general, a nontrivial task. A derivation is only
one branch of a whole search tree which contains many more branches. Each branch
represents a (successful or unsuccessful) direction in which a possible derivation may
proceed. Another important challenge is to arrange things in such a way that finding
a derivation can be done in an efficient way.
16
2.3. The language of a grammar
From the example derivation above it follows that
P ⇒

bacab
Because this derivation begins with the start symbol of the grammar and results in a
sequence consisting of terminals only (a terminal string), we say that the string bacab
belongs to the language generated by the grammar for palindromes. In general, we
define
language of a
grammar
Definition 2.7 (Language of a grammar). The language of a grammar G=(T, N, R, S),
usually denoted by L(G), is defined as
L(G) =¦s [ S ⇒

s, s ∈ T

¦
The language L(G) is also called the language generated by the grammar G. We
sometimes also talk about the language of a nonterminal A, which is defined by
L(A) =¦s [ A ⇒

s, s ∈ T

¦
Note that different grammars may have the same language. For example, if we extend
the grammar for PAL with the production P → bacab, we obtain a grammar with
exactly the same language as PAL. Two grammars that generate the same language
are called equivalent. So for a particular grammar there exists a unique language,
equivalent
but the reverse is not true: given a language we can construct many grammars that
generate the language. To phrase it more mathematically: the mapping between a
grammar and its language is not a bijection.
Definition 2.8 (Context-free language). A context-free language is a language that
context-free
language
is generated by a context-free grammar.
All palindromes can be derived from the start symbol P. Thus, the language of
our grammar for palindromes is PAL, the set of all palindromes over the alphabet
¦a, b, c¦, and PAL is context-free.
2.3.1. Examples of basic languages
Digits occur in a several programming languages and other languages, and so do
letters. In this subsection we will define some grammars that specify some basic
languages such as digits and letters. These grammars will be used frequently in later
sections.
• The language of single digits is specified by a grammar with ten production
rules for the nonterminal Dig.
Dig → 0 [ 1 [ 2 [ 3 [ 4 [ 5 [ 6 [ 7 [ 8 [ 9
17
2. Context-Free Grammars
• We obtain sequences of digits by means of the following grammar:
Digs → ε [ Dig Digs
• Natural numbers are sequences of digits that start with a non-zero digit. So
in order to specify natural numbers, we first define the language of non-zero
digits.
Dig-0 → 1 [ 2 [ 3 [ 4 [ 5 [ 6 [ 7 [ 8 [ 9
Now we can define the language of natural numbers as follows.
Nat → 0 [ Dig-0 Digs
• Integers are natural numbers preceded by a sign. If a natural number is not
preceded by a sign, it is supposed to be a positive number.
Sign → + [ -
Int → Sign Nat [ Nat
• The languages of small letters and capital letters are each specified by a gram-
mar with 26 productions:
SLetter → a [ b [ . . . [ z
CLetter → A [ B [ . . . [ Z
In the real definitions of these grammars we have to write each of the 26 letters,
of course. A letter is now either a small or a capital letter.
Letter → SLetter [ CLetter
• Variable names, function names, datatypes, etc., are all represented by identi-
fiers in programming languages. The following grammar for identifiers might
be used in a programming language:
Identifier → Letter AlphaNums
AlphaNums → ε [ Letter AlphaNums [ Dig AlphaNums
An identifier starts with a letter, and is followed by a sequence of alphanumeric
characters, i. e., letters and digits. We might want to allow more symbols, such
as for example underscores and dollar symbols, but then we have to adjust the
grammar, of course.
• Dutch zip codes consist of four digits, of which the first digit is non-zero, fol-
lowed by two capital letters. So
ZipCode → Dig-0 Dig Dig Dig CLetter CLetter
18
2.4. Parse trees
Exercise 2.11. A terminal string that belongs to the language of a grammar is always
derived in one or more steps from the start symbol of the grammar. Why?
Exercise 2.12. What language is generated by the grammar with the single production rule
S → ε
Exercise 2.13. What language does the grammar with the following productions generate?
S → Aa
A → B
B → Aa
Exercise 2.14. Give a simple description of the language generated by the grammar with
productions
S → aA
A → bS
S → ε
Exercise 2.15. Is the language L defined in Exercise 2.1 context free ?
2.4. Parse trees
The goal of this section is to introduce parse trees, and to show how parse trees relate
to derivations. Furthermore, this section defines (non)ambiguous grammars.
For any partial derivation, i. e., a derivation that contains nonterminals in its right
hand side, there may be several productions of the grammar that can be used to pro-
ceed the partial derivation with. As a consequence, there may be different derivations
for the same sentence.
However, if only the order in which the derivation steps are chosen differs between
two derivations, then the derivations are considered to be equivalent. If, however,
different derivation steps have been chosen in two derivations, then these derivations
are considered to be different.
Here is a simple example. Consider the grammar SequenceOfS with productions:
S → SS
S → s
Using this grammar, we can derive the sentence sss in at least the following two
ways (the nonterminal that is rewritten in each step appears underlined):
S ⇒ SS ⇒ SSS ⇒ SsS ⇒ ssS ⇒ sss
19
2. Context-Free Grammars
S ⇒ SS ⇒ sS ⇒ sSS ⇒ sSs ⇒ sss
These derivations are the same up to the order in which derivation steps are taken.
However, the following derivation does not use the same derivation steps:
S ⇒ SS ⇒ SSS ⇒ sSS ⇒ ssS ⇒ sss
In both derivation sequences above, the first S was rewritten to s. In this derivation,
however, the first S is rewritten to SS.
The set of all equivalent derivations can be represented by selecting a so-called canoni-
cal element. A good candidate for such a canonical element is the leftmost derivation.
leftmost derivation
In a leftmost derivation, the leftmost nonterminal is rewritten in each step. If there
exists a derivation of a sentence x using the productions of a grammar, then there
exists also a leftmost derivation of x. The last of the three derivation sequences for
the sentence sss given above is a leftmost derivation. The two equivalent derivation
sequences before, however, are both not leftmost. The leftmost derivation corre-
sponding to these two sequences above is
S ⇒ SS ⇒ sS ⇒ sSS ⇒ ssS ⇒ sss
There exists another convenient way for representing equivalent derivations: they all
correspond to the same parse tree (or derivation tree). A parse tree is a representation
parse tree
of a derivation which abstracts from the order in which derivation steps are chosen.
The internal nodes of a parse tree are labelled with a nonterminal N, and the children
of such a node are the parse trees for symbols of the right hand side of a production for
N. The parse tree of a terminal symbol is a leaf labelled with the terminal symbol.
The resulting parse tree of the first two derivations of the sentence sss looks as
follows:
S
S
s
S
S
s
S
s
The third derivation of the sentence sss results in a different parse tree:
S
S
S
s
S
s
S
s
20
2.4. Parse trees
As another example, all derivations of the string abba using the productions of the
grammar
P → ε
P → APA
P → BPB
A → a
B → b
are represented by the following derivation tree:
P
A
a
P
B
b
P
ε
B
b
A
a
A derivation tree can be seen as a structural interpretation of the derived sentence.
Note that there might be more than one structural interpretation of a sentence with
respect to a given grammar. Such grammars are called ambiguous.
Definition 2.9 (ambiguous grammar, unambiguous grammar). A grammar is un-
ambiguous if every sentence has a unique leftmost derivation, or, equivalently, if every
unambiguous
sentence has a unique derivation tree. Otherwise it is called ambiguous.
ambiguous
The grammar SequenceOfS for constructing sequences of s’s is an example of an
ambiguous grammar, since there exist two parse trees for the sentence sss.
It is in general undecidable whether or not an arbitrary context-free grammar is
ambiguous. This implies that is impossible to write a program that determines for
an arbitrary context-free grammar whether it is ambiguous or not.
It is usually rather difficult to translate languages with ambiguous grammars. There-
fore, you will find that most grammars of programming languages and other languages
that are used in processing information are unambiguous.
Grammars have proved very successful in the specification of artificial languages (such
as programming languages). They have proved less successful in the specification of
natural languages (such as English), partly because is extremely difficult to construct
an unambiguous grammar that specifies a nontrivial part of the language. Take for
example the sentence ‘They are flying planes’. This sentence can be read in two
ways, with different meanings: ‘They – are – flying planes’, and ‘They – are flying
– planes’. While ambiguity of natural languages may perhaps be considered as an
advantage for their users (e. g., politicians), it certainly is considered a disadvantage
for language translators, because it is usually impossible to maintain an ambiguous
meaning in a translation.
21
2. Context-Free Grammars
2.5. Grammar transformations
In this section, we will first look at a number of properties of grammars. We then will
show how we can systematically transform grammars into grammars that describe
the same language and satisfy particular properties.
Here are some examples of properties that are worth considering for a particular
grammar:
• a grammar may be unambiguous, that is, every sentence of its language has a
unique parse tree;
• a grammar may have the property that only the start symbol can derive the
empty string; no other nonterminal can derive the empty string;
• a grammar may have the property that every production either has a single
terminal, or two nonterminals in its right hand side. Such a grammar is said
to be in Chomsky normal form.
So why are we interested in such properties? Some of these properties imply that it
is possible to build parse trees for sentences of the language of the grammar in only
one way. Some other properties imply that we can build these parse trees very fast.
Other properties are used to prove facts about grammars. Yet other properties are
used to efficiently compute certain information from parse trees of a grammar.
Properties are particularly interesting in combination with grammar transformations.
A grammar transformationis a procedure to obtain a grammar G

from a grammar
grammar
transformation
G such that L(G

) =L(G).
Suppose, for example, that we have a program that builds parse trees for sentences
of grammars in Chomsky normal form, and that we can prove that every grammar
can be transformed in a grammar in Chomsky normal form. Then we can use this
program for building parse trees for any grammar.
Since it is sometimes convenient to have a grammar that satisfies a particular property
for a language, we would like to be able to transform grammars into other grammars
that generate the same language, but that possibly satisfy different properties. In
the following, we describe a number of grammar transformations:
• Removing duplicate productions.
• Substituting right hand sides for nonterminals.
• Removing unreachable productions.
• Left factoring.
• Removing left recursion.
• Associative separator.
• Introduction of priorities.
22
2.5. Grammar transformations
There are many more transformations than we describe here; we will only show a
small but useful set of grammar transformations. In the following, we will assume
that N is the set of nonterminals, T is the set of terminals, and that u, v, w, x, y and
z denote sequences of terminals and nonterminals, i. e., are elements of (N ∪ T)

.
2.5.1. Removing duplicate productions
This grammar transformation is a transformation that can be applied to any grammar
of the correct form. If a grammar contains two occurrences of the same production
rule, one of these occurrences can be removed. For example,
A → u [ u [ v
can be transformed into
A → u [ v
2.5.2. Substituting right hand sides for nonterminals
If a nonterminal X occurs in a right hand side of a production, the production may
be replaced by just as many productions as there exist productions for X, in which
X has been replaced by its right hand sides. For example, we can substitute B in
the right hand side of the first production in the following grammar:
A → uBv [ z
B → x [ w
The resulting grammar is:
A → uxv [ uwv [ z
B → x [ w
2.5.3. Removing unreachable productions
Consider the result of the transformation above:
A → uxv [ uwv [ z
B → x [ w
If A is the start symbol and B does not occur in any of the symbol sequences u, v,
w, x, z , then the second production can never occur in the derivation of a sentence
starting from A. In such a case, the unreachable production can be dropped:
A → uxv [ uwv [ z
23
2. Context-Free Grammars
2.5.4. Left factoring
Left factoring is a grammar transformation that is applicable when two productions
left factoring
for the same nonterminal start with the same sequence of (terminal and/or nonter-
minal) symbols. These two productions can then be replaced by a single production,
that ends with a new nonterminal, replacing the part of the sequence after the com-
mon start sequence. Two productions for the new nonterminal are added: one for
each of the two different end sequences of the two productions. For example:
A → xy [ xz [ v
may be transformed into
A → xZ [ v
Z → y [ z
where Z is a new nonterminal. As we will see in Chapter 3, parsers can be constructed
systematically from a grammar, and left factoring the grammar before constructing
a parser can dramatically improve the performance of the resulting parser.
2.5.5. Removing left recursion
A production is called left-recursive if the right-hand side starts with the nonterminal
left recursion
of the left-hand side. For example, the production
A → Az
is left-recursive. A grammar is left-recursive if we can derive A => + Az for some
nonterminal A of the grammar (i.e., if we can derive Az from A in one or more
steps).
Left-recursive grammars are sometimes undesirable – we will, for instance, see in
Chapter 3 that a parser constructed systematically from a left-recursive grammar
may loop. Fortunately, left recursion can be removed by transforming the grammar.
The following transformation removes left-recursive productions.
To remove the left-recursive productions of a nonterminal A, we divide the produc-
tions for A in sets of left-recursive and non left-recursive productions. We can thus
factorize the productions for A as follows:
A → Ax
1
[ Ax
2
[ . . . [ Ax
n
A → y
1
[ y
2
[ . . . [ y
m
where none of the symbol sequences y
1
, . . . , y
m
starts with A. We now add a new
nonterminal Z, and replace the productions for A by:
24
2.5. Grammar transformations
A → y
1
[ y
1
Z [ . . . [ y
m
[ y
m
Z
Z → x
1
[ x
1
Z [ . . . [ x
n
[ x
n
Z
Note that this procedure only works for a grammar that is directly left-recursive, i. e.,
a grammar that contains a left-recursive production of the form A → Ax.
Grammars can also be indirectly left recursive. An example is
A → Bx
B → Ay
None of the two productions is left recursive, but we can still derive A ⇒

Ayx.
Removing left recursion in an indirectly left-recursive grammar is also possible, but
a bit more complicated [1].
Here is an example of how to apply the procedure described above to a grammar that
is directly left recursive, namely the grammar SequenceOfS that we have introduced
in Section 2.4:
S → SS
S → s
The first production is left-recursive. The second is not. We can thus directly apply
the procedure for left recursion removal, and obtain the following productions:
S → s [ sZ
Z → S [ SZ
2.5.6. Associative separator
The following grammar fragment generates a list of declarations, separated by a
semicolon ‘;’:
Decls → Decls ; Decls
Decls → Decl
The productions for Decl , which generates a single declaration, have been omitted.
This grammar is ambiguous, for the same reason as SequenceOfS is ambiguous. The
operator ; is an associative separator in the generated language, that is, it does not
matter how we group the declarations; given three declarations d
1
, d
2
, and d
3
, the
meaning of d
1
; (d
2
; d
3
) and (d
1
; d
2
) ; d
3
is the same. Therefore, we may use the
following unambiguous grammar for generating a language of declarations:
Decls → Decl ; Decls
Decls → Decl
25
2. Context-Free Grammars
Note that in this case, the transformed grammar is also no longer left-recursive.
An alternative unambiguous (but still left-recursive) grammar for the same language
is
Decls → Decls ; Decl
Decls → Decl
The grammar transformation just described should be handled with care: if the
separator is associative in the generated language, like the semicolon in this case,
applying the transformation is fine. However, if the separator is not associative, then
removing the ambiguity in favour of a particular nesting is dangerous.
This grammar transformation is often useful for expressions which are separated by
associative operators, such as for example natural numbers and addition.
2.5.7. Introduction of priorities
Another form of ambiguity often arises in the part of a grammar for a programming
language which describes expressions. For example, the following grammar generates
arithmetic expressions:
E → E + E
E → E * E
E → ( E )
E → Digs
where Digs generates a list of digits as described in Section 2.3.1.
This grammar is ambiguous: for example, the sentence 2+4*6 has two parse trees: one
corresponding to (2+4)*6, and one corresponding to 2+(4*6). If we make the usual
assumption that * has higher priority than +, the latter expression is the intended
reading of the sentence 2+4*6. In order to obtain parse trees that respect these
priorities, we transform the grammar as follows:
E → T
E → E + T
T → F
T → T * F
F → ( E )
F → Digs
This grammar generates the same language as the previous grammar for expressions,
but it respects the priorities of the operators.
26
2.5. Grammar transformations
In practice, often more than two levels of priority are used. Then, instead of writing
a large number of nearly identically formed production rules, we can abbreviate the
grammar by using parameterised nonterminals. For 1 i <n, we get productions
E
i
→ E
i+1
E
i
→ E
i
Op
i
E
i+1
The nonterminal Op
i
is parameterised and generates operators of priority i . In
addition to the above productions, there should also be a production for expressions
of the highest priority, for example:
E
n
→ ( E
1
) [ Digs
2.5.8. Discussion
We have presented several examples of grammar transformations. A grammar trans-
formation transforms a grammar into another grammar that generates the same
language. For each of the above transformations we should therefore prove that the
generated language remains the same. Since the proofs are too complicated at this
point, they are omitted. Proofs can be found in any of the theoretical books on
language and parsing theory [11].
There exist many other grammar transformations, but the ones given in this section
suffice for now. Note that everywhere we use ‘left’ (left-recursion, left factoring), we
can replace it by ‘right’, and obtain a dual grammar transformation. We will discuss
a larger example of a grammar transformation after the following section.
Exercise 2.16. Consider the following ambiguous grammar with start symbol A:
A → AaA
A → b [ c
Transform the grammar by applying the rule for associative separators. Choose the trans-
formation such that the resulting grammar is also no longer left-recursive.
Exercise 2.17. The standard example of ambiguity in programming languages is the dan-
gling else. Let G be a grammar with terminal set ¦if, b, then, else, a¦ and the following
productions:
S → if b then S else S
S → if b then S
S → a
1. Give two parse trees for the sentence if b then if b then a else a.
2. Give an unambiguous grammar that generates the same language as G.
3. How does Java prevent this dangling else problem?
27
2. Context-Free Grammars
Exercise 2.18. A bit list is a nonempty list of bits separated by commas. A grammar for
bit lists is given by
L → B
L → L , L
B → 0 [ 1
Remove the left recursion from this grammar.
Exercise 2.19. Consider the follwing grammar with start symbol S:
S → AB
A → ε [ aaA
B → ε [ Bb
1. What language does this grammar generate?
2. Give an equivalent non left recursive grammar.
2.6. Concrete and abstract syntax
In this section, we establish a connection between context-free grammars and Haskell
datatypes. To this end, we introduce the notion of abstract syntax, and show how
to obtain an abstract syntax from a concrete syntax.
For each context-free grammar we can define a corresponding datatype in Haskell.
Values of these datatypes represent parse trees of the context-free grammar. As an
example we take the grammar SequenceOfS:
S → SS
S → s
First, we give each of the productions of this grammar a name:
Beside: S → SS
Single: S → s
Now we interpret the start symbol of the grammar S as a datatype, using the names
of the productions as constructors:
data S = Beside S S
[ Single
Note that the nonterminals on the right hand side of Beside reappear as arguments of
the constructor Beside. On the other hand, the terminal symbol s in the production
Single is omitted in the definition of the constructor Single.
One may be tempted to make the following definition instead:
28
2.6. Concrete and abstract syntax
data S

= Beside

S

S

[ Single

Char — too general
However, this datatype is too general for the given grammar. An argument of type
Char can be instantiated to any single character, but we know that this character
always has to be s. Since there is no choice anyway, there is no extra value in storing
that s, and the first datatype S serves the purpose of encoding the parse trees of the
grammar SequenceOfS just fine.
For example, the parse tree that corresponds to the first two derivations of the se-
quence sss is represented by the following value of the datatype S:
Beside Single (Beside Single Single)
The third derivation of the sentence sss produces the following parse tree:
Beside (Beside Single Single) Single
To emphasize that these representations contain sufficient information in order to re-
produce the original strings, we can write a function that performs this conversion:
sToString :: S → String
sToString (Beside l r) = sToString l ++ sToString r
sToString Single = "s"
Applying the function sToString to either Beside Single (Beside Single Single) or
Beside (Beside Single Single) Single yields the string "sss".
A concrete syntax of a language describes the appearance of the sentences of a lan-
concrete syntax
guage. So the concrete syntax of the language of nonterminal S is given by the
grammar SequenceOfS.
On the other hand, an abstract syntax of a language describes the shapes of parse trees
abstract syntax
of the language, without the need to refer to concrete terminal symbols. Parse trees
are therefore often also called abstract syntax trees. The datatype S is an example
of an abstract syntax for the language of SequenceOfS. The adjective ‘abstract’
indicates that values of the abstract syntax do not need to explicitly contain all
information about particular sentences, as long as that information is recoverable, as
for example by applying function sToString.
A function such as sToString is often called a semantic function. A semantic function
semantic function
is a function that is defined on an abstract syntax of a language. Semantic functions
are used to give semantics (meaning) to values. In this example, the meaning of a
more abstract representation is expressed in terms of a concrete representation.
Using the removing left recursion grammar transformation, the grammar SequenceOfS
can be transformed into the grammar with the following productions:
29
2. Context-Free Grammars
S → sZ [ s
Z → SZ [ S
An abstract syntax of this grammar may be given by
data SA = ConsS Z [ SingleS
data Z = ConsZ SA Z [ SingleZ SA
For each nonterminal in the original grammar, we have introduced a corresponding
datatype. For each production for a particular nonterminal (expanding all the alter-
natives into separate productions), we have introduced a constructor and invented a
name for the constructor. The nonterminals on the right hand side of the production
rules appear as arguments of the constructors, but the terminals disappear, because
that information can be recovered by knowing which constructors have been used.
If we look at the two different grammars for SequenceOfS, and the two abstract
syntaxes, we can conclude that the only important information about sequences of
s’s is how many occurrences of s there are. So the ultimate abstract syntax for
SequenceOfS is
data SS = Size Int
Using the abstract syntax SS, the sequence sss is represented by the parse tree
Size 3, and we can still recover the original string from a value SS by means of a
semantic function:
ssToString :: SS → String
ssToString (Size n) = replicate n ’s’
The SequenceOfS example shows that one may choose between many different ab-
stract syntaxes for a given grammar. The choice of an abstract syntax over another
should therefore be determined by the demands of the application, i.e., by what we
ultimately want to compute.
Exercise 2.20. The following Haskell datatype represents a limited form of arithmetic ex-
pressions
data Expr = Add Expr Expr
[ Mul Expr Expr
[ Con Int
Give a grammar for a suitable concrete syntax corresponding to this datatype.
Exercise 2.21. Consider the grammar for palindromes that you have constructed in Exer-
cise 2.7. Give parse trees for the palindromes pal
1
="abaaba" and pal
2
="baaab". Define a
datatype Pal corresponding to the grammar and represent the parse trees for pal
1
and pal
2
as values of Pal .
Exercise 2.22. Consider your answers to Exercises 2.7 and 2.21 where we have given a
grammar for palindromes over the alphabet ¦a, b¦ and a Haskell datatype describing the
abstract syntax of such palindromes.
30
2.7. Constructions on grammars
1. Write a semantic function that transforms an abstract representation of a palindrome
into a concrete one. Test your function with the palindromes pal
1
and pal
2
from
Exercise 2.21.
2. Write a semantic function that counts the number of a’s occurring in a palindrome.
Test your function with the palindromes pal
1
and pal
2
from Exercise 2.21.
Exercise 2.23. Consider your answer to Exercise 2.8, which describes the concrete syntax
for mirror palindromes.
1. Define a datatype Mir that describes the abstract syntax corresponding to your gram-
mar. Give the two abstract mirror palindromes aMir
1
and aMir
2
that correspond to
the concrete mirror palindromes cMir
1
= "abaaba" and cMir
2
= "abbbba".
2. Write a semantic function that transforms an abstract representation of a mirror
palindrome into a concrete one. Test your function with the abstract mirror palin-
dromes aMir
1
and aMir
2
.
3. Write a function that transforms an abstract representation of a mirror palindrome
into the corresponding abstract representation of a palindrome (using the datatype
from Exercise 2.21). Test your function with the abstract mirror palindromes aMir
1
and aMir
2
.
Exercise 2.24. Consider your anwer to Exercise 2.9, which describes the concrete syntax
for parity sequences.
1. Define a datatype Parity describing the abstract syntax corresponding to your gram-
mar. Give the two abstract parity sequences aEven
1
and aEven
2
that correspond to
the concrete parity sequences cEven
1
= "00101" and cEven
2
= "01010".
2. Write a semantic function that transforms an abstract representation of a parity se-
quence into a concrete one. Test your function with the abstract parity sequences aEven
1
and aEven
2
.
Exercise 2.25. Consider your answer to Exercise 2.18, which describes the concrete syntax
for bit lists by means of a grammar that is not left-recursive.
1. Define a datatype BitList that describes the abstract syntax corresponding to your
grammar. Give the two abstract bit-lists aBitList
1
and aBitList
2
that correspond to
the concrete bit-lists cBitList
1
= "0,1,0" and cBitList
2
= "0,0,1".
2. Write a semantic function that transforms an abstract representation of a bit list into
a concrete one. Test your function with the abstract bit lists aBitList
1
and aBitList
2
.
3. Write a function that concatenates two abstract representations of a bit lists into a bit
list. Test your function with the abstract bit lists aBitList
1
and aBitList
2
.
2.7. Constructions on grammars
This section introduces some constructions on grammars that are useful when spec-
ifying larger grammars, for example for programming languages. Furthermore, it
gives an example of a larger grammar that is transformed in several steps.
The BNF notation, introduced in Section 2.2.1, was first used in the early sixties
when the programming language ALGOL 60 was defined and until now it is the
31
2. Context-Free Grammars
standard way of defining the syntax of programming languages (see,for instance,
the Java Language Grammar). You may object that the Java grammar contains
more “syntactical sugar” than the grammars that we considered thus far (and to be
honest, this also holds for the ALGOL 60 grammar): one encounters nonterminals
with postfixes ‘?’, ‘+’ and ‘∗’.
This extended BNF notation, EBNF, is introduced in order to help abbreviate a
EBNF
number of standard constructions that usually occur quite often in the syntax of a
programming language:
• one or zero occurrences of nonterminal P, abbreviated P?,
• one or more occurrences of nonterminal P, abbreviated P
+
,
• and zero or more occurrences of nonterminal P, abbreviated P

.
We could easily express these constructions by adding additional nonterminals, but
that decreases the readability of the grammar. The notation for the EBNF con-
structions is not entirely standardized. In some texts, you will for instance find the
notation [P] instead of P?, and ¦P¦ for P

. The same notation can be used for
languages, grammars, and sequences of terminal and nonterminal symbols instead of
just single nonterminals. In this section, we define the meaning of these constructs.
We introduced grammars as an alternative for the description of languages. Designing
a grammar for a specific language may not be a trivial task. One approach is to
decompose the language and to find grammars for each of its constituent parts.
In Definition 2.3, we have defined a number of operations on languages using op-
erations on sets. We now show that these operations can be expressed in terms of
operations on context-free grammars.
Theorem 2.10 (Language operations). Suppose we have grammars for the languages
L and M, say G
L
= (T, N
L
, R
L
, S
L
) and G
M
= (T, N
M
, R
M
, S
M
). We assume that
the nonterminal sets N
L
and N
M
are disjoint. Then
• the language L∪M is generated by the grammar (T, N, R, S) where S is a fresh
nonterminal, N = N
L
∪ N
M
∪ ¦S ¦ and R = R
L
∪ R
M
∪ ¦S → S
L
, S → S
M
¦;
• the language L M is generated by the grammar (T, N, R, S) where S is a fresh
nonterminal, N = N
L
∪ N
M
∪ ¦S ¦ and R = R
L
∪ R
M
∪ ¦S → S
L
S
M
¦;
• the language L

is generated by the grammar (T, N, R, S) where S is a fresh
nonterminal, N = N
L
∪ ¦S ¦ and R = R
L
∪ ¦S → ε, S → S
L
S ¦;
• the language L
+
is generated by the grammar (T, N, R, S) where S is a fresh
nonterminal, N = N
L
∪ ¦S ¦ and R = R
L
∪ ¦S → S
L
, S → S
L
S ¦.
The theorem above establishes that the set-theoretic operations at the level of lan-
guages (i. e., sets of sentences) have a direct counterpart at the level of grammatical
descriptions. A straightforward question to ask is now: can we also define languages
32
2.7. Constructions on grammars
as the difference between two languages or as the intersection of two languages, and
translate these operations to operations on grammars? Unfortunately, the answer
is negative – there are no operations on grammars that correspond to the language
intersection and difference operators.
Two of the above constructions are important enough to actually define them as
grammar operations. Furthermore, we add a new grammar construction for an “op-
tional grammar”.
Definition 2.11 (Grammar operations). Let G = (T, N, R, S) be a context-free
grammar and let S

be a fresh nonterminal. Then
G

= (T, N ∪ ¦S

¦, R ∪ ¦S

→ ε, S

→ S S

¦, S

)
G
+
= (T, N ∪ ¦S

¦, R ∪ ¦S

→ S, S

→ S S

¦, S

)
G? = (T, N ∪ ¦S

¦, R ∪ ¦S

→ ε, S

→ S ¦, S

)
The definition of P?, P
+
, and P

for a sequence of symbols P is very similar to the
definitions of the operations on grammars. For example, P

denotes zero or more
concatenations of string P, so Dig

denotes the language consisting of zero or more
digits.
Definition 2.12 (EBNF for sequences). Let P be a sequence of nonterminals and
terminals, then
L(P

) =L(Z) with Z → ε [ PZ
L(P
+
) =L(Z) with Z → P [ PZ
L(P?) =L(Z) with Z → ε [ P
where Z is a new nonterminal in each definition.
Because the concatenation operator for sequences is associative, the operators

and
+
can also be defined symmetrically:
L(P

) =L(Z) with Z → ε [ ZP
L(P
+
) =L(Z) with Z → P [ ZP
Many variations are possible on this theme:
L(P

Q) =L(Z) with Z → Q [ PZ (2.1)
or also
L(P Q

) =L(Z) with Z → P [ ZQ (2.2)
33
2. Context-Free Grammars
2.7.1. SL: an example
To illustrate EBNF and some of the grammar transformations given in the previous
section, we give a larger example. The following grammar generates expressions in a
very small programming language, called SL.
Expr → if Expr then Expr else Expr
Expr → Expr where Decls
Expr → AppExpr
AppExpr → AppExpr Atomic [ Atomic
Atomic → Var [ Number [ Bool [ ( Expr )
Decls → Decl
Decls → Decls ; Decls
Decl → Var = Expr
where the nonterminals Var, Number, and Bool generate variables, number expres-
sions, and boolean expressions, respectively. Note that the brackets around the Expr
in the production for Atomic, and the semicolon in between the Decls in the second
production for Decls are also terminal symbols. The following ‘program’ is a sentence
of this language:
if true then funny true else false where funny = 7
It is clear that this is not a very convenient language to write programs in.
The above grammar is ambiguous (why?), and we introduce priorities to resolve
some of the ambiguities. Application binds stronger than if, and both application
and if bind stronger then where. Using the “introduction of priorities” grammar
transformation, we obtain:
Expr → Expr
1
Expr → Expr
1
where Decls
Expr
1
→ Expr
2
Expr
1
→ if Expr
1
then Expr
1
else Expr
1
Expr
2
→ Atomic
Expr
2
→ Expr
2
Atomic
where Atomic and Decls have the same productions as before.
The nonterminal Expr
2
is left-recursive. Removing left recursion gives the following
productions for Expr
2
:
Expr
2
→ Atomic [ Atomic Expr

2
Expr

2
→ Atomic [ Atomic Expr

2
34
2.7. Constructions on grammars
Since the new nonterminal Expr

2
has exactly the same productions as Expr
2
, these
productions can be replaced by
Expr
2
→ Atomic [ Atomic Expr
2
So Expr
2
generates a nonempty sequence of atomics. Using the
+
-notation intro-
duced before, we can replace Expr
2
by Atomic
+
.
Another source of ambiguity are the productions for Decls. The nonterminal Decls
generates a nonempty list of declarations, and the separator ; is assumed to be asso-
ciative. Hence we can apply the “associative separator” transformation to obtain
Decls → Decl [ Decls ; Decl
or, according to (2.2),
Decls → Decl (; Decl )

The last grammar transformation we apply is “left factoring”. This transformation
is applied to the productions for Expr, and yields
Expr → Expr
1
Expr

1
Expr

1
→ ε [ where Decls
Since nonterminal Expr

1
generates either nothing or a where clause, we can replace
Expr

1
by an optional where clause in the production for Expr:
Expr → Expr
1
(where Decls)?
After all these grammar transformations, we obtain the following grammar.
Expr → Expr
1
(where Decls)?
Expr
1
→ Atomic
+
Expr
1
→ if Expr
1
then Expr
1
else Expr
1
Atomic → Var [ Number [ Bool [ ( Expr )
Decls → Decl (; Decl )

Exercise 2.26. Give the EBNF notation for each of the basic languages defined in Sec-
tion 2.3.1.
Exercise 2.27. Let G be a grammar G. Give the language that is generated by G? (i.e.,
the ? operation applied to G).
Exercise 2.28. Let
L
1
=¦a
m
b
m
c
n
[ m, n ∈ N¦
L
2
=¦a
m
b
n
c
n
[ m, n ∈ N¦
1. Give grammars for L
1
and L
2
.
2. Is L
1
∩ L
2
context-free, i. e., can you give a context-free grammar for this language?
35
2. Context-Free Grammars
2.8. Parsing
This section formulates the parsing problem, and discusses some of the future topics
of the course.
Definition 2.13 (Parsing problem). Given the grammar G and a string s, the
parsing problem answers the question whether or not s ∈ L(G). If s ∈ L(G), the
parsing problem
answer to this question may be either a parse tree or a derivation.
This question may not be easy to answer given an arbitrary grammar. Until now we
have only seen simple grammars for which it is relatively easy to determine whether
or not a string is a sentence of the grammar. For more complicated grammars this
may be more difficult. However, in the first part of this course we will show how –
given a grammar with certain reasonable properties – we can easily construct parsers
by hand. At the same time we will show how the parsing process can quite often be
combined with the algorithm we actually want to perform on the recognized object
(the semantic function). The techniques we describe comprise a simple, although
surprisingly efficient, introduction into the area of compiler construction.
A compiler for a programming language consists of several parts. Examples of such
parts are a scanner, a parser, a type checker, and a code generator. Usually, a parser
is preceded by a scanner (also called lexer), which splits an input sentence into a list
scanner
lexer
of so-called tokens. For example, given the sentence
if true then funny true else false where funny = 7
a scanner might return the following list of tokens:
["if", "true", "then", "funny", "true",
"else", "false", "where", "funny", "=", "7"]
So a token is a syntactical entity. A scanner usually performs the first step towards
token
an abstract syntax: it throws away layout information such as spacing and newlines.
In this course we will concentrate on parsers, but some of the concepts of scanners
will sometimes be used.
In the second part of this course we will take a look at more complicated grammars,
which do not always conform to the restrictions just referred to. By analysing the
grammar we may nevertheless be able to generate parsers as well. Such generated
parsers will be in such a form that it will be clear that writing such parsers by hand
is far from attractive, and actually impossible for all practical cases.
One of the problems we have not referred to yet in this rather formal chapter is of
a more practical nature. Quite often the sentence presented to the parser will not
be a sentence of the language since mistakes were made when typing the sentence.
This raises another interesting question: What are the minimal changes that have
36
2.9. Exercises
to be made to the sentence in order to convert it into a sentence of the language?
It goes almost without saying that this is an important question to be answered in
practice; one would not be very happy with a compiler which, given an erroneous
input, would just reply that the “Input could not be recognised”. One of the most
important aspects here is to define metric for deciding about the minimality of a
change; humans usually make certain mistakes more often than others. A semicolon
can easily be forgotten, but the chance that an if symbol is missing is far from likely.
This is where grammar engineering starts to play a rˆole.
Summary
Starting from a simple example, the language of palindromes, we have introduced
the concept of a context-free grammar. Associated concepts, such as derivations and
parse trees were introduced.
2.9. Exercises
Exercise 2.29. Do there exist languages L such that (L

) = (L)

?
Exercise 2.30. Give a language L such that L = L

.
Exercise 2.31. Under which circumstances is L
+
= L

−¦ε¦?
Exercise 2.32. Let L be a language over alphabet ¦a, b, c¦ such that L=L
R
. Does L contain
only palindromes?
Exercise 2.33. Consider the grammar with productions
S → AA
A → AAA
A → a
A → bA
A → Ab
1. Which terminal strings can be produced by derivations of four or fewer steps?
2. Give at least two distinct derivations for the string babbab.
3. For any m, n, p 0, describe a derivation of the string b
m
ab
n
ab
p
.
Exercise 2.34. Consider the grammar with productions
S → aaB
A → bBb
A → ε
B → Aa
Show that the string aabbaabba cannot be derived from S.
37
2. Context-Free Grammars
Exercise 2.35. Give a grammar for the language
L =¦ωcω
R
[ ω ∈ ¦a, b¦

¦
This language is known as the center-marked palindromes language. Give a derivation of the
sentence abcba.
Exercise 2.36. Describe the language generated by the grammar:
S → ε
S → A
A → aAb
A → ab
Can you find another (preferably simpler) grammar for the same language?
Exercise 2.37. Describe the languages generated by the grammars.
S → ε
S → A
A → Aa
A → a
and
S → ε
S → A
A → AaA
A → a
Can you find other (preferably simpler) grammars for the same languages?
Exercise 2.38. Show that the languages generated by the grammars G
1
, G
2
en G
3
are the
same.
G
1
: G
2
: G
3
:
S → ε S → ε S → ε
S → aS S → Sa S → a
S → SS
Exercise 2.39. Consider the following property of grammars:
1. the start symbol is the only nonterminal which may have an empty production (a
production of the form X → ε),
2. the start symbol does not occur in any alternative.
A grammar having this property is called non-contracting. The grammar A → aAb [ ε does
not have this property. Give a non-contracting grammar which describes the same language.
Exercise 2.40. Describe the language L of the grammar
A → AaA[ a
Give a grammar for L that has no left-recursive productions. Give a grammar for L that has
no right-recursive productions.
38
2.9. Exercises
Exercise 2.41. Describe the language L of the grammar
X → a [ Xb
Give a grammar for L that has no left-recursive productions. Give a grammar for L that has
no left-recursive productions and is non-contracting.
Exercise 2.42. Consider the language L of the grammar
S → T [ US
T → aSa [ Ua
U → S [ SUT
Give a grammar for L which uses only productions with two or less symbols on the right
hand side. Give a grammar for L which uses only two nonterminals.
Exercise 2.43. Give a grammar for the language of all sequences of 0’s and 1’s which start
with a 1 and contain exactly one 0.
Exercise 2.44. Give a grammar for the language consisting of all nonempty sequences of
brackets,
¦(, )¦
in which the brackets match. An example sentence of the language is ( ) ( ( ) ) ( ). Give a
derivation for this sentence.
Exercise 2.45. Give a grammar for the language consisting of all nonempty sequences of
two kinds of brackets,
¦(, ), [, ]¦
in which the brackets match. An example sentence in this language is [ ( ) ] ( ).
Exercise 2.46. This exercise shows an example (attributed to Noam Chomsky) of an am-
biguous English sentence. Consider the following grammar for a part of the English language:
Sentence → Subject Predicate .
Subject → they
Predicate → Verb NounPhrase
Predicate → AuxVerb Verb Noun
Verb → are
Verb → flying
AuxVerb → are
NounPhrase → Adjective Noun
Adjective → flying
Noun → planes
Give two different leftmost derivations for the sentence
they are flying planes.
39
2. Context-Free Grammars
Exercise 2.47. Try to find some ambiguous sentences in your own natural language. Here
are some ambiguous Dutch sentences seen in the newspapers:
Vliegen met hartafwijking niet gevaarlijk
Jessye Norman kan niet zingen
Alcohol is voor vrouwen schadelijker dan mannen
Exercise 2.48 (•). Is your grammar for Exercise 2.45 unambiguous? If not, find one which
is unambiguous.
Exercise 2.49. This exercise deals with a grammar that uses unusual terminal and non-
terminal symbols. Assume that , ⊗, and ⊕ are nonterminals, and the other symbols are
terminals.
→ ´⊗
→ ⊗
⊗ → ⊗3⊕
⊗ → ⊕
⊕ → ♣
⊕ → ♠
Find a derivation for the sentence ♣3♣´♠.
Exercise 2.50 (•). Prove, using induction, that the grammar G for palindromes in Sec-
tion 2.2 does indeed generate the language of palindromes.
Exercise 2.51 (••, no answer provided). Prove that the language generated by the grammar
of Exercise 2.33 contains all strings over ¦a, b¦ where the number of a’s is even and greater
than zero.
Exercise 2.52 (no answer provided). Consider the natural numbers in unary notation where
only the symbol I is used; thus 4 is represented as IIII. Write an algorithm that, given a
string w of I’s, determines whether or not w is divisible by 7.
Exercise 2.53 (no answer provided). Consider the natural numbers in reverse binary no-
tation; thus 4 is represented as 001. Write an algorithm that, given a string w of zeros and
ones, determines whether or not w is divisible by 7.
Exercise 2.54 (no answer provided). Let w be a string consisting of a’s and b’s only. Write
an algorithm that determines whether or not the number of a’s in w equals the number of
b’s in w.
40
3. Parser combinators
Introduction
This chapter is an informal introduction to writing parsers in a lazy functional lan-
guage using ‘parser combinators’. Parsers can be written using a small set of basic
parsing functions, and a number of functions that combine parsers into more com-
plicated parsers. The functions that combine parsers are called parser combinators.
The basic parsing functions do not combine parsers, and are therefore not parser
combinators in this sense, but they are usually also called parser combinators.
Parser combinators are used to write parsers that are very similar to the grammar of
a language. Thus writing a parser amounts to translating a grammar to a functional
program, which is often a simple task.
Parser combinators are built by means of standard functional language constructs
like higher-order functions, lists, and datatypes. List comprehensions are used in a
few places, but they are not essential, and could easily be rephrased using the map,
filter and concat functions. Type classes are only used for overloading the equality
and arithmetic operators.
We will start by motivating the definition of the type of parser functions. Using that
type, we can build parsers for the language of (possibly ambiguous) grammars. Next,
we will introduce some elementary parsers that can be used for parsing the terminal
symbols of a language.
In Section 3.3 the first parser combinators are introduced, which can be used for se-
quentially and alternatively combining parsers, and for calculating so-called semantic
functions during the parse. Semantic functions are used to give meaning to syntactic
structures. As an example, we construct a parser for strings of matching paren-
theses in Section 3.3.1. Different semantic values are calculated for the matching
parentheses: a tree describing the structure, and an integer indicating the nesting
depth.
In Section 3.4 we introduce some new parser combinators. Not only do these make life
easier later, but their definitions are also nice examples of using parser combinators.
A real application is given in Section 3.5, where a parser for arithmetical expressions
is developed. Finally, the expression parser is generalised to expressions with an
arbitrary number of precedence levels. This is done without coding the priorities of
operators as integers, and we will avoid using indices and ellipses.
41
3. Parser combinators
It is not always possible to directly construct a parser from a context-free grammar
using parser combinators. If the grammar is left-recursive, it has to be transformed
into a non left-recursive grammar before we can construct a combinator parser. An-
other limitation of the parser combinator technique as described in this chapter is
that it is not trivial to write parsers for complex grammars that perform reasonably
efficient. However, there do exist implementations of parser combinators that per-
form remarkably well, see [10, 12]. For example, there exist good parsers using parser
combinators for the Haskell language.
Most of the techniques introduced in this chapter have been described by Burge [4],
Wadler [13] and Hutton [7].
This chapter is a revised version of an article by Fokker [5].
Goals
This chapter introduces the first programs for parsing in these lecture notes. Parsers
are composed from simple parsers by means of parser combinators. Hence, important
primary goals of this chapter are:
• to understand how to parse, i. e., how to recognise structure in a sequence of
symbols, by means of parser combinators;
• to be able to construct a parser when given a grammar;
• to understand the concept of semantic functions.
Two secondary goals of this chapter are:
• to develop the capability to abstract;
• to understand the concept of domain specific language.
Required prior knowledge
To understand this chapter, you should be able to formulate the parsing problem,
and you should understand the concept of context-free grammar. Furthermore, you
should be familiar with functional programming concepts such as type, class, and
higher-order functions.
42
3.1. The type of parsers
3.1. The type of parsers
The goals of this section are:
• develop a type Parser that is used to give the type of parsing functions;
• show how to obtain this type by means of several abstraction steps.
The parsing problem is (see Section 2.8): Given a grammar G and a string s, de-
termine whether or not s ∈ L(G). If s ∈ L(G), the answer to this question may
be either a parse tree or a derivation. For example, in Section 2.4 we have seen a
grammar for sequences of s’s:
S → SS [ s
A parse tree of an expression of this language is a value of the datatype S (or a value
of several variants of that type, see Section 2.6, depending on what you want to do
with the result of the parser), which is defined by
data S = Beside S S [ Single
A parser for expressions could be implemented as a function of the following type:
type Parser = String → S — preliminary
For parsing substructures, a parser can call other parsers, or call itself recursively.
These calls do not only have to communicate their result, but also the part of the
input string that is left unprocessed. For example, when parsing the string sss, a
parser will first build a parse tree Beside Single Single for ss, and only then build a
complete parse tree
Beside (Beside Single Single) Single
using the unprocessed part s of the input string. As this cannot be done using a global
variable, the unprocessed input string has to be part of the result of the parser. The
two results can be paired. A better definition for the type Parser is hence:
type Parser = String → (S, String) — still preliminary
Any parser of type Parser returns an S and a String. However, for different grammars
we want to return different parse trees: the type of tree that is returned depends on
the grammar for which we want to parse sentences. Therefore it is better to abstract
from the type S, and to turn the parser type into a polymorphic type. The type
Parser is parametrised with a type a, which represents the type of parse trees.
type Parser a = String → (a, String) — still preliminary
For example, a parser that returns a structure of type Oak (whatever that is) now
has type Parser Oak. A parser that parses sequences of s’s has type Parser S.
43
3. Parser combinators
We might also define a parser that does not return a value of type S, but instead
the number of s’s in the input sequence. This parser would have type Parser Int.
Another instance of a parser is a parse function that recognises a string of digits, and
returns the number represented by it as a parse ‘tree’. In this case the function is
also of type Parser Int. Finally, a recogniser that either accepts or rejects sentences
of a grammar returns a boolean value, and will have type Parser Bool .
Until now, we have assumed that every string can be parsed in exactly one way. In
general, this need not be the case: it may be that a single string can be parsed in
various ways, or that there is no way to parse a string. For example, the string "sss"
has the following two parse trees:
Beside (Beside Single Single) Single
Beside Single (Beside Single Single)
As another refinement of the type Parser, instead of returning one parse tree (and
its associated rest string), we let a parser return a list of trees. Each element of the
result consists of a tree, paired with the rest string that was left unprocessed after
parsing. The type definition of Parser therefore becomes:
type Parser a = String → [(a, String)] — useful, but still suboptimal
If there is just one parse, the result of the parse function is a singleton list. If no
parse is possible, the result is an empty list. In case of an ambiguous grammar, the
result consists of all possible parses.
This method for parsing is called the list of successes method, described by Wadler [13].
list of successes
It can be used in situations where in other languages you would use so-called back-
tracking techniques. In the Bird and Wadler textbook it is used to solve combinatorial
problems like the eight queens problem [3]. If only one solution is required rather
than all possible solutions, you can take the head of the list of successes. Thanks
to lazy evaluation, not all elements of the list are computed if only the first value is
lazy evaluation
needed, so there will be no loss of efficiency. Lazy evaluation provides a backtracking
approach to finding the first solution.
Parsers with the type described so far operate on strings, that is lists of characters.
There is however no reason for not allowing parsing strings of elements other than
characters. You may imagine a situation in which a preprocessor prepares a list of
tokens (see Section 2.8), which is subsequently parsed. To cater for this situation we
refine the parser type once more: we let the type of the elements of the input string
be an argument of the parser type. Calling the type of symbols s, and as before the
result type a, the type of parsers becomes
type Parser s a = [s] → [(a, [s])]
or, if you prefer meaningful identifiers over conciseness:
44
3.2. Elementary parsers
— The type of parsers
type Parser symbol result = [symbol ] → [(result, [symbol ])]
Listing 3.1: ParserType.hs
type Parser symbol result = [symbol ] → [(result, [symbol ])]
We will use this type definition in the rest of this chapter. This type is defined in
Listing 3.1, the first part of our parser library. The list of successes appears in result
type of a parser. Each element of this list is a possible parsing of (an initial part of)
the input. We will hardly use the full generality provided by the Parser type: the
type of the input s (or symbol ) will almost always be Char.
3.2. Elementary parsers
The goals of this section are:
• introduce some very simple parsers for parsing sentences of grammars with rules
of the form:
A → ε
A → a
A → x
where x is a sequence of terminals;
• show how one can construct useful functions from simple, trivially correct func-
tions by means of generalisation and partial parametrisation.
This section defines parsers that can only be used to parse fixed sequences of terminal
symbols. For a grammar with a production that contains nonterminals in its right-
hand side we need techniques that will be introduced in the following section.
We will start with a very simple parse function that just recognises the terminal
symbol a. The type of the input string symbols is Char in this case, and as a parse
‘tree’ we also simply use a Char:
symbola :: Parser Char Char
symbola [] = []
symbola (x : xs) [ x = = ’a’ = [(’a’, xs)]
[ otherwise = []
45
3. Parser combinators
— Elementary parsers
symbol :: Eq s ⇒ s → Parser s s
symbol a [] = []
symbol a (x : xs) [ x = = a = [(x, xs)]
[ otherwise = []
satisfy :: (s → Bool ) → Parser s s
satisfy p [] = []
satisfy p (x : xs) [ p x = [(x, xs)]
[ otherwise = []
token :: Eq s ⇒ [s] → Parser s [s]
token k xs [ k = = take n xs = [(k, drop n xs)]
[ otherwise = []
where n = length k
failp :: Parser s a
failp xs = []
succeed :: a → Parser s a
succeed r xs = [(r, xs)]
— Applications of elementary parsers
digit :: Parser Char Char
digit = satisfy isDigit
Listing 3.2: ParserType.hs
The list of successes method immediately pays off, because now we can return an
empty list if no parsing is possible (because the input is empty, or does not start
with an a).
In the same fashion, we can write parsers that recognise other symbols. As always,
rather than defining a lot of closely related functions, it is better to abstract from the
symbol to be recognised by making it an extra argument of the function. Further-
more, the function can operate on lists of characters, but also on lists of symbols of
other types, so that it can be used in other applications than character oriented ones.
The only prerequisite is that the symbols to be parsed can be tested for equality. In
Haskell, this is indicated by the Eq predicate in the type of the function.
Using these generalisations, we obtain the function symbol that is given in Listing 3.2.
The function symbol is a function that, given a symbol, returns a parser for that
symbol. A parser on its turn is a function too. This is why two arguments appear in
the definition of symbol .
46
3.2. Elementary parsers
We will now define some elementary parsers that can do the work traditionally taken
care of by lexical analysers (see Section 2.8). For example, a useful parser is one
that recognises a fixed string of symbols, such as while or switch. We will call this
function token; it is defined in Listing 3.2. As in the case of the symbol function we
have parametrised this function with the string to be recognised, effectively making
it into a family of functions. Of course, this function is not confined to strings of
characters. However, we do need an equality test on the type of values in the input
string; the type of token is:
token :: Eq s ⇒ [s] → Parser s [s]
The function token is a generalisation of the symbol function, in that it recognises
a list of symbols instead of a single symbol. Note that we cannot define symbol in
terms of token: the two functions have incompatible types.
Another generalisation of symbol is a function which may, depending on the in-
put, return different parse results. Instead of specifying a specific symbol, we can
parametrise the function with a condition that the symbol should fulfill. Thus the
function satisfy has a function s → Bool as argument. Where symbol tests for equal-
ity to a specific value, the function satisfy tests for compliance with this predicate.
It is defined in Listing 3.2. This generalised function is for example useful when we
want to parse digits (characters in between ’0’ and ’9’):
digit :: Parser Char Char
digit = satisfy isDigit
where the function isDigit is the standard predicate that tests whether or not a
character is a digit:
isDigit :: Char → Bool
isDigit x = ’0’ x ∧ x ’9’
In books on grammar theory an empty string is often called ‘ε’. In this tradition,
we will define a function epsilon that ‘parses’ the empty string. It does not consume
any input, and hence always returns an empty parse tree and unmodified input. A
zero-tuple can be used as a result value: () is the only value of the type () – both the
type and the value are pronounced unit.
unit type
epsilon :: Parser s ()
epsilon xs = [((), xs)]
A more useful variant is the function succeed, which also doesn’t consume input,
but always returns a given, fixed value (or ‘parse tree’, if you can call the result of
processing zero symbols a parse tree). It is defined in Listing 3.2.
Dual to the function succeed is the function failp, which fails to recognise any symbol
on the input string. As the result list of a parser is a ‘list of successes’, and in the
47
3. Parser combinators
case of failure there are no successes, the result list should be empty. Therefore the
function failp always returns the empty list of successes. It is defined in Listing 3.2.
Note the difference with epsilon, which does have one element in its list of successes
(albeit an empty one).
Do not confuse failp with epsilon: there is an important difference between returning
one solution (which contains the unchanged input as ‘rest’ string) and not returning
a solution at all!
Exercise 3.1. Define a function capital :: Parser Char Char that parses capital letters.
Exercise 3.2. Since satisfy is a generalisation of symbol , the function symbol can be defined
as an instance of satisfy. How can this be done?
Exercise 3.3. Define the function epsilon using succeed.
3.3. Parser combinators
Using the elementary parsers from the previous section, parsers can be constructed
for terminal symbols from a grammar. More interesting are parsers for nonterminal
symbols. It is convenient to construct these parsers by partially parametrising higher-
order functions.
The goals of this section are:
• show how parsers can be constructed directly from the productions of a gram-
mar. The kind of productions for which parsers will be constructed are
A → x [ y
A → x y
where x and y are sequences of nonterminal or terminal symbols;
• show how we can construct a small, powerful combinator language (a domain
specific language) for the purpose of parsing;
• understand and use the concept of semantic functions in parsers.
Let us have a look at the grammar for expressions again, see also Section 2.5:
E → T + E [ T
T → F * T [ F
F → Digs [ ( E )
where Digs is a nonterminal that generates the language of sequences of digits, see
Section 2.3.1. An expression can be parsed according to any of the two rules for E.
This implies that we want to have a way to say that a parser consists of several alter-
native parsers. Furthermore, the first rule says that in order to parse an expression,
48
3.3. Parser combinators
we should first parse a term, then a terminal symbol +, and then an expression. This
implies that we want to have a way to say that a parser consists of several parsers
that are applied sequentially.
So important operations on parsers are sequential and alternative composition: a
more complex construct can consist of a simple construct followed by another con-
struct (sequential composition), or by a choice between two constructs (alternative
composition). These operations correspond directly to their grammatical counter-
parts. We will develop two functions for this, which for notational convenience are
defined as operators: <∗> for sequential composition, and <[> for alternative com-
position. The names of these operators are chosen so that they can be easily remem-
bered: <∗> ‘multiplies’ two constructs together, and <[> can be pronounced as ‘or’.
Be careful, though, not to confuse the <[>-operator with Haskell’s built-in construct
[, which is used to distinguish cases in a function definition or as a separator for the
constructors of a datatype.
Priorities of these operators are defined so as to minimise parentheses in practical
situations:
infixl 6 <∗>
infixr 4 <[>
So <∗> has a higher priority – i. e., it binds stronger – than <[>.
Both operators take two parsers as argument, and return a parser as result. By
again combining the result with other parsers, you may construct even more involved
parsers.
In the definitions in Listing 3.3, the functions operate on parsers p and q. Apart
from the arguments p and q, the function operates on a string, which can be thought
of as the string that is parsed by the parser that is the result of combining p and
q.
We start with the definition of operator <∗>. For sequential composition, p must be
applied to the input first. After that, q is applied to the rest string of the result. The
first parser, p, returns a list of successes, each of which contains a value and a rest
string. The second parser, q, should be applied to the rest string, returning a second
value. Therefore we use a list comprehension, in which the second parser is applied
in all possible ways to the rest string of the first parser:
(p <∗>q) xs = [ (combine r
1
r
2
, zs)
[ (r
1
, ys) ← p xs
, (r
2
, zs) ← q ys
]
The rest string of the parser for the sequential composition of p and q is whatever
the second parser q leaves behind as rest string.
49
3. Parser combinators
— Parser combinators
(<[>) :: Parser s a → Parser s a → Parser s a
(p <[>q) xs = p xs ++ q xs
(<∗>) :: Parser s (b → a) → Parser s b → Parser s a
(p <∗>q) xs = [ (f x, zs )
[ (f , ys) ← p xs
, ( x, zs ) ← q ys
]
(<$>) :: (a → b) → Parser s a → Parser s b
(f <$>p) xs = [ (f y, ys)
[ ( y, ys) ← p xs
]
— Applications of parser combinators
newdigit :: Parser Char Int
newdigit = f <$>digit
where f c = ord c −ord ’0’
Listing 3.3: ParserCombinators.hs
Now, how should the results of the two parsings be combined? We could, of course,
parametrise the whole thing with an operator that describes how to combine the
parts (as is done in the zipWith function). However, we have chosen for a different
approach, which nicely exploits the ability of functional languages to manipulate
functions. The function combine should combine the results of the two parse trees
recognised by p and q. In the past, we have interpreted the word ‘tree’ liberally:
simple values, like characters, may also be used as a parse ‘tree’. We will now also
accept functions as parse trees. That is, the result type of a parser may be a function
type.
If the first parser that is combined by <∗> would return a function of type b → a,
and the second parser a value of type b, a straightforward choice for the combine
function would be function application. That is exactly the approach taken in the
definition of <∗> in Listing 3.3. The first parser returns a function, the second parser
a value, and the combined parser returns the value that is obtained by applying the
function to the value.
Apart from ‘sequential composition’ we need a parser combinator for representing
‘choice’. For this, we have the parser combinator operator <[>. Thanks to the list
of successes method, both p
1
and p
2
return lists of possible parsings. To obtain all
possible parsings when applying p
1
or p
2
, we only need to concatenate these two
lists.
50
3.3. Parser combinators
By combining parsers with parser combinators we can construct new parsers. The
most important parser combinators are <∗> and <[>. The parser combinator <,>
in exercise 3.12 is just a variation of <∗>.
Sometimes we are not quite satisfied with the result value of a parser. The parser
might work well in that it consumes symbols from the input adequately (leaving the
unused symbols as rest-string in the tuples in the list of successes), but the result
value might need some postprocessing. For example, a parser that recognises one digit
is defined using the function satisfy: digit = satisfy isDigit. In some applications,
we may need a parser that recognises one digit character, but returns the result
as an integer, instead of a character. In a case like this, we can use a new parser
combinator: <$>. It takes a function and a parser as argument; the result is a parser
that recognises the same string as the original parser, but ‘postprocesses’ the result
using the function. We use the $ sign in the name of the combinator, because the
combinator resembles the operator that is used for normal function application in
Haskell: f $ x = f x. The definition of <$> is given in Listing 3.3. It is an infix
operator:
infixl 7 <$>
Using this postprocessing parser combinator, we can modify the parser digit that was
defined above:
newdigit :: Parser Char Int
newdigit = f <$>digit
where f c = ord c −ord ’0’
The auxiliary function f determines the ordinal number of a digit character; using
the parser combinator <$> it is applied to the result part of the digit parser.
In practice, the <$> operator is used to build a certain value during parsing (in the
case of parsing a computer program this value may be the generated code, or a list
of all variables with their types, etc.). Put more generally: using <$> we can add
semantic functions to parsers.
A parser for the SequenceOfS grammar that returns the abstract syntax tree of the
input, i.e., a value of type S, see Section 2.6, is defined as follows:
sequenceOfS :: Parser Char S
sequenceOfS = Beside <$>sequenceOfS <∗>sequenceOfS
<[>const Single <$>symbol ’s’
But if you try to run this function, you will get a stack overflow! If you apply
sequenceOfS to a string, the first thing it does is to apply itself to the same string,
which loops. The problem stems from the fact that the underlying grammar is left-
recursive. For any left-recursive grammar, a systematically constructed parser using
parser combinators will exhibit the problem that it loops. However, in Section 2.5
51
3. Parser combinators
we have shown how to remove the left recursion in the SequenceOfS grammar. The
resulting grammar is used to obtain the following parser:
sequenceOfS

:: Parser Char SA
sequenceOfS

=
const ConsS <$>symbol ’s’ <∗>parseZ
<[>const SingleS <$>symbol ’s’
where parseZ = ConsZ <$>sequenceOfS

<∗>parseZ
<[>SingleZ <$>sequenceOfS

This example is a direct translation of the grammar obtained by using the remov-
ing left recursion grammar transformation. There exists a much simpler parser for
parsing sequences of s’s.
Exercise 3.4. Prove for all f :: a → b that
f <$>succeed a = succeed (f a)
In the sequel we will often use this rule for constant functions f , i. e., f = λ → c for some
term c.
Exercise 3.5. Consider the parser (:) <$> symbol ’a’. Give its type and show its results
on inputs [] and x : xs.
Exercise 3.6. Consider the parser (:) <$> symbol ’a’ <∗> p. Give its type and show its
results on inputs [] and x : xs.
Exercise 3.7. Define a parser for Booleans.
Exercise 3.8 (no answer provivded). Define parsers for each of the basic languages defined
in Section 2.3.1.
Exercise 3.9. Consider the grammar for palindromes that you have constructed in Exer-
cise 2.7.
1. Give the datatype Pal
2
that corresponds to this grammar.
2. Define a parser palin
2
that returns parse trees for palindromes. Test your function
with the palindromes cPal
1
= "abaaba" and cPal
2
= "baaab". Compare the results
with your answer to Exercise 2.21.
3. Define a parser palina that counts the number of a’s occurring in a palindrome.
Exercise 3.10. Consider the grammar for a part of the English language that is given in
Exercise 2.46.
1. Give the datatype English that corresponds to this grammar.
2. Define a parser english that returns parse trees for the English language. Test your
function with the sentence they are flying planes. Compare the result to your
answer of Exercise 2.46.
52
3.3. Parser combinators
Exercise 3.11. When defining the priority of the <[> operator with the infixr keyword,
we also specified that the operator associates to the right. Why is this a better choice than
association to the left?
Exercise 3.12. Define a parser combinator <,> that combines two parsers. The value
returned by the combined parser is a tuple containing the results of the two component
parsers. What is the type of this parser combinator?
Exercise 3.13. The term ‘parser combinator’ is in fact not an adequate description for <$>.
Can you think of a better word?
Exercise 3.14. Compare the type of <$> with the type of the standard function map. Can
you describe your observations in an easy-to-remember, catchy phrase?
Exercise 3.15. Define <∗> in terms of <,> and <$>. Define <,> in terms of <∗> and <$>.
Exercise 3.16. If you examine the definitions of <∗> and <$> in Listing 3.3, you can observe
that <$> is in a sense a special case of <∗>. Can you define <$> in terms of <∗>?
3.3.1. Matching parentheses: an example
Using parser combinators, it is often fairly straightforward to construct a parser for
a language for which you have a grammar. Consider, for example, the grammar that
you wrote in Exercise 2.44:
S → ( S ) S [ ε
This grammar can be directly translated to a parser, using the parser combinators
<∗> and <[>. We use <∗> when symbols are written next to each other, and <[>
when [ appears in a production (or when there is more than one production for a
nonterminal).
parens :: Parser Char ??? — ill-typed
parens = symbol ’(’ <∗>parens <∗>symbol ’)’ <∗>parens
<[>epsilon
However, this function is not correctly typed: the parsers in the first alternative
cannot be composed using <∗>, as for example symbol ’(’ is not a parser returning
a function.
But we can postprocess the parser symbol ’(’ so that, instead of a character, this
parser does return a function. So, what function should we use? This depends on the
kind of value that we want as a result of the parser. A nice result would be a tree-
like description of the parentheses that are parsed. For this purpose we introduce
an abstract syntax, see Section 2.6, for the parentheses grammar. We obtain the
following Haskell datatype:
data Parentheses = Match Parentheses Parentheses
[ Empty
53
3. Parser combinators
data Parentheses = Match Parentheses Parentheses
[ Empty
deriving Show
open = symbol ’(’
close = symbol ’)’
parens :: Parser Char Parentheses
parens = (λ x y → Match x y)
<$>open <∗>parens <∗>close <∗>parens
<[>succeed Empty
nesting :: Parser Char Int
nesting = (λ x y → max (1 + x) y)
<$>open <∗>nesting <∗>close <∗>nesting
<[>succeed 0
Listing 3.4: ParseParentheses.hs
For example, the sentence ()() is represented by
Match Empty (Match Empty Empty)
Suppose we want to calculate the number of parentheses in a sentence. The number
of parentheses is calculated by the function nrofpars, which is defined by induction
on the datatype Parentheses.
nrofpars :: Parentheses → Int
nrofpars (Match pl pr) = 2 + nrofpars pl + nrofpars pr
nrofpars Empty = 0
Using the datatype Parentheses, we can add ‘semantic functions’ to the parser. We
then obtain the definition of parens in Listing 3.4.
By varying the function used as a first argument of <$> (the ‘semantic function’), we
can return other things than parse trees. As an example we construct a parser that
calculates the nesting depth of nested parentheses, see the function nesting defined
in Listing 3.4.
A session in which nesting is used may look like this:
? nesting "()(())()"
[(2,[]), (2,"()"), (1,"(())()"), (0,"()(())()")]
? nesting "())"
[(1,")"), (0,"())")]
54
3.4. More parser combinators
As you can see, when there is a syntax error in the argument, there are no solutions
with empty rest string. It is fairly simple to test whether a given string belongs to
the language that is parsed by a given parser.
Exercise 3.17. What is the type of the function f =λ x y → Match x y which appears in
function parens in Listing 3.4? What is the type of the parser open? Using the type of <$>,
what is the type of f <$>open? Can f <$>open be used as a left hand side of <∗>parens?
What is the type of the result?
Exercise 3.18. What is a convenient way for <∗> to associate? Does it?
Exercise 3.19. Write a function test that determines whether or not a given string belongs
to the language parsed by a given parser.
3.4. More parser combinators
In principle you can build parsers for any context-free language using the combina-
tors <∗> and <[>, but in practice it is easier to have some more parser combinators
available. In traditional grammar formalisms, additional symbols are used to describe
for example optional or repeated constructions. Consider for example the BNF for-
malism, in which originally only sequential and alternative composition can be used
(denoted by juxtaposition and vertical bars, respectively), but which was later ex-
tended to EBNF to also allow for repetition, denoted by a star. The goal of this
section is to show how the set of parser combinators can be extended.
3.4.1. Parser combinators for EBNF
It is very easy to make new parser combinators for EBNF. As a first example we
consider repetition. Given a parser p for a construction, many p constructs a parser
for zero or more occurrences of that construction:
many :: Parser s a → Parser s [a]
many p = (:) <$>p <∗>many p
<[>succeed []
So the EBNF expression P

is implemented by many P. The function (:) is just the
cons-operator for lists: it takes a head element and a tail list and combines them.
The order in which the alternatives are given only influences the order in which
solutions are placed in the list of successes.
For example, the many combinator can be used in parsing a natural number:
natural :: Parser Char Int
natural = foldl f 0 <$>many newdigit
where f a b = a ∗ 10 + b
55
3. Parser combinators
— EBNF parser combinators
option :: Parser s a → a → Parser s a
option p d = p <[>succeed d
many :: Parser s a → Parser s [a]
many p = (:) <$>p <∗>many p <[>succeed []
many
1
:: Parser s a → Parser s [a]
many
1
p = (:) <$>p <∗>many p
pack :: Parser s a → Parser s b → Parser s c → Parser s b
pack p r q = (λ x → x) <$>p <∗>r <∗>q
listOf :: Parser s a → Parser s b → Parser s [a]
listOf p s = (:) <$>p <∗>many ((λ x → x) <$>s <∗>p)
— Auxiliary functions
first :: Parser s b → Parser s b
first p xs [ null r = []
[ otherwise = [head r]
where r = p xs
greedy, greedy
1
:: Parser s b → Parser s [b]
greedy = first . many
greedy
1
= first . many
1
Listing 3.5: EBNF.hs
56
3.4. More parser combinators
Defined in this way, the natural parser also accepts empty input as a number. If
this is not desired, we had better use the many
1
parser combinator, which accepts
one or more occurrences of a construction, and corresponds to the EBNF expression
P
+
, see Section 2.7. It is defined in Listing 3.5. Another combinator from EBNF
is the option combinator P?. It takes a parser as argument, and returns a parser
that recognises the same construct, but which also succeeds if that construct is not
present in the input string. The definition is given in Listing 3.5. It has an additional
argument: the value that should be used as result in case the construct is not present.
It is a kind of ‘default’ value.
By the use of the option and many functions, a large amount of backtracking possi-
bilities are introduced. This is not always advantageous. For example, if we define a
parser for identifiers by
identifier = many
1
(satisfy isAlpha)
a single word may also be parsed as two identifiers. Caused by the order of the
alternatives in the definition of many (succeed [] appears as the second alternative),
the ‘greedy’ parsing, which accumulates as many letters as possible in the identifier
is tried first, but if parsing fails elsewhere in the sentence, also less greedy parsings
of the identifier are tried – in vain. You will give a better definition of identifier in
Exercise 3.27.
In situations where from the way the grammar is built we can predict that it is
hopeless to try non-greedy results of many, we can define a parser transformer first,
that transforms a parser into a parser that only returns the first possible parsing. It
does so by taking the first element of the list of successes.
first :: Parser a b → Parser a b
first p xs [ null r = []
[ otherwise = [head r]
where r = p xs
Using this function, we can create a special ‘take all or nothing’ version of many:
greedy = first . many
greedy
1
= first . many
1
If we compose the first function with the option parser combinator:
obligatory p d = first (option p d)
we get a parser which must accept a construction if it is present, but which does not
fail if it is not present.
57
3. Parser combinators
3.4.2. Separators
The combinators many, many
1
and option are classical in compiler constructions
– there are notations for it in EBNF (

,
+
and ?, respectively) –, but there is no
need to leave it at that. For example, in many languages constructions are frequently
enclosed between two meaningless symbols, most often some sort of parentheses. For
this case we design a parser combinator pack. Given a parser for an opening token,
a body, and a closing token, it constructs a parser for the enclosed body, as defined
in Listing 3.5. Special cases of this combinator are:
parenthesised p = pack (symbol ’(’) p (symbol ’)’)
bracketed p = pack (symbol ’[’) p (symbol ’]’)
compound p = pack (token "begin") p (token "end")
Another frequently occurring construction is repetition of a certain construction,
where the elements are separated by some symbol. You may think of lists of ar-
guments (expressions separated by commas), or compound statements (statements
separated by semicolons). For the parse trees, the separators are of no importance.
The function listOf below generates a parser for a non-empty list, given a parser for
the items and a parser for the separators:
listOf :: Parser s a → Parser s b → Parser s [a]
listOf p s = (:) <$>p <∗>many ((λ x → x) <$>s <∗>p)
Useful instantiations are:
commaList, semicList :: Parser Char a → Parser Char [a]
commaList p = listOf p (symbol ’,’)
semicList p = listOf p (symbol ’;’)
A somewhat more complicated variant of the function listOf is the case where the
separators carry a meaning themselves. For example, in arithmetical expressions,
where the operators that separate the subexpressions have to be part of the parse
tree. For this case we will develop the functions chainr and chainl . These functions
expect that the parser for the separators returns a function (!); that function is used
by chain to combine parse trees for the items. In the case of chainr the operator is
applied right-to-left, in the case of chainl it is applied left-to-right. The functions
chainr and chainl are defined in Listing 3.6 (remember that $ is function application:
f $ x = f x).
The definitions look quite complicated, but when you look at the underlying grammar
they are quite straightforward. Suppose we apply operator ⊕ (⊕ is an operator
variable, it denotes an arbitrary right-associative operator) from right to left, so
e
1
⊕e
2
⊕e
3
⊕e
4
=
58
3.4. More parser combinators
— Chain expression combinators
chainr :: Parser s a → Parser s (a → a → a) → Parser s a
chainr pe po = h <$>many (j <$>pe <∗>po) <∗>pe
where j x op = (x‘op‘)
h fs x = foldr ($) x fs
chainl :: Parser s a → Parser s (a → a → a) → Parser s a
chainl pe po = h <$>pe <∗>many (j <$>po <∗>pe)
where j op x = (‘op‘x)
h x fs = foldl (flip ($)) x fs
Listing 3.6: Chains.hs
e
1
⊕(e
2
⊕(e
3
⊕e
4
))
=
((e
1
⊕) (e
2
⊕) (e
3
⊕)) e
4
It follows that we can parse such expressions by parsing many pairs of expressions
and operators, turning them into functions, and applying all those functions to the
last expression. This is done by function chainr, see Listing 3.6.
If operator ⊕ is applied from left to right, then
e
1
⊕e
2
⊕e
3
⊕e
4
=
((e
1
⊕e
2
) ⊕e
3
) ⊕e
4
=
((⊕e
4
) (⊕e
3
) (⊕e
2
)) e
1
So such an expression can be parsed by first parsing a single expression (e
1
), and then
parsing many pairs of operators and expressions, turning them into functions, and
applying all those functions to the first expression. This is done by function chainl ,
see Listing 3.6.
Functions chainl and chainr can be made more efficient by avoiding the construction
of the intermediate list of functions. The resulting definitions can be found in the
article by Fokker [5].
Note that functions chainl and chainr are very similar, the only difference is that
everything is ‘turned around’: function j of chainr takes a value and an operator,
and returns the function obtained by ‘left’ applying the operator; function j of chainl
takes an operator and a value, and returns the function obtained by ‘right’ applying
the operator to the value. Such functions are sometimes called dual .
dual
59
3. Parser combinators
Exercise 3.20.
1. Define a parser that analyses a string and recognises a list of digits separated by a
space character. The result is a list of integers.
2. Define a parser sumParser that recognises digits separated by the character ’+’ and
returns the sum of these integers.
3. Both parsers return a list of solutions. What should be changed in order to get only
one solution?
Exercise 3.21. What is the value of
many (symbol ’a’) xs
for xs ∈ ¦[], [’a’], [’b’], [’a’, ’b’], [’a’, ’a’, ’b’] ¦?
Exercise 3.22. Consider the application of the parser many (symbol ’a’) to the string aaa.
In what order do the four possible parsings appear in the list of successes?
Exercise 3.23 (no answer provided). Using the parser combinators option, many and many
1
define parsers for each of the basic languages defined in Section 2.3.1.
Exercise 3.24. As another variation on the theme ‘repetition’, define a parser combinator
psequence that transforms a list of parsers for some type into a parser returning a list of
elements of that type. What is the type of psequence? Also define a combinator choice that
iterates the operator <[>.
Exercise 3.25. As an application of psequence, define the function token that was discussed
in Section 3.2.
Exercise 3.26 (no answer provided). Carefully analyse the semantic functions in the defi-
nition of chainl in Listing 3.6.
Exercise 3.27. In real programming languages, identifiers follow rather flexible rules: the
first symbol must be a letter, but the symbols that follow (if any) may be a letter, digit, or
underscore symbol. Define a more realistic parser identifier.
3.5. Arithmetical expressions
The goal of this section is to use parser combinators in a concrete application. We
will develop a parser for arithmetical expressions, which have the following concrete
syntax:
E → E + E
[ E - E
[ E / E
[ ( E )
[ Digs
60
3.5. Arithmetical expressions
Besides these productions, we also have productions for identifiers and applications
of functions:
E → Identifier
[ Identifier ( Args )
Args → ε [ E (, E)

The parse trees for this grammar are of type Expr:
data Expr = Con Int
[ Var String
[ Fun String [Expr]
[ Expr :+: Expr
[ Expr :−: Expr
[ Expr :∗: Expr
[ Expr :/: Expr
You can almost recognise the structure of the parser in this type definition. But in
order to account for the priorities of the operators, we will use a grammar with three
non-terminals ‘expression’, ‘term’ and ‘factor’: an expression is composed of terms
separated by + or −; a term is composed of factors separated by ∗ or /, and a factor
is a constant, variable, function call, or expression between parentheses.
This grammar appears as a parser in the functions in Listing 3.7.
The first parser, fact, parses factors.
fact :: Parser Char Expr
fact = Con <$>integer
<[>Var <$>identifier
<[>Fun <$>identifier <∗>parenthesised (commaList expr)
<[>parenthesised expr
The first alternative is an integer parser which is postprocessed by the ‘semantic
function’ Con. The second and third alternative are a variable or function call,
depending on the presence of an argument list. In absence of the latter, the function
Var is applied, in presence the function Fun. For the fourth alternative there is no
semantic function, because the meaning of an expression between parentheses is the
meaning of the expression.
For the definition of a term as a list of factors separated by multiplicative operators
we use the function chainl . Recall that chainl repeatedly recognises its first argument
(fact), separated by its second argument (a ∗ or /). The parse trees for the individual
factors are joined by the constructor functions that appear before <$>. We use chainl
and not chainr because the operator ’/’ is considered to be left-associative.
The function expr is analogous to term, only with additive operators instead of
multiplicative operators, and with terms instead of factors.
61
3. Parser combinators
— Type definition for parse tree
data Expr = Con Int
[ Var String
[ Fun String [Expr]
[ Expr :+: Expr
[ Expr :−: Expr
[ Expr :∗: Expr
[ Expr :/: Expr
— Parser for expressions with two priorities
fact :: Parser Char Expr
fact = Con <$>integer
<[> Var <$>identifier
<[> Fun <$>identifier <∗>parenthesised (commaList expr)
<[> parenthesised expr
integer :: Parser Char Int
integer = (const negate <$>(symbol ’-’)) ‘option‘ id <∗>natural
term :: Parser Char Expr
term = chainl fact
( const (:∗:) <$>symbol ’*’
<[>const (:/:) <$>symbol ’/’
)
expr :: Parser Char Expr
expr = chainl term
( const (:+:) <$>symbol ’+’
<[>const (:−:) <$>symbol ’-’
)
Listing 3.7: ExpressionParser.hs
62
3.6. Generalised expressions
This example clearly shows the strength of parsing with parser combinators. There is
no need for a separate formalism for grammars; the production rules of the grammar
are combined with higher-order functions. Also, there is no need for a separate
parser generator (like ‘yacc’); the functions can be viewed both as description of the
grammar and as an executable parser.
Exercise 3.28. 1. Give the parse tree for the expressions "abc", "(abc)", "a*b+1",
"a*(b+1)", "-1-a", and "a(1,b)"
2. Why is the parse tree for the expression "a(1,b)" not the first solution of the parser?
Modify the functions in Listing 3.7 in such way that it will be.
Exercise 3.29. A function with no arguments such as "f()" is not accepted by the parser.
Explain why and modify the parser in such way that it will be.
Exercise 3.30. Modify the functions in Listing 3.7, in such a way that + is parsed as a
right-associative operator, and - is parsed as a left-associative operator.
3.6. Generalised expressions
This section generalises the parser in the previous section with respect to priorities.
Arithmetical expressions in which operators have more than two levels of priority can
be parsed by writing more auxiliary functions between term and expr. The function
chainl is used in each definition, with as first argument the function of one priority
level lower.
If there are nine levels of priority, we obtain nine copies of almost the same text. This
is not as it should be. Functions that resemble each other are an indication that we
should write a generalised function, where the differences are described using extra
arguments. Therefore, let us inspect the differences in the definitions of term and
expr again. These are:
• The operators and associated tree constructors that are used in the second
argument of chainl
• The parser that is used as first argument of chainl
The generalised function will take these two differences as extra arguments: the first
in the form of a list of pairs, the second in the form of a parse function:
type Op a = (Char, a → a → a)
gen :: [Op a] → Parser Char a → Parser Char a
gen ops p = chainl p (choice (map f ops))
where f (s, c) = const c <$>symbol s
If furthermore we define as shorthand:
63
3. Parser combinators
multis = [(’*’, (:∗:)), (’/’, (:/:))]
addis = [(’+’, (:+:)), (’-’, (:−:))]
then expr and term can be defined as partial parametrisations of gen:
expr = gen addis term
term = gen multis fact
By expanding the definition of term in that of expr we obtain:
expr = addis ‘gen‘ (multis ‘gen‘ fact)
which an experienced functional programmer immediately recognises as an applica-
tion of foldr:
expr = foldr gen fact [addis, multis]
From this definition a generalisation to more levels of priority is simply a matter of
extending the list of operator-lists.
The very compact formulation of the parser for expressions with an arbitrary number
of priority levels is possible because the parser combinators can be used together with
the existing mechanisms for generalisation and partial parametrisation in Haskell.
Contrary to conventional approaches, the levels of priority need not be coded expli-
citly with integers. The only thing that matters is the relative position of an operator
in the list of ‘list with operators of the same priority’. Also, the insertion of new
priority levels is very easy. The definitions are summarised in Listing 3.8.
Summary
This chapter shows how to construct parsers from simple combinators. It shows
how a small parser combinator library can be a powerful tool in the construction of
parsers. Furthermore, this chapter gives a rather basic implementation of the parser
combinator library. More advanced implementations are discussed elsewhere.
3.7. Exercises
Exercise 3.31. How should the parser of Section 3.6 be adapted to also allow raising an
expression to the power of an expression?
Exercise 3.32. Prove the following laws
h <$>(f <$>p) = (h . f ) <$>p (3.1)
h <$>(p <[>q) = (h <$>p) <[>(h <$>q) (3.2)
h <$>(p <∗>q) = ((h.) <$>p) <∗>q (3.3)
64
3.7. Exercises
— Parser for expressions with aribitrary many priorities
type Op a = (Char, a → a → a)
fact

:: Parser Char Expr
fact

= Con <$>integer
<[>Var <$>identifier
<[>Fun <$>identifier <∗>parenthesised (commaList expr

)
<[>parenthesised expr

gen :: [Op a] → Parser Char a → Parser Char a
gen ops p = chainl p (choice (map f ops))
where f (s, c) = const c <$>symbol s
expr

:: Parser Char Expr
expr

= foldr gen fact

[addis, multis]
multis = [(’*’, (:∗:)), (’/’, (:/:))]
addis = [(’+’, (:+:)), (’-’, (:−:))]
Listing 3.8: GExpressionParser.hs
Exercise 3.33. Consider your answer to Exercise 2.23. Define a combinator parser pMir
that transforms a concrete representation of a mirror-palindrome into an abstract one. Test
your function with the concrete mirror-palindromes cMir
1
and cMir
2
.
Exercise 3.34. Consider your answer to Exercise 2.25. Assuming the comma is an associa-
tive operator, we can give the following abstract syntax for bit-lists:
data BitList = SingleB Bit [ ConsB Bit BitList
Define a combinator parser pBitList that transforms a concrete representation of a bit-list
into an abstract one. Test your function with the concrete bit-lists cBitList
1
and cBitList
2
.
Exercise 3.35. Define a parser for fixed-point numbers, that is numbers like 12.34 and
-123.456. Also integers are acceptable. Notice that the part following the decimal point
looks like an integer, but has a different semantics!
Exercise 3.36. Define a parser for floating point numbers, which are fixed point numbers
followed by an optional E and an (positive or negative, integer) exponent.
Exercise 3.37. Define a parser for Java assignments that consist of a variable, an = sign,
an expression and a semicolon.
Exercise 3.38 (no answer provided). Define a parser for (simplified) Java statements.
Exercise 3.39 (no answer provided). Outline the construction of a parser for Java programs.
65
3. Parser combinators
66
4. Grammar and Parser design
The previous chapters have introduced many concepts related to grammars and
parsers. The goal of this chapter is to review these concepts, and to show how
they are used in the design of grammars and parsers.
The design of a grammar and parser for a language consists of several steps: you
have to
1. give example sentences of the language for which you want to design a grammar
and a parser;
2. give a grammar for the language for which you want to have a parser;
3. test that the grammar can indeed describe the example sentences;
4. analyse this grammar to find out whether or not it has some desirable proper-
ties;
5. possibly transform the grammar to obtain some of these desirable properties;
6. decide on the type of the parser: Parser a b, that is, decide on both the input
type a of the parser (which may be the result type of a scanner), and the result
type b of the parser.
7. construct a basic parser;
8. add semantic functions;
9. test that the parser can parse the example sentences you have given in the first
step, and that the parser returns what you expect.
We will describe and exemplify each of these steps in detail in the rest of this sec-
tion.
As a running example we will construct a grammar and parser for travelling schemes
for day trips, of the following form:
Groningen 8:37 9:44 Zwolle 9:49 10:15 Utrecht 10:21 11:05 Den Haag
We might want to do several things with such a schema, for example:
1. compute the net travel time, i. e., the travel time minus the waiting time (2
hours and 17 minutes in the above example);
67
4. Grammar and Parser design
2. compute the total time one has to wait on the intermediate stations (11 min-
utes).
This chapter defines functions to perform these computations.
4.1. Step 1: Example sentences for the language
We have already given an example sentence above:
Groningen 8:37 9:44 Zwolle 9:49 10:15 Utrecht 10:21 11:05 Den Haag
Other example sentences are:
Utrecht Centraal 10:25 10:58 Amsterdam Centraal
Assen
4.2. Step 2: A grammar for the language
The starting point for designing a parser for your language is to define a grammar that
describes the language as precisely as possible. It is important to convince yourself
from the fact that the grammar you give really generates the desired language, since
the grammar will be the basis for grammar transformations, which might turn the
grammar into a set of incomprehensible productions.
For the language of travelling schemes, we can give several grammars. The following
grammar focuses on the fact that a trip consists of zero or more departures and
arrivals.
TS → TS Departure Arrival TS [ Station
Station → Identifier
+
Departure → Time
Arrival → Time
Time → Nat : Nat
where Identifier and Nat have been defined in Section 2.3.1. So a travelling scheme is
a sequence of departure and arrival times, separated by stations. Note that a single
station is also a travelling scheme with this grammar.
Another grammar focuses on changing at a station:
TS → Station Departure (Arrival Station Departure)

Arrival Station
[ Station
So each travelling scheme starts and ends at a station, and in between there is a list
of intermediate stations.
68
4.3. Step 3: Testing the grammar
4.3. Step 3: Testing the grammar
Both grammars we have given in step 2 describe the example sentences given in
step 1. The derivation of these sentences using these grammars is easy.
4.4. Step 4: Analysing the grammar
To parse sentences of a language efficiently, we want to have a unambiguous grammar
that is left-factored and not left recursive. Depending on the parser we want to obtain,
we might desire other properties of our grammar. So a first step in designing a parser
is analysing the grammar, and determining which properties are (not) satisfied. We
have not yet developed tools for grammar analysis (we will do so in the chapter on
LL(1) parsing) but for some grammars it is easy to detect some properties.
The first example grammar is left and right recursive: the first production for TS
starts and ends with TS. Furthermore, the sequence Departure Arrival is an asso-
ciative separator in the generated language.
These properties may be used for transforming the grammar. Since we don’t mind
about right recursion, we will not make use of the fact that the grammar is right
recursive. The other properties will be used in grammar transformations in the
following subsection.
4.5. Step 5: Transforming the grammar
Since the sequence Departure Arrival is an associative separator in the generated
language, the productions for TS may be transformed into:
TS → Station [ Station Departure Arrival TS (4.1)
Thus we have removed the left recursion in the grammar. Both productions for
TS start with the nonterminal Station, so TS can be left factored. The resulting
productions are:
TS → Station Z
Z → ε [ Departure Arrival TS
We can also apply equivalence (2.1) to the two productions for TS from (4.1), and
obtain the following single production:
TS → (Station Departure Arrival )

Station (4.2)
So which productions do we take for TS? This depends on what we want to do with
the parsed sentences. We will show several choices in the next section.
69
4. Grammar and Parser design
4.6. Step 6: Deciding on the types
We want to write a parser for travel schemes, that is, we want to write a function ts
of type
ts :: Parser ? ?
The question marks should be replaced by the input type and the result type, respec-
tively. For the input type we can choose between at least two possibilities: characters,
Char or tokens Token. The type of tokens can be chosen as follows:
data Token = Station Token Station [ Time Token Time
type Station = String
type Time = (Int, Int)
We will construct a parser for both input types in the next subsection. So ts has one
of the following two types.
ts :: Parser Char ?
ts :: Parser Token ?
For the result type we have many choices. If we just want to compute the total
travelling time, Int suffices for the result type. If we want to compute the total
travelling time, the total waiting time, and a nicely printed version of the travelling
scheme, we may do several things:
• define three parsers, with Int (total travelling time), Int (total waiting time),
and String (nicely printed version) as result type, respectively;
• define a single parser with the triple (Int, Int, String) as result type;
• define an abstract syntax for travelling schemes, say a datatype TS, and define
three functions on TS that compute the desired results.
The first alternative parses the input three times, and is rather inefficient compared
with the other alternatives. The second alternative is hard to extend if we want
to compute something extra, but in some cases it might be more efficient than the
third alternative. The third alternative needs an abstract syntax. There are several
ways to define an abstract syntax for travelling schemes. The first abstract syntax
corresponds to definition (4.1) of grammar TS.
data TS
1
= Single
1
Station
[ Cons
1
Station Time Time TS
1
where Station and Time are defined above. A second abstract syntax corresponds
to the grammar for travelling schemes defined in (4.2).
type TS
2
= ([(Station, Time, Time)], Station)
70
4.7. Step 7: Constructing the basic parser
So a travelling scheme is a tuple, the first component of which is a list of triples
consisting of a departure station, a departure time, and an arrival time, and the
second component of which is the final arrival station. A third abstract syntax
corresponds to the second grammar defined in Section 4.2:
data TS
3
= Single
3
Station
[ Cons
3
(Station, Time, [(Time, Station, Time)], Time, Station)
Which abstract syntax should we take? Again, this depends on what we want to do
with the abstract syntax. Since TS
2
and TS
1
combine departure and arrival times
in a tuple, they are convenient to use when computing travelling times. TS
3
is useful
when we want to compute waiting times since it combines arrival and departure times
in one constructor. Often we want to exactly mimic the productions of the grammar
in the abstract syntax, so if we use grammar (4.1) for travelling schemes, we use TS
1
for the abstract syntax. Note that TS
1
is a datatype, whereas TS
2
is a type. TS
1
cannot be defined as a type because of the two alternative productions for TS. TS
2
can be defined as a datatype by adding a constructor. Types and datatypes each
have their advantages and disadvantages; the application determines which to use.
The result type of the parsing function ts may be one of types mentioned earlier (Int,
etc.), or one of TS
1
, TS
2
, TS
3
.
4.7. Step 7: Constructing the basic parser
Converting a grammar to a parser is a mechanical process that consists of a set
of simple replacement rules. Functional programming languages offer some extra
flexibility that we sometimes use, but usually writing a parser is a simple translation.
We use the following replacement rules.
grammar construct Haskell/parser construct
→ =
[ <[>
(space) <∗>

+
many
1


many
? option
terminal x symbol x
begin of sequence of symbols undefined<$>
Note that we start each sequence of symbols by undefined<$>. The undefined has
to be replaced by an appropriate semantic function in Step 6, but putting undefined
here ensures type correctness of the parser. Of course, running the parser will result
in an error.
71
4. Grammar and Parser design
We construct a basic parser for each of the input types Char and Token.
4.7.1. Basic parsers from strings
Applying these rules to the grammar (4.2) for travelling schemes, we obtain the
following basic parser.
station :: Parser Char Station
station = undefined <$>many
1
identifier
time :: Parser Char Time
time = undefined <$>natural <∗>symbol ’:’ <∗>natural
departure, arrival :: Parser Char Time
departure = undefined <$>time
arrival = undefined <$>time
tsstring :: Parser Char ?
tsstring = undefined
<$>many ( undefined
<$>spaces
<∗> station
<∗> spaces
<∗> departure
<∗> spaces
<∗> arrival
)
<∗> spaces
<∗> station
spaces :: Parser Char String
spaces = undefined <$>many (symbol ’ ’)
The only thing left to do is to add the semantic glue to the functions. The semantic
glue also determines the type of the function tsstring, which is denoted by ? for the
moment. For the other basic parsers we have chosen some reasonable return types.
The semantic functions are defined in the next and final step.
4.7.2. A basic parser from tokens
To obtain a basic parser from tokens, we first write a scanner that produces a list of
tokens from a string.
scanner :: String → [Token]
scanner = mkTokens . combine . words
combine :: [String] → [String]
72
4.7. Step 7: Constructing the basic parser
combine [] = []
combine [x] = [x]
combine (x : y : xs) =if isAlpha (head x) ∧ isAlpha (head y)
then combine ((x ++ " " ++ y) : xs)
else x : combine (y : xs)
mkToken :: String → Token
mkToken xs =if isDigit (head xs)
then Time Token (mkTime xs)
else Station Token xs
parse result :: [(a, b)] → a
parse result xs
[ null xs = error "parse_result: could not parse the input"
[ otherwise = fst (head xs)
mkTime :: String → Time
mkTime = parse result . time
This is a basic scanner with very basic error messages, but it suffices for now. The
composition of the scanner with the function tstoken
1
defined below gives the final
parser.
tstoken
1
:: Parser Token ?
tstoken
1
= undefined
<$>many ( undefined
<$>tstation
<∗> tdeparture
<∗> tarrival
)
<∗> tstation
tstation :: Parser Token Station
tstation (Station Token s : xs) = [(s, xs)]
tstation = []
tdeparture, tarrival :: Parser Token Time
tdeparture (Time Token (h, m) : xs) = [((h, m), xs)]
tdeparture = []
tarrival (Time Token (h, m) : xs) = [((h, m), xs)]
tarrival = []
where again the semantic functions remain to be defined. Note that functions
tdeparture and tarrival are the same functions. Their presence reflects their pres-
ence in the grammar.
Another basic parser from tokens is based on the second grammar of Section 4.2.
tstoken
2
:: Parser Token ?
tstoken
2
= undefined
73
4. Grammar and Parser design
<$>tstation
<∗> tdeparture
<∗> many ( undefined
<$>tarrival
<∗> tstation
<∗> tdeparture
)
<∗> tarrival
<∗> tstation
<[>undefined <$>tstation
4.8. Step 8: Adding semantic functions
Once we have the basic parsing functions, we need to add the semantic glue: the
functions that take the results of the elements in the right hand side of a production,
and convert them into the result of the left hand side. The basic rule is: Let the
types do the work!
First we add semantic functions to the basic parsing functions station, time, departure,
arrival , and spaces. Since function many
1
identifier returns a list of strings, and we
want to obtain the concatenation of these strings for the station name, we can take
the concatenation function concat for undefined in function station. To obtain a
value of type Time from an integer, a character, and an integer, we have to combine
the two integers in a tuple. So we take the following function
λx y → (x, y)
for undefined in time. Now, since function time returns a value of type Time, we can
take the identity function for undefined in departure and arrival , and then we replace
id <$>time by just time. Finally, the result of many is a string, so for undefined in
spaces we can take the identity function too.
The first semantic function for the basic parser tsstring defined in Section 4.7.1
returns an abstract syntax tree of type TS
2
. So the first undefined in tsstring should
return a tuple of a list of things of the correct type (the first component of the type
TS
2
) and a Station. Since many returns a list of things, we can construct such a
tuple by means of the function
λx y → (x, y)
provided many returns a value of the desired type: [(Station, Time, Time)]. Note
that this semantic function basically only throws away the value returned by the
spaces parser: we are not interested in the spaces between the components of our
74
4.9. Step 9: Did you get what you expected
travelling scheme. The many parser returns a value of the correct type if we replace
the second occurrence of undefined in tsstring by the function
λ x y z → (x, y, z )
Again, the results of spaces are thrown away. This completes a parser for travelling
schemes. The next semantic functions we define compute the net travel time. To
compute the net travel time, we have to compute the travel time of each trip from
a station to a station, and to add the travel times of all of these trips. We obtain
the travel time of a single trip if we replace the second occurrence of undefined in
tsstring by:
λ (xh, xm) (zh, zm) → (zh −xh) ∗ 60 + zm −xm
and Haskell’s prelude function sum sums these times, so for the first occurrence of
undefined we take:
λx → sum x
The final set of semantic functions we define are used for computing the total waiting
time. Since the second grammar of Section 4.2 combines arrival times and departure
times, we use a parser based on this grammar: the basic parser tstoken
2
. We have
to give definitions of the three undefined semantic functions. If a trip consists of a
single station, there is now waiting time, so the last occurrence of undefined is the
function const 0. The second occurrence of function undefined computes the waiting
time for one intermediate station:
λ(uh, um) (wh, wm) → (wh −uh) ∗ 60 + wm −um
Finally, the first occurrence of undefined sums the list of waiting time obtained by
means of the function that replaces the second occurrence of undefined:
λ x → sum x
4.9. Step 9: Did you get what you expected
In the last step you test your parser(s) to see whether or not you have obtained what
you expected, and whether or not you have made errors in the above process.
Summary
This chapter describes the different steps that have to be considered in the design of
a grammar and a language.
75
4. Grammar and Parser design
4.10. Exercises
Exercise 4.1. Write a parser floatLiteral for Java float-literals. The EBNF grammar for
float-literals is given by:
FloatLiteral → IntPart . FractPart? ExponentPart? FloatSuffix?
[ . FractPart ExponentPart? FloatSuffix?
[ IntPart ExponentPart FloatSuffix?
[ IntPart ExponentPart? FloatSuffix
IntPart → SignedInteger
FractPart → Digits
ExponentPart → ExponentIndicator SignedInteger
SignedInteger → Sign? Digits
Digits → Digits Digit [ Digit
ExponentIndicator → e [ E
Sign → + [ -
FloatSuffix → f [ F [ d [ D
To keep your parser simple, assume that all nonterminals, except for the nonterminal FloatLiteral ,
are represented by a String in the abstract syntax.
Exercise 4.2. Write an evaluator signedFloat for Java float-literals (the float-suffix may be
ignored).
Exercise 4.3. Up to the definition of the semantic functions, parsers constructed on a (fixed)
abstract syntax have the same shape. Give this parsing scheme for Java float literals.
76
5. Compositionality
Introduction
Many recursive functions follow a common pattern of recursion. These common
patterns of recursion can conveniently be captured by higher order functions. For
example: many recursive functions defined on lists are instances of the higher order
function foldr. It is possible to define a function such as foldr for a whole range of
datatypes other than lists. Such functions are called compositional. Compositional
functions on datatypes are defined in terms of algebras of semantic actions that
correspond to the constructors of the datatype. Compositional functions can typically
be used to define the semantics of programming languages constructs. Such semantics
is referred to as algebraic semantics. Algebraic semantics often uses algebras that
contain functions from tuples to tuples. Such functions can be seen as computations
that read values from a component of the domain and write values to a component of
the codomain. The former values are called inherited attributes and the latter values
are called synthesised attributes. Attributes can be both inherited and synthesised.
As explained in Section 2.6, there is an important relationship between grammars
and compositionality: with every grammar, which describes the concrete syntax of a
language, one can associate a (possibly mutually recursive) datatype, which describes
the abstract syntax of the language. Compositional functions on these datatypes are
called syntax driven.
Goals
After studying this chapter and making the exercises you will
• know how to generalise constructors of a datatype to an algebra;
• know how to write compositional functions, also known as folds, on (possibly
mutually recursive) datatypes;
• understand the advantages of using folds, and have seen that many problems
can be solved with a fold;
• know that a fold applied to the constructors algebra is the identity function;
• have seen the notions of fusion and deforestation;
• know how to write syntax driven code;
• understand the connection between datatypes, abstract syntax and concrete
syntax;
• understand the notions of synthesised and inherited attributes;
77
5. Compositionality
• be able to associate inherited and synthesised attributes with the different alter-
natives of (possibly mutually recursive) datatypes (or the different nonterminals
of grammars);
• be able to define algebraic semantics in terms of compositional (or syntax
driven) code that is defined using algebras of computations which make use
of inherited and synthesised attributes.
Organisation
The chapter is organised as follows. Section 5.1 shows how to define compositional
recursive functions on built-in lists using a function which is similar to foldr and shows
how to do the same thing with user-defined lists and streams. Section 5.2 shows how
to define compositional recursive functions on several kinds of trees. Section 5.3
defines algebraic semantics. Section 5.4 shows the usefulness of algebraic semantics
by presenting an expression evaluator, an expression interpreter which makes use of
a stack and expression compiler to a stack machine. They only differ in the way
they handle basic expressions (variables and local definitions are handled in the same
way). All three examples use an algebra of computations which can read values from
and write values to an environment which binds names to values. In a second version
of the expression evaluator the use of inherited and synthesised attributes is made
more explicit by using tuples. Section 5.5 presents a relatively complex example of
the use of tuples in combination with compositionality. It deals with the problem of
variable scope when compiling block structured languages.
5.1. Lists
This section introduces compositional functions on the well known datatype of lists.
Compositional functions are defined on the built-in datatype [a] for lists (Section
5.1.1), on a user-defined datatype List a for lists (Section 5.1.2), and on streams or
infinite lists (Section 5.1.3). We also show how to construct an algebra that directly
corresponds to a datatype.
5.1.1. Built-in lists
The datatype of lists is perhaps the most important example of a datatype. A list is
either the empty list [] or a nonempty list (x : xs) consisting of an element x at the
head of the list, followed by a tail xs which itself is again a list. Thus, the type [x]
is recursive. In Haskell the type [x] for lists is built-in. The informal definition of
above corresponds to the following (pseudo) datatype.
data [x] = x : [x] [ []
78
5.1. Lists
Many recursive functions on lists look very similar. Computing the sum of the el-
ements of a list of integers (sumL, where the L denotes that it is a sum function
defined on lists) is very similar to computing the product of the elements of a list of
integers (prodL).
sumL, prodL :: [Int] → Int
sumL (x : xs) = x + sumL xs
sumL [] = 0
prodL (x : xs) = x ∗ prodL xs
prodL [] = 1
The function sumL replaces the list constructor (:) by (+) and the list constructor
[] by 0 and the function prodL replaces the list constructor (:) by (∗) and the list
constructor [] by 1. Note that we have replaced the constructor (:) (a constructor
with two arguments) by binary operators (+) and (∗) (i.e. functions with two argu-
ments) and the constructor [] (a constructor with zero variables) by constants 0 and
1 (i.e. ‘functions’ with zero variables). The similarity between the definitions of the
functions sumL and prodL can be captured by the following higher order recursive
function foldL, which is nothing else but an uncurried version of the well known func-
tion foldr. Don’t confuse foldL with Haskell’s prelude function foldl , which works
the other way around.
foldL :: (x → l → l , l ) → [x] → l
foldL (op, c) = fold
where
fold (x : xs) = op x (fold xs)
fold [] = c
The function foldL recursively replaces the constructor (:) by an operator op and
the constructor [] by a constant c. We can now use foldL to compute the sum and
product of the elements of a list of integers as follows.
? foldL ((+),0) [1,2,3,4]
10
? foldL ((*),1) [1,2,3,4]
24
The pair (op, c) is often referred to as a list-algebra. More precisely, a list-algebra
consists of a type l (the carrier of the algebra), a binary operator op of type x → l → l
and a constant c of type l . Note that a type (like Int) can be the carrier of a list-
algebra in more than one way (for example using ((+), 0) and ((∗), 1)). Here is
another example of how to turn Int into a list-algebra.
? foldL (\_ n -> n+1,0) [1,2,3,4]
4
79
5. Compositionality
This list-algebra ignores the value at the head of a list, and increments the result
obtained thus far with one. It corresponds to the function sizeL defined by:
sizeL :: [x] → Int
sizeL ( : xs) = 1 + sizeL xs
sizeL [] = 0
Note that the type of sizeL is more general than the types of sumL and prodL. The
type of the elements of the list does not play a role.
5.1.2. User-defined lists
In this subsection we present an example of a fold function defined on another
datatype than built-in lists. To keep things simple we redo the list example for
user-defined lists.
data List x = Cons x (List x) [ Nil
User-defined lists are defined in the same way as built-in ones. The constructors
(:) and [] are replaced by constructors Cons and Nil . Here are the types of the
constructors Cons and Nil .
? :t Cons
Cons :: a -> List a -> List a
? :t Nil
Nil :: List a
A algebra type ListAlgebra corresponding to the datatype List directly follows the
structure of that datatype.
type ListAlgebra x l = (x → l → l , l )
The left hand side of the type definition is obtained from the left hand side of the
datatype as follows: a postfix Algebra is added at the end of the name List and a type
variable l is added at the end of the whole left hand side of the type definition. The
right hand side of the type definition is obtained from the right hand side of the data
definition as follows: all List x valued constructors are replaced by l valued functions
which have the same number of arguments (if any) as the corresponding constructors.
The types of recursive constructor arguments (i.e. arguments of type List x) are
replaced by recursive function arguments (i.e. arguments of type l ). The types of the
other arguments are simply left unchanged. In a similar way, the definition of a fold
function can be generated automatically from the data definition.
foldList :: ListAlgebra x l → List x → l
foldList (cons, nil ) = fold
80
5.1. Lists
where
fold (Cons x xs) = cons x (fold xs)
fold Nil = nil
The constructors Cons and Nil in the left hand sides of the definition of the local
function fold are replaced by functions cons and nil in the right hand sides. The
function fold is applied recursively to all recursive constructor arguments of type
List x to return a value of type l as required by the functions of the algebra (in this
case Cons and cons have one such recursive argument). The other arguments are
left unchanged. Recursive functions on user-defined lists which are defined by means
of foldList are called compositional. Every algebra defines a unique compositional
function. Here are three examples of compositional functions. They correspond to
the examples of Section 5.1.1.
sumList, prodList :: List Int → Int
sumList = foldList ((+), 0)
prodList = foldList ((∗), 1)
sizeList :: List x → Int
sizeList = foldList (const (1+), 0)
It is worth mentioning one particular ListAlgebra: the trivial ListAlgebra that re-
places Cons by Cons and Nil by Nil . This algebra defines the identity function on
user-defined lists.
idListAlgebra :: ListAlgebra x (List x)
idListAlgebra = (Cons, Nil )
idList :: List x → List x
idList = foldList idListAlgebra
? idList (Cons 1 (Cons 2 Nil))
Cons 1 (Cons 2 Nil)
5.1.3. Streams
In this section we consider streams (or infinite lists).
streams
data Stream x = And x (Stream x)
Here is a standard example of a stream: the infinite list of fibonacci numbers.
fibStream :: Stream Int
fibStream = And 0 (And 1 (restOf fibStream))
where
restOf (And x stream@(And y )) = And (x + y) (restOf stream)
81
5. Compositionality
The algebra type StreamAlgebra, and the fold function foldStream can be generated
automatically from the datatype Stream.
type StreamAlgebra x s = x → s → s
foldStream :: StreamAlgebra x s → Stream x → s
foldStream and = fold
where
fold (And x xs) = and x (fold xs)
Note that the algebra has only one component because Stream has only one construc-
tor. For the same reason the fold function is defined using only one equation. Here is
an example of using a compositional function on user defined streams. It computes
the first element of a monotone stream that is greater or equal than a given value.
firstGreaterThan :: Ord x ⇒ x → Stream x → x
firstGreaterThan n = foldStream (λx y → if x n then x else y)
5.2. Trees
Now that we have seen how to generalise the foldr function on built-in lists to compo-
sitional functions on user-defined lists and streams we proceed by explaining another
common class of datatypes: trees. We will treat four different kinds of trees in the
subsections below:
• binary trees;
• trees for matching parentheses;
• expression trees;
• general trees.
Furthermore, we will briefly mention the concepts of fusion and deforestation.
5.2.1. Binary trees
A binary tree is either a node where the tree splits into two subtrees or a leaf which
holds a value.
data BinTree x = Bin (BinTree x) (BinTree x) [ Leaf x
One can generate the corresponding algebra type BinTreeAlgebra and fold function
foldBinTree from the datatype automatically. Note that Bin has two recursive argu-
ments and that Leaf has one non-recursive argument.
type BinTreeAlgebra x t = (t → t → t, x → t)
82
5.2. Trees
foldBinTree :: BinTreeAlgebra x t → BinTree x → t
foldBinTree (bin, leaf ) = fold
where
fold (Bin l r) = bin (fold l ) (fold r)
fold (Leaf x) = leaf x
In the BinTreeAlgebra type, the bin part of the algebra has two arguments of type
t and the leaf part of the algebra has one argument of type x. Similarly, in the
foldBinTree function, the local fold function is applied recursively to both arguments
of bin and is not called on the argument of leaf . We can now define compositional
functions on binary trees much in the same way as we defined them on lists. Here is
an example: the function sizeBinTree computes the size of a binary tree.
sizeBinTree :: BinTree x → Int
sizeBinTree = foldBinTree ((+), const 1)
? sizeBinTree (Bin (Bin (Leaf 3) (Leaf 7)) (Leaf 11))
3
If a tree consists of a leaf, then sizeBinTree ignores the value at the leaf and returns
1 as the size of the tree. If a tree consists of two subtrees, then sizeBinTree returns
the sum of the sizes of those subtrees as the size of the tree. Functions for computing
the sum and the product of the integers at the leafs of a binary tree can be defined
in a similar way. It suffices to define appropriate semantic actions bin and leaf on a
type t (in this case Int) that correspond to the syntactic constructs Bin and Leaf of
the datatype BinTree.
5.2.2. Trees for matching parentheses
Section 3.3.1 defines the datatype Parentheses for matching parentheses.
data Parentheses = Match Parentheses Parentheses
[ Empty
For example, the sentence ()() of the concrete syntax for matching parentheses
is represented by the value Match Empty (Match Empty Empty) in the abstract
syntax Parentheses. Remember that the abstract syntax ignores the terminal bracket
symbols of the concrete syntax.
We can now define, in the same way as we did for lists and binary trees, an algebra
type ParenthesesAlgebra and a fold function foldParentheses, which can be used to
compute the depth (depthParentheses) and the width (widthParentheses) of match-
ing parenthesis in a compositional way. The depth of a string of matching parentheses
83
5. Compositionality
s is the largest number of unmatched parentheses that occurs in a substring of s. For
example, the depth of the string ((()))() is 3. The width of a string of matching
parentheses s is the the number of substrings that are matching parentheses them-
selves, which are not a substring of a surrounding string of matching parentheses.
For example, the width of the string ((()))() is 2. Compositional functions on
datatypes that describe the abstract syntax of a language are called syntax driven.
type ParenthesesAlgebra m = (m → m → m, m)
foldParentheses :: ParenthesesAlgebra m → Parentheses → m
foldParentheses (match, empty) = fold
where
fold (Match l r) = match (fold l ) (fold r)
fold Empty = empty
depthParenthesesAlgebra :: ParenthesesAlgebra Int
depthParenthesesAlgebra = (λx y → max (1 + x) y, 0)
widthParenthesesAlgebra :: ParenthesesAlgebra Int
widthParenthesesAlgebra = (λ y → 1 + y , 0)
depthParentheses, widthParentheses :: Parentheses → Int
depthParentheses = foldParentheses depthParenthesesAlgebra
widthParentheses = foldParentheses widthParenthesesAlgebra
parenthesesExample = Match (Match (Match Empty Empty) Empty)
(Match Empty
(Match (Match Empty Empty)
Empty
) )
? depthParentheses parenthesesExample
3
? widthParentheses parenthesesExample
3
Our example reveals that abstract syntax is not very well suited for interpretation
by human beings. What is the concrete representation of the matching parenthesis
example represented by parenthesesExample? It happens to be ((()))()(()). For-
tunately, we can easily write a program that computes the concrete representation
from the abstract one. We know exactly which terminals we have deleted when go-
ing from the concrete syntax to the abstract one. The algebra used by the function
a2cParentheses simply reinserts those terminals that we have deleted. Note that
a2cParentheses does not deal with layout such as blanks, indentation and newlines.
For a simple example layout does not really matter. For large examples layout is
very important: it can be used to let concrete representations look pretty.
84
5.2. Trees
a2cParenthesesAlgebra :: ParenthesesAlgebra String
a2cParenthesesAlgebra = (λxs ys → "(" ++ xs ++ ")" ++ ys, "")
a2cParentheses :: Parentheses → String
a2cParentheses = foldParentheses a2cParenthesesAlgebra
? a2cParentheses parenthesesExample
((()))()(())
This example illustrates that a computer can easily interpret abstract syntax (some-
thing human beings have difficulties with). Strangely enough, human beings can
easily interpret concrete syntax (something computers have difficulties with). What
we would really like is that computers can interpret concrete syntax as well. This
is the place where parsing enters the picture: computing an abstract representation
from a concrete one is precisely what parsers are used for.
Consider the functions parens and nesting of Section 3.3.1 again.
open = symbol ’(’
close = symbol ’)’
parens :: Parser Char Parentheses
parens = (λ x y → Match x y)
<$>open <∗>parens <∗>close <∗>parens
<[>succeed Empty
nesting :: Parser Char Int
nesting = (λ x y → max (1 + x) y)
<$>open <∗>nesting <∗>close <∗>nesting
<[>succeed 0
Function nesting could have been defined by means of function parens and a fold:
nesting

:: Parser Char Int
nesting

= depthParentheses <$>parens
(Remember that depthParentheses has been defined as a fold.) Functions nesting
and nesting

compute exactly the same result. The function nesting is the fusion of
the fold function with the parser parens from the function nesting

. Using laws for
parsers and folds (not shown here) we can prove that the two functions are equal.
Note that function nesting

first builds a tree by means of function parens, and then
flattens it by means of the fold. Function nesting never builds a tree, and is thus
preferable for reasons of efficiency. On the other hand: in function nesting

we reuse
the parser parens and the function depthParentheses, in function nesting we have
to write our own parser, and convince ourselves that it is correct. So for reasons of
‘programming efficiency’ function nesting

is preferable. To obtain the best of both
85
5. Compositionality
worlds, we would like to write function nesting

and have our compiler figure out that
it is better to use function nesting in computations. The automatic transformation
of function nesting

into function nesting is called deforestation (trees are removed).
deforestation
Some (very few) compilers are clever enough to perform this transformation auto-
matically.
5.2.3. Expression trees
The matching parentheses grammar has only one nonterminal. Therefore its ab-
stract syntax is described by a single datatype. In this section we look again at the
expression grammar of Section 3.5:
E → T
E → E + T
T → F
T → T * F
F → ( E )
F → Digs
This grammar has three nonterminals, E, T, and F. Using the approach from
Section 2.6 we transform the nonterminals to datatypes:
data E = E
1
T [ E
2
E T
data T = T
1
F [ T
2
T F
data F = F
1
E [ F
2
Int
where we have translated Digs by the type Int. Note that this is a rather inconvenient
and clumsy abstract syntax for expressions; the following abstract syntax is more
convenient.
data Expr = Con Int [ Add Expr Expr [ Mul Expr Expr
However, to illustrate the concept of mutual recursive datatypes, we will study the
datatypes E, T, and F defined above. Since E uses T, T uses F, and F uses E,
these three types are mutually recursive. The main datatype of the three datatypes
is the one corresponding to the start symbol E. Since the datatypes are mutually
recursive, the algebra type EAlgebra consists of three tuples of functions and three
carriers (the main carrier is, as always, the one corresponding to the main datatype
and is therefore the one corresponding to the start-symbol).
type EAlgebra e t f = ((t → e, e → t → e)
, (f → t, t → f → t)
, (e → f , Int → f )
)
86
5.2. Trees
The fold function foldE for E also folds over T and F, so it uses three mutually
recursive local functions.
foldE :: EAlgebra e t f → E → e
foldE ((e
1
, e
2
), (t
1
, t
2
), (f
1
, f
2
)) = fold
where
fold (E
1
t) = e
1
(foldT t)
fold (E
2
e t) = e
2
(fold e) (foldT t)
foldT (T
1
f ) = t
1
(foldF f )
foldT (T
2
t f ) = t
2
(foldT t) (foldF f )
foldF (F
1
e) = f
1
(fold e)
foldF (F
2
n) = f
2
n
We can now use foldE to write a syntax driven expression evaluator evalE. In the
algebra that is used in the foldE, all type variables e, f , and t are instantiated with
Int.
evalE :: E → Int
evalE = foldE ((id, (+)), (id, (∗)), (id, id))
exE = E
2
(E
1
(T
2
(T
1
(F
2
2)) (F
2
3))) (T
1
(F
2
1))
? evalE exE
7
Once again our example shows that abstract syntax cannot easily be interpreted by
human beings. Here is a function a2cE which does this job for us.
a2cE :: E → String
a2cE = foldE ((e
1
, e
2
), (t
1
, t
2
), (f
1
, f
2
))
where e
1
=λt → t
e
2
=λe t → e ++ "+" ++ t
t
1
=λf → f
t
2
=λt f → t ++ "*" ++ f
f
1
=λe → "(" ++ e ++ ")"
f
2
=λn → show n
? a2cE exE
"2*3+1"
87
5. Compositionality
5.2.4. General trees
A general tree consist of a node, holding a value, where the tree splits into a list of
subtrees. Notice that this list may be empty (in which case, of course, only the value
at the node is of interest). As usual, the type TreeAlgebra and the function foldTree
can be generated automatically from the data definition.
data Tree x = Node x [Tree x]
type TreeAlgebra x a = x → [a] → a
foldTree :: TreeAlgebra x a → Tree x → a
foldTree node = fold
where
fold (Node x gts) = node x (map fold gts)
Notice that the constructor Node has a list of recursive arguments. Therefore the
node function of the algebra has a corresponding list of recursive arguments. The local
fold function is recursively called on all elements of a list using the map function.
One can compute the sum of the values at the nodes of a general tree as follows:
sumTree :: Tree Int → Int
sumTree = foldTree (λx xs → x + sum xs)
Computing the product of the values at the nodes of a general tree and computing
the size of a general tree can be done in a similar way.
5.2.5. Efficiency
A fold takes a value of a datatype, and replaces its constructors by functions. If
the evaluation of each of these functions on their arguments takes constant time,
evaluation of the fold takes time linear in the number of constructors in its argument.
However, some functions require more than constant evaluation time. For example,
list concatenation is linear in its left argument, and it follows that if we define the
function reverse by
reverse :: [a] → [a]
reverse = foldL (λx xs → xs ++ [x], [])
then function reverse takes time quadratic in the length of its argument list. So,
folds are often efficient functions, but if the functions in the algebra are not constant,
the fold is usually not linear. Often such a nonlinear fold can be transformed into
a more efficient function. A technique that often can be used in such a case is the
accumulating parameter technique. For example, for the reverse function we have
reverse x = reverse

x []
88
5.3. Algebraic semantics
reverse

:: [a] → [a] → [a]
reverse

[] ys = ys
reverse

(x : xs) ys = reverse

xs (x : ys)
The evaluation of reverse

xs takes time linear in the length of xs.
Exercise 5.1. Define an algebra type and a fold function for the following datatype.
data LNTree a b = Leaf a
[ Node (LNTree a b) b (LNTree a b)
Exercise 5.2. Define the following functions as folds on the datatype BinTree, see Sec-
tion 5.2.1.
1. height, which returns the height of a tree.
2. flatten, which returns the list of leaf values in left-to-right order.
3. maxBinTree, which returns the maximal value at the leaves.
4. sp, which returns the length of a shortest path.
5. mapBinTree, which maps a function over the elements at the leaves.
Exercise 5.3. A path through a binary tree describes the route from the root of the tree to
some leaf. We choose to represent paths by sequences of Direction’s:
data Direction = L [ R
in such a way that taking the left subtree in an internal node will be encoded by L and
taking the right subtree will be encoded by R. Define a compositional function allPaths
which produces all paths of a given tree. Define this function first using explicit recursion,
and then using a fold.
Exercise 5.4. This exercise deals with resistors. There are some basic resistors with a fixed
(floating point) resistance and, given two resistors, they can be put in parallel (:[:) or in
sequence (:∗:).
1. Define the datatype Resist to represent resistors. Also, define the type ResistAlgebra
and the corresponding function foldResist.
2. Define a compositonal function result which determines the resistance of a resistors.
(Recall the rules
1
r
=
1
r
1
+
1
r
2
and r = r
1
+r
2
.)
5.3. Algebraic semantics
This section summarizes what we have seen before, and establishes terminology.
In the previous section, we have seen how to associate an algebra type and a fold
function with every datatype, or family of mutually recursive datatypes. The algebra
is a tuple (one component per datatype) of tuples (one component per constructor
89
5. Compositionality
of the datatype) of semantic actions. The algebra is parameterized over the result
type of the computation, also called the carrier of the algebra. If we work with a
carrier
family of mutually recursive datatypes, we need one carrier per datatype, and the
algebra is parameterized over all of them. We call the carrier corresponding to the
main datatype of the family the main carrier, and the others auxiliary carriers.
The fold function traverses a value and recursively replaces syntactic constructors
of the datatypes by the corresponding semantic actions of the algebra. Functions
which are defined in terms of a fold function and an algebra are called compositional
compositional
functions.
There is one special algebra: the one whose components are the constructor functions
of the datatypes we operate on. This algebra, when applied via a fold, defines the
identity function, and is called the initial algebra.
initial algebra
The function resulting from applying an algebra to a fold function is sometimes also
called an algebra homomorphism, and the compositional function is said to define
algebra
homomorphism
algebraic semantics.
algebraic semantics
5.4. Expressions
The first part of this section presents a basic expression evaluator. The evaluator
is extended with variables in the second part and with local definitions in the third
part.
5.4.1. Evaluating expressions
In this subsection we start with a more involved example: an expression evaluator.
We will use another datatype for expressions than the one introduced in Section 5.2.3:
here we will use a single, and hence non mutual recursive datatype for expressions.
We restrict ourselves to float valued expressions on which addition, subtraction, mul-
tiplication and division are defined. The datatype and the corresponding algebra
type and fold function are as follows:
infixl 7 ‘Mul ‘
infix 7 ‘Dvd‘
infixl 6 ‘Add‘, ‘Min‘
data Expr = Expr ‘Add‘ Expr
[ Expr ‘Min‘ Expr
[ Expr ‘Mul ‘ Expr
[ Expr ‘Dvd‘ Expr
[ Num Float
type ExprAlgebra a = (a → a → a — add
90
5.4. Expressions
, a → a → a — min
, a → a → a — mul
, a → a → a — dvd
, Float → a) — num
foldExpr :: ExprAlgebra a → Expr → a
foldExpr (add, min, mul , dvd, num) = fold
where
fold (expr
1
‘Add‘ expr
2
) = fold expr
1
‘add‘ fold expr
2
fold (expr
1
‘Min‘ expr
2
) = fold expr
1
‘min‘ fold expr
2
fold (expr
1
‘Mul ‘ expr
2
) = fold expr
1
‘mul ‘ fold expr
2
fold (expr
1
‘Dvd‘ expr
2
) = fold expr
1
‘dvd‘ fold expr
2
fold (Num n) = num n
There is nothing special to notice about these definitions except, perhaps, the fact
that Expr does not have an extra parameter x like the list and tree examples. Com-
puting the result of an expression now simply consists of replacing the constructors
by appropriate functions.
resultExpr :: Expr → Float
resultExpr = foldExpr ((+), (−), (∗), (/), id)
5.4.2. Adding variables
Our next goal is to extend the evaluator of the previous subsection such that it
can handle variables as well. The values of variables are typically looked up in an
environment which binds the names of the variables to values. We implement an
environment as a list of name-value pairs. For our purposes names are strings and
values are floats. In the following programs we will use the following functions and
types:
type Env name value = [(name, value)]
(?) :: Eq name ⇒ Env name value → name → value
env ? x = head [v [ (y, v) ← env, x = = y]
type Name = String
type Value = Float
The datatype and the corresponding algebra type and fold function are now as follows.
Note that we use the same name (Expr) for the datatype, although it differs from
the previous Expr datatype.
data Expr = Expr ‘Add‘ Expr
[ Expr ‘Min‘ Expr
[ Expr ‘Mul ‘ Expr
91
5. Compositionality
[ Expr ‘Dvd‘ Expr
[ Num Value
[ Var Name
type ExprAlgebra a = (a → a → a — add
, a → a → a — min
, a → a → a — mul
, a → a → a — dvd
, Value → a — num
, Name → a) — var
foldExpr :: ExprAlgebra a → Expr → a
foldExpr (add, min, mul , dvd, num, var) = fold
where
fold (expr
1
‘Add‘ expr
2
) = fold expr
1
‘add‘ fold expr
2
fold (expr
1
‘Min‘ expr
2
) = fold expr
1
‘min‘ fold expr
2
fold (expr
1
‘Mul ‘ expr
2
) = fold expr
1
‘mul ‘ fold expr
2
fold (expr
1
‘Dvd‘ expr
2
) = fold expr
1
‘dvd‘ fold expr
2
fold (Num n) = num n
fold (Var x) = var x
The datatype Expr now has an extra constructor: the unary constructor Var. Sim-
ilarly, the argument of foldExpr now has an extra component: the unary function
var which corresponds to the unary constructor Var. Computing the result of an
expression somehow needs to use an environment. Here is a first, bad way of doing
this: one can use it as an argument of a function that computes an algebra (we will
explain why this is a bad choice in the next subsection; the basic idea is that we use
the environment as a global variable here).
resultExprBad :: Env Name Value → Expr → Value
resultExprBad env = foldExpr ((+), (−), (∗), (/), id, (env?))
? resultExprBad [("x",3)] (Var "x" ‘Mul‘ Num 2)
6
The good way of using an environment is the following: instead of working with a
computation which, given an environment, yields an algebra of values it is better
to turn the computation itself into an algebra. Thus we turn the environment in a
‘local’ variable.
([+[), ([−[), ([∗[), ([/[) :: (Env Name Value → Value) →
(Env Name Value → Value) →
(Env Name Value → Value)
f [+[ g =λenv → f env + g env
f [−[ g =λenv → f env −g env
92
5.4. Expressions
f [∗[ g =λenv → f env ∗ g env
f [/[ g =λenv → f env / g env
resultExprGood :: Expr → (Env Name Value → Value)
resultExprGood = foldExpr (([+[), ([−[), ([∗[), ([/[), const, flip (?))
? resultExprGood (Var "x" ‘Mul‘ Num 2) [("x",3)]
6
The actions (+), (−), (∗) and (/) on values are now replaced by corresponding actions
([+[), ([−[), ([∗[) and ([/[) on computations. Computing the result of the sum of two
subexpressions within a given environment consists of computing the result of the
subexpressions within this environment and adding both results to yield a final result.
Computing the result of a constant does not need the environment at all. Computing
the result of a variable consists of looking it up in the environment. Thus, the
algebraic semantics of an expression is a computation which yields a value.
In this case the computation is of the form env → val . The value type is an example
of a synthesised attribute. The value of an expression is synthesised from values of its
synthesised attribute
subexpressions. The environment type is an example of an inherited attribute. The
inherited attribute
environment which is used by the computation of a subexpression of an expression
is inherited from the computation of the expression. Since we are working with
abstract syntax we say that the synthesised and inherited attributes are attributes
of the datatype Expr. If Expr is one of the mutually recursive datatypes which are
generated from the nonterminals of a grammar, then we say that the synthesised and
inherited attributes are attributes of the nonterminal.
5.4.3. Adding definitions
Our next goal is to extend the evaluator of the previous subsection such that it can
handle definitions as well. A (local) definition is an expression of the form
Def name expr
1
expr
2
which should be interpreted as: let the value of name be equal to expr
1
in expres-
sion expr
2
. Variables are typically defined by updating the environment with an
appropriate name-value pair.
The datatype (called Expr again) and the corresponding algebra type and eval func-
tion are now as follows:
data Expr = Expr ‘Add‘ Expr
[ Expr ‘Min‘ Expr
[ Expr ‘Mul ‘ Expr
93
5. Compositionality
[ Expr ‘Dvd‘ Expr
[ Num Value
[ Var Name
[ Def Name Expr Expr
type ExprAlgebra a = (a → a → a — add
, a → a → a — min
, a → a → a — mul
, a → a → a — dvd
, Value → a — num
, Name → a — var
, Name → a → a → a) — def
foldExpr :: ExprAlgebra a → Expr → a
foldExpr (add, min, mul , dvd, num, var, def ) = fold
where
fold (expr
1
‘Add‘ expr
2
) = fold expr
1
‘add‘ fold expr
2
fold (expr
1
‘Min‘ expr
2
) = fold expr
1
‘min‘ fold expr
2
fold (expr
1
‘Mul ‘ expr
2
) = fold expr
1
‘mul ‘ fold expr
2
fold (expr
1
‘Dvd‘ expr
2
) = fold expr
1
‘dvd‘ fold expr
2
fold (Num n) = num n
fold (Var x) = var x
fold (Def x value body) = def x (fold value) (fold body)
Expr now has an extra constructor: the ternary constructor Def , which can be used
to introduce a local variable. For example, the following expression can be used to
compute the number of seconds per year.
seconds = Def "days_per_year" (Num 365) (
Def "hours_per_day" (Num 24) (
Def "minutes_per_hour" (Num 60) (
Def "seconds_per_minute" (Num 60) (
Var "days_per_year" ‘Mul ‘
Var "hours_per_day" ‘Mul ‘
Var "minutes_per_hour" ‘Mul ‘
Var "seconds_per_minute" ))))
Similarly, the parameter of foldExpr now has an extra component: the ternary func-
tion def which corresponds to the ternary constructor Def . Notice that the last two
arguments are recursive ones. We can now explain why the first use of environments
is inferior to the second one. Trying to extend the first definition gives something
like:
resultExprBad :: Env Name Value → Expr → Value
resultExprBad env =
foldExpr ((+), (−), (∗), (/), id, (env?), error "def")
94
5.4. Expressions
The last component causes a problem: a body that contains a local definition has to
be evaluated in an updated environment. We cannot update the environment in this
setting: we can read the environment but afterwards it is not accessible any more
in the algebra (which consists of values). Extending the second definition causes
no problems: the environment is now accessible in the algebra (which consists of
computations). We can easily add a new action which updates the environment.
The computation corresponding to the body of an expression with a local definition
can now be evaluated in the updated environment.
f [+[ g =λenv → f env + g env
f [−[ g =λenv → f env −g env
f [∗[ g =λenv → f env ∗ g env
f [/[ g =λenv → f env / g env
x [=[ f =λg env → g ((x, f env) : env)
resultExprGood :: Expr → (Env Name Value → Value)
resultExprGood =
foldExpr (([+[), ([−[), ([∗[), ([/[), const, flip (?), ([=[))
? resultExprGood seconds []
31536000
Note that by prepending a pair (x, y) to an environment (in the definition of the
operator [=[), we add the pair to the environment. By definition of (?), the binding
for x hides possible other bindings for x.
5.4.4. Compiling to a stack machine
In this section we compile expressions to instructions on a stack machine. We can
then use this stack machine to evaluate compiled expressions. This section is inspired
by an example in the textbook by Bird and Wadler [3].
Imagine a simple computer for evaluating arithmetic expressions. This computer has
a ‘stack’ and can execute ‘instructions’ which change the value of the stack. The
class of possible instructions is defined by the following datatype.
data MachInstr v = Push v [ Apply (v → v → v)
type MachProgr v = [MachInstr v]
An instruction either pushes a value of type v on the stack, or it executes an operator
that takes the two top values of the stack, applies the operator, and pushes the result
back on the stack. A stack (a value of type Stack v for some value type v) is a list of
values, from which you can pop values, on which you can push values, and from which
95
5. Compositionality
you can take the top value. A module Stack for stacks can be found in Appendix A.
The effect of executing an instruction of type MachInstr is defined by
execute :: MachInstr v → Stack v → Stack v
execute (Push x) s = push x s
execute (Apply op) s =let a = top s
t = pop s
b = top t
u = pop t
in push (op a b) u
A sequence of instructions is executed by the function run defined by
run :: MachProgr v → Stack v → Stack v
run [] s = s
run (x : xs) s = run xs (execute x s)
It follows that run can be defined as a foldl .
An expression can be translated (or compiled) into a list of instructions by the func-
tion compile, defined by:
compileExpr :: Expr →
Env Name (MachProgr Value) →
MachProgr Value
compileExpr = foldExpr (add, min, mul , dvd, num, var, def )
where
f ‘add‘ g =λenv → f env ++ g env ++ [Apply (+)]
f ‘min‘ g =λenv → f env ++ g env ++ [Apply (−)]
f ‘mul ‘ g =λenv → f env ++ g env ++ [Apply (∗)]
f ‘dvd‘ g =λenv → f env ++ g env ++ [Apply (/)]
num v =λenv → [Push v]
var x =λenv → env ? x
def x fd fb =λenv → fb ((x, fd env) : env)
Exercise 5.5. Define the following functions as folds on the datatype Expr that contains
definitions.
1. isSum, which determines whether or not an expression is a sum.
2. vars, which returns the list of variables that occur in the expression.
Exercise 5.6. This exercise deals with expressions without definitions. The function der is
defined by
der (e
1
‘Add‘ e
2
) dx = der e
1
dx ‘Add‘ der e
2
dx
der (e
1
‘Min‘ e
2
) dx = der e
1
dx ‘Min‘ der e
2
dx
der (e
1
‘Mul ‘ e
2
) dx = (e
1
‘Mul ‘ der e
2
dx) ‘Add‘ (der e
1
dx ‘Mul ‘ e
2
)
96
5.5. Block structured languages
der (e
1
‘Dvd‘ e
2
) dx = ((e
2
‘Mul ‘ der e
1
dx) ‘Min‘ (e
1
‘Mul ‘ der e
2
dx))
‘Dvd‘ (e
2
‘Mul ‘ e
2
)
der (Num f ) dx = Num 0
der (Var s) dx =if s = = dx then Num 1 else Num 0
1. Give an informal description of the function der.
2. Why is the function der not compositional ?
3. Define a datatype Exp to represent expressions consisting of (floating point) constants,
variables, addition and substraction. Also, define the type ExpAlgebra and the corre-
sponding foldExp.
4. Define the function der on Exp and show that this function is compositional.
Exercise 5.7. Define the function replace, which given a binary tree and an element m
replaces the elements at the leaves by m as a fold on the datatype BinTree, see Section 5.2.1.
It is easy to write a function with the required functionality if you swap the arguments, but
then it is impossible to write replace as a fold. Note that the fold returns a function, which
when given m replaces all the leaves by m.
Exercise 5.8. Consider the datatype of paths introduced in Exercise 5.3. A path in a tree
leads to a unique leaf. Define a compositonal function path2Value which, given a tree and a
path in the tree, yields the element at the unique leaf.
5.5. Block structured languages
This section presents a more complex example of the use of tuples in combination
with compositionality. The example deals with the scope of variables in a block
structured language. A variable from a global scope is visible in a local scope only if
it is not hidden by a variable with the same name in the local scope.
5.5.1. Blocks
A block is a list of statements. A statement is a variable declaration, a variable
usage or a nested block. The concrete representation of an example block of our
block structured language looks as follows (dcl stands for declaration, use stands for
usage and x, y and z are variables).
use x ; dcl x ;
(use z ; use y ; dcl x ; dcl z ; use x) ;
dcl y ; use y
97
5. Compositionality
Statements are separated by semicolons. Nested blocks are surrounded by paren-
theses. The usage of z refers to the local declaration (the only declaration of z).
The usage of y refers to the global declaration (the only declaration of y). The lo-
cal usage of x refers to the local declaration and the global usage of x refers to the
global declaration. Note that it is allowed to use variables before they are declared.
Here are some mutually recursive (data)types, which describe the abstract syntax of
blocks, corresponding to the grammar that describes the concrete syntax of blocks
which is used above. We use meaningful names for data constructors and we use
built-in lists instead of user-defined lists for the block algebra. As usual, the algebra
type BlockAlgebra, which consists of two tuples of functions, and the fold function
foldBlock, which uses two mutually recursive local functions, can be generated from
the two mutually recursive (data)types.
type Block = [Statement]
data Statement = Dcl Idf [ Use Idf [ Blk Block
type Idf = String
type BlockAlgebra b s = ((s → b → b, b)
, (Idf → s, Idf → s, b → s)
)
foldBlock :: BlockAlgebra b s → Block → b
foldBlock ((cons, empty), (dcl , use, blk)) = fold
where
fold (s : b) = cons (foldS s) (fold b)
fold [] = empty
foldS (Dcl x) = dcl x
foldS (Use x) = use x
foldS (Blk b) = blk (fold b)
5.5.2. Generating code
The goal of this section is to generate code from a block. The code consists of a
sequence of instructions. There are three types of instructions.
• Enter (l , c): enters the l ’th nested block in which c local variables are declared.
• Leave (l , c): leaves the l ’th nested block in which c local variables were declared.
• Access (l , c): accesses the c’th variable of the l ’th nested block.
The code generated for the above example looks as follows.
[ Enter (0, 2), Access (0, 0)
, Enter (1, 2), Access (1, 1), Access (0, 1), Access (1, 0), Leave (1, 2)
98
5.5. Block structured languages
, Access (0, 1), Leave (0, 2)
]
Note that we start numbering levels (l ) and counts (c) (which are sometimes called
displacements) from 0. The abstract syntax of the code to be generated is described
by the following datatype.
type Count = Int
type Level = Int
type Variable = (Level , Count)
type BlockInfo = (Level , Count)
data Instruction = Enter BlockInfo
[ Leave BlockInfo
[ Access Variable
type Code = [Instruction]
The function ab2ac, which generates abstract code (a value of type Code) from an
abstract block (a value of type Block), uses a compositional function block2Code.
For all syntactic constructs of Statement and Block we define appropriate semantic
actions on an algebra of computations. Here is a, somewhat simplified, description
of these semantic actions.
• Dcl : Every time we declare a local variable x we have to update the local
environment le of the block we are in by associating with x the current pair of
level and local-count (l , lc). Moreover we have to increment the local variable
count lc to lc + 1. Note that we do not generate any code for a declaration
statement. Instead we perform some computations which make it possible to
generate appropriate code for other statements.
dcl x (le, l , lc) = (le

, lc

)
where
le

= le ‘update‘ (x, (l , lc))
lc

= lc + 1
where function update is defined in the AssociationList module.
• Use: Every time we use a local variable x we have to generate code cd

for it.
This code is of the form [Access (l , lc)]. The level–local-count pair (l , lc) of the
variable is looked up in the global environment e.
use x e = cd

where
cd

= [Access (l , c)]
(l , c) = e ? x
99
5. Compositionality
• Blk: Every time we enter a nested block we increment the global level l to l +1,
start with a fresh local variable count 0 and set the local environment of the
nested block we enter to the current global environment e. The computation for
the nested block results in a local variable count lcB and a local environment
leB. Furthermore we need to make sure that the global environment (the one
in which we look up variables) which is used by the computation for the nested
block is equal to leB. The code which is generated for the block is surrounded
by an appropriate [Enter lcB]-[Leave lcB] pair.
blk fB (e, l ) = cd

where
l

= l + 1
(leB, lcB, cdB) = fB (leB, l

, e, 0)
cd

= [Enter (l

, lcB)] ++ cdB ++ [Leave (l

, lcB)]
• []: No action need to be performed for an empty block.
• (:): For every nonempty block we perform the computation of the first state-
ment of the block which, given a local environment le and local variable count
lc, results in a local environment leS and local variable count lcS. This
environment-count pair is then used by the computation of the rest of the
block to result in a local environment le

and local variable count lc

. The code
cd

which is generated is the concatenation cdS ++cdB of the code cdS which is
generated for the first statement and the code cdB which is generated for the
rest of the block.
cons fS fB (le, lc) = (le

, lc

, cd

)
where
(leS, lcS, cdS) = fS (le, lc)
(le

, lc

, cdB) = fB (leS, lcS)
cd

= cdS ++ cdB
What does our actual computation type look like? For dcl we need three inherited
attributes: a global level, a local block environment and a local variable count. Two
of them: the local block environment and the local variable count are also synthesised
attributes. For use we need one inherited attribute: a global block environment, and
we compute one synthesised attribute: the generated code. For blk we need two
inherited attributes: a global block environment and a global level, and we com-
pute two synthesised attributes: the local variable count and the generated code.
Moreover there is one extra attribute: a local block environment which is both inher-
ited and synthesised. When processing the statements of a nested block we already
make use of the global block environment which we are synthesising (when looking
up variables). For cons we compute three synthesised attributes: the local block
environment, the local variable count and the generated code. Two of them, the
100
5.5. Block structured languages
local block environment and the local variable count are also needed as inherited
attributes. It is clear from the considerations above that the following types fulfill
our needs.
type BlockEnv = [(Idf , Variable)]
type GlobalEnv = (BlockEnv, Level )
type LocalEnv = (BlockEnv, Count)
The implementation of block2Code is now a straightforward translation of the actions
described above. Attributes which are not mentioned in those actions are added as
extra components which do not contribute to the functionality.
block2Code :: Block → GlobalEnv → LocalEnv → (LocalEnv, Code)
block2Code = foldBlock ((cons, empty), (dcl , use, blk))
where
cons fS fB (e, l ) (le, lc) = ((le

, lc

), cd

)
where
((leS, lcS), cdS) = fS (e, l ) (le, lc)
((le

, lc

), cdB) = fB (e, l ) (leS, lcS)
cd

= cdS ++ cdB
empty (e, l ) (le, lc) = ((le, lc), [])
dcl x (e, l ) (le, lc) = ((le

, lc

), [])
where
le

= (x, (l , lc)) : le
lc

= lc + 1
use x (e, l ) (le, lc) = ((le, lc), cd

)
where
cd

= [Access (l , c)]
(l , c) = e ? x
blk fB (e, l ) (le, lc) = ((le, lc), cd

)
where
((leB, lcB), cdB) = fB (leB, l

) (e, 0)
l

= l + 1
cd

= [Enter (l

, lcB)] ++ cdB ++ [Leave (l

, lcB)]
The code generator starts with an empty local environment, a fresh level and a fresh
local variable count. The code is a synthesised attribute. The global environment
is an attribute which is both inherited and synthesised. When processing a block
we already use the global environment which we are synthesising (when looking up
variables).
ab2ac :: Block → Code
ab2ac b = [Enter (0, c)] ++ cd ++ [Leave (0, c)]
where
101
5. Compositionality
((e, c), cd) = block2Code b (e, 0) ([], 0)
aBlock
= [ Use "x", Dcl "x"
, Blk [Use "z", Use "y", Dcl "x", Dcl "z", Use "x"]
, Dcl "y", Use "y"]
? ab2ac aBlock
[Enter (0,2),Access (0,0)
,Enter (1,2),Access (1,1),Access (0,1),Access (1,0),Leave (1,2)
,Access (0,1),Leave (0,2)]
5.6. Exercises
Exercise 5.9. Consider your answer to Exercise 2.22, which gives an abstract syntax for
palindromes.
1. Define a type PalAlgebra that describes the type of the semantic actions that corre-
spond to the syntactic constructs of Pal .
2. Define the function foldPal , which describes how the semantics actions that correspond
to the syntactic constructs of Pal should be applied.
3. Define the functions a2cPal and aCountPal as foldPal ’s.
4. Define the parser pfoldPal which interprets its input in an arbitrary semantic PalAlgebra
without building the intermediate abstract syntax tree.
5. Describe the parsers pfoldPal m
1
and pfoldPal m
2
where m
1
and m
2
correspond to the
algebras of a2cPal and aCountPal respectively.
Exercise 5.10. Consider your answer to Exercise 2.23, which gives an abstract syntax for
mirror-palindromes.
1. Define the type MirAlgebra that describes the semantic actions that correspond to the
syntactic constructs of Mir.
2. Define the function foldMir, which describes how semantic actions that correspond to
the syntactic constructs of Mir should be applied.
3. Define the functions a2cMir and m2pMir as foldMir’s.
4. Define the parser pfoldMir, which interprets its input in an arbitrary semantic MirAlgebra
without building the intermediate abstract syntax tree.
5. Describe the parsers pfoldMir m
1
and pfoldMir m
2
where m
1
and m
2
correspond to
the algebras of a2cMir and m2pMir, respectively.
Exercise 5.11. Consider your answer to exercise 2.24, which gives an abstract syntax for
parity-sequences.
1. Define the type ParityAlgebra that describes the semantic actions that correspond to
the syntactic constructs of Parity.
2. Define the function foldParity, which describes how the semantic actions that corre-
spond to the syntactic constructs of Parity should be applied.
3. Define the function a2cParity as foldParity.
102
5.6. Exercises
Exercise 5.12. Consider your answer to Exercise 2.25, which gives an abstract syntax for
bit-lists.
1. Define the type BitListAlgebra that describes the semantic actions that correspond to
the syntactic constructs of BitList.
2. Define the function foldBitList, which describes how the semantic actions that corre-
spond to the syntactic constructs of BitList should be applied.
3. Define the function a2cBitList as a foldBitList.
4. Define the parser pfoldBitList, which interprets its input in an arbitrary semantic
BitListAlgebra without building the intermediate abstract syntax tree.
Exercise 5.13. The following grammar describes the concrete syntax of a simple block-
structured programming language
B → S R (block)
R → ; S R [ ε (rest)
S → D [ U [ N (statement)
D → x [ y (declaration)
U → X [ Y (usage)
N → ( B ) (nested block)
1. Define a datatype Block that describes the abstract syntax that corresponds to the
grammar. What is the abstract representation of x;(y;Y);X?
2. Define the type BlockAlgebra that describes the semantic actions that correspond to
the syntactic constructs of Block.
3. Define the function foldBlock, which describes how the semantic actions corresponding
to the syntactic constructs of Block should be applied.
4. Define the function a2cBlock, which converts an abstract block into a concrete one.
Write a2cBlock as a foldBlock
5. The function checkBlock tests whether or not each variable of a given abstract block
is declared before use (declared in the same or in a surrounding block).
103
5. Compositionality
104
6. Computing with parsers
Parsers produce results. For example, the parsers for travelling schemes given in
Chapter 4 return an abstract syntax, or an integer that represents the net travelling
time in minutes. The net travelling time is computed directly by inserting the cor-
rect semantic functions. Another way to compute the net travelling time is by first
computing the abstract syntax, and then applying a function to the abstract syntax
that computes the net travelling time. This section shows several ways to compute
results using parsers:
• insert a semantic function in the parser;
• apply a fold to the abstract syntax;
• use a class instead of abstract syntax;
• pass an algebra to the parser.
6.1. Insert a semantic function in the parser
In Chapter 4 we have defined two parsers: a parser that computes the abstract syntax
for a travelling schema, and a parser that computes the net travelling time. These
functions are obtained by inserting different functions in the basic parser. If we want
to compute the total travelling time, we have to insert different functions in the basic
parser. This approach works fine for a small parser, but it has some disadvantages
when building a larger parser:
• semantics is intertwined with the parsing process;
• it is difficult to locate all positions where semantic functions have to be inserted
in the parser.
6.2. Apply a fold to the abstract syntax
Instead of inserting operations in a basic parser, we can write a parser that parses
the input to an abstract syntax, and computes the desired result by applying a fold
to the abstract syntax.
An example of such an approach has been given in Section 5.2.2, where we defined
two functions with the same functionality: nesting and nesting’; both compute
the maximum nesting depth in a string of parentheses. Function nesting is defined
105
6. Computing with parsers
by inserting functions in the basic parser. Function nesting’ is defined by applying
a fold to the abstract syntax. Each of these definitions has its own merits; we repeat
the main arguments below.
parens :: Parser Char Parentheses
parens = (\_ b _ d -> Match b d) <$>
open <*> parens <*> close <*> parens
<|> succeed Empty
nesting :: Parser Char Int
nesting = (\_ b _ d -> max (1+b) d) <$>
open <*> nesting <*> close <*> nesting
<|> succeed 0
nesting’ :: Parser Char Int
nesting’ = depthParentheses <$> parens
The first definition (nesting) is more efficient, because it does not build an inter-
mediate abstract syntax tree. On the other hand, it might be more difficult to write
because we have to insert functions in the correct places in the basic parser. The ad-
vantage of the second definition (nesting’) is that we reuse both the parser parens,
which returns an abstract syntax tree, and the function depthParentheses (or the
function foldParentheses, which is used in the definition of depthParentheses),
which does recursion over an abstract syntax tree. The only thing we have to write
ourselves in the second definition is the depthParenthesesAlgebra. The disadvan-
tage of the second definition is that it builds an intermediate abstract syntax tree,
which is ‘flattened’ by the fold. We want to avoid building the abstract syntax
tree altogether. To obtain the best of both worlds, we would like to write function
nesting’ and have our compiler figure out that it is better to use function nesting
in computations. The automatic transformation of function nesting’ into function
nesting is called deforestation (trees are removed). Some (very few) compilers are
clever enough to perform this transformation automatically.
6.3. Deforestation
Deforestation removes intermediate trees in computations. The previous section gives
an example of deforestation on the datatype Parentheses. This section sketches the
general idea.
Suppose we have a datatype AbstractTree
data AbstractTree = ...
106
6.4. Using a class instead of abstract syntax
From this datatype we construct an algebra and a fold, see Chapter 5.
type AbstractTreeAlgebra a = ...
foldAbstractTree :: AbstractTreeAlgebra a -> AbstractTree -> a
A parser for the datatype AbstractTree (which returns a value of AbstractTree)
has the following type:
parseAbstractTree :: Parser Symbol AbstractTree
where Symbol is some type of input symbols (for example Char). Suppose now that
we define a function p that parses an AbstractTree, and then computes some value
by folding with an algebra f over this tree:
p = foldAbstractTree f . parseAbstractTree
Then deforestation says that p is equal to the function parseAbstractTree in which
occurrences of the constructors of the datatype AbstractTree have been replaced
by the corresponding components of the algebra f. The following two sections each
describe a way to implement such a deforestated function.
6.4. Using a class instead of abstract syntax
Classes can be used to implement the deforestated or fused computation of a fold
with a parser. This gives a solution of the desired efficiency.
For example, for the language of parentheses, we define the following class:
class Parens a where
match :: a -> a -> a
empty :: a
Note that types of the functions in the class Parens correspond exactly to the two
types that occur in the type ParenthesesAlgebra. This class is used in a parser for
parentheses:
parens :: Parens a => Parser Char a
parens = (\_ b _ d -> match b d) <$>
open <*> parens <*> close <*> parens
<|> succeed empty
107
6. Computing with parsers
The algebra is implicit in this function: the only thing we know is that there exist
functions empty and match of the correct type; we know nothing about their imple-
mentation. To obtain a function parens that returns a value of type Parentheses
we create the following instance of the class Parens.
instance Parens Parentheses where
match = Match
empty = Empty
Now we can write:
?(parens :: Parser Char Parentheses) "()()"
[(Match Empty (Match Empty Empty), "")
,(Match Empty Empty, "()")
,(Empty, "()()")
]
Note that we have to supply the type of parens in this expression, otherwise Haskell
doesn’t know which instance of Parens to use. This is how we turn the implicit
‘class’ algebra into an explicit ‘instance’ algebra. Another instance of Parens can be
used to compute the nesting depth of parentheses:
instance Parens Int where
match b d = max (1+b) d
empty = 0
And now we can write:
?(parens :: Parser Char Int) "()()"
[(1, ""), (1, "()"), (0, "()()")]
So the answer depends on the type we want our function parens to have. This
also immediately shows a problem of this, otherwise elegant, approach: it does not
work if we want to compute two different results of the same type, because Haskell
doesn’t allow you to define two (or more) instances with the same type. So once we
have defined the instance Parens Int as above, we cannot use function parens to
compute, for example, the width (also an Int) of a string of parentheses.
108
6.5. Passing an algebra to the parser
6.5. Passing an algebra to the parser
The previous section shows how to implement a parser with an implicit algebra. Since
this approach fails when we want to define different parsers with the same result type,
we make the algebras explicit. Thus we obtain the following definition of parens:
parens :: ParenthesesAlgebra a -> Parser Char a
parens (match,empty) = par where
par = (\_ b _ d -> match b d) <$>
open <*> par <*> close <*> par
<|> succeed empty
Note that it is now easy to define different parsers with the same result type:
nesting, breadth :: Parser Char Int
nesting = parens (\b d -> max (1+b) d,0)
breadth = parens (\b d -> d+1,0)
109
6. Computing with parsers
110
7. Programming with
higher-order folds
Introduction
In the previous chapters we have seen that algebras play an important role when
describing the meaning of a recognised structure (a parse tree). For each recursive
datatype T we have a function foldT, and for each constructor of the datatype we
have a corresponding function as a component in the algebra. Chapter 5 introduces
a language in which local declarations are permitted. Evaluating expressions in this
language can be done by choosing an appropriate algebra. The domain of that algebra
is a higher order (data)type (a (data)type that contains functions). Unfortunately,
the resulting code comes as a surprise to many. In this chapter we will illustrate a
related formalism, which will make it easier to construct such involved algebras. This
related formalism is the attribute grammar formalism. We will not formally define
attribute grammars, but instead illustrate the formalism with some examples, and
give an informal definition.
We start with developing a somewhat unconventional way of looking at functional
programs, and especially those programs that use functions that recursively descend
over datatypes a lot. In our case one may think about these datatypes as abstract
syntax trees. When computing a property of such a recursive object (for example, a
program) we define two sets of functions: one set that describes how to recursively
visit the nodes of the tree, and one set of functions (an algebra) that describes what
to compute at each node when visited.
One of the most important steps in this process is deciding what the carrier type
of the algebras is going to be. Once this step has been taken, these types are a
guideline for further design steps. We will see that such carrier types may be functions
themselves, and that deciding on the type of such functions may not always be simple.
In this chapter we will present a view on recursive computations that will enable us
to “design” the carrier type in an incremental way. We will do so by constructing
algebras out of other algebras. In this way we define the meaning of a language in a
semantically compositional way.
We will start with the rep min example, which looks a bit artificial, and deals with
a non-interesting, highly specific problem. However, it has been chosen for its sim-
plicity, and to not distract our attention to specific, programming language related,
111
7. Programming with higher-order folds
data Tree = Leaf Int
| Bin Tree Tree deriving Show
type TreeAlgebra a = (Int -> a, a -> a -> a)
foldTree :: TreeAlgebra a -> Tree -> a
foldTree alg@(leaf, _ ) (Leaf i) = leaf i
foldTree alg@(_ , bin) (Bin l r) = bin (foldTree alg l)
(foldTree alg r)
Listing 7.1: rm.start.hs
semantic issues. The second example of this chapter demonstrates the techniques on
a larger example: a small compiler for part of a programming language.
Goals
In this chapter you will learn:
• how to write ‘circular’ functional programs, or ‘higher-order folds’;
• how to combine algebras;
• (informally) the concept of an attribute grammar.
7.1. The rep min problem
One of the famous examples in which the power of lazy evaluation is demonstrated is
the so-called rep min problem [2]. Many have wondered how this program achieves
its goal, since at first sight it seems that it is impossible to compute anything with
this program. We will use this problem, and a sequence of different solutions, to
build up an understanding of a whole class of such programs.
In Listing 7.1 we present the datatype Tree, together with its associated algebra.
The carrier type of an algebra is the type that describes the objects of the algebra.
We represent it by a type parameter of the algebra type:
type TreeAlgebra a = (Int -> a, a -> a -> a)
The associated evaluation function foldTree systematically replaces the constructors
Leaf and Bin by corresponding operations from the algebra alg that is passed as an
argument.
112
7.1. The rep min problem
minAlg :: TreeAlgebra Int
minAlg = (id, min :: Int->Int->Int)
rep_min :: Tree -> Tree
rep_min t = foldTree repAlg t
where m = foldTree minAlg t
repAlg = (const (Leaf m), Bin)
Listing 7.2: rm.sol1.hs
We now want to construct a function rep_min :: Tree -> Tree that returns a Tree
with the same “shape” as its argument Tree, but with the values in its leaves replaced
by the minimal value occurring in the original tree. For example,
?rep_min (Bin (Bin (Leaf 1) (Leaf 7)) (Leaf 11))
Bin (Bin (Leaf 1) (Leaf 1)) (Leaf 1)
7.1.1. A straightforward solution
A straightforward solution to the rep min problem consists of a function in which
foldTree is used twice: once for computing the minimal value of the leaf values,
and once for constructing the resulting Tree. The function rep_min that solves the
problem in this way is given in Listing 7.2. Notice that the variable m is a global
variable of the repAlg-algebra, that is used in the tree constructing call of foldTree.
One of the disadvantages of this solution is that in the course of the computation the
pattern matching associated with the inspection of the tree nodes is performed twice
for each node in the tree.
Although this solution as such is no problem, we will try to construct a solution that
calls foldTree only once.
7.1.2. Lambda lifting
We want to obtain a program for the rep min problem in which pattern matching is
used only once. Program Listing 7.3 is an intermediate step towards this goal. In
this program the global variable m has been removed and the second call of foldTree
does not construct a Tree anymore, but instead a function constructing a tree of type
Int -> Tree, which takes the computed minimal value as an argument. Notice how
we have emphasized the fact that a function is returned through some superfluous
notation: the first lambda in the function definitions constituting the algebra repAlg
is required by the signature of the algebra, the second lambda, which could have been
113
7. Programming with higher-order folds
repAlg = ( \_ -> \m -> Leaf m
,\lfun rfun -> \m -> let lt = lfun m
rt = rfun m
in Bin lt rt
)
rep_min’ t = (foldTree repAlg t) (foldTree minAlg t)
Listing 7.3: rm.sol2.hs
infix 9 ‘tuple‘
tuple :: TreeAlgebra a -> TreeAlgebra b -> TreeAlgebra (a,b)
(leaf1, bin1) ‘tuple‘ (leaf2, bin2) = (\i -> (leaf1 i, leaf2 i)
,\l r -> (bin1 (fst l) (fst r)
,bin2 (snd l) (snd r)
)
)
min_repAlg :: TreeAlgebra (Int, Int -> Tree)
min_repAlg = (minAlg ‘tuple‘ repAlg)
rep_min’’ t = r m
where (m, r) = foldTree min_repAlg t
Listing 7.4: rm.sol3.hs
omitted, is there because the carrier set of the algebra contains functions of type Int
-> Tree. This process is done routinely by functional compilers and is known as
lambda-lifting.
7.1.3. Tupling computations
We are now ready to formulate a solution in which foldTree is called only once. Note
that in the last solution the two calls of foldTree don’t interfere with each other.
As a consequence we may perform both the computation of the tree constructing
function and the minimal value in one go, by tupling the results of the computations.
The solution is given in Listing 7.4. First a function tuple is defined. This function
takes two TreeAlgebras as arguments and constructs a third one, which has as its
carrier tuples of the carriers of the original algebras.
114
7.1. The rep min problem
7.1.4. Merging tupled functions
In the next step we transform the type of the carrier set in the previous example,
(Int, Int->Tree), into an equivalent type Int -> (Int, Tree). This transforma-
tion is not essential here, but we use it to demonstrate that if we compute a cartesian
product of functions, we may transform that type into a new type in which we com-
pute only one function, which takes as its arguments the cartesian product of all
the arguments of the functions in the tuple, and returns as its result the cartesian
product of the result types. In our example the computation of the minimal value
may be seen as a function of type ()->Int. As a consequence the argument of the
new type is ((), Int), which is isomorphic to just Int, and the result type becomes
(Int, Tree).
We want to mention here too that the reverse is in general not true; given a function
of type (a, b) -> (c, d), it is in general not possible to split this function into two
functions of type a -> c and b -> d, which together achieve the same effect. The
new version is given in Listing 7.5.
Notice how we have, in an attempt to make the different rˆoles of the parameters
explicit, again introduced extra lambdas in the definition of the functions of the
algebra. The parameters after the second lambda are there because we construct
values in a higher order carrier set. The parameters after the first lambda are there
because we deal with a TreeAlgebra. A curious step taken here is that part of
the result, in our case the value m, is passed back as an argument to the result of
(foldTree mergedAlg t). Lazy evaluation makes this work.
That such programs were possible came originally as a great surprise to many func-
tional programmers, especially to those who used to program in LISP or ML, lan-
guages that require arguments to be evaluated completely before a call is evaluated
(so-called strict evaluation in contrast to lazy evaluation). Because of this surprising
behaviour this class of programs became known as circular programs. Notice however
that there is nothing circular in this program. Each value is defined in terms of other
values, and no value is defined in terms of itself (as in ones=1:ones).
Finally, Listing 7.6 shows the version of this program in which the function foldTree
has been unfolded. Thus we obtain the original solution as given in Bird [2].
Concluding, we have systematically transformed a program that inspects each node
twice into an equivalent program that inspects each node only once. The resulting
solution passes back part of the result of a call as an argument to that same call.
Lazy evaluation makes this possible.
Exercise 7.1. The deepest front problem is the problem of finding the so-called front of a
tree. The front of a tree is the list of all nodes that are at the deepest level. As in the
rep min problem, the trees involved are elements of the datatype Tree, see Listing 7.1. A
straightforward solution is to compute the height of the tree and passing the result of this
function to a function frontAtLevel :: Tree -> Int -> [Int].
115
7. Programming with higher-order folds
mergedAlg :: TreeAlgebra (Int -> (Int,Tree))
mergedAlg = (\i -> \m -> (i, Leaf m)
,\lfun rfun -> \m -> let (lm,lt) = lfun m
(rm,rt) = rfun m
in (lm ‘min‘ rm
, Bin lt rt
)
)
rep_min’’’ t = r
where (m, r) = (foldTree mergedAlg t) m
Listing 7.5: rm.sol4.hs
rep_min’’’’ t = r
where (m, r) = tree t m
tree (Leaf i) = \m -> (i, Leaf m)
tree (Bin l r) = \m -> let (lm, lt) = tree l m
(rm, rt) = tree r m
in (lm ‘min‘ rm, Bin lt rt)
Listing 7.6: rm.sol5.hs
116
7.2. A small compiler
1. Define the functions height and frontAtLevel
2. Give the four different solutions as defined in the rep min problem.
Exercise 7.2. Redo the previous exercise for the highest front problem.
7.2. A small compiler
This section constructs a small compiler for (a part of) a small language. The com-
piler compiles this code into code for a hypothetical stack machine.
7.2.1. The language
The language we consider in this section has integers, booleans, function application,
and an if-then-else expression. A language with just these constructs is useless, and
you will extend the language in the exercises with some other constructs, which make
the language a bit more interesting. We take the following context-free grammar for
the concrete syntax of the language.
Expr0 → if Expr1 then Expr1 else Expr1 [ Expr1
Expr1 → Expr2 Expr2

Expr2 → Int [ Bool
where Int generates integers, and Bool booleans. An abstract syntax for our language
is given in Listing 7.7. Note that we use a single datatype for the abstract syntax
instead of three datatypes (one for each nonterminal); this simplifies the code a bit.
The Listing 7.7 also contains a definition of a fold and an algebra type for the abstract
syntax.
A parser for expressions is given in Listing 7.8.
7.2.2. A stack machine
In section 5.4.4 we have defined a stack machine with which simple arithmetic ex-
pressions can be evaluated. Here we define a stack machine that has some more
instructions. The language of the previous section wil be compiled into code for this
stack machine in the following section.
The stack machine we will use has the following instructions:
• it can load an integer;
• it can load a boolean;
• given an argument and a function on the stack, it can call the function on the
argument;
117
7. Programming with higher-order folds
data ExprAS = If ExprAS ExprAS ExprAS
| Apply ExprAS ExprAS
| ConInt Int
| ConBool Bool deriving Show
type ExprASAlgebra a = (a -> a -> a -> a
,a -> a -> a
,Int -> a
,Bool -> a
)
foldExprAS :: ExprASAlgebra a -> ExprAS -> a
foldExprAS (iff,apply,conint,conbool) = fold
where fold (If ce te ee) = iff (fold ce) (fold te) (fold ee)
fold (Apply fe ae) = apply (fold fe) (fold ae)
fold (ConInt i) = conint i
fold (ConBool b) = conbool b
Listing 7.7: ExprAbstractSyntax.hs
• it can set a label in the code;
• given a boolean on the stack, it can jump to a label provided the boolean is
false;
• it can jump to a label (unconditionally).
The datatype for instructions is given in Listing 7.9.
7.2.3. Compiling to the stackmachine
How do we compile the different expressions to stack machine code? We want to
define a function compile of type
compile :: ExprAS -> [InstructionSM]
• A ConInt i is compiled to a LoadInt i.
compile (ConInt i) = [LoadInt i]
• A ConBool b is compiled to a LoadBool b.
compile (ConBool b) = [LoadBool b]
118
7.2. A small compiler
sptoken :: String -> Parser Char String
sptoken s = (\_ b _ -> b) <$>
many (symbol ’ ’) <*> token s <*> many1 (symbol ’ ’)
boolean = const True <$> token "True" <|> const False <$> token "False"
parseExpr :: Parser Char ExprAS
parseExpr = expr0
where expr0 = (\a b c d e f -> If b d f) <$>
sptoken "if"
<*> parseExpr
<*> sptoken "then"
<*> parseExpr
<*> sptoken "else"
<*> parseExpr
<|> expr1
expr1 = chainl expr2 (const Apply <$> many1 (symbol ’ ’))
<|> expr2
expr2 = ConBool <$> boolean
<|> ConInt <$> natural
Listing 7.8: ExprParser.hs
data InstructionSM = LoadInt Int
| LoadBool Bool
| Call
| SetLabel Label
| BrFalse Label
| BrAlways Label
type Label = Int
Listing 7.9: InstructionSM.hs
119
7. Programming with higher-order folds
• An application Apply f x is compiled by first compiling the argument x, then
the ‘function’ f (at the moment it is impossible to define functions in our
language, hence the quotes around ‘function’), and finally putting a Call on
top of the stack.
compile (Apply f x) = compile x ++ compile f ++ [Call]
• An if-then-else expression If ce te ee is compiled by first compiling the con-
ditional expression ce. Then we jump to a label (which will be set before the
code of the else expression ee later) if the resulting boolean is false. Then we
compile the then expression te. After the then expression we always jump to
the end of the code of the if-then-else expression, for which we need another la-
bel. Then we set the label for the else expression, we compile the else expression
ee, and, finally, we set the label for the end of the if-then-else expression.
compile (If ce te ee) = compile ce
++ [BrFalse ?lab1]
++ compile te
++ [BrAlways ?lab2]
++ [SetLabel ?lab1]
++ compile ee
++ [SetLabel ?lab2]
Note that we use labels here, but where do these labels come from?
From the above description we see that we also need labels when compiling an ex-
pression. We add a label argument (an integer, used for the first label in the compiled
code) to function compile, and we want function compile to return the first unused
label. We change the type of function compile as follows:
compile :: ExprAS -> Label -> ([InstructionSM],Label)
type Label = Int
The four cases in the definition of compile have to take care of the labels. We obtain
the following definition of compile:
compile (ConInt i) = \l -> ([LoadInt i],l)
compile (ConBool b) = \l -> ([LoadBool b],l)
compile (Apply f x) = \l -> let (xc,l’) = compile x l
(fc,l’’) = compile f l’
in (xc ++ fc ++ [Call],l’’)
compile (If ce te ee) = \l -> let (cc,l’) = compile ce (l+2)
(tc,l’’) = compile te l’
(ec,l’’’) = compile ee l’’
120
7.2. A small compiler
compile = foldExprAS compileAlgebra
compileAlgebra :: ExprASAlgebra (Label -> ([InstructionSM],Label))
compileAlgebra = (\cce cte cee -> \l ->
let (cc,l’) = cce (l+2)
(tc,l’’) = cte l’
(ec,l’’’) = cee l’’
in ( cc
++ [BrFalse l]
++ tc
++ [BrAlways (l+1)]
++ [SetLabel l]
++ ec
++ [SetLabel (l+1)]
,l’’’
)
,\cf cx -> \l -> let (xc,l’) = cx l
(fc,l’’) = cf l’
in (xc ++ fc ++ [Call],l’’)
,\i -> \l -> ([LoadInt i],l)
,\b -> \l -> ([LoadBool b],l)
)
Listing 7.10: CompileExpr.hs
in ( cc
++ [BrFalse l]
++ tc
++ [BrAlways (l+1)]
++ [SetLabel l]
++ ec
++ [SetLabel (l+1)]
,l’’’
)
Function compile is a fold, the carrier type of its algebra is a function of type Label
-> ([InstructionSM],Label). The definition of function compile as a fold is given
in Listing 7.10.
Exercise 7.3 (no answer provided). Extend the code generation example by adding variables
to the datatype Expr.
Exercise 7.4 (no answer provided). Extend the code generation example by adding defini-
tions to the datatype Expr too.
121
7. Programming with higher-order folds
7.3. Attribute grammars
In Section 7.1 we have written a program that solves the rep min problem. This
program computes the minimum of a tree, and it computes the tree in which all the
leaf values are replaced by the minimum value. The minimum is computed bottom-
up: it is synthesized from its children. The minimum value is then passed on to the
functions that build the tree with the minimum value in its leaves. These functions
receive the minimum value from their parent tree node: they inherit the minimum
value from their parent.
We can see the rep min computation as a computation on a value of type Tree,
on which two attributes are defined: the minimum and result tree attributes. The
minimum is computed bottom-up, and is then passed down to the result tree, and is
therefore a synthesized and inherited attribute. The result tree is computed bottom-
up, and is hence a synthesized attribute.
The formalism in which it is possible to specify such attributes and computations
on datatypes or grammars is called attribute grammars, and was originally proposed
by Donald Knuth in [9]. Attribute grammars provide a solution for the systematic
description of the phases of the compiler that come after scanning and parsing. Al-
though they look different from what we have encountered thus far and are probably
a little easier to write, they can straightforwardly be mapped onto a functional pro-
gram. The programs you have seen in this chapter could also have been obtained
by means of such a mapping from an attribute grammar specification. Traditionally
such attribute grammars are used as the input of a compiler generator. Just as we
have seen how by introducing a suitable set of parsing combinators one may avoid
the use of a special parser generator and even gain a lot of flexibility in extending
the grammatical formalism by introducing more complicated combinators, we have
shown how one can do without a special purpose attribute grammar processing sys-
tem. But, just as the concept of a context free grammar was useful in understanding
the fundamentals of parser combinators, understanding attribute grammars will help
significantly in describing the semantic part of the recognition and compilation pro-
cess. This chapter does not further introduce attribute grammars, but they will
appear again in the course in implementing programming languages.
122
8. Regular Languages
Introduction
The first phase of a compiler takes an input program, and splits the input into a
list of terminal symbols: keywords, identifiers, numbers, punctuation, etc. Regular
expressions are used for the description of the terminal symbols. A regular grammar
is a particular kind of context-free grammar that can be used to describe regular
expressions. Finite-state automata can be used to recognise sentences of regular
grammars. This chapter discusses all of these concepts, and is organised as follows.
Section 8.1 introduces finite-state automata. Finite-state automata appear in two
versions, nondeterministic and deterministic ones. Section 8.1.4 shows that a non-
deterministic finite-state automaton can be transformed into a deterministic finite-
state automaton, so you don’t have to worry about whether or not your automaton is
deterministic. Section 8.2 introduces regular grammars (context-free grammars of a
particular form), and regular languages. Furthermore, it shows their equivalence with
finite-state automata. Section 8.3 introduces regular expressions as finite descriptions
of regular languages and shows that such expressions are another, equivalent, tool
for regular languages. Finally, Section 8.4 gives some of the proofs of the results of
the previous sections.
Goals
After you have studied this chapter you will know that
• regular languages are a subset of context-free languages;
• it is not always possible to give a regular grammar for a context-free grammar;
• regular grammars, finite-state automata and regular expressions are three dif-
ferent tools for regular languages;
• regular grammars, finite-state automata and regular expressions have the same
expressive power;
• finite-state automata appear in two, equally expressive, versions: deterministic
and nondeterministic.
8.1. Finite-state automata
The classical approach to recognising sentences from a regular language uses finite-
state automata. A finite-state automaton can be viewed as a simple form of digital
123
8. Regular Languages
computer with only a finite number of states, no temporary storage, an input file but
only the possibility to read it, and a control unit which records the state transitions.
A rather limited medium, but a useful tool in many practical subproblems. A finite-
state automaton can easily be implemented by a function that takes time linear in
the length of its input, and constant space. This implies that problems that can be
solved by means of a finite-state automaton, can be implemented by means of very
efficient programs.
8.1.1. Deterministic finite-state automata
Finite-state automata come in two flavours: deterministic and nondeterministic. We
start with a description of deterministic finite-state automata, the simplest form of
automata.
Definition 8.1 (Deterministic finite-state automaton, DFA). A deterministic finite-
state automaton (DFA) is a 5-tuple (X, Q, d, S, F) where
deterministic
finite-state
automaton
• X is the input alphabet,
• Q is a finite set of states,
• d :: Q → X → Q is the state transition function,
• S ∈ Q is the start state,
• F ⊆ Q is the set of accepting states.
As an example, consider the DFA M
0
= (X, Q, d, S, F) with
X = ¦a, b, c¦
Q = ¦S, A, B, C¦
F = ¦C¦
where state transition function d is defined by
d S a = C
d S b = A
d S c = S
d A a = B
d B c = C
For human beings, a finite-state automaton is more comprehensible in a graphical
representation. The following representation is customary: states are depicted as
the nodes in a graph; accepting states get a double circle; start states are explicitly
mentioned or indicated otherwise. The transition function is represented by the
edges: whenever d Q
i
x is a state Q
j
, then there is an arrow labelled x from Q
i
to Q
j
.
The input alphabet is implicit in the labels. For automaton M
0
above, the pictorial
representation is:
124
8.1. Finite-state automata
S
?>=< 89:;
ED GF
c
@A

a

b

C
?>=< 89:; /.-, ()*+
A
?>=< 89:;
a

B
?>=< 89:;
c

Note that d is a partial function: for example d B a is not defined. We can make d
into a total function by introducing a new ‘sink’ state, the result state of all undefined
transitions. For example, in the above automaton we can introduce a sink state D
with d D x = D for all terminals x, and d E x = D for all states E and terminals
x for which d E x is undefined. The sink state and the transitions from/to it are
almost always omitted.
The action of a DFA on an input string is described as follows: given a sequence w
of input symbols, w can be ‘processed’ symbol by symbol (from left to right) and —
depending on the specific input symbol — the DFA (initially in the start state) moves
to the state as determined by its state transition function. If no move is possible, the
automaton blocks. When the complete input has been processed and the DFA is in
one of its accepting states, then we say that w is accepted by the automaton.
accept
To illustrate the action of an DFA, we will show that the sentence bac is accepted
by M
0
. We do so by recording the successive configurations, i.e. the pairs of current
state and remaining input values.
(S, bac)

(A, ac)

(B, c)

(C, )
Because of the deterministic behaviour of a DFA the definition of acceptance by
a DFA is relatively easy. Informally, a sequence w ∈ X

is accepted by a DFA
(X, Q, d, S, F), if it is possible, when starting the DFA in S, to end in an accepting
state after processing w. This operational description of acceptance is formalised in
the predicate dfa accept. The predicate will be derived in a top-down fashion, i.e. we
formulate the predicate in terms of (“smaller”) subcomponents and afterwards we
give solutions to the subcomponents.
125
8. Regular Languages
Suppose dfa is a function that reflects the behaviour of the DFA, i.e. a function which
given a transition function, a start state and a string, returns the unique state that is
reached after processing the string starting from the start state. Then the predicate
dfa accept is defined by:
dfa accept :: X

→ (Q → X → Q, Q, ¦Q¦) → Bool
dfa accept w (d, S, F) = (dfa d S w) ∈ F
It remains to construct a definition of function dfa that takes a transition function,
a start state, and a list of input symbols, and reflects the behaviour of a DFA. The
definition of dfa is straightforward
dfa :: (Q → X → Q) → Q → X

→ Q
dfa d q = q
dfa d q (ax) = dfa d (d q a) x
Note that both the type and the definition of dfa match the pattern of the function
foldl , and it follows that the function dfa is actually identical to foldl .
dfa d q = foldl d q.
Definition 8.2 (Acceptance by a DFA). The sequence w ∈ X

is accepted by DFA
(X, Q, d, S, F) if
dfa accept w (d, S, F)
where
dfa accept w (d, qs, fs) = dfa d qs w ∈ fs
dfa d qs = foldl d qs
Using the predicate dfa accept, the language of a DFA is defined as follows.
Definition 8.3 (Language of a DFA). For DFA M = (X, Q, d, S, F), the language
of M, Ldfa(M), is defined by
Ldfa(M) = ¦w ∈ X

[ dfa accept w (d, S, F)¦
8.1.2. Nondeterministic finite-state automata
This subsection introduces nondeterministic finite-state automata and defines their
semantics, i.e. the language of a nondeterministic finite-state automaton.
The transition function of a DFA returns a state, which implies that for all terminal
symbols x and for all states t there can only be one edge starting in t labelled with
x. Sometimes it is convenient to have two or more edges labelled with the same
terminal symbol from a state. In these cases one can use a nondeterministic finite-
state automaton. Nondeterministic finite state automata are defined as follows.
126
8.1. Finite-state automata
Definition 8.4 (Nondeterministic finite-state automaton, NFA). A nondeterministic
finite-state automaton (NFA) is a 5-tuple (X, Q, d, Q
0
, F), where
nondeterministic
finite-state
automaton
• X is the input alphabet,
• Q is a finite set of states,
• d :: Q → X → ¦Q¦ is the state transition function,
• Q
0
⊆ Q is the set of start states,
• F ⊆ Q is the set of accepting states.
An NFA differs from a DFA in that there may be more than one start state and that
there may be more than one possible move for each state and input symbol. Here is
an example of an NFA:
S
?>=< 89:;
ED GF
c
@A

a

b

C
?>=< 89:; /.-, ()*+
A
?>=< 89:;
a

BC
ED
a

B
?>=< 89:;
c

Note that this NFA is very similar to the DFA in the previous section: the only
difference is that there are two outgoing arrows labelled with a from state A. Thus
the DFA becomes an NFA.
Formally, this NFA is defined as M
1
= (X, Q, d, Q
0
, F) with
X = ¦a, b, c¦
Q = ¦S, A, B, C¦
Q
0
= ¦S¦
F = ¦C¦
where state transition function d is defined by
d S a = ¦C¦
d S b = ¦A¦
d S c = ¦S¦
d A a = ¦S, B¦
d B c = ¦C¦
Again d is a partial function, which can be made total by adding d D x = ¦ ¦ for all
states D and all terminal symbols x for which d D x is undefined.
127
8. Regular Languages
Since an NFA can make an arbitrary (nondeterministic) choice for one of its possible
moves, we have to be careful in defining what it means that a sequence is accepted
by an NFA. Informally, sequence w ∈ X

is accepted by NFA (X, Q, d, Q
0
, F), if it
is possible, when starting the NFA in a state from Q
0
, to end in an accepting state
after processing w. This operational description of acceptance is formalised in the
predicate nfa accept.
Assume that we have a function, say nfa, which reflects the behaviour of the NFA.
That is a function which given a transition function, a set of start states and a string,
returns all possible states that can be reached after processing the string starting in
some start state. Then the predicate nfa accept can be expressed as
nfa accept :: X

→ (Q → X → ¦Q¦, ¦Q¦, ¦Q¦) → Bool
nfa accept w (d, Q
0
, F) = nfa d Q
0
w ∩ F = ∅
Now it remains to find a function nfa d qs of type X

→ ¦Q¦ that reflects the
behaviour of the NFA. For lists of length 1 such a function, called deltas, is defined
by
deltas :: (Q → X → ¦Q¦) → ¦Q¦ → X → ¦Q¦
deltas d qs a = ¦r [ q ∈ qs, r ∈ d q a¦
The behaviour of the NFA on X-sequences of arbitrary length follows from this “one
step” behaviour:
nfa :: (Q → X → ¦Q¦) → ¦Q¦ → X

→ ¦Q¦
nfa d qs = qs
nfa d qs (ax) = nfa d (deltas d qs a) x
Again, it follows that nfa can be written as a foldl.
nfa d qs = foldl (deltas d) qs
This concludes the definition of predicate nfa accept. In summary we have derived
Definition 8.5 (Acceptance by an NFA). The sequence w ∈ X

is accepted by NFA
(X, Q, d, Q
0
, F) if
nfa accept w (d, Q
0
, F)
where
nfa accept w (d, qs, fs) = nfa d qs w ∩ fs = ∅
nfa d qs = foldl (deltas d) qs
deltas d qs a = ¦r [ q ∈ qs, r ∈ d q a¦
Using the nfa accept-predicate, the language of an NFA is defined by
128
8.1. Finite-state automata
Definition 8.6 (Language of an NFA). For NFA M = (X, Q, d, Q
0
, F), the language
of M, Lnfa(M), is defined by
Lnfa(M) = ¦w ∈ X

[ nfa accept w (d, Q
0
, F)¦
Note that it is computationally expensive to determine whether or not a list is an
element of the language of a nondeterministic finite-state automaton. This is due to
the fact that all possible transitions have to be tried in order to determine whether or
not the automaton can end in an accepting state after reading the input. Determining
whether or not a list is an element of the language of a deterministic finite-state
automaton can be done in time linear in the length of the input list, so from a
computational view, deterministic finite-state automata are preferable. Fortunately,
for each nondeterministic finite-state automaton there exists a deterministic finite-
state automaton that accepts the same language. We will show how to construct a
DFA from an NFA in subsection 8.1.4.
8.1.3. Implementation
This section describes how to implement finite state machines. We start with imple-
menting DFA’s. Given a DFA M = (X, Q, d, S, F), we define two datatypes:
data StateM = ... deriving Eq
data SymbolM = ...
where the states of M (the elements of the set Q) are listed as constructors of StateM,
and the symbols of M (the elements of the set X) are listed as constructors of SymbolM.
Furthermore, we define three values (one of which a function):
start :: StateM
delta :: SymbolM -> StateM -> StateM
finals :: [StateM]
Note that the first two arguments of delta have changed places: this has been done in
order to be able to apply ‘partial evaluation’ later. The extended transition function
dfa and the accept function dfaAccept are now defined by:
dfa :: [SymbolM] -> StateM
dfa = foldl (flip delta) start
dfaAccept :: [SymbolM] -> Bool
dfaAccept xs = elem (dfa xs) finals
129
8. Regular Languages
Given a list of symbols [x1,x2,...,xn], the computation of dfa [x1,x2,...,xn]
uses the following intermediate states:
start, delta x1 start, delta x2 (delta x1 start),...
This list of states is determined uniquely by the input [x1,x2,...,xn] and the start
state.
Since we want to use the same function names for different automata, we introduce
the following class:
class Eq a => DFA a b where
start :: a
delta :: b -> a -> a
finals :: [a]
dfa :: [b] -> a
dfa = foldl (flip delta) start
dfaAccept :: [b] -> Bool
dfaAccept xs = elem (dfa xs) finals
Note that the functions dfa and dfaAccept are defined once and for all for all in-
stances of the class DFA.
As an example, we give the implementation of the example DFA (called MEX here)
given in the previous subsection.
data StateMEX = A | B | C | S deriving Eq
data SymbolMEX = SA | SB | SC
So the state A is represented by A, and the symbol a is represented by SA, and similar
for the other states and symbols. The automaton is made an instance of class DFA
as follows:
instance DFA StateMEX SymbolMEX where
start = S
delta x S = case x of SA -> C
SB -> A
SC -> S
delta SA A = B
delta SC B = C
finals = [C]
130
8.1. Finite-state automata
We can improve the performance of the automaton (function) dfa by means of partial
evaluation. The main idea of partial evaluation is to replace computations that
partial evaluation
are performed often at run-time by a single computation that is performed only
once at compile-time. A very simple example is the replacement of the expression
if True then f1 else f2 by the expression f1. Partial evaluation applies to finite
automatons in the following way.
dfa [x1,x2,...,xn]
=
foldl (flip delta) start [x1,x2,...,xn]
=
foldl (flip delta) (delta x1 start) [x2,...,xn]
=
case x1 of
SA -> foldl (flip delta) (delta SA start) [x2,...,xn]
SB -> foldl (flip delta) (delta SB start) [x2,...,xn]
SC -> foldl (flip delta) (delta SC start) [x2,...,xn]
All these equalities are simple transformation steps for functional programs. Note
that the first argument of foldl is always flip delta, and the second argument is
one of the four states S, A, B, or C (the result of delta). Since there are only a finite
number of states (four, to be precise), we can define a transition function for each
state:
dfaS, dfaA, dfaB, dfaC :: [Symbol] -> State
Each of these functions is a case expression over the possible input symbols.
dfaS [] = S
dfaS (x:xs) = case x of SA -> dfaC xs
SB -> dfaA xs
SC -> dfaS xs
dfaA [] = A
dfaA (x:xs) = case x of SA -> dfaB xs
dfaB [] = B
dfaB (x:xs) = case x of SC -> dfaC xs
dfaC [] = C
With this definition of the finite automaton, the number of steps required for com-
puting the value of dfaS xs for some list of symbols xs is reduced considerably.
131
8. Regular Languages
The implementation of NFA’s is similar to the implementation of DFA’s. The only
difference is that the transition and accept functions have to take care of sets (lists)
of states now. We will use the following class, in which we use some names that
also appear in the class DFA. This is a problem if the two classes appear in the same
module.
class Eq a => NFA a b where
start :: [a]
delta :: b -> a -> [a]
finals :: [a]
nfa :: [b] -> [a]
nfa = foldl (flip deltas) start
deltas :: b -> [a] -> [a]
deltas a = union . map (delta a)
nfaAccept :: [b] -> Bool
nfaAccept xs = intersect (nfa xs) finals /= []
Here, functions union and intersect are implemented as follows:
union :: Eq a => [[a]] -> [a]
union = nub . concat
nub :: Eq a => [a] -> [a]
nub = foldr (\x xs -> x:filter (/=x) xs) []
intersect :: Eq a => [a] -> [a] -> [a]
intersect xs ys = intersect’ (nub xs)
where intersect’ =
foldr (\x xs -> if x ‘elem‘ ys then x:xs else xs) []
8.1.4. Constructing a DFA from an NFA
Is it possible to express more languages by means of nondeterministic finite-state
automata than by deterministic finite-state automata? For each nondeterministic
automaton it is possible to give a deterministic finite-state automaton such that
both automata accept the same language, so the answer to the above question is no.
Before we give the formal proof of this claim, we illustrate the construction of a DFA
for an NFA in an example.
132
8.1. Finite-state automata
Consider the nondeterministic finite-state automaton corresponding with the example
grammar of the previous subsection.
S
?>=< 89:;
ED GF
c
@A

a

b

C
?>=< 89:; /.-, ()*+
A
?>=< 89:;
a

BC
ED
a

B
?>=< 89:;
c

The nondeterminicity of this automaton appears in state A: two outgoing arcs of A
are labelled with an a. Suppose we add a new state D, with an arc from A to D
labelled a, and we remove the arcs labelled a from A to S and from A to B. Since D
is a merge of the states S and B we have to merge the outgoing arcs from S and B
into outgoing arcs of D. We obtain the following automaton.
S
?>=< 89:;
ED GF
c
@A

a

b

C
?>=< 89:; /.-, ()*+
A
?>=< 89:;
a

D
?>=< 89:;
BC @A
b

c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
a,c

















B
?>=< 89:;
c

We omit the proof that the language of the latter automaton is equal to the language
of the former one. Although there is just one outgoing arc from A labelled with a,
this automaton is still nondeterministic: there are two outgoing arcs labelled with
c from D. We apply the same procedure as above. Add a new state E and an arc
labelled c from D to E, and remove the two outgoing arcs labelled c from D. Since
E is a merge of the states C and S we have to merge the outgoing arcs from C and
S into outgoing arcs of E. We obtain the following automaton.
S
?>=< 89:;
ED GF
c
@A

a

b

C
?>=< 89:; /.-, ()*+
A
?>=< 89:;
a

D
?>=< 89:;
BC @A
b

a

{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
{
c

E
?>=< 89:; /.-, ()*+
a

c
_u
u
u
u
u
u
u
u
u
u
u
u
u
u
u
u
u
u
u
u
u
u
u
ED BC
b
@A

B
?>=< 89:;
c

133
8. Regular Languages
Again, we do not prove that the language of this automaton is equal to the language
of the previous automaton, provided we add the state E to the set of accepting
states, which until now consisted just of state C. State E is added to the set of
accepting states because it is the merge of a set of states among which at least one
belongs to the set of accepting states. Note that in this automaton for each state, all
outgoing arcs are labelled differently, i.e. this automaton is deterministic. The DFA
constructed from the NFA above is the 5-tuple (X, Q, d, S, F) with
X = ¦a, b, c¦
Q = ¦S, A, B, C, D, E¦
F = ¦C, E¦
where transition function d is defined by
d S a = C
d S b = A
d S c = S
d A a = D
d B c = C
d D a = C
d D b = A
d D c = E
d E a = C
d E b = A
d E c = S
This construction is called the ‘subset construction’. In general, the construction
works as follows. Suppose M = (X, Q, d, Q
0
, F) is a nondeterministic finite-state au-
tomaton. Then the finite-state automaton M

= (X

, Q

, d

, Q

0
, F

), the components
of which are defined below, accepts the same language.
X

= X
Q

= subs Q
where subs returns all subsets of a set. subs Q is also called the powerset of Q. For
example,
subs :: ¦X¦ → ¦¦X¦¦
subs ¦A, B¦ = ¦¦ ¦, ¦A¦, ¦A, B¦, ¦B¦¦
For the other components of M

we define
d

q a = ¦t [ t ∈ d r a, r ∈ q¦
Q

0
= ¦Q
0
¦
F

= ¦p [ p ∩ F = ∅, p ∈ Q

¦
134
8.1. Finite-state automata
The proof of the following theorem is given in Section 8.4.
Theorem 8.7 (DFA for NFA). For every nondeterministic finite-state automaton M
there exists a finite-state automaton M

such that
Lnfa(M) = Ldfa(M

)
Theorem 8.7 enables us to freely switch between NFA’s and DFA’s. Equipped with
this knowledge we continue the exploration of regular languages in the following
section. But first we show that the transformation from an NFA to a DFA is an
instance of partial evaluation.
8.1.5. Partial evaluation of NFA’s
Given a nondeterministic finite state automaton we can obtain a deterministic finite
state automaton not just by means of the above construction, but also by means of
partial evaluation.
Just as for function dfa, we can calculate as follows with function nfa.
nfa [x1,x2,...,xn]
=
foldl (flip deltas) start [x1,x2,...,xn]
=
foldl (flip deltas) (deltas x1 start) [x2,...,xn]
=
case x1 of
SA -> foldl (flip deltas) (deltas SA start) [x2,...,xn]
SB -> foldl (flip deltas) (deltas SB start) [x2,...,xn]
SC -> foldl (flip deltas) (deltas SC start) [x2,...,xn]
Note that the first argument of foldl is always flip deltas, and the second argu-
ment is one of the six sets of states [S], [A], [B], [C], [B,S], [C,S] (the possible
results of deltas). Since there are only a finite number of results of deltas (six, to
be precise), we can define a transition function for each state:
nfaS, nfaA, nfaB, nfaC, nfaBS, nfaCS :: [Symbol] -> [State]
For example,
nfaA [] = A
nfaA (x:xs) = case x of
SA -> nfaBS xs
_ -> error "no transition possible"
135
8. Regular Languages
Each of these functions is a case expression over the possible input symbols. By
partially evaluating the function nfa we have obtained a function that is the imple-
mentation of the deterministic finite state automaton corresponding to the nondeter-
ministic finite state automaton.
8.2. Regular grammars
This section defines regular grammars, a special kind of context-free grammars. Sub-
section 8.2.1 gives the correspondence between nondeterministic finite-state automata
and regular grammars.
Definition 8.8 (Regular Grammar). A regular grammar G is a context free grammar
regular grammar
(T, N, R, S) in which all production rules in R are of one of the following two forms:
A → xB
A → x
with x ∈ T

and A, B ∈ N. So in every rule there is at most one nonterminal, and
if there is a nonterminal present, it occurs at the end.
The regular grammars as defined here are sometimes called right-regular grammars.
There is a symmetric definition for left-regular grammars.
Definition 8.9 (Regular Language). A regular language is a language that is gen-
regular language
erated by a regular grammar.
From the definition it is clear that each regular language is context-free. The question
is now: Is each context-free language regular? The answer is: No. There are context-
free languages that are not regular; an example of such a language is ¦a
n
b
n
[ n ∈ N¦.
To understand this, you have to know how to prove that a language is not regular.
Because of its subtlety, we postpone this kind of proofs until Chapter 9. Here it
suffices to know that regular languages form a proper subset of context-free languages
and that we will profit from their speciality in the recognition process.
A first similarity between regular languages and context-free languages is that both
are closed under union, concatenation and Kleene-star.
Theorem 8.10. Let L and M be regular languages, then
L ∪ M is regular
LM is regular
L

is regular
136
8.2. Regular grammars
Proof. Let G
L
= (T, N
L
, R
L
, S
L
) and G
M
= (T, N
M
, R
M
, S
M
) be regular grammars
for L and M respectively, then
• for regular grammars, the well-known union construction for context-free gram-
mars is a regular grammar again;
• we obtain a regular grammar for LM if we replace, in G
L
, each production of
the form T → x and T → by T → xS
M
and T → S
M
, respectively;
• since L

= ¦¦ ∪ LL

, it follows from the above that there exists a regular
grammar for L

.
In addition to these closure properties, regular languages are closed under intersection
and complement too. See the exercises. This is remarkable because context-free
languages are not closed under these operations. Recall the language L = L
1
∩ L
2
where L
1
= ¦a
n
b
n
c
m
[ n, m ∈ N¦ and L
2
= ¦a
n
b
m
c
m
[ n, m ∈ N¦.
As for context-free languages, there may exist more than one regular grammar for
a given regular language and these regular grammars may be transformed into each
other.
We conclude this section with a grammar transformation:
Theorem 8.11. For each regular grammar G there exists a regular grammar G

with
start-symbol S

such that
L(G) = L(G

)
and such that G

has no productions of the form U → V and W → , with W = S

.
In other words: every regular grammar can be transformed to a form where every
production has a nonempty terminal string in its right hand side (with a possible
exception for S → ).
The proof of this transformation is omitted, we only briefly describe the construction
of such a regular grammar, and illustrate the construction with an example.
Given a regular grammar G, a regular grammar with the same language but without
productions of the form U → V and W → for all U, V , and all W = S is obtained
as follows. First, consider all pairs Y , Z of nonterminals of G such that Y

⇒ Z.
Add productions Y → z to the grammar, with Z → z a production of the original
grammar, and z not a single nonterminal. Remove all productions U → V from
G. Finally, remove all productions of the form W → for W = S, and for each
production U → xW add the production U → x. The following example illustrates
this construction.
137
8. Regular Languages
Consider the following regular grammar G.
S → aA
S → bB
S → A
S → C
A → bB
A → S
A →
B → bB
B →
C → c
The grammar G

of the desired form is constructed in 3 steps.
Step 1
Let G

equal G.
Step 2
Consider all pairs of nonterminals Y and Z. If Y

⇒Z, add the productions Y → z to
G

, with Z → z a production of the original grammar, and z not a single nonterminal.
Furthermore, remove all productions of the form U → V from G

. In the example
we remove the productions S → A, S → C, A → S, and we add the productions
S → bB and S → since S

⇒A, and the production S → c since S

⇒C, and the
productions A → aA and A → bB since A

⇒ S, and the production A → c since
A

⇒C. We obtain the grammar with the following productions.
S → aA
S → bB
S → c
S →
A → bB
A → aA
A → c
A →
B → bB
B →
C → c
This grammar generates the same language as G, and has no productions of the form
U → V . It remains to remove productions of the form W → for W = S.
138
8.2. Regular grammars
Step 3
Remove all productions of the form W → for W = S, and for each production
U → xW add the production U → x. Applying this transformation to the above
grammar gives the following grammar.
S → aA
S → bB
S → a
S → b
S → c
S →
A → bB
A → aA
A → a
A → b
A → c
B → bB
B → b
C → c
Each production in this grammar is of one of the desired forms: U → x or U → xV ,
and the language of the grammar G

we thus obtain is equal to the language of
grammar G.
8.2.1. Equivalence of Regular grammars and Finite automata
In the previous section, we introduced finite-state automata. Here we show that
regular grammars and nondeterministic finite-state automata are two sides of one
coin.
We will prove the equivalence using theorem 8.7. The equivalence consists of two
parts, formulated in the theorems 8.12 and 8.13 below. The basis for both theorems
is the direct correspondence between a production A → xB and a transition
A
?>=< 89:;
x

B
?>=< 89:;
Theorem 8.12 (Regular grammar for NFA). For each NFA M there exists a regular
grammar G such that
Lnfa(M) = L(G)
139
8. Regular Languages
Proof. We will just sketch the construction, the formal proof can be found in the
literature. Let (X, Q, d, S, F) be a DFA for NFA M. Construct the grammar G =
(X, Q, R, S) where
• the terminals of the grammar are the input alphabet of the automaton;
• the nonterminals of the grammar are the states of the automaton;
• the start state of the grammar is the start state of the automaton;
• the productions of the grammar correspond to the automaton transitions:
a rule A → xB for each transition
A
?>=< 89:;
x

B
?>=< 89:;
a rule A → for each terminal state A.
In formulae:
R = ¦A → xB [ A, B ∈ Q, x ∈ X, d A x = B¦
∪ ¦A → [ A ∈ F¦
Theorem 8.13 (NFA for regular grammar). For each regular grammar G there exists
a nondeterministic finite-state automaton M such that
L(G) = Lnfa(M)
Proof. Again, we will just sketch the construction, the formal proof can be found in
the literature. The construction consists of two steps: first we give a direct translation
of a regular grammar to an automaton and then we transform the automaton into a
suitable shape.
From a grammar to an automaton. Let G = (T, N, R, S) be a regular grammar
without productions of the form U → V and W → for W = S.
Construct NFA M = (X, Q, d, ¦S¦, F) where
• The input alphabet of the automaton are the nonempty terminal strings (!)
that occur in the rules of the grammar:
X = ¦x ∈ T
+
[ A, B ∈ N, A → xB ∈ R¦
∪ ¦x ∈ T
+
[ A ∈ N, A → x ∈ R¦
• The states of the automaton are the nonterminals of the grammar extended
with a new state N
f
.
Q = N ∪ ¦N
f
¦
140
8.2. Regular grammars
• The transitions of the automaton correspond to the grammar productions:
for each rule A → xB we get a transition
A
?>=< 89:;
x

B
?>=< 89:;
for each rule A → x with nonempty x, we get a transition
A
?>=< 89:;
x

N
f
GFED @ABC
In formulae: For all A ∈ N and x ∈ X:
d A x = ¦B [ B ∈ N, A → xB ∈ R¦
∪ ¦N
f
[ A → x ∈ R¦
• The final states of the automaton are N
f
and possibly S, if S → is a grammar
production.
F = ¦N
f
¦ ∪ ¦S [ S → ∈ R¦
Lnfa(M) = L(G), because of the direct correspondence between derivation steps in
G and transitions in M.
Transforming the automaton to a suitable shape. There is a minor flaw in the au-
tomaton given above: the grammar and the automaton have different alphabets.
This shortcoming will be remedied by an automaton transformation which yields an
equivalent automaton with transitions labelled by elements of T (instead of T

). The
transformation is relatively easy and is depicted in the diagram below. In order to
eliminate transition d q x = q

where x = x
1
x
2
. . . x
k+1
with k > 0 and x
i
∈ T for
all i, add new (nonfinal) states p
1
, . . . , p
k
to the existing ones and new transitions
d q x
1
= p
1
, d p
1
x
2
= p
2
, . . . , d p
k
x
k+1
= q

.
b
q
x
1
x
2
...x
k+1
b
q

b
q
x
1
b
p
1
x
2
b
p
2
b
p
k
x
k+1
b
q

- - - p p p p p p p p p p p p p p p p p p -
It is intuitively clear that the resulting automaton is equivalent to the original one.
Carry out this transformation for each M-transition d q x = q

with [x[ > 1 in order
to get an automaton for G with the same input alphabet T.
141
8. Regular Languages
8.3. Regular expressions
Regular expressions are a classical and convenient way to describe, for example, the
structure of terminal words. This section defines regular expressions, defines the
language of a regular expression, and shows that regular expressions and regular
grammars are equally expressive formalisms. We do not discuss implementations of
(datatypes and functions for matching) regular expressions; implementations can be
found in the literature [8, 6].
Definition 8.14 (RE
T
, regular expressions over alphabet T). The set RE
T
of regular
expressions over alphabet T is inductively defined as follows: for regular expressions
regular expression
R, S
∅ ∈ RE
T
∈ RE
T
a ∈ RE
T
R +S ∈ RE
T
RS ∈ RE
T
R∗ ∈ RE
T
(R) ∈ RE
T
where a ∈ T. The operator + is associative, commutative, and idempotent; the
concatenation operator, written as juxtaposition (so x concatenated with y is denoted
by xy), is associative, and is the unit of it. In formulae this reads, for all regular
expressions R, S, and V ,
R + (S +U) = (R +S) +U
R +S = S +R
R +R = R
R(S U) = (RS) U
R = R (= R)
Furthermore, the star operator, ∗, binds stronger than concatenation, and concate-
nation binds stronger than +. Examples of regular expressions are:
(bc)∗ +∅
+ b(∗)
The language (i.e. the “semantics”) of a regular expression over T is a set of T-
sequences compositionally defined on the structure of regular expressions. As fol-
lows.
142
8.3. Regular expressions
Definition 8.15 (Language of a regular expression). Function Lre :: RE
T
→ ¦T

¦
returns the language of a regular expression. It is defined inductively by:
Lre(∅) = ∅
Lre() = ¦¦
Lre(b) = ¦b¦
Lre(x +y) = Lre(x) ∪ Lre(y)
Lre(xy) = Lre(x) Lre(y)
Lre(x∗) = (Lre (x))

Since ∪ is associative, commutative, and idempotent, set concatenation is associative
with ¦¦ as its unit, and function Lre is well defined. Note that the language Lreb∗
is the set consisting of zero, one or more concatenations of b, i.e., Lre(b∗) = (¦b¦)

.
As an example of a language of a regular expression, we compute the language of the
regular expression ( + bc)d.
Lre(( + bc)d)
=
(Lre( + bc)) (Lre(d))
=
(Lre() ∪ Lre(bc))¦d¦
=
(¦¦ ∪ (Lre(b))(Lre(c)))¦d¦
=
¦, bc¦ ¦d¦
=
¦d, bcd¦
Regular expressions are used to describe the tokens of a language. For example, the
list
if p then e1 else e2
contains six tokens, three of which are identifiers. An identifier is an element in the
language of the regular expression
letter(letter + digit)∗
where
letter = a + b +. . . + z +
A + B +. . . + Z
digit = 0 + 1 +. . . + 9
143
8. Regular Languages
see subsection 2.3.1.
In the beginning of this section we claimed that regular expressions and regular
grammars are equivalent formalisms. We will prove this claim later, but first we
illustrate the construction of a regular grammar out of a regular expressions in an
example. Consider the following regular expression.
R = a∗ + + (a + b)∗
We aim at a regular grammar G such that Lre(R) = L(G) and again we take a
top-down approach.
Suppose that nonterminal A generates the language Lre(a∗), nonterminal B generates
the language Lre(), and nonterminal C generates the language Lre((a + b)∗). Sup-
pose furthermore that the productions for A, B, and C satisfy the conditions imposed
upon regular grammars. Then we obtain a regular grammar G with L(G) = Lre(R)
by defining
S → A
S → B
S → C
where S is the start-symbol of G. It remains to construct productions for nontermi-
nals A, B, and C.
• The nonterminal A with productions
A → aA
A →
generates the language Lre(a∗).
• Since Lre() = ¦¦, the nonterminal B with production
B →
generates the language ¦¦.
• Nonterminal C with productions
C → aC
C → bC
C →
generates the language Lre((a + b)∗).
For a specific example it is not difficult to construct a regular grammar for a regular
expression. We now give the general result.
144
8.4. Proofs
Theorem 8.16 (Regular Grammar for Regular Expression). For each regular ex-
pression R there exists a regular grammar G such that
Lre(R) = L(G)
The proof of this theorem is given in Section 8.4.
To obtain a regular expression that generates the same language as a given regular
grammar we go via an automaton. Given a regular grammar G, we can use the
theorems from the previous sections to obtain a DFA D such that
L(G) = Ldfa(D)
So if we can obtain a regular expression for a DFA D, we have found a regular ex-
pression for a regular grammar. To obtain a regular expression for a DFA D, we
interpret each state of D as a regular expression defined as the sum of the concate-
nation of outgoing terminal symbols with the resulting state. For our example DFA
we obtain:
S = aC +bA+cS
A = aB
B = cC
C =
It is easy to merge these four regular expressions into a single regular expression,
partially because this is a simple example. Merging the regular expressions obtained
from a DFA that may loop is more complicated, as we will briefly explain in the proof
of the following theorem. In general, we have:
Theorem 8.17 (Regular Expression for Regular Grammar). For each regular gram-
mar G there exists a regular expression R such that
L(G) = Lre(R)
The proof of this theorem is given in Section 8.4.
8.4. Proofs
This section contains the proofs of some of the theorems given in this chapter.
145
8. Regular Languages
8.4.1. Proof of Theorem 8.7
Suppose M = (X, Q, d, Q
0
, F) is a nondeterministic finite-state automaton. Define
the finite-state automaton M

= (X

, Q

, d

, Q

0
, F

) as follows.
X

= X
Q

= subs Q
where subs returns the powerset of a set. For example,
subs ¦A, B¦ = ¦¦ ¦, ¦A¦, ¦A, B¦, ¦B¦¦
For the other components of M

we define
d

q a = ¦t [ t ∈ d r a, r ∈ q¦
Q

0
= Q
0
F

= ¦p [ p ∩ F = ∅, p ∈ Q

¦
We have
Lnfa(M)
= definition of Lnfa
¦w [ w ∈ X

, nfa accept w (d, Q
0
, F)¦
= definition of nfa accept
¦w [ w ∈ X

, (nfa d Q
0
w ∩ F) = ∅¦
= definition of F

¦w [ w ∈ X

, nfa d Q
0
w ∈ F

¦
= assume nfa d Q
0
w = dfa d

Q

0
w
¦w [ w ∈ X

, dfa d

Q

0
w ∈ F

¦
= definition of dfa accept
¦w [ w ∈ X

, dfa accept w (d

, Q

0
, F


= definition of Ldfa
Ldfa(M

)
It follows that Lnfa(M) = Ldfa(M

) provided
nfa d Q
0
w = dfa d

Q

0
w (8.1)
We prove this equation as follows.
146
8.4. Proofs
nfa d Q
0
w
= definition of nfa
foldl (deltas d) Q
0
w
= Q
0
= Q

0
; assume deltas d = d

foldl d

Q

0
w
= definition of dfa
dfa d

Q

0
w
So if
deltas d = d

equation (8.1) holds. This equality follows from the following calculation.
d

q a
= definition of d

¦t [ r ∈ q, t ∈ d r a¦
= definition of deltas
deltas d q a
8.4.2. Proof of Theorem 8.16
The proof of this theorem is by induction to the structure of regular expressions. For
the three base cases we argue as follows.
The regular grammar without productions generates the language Lre(∅). The regu-
lar grammar with the production S → generates the language Lre(). The regular
grammar with production S → b generates the language Lre(b).
For the other three cases the induction hypothesis is that there exist a regular gram-
mar with start-symbol S
1
that generates the language Lre(x), and a regular grammar
with start-symbol S
2
that generates the language Lre(y).
We obtain a regular grammar with start-symbol S that generates the language
Lre(x +y) by defining
S → S
1
S → S
2
We obtain a regular grammar with start-symbol S that generates the language
Lre(xy) by replacing, in the regular grammar that generates the language Lre(x),
147
8. Regular Languages
each production of the form T → a and T → by T → aS
2
and T → S
2
, respec-
tively.
We obtain a regular grammar with start-symbol S that generates the language
Lre(x∗) by replacing, in the regular grammar that generates the language Lre(x),
each production of the form T → a and T → by T → aS and T → S, and by
adding the productions S → S
1
and S → , where S
1
is the start-symbol of the
regular grammar that generates the language Lre(x).
8.4.3. Proof of Theorem 8.17
In sections 8.1.4 and 8.2.1 we have shown that there exists a DFA D = (X, Q, d, S, F)
such that
L(G) = Ldfa(D)
So, if we can show that there exists a regular expression R such that
Ldfa(D) = Lre(R)
then the theorem follows.
Let D = (X, Q, d, S, F) be a DFA such that L(G) = Ldfa(D). We define a regular
expression R such that
Lre(R) = Ldfa(D)
For each state q ∈ Q we define a regular expression ¯ q, and we let R be
¯
S. We obtain
the definition of ¯ q by combining all pairs c and C such that d q c = C.
¯ q = if q / ∈ F
then foldl (+) ∅ [cC [ d q c = C]
else + foldl (+) ∅ [cC [ d q c = C]
This gives a set of possibly mutually recursive equations, which we have to solve. In
solving these equations we use the fact that concatenation distributes over the sum
operator:
z(x +y) = zx +zy
and that recursion can be removed by means of the star operator ∗:
A = xA+z (where A / ∈ z) ≡ A = x∗z
The algorithm for solving such a set of equations is omitted.
We prove Ldfa(D) = Lre(
¯
S).
148
8.4. Proofs
Ldfa(D)
= definition of Ldfa
¦w [ w ∈ X

, dfa accept w (d, S, F)¦
= definition of dfa accept
¦w [ w ∈ X

, (dfa d S w) ∈ F¦
= definition of dfa
¦w [ w ∈ X

, (foldl d S w) ∈ F¦
= assumption
¦w [ w ∈ Lre(
¯
S)¦
= equality for set-comprehensions
Lre(
¯
S)
It remains to prove the assumption in the above calculation: for w ∈ X

,
(foldl d S w) ∈ F ≡ w ∈ Lre(
¯
S)
We prove a generalisation of this equation, namely, for arbitrary q,
(foldl d q w) ∈ F ≡ w ∈ Lre(¯ q)
This equation is proved by induction to the length of w. For the base case w = we
calculate as follows.
(foldl d q ) ∈ F
≡ definition of foldl
q ∈ F
≡ definition of ¯ q, E abbreviates the fold expression
¯ q = +E
≡ definition of Lre, definition of ¯ q
∈ Lre(¯ q)
The induction hypothesis is that for all lists w with [w[ n we have (foldl d q w) ∈
F ≡ w ∈ Lre(¯ q). Suppose ax is a list of length n+1.
(foldl d q (ax)) ∈ F
≡ definition of foldl
(foldl d (d q a) x) ∈ F
≡ induction hypothesis
x ∈ Lre(
¯
d q a)
≡ definition of ¯ q, D is deterministic
ax ∈ Lre(¯ q)
149
8. Regular Languages
Summary
This chapter discusses methods for recognising sentences from regular languages, and
introduces several concepts related to describing and recognising regular languages.
Regular languages are used for describing simple languages like ‘the language of
identifiers’ and ‘the language of keywords’ and regular expressions are convenient for
the description of regular languages. The straightforward translation of a regular
expression into a recogniser for the language of that regular expression results in a
recogniser that is often very inefficient. By means of (non)deterministic finite-state
automata we construct a recogniser that requires time linear in the length of the
input list for recognising an input list.
8.5. Exercises
Exercise 8.1. Given a regular grammar G for language L, construct a regular grammar for
L

.
Exercise 8.2. Transform the grammar with the following productions to a grammar without
productions of the form U → V and W → with W = S.
S → aA
S → A
A → aS
A → B
B → C
B →
C → cC
C → a
Exercise 8.3. Suppose that the state transition function d in the definition of a nondeter-
ministic finite-state automaton has the following type
d :: ¦Q¦ → X → ¦Q¦
Function d takes a set of states V and an element a, and returns the set of states that are
reachable from V with an arc labelled a. Define a function ndfsa of type
(¦Q¦ → X → ¦Q¦) → ¦Q¦ → X

→ ¦Q¦
which given a function d, a set of start states, and an input list, returns the set of states in
which the nondeterministic finite-state automaton can end after reading the input list.
Exercise 8.4. Prove the converse of Theorem 8.7: show that for every deterministic finite-
state automaton M there exists a nondeterministic finite-state automaton M

such that
Lda(M) = Lna(M

)
Exercise 8.5. Regular languages are closed under complementation. Prove this claim. Hint:
construct a finite automaton for L out of an automaton for regular language L.
150
8.5. Exercises
Exercise 8.6. Regular languages are closed under intersection.
1. Prove this claim using the result from the previous exercise.
2. A direct proof form this claim is the following:
Let M
1
= (X, Q
1
, d
1
, S
1
, F
1
) and M
2
= (X, Q
2
, d
2
, S
2
, F
2
) be DFA’s for the regular
languages L
1
and L
2
respectively. Define the (product) automaton M = (X, Q
1

Q
2
, d, (S
1
, S
2
), F
1
F
2
) by d (q
1
, q
2
) x = (d
1
q
1
x, d
2
q
2
x) Now prove that Ldfa(M) =
Ldfa(M
1
) ∩ Ldfa(M
2
)
Exercise 8.7. Define nondeterministic finite-state automata that accept languages equal to
the languages of the following regular grammars.
1.











S → (A
S →
S → )A
A → )
A → (
2.



















S → 0A
S → 0B
S → 1A
A → 1
A → 0
B → 0
B →
Exercise 8.8. Describe the language of the following regular expressions.
1. + b(∗)
2. (bc)∗ +∅
3. a(b∗) + c∗
Exercise 8.9. Prove that for arbitrary regular expressions R, S, and T the following equiv-
alences hold.
Lre(R(S +T)) = Lre(RS +RT)
Lre((R +S)T) = Lre(RT +ST)
Exercise 8.10. Give regular expressions S and R such that
Lre(RS) = Lre(SR)
Lre(RS) = Lre(SR)
Exercise 8.11. Give regular expressions V and W, with Lre(V ) = Lre(W), such that for
all regular expressions R and S with S = ∅
Lre(R(S +V )) = Lre(R(S +W))
V and W may be expressed in terms of R and S.
Exercise 8.12. Give a regular expression for the language that consists of all lists of zeros
and ones such that the segment 01 occurs nowhere in a list. Examples of sentences of this
language are 1110, and 000.
151
8. Regular Languages
Exercise 8.13. Give regular grammars that generate the language of the following regular
expressions.
1. ((a + bb)∗ + c)∗
2. a∗ + b∗ + ab
Exercise 8.14. Give regular expressions of which the language equals the language of the
following regular grammars.
1.































S → bA
S → aC
S →
A → bA
A →
B → aC
B → bB
B →
C → bB
C → b
2.











S → 0S
S → 1T
S →
T → 0T
T → 1S
Exercise 8.15. Construct for each of the following regular expressions a nondeterministic
finite-state automaton that accepts the sentences of the language. Transform the nondeter-
ministic finite-state automata into deterministic finite-state automata.
1. a∗ + b∗ + (ab)
2. (1 + (12) + 0)∗(30)∗
Exercise 8.16. Define regular expressions for the languages of the following deterministic
finite-state automata.
1. Start state is S.
S
?>=< 89:; '&%$ !"#
a

ED GF
b
@A

A
?>=< 89:;
b

ED GF
a
@A

B
?>=< 89:; '&%$ !"#
ED GF
a,b
@A

2. Start state is S.
S
?>=< 89:; '&%$ !"#
a

ED GF
b
@A

A
?>=< 89:;
b

BC @A
b

B
?>=< 89:;
BC @A
b

152
9. Pumping Lemmas:
the expressive power of languages
Introduction
In these lecture notes we have presented several ways to show that a language is
regular or context-free, but until now we did not give any means to show the non-
regularity or noncontext-freeness of a language. In this chapter we fill this gap by
introducing the so-called Pumping Lemmas. For example, the pumping lemma for
regular languages says
IF language L is regular,
THEN it has the following property P: each sufficiently long sentence w ∈ L has
a substring that can be repeated any number of times, every time yielding
another word of L
In applications, pumping lemmas are used in the contrapositive way. In the regular
case this means that one may conclude that L is not regular, if P does not hold.
Although the ideas behind pumping lemmas are very simple, a precise formulation
is not. As a consequence, it takes some effort to get familiar with applying pumping
lemmas. Regular grammars and context-free grammars are part of the Chomsky
hierarchy, which consists of four different kinds of grammars and their corresponding
languages. Pumping lemmas are used to show that the expressive power of the
different elements of the Chomsky hierarchy is different.
Goals
After you have studied this chapter you will be able to
• prove that a language is not regular;
• prove that a language is not context-free;
• identify languages and grammars as regular, context-free or none of these;
• give examples of languages that are not regular, and/or not context-free;
• explain the Chomsky hierarchy.
153
9. Pumping Lemmas: the expressive power of languages
9.1. The Chomsky hierarchy
In the preceding chapters we have seen contex-free grammars and regular grammars.
You may now wonder: is it possible to express any language with these grammars?
And: is it possible to obtain any context-free language from a regular grammar? The
answer to these questions is no. The Chomsky hierarchy explains why the answer
is no. The Chomsky hierarchy consists of four elements, each of which is explained
below.
9.1.1. Type-0 grammars
The most powerful grammars are the type-0 grammars, in which a production has
the form φ → ψ, where φ ∈ V
+
, ψ ∈ V

, where V is the set of symbols of the
grammar. So the left-hand side of a production may consist of a list of nonterminal
and terminal symbols, instead of a single nonterminal as in context-free grammars.
Type-0 grammars have the same expressive power as Turing machines, and the lan-
guages described by these grammars are the recursive enumerable languages. This
expressive power comes at a cost though: it is very difficult to parse sentences from
type-0 grammars.
9.1.2. Type-1 grammars
We can slightly restrict the form of the productions to obtain type-1 grammars. In
a type-1, or context-sensitive grammar, each production has the form φAψ → φδψ,
where φ, ψ ∈ V

, δ ∈ V
+
. So a production describes how to rewrite a nonterminal
A, in the context of lists of symbols φ and ψ. A language generated by a context-
sensitive grammar is called a context-sensitive language. Although context-sensitive
grammars are less expressive than type-0 grammars, parsing is still very difficult for
context-sensitive grammars.
9.1.3. Type-2 grammars
The type-2 grammars are the context-free grammars which you have seen a lot in the
preceding chapters. As the name says, in a context-free grammar you can rewrite
nonterminals without looking at the context in which they appear. Actually, it is
impossible to look at the context when rewriting symbols. Context-free grammars
are less expressive than context-sensitive grammars. This statement can be proved
using the pumping lemma for context-free languages.
However, it is much easier to parse sentences from context-free languages. In fact, a
sentence of length n can be parsed in time at most O(n
3
) (or even a bit less than this)
for any sentence of a context-free language. And if we put some more restrictions
154
9.2. The pumping lemma for regular languages
on context-free grammars (for example LL(1)), we obtain linear-time algorithms for
parsing sentences of such grammars.
9.1.4. Type-3 grammars
The type-3 grammars in the Chomsky hierarchy are the regular grammars. Any sen-
tence from a regular language can be processed by means of a finite-state automaton,
which takes linear time and constant space in the size of its input. The set of regular
languages is strictly smaller than the set of context-free languages, a fact we will
prove below by means of the pumping lemma for regular languages.
9.2. The pumping lemma for regular languages
In this section we give the pumping lemma for regular languages. The lemma gives
a property that is satisfied by all regular languages. The property is a statement
of the form: in sentences longer than a certain length a substring can be identified
that can be duplicated while retaining a sentence. The idea behind this property
is simple: regular languages are accepted by finite automata. Given a DFA for a
regular language, a sentence of the language describes a path from the start state to
some finite state. When the length of such a sentence exceeds the number of states,
then at least one state is visited twice; consequently the path contains a cycle that
can be repeated as often as desired. The proof of the following lemma is given in
Section 9.4.
Theorem 9.1 (Regular Pumping Lemma). Let L be a regular language. Then
there exists n ∈ N :
for all x, y, z : xyz ∈ L and [y[ n :
there exist u, v, w : y = uvw and [v[ > 0 :
for all i ∈ N : xuv
i
wz ∈ L
Note that [y[ denotes the length of the string y. Also remember that ‘for all x ∈ X : ’
is true if X = ∅, and ‘there exists x ∈ X : ’ is false if X = ∅.
155
9. Pumping Lemmas: the expressive power of languages
For example, consider the following automaton.
S
?>=< 89:;
a

A
?>=< 89:;
b

B
c

?>=< 89:;
C
?>=< 89:;
a
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
c
d

D
?>=< 89:; /.-, ()*+
This automaton accepts: abcabcd, abcabcabcd, and, in general, a(bca)

bcd. The
statement of the pumping lemma amounts to the following. Take for n the number
of states in the automaton (5). Let x, y, z be such that xyz ∈ L, and [y[ n.
Then we know that in order to accept y, the above automaton has to pass at least
twice through state A. The part that is accepted in between the two moments the
automaton passes through state A can be pumped up to create sentences that contain
an arbitrary number of copies of the string v = bca.
This pumping lemma is useful in showing that a language does not belong to the
family of regular languages. Its application is typical of pumping lemmas in general;
they are used negatively to show that a given language does not belong to some
family.
Theorem 9.1 enables us to prove that a language L is not regular by showing that
for all n ∈ N :
there exist x, y, z : xyz ∈ L and [y[ n :
for all u, v, w : y = uvw and [v[ > 0 :
there exists i ∈ N : xuv
i
wz ∈ L
In all applications of the pumping lemma in this chapter, this is the formulation we
will use.
Note that if n = 0, we can choose y = , and since there is no v with [v[ > 0 such
that y = uvw, the statement above holds for all such v (namely none!).
As an example, we will prove that language L = ¦a
m
b
m
[ m 0¦ is not regular.
Let n ∈ N.
Take s = a
n
b
n
with x = , y = a
n
, and z = b
n
.
Let u, v, w be such that y = uvw with v = , that is, u = a
p
, v = a
q
and w = a
r
with
p +q +r = n and q > 0.
Take i = 2, then
156
9.3. The pumping lemma for context-free languages
xuv
2
wz ∈ L
⇐ defn. x, u, v, w, z, calculus
a
p+2q+r
b
n
∈ L
⇐ p +q +r = n
n +q = n
⇐ arithmetic
q > 0
⇐ q > 0
true
Note that the language L = ¦a
m
b
m
[ m 0¦ is context-free, and together with the
fact that each regular grammar is also a context-free grammar it follows immedi-
ately that the set of regular languages is strictly smaller than the set of context-free
languages.
Note that here we use the pumping lemma (and not the proof of the pumping lemma)
to prove that a language is not regular. This kind of proof can be viewed as a kind of
game: ‘for all’ is about an arbitrary element which can be chosen by the opponent;
‘there exists’ is about a particular element which you may choose. Choosing the right
elements helps you ‘win’ the game, where winning means proving that a language is
not regular.
Exercise 9.1. Prove that the following language is not regular
¦a
k
2
[ k 0¦
Exercise 9.2. Show that the following language is not regular.
¦x [ x ∈ ¦a, b¦

∧ nr a x < nr b x¦
where nr a x is the number of occurrences of a in x.
Exercise 9.3. Prove that the following language is not regular
¦a
k
b
m
[ k m 2k¦
Exercise 9.4. Show that the following language is not regular.
¦a
k
b
l
a
m
[ k > 5 ∧ l > 3 ∧ m l¦
9.3. The pumping lemma for context-free languages
The Pumping Lemma for context-free languages gives a property that is satisfied by
all context-free languages. This property is a statement of the form: in sentences ex-
ceeding a certain length, two sublists of bounded length can be identified that can be
157
9. Pumping Lemmas: the expressive power of languages
duplicated while retaining a sentence. The idea behind this property is the following.
Context-free languages are described by context-free grammars. For each sentence
in the language there exists a derivation tree. When sentences have a derivation tree
that is higher than the number of nonterminals, then at least one nonterminal will
occur twice in a node; consequently a subtree can be inserted as often as desired.
As an example of an application of the Pumping Lemma, consider the context-free
grammar with the following productions.
S → aAb
A → cBd
A → e
B → fAg
The following parse tree represents the derivation of the sentence acfegdb.
S








c
c
c
c
c
c
c
c
a
A








c
c
c
c
c
c
c
c
b
c
B








c
c
c
c
c
c
c
c
d
f A
g
e
If we replace the subtree rooted by the lower occurrence of nonterminal A by the
158
9.3. The pumping lemma for context-free languages
subtree rooted by the upper occurrence of A, we obtain the following parse tree.
S








c
c
c
c
c
c
c
c
a
A








c
c
c
c
c
c
c
c
b
c
B








c
c
c
c
c
c
c
c
d
f A








c
c
c
c
c
c
c
c
g
c
B








c
c
c
c
c
c
c
c
d
f A
g
e
This parse tree represents the derivation of the sentence acfcfegdgdb. Thus we ‘pump’
the derivation of sentence acfegdb to the derivation of sentence acfcfegdgdb. Repeat-
ing this step once more, we obtain a parse tree for the sentence
acfcfcfegdgdgdb
We can repeatedly apply this process to obtain derivation trees for all sentences of
the form
a(cf)
i
e(gd)
i
b
for i 0. The case i = 0 is obtained if we replace in the parse tree for the sentence
acfegdb the subtree rooted by the upper occurrence of nonterminal A by the subtree
rooted by the lower occurrence of A:
S
Ð
Ð
Ð
Ð
Ð
Ð
Ð
Ð
b
b
b
b
b
b
b
a
A b
e
This is a derivation tree for the sentence aeb. This step can be viewed as a negative
pumping step.
The proof of the following lemma is given in Section 9.4.
159
9. Pumping Lemmas: the expressive power of languages
Theorem 9.2 (Context-free Pumping Lemma). Let L be a context-free language.
Then
there exist c, d : c, d ∈ N :
for all z : z ∈ L and [z[ c :
there exist u, v, w, x, y : z = uvwxy and [vx[ > 0 and [vwx[ d :
for all i ∈ N : uv
i
wx
i
y ∈ L
The Pumping Lemma is a tool with which we prove that a given language is not
context-free. The proof obligation is to show that the property shared by all context-
free languages does not hold for the language under consideration.
Theorem 9.2 enables us to prove that a language L is not context-free by showing
that
for all c, d : c, d ∈ N :
there exists z : z ∈ L and [z[ c :
for all u, v, w, x, y : z = uvwxy and [vx[ > 0 and [vwx[ d :
there exists i ∈ N : uv
i
wx
i
y ∈ L
As an example, we will prove that the language T defined by
T = ¦a
n
b
n
c
n
[ n > 0¦
is not context-free.
Proof. Let c, d ∈ N.
Take z = a
r
b
r
c
r
with r = max(c, d).
Let u, v, w, x, y be such that z = uvwxy, [vx[ > 0 and [vwx[ d
Note that our choice for r guarantees that substring vwx has one of the following
shapes:
• vwx consists of just a’s, or just b’s, or just c’s.
• vwx contains both a’s and b’s, or both b’s and c’s.
So vwx does not contain a’s, b’s, and c’s.
Take i = 0, then
• If vwx consists of just a’s, or just b’s, or just c’s, then it is impossible to write
the string uwy as a
s
b
s
c
s
for some s, since only the number of terminals of one
kind is decreased.
160
9.4. Proofs of pumping lemmas
• If vwx contains both a’s and b’s, or both b’s and c’s it lies somewhere on the
border between a’s and b’s, or on the border between b’s and c’s. Then the
string uwy can be written as
uwy = a
s
b
t
c
r
uwy = a
r
b
p
c
q
for some s, t, p, q, respectively. At least one of s and t or of p and q is less than
r. Again this list is not an element of T.
Exercise 9.5. Why does vwx not contain a’s, b’s, and c’s?
Exercise 9.6. Prove that the following language is not context-free
¦a
k
2
[ k 0¦
Exercise 9.7. Prove that the following language is not context-free
¦a
i
[ i is a prime number ¦
Exercise 9.8. Prove that the following language is not context-free
¦ww [ w ∈ ¦a, b¦

¦
9.4. Proofs of pumping lemmas
This section gives the proof of the pumping lemmas.
9.4.1. Proof of the Regular Pumping Lemma, Theorem 9.1
Since L is a regular language, there exists a deterministic finite-state automaton D
such that L = Ldfa D.
Take for n the number of states of D.
Let s be an element of L with sublist y such that [y[ n, say s = xyz .
Consider the sequence of states D passes through while processing y. Since [y[ n,
this sequence has more than n entries, hence at least one state, say state A, occurs
twice.
Take u, v, w as follows
• u is the initial part of y processed until the first occurrence of A,
• v is the (nonempty) part of y processed from the first to the second occurrence
of A,
• w is the remaining part of y
161
9. Pumping Lemmas: the expressive power of languages
Note that D could have skipped processing v, and hence would have accepted xuwz .
Furthermore, D can repeat the processing in between the first occurrence of A and
the second occurrence of A as often as desired, and hence it accepts xuv
i
wz for all
i ∈ N. Formally, a simple proof by induction shows that (∀i : i 0 : xuv
i
wz ∈ L).
9.4.2. Proof of the Context-free Pumping Lemma, Theorem 9.2
Let G = (T, N, R, S) be a context-free grammar such that L = L(G). Let m be the
length of the longest right-hand side of any production, and let k be the number of
nonterminals of G.
Take c = m
k
. In Lemma 9.3 below we prove that if z is a list with [z[ > c, then in
all derivation trees for z there exists a path of length at least k+1.
Let z ∈ L such that [z[ > c. Since grammar G has k nonterminals, there is at least
one nonterminal that occurs more than once in a path of length k+1 (which contains
k+2 symbols, of which at most one is a terminal, and all others are nonterminals)
of a derivation tree for z. Consider the nonterminal A that satisfies the following
requirements.
• A occurs at least twice in the path of length k+1 of a derivation tree for z.
Call the list corresponding to the derivation tree rooted at the lower A w, and call
the list corresponding to the derivation tree rooted at the upper A (which contains
the list w) vwx.
• A is chosen such that at most one of v and x equals .
• Finally, we suppose that below the upper occurrence of A no other nonterminal
that satisfies the same requirements occurs, that is, the two A’s are the lowest
pair of nonterminals satisfying the above requirements.
First we show that a nonterminal A satisfying the above requirements exists. We
prove this statement by contradiction. Suppose that for all nonterminals A that
occur twice on a path of length at least k+1 both v and x, the border lists of the list
vwx corresponding to the tree rooted at the upper occurrence of A, are both equal to
. Then we can replace the tree rooted at the upper A by the tree rooted at the lower
A without changing the list corresponding to the tree. Thus we can replace all paths
of length at least k+1 by a path of length at most k. But this contradicts Lemma
9.3 below, and we have a contradiction. It follows that a nonterminal A satisfying
the above requirements exists.
There exists a derivation tree for z in which the path from the upper A to the leaf
has length at most k+1, since either below A no nonterminal occurs twice, or there is
one or more nonterminal B that occurs twice, but the border lists v

and x

from the
list v

w

x

corresponding to the tree rooted at the upper occurrence of nonterminal
B are empty. Since the lists v

and x

are empty, we can replace the tree rooted at
162
9.4. Proofs of pumping lemmas
the upper occurrence of B by the tree rooted at the lower occurrence of B without
changing the list corresponding to the derivation tree. Since we can do this for all
nonterminals B that occur twice below the upper occurrence of A, there exists a
derivation tree for z in which the path from the upper A to the leaf has length at
most k+1. It follows from Lemma 9.3 below that the length of vwx is at most m
k+1
,
so we define d = m
k+1
.
Suppose z = uvwxy, that is, the list corresponding to the subtree to the left (right)
of the upper occurrence of A is u (y). This situation is depicted as follows.
S
r
r
r
r
r
r
r
r
r
r
r
r
r
Ò
Ò
Ò
Ò
Ò
Ò
Ò
Ò
`
`
`
`
`
`
`
`
v
v
v
v
v
v
v
v
v
v
v
v
v
u
A
r
r
r
r
r
r
r
r
r
r
r
r
r
Ò
Ò
Ò
Ò
Ò
Ò
Ò
Ò
`
`
`
`
`
`
`
`
v
v
v
v
v
v
v
v
v
v
v
v
v
y
v
A







c
c
c
c
c
c
c x
w
We prove by induction that (∀i : i 0 : uv
i
wx
i
y ∈ L). In this proof we apply the
tree substitution process described in the example before the lemma. For i = 0 we
have to show that the list uwy is a sentence in L. The list uwy is obtained if the
tree rooted at the upper A is replaced by the tree rooted at the lower A. Suppose
that for all i n we have uv
i
wx
i
y ∈ L. The list uv
i+1
wx
i+1
y is obtained if the tree
rooted at the lower A in the derivation tree for uv
i
wx
i
y ∈ L is replaced by the tree
rooted above it A. This proves the induction step.
The proof of the Pumping Lemma above frequently refers to the following lemma.
Theorem 9.3. Let G be a context-free grammar, and suppose that the longest right-
hand side of any production has length m. Let t be a derivation tree for a sentence
z ∈ L(G). If height t j, then [z[ m
j
.
Proof. We prove a slightly stronger result: if t is a derivation tree for a list z, but
the root of t is not necessarily the start-symbol, and height t j, then [z[ m
j
. We
prove this statement by induction on j.
For the base case, suppose j = 1. Then tree t corresponds to a single production in
G, and since the longest right-hand side of any production has length m, we have
that [z[ m = m
j
.
For the induction step, assume that for all derivation trees t of height at most j
we have that [z[ m
j
, where z is the list corresponding to t. Suppose we have
a tree t of height j+1. Let A be the root of t, and suppose the top of the tree
163
9. Pumping Lemmas: the expressive power of languages
corresponds to the production A → v in G. For all trees s rooted at the symbols of
v we have height s j, so the induction hypothesis applies to these trees, and the
lists corresponding to the trees rooted at the symbols of v all have length at most
m
j
. Since A → v is a production of G, and since the longest right-hand side of any
production has length m, the list corresponding to the tree rooted at A has length
at most mm
j
= m
j+1
, which proves the induction step.
Summary
This section introduces pumping lemmas. Pumping lemmas are used to prove that
languages are not regular or not context-free.
9.5. Exercises
Exercise 9.9. Show that the following language is not regular.
¦x [ x ∈ ¦0, 1¦

∧ nr 1 x = nr 0 x¦
where function nr takes an element a and a list x, and returns the number of occurrences of
a in x.
Exercise 9.10. Consider the following language:
¦a
i
b
j
[ 0 i j¦
1. Is this language context-free? If it is, give a context-free grammar and prove that this
grammar generates the language. If it is not, why not?
2. Is this language regular? If it is, give a regular grammar and prove that this grammar
generates the language. If it is not, why not?
Exercise 9.11. Consider the following language:
¦wcw [ w ∈ ¦a, b¦

¦
1. Is this language context-free? If it is, give a context-free grammar and prove that this
grammar generates the language. If it is not, why not?
2. Is this language regular? If it is, give a regular grammar and prove that this grammar
generates the language. If it is not, why not?
Exercise 9.12. Consider the grammar G with the following productions.











S → ¦A¦
S →
A → S
A → AA
A → ¦¦
1. Is this grammar
164
9.5. Exercises
• Context-free?
• Regular?
Why?
2. Give the language of G without referring to G itself. Prove that your description is
correct.
3. Is the language of G
• Context-free?
• Regular?
Why?
Exercise 9.13. Consider the grammar G with the following productions.







S →
S → 0
S → 1
S → S0
1. Is this grammar
• Context-free?
• Regular?
Why?
2. Give the language of G without referring to G itself. Prove that your description is
correct.
3. Is the language of G
• Context-free?
• Regular?
Why?
165
9. Pumping Lemmas: the expressive power of languages
166
10. LL Parsing
Introduction
This chapter introduces LL(1) parsing. LL(1) parsing is an efficient (linear in the
length of the input string) method for parsing that can be used for all LL(1) gram-
mars. A grammar is LL(1) if at each step in a derivation the next symbol in the input
uniquely determines the production that should be applied. In order to determine
whether or not a grammar is LL(1), we introduce several kinds of grammar analyses,
such as determining whether or not a nonterminal can derive the empty string, and
determining the set of symbols that can appear as the first symbol in a derivation
from a nonterminal.
Goals
After studying this chapter you will
• know the definition of LL(1) grammars;
• know how to parse a sentence from an LL(1) grammar;
• be able to apply different kinds of grammar analyses in order to determine
whether or not a grammar is LL(1).
This chapter is organised as follows. Section 10.1 describes the background of LL(1)
parsing, and Section 10.2 describes an implementation in Haskell of LL(1) parsing
and the different kinds of grammar analyses needed for checking whether or not a
grammar is LL(1).
10.1. LL Parsing: Background
In the previous chapters we have shown how to construct parsers for sentences of
context-free languages using combinator parsers. Since these parsers may backtrack,
the resulting parsers are sometimes a bit slow. There are several ways in which we
can put extra restrictions on context-free grammars such that we can parse sentences
of the corresponding languages efficiently. This chapter discusses one such restriction:
LL(1). Other restrictions, not discussed in these lecture notes are LR(1), LALR(1),
SLR(1), etc.
167
10. LL Parsing
10.1.1. A stack machine for parsing
This section presents a stack machine for parsing sentences of context-free grammars.
We will use this machine in the following subsections to illustrate why we need
grammar analysis.
The stack machine we use in this section differs from the stack machines introduced
in Sections 5.4.4 and 7.2. A stack machine for a grammar G has a stack and an input,
and performs one of the following two actions.
1. Expand: If the top stack symbol is a nonterminal, it is popped from the stack
and a right-hand side from a production of G for the nonterminal is pushed onto
the stack. The production is chosen nondeterministically.
2. Match: If the top stack symbol is a terminal, then it is popped from the stack
and compared with the next symbol of the input sequence. If they are equal,
then this terminal symbol is ‘read’. If the stack symbol and the next input
symbol do not match, the machine signals an error, and the input sentence
cannot be accepted.
These actions are performed until the stack is empty. A stack machine for G accepts
an input if it can terminate with an empty input when starting with the start-symbol
from G on the stack.
For example, let grammar G be the grammar with the productions:
S → aS [ cS [ b
The stack machine for G accepts the input string aab because it can perform the
following actions (the first component of the state (before the |in the picture below) is
the symbol stack and the second component of the state (after the |) is the unmatched
(remaining part of the) input string):
stack input
S aab
aS aab
S ab
aS ab
S b
b b
and end with an empty input. However, if the machine had chosen the production
S → cS in its first step, it would have been stuck. So not all possible sequences
of actions from the state (S,aab) lead to an empty input. If there is at least one
sequence of actions that ends with an empty input on an input string, the input
168
10.1. LL Parsing: Background
string is accepted. In this sense, the stack machine is similar to a nondeterministic
finite-state automaton.
10.1.2. Some example derivations
This section gives three examples in which the stack machine for parsing is applied.
It turns out that, for all three examples, the nondeterministic stack machine can act
in a deterministic way by looking ahead one (or two) symbols of the sequence of input
symbols. Each of the examples exemplifies why different kinds of grammar analyses
are useful in parsing.
The first example
Our first example is gramm1. The set of terminal symbols of gramm1 is ¦a, b, c¦, the
set of nonterminal symbols is ¦S, A, B, C¦, the start symbol is S; and the productions
are
S → cA [ b
A → cBC [ bSA [ a
B → cc [ Cb
C → aS [ ba
We want to know whether or not the string ccccba is a sentence of the language of
this grammar. The stack machine produces, amongst others, the following sequence,
corresponding with a leftmost derivation of ccccba.
stack input
S ccccba
cA ccccba
A cccba
cBC cccba
BC ccba
ccC ccba
cC cba
C ba
ba ba
a a
Starting with S the machine chooses between two productions:
S → cA [ b
169
10. LL Parsing
but, since the first symbol of the string ccccba to be recognised is c, the only ap-
plicable production is the first one. After expanding S a match-action removes the
leading c from ccccba and cA. So now we have to derive the string cccba from A.
The machine chooses between three productions:
A → cBC [ bSA [ a
and again, since the next symbol of the remaining string cccba to be recognised is
c, the only applicable production is the first one. After expanding A a match-action
removes the leading c from cccba and cBC. So now we have to derive the string
ccba from BC. The top stack symbol of BC is B. The machine chooses between
two productions:
B → cc [ Cb
The next symbol of the remaining string ccba to be recognised is, once again, c. The
first production is applicable, but the second production may be applicable as well.
To decide whether it also applies we have to determine the symbols that can appear
as the first element of a string derived from B starting with the second production.
The first symbol of the alternative Cb is the nonterminal C. From the productions
C → aS [ ba
it is immediately clear that a string derived from C starts with either an a or a b.
The set ¦a, b¦ is called the first set of the nonterminal C. Since the next symbol
in the remaining string to be recognised is a c, the second production cannot be
applied. After expanding B and performing two match-actions it remains to derive
the string ba from C. The machine chooses between two productions C → aS and
C → ba. Clearly, only the second one applies, and, after two match-actions, leads to
success.
From the above derivation we conclude the following.
Deriving the sentence ccccba using gramm1 is a deterministic computa-
tion: at each step of the derivation there is only one applicable alternative
for the nonterminal on top of the stack.
Determinicity is obtained by looking at the set of firsts of the nonterminals.
The second example
A second example is the grammar gramm2 whose productions are
S → abA [ aa
A → bb [ bS
Now we want to know whether or not the string abbb is a sentence of the language of
this grammar. The stack machine produces, amongst others, the following sequence
170
10.1. LL Parsing: Background
stack input
S abbb
abA abbb
bA bbb
A bb
bb bb
b b
Starting with S the machine chooses between two productions:
S → abA [ aa
since both alternatives start with an a, it is not sufficient to look at the first symbol a
of the string to be recognised. The problem is that the lookahead sets (the lookahead
set of a production N → α is the set of terminal symbols that can appear as the first
symbol of a string that can be derived from N starting with the production N → α,
the definition is given in the following subsection) of the two productions for S both
contain a. However, if we look at the first two symbols ab, then we find that the
only applicable production is the first one. After expanding and matching it remains
to derive the string bb from A. Again, looking ahead one symbol in the input string
does not give sufficient information for choosing one of the two productions
A → bb [ bS
for A. If we look at the first two symbols bb of the input string, then we find that the
first production applies (and, after matching, leads to success). Each string derived
from A starting with the second production starts with a b and, since it is not possible
to derive a string starting with another b from S, the second production does not
apply.
From the above derivation we conclude the following.
Deriving the string abbb using gramm2 is a deterministic computation: at
each step of the derivation there is only one applicable alternative for the
nonterminal on the top of the stack.
Again, determinicity is obtained by analysing the set of firsts (of strings of length
2) of the nonterminals. Alternatively, we can left-factor the grammar to obtain a
grammar in which all productions for a nonterminal start with a different terminal
symbol.
171
10. LL Parsing
The third example
A third example is grammar gramm3 with the following productions:
S → AaS [ B
A → cS [
B → b
Now we want to know whether or not the string acbab is an element of the language
of this grammar. The stack machine produces the following sequence
stack input
S acbab
AaS acbab
aS acbab
Starting with S the machine chooses between two productions:
S → AaS [ B
since each nonempty string derived from A starts with a c, and each nonempty string
derived from B starts with a b, there does not seem to be a candidate production
to start a leftmost derivation of acabb with. However, since A can also derive the
empty string, we can apply the first production, and then apply the empty string for
A, producing aS which, as required, starts with an a. We do not explain the rest of
the leftmost derivation since it does not use any empty strings any more.
Nonterminal symbols that can derive the empty sequence will play a central role in
the grammar analysis problems which we will consider in Section 10.2.
From the above derivation we conclude the following.
Deriving the string acbab using gramm3 is a deterministic computation:
at each step of the derivation there is only one applicable alternative for
the nonterminal on the top of the stack.
Determinicity is obtained by analysing whether or not nonterminals can derive the
empty string, and which terminal symbols can follow upon a nonterminal in a deriva-
tion.
172
10.1. LL Parsing: Background
10.1.3. LL(1) grammars
The examples in the previous subsection show that the derivations of the example
sentences are deterministic, provided we can look ahead one or two symbols in the
input. An obvious question now is: for which grammars are all derivations determin-
istic? Of course, as the second example shows, the answer to this question depends
on the number of symbols we are allowed to look ahead. In the rest of this chapter we
assume that we may look 1 symbol ahead. A grammar for which all derivations are
deterministic with 1 symbol lookahead is called LL(1): Leftmost with a Lookahead
of 1. Since all derivations of sentences of LL(1) grammars are deterministic, LL(1)
is a desirable property of grammars.
To formalise this definition, we define lookAhead sets.
Definition 10.1 (lookAhead set). The lookahead set of a production N → α is
the set of terminal symbols that can appear as the first symbol of a string that can
be derived from Nδ (where Nδ appears as a tail substring in a derivation from the
start-symbol) starting with the production N → α. So
lookAhead (N → α) = ¦x [ S

⇒ γNδ ⇒ γαδ

⇒ γxβ¦
For example, for the productions of gramm1 we have
lookAhead (S → cA) = ¦c¦
lookAhead (S → b) = ¦b¦
lookAhead (A → cBC) = ¦c¦
lookAhead (A → bSA) = ¦b¦
lookAhead (A → a) = ¦a¦
lookAhead (B → cc) = ¦c¦
lookAhead (B → Cb) = ¦a, b¦
lookAhead (C → aS) = ¦a¦
lookAhead (C → ba) = ¦b¦
We use lookAhead sets in the definition of LL(1) grammar.
Definition 10.2 (LL(1) grammar). A grammar G is LL(1) if all pairs of productions
of the same nonterminal have disjoint lookahead sets, that is: for all productions
N → α, N → β of G:
lookAhead (N → α) ∩ lookAhead (N → β) = ∅
Since all lookAhead sets for productions of the same nonterminal of gramm1 are
disjoint, gramm1 is an LL(1) grammar. For gramm2 we have:
lookAhead (S → abA) = ¦a¦
lookAhead (S → aa) = ¦a¦
lookAhead (A → bb) = ¦b¦
lookAhead (A → bS) = ¦b¦
173
10. LL Parsing
Here, the lookAhead sets for both nonterminals S and A are not disjoint, and it
follows that gramm2 is not LL(1). gramm2 is an LL(2) grammar, where an LL(k)
grammar for k 2 is defined similarly to an LL(1) grammar: instead of one symbol
lookahead we have k symbols lookahead.
How do we determine whether or not a grammar is LL(1)? Clearly, to answer this
question we need to know the lookahead sets of the productions of the grammar. The
lookAhead set of a production N → α, where α starts with a terminal symbol x, is
simply x. But what if α starts with a nonterminal P, that is α = Pβ, for some β?
Then we have to determine the set of terminal symbols with which strings derived
from P can start. But if P can derive the empty string, we also have to determine
the set of terminal symbols with which a string derived from β can start. As you see,
in order to determine the lookAhead sets of productions, we are interested in
• whether or not a nonterminal can derive the empty string (empty);
• which terminal symbols can appear as the first symbol in a string derived from
a nonterminal (firsts);
• and which terminal symbols can follow upon a nonterminal in a derivation
(follow).
In each of the following definitions we assume that a grammar G is given.
Definition 10.3 (Empty). Function empty takes a nonterminal N, and determines
whether or not the empty string can be derived from the nonterminal:
empty N = N


For example, for gramm3 we have:
empty S = False
empty A = True
empty B = False
Definition 10.4 (First). The set of firsts of a nonterminal N is the set of terminal
symbols that can appear as the first symbol of a string that can be derived from N:
firsts N = ¦x [ N

⇒ xβ¦
For example, for gramm3 we have:
firsts S = ¦a, b, c¦
firsts A = ¦c¦
firsts B = ¦b¦
174
10.1. LL Parsing: Background
We could have given more restricted definitions of empty and firsts, by only looking
at derivations from the start-symbol, for example,
empty N = S

⇒ αNβ

⇒ αβ
but the simpler definition above suffices for our purposes.
Definition 10.5 (Follow). The follow set of a nonterminal N is the set of terminal
symbols that can follow on N in a derivation starting with the start-symbol S from
the grammar G:
follow N = ¦x [ S

⇒ αNxβ¦
For example, for gramm3 we have:
follow S = ¦a¦
follow A = ¦a¦
follow B = ¦a¦
In the following section we will give programs with which lookahead, empty, firsts,
and follow are computed.
Exercise 10.1. Give the results of the function empty for the grammars gramm1 and gramm2.
Exercise 10.2. Give the results of the function firsts for the grammars gramm1 and gramm2.
Exercise 10.3. Give the results of the function follow for the grammars gramm1 and gramm2.
Exercise 10.4. Give the results of the function lookahead for grammar gramm3. Is gramm3
an LL(1) grammar ?
Exercise 10.5. Grammar gramm2 is not LL(1), but it can be transformed into an LL(1)
grammar by left factoring. Give this equivalent grammar gramm2’ and give the results of
the functions empty, first, follow and lookAhead on this grammar. Is gramm2’ an LL(1)
grammar?
Exercise 10.6. A non-leftrecursive grammar for Bit-Lists is given by the following gram-
mar (see your answer to Exercise 2.18):
L → BR
R → [ ,BR
B → 0 [ 1
Give the results of functions empty, firsts, follow and lookAhead on this grammar. Is this
grammar LL(1)?
175
10. LL Parsing
10.2. LL Parsing: Implementation
Until now we have written parsers with parser combinators. Parser combinators use
backtracking, and this is sometimes a cause of inefficiency. If a grammar is LL(1) we
do not need backtracking anymore: parsing is deterministic. We can use this fact by
either adjusting the parser combinators so that they don’t use backtracking anymore,
or by writing a special purpose LL(1) parsing program. We present the latter in this
section.
This section describes the implementation of a program that parses sentences of LL(1)
grammars. The program works for arbitrary context-free LL(1) grammars, so we first
describe how to represent context-free grammars in Haskell. Another consequence of
the fact that the program parses sentences of arbitrary context-free LL(1) grammars is
that we need a generic representation of parse trees in Haskell. The second subsection
defines a datatype for parse trees in Haskell. The third subsection presents the
program that parses sentences of LL(1) grammars. This program assumes that the
input grammar is LL(1), so in the fourth subsection we give a function that determines
whether or not a grammar is LL(1). Both this and the LL(1) parsing function
use a function that determines the lookahead of a production. This function is
presented in the fifth subsection. The last subsections of this section define functions
for determining the empty, first, and follow symbols of a nonterminal.
10.2.1. Context-free grammars in Haskell
A context-free grammar may be represented by a pair: its start symbol, and its
productions. How do we represent terminal and nonterminal symbols? There are at
least two possibilities.
• The rigorous approach uses a datatype Symbol:
data Symbol a b = N a | T b
The advantage of this approach is that nonterminals and terminals are strictly
separated, the disadvantage is the notational overhead of constructors that
has to be carried around. However, a rigorous implementation of context-free
grammars should keep terminals and nonterminals apart, so this is the preferred
implementation. But in this section we will use the following implementation:
• class Eq s => Symbol s where
isT :: s -> Bool
isN :: s -> Bool
isT = not . isN
where isN and isT determine whether or not a symbol is a nonterminal or a
terminal, respectively. This notation is compact, but terminals and nontermi-
nals are no longer strictly separated, and symbols that are used as nonterminals
176
10.2. LL Parsing: Implementation
cannot be used as terminals anymore. For example, the type of characters is
made an instance of this class by defining:
instance Symbol Char where
isN c = ’A’ <= c && c <= ’Z’
that is, capitals are nonterminals, and, by definition, all other characters are
terminals.
A context-free grammar is a value of the type CFG:
type CFG s = (s,[(s,[s])])
where the list in the second component associates nonterminals to right-hand sides.
So an element of this list is a production. For example, the grammar with produc-
tions
S → AaS [ B [ CB
A → SC [
B → A [ b
C → D
D → d
is represented as:
exGrammar :: CFG Char
exGrammar =
(’S’, [(’S’,"AaS"),(’S’,"B"),(’S’,"CB")
,(’A’,"SC"),(’A’,"")
,(’B’,"A"),(’B’,"b")
,(’C’,"D")
,(’D’,"d")
]
)
On this type we define some functions for extracting the productions, nonterminals,
terminals, etc. from a grammar.
start :: CFG s -> s
start = fst
prods :: CFG s -> [(s,[s])]
prods = snd
terminals :: (Symbol s, Ord s) => CFG s -> [s]
terminals = unions . map (filter isT . snd) . snd
177
10. LL Parsing
where unions :: Ord s => [[s]] -> [s] returns the union of the ‘sets’ of symbols
in the lists.
nonterminals :: (Symbol s, Ord s) => CFG s -> [s]
nonterminals = nub . map fst . snd
Here, nub :: Ord s => [s] -> [s] removes duplicates from a list.
symbols :: (Symbol s, Ord s) => CFG s -> [s]
symbols grammar =
union (terminals grammar) (nonterminals grammar)
nt2prods :: Eq s => CFG s -> s -> [(s,[s])]
nt2prods grammar s =
filter (\(nt,rhs) -> nt==s) (prods grammar)
where function union returns the set union of two lists (removing duplicates). For
example, we have
?start exGrammar
’S’
? terminals exGrammar
"abd"
10.2.2. Parse trees in Haskell
A parse tree is a tree, in which each internal node is labelled with a nonterminal,
and has a list of children (corresponding with a right-hand side of a production of
the nonterminal). It follows that parse trees can be represented as rose trees with
symbols, where the datatype of rose trees is defined by:
data Rose a = Node a [Rose a] | Nil
The constructor Nil has been added to simplify error handling when parsing: when
a sentence cannot be parsed, the ‘parse tree’ Nil is returned. Strictly speaking it
should not occur in the datatype of rose trees.
178
10.2. LL Parsing: Implementation
10.2.3. LL(1) parsing
This section defines a function ll1 that takes a grammar and a terminal string as
input, and returns one tuple: a parse tree, and the rest of the inputstring that has
not been parsed. So ll1 takes a grammar, and returns a parser with Rose s as
its result type. As mentioned in the beginning of this section, our parser doesn’t
need backtracking anymore, since parsing with an LL(1) grammar is deterministic.
Therefore, the parser type is adjusted as follows:
type Parser b a = [b] -> (a,[b])
Using this parser type, the type of the function ll1 is:
ll1 :: (Symbol s, Ord s) => CFG s -> Parser s (Rose s)
Function ll1 is defined in terms of two functions. Function isll1 :: CFG s ->
Bool is a function that checks whether or not a grammar is LL(1). And function
gll1 (for generalised LL(1)) produces a list of rose trees for a list of symbols. ll1 is
obtained from gll1 by giving gll1 the singleton list containing the start symbol of
the grammar as argument.
ll1 grammar input =
if isll1 grammar
then let ([rose], rest) = gll1 grammar [start grammar] input
in (rose, rest)
else error "ll1: grammar not LL(1)"
So now we have to implement functions isll1 and gll1. Function isll1 is imple-
mented in the following subsection. Function gll1 also uses two functions. Function
grammar2ll1table takes a grammar and returns the LL(1) table: the association list
that associates productions with their lookahead sets. And function choose takes a
terminal symbol, and chooses a production based on the LL(1) table.
gll1 :: (Symbol s, Ord s) => CFG s -> [s] -> Parser s [Rose s]
gll1 grammar =
let ll1table = grammar2ll1table grammar
-- The LL(1) table.
nt2prods nt = filter (\((n,l),r) -> n==nt) ll1table
-- nt2prods returns the productions for nonterminal
-- nt from the LL(1) table
selectprod nt t = choose t (nt2prods nt)
-- selectprod selects the production for nt from the
179
10. LL Parsing
-- LL(1) table that should be taken when t is the next
-- symbol in the input.
in \stack input ->
case stack of
[] -> ([], input)
(s:ss) ->
if isT s
then -- match
let (rts,rest) = gll1 grammar ss (tail input)
in if s == head input
then (Node s []: rts, rest)
-- The parse tree is a leaf (a node with
-- no children).
else ([Nil], input)
-- The input cannot be parsed
else -- expand
let t = head input
(rts,zs) = gll1 grammar (selectprod s t) input
-- Try to parse according to the production
-- obtained from the LL(1) table from s.
(rrs,vs) = gll1 grammar ss zs
-- Parse the rest of the symbols on the
-- stack.
in ((Node s rts): rrs, vs)
Functions grammar2ll1table and choose, which are used in the above function gll1,
are defined as follows. These functions use function lookaheadp, which returns the
lookahead set of a production and is defined in one of the following subsections.
grammar2ll1table :: (Symbol s, Ord s) => CFG s -> [((s,[s]),[s])]
grammar2ll1table grammar =
map (\x -> (x,lookaheadp grammar x)) (prods grammar)
choose :: Eq a => a -> [((b,c),[a])] -> c
choose t l =
let [((s,rhs), ys)] = filter (\((x,p),q) -> t ‘elem‘ q) l
in rhs
Note that function choose assumes that there is exactly one element in the association
list in which the element t occurs.
180
10.2. LL Parsing: Implementation
10.2.4. Implementation of isLL(1)
Function isll1 checks whether or not a context-free grammar is LL(1). It does this
by computing for each nonterminal of its argument grammar the set of lookahead
sets (one set for each production of the nonterminal), and checking that all of these
sets are disjoint. It is defined in terms of a function lookaheadn, which computes the
lookahead sets of a nonterminal, and a function disjoint, which determines whether
or not all sets in a list of sets are disjoint. All sets in a list of sets are disjoint if
the length of the concatenation of these sets equals the length of the union of these
sets.
isll1 :: (Symbol s, Ord s) => CFG s -> Bool
isll1 grammar =
and (map (disjoint . lookaheadn grammar) (nonterminals grammar))
disjoint :: Ord s => [[s]] -> Bool
disjoint xss = length (concat xss) == length (unions xss)
Function lookaheadn computes the lookahead sets of a nonterminal by computing
all productions of a nonterminal, and computing the lookahead set of each of these
productions by means of function lookaheadp.
lookaheadn :: (Symbol s, Ord s) => CFG s -> s -> [[s]]
lookaheadn grammar =
map (lookaheadp grammar) . nt2prods grammar
10.2.5. Implementation of lookahead
Function lookaheadp takes a grammar and a production, and returns the lookahead
set of the production. It is defined in terms of four functions. Each of the first three
functions will be defined in a separate subsection below, the fourth function is defined
in this subsection.
• isEmpty :: (Ord s,Symbol s) => CFG s -> s -> Bool
Function isEmpty takes a grammar and a nonterminal and determines whether
or not the empty string can be derived from the nonterminal in the grammar.
(This function was called empty in Definition 10.3.)
• firsts :: (Ord s, Symbol s) => CFG s -> [(s,[s])]
Function firsts takes a grammar and computes the set of firsts of each symbol
(the set of firsts of a terminal is the terminal itself).
181
10. LL Parsing
• follow :: (Ord s, Symbol s) => CFG s -> [(s,[s])]
Function follow takes a grammar and computes the follow set of each nonter-
minal (so it associates a list of symbols with each nonterminal).
• lookSet :: Ord s =>
(s -> Bool) -> -- isEmpty
(s -> [s]) -> -- firsts?
(s -> [s]) -> -- follow?
(s, [s]) -> -- production
[s] -- lookahead set
Note that we use the operator ?, see Section 5.4.2, on the firsts and follow
association lists. Function lookSet takes a predicate, two functions that given
a nonterminal return the first and follow set, respectively, and a production, and
returns the lookahead set of the production. Function lookSet is introduced
after the definition of function lookaheadp.
Now we define:
lookaheadp :: (Symbol s, Ord s) => CFG s -> (s,[s]) -> [s]
lookaheadp grammar =
lookSet (isEmpty grammar) ((firsts grammar)?) ((follow grammar)?)
We will exemplify the definition of function lookSet with the grammar exGrammar,
with the following productions:
S → AaS [ B [ CB
A → SC [
B → A [ b
C → D
D → d
Consider the production S → AaS. The lookahead set of the production contains
the set of symbols which can appear as the first terminal symbol of a sequence of
symbols derived from A. But, since the nonterminal symbol A can derive the empty
string, the lookahead set also contains the symbol a.
Consider the production A → SC. The lookahead set of the production contains
the set of symbols which can appear as the first terminal symbol of a sequence of
symbols derived from S. But, since the nonterminal symbol S can derive the empty
string, the lookahead set also contains the set of symbols which can appear as the
first terminal symbol of a sequence of symbols derived from C.
Finally, consider the production B → A. The lookahead set of the production con-
tains the set of symbols which can appear as the first terminal symbol of a sequence
182
10.2. LL Parsing: Implementation
of symbols derived from A. But, since the nonterminal symbol A can derive the
empty string, the lookahead set also contains the set of terminal symbols which can
follow the nonterminal symbol B in some derivation.
The examples show that it is useful to have functions firsts and follow in which,
for every nonterminal symbol n, we can look up the terminal symbols which can
appear as the first terminal symbol of a sequence of symbols in some derivation from
n and the set of terminal symbols which can follow the nonterminal symbol n in a
sequence of symbols occurring in some derivation respectively. It turns out that the
definition of function follow also makes use of a function lasts which is similar to
the function firsts, but which deals with last nonterminal symbols rather than first
terminal ones.
The examples also illustrate a control structure which will be used very often in
the following algorithms: we will fold over right-hand sides. While doing so we
compute sets of symbols for all the symbols of the right-hand side which we encounter
and collect them into a final set of symbols. Whenever such a list for a symbol is
computed, there are always two possibilities:
• either we continue folding and return the result of taking the union of the set
obtained from the current element and the set obtained by recursively folding
over the rest of the right-hand side
• or we stop folding and immediately return the set obtained from the current
element.
We continue if the current symbol is a nonterminal which can derive the empty se-
quence and we stop if the current symbol is either a terminal symbol or a nonterminal
symbol which cannot derive the empty sequence. The following function makes this
statement more precise.
foldrRhs :: Ord s =>
(s -> Bool) ->
(s -> [s]) ->
[s] ->
[s] ->
[s]
foldrRhs p f start = foldr op start
where op x xs = f x ‘union‘ if p x then xs else []
The function foldrRhs is, of course, most naturally defined in terms of the function
foldr. This function is somewhere in between a general purpose and an application
specific function (we could easily have made it more general though). In the exercises
we give an alternative characterisation of foldRhs. We will also need a function
scanrRhs which is like foldrRhs but accumulates intermediate results in a list. The
function scanrRhs is most naturally defined in terms of the function scanr.
183
10. LL Parsing
scanrRhs :: Ord s =>
(s -> Bool) ->
(s -> [s]) ->
[s] ->
[s] ->
[[s]]
scanrRhs p f start = scanr op start
where op x xs = f x ‘union‘ if p x then xs else []
Finally, we will also need a function scanlRhs which does the same job as scanrRhs
but in the opposite direction. The easiest way to define scanlRhs is in terms of
scanrRhs and reverse.
scanlRhs p f start = reverse . scanrRhs p f start . reverse
We now return to the function lookSet.
lookSet :: Ord s =>
(s -> Bool) ->
(s -> [s]) ->
(s -> [s]) ->
(s,[s]) ->
[s]
lookSet p f g (nt,rhs) = foldrRhs p f (g nt) rhs
The function lookSet makes use of foldrRhs to fold over a right-hand side. As
stated above, the function foldrRhs continues processing a right-hand side only if
it encounters a nonterminal symbol for which p (so isEmpty in the lookSet in-
stance lookaheadp) holds. Thus, the set g nt (follow?nt in the lookSet instance
lookaheadp) is only important for those right-hand sides for nt that consist of non-
terminals that can all derive the empty sequence. We can now (assuming that the
definitions of the auxiliary functions are given) use the function lookaheadp instance
of lookSet to compute the lookahead sets of all productions.
look nt rhs = lookaheadp exGrammar (nt,rhs)
? look ’S’ "AaS"
dba
? look ’S’ "B"
dba
? look ’S’ "CB"
d
? look ’A’ "SC"
184
10.2. LL Parsing: Implementation
dba
? look ’A’ ""
ad
? look ’B’ "A"
dba
? look ’B’ "b"
b
? look ’C’ "D"
d
? look ’D’ "d"
d
It is clear from this result that exGrammar is not an LL(1)-grammar. Let us have
a closer look at how these lookahead sets are obtained. We will have to use the
functions firsts and follow and the predicate isEmpty for computing intermediate
results. The corresponding subsections explain how to compute these intermediate
results.
For the lookahead set of the production A → AaS we fold over the right-hand side
AaS. Folding stops at ’a’ and we obtain
firsts? ’A’ ‘union‘ firsts? ’a’
==
"dba" ‘union‘ "a"
==
"dba"
For the lookahead set of the production A → SC we fold over the right-hand side
SC. Folding stops at C since it cannot derive the empty sequence, and we obtain
firsts? ’S’ ‘union‘ firsts? ’C’
==
"dba" ‘union‘ "d"
==
"dba"
Finally, for the lookahead set of the production B → A we fold over the right-hand
side A In this case we fold over the complete (one element) list and and we obtain
firsts? ’A’ ‘union‘ follow? ’B’
==
"dba" ‘union‘ "d"
==
"dba"
185
10. LL Parsing
The other lookahead sets are computed in a similar way.
10.2.6. Implementation of empty
Many functions defined in this chapter make use of a predicate isEmpty, which
tests whether or not the empty sequence can be derived from a nonterminal. This
subsection defines this function. Consider the grammar exGrammar. We are now only
interested in deriving sequences which contain only nonterminal symbols (since it is
impossible to derive the empty string if a terminal occurs). Therefore we only have
to consider the productions in which no terminal symbols appear in the right-hand
sides.
S → B [ CB
A → SC [
B → A
C → D
One can immediately see from those productions that the nonterminal A derives the
empty string in one step. To know whether there are any nonterminals which derive
the empty string in more than one step we eliminate the productions for A and we
eliminate all occurrences of A in the right hand sides of the remaining productions
S → B [ CB
B →
C → D
One can now conclude that the nonterminal B derives the empty string in two steps.
Doing the same with B as we did with A gives us the following productions
S → [ C
C → D
One can now conclude that the nonterminal S derives the empty string in three steps.
Doing the same with S as we did with A and B gives us the following productions
C → D
At this stage we can conclude that there are no more new nonterminals which derive
the empty string.
We now give the Haskell implementation of the algorithm described above. The al-
gorithm is iterative: it does the same steps over and over again until some desired
condition is met. For this purpose we use function fixedPoint, which takes a func-
tion and a set, and repeatedly applies the function to the set, until the set does not
change anymore.
186
10.2. LL Parsing: Implementation
fixedPoint :: Ord a => ([a] -> [a]) -> [a] -> [a]
fixedPoint f xs | xs == nexts = xs
| otherwise = fixedPoint f nexts
where nexts = f xs
fixedPoint f is sometimes called the fixed-point of f. Function isEmpty determines
whether or not a nonterminal can derive the empty string. A nonterminal can derive
the empty string if it is a member of the emptySet of a grammar.
isEmpty :: (Symbol s, Ord s) => CFG s -> s -> Bool
isEmpty grammar = (‘elem‘ emptySet grammar)
The emptySet of a grammar is obtained by the iterative process described in the
example above. We start with the empty set of nonterminals, and at each step n of
the computation of the emptySet as a fixedPoint, we add the nonterminals that
can derive the empty string in n steps. Function emptyStepf adds a nonterminal if
there exists a production for the nonterminal of which all elements can derive the
empty string.
emptySet :: (Symbol s, Ord s) => CFG s -> [s]
emptySet grammar = fixedPoint (emptyStepf grammar) []
emptyStepf :: (Symbol s, Ord s) => CFG s -> [s] -> [s]
emptyStepf grammar set =
nub (map fst (filter (\(nt,rhs) -> all (‘elem‘ set) rhs)
(prods grammar)
) )
10.2.7. Implementation of first and last
Function firsts takes a grammar, and returns for each symbol of the grammar (so
also the terminal symbols) the set of terminal symbols with which a sentence derived
from that symbol can start. The first set of a terminal symbol is the terminal symbol
itself.
The set of firsts each symbol consists of that symbol itself, plus the (first) symbols
that can be derived from that symbol in one or more steps. So the set of firsts can
be computed by an iterative process, just as the function isEmpty.
Consider the grammar exGrammar again. We start the iteration with
[(’S’,"S"),(’A’,"A"),(’B’,"B"),(’C’,"C"),(’D’,"D")
,(’a’,"a"),(’b’,"b"),(’d’,"d")
]
187
10. LL Parsing
Using the productions of the grammar we can derive in one step the following lists
of first symbols.
[(’S’,"ABC"),(’A’,"S"),(’B’,"Ab"),(’C’,"D"),(’D’,"d")]
and the union of these two lists is
[(’S’,"SABC"),(’A’,"AS"),(’B’,"BAb"),(’C’,"CD"),(’D’,"Dd")
,(’a’,"a"),(’b’,"b"),(’d’,"d")]
In two steps we can derive
[(’S’,"SAbD"),(’A’,"ABC"),(’B’,"S"),(’C’,"d"),(’D’,"")]
and again we have to take the union of this list with the previous result. We repeat
this process until the list doesn’t change anymore. For exGrammar this happens
when:
[(’S’,"SABCDabd")
,(’A’,"SABCDabd")
,(’B’,"SABCDabd")
,(’C’,"CDd")
,(’D’,"Dd")
,(’a’,"a")
,(’b’,"b")
,(’d’,"d")
]
Function firsts is defined as the fixedPoint of a step function that iterates the
first computation one more step. The fixedPoint starts with the list that contains
all symbols paired with themselves.
firsts :: (Symbol s, Ord s) => CFG s -> [(s,[s])]
firsts grammar =
fixedPoint (firstStepf grammar) (startSingle grammar)
startSingle :: (Ord s, Symbol s) => CFG s -> [(s,[s])]
startSingle grammar = map (\x -> (x,[x])) (symbols grammar)
The step function takes the old approximation and performs one more iteration step.
At each of these iteration steps we have to add the start list with which the iteration
started again.
188
10.2. LL Parsing: Implementation
firstStepf :: (Ord s, Symbol s) =>
CFG s -> [(s,[s])] -> [(s,[s])]
firstStepf grammar approx = (startSingle grammar)
‘combine‘ (compose (first1 grammar) approx)
combine :: Ord s => [(s,[s])] -> [(s,[s])] -> [(s,[s])]
combine xs = foldr insert xs
where insert (a,bs) [] = [(a,bs)]
insert (a,bs) ((c,ds):rest)
| a == c = (a, union bs ds) : rest
| otherwise = (c,ds) : (insert (a,bs) rest)
compose :: Ord a => [(a,[a])] -> [(a,[a])] -> [(a,[a])]
compose r1 r2 = [(a, unions (map (r2?) bs)) | (a,bs) <- r1]
Finally, function first1 computes the direct first symbols of all productions, taking
into account that some nonterminals can derive the empty string, and combines the
results for the different nonterminals.
first1 :: (Symbol s, Ord s) => CFG s -> [(s,[s])]
first1 grammar =
map (\(nt,fs) -> (nt,unions fs))
(group (map (\(nt,rhs) -> (nt,foldrRhs (isEmpty grammar)
single
[]
rhs
) )
(prods grammar)
) )
where group groups elements with the same first element together
group :: Eq a => [(a,b)] -> [(a,[b])]
group = foldr insertPair []
insertPair :: Eq a => (a,b) -> [(a,[b])] -> [(a,[b])]
insertPair (a,b) [] = [(a,[b])]
insertPair (a,b) ((c,ds):rest) =
if a==c then (c,(b:ds)):rest else (c,ds):(insertPair (a,b) rest)
function single takes an element and returns the set with the element, and unions
returns the union of a set of sets.
189
10. LL Parsing
Function lasts is defined using function firsts. Suppose we reverse the right-hand
sides of all productions of a grammar. Then the set of firsts of this reversed grammar
is the set of lasts of the original grammar. This idea is implemented in the following
functions.
reverseGrammar :: Symbol s => CFG s -> CFG s
reverseGrammar =
\(s,al) -> (s,map (\(nt,rhs) -> (nt,reverse rhs)) al)
lasts :: (Symbol s, Ord s) => CFG s -> [(s,[s])]
lasts = firsts . reverseGrammar
10.2.8. Implementation of follow
The final function we have to implement is the function follow, which takes a gram-
mar, and returns an association list in which nonterminals are associated to symbols
that can follow upon the nonterminal in a derivation. A nonterminal n is associ-
ated to a list containing terminal t in follow if n and t follow each other in some
sequence of symbols occurring in some leftmost derivation. We can compute pairs
of such adjacent symbols by splitting up the right-hand sides with length at least 2
and, using lasts and firsts, compute the symbols which appear at the end resp.
at the beginning of strings which can be derived from the left resp. right part of the
split alternative. Our grammar exGrammar has three alternatives with length at least
2: "AaS", "CB" and "SC". Function follow uses the functions firsts and lasts
and the predicate isEmpty for intermediate results. The previous subsections explain
how to compute these functions.
Let’s see what happens with the alternative "AaS". The lists of all nonterminal
symbols that can appear at the end of sequences of symbols derived from "A" and
"Aa" are "ADC" and "" respectively. The lists of all terminal symbols which can
appear at the beginning of sequences of symbols derived from "aS" and "S" are "a"
and "dba" respectively. Zipping together those lists shows that an ’A’, a ’D’ and
a ’C’ can be be followed by an ’a’. Splitting the alternative "CB" in the middle
produces sets of firsts and sets of lasts "CD" and "dba". Splitting the alternative
"SC" in the middle produces sets of firsts and sets of lasts "SDCAB" and "d". From
the first pair we can see that a ’C’ and a ’D’ can be followed by a ’d’, a ’b’ and
an ’a’ From the second pair we see that an ’S’, a ’D’, a ’C’, an ’A’, and a ’B’
can be followed by a ’d’. Combining all these results gives:
[(’S’,"d"),(’A’,"ad"),(’B’,"d"),(’C’,"adb"),(’D’,"adb")]
The function follow uses the functions scanrAlt and scanlAlt. The lists produced
by these functions are exactly the ones we need: using the function zip from the
190
10.2. LL Parsing: Implementation
standard prelude we can combine the lists. For example: for the alternative "AaS"
the functions scanlAlt and scanrAlt produce the following lists:
[[], "ADC", "DC", "SDCAB"]
["dba", "a", "dba", []]
Only the two middle elements of both lists are important (they correspond to the
nontrivial splittings of "AaS"). Thus, we only have to consider alternatives of length
at least 2. We start the computation of follow with assigning the empty follow set
to each symbol:
follow :: (Symbol s, Ord s) => CFG s -> [(s,[s])]
follow grammar = combine (followNE grammar) (startEmpty grammar)
startEmpty grammar = map (\x -> (x,[])) (symbols grammar)
The real work is done by functions followNE and function splitProds. Function
followNE passes the right arguments on to function splitProds, and removes all
nonterminals from the set of firsts, and all terminals from the set of lasts. Function
splitProds splits the productions of length at least 2, and pairs the last nonterminals
with the first terminals.
followNE :: (Symbol s, Ord s) => CFG s -> [(s,[s])]
followNE grammar = splitProds
(prods grammar)
(isEmpty grammar)
(isTfirsts grammar)
(isNlasts grammar)
where isTfirsts = map (\(x,xs) -> (x,filter isT xs)) . firsts
isNlasts = map (\(x,xs) -> (x,filter isN xs)) . lasts
splitProds :: (Symbol s, Ord s) =>
[(s,[s])] -> -- productions
(s -> Bool) -> -- isEmpty
[(s,[s])] -> -- terminal firsts
[(s,[s])] -> -- nonterminal lasts
[(s,[s])]
splitProds prods p fset lset =
map (\(nt,rhs) -> (nt,nub rhs)) (group pairs)
where pairs = [(l, f)
| rhs <- map snd prods
, length rhs >= 2
, (fs, ls) <- zip (rightscan rhs) (leftscan rhs)
191
10. LL Parsing
, l <- ls
, f <- fs
]
leftscan = scanlRhs p (lset?) []
rightscan = scanrRhs p (fset?) []
Exercise 10.7. Give the Rose tree representation of the parse tree corresponding to the
derivation of the sentence ccccba using grammar gramm1.
Exercise 10.8. Give the Rose tree representation of the parse tree corresponding to the
derivation of the sentence abbb using grammar gramm2’ defined in the exercises of the previous
section.
Exercise 10.9. Give the Rose tree representation of the parse tree corresponding to the
derivation of the sentence acbab using grammar gramm3.
Exercise 10.10. In this exercise we will take a closer look at the functions foldrRhs and
scanrRhs which are the essential ingredients of the implementation of the grammar analysis
algorithms. From the definitions it is clear that grammar analysis is easily expressed via
a calculus for (finite) sets. A calculus for finite sets is implicit in the programs for LL(1)
parsing. Since the code in this module is obscured by several implementation details we will
derive the functions foldrRhs and scanrRhs in a stepwise fashion. In this derivation we will
use the following:
A (finite) set is implemented by a list with no duplicates. In order to construct a set, the
following operations may be used:
[] :: [a] the empty set of a-elements
union :: [a] → [a] → [a] the union of two sets
unions :: [[a]] → [a] the generalised union
single :: a → [a] the singleton function
These operations satisfy the well-known laws for set operations.
1. Define a function list2Set :: [a] → [a] which returns the set of elements occurring
in the argument.
2. Define list2Set as a foldr.
3. Define a function pref p :: [a] → [a] which given a list xs returns the set of
elements corresponding to the longest prefix of xs all of whose elements satisfy p.
4. Define a function prefplus p :: [a] → [a] which given a list xs returns the set of
elements in the longest prefix of xs all of whose elements satisfy p together with the
first element of xs that does not satisfy p (if this element exists at all).
5. Define prefplus p as a foldr.
6. Show that prefplus p = foldrRhs p single [].
7. It can be shown that
foldrRhs p f [] = unions . map f . prefplus p
for all set-valued functions f. Give an informal description of the functions foldrRhs
p f [] and foldrRhs p f start.
192
10.2. LL Parsing: Implementation
8. The standard function scanr is defined by
scanr f q0 = map (foldr f q0) . tails
where tails is a function which takes a list xs and returns a list with all tailsegments
(postfixes) of xs in decreasing length. The function scanrRhs is defined is a similar
way
scanrRhs p f start = map (foldrRhs p f start) . tails
Give an informal description of the function scanrRhs.
Exercise 10.11. The computation of the functions empty and firsts is not restricted to
nonterminals only. For terminal symbols s these functions are defined by
empty s = False
firsts s = ¦s¦
Using the definitions in the previous exercise, compute the following.
1. For the example grammar gramm1 and two of its productions A → bSA and B → Cb.
a) foldrRhs empty firsts [] bSA
b) foldrRhs empty firsts [] Cb
2. For the example grammar gramm3 and its production S → AaS
a) foldrRhs empty firsts [] AaS
b) scanrRhs empty firsts [] AaS
193
10. LL Parsing
194
11. LL versus LR parsing
The parsers that were constructed using parser combinators in Chapter 3 are non-
deterministic. They can recognize sentences described by ambiguous grammars by
returning a list of solutions, rather than just one solution; each solution contains a
parse tree and the part of the input that was left unprocessed. Nondeterministic
parsers are of the following type:
type Parser a b = [a] -> [ (b, [a]) ]
where a denotes the alphabet type, and b denotes the parse tree type.
In chapter 10 we turned to deterministic parsers. Here, parsers have only one result,
consisting of a parse tree of type b and the remaining part of the input string:
type Parser a b = [a] -> (b, [a])
Ambiguous grammars are not allowed anymore. Also, in the case of parsing input
containing syntax errors, we cannot return an empty list of successes anymore; intead
there should be some mechanism of raising an error.
There are various algorithms for deterministic parsing. They impose some additional
constraints on the form of the grammar: not every context free grammar is allowable
by these algorithms. The parsing algorithms can be modelled by making use of a stack
machine. There are two fundamentally different deterministic parsing algorithms:
• LL parsing, also known as top-down parsing
• LR parsing, also known as bottom-up parsing
The first ‘L’ in these acronyms stands for ‘Left-to-right’, that is, input in processed in
the order it is read. So these algorithms are both suitable for reading an input stream,
e.g. from a file. The second ‘L’ in ‘LL-parsing’ stands for ‘Leftmost derivation’, as
the parsing mimics doing a leftmost derivation of a sentence (see Section 2.4). The
‘R’ in ‘LR-parsing’ stands for ‘Rightmost derivation’ of the sentences. The parsing
algorithms are normally referred to as LL(k) or (LR(k), where k is the number of
unread symbols that the parser is allowed to ‘look ahead’. In most practical cases, k
is taken to be 1.
In Chapter 10 we studied the LL(1) parsing algorithm extensively, including the
so-called LL(1)-property to which grammars must abide. In this chapter we start
with an example application of the LL(1) algorithm. Next, we turn to the LR(1)
algorithm.
195
11. LL versus LR parsing
11.1. LL(1) parser example
The LL(1) parsing algorithm was implemented in Section 10.2.3. Function ll1 de-
fined there takes a grammar and an input string, and returns a single result consisting
of a parse tree and rest string. The function was implemented by calling a generalized
function named gll1; generalized in the sense that the function takes an additional
list of symbols that need to be recognized. That generalization is then called with a
singleton list containing only the root nonterminal.
11.1.1. An LL(1) checker
Here, we define a slightly simplified version of the algorithm: it doesn’t return a parse
tree, so it need not be concerned with building it; it merely checks whether or not
the input string is a sentence. Hence the result of the function is simply a boolean
value:
check :: String -> Bool
check input = run [’S’] input
As in chapter 10, the function is implemented by calling a generalized function run
with a singleton containing just the root nonterminal. Now the function run takes,
in addition to the input string, a list of symbols (which is initially called with the
above-mentioned singleton). That list is used as some kind of stack, as elements are
prepended at its front, and removed at the front in other occasions. Therefore we
refer to the whole operation as a stack machine, and that’s why the function is named
run: it runs the stack machine. Function run is defined as follows:
run :: Stack -> String -> Bool
run [] [] = True
run [] (x:xs) = False
run (a:as) input | isT a = not(null input)
&& a==hd input
&& run as (tl input)
| isN a = run (rs++as) input
where rs = select a (hd input)
So, when called with an empty stack and an empty string, the function succeeds. If
the stack is empty, but the input is not, it fails, as there is junk input present. If
the stack is nonempty, case distinction is done on a, the top of the stack. If it is a
terminal, the input should begin with it, and the machine is called recursively for the
rest of the input. In the case of a nonterminal, we push rs on the stack, and leave
the input unchanged in the recursive call. This is a simple tail recursive function,
and could imperatively be implemented in a single loop that runs while the stack is
non-empty.
The new symbols rs that are pushed on the stack is the right hand side of a production
rule for nonterminal a. It is selected from the available alternatives by function
196
11.1. LL(1) parser example
select. For making the choice, the first input symbol is passed to select as well.
This is where you can see that the algorithm is LL(1): it is allowed to look ahead
one symbol.
This is how the alternative is selected:
select a x = (snd . hd . filter ok . prods) gram
where ok p@(n,_) = n==a && x ‘elem‘ lahP gram p
So, from all productions of the grammar returned by prods, the ok ones are taken, of
which there should be only one (this is ensured by the LL(1)-property); that single
production is retrieved by hd, and of it only the right hand side is needed, hence
the call to snd. Now a production is ok, if the nonterminal n is a, the one we are
looking for, and moreover the first input symbol x is member of the lookahead set of
this production.
Determining the lookahead sets of all productions of a grammar is the tricky part
of the LL(1) parsing algorithm. It is described in Sections 10.2.5 to 10.2.8. Though
the general formulation of the algorithm is quite complex, its application in a simple
case is rather intuitive. So let’s study an example grammar: arithmetic expressions
with operators of two levels of precedence.
11.1.2. An LL(1) grammar
The idea for the grammar for arithmetical expressions was given in Section 2.5.7: we
need auxiliary notions of ‘Term’ and ‘Factor’ (actually, we need as much additional
notions as there are levels of precedence). The most straightforward definition of the
grammar is shown in the left column below.
Unfortunately, this grammar doesn’t abide to the LL(1)-property. The reason for
this is that the grammar contains rules with common prefixes on the right hand side
(for E and T). A way out is the application of a grammar transformation known
as left factoring, as described in Section 2.5.4. The result is a grammar where there
is only one rule for E and T, and the non-common part of the right hand sides is
described by additional nonterminals, P and M. The resulting grammar is shown in
the right column below. For being able to treat end-of-input as if it were a character,
we also add an additional rule, which says that the input consists of an expression
followed by end-of-input (designated as ‘#’ here).
197
11. LL versus LR parsing
E → T
E → T + E
T → F
T → F * T
F → N
F → ( E )
N → 1
N → 2
N → 3
S → E #
E → TP
P →
P → + E
T → FM
M →
M → * T
F → N
F → ( E )
N → 1
N → 2
N → 3
For determining the lookahead sets of each nonterminal, we also need to analyze
whether a nonterminal can produce the empty string, and which terminals can be the
first symbol of any string produced by each nonterminal. These properties are named
empty and first respecively. All properties for the example grammar are summarized
in the table below. Note that empty and first are properties of a nonterminal, whereas
lookahead is a property of a single production rule.
production empty first lookahead
S → E # no ( 1 2 3 first(E) ( 1 2 3
E → TP no ( 1 2 3 first(T) ( 1 2 3
P → yes + follow(P) ) #
P → + E immediate +
T → FM no ( 1 2 3 first(F) ( 1 2 3
M → yes * follow(M) + ) #
M → * T immediate *
F → N no ( 1 2 3 first(N) 1 2 3
F → ( E ) immediate (
N → 1 no 1 2 3 immediate 1
N → 2 immediate 2
N → 3 immediate 3
11.1.3. Using the LL(1) parser
A sentence of the language described by the example grammar is 1+2*3. Because of
the high precedence of * relative to +, the parse tree should reflect that the 2 and 3
should be multiplied, rather than the 1+2 and 3. Indeed, the parse tree does so:
198
11.2. LR parsing
Now when we do a step-by-step analysis of how the parse tree is constructed by the
stack machine, we notice that the parse tree is traversed in a depth-first fashion,
where the left subtrees are analysed first. Each node corresponds to the application
of a production rule. The order in which the production rules are applied is a pre-
order traversal of the tree. The tree is constructed top-down: first the root is visited,
an then each time the first remaining nonterminal is expanded. From the table in
Figure 11.1 it is clear that the contents of the stack describes what is still to be
expected on the input. Initially, of course, this is the root nonterminal S. Whenever
a terminal is on top of the stack, the corresponing symbol is read from the input.
11.2. LR parsing
Another algorithm for doing deterministic parsing using a stack machine, is LR(1)-
parsing. Actually, what is described in this section is known as Simple LR parsing
or SLR(1)-parsing. It is still rather complicated, though.
A nice property of LR-parsing is that it is in many ways exactly the opposite, or
dual, of LL-parsing. Some of these ways are:
• LL does a leftmost derivation, LR does a rightmost derivation
199
11. LL versus LR parsing
• LL starts with only the root nonterminal on the stack, LR ends with only the
root nonterminal on the stack
• LL ends when the stack is empty, LR starts with an empty stack
• LL uses the stack for designating what is still to be expected, LR uses the stack
for designating what has already been seen
• LL builds the parse tree top down, LR builds the parse tree bottom up
• LL continuously pops a nonterminal off the stack, and pushes a corresponding
right hand side; LR tries to recognize a right hand side on the stack, pops it,
and pushes the corresponding nonterminal
• LL thus expands nonterminals, while LR reduces them
• LL reads terminal when it pops one off the stack, LR reads terminals while it
pushes them on the stack
• LL uses grammar rules in an order which corresponds to pre-order traversal of
the parse tree, LR does a post-order traversal.
11.2.1. A stack machine for SLR parsing
As in Section 10.1.1 a stack machine is used to parse sentences. A stack machine for
a grammar G has a stack and an input. When the parsing starts the stack is empty.
The stack machine performs one of the following two actions.
1. Shift: Move the first symbol of the remaining input to the top of the stack.
2. Reduce: Choose a production rule N → α; pop the sequence α from the top
of the stack; push N onto the stack.
These actions are performed until the stack only contains the start symbol. A stack
machine for G accepts an input if it can terminate with only the start symbol on
the stack when the whole input has been processed. Let us take the example of
Section 10.1.1.
stack input
aab
a ab
aa b
baa
Saa
Sa
S
Note that with our notation the top of the stack is at the left side. To decide which
production rule is to be used for the reduction, the symbols on the stack must be
read in reverse order.
200
11.3. LR parse example
The stack machine for G accepts the input string aab because it terminates with the
start symbol on the stack and the input is completely processed. The stack machine
performs three shift actions and then three reduce actions.
The SLR parser only performs a reduction with a grammar rule N → α if the next
symbol on the input is in the follow set of N.
Exercise 11.1. Give for gramm1 of Section 10.1.2 all the states of the stack machine for a
derivation of the input ccccba.
Exercise 11.2. Give for gramm2 of Section 10.1.2 all the states of the stack machine for a
derivation of the input abbb.
Exercise 11.3. Give for gramm3 of Section 10.1.2 all the states of the stack machine for a
derivation of the input acbab.
11.3. LR parse example
11.3.1. An LR checker
for a start, the main function for LR parsing calls a generalized stack function with
an empty stack:
check’ :: String -> Bool
check’ input = run’ [] input
The stack machine terminates when it finds the stack containing just the root non-
terminal. Otherwise it either pushes the first input symbol on the stack (‘Shift’), or
it drops some symbols off the stack (which should be the right hand side of a rule)
and pushes the corresponding nonterminal (‘Reduce’).
run’ :: Stack -> String -> Bool
run’ [’S’] [] = True
run’ [’S’] (x:xs) = False
run’ stack (x:xs) = case action of
Shift -> run’ (x: stack) xs
Reduce a n -> run’ (a:drop n stack) (x:xs)
Error -> False
where action = select’ stack x
In the case of LL-parsing the hard part was selecting the right rule to expand; here
we have the hard decision of whether to reduce according to a (and which?) rule, or
to shift the next symbol. This is done by the select’ function, which is allowed to
inspect the first input symbol x and the entire stack: after all, in needs to find the
right hand side of a rule on the stack.
In the right column in Figure 11.1 the LR derivation of sentence 1+2*3 is shown.
Compare closely to the left column, which shows the LL derivation, and note the
duality of the processes.
201
11. LL versus LR parsing
LL derivation
rule read ↓stack remaining
S 1+2*3
S → E E 1+2*3
E → TP TP 1+2*3
T → FM FMP 1+2*3
F → N NMP 1+2*3
N → 1 1MP 1+2*3
read 1 MP +2*3
M → 1 P +2*3
P → +E 1 +E +2*3
read 1+ E 2*3
E → TP 1+ TP 2*3
T → FM 1+ FMP 2*3
F → N 1+ NMP 2*3
N → 2 1+ 2MP 2*3
read 1+2 MP *3
M → *T 1+2 *TP *3
read 1+2* TP 3
T → FM 1+2* FMP 3
F → N 1+2* NMP 3
N → 3 1+2* 3MP 3
read 1+2*3 MP
M → 1+2*3 P
P → 1+2*3
LR derivation
rule read stack↓ remaining
1+2*3
shift 1 1 +2*3
N → 1 1 N +2*3
F → N 1 F +2*3
M → 1 FM +2*3
T → FM 1 T +2*3
shift 1+ T+ 2*3
shift 1+2 T+2 *3
N → 2 1+2 T+N *3
F → N 1+2 T+F *3
shift 1+2* T+F* 3
shift 1+2*3 T+F*3
N → 3 1+2*3 T+F*N
F → N 1+2*3 T+F*F
M → 1+2*3 T+F*FM
T → FM 1+2*3 T+F*T
M → *T 1+2*3 T+FM
T → FM 1+2*3 T+T
P → 1+2*3 T+TP
E → TP 1+2*3 T+E
P → +E 1+2*3 TP
E → TP 1+2*3 E
S → E 1+2*3 S
Figure 11.1.: LL and LR derivations of 1+2*3
11.3.2. LR action selection
The choice whether to shift or to reduce is made by function select’. It is defined
as follows:
select’ as x
| null items = Error
| null redItems = Shift
| otherwise = Reduce a (length rs)
where items = dfa as
redItems = filter red items
(a,rs,_) = hd redItems
In the selection process a set (list) of so-called items plays a role. If the set is empty,
there is an error. If the set contains at least one item, we filter the red, or reducible
items from it. There should be only one, if the grammar has the LR-property. (Or
rather: it is the LR-property that there is only one element in this situation). The
reducible item is the production rule that can be reduced by the stack machine.
Now what are these items? An item is defined to be a production rule, augmented
202
11.3. LR parse example
with a ‘cursor position’ somewhere in the right hand side of the rule. So for the rule
F → (E), there are four possible items: F → (E), F → ( E), F → (E ) and
F → (E), where denotes the cursor.
for the rule F → 2 we have two items: one having the cursor in front of the single
symbol, and one with the cursor after it: F → 2 and F → 2 . For epsilon-rules
there is a single item, where the cursor is at position 0 in the right hand side.
In the Haskell representation, the cursor is an integer which is added as a third
element of the tuple, which already contains nonterminal and right hand side of the
rule.
The items thus constructed are taken to be the states of a NFA (Nondeterminis-
tic Finite-state Automaton), as described in Section 8.1.2. We have the following
transition relations in this NFA:
• The cursor can be ‘stepped’ to the next position. This transition is labeled
with the symbol that is hopped over
• If the cursor is on front of a nonterminal, we can jump to an item describing
the application of that nonterminal, where the cursor is at the beginning. This
relation is an epsilon-transition of the NFA.
As an example, let’s consider an simplification of the arithmetic expression gram-
mar:
E → T
E → T + E
T → N
T → N * T
T → ( E )
it skips the ‘factor’ notion as compared to the grammar earlier in this chapter, so it
misses some well-formed expressions, but it serves only as an example for creating
the states here. There are 18 states in this machine, as depicted here:
203
11. LL versus LR parsing
Exercise 11.4 (no answer provided). How can you predict the number of states from in-
specting the grammar?
Exercise 11.5 (no answer provided). In the picture, the transition arrow labels are not
shown. Add them.
Exercise 11.6 (no answer provided). What makes this FA nondeterministic?
As was shown in Section 8.1.4, another automaton can be constructed that is de-
terministic (a DFA). That construction involves defining states which are sets of the
original states. So in this case, the states of the DFA are sets of items (where items
are rules with a cursor position). In the worst case we would have 2
18
states for the
DFA-version of our example NFA, but it turns out that we are in luck: there are only
11 states in the DFA. Its states are rather complicated to depict, as they are sets of
items, but it can be done:
204
11.3. LR parse example
Exercise 11.7 (no answer provided). Check that this FA is indeed deterministic.
Given this DFA, let’s return to our function that selects whether to shift or to re-
duce:
select’ as x
| null items = Error
| null redItems = Shift
| otherwise = Reduce a (length rs)
where items = dfa as
redItems = filter red items
(a,rs,_) = hd redItems
It runs the DFA, using the contents of the stack (as) as input. From the state where
the DFA ends, which is by construction a set of items, the ‘red’ ones are filtered out.
An item is ‘red’ (that is: reducible) if the cursor is at the end of the rule.
Exercise 11.8 (no answer provided). Which of the states in the picture of the DFA contain
red items? How many?
This is not the only condition that makes an item reducible; the second condition
is that the lookahead input symbol is in the follow set of the nonterminal that is
reduced to. Function follow was also needed in the LL analysis in Section 10.2.8.
Both conditions are in the formal definition:
... where red (a,r,c) = c==length r && x ‘elem‘ follow a
Exercise 11.9 (no answer provided). How does this condition reflect that ‘the cursor is at
the end’ ?
205
11. LL versus LR parsing
11.3.3. LR optimizations and generalizations
The stack machine run’, by its nature, pushes and pops the stack continuously, and
does a recursive call afterwards. In each call, for making the shift/reduce decision,
the DFA is run on the (new) stack. In practice, it is considered a waste of time
to do the full DFA transitions each time, as most of the stack remains the same
after some popping and pushing at the top. Therefore, as an optimization, at each
stack position, the corresponding DFA state is also stored. The states of the DFA
can easily be numbered, so this amounts to just storing extra integers on the stack,
tupled with the symbols that used to be on the stack. (An artificial bottom element
should be placed on the stack initially, containing a dummy symbol and the number
of the initial DFA state).
By analysing the grammar, two tables can be precomputed:
• Shift, that decides what is the new state when pushing a terminal symbol on
the stack. This basically is the transition relation on the DFA.
• Action, that decides what action to take from a given state seeing a given input
symbol.
Both tables can be implemented as a two-dimensional table of integers, of dimensions
the number of states (typically, 100s to 1000s) times the number of symbols (typically,
under 100).
As said in the introduction, the algorithm described here is a mere Simple LR parsing,
or SLR(1). Its simplicity is in the reducibility test, which says:
... where red (a,r,c) = c==length r && x ‘elem‘ follow a
The follow set is a rough approximation of what might follow a given nonterminal.
But this set is not dependent of the context in which the nonterminal is used; maybe,
in some contexts, the set is smaller. So, the SLR red function, may designate items
as reducible, where it actually should not. For some grammars this might lead to
a decision to reduce, where it should have done a shift, with a failing parser as a
consequence.
An improvement, leading to a wider class of grammars that are allowable, would be
to make the follow set context dependent. This means that it should vary for each
item instead of for each nonterminal. It leads to a dramatic increase of the number
of states in the DFA. And the full power of LR parsing is rarely needed.
A compromise position is taken by the so-called LALR parsers, or Look Ahead LR
parsing. (A rather silly name, as all parsers look ahead. . . ). In LALR parsers,
the follow sets are context dependent, but when states in the DFA differ only with
respect to the follow-part of their set-members (and not with respect to the item-
part of them), the states are merged. Grammars that do not give rise to shift/reduce
conflicts in this situation are said to be LALR-grammars. It is not really a natural
206
11.3. LR parse example
notion; rather, it is a performance hack that gets most of LR power while keeping
the size of the goto- and action-tables reasonable.
A widely used parser generator tool named yacc (for ‘yet another compiler compiler’)
is based on an LALR engine. It comes with Unix, and was originally created for
implementing the first C compilers. A commonly used clone of yacc is named Bison.
Yacc is designed for doing the context-free aspects of analysing a language. The
micro structure of identifiers, numbers, keywords, comments etc. is not handled by
this grammar. Instead, it is described by regular expressions, which are analysed by a
accompanying tool to yacc named lex (for ‘lexical scanner’). Lex is a preprocessor to
yacc, that subdivides the input character stream into a stream of meaningful tokens,
such as numbers, identifiers, operators, keywords etc.
Bibliographical notes
The example DFA and NFA for LR parsing, and part of the description of the algo-
rithm were taken from course notes by Alex Aiken and George Necula, which can be
found at www.cs.wright.edu/~tkprasad/courses/cs780
207
11. LL versus LR parsing
208
Bibliography
[1] A.V. Aho, Sethi R., and J.D. Ullman. Compilers — Principles, Techniques and
Tools. Addison-Wesley, 1986.
[2] R.S. Bird. Using circular programs to eliminate multiple traversals of data. Acta
Informatica, 21:239–250, 1984.
[3] R.S. Bird and P. Wadler. Introduction to Functional Programming. Prentice
Hall, 1988.
[4] W.H. Burge. Parsing. In Recursive Programming Techniques. Addison-Wesley,
1975.
[5] J. Fokker. Functional parsers. In J. Jeuring and E. Meijer, editors, Advanced
Functional Programming, volume 925 of Lecture Notes in Computer Science.
Springer-Verlag, 1995.
[6] R. Harper. Proof-directed debugging. Journal of Functional Programming, 1999.
To appear.
[7] G. Hutton. Higher-order functions for parsing. Journal of Functional Program-
ming, 2(3):323 – 343, 1992.
[8] B.W. Kernighan and R. Pike. Regular expressions — languages, algorithms, and
software. Dr. Dobb’s Journal, April:19 – 22, 1999.
[9] D.E. Knuth. Semantics of context-free languages. Math. Syst. Theory, 2(2):127–
145, 1968.
[10] Niklas R¨ojemo. Garbage collection and memory efficiency in lazy functional
languages. PhD thesis, Chalmers University of Technology, 1995.
[11] S. Sippu and E. Soisalon-Soininen. Parsing Theory, Vol. 1: Languages and
Parsing, volume 15 of EATCS Monographs on THeoretical Computer Science.
Springer-Verlag, 1988.
[12] S.D. Swierstra and P.R. Azero Alcocer. Fast, error correcting parser combinators:
a short tutorial. In SOFSEM’99, 1999.
[13] P. Wadler. How to replace failure by a list of successes: a method for excep-
tion handling, backtracking, and pattern matching in lazy functional languages.
In J.P. Jouannaud, editor, Functional Programming Languages and Computer
Architecture, pages 113 – 128. Springer, 1985. LNCS 201.
209
Bibliography
210
A. The Stack module
module Stack
( Stack
, emptyStack
, isEmptyStack
, push
, pushList
, pop
, popList
, top
, split
, mystack
)
where
data Stack x = MkS [x] deriving (Show,Eq)
emptyStack :: Stack x
emptyStack = MkS []
isEmptyStack :: Stack x -> Bool
isEmptyStack (MkS xs) = null xs
push :: x -> Stack x -> Stack x
push x (MkS xs) = MkS (x:xs)
pushList :: [x] -> Stack x -> Stack x
pushList xs (MkS ys) = MkS (xs ++ ys)
pop :: Stack x -> Stack x
pop (MkS xs) = if isEmptyStack (MkS xs)
then error "pop on emptyStack"
else MkS (tail xs)
popIf :: Eq x => x -> Stack x -> Stack x
211
A. The Stack module
popIf x stack = if top stack == x
then pop stack
else error "argument and top of stack don’t match"
popList :: Eq x => [x] -> Stack x -> Stack x
popList xs stack = foldr popIf stack (reverse xs)
top :: Stack x -> x
top (MkS xs) = if isEmptyStack (MkS xs)
then error "top on emptyStack"
else head xs
split :: Int -> Stack x -> ([x], Stack x)
split 0 stack = ([], stack)
split n (MkS []) = error "attempt to split the emptystack"
split (n+1) (MkS (x:xs)) = (x:ys, stack’)
where
(ys, stack’) = split n (MkS xs)
mystack = MkS [1,2,3,4,5,6,7]
212
B. Answers to exercises
2.1 Three of the fours strings are elements of L

: abaabaaabaa, aaaabaaaa,
baaaaabaa.
2.2 ¦ε¦.
2.3
∅L
= ¦ Definition of concatenation of languages ¦
¦st [ s ∈ ∅, t ∈ L¦
= ¦ s ∈ ∅ ¦

The other equalities can be proved in a similar fashion.
2.4 The star operator on sets injects the elements of a set in a list; the star operator
on languages concatenates the sentences of the language. The former star operator
preserves more structure.
2.5 Section 2.1 contains an inductive definition of the set of sequences over an arbi-
trary set X. Syntactical definitions for such sets follow immediately from this.
1. A grammar for X =¦a¦ is given by
S → ε
S → aS
2. A grammar for X =¦a, b¦ is given by
S → ε
S → X S
X → a [ b
2.6 A context free grammar for L is given by
S → ε
S → aSb
213
B. Answers to exercises
2.7 Analogous to the construction of the grammar for PAL.
P → ε
[ a
[ b
[ aPa
[ bPb
2.8 Analogous to the construction of PAL.
M → ε
[ aMa
[ bMb
2.9 First establish an inductive definition for parity sequences. An example of a
grammar that can be derived from the inductive definition is:
P → ε [ 1P1 [ 0P [ P0
There are many other solutions.
2.10 Again, establish an inductive definition for L. An example of a grammar that
can be derived from the inductive definition is:
S → ε [ aSb [ bSa [ SS
Again, there are many other solutions.
2.11 A sentence is a sentential form consisting only of terminals which can be de-
rived in zero or more derivation steps from the start symbol (to be more precise:
the sentential form consisting only of the start symbol). The start symbol is a non-
terminal. The nonterminals of a grammar do not belong to the alphabet (the set
of terminals) of the language we describe using the grammar. Therefore the start
symbol cannot be a sentence of the language. As a consequence we have to perform
at least one derivation step from the start symbol before we end up with a sentence
of the language.
2.12 The language consisting of the empty string only, i.e., ¦ε¦.
2.13 This grammar generates the empty language, i.e., ∅. In general, grammars such
as this one that have no production rules without nonterminals on the right hand
side, cannot produce any sentences with only terminal symbols. Each derivation will
always contain nonterminals, so no sentences can be derived.
2.14 The sentences in this language consist of zero or more concatenations of ab, i.e.,
the language is the set ¦ab¦

.
214
2.15 Yes. Each finite language is context free. A context free grammar can be
obtained by taking one nonterminal and adding a production rule for each sentence
in the language. For the language in Exercise 2.1, this prodedure yields
S → ab
S → aa
S → baa
2.16 To bring the grammar into the form where we can directly apply the rule for
associative separators, we introduce a new nonterminal:
A → AaA
A → B
B → b [ c
Now we can remove the ambiguity:
A → BaA
A → B
B → b [ c
It is now (optionally) possible to undo the auxiliary step of introducing the additional
nonterminal by applying the rules for substituting right hand sides for nonterminal
and removing unreachable productions. We then obtain:
A → baA[ caA
A → b [ c
2.17
1. Here are two parse trees for the sentence if b then if b then a else a:
S
if b then S
if b then S
a
else S
a
S
if b then S
if b then S
a
else S
a
215
B. Answers to exercises
2. The rule we apply is: match else with the closest previous unmatched else.
This means we prefer the second of the two parse trees above. The disam-
biguating rule is incorporated directly into the grammar:
S → MatchedS [ UnmatchedS
MatchedS → if b then MatchedS else MatchedS
[ a
UnmatchedS → if b then S
[ if b then MatchedS else UnmatchedS
3. An else clause is always matched with the closest previous unmatched if.
2.18 An equivalent grammar for bit lists is
L → B Z [ B
Z → , L Z [ , L
B → 0 [ 1
2.19
1. The grammar generates the language ¦a
2n
b
m
[ m, n ∈ N¦.
2. An equivalent non left recursive grammar is
S → AB
A → ε [ aaA
B → ε [ bB
2.20 Of course, we can choose how we want to represent the different operators in
concrete syntax. Choosing standard symbols, this is one possibility:
Expr → Expr + Expr
[ Expr * Expr
[ Int
where Int is a nonterminal that produces integers. Note that the grammar given
above is ambiguous. We could also give an unambiguous version, for example by
introducing operator priorities.
2.21 Recall the grammar for palindromes from Exercise 2.7, now with names for the
productions:
Empty: P → ε
A: P → a
B: P → b
A
2
: P → aPa
B
2
: P → bPb
216
We construct the datatype Pal by interpreting the nonterminal as datatype and the
names of the productions as names of the constructors:
data Pal = Empty [ A[ B [ A
2
P [ B
2
P
Note that once again we keep the nonterminals on the right hand sides as arguments
to the constructors, but omit all the terminal symbols.
The strings abaaba and baaab can be derived as follows:
P ⇒ aPa ⇒ abPba ⇒ abaPaba ⇒ abaaba
P ⇒ bPb ⇒ baPba ⇒ baaab
The parse trees corresponding to the derivations are the following:
P
a
P
b P
a
P
ε
a
b
a
and
P
b P
a
P
a
a
b
Consequently, the desired Haskell definitions are
pal
1
= A
2
(B
2
(A
2
Empty))
pal
2
= B
2
(A
2
A)
2.22
1.
printPal :: Pal → String
printPal Empty = ""
printPal A = "a"
printPal B = "b"
printPal (A
2
p) = "a" ++ printPal p ++ "a"
printPal (B
2
p) = "b" ++ printPal p ++ "b"
Note how the function follows the structure of the datatype Pal closely, and
calls itself recursively wherever a recursive value of Pal occurs in the datatype.
Such a pattern is typical for semantic functions.
217
B. Answers to exercises
2.
aCountPal :: Pal → Int
aCountPal Empty = 0
aCountPal A = 1
aCountPal B = 0
aCountPal (A
2
p) = aCountPal p + 2
aCountPal (B
2
p) = aCountPal p
2.23 Recall the grammar from Exercise 2.8, this time with names for the produc-
tions:
MEmpty: M → ε
MA: M → aMa
MB: M → bMb
1. By systematically transforming the grammar, we obtain the following datatype:
data Mir = MEmpty [ MA Mir [ MB Mir
The concrete mirror palindromes cMir
1
and cMir
2
correspond to the following
terms of type Mir:
aMir
1
= MA (MB (MA Empty))
aMir
2
= MA (MB (MB Empty))
2.
printMir :: Mir → String
printMir MEmpty = ""
printMir (MA m) = "a" ++ printMir m ++ "a"
printMir (MB m) = "b" ++ printMir m ++ "b"
3.
mirToPal :: Mir → Pal
mirToPal MEmpty = Empty
mirToPal (MA m) = A
2
(mirToPal m)
mirToPal (MB m) = B
2
(mirToPal m)
2.24 Recall the grammar from Exercise 2.9, this time with names for the produc-
tions:
Stop: P → ε
POne: P → 1P1
PZeroL: P → 0P
PZeroR: P → P0
218
1.
data Parity = Stop [ POne Parity [ PZeroL Parity [ PZeroR Parity
aEven
1
= PZeroL (PZeroL (POne (PZeroL Stop)))
aEven
2
= PZeroL (PZeroR (POne (PZeroL Stop)))
Note that the grammar is ambiguous, and other representations for cEven
1
and
cEven
2
are possible, for instance:
aEven

1
= PZeroL (PZeroL (POne (PZeroR Stop)))
aEven

2
= PZeroR (PZeroL (POne (PZeroL Stop)))
2.
printParity :: Parity → String
printParity Stop = ""
printParity (POne p) = "1" ++ printParity p ++ "1"
printParity (PZeroL p) = "0" ++ printParity p
printParity (PZeroR p) = printParity p ++ "0"
2.25 A grammar for bit lists that is not left-recursive is the following:
L → B Z [ B
Z → , L Z [ , L
B → 0 [ 1
1.
data BitList = ConsBit Bit Z [ SingleBit Bit
data Z = ConsBitList BitList Z [ SingleBitList BitList
data Bit = Bit
0
[ Bit
1
aBitList
1
= ConsBit Bit
0
(ConsBitList (SingleBit Bit
1
)
(SingleBitList (SingleBit Bit
0
)))
aBitList
2
= ConsBit Bit
0
(ConsBitList (SingleBit Bit
0
)
(SingleBitList (SingleBit Bit
1
)))
2.
printBitList :: BitList → String
printBitList (ConsBit b z ) = printBit b ++ printZ z
printBitList (SingleBit b) = printBit b
printZ :: Z → String
printZ (ConsBitList bs z ) = "," ++ printBitList bs ++ printZ z
printZ (SingleBitList bs) = "," ++ printBitList bs
219
B. Answers to exercises
printBit :: Bit → String
printBit Bit
0
= "0"
printBit Bit
1
= "1"
When multiple datatypes are involved, semantic functions typically still follow
the structure of the datatypes closely. We get one function per datatype, and
the functions call each other recursively where appropriate – we say they are
mutually recursive.
3. We can still make the concatenation function structurally recursive in the first
of the two bit lists. We never have to match on the second bit list:
concatBitList :: BitList → BitList → BitList
concatBitList (ConsBit b z ) cs = ConsBit b (concatZ z cs)
concatBitList (SingleBit b) cs = ConsBit b (SingleBitList cs)
concatZ :: Z → BitList → Z
concatZ (ConsBitList bs z ) cs = ConsBitList bs (concatZ z xs)
concatZ (SingleBitList bs) cs = ConsBitList bs (SingleBitList cs)
2.26 We only give the EBNF notation for the productions that change.
Digs → Dig

Int → Sign? Nat
AlphaNums → AlphaNum

AlphaNum → Letter [ Dig
2.27 L(G?) =L(G) ∪ ¦ε¦
2.28
1. L
1
is generated by:
S → ZC
Z → aZb [ ε
C → c

and L
2
is generated by:
S → AZ
A → a

Z → bZc [ ε
2. We have that
L
1
∩ L
2
=¦a
n
b
n
c
n
[ n ∈ N¦
However, this language is not context-free, i. e., there is no context-free grammar
that generates this language. We will see in Chapter 9 how to prove such a
statement.
220
2.29 No. Furthermore, for any language L, since ε ∈ L

, we have ε / ∈ (L

). On the
other hand, ε ∈ (L)

. Thus (L

) and (L)

cannot be equal.
2.30 For example L =¦x
n
[ n ∈ N¦. For any language L, it holds that
L

= (L

)

so, given any language L, the language L

fulfills the desired property.
2.31 This is only the case when ε / ∈ L.
2.32 No. the language L =¦aab, baa¦ also satisfies L = L
R
.
2.33
1. The shortest derivation is three steps long and yields the sentence aa. The
sentences baa, aba, and aab can all be derived in four steps.
2. Several derivations are possible for the string babbab. Two of them are
S ⇒ AA ⇒ bAA ⇒ bAbA ⇒ babA ⇒ babbA ⇒ babbAb ⇒ babbab
S ⇒ AA ⇒ AAb ⇒ bAAb ⇒ bAbAb ⇒ bAbbAb ⇒ babbAb ⇒ babbab
3. A leftmost derivation is:
S ⇒ AA ⇒

b
m
AA ⇒

b
m
Ab
n
A ⇒ b
m
ab
n
A ⇒

b
m
ab
n
Ab
p
⇒ b
m
ab
n
ab
p
2.34 The grammar is equivalent to the grammar
S → aaB
B → bBba
B → a
This grammar generates the string aaa and the strings aab
m
a(ba)
m
for m ∈ N,
m 1. The string aabbaabba does not appear in this language.
2.35 The language L is generated by:
S → aSa [ bSb [ c
The derivation is:
S ⇒ aSa ⇒ abSba ⇒ abcba
2.36 The language generated by the grammar is
¦a
n
b
n
[ n ∈ N¦
221
B. Answers to exercises
The same language is also generated by the grammar
S → aAb [ ε
2.37 The first language is
¦a
n
[ n ∈ N¦
This language is also generated by the grammar
S → aS [ ε
The second language is
¦ε¦ ∪ ¦a
2n+1
[ n ∈ N¦
This language is also generated by the grammar
S → A[ ε
A → a [ aAa
or using EBNF notation
S → (a(aa)

)?
2.38 All three grammars generate the language
¦a
n
[ n ∈ N¦
2.39
S → A[ ε
A → aAb [ ab
2.40 The language is
L =¦a
2n+1
[ n ∈ N¦
A grammar for L without left-recursive productions is
A → aaA[ a
And a grammar without right-recursive prodcutions is
A → Aaa [ a
2.41 The language is
222
¦ab
n
[ n ∈ N¦
A grammar for L without left-recursive productions is
X → aY
Y → bY [ ε
A grammar for L without left-recursive productions that is also non-contracting is
X → aY [ a
Y → bY [ b
2.42 A grammar that uses only productions with two or less symbols on the right
hand side:
S → T [ US
T → Xa [ Ua
X → aS
U → S [ YT
Y → SU
The sentential forms aS and SU have been abstracted to nonterminals X and Y.
A grammar for the same language with only two nonterminals:
S → aSa [ Ua [ US
U → S [ SUaSa [ SUUa
The nonterminal T has been substituted for its alternatives aSa and Ua.
2.43
S → 1O
O → 1O [ 0N
N → 1

2.44 The language is generated by the grammar:
S → (A) [ SS
A → S [ ε
A derivation for ( ) ( ( ) ) ( ) is:
S ⇒ SS ⇒ SSS ⇒ (A)SS ⇒ ()SS ⇒ ()(A)S ⇒ ()(S)S ⇒ ()((A))S
⇒ ()(())S ⇒ ()(())(A) ⇒ ()(())()
2.45 The language is generated by the grammar
223
B. Answers to exercises
S → (A) [ [A] [ SS
A → S [ ε
A derivation for [ ( ) ] ( ) is:
S ⇒ SS ⇒ [A]S ⇒ [S]S ⇒ [(A)]S ⇒ [()]S ⇒ [()](A) ⇒ [()]()
2.46 First leftmost derivation:
Sentence
⇒ Subject Predicate
⇒ they Predicate
⇒ they Verb NounPhrase
⇒ they are NounPhrase
⇒ they are Adjective Noun
⇒ they are flying Noun
⇒ they are flying planes
Second leftmost derivation:
Sentence
⇒ Subject Predicate
⇒ they Predicate
⇒ they AuxVerb Verb Noun
⇒ they are Verb Noun
⇒ they are flying Noun
⇒ they are flying planes
2.48 Here is an unambiguous grammar for the language from Exercise 2.45:
S → (E)E [ [E] E
E → ε [ S
2.49 Here is a leftmost derivation for ♣3♣´♠.
⇒ ´⊗ ⇒ ⊗´⊗ ⇒ ⊗3⊕´⊗ ⇒ ⊕3⊕´⊗ ⇒ ♣3⊕´⊗
⇒ ♣3♣´⊗ ⇒ ♣3♣´⊕ ⇒ ♣3♣´♠
Notice that the grammar of this exercise is the same, up to renaming, as the gram-
mar
E → E + T [ T
T → T * F [ F
F → 0 [ 1
224
2.50 The palindrome ε with length 0 can be generated by de grammar with the
derivation P ⇒ ε. The palindromes a, b, and b are the palindromes with length 1.
They can be derivated with P ⇒ a, P ⇒ b, P ⇒ c, respectively.
Suppose that the palindrome s with a length of 2 or more. Then s can be written as
ata or btb or ctc where the length of t is strictly smaller, and t also is a palindrome.
Thus, by induction hypothesis, there is a derivation P ⇒

t. But then, there is also
a derivation for s, for example P ⇒

t ⇒ ata in the first situation – the other two
cases are analogous.
We have now proved that any palindrome can be generated by the grammar – we
still have to prove that anything generated by the grammar is a palindrome, but this
is easy to see by induction over the length of derivations. Certainly ε, a, b, and c
are palindromes. And if s is a palindrome that can be derived, so are asa, bsb, and
csc.
3.1 Either we use the predefined predicate isUpper in module Data.Char,
capital = satisfy isUpper
or we make use of the ordering defined characters,
capital = satisfy (λs → (’A’ s) ∧ (s ’Z’))
3.2 A symbol equal to a satisfies the predicate (= = a):
symbol a = satisfy (= = a)
3.3 The function epsilon is a special case of succeed:
epsilon :: Parser s ()
epsilon = succeed ()
3.4 Let xs :: [s]. Then
(f <$>succeed a) xs
= ¦ definition of <$> ¦
[(f x, ys) [ (x, ys) ← succeed a xs]
= ¦ definition of succeed ¦
[(f a, xs)]
= ¦ definition of succeed ¦
succeed (f a) xs
3.5 The type and results of (:) <$>symbol ’a’ are (note that you cannot write this
as a definition in Haskell):
225
B. Answers to exercises
((:) <$>symbol ’a’) :: Parser Char (String → String)
((:) <$>symbol ’a’) [] = []
((:) <$>symbol ’a’) (x : xs) [ x = = ’a’ = [((x:), xs)]
[ otherwise = []
3.6 The type and results of (:) <$>symbol ’a’ <∗>p are:
((:) <$>symbol ’a’ <∗>p) :: Parser Char String
((:) <$>symbol ’a’ <∗>p) [] = []
((:) <$>symbol ’a’ <∗>p) (x : xs)
[ x = = ’a’ = [(’a’ : x, ys) [ (x, ys) ← p xs]
[ otherwise = []
3.7
pBool :: Parser Char Bool
pBool = const True <$>token "True"
<[>const False <$>token "False"
3.9
1.
data Pal
2
= Nil [ Leafa [ Leafb [ Twoa Pal
2
[ Twob Pal
2
2.
palin
2
:: Parser Char Pal
2
palin
2
= (` y → Twoa y) <$>
symbol ’a’ <∗>palin
2
<∗>symbol ’a’
<[>(` y → Twob y) <$>
symbol ’b’ <∗>palin
2
<∗>symbol ’b’
<[>const Leafa <$>symbol ’a’
<[>const Leafb <$>symbol ’b’
<[>succeed Nil
3.
palina :: Parser Char Int
palina = (λ y → y + 2) <$>
symbol ’a’ <∗>palina <∗>symbol ’a’
<[>(λ y → y) <$>
symbol ’b’ <∗>palina <∗>symbol ’b’
<[>const 1 <$>symbol ’a’
<[>const 0 <$>symbol ’b’
<[>succeed 0
226
3.10
1.
data English = E
1
Subject Pred
data Subject = E
2
String
data Pred = E
3
Verb NounP [ E
4
AuxV Verb Noun
data Verb = E
5
String [ E
6
String
data AuxV = E
7
String
data NounP = E
8
Adj Noun
data Adj = E
9
String
data Noun = E
10
String
2.
english :: Parser Char English
english = E
1
<$>subject <∗>pred
subject = E
2
<$>token "they"
pred = E
3
<$>verb <∗>nounp
<[>E
4
<$>auxv <∗>verb <∗>noun
verb = E
5
<$>token "are"
<[>E
6
<$>token "flying"
auxv = E
7
<$>token "are"
nounp = E
8
<$>adj <∗>noun
adj = E
9
<$>token "flying"
noun = E
10
<$>token "planes"
3.11 As <[> uses ++, it is more efficiently evaluated if right-associative.
3.12 The function is the same as <∗>, but instead of applying the result of the first
parser to that of the second, it pairs them together:
(<,>) :: Parser s a → Parser s b → Parser s (a, b)
(p <,>q) xs = [ ((x, y), zs)
[ (x, ys) ← p xs
, (y, zs) ← q ys
]
3.13 ‘Parser transformator’, or ‘parser modifier’ or ‘parser postprocessor’, etcetera.
3.14 The transformator <$> does to the result part of parsers what map does to
the elements of a list.
3.15 The parser combinators <∗> and <,> can be defined in terms of each other:
p <∗>q = uncurry ($) <$>(p <,>q)
p <,>q = (, ) <$>p <∗>q
227
B. Answers to exercises
3.16 Yes. You can combine the parser parameter of <$> with a parser that consumes
no input and always yields the function parameter of <$>:
f <$>p = succeed f <∗>p
3.17
f ::
Char → Parentheses → Char → Parentheses → Parentheses
open ::
Parser Char Char
f <$>open ::
Parser Char (Parentheses → Char → Parentheses → Parentheses)
parens ::
Parser Char Parentheses
(f <$>open) <∗>parens ::
Parser Char (Char → Parentheses → Parentheses)
3.18 To the left. Yes.
3.19 The function has to check whether applying the parser p to input s returns at
least one result with an empty rest sequence:
test p s = not (null (filter (null . snd) (p s)))
3.20
1.
listofdigits :: Parser Char [Int]
listofdigits = listOf newdigit (symbol ’ ’)
?> listofdigits "1 2 3"
[([1,2,3],""),([1,2]," 3"),([1]," 2 3")]
?> listofdigits "1 2 a"
[([1,2]," a"),([1]," 2 a")]
2. In order to use the parser chainr we first define a parser plusParser that recog-
nises the character ’+’ and returns the function (+).
plusParser :: Num a ⇒ Parser Char (a → a → a)
plusParser [] = []
plusParser (x : xs) [ x = = ’+’ = [((+), xs)]
[ otherwise = []
228
The definiton of the parser sumParser is:
sumParser :: Parser Char Int
sumParser = chainr newdigit plusParser
?> sumParser "1+2+3"
[(6,""),(3,"+3"),(1,"+2+3")]
?> sumParser "1+2+a"
[(3,"+a"),(1,"+2+a")]
?> sumParser "1"
[(1,"")]
Note that the parser also recognises a single integer.
3. The parser many should be replaced by the parser greedy in de definition of
listOf .
3.21 We introduce the abbreviation
listOfa = (:) <$>symbol ’a’
and use the results of Exercises 3.5 and 3.6.
xs = []:
many (symbol ’a’) []
= ¦ definition of many and listOfa ¦
(listOfa <∗>many (symbol ’a’) <[>succeed []) []
= ¦ definition of <[> ¦
(listOfa <∗>many (symbol ’a’)) [] ++ succeed [] []
= ¦ Exercise 3.6, definition of succeed ¦
[] ++ [([], [])]
= ¦ definition of ++ ¦
[([], [])]
xs = [’a’]:
many (symbol ’a’) [’a’]
= ¦ definition of many and listOfa ¦
(listOfa <∗>many (symbol ’a’) <[>succeed []) [’a’]
= ¦ definition of <[> ¦
(listOfa <∗>many (symbol ’a’)) [’a’] ++ succeed [] [’a’]
= ¦ Exercise 3.6, previous calculation ¦
229
B. Answers to exercises
[([’a’], []), ([], [’a’])]
xs = [’b’]:
many (symbol ’a’) [’b’]
= ¦ as before ¦
(listOfa <∗>many (symbol ’a’)) [’b’] ++ succeed [] [’b’]
= ¦ Exercise 3.6, previous calculation ¦
[([], [’b’])]
xs = [’a’, ’b’]:
many (symbol ’a’) [’a’, ’b’]
= ¦ as before ¦
(listOfa <∗>many (symbol ’a’)) [’a’, ’b’] ++ succeed [] [’a’, ’b’]
= ¦ Exercise 3.6, previous calculation ¦
[([’a’], [’b’]), ([], [’a’, ’b’])]
xs = [’a’, ’a’, ’b’]:
many (symbol ’a’) [’a’, ’a’, ’b’]
= ¦ as before ¦
(listOfa <∗>many (symbol ’a’)) [’a’, ’a’, ’b’] ++ succeed [] [’a’, ’a’, ’b’]
= ¦ Exercise 3.6, previous calculation ¦
[([’a’, ’a’], [’b’]), ([’a’], [’a’, ’b’]), ([], [’a’, ’a’, ’b’])]
3.22 The empty alternative is presented last, because the <[> combinator uses list
concatenation for concatenating lists of successes. This also holds for the recursive
calls; thus the ‘greedy’ parsing of all three a’s is presented first, then two a’s with a
singleton rest string, then one a, and finally the empty result with the original input
as rest string.
3.24
— Combinators for repetition
psequence :: [Parser s a] → Parser s [a]
psequence [] = succeed []
psequence (p : ps) = (:) <$>p <∗>psequence ps
psequence

:: [Parser s a] → Parser s [a]
psequence

= foldr f (succeed [])
where f p q = (:) <$>p <∗>q
230
choice :: [Parser s a] → Parser s a
choice = foldr (<[>) failp
?> (psequence [digit, satisfy isUpper]) "1A"
[("1A","")]
?> (psequence [digit, satisfy isUpper]) "1Ab"
[("1A","b")]
?> (psequence [digit, satisfy isUpper]) "1ab"
[]
?> (choice [digit, satisfy isUpper]) "1ab"
[(’1’,"ab")]
?> (choice [digit, satisfy isUpper]) "Ab"
[(’A’,"b")]
?> (choice [digit, satisfy isUpper]) "ab"
[]
3.25
token :: Eq s ⇒ [s] → Parser s [s]
token = psequence . map symbol
3.27
identifier :: Parser Char String
identifier = (:) <$>satisfy isAlpha <∗>greedy (satisfy isAlphaNum)
3.28
1. As Haskell terms:
"abc": Var "abc"
"(abc)": Var "abc"
"a*b+1": Var "a" :∗: Var "b" :+: Con 1
"a*(b+1)": Var "a" :∗: (Var "b" :+: Con 1)
"-1-a": Con (−1) :−: Var "a"
"a(1,b)": Fun "a" [Con 1, Var "b"]
2. The parser fact first tries to parse an integer, then a variable, then a function
application and finally a parenthesised expression. A function application is a
variable followed by an argument list. When the parser encounters a function
application, a variable will first be recognised. This first solution will however
231
B. Answers to exercises
not lead to a parse tree for the complete expression because the list of arguments
that comes after the variable cannot be parsed.
If we swap the second and the third line in the definition of the parser fact, the
parse tree for a function application will be the first solution of the parser:
fact :: Parser Char Expr
fact = Con <$>integer
<[>Fun <$>identifier <∗>parenthesised (commaList expr)
<[>Var <$>identifier
<[>parenthesised expr
?> expr "a(1,b)"
[(Fun "a" [Con 1,Var "b"],""),(Var "a","(1,b)")]
3.29 A function with no arguments is not accepted by the parser:
?> expr "f()"
[(Var "f","()")]
The parser parenthesised (commaList expr) that is used in the parser fact does not
accept an empty list of arguments because commaList does not. To accept an empty
list we modify the parser fact as follows:
fact :: Parser Char Expr
fact = Con <$>integer
<[>Fun <$>identifier
<∗> parenthesised (commaList expr <[>succeed [])
<[>Var <$>identifier
<[>parenthesised expr
?> expr "f()"
[(Fun "f" [],""),(Var "f","()")]
3.30
expr = chainr (chainl term (const (:−:) <$>symbol ’-’))
(const (:+:) <$>symbol ’+’)
3.31 The datatype Expr is extended as follows to allow raising an expression to the
power of an expression:
data Expr = Con Int
[ Var String
232
[ Fun String [Expr]
[ Expr :+: Expr
[ Expr :−: Expr
[ Expr :∗: Expr
[ Expr :/: Expr
[ Expr : ˆ : Expr
deriving Show
Now the parser expr

of Listing 3.8 can be extended with a new level of priorities:
powis = [(’^’, (: ˆ :))]
expr

:: Parser Char Expr
expr

= foldr gen fact

[addis, multis, powis]
Note that because of the use of chainl all the operators listed in addis, multis and
powis are treated as left-associative.
3.32 The proofs can be given by using laws for list comprehension, but here we prefer
to exploit the following equation
(f <$>p) xs = map (f ∗∗∗ id) (p xs) (B.1)
where (∗∗∗) is defined by
(∗∗∗) :: (a → c) → (b → d) → (a, b) → (c, d)
(f ∗∗∗ g) (a, b) = (f a, g b)
It has the following property:
(f ∗∗∗ g) . (h ∗∗∗ k) = (f . h) ∗∗∗ (g . k) (B.2)
Furthermore, we will use the following laws about map in our proof: map distributes
over composition, concatenation, and the function concat:
map f . map g = map (f . g) (B.3)
map f (x ++ y) = map f x ++ map f y (B.4)
map f . concat = concat . map (map f ) (B.5)
1.
(h <$>(f <$>p)) xs
= ¦ (B.1) ¦
map (h ∗∗∗ id) ((f <$>p) xs)
= ¦ (B.1) ¦
map (h ∗∗∗ id) (map (f ∗∗∗ id) (p xs))
233
B. Answers to exercises
= ¦ (B.3) ¦
map ((h ∗∗∗ id) . (f ∗∗∗ id)) (p xs)
= ¦ (B.2) ¦
map ((h . f ) ∗∗∗ id) (p xs)
= ¦ (B.1) ¦
((h . f ) <$>p) xs
2.
(h <$>(p <[>q)) xs
= ¦ (B.1) ¦
map (h ∗∗∗ id) ((p <[>q) xs)
= ¦ definition of <[> ¦
map (h ∗∗∗ id) (p xs ++ q xs)
= ¦ (B.4) ¦
map (h ∗∗∗ id) (p xs) ++ map (h ∗∗∗ id) (q xs)
= ¦ (B.1) ¦
(h <$>p) xs ++ (h <$>q) xs
= ¦ definition of <[> ¦
((h <$>p) <[>(h <$>q)) xs
3. First note that (p <∗>q) xs can be written as
(p <∗>q) xs = concat (map (mc q) (p xs)) (B.6)
where
mc q (f , ys) = map (f ∗∗∗ id) (q ys)
Now we calculate
(((h.) <$>p) <∗>q) xs
= ¦ (B.6) ¦
concat (map (mc q) (((h.) <$>p) xs))
= ¦ (B.1) ¦
concat (map (mc q) (map ((h.) ∗∗∗ id) (p xs)))
= ¦ (B.3) ¦
concat (map (map (h ∗∗∗ id) (map (mc q) (p xs))))
= ¦ (B.7), see below ¦
concat (map ((map (h ∗∗∗ id)) . mc q) (p xs))
234
= ¦ (B.3) ¦
concat (map (map (h ∗∗∗ id)) (map (mc q) (p xs)))
= ¦ (B.5) ¦
map (h ∗∗∗ id) (concat (map (mc q) (p xs)))
= ¦ (B.6) ¦
map (h ∗∗∗ id) ((p <∗>q) xs)
= ¦ (B.1) ¦
(h <$>(p <∗>q)) xs
It remains to prove the claim
mc q . ((h.) ∗∗∗ id) = map (h ∗∗∗ id) . mc q (B.7)
This claim is also proved by calculation:
((map (h ∗∗∗ id)) . mc q) (f , ys)
= ¦ definition of . ¦
map (h ∗∗∗ id) (mc q (f , ys))
= ¦ definition of mc q ¦
map (h ∗∗∗ id) (map (f ∗∗∗ id) (q ys))
= ¦ map and ∗∗∗ distribute over composition ¦
map ((h . f ) ∗∗∗ id) (q y)
= ¦ definition of mc q ¦
mc q (h . f , ys)
= ¦ definition of ∗∗∗ ¦
(mc q . ((h.) ∗∗∗ id)) (f , ys)
3.33
pMir :: Parser Char Mir
pMir = (λ m → MB m) <$>symbol ’b’ <∗>pMir <∗>symbol ’b’
<[>(λ m → MA m) <$>symbol ’a’ <∗>pMir <∗>symbol ’a’
<[>succeed MEmpty
3.34
pBitList :: Parser Char BitList
pBitList = SingleB <$>pBit
<[>(λb bs → ConsB b bs) <$>pBit <∗>symbol ’,’ <∗>pBitList
pBit = const Bit
0
<$>symbol ’0’
<[>const Bit
1
<$>symbol ’1’
235
B. Answers to exercises
3.35
— Parser for floating point numbers
fixed :: Parser Char Float
fixed = (+) <$>(fromIntegral <$>greedy integer)
<∗> (((λ y → y) <$>symbol ’.’ <∗>fractpart) ‘option‘ 0.0)
fractpart :: Parser Char Float
fractpart = foldr f 0.0 <$>greedy newdigit
where f d n = (n + fromIntegral d) / 10.0
3.36
float :: Parser Char Float
float = f <$>fixed
<∗> (((λ y → y) <$>symbol ’E’ <∗>integer) ‘option‘ 0)
where f m e = m ∗ power e
power e [ e <0 = 1.0 / power (−e)
[ otherwise = fromIntegral (10e)
3.37 Parse trees for Java assignments are of type:
data JavaAssign = JAssign String Expr
deriving Show
The parser is defined as follows:
assign :: Parser Char JavaAssign
assign = JAssign
<$>identifier
<∗> ((λ y → y) <$>symbol ’=’ <∗>expr <∗>symbol ’;’)
?> assign "x1=(a+1)*2;"
[(JAssign "x1" (Var "a" :+: Con 1 :*: Con 2),"")]
?> assign "x=a+1"
[]
Note that the second example is not recognised as an assignment because the string
does not end with a semicolon.
4.1
data FloatLiteral = FL
1
IntPart FractPart ExponentPart FloatSuffix
[ FL
2
FractPart ExponentPart FloatSuffix
[ FL
3
IntPart ExponentPart FloatSuffix
236
[ FL
4
IntPart ExponentPart FloatSuffix
deriving Show
type ExponentPart = String
type ExponentIndicator = String
type SignedInteger = String
type IntPart = String
type FractPart = String
type FloatSuffix = String
digit = satisfy isDigit
digits = many
1
digit
floatLiteral = (λa b c d e → FL
1
a c d e)
<$>intPart <∗>period <∗>optfract <∗>optexp <∗>optfloat
<[> (λa b c d → FL
2
b c d)
<$>period <∗>fractPart <∗>optexp <∗>optfloat
<[> (λa b c → FL
3
a b c)
<$>intPart <∗>exponentPart <∗>optfloat
<[> (λa b c → FL
4
a b c)
<$>intPart <∗>optexp <∗>floatSuffix
intPart = signedInteger
fractPart = digits
exponentPart = (++) <$>exponentIndicator <∗>signedInteger
signedInteger = (++) <$>option sign "" <∗>digits
exponentIndicator = token "e" <[>token "E"
sign = token "+" <[>token "-"
floatSuffix = token "f" <[>token "F"
<[>token "d" <[>token "D"
period = token "."
optexp = option exponentPart ""
optfract = option fractPart ""
optfloat = option floatSuffix ""
4.2 The data and type definitions are the same as before, only the parsers return
another (semantic) result.
digit = f <$> satisfy isDigit
where f c = ord c - ord ’0’
digits = foldl f 0 <$> many1 digit
where f a b = 10*a + b
floatLiteral = (\a b c d e -> (fromIntegral a + c) * power d) <$>
intPart <*> period <*> optfract <*> optexp <*> optfloat
<|> (\a b c d -> b * power c) <$>
period <*> fractPart <*> optexp <*> optfloat
237
B. Answers to exercises
<|> (\a b c -> (fromIntegral a) * power b) <$>
intPart <*> exponentPart <*> optfloat
<|> (\a b c -> (fromIntegral a) * power b) <$>
intPart <*> optexp <*> floatSuffix
intPart = signedInteger
fractPart = foldr f 0.0 <$> many1 digit
where f a b = (fromIntegral a + b)/10
exponentPart = (\x y -> y) <$> exponentIndicator <*> signedInteger
signedInteger = (\ x y -> x y) <$> option sign id <*> digits
exponentIndicator = symbol ’e’ <|> symbol ’E’
sign = const id <$> symbol ’+’
<|> const negate <$> symbol ’-’
floatSuffix = symbol ’f’ <|> symbol ’F’<|> symbol ’d’ <|> symbol ’D’
period = symbol ’.’
optexp = option exponentPart 0
optfract = option fractPart 0.0
optfloat = option floatSuffix ’ ’
power e | e < 0 = 1 / power (-e)
| otherwise = fromIntegral (10^e)
4.3 The parsing scheme for Java floats is
digit = f <$> satisfy isDigit
where f c = .....
digits = f <$> many1 digit
where f ds = ..
floatLiteral = f1 <$>
intPart <*> period <*> optfract <*> optexp <*> optfloat
<|> f2 <$>
period <*> fractPart <*> optexp <*> optfloat
<|> f3 <$>
intPart <*> exponentPart <*> optfloat
<|> f4 <$>
intPart <*> optexp <*> floatSuffix
where
f1 a b c d e = .....
f2 a b c d = .....
f3 a b c = .....
f4 a b c = .....
intPart = signedInteger
fractPart = f <$> many1 digit
where f ds = ....
exponentPart = f <$> exponentIndicator <*> signedInteger
where f x y = .....
238
signedInteger = f <$> option sign ?? <*> digits
where f x y = .....
exponentIndicator = f1 <$> symbol ’e’ <|> f2 <$> symbol ’E’
where
f1 c = ....
f2 c = ..
sign = f1 <$> symbol ’+’ <|> f2 <$> symbol ’-’
where
f1 h = .....
f2 h = .....
floatSuffix = f1 <$> symbol ’f’
<|> f2 <$> symbol ’F’
<|> f3 <$> symbol ’d’
<|> f4 <$> symbol ’D’
where
f1 c = .....
f2 c = .....
f3 c = .....
f4 c = .....
period = symbol ’.’
optexp = option exponentPart ??
optfract = option fractPart ??
optfloat = option floatSuffix ??
5.1
type LNTreeAlgebra a b x = (a → x, x → b → x → x)
foldLNTree :: LNTreeAlgebra a b x → LNTree a b → x
foldLNTree (leaf , node) = fold
where
fold (Leaf a) = leaf a
fold (Node l m r) = node (fold l ) m (fold r)
5.2
1. Definition of height by case analysis:
height (Leaf x) = 0
height (Bin lt rt) = 1 + (height lt ‘max‘ height rt)
Definition as a fold:
height :: BinTree x → Int
height = foldBinTree heightAlgebra
heightAlgebra = (λu v → 1 + (u ‘max‘ v), const 0)
239
B. Answers to exercises
2. Definition of flatten by case analysis:
flatten (Leaf x) = [x]
flatten (Bin lt rt) = flatten lt ++ flatten rt
Definition as a fold:
flatten :: BinTree x → [x]
flatten = foldBinTree ((++), λx → [x])
3. Definition of maxBinTree by case analysis:
maxBinTree (Leaf x) = x
maxBinTree (Bin lt rt) = maxBinTree lt ‘max‘ maxBinTree rt
Definition as a fold:
maxBinTree :: Ord x ⇒ BinTree x → x
maxBinTree = foldBinTree (max, id)
4. Definition of sp by case analysis:
sp (Leaf x) = 0
sp (Bin lt rt) = 1 + (sp lt) ‘min‘ (sp rt)
Definition as a fold:
sp :: BinTree x → Int
sp = foldBinTree spAlgebra
spAlgebra = (λu v → 1 + u ‘min‘ v, const 0)
5. Definition of mapBinTree by case analysis:
mapBinTree f (Leaf x) = Leaf (f x)
mapBinTree f (Bin lt rt) = Bin (mapBinTree f lt) (mapBinTree f rt)
Definition as a fold:
mapBinTree :: (a → b) → BinTree a → BinTree b
mapBinTree f = foldBinTree (Bin, Leaf . f )
5.3 Using explicit recursion:
allPaths (Leaf x) = [[]]
allPaths (Bin lt rt) = map (L:) (allPaths lt)
++ map (R:) (allPaths rt)
240
As a fold:
allPaths :: BinTree a → [[Direction]]
allPaths = foldBinTree psAlgebra
psAlgebra :: BinTreeAlgebra a [[Direction]]
psAlgebra = (λu v → map (L:) u ++ map (R:) v, const [[]])
5.4
1.
data Resist = Resist :[: Resist
[ Resist :∗: Resist
[ BasicR Float
deriving Show
type ResistAlgebra a = (a → a → a, a → a → a, Float → a)
foldResist :: ResistAlgebra a → Resist → a
foldResist (par, seq, basic) = fold
where
fold (r
1
:[: r
2
) = par (fold r
1
) (fold r
2
)
fold (r
1
:∗: r
2
) = seq (fold r
1
) (fold r
2
)
fold (BasicR f ) = basic f
2.
result :: Resist → Float
result = foldResist resultAlgebra
resultAlgebra :: ResistAlgebra Float
resultAlgebra = (λu v → (u ∗ v) / (u + v), (+), id)
5.5
1. isSum = foldExpr ((&&)
,\x y -> False
,\x y -> False
,\x y -> False
,const True
,const True
,\x -> (&&)
)
2. vars = foldExpr ((++)
,(++)
,(++)
241
B. Answers to exercises
,(++)
,const []
,\x -> [x]
,\x y z -> x : (y ++ z)
)
5.6
1. der computes a symbolic differentiation of an expression.
2. The function der is not a compositional function on Expr because the righthand
sides of the ‘Mul ‘ and ‘Dvd‘ expressions do not only use der e
1
dx and der e
2
dx,
but also e
1
and e
2
themselves.
3.
data Exp = Exp ‘Plus‘ Exp
[ Exp ‘Sub‘ Exp
[ Con Float
[ Idf String
deriving Show
type ExpAlgebra a = (a → a → a
, a → a → a
, Float → a
, String → a
)
foldExp :: ExpAlgebra a → Exp → a
foldExp (plus, sub, con, idf ) = fold
where
fold (e
1
‘Plus‘ e
2
) = plus (fold e
1
) (fold e
2
)
fold (e
1
‘Sub‘ e
2
) = sub (fold e
1
) (fold e
2
)
fold (Con n) = con n
fold (Idf s) = idf s
4. Using explicit resursion:
der :: Exp → String → Exp
der (e
1
‘Plus‘ e
2
) dx = der e
1
dx ‘Plus‘ der e
2
dx
der (e
1
‘Sub‘ e
2
) dx = der e
1
dx ‘Sub‘ der e
2
dx
der (Con f ) dx = Con 0
der (Idf s) dx =if s = = dx then Con 1 else Con 0
Using a fold:
der = foldExp derAlgebra
derAlgebra :: ExpAlgebra (String → Exp)
242
derAlgebra = (λf g → λs → f s ‘Plus‘ g s
, λf g → λs → f s ‘Sub‘ g s
, λn → λs → Con 0
, λs → λt → if s = = t then Con 1 else Con 0
)
5.7 Using explicit recursion:
replace (Leaf x) y = Leaf y
replace (Bin lt rt) y = Bin (replace lt y) (replace rt y)
As a fold:
replace :: BinTree a → a → BinTree a
replace = foldBinTree repAlgebra
repAlgebra = (λf g → λy → Bin (f y) (g y), λx y → Leaf y)
5.8 Using explicit recursion:
path2Value (Leaf x) =
λbs → if null bs then x else error "no roothpath"
path2Value (Bin lt rt) =
λbs → case bs of
[] → error "no roothpath"
(L : rs) → path2Value lt rs
(R : rs) → path2Value rt rs
Using a fold:
path2Value :: BinTree a → [Direction] → a
path2Value = foldBinTree pvAlgebra
pvAlgebra :: BinTreeAlgebra a ([Direction] → a)
pvAlgebra = (λfl fr → λbs → case bs of
[] → error "no roothpath"
(L : rs) → fl rs
(R : rs) → fr rs
, λx → λbs → if null bs then x
else error "no roothpath")
5.9
1.
type PalAlgebra p = (p, p, p, p → p, p → p)
243
B. Answers to exercises
2.
foldPal :: PalAlgebra p → Pal → p
foldPal (pal
1
, pal
2
, pal
3
, pal
4
, pal
5
) = fPal
where
fPal Pal
1
= pal
1
fPal Pal
2
= pal
2
fPal Pal
3
= pal
3
fPal (Pal
4
p) = pal
4
(fPal p)
fPal (Pal
5
p) = pal
5
(fPal p)
3.
a2cPal = foldPal (""
, "a"
, "b"
, λp → "a" ++ p ++ "a"
, λp → "b" ++ p ++ "b"
)
aCountPal = foldPal (0, 1, 0, λp → p + 2, λp → p)
4.
pfoldPal :: PalAlgebra p → Parser Char p
pfoldPal (pal
1
, pal
2
, pal
3
, pal
4
, pal
5
) = pPal
where
pPal = const pal
1
<$>ε
<[>const pal
2
<$>syma
<[>const pal
3
<$>symb
<[>(` p → pal
4
p) <$>syma <∗>pPal <∗>syma
<[>(` p → pal
5
p) <$>symb <∗>pPal <∗>symb
syma = symbol ’a’
symb = symbol ’b’
5. The parser pfoldPal m
1
returns the concrete representation of a palindrome.
The parser pfoldPal m
2
returns the number of a’s occurring in a palindrome.
5.10
1. type MirAlgebra m = (m,m->m,m->m)
2. foldMir :: MirAlgebra m -> Mir -> m
foldMir (mir1,mir2,mir3) = fMir where
fMir Mir1 = mir1
fMir (Mir2 m) = mir2 (fMir m)
fMir (Mir3 m) = mir3 (fMir m)
3. a2cMir = foldMir ("",\m->"a"++m++"a",\m->"b"++m++"b")
m2pMir = foldMir (Pal1,Pal4,Pal5)
244
4. pfoldMir :: MirAlgebra m -> Parser Char m
pfoldMir (mir1, mir2, mir3) = pMir where
pMir = const mir1 <$> epsilon
<|> (\_ m _ -> mir2 m) <$> syma <*> pMir <*> syma
<|> (\_ m _ -> mir3 m) <$> symb <*> pMir <*> symb
syma = symbol ’a’
symb = symbol ’b’
5. The parser pfoldMir m
1
returns the concrete representation of a palindrome.
The parser pfoldMir m
2
returns the abstract representation of a palindrome.
5.11
1. type ParityAlgebra p = (p,p->p,p->p,p->p)
2. foldParity :: ParityAlgebra p -> Parity -> p
foldParity (empty,parity1,parityL0,parityR0) = fParity where
fParity Empty = empty
fParity (Parity1 p) = parity1 (fParity p)
fParity (ParityL0 p) = parityL0 (fParity p)
fParity (ParityR0 p) = parityR0 (fParity p)
3. a2cParity = foldParity (""
,\x->"1"++x++"1"
,\x->"0"++x
,\x->x++"0"
)
5.12
1. type BitListAlgebra bl z b = ((b->z->bl,b->bl),(bl->z->z,bl->z),(b,b))
2. foldBitList :: BitListAlgebra bl z b -> BitList -> bl
foldBitList ((consb,singleb),(consbl,singlebl),(bit0,bit1)) = fBitList
where
fBitList (ConsB b z) = consb (fBit b) (fZ z)
fBitList (SingleB b) = singleb (fBit b)
fZ (ConsBL bl z) = consbl (fBitList bl) (fZ z)
fZ (SingleBL bl) = singlebl (fBitList bl)
fBit Bit0 = bit0
fBit Bit1 = bit1
3. a2cBitList = foldBitList (((++),id)
,((++),id)
,("0","1")
)
4. pfoldBitList :: BitListAlgebra bl z b -> Parser Char bl
pfoldBitList ((consb,singleb),(consbl,singlebl),(bit0,bit1)) = pBitList
where
pBitList = consb <$> pBit <*> pZ
<|> singleb <$> pBit
245
B. Answers to exercises
pZ = (\_ bl z -> consbl bl z) <$> symbol ’,’
<*> pBitList
<*> pZ
<|> singlebl <$> pBitList
pBit = const bit0 <$> symbol ’0’
<|> const bit1 <$> symbol ’1’
5.13
1.
data Block = B
1
Stat Rest
data Rest = R
1
Stat Rest [ Nix
data Stat = S
1
Decl [ S
2
Use [ S
3
Nest
data Decl = Dx [ Dy
data Use = UX [ UY
data Nest = N
1
Block
The abstract representation of x;(y;Y);X is
B
1
(S
1
Dx) (R
1
stat
2
rest
2
)
where
stat
2
= S
3
(N
1
block)
block = B
1
(S
1
Dy) (R
1
(S
2
UY) Nix)
rest
2
= R
1
(S
2
UX) Nix
2.
type BlockAlgebra b r s d u n =
(s → r → b
, (s → r → r, r)
, (d → s, u → s, n → s)
, (d, d)
, (u, u)
, b → n
)
3.
foldBlock :: BlockAlgebra b r s d u n → Block → b
foldBlock (b1, (r
1
, nix), (s1, s2, s3), (dx, dy), (ux, uy), n1) =
foldB
where
foldB (B
1
stat rest) = b1 (foldS stat) (foldR rest)
foldR (R
1
stat rest) = r
1
(foldS stat) (foldR rest)
246
foldR Nix = nix
foldS (S
1
decl ) = s1 (foldD decl )
foldS (S
2
use) = s2 (foldU use)
foldS (S
3
nest) = s3 (foldN nest)
foldD Dx = dx
foldD Dy = dy
foldU UX = ux
foldU UY = uy
foldN (N
1
block) = n1 (foldB block)
4.
a2cBlock :: Block → String
a2cBlock = foldBlock a2cBlockAlgebra
a2cBlockAlgebra :: BlockAlgebra String String String
String String String
a2cBlockAlgebra =
(b1, (r
1
, nix), (s1, s2, s3), (dx, dy), (ux, uy), n1)
where
b1 u v = u ++ v
r
1
u v = ";" ++ u ++ v
nix = ""
s1 u = u
s2 u = u
s3 u = u
dx = "x"
dy = "y"
ux = "X"
uy = "Y"
n1 u = "(" ++ u ++ ")"
7.1 The type TreeAlgebra and the function foldTree are defined in the rep-min
problem.
1. height = foldTree heightAlgebra
heightAlgebra = (const 0, \l r -> (l ‘max‘ r) + 1)
--frontAtLevel :: Tree -> Int -> [Int]
--frontAtLevel (Leaf i) h = if h == 0 then [i] else []
--frontAtLevel (Bin l r) h =
-- if h > 0 then frontAtLevel l (h-1) ++ frontAtLevel r (h-1)
-- else []
frontAtLevel = foldTree frontAtLevelAlgebra
frontAtLevelAlgebra =
( \i h -> if h == 0 then [i] else []
247
B. Answers to exercises
, \f g h -> if h > 0 then f (h-1) ++ g (h-1) else []
)
2. a) Straightforward Solution: impossible to give a solution like rm.sol1.
heightAlgebra = (const 0, \l r -> (l ‘max‘ r) + 1)
front t = foldTree frontAtLevelAlgebra t h
where h = foldTree heightAlgebra t
b) Lambda Lifting
front’ t = foldTree
frontAtLevelAlgebra
t
(foldTree heightAlgebra t)
c) Tupling Computations
htupfr :: TreeAlgebra (Int, Int -> [Int])
htupfr
-- = (heightAlgebra ‘tuple‘ frontAtLevelAlgebra)
= ( \i -> ( 0
, \h -> if h == 0 then [i] else []
)
, \(lh,f) (rh,g) -> ( (lh ‘max‘ rh) + 1
, \h -> if h > 0
then f (h-1) ++ g (h-1)
else []
)
)
front’’ t = fr h
where (h, fr) = foldTree htupfr t
d) Merging Tupled Functions
It is helpful to note that the merged algebra is constructed such that
foldTree mergedAlgebra t i = (height t, frontAtLevel t i)
Therefore, the definitions corresponding to the fourth solution are
mergedAlgebra :: TreeAlgebra (Int -> (Int, [Int]))
mergedAlgebra =
(\i -> \h -> ( 0
, if h == 0 then [i] else []
)
, \lfun rfun -> \h -> let (lh, xs) = lfun (h-1)
(rh, ys) = rfun (h-1)
in
( (lh ‘max‘ rh) + 1
248
, if h > 0 then xs ++ ys else []
)
)
front’’’ t = fr
where (h, fr) = foldTree mergedAlgebra t h
7.2 The highest front problem is the problem of finding the first non-empty list
of nodes which are at the lowest level. The solutions are similar to these of the
deepest front problem.
1. --lowest :: Tree -> Int
--lowest (Leaf i) = 0
--lowest (Bin l r) = ((lowest l) ‘min‘ (lowest r)) + 1
lowest = foldTree lowAlgebra
lowAlgebra = (const 0, \ l r -> (l ‘min‘ r) + 1)
2. a) Straightforward Solution: impossible to give a solution like rm.sol1.
lfront t = foldTree frontAtLevelAlgebra t l
where l = foldTree lowAlgebra t
b) Lambda Lifting
lfront’ t = foldTree
frontAtLevelAlgebra
t
(foldTree lowAlgebra t)
c) Tupling Computations
ltupfr :: TreeAlgebra (Int, Int -> [Int])
ltupfr =
( \i -> ( 0
, (\l -> if l == 0 then [i] else [])
)
, \ (lh,f) (rh,g) -> ( (lh ‘min‘ rh) + 1
, \l -> if l > 0
then f (l-1) ++ g (l-1)
else []
)
)
lfront’’ t = fr l
where (l, fr) = foldTree ltupfr t
d) Merging Tupled Functions
It is helpful to note that the merged algebra is constructed such that
foldTree mergedAlgebra t i = (lowest t, frontAtLevel t i)
249
B. Answers to exercises
Therefore, the definitions corresponding to the fourth solution are
lmergedAlgebra :: TreeAlgebra (Int -> (Int, [Int]))
lmergedAlgebra =
( \i -> \l -> ( 0
, if l == 0 then [i] else []
)
, \lfun rfun -> \l ->
let (ll, lres) = lfun (l-1)
(rl, rres) = rfun (l-1)
in
( (ll ‘min‘ rl) + 1
, if l > 0 then lres ++ rres else []
)
)
lfront’’’ t = fr
where (l, fr) = foldTree lmergedAlgebra t l
8.1 Let G = (T, N, R, S). Define the regular grammar G

= (T, N ∪ ¦S

¦, R

, S

)
as follows. S

is a new nonterminal symbol. For the productions R

, divide the
productions of R in two sets: the productions R1 of which the right-hand side consists
just of terminal symbols, and the productions R2 of which the right-hand side ends
with a nonterminal. Define a set R3 of productions by adding the nonterminal S
at the right end of each production in R1. Define R

= R ∪ R3 ∪ S

→ S [ . The
grammar G

thus defined is regular, and generates the language L

.
8.2 In step 2, add the productions
S → aS
S →
S → cC
S → a
A →
A → cC
A → a
B → cC
B → a
and remove the productions
S → A
A → B
B → C
In step 3, add the production S → a, and remove the productions A → , B → .
250
8.3
ndfsa d qs [ ] = qs
ndfsa d qs (x : xs) = ndfsa d (d qs x) xs
8.4 Let M = (X, Q, d, S, F) be a deterministic finite-state automaton. Then the
nondeterministic finite-state automaton M

defined by M

= (X, Q, d

, ¦S¦, F), where
d

q x = ¦d q x¦ accepts the same language.
8.5 Let M = (X, Q, d, S, F) be a deterministic finite-state automaton that accepts
language L. Then the deterministic finite-state automaton M

= (X, Q, d, S, Q−F),
where Q − F is the set of states Q from which all states in F have been removed,
accepts the language L. Here we assume that d q a is defined for all q and a.
8.6
1. L
1
∩ L
2
= (L
1
∪ L
2
), so it follows that regular languages are closed under
intersection if they are closed under complementation and union. Regular lan-
guages are closed under union, see Theorem 8.10, and they are closed under
complementation, see Exercise 8.5.
2. Ldfa M
=
¦w ∈ X

[ dfa accept w (d, (S
1
, S
2
), (F
1
F
2
))¦
=
¦w ∈ X

[ dfa d (S
1
, S
2
) w ∈ F
1
F
2
¦
= This requires a proof by induction
¦w ∈ X

[ (dfa d
1
S
1
w, dfa d
2
S
2
w) ∈ F
1
F
2
¦
=
¦w ∈ X

[ dfa d
1
S
1
w ∈ F
1
∧ dfa d
2
S
2
w ∈ F
2
¦
=
¦w ∈ X

[ dfa d
1
S
1
w ∈ F
1
¦ ∩ ¦w ∈ X

[ dfa d
2
S
2
w ∈ F
2
¦
=
Ldfa M
1
∩ Ldfa M
2
8.7
1.
S
?>=< 89:; /.-, ()*+
(

@A BC
)

A
?>=< 89:;
)

@A BC
(

B
?>=< 89:; /.-, ()*+
251
B. Answers to exercises
2.
S
?>=< 89:;
0

GF ED
1

0

A
?>=< 89:;
0

1

C
?>=< 89:; /.-, ()*+
B
?>=< 89:; /.-, ()*+
0

D
?>=< 89:; /.-, ()*+
8.8 Use the definition of Lre to calculate the languages:
1. ¦, b¦
2. (bc

)
3. ¦a¦b

∪ c

8.9
1. Lre(R(S +T))
=
(Lre(R))(Lre(S +T))
=
(Lre(R))(Lre(S) ∪ Lre(T))
=
(Lre(R))(Lre(S)) ∪ (Lre(R))(Lre(T))
=
Lre(RS +RT)
2. Similar to the above calculation.
8.10 Take R = a, S = a, and R = a, S = b.
8.11 If both V and W are subsets of S, then Lre(R(S+V )) = Lre(R(S+W)). Since
S = ∅, V = S and W = ∅ satisfy the requirement. Another solution is
V = S ∩ R
W = S ∩ R
Since S = ∅, at least one of S∩R and S∩R is not empty, and it follows that V = W.
There exist other solutions than these.
8.12 The string 01 may not occur in a string in the language of the regular expression.
So when a 0 appears somewhere, only 0’s can follow. Take 1∗0∗.
8.13
252
1. If we can give a regular grammar for (a + bb)∗ + c, we can use the procedure
constructed in Exercise 8.1 to obtain a regular grammar for the complete regular
expression. The following regular grammar generates (a + bb)∗ + c:
S
0
→ S
1
[ c
provided S
1
generates (a + bb)∗. Again, we use Exercise 8.1 and a regular
grammar for a + bb to obtain a regular grammar for (a + bb)∗. The language
of the regular expression a + bb is generated by the regular grammar
S
2
→ a [ bb
2. The language of the regular expression a∗ is generated by the regular grammar
S
0
→ [ aS
0
The language of the regular expression b∗ is generated by the regular grammar
S
1
→ [ bS
1
The language of the regular expression ab is generated by the regular grammar
S
2
→ ab
It follows that the language of the regular expression a∗ +b∗ +ab is generated
by the regular grammar
S
3
→ S
0
[ S
1
[ S
2
8.14 First construct a nondeterministic finite-state automaton for these grammars,
and use the automata to construct the following regular expressions.
1. a(b + b(b + ab)∗(ab +)) + bb∗ +
2. (0 + 10∗1)∗
8.16
1. b ∗ +b ∗ (aa ∗ b(a + b)∗)
2. (b + a(bb) ∗ b)∗
9.1 We follow the proof pattern for non-regular languages given in the text.
Let n ∈ N.
Take s = a
n
2
with x = , y = a
n
, and z = a
n
2
−n
.
Let u, v, w be such that y = uvw with v = , that is, u = a
p
, v = a
q
and w = a
r
with
p +q +r = n and q > 0.
Take i = 2, then
253
B. Answers to exercises
xuv
2
wz ∈ L
⇐ defn. x, u, v, w, z, calculus
a
p+2q+r
a
n
2
−n
∈ L
⇐ p +q +r = n and q > 0
n
2
+q is not a square

n
2
< n
2
+q < (n + 1)
2

q < 2n + 1
9.2 Using the proof pattern.
Let n ∈ N.
Take s = a
n
b
n+1
with x = a
n
, y = b
n
, and z = b.
Let u, v, w be such that y = uvw with v = , that is, u = b
p
, v = b
q
and w = b
r
with
p +q +r = n and q > 0.
Take i = 0, then
xuwz ∈ L
⇐ defn. x, u, v, w, z, calculus
a
n
b
p+r+1
∈ L

p +q +r p +r + 1

q > 0
9.3 Using the proof pattern.
Let n ∈ N.
Take s = a
n
b
2n
with x = a
n
, y = b
n
, and z = b
n
.
Let u, v, w be such that y = uvw with v = , that is, u = b
p
, v = b
q
and w = b
r
with
p +q +r = n and q > 0.
Take i = 2, then
xuv
2
wz ∈ L
⇐ defn. x, u, v, w, z, calculus
a
n
b
2n+q
∈ L

2n +q > 2n

q > 0
254
9.4 Using the proof pattern.
Let n ∈ N.
Take s = a
6
b
n+4
a
n
with x = a
6
b
n+4
, y = a
n
, and z = .
Let u, v, w be such that y = uvw with v = , that is, u = a
p
, v = a
q
and w = a
r
with
p +q +r = n and q > 0.
Take i = 6, then
xuv
6
wz ∈ L
⇐ defn. x, u, v, w, z, calculus
a
6
b
n+4
a
n+5q
∈ L

n + 5q > n + 4

q > 0
9.5 The length of a substring with a’s, b’s, and c’s is at least r +2, and [vwx[ d
r.
9.6 We follow the proof pattern for proving that a language is not context-free given
in the text.
Let c, d ∈ N.
Take z = a
k
2
with k = max(c, d).
Let u, v, w, x, y be such that z = uvwxy, [vx[ > 0 and [vwx[ d
That is u = a
p
, v = a
q
, w = a
r
, x = a
s
and y = a
t
with p + q + r + s + t = k
2
,
q +r +s d and q +s > 0.
Take i = 2, then
uv
2
wx
2
y ∈ L
⇐ defn. u, v, w, x, y, calculus
a
k
2
+q+s
∈ L
⇐ q +s > 0
k
2
+q +s is not a square

k
2
< k
2
+q +s < (k + 1)
2

q +s < 2k + 1
⇐ defn. k
q +s d
9.7 Using the proof pattern.
Let c, d ∈ N.
255
B. Answers to exercises
Take z = a
k
with k is prime and k > max(c, d).
Let u, v, w, x, y be such that z = uvwxy, [vx[ > 0 and [vwx[ d
That is u = a
p
, v = a
q
, w = a
r
, x = a
s
and y = a
t
with p + q + r + s + t = k,
q +r +s d and q +s > 0.
Take i = k + 1, then
uv
k+1
wx
k+1
y ∈ L
⇐ defn. u, v, w, x, y, calculus
a
k+kq+ks
∈ L

k(1 +q +s) is not a prime

q +s > 0
9.8 Using the proof pattern.
Let c, d ∈ N.
Take z = a
k
b
k
a
k
b
k
with k = max(c, d).
Let u, v, w, x, y be such that z = uvwxy, [vx[ > 0 and [vwx[ d
Note that our choice for k guarantees that substring vwx has one of the following
shapes:
• vwx consists of just a’s, or just b’s.
• vwx contains both a’s and b’s.
Take i = 0, then
• If vwx consists of just a’s, or just b’s, then it is impossible to write the string
uwy as ww for some string w, since only the number of terminals of one kind
is decreased.
• If vwx contains both a’s and b’s, it lies somewhere on the border between a’s
and b’s, or on the border between b’s and a’s. Then the string uwy can be
written as
uwy = a
s
b
t
a
p
b
q
for some s, t, p, q, respectively. At least one of s ,t, p and q is less than k,
while two of them are equal to k. Again this sentence is not an element of the
language.
9.9 Using the proof pattern.
Let n ∈ N.
Take s = 1
n
0
n
with x = 1
n
, y = 0
n
, and z = .
Let u, v, w be such that y = uvw with v = , that is, u = b
p
, v = b
q
and w = b
r
with
p +q +r = n and q > 0.
Take i = 2, then
256
xuv
2
w ∈ L
⇐ defn. x, u, v, w, z, calculus
1
n
0
p+2q+r
∈ L
⇐ p +q +r = n
1
n
0
n+q
∈ L
⇐ q > 0
true
9.10
1. The language ¦a
i
b
j
[ 0 i j¦ is context-free. The language can be generated
by the following context-free grammar:
S → aSbA
S →
A → bA
A →
This grammar generates the strings a
m
b
m
b
n
for m, n ∈ N. These are exactly
the strings of the given language.
2. The language is not regular. We prove this using the proof pattern.
Let n ∈ N.
Take s = a
n
b
n+1
with x = , y = a
n
, and z = b
n+1
.
Let u, v, w be such that y = uvw with v = , that is, u = a
p
, v = a
q
and w = a
r
with p +q +r = n and q > 0.
Take i = 3, then
xuv
3
wz ∈ L
⇐ defn. x, u, v, w, z, calculus
a
p+3q+r
b
n+1
∈ L
⇐ p +q +r = n
n + 2q > n + 1
⇐ q > 0
true
9.11
1. The language ¦wcw [ w ∈ ¦a, b¦

¦ is not context-free. We prove this using the
proof pattern.
Let c, d ∈ N.
257
B. Answers to exercises
Take z = a
k
b
k
ca
k
b
k
with k = max(c, d).
Let u, v, w, x, y be such that z = uvwxy, [vx[ > 0 and [vwx[ d
Note that our choice for k guarantees that [vwx[ k. The substring vwx has
one of the following shapes:
• vwx does not contain c.
• vwx contains c.
Take i = 0, then
• If vwx does not contain c it is a substring at the left-hand side or at the
right-hand side of c. Then it is impossible to write the string uwy as scs
for some string s, since only the number of terminals on one side of c is
decreased.
• If vwx contains c then the string vwx can be written as
vwx = b
s
ca
t
for some s, t respectively. For uwy there are two possibilities.
The string uwy does not contain c so it is not an element of the language.
If the string uwy does contain c then it has either fewer b’s at the left-hand
side than at the right-hand side or fewer a’s at the right-hand side than
at the left-hand side, or both. So it is impossible to write the string uwy
as scs for some string s.
2. The language is not regular because it is not context-free. The set of regular
languages is a subset of the set of context-free languages.
9.12
1. The grammar G is context-free because there is only one nonterminal at the
left hand side of the production rules and the right hand side is a sequence of
terminals and nonterminals. The grammar is not regular because there is a
production rule with two nonterminals at the right hand side.
2. L(G) = ¦(st)

[ s ∈ ¦

∧ t ∈ ¦

∧ [s[ = [t[ ¦
L(G) is all sequences of nested braces.
3. The language is context-free because it is generated by a context-free grammar.
The language is not regular. We use the proof pattern and choose
s = ¦
n
¦
n
. The proof is exactly the same as the proof for non-regularity of
L = ¦a
m
b
m
[ m 0¦ in Section 9.2.
9.13
1. The grammar is context-free. The grammar is not right-regular but left-regular
because the nonterminal appears at the left hand side in the last production
rule.
258
2. L(G) = 0

∪ 10

Note that the language is also generated by the following right-regular grammar:
S →
S → 1A
S → 0A
A →
A → 0
A → 0A
3. The language is context-free and regular because it is generated by a regular
grammar.
10.1 For both grammars we have:
empty = const False
10.2 For grammar1 we have:
firsts S = {b,c}
firsts A = {a,b,c}
firsts B = {a,b,c}
firsts C = {a,b}
For grammar2:
firsts S = {a}
firsts A = {b}
10.3 For gramm1 we have:
follow S = {a,b,c}
follow A = {a,b,c}
follow B = {a,b}
follow C = {a,b,c}
For gramm2:
follow = const {}
10.4 For the productions of gramm3 we have:
259
B. Answers to exercises
lookAhead (S → AaS) = ¦a,c ¦
lookAhead (S → B) = ¦b¦
lookAhead (A → cS) = ¦c¦
lookAhead (A → []) = ¦a¦
lookAhead (B → b) = ¦b¦
Since all lookAhead sets for productions of the same nonterminal are disjoint, gramm3
is an LL(1) grammar.
10.5 After left factoring gramm2 we obtain gramm2’ with productions
S → aC
C → bA [ a
A → bD
D → b [ S
For this transformed grammar gramm2’ we have:
The empty function
empty = const False
The first function
firsts S = {a}
firsts C = {a,b}
firsts A = {b}
firsts D = {a,b}
The follow function
follow = const {}
The lookAhead function
lookAhead (S → aC) = ¦a¦
lookAhead (C → bA) = ¦b¦
lookAhead (C → a) = ¦a¦
lookAhead (A → bD) = ¦b¦
lookAhead (D → b) = ¦b¦
lookAhead (D → S) = ¦a¦
Clearly, gramm2’ is an LL(1) grammar.
10.6 For the empty function we have:
260
empty R = True
empty _ = False
For the firsts function we have
firsts L = {0,1}
firsts R = {,}
firsts B = {0,1}
The follow function is
follow B = {,}
follow _ = {}
The lookAhead function is
lookAhead (L → B R) = ¦0, 1¦
lookAhead (R → ) = ¦¦
lookAhead (R → , B R) = ¦,¦
lookAhead (B → 0) = ¦0¦
lookAhead (B → 1) = ¦1¦
Since all lookAhead sets for productions of the same nonterminal are disjoint, the
grammar is LL(1).
10.7
Node S [ Node c []
, Node A [ Node c []
, Node B [ Node c [], Node c [] ]
, Node C [ Node b [], Node a [] ]
]
]
10.8
Node S [ Node a []
, Node C [ Node b []
, Node A [ Node b [], Node D [ Node b [] ] ]
]
]
10.9
261
B. Answers to exercises
Node S [ Node A []
, Node a []
, Node S [ Node A [ Node c [], Node S [ Node B [ Node b []]]]
, Node a []
, Node S [ Node B [ Node b [] ]]
]
]
10.10
1. list2Set :: Ord s => [s] -> [s]
list2Set = unions . map single
2. list2Set :: Ord s => [s] -> [s]
list2Set = foldr op []
where
op x xs = single x ‘union‘ xs
3. pref :: Ord s => (s -> Bool) -> [s] -> [s]
pref p = list2Set . takeWhile p
or
pref p [] = []
pref p (x:xs) = if p x then single x ‘union‘ pref p xs else []
or
pref p = foldr op []
where
op x xs = if p x then single x ‘union‘ xs else []
4. prefplus p [] = []
prefplus p (x:xs) = if p x then single x ‘union‘ prefplus p xs
else single x
5. prefplus p = foldr op []
where
op x us = if p x then single x ‘union‘ us
else single x
6. prefplus p
=
foldr op []
where
op x us = if p x then single x ‘union‘ us else single x
=
foldr op []
262
where
op x us = single x ‘union‘ rest
where
rest = if p x then us else []
=
foldrRhs p single []
7. The function foldrRhs p f [] takes a list xs and returns the set of all f-images
of the elements of the prefix of xs all of whose elements satisfy p together with
the f-image of the first element of xs that does not satisfy p (if this element
exists).
The function foldrRhs p f start takes a list xs and returns the set start
‘union‘ foldrRhs p f [] xs.
8.
10.11
1. For gramm1
a) foldrRhs empty first [] bSA
=
unions (map first (prefplus empty bSA))
= empty b = False
unions (map first [b])
=
unions [¦b¦]
=
¦b¦
b) foldrRhs empty first [] Cb
=
unions (map first (prefplus empty Cb))
= empty C = False
unions (map first [C])
=
unions [¦a, b ¦]
=
¦ a, b ¦
2. For gramm3
a) foldrRhs empty first [] AaS
=
unions (map first (prefplus empty AaS))
263
B. Answers to exercises
= empty A = True
unions (map first ([A] ‘union‘ prefplus empty aS))
empty a = False
unions (map first ([A] ‘union‘ [a]))
=
unions (map first ([A, a]))
=
unions [first A, first a]
=
unions [¦c¦, ¦a¦]
=
¦a, c¦
b) scanrRhs empty first [] AaS
=
map (foldrRhs empty first []) (tails AaS)
=
map (foldrRhs empty first []) [AaS, aS, S, []]
=
[ foldrRhs empty first [] AaS
, foldrRhs empty first [] aS
, foldrRhs empty first [] S
, foldrRhs empty first [] []
]
= calculus
[ ¦a, c¦, ¦a¦, ¦a, b, c ¦, [] ]
11.1 Recall, for gramm1 we have:
follow S = {a,b,c}
follow A = {a,b,c}
follow B = {a,b}
follow C = {a,b,c}
The production rule B → cc cannot be used for a reduction when the input begins
with the symbol c.
264
stack input
ccccba
c cccba
cc ccba
ccc cba
cccc ba
Bcc ba
bBcc a
abBcc
CBcc
Ac
S
11.2 Recall, for gramm2 we have:
follow S = {}
follow A = {}
No reduction can be made until the input is empty.
stack input
abbb
a bbb
ba bb
bba b
bbba
Aba
S
11.3 Recall, for gramm3 we have:
follow S = {a}
follow A = {a}
follow B = {a}
At the beginning a reduction is applied using the production rule A → ε.
265
B. Answers to exercises
stack input
acbab
A acbab
aA cbab
caA bab
bcaA ab
BcaA ab
ScaA ab
AaA ab
aAaA b
baAaA
SaAaA
SaA
S
266

Copyright c 2001–2009 by Johan Jeuring and Doaitse Swierstra.

Contents
Preface for the 2009 version Voorwoord 1. Goals 1.1. History . . . . . . . . . . . . . . 1.2. Grammar analysis of context-free 1.3. Compositionality . . . . . . . . . 1.4. Abstraction mechanisms . . . . . v vii 1 2 3 4 4 7 8 11 14 15 17 19 22 23 23 23 24 24 25 26 27 28 31 34 36 37

. . . . . . grammars . . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

2. Context-Free Grammars 2.1. Languages . . . . . . . . . . . . . . . . . . . . . . . . . 2.2. Grammars . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1. Notational conventions . . . . . . . . . . . . . . 2.3. The language of a grammar . . . . . . . . . . . . . . . 2.3.1. Examples of basic languages . . . . . . . . . . . 2.4. Parse trees . . . . . . . . . . . . . . . . . . . . . . . . 2.5. Grammar transformations . . . . . . . . . . . . . . . . 2.5.1. Removing duplicate productions . . . . . . . . 2.5.2. Substituting right hand sides for nonterminals 2.5.3. Removing unreachable productions . . . . . . . 2.5.4. Left factoring . . . . . . . . . . . . . . . . . . . 2.5.5. Removing left recursion . . . . . . . . . . . . . 2.5.6. Associative separator . . . . . . . . . . . . . . . 2.5.7. Introduction of priorities . . . . . . . . . . . . . 2.5.8. Discussion . . . . . . . . . . . . . . . . . . . . . 2.6. Concrete and abstract syntax . . . . . . . . . . . . . . 2.7. Constructions on grammars . . . . . . . . . . . . . . . 2.7.1. SL: an example . . . . . . . . . . . . . . . . . . 2.8. Parsing . . . . . . . . . . . . . . . . . . . . . . . . . . 2.9. Exercises . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

3. Parser combinators 41 3.1. The type of parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.2. Elementary parsers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.3. Parser combinators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

i

Contents

3.4.

3.5. 3.6. 3.7.

3.3.1. Matching parentheses: an example More parser combinators . . . . . . . . . . 3.4.1. Parser combinators for EBNF . . . 3.4.2. Separators . . . . . . . . . . . . . . Arithmetical expressions . . . . . . . . . . Generalised expressions . . . . . . . . . . Exercises . . . . . . . . . . . . . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

. . . . . . .

53 55 55 58 60 63 64 67 68 68 69 69 69 70 71 72 72 74 75 76 77 78 78 80 81 82 82 83 86 88 88 89 90 90 91 93 95 97 97 98 102

4. Grammar and Parser design 4.1. Step 1: Example sentences for the language 4.2. Step 2: A grammar for the language . . . . 4.3. Step 3: Testing the grammar . . . . . . . . 4.4. Step 4: Analysing the grammar . . . . . . . 4.5. Step 5: Transforming the grammar . . . . . 4.6. Step 6: Deciding on the types . . . . . . . . 4.7. Step 7: Constructing the basic parser . . . . 4.7.1. Basic parsers from strings . . . . . . 4.7.2. A basic parser from tokens . . . . . 4.8. Step 8: Adding semantic functions . . . . . 4.9. Step 9: Did you get what you expected . . 4.10. Exercises . . . . . . . . . . . . . . . . . . . 5. Compositionality 5.1. Lists . . . . . . . . . . . . . . . . . . . 5.1.1. Built-in lists . . . . . . . . . . . 5.1.2. User-defined lists . . . . . . . . 5.1.3. Streams . . . . . . . . . . . . . 5.2. Trees . . . . . . . . . . . . . . . . . . . 5.2.1. Binary trees . . . . . . . . . . . 5.2.2. Trees for matching parentheses 5.2.3. Expression trees . . . . . . . . 5.2.4. General trees . . . . . . . . . . 5.2.5. Efficiency . . . . . . . . . . . . 5.3. Algebraic semantics . . . . . . . . . . 5.4. Expressions . . . . . . . . . . . . . . . 5.4.1. Evaluating expressions . . . . . 5.4.2. Adding variables . . . . . . . . 5.4.3. Adding definitions . . . . . . . 5.4.4. Compiling to a stack machine . 5.5. Block structured languages . . . . . . 5.5.1. Blocks . . . . . . . . . . . . . . 5.5.2. Generating code . . . . . . . . 5.6. Exercises . . . . . . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . .

ii

Contents

6. Computing with parsers 6.1. Insert a semantic function in the parser 6.2. Apply a fold to the abstract syntax . . . 6.3. Deforestation . . . . . . . . . . . . . . . 6.4. Using a class instead of abstract syntax 6.5. Passing an algebra to the parser . . . . 7. Programming with higher-order folds 7.1. The rep min problem . . . . . . . . . . 7.1.1. A straightforward solution . . . 7.1.2. Lambda lifting . . . . . . . . . 7.1.3. Tupling computations . . . . . 7.1.4. Merging tupled functions . . . 7.2. A small compiler . . . . . . . . . . . . 7.2.1. The language . . . . . . . . . . 7.2.2. A stack machine . . . . . . . . 7.2.3. Compiling to the stackmachine 7.3. Attribute grammars . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

105 . 105 . 105 . 106 . 107 . 109 111 112 113 113 114 115 117 117 117 118 122

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

8. Regular Languages 8.1. Finite-state automata . . . . . . . . . . . . . . . . . . . . . . . 8.1.1. Deterministic finite-state automata . . . . . . . . . . . . 8.1.2. Nondeterministic finite-state automata . . . . . . . . . . 8.1.3. Implementation . . . . . . . . . . . . . . . . . . . . . . . 8.1.4. Constructing a DFA from an NFA . . . . . . . . . . . . 8.1.5. Partial evaluation of NFA’s . . . . . . . . . . . . . . . . 8.2. Regular grammars . . . . . . . . . . . . . . . . . . . . . . . . . 8.2.1. Equivalence of Regular grammars and Finite automata 8.3. Regular expressions . . . . . . . . . . . . . . . . . . . . . . . . . 8.4. Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4.1. Proof of Theorem 8.7 . . . . . . . . . . . . . . . . . . . 8.4.2. Proof of Theorem 8.16 . . . . . . . . . . . . . . . . . . . 8.4.3. Proof of Theorem 8.17 . . . . . . . . . . . . . . . . . . . 8.5. Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9. Pumping Lemmas: the expressive power of languages 9.1. The Chomsky hierarchy . . . . . . . . . . . . . . . . . . 9.1.1. Type-0 grammars . . . . . . . . . . . . . . . . . . 9.1.2. Type-1 grammars . . . . . . . . . . . . . . . . . . 9.1.3. Type-2 grammars . . . . . . . . . . . . . . . . . . 9.1.4. Type-3 grammars . . . . . . . . . . . . . . . . . . 9.2. The pumping lemma for regular languages . . . . . . . . 9.3. The pumping lemma for context-free languages . . . . . 9.4. Proofs of pumping lemmas . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

123 . 123 . 124 . 126 . 129 . 132 . 135 . 136 . 139 . 142 . 145 . 146 . 147 . 148 . 150 153 154 154 154 154 155 155 157 161

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

iii

1. . . . .1. . . . . 10. . . . . . . . LL(1) parsing . . LL(1) grammars . . . . . . . . . 11. . . . . . . 167 . . . 190 195 .1 . . . . . Implementation of isLL(1) . . . . A. 198 . .2. . 11.2. . Theorem 9. . . . . . .2. . . . . . . . 199 . . . .2. . . Exercises . . 179 . . . . . . . . . . . . . . . . . . . .5. .2. . 11.2. . Implementation of follow . LR parsing . 11. . Implementation of empty . . . . . . . An LL(1) grammar . .2. . . . . . . iv . . LL Parsing: Background . .Contents 9.1. . . . .LL Parsing 10. . . . . 202 . . . . . .3. . . . . . . . . 11.3. . . . . . . . . . . . . . 11. . Parse trees in Haskell . . . . . . . . . . .3. . .2. . . . . . . . . . . . . . 11. . . . . 162 9. . . . . . . .1. .4. . . . . . . . . . . . . . .2. A stack machine for SLR parsing . . . .1. LR optimizations and generalizations . . . 10. . . . . . . . . . . . . . . . . . . . . Proof of the Regular Pumping Lemma. .1. . . . . . . . . . .2. . The Stack module B. . . . . . . . . . . . . . . Implementation of lookahead . . . . . Using the LL(1) parser . . . .2. . . . . .3. . . 10. . . 181 . . . . . . . . . . . . . . .6. . . . . . . . An LR checker . . . 206 211 213 . . . . . Proof of the Context-free Pumping Lemma. . . . . . . . . . . . . . . . . .1. . .3. Some example derivations . .1. . . . . 197 . . . 10. . . . . 10. . . An LL(1) checker . . . . 11. . 181 . . . 196 . . . . . . . . . . . . . 176 . . . . 178 . . . . . . . 187 . . . 173 . 11. 176 . . . Answers to exercises . . . . . . . 201 .3. . . . . .2. . . . . . . . . 164 10. . . . .1. 10. . . . . Context-free grammars in Haskell . . . . . .1. . . . . . . . 196 . . . LR parse example . . . . . . . . . . . . . . 161 9. .2 . . . . . LR action selection . . . . . . . . . . . . . 168 . .7. . Implementation of first and last .2. . 10. . . . . . . . . . . . . . .LL versus LR parsing 11. 11. . .2. . . 200 . 186 . 10. . . . . 167 . . 201 . . . . . . . .4. .2. . . 169 . . . . . LL(1) parser example . . . A stack machine for parsing . . . . . . . .3. . . .2. . . .1. 10. . . .3. . . . . .8. . . . .1. . LL Parsing: Implementation . . . .1. . 10. .4. . . . . . . . . . . . . . . . . . . . . . . . . . .1. Theorem 9. . . . . . . .5. 10. 10. . . . . .

and this means that some of the later chapters look a bit different from the earlier chapters. but also at the Open University. I also rearranged some of the material because I think they connect better in the new order. Some modifications made by Manuela Witsiers have been reintegrated. However. I have also started to make some modifications to style and content – mainly trying to adapt the Haskell code contained in the lecture notes to currently established coding guidelines. These lecture notes are in large parts identical with the old lecture notes for “Grammars and Parsing” by Johan Jeuring and Doaitse Swierstra. I apologize in advance for any inconsistencies or mistakes I may have introduced. The lecture notes have not only been used at Utrecht University. Please feel free to point out any mistakes you find in these lecture notes to me – any other feedback is also welcome. Andres L¨h o October 2009 v . this work is not quite finished. I hope to be able to update the online version of the lecture notes regularly.Preface for the 2009 version This is a work in progress.

Preface for the 2009 version vi .

e Tenslotte willen we van de gelegenheid gebruik maken enige studeeraanwijzingen te geven: • Het is onze eigen ervaring dat het uitleggen van de stof aan iemand anders vaak pas duidelijk maakt welke onderdelen je zelf nog niet goed beheerst. We hebben geprobeerd door de indeling van vii . Andres L¨h. Rijk-Jan van Haaften. Veel mensen hebben een bijgedrage geleverd aan de totstandkoming van dit dictaat door een gedeelte te schrijven. Martin Bravenboer. die mee hebben geholpen door het schrijven van (een) hoofdstuk(ken) van het dictaat. na lezing van een hoofdstuk. of (een gedeelte van) het dictaat te becommentari¨ren. is het verleidelijker om. Breght Boschker. Commentaar is onder andere geleverd door: Arthur Baars. maar we houden ons van harte aanbevolen voor suggesties voor verdere verbetering. Gijsbert Bol. Maak dus de opgaven die in het dictaat opgenomen zijn zelf. Graham Hutton. In tegenstelling tot sommige andere vakken is het bij dit vak gemakkelijk de vaste grond onder je voeten kwijt te raken. en doe dat niet door te kijken of je de oplossingen die anderen gevonden hebben. Erik Meijer. en Vincent Oosto indi¨. Daan Leijen. met name daar waar het het aangeven van verbanden met andere vakken betreft. e Speciale vermelding verdienen Jeroen Fokker. Arnoud Berendsen. Het dictaat is de afgelopen jaren verbeterd. Rik van Geldrop. “kennen” is iets anders dan “beheersen” en “beheersen” is weer iets anders dan “er iets mee kunnen”. • Oefening baart kunst. begrijpt.Voorwoord Het hiervolgende dictaat is gebaseerd op teksten uit vorige jaren. Het is niet “elke week nieuwe kansen”. Naarmate er meer aandacht wordt besteed aan de presentatie van de stof. en naarmate er meer voorbeelden gegeven worden. “Begrijpen is echter niet hetzelfde als “kennen”. die onder andere geschreven zijn in het kader van het project Kwaliteit en Studeerbaarheid. Pieter Eendebak. Probeer voor jezelf bij te houden welk stadium je bereikt hebt met betrekking tot alle genoemde leerdoelen. en Luc Duponcheel. Als je dus van mening bent dat je een hoofdstuk goed begrijpt. de conclusie te trekken dat je een en ander daadwerkelijk beheerst. In het ideale geval zou je in staat moeten zijn een mooi tentamen in elkaar te zetten voor je mede-studenten! • Zorg dat je up-to-date bent. probeer dan eens in eigen woorden uiteen te zetten.

Voorwoord de stof hier wel iets aan te doen. Goed gereedschap is het halve werk. en zonodig hulp te vragen. Als je nog moeilijkheden hebt met de taal Haskell aarzel dan niet direct hier wat aan te doen. • We maken gebruik van de taal Haskell om veel concepten en algoritmen te presenteren. De tijd die je dan op college en werkcollege doorbrengt is dan weinig effectief. maar de totale opbouw laat hier niet heel veel vrijheid toe. Als je een week gemist hebt is het vrijwel onmogelijk de nieuwe stof van de week daarop te begrijpen. Veel sterkte. Johan Jeuring en Doaitse Swierstra viii . met als gevolg dat je vaak voor het tentamen heel veel tijd (die er dan niet is) kwijt bent om in je uppie alles te bestuderen. Anders maak je jezelf het leven heel moeilijk. en hopelijk ook veel plezier. en Haskell is hier ons gereedschap.

and it is only in the last years that we have started to discover and make explicit the reasons why this area attracted so much interest. and secondary – but not less important – goals which 1 . we could construct programs someone else could not – thus giving us the feeling we had “the right stuff” –. This feeling was augmented by the fact that we would not have had the foggiest idea how to complete such a product a few months before. thus linking the o area with many other areas of computer science. The reason for this is that from the very beginning of these curricula it has been one of the few areas where the development of formal methods and the application of formal techniques in actual program construction come together. One of the things which were not so clear however is where exactly this joy originated from: the techniques taught definitely had a certain elegance. For many practicing computer scientists the course on compiler construction still is one of the highlights of their education. have been provided with a theoretical underpinning which explains why the techniques work. Goals Introduction Courses on Grammars. which invariably consisted of constructing a compiler for some toy language. where we had tools for generating parts of compilers out of formal descriptions of the tasks to be performed. we had the usual satisfied feeling.1. and where such program generators were indeed generating programs which would have been impossible to create by hand. and now we knew “how to do it”. Goals The goals of these lecture notes can be split into primary goals. As a beneficial side-effect we also gradually learned to see where the discovered concept further played a rˆle. show correspondences and transfer insights from one area of interest to the other. stress their importance. Many of the techniques which were taught on a “this is how you solve this kind of problems” basis. For a long time the construction of compilers has been one of the few areas where we had a methodology available. and when completing the practical exercises. but also giving us a means to explain such links. which are associated with the specific subject studied. This situation has remained so for years. and not only that. Parsing and Compilation of programming languages have always been some of the core components of a computer science curriculum.

which was soon to be followed by ALGOL-60 and COBOL. to 2 . such as.e. the problem of how to describe the language was not a hot issue: much more important problems were to be solved. to understand the concepts of abstract interpretation and partial evaluation. goals are: to develop the capability to abstract. Writing larger and larger programs by more and more people sparked the development of the first more or less machine-independent programming language FORTRAN (FORmula TRANslator). • to show a wide variety of useful programming techniques. • to know how to parse. For the developers of the FORTRAN language. more far reaching. how to construct a compiler for the language that would fit into the small memories which were available at that time (kilobytes instead of megabytes). • • • • 1. “formulas”) using grammars. • to familiarise oneself with the concept of computability. • to be able to apply these techniques in the construction of all kinds of programs. and their reliability had increased enough to justify applying them to a wide range of problems. As a result the language was very much implicitly defined by what was accepted by the compiler and what not.1. i. • to know how to analyse grammars to see whether or not specific properties hold. of which John Backus was the prime architect. it was no longer the actual hardware which posed most of the problems. to understand the concept of domain specific languages. The secondary. what should be in the language and what not. The primary. and how to generate machine code that would not be ridiculed by programmers who had thus far written such code by hand.. goals are: • to understand how to describe structures (i. to show how proper formalisations can be used as a starting point for the construction of useful tools. how to recognise (build) such structures in (from) a sequence of symbols.e. • to improve the general programming skills. Soon after the development of FORTRAN an international working group started to work on the design of a machine independent high-level programming language. History When at the end of the fifties the use of computers became more and more widespread. • to understand the concept of compositionality. somewhat more traditional. • to show how to develop programs in a calculational style..1. Goals have to do with developing skills which one would expect every educated computer scientist to have.

If we use the full power of the context-free languages we get compilers which in general are inefficient. this notation.e. Ever since it was introduced. for a human being it is not always easy. Besides these very mundane goals. we most often speak of context-free grammars. Unfortunately. As a remarkable side-effect of this undertaking. When constructing a recogniser for a language described by a context-free grammar one often wants to check whether or not the grammar has specific desirable properties. and. It was not for long that computer scientists. It appeared that the BNF-formalism actually was a member of a hierarchy of grammar classes which had been formulated a number of years before by the linguist Noam Chomsky in an attempt to capture the concept of a “language”. i. the construction of compilers. Most invocations of compilers still have as their primary goal to discover mistakes made when typing the program. 1. and probably not so good in handling erroneous input. Grammar analysis of context-free grammars Nowadays the use of the word Backus-Naur is gradually diminishing. programs emerged which take such a piece of language description as input. Questions arose about the expressibility of BNF. where the type checking performed by the compilers is one of the main contributions to the increase in efficiency in the programming process.2. which classes of languages can be expressed by means of BNF and which not. i.e. Such programs are now known under the name parser generators. In the lectures we will see many examples of this. and not so much generating actual code.. such as Java and Haskell. and consequently how to express restrictions and properties of languages for which the BNF-formalism is not powerful enough. discovered that the formalism was not only useful to express what language should be accepted by their compilers. but could also be used as a guideline for structuring their compilers.1.. which soon became to be known as the Backus-Naur formalism (BNF). and produce a skeleton of the desired compiler.2. This latter fact may not be so important from a theoretical point of view. inspired by the Chomsky hierarchy. but also a notation for describing programming languages was proposed by Naur and used to describe the language in the famous Algol-60 report. Grammar analysis of context-free grammars become known under the name ALGOL-60. This aspect is even stronger present in strongly typed languages. the BNFformalism also became soon a subject of study for the more theoretically oriented. but it is from a pragmatical point of view. not only a language standard was produced. and especially people writing compilers. has been used as the primary tool for describing the basic structure of programming languages. and quite often practically 3 . For the construction of everyday compilers for everyday languages it appears that this class is still a bit too large. and probably caused by the need to exchange proposals in writing. Once this relationship between a piece of BNF and a compiler became well understood.

We will see many examples of such mappings. The only way we see is to take a historians view and to compare the old and the new situations. Under closer scrutiny.4. is that over the last thirty years we have discovered the right kind of abstractions to be used. albeit quite 4 . Furthermore. This should not come as a surprise since this recognition is a direct consequence of the tendency that ever greater parts of compilers are more or less automatically generated from a formal description of some aspect of a programming language. it may be very expensive to check whether or not such a property holds. We will see that the concept of lazy evaluation plays an important rˆle in making these efficient and straightforward implementations possible. Goals impossible.1. In this course we will see many examples of such classes. This has led to a whole hierarchy of context-free grammars classes. The general observation is that the more precise the answer to a specific question one wants to have. some are easy to check by machine. As a side effect you will acquire a special form of writing functional programs. 1. Compositionality As we will see the structure of many compilers follows directly from the grammar that describes the language to be compiled. e. Abstraction mechanisms One of the main reasons for that what used to be an endeavour for a large team in the past can now easily be done by a couple of first year’s students in a matter of days or weeks. a concept thus far only familiar to mathematicians and more theoretically oriented computer scientist that study the description of the meaning of a programming language. o 1. Fortunately however there have also been some developments in programming language design. to see whether or not a particular property holds. which makes it often surprisingly simple to solve at first sight rather complicated programming assignments. the more computational effort is needed and the sooner this question cannot be answered by a human being anymore. by making use of a description of their outer appearance or by making use of a description of the semantics (meaning) of a language. Once this phenomenon was recognised it went under the name syntax directed compilation. and an efficient way of partitioning a problem into smaller components. We claim that the combination of a modern. and some are easily checked by a simple human inspection.3. Unfortunately there is no simple way to teach the techniques which have led us thus far. of which we want to mention the developments in the area of functional programming in particular. some of which are more powerful.g. and under the influence of the more functional oriented style of programming. it was recognised that actually compilers are a special form of homomorphisms.

. find clues about how to extend such languages to enable us to make common patterns.1. and even find some inspiration for further developments in language design: i. 5 . combined with the concept of lazy evaluation. we can show the real power of abstraction. in a modern functional language. and fragments thereof. explicit. Abstraction mechanisms elaborate. We hope that by presenting many algorithms. type system. which thus far have only been demonstrated by giving examples.e. There does not exist another readily executable formalism which may serve as an equally powerful tool.4. provides an ideal platform to develop and practice ones abstraction skills.

1. Goals 6 .

. possibly infinite. This property may enable us to mathematically prove that a sentence belongs to some language. This terminology has been carried over to computer science: the programming language Java can be seen as the set of all correct Java programs. and often such proofs can be constructed automatically by a computer in a process called parsing. in the area of natural language research the question has often been posed what actually constitutes a “language”. such as identifiers. Context-Free Grammars Introduction We often want to recognise a particular structure hidden in a sequence of symbols. Just as we say that the fact whether or not a sentence belongs to the English language is determined by the English grammar (remember that before we have used the phrase “grammatically correct”). This chapter introduces the most important notions of this course: the concept of a language and a grammar. Notice that this is quite different from the grammars for natural languages. specific operators etc.2. when reading this sentence. and that a sentence consists of a sequence of English words. sequences of characters. For example. which are referred to as strings). Of course. where one may easily disagree about whether something is correct English or not. g. whereas a Java program can be seen as a sequence of Java symbols. reserved words. the expressiveness of the class of grammars we are going to describe in this chapter is rather limited. The simplest definition one can come up with is to say that the English language equals the set of all grammatically correct English sentences. This completely formal approach however also comes with a disadvantage. you automatically structure it by means of your understanding of the English language. A language is a. 7 . not any sequence of symbols is an English sentence. which was posed long before computers were widely used. A difference with the grammars for natural languages is that this grammatical formalism is a completely formal one. given the limitations of the formalism. and there are many languages one might want to describe but which cannot be described. So how do we characterise English sentences? This is an old question. set of sentences and sentences are sequences of symbols taken from a finite set (e. we have a grammatical formalism for describing artificial languages.

for example for removing left recursion. 2. the definition of the concept refers to the concept itself. is defined as follows: • ε is a sequence. be able to transform a grammar. Context-Free Grammars Goals The main goal of this chapter is to introduce and show the relation between the main concepts for describing the parsing problem: languages and sentences. called the empty sequence. understand the EBNF formalism. o understand the relation between context-free grammars and datatypes. be able to verify whether or not a simple grammar is ambiguous.2.1. called X ∗ . i.. and grammars. In the following.1 (Sequence). whereas sequences are always finite. but finite application of one of the two given rules is a sequence over X . understand the derivation process used in describing languages. understand the concepts of concrete and abstract syntax . They can be implemented easily in Haskell using lists. and • if z is a sequence and a is an element of X . we will introduce several concepts based on sequences. In particular. Furthermore. understand the rˆle of parse trees. then az is also a sequence. know the difference between a terminal symbol and a nonterminal symbol. Let X be a set. induction 8 . the definition corresponds almost exactly to the definition of the type [a] of lists with elements of type a in Haskell. know how to describe languages by means of context-free grammars. The set of sequences over X . be able to read and interpret the BNF notation. The above definition is an instance of a very common definition pattern: it is a definition by induction. be able to convert a grammar from EBNF-notation into BNF-notation by hand. Languages The goal of this section is to introduce the concepts of language and sentence. It is implicitly understood that nothing that cannot be formed by repeated. be able to construct a simple context-free grammar in EBNF notation. The one difference is that Haskell lists can be infinite. after you have studied this chapter you will: • • • • • • • • • • • • • know the concepts of language and sentence. In conventional texts about mathematics it is not uncommon to encounter a definition of sequences that looks as follows: sequence Definition 2. e.

• A language is a subset of T ∗ . u. . We typically use ε only when we want to explicitly emphasize that we are talking about the empty sequence. whereas xy is the concatenation of sequences x and y. We write a to denote both the single symbol a or the sequence aε. . Languages Functions that operate on an inductively defined structure such as sequences are typically structurally recursive. In Haskell. then. exam}. 9 . Concatenation of lists is handled by the operator (+ whereas a single element can be added to the front +). Elements are distinguished from lists by their type. x}. Sentence). b. else}.. d. and letters from the end of the alphabet to represent sequences. c. we often leave the distinction between single symbols and sequences implicit. Language. depending on context. t. az should be understood as the symbol a followed by the sequence z . . practical.2 (Alphabet. a set of English words {course. Haskell identifiers often have longer names. Note that ‘word’ and ‘sentence’ in formal languages are used as synonyms. z}. the binary alphabet {0. We use letters from the beginning of the alphabet to represent single symbols. Also. e. n. Furthermore. sets of reserved words {if. Some examples of alphabets are: • • • • • the conventional Roman alphabet: {a. i.1. When talking about languages and grammars. o.. a set of characters l = {a. m. s. so ab in Haskell is to be understood as a single identifier with name ab. r. for some alphabet T . alphabet language sentence Examples of languages are: • T ∗ . e. Now we move from individual sequences to finite or infinite sets of sequences. c. We start with some terminology: Definition 2. e.) Note that the Haskell notation for lists is generally more precise than the mathematical notation for sequences. we denote concatenation of sequences and symbols in the same way. b. ∅ (the empty set). k. • An alphabet is a finite set of symbols. 1}. not as a combination of two symbols a and b. w. • A sentence (often also called word ) is an element of a language. there is a clear difference between a and [a]. i. of a list using (:). i. {ε} and T are languages over alphabet T . such definitions often follow a recursion pattern which is similar to the definition of the structure itself. exercise.2. l. p. . (Recall the function foldr from the course on Functional Programming. all these distinctions are explicit.

If we define a language by means of a predicate we only have a means to decide whether or not an element belongs to a language. . so {1. P (x ). b. Chapter 8.2. Examples of the second approach are: • the even natural numbers {n | n ∈ {0. there are more specific operators. where s R denotes the reverse of sequence s. which apply only to sets of sequences. sequences which read the same forward as backward. Since a language is a set we immediately see three different approaches: • enumerate all the elements of the set explicitly. L+ = i∈N. complement of L reverse of L concatenation of L and M 0th power of L n + 1st power of L star-closure of L positive closure of L 10 . Quite often we want to prove that a language L. A famous example of a language which is easily defined in a predicative way. intersection and difference can be used to construct new languages from existing ones. 2} ∪ {1. it provides us with a way to enumerate all elements of a language. s = s R }. . . • characterise the elements of the set by means of a predicate. / In addition to these set operators. but for which the membership test is very hard. The question that now arises is how to specify a language. The complement of a language L over alphabet T is defined by L = {x | x ∈ T ∗ . exercise. 9}∗ . t ∈ M } L0 = {ε} Ln+1 = LLn L∗ = i∈N Li = L0 ∪ L1 ∪ L2 ∪ .i>0 Li = L1 ∪ L2 ∪ . c}∗ . b. e. If this property is of the form P (L) = ∀x ∈ L . which is defined by means of an inductive definition. has a specific property P . then L = T∗ − L LR = {s R | s ∈ L} LM = {st | s ∈ L. Since languages are sets the usual set operators such as union. We have just seen some examples of the first (the Roman alphabet) and third (the set of sequences over an alphabet) approach. 2. • the language PAL of palindromes. then we want to prove that L ⊆ P . Context-Free Grammars • the set {course. . . . 1. Definition 2. We will use these operators mainly in the chapter on regular languages. 3} = {1. One of the fundamental differences between the predicative and the inductive approach to defining a language is that the latter approach is constructive. n mod 2 = 0}. . 3}. is the set of prime numbers. Note that ∪ denotes set union.. c}: {s | s ∈ {a. practical.3 (Language operations). exam} is a language over the alphabet l of characters and exam is a sentence in it. over the alphabet {a. . Let L and M be languages over the same alphabet T . i. x ∈ L}. • define which elements belong to the set by means of induction.

• the strings consisting of just one character. aaaabaaaa. Although this definition defines the language we want. Let L={ab. But the class of languages that can be described by context-free grammars is limited.3). are palindromes. and that can describe a large class of languages. Is there a difference between these operators? 2. An important observation is the fact that the set of palindromes can be defined inductively as follows. of sets. ε. called grammars.1.4. baaaaabaa? Exercise 2. Which of the following strings are in L∗ : abaabaaabaa. {ε}L = L{ε} = L Exercise 2. a. and to prove properties of sets. but it often is complicated to manipulate sets. Grammars The goal of this section is to introduce the concept of context-free grammars. aa. b. In the previous section we defined PAL. then the strings obtained by prepending and appending the same character. and c. 11 . b. a kind of grammars that are convenient for automatic processing. Working with sets might be fun. What are the elements of ∅∗ ? Exercise 2.2. by means of a predicate. is a palindrome. prove 1. For these purposes we introduce syntactical definitions. where a and b are the terminals. it is hard to use in proofs and programs. the strings aP a bP b cP c are palindromes. baa}. the language of palindromes. to it are also palindromes. baaaaabaaaab. In this section we defined two “star” operators: one for arbitrary sets (Definition 2. that is. • if P is a palindrome. a. This section will only discuss so-called context-free grammars.2. and c. Definition 2.2.3.1) and one for languages (Definition 2. • The empty string. ∅L = L∅ = ∅ 2.2. Grammars The following equations follow immediately from the above definitions.4 (Palindromes by induction). For any language. L∗ = {ε} ∪ LL∗ L+ = LL∗ Exercise 2.

. Since in computer science we use many definitions which follow such a pattern. Therefore this definition is said to be complete (every palindrome is in PAL). called a grammar . b’s. but you also have to convince yourself that the definition is complete. Very often it is relatively easy to find a definition that is sound. The rule with which the empty string is constructed is: 12 . The second kind of deduction rule. We can give a grammar of PAL by translating the deduction rules given above into production rules. we introduce a shorthand for it. and an are true. a2 . . Now that we have an inductive definition for palindromes. we can proceed by giving a formal representation of this inductive definition. an a or a The first kind of deduction rule has to be read as follows: if a1 . The last part of the definition covers the inductive cases. it is still laborious to write. Conversely. Finding an inductive definition for a language which is described by a predicate (like the one for palindromes) is often a nontrivial task. called an axiom. and c’s reads the same forwards and backwards then it belongs to the language PAL. . . All strings which belong to the language PAL inductively defined using the above definition read the same forwards and backwards. then a is true. A typical method for proving soundness and completeness of an inductive definition is mathematical induction. Using these deduction rules we can now write down the inductive definition for PAL as follows: ε ∈ PAL a ∈ PAL b ∈ PAL c ∈ PAL aP a ∈ PAL bP b ∈ PAL cP c ∈ PAL P ∈ PAL P ∈ PAL P ∈ PAL grammar Although the definition of PAL is completely formal. has to be read as follows: a is true. if a string consisting of a’s. . a2 .2. Inductive definitions like the one above can be represented formally by making use of deduction rules which look like: a1 . . Therefore this definition is said to be sound (every string in PAL is a palindrome). A grammar consists of production rules. . Context-Free Grammars sound complete The first two parts of the definition cover the basic cases.

and gives rise to a production: P →a P →b P →c These production rules correspond to the axioms that state that the one element strings a. or c to it. In this case we have only one choice for a start symbol. Grammars P →ε This rule corresponds to the axiom that states that the empty string ε is a palindrome. A rule of the form s → α. b. • and the set of productions (the seven productions that we have introduced so far). b. then one can deduce that aP a. then we obtain a new palindrome by prepending and appending an a. they are not part of the alphabet of the language. A production rule can be considered as a possible way to rewrite the symbol s. or nonterminal for short. The grammar we have presented so far consists of three components: • the set of terminals {a.2. b. we obtain the following grammar for PAL: P →ε P →a P →b terminal production rule nonterminal start symbol 13 . c}. that is. Each of the one element strings a. Note that the intersection of the set of terminals and the set of nonterminals is empty. Three other basic production rules are the rules for constructing palindromes consisting of just one character.2. bP b and cP c are also palindromes. bαb. if P is a palindrome. or production for short. • the set of nonterminals {P }. To obtain these palindromes we use the following recursive productions: P → aP a P → bP b P → cP c These production rules correspond to the deduction rules that state that. Such a symbol is an example of a nonterminal symbol . and cαc are also palindromes. The symbol P to the left of the arrow is a symbol which denotes palindromes. is called a production rule. We complete the description of the grammar by adding a fourth component: the nonterminal start symbol P . To summarize. and c is a palindrome. and c are palindromes. If a string α is a palindrome. aαa. Nonterminal symbols are also called auxiliary symbols: their only purpose is to denote structure. b. but a grammar may have many nonterminal symbols. and we always have to select one to start with. where s is symbol and λα is a sequence of symbols.

Not every language can be described via a context-free grammar. The adjective “context-free” in the above definition comes from the specific production rules that are considered: exactly one nonterminal on the left hand side. where A is a nonterminal and α is a sequence of terminals and nonterminals. S ) where • T is a finite set of terminal symbols. who first used this notation for defining grammars. Each production has the form A → α. We will encounter this example again later in these lecture notes. is often implicit. {a. Instead of writing S →α S →β we combine the two productions for S in one line as using the symbol |: S →α|β We may rewrite any number of rewrite rules for one nonterminal in this fashion. and since the production rules form the heart of every grammar. N . The standard example here is {an bn cn | n ∈ N}. R. • N is a finite set of nonterminal symbols (T and N are disjunct). S ∈ N . we introduce the following shorthand. c}.1. Notational conventions In the definition of the grammar for PAL we have written every production on a single line. {P }. Also the start-symbol is implicitly defined here since there is only one nonterminal. so the grammar for PAL may also be written as follows: P → ε | a | b | c | aP a | bP b | cP c BNF The notation we use for grammars is known as BNF – Backus Naur Form – after Backus and Naur. context-free grammar Definition 2. and the set of nonterminals.2. b.2. • R is a finite set of production rules. 2. Context-Free Grammars P P P P →c → aP a → bP b → cP c The definition of the set of terminals. 14 .5 (Context-Free Grammar). • S is the start symbol. We conclude this example with the formal definition of a context-free grammar. A context-free grammar G is a four-tuple (T . Since this takes up a lot of space.

b}. The names will be written in front of the production. Now we consider the reverse question: how to obtain a language from a given grammar? Before we can answer this question we first have to say what we can do with a grammar. w )} where #(c.7. the startsymbol is usually the nonterminal in the left hand side of the first production. 2. we have demonstrated in several examples how to construct a grammar for a particular language. The answer is simple: we can derive sequences with it. Give a context free grammar for the set of sentences over alphabet X where 1. So. Exercise 2. In the previous section.3. Give a grammar for palindromes over the alphabet {a. X = {a}.2. Exercise 2. 15 . w ) is the number of c-occurrences in w .8. Give a grammar for the language L = {s s R | s ∈ {a. Exercise 2. Give a context free grammar for the language L = {a n b n | n ∈ N} Exercise 2. if we give a context-free grammar just by means of its productions. and the start-symbol is usually called S . Alpha rule: S → α Beta rule: S → β Finally. Exercise 2.6. b} } This language is known as the language of mirror palindromes. The language of a grammar The goal of this section is to describe the relation between grammars and languages: to show how to derive sentences of a language. Give a grammar for parity sequences.5. Sometimes we want to give names to production rules. The language of a grammar Another notational convention concerns names of productions.10. Give a grammar for the language L = {w | w ∈ {a. for example. w ) = #(b. given its grammar. b} ∧ #(a. A parity sequence is a sequence consisting of 0’s and 1’s that has an even number of ones.3. Exercise 2. X = {a. ∗ ∗ 2. b}.9.

In the first step of this derivation production P → bP b is used to rewrite P into bP b. 0 i <n : ϕi ⇒ ϕi+1 direct derivation derivation If n = 0. a nontrivial task. Definition 2. A sequence ϕn is derived from a sequence ϕ0 . . In the second step production P → aP a is used to rewrite bP b into baP ab. and it follows that we can derive each sentence ϕ from itself in zero steps: ϕ ⇒∗ ϕ partial derivation A partial derivation is a derivation of a sequence that still contains nonterminals. We write αX γ ⇒ αβγ and we also say that αX γ rewrites to αβγ in one step. in general.2. in the last step production P → c is used to rewrite baP ab into bacab. if there exist sequences ϕ0 . . which is obtained by replacing the left hand side X of the production by the corresponding right hand side β. For example. that can be derived in zero or more direct derivation steps from the start symbol P using the productions of the grammar for palindromes given before. . Each branch represents a (successful or unsuccessful) direction in which a possible derivation may proceed. . 16 . Suppose X → β is a production of a grammar. Context-Free Grammars How do we construct a palindrome? A palindrome is a sequence of terminals. We will now describe derivation steps more formally. Another important challenge is to arrange things in such a way that finding a derivation can be done in an efficient way. where X is a nonterminal symbol and β is a sequence of (nonterminal or terminal) symbols. A derivation is only one branch of a whole search tree which contains many more branches. in our case the characters a. We say that αX γ directly derives the sequence αβγ. Let αX γ be a sequence of (nonterminal or terminal) symbols. the sequence bacab can be derived using the grammar for palindromes as follows: P ⇒ bP b ⇒ baP ab ⇒ bacab derivation Such a construction is called a derivation. written ϕ0 ⇒∗ ϕn . Finding a derivation ϕ0 ⇒∗ ϕn is. Finally. b and c. this statement is trivially true. Constructing a derivation can be seen as a constructive proof that the string bacab is a palindrome.6 (Derivation). ϕn such that ∀i .

s ∈ T ∗ } The language L(G) is also called the language generated by the grammar G.3. N . but the reverse is not true: given a language we can construct many grammars that generate the language. and so do letters. So for a particular grammar there exists a unique language. Thus.8 (Context-free language). The language of a grammar G=(T . For example.2. Dig → 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 17 . we define usually denoted by L(G). All palindromes can be derived from the start symbol P . A context-free language is a language that is generated by a context-free grammar. Two grammars that generate the same language are called equivalent. b. we say that the string bacab belongs to the language generated by the grammar for palindromes. the set of all palindromes over the alphabet {a. grammar equivalent context-free language 2. • The language of single digits is specified by a grammar with ten production rules for the nonterminal Dig. S ).7 (Language of a grammar). c}. and PAL is context-free. the language of our grammar for palindromes is PAL. To phrase it more mathematically: the mapping between a grammar and its language is not a bijection. R. Definition 2. These grammars will be used frequently in later sections.1. In general. is defined as L(G) = {s | S ⇒∗ s. language of a Definition 2. Examples of basic languages Digits occur in a several programming languages and other languages. s ∈ T ∗ } Note that different grammars may have the same language. The language of a grammar From the example derivation above it follows that P ⇒∗ bacab Because this derivation begins with the start symbol of the grammar and results in a sequence consisting of terminals only (a terminal string).3. We sometimes also talk about the language of a nonterminal A. if we extend the grammar for PAL with the production P → bacab. we obtain a grammar with exactly the same language as PAL. In this subsection we will define some grammars that specify some basic languages such as digits and letters. which is defined by L(A) = {s | A ⇒∗ s.

Context-Free Grammars • We obtain sequences of digits by means of the following grammar: Digs → ε | Dig Digs • Natural numbers are sequences of digits that start with a non-zero digit. of which the first digit is non-zero. we first define the language of non-zero digits. The following grammar for identifiers might be used in a programming language: Identifier → Letter AlphaNums AlphaNums → ε | Letter AlphaNums | Dig AlphaNums An identifier starts with a letter. and is followed by a sequence of alphanumeric characters. | z CLetter → A | B | . A letter is now either a small or a capital letter. such as for example underscores and dollar symbols. etc. are all represented by identifiers in programming languages. So in order to specify natural numbers. datatypes. followed by two capital letters. . • Dutch zip codes consist of four digits. If a natural number is not preceded by a sign. We might want to allow more symbols. | Z In the real definitions of these grammars we have to write each of the 26 letters. Nat → 0 | Dig-0 Digs • Integers are natural numbers preceded by a sign. . . of course. but then we have to adjust the grammar. So ZipCode → Dig-0 Dig Dig Dig CLetter CLetter 18 . of course. e. function names.2. letters and digits. Sign → + | Int → Sign Nat | Nat • The languages of small letters and capital letters are each specified by a grammar with 26 productions: SLetter → a | b | .. i. Letter → SLetter | CLetter • Variable names.. Dig-0 → 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 Now we can define the language of natural numbers as follows. . it is supposed to be a positive number.

a derivation that contains nonterminals in its right hand side. Parse trees The goal of this section is to introduce parse trees. there may be different derivations for the same sentence. Consider the grammar SequenceOfS with productions: S → SS S →s Using this grammar.1 context free ? 2. For any partial derivation. e. Give a simple description of the language generated by the grammar with productions S → aA A → bS S →ε Exercise 2. As a consequence. i.. Furthermore. Is the language L defined in Exercise 2. and to show how parse trees relate to derivations. then the derivations are considered to be equivalent.13. we can derive the sentence sss in at least the following two ways (the nonterminal that is rewritten in each step appears underlined): S ⇒ S S ⇒ S S S ⇒ S sS ⇒ ssS ⇒ sss 19 .2. However. there may be several productions of the grammar that can be used to proceed the partial derivation with.4.14.4. Why? Exercise 2. then these derivations are considered to be different. however.15. Here is a simple example. If. Parse trees Exercise 2.12. different derivation steps have been chosen in two derivations. this section defines (non)ambiguous grammars.11. What language does the grammar with the following productions generate? S → Aa A→B B → Aa Exercise 2. A terminal string that belongs to the language of a grammar is always derived in one or more steps from the start symbol of the grammar. What language is generated by the grammar with the single production rule S →ε Exercise 2. if only the order in which the derivation steps are chosen differs between two derivations.

The set of all equivalent derivations can be represented by selecting a so-called canonical element. then there exists also a leftmost derivation of x . are both not leftmost. A good candidate for such a canonical element is the leftmost derivation. the following derivation does not use the same derivation steps: S ⇒ S S ⇒ S SS ⇒ sS S ⇒ ssS ⇒ sss In both derivation sequences above. however. The parse tree of a terminal symbol is a leaf labelled with the terminal symbol. Context-Free Grammars S ⇒ S S ⇒ sS ⇒ sS S ⇒ sS s ⇒ sss These derivations are the same up to the order in which derivation steps are taken. The leftmost derivation corresponding to these two sequences above is S ⇒ S S ⇒ sS ⇒ sS S ⇒ ssS ⇒ sss leftmost derivation parse tree There exists another convenient way for representing equivalent derivations: they all correspond to the same parse tree (or derivation tree). The internal nodes of a parse tree are labelled with a nonterminal N . In this derivation. If there exists a derivation of a sentence x using the productions of a grammar. A parse tree is a representation of a derivation which abstracts from the order in which derivation steps are chosen. The last of the three derivation sequences for the sentence sss given above is a leftmost derivation.2. the first S is rewritten to SS . the leftmost nonterminal is rewritten in each step. In a leftmost derivation. However. and the children of such a node are the parse trees for symbols of the right hand side of a production for N . The resulting parse tree of the first two derivations of the sentence sss looks as follows: S S s S s S S s The third derivation of the sentence sss results in a different parse tree: S S S s S s S s 20 . The two equivalent derivation sequences before. however. the first S was rewritten to s.

2. Otherwise it is called ambiguous. Take for example the sentence ‘They are flying planes’. Therefore. Such grammars are called ambiguous. all derivations of the string abba using the productions of the grammar P P P A B →ε → APA → BPB →a →b are represented by the following derivation tree: P A a B b P P ε B b A a A derivation tree can be seen as a structural interpretation of the derived sentence. This sentence can be read in two ways. you will find that most grammars of programming languages and other languages that are used in processing information are unambiguous. equivalently. politicians). Grammars have proved very successful in the specification of artificial languages (such as programming languages). g. This implies that is impossible to write a program that determines for an arbitrary context-free grammar whether it is ambiguous or not. Definition 2. unambiguous grammar). unambiguous ambiguous 21 . or. if every sentence has a unique derivation tree. because it is usually impossible to maintain an ambiguous meaning in a translation.9 (ambiguous grammar.. and ‘They – are flying – planes’. since there exist two parse trees for the sentence sss. partly because is extremely difficult to construct an unambiguous grammar that specifies a nontrivial part of the language. Note that there might be more than one structural interpretation of a sentence with respect to a given grammar.4. It is usually rather difficult to translate languages with ambiguous grammars. it certainly is considered a disadvantage for language translators. The grammar SequenceOfS for constructing sequences of s’s is an example of an ambiguous grammar. While ambiguity of natural languages may perhaps be considered as an advantage for their users (e. Parse trees As another example. A grammar is unambiguous if every sentence has a unique leftmost derivation. with different meanings: ‘They – are – flying planes’. It is in general undecidable whether or not an arbitrary context-free grammar is ambiguous. They have proved less successful in the specification of natural languages (such as English).

Properties are particularly interesting in combination with grammar transformations. Context-Free Grammars 2.5. no other nonterminal can derive the empty string. • a grammar may have the property that every production either has a single terminal. Then we can use this program for building parse trees for any grammar. So why are we interested in such properties? Some of these properties imply that it is possible to build parse trees for sentences of the language of the grammar in only one way. we will first look at a number of properties of grammars. or two nonterminals in its right hand side.2. Removing unreachable productions. Some other properties imply that we can build these parse trees very fast. that is. Other properties are used to prove facts about grammars. Here are some examples of properties that are worth considering for a particular grammar: • a grammar may be unambiguous. Removing left recursion. In the following. but that possibly satisfy different properties. grammar transformation 22 . we would like to be able to transform grammars into other grammars that generate the same language. Suppose. Since it is sometimes convenient to have a grammar that satisfies a particular property for a language. and that we can prove that every grammar can be transformed in a grammar in Chomsky normal form. Yet other properties are used to efficiently compute certain information from parse trees of a grammar. Associative separator. for example. we describe a number of grammar transformations: • • • • • • • Removing duplicate productions. • a grammar may have the property that only the start symbol can derive the empty string. Substituting right hand sides for nonterminals. We then will show how we can systematically transform grammars into grammars that describe the same language and satisfy particular properties. Introduction of priorities. that we have a program that builds parse trees for sentences of grammars in Chomsky normal form. A grammar transformationis a procedure to obtain a grammar G from a grammar G such that L(G ) = L(G). Such a grammar is said to be in Chomsky normal form. Grammar transformations In this section. Left factoring. every sentence of its language has a unique parse tree.

3.2. For example. i. In the following. Removing duplicate productions This grammar transformation is a transformation that can be applied to any grammar of the correct form. In such a case. If a grammar contains two occurrences of the same production rule. Grammar transformations There are many more transformations than we describe here. we will only show a small but useful set of grammar transformations. w . e.5.2. the unreachable production can be dropped: A → uxv | uwv | z 23 . we will assume that N is the set of nonterminals. then the second production can never occur in the derivation of a sentence starting from A. v . y and z denote sequences of terminals and nonterminals. are elements of (N ∪ T )∗ . we can substitute B in the right hand side of the first production in the following grammar: A → uBv | z B →x |w The resulting grammar is: A → uxv | uwv | z B →x |w 2. x . A→u |u |v can be transformed into A→u |v 2. For example. Removing unreachable productions Consider the result of the transformation above: A → uxv | uwv | z B →x |w If A is the start symbol and B does not occur in any of the symbol sequences u. Substituting right hand sides for nonterminals If a nonterminal X occurs in a right hand side of a production.5. 2. T is the set of terminals..5. in which X has been replaced by its right hand sides. x . w . z .5.1. the production may be replaced by just as many productions as there exist productions for X . one of these occurrences can be removed. and that u. v .

. . To remove the left-recursive productions of a nonterminal A.2.5. | Axn A → y1 | y2 | . and replace the productions for A by: 24 . Two productions for the new nonterminal are added: one for each of the two different end sequences of the two productions.5. ym starts with A. if we can derive Az from A in one or more steps). We now add a new nonterminal Z . For example: A → xy | xz | v may be transformed into A → xZ | v Z →y |z where Z is a new nonterminal. The following transformation removes left-recursive productions. . | ym where none of the symbol sequences y1 .5. . Left-recursive grammars are sometimes undesirable – we will. These two productions can then be replaced by a single production. replacing the part of the sequence after the common start sequence. A grammar is left-recursive if we can derive A => + Az for some nonterminal A of the grammar (i. . 2.. We can thus factorize the productions for A as follows: A → Ax1 | Ax2 | . Context-Free Grammars 2. Removing left recursion left recursion A production is called left-recursive if the right-hand side starts with the nonterminal of the left-hand side. left recursion can be removed by transforming the grammar. and left factoring the grammar before constructing a parser can dramatically improve the performance of the resulting parser. . .e. parsers can be constructed systematically from a grammar. that ends with a new nonterminal. Left factoring left factoring Left factoring is a grammar transformation that is applicable when two productions for the same nonterminal start with the same sequence of (terminal and/or nonterminal) symbols. For example. As we will see in Chapter 3. we divide the productions for A in sets of left-recursive and non left-recursive productions. the production A → Az is left-recursive. see in Chapter 3 that a parser constructed systematically from a left-recursive grammar may loop.4. . for instance. Fortunately.

Therefore. The second is not. . d2 . given three declarations d1 . .’: Decls → Decls .6.2. and obtain the following productions: S → s | sZ Z → S | SZ 2. but we can still derive A ⇒∗ Ayx . Here is an example of how to apply the procedure described above to a grammar that is directly left recursive. Associative separator The following grammar fragment generates a list of declarations. This grammar is ambiguous. We can thus directly apply the procedure for left recursion removal.5. have been omitted. Grammar transformations A → y1 | y1 Z | . e. a grammar that contains a left-recursive production of the form A → Ax . | xn | xn Z Note that this procedure only works for a grammar that is directly left-recursive.5. . but a bit more complicated [1]. Removing left recursion in an indirectly left-recursive grammar is also possible.4: S → SS S →s The first production is left-recursive. Decls Decls → Decl 25 . the meaning of d1 . that is. | ym | ym Z Z → x1 | x1 Z | . which generates a single declaration. d2 ) . d3 is the same. it does not matter how we group the declarations. is an associative separator in the generated language. and d3 . (d2 .. namely the grammar SequenceOfS that we have introduced in Section 2. An example is A → Bx B → Ay None of the two productions is left recursive. . i. Decls Decls → Decl The productions for Decl . separated by a semicolon ‘. for the same reason as SequenceOfS is ambiguous. we may use the following unambiguous grammar for generating a language of declarations: Decls → Decl . Grammars can also be indirectly left recursive. d3 ) and (d1 . The operator .

the sentence 2+4*6 has two parse trees: one corresponding to (2+4)*6. we transform the grammar as follows: E →T E →E +T T →F T →T *F F →(E ) F → Digs This grammar generates the same language as the previous grammar for expressions. 26 . then removing the ambiguity in favour of a particular nesting is dangerous. such as for example natural numbers and addition. This grammar is ambiguous: for example. the transformed grammar is also no longer left-recursive. Context-Free Grammars Note that in this case.7. if the separator is not associative.2. For example. 2. Decl Decls → Decl The grammar transformation just described should be handled with care: if the separator is associative in the generated language. the following grammar generates arithmetic expressions: E E E E →E +E →E *E →(E ) → Digs where Digs generates a list of digits as described in Section 2. applying the transformation is fine. An alternative unambiguous (but still left-recursive) grammar for the same language is Decls → Decls . the latter expression is the intended reading of the sentence 2+4*6. If we make the usual assumption that * has higher priority than +. and one corresponding to 2+(4*6). This grammar transformation is often useful for expressions which are separated by associative operators. Introduction of priorities Another form of ambiguity often arises in the part of a grammar for a programming language which describes expressions.5. In order to obtain parse trees that respect these priorities.3. However. but it respects the priorities of the operators.1. like the semicolon in this case.

a} and the following productions: S → if b then S else S S → if b then S S →a 1.17. we get productions Ei → Ei+1 Ei → Ei Op i Ei+1 The nonterminal Op i is parameterised and generates operators of priority i .8. Consider the following ambiguous grammar with start symbol A: A → AaA A→b|c Transform the grammar by applying the rule for associative separators.2. they are omitted. For each of the above transformations we should therefore prove that the generated language remains the same. For 1 i < n. else. There exist many other grammar transformations. and obtain a dual grammar transformation. instead of writing a large number of nearly identically formed production rules. Note that everywhere we use ‘left’ (left-recursion. there should also be a production for expressions of the highest priority. Grammar transformations In practice. Then. but the ones given in this section suffice for now. 3. often more than two levels of priority are used.5. b. we can replace it by ‘right’. Exercise 2. In addition to the above productions. we can abbreviate the grammar by using parameterised nonterminals.5. A grammar transformation transforms a grammar into another grammar that generates the same language. then. left factoring). The standard example of ambiguity in programming languages is the dangling else. Give two parse trees for the sentence if b then if b then a else a.16. We will discuss a larger example of a grammar transformation after the following section. Exercise 2. Since the proofs are too complicated at this point. Give an unambiguous grammar that generates the same language as G. Choose the transformation such that the resulting grammar is also no longer left-recursive. Discussion We have presented several examples of grammar transformations. How does Java prevent this dangling else problem? 27 . for example: En → ( E1 ) | Digs 2. 2. Proofs can be found in any of the theoretical books on language and parsing theory [11]. Let G be a grammar with terminal set {if.

we introduce the notion of abstract syntax. Concrete and abstract syntax In this section. using the names of the productions as constructors: data S = Beside S S | Single Note that the nonterminals on the right hand side of Beside reappear as arguments of the constructor Beside. On the other hand. Consider the follwing grammar with start symbol S : S → AB A → ε | aaA B → ε | Bb 1.19. As an example we take the grammar SequenceOfS: S → SS S →s First. One may be tempted to make the following definition instead: 28 . and show how to obtain an abstract syntax from a concrete syntax. Give an equivalent non left recursive grammar. we establish a connection between context-free grammars and Haskell datatypes. What language does this grammar generate? 2. A grammar for bit lists is given by L →B L →L. Values of these datatypes represent parse trees of the context-free grammar.L B →0|1 Remove the left recursion from this grammar. we give each of the productions of this grammar a name: Beside: S → SS Single: S → s Now we interpret the start symbol of the grammar S as a datatype. For each context-free grammar we can define a corresponding datatype in Haskell. A bit list is a nonempty list of bits separated by commas. the terminal symbol s in the production Single is omitted in the definition of the constructor Single.18.6. Context-Free Grammars Exercise 2. 2. Exercise 2.2. To this end.

as long as that information is recoverable. A semantic function is a function that is defined on an abstract syntax of a language. this datatype is too general for the given grammar.2.6. For example. Concrete and abstract syntax data S = Beside S S | Single Char — too general However. but we know that this character always has to be s. we can write a function that performs this conversion: sToString :: S → String sToString (Beside l r ) = sToString l + sToString r + sToString Single = "s" Applying the function sToString to either Beside Single (Beside Single Single) or Beside (Beside Single Single) Single yields the string "sss". The adjective ‘abstract’ indicates that values of the abstract syntax do not need to explicitly contain all information about particular sentences. Since there is no choice anyway. An argument of type Char can be instantiated to any single character. an abstract syntax of a language describes the shapes of parse trees of the language. A function such as sToString is often called a semantic function. the parse tree that corresponds to the first two derivations of the sequence sss is represented by the following value of the datatype S : Beside Single (Beside Single Single) The third derivation of the sentence sss produces the following parse tree: Beside (Beside Single Single) Single To emphasize that these representations contain sufficient information in order to reproduce the original strings. So the concrete syntax of the language of nonterminal S is given by the grammar SequenceOfS. the grammar SequenceOfS can be transformed into the grammar with the following productions: concrete syntax abstract syntax semantic function 29 . The datatype S is an example of an abstract syntax for the language of SequenceOfS. Semantic functions are used to give semantics (meaning) to values. without the need to refer to concrete terminal symbols. Parse trees are therefore often also called abstract syntax trees. In this example. the meaning of a more abstract representation is expressed in terms of a concrete representation. A concrete syntax of a language describes the appearance of the sentences of a language. On the other hand. and the first datatype S serves the purpose of encoding the parse trees of the grammar SequenceOfS just fine. there is no extra value in storing that s. as for example by applying function sToString. Using the removing left recursion grammar transformation.

Consider your answers to Exercises 2. For each production for a particular nonterminal (expanding all the alternatives into separate productions). i. we can conclude that the only important information about sequences of s’s is how many occurrences of s there are. The following Haskell datatype represents a limited form of arithmetic expressions data Expr = Add Expr Expr | Mul Expr Expr | Con Int Give a grammar for a suitable concrete syntax corresponding to this datatype.2. If we look at the two different grammars for SequenceOfS. 30 .7 and 2. The nonterminals on the right hand side of the production rules appear as arguments of the constructors. Exercise 2. Define a datatype Pal corresponding to the grammar and represent the parse trees for pal 1 and pal 2 as values of Pal . and we can still recover the original string from a value SS by means of a semantic function: ssToString :: SS → String ssToString (Size n) = replicate n ’s’ The SequenceOfS example shows that one may choose between many different abstract syntaxes for a given grammar.7. we have introduced a constructor and invented a name for the constructor. The choice of an abstract syntax over another should therefore be determined by the demands of the application. Exercise 2. the sequence sss is represented by the parse tree Size 3. because that information can be recovered by knowing which constructors have been used. b} and a Haskell datatype describing the abstract syntax of such palindromes. Exercise 2. we have introduced a corresponding datatype. but the terminals disappear.. and the two abstract syntaxes. Consider the grammar for palindromes that you have constructed in Exercise 2.20. Give parse trees for the palindromes pal 1 = "abaaba" and pal 2 = "baaab".22. So the ultimate abstract syntax for SequenceOfS is data SS = Size Int Using the abstract syntax SS .21.21 where we have given a grammar for palindromes over the alphabet {a. by what we ultimately want to compute. Context-Free Grammars S → sZ | s Z → SZ | S An abstract syntax of this grammar may be given by data SA = ConsS Z | SingleS data Z = ConsZ SA Z | SingleZ SA For each nonterminal in the original grammar.e.

Define a datatype Mir that describes the abstract syntax corresponding to your grammar. Test your function with the palindromes pal 1 and pal 2 from Exercise 2. Exercise 2. which describes the concrete syntax for mirror palindromes. Write a function that transforms an abstract representation of a mirror palindrome into the corresponding abstract representation of a palindrome (using the datatype from Exercise 2. Define a datatype BitList that describes the abstract syntax corresponding to your grammar. Constructions on grammars This section introduces some constructions on grammars that are useful when specifying larger grammars.1. Consider your answer to Exercise 2. Consider your answer to Exercise 2.1. Give the two abstract parity sequences aEven 1 and aEven 2 that correspond to the concrete parity sequences cEven 1 = "00101" and cEven 2 = "01010".8.24. 1.2.21). Test your function with the abstract bit lists aBitList 1 and aBitList 2 .7. 2. Write a semantic function that transforms an abstract representation of a bit list into a concrete one.0. Test your function with the abstract bit lists aBitList 1 and aBitList 2 .9. Write a semantic function that transforms an abstract representation of a mirror palindrome into a concrete one. 2. Test your function with the abstract mirror palindromes aMir 1 and aMir 2 . The BNF notation. Give the two abstract bit-lists aBitList 1 and aBitList 2 that correspond to the concrete bit-lists cBitList 1 = "0.2. was first used in the early sixties when the programming language ALGOL 60 was defined and until now it is the 31 . Exercise 2. Consider your anwer to Exercise 2. introduced in Section 2.21.21.18. Write a semantic function that transforms an abstract representation of a palindrome into a concrete one. Write a function that concatenates two abstract representations of a bit lists into a bit list.7. which describes the concrete syntax for bit lists by means of a grammar that is not left-recursive. 3. 2. Test your function with the abstract mirror palindromes aMir 1 and aMir 2 . which describes the concrete syntax for parity sequences. for example for programming languages. Furthermore. Test your function with the abstract parity sequences aEven 1 and aEven 2 . Write a semantic function that transforms an abstract representation of a parity sequence into a concrete one. 3. Test your function with the palindromes pal 1 and pal 2 from Exercise 2. Constructions on grammars 1. Exercise 2. Define a datatype Parity describing the abstract syntax corresponding to your grammar. Give the two abstract mirror palindromes aMir 1 and aMir 2 that correspond to the concrete mirror palindromes cMir 1 = "abaaba" and cMir 2 = "abbbba". 1.0" and cBitList 2 = "0. Write a semantic function that counts the number of a’s occurring in a palindrome.25.1". 1. it gives an example of a larger grammar that is transformed in several steps.23. 2. 2.

S → SL S }. and {P } for P ∗ . SM ). S ) where S is a fresh nonterminal. We assume that the nonterminal sets NL and NM are disjoint. we define the meaning of these constructs. RL . • the language L+ is generated by the grammar (T . Context-Free Grammars standard way of defining the syntax of programming languages (see. The notation for the EBNF constructions is not entirely standardized. abbreviated P ∗ . we have defined a number of operations on languages using operations on sets. • one or more occurrences of nonterminal P . One approach is to decompose the language and to find grammars for each of its constituent parts. In Definition 2. S → SL S }. N = NL ∪ {S } and R = RL ∪ {S → ε. S ) where S is a fresh nonterminal. N . N . We introduced grammars as an alternative for the description of languages.3. but that decreases the readability of the grammar. We now show that these operations can be expressed in terms of operations on context-free grammars. SL ) and GM = (T . EBNF This extended BNF notation.. RM .10 (Language operations). S ) where S is a fresh nonterminal. You may object that the Java grammar contains more “syntactical sugar” than the grammars that we considered thus far (and to be honest. Theorem 2. sets of sentences) have a direct counterpart at the level of grammatical descriptions. N = NL ∪ {S } and R = RL ∪ {S → SL . this also holds for the ALGOL 60 grammar): one encounters nonterminals with postfixes ‘?’. Suppose we have grammars for the languages L and M . In some texts. e. the Java Language Grammar). abbreviated P + . • and zero or more occurrences of nonterminal P . A straightforward question to ask is now: can we also define languages 32 . you will for instance find the notation [P ] instead of P ?. We could easily express these constructions by adding additional nonterminals. N = NL ∪ NM ∪ {S } and R = RL ∪ RM ∪ {S → SL . N . R. • the language L M is generated by the grammar (T . The theorem above establishes that the set-theoretic operations at the level of languages (i. In this section. EBNF . say GL = (T . is introduced in order to help abbreviate a number of standard constructions that usually occur quite often in the syntax of a programming language: • one or zero occurrences of nonterminal P . N = NL ∪ NM ∪ {S } and R = RL ∪ RM ∪ {S → SL SM }. S → SM }. abbreviated P ?.2. ‘+’ and ‘∗’. • the language L∗ is generated by the grammar (T . R.for instance. Then • the language L ∪ M is generated by the grammar (T . N . S ) where S is a fresh nonterminal. NL . Designing a grammar for a specific language may not be a trivial task. R. grammars. The same notation can be used for languages. R. and sequences of terminal and nonterminal symbols instead of just single nonterminals. NM .

and P ∗ for a sequence of symbols P is very similar to the definitions of the operations on grammars. N . S → S }.7. Because the concatenation operator for sequences is associative.2.2) Z → Q | PZ (2. the answer is negative – there are no operations on grammars that correspond to the language intersection and difference operators.11 (Grammar operations). we add a new grammar construction for an “optional grammar”. Two of the above constructions are important enough to actually define them as grammar operations. S ) be a context-free grammar and let S be a fresh nonterminal. Definition 2. N ∪ {S }. the operators ·∗ and ·+ can also be defined symmetrically: L(P ∗ ) L(P + ) = L(Z ) with = L(Z ) with Z → ε | ZP Z → P | ZP Many variations are possible on this theme: L(P ∗ Q) = L(Z ) with or also L(P Q ∗ ) = L(Z ) with Z → P | ZQ (2. Constructions on grammars as the difference between two languages or as the intersection of two languages. so Dig ∗ denotes the language consisting of zero or more digits. R ∪ {S → ε. Furthermore. Then G ∗ = (T . P + . Let P be a sequence of nonterminals and terminals. S ) G? = (T . N ∪ {S }. P ∗ denotes zero or more concatenations of string P . R. S → S S }. S → S S }. Definition 2. R ∪ {S → ε. For example. S ) G + = (T .1) 33 . R ∪ {S → S . S ) The definition of P ?. Let G = (T .12 (EBNF for sequences). and translate these operations to operations on grammars? Unfortunately. N ∪ {S }. then L(P ∗ ) L(P + ) L(P ?) = L(Z ) with = L(Z ) with = L(Z ) with Z → ε | PZ Z → P | PZ Z →ε |P where Z is a new nonterminal in each definition.

and we introduce priorities to resolve some of the ambiguities. Using the “introduction of priorities” grammar transformation.1. The following ‘program’ is a sentence of this language: if true then funny true else false where funny = 7 It is clear that this is not a very convenient language to write programs in. and Bool generate variables. SL: an example To illustrate EBNF and some of the grammar transformations given in the previous section. Decls → Var = Expr AppExpr → AppExpr Atomic | Atomic where the nonterminals Var . Removing left recursion gives the following productions for Expr 2 : Expr 2 → Atomic | Atomic Expr 2 Expr 2 → Atomic | Atomic Expr 2 34 . The following grammar generates expressions in a very small programming language. Note that the brackets around the Expr in the production for Atomic. and the semicolon in between the Decls in the second production for Decls are also terminal symbols.7. and both application and if bind stronger then where. number expressions. The above grammar is ambiguous (why?). we give a larger example. The nonterminal Expr 2 is left-recursive. respectively. Number . and boolean expressions. called SL.2. we obtain: Expr → Expr 1 Expr → Expr 1 where Decls Expr 1 → Expr 2 Expr 1 → if Expr 1 then Expr 1 else Expr 1 Expr 2 → Atomic Expr 2 → Expr 2 Atomic where Atomic and Decls have the same productions as before. Application binds stronger than if. Expr Expr Expr Atomic Decls Decls Decl → if Expr then Expr else Expr → Expr where Decls → AppExpr → Var | Number | Bool | ( Expr ) → Decl → Decls . Context-Free Grammars 2.

n ∈ N} 1. the ·? operation applied to G). This transformation is applied to the productions for Expr .. Give grammars for L1 and L2 . e. Let G be a grammar G. Expr Expr 1 Expr 1 Atomic Decls → Expr 1 (where Decls)? → Atomic + → if Expr 1 then Expr 1 else Expr 1 → Var | Number | Bool | ( Expr ) → Decl (. is assumed to be associative. Constructions on grammars Since the new nonterminal Expr 2 has exactly the same productions as Expr 2 . Give the language that is generated by G? (i. 2. Decl )∗ Exercise 2.2).1. we can replace Expr 1 by an optional where clause in the production for Expr : Expr → Expr 1 (where Decls)? After all these grammar transformations. Exercise 2.3. i.2. Another source of ambiguity are the productions for Decls. Give the EBNF notation for each of the basic languages defined in Section 2. and the separator . The nonterminal Decls generates a nonempty list of declarations.. these productions can be replaced by Expr 2 → Atomic | Atomic Expr 2 So Expr 2 generates a nonempty sequence of atomics. Decl )∗ The last grammar transformation we apply is “left factoring”. can you give a context-free grammar for this language? 35 .28.e. and yields Expr → Expr 1 Expr 1 Expr 1 → ε | where Decls Since nonterminal Expr 1 generates either nothing or a where clause. we can replace Expr 2 by Atomic + . Decls → Decl (. n ∈ N} L2 = {a m b n c n | m. we obtain the following grammar. Decl or. Exercise 2.7. Using the ·+ -notation introduced before. Let L1 = {a m b m c n | m. according to (2. Hence we can apply the “associative separator” transformation to obtain Decls → Decl | Decls .27.26. Is L1 ∩ L2 context-free.

However. If s ∈ L(G). Definition 2. Until now we have only seen simple grammars for which it is relatively easy to determine whether or not a string is a sentence of the grammar. "else". in the first part of this course we will show how – given a grammar with certain reasonable properties – we can easily construct parsers by hand. but some of the concepts of scanners will sometimes be used. "true". By analysing the grammar we may nevertheless be able to generate parsers as well.2.13 (Parsing problem). which do not always conform to the restrictions just referred to. "false". Such generated parsers will be in such a form that it will be clear that writing such parsers by hand is far from attractive. This raises another interesting question: What are the minimal changes that have 36 . which splits an input sentence into a list of so-called tokens. "then". the answer to this question may be either a parse tree or a derivation. given the sentence if true then funny true else false where funny = 7 a scanner might return the following list of tokens: ["if". a parser. In this course we will concentrate on parsers. For more complicated grammars this may be more difficult. Quite often the sentence presented to the parser will not be a sentence of the language since mistakes were made when typing the sentence. One of the problems we have not referred to yet in this rather formal chapter is of a more practical nature. Usually. "true". The techniques we describe comprise a simple. Parsing This section formulates the parsing problem. and a code generator. the parsing problem answers the question whether or not s ∈ L(G). and actually impossible for all practical cases. introduction into the area of compiler construction. For example. A compiler for a programming language consists of several parts. although surprisingly efficient. "funny". This question may not be easy to answer given an arbitrary grammar. Context-Free Grammars 2.8. "7"] token parsing problem scanner lexer So a token is a syntactical entity. At the same time we will show how the parsing process can quite often be combined with the algorithm we actually want to perform on the recognized object (the semantic function). "where". "=". a type checker. Given the grammar G and a string s. A scanner usually performs the first step towards an abstract syntax: it throws away layout information such as spacing and newlines. and discusses some of the future topics of the course. "funny". In the second part of this course we will take a look at more complicated grammars. a parser is preceded by a scanner (also called lexer ). Examples of such parts are a scanner.

A semicolon can easily be forgotten. Which terminal strings can be produced by derivations of four or fewer steps? 2. Exercises to be made to the sentence in order to convert it into a sentence of the language? It goes almost without saying that this is an important question to be answered in practice. Do there exist languages L such that (L∗ ) = (L) ? Exercise 2. such as derivations and parse trees were introduced. p 0.33. one would not be very happy with a compiler which.29.32. the language of palindromes.9. One of the most important aspects here is to define metric for deciding about the minimality of a change. n.31. ∗ Exercise 2. humans usually make certain mistakes more often than others. Associated concepts. Give at least two distinct derivations for the string babbab. c} such that L=LR . Does L contain only palindromes? Exercise 2. Exercise 2.34. 2.9. given an erroneous input. b. o Summary Starting from a simple example. we have introduced the concept of a context-free grammar. Let L be a language over alphabet {a. For any m. This is where grammar engineering starts to play a rˆle. describe a derivation of the string bm abn abp . but the chance that an if symbol is missing is far from likely. Exercises Exercise 2. Consider the grammar with productions S → AA A → AAA A→a A → bA A → Ab 1. would just reply that the “Input could not be recognised”. Consider the grammar with productions S → aaB A → bB b A→ε B → Aa Show that the string aabbaabba cannot be derived from S . 37 .30. Under which circumstances is L+ = L∗ − {ε}? Exercise 2.2. 3. Give a language L such that L = L∗ .

37. A grammar having this property is called non-contracting. the start symbol is the only nonterminal which may have an empty production (a production of the form X → ε). G2 en G3 are the same. Show that the languages generated by the grammars G1 .2. Give a grammar for L that has no right-recursive productions.38.40. Give a grammar for the language L = {ωcω R | ω ∈ {a. Context-Free Grammars Exercise 2. Exercise 2. the start symbol does not occur in any alternative.35. Describe the language generated by the grammar: S →ε S →A A → aAb A → ab Can you find another (preferably simpler) grammar for the same language? Exercise 2. The grammar A → aAb | ε does not have this property. b} } This language is known as the center-marked palindromes language. Consider the following property of grammars: 1. S →ε S →A A → Aa A→a and S →ε S →A A → AaA A→a Can you find other (preferably simpler) grammars for the same languages? Exercise 2. G1 : S →ε S → aS G2 : S →ε S → Sa G3 : S →ε S →a S → SS ∗ Exercise 2. Describe the languages generated by the grammars. Exercise 2.39. Give a non-contracting grammar which describes the same language. Give a derivation of the sentence abcba.36. Describe the language L of the grammar A → AaA | a Give a grammar for L that has no left-recursive productions. 38 . 2.

[. An example sentence in this language is [ ( ) ] ( ).45. {(. Consider the language L of the grammar S → T | US T → aS a | U a U → S | SUT Give a grammar for L which uses only productions with two or less symbols on the right hand side. )} in which the brackets match. ]} in which the brackets match.44. Give a grammar for the language consisting of all nonempty sequences of two kinds of brackets. An example sentence of the language is ( ) ( ( ) ) ( ). Exercise 2. → they → Verb NounPhrase → AuxVerb Verb Noun → are → flying → are → Adjective Noun → flying → planes Give two different leftmost derivations for the sentence they are flying planes. {(. Exercises Exercise 2. 39 . Consider the following grammar for a part of the English language: Sentence Subject Predicate Predicate Verb Verb AuxVerb NounPhrase Adjective Noun → Subject Predicate . Exercise 2.46.42. Exercise 2.41. Give a grammar for L that has no left-recursive productions and is non-contracting. This exercise shows an example (attributed to Noam Chomsky) of an ambiguous English sentence. ). Give a grammar for the language consisting of all nonempty sequences of brackets. Exercise 2. Exercise 2. Give a grammar for the language of all sequences of 0’s and 1’s which start with a 1 and contain exactly one 0. Describe the language L of the grammar X → a | Xb Give a grammar for L that has no left-recursive productions.43.9. Give a derivation for this sentence.2. Give a grammar for L which uses only two nonterminals.

Exercise 2.51 (••. Exercise 2. given a string w of zeros and ones. Prove.45 unambiguous? If not.49.33 contains all strings over {a. Context-Free Grammars Exercise 2. Prove that the language generated by the grammar of Exercise 2. Assume that .54 (no answer provided).52 (no answer provided). → ⊗ →⊗ ⊗→⊗3⊕ ⊗→⊕ ⊕→♣ ⊕→♠ Find a derivation for the sentence ♣ 3 ♣ ♠. 40 .53 (no answer provided). thus 4 is represented as 001. and ⊕ are nonterminals. Write an algorithm that determines whether or not the number of a’s in w equals the number of b’s in w .47. and the other symbols are terminals.48 (•). b} where the number of a’s is even and greater than zero. Exercise 2. Let w be a string consisting of a’s and b’s only. Write an algorithm that. using induction. Exercise 2. no answer provided). given a string w of I’s. thus 4 is represented as IIII. Write an algorithm that. determines whether or not w is divisible by 7. Try to find some ambiguous sentences in your own natural language. Consider the natural numbers in unary notation where only the symbol I is used.2. Exercise 2. Here are some ambiguous Dutch sentences seen in the newspapers: Vliegen met hartafwijking niet gevaarlijk Jessye Norman kan niet zingen Alcohol is voor vrouwen schadelijker dan mannen Exercise 2.2 does indeed generate the language of palindromes. Consider the natural numbers in reverse binary notation. Exercise 2. find one which is unambiguous. Is your grammar for Exercise 2. ⊗. determines whether or not w is divisible by 7. that the grammar G for palindromes in Section 2.50 (•). This exercise deals with a grammar that uses unusual terminal and nonterminal symbols.

In Section 3. and are therefore not parser combinators in this sense.3 the first parser combinators are introduced. we can build parsers for the language of (possibly ambiguous) grammars.3. Finally. and could easily be rephrased using the map. Parsers can be written using a small set of basic parsing functions. Not only do these make life easier later. Parser combinators Introduction This chapter is an informal introduction to writing parsers in a lazy functional language using ‘parser combinators’.4 we introduce some new parser combinators. Parser combinators are used to write parsers that are very similar to the grammar of a language. but their definitions are also nice examples of using parser combinators. but they are usually also called parser combinators. which can be used for sequentially and alternatively combining parsers. and an integer indicating the nesting depth. Next. This is done without coding the priorities of operators as integers.1. Semantic functions are used to give meaning to syntactic structures. and a number of functions that combine parsers into more complicated parsers. we will introduce some elementary parsers that can be used for parsing the terminal symbols of a language. filter and concat functions. Thus writing a parser amounts to translating a grammar to a functional program. Type classes are only used for overloading the equality and arithmetic operators. and we will avoid using indices and ellipses. and for calculating so-called semantic functions during the parse. As an example.5. but they are not essential. The basic parsing functions do not combine parsers. A real application is given in Section 3. the expression parser is generalised to expressions with an arbitrary number of precedence levels.3. Different semantic values are calculated for the matching parentheses: a tree describing the structure. The functions that combine parsers are called parser combinators. We will start by motivating the definition of the type of parser functions. Parser combinators are built by means of standard functional language constructs like higher-order functions. 41 . where a parser for arithmetical expressions is developed. Using that type. List comprehensions are used in a few places. lists. In Section 3. which is often a simple task. and datatypes. we construct a parser for strings of matching parentheses in Section 3.

Parsers are composed from simple parsers by means of parser combinators. Parser combinators It is not always possible to directly construct a parser from a context-free grammar using parser combinators. • to understand the concept of semantic functions. However. • to understand the concept of domain specific language. i. there exist good parsers using parser combinators for the Haskell language. • to be able to construct a parser when given a grammar.3. 42 . you should be able to formulate the parsing problem. there do exist implementations of parser combinators that perform remarkably well. Furthermore. e. Goals This chapter introduces the first programs for parsing in these lecture notes. Hence. by means of parser combinators. how to recognise structure in a sequence of symbols. Wadler [13] and Hutton [7]. Another limitation of the parser combinator technique as described in this chapter is that it is not trivial to write parsers for complex grammars that perform reasonably efficient. Most of the techniques introduced in this chapter have been described by Burge [4]. and you should understand the concept of context-free grammar. it has to be transformed into a non left-recursive grammar before we can construct a combinator parser. class. important primary goals of this chapter are: • to understand how to parse. For example. and higher-order functions. Two secondary goals of this chapter are: • to develop the capability to abstract.. you should be familiar with functional programming concepts such as type. see [10. Required prior knowledge To understand this chapter. 12]. If the grammar is left-recursive. This chapter is a revised version of an article by Fokker [5].

a parser that returns a structure of type Oak (whatever that is) now has type Parser Oak .6. Therefore it is better to abstract from the type S . and only then build a complete parse tree Beside (Beside Single Single) Single using the unprocessed part s of the input string. the answer to this question may be either a parse tree or a derivation. These calls do not only have to communicate their result. determine whether or not s ∈ L(G). String) — still preliminary Any parser of type Parser returns an S and a String. • show how to obtain this type by means of several abstraction steps. the unprocessed input string has to be part of the result of the parser. which represents the type of parse trees. a parser will first build a parse tree Beside Single Single for ss. 43 . As this cannot be done using a global variable. However. For example. or call itself recursively. A better definition for the type Parser is hence: type Parser = String → (S . The type Parser is parametrised with a type a. a parser can call other parsers.4 we have seen a grammar for sequences of s’s: S → SS | s A parse tree of an expression of this language is a value of the datatype S (or a value of several variants of that type. The type of parsers The goals of this section are: • develop a type Parser that is used to give the type of parsing functions. A parser that parses sequences of s’s has type Parser S .8): Given a grammar G and a string s.1. type Parser a = String → (a. for different grammars we want to return different parse trees: the type of tree that is returned depends on the grammar for which we want to parse sentences. String) — still preliminary For example. The parsing problem is (see Section 2. which is defined by data S = Beside S S | Single A parser for expressions could be implemented as a function of the following type: type Parser = String → S — preliminary For parsing substructures. when parsing the string sss. see Section 2.1. If s ∈ L(G). in Section 2. but also the part of the input string that is left unprocessed. The two results can be paired.3. For example. depending on what you want to do with the result of the parser). The type of parsers 3. and to turn the parser type into a polymorphic type.

Finally. and returns the number represented by it as a parse ‘tree’. and as before the result type a. To cater for this situation we refine the parser type once more: we let the type of the elements of the input string be an argument of the parser type. a recogniser that either accepts or rejects sentences of a grammar returns a boolean value. the type of parsers becomes type Parser s a = [s] → [(a. There is however no reason for not allowing parsing strings of elements other than characters. Parsers with the type described so far operate on strings. If no parse is possible. the result is an empty list. Until now. the result consists of all possible parses. if you prefer meaningful identifiers over conciseness: 44 . the string "sss" has the following two parse trees: Beside (Beside Single Single) Single Beside Single (Beside Single Single) As another refinement of the type Parser . In general.3. paired with the rest string that was left unprocessed after parsing. In this case the function is also of type Parser Int. but instead the number of s’s in the input sequence. we have assumed that every string can be parsed in exactly one way. Each element of the result consists of a tree. described by Wadler [13]. This parser would have type Parser Int.8). so there will be no loss of efficiency. Another instance of a parser is a parse function that recognises a string of digits. which is subsequently parsed. you can take the head of the list of successes. It can be used in situations where in other languages you would use so-called backtracking techniques. Calling the type of symbols s. Parser combinators We might also define a parser that does not return a value of type S . the result of the parse function is a singleton list. or that there is no way to parse a string. In case of an ambiguous grammar. Lazy evaluation provides a backtracking approach to finding the first solution. Thanks to lazy evaluation. String)] — useful. not all elements of the list are computed if only the first value is needed. and will have type Parser Bool . For example. [s])] or. that is lists of characters. If only one solution is required rather than all possible solutions. The type definition of Parser therefore becomes: type Parser a = String → [(a. You may imagine a situation in which a preprocessor prepares a list of tokens (see Section 2. we let a parser return a list of trees. In the Bird and Wadler textbook it is used to solve combinatorial problems like the eight queens problem [3]. instead of returning one parse tree (and its associated rest string). but still suboptimal If there is just one parse. list of successes lazy evaluation This method for parsing is called the list of successes method. this need not be the case: it may be that a single string can be parsed in various ways.

[symbol ])] We will use this type definition in the rest of this chapter. We will hardly use the full generality provided by the Parser type: the type of the input s (or symbol ) will almost always be Char .2. Elementary parsers The goals of this section are: • introduce some very simple parsers for parsing sentences of grammars with rules of the form: A→ε A→a A→x where x is a sequence of terminals. This type is defined in Listing 3. 3. • show how one can construct useful functions from simple. Each element of this list is a possible parsing of (an initial part of) the input. The list of successes appears in result type of a parser.2.1. and as a parse ‘tree’ we also simply use a Char : symbola :: Parser Char Char symbola [] = [] symbola (x : xs) | x = = ’a’ = [(’a’.hs type Parser symbol result = [symbol ] → [(result. xs)] | otherwise = [] 45 .1: ParserType. The type of the input string symbols is Char in this case. This section defines parsers that can only be used to parse fixed sequences of terminal symbols. [symbol ])] Listing 3. We will start with a very simple parse function that just recognises the terminal symbol a. the first part of our parser library. Elementary parsers — The type of parsers type Parser symbol result = [symbol ] → [(result. For a grammar with a production that contains nonterminals in its righthand side we need techniques that will be introduced in the following section. trivially correct functions by means of generalisation and partial parametrisation.3.

rather than defining a lot of closely related functions. given a symbol. drop n xs)] | otherwise = [] where n = length k failp :: Parser s a failp xs = [] succeed :: a → Parser s a succeed r xs = [(r . In the same fashion. because now we can return an empty list if no parsing is possible (because the input is empty. we can write parsers that recognise other symbols. it is better to abstract from the symbol to be recognised by making it an extra argument of the function. xs)] — Applications of elementary parsers digit :: Parser Char Char digit = satisfy isDigit Listing 3. or does not start with an a). so that it can be used in other applications than character oriented ones. Parser combinators — Elementary parsers symbol :: Eq s ⇒ s → Parser s s symbol a [] = [] symbol a (x : xs) | x = = a = [(x . 46 . This is why two arguments appear in the definition of symbol .3. returns a parser for that symbol. The only prerequisite is that the symbols to be parsed can be tested for equality. we obtain the function symbol that is given in Listing 3. As always. xs)] | otherwise = [] satisfy :: (s → Bool ) → Parser s s satisfy p [] = [] satisfy p (x : xs) | p x = [(x . but also on lists of symbols of other types. Furthermore.2: ParserType. The function symbol is a function that. A parser on its turn is a function too.hs The list of successes method immediately pays off.2. the function can operate on lists of characters. In Haskell. Using these generalisations. xs)] | otherwise = [] token :: Eq s ⇒ [s] → Parser s [s] token k xs | k = = take n xs = [(k . this is indicated by the Eq predicate in the type of the function.

this function is not confined to strings of characters. a useful parser is one that recognises a fixed string of symbols. xs)] A more useful variant is the function succeed . we can parametrise the function with a condition that the symbol should fulfill. It does not consume any input. it is defined in Listing 3. Elementary parsers We will now define some elementary parsers that can do the work traditionally taken care of by lexical analysers (see Section 2. and in the unit type 47 .3. Of course. such as while or switch. epsilon :: Parser s () epsilon xs = [((). which fails to recognise any symbol on the input string. effectively making it into a family of functions. For example. As in the case of the symbol function we have parametrised this function with the string to be recognised. fixed value (or ‘parse tree’. In this tradition. It is defined in Listing 3.8). This generalised function is for example useful when we want to parse digits (characters in between ’0’ and ’9’): digit :: Parser Char Char digit = satisfy isDigit where the function isDigit is the standard predicate that tests whether or not a character is a digit: isDigit :: Char → Bool isDigit x = ’0’ x ∧ x ’9’ In books on grammar theory an empty string is often called ‘ε’. Thus the function satisfy has a function s → Bool as argument. It is defined in Listing 3. Note that we cannot define symbol in terms of token: the two functions have incompatible types. depending on the input. we will define a function epsilon that ‘parses’ the empty string.2. Where symbol tests for equality to a specific value. if you can call the result of processing zero symbols a parse tree). the function satisfy tests for compliance with this predicate.2. in that it recognises a list of symbols instead of a single symbol. We will call this function token. which also doesn’t consume input. Dual to the function succeed is the function failp. and hence always returns an empty parse tree and unmodified input. we do need an equality test on the type of values in the input string. As the result list of a parser is a ‘list of successes’. A zero-tuple can be used as a result value: () is the only value of the type () – both the type and the value are pronounced unit. return different parse results. but always returns a given. Instead of specifying a specific symbol. However. the type of token is: token :: Eq s ⇒ [s] → Parser s [s] The function token is a generalisation of the symbol function.2.2. Another generalisation of symbol is a function which may.

powerful combinator language (a domain specific language) for the purpose of parsing. The goals of this section are: • show how parsers can be constructed directly from the productions of a grammar. 3. More interesting are parsers for nonterminal symbols.3. The kind of productions for which parsers will be constructed are A→x |y A→x y where x and y are sequences of nonterminal or terminal symbols. see Section 2. Exercise 3. • understand and use the concept of semantic functions in parsers.1.1. the function symbol can be defined as an instance of satisfy.2. It is convenient to construct these parsers by partially parametrising higherorder functions. Parser combinators case of failure there are no successes. This implies that we want to have a way to say that a parser consists of several alternative parsers.3. the result list should be empty.2. Therefore the function failp always returns the empty list of successes.3. see also Section 2. the first rule says that in order to parse an expression. How can this be done? Exercise 3. Let us have a look at the grammar for expressions again. Define the function epsilon using succeed . Since satisfy is a generalisation of symbol . An expression can be parsed according to any of the two rules for E. • show how we can construct a small. Parser combinators Using the elementary parsers from the previous section. Note the difference with epsilon. which does have one element in its list of successes (albeit an empty one). Furthermore. Define a function capital :: Parser Char Char that parses capital letters. 48 . Do not confuse failp with epsilon: there is an important difference between returning one solution (which contains the unchanged input as ‘rest’ string) and not returning a solution at all! Exercise 3. parsers can be constructed for terminal symbols from a grammar. It is defined in Listing 3.5: E →T +E |T T →F *T |F F → Digs | ( E ) where Digs is a nonterminal that generates the language of sequences of digits.3.

and <|> for alternative composition. zs) |(r1 . should be applied to the rest string. q is applied to the rest string of the result. the function operates on a string. which can be thought of as the string that is parsed by the parser that is the result of combining p and q. These operations correspond directly to their grammatical counterparts.3. So important operations on parsers are sequential and alternative composition: a more complex construct can consist of a simple construct followed by another construct (sequential composition).3. though. Priorities of these operators are defined so as to minimise parentheses in practical situations: infixl 6 <∗> infixr 4 <|> So <∗> has a higher priority – i. in which the second parser is applied in all possible ways to the rest string of the first parser: (p <∗> q) xs = [(combine r1 r2 . Be careful. In the definitions in Listing 3. it binds stronger – than <|>.. Both operators take two parsers as argument. e. which is used to distinguish cases in a function definition or as a separator for the constructors of a datatype. not to confuse the <|>-operator with Haskell’s built-in construct |. zs) ← q ys ] The rest string of the parser for the sequential composition of p and q is whatever the second parser q leaves behind as rest string. We start with the definition of operator <∗>. the functions operate on parsers p and q. and return a parser as result. ys) ← p xs . then a terminal symbol +. The second parser. For sequential composition. returns a list of successes. and <|> can be pronounced as ‘or’. (r2 . which for notational convenience are defined as operators: <∗> for sequential composition. By again combining the result with other parsers. This implies that we want to have a way to say that a parser consists of several parsers that are applied sequentially. each of which contains a value and a rest string. Apart from the arguments p and q. Therefore we use a list comprehension. After that. and then an expression. q.3. We will develop two functions for this. p must be applied to the input first. returning a second value. p. 49 . Parser combinators we should first parse a term. The names of these operators are chosen so that they can be easily remembered: <∗> ‘multiplies’ two constructs together. you may construct even more involved parsers. The first parser. or by a choice between two constructs (alternative composition).

However. Apart from ‘sequential composition’ we need a parser combinator for representing ‘choice’. and the second parser a value of type b. In the past.hs Now.3. ys) ← p xs . how should the results of the two parsings be combined? We could. and the combined parser returns the value that is obtained by applying the function to the value. we have chosen for a different approach. the second parser a value. like characters.3: ParserCombinators. both p1 and p2 return lists of possible parsings. ys) ← p xs ] — Applications of parser combinators newdigit :: Parser Char Int newdigit = f <$> digit where f c = ord c − ord ’0’ Listing 3. of course. we have interpreted the word ‘tree’ liberally: simple values. Thanks to the list of successes method. The first parser returns a function. a straightforward choice for the combine function would be function application. the result type of a parser may be a function type. To obtain all possible parsings when applying p1 or p2 . zs ) ← q ys ] (<$>) :: (a → b) → Parser s a → Parser s b (f <$> p) xs = [ (f y. We will now also accept functions as parse trees. we only need to concatenate these two lists. zs ) | (f . That is. If the first parser that is combined by <∗> would return a function of type b → a. For this. we have the parser combinator operator <|>. which nicely exploits the ability of functional languages to manipulate functions. ys) | ( y. 50 . Parser combinators — Parser combinators (<|>) :: Parser s a → Parser s a → Parser s a (p <|> q) xs = p xs + q xs + (<∗>) :: Parser s (b → a) → Parser s b → Parser s a (p <∗> q) xs = [ (f x . That is exactly the approach taken in the definition of <∗> in Listing 3. may also be used as a parse ‘tree’. parametrise the whole thing with an operator that describes how to combine the parts (as is done in the zipWith function). ( x . The function combine should combine the results of the two parse trees recognised by p and q.3.

you will get a stack overflow! If you apply sequenceOfS to a string. the <$> operator is used to build a certain value during parsing (in the case of parsing a computer program this value may be the generated code.5 51 . The most important parser combinators are <∗> and <|>. In practice. The parser might work well in that it consumes symbols from the input adequately (leaving the unused symbols as rest-string in the tuples in the list of successes). The problem stems from the fact that the underlying grammar is leftrecursive. because the combinator resembles the operator that is used for normal function application in Haskell: f $ x = f x ..12 is just a variation of <∗>. In a case like this. using the parser combinator <$> it is applied to the result part of the digit parser. see Section 2. It takes a function and a parser as argument.> in exercise 3. is defined as follows: sequenceOfS :: Parser Char S sequenceOfS = Beside <$> sequenceOfS <∗> sequenceOfS <|> const Single <$> symbol ’s’ But if you try to run this function. but ‘postprocesses’ the result using the function.e. a systematically constructed parser using parser combinators will exhibit the problem that it loops. but returns the result as an integer. we may need a parser that recognises one digit character. etc. a parser that recognises one digit is defined using the function satisfy: digit = satisfy isDigit. we can modify the parser digit that was defined above: newdigit :: Parser Char Int newdigit = f <$> digit where f c = ord c − ord ’0’ The auxiliary function f determines the ordinal number of a digit character. i. In some applications. For any left-recursive grammar.3. or a list of all variables with their types. The parser combinator <. It is an infix operator: infixl 7 <$> Using this postprocessing parser combinator.). the result is a parser that recognises the same string as the original parser. a value of type S . A parser for the SequenceOfS grammar that returns the abstract syntax tree of the input. We use the $ sign in the name of the combinator.3. which loops. Sometimes we are not quite satisfied with the result value of a parser. However. The definition of <$> is given in Listing 3.3. we can use a new parser combinator: <$>. instead of a character.6. Parser combinators By combining parsers with parser combinators we can construct new parsers. the first thing it does is to apply itself to the same string. For example. Put more generally: using <$> we can add semantic functions to parsers. but the result value might need some postprocessing. in Section 2.

1. 52 . The resulting grammar is used to obtain the following parser: sequenceOfS :: Parser Char SA sequenceOfS = const ConsS <$> symbol ’s’ <∗> parseZ <|> const SingleS <$> symbol ’s’ where parseZ = ConsZ <$> sequenceOfS <∗> parseZ <|> SingleZ <$> sequenceOfS This example is a direct translation of the grammar obtained by using the removing left recursion grammar transformation. 2. 3. Give the datatype English that corresponds to this grammar.7. Test your function with the sentence they are flying planes. Give its type and show its results on inputs [] and x : xs. Test your function with the palindromes cPal 1 = "abaaba" and cPal 2 = "baaab".5. Compare the result to your answer of Exercise 2. Define a parser for Booleans. i. f = λ → c for some term c.8 (no answer provivded).10.3. Define a parser english that returns parse trees for the English language. Consider the grammar for a part of the English language that is given in Exercise 2. Consider the grammar for palindromes that you have constructed in Exercise 2. Consider the parser (:) <$> symbol ’a’..21. Exercise 3. Exercise 3. Consider the parser (:) <$> symbol ’a’ <∗> p.46.3. There exists a much simpler parser for parsing sequences of s’s.6. Exercise 3. e. Compare the results with your answer to Exercise 2. 2.1. Define a parser palin 2 that returns parse trees for palindromes.7. Exercise 3. Exercise 3.4.9.46. Give its type and show its results on inputs [] and x : xs. Define a parser palina that counts the number of a’s occurring in a palindrome. Give the datatype Pal 2 that corresponds to this grammar. Parser combinators we have shown how to remove the left recursion in the SequenceOfS grammar. Exercise 3. Prove for all f :: a → b that f <$> succeed a = succeed (f a) In the sequel we will often use this rule for constant functions f . Define parsers for each of the basic languages defined in Section 2. 1. Exercise 3.

1.6. We obtain the following Haskell datatype: data Parentheses = Match Parentheses Parentheses | Empty 53 .13.3. for example.> and <$>. Can you describe your observations in an easy-to-remember. you can observe that <$> is in a sense a special case of <∗>.15. as for example symbol ’(’ is not a parser returning a function.44: S →(S )S |ε This grammar can be directly translated to a parser.3.3. The value returned by the combined parser is a tuple containing the results of the two component parsers. When defining the priority of the <|> operator with the infixr keyword. parens :: Parser Char ??? — ill-typed parens = symbol ’(’ <∗> parens <∗> symbol ’)’ <∗> parens <|> epsilon However. what function should we use? This depends on the kind of value that we want as a result of the parser.14.16. Exercise 3. using the parser combinators <∗> and <|>. instead of a character. Compare the type of <$> with the type of the standard function map. Consider. For this purpose we introduce an abstract syntax. Define <. But we can postprocess the parser symbol ’(’ so that. If you examine the definitions of <∗> and <$> in Listing 3. Matching parentheses: an example Using parser combinators. catchy phrase? Exercise 3. The term ‘parser combinator’ is in fact not an adequate description for <$>.12. it is often fairly straightforward to construct a parser for a language for which you have a grammar. What is the type of this parser combinator? Exercise 3. Can you think of a better word? Exercise 3. A nice result would be a treelike description of the parentheses that are parsed. and <|> when | appears in a production (or when there is more than one production for a nonterminal). the grammar that you wrote in Exercise 2.> that combines two parsers. this function is not correctly typed: the parsers in the first alternative cannot be composed using <∗>. We use <∗> when symbols are written next to each other.11. we also specified that the operator associates to the right. Define a parser combinator <.3. So. this parser does return a function. Can you define <$> in terms of <∗>? 3. for the parentheses grammar. Parser combinators Exercise 3. Define <∗> in terms of <. see Section 2. Why is this a better choice than association to the left? Exercise 3.> in terms of <∗> and <$>.

nrofpars :: Parentheses → Int nrofpars (Match pl pr ) = 2 + nrofpars pl + nrofpars pr nrofpars Empty =0 Using the datatype Parentheses. we can return other things than parse trees. As an example we construct a parser that calculates the nesting depth of nested parentheses."(())()"). We then obtain the definition of parens in Listing 3.4. which is defined by induction on the datatype Parentheses. (2. (0."()(())()")] ? nesting "())" [(1. the sentence ()() is represented by Match Empty (Match Empty Empty) Suppose we want to calculate the number of parentheses in a sentence.3. (1. see the function nesting defined in Listing 3. we can add ‘semantic functions’ to the parser."())")] 54 .4."()"). By varying the function used as a first argument of <$> (the ‘semantic function’). Parser combinators data Parentheses = Match Parentheses Parentheses | Empty deriving Show open = symbol ’(’ close = symbol ’)’ parens :: Parser Char Parentheses parens = (λ x y → Match x y) <$> open <∗> parens <∗> close <∗> parens <|> succeed Empty nesting :: Parser Char Int nesting = (λ x y → max (1 + x ) y) <$> open <∗> nesting <∗> close <∗> nesting <|> succeed 0 Listing 3.[]).hs For example. A session in which nesting is used may look like this: ? nesting "()(())()" [(2.")"). The number of parentheses is calculated by the function nrofpars. (0.4: ParseParentheses.

what is the type of f <$> open? Can f <$> open be used as a left hand side of <∗>parens? What is the type of the result? Exercise 3. when there is a syntax error in the argument. 3. Exercise 3.17. In traditional grammar formalisms. 3.1. but in practice it is easier to have some more parser combinators available. additional symbols are used to describe for example optional or repeated constructions. there are no solutions with empty rest string. respectively). It is fairly simple to test whether a given string belongs to the language that is parsed by a given parser. For example. More parser combinators As you can see.4.4. More parser combinators In principle you can build parsers for any context-free language using the combinators <∗> and <|>. denoted by a star. but which was later extended to EBNF to also allow for repetition. As a first example we consider repetition. The function (:) is just the cons-operator for lists: it takes a head element and a tail list and combines them.18.4. Given a parser p for a construction. in which originally only sequential and alternative composition can be used (denoted by juxtaposition and vertical bars. Consider for example the BNF formalism. Write a function test that determines whether or not a given string belongs to the language parsed by a given parser.19. many p constructs a parser for zero or more occurrences of that construction: many :: Parser s a → Parser s [a] many p = (:) <$> p <∗> many p <|> succeed [] So the EBNF expression P ∗ is implemented by many P .4? What is the type of the parser open? Using the type of <$>. Parser combinators for EBNF It is very easy to make new parser combinators for EBNF. The goal of this section is to show how the set of parser combinators can be extended. What is a convenient way for <∗> to associate? Does it? Exercise 3. The order in which the alternatives are given only influences the order in which solutions are placed in the list of successes.3. What is the type of the function f = λ x y → Match x y which appears in function parens in Listing 3. the many combinator can be used in parsing a natural number: natural :: Parser Char Int natural = foldl f 0 <$> many newdigit where f a b = a ∗ 10 + b 55 .

Parser combinators — EBNF parser combinators option :: Parser s a → a → Parser s a option p d = p <|> succeed d many :: Parser s a → Parser s [a] many p = (:) <$> p <∗> many p <|> succeed [] many 1 :: Parser s a → Parser s [a] many 1 p = (:) <$> p <∗> many p pack :: Parser s a → Parser s b → Parser s c → Parser s b pack p r q = (λ x → x ) <$> p <∗> r <∗> q listOf :: Parser s a → Parser s b → Parser s [a] listOf p s = (:) <$> p <∗> many ((λ x → x ) <$> s <∗> p) — Auxiliary functions first :: Parser s b → Parser s b first p xs | null r = [] | otherwise = [head r ] where r = p xs greedy.5: EBNF.3. many greedy 1 = first . greedy 1 :: Parser s b → Parser s [b] greedy = first .hs 56 . many 1 Listing 3.

the natural parser also accepts empty input as a number. For example. It is defined in Listing 3. In situations where from the way the grammar is built we can predict that it is hopeless to try non-greedy results of many. many greedy 1 = first .7. we had better use the many 1 parser combinator. and corresponds to the EBNF expression P + . but which also succeeds if that construct is not present in the input string. also less greedy parsings of the identifier are tried – in vain. You will give a better definition of identifier in Exercise 3. Another combinator from EBNF is the option combinator P ?. if we define a parser for identifiers by identifier = many 1 (satisfy isAlpha) a single word may also be parsed as two identifiers. It has an additional argument: the value that should be used as result in case the construct is not present. we can create a special ‘take all or nothing’ version of many: greedy = first .5. It is a kind of ‘default’ value. More parser combinators Defined in this way. which accepts one or more occurrences of a construction. first :: Parser a b → Parser a b first p xs | null r = [] | otherwise = [head r ] where r = p xs Using this function. but if parsing fails elsewhere in the sentence. It does so by taking the first element of the list of successes. The definition is given in Listing 3. By the use of the option and many functions.3.5. that transforms a parser into a parser that only returns the first possible parsing. and returns a parser that recognises the same construct. Caused by the order of the alternatives in the definition of many (succeed [] appears as the second alternative). many 1 If we compose the first function with the option parser combinator: obligatory p d = first (option p d ) we get a parser which must accept a construction if it is present.27. 57 . a large amount of backtracking possibilities are introduced.4. but which does not fail if it is not present. we can define a parser transformer first. It takes a parser as argument. This is not always advantageous. If this is not desired. the ‘greedy’ parsing. which accumulates as many letters as possible in the identifier is tried first. see Section 2.

4.’) A somewhat more complicated variant of the function listOf is the case where the separators carry a meaning themselves. so e1 ⊕ e2 ⊕ e3 ⊕ e4 = 58 . where the elements are separated by some symbol. but there is no need to leave it at that. most often some sort of parentheses. it constructs a parser for the enclosed body. Separators The combinators many.3. in the case of chainl it is applied left-to-right. The definitions look quite complicated. it denotes an arbitrary right-associative operator) from right to left. and a closing token. the separators are of no importance. that function is used by chain to combine parse trees for the items. where the operators that separate the subexpressions have to be part of the parse tree. For example. Special cases of this combinator are: parenthesised p = pack (symbol ’(’) p (symbol ’)’) bracketed p = pack (symbol ’[’) p (symbol ’]’) compound p = pack (token "begin") p (token "end") Another frequently occurring construction is repetition of a certain construction.’) semicList p = listOf p (symbol ’. respectively) –. many 1 and option are classical in compiler constructions – there are notations for it in EBNF (·∗ . In the case of chainr the operator is applied right-to-left. The function listOf below generates a parser for a non-empty list. or compound statements (statements separated by semicolons). For the parse trees. but when you look at the underlying grammar they are quite straightforward. as defined in Listing 3.6 (remember that $ is function application: f $ x = f x ). a body. For example. Given a parser for an opening token. semicList :: Parser Char a → Parser Char [a] commaList p = listOf p (symbol ’. The functions chainr and chainl are defined in Listing 3. Suppose we apply operator ⊕ (⊕ is an operator variable. Parser combinators 3. For this case we will develop the functions chainr and chainl . in many languages constructions are frequently enclosed between two meaningless symbols.2. in arithmetical expressions. You may think of lists of arguments (expressions separated by commas). given a parser for the items and a parser for the separators: listOf :: Parser s a → Parser s b → Parser s [a] listOf p s = (:) <$> p <∗> many ((λ x → x ) <$> s <∗> p) Useful instantiations are: commaList. For this case we design a parser combinator pack .5. These functions expect that the parser for the separators returns a function (!). ·+ and ·?.

4. function j of chainl takes an operator and a value. turning them into functions. dual 59 .hs e1 ⊕ (e2 ⊕ (e3 ⊕ e4 )) = ((e1 ⊕) · (e2 ⊕) · (e3 ⊕)) e4 It follows that we can parse such expressions by parsing many pairs of expressions and operators. Such functions are sometimes called dual . see Listing 3. Functions chainl and chainr can be made more efficient by avoiding the construction of the intermediate list of functions. This is done by function chainl .6.3. turning them into functions. If operator ⊕ is applied from left to right. and applying all those functions to the first expression. This is done by function chainr . Note that functions chainl and chainr are very similar. The resulting definitions can be found in the article by Fokker [5]. and returns the function obtained by ‘left’ applying the operator. and then parsing many pairs of operators and expressions. More parser combinators — Chain expression combinators chainr :: Parser s a → Parser s (a → a → a) → Parser s a chainr pe po = h <$> many (j <$> pe <∗> po) <∗> pe where j x op = (x ‘op‘) h fs x = foldr ($) x fs chainl :: Parser s a → Parser s (a → a → a) → Parser s a chainl pe po = h <$> pe <∗> many (j <$> po <∗> pe) where j op x = (‘op‘x ) h x fs = foldl (flip ($)) x fs Listing 3.6: Chains. and applying all those functions to the last expression.6. then e1 ⊕ e2 ⊕ e3 ⊕ e4 = ((e1 ⊕ e2 ) ⊕ e3 ) ⊕ e4 = ((⊕e4 ) · (⊕e3 ) · (⊕e2 )) e1 So such an expression can be parsed by first parsing a single expression (e1 ). see Listing 3. the only difference is that everything is ‘turned around’: function j of chainr takes a value and an operator. and returns the function obtained by ‘right’ applying the operator to the value.

’b’]}? Exercise 3. 2. As an application of psequence. In real programming languages.1. ’b’]. Exercise 3. 1.27.5. What should be changed in order to get only one solution? Exercise 3. Arithmetical expressions The goal of this section is to use parser combinators in a concrete application. The result is a list of integers. or underscore symbol. digit. [’b’].22. [’a’.20. identifiers follow rather flexible rules: the first symbol must be a letter. Define a parser sumParser that recognises digits separated by the character ’+’ and returns the sum of these integers. [’a’. What is the value of many (symbol ’a’) xs for xs ∈ {[].23 (no answer provided). 3. Carefully analyse the semantic functions in the definition of chainl in Listing 3. 3. [’a’]. define a parser combinator psequence that transforms a list of parsers for some type into a parser returning a list of elements of that type.6.3. Exercise 3. Both parsers return a list of solutions. define the function token that was discussed in Section 3. but the symbols that follow (if any) may be a letter.26 (no answer provided).2. Using the parser combinators option.21. We will develop a parser for arithmetical expressions. As another variation on the theme ‘repetition’.25.3. which have the following concrete syntax: E →E +E | E -E | E /E | (E ) | Digs 60 . Define a parser that analyses a string and recognises a list of digits separated by a space character. Parser combinators Exercise 3. ’a’. In what order do the four possible parsings appear in the list of successes? Exercise 3. What is the type of psequence? Also define a combinator choice that iterates the operator <|>. Exercise 3. Define a more realistic parser identifier . many and many 1 define parsers for each of the basic languages defined in Section 2.24. Consider the application of the parser many (symbol ’a’) to the string aaa. Exercise 3.

the function Var is applied. only with additive operators instead of multiplicative operators.5. fact. and a factor is a constant.7. parses factors. The second and third alternative are a variable or function call. fact :: Parser Char Expr fact = Con <$> integer <|> Var <$> identifier <|> Fun <$> identifier <∗> parenthesised (commaList expr ) <|> parenthesised expr The first alternative is an integer parser which is postprocessed by the ‘semantic function’ Con. separated by its second argument (a ∗ or /).3. and with terms instead of factors. But in order to account for the priorities of the operators. variable. or expression between parentheses. we will use a grammar with three non-terminals ‘expression’. The function expr is analogous to term. ‘term’ and ‘factor’: an expression is composed of terms separated by + or −. Recall that chainl repeatedly recognises its first argument (fact). a term is composed of factors separated by ∗ or /. In absence of the latter. The parse trees for the individual factors are joined by the constructor functions that appear before <$>. in presence the function Fun. The first parser. function call. For the definition of a term as a list of factors separated by multiplicative operators we use the function chainl . We use chainl and not chainr because the operator ’/’ is considered to be left-associative. depending on the presence of an argument list. E )∗ The parse trees for this grammar are of type Expr : data Expr = Con Int | Var String | Fun String [Expr ] | Expr :+: Expr | Expr :−: Expr | Expr :∗: Expr | Expr :/: Expr You can almost recognise the structure of the parser in this type definition. For the fourth alternative there is no semantic function. we also have productions for identifiers and applications of functions: E → Identifier | Identifier ( Args ) Args → ε | E (. 61 . Arithmetical expressions Besides these productions. because the meaning of an expression between parentheses is the meaning of the expression. This grammar appears as a parser in the functions in Listing 3.

7: ExpressionParser. Parser combinators — Type definition for parse tree data Expr = Con Int | Var String | Fun String [Expr ] | Expr :+: Expr | Expr :−: Expr | Expr :∗: Expr | Expr :/: Expr — Parser for expressions with two priorities fact :: Parser Char Expr fact = Con <$> integer <|> Var <$> identifier <|> Fun <$> identifier <∗> parenthesised (commaList expr ) <|> parenthesised expr integer :: Parser Char Int integer = (const negate <$> (symbol ’-’)) ‘option‘ id <∗> natural term :: Parser Char Expr term = chainl fact ( const (:∗:) <$> symbol ’*’ <|> const (:/:) <$> symbol ’/’ ) expr :: Parser Char Expr expr = chainl term ( const (:+:) <$> symbol ’+’ <|> const (:−:) <$> symbol ’-’ ) Listing 3.3.hs 62 .

These are: • The operators and associated tree constructors that are used in the second argument of chainl • The parser that is used as first argument of chainl The generalised function will take these two differences as extra arguments: the first in the form of a list of pairs. let us inspect the differences in the definitions of term and expr again.b)" not the first solution of the parser? Modify the functions in Listing 3. there is no need for a separate parser generator (like ‘yacc’). "a*(b+1)". the production rules of the grammar are combined with higher-order functions. the second in the form of a parse function: type Op a = (Char .30. and "a(1.29. "(abc)". Therefore. Generalised expressions This example clearly shows the strength of parsing with parser combinators. Functions that resemble each other are an indication that we should write a generalised function. Explain why and modify the parser in such way that it will be. "-1-a".is parsed as a left-associative operator. Generalised expressions This section generalises the parser in the previous section with respect to priorities. This is not as it should be. Also. Give the parse tree for the expressions "abc". There is no need for a separate formalism for grammars. Why is the parse tree for the expression "a(1.7. the functions can be viewed both as description of the grammar and as an executable parser. where the differences are described using extra arguments. Exercise 3.6.7 in such way that it will be. A function with no arguments such as "f()" is not accepted by the parser. Exercise 3. If there are nine levels of priority. c) = const c <$> symbol s If furthermore we define as shorthand: 63 . a → a → a) gen :: [Op a] → Parser Char a → Parser Char a gen ops p = chainl p (choice (map f ops)) where f (s. with as first argument the function of one priority level lower.28.3. we obtain nine copies of almost the same text. Arithmetical expressions in which operators have more than two levels of priority can be parsed by writing more auxiliary functions between term and expr . Exercise 3. "a*b+1". and . in such a way that + is parsed as a right-associative operator. Modify the functions in Listing 3. The function chainl is used in each definition.b)" 2. 3.6. 1.

f ) <$> p h <$> (p <|> q) = (h <$> p) <|> (h <$> q) h <$> (p <∗> q) = ((h. (’-’. (:−:))] then expr and term can be defined as partial parametrisations of gen: expr = gen addis term term = gen multis fact By expanding the definition of term in that of expr we obtain: expr = addis ‘gen‘ (multis ‘gen‘ fact) which an experienced functional programmer immediately recognises as an application of foldr : expr = foldr gen fact [addis. More advanced implementations are discussed elsewhere. Summary This chapter shows how to construct parsers from simple combinators. The definitions are summarised in Listing 3. (’/’. Also. Prove the following laws h <$> (f <$> p) = (h . The very compact formulation of the parser for expressions with an arbitrary number of priority levels is possible because the parser combinators can be used together with the existing mechanisms for generalisation and partial parametrisation in Haskell. How should the parser of Section 3.8. (:∗:)). the levels of priority need not be coded explicitly with integers. The only thing that matters is the relative position of an operator in the list of ‘list with operators of the same priority’.2) (3. Furthermore. (:+:)). the insertion of new priority levels is very easy.6 be adapted to also allow raising an expression to the power of an expression? Exercise 3. It shows how a small parser combinator library can be a powerful tool in the construction of parsers. (:/:))] addis = [(’+’. multis] From this definition a generalisation to more levels of priority is simply a matter of extending the list of operator-lists. 3. this chapter gives a rather basic implementation of the parser combinator library.7. Exercises Exercise 3.3.31.32.) <$> p) <∗> q (3.3) 64 . Contrary to conventional approaches.1) (3. Parser combinators multis = [(’*’.

an expression and a semicolon. (:−:))] Listing 3. that is numbers like 12. Also integers are acceptable. 65 . Define a parser for floating point numbers.23. Assuming the comma is an associative operator. Exercise 3. (:/:))] addis = [(’+’. Exercise 3. Exercise 3. Outline the construction of a parser for Java programs. Define a parser for (simplified) Java statements. Test your function with the concrete mirror-palindromes cMir 1 and cMir 2 .34 and -123. Define a parser for fixed-point numbers.38 (no answer provided). we can give the following abstract syntax for bit-lists: data BitList = SingleB Bit | ConsB Bit BitList Define a combinator parser pBitList that transforms a concrete representation of a bit-list into an abstract one. (’-’. Define a combinator parser pMir that transforms a concrete representation of a mirror-palindrome into an abstract one. (’/’. (:+:)).456.33. Define a parser for Java assignments that consist of a variable.7.39 (no answer provided). multis] multis = [(’*’.37. Consider your answer to Exercise 2. (:∗:)).25. an = sign. Exercise 3. a → a → a) fact :: Parser Char Expr fact = Con <$> integer <|> Var <$> identifier <|> Fun <$> identifier <∗> parenthesised (commaList expr ) <|> parenthesised expr gen :: [Op a] → Parser Char a → Parser Char a gen ops p = chainl p (choice (map f ops)) where f (s.35. Exercise 3. Test your function with the concrete bit-lists cBitList 1 and cBitList 2 .3.hs Exercise 3. integer) exponent. Notice that the part following the decimal point looks like an integer. Exercises — Parser for expressions with aribitrary many priorities type Op a = (Char .36. c) = const c <$> symbol s expr :: Parser Char Expr expr = foldr gen fact [addis. but has a different semantics! Exercise 3.34.8: GExpressionParser. which are fixed point numbers followed by an optional E and an (positive or negative. Consider your answer to Exercise 2.

Parser combinators 66 .3.

analyse this grammar to find out whether or not it has some desirable properties.. 67 . construct a basic parser. decide on the type of the parser: Parser a b. 6. compute the net travel time. 4. 8. test that the grammar can indeed describe the example sentences. possibly transform the grammar to obtain some of these desirable properties. give example sentences of the language for which you want to design a grammar and a parser. Grammar and Parser design The previous chapters have introduced many concepts related to grammars and parsers. decide on both the input type a of the parser (which may be the result type of a scanner). e. and that the parser returns what you expect. i. that is. 3. 7. add semantic functions. 2. and to show how they are used in the design of grammars and parsers. the travel time minus the waiting time (2 hours and 17 minutes in the above example).4. of the following form: Groningen 8:37 9:44 Zwolle 9:49 10:15 Utrecht 10:21 11:05 Den Haag We might want to do several things with such a schema. 9. and the result type b of the parser. for example: 1. As a running example we will construct a grammar and parser for travelling schemes for day trips. The design of a grammar and parser for a language consists of several steps: you have to 1. give a grammar for the language for which you want to have a parser. We will describe and exemplify each of these steps in detail in the rest of this section. test that the parser can parse the example sentences you have given in the first step. The goal of this chapter is to review these concepts. 5.

compute the total time one has to wait on the intermediate stations (11 minutes).3. The following grammar focuses on the fact that a trip consists of zero or more departures and arrivals. It is important to convince yourself from the fact that the grammar you give really generates the desired language.1. and in between there is a list of intermediate stations. Grammar and Parser design 2. separated by stations. This chapter defines functions to perform these computations. since the grammar will be the basis for grammar transformations. TS Station Departure Arrival Time → TS Departure Arrival TS | Station → Identifier + → Time → Time → Nat : Nat where Identifier and Nat have been defined in Section 2.4. we can give several grammars. 4. For the language of travelling schemes. So a travelling scheme is a sequence of departure and arrival times. Another grammar focuses on changing at a station: TS → Station Departure (Arrival Station Departure)∗ Arrival Station | Station So each travelling scheme starts and ends at a station. Note that a single station is also a travelling scheme with this grammar. 68 .2.1. Step 2: A grammar for the language The starting point for designing a parser for your language is to define a grammar that describes the language as precisely as possible. which might turn the grammar into a set of incomprehensible productions. Step 1: Example sentences for the language We have already given an example sentence above: Groningen 8:37 9:44 Zwolle 9:49 10:15 Utrecht 10:21 11:05 Den Haag Other example sentences are: Utrecht Centraal 10:25 10:58 Amsterdam Centraal Assen 4.

we might desire other properties of our grammar.4. Furthermore. Step 4: Analysing the grammar To parse sentences of a language efficiently. 69 . the productions for TS may be transformed into: TS → Station | Station Departure Arrival TS (4. we will not make use of the fact that the grammar is right recursive. 4.3. These properties may be used for transforming the grammar. and determining which properties are (not) satisfied. The derivation of these sentences using these grammars is easy. 4. Step 3: Testing the grammar 4.1) to the two productions for TS from (4. The other properties will be used in grammar transformations in the following subsection. and obtain the following single production: TS → (Station Departure Arrival )∗ Station (4.3. We will show several choices in the next section. The resulting productions are: TS → Station Z Z → ε | Departure Arrival TS We can also apply equivalence (2. The first example grammar is left and right recursive: the first production for TS starts and ends with TS . we want to have a unambiguous grammar that is left-factored and not left recursive. So a first step in designing a parser is analysing the grammar. the sequence Departure Arrival is an associative separator in the generated language. so TS can be left factored. We have not yet developed tools for grammar analysis (we will do so in the chapter on LL(1) parsing) but for some grammars it is easy to detect some properties.4.5. Since we don’t mind about right recursion.2) So which productions do we take for TS ? This depends on what we want to do with the parsed sentences.1) Thus we have removed the left recursion in the grammar. Both productions for TS start with the nonterminal Station. Depending on the parser we want to obtain.1). Step 5: Transforming the grammar Since the sequence Departure Arrival is an associative separator in the generated language. Step 3: Testing the grammar Both grammars we have given in step 2 describe the example sentences given in step 1.

the total waiting time. Int suffices for the result type. say a datatype TS . Int. data TS 1 = Single 1 Station | Cons 1 Station Time Time TS 1 where Station and Time are defined above. Int (total waiting time). but in some cases it might be more efficient than the third alternative. and a nicely printed version of the travelling scheme. Char or tokens Token. respectively. • define an abstract syntax for travelling schemes. For the input type we can choose between at least two possibilities: characters. There are several ways to define an abstract syntax for travelling schemes. The type of tokens can be chosen as follows: data Token = Station Token Station | Time Token Time type Station = String type Time = (Int. The first alternative parses the input three times. The first abstract syntax corresponds to definition (4. Time. Step 6: Deciding on the types We want to write a parser for travel schemes. we may do several things: • define three parsers. ts :: Parser Char ? ts :: Parser Token ? For the result type we have many choices. Grammar and Parser design 4.2). respectively.1) of grammar TS . If we want to compute the total travelling time. that is. String) as result type. Int) We will construct a parser for both input types in the next subsection.6. • define a single parser with the triple (Int. and define three functions on TS that compute the desired results. The second alternative is hard to extend if we want to compute something extra. A second abstract syntax corresponds to the grammar for travelling schemes defined in (4. type TS 2 = ([(Station. and is rather inefficient compared with the other alternatives.4. If we just want to compute the total travelling time. So ts has one of the following two types. and String (nicely printed version) as result type. The third alternative needs an abstract syntax. Station) 70 . we want to write a function ts of type ts :: Parser ? ? The question marks should be replaced by the input type and the result type. with Int (total travelling time). Time)].

so if we use grammar (4. We use the following replacement rules.7. Station) Which abstract syntax should we take? Again. Since TS 2 and TS 1 combine departure and arrival times in a tuple. etc. A third abstract syntax corresponds to the second grammar defined in Section 4. TS 3 is useful when we want to compute waiting times since it combines arrival and departure times in one constructor. they are convenient to use when computing travelling times. The result type of the parsing function ts may be one of types mentioned earlier (Int. Time.1) for travelling schemes. Of course.4. we use TS 1 for the abstract syntax. this depends on what we want to do with the abstract syntax. running the parser will result in an error. but putting undefined here ensures type correctness of the parser. TS 3 . TS 2 can be defined as a datatype by adding a constructor. 4. TS 2 . [(Time. 71 .2: data TS 3 = Single 3 Station | Cons 3 (Station. Often we want to exactly mimic the productions of the grammar in the abstract syntax. but usually writing a parser is a simple translation. Functional programming languages offer some extra flexibility that we sometimes use. the application determines which to use. the first component of which is a list of triples consisting of a departure station. Step 7: Constructing the basic parser Converting a grammar to a parser is a mechanical process that consists of a set of simple replacement rules. Types and datatypes each have their advantages and disadvantages.). and an arrival time. TS 1 cannot be defined as a type because of the two alternative productions for TS . and the second component of which is the final arrival station.7. Station. or one of TS 1 . a departure time. The undefined has to be replaced by an appropriate semantic function in Step 6. whereas TS 2 is a type. grammar construct → | (space) ·+ ·∗ ·? terminal x begin of sequence of symbols Haskell/parser construct = <|> <∗> many 1 many option symbol x undefined <$> Note that we start each sequence of symbols by undefined <$>. Note that TS 1 is a datatype. Time)]. Step 7: Constructing the basic parser So a travelling scheme is a tuple. Time.

Basic parsers from strings Applying these rules to the grammar (4. which is denoted by ? for the moment. 4.1.2.2) for travelling schemes. For the other basic parsers we have chosen some reasonable return types. The semantic glue also determines the type of the function tsstring. scanner :: String → [Token] scanner = mkTokens . combine . Grammar and Parser design We construct a basic parser for each of the input types Char and Token. arrival :: Parser Char Time departure = undefined <$> time arrival = undefined <$> time tsstring :: Parser Char ? tsstring = undefined <$> many ( undefined <$> spaces <∗> station <∗> spaces <∗> departure <∗> spaces <∗> arrival ) <∗> spaces <∗> station spaces :: Parser Char String spaces = undefined <$> many (symbol ’ ’) The only thing left to do is to add the semantic glue to the functions. we obtain the following basic parser. station :: Parser Char Station station = undefined <$> many 1 identifier time :: Parser Char Time time = undefined <$> natural <∗> symbol ’:’ <∗> natural departure.7. The semantic functions are defined in the next and final step. 4. words combine :: [String] → [String] 72 .4. A basic parser from tokens To obtain a basic parser from tokens.7. we first write a scanner that produces a list of tokens from a string.

m) : xs) = [((h. Step 7: Constructing the basic parser combine [] = [] combine [x ] = [x ] combine (x : y : xs) = if isAlpha (head x ) ∧ isAlpha (head y) then combine ((x + " " + y) : xs) + + else x : combine (y : xs) mkToken :: String → Token mkToken xs = if isDigit (head xs) then Time Token (mkTime xs) else Station Token xs parse result :: [(a. Their presence reflects their presence in the grammar. The composition of the scanner with the function tstoken 1 defined below gives the final parser. but it suffices for now.7. time This is a basic scanner with very basic error messages.4. xs)] = [] tstation tdeparture. tstoken 1 :: Parser Token ? tstoken 1 = undefined <$> many ( undefined <$> tstation <∗> tdeparture <∗> tarrival ) <∗> tstation tstation :: Parser Token Station tstation (Station Token s : xs) = [(s. tarrival :: Parser Token Time tdeparture (Time Token (h. xs)] tdeparture = [] tarrival (Time Token (h. Note that functions tdeparture and tarrival are the same functions. m).2. xs)] tarrival = [] where again the semantic functions remain to be defined. m) : xs) = [((h. b)] → a parse result xs | null xs = error "parse_result: could not parse the input" | otherwise = fst (head xs) mkTime :: String → Time mkTime = parse result . Another basic parser from tokens is based on the second grammar of Section 4. m). tstoken 2 :: Parser Token ? tstoken 2 = undefined 73 .

Finally. Note that this semantic function basically only throws away the value returned by the spaces parser: we are not interested in the spaces between the components of our 74 . and then we replace id <$> time by just time.1 returns an abstract syntax tree of type TS 2 . the result of many is a string. since function time returns a value of type Time. time.8. we have to combine the two integers in a tuple. y) for undefined in time. Grammar and Parser design <$> tstation <∗> tdeparture <∗> many ( undefined <$> tarrival <∗> tstation <∗> tdeparture ) <∗> tarrival <∗> tstation <|> undefined <$> tstation 4. Time. arrival . a character. departure. and we want to obtain the concatenation of these strings for the station name. Time)]. To obtain a value of type Time from an integer. we can take the concatenation function concat for undefined in function station. Step 8: Adding semantic functions Once we have the basic parsing functions. y) provided many returns a value of the desired type: [(Station. we need to add the semantic glue: the functions that take the results of the elements in the right hand side of a production. Since many returns a list of things.4. The basic rule is: Let the types do the work! First we add semantic functions to the basic parsing functions station. and an integer. we can take the identity function for undefined in departure and arrival . The first semantic function for the basic parser tsstring defined in Section 4.7. So the first undefined in tsstring should return a tuple of a list of things of the correct type (the first component of the type TS 2 ) and a Station. Since function many 1 identifier returns a list of strings. so for undefined in spaces we can take the identity function too. So we take the following function λx y → (x . and spaces. Now. and convert them into the result of the left hand side. we can construct such a tuple by means of the function λx y → (x .

so the last occurrence of undefined is the function const 0. Step 9: Did you get what you expected travelling scheme. we use a parser based on this grammar: the basic parser tstoken 2 . z ) Again. y. The many parser returns a value of the correct type if we replace the second occurrence of undefined in tsstring by the function λ x y z → (x .9.9. we have to compute the travel time of each trip from a station to a station. so for the first occurrence of undefined we take: λx → sum x The final set of semantic functions we define are used for computing the total waiting time. The next semantic functions we define compute the net travel time. This completes a parser for travelling schemes. To compute the net travel time. and whether or not you have made errors in the above process. wm) → (wh − uh) ∗ 60 + wm − um Finally. there is now waiting time. um) (wh.2 combines arrival times and departure times. Summary This chapter describes the different steps that have to be considered in the design of a grammar and a language.4. If a trip consists of a single station. zm) → (zh − xh) ∗ 60 + zm − xm and Haskell’s prelude function sum sums these times. We have to give definitions of the three undefined semantic functions. xm) (zh. We obtain the travel time of a single trip if we replace the second occurrence of undefined in tsstring by: λ (xh. Step 9: Did you get what you expected In the last step you test your parser(s) to see whether or not you have obtained what you expected. and to add the travel times of all of these trips. Since the second grammar of Section 4. The second occurrence of function undefined computes the waiting time for one intermediate station: λ(uh. the first occurrence of undefined sums the list of waiting time obtained by means of the function that replaces the second occurrence of undefined : λ x → sum x 4. 75 . the results of spaces are thrown away.

except for the nonterminal FloatLiteral . Exercises Exercise 4. Write an evaluator signedFloat for Java float-literals (the float-suffix may be ignored).4. Write a parser floatLiteral for Java float-literals. FractPart? ExponentPart? FloatSuffix ? | .10. Give this parsing scheme for Java float literals. Exercise 4.1.2. Grammar and Parser design 4. The EBNF grammar for float-literals is given by: FloatLiteral → IntPart . parsers constructed on a (fixed) abstract syntax have the same shape. Up to the definition of the semantic functions.3. are represented by a String in the abstract syntax. Exercise 4. FractPart ExponentPart? FloatSuffix ? | IntPart ExponentPart FloatSuffix ? | IntPart ExponentPart? FloatSuffix → SignedInteger → Digits → ExponentIndicator SignedInteger → Sign? Digits → Digits Digit | Digit →+|→f|F|d|D IntPart FractPart ExponentPart SignedInteger Digits Sign FloatSuffix ExponentIndicator → e | E To keep your parser simple. assume that all nonterminals. 76 .

• understand the notions of synthesised and inherited attributes. there is an important relationship between grammars and compositionality: with every grammar. Compositional functions on datatypes are defined in terms of algebras of semantic actions that correspond to the constructors of the datatype. • know how to write compositional functions. also known as folds. Algebraic semantics often uses algebras that contain functions from tuples to tuples. Goals After studying this chapter and making the exercises you will • know how to generalise constructors of a datatype to an algebra. For example: many recursive functions defined on lists are instances of the higher order function foldr . Compositional functions on these datatypes are called syntax driven. on (possibly mutually recursive) datatypes. The former values are called inherited attributes and the latter values are called synthesised attributes. • know how to write syntax driven code. and have seen that many problems can be solved with a fold. Such semantics is referred to as algebraic semantics.5. which describes the abstract syntax of the language. • know that a fold applied to the constructors algebra is the identity function. Compositionality Introduction Many recursive functions follow a common pattern of recursion.6. • understand the connection between datatypes. Attributes can be both inherited and synthesised. abstract syntax and concrete syntax. which describes the concrete syntax of a language. It is possible to define a function such as foldr for a whole range of datatypes other than lists. Such functions are called compositional. As explained in Section 2. These common patterns of recursion can conveniently be captured by higher order functions. • understand the advantages of using folds. 77 . Compositional functions can typically be used to define the semantics of programming languages constructs. Such functions can be seen as computations that read values from a component of the domain and write values to a component of the codomain. one can associate a (possibly mutually recursive) datatype. • have seen the notions of fusion and deforestation.

the type [x ] is recursive. Section 5.1). • be able to define algebraic semantics in terms of compositional (or syntax driven) code that is defined using algebras of computations which make use of inherited and synthesised attributes. The informal definition of above corresponds to the following (pseudo) datatype. on a user-defined datatype List a for lists (Section 5.1.2). followed by a tail xs which itself is again a list. It deals with the problem of variable scope when compiling block structured languages. Section 5. 5.2 shows how to define compositional recursive functions on several kinds of trees. Thus. Section 5. All three examples use an algebra of computations which can read values from and write values to an environment which binds names to values.1. 5. We also show how to construct an algebra that directly corresponds to a datatype. Section 5.3 defines algebraic semantics.1 shows how to define compositional recursive functions on built-in lists using a function which is similar to foldr and shows how to do the same thing with user-defined lists and streams.1.5 presents a relatively complex example of the use of tuples in combination with compositionality.1.4 shows the usefulness of algebraic semantics by presenting an expression evaluator.5.1.1. an expression interpreter which makes use of a stack and expression compiler to a stack machine. Lists This section introduces compositional functions on the well known datatype of lists. Compositionality • be able to associate inherited and synthesised attributes with the different alternatives of (possibly mutually recursive) datatypes (or the different nonterminals of grammars). Built-in lists The datatype of lists is perhaps the most important example of a datatype. and on streams or infinite lists (Section 5. In Haskell the type [x ] for lists is built-in. They only differ in the way they handle basic expressions (variables and local definitions are handled in the same way).3). In a second version of the expression evaluator the use of inherited and synthesised attributes is made more explicit by using tuples. data [x ] = x : [x ] | [] 78 . Organisation The chapter is organised as follows. A list is either the empty list [] or a nonempty list (x : xs) consisting of an element x at the head of the list. Section 5. Compositional functions are defined on the built-in datatype [a] for lists (Section 5.

3.3. ? foldL (\_ n -> n+1. The similarity between the definitions of the functions sumL and prodL can be captured by the following higher order recursive function foldL. Note that we have replaced the constructor (:) (a constructor with two arguments) by binary operators (+) and (∗) (i. Note that a type (like Int) can be the carrier of a listalgebra in more than one way (for example using ((+).4] 10 ? foldL ((*). Lists Many recursive functions on lists look very similar.e. foldL :: (x → l → l . Computing the sum of the elements of a list of integers (sumL. More precisely.e.1. prodL :: [Int] → Int sumL (x : xs) = x + sumL xs sumL [] =0 prodL (x : xs) = x ∗ prodL xs prodL [] =1 The function sumL replaces the list constructor (:) by (+) and the list constructor [] by 0 and the function prodL replaces the list constructor (:) by (∗) and the list constructor [] by 1. where the L denotes that it is a sum function defined on lists) is very similar to computing the product of the elements of a list of integers (prodL). which is nothing else but an uncurried version of the well known function foldr .1) [1. 1)).4] 4 79 . ? foldL ((+). c) = fold where fold (x : xs) = op x (fold xs) fold [] =c The function foldL recursively replaces the constructor (:) by an operator op and the constructor [] by a constant c.2.2. ‘functions’ with zero variables).4] 24 The pair (op. which works the other way around. a list-algebra consists of a type l (the carrier of the algebra). We can now use foldL to compute the sum and product of the elements of a list of integers as follows. l ) → [x ] → l foldL (op. c) is often referred to as a list-algebra. sumL.2. Here is another example of how to turn Int into a list-algebra. a binary operator op of type x → l → l and a constant c of type l .3. Don’t confuse foldL with Haskell’s prelude function foldl . 0) and ((∗).0) [1.0) [1. functions with two arguments) and the constructor [] (a constructor with zero variables) by constants 0 and 1 (i.5.

The types of recursive constructor arguments (i. The right hand side of the type definition is obtained from the right hand side of the data definition as follows: all List x valued constructors are replaced by l valued functions which have the same number of arguments (if any) as the corresponding constructors.e. the definition of a fold function can be generated automatically from the data definition. 5. data List x = Cons x (List x ) | Nil User-defined lists are defined in the same way as built-in ones. type ListAlgebra x l = (x → l → l . l ) The left hand side of the type definition is obtained from the left hand side of the datatype as follows: a postfix Algebra is added at the end of the name List and a type variable l is added at the end of the whole left hand side of the type definition. foldList :: ListAlgebra x l → List x → l foldList (cons. The type of the elements of the list does not play a role. User-defined lists In this subsection we present an example of a fold function defined on another datatype than built-in lists. arguments of type List x ) are replaced by recursive function arguments (i.1. Here are the types of the constructors Cons and Nil .2. arguments of type l ). and increments the result obtained thus far with one.e. In a similar way. It corresponds to the function sizeL defined by: sizeL :: [x ] → Int sizeL ( : xs) = 1 + sizeL xs sizeL [] =0 Note that the type of sizeL is more general than the types of sumL and prodL. The constructors (:) and [] are replaced by constructors Cons and Nil . nil ) = fold 80 . The types of the other arguments are simply left unchanged. ? :t Cons Cons :: a -> List a -> List a ? :t Nil Nil :: List a A algebra type ListAlgebra corresponding to the datatype List directly follows the structure of that datatype.5. To keep things simple we redo the list example for user-defined lists. Compositionality This list-algebra ignores the value at the head of a list.

prodList :: List Int → Int sumList = foldList ((+).1.5. 1) sizeList :: List x → Int sizeList = foldList (const (1+). They correspond to the examples of Section 5.3.1. Nil ) idList :: List x → List x idList = foldList idListAlgebra ? idList (Cons 1 (Cons 2 Nil)) Cons 1 (Cons 2 Nil) 5.1. Every algebra defines a unique compositional function. idListAlgebra :: ListAlgebra x (List x ) idListAlgebra = (Cons. Recursive functions on user-defined lists which are defined by means of foldList are called compositional. data Stream x = And x (Stream x ) Here is a standard example of a stream: the infinite list of fibonacci numbers. fibStream :: Stream Int fibStream = And 0 (And 1 (restOf fibStream)) where restOf (And x stream@(And y )) = And (x + y) (restOf stream) streams 81 . Lists where fold (Cons x xs) = cons x (fold xs) fold Nil = nil The constructors Cons and Nil in the left hand sides of the definition of the local function fold are replaced by functions cons and nil in the right hand sides. This algebra defines the identity function on user-defined lists. The function fold is applied recursively to all recursive constructor arguments of type List x to return a value of type l as required by the functions of the algebra (in this case Cons and cons have one such recursive argument).1. 0) prodList = foldList ((∗). 0) It is worth mentioning one particular ListAlgebra: the trivial ListAlgebra that replaces Cons by Cons and Nil by Nil . sumList. The other arguments are left unchanged. Here are three examples of compositional functions. Streams In this section we consider streams (or infinite lists).

data BinTree x = Bin (BinTree x ) (BinTree x ) | Leaf x One can generate the corresponding algebra type BinTreeAlgebra and fold function foldBinTree from the datatype automatically. and the fold function foldStream can be generated automatically from the datatype Stream. expression trees. Binary trees A binary tree is either a node where the tree splits into two subtrees or a leaf which holds a value. Compositionality The algebra type StreamAlgebra. type BinTreeAlgebra x t = (t → t → t. 5. type StreamAlgebra x s = x → s → s foldStream :: StreamAlgebra x s → Stream x → s foldStream and = fold where fold (And x xs) = and x (fold xs) Note that the algebra has only one component because Stream has only one constructor. we will briefly mention the concepts of fusion and deforestation.2.5. Trees Now that we have seen how to generalise the foldr function on built-in lists to compositional functions on user-defined lists and streams we proceed by explaining another common class of datatypes: trees. It computes the first element of a monotone stream that is greater or equal than a given value. Note that Bin has two recursive arguments and that Leaf has one non-recursive argument.2. trees for matching parentheses. For the same reason the fold function is defined using only one equation. We will treat four different kinds of trees in the subsections below: • • • • binary trees. Furthermore. x → t) 82 . Here is an example of using a compositional function on user defined streams.1. firstGreaterThan :: Ord x ⇒ x → Stream x → x firstGreaterThan n = foldStream (λx y → if x n then x else y) 5. general trees.

Remember that the abstract syntax ignores the terminal bracket symbols of the concrete syntax. Similarly. const 1) ? sizeBinTree (Bin (Bin (Leaf 3) (Leaf 7)) (Leaf 11)) 3 If a tree consists of a leaf.2. sizeBinTree :: BinTree x → Int sizeBinTree = foldBinTree ((+).2. We can now define compositional functions on binary trees much in the same way as we defined them on lists.5. It suffices to define appropriate semantic actions bin and leaf on a type t (in this case Int) that correspond to the syntactic constructs Bin and Leaf of the datatype BinTree. Trees foldBinTree :: BinTreeAlgebra x t → BinTree x → t foldBinTree (bin. Trees for matching parentheses Section 3. which can be used to compute the depth (depthParentheses) and the width (widthParentheses) of matching parenthesis in a compositional way. We can now define.1 defines the datatype Parentheses for matching parentheses. Here is an example: the function sizeBinTree computes the size of a binary tree. If a tree consists of two subtrees. The depth of a string of matching parentheses 83 . Functions for computing the sum and the product of the integers at the leafs of a binary tree can be defined in a similar way. data Parentheses = Match Parentheses Parentheses | Empty For example. in the foldBinTree function. the sentence ()() of the concrete syntax for matching parentheses is represented by the value Match Empty (Match Empty Empty) in the abstract syntax Parentheses. then sizeBinTree returns the sum of the sizes of those subtrees as the size of the tree. leaf ) = fold where fold (Bin l r ) = bin (fold l ) (fold r ) fold (Leaf x ) = leaf x In the BinTreeAlgebra type.3. an algebra type ParenthesesAlgebra and a fold function foldParentheses. then sizeBinTree ignores the value at the leaf and returns 1 as the size of the tree. in the same way as we did for lists and binary trees.2. the local fold function is applied recursively to both arguments of bin and is not called on the argument of leaf . 5. the bin part of the algebra has two arguments of type t and the leaf part of the algebra has one argument of type x .

We know exactly which terminals we have deleted when going from the concrete syntax to the abstract one. For large examples layout is very important: it can be used to let concrete representations look pretty. For a simple example layout does not really matter. the depth of the string ((()))() is 3. which are not a substring of a surrounding string of matching parentheses. indentation and newlines. Compositional functions on datatypes that describe the abstract syntax of a language are called syntax driven. The algebra used by the function a2cParentheses simply reinserts those terminals that we have deleted. Compositionality s is the largest number of unmatched parentheses that occurs in a substring of s. 0) depthParentheses. Fortunately. m) foldParentheses :: ParenthesesAlgebra m → Parentheses → m foldParentheses (match. the width of the string ((()))() is 2. empty) = fold where fold (Match l r ) = match (fold l ) (fold r ) fold Empty = empty depthParenthesesAlgebra :: ParenthesesAlgebra Int depthParenthesesAlgebra = (λx y → max (1 + x ) y. What is the concrete representation of the matching parenthesis example represented by parenthesesExample? It happens to be ((()))()(()). type ParenthesesAlgebra m = (m → m → m. Note that a2cParentheses does not deal with layout such as blanks.5. we can easily write a program that computes the concrete representation from the abstract one. 84 . widthParentheses :: Parentheses → Int depthParentheses = foldParentheses depthParenthesesAlgebra widthParentheses = foldParentheses widthParenthesesAlgebra parenthesesExample = Match (Match (Match Empty Empty) Empty) (Match Empty (Match (Match Empty Empty) Empty ) ) ? depthParentheses parenthesesExample 3 ? widthParentheses parenthesesExample 3 Our example reveals that abstract syntax is not very well suited for interpretation by human beings. 0) widthParenthesesAlgebra :: ParenthesesAlgebra Int widthParenthesesAlgebra = (λ y → 1 + y . For example. The width of a string of matching parentheses s is the the number of substrings that are matching parentheses themselves. For example.

and is thus preferable for reasons of efficiency. Consider the functions parens and nesting of Section 3. and convince ourselves that it is correct.) Functions nesting and nesting compute exactly the same result.3. The function nesting is the fusion of the fold function with the parser parens from the function nesting . Function nesting never builds a tree.2. Trees a2cParenthesesAlgebra :: ParenthesesAlgebra String a2cParenthesesAlgebra = (λxs ys → "(" + xs + ")" + ys. What we would really like is that computers can interpret concrete syntax as well. Using laws for parsers and folds (not shown here) we can prove that the two functions are equal. human beings can easily interpret concrete syntax (something computers have difficulties with). Strangely enough. in function nesting we have to write our own parser. and then flattens it by means of the fold. So for reasons of ‘programming efficiency’ function nesting is preferable. On the other hand: in function nesting we reuse the parser parens and the function depthParentheses. open = symbol ’(’ close = symbol ’)’ parens :: Parser Char Parentheses parens = (λ x y → Match x y) <$> open <∗> parens <∗> close <∗> parens <|> succeed Empty nesting :: Parser Char Int nesting = (λ x y → max (1 + x ) y) <$> open <∗> nesting <∗> close <∗> nesting <|> succeed 0 Function nesting could have been defined by means of function parens and a fold: nesting :: Parser Char Int nesting = depthParentheses <$> parens (Remember that depthParentheses has been defined as a fold. To obtain the best of both 85 . This is the place where parsing enters the picture: computing an abstract representation from a concrete one is precisely what parsers are used for.5. Note that function nesting first builds a tree by means of function parens.1 again. "") + + + a2cParentheses :: Parentheses → String a2cParentheses = foldParentheses a2cParenthesesAlgebra ? a2cParentheses parenthesesExample ((()))()(()) This example illustrates that a computer can easily interpret abstract syntax (something human beings have difficulties with).

3. and F defined above.2. the following abstract syntax is more convenient. as always. and F uses E . these three types are mutually recursive. In this section we look again at the expression grammar of Section 3. Since E uses T . T . Some (very few) compilers are clever enough to perform this transformation automatically. Using the approach from Section 2. (f → t. the algebra type EAlgebra consists of three tuples of functions and three carriers (the main carrier is. t → f → t) . e → t → e) . Compositionality deforestation worlds.5. T uses F .5: E E T T F F →T →E +T →F →T *F →(E ) → Digs This grammar has three nonterminals. Int → f ) ) 86 . (e → f . Since the datatypes are mutually recursive. type EAlgebra e t f = ((t → e. E . Expression trees The matching parentheses grammar has only one nonterminal. 5. the one corresponding to the main datatype and is therefore the one corresponding to the start-symbol). The automatic transformation of function nesting into function nesting is called deforestation (trees are removed).6 we transform the nonterminals to datatypes: data E = E1 T | E2 E T data T = T1 F | T2 T F data F = F1 E | F2 Int where we have translated Digs by the type Int. T . we would like to write function nesting and have our compiler figure out that it is better to use function nesting in computations. and F . The main datatype of the three datatypes is the one corresponding to the start symbol E . Note that this is a rather inconvenient and clumsy abstract syntax for expressions. to illustrate the concept of mutual recursive datatypes. data Expr = Con Int | Add Expr Expr | Mul Expr Expr However. we will study the datatypes E . Therefore its abstract syntax is described by a single datatype.

so it uses three mutually recursive local functions. evalE :: E → Int evalE = foldE ((id . and t are instantiated with Int. (f1 . a2cE :: E → String a2cE = foldE ((e1 . (id . (∗)). f . (id . (+)). Trees The fold function foldE for E also folds over T and F . e2 ). f2 )) where e1 = λt → t e2 = λe t → e + "+" + t + + t1 = λf → f t2 = λt f → t + "*" + f + + f1 = λe → "(" + e + ")" + + f2 = λn → show n ? a2cE exE "2*3+1" 87 . Here is a function a2cE which does this job for us. t2 ).5. t2 ). foldE :: EAlgebra e t f → E → e foldE ((e1 . all type variables e. e2 ). In the algebra that is used in the foldE . f2 )) = fold where fold (E1 t) = e1 (foldT t) fold (E2 e t) = e2 (fold e) (foldT t) foldT (T1 f ) = t1 (foldF f ) foldT (T2 t f ) = t2 (foldT t) (foldF f ) foldF (F1 e) = f1 (fold e) foldF (F2 n) = f2 n We can now use foldE to write a syntax driven expression evaluator evalE . (f1 . (t1 . id )) exE = E2 (E1 (T2 (T1 (F2 2)) (F2 3))) (T1 (F2 1)) ? evalE exE 7 Once again our example shows that abstract syntax cannot easily be interpreted by human beings. (t1 .2.

and it follows that if we define the function reverse by reverse :: [a] → [a] reverse = foldL (λx xs → xs + [x ]. For example. only the value at the node is of interest).2.4. folds are often efficient functions. some functions require more than constant evaluation time. General trees A general tree consist of a node. list concatenation is linear in its left argument. the type TreeAlgebra and the function foldTree can be generated automatically from the data definition. However. 5. of course. where the tree splits into a list of subtrees. Notice that this list may be empty (in which case. As usual. []) + then function reverse takes time quadratic in the length of its argument list. Therefore the node function of the algebra has a corresponding list of recursive arguments. Compositionality 5. A technique that often can be used in such a case is the accumulating parameter technique. For example.5. data Tree x = Node x [Tree x ] type TreeAlgebra x a = x → [a] → a foldTree :: TreeAlgebra x a → Tree x → a foldTree node = fold where fold (Node x gts) = node x (map fold gts) Notice that the constructor Node has a list of recursive arguments. the fold is usually not linear. holding a value. If the evaluation of each of these functions on their arguments takes constant time. Efficiency A fold takes a value of a datatype.2. but if the functions in the algebra are not constant. One can compute the sum of the values at the nodes of a general tree as follows: sumTree :: Tree Int → Int sumTree = foldTree (λx xs → x + sum xs) Computing the product of the values at the nodes of a general tree and computing the size of a general tree can be done in a similar way. So. and replaces its constructors by functions. evaluation of the fold takes time linear in the number of constructors in its argument.5. Often such a nonlinear fold can be transformed into a more efficient function. for the reverse function we have reverse x = reverse x [] 88 . The local fold function is recursively called on all elements of a list using the map function.

(Recall the rules 1 = r1 + r1 and r = r1 + r2 . given two resistors. Algebraic semantics This section summarizes what we have seen before.2. mapBinTree. which returns the maximal value at the leaves.1. Define the following functions as folds on the datatype BinTree. In the previous section. we have seen how to associate an algebra type and a fold function with every datatype. We choose to represent paths by sequences of Direction’s: data Direction = L | R in such a way that taking the left subtree in an internal node will be encoded by L and taking the right subtree will be encoded by R. which returns the list of leaf values in left-to-right order. The algebra is a tuple (one component per datatype) of tuples (one component per constructor 89 . which returns the length of a shortest path.1. 2.3. or family of mutually recursive datatypes. sp. and establishes terminology. Exercise 5.3. define the type ResistAlgebra and the corresponding function foldResist. 4. Define this function first using explicit recursion. Exercise 5. data LNTree a b = Leaf a | Node (LNTree a b) b (LNTree a b) Exercise 5. 1.3. Algebraic semantics reverse :: [a] → [a] → [a] reverse [] ys = ys reverse (x : xs) ys = reverse xs (x : ys) The evaluation of reverse xs takes time linear in the length of xs. Also. see Section 5. they can be put in parallel (:|:) or in sequence (:∗:). Define a compositonal function result which determines the resistance of a resistors. 3. 5. 2. which returns the height of a tree. and then using a fold.4. Define an algebra type and a fold function for the following datatype.) r 1 2 5. A path through a binary tree describes the route from the root of the tree to some leaf. which maps a function over the elements at the leaves.2. 1. flatten. Define a compositional function allPaths which produces all paths of a given tree. There are some basic resistors with a fixed (floating point) resistance and. Exercise 5. maxBinTree. Define the datatype Resist to represent resistors.5. This exercise deals with resistors. height.

We call the carrier corresponding to the main datatype of the family the main carrier. The datatype and the corresponding algebra type and fold function are as follows: infixl 7 ‘Mul ‘ infix 7 ‘Dvd ‘ infixl 6 ‘Add ‘. This algebra.2. when applied via a fold. and the algebra is parameterized over all of them. We restrict ourselves to float valued expressions on which addition. Expressions The first part of this section presents a basic expression evaluator. and hence non mutual recursive datatype for expressions. we need one carrier per datatype. The function resulting from applying an algebra to a fold function is sometimes also called an algebra homomorphism. and the others auxiliary carriers.3: here we will use a single. subtraction. and the compositional function is said to define algebraic semantics. Evaluating expressions In this subsection we start with a more involved example: an expression evaluator.1. compositional initial algebra algebra homomorphism algebraic semantics 5.4. The fold function traverses a value and recursively replaces syntactic constructors of the datatypes by the corresponding semantic actions of the algebra.4. The algebra is parameterized over the result type of the computation. ‘Min‘ data Expr = Expr | Expr | Expr | Expr | Num ‘Add ‘ Expr ‘Min‘ Expr ‘Mul ‘ Expr ‘Dvd ‘ Expr Float — add type ExprAlgebra a = (a → a → a 90 . 5. Functions which are defined in terms of a fold function and an algebra are called compositional functions. The evaluator is extended with variables in the second part and with local definitions in the third part. and is called the initial algebra. multiplication and division are defined.5. defines the identity function. We will use another datatype for expressions than the one introduced in Section 5. also called the carrier of the algebra. If we work with a family of mutually recursive datatypes. There is one special algebra: the one whose components are the constructor functions of the datatypes we operate on. Compositionality carrier of the datatype) of semantic actions.

For our purposes names are strings and values are floats. Float → a) — — — — min mul dvd num foldExpr :: ExprAlgebra a → Expr → a foldExpr (add . the fact that Expr does not have an extra parameter x like the list and tree examples. We implement an environment as a list of name-value pairs.5. (−).4. min. In the following programs we will use the following functions and types: type Env name value = [(name.2. The values of variables are typically looked up in an environment which binds the names of the variables to values. (∗). x = = y] type Name = String type Value = Float The datatype and the corresponding algebra type and fold function are now as follows.a → a → a . data Expr = Expr ‘Add ‘ Expr | Expr ‘Min‘ Expr | Expr ‘Mul ‘ Expr 91 . although it differs from the previous Expr datatype. v ) ← env . id ) 5.a → a → a . Expressions . Computing the result of an expression now simply consists of replacing the constructors by appropriate functions. num) = fold where fold (expr 1 ‘Add ‘ expr 2 ) = fold expr 1 ‘add ‘ fold fold (expr 1 ‘Min‘ expr 2 ) = fold expr 1 ‘min‘ fold fold (expr 1 ‘Mul ‘ expr 2 ) = fold expr 1 ‘mul ‘ fold fold (expr 1 ‘Dvd ‘ expr 2 ) = fold expr 1 ‘dvd ‘ fold fold (Num n) = num n expr 2 expr 2 expr 2 expr 2 There is nothing special to notice about these definitions except. Note that we use the same name (Expr ) for the datatype. value)] (?) :: Eq name ⇒ Env name value → name → value env ? x = head [v | (y. resultExpr :: Expr → Float resultExpr = foldExpr ((+). (/). perhaps. mul . dvd .4.a → a → a . Adding variables Our next goal is to extend the evaluator of the previous subsection such that it can handle variables as well.

bad way of doing this: one can use it as an argument of a function that computes an algebra (we will explain why this is a bad choice in the next subsection.a → a → a . (−).3)] (Var "x" ‘Mul‘ Num 2) 6 The good way of using an environment is the following: instead of working with a computation which. Name → a) — — — — — — add min mul dvd num var foldExpr :: ExprAlgebra a → Expr → a foldExpr (add . var ) = fold where fold (expr 1 ‘Add ‘ expr 2 ) = fold expr 1 ‘add ‘ fold fold (expr 1 ‘Min‘ expr 2 ) = fold expr 1 ‘min‘ fold fold (expr 1 ‘Mul ‘ expr 2 ) = fold expr 1 ‘mul ‘ fold fold (expr 1 ‘Dvd ‘ expr 2 ) = fold expr 1 ‘dvd ‘ fold fold (Num n) = num n fold (Var x ) = var x expr 2 expr 2 expr 2 expr 2 The datatype Expr now has an extra constructor: the unary constructor Var . (|/|) :: (Env Name Value → Value) → (Env Name Value → Value) → (Env Name Value → Value) f |+| g = λenv → f env + g env f |−| g = λenv → f env − g env 92 . num. (|+|). resultExprBad :: Env Name Value → Expr → Value resultExprBad env = foldExpr ((+). (env ?)) ? resultExprBad [("x".5. (/). min. given an environment. dvd .a → a → a . the argument of foldExpr now has an extra component: the unary function var which corresponds to the unary constructor Var . id . (|−|). (∗). Thus we turn the environment in a ‘local’ variable. Computing the result of an expression somehow needs to use an environment. (|∗|). the basic idea is that we use the environment as a global variable here).a → a → a . mul . Value → a . Here is a first. yields an algebra of values it is better to turn the computation itself into an algebra. Similarly. Compositionality | Expr ‘Dvd ‘ Expr | Num Value | Var Name type ExprAlgebra a = (a → a → a .

Expressions f |∗| g = λenv → f env ∗ g env f |/| g = λenv → f env / g env resultExprGood :: Expr → (Env Name Value → Value) resultExprGood = foldExpr ((|+|). (|∗|) and (|/|) on computations. (|/|). A (local) definition is an expression of the form Def name expr 1 expr 2 which should be interpreted as: let the value of name be equal to expr 1 in expression expr 2 . then we say that the synthesised and inherited attributes are attributes of the nonterminal. The value of an expression is synthesised from values of its subexpressions. Computing the result of a constant does not need the environment at all. Computing the result of a variable consists of looking it up in the environment. The datatype (called Expr again) and the corresponding algebra type and eval function are now as follows: data Expr = Expr ‘Add ‘ Expr | Expr ‘Min‘ Expr | Expr ‘Mul ‘ Expr 93 . (|∗|). Adding definitions Our next goal is to extend the evaluator of the previous subsection such that it can handle definitions as well. Computing the result of the sum of two subexpressions within a given environment consists of computing the result of the subexpressions within this environment and adding both results to yield a final result.4. synthesised attribute inherited attribute 5.5. The environment which is used by the computation of a subexpression of an expression is inherited from the computation of the expression. (|−|). The value type is an example of a synthesised attribute.3.4. If Expr is one of the mutually recursive datatypes which are generated from the nonterminals of a grammar. the algebraic semantics of an expression is a computation which yields a value. Thus. (|−|). Since we are working with abstract syntax we say that the synthesised and inherited attributes are attributes of the datatype Expr .3)] 6 The actions (+). (∗) and (/) on values are now replaced by corresponding actions (|+|). Variables are typically defined by updating the environment with an appropriate name-value pair. (−). In this case the computation is of the form env → val . The environment type is an example of an inherited attribute. const. flip (?)) ? resultExprGood (Var "x" ‘Mul‘ Num 2) [("x".

Name → a . Notice that the last two arguments are recursive ones. (−). var . def ) = fold where fold (expr 1 ‘Add ‘ expr 2 ) = fold expr 1 ‘add ‘ fold expr 2 fold (expr 1 ‘Min‘ expr 2 ) = fold expr 1 ‘min‘ fold expr 2 fold (expr 1 ‘Mul ‘ expr 2 ) = fold expr 1 ‘mul ‘ fold expr 2 fold (expr 1 ‘Dvd ‘ expr 2 ) = fold expr 1 ‘dvd ‘ fold expr 2 fold (Num n) = num n fold (Var x ) = var x fold (Def x value body) = def x (fold value) (fold body) Expr now has an extra constructor: the ternary constructor Def .a → a → a . which can be used to introduce a local variable. We can now explain why the first use of environments is inferior to the second one. dvd . num. Value → a . Compositionality | | | | Expr ‘Dvd ‘ Expr Num Value Var Name Def Name Expr Expr — — — — — — — add min mul dvd num var def type ExprAlgebra a = (a → a → a . mul . id . For example.a → a → a . the parameter of foldExpr now has an extra component: the ternary function def which corresponds to the ternary constructor Def .a → a → a .5. (∗). (/). Trying to extend the first definition gives something like: resultExprBad :: Env Name Value → Expr → Value resultExprBad env = foldExpr ((+). (env ?). Name → a → a → a) foldExpr :: ExprAlgebra a → Expr → a foldExpr (add . seconds = Def "days_per_year" (Num Def "hours_per_day" (Num Def "minutes_per_hour" (Num Def "seconds_per_minute" (Num Var "days_per_year" ‘Mul ‘ Var "hours_per_day" ‘Mul ‘ Var "minutes_per_hour" ‘Mul ‘ Var "seconds_per_minute" 365) ( 24) ( 60) ( 60) ( )))) Similarly. min. error "def") 94 . the following expression can be used to compute the number of seconds per year.

flip (?). Compiling to a stack machine In this section we compile expressions to instructions on a stack machine. We cannot update the environment in this setting: we can read the environment but afterwards it is not accessible any more in the algebra (which consists of values). Imagine a simple computer for evaluating arithmetic expressions. This computer has a ‘stack’ and can execute ‘instructions’ which change the value of the stack. Extending the second definition causes no problems: the environment is now accessible in the algebra (which consists of computations). and from which 95 . We can easily add a new action which updates the environment. y) to an environment (in the definition of the operator |=|). (|=|)) ? resultExprGood seconds [] 31536000 Note that by prepending a pair (x . f env ) : env ) resultExprGood :: Expr → (Env Name Value → Value) resultExprGood = foldExpr ((|+|). Expressions The last component causes a problem: a body that contains a local definition has to be evaluated in an updated environment. The class of possible instructions is defined by the following datatype. A stack (a value of type Stack v for some value type v ) is a list of values. (|/|). The computation corresponding to the body of an expression with a local definition can now be evaluated in the updated environment. the binding for x hides possible other bindings for x . This section is inspired by an example in the textbook by Bird and Wadler [3]. and pushes the result back on the stack. data MachInstr v = Push v | Apply (v → v → v ) type MachProgr v = [MachInstr v ] An instruction either pushes a value of type v on the stack. on which you can push values. applies the operator. we add the pair to the environment. By definition of (?). const.5. or it executes an operator that takes the two top values of the stack.4. We can then use this stack machine to evaluate compiled expressions. from which you can pop values. (|∗|).4.4. f f f f x |+| g |−| g |∗| g |/| g |=| f = λenv = λenv = λenv = λenv = λg env →f →f →f →f →g env + g env env − g env env ∗ g env env / g env ((x . 5. (|−|).

2. fd env ) : env ) Exercise 5. which returns the list of variables that occur in the expression. vars. var . num. This exercise deals with expressions without definitions. dvd . defined by: compileExpr :: Expr → Env Name (MachProgr Value) → MachProgr Value compileExpr = foldExpr (add .6.5. def ) where f ‘add ‘ g = λenv → f env + g env + [Apply (+)] + + f ‘min‘ g = λenv → f env + g env + [Apply (−)] + + f ‘mul ‘ g = λenv → f env + g env + [Apply (∗)] + + f ‘dvd ‘ g = λenv → f env + g env + [Apply (/)] + + num v = λenv → [Push v ] var x = λenv → env ? x def x fd fb = λenv → fb ((x . min. isSum. mul .5. Compositionality you can take the top value. A module Stack for stacks can be found in Appendix A. The function der is defined by der (e1 ‘Add ‘ e2 ) dx = der e1 dx ‘Add ‘ der e2 dx der (e1 ‘Min‘ e2 ) dx = der e1 dx ‘Min‘ der e2 dx der (e1 ‘Mul ‘ e2 ) dx = (e1 ‘Mul ‘ der e2 dx ) ‘Add ‘ (der e1 dx ‘Mul ‘ e2 ) 96 . The effect of executing an instruction of type MachInstr is defined by execute :: MachInstr v → Stack v → Stack v execute (Push x ) s = push x s execute (Apply op) s = let a = top s t = pop s b = top t u = pop t in push (op a b) u A sequence of instructions is executed by the function run defined by run :: MachProgr v → Stack v → Stack v run [] s =s run (x : xs) s = run xs (execute x s) It follows that run can be defined as a foldl . Define the following functions as folds on the datatype Expr that contains definitions. 1. which determines whether or not an expression is a sum. Exercise 5. An expression can be translated (or compiled) into a list of instructions by the function compile.

addition and substraction. use y 97 . It is easy to write a function with the required functionality if you swap the arguments.2. given a tree and a path in the tree. Blocks A block is a list of statements. Define the function der on Exp and show that this function is compositional. Exercise 5. but then it is impossible to write replace as a fold. which when given m replaces all the leaves by m. variables. Define a datatype Exp to represent expressions consisting of (floating point) constants.1.7.5.1. dcl x . The example deals with the scope of variables in a block structured language. dcl y . Why is the function der not compositional ? 3. see Section 5. The concrete representation of an example block of our block structured language looks as follows (dcl stands for declaration. Define the function replace. Also.5. Give an informal description of the function der . which given a binary tree and an element m replaces the elements at the leaves by m as a fold on the datatype BinTree. define the type ExpAlgebra and the corresponding foldExp. A variable from a global scope is visible in a local scope only if it is not hidden by a variable with the same name in the local scope. A path in a tree leads to a unique leaf. 4.3. A statement is a variable declaration. Block structured languages der (e1 ‘Dvd ‘ e2 ) dx = ((e2 ‘Mul ‘ der e1 dx ) ‘Min‘ (e1 ‘Mul ‘ der e2 dx )) ‘Dvd ‘ (e2 ‘Mul ‘ e2 ) der (Num f ) dx = Num 0 der (Var s) dx = if s = = dx then Num 1 else Num 0 1.5. Note that the fold returns a function. a variable usage or a nested block. use stands for usage and x. y and z are variables). 2. 5. use y . Block structured languages This section presents a more complex example of the use of tuples in combination with compositionality. yields the element at the unique leaf. Exercise 5. 5. dcl z . (use z . use x) .5. Define a compositonal function path2Value which.8. Consider the datatype of paths introduced in Exercise 5. use x . dcl x .

There are three types of instructions. Generating code The goal of this section is to generate code from a block. Nested blocks are surrounded by parentheses. Access (0. Here are some mutually recursive (data)types. empty). 2) 98 . which describe the abstract syntax of blocks. (dcl . • Enter (l . The code consists of a sequence of instructions. • Access (l . (Idf → s. c): leaves the l ’th nested block in which c local variables were declared. [Enter (0. which uses two mutually recursive local functions. b → s) ) foldBlock :: BlockAlgebra b s → Block → b foldBlock ((cons. 2).5. 1). 1). the algebra type BlockAlgebra. Access (1. 0) . which consists of two tuples of functions. use. Compositionality Statements are separated by semicolons. blk )) = fold where fold (s : b) = cons (foldS s) (fold b) fold [] = empty foldS (Dcl x ) = dcl x foldS (Use x ) = use x foldS (Blk b) = blk (fold b) 5. corresponding to the grammar that describes the concrete syntax of blocks which is used above.2. c): accesses the c’th variable of the l ’th nested block. Leave (1. Access (1. Enter (1. type Block = [Statement] data Statement = Dcl Idf | Use Idf | Blk Block type Idf = String type BlockAlgebra b s = ((s → b → b. 0). The usage of y refers to the global declaration (the only declaration of y). The code generated for the above example looks as follows. Idf → s. b) . We use meaningful names for data constructors and we use built-in lists instead of user-defined lists for the block algebra. can be generated from the two mutually recursive (data)types. • Leave (l . Access (0. 2). c): enters the l ’th nested block in which c local variables are declared. As usual. The usage of z refers to the local declaration (the only declaration of z).5. and the fold function foldBlock . Note that it is allowed to use variables before they are declared. The local usage of x refers to the local declaration and the global usage of x refers to the global declaration.

dcl x (le. Leave (0. For all syntactic constructs of Statement and Block we define appropriate semantic actions on an algebra of computations. lc)) lc = lc + 1 where function update is defined in the AssociationList module. lc ) where le = le ‘update‘ (x . type Count = Int type Level = Int type Variable = (Level . 2) ] Note that we start numbering levels (l ) and counts (c) (which are sometimes called displacements) from 0. • Dcl : Every time we declare a local variable x we have to update the local environment le of the block we are in by associating with x the current pair of level and local-count (l . Count) type BlockInfo = (Level . c) = e ? x 99 .5. lc) = (le .5. c)] (l . lc)]. Access (0. • Use: Every time we use a local variable x we have to generate code cd for it. Note that we do not generate any code for a declaration statement. Instead we perform some computations which make it possible to generate appropriate code for other statements. lc) of the variable is looked up in the global environment e. lc). Moreover we have to increment the local variable count lc to lc + 1. 1). The abstract syntax of the code to be generated is described by the following datatype. Here is a. The level–local-count pair (l . l . Block structured languages . use x e = cd where cd = [Access (l . (l . description of these semantic actions. somewhat simplified. This code is of the form [Access (l . uses a compositional function block2Code. Count) data Instruction = Enter BlockInfo | Leave BlockInfo | Access Variable type Code = [Instruction] The function ab2ac. which generates abstract code (a value of type Code) from an abstract block (a value of type Block ).

the local variable count and the generated code. given a local environment le and local variable count lc.5. • (:): For every nonempty block we perform the computation of the first statement of the block which. l ) = cd where l =l +1 (leB . The code cd which is generated is the concatenation cdS + cdB of the code cdS which is + generated for the first statement and the code cdB which is generated for the rest of the block. cdB ) = fB (leS . and we compute two synthesised attributes: the local variable count and the generated code. lc . e. For blk we need two inherited attributes: a global block environment and a global level. cdB ) = fB (leB . results in a local environment leS and local variable count lcS . For cons we compute three synthesised attributes: the local block environment. cd ) where (leS . lcB )] + + • []: No action need to be performed for an empty block. 0) cd = [Enter (l . cons fS fB (le. Furthermore we need to make sure that the global environment (the one in which we look up variables) which is used by the computation for the nested block is equal to leB . and we compute one synthesised attribute: the generated code. lcS . a local block environment and a local variable count. blk fB (e. Two of them: the local block environment and the local variable count are also synthesised attributes. lcB . For use we need one inherited attribute: a global block environment. The computation for the nested block results in a local variable count lcB and a local environment leB . Compositionality • Blk : Every time we enter a nested block we increment the global level l to l + 1. Two of them. The code which is generated for the block is surrounded by an appropriate [Enter lcB ]-[Leave lcB ] pair. l . start with a fresh local variable count 0 and set the local environment of the nested block we enter to the current global environment e. lc . lc) (le . This environment-count pair is then used by the computation of the rest of the block to result in a local environment le and local variable count lc . When processing the statements of a nested block we already make use of the global block environment which we are synthesising (when looking up variables). cdS ) = fS (le. the 100 . lcB )] + cdB + [Leave (l . lcS ) cd = cdS + cdB + What does our actual computation type look like? For dcl we need three inherited attributes: a global level. lc) = (le . Moreover there is one extra attribute: a local block environment which is both inherited and synthesised.

l ) (le. blk )) where cons fS fB (e. lcS ). It is clear from the considerations above that the following types fulfill our needs. lcS ) cd = cdS + cdB + empty (e.5. 0) l =l +1 cd = [Enter (l . lc) = ((le. lc ). lc)) : le lc = lc + 1 use x (e. l ) (e. lcB ). lc ). lc) ((le . l ) (le. lc). cdS ) = fS (e. lcB )] + cdB + [Leave (l . block2Code :: Block → GlobalEnv → LocalEnv → (LocalEnv . The global environment is an attribute which is both inherited and synthesised. lc ). use. lc) = ((le . The code is a synthesised attribute. c)] + cd + [Leave (0. lcB )] + + The code generator starts with an empty local environment. Block structured languages local block environment and the local variable count are also needed as inherited attributes. When processing a block we already use the global environment which we are synthesising (when looking up variables). a fresh level and a fresh local variable count. empty). Attributes which are not mentioned in those actions are added as extra components which do not contribute to the functionality. Level ) type LocalEnv = (BlockEnv . Count) The implementation of block2Code is now a straightforward translation of the actions described above. lc) = ((le .5. cd ) where cd = [Access (l . lc). []) where le = (x . type BlockEnv = [(Idf . lc) = ((le. l ) (le. c) = e ? x blk fB (e. l ) (le. lc) = ((le. cdB ) = fB (e. (dcl . Variable)] type GlobalEnv = (BlockEnv . l ) (le. []) dcl x (e. cd ) where ((leS . Code) block2Code = foldBlock ((cons. cd ) where ((leB . (l . ab2ac :: Block → Code ab2ac b = [Enter (0. c)] + + where 101 . l ) (le. cdB ) = fB (leB . lc). l ) (leS . c)] (l .

which gives an abstract syntax for mirror-palindromes.Enter (1.9. Define the type MirAlgebra that describes the semantic actions that correspond to the syntactic constructs of Mir . Dcl "x".Access (0. which describes how semantic actions that correspond to the syntactic constructs of Mir should be applied.Leave (1.11.Leave (0. 102 .0) .2) .Access (0. Blk [Use "z". Define the type ParityAlgebra that describes the semantic actions that correspond to the syntactic constructs of Parity.10.1). Define the function foldMir . 1. Compositionality ((e.2). which gives an abstract syntax for parity-sequences. Define the parser pfoldMir . Define the function foldPal . 2. 5. Use "y". Define the function a2cParity as foldParity. which describes how the semantic actions that correspond to the syntactic constructs of Parity should be applied. which interprets its input in an arbitrary semantic MirAlgebra without building the intermediate abstract syntax tree. Define the functions a2cMir and m2pMir as foldMir ’s. Use "x"] .1). c).22.Access (0. 3.2). Exercises Exercise 5. 3.Access (1.24. 4. 2.23.0). cd ) = block2Code b (e. Define the functions a2cPal and aCountPal as foldPal ’s. 1. 3. Define a type PalAlgebra that describes the type of the semantic actions that correspond to the syntactic constructs of Pal . 2. Exercise 5. Dcl "z". 0) ([]. 4.5.2)] 5. respectively. Exercise 5. which describes how the semantics actions that correspond to the syntactic constructs of Pal should be applied.1). Define the function foldParity. Describe the parsers pfoldPal m1 and pfoldPal m2 where m1 and m2 correspond to the algebras of a2cPal and aCountPal respectively. Dcl "y". 5. which gives an abstract syntax for palindromes. 1. Describe the parsers pfoldMir m1 and pfoldMir m2 where m1 and m2 correspond to the algebras of a2cMir and m2pMir . Consider your answer to Exercise 2. Define the parser pfoldPal which interprets its input in an arbitrary semantic PalAlgebra without building the intermediate abstract syntax tree. Dcl "x" .Access (1. Use "y"] ? ab2ac aBlock [Enter (0. Consider your answer to Exercise 2. Consider your answer to exercise 2. 0) aBlock = [ Use "x".6.

Exercises Exercise 5. Define the type BitListAlgebra that describes the semantic actions that correspond to the syntactic constructs of BitList. which converts an abstract block into a concrete one.13. Define the function foldBitList.5. Define the type BlockAlgebra that describes the semantic actions that correspond to the syntactic constructs of Block . 4.25. which interprets its input in an arbitrary semantic BitListAlgebra without building the intermediate abstract syntax tree. Write a2cBlock as a foldBlock 5. 1.12. Define the function foldBlock . 4. 3. Define the parser pfoldBitList. 2. What is the abstract representation of x.S R|ε →D |U |N →x|y →X|Y →(B ) (block ) (rest) (statement) (declaration) (usage) (nested block ) 1. which describes how the semantic actions corresponding to the syntactic constructs of Block should be applied. The following grammar describes the concrete syntax of a simple blockstructured programming language B R S D U N →S R →. Exercise 5.Y). Define the function a2cBitList as a foldBitList.6. The function checkBlock tests whether or not each variable of a given abstract block is declared before use (declared in the same or in a surrounding block). Consider your answer to Exercise 2. which describes how the semantic actions that correspond to the syntactic constructs of BitList should be applied. 3.X? 2. 103 . Define the function a2cBlock . which gives an abstract syntax for bit-lists. Define a datatype Block that describes the abstract syntax that corresponds to the grammar.(y.

Compositionality 104 .5.

and computes the desired result by applying a fold to the abstract syntax. These functions are obtained by inserting different functions in the basic parser. apply a fold to the abstract syntax. This approach works fine for a small parser.2. Apply a fold to the abstract syntax Instead of inserting operations in a basic parser.2. and then applying a function to the abstract syntax that computes the net travelling time.1. For example. we can write a parser that parses the input to an abstract syntax. Insert a semantic function in the parser In Chapter 4 we have defined two parsers: a parser that computes the abstract syntax for a travelling schema. both compute the maximum nesting depth in a string of parentheses. 6. If we want to compute the total travelling time. pass an algebra to the parser. the parsers for travelling schemes given in Chapter 4 return an abstract syntax. and a parser that computes the net travelling time. The net travelling time is computed directly by inserting the correct semantic functions. • it is difficult to locate all positions where semantic functions have to be inserted in the parser. or an integer that represents the net travelling time in minutes. but it has some disadvantages when building a larger parser: • semantics is intertwined with the parsing process. where we defined two functions with the same functionality: nesting and nesting’. Computing with parsers Parsers produce results. Function nesting is defined 105 . 6. An example of such an approach has been given in Section 5. use a class instead of abstract syntax. This section shows several ways to compute results using parsers: • • • • insert a semantic function in the parser.2. Another way to compute the net travelling time is by first computing the abstract syntax. we have to insert different functions in the basic parser.6.

3. Deforestation Deforestation removes intermediate trees in computations.6. which is ‘flattened’ by the fold.. which does recursion over an abstract syntax tree. which returns an abstract syntax tree. The advantage of the second definition (nesting’) is that we reuse both the parser parens.. Each of these definitions has its own merits. 106 . which is used in the definition of depthParentheses). 6. To obtain the best of both worlds. parens parens :: Parser Char Parentheses = (\_ b _ d -> Match b d) <$> open <*> parens <*> close <*> parens <|> succeed Empty :: Parser Char Int = (\_ b _ d -> max (1+b) d) <$> open <*> nesting <*> close <*> nesting <|> succeed 0 :: Parser Char Int = depthParentheses <$> parens nesting nesting nesting’ nesting’ The first definition (nesting) is more efficient. On the other hand. We want to avoid building the abstract syntax tree altogether. and the function depthParentheses (or the function foldParentheses. This section sketches the general idea. Some (very few) compilers are clever enough to perform this transformation automatically. The previous section gives an example of deforestation on the datatype Parentheses. Computing with parsers by inserting functions in the basic parser. because it does not build an intermediate abstract syntax tree. we would like to write function nesting’ and have our compiler figure out that it is better to use function nesting in computations. it might be more difficult to write because we have to insert functions in the correct places in the basic parser. The disadvantage of the second definition is that it builds an intermediate abstract syntax tree. Suppose we have a datatype AbstractTree data AbstractTree = . The only thing we have to write ourselves in the second definition is the depthParenthesesAlgebra. we repeat the main arguments below. Function nesting’ is defined by applying a fold to the abstract syntax. The automatic transformation of function nesting’ into function nesting is called deforestation (trees are removed).

For example. This class is used in a parser for parentheses: parens parens :: Parens a => Parser Char a = (\_ b _ d -> match b d) <$> open <*> parens <*> close <*> parens <|> succeed empty 107 . see Chapter 5. Using a class instead of abstract syntax From this datatype we construct an algebra and a fold.6. Using a class instead of abstract syntax Classes can be used to implement the deforestated or fused computation of a fold with a parser. The following two sections each describe a way to implement such a deforestated function. 6. and then computes some value by folding with an algebra f over this tree: p = foldAbstractTree f .4.. type AbstractTreeAlgebra a = .. foldAbstractTree :: AbstractTreeAlgebra a -> AbstractTree -> a A parser for the datatype AbstractTree (which returns a value of AbstractTree) has the following type: parseAbstractTree :: Parser Symbol AbstractTree where Symbol is some type of input symbols (for example Char). Suppose now that we define a function p that parses an AbstractTree. parseAbstractTree Then deforestation says that p is equal to the function parseAbstractTree in which occurrences of the constructors of the datatype AbstractTree have been replaced by the corresponding components of the algebra f. for the language of parentheses. This gives a solution of the desired efficiency. we define the following class: class Parens a where match :: a -> a -> a empty :: a Note that types of the functions in the class Parens correspond exactly to the two types that occur in the type ParenthesesAlgebra.4.

instance Parens Parentheses match = Match empty = Empty where Now we can write: ?(parens :: Parser Char Parentheses) "()()" [(Match Empty (Match Empty Empty). (0. "()") .(Empty. This also immediately shows a problem of this.6. ""). Another instance of Parens can be used to compute the nesting depth of parentheses: instance Parens Int where match b d = max (1+b) d empty = 0 And now we can write: ?(parens :: Parser Char Int) "()()" [(1. we cannot use function parens to compute. (1. To obtain a function parens that returns a value of type Parentheses we create the following instance of the class Parens. the width (also an Int) of a string of parentheses.(Match Empty Empty. approach: it does not work if we want to compute two different results of the same type. "()()") ] Note that we have to supply the type of parens in this expression. 108 . So once we have defined the instance Parens Int as above. "") . for example. "()"). because Haskell doesn’t allow you to define two (or more) instances with the same type. otherwise elegant. we know nothing about their implementation. This is how we turn the implicit ‘class’ algebra into an explicit ‘instance’ algebra. otherwise Haskell doesn’t know which instance of Parens to use. "()()")] So the answer depends on the type we want our function parens to have. Computing with parsers The algebra is implicit in this function: the only thing we know is that there exist functions empty and match of the correct type.

Since this approach fails when we want to define different parsers with the same result type.5.0) breadth = parens (\b d -> d+1. breadth :: Parser Char Int nesting = parens (\b d -> max (1+b) d.5. Thus we obtain the following definition of parens: parens :: ParenthesesAlgebra a -> Parser Char a parens (match. Passing an algebra to the parser The previous section shows how to implement a parser with an implicit algebra.0) 109 .empty) = par where par = (\_ b _ d -> match b d) <$> open <*> par <*> close <*> par <|> succeed empty Note that it is now easy to define different parsers with the same result type: nesting. Passing an algebra to the parser 6. we make the algebras explicit.6.

6. Computing with parsers 110 .

Once this step has been taken. We will do so by constructing algebras out of other algebras. In this chapter we will illustrate a related formalism. This related formalism is the attribute grammar formalism. highly specific problem. and especially those programs that use functions that recursively descend over datatypes a lot. 111 . and one set of functions (an algebra) that describes what to compute at each node when visited. For each recursive datatype T we have a function foldT. programming language related. Chapter 5 introduces a language in which local declarations are permitted. but instead illustrate the formalism with some examples. these types are a guideline for further design steps. and give an informal definition. which looks a bit artificial.7. In this chapter we will present a view on recursive computations that will enable us to “design” the carrier type in an incremental way. When computing a property of such a recursive object (for example. We start with developing a somewhat unconventional way of looking at functional programs. The domain of that algebra is a higher order (data)type (a (data)type that contains functions). In this way we define the meaning of a language in a semantically compositional way. We will not formally define attribute grammars. which will make it easier to construct such involved algebras. We will start with the rep min example. and deals with a non-interesting. and for each constructor of the datatype we have a corresponding function as a component in the algebra. the resulting code comes as a surprise to many. We will see that such carrier types may be functions themselves. One of the most important steps in this process is deciding what the carrier type of the algebras is going to be. and to not distract our attention to specific. and that deciding on the type of such functions may not always be simple. In our case one may think about these datatypes as abstract syntax trees. Programming with higher-order folds Introduction In the previous chapters we have seen that algebras play an important role when describing the meaning of a recognised structure (a parse tree). However. Evaluating expressions in this language can be done by choosing an appropriate algebra. Unfortunately. it has been chosen for its simplicity. a program) we define two sets of functions: one set that describes how to recursively visit the nodes of the tree.

Goals In this chapter you will learn: • how to write ‘circular’ functional programs. a -> a -> a) foldTree :: TreeAlgebra a -> Tree -> a foldTree alg@(leaf.1. We will use this problem. Programming with higher-order folds data Tree = | Leaf Int Bin Tree Tree = deriving Show type TreeAlgebra a (Int -> a. bin) (Bin l r) = bin (foldTree alg l) (foldTree alg r) Listing 7. to build up an understanding of a whole class of such programs.hs semantic issues. The second example of this chapter demonstrates the techniques on a larger example: a small compiler for part of a programming language. We represent it by a type parameter of the algebra type: type TreeAlgebra a = (Int -> a. The rep min problem One of the famous examples in which the power of lazy evaluation is demonstrated is the so-called rep min problem [2].7. _ ) (Leaf i) = leaf i foldTree alg@(_ . since at first sight it seems that it is impossible to compute anything with this program. • how to combine algebras. together with its associated algebra.1: rm. Many have wondered how this program achieves its goal.1 we present the datatype Tree. The carrier type of an algebra is the type that describes the objects of the algebra.start. In Listing 7. 7. • (informally) the concept of an attribute grammar. and a sequence of different solutions. a -> a -> a) The associated evaluation function foldTree systematically replaces the constructors Leaf and Bin by corresponding operations from the algebra alg that is passed as an argument. or ‘higher-order folds’. 112 .

Notice how we have emphasized the fact that a function is returned through some superfluous notation: the first lambda in the function definitions constituting the algebra repAlg is required by the signature of the algebra. ?rep_min (Bin (Bin (Leaf 1) (Leaf 7)) (Leaf 11)) Bin (Bin (Leaf 1) (Leaf 1)) (Leaf 1) 7. but with the values in its leaves replaced by the minimal value occurring in the original tree. The rep min problem minAlg minAlg :: TreeAlgebra Int = (id. One of the disadvantages of this solution is that in the course of the computation the pattern matching associated with the inspection of the tree nodes is performed twice for each node in the tree.3 is an intermediate step towards this goal. min :: Int->Int->Int) rep_min :: Tree -> Tree rep_min t = foldTree repAlg t where m = foldTree minAlg t repAlg = (const (Leaf m).2. that is used in the tree constructing call of foldTree. we will try to construct a solution that calls foldTree only once. For example.sol1.hs We now want to construct a function rep_min :: Tree -> Tree that returns a Tree with the same “shape” as its argument Tree.1. and once for constructing the resulting Tree.2: rm.7. which could have been 113 . In this program the global variable m has been removed and the second call of foldTree does not construct a Tree anymore.1. Lambda lifting We want to obtain a program for the rep min problem in which pattern matching is used only once. Program Listing 7. Notice that the variable m is a global variable of the repAlg-algebra. The function rep_min that solves the problem in this way is given in Listing 7. but instead a function constructing a tree of type Int -> Tree.2.1. 7.1. Bin) Listing 7. Although this solution as such is no problem. which takes the computed minimal value as an argument. A straightforward solution A straightforward solution to the rep min problem consists of a function in which foldTree is used twice: once for computing the minimal value of the leaf values. the second lambda.

by tupling the results of the computations. Tupling computations We are now ready to formulate a solution in which foldTree is called only once. Programming with higher-order folds repAlg = ( \_ -> \m -> Leaf m .b) (leaf1. As a consequence we may perform both the computation of the tree constructing function and the minimal value in one go. is there because the carrier set of the algebra contains functions of type Int -> Tree. This function takes two TreeAlgebras as arguments and constructs a third one. 114 .4. which has as its carrier tuples of the carriers of the original algebras.bin2 (snd l) (snd r) ) ) min_repAlg :: TreeAlgebra (Int.hs infix 9 ‘tuple‘ tuple :: TreeAlgebra a -> TreeAlgebra b -> TreeAlgebra (a.\lfun rfun -> \m -> let lt = lfun m rt = rfun m in Bin lt rt ) rep_min’ t = (foldTree repAlg t) (foldTree minAlg t) Listing 7.sol3.hs omitted. This process is done routinely by functional compilers and is known as lambda-lifting. Int -> Tree) min_repAlg = (minAlg ‘tuple‘ repAlg) rep_min’’ t = r m where (m. First a function tuple is defined. bin2) = (\i -> (leaf1 i.1.sol2. bin1) ‘tuple‘ (leaf2.3: rm.4: rm.\l r -> (bin1 (fst l) (fst r) . leaf2 i) . The solution is given in Listing 7. r) = foldTree min_repAlg t Listing 7. Note that in the last solution the two calls of foldTree don’t interfere with each other. 7.3.7.

The parameters after the second lambda are there because we construct values in a higher order carrier set.5. which together achieve the same effect. is passed back as an argument to the result of (foldTree mergedAlg t). and returns as its result the cartesian product of the result types. which takes as its arguments the cartesian product of all the arguments of the functions in the tuple.1. The rep min problem 7. Notice however that there is nothing circular in this program. languages that require arguments to be evaluated completely before a call is evaluated (so-called strict evaluation in contrast to lazy evaluation). again introduced extra lambdas in the definition of the functions of the algebra. The resulting solution passes back part of the result of a call as an argument to that same call. Tree). in our case the value m. Lazy evaluation makes this possible. in an attempt to make the different rˆles of the parameters o explicit. especially to those who used to program in LISP or ML. As a consequence the argument of the new type is ((). Because of this surprising behaviour this class of programs became known as circular programs. we may transform that type into a new type in which we compute only one function. Thus we obtain the original solution as given in Bird [2]. and the result type becomes (Int. A curious step taken here is that part of the result. it is in general not possible to split this function into two functions of type a -> c and b -> d. The parameters after the first lambda are there because we deal with a TreeAlgebra. Int->Tree). Lazy evaluation makes this work. Finally. Int). and no value is defined in terms of itself (as in ones=1:ones). 115 . As in the rep min problem. Notice how we have. we have systematically transformed a program that inspects each node twice into an equivalent program that inspects each node only once. The front of a tree is the list of all nodes that are at the deepest level. into an equivalent type Int -> (Int.1. b) -> (c. We want to mention here too that the reverse is in general not true. d).7.4. Merging tupled functions In the next step we transform the type of the carrier set in the previous example. Each value is defined in terms of other values. Tree). the trees involved are elements of the datatype Tree. This transformation is not essential here.6 shows the version of this program in which the function foldTree has been unfolded. Exercise 7. A straightforward solution is to compute the height of the tree and passing the result of this function to a function frontAtLevel :: Tree -> Int -> [Int]. (Int. but we use it to demonstrate that if we compute a cartesian product of functions. Concluding.1. That such programs were possible came originally as a great surprise to many functional programmers.1. In our example the computation of the minimal value may be seen as a function of type ()->Int. The deepest front problem is the problem of finding the so-called front of a tree. see Listing 7. which is isomorphic to just Int. The new version is given in Listing 7. Listing 7. given a function of type (a.

lt) = lfun m (rm. Leaf m) . lt) = tree l m (rm.sol4.Tree)) mergedAlg = (\i -> \m -> (i.sol5. rt) = tree r m in (lm ‘min‘ rm.5: rm.rt) = rfun m in (lm ‘min‘ rm .6: rm.hs 116 .hs rep_min’’’’ t = r where (m. Programming with higher-order folds mergedAlg :: TreeAlgebra (Int -> (Int. Bin lt rt) Listing 7. r) = tree t m tree (Leaf i) = \m -> (i. Bin lt rt ) ) rep_min’’’ t = r where (m. r) = (foldTree mergedAlg t) m Listing 7.\lfun rfun -> \m -> let (lm.7. Leaf m) tree (Bin l r) = \m -> let (lm.

it can call the function on the argument. A language with just these constructs is useless. An abstract syntax for our language is given in Listing 7. A stack machine In section 5. Exercise 7. Note that we use a single datatype for the abstract syntax instead of three datatypes (one for each nonterminal).8. We take the following context-free grammar for the concrete syntax of the language.4 we have defined a stack machine with which simple arithmetic expressions can be evaluated.7 also contains a definition of a fold and an algebra type for the abstract syntax.1. 7. this simplifies the code a bit. A small compiler 1. Here we define a stack machine that has some more instructions. A parser for expressions is given in Listing 7.7. The language of the previous section wil be compiled into code for this stack machine in the following section. 117 . The Listing 7. 7. and an if-then-else expression. The compiler compiles this code into code for a hypothetical stack machine. The language The language we consider in this section has integers.2.2.2. booleans. • it can load a boolean.4. Define the functions height and frontAtLevel 2. which make the language a bit more interesting.2.2. and Bool booleans. function application. A small compiler This section constructs a small compiler for (a part of) a small language. Expr0 → if Expr1 then Expr1 else Expr1 | Expr1 Expr1 → Expr2 Expr2∗ Expr2 → Int | Bool where Int generates integers. 7.2. Redo the previous exercise for the highest front problem. • given an argument and a function on the stack. Give the four different solutions as defined in the rep min problem. and you will extend the language in the exercises with some other constructs.7. The stack machine we will use has the following instructions: • it can load an integer.

Int -> a . Programming with higher-order folds data ExprAS = | | | If ExprAS ExprAS ExprAS Apply ExprAS ExprAS ConInt Int ConBool Bool deriving Show type ExprASAlgebra a = (a -> a -> a -> a . it can jump to a label provided the boolean is false.conint.a -> a -> a . compile (ConBool b) = [LoadBool b] 118 .9.apply.conbool) = fold where fold (If ce te ee) = iff (fold ce) (fold te) (fold ee) fold (Apply fe ae) = apply (fold fe) (fold ae) fold (ConInt i) = conint i fold (ConBool b) = conbool b Listing 7.Bool -> a ) foldExprAS :: ExprASAlgebra a -> ExprAS -> a foldExprAS (iff. • it can jump to a label (unconditionally).2. Compiling to the stackmachine How do we compile the different expressions to stack machine code? We want to define a function compile of type compile :: ExprAS -> [InstructionSM] • A ConInt i is compiled to a LoadInt i. 7. The datatype for instructions is given in Listing 7.3.7: ExprAbstractSyntax. compile (ConInt i) = [LoadInt i] • A ConBool b is compiled to a LoadBool b.hs • it can set a label in the code. • given a boolean on the stack.7.

9: InstructionSM.8: ExprParser.hs data InstructionSM = | | | | | LoadInt Int LoadBool Bool Call SetLabel Label BrFalse Label BrAlways Label type Label = Int Listing 7.2.hs 119 .7. A small compiler sptoken :: String -> Parser Char String sptoken s = (\_ b _ -> b) <$> many (symbol ’ ’) <*> token s <*> many1 (symbol ’ ’) boolean = const True <$> token "True" <|> const False <$> token "False" parseExpr :: Parser Char ExprAS parseExpr = expr0 where expr0 = (\a b c d e f -> If b d f) <$> sptoken "if" <*> parseExpr <*> sptoken "then" <*> parseExpr <*> sptoken "else" <*> parseExpr <|> expr1 expr1 = chainl expr2 (const Apply <$> many1 (symbol ’ ’)) <|> expr2 expr2 = ConBool <$> boolean <|> ConInt <$> natural Listing 7.

and we want function compile to return the first unused label.l) \l -> let (xc. but where do these labels come from? From the above description we see that we also need labels when compiling an expression.l’’) = compile te l’ (ec. we compile the else expression ee.l’) = compile ce (l+2) (tc. Programming with higher-order folds • An application Apply f x is compiled by first compiling the argument x. compile (If ce te ee) = ++ ++ ++ ++ ++ ++ compile ce [BrFalse ?lab1] compile te [BrAlways ?lab2] [SetLabel ?lab1] compile ee [SetLabel ?lab2] Note that we use labels here. Then we jump to a label (which will be set before the code of the else expression ee later) if the resulting boolean is false. We obtain the following definition of compile: compile (ConInt i) compile (ConBool b) compile (Apply f x) = = = \l -> ([LoadInt i].Label) = Int type Label The four cases in the definition of compile have to take care of the labels. and finally putting a Call on top of the stack. hence the quotes around ‘function’).l’’) \l -> let (cc.l) \l -> ([LoadBool b]. and.l’) = compile x l (fc.l’’) = compile f l’ in (xc ++ fc ++ [Call]. Then we set the label for the else expression. We change the type of function compile as follows: compile :: ExprAS -> Label -> ([InstructionSM]. then the ‘function’ f (at the moment it is impossible to define functions in our language. Then we compile the then expression te. for which we need another label. We add a label argument (an integer. we set the label for the end of the if-then-else expression. used for the first label in the compiled code) to function compile.l’’’) = compile ee l’’ compile (If ce te ee) = 120 . compile (Apply f x) = compile x ++ compile f ++ [Call] • An if-then-else expression If ce te ee is compiled by first compiling the conditional expression ce. finally. After the then expression we always jump to the end of the code of the if-then-else expression.7.

10: CompileExpr.l’’’) = cee l’’ in ( cc ++ [BrFalse l] ++ tc ++ [BrAlways (l+1)] ++ [SetLabel l] ++ ec ++ [SetLabel (l+1)] .hs in ( cc ++ [BrFalse l] ++ tc ++ [BrAlways (l+1)] ++ [SetLabel l] ++ ec ++ [SetLabel (l+1)] . Exercise 7.\i -> \l -> ([LoadInt i].\cf cx -> \l -> let (xc.l) .Label)) compileAlgebra = (\cce cte cee -> \l -> let (cc.l’’’ ) Function compile is a fold.l’’) = cte l’ (ec.l’’) = cf l’ in (xc ++ fc ++ [Call]. The definition of function compile as a fold is given in Listing 7.Label).10.\b -> \l -> ([LoadBool b].l’’’ ) . Exercise 7.l’) = cx l (fc.3 (no answer provided).2. 121 . Extend the code generation example by adding definitions to the datatype Expr too. A small compiler compile = foldExprAS compileAlgebra compileAlgebra :: ExprASAlgebra (Label -> ([InstructionSM].l’) = cce (l+2) (tc. the carrier type of its algebra is a function of type Label -> ([InstructionSM].7.4 (no answer provided). Extend the code generation example by adding variables to the datatype Expr.l’’) .l) ) Listing 7.

The minimum is computed bottom-up. The minimum is computed bottomup: it is synthesized from its children. But. We can see the rep min computation as a computation on a value of type Tree. Just as we have seen how by introducing a suitable set of parsing combinators one may avoid the use of a special parser generator and even gain a lot of flexibility in extending the grammatical formalism by introducing more complicated combinators. These functions receive the minimum value from their parent tree node: they inherit the minimum value from their parent.3. Although they look different from what we have encountered thus far and are probably a little easier to write. This program computes the minimum of a tree.7. The programs you have seen in this chapter could also have been obtained by means of such a mapping from an attribute grammar specification. they can straightforwardly be mapped onto a functional program. on which two attributes are defined: the minimum and result tree attributes. and was originally proposed by Donald Knuth in [9]. Traditionally such attribute grammars are used as the input of a compiler generator. 122 . The formalism in which it is possible to specify such attributes and computations on datatypes or grammars is called attribute grammars.1 we have written a program that solves the rep min problem. The minimum value is then passed on to the functions that build the tree with the minimum value in its leaves. and is therefore a synthesized and inherited attribute. Attribute grammars provide a solution for the systematic description of the phases of the compiler that come after scanning and parsing. just as the concept of a context free grammar was useful in understanding the fundamentals of parser combinators. and it computes the tree in which all the leaf values are replaced by the minimum value. understanding attribute grammars will help significantly in describing the semantic part of the recognition and compilation process. Programming with higher-order folds 7. and is then passed down to the result tree. but they will appear again in the course in implementing programming languages. Attribute grammars In Section 7. This chapter does not further introduce attribute grammars. and is hence a synthesized attribute. we have shown how one can do without a special purpose attribute grammar processing system. The result tree is computed bottomup.

• finite-state automata appear in two. equally expressive. and is organised as follows. A finite-state automaton can be viewed as a simple form of digital 123 . This chapter discusses all of these concepts. Finite-state automata The classical approach to recognising sentences from a regular language uses finitestate automata. punctuation.1.1 introduces finite-state automata. and splits the input into a list of terminal symbols: keywords. Furthermore. Regular Languages Introduction The first phase of a compiler takes an input program.3 introduces regular expressions as finite descriptions of regular languages and shows that such expressions are another. A regular grammar is a particular kind of context-free grammar that can be used to describe regular expressions. Section 8. Finite-state automata can be used to recognise sentences of regular grammars. Finite-state automata appear in two versions. nondeterministic and deterministic ones. finite-state automata and regular expressions are three different tools for regular languages. versions: deterministic and nondeterministic. Section 8.2 introduces regular grammars (context-free grammars of a particular form). Goals After you have studied this chapter you will know that • regular languages are a subset of context-free languages. finite-state automata and regular expressions have the same expressive power. • regular grammars.4 shows that a nondeterministic finite-state automaton can be transformed into a deterministic finitestate automaton. • it is not always possible to give a regular grammar for a context-free grammar. identifiers. equivalent. Section 8. and regular languages. tool for regular languages. Finally. so you don’t have to worry about whether or not your automaton is deterministic. etc. it shows their equivalence with finite-state automata. Section 8. Regular expressions are used for the description of the terminal symbols.4 gives some of the proofs of the results of the previous sections. 8. Section 8. • regular grammars. numbers.1.8.

8.1. For automaton M0 above. accepting states get a double circle. and constant space. consider the DFA M0 = (X. We start with a description of deterministic finite-state automata. B. a finite-state automaton is more comprehensible in a graphical representation. the simplest form of automata. A. no temporary storage. The transition function is represented by the edges: whenever d Qi x is a state Qj . but a useful tool in many practical subproblems. Regular Languages computer with only a finite number of states. Q. an input file but only the possibility to read it. start states are explicitly mentioned or indicated otherwise. The following representation is customary: states are depicted as the nodes in a graph. A deterministic finitestate automaton (DFA) is a 5-tuple (X. can be implemented by means of very efficient programs. Q is a finite set of states. A finitestate automaton can easily be implemented by a function that takes time linear in the length of its input.1. S ∈ Q is the start state. This implies that problems that can be solved by means of a finite-state automaton. then there is an arrow labelled x from Qi to Qj . DFA). Definition 8. S. d. Q. C} F = {C} where state transition function d is defined by dSa = C dSb = A dSc = S dAa = B dBc = C For human beings. b. d :: Q → X → Q is the state transition function. and a control unit which records the state transitions. deterministic finite-state automaton As an example. c} Q = {S.1 (Deterministic finite-state automaton. Deterministic finite-state automata Finite-state automata come in two flavours: deterministic and nondeterministic. F ) where • • • • • X is the input alphabet. S. F ) with X = {a. d. F ⊆ Q is the set of accepting states. 8. A rather limited medium. The input alphabet is implicit in the labels. the pictorial representation is: 124 .

-. The predicate will be derived in a top-down fashion. To illustrate the action of an DFA. d.8. We do so by recording the successive configurations. When the complete input has been processed and the DFA is in one of its accepting states. F ). the pairs of current state and remaining input values. bac) → (A. Informally. ()*+ / ?>=< C O c Note that d is a partial function: for example d B a is not defined. We can make d into a total function by introducing a new ‘sink’ state. and d E x = D for all states E and terminals x for which d E x is undefined. 125 . when starting the DFA in S. i. Finite-state automata @A GFED 89:. Q. This operational description of acceptance is formalised in the predicate dfa accept. / ?>=< S c a b a 89:. the result state of all undefined transitions. (S. The action of a DFA on an input string is described as follows: given a sequence w of input symbols. A 89:. If no move is possible. to end in an accepting state after processing w. The sink state and the transitions from/to it are almost always omitted.e.e. we will show that the sentence bac is accepted by M0 . a sequence w ∈ X ∗ is accepted by a DFA (X. w can be ‘processed’ symbol by symbol (from left to right) and — depending on the specific input symbol — the DFA (initially in the start state) moves to the state as determined by its state transition function. S. if it is possible. i. in the above automaton we can introduce a sink state D with d D x = D for all terminals x. the automaton blocks. we formulate the predicate in terms of (“smaller”) subcomponents and afterwards we give solutions to the subcomponents.1. )  ?>=< 89:. For example. /. ac) → (B. then we say that w is accepted by the automaton. c) → (C. / ?>=< B accept Because of the deterministic behaviour of a DFA the definition of acceptance by a DFA is relatively easy.

i. Regular Languages Suppose dfa is a function that reflects the behaviour of the DFA. Definition 8. the language of a nondeterministic finite-state automaton.8. a start state. F ).3 (Language of a DFA). S. The definition of dfa is straightforward dfa :: (Q → X → Q) → Q → X ∗ → Q dfa d q = q dfa d q (ax) = dfa d (d q a) x Note that both the type and the definition of dfa match the pattern of the function foldl . a function which given a transition function.e. dfa d q = foldl d q. returns the unique state that is reached after processing the string starting from the start state. d. 126 .1. i. and reflects the behaviour of a DFA. For DFA M = (X.e. The sequence w ∈ X ∗ is accepted by DFA (X. S. d. Q. Nondeterministic finite state automata are defined as follows. Sometimes it is convenient to have two or more edges labelled with the same terminal symbol from a state. Definition 8. S. Q. F ) = (dfa d S w) ∈ F It remains to construct a definition of function dfa that takes a transition function. F ) where dfa accept w (d. In these cases one can use a nondeterministic finitestate automaton. qs. {Q}) → Bool dfa accept w (d. fs) = dfa d qs w ∈ fs dfa d qs = foldl d qs Using the predicate dfa accept. a start state and a string. Ldfa(M ). S. S. is defined by Ldfa(M ) = {w ∈ X ∗ | dfa accept w (d. The transition function of a DFA returns a state. which implies that for all terminal symbols x and for all states t there can only be one edge starting in t labelled with x. Nondeterministic finite-state automata This subsection introduces nondeterministic finite-state automata and defines their semantics. F ) if dfa accept w (d. and a list of input symbols.2 (Acceptance by a DFA). Then the predicate dfa accept is defined by: dfa accept :: X ∗ → (Q → X → Q. the language of a DFA is defined as follows. and it follows that the function dfa is actually identical to foldl . the language of M . F )} 8. Q.2.

this NFA is defined as M1 = (X. where • • • • • X is the input alphabet. / ?>=< B where state transition function d is defined by d S a = {C} d S b = {A} d S c = {S} d A a = {S. ()*+ / ?>=< C O c a 89:.1. Formally. B. d. Q. which can be made total by adding d D x = { } for all states D and all terminal symbols x for which d D x is undefined.4 (Nondeterministic finite-state automaton. Thus the DFA becomes an NFA. c} Q = {S. F ) with X = {a. nondeterministic finite-state automaton An NFA differs from a DFA in that there may be more than one start state and that there may be more than one possible move for each state and input symbol. Q is a finite set of states.8. Finite-state automata Definition 8. /. Q. F ⊆ Q is the set of accepting states. Q0 . 127 . F ). b. C} Q0 = {S} F = {C}  BC ?>=< 89:. NFA). d :: Q → X → {Q} is the state transition function. A nondeterministic finite-state automaton (NFA) is a 5-tuple (X. A. Q0 . Q0 ⊆ Q is the set of start states. / ?>=< o S c Note that this NFA is very similar to the DFA in the previous section: the only difference is that there are two outgoing arrows labelled with a from state A. A ba ED a 89:. d. B} d B c = {C} Again d is a partial function.-. Here is an example of an NFA: @A GFED 89:.

fs) = nfa d qs w ∩ fs = ∅ nfa d qs = foldl (deltas d) qs deltas d qs a = {r | q ∈ qs. sequence w ∈ X ∗ is accepted by NFA (X.8. Q. That is a function which given a transition function. Q0 . F ) if nfa accept w (d. it follows that nfa can be written as a foldl. to end in an accepting state after processing w. Q0 . In summary we have derived Definition 8. Assume that we have a function. Q0 . we have to be careful in defining what it means that a sequence is accepted by an NFA. {Q}. the language of an NFA is defined by 128 . d. F ). returns all possible states that can be reached after processing the string starting in some start state. F ) where nfa accept w (d. Q0 . a set of start states and a string. called deltas. The sequence w ∈ X ∗ is accepted by NFA (X. F ) = nfa d Q0 w ∩ F = ∅ Now it remains to find a function nfa d qs of type X ∗ → {Q} that reflects the behaviour of the NFA. say nfa. Informally.5 (Acceptance by an NFA). which reflects the behaviour of the NFA. Regular Languages Since an NFA can make an arbitrary (nondeterministic) choice for one of its possible moves. if it is possible. For lists of length 1 such a function. is defined by deltas :: (Q → X → {Q}) → {Q} → X → {Q} deltas d qs a = {r | q ∈ qs. Q. qs. Then the predicate nfa accept can be expressed as :: X ∗ → (Q → X → {Q}. when starting the NFA in a state from Q0 . nfa d qs = foldl (deltas d) qs This concludes the definition of predicate nfa accept. This operational description of acceptance is formalised in the predicate nfa accept. d. {Q}) → Bool nfa accept nfa accept w (d. r ∈ d q a} Using the nfa accept-predicate. r ∈ d q a} The behaviour of the NFA on X-sequences of arbitrary length follows from this “one step” behaviour: nfa :: (Q → X → {Q}) → {Q} → X ∗ → {Q} nfa d qs = qs nfa d qs (ax) = nfa d (deltas d qs a) x Again.

.3. is defined by Lnfa(M ) = {w ∈ X ∗ | nfa accept w (d. Furthermore. 8. Q0 . deriving Eq where the states of M (the elements of the set Q) are listed as constructors of StateM.8. d.1.1. Implementation This section describes how to implement finite state machines. d. Q. for each nondeterministic finite-state automaton there exists a deterministic finitestate automaton that accepts the same language. Q0 . we define two datatypes: data StateM data SymbolM = = ..1. so from a computational view. F )} Note that it is computationally expensive to determine whether or not a list is an element of the language of a nondeterministic finite-state automaton. The extended transition function dfa and the accept function dfaAccept are now defined by: dfa dfa :: [SymbolM] -> StateM = foldl (flip delta) start :: [SymbolM] -> Bool = elem (dfa xs) finals dfaAccept dfaAccept xs 129 . Given a DFA M = (X. we define three values (one of which a function): start delta finals :: :: :: StateM SymbolM -> StateM -> StateM [StateM] Note that the first two arguments of delta have changed places: this has been done in order to be able to apply ‘partial evaluation’ later. Fortunately. We will show how to construct a DFA from an NFA in subsection 8. S. Lnfa(M ). . and the symbols of M (the elements of the set X) are listed as constructors of SymbolM. the language of M . Q. Determining whether or not a list is an element of the language of a deterministic finite-state automaton can be done in time linear in the length of the input list.6 (Language of an NFA). This is due to the fact that all possible transitions have to be tried in order to determine whether or not the automaton can end in an accepting state after reading the input. deterministic finite-state automata are preferable. For NFA M = (X. F ).. F )..4. Finite-state automata Definition 8. We start with implementing DFA’s.

data StateMEX data SymbolMEX = = A | B | C | S SA | SB | SC deriving Eq So the state A is represented by A. delta x2 (delta x1 start).. we give the implementation of the example DFA (called MEX here) given in the previous subsection.. delta x1 start...x2.x2. we introduce the following class: class Eq a => start :: delta :: finals :: dfa dfa DFA a b where a b -> a -> a [a] :: [b] -> a = foldl (flip delta) start :: [b] -> Bool = elem (dfa xs) finals dfaAccept dfaAccept xs Note that the functions dfa and dfaAccept are defined once and for all for all instances of the class DFA. Since we want to use the same function names for different automata....xn].. The automaton is made an instance of class DFA as follows: instance DFA StateMEX SymbolMEX start = S delta x S = where delta SA A delta SC B finals = = = [C] case x of SA -> C SB -> A SC -> S B C 130 . and the symbol a is represented by SA.xn] and the start state. Regular Languages Given a list of symbols [x1. and similar for the other states and symbols. the computation of dfa [x1...xn] uses the following intermediate states: start.x2.. As an example.. This list of states is determined uniquely by the input [x1....8.

.xn] = foldl (flip delta) (delta x1 start) [x2. we can define a transition function for each state: dfaS. dfa [x1...xn] = foldl (flip delta) start [x1. 131 .. dfaB.8.. A. dfaS [] dfaS (x:xs) = = S case x of SA -> dfaC xs SB -> dfaA xs SC -> dfaS xs A case x of SA -> dfaB xs B case x of SC -> dfaC xs C dfaA [] dfaA (x:xs) dfaB [] dfaB (x:xs) dfaC [] = = = = = With this definition of the finite automaton...xn] = case x1 of SA -> foldl (flip delta) (delta SA start) [x2....xn] SC -> foldl (flip delta) (delta SC start) [x2. Finite-state automata We can improve the performance of the automaton (function) dfa by means of partial evaluation... Since there are only a finite number of states (four.x2... the number of steps required for computing the value of dfaS xs for some list of symbols xs is reduced considerably... to be precise). Note that the first argument of foldl is always flip delta. dfaC :: [Symbol] -> State partial evaluation Each of these functions is a case expression over the possible input symbols.. The main idea of partial evaluation is to replace computations that are performed often at run-time by a single computation that is performed only once at compile-time. dfaA.xn] SB -> foldl (flip delta) (delta SB start) [x2..x2. B.xn] All these equalities are simple transformation steps for functional programs.. A very simple example is the replacement of the expression if True then f1 else f2 by the expression f1.... Partial evaluation applies to finite automatons in the following way.1. and the second argument is one of the four states S... or C (the result of delta).

Constructing a DFA from an NFA Is it possible to express more languages by means of nondeterministic finite-state automata than by deterministic finite-state automata? For each nondeterministic automaton it is possible to give a deterministic finite-state automaton such that both automata accept the same language. 132 .4. This is a problem if the two classes appear in the same module. Before we give the formal proof of this claim. class Eq a => start :: delta :: finals :: nfa nfa NFA a b where [a] b -> a -> [a] [a] :: [b] -> [a] = foldl (flip deltas) start deltas :: b -> [a] -> [a] deltas a = union . in which we use some names that also appear in the class DFA. map (delta a) nfaAccept nfaAccept xs :: [b] -> Bool = intersect (nfa xs) finals /= [] Here. so the answer to the above question is no. functions union and intersect are implemented as follows: union union nub nub :: Eq a => [[a]] -> [a] = nub . concat :: Eq a => [a] -> [a] = foldr (\x xs -> x:filter (/=x) xs) [] intersect :: Eq a => [a] -> [a] -> [a] intersect xs ys = intersect’ (nub xs) where intersect’ = foldr (\x xs -> if x ‘elem‘ ys then x:xs else xs) [] 8. Regular Languages The implementation of NFA’s is similar to the implementation of DFA’s. The only difference is that the transition and accept functions have to take care of sets (lists) of states now. we illustrate the construction of a DFA for an NFA in an example. We will use the following class.8.1.

1.-. @A GFED 89:. @A GFED 89:. 89:. /. / ?>=< B A D O @A BC b c a 89:. Since D is a merge of the states S and B we have to merge the outgoing arcs from S and B into outgoing arcs of D.-. /. @A GFED ?>=< / 89:. and remove the two outgoing arcs labelled c from D. Finite-state automata Consider the nondeterministic finite-state automaton corresponding with the example grammar of the previous subsection. /. We obtain the following automaton. o S c The nondeterminicity of this automaton appears in state A: two outgoing arcs of A are labelled with an a. ()*+ / ?>=< C euu {= F O uu uu {{ { uu {{  uu uu a {{{a  c u b {{  c uuuu uu {{{{  u {u   {{ c uu  a 89:. Since E is a merge of the states C and S we have to merge the outgoing arcs from C and S into outgoing arcs of E. with an arc from A to D labelled a. ()*+ / ?>=< C O c a 89:. / ?>=< S a 89:. this automaton is still nondeterministic: there are two outgoing arcs labelled with c from D. ?>=< 89:.c  c c b  c ccc cc  cc    a 89:. /.-.8. We apply the same procedure as above.-. We obtain the following automaton. ?>=< 89:. ?>=< / ?>=< / ?>=< A D E B O ED b @A BC BC b 133 . ()*+ 89:. and we remove the arcs labelled a from A to S and from A to B. ()*+ / ?>=< C _cc ? O cc  cc  cc a. / ?>=< S  BC 89:. Although there is just one outgoing arc from A labelled with a. ?>=< 89:. ?>=< A ba ED a 89:. Add a new state E and an arc labelled c from D to E. / ?>=< B c We omit the proof that the language of the latter automaton is equal to the language of the former one. Suppose we add a new state D.

For example. Suppose M = (X. Regular Languages Again. accepts the same language. Note that in this automaton for each state. subs :: {X} → {{X}} subs {A. d. Then the finite-state automaton M = (X . this automaton is deterministic. B} = {{ }. E} where transition function d is defined by dSa = C dSb = A dSc = S dAa = D dBc = C dDa = C dDb = A dDc = E dEa = C dEb = A dEc = S This construction is called the ‘subset construction’. The DFA constructed from the NFA above is the 5-tuple (X. A. i. b. {B}} For the other components of M we define d q a = {t | t ∈ d r a. {A}. State E is added to the set of accepting states because it is the merge of a set of states among which at least one belongs to the set of accepting states. c} Q = {S. provided we add the state E to the set of accepting states. Q0 . F ) is a nondeterministic finite-state automaton. the construction works as follows. which until now consisted just of state C. r ∈ q} Q0 = {Q0 } F = {p | p ∩ F = ∅. {A. E} F = {C. p ∈ Q } 134 . X Q = X = subs Q where subs returns all subsets of a set. Q . we do not prove that the language of this automaton is equal to the language of the previous automaton. Q0 . D. In general. F ). the components of which are defined below. Q. subs Q is also called the powerset of Q.8. d .e. d. all outgoing arcs are labelled differently. S. B. F ) with X = {a. C. Q. B}.

[A]. [B. nfaA [] nfaA (x:xs) = = A case x of SA -> nfaBS xs _ -> error "no transition possible" :: [Symbol] -> [State] 135 .1. Finite-state automata The proof of the following theorem is given in Section 8.x2.S] (the possible results of deltas). nfaCS For example. nfaA. [B]....7 enables us to freely switch between NFA’s and DFA’s.. For every nondeterministic finite-state automaton M there exists a finite-state automaton M such that Lnfa(M ) = Ldfa(M ) Theorem 8..xn] = case x1 of SA -> foldl (flip deltas) (deltas SA start) [x2..8. Since there are only a finite number of results of deltas (six. but also by means of partial evaluation. nfa [x1. to be precise).xn] SB -> foldl (flip deltas) (deltas SB start) [x2.. nfaBS. Just as for function dfa.... 8. and the second argument is one of the six sets of states [S]. Partial evaluation of NFA’s Given a nondeterministic finite state automaton we can obtain a deterministic finite state automaton not just by means of the above construction. we can define a transition function for each state: nfaS.4...xn] = foldl (flip deltas) (deltas x1 start) [x2. Equipped with this knowledge we continue the exploration of regular languages in the following section.S].5.. we can calculate as follows with function nfa.1.. But first we show that the transformation from an NFA to a DFA is an instance of partial evaluation.xn] SC -> foldl (flip deltas) (deltas SC start) [x2.xn] Note that the first argument of foldl is always flip deltas. Theorem 8...x2. [C].xn] = foldl (flip deltas) start [x1.... [C.. nfaC.7 (DFA for NFA).... nfaB..

A regular language is a language that is generated by a regular grammar. we postpone this kind of proofs until Chapter 9. Here it suffices to know that regular languages form a proper subset of context-free languages and that we will profit from their speciality in the recognition process. From the definition it is clear that each regular language is context-free. then L∪M LM L ∗ is regular is regular is regular 136 .8 (Regular Grammar). concatenation and Kleene-star. R. Regular Languages Each of these functions is a case expression over the possible input symbols. So in every rule there is at most one nonterminal.2. regular grammar Definition 8. B ∈ N . a special kind of context-free grammars. Because of its subtlety. A regular grammar G is a context free grammar (T. Theorem 8. Let L and M be regular languages.2. 8.9 (Regular Language). A first similarity between regular languages and context-free languages is that both are closed under union. an example of such a language is {an bn | n ∈ N}. regular language Definition 8. S) in which all production rules in R are of one of the following two forms: A → xB A → x with x ∈ T ∗ and A. and if there is a nonterminal present. Regular grammars This section defines regular grammars. The question is now: Is each context-free language regular? The answer is: No. you have to know how to prove that a language is not regular.1 gives the correspondence between nondeterministic finite-state automata and regular grammars. There is a symmetric definition for left-regular grammars. The regular grammars as defined here are sometimes called right-regular grammars. To understand this. Subsection 8.8. N.10. There are contextfree languages that are not regular. it occurs at the end. By partially evaluating the function nfa we have obtained a function that is the implementation of the deterministic finite state automaton corresponding to the nondeterministic finite state automaton.

Regular grammars Proof. a regular grammar with the same language but without productions of the form U → V and W → for all U . and all W = S is obtained ∗ as follows. there may exist more than one regular grammar for a given regular language and these regular grammars may be transformed into each other. Given a regular grammar G. m ∈ N}. RL . regular languages are closed under intersection and complement too. Recall the language L = L1 ∩ L2 where L1 = {an bn cm | n. we only briefly describe the construction of such a regular grammar. and illustrate the construction with an example. each production of the form T → x and T → by T → xSM and T → SM . and for each production U → xW add the production U → x. In addition to these closure properties. For each regular grammar G there exists a regular grammar G with start-symbol S such that L(G) = L(G ) and such that G has no productions of the form U → V and W → . See the exercises. remove all productions of the form W → for W = S.8. Let GL = (T. NM . • since L∗ = { } ∪ LL∗ . the well-known union construction for context-free grammars is a regular grammar again. Add productions Y → z to the grammar. First. NL . and z not a single nonterminal. 137 .11. Finally. This is remarkable because context-free languages are not closed under these operations. then • for regular grammars. Z of nonterminals of G such that Y ⇒ Z. SM ) be regular grammars for L and M respectively. respectively. m ∈ N} and L2 = {an bm cm | n. The proof of this transformation is omitted. RM .2. consider all pairs Y . in GL . As for context-free languages. In other words: every regular grammar can be transformed to a form where every production has a nonempty terminal string in its right hand side (with a possible exception for S → ). SL ) and GM = (T. Remove all productions U → V from G. • we obtain a regular grammar for LM if we replace. V . We conclude this section with a grammar transformation: Theorem 8. it follows from the above that there exists a regular grammar for L∗ . with Z → z a production of the original grammar. The following example illustrates this construction. with W = S .

and the production S → c since S ⇒ C. In the example we remove the productions S → A.8. and the ∗ productions A → aA and A → bB since A ⇒ S. It remains to remove productions of the form W → for W = S. and has no productions of the form U → V . and we add the productions ∗ ∗ S → bB and S → since S ⇒ A. S → C. Furthermore. remove all productions of the form U → V from G . We obtain the grammar with the following productions. A → S. If Y ⇒ Z. with Z → z a production of the original grammar. and the production A → c since ∗ A ⇒ C. S → aA S → bB S → A S → C A → bB A → S A → B → bB B → C → c The grammar G of the desired form is constructed in 3 steps. Regular Languages Consider the following regular grammar G. Step 1 Let G equal G. 138 . S → aA S → bB S → c S → A → bB A → aA A → c A → B → bB B → C → c This grammar generates the same language as G. and z not a single nonterminal. add the productions Y → z to G . Step 2 ∗ Consider all pairs of nonterminals Y and Z.

For each NFA M there exists a regular grammar G such that Lnfa(M ) = L(G) 89:. The equivalence consists of two parts. we introduced finite-state automata. formulated in the theorems 8. Equivalence of Regular grammars and Finite automata In the previous section. We will prove the equivalence using theorem 8. 8.1. S → aA S → bB S → a S → b S → c S → A → bB A → aA A → a A → b A → c B → bB B → b C → c Each production in this grammar is of one of the desired forms: U → x or U → xV . / ?>=< B 139 . and for each production U → xW add the production U → x.2.12 and 8. ?>=< A x Theorem 8. and the language of the grammar G we thus obtain is equal to the language of grammar G.13 below.8.2.7.12 (Regular grammar for NFA). The basis for both theorems is the direct correspondence between a production A → xB and a transition 89:. Here we show that regular grammars and nondeterministic finite-state automata are two sides of one coin. Regular grammars Step 3 Remove all productions of the form W → for W = S. Applying this transformation to the above grammar gives the following grammar.

The construction consists of two steps: first we give a direct translation of a regular grammar to an automaton and then we transform the automaton into a suitable shape. we will just sketch the construction. A → xB ∈ R} ∪ {x ∈ T + | A ∈ N. / ?>=< B R = {A → xB | A. S) be a regular grammar without productions of the form U → V and W → for W = S. Q = N ∪ {Nf } 140 . For each regular grammar G there exists a nondeterministic finite-state automaton M such that L(G) = Lnfa(M ) Proof. Q. R. the start state of the grammar is the start state of the automaton. S. We will just sketch the construction. In formulae: 89:. Regular Languages Proof.13 (NFA for regular grammar). d. the productions of the grammar correspond to the automaton transitions: a rule A → xB for each transition 89:. Let (X. F ) be a DFA for NFA M . F ) where • The input alphabet of the automaton are the nonempty terminal strings (!) that occur in the rules of the grammar: X = {x ∈ T + | A. Construct NFA M = (X. ?>=< A x a rule A → for each terminal state A. x ∈ X. From a grammar to an automaton. d. Let G = (T. N. the formal proof can be found in the literature. Again. Construct the grammar G = (X. A → x ∈ R} • The states of the automaton are the nonterminals of the grammar extended with a new state Nf . d A x = B} ∪ {A → | A ∈ F} Theorem 8. {S}. Q. S) where • • • • the terminals of the grammar are the input alphabet of the automaton. Q. R.8. the nonterminals of the grammar are the states of the automaton. the formal proof can be found in the literature. B ∈ N. B ∈ Q.

b q x1 x2 . . if S → is a grammar production.. . . add new (nonfinal) states p1 . There is a minor flaw in the automaton given above: the grammar and the automaton have different alphabets. . .xk+1 -b b q q x1 -b p1 x2 . In order to eliminate transition d q x = q where x = x1 x2 . A x for each rule A → x with nonempty x.bp p p p p p p p p p p p p p p p p p b p2 pk xk+1 -b q It is intuitively clear that the resulting automaton is equivalent to the original one.2. . A → xB ∈ R} ∪ {Nf | A → x ∈ R} • The final states of the automaton are Nf and possibly S. . d pk xk+1 = q . . pk to the existing ones and new transitions d q x1 = p1 .. because of the direct correspondence between derivation steps in G and transitions in M . 141 . ?>=< A x 89:. The transformation is relatively easy and is depicted in the diagram below. Transforming the automaton to a suitable shape.8. / ?>=< B In formulae: For all A ∈ N and x ∈ X: @ABC /GFED Nf d A x = {B | B ∈ N. Regular grammars • The transitions of the automaton correspond to the grammar productions: for each rule A → xB we get a transition ?>=< 89:. This shortcoming will be remedied by an automaton transformation which yields an equivalent automaton with transitions labelled by elements of T (instead of T ∗ ). we get a transition 89:. Carry out this transformation for each M -transition d q x = q with |x| > 1 in order to get an automaton for G with the same input alphabet T . . . F = {Nf } ∪ {S | S → ∈ R} Lnfa(M ) = L(G). xk+1 with k > 0 and xi ∈ T for all i. d p1 x2 = p2 .

The operator + is associative. Examples of regular expressions are: (bc)∗ + ∅ + b( ∗) The language (i. As follows. regular expressions over alphabet T ).8. R + (S + U ) = (R + S) + U R+S = S+R R+R = R R (S U ) = (R S) U R = R (= R) regular expression Furthermore. for all regular expressions R. 142 . written as juxtaposition (so x concatenated with y is denoted by xy). The set RET of regular expressions over alphabet T is inductively defined as follows: for regular expressions R. the star operator. Regular expressions Regular expressions are a classical and convenient way to describe. This section defines regular expressions. Regular Languages 8. is associative. and idempotent. the structure of terminal words. S ∅ ∈ RET ∈ RET a ∈ RET R + S ∈ RET R S ∈ RET R∗ ∈ RET (R) ∈ RET where a ∈ T . In formulae this reads. implementations can be found in the literature [8.e. and concatenation binds stronger than +. S.3. 6]. Definition 8. and shows that regular expressions and regular grammars are equally expressive formalisms. for example. defines the language of a regular expression. commutative. We do not discuss implementations of (datatypes and functions for matching) regular expressions. the “semantics”) of a regular expression over T is a set of T sequences compositionally defined on the structure of regular expressions. and V . ∗. and is the unit of it. binds stronger than concatenation. the concatenation operator.14 (RET .

. i. and idempotent. bc} {d} = {d. and function Lre is well defined.. three of which are identifiers. For example. Function Lre :: RET → {T ∗ } returns the language of a regular expression. + z + A + B + .e. + 9 143 .8..3. one or more concatenations of b. Lre(( + bc)d) = (Lre( + bc)) (Lre(d)) = (Lre( ) ∪ Lre(bc)){d} = ({ } ∪ (Lre(b))(Lre(c))){d} = { . It is defined inductively by: Lre(∅) = ∅ Lre( ) = { } Lre(b) = {b} Lre(x + y) = Lre(x) ∪ Lre(y) Lre(xy) = Lre(x) Lre(y) Lre(x∗) = (Lre (x))∗ Since ∪ is associative.. we compute the language of the regular expression ( + bc)d.15 (Language of a regular expression). Regular expressions Definition 8. set concatenation is associative with { } as its unit. the list if p then e1 else e2 contains six tokens. As an example of a language of a regular expression. Note that the language Lreb∗ is the set consisting of zero. Lre(b∗) = ({b})∗ .. + Z = 0 + 1 + ... bcd} Regular expressions are used to describe the tokens of a language. commutative. An identifier is an element in the language of the regular expression letter (letter + digit)∗ where letter digit = a + b + .

Consider the following regular expression. R = a∗ + + (a + b)∗ We aim at a regular grammar G such that Lre(R) = L(G) and again we take a top-down approach. Suppose that nonterminal A generates the language Lre(a∗). • The nonterminal A with productions A → aA A → generates the language Lre(a∗). It remains to construct productions for nonterminals A.1. B. • Since Lre( ) = { }. • Nonterminal C with productions C → aC C → bC C → generates the language Lre((a + b)∗).8. Suppose furthermore that the productions for A. For a specific example it is not difficult to construct a regular grammar for a regular expression. and C. the nonterminal B with production B → generates the language { }. Then we obtain a regular grammar G with L(G) = Lre(R) by defining S → A S → B S → C where S is the start-symbol of G. but first we illustrate the construction of a regular grammar out of a regular expressions in an example. We now give the general result. nonterminal B generates the language Lre( ). B. We will prove this claim later. and C satisfy the conditions imposed upon regular grammars. In the beginning of this section we claimed that regular expressions and regular grammars are equivalent formalisms.3. 144 . and nonterminal C generates the language Lre((a + b)∗). Regular Languages see subsection 2.

For our example DFA we obtain: S = aC + bA + cS A = aB B = cC C = It is easy to merge these four regular expressions into a single regular expression. Merging the regular expressions obtained from a DFA that may loop is more complicated. Proofs Theorem 8.4. For each regular grammar G there exists a regular expression R such that L(G) = Lre(R) The proof of this theorem is given in Section 8.8. we have: Theorem 8. Given a regular grammar G. To obtain a regular expression that generates the same language as a given regular grammar we go via an automaton. partially because this is a simple example. we interpret each state of D as a regular expression defined as the sum of the concatenation of outgoing terminal symbols with the resulting state. Proofs This section contains the proofs of some of the theorems given in this chapter.17 (Regular Expression for Regular Grammar). 145 . For each regular expression R there exists a regular grammar G such that Lre(R) = L(G) The proof of this theorem is given in Section 8.4.4. as we will briefly explain in the proof of the following theorem. In general. To obtain a regular expression for a DFA D. 8. we have found a regular expression for a regular grammar.4.16 (Regular Grammar for Regular Expression). we can use the theorems from the previous sections to obtain a DFA D such that L(G) = Ldfa(D) So if we can obtain a regular expression for a DFA D.

nfa d Q0 w ∈ F } assume nfa d Q0 w = dfa d Q0 w {w | w ∈ X ∗ . {A. subs {A.8. F )} definition of Ldfa Ldfa(M ) It follows that Lnfa(M ) = Ldfa(M ) provided nfa d Q0 w = dfa d Q0 w We prove this equation as follows.7 Suppose M = (X.1. F ) is a nondeterministic finite-state automaton. B}.1) = {p | p ∩ F = ∅. X Q = X = subs Q where subs returns the powerset of a set. (nfa d Q0 w ∩ F ) = ∅} definition of F {w | w ∈ X ∗ . Q0 .4. dfa accept w (d . {B}} For the other components of M we define d q a = {t | t ∈ d r a. Proof of Theorem 8. Q0 . Q . Define the finite-state automaton M = (X . Regular Languages 8. {A}. Q0 . (8. nfa accept w (d. d. d . B} = {{ }. r ∈ q} Q0 = Q0 F We have Lnfa(M ) = = = = = = definition of Lnfa {w | w ∈ X ∗ . F )} definition of nfa accept {w | w ∈ X ∗ . For example. p ∈ Q } 146 . Q. Q0 . F ) as follows. dfa d Q0 w ∈ F } definition of dfa accept {w | w ∈ X ∗ .

Proof of Theorem 8. d qa = = definition of d {t | r ∈ q.16 The proof of this theorem is by induction to the structure of regular expressions. 147 . We obtain a regular grammar with start-symbol S that generates the language Lre(x + y) by defining S → S1 S → S2 We obtain a regular grammar with start-symbol S that generates the language Lre(xy) by replacing. Proofs nfa d Q0 w = = = definition of nfa foldl (deltas d) Q0 w Q0 = Q0 .1) holds. t ∈ d r a} definition of deltas deltas d q a 8. The regular grammar with production S → b generates the language Lre(b). assume deltas d = d foldl d Q0 w definition of dfa dfa d Q0 w So if deltas d = d equation (8. The regular grammar with the production S → generates the language Lre( ).4. The regular grammar without productions generates the language Lre(∅). For the three base cases we argue as follows. and a regular grammar with start-symbol S2 that generates the language Lre(y).2.4. For the other three cases the induction hypothesis is that there exist a regular grammar with start-symbol S1 that generates the language Lre(x).8. This equality follows from the following calculation. in the regular grammar that generates the language Lre(x).

1.1 we have shown that there exists a DFA D = (X. Let D = (X. d.3. if we can show that there exists a regular expression R such that Ldfa(D) = Lre(R) then the theorem follows. In solving these equations we use the fact that concatenation distributes over the sum operator: z(x + y) = zx + zy and that recursion can be removed by means of the star operator ∗: A = xA + z (where A ∈ z) ≡ A = x∗z / The algorithm for solving such a set of equations is omitted. F ) be a DFA such that L(G) = Ldfa(D). respec- We obtain a regular grammar with start-symbol S that generates the language Lre(x∗) by replacing. Q.8. d.2. in the regular grammar that generates the language Lre(x).4. where S1 is the start-symbol of the regular grammar that generates the language Lre(x). Q. 8. Proof of Theorem 8. F ) such that L(G) = Ldfa(D) So. S. which we have to solve. We define a regular expression R such that Lre(R) = Ldfa(D) ¯ For each state q ∈ Q we define a regular expression q .4 and 8. ¯ q = if q ∈ F ¯ / then foldl (+) ∅ [cC | d q c = C] else + foldl (+) ∅ [cC | d q c = C] This gives a set of possibly mutually recursive equations. S. and by adding the productions S → S1 and S → . Regular Languages each production of the form T → a and T → tively. 148 . ¯ We prove Ldfa(D) = Lre(S). by T → aS2 and T → S2 .17 In sections 8. each production of the form T → a and T → by T → aS and T → S. We obtain ¯ the definition of q by combining all pairs c and C such that d q c = C. and we let R be S.

namely. E abbreviates the fold expression ¯ q = +E ¯ definition of Lre. (foldl d q w) ∈ F ≡ w ∈ Lre(¯) q we This equation is proved by induction to the length of w. (foldl d S w) ∈ F } assumption ¯ {w | w ∈ Lre(S)} equality for set-comprehensions ¯ Lre(S) It remains to prove the assumption in the above calculation: for w ∈ X ∗ .8. for arbitrary q. q (foldl d q (ax)) ∈ F ≡ ≡ ≡ definition of foldl (foldl d (d q a) x) ∈ F induction hypothesis x ∈ Lre(d ¯ a) q definition of q . Suppose ax is a list of length n+1. (foldl d q ) ∈ F ≡ ≡ ≡ definition of foldl q∈F definition of q .4. dfa accept w (d. Proofs Ldfa(D) = = = = = definition of Ldfa {w | w ∈ X ∗ . ¯ (foldl d S w) ∈ F ≡ w ∈ Lre(S) We prove a generalisation of this equation. D is deterministic ¯ ax ∈ Lre(¯) q n we have (foldl d q w) ∈ 149 . F )} definition of dfa accept {w | w ∈ X ∗ . S. definition of q ¯ ∈ Lre(¯) q The induction hypothesis is that for all lists w with |w| F ≡ w ∈ Lre(¯). For the base case w = calculate as follows. (dfa d S w) ∈ F } definition of dfa {w | w ∈ X ∗ .

Hint: construct a finite automaton for L out of an automaton for regular language L.3. Suppose that the state transition function d in the definition of a nondeterministic finite-state automaton has the following type d :: {Q} → X → {Q} Function d takes a set of states V and an element a. 150 .1. returns the set of states in which the nondeterministic finite-state automaton can end after reading the input list. Prove this claim. 8. Exercise 8. Exercise 8. Regular languages are closed under complementation.5. Exercises Exercise 8.2.4.8. and introduces several concepts related to describing and recognising regular languages. and returns the set of states that are reachable from V with an arc labelled a. Prove the converse of Theorem 8. a set of start states. Transform the grammar with the following productions to a grammar without productions of the form U → V and W → with W = S. Define a function ndfsa of type ({Q} → X → {Q}) → {Q} → X ∗ → {Q} which given a function d. Regular Languages Summary This chapter discusses methods for recognising sentences from regular languages. and an input list. Regular languages are used for describing simple languages like ‘the language of identifiers’ and ‘the language of keywords’ and regular expressions are convenient for the description of regular languages. S S A A B B C C → → → → → → → → aA A aS B C cC a Exercise 8.5. The straightforward translation of a regular expression into a recogniser for the language of that regular expression results in a recogniser that is often very inefficient. Given a regular grammar G for language L.7: show that for every deterministic finitestate automaton M there exists a nondeterministic finite-state automaton M such that Lda(M ) = Lna(M ) Exercise 8. By means of (non)deterministic finite-state automata we construct a recogniser that requires time linear in the length of the input list for recognising an input list. construct a regular grammar for L∗ .

Examples of sentences of this language are 1110.   A → 0     B → 0    B → Exercise 8. A direct proof form this claim is the following: Let M1 = (X. (S1 . S2 ). Prove that for arbitrary regular expressions R.8. Regular languages are closed under intersection. Exercise 8. d1 .10. Describe the language of the following regular expressions.9.7. + b( ∗) 2. S2 . d2 . 2. F1 ) and M2 = (X. F2 ) be DFA’s for the regular languages L1 and L2 respectively. Q1 . Prove this claim using the result from the previous exercise. (bc)∗ + ∅ 3. with Lre(V ) = Lre(W ).11. Lre(R(S + T )) Lre((R + S)T ) = Lre(RS + RT ) = Lre(RT + ST ) Exercise 8. S1 . Exercises Exercise 8.6. and T the following equivalences hold. 1. 151 .12. Q2 .5. d. S. Give regular expressions S and R such that Lre(RS) = Lre(RS) = Lre(SR) Lre(SR) Exercise 8. Define the (product) automaton M = (X. a(b∗) + c∗ Exercise 8. such that for all regular expressions R and S with S = ∅ Lre(R(S + V )) = Lre(R(S + W )) V and W may be expressed in terms of R and S.   S → (A    S →  S → )A 1. d2 q2 x) Now prove that Ldfa(M ) = Ldfa(M1 ) ∩ Ldfa(M2 ) Exercise 8. Q1 × Q2 . and 000. Give regular expressions V and W .   A → )    A → (   S → 0A    S → 0B     S → 1A  A → 1 2. 1. Define nondeterministic finite-state automata that accept languages equal to the languages of the following regular grammars. q2 ) x = (d1 q1 x. Give a regular expression for the language that consists of all lists of zeros and ones such that the segment 01 occurs nowhere in a list. F1 × F2 ) by d (q1 .8.

1. ?>=< !"# / '&%$ B 89:.16.b @A GFED 89:. a∗ + b∗ + ab Exercise 8. a∗ + b∗ + (ab) 2.13.8.14. ?>=< !"# / '&%$ S a @A GFED 89:. Give regular grammars that generate the language of the following regular expressions. Regular Languages Exercise 8.  B → aC    B → bB     B →     C → bB    C → b   S → 0S    S → 1T  S → 2. 1. ?>=< !"# / '&%$ S O @A a b b b 152 . /?>=< A O BC@A a. Start state is S. 1.15. b @A GFED 89:. Give regular expressions of which the language equals the language of the following regular grammars. Construct for each of the following regular expressions a nondeterministic finite-state automaton that accepts the sentences of the language.   T → 0T    T → 1S Exercise 8. (1 + (12) + 0)∗(30)∗ Exercise 8. /?>=< A 89:. Start state is S. Transform the nondeterministic finite-state automata into deterministic finite-state automata. ?>=< /B BC a b 2. Define regular expressions for the languages of the following deterministic finite-state automata. ((a + bb)∗ + c)∗ 2. b @A GFED 89:.   S → bA    S → aC     S →     A → bA    A → 1.

which consists of four different kinds of grammars and their corresponding languages. Regular grammars and context-free grammars are part of the Chomsky hierarchy. For example. the pumping lemma for regular languages says IF language L is regular. As a consequence. Pumping lemmas are used to show that the expressive power of the different elements of the Chomsky hierarchy is different. 153 . prove that a language is not context-free. explain the Chomsky hierarchy. Pumping Lemmas: the expressive power of languages Introduction In these lecture notes we have presented several ways to show that a language is regular or context-free. Although the ideas behind pumping lemmas are very simple. if P does not hold. pumping lemmas are used in the contrapositive way. give examples of languages that are not regular. In this chapter we fill this gap by introducing the so-called Pumping Lemmas. context-free or none of these. Goals After you have studied this chapter you will be able to • • • • • prove that a language is not regular. but until now we did not give any means to show the nonregularity or noncontext-freeness of a language. and/or not context-free.9. a precise formulation is not. every time yielding another word of L In applications. In the regular case this means that one may conclude that L is not regular. identify languages and grammars as regular. it takes some effort to get familiar with applying pumping lemmas. THEN it has the following property P : each sufficiently long sentence w ∈ L has a substring that can be repeated any number of times.

This expressive power comes at a cost though: it is very difficult to parse sentences from type-0 grammars. where V is the set of symbols of the grammar. You may now wonder: is it possible to express any language with these grammars? And: is it possible to obtain any context-free language from a regular grammar? The answer to these questions is no. in which a production has the form φ → ψ.1. it is much easier to parse sentences from context-free languages. where φ. Although context-sensitive grammars are less expressive than type-0 grammars. So a production describes how to rewrite a nonterminal A. 9. where φ ∈ V + . ψ ∈ V ∗ . Pumping Lemmas: the expressive power of languages 9. The Chomsky hierarchy explains why the answer is no.1. As the name says. or context-sensitive grammar. In a type-1. However. So the left-hand side of a production may consist of a list of nonterminal and terminal symbols. each production has the form φAψ → φδψ. δ ∈ V + . each of which is explained below. Type-0 grammars have the same expressive power as Turing machines. 9. Type-1 grammars We can slightly restrict the form of the productions to obtain type-1 grammars. Context-free grammars are less expressive than context-sensitive grammars.9. in a context-free grammar you can rewrite nonterminals without looking at the context in which they appear. The Chomsky hierarchy In the preceding chapters we have seen contex-free grammars and regular grammars. Actually.2.1. 9. in the context of lists of symbols φ and ψ. And if we put some more restrictions 154 . Type-0 grammars The most powerful grammars are the type-0 grammars. This statement can be proved using the pumping lemma for context-free languages. The Chomsky hierarchy consists of four elements. In fact.3. parsing is still very difficult for context-sensitive grammars. a sentence of length n can be parsed in time at most O(n3 ) (or even a bit less than this) for any sentence of a context-free language.1. and the languages described by these grammars are the recursive enumerable languages. A language generated by a contextsensitive grammar is called a context-sensitive language. ψ ∈ V ∗ .1. it is impossible to look at the context when rewriting symbols. instead of a single nonterminal as in context-free grammars. Type-2 grammars The type-2 grammars are the context-free grammars which you have seen a lot in the preceding chapters.

Also remember that ‘for all x ∈ X : · · · ’ is true if X = ∅. 9. which takes linear time and constant space in the size of its input.1 (Regular Pumping Lemma). Theorem 9. 155 . w i∈N : : xyz ∈ L and |y| n : : y = uvw and |v| > 0 : : xuv i wz ∈ L Note that |y| denotes the length of the string y.1. The lemma gives a property that is satisfied by all regular languages.2. y. 9. The property is a statement of the form: in sentences longer than a certain length a substring can be identified that can be duplicated while retaining a sentence. Given a DFA for a regular language.4. The set of regular languages is strictly smaller than the set of context-free languages. When the length of such a sentence exceeds the number of states. z u. The proof of the following lemma is given in Section 9. a fact we will prove below by means of the pumping lemma for regular languages. The pumping lemma for regular languages In this section we give the pumping lemma for regular languages. then at least one state is visited twice. and ‘there exists x ∈ X : · · · ’ is false if X = ∅. a sentence of the language describes a path from the start state to some finite state. consequently the path contains a cycle that can be repeated as often as desired. Let L be a regular language. Type-3 grammars The type-3 grammars in the Chomsky hierarchy are the regular grammars. v. we obtain linear-time algorithms for parsing sentences of such grammars. The idea behind this property is simple: regular languages are accepted by finite automata.2. The pumping lemma for regular languages on context-free grammars (for example LL(1)).9. Then there exists for all there exist for all n∈N x. Any sentence from a regular language can be processed by means of a finite-state automaton.4.

the statement above holds for all such v (namely none!). the above automaton has to pass at least twice through state A. Take i = 2. Then we know that in order to accept y. Its application is typical of pumping lemmas in general. As an example. Let u. and |y| n. they are used negatively to show that a given language does not belong to some family. we can choose y = . and z = bn . this is the formulation we will use. This pumping lemma is useful in showing that a language does not belong to the family of regular languages. Take for n the number of states in the automaton (5). The part that is accepted in between the two moments the automaton passes through state A can be pumped up to create sentences that contain an arbitrary number of copies of the string v = bca.-. u = ap . / ?>=< A b 89:. that is. z be such that xyz ∈ L. in general. ?>=< /. Let n ∈ N.1 enables us to prove that a language L is not regular by showing that for all there exist for all there exists n∈N x. and since there is no v with |v| > 0 such that y = uvw. w be such that y = uvw with v = . 89:.9. Note that if n = 0. we will prove that language L = {am bm | m 0} is not regular. y. a(bca)∗ bcd. The statement of the pumping lemma amounts to the following. ?>=< S a 89:. Take s = an bn with x = . v. y. v = aq and w = ar with p + q + r = n and q > 0. ?>=< C  89:. y = an . / ?>=< B _cc cc cc cc c c a ccc cc cc  89:. then 156 . consider the following automaton. Let x. w i∈N : : xyz ∈ L and |y| n : : y = uvw and |v| > 0 : : xuv i wz ∈ L In all applications of the pumping lemma in this chapter. and. Theorem 9. ()*+ D d This automaton accepts: abcabcd. z u. Pumping Lemmas: the expressive power of languages For example. v. abcabcabcd.

Show that the following language is not regular. two sublists of bounded length can be identified that can be 157 . Prove that the following language is not regular {ak | k 2 ∈L p+q+r =n 0} Exercise 9. The pumping lemma for context-free languages The Pumping Lemma for context-free languages gives a property that is satisfied by all context-free languages. z.3.3. Prove that the following language is not regular {ak bm | k m 2k} Exercise 9.1. This kind of proof can be viewed as a kind of game: ‘for all’ is about an arbitrary element which can be chosen by the opponent. Exercise 9. b}∗ ∧ nr a x < nr b x} where nr a x is the number of occurrences of a in x. Note that here we use the pumping lemma (and not the proof of the pumping lemma) to prove that a language is not regular.2. x. w.3. Show that the following language is not regular. This property is a statement of the form: in sentences exceeding a certain length. ‘there exists’ is about a particular element which you may choose. Choosing the right elements helps you ‘win’ the game. {x | x ∈ {a. where winning means proving that a language is not regular. calculus ap+2q+r bn n+q =n arithmetic q>0 q>0 true Note that the language L = {am bm | m 0} is context-free. v.4. Exercise 9. The pumping lemma for context-free languages xuv 2 wz ∈ L ⇐ ⇐ ⇐ ⇐ defn. and together with the fact that each regular grammar is also a context-free grammar it follows immediately that the set of regular languages is strictly smaller than the set of context-free languages. {ak bl am | k > 5 ∧ l > 3 ∧ m l} 9.9. u.

9. Pumping Lemmas: the expressive power of languages

duplicated while retaining a sentence. The idea behind this property is the following. Context-free languages are described by context-free grammars. For each sentence in the language there exists a derivation tree. When sentences have a derivation tree that is higher than the number of nonterminals, then at least one nonterminal will occur twice in a node; consequently a subtree can be inserted as often as desired.

As an example of an application of the Pumping Lemma, consider the context-free grammar with the following productions.

S → aAb A → cBd A → e B → fAg

The following parse tree represents the derivation of the sentence acfegdb.

a

             

Sc c Ac Bc A e

cc cc cc

cc cc cc c cc cc cc c

b

c

d g

f

If we replace the subtree rooted by the lower occurrence of nonterminal A by the

158

9.3. The pumping lemma for context-free languages

subtree rooted by the upper occurrence of A, we obtain the following parse tree.
                       

Sc c Ac Bc

cc cc cc

a

cc cc cc c cc cc cc c

b

c

d g

f c

Ac Bc A e

cc cc cc c cc cc cc c

d g

f

This parse tree represents the derivation of the sentence acfcfegdgdb. Thus we ‘pump’ the derivation of sentence acfegdb to the derivation of sentence acfcfegdgdb. Repeating this step once more, we obtain a parse tree for the sentence acfcfcfegdgdgdb We can repeatedly apply this process to obtain derivation trees for all sentences of the form a(cf)i e(gd)i b for i 0. The case i = 0 is obtained if we replace in the parse tree for the sentence acfegdb the subtree rooted by the upper occurrence of nonterminal A by the subtree rooted by the lower occurrence of A:
ÐÐ ÐÐ ÐÐ ÐÐ

Sb b A e

bb bb b

a

b

This is a derivation tree for the sentence aeb. This step can be viewed as a negative pumping step. The proof of the following lemma is given in Section 9.4.

159

9. Pumping Lemmas: the expressive power of languages

Theorem 9.2 (Context-free Pumping Lemma). Let L be a context-free language. Then there exist for all there exist for all c, d z u, v, w, x, y i∈N : : : : c, d ∈ N z ∈ L and |z| c z = uvwxy and |vx | > 0 and |vwx | uv i wx i y ∈ L : : :

d

The Pumping Lemma is a tool with which we prove that a given language is not context-free. The proof obligation is to show that the property shared by all contextfree languages does not hold for the language under consideration. Theorem 9.2 enables us to prove that a language L is not context-free by showing that for all there exists for all there exists c, d z u, v, w, x, y i∈N : : : : c, d ∈ N z ∈ L and |z| c z = uvwxy and |vx | > 0 and |vwx | uv i wx i y ∈ L : : :

d

As an example, we will prove that the language T defined by T is not context-free. Proof. Let c, d ∈ N. Take z = ar br cr with r = max(c, d). Let u, v, w, x, y be such that z = uvwxy, |vx| > 0 and |vwx | d Note that our choice for r guarantees that substring vwx has one of the following shapes: • vwx consists of just a’s, or just b’s, or just c’s. • vwx contains both a’s and b’s, or both b’s and c’s. So vwx does not contain a’s, b’s, and c’s. Take i = 0, then • If vwx consists of just a’s, or just b’s, or just c’s, then it is impossible to write the string uwy as as bs cs for some s, since only the number of terminals of one kind is decreased. = {an bn cn | n > 0}

160

9.4. Proofs of pumping lemmas

• If vwx contains both a’s and b’s, or both b’s and c’s it lies somewhere on the border between a’s and b’s, or on the border between b’s and c’s. Then the string uwy can be written as uwy uwy = as bt cr = ar bp cq

for some s, t, p, q, respectively. At least one of s and t or of p and q is less than r. Again this list is not an element of T .

Exercise 9.5. Why does vwx not contain a’s, b’s, and c’s? Exercise 9.6. Prove that the following language is not context-free {ak | k
2

0}

Exercise 9.7. Prove that the following language is not context-free {ai | i is a prime number } Exercise 9.8. Prove that the following language is not context-free {ww | w ∈ {a, b}∗ }

9.4. Proofs of pumping lemmas
This section gives the proof of the pumping lemmas.

9.4.1. Proof of the Regular Pumping Lemma, Theorem 9.1
Since L is a regular language, there exists a deterministic finite-state automaton D such that L = Ldfa D. Take for n the number of states of D. Let s be an element of L with sublist y such that |y| n, say s = xyz . Consider the sequence of states D passes through while processing y. Since |y| n, this sequence has more than n entries, hence at least one state, say state A, occurs twice. Take u, v, w as follows • u is the initial part of y processed until the first occurrence of A, • v is the (nonempty) part of y processed from the first to the second occurrence of A, • w is the remaining part of y

161

9. Pumping Lemmas: the expressive power of languages

Note that D could have skipped processing v, and hence would have accepted xuwz . Furthermore, D can repeat the processing in between the first occurrence of A and the second occurrence of A as often as desired, and hence it accepts xuv i wz for all i ∈ N. Formally, a simple proof by induction shows that (∀i : i 0 : xuv i wz ∈ L).

9.4.2. Proof of the Context-free Pumping Lemma, Theorem 9.2
Let G = (T, N, R, S) be a context-free grammar such that L = L(G). Let m be the length of the longest right-hand side of any production, and let k be the number of nonterminals of G. Take c = mk . In Lemma 9.3 below we prove that if z is a list with |z| > c, then in all derivation trees for z there exists a path of length at least k+1. Let z ∈ L such that |z| > c. Since grammar G has k nonterminals, there is at least one nonterminal that occurs more than once in a path of length k+1 (which contains k+2 symbols, of which at most one is a terminal, and all others are nonterminals) of a derivation tree for z. Consider the nonterminal A that satisfies the following requirements. • A occurs at least twice in the path of length k+1 of a derivation tree for z. Call the list corresponding to the derivation tree rooted at the lower A w, and call the list corresponding to the derivation tree rooted at the upper A (which contains the list w ) vwx . • A is chosen such that at most one of v and x equals . • Finally, we suppose that below the upper occurrence of A no other nonterminal that satisfies the same requirements occurs, that is, the two A’s are the lowest pair of nonterminals satisfying the above requirements. First we show that a nonterminal A satisfying the above requirements exists. We prove this statement by contradiction. Suppose that for all nonterminals A that occur twice on a path of length at least k+1 both v and x, the border lists of the list vwx corresponding to the tree rooted at the upper occurrence of A, are both equal to . Then we can replace the tree rooted at the upper A by the tree rooted at the lower A without changing the list corresponding to the tree. Thus we can replace all paths of length at least k+1 by a path of length at most k. But this contradicts Lemma 9.3 below, and we have a contradiction. It follows that a nonterminal A satisfying the above requirements exists. There exists a derivation tree for z in which the path from the upper A to the leaf has length at most k+1, since either below A no nonterminal occurs twice, or there is one or more nonterminal B that occurs twice, but the border lists v and x from the list v w x corresponding to the tree rooted at the upper occurrence of nonterminal B are empty. Since the lists v and x are empty, we can replace the tree rooted at

162

we have that |z| m = mj . and suppose the top of the tree 163 . This situation is depicted as follows. Then tree t corresponds to a single production in G. and since the longest right-hand side of any production has length m. The list uwy is obtained if the tree rooted at the upper A is replaced by the tree rooted at the lower A. then |z| mj . The proof of the Pumping Lemma above frequently refers to the following lemma. and height t j. For the induction step. and suppose that the longest righthand side of any production has length m. Proofs of pumping lemmas the upper occurrence of B by the tree rooted at the lower occurrence of B without changing the list corresponding to the derivation tree. then |z| mj .3. Let G be a context-free grammar. S vv rr ```vvv Ò rrrÒÒ `` vvv rr Ò `` vvv rrr ÒÒÒ vv ` Ò rrr vv y u r A `` vv `` vv Ò rrr Ò `` vvv rrr ÒÒÒ rr ÒÒ `` vvv rr Ò v rr Ac v x cc  cc  cc    w We prove by induction that (∀i : i 0 : uv i wx i y ∈ L).9. In this proof we apply the tree substitution process described in the example before the lemma. Suppose that for all i n we have uv i wx i y ∈ L.4. The list uv i+1 wx i+1 y is obtained if the tree rooted at the lower A in the derivation tree for uv i wx i y ∈ L is replaced by the tree rooted above it A. Since we can do this for all nonterminals B that occur twice below the upper occurrence of A. Let A be the root of t. suppose j = 1. We prove this statement by induction on j.3 below that the length of vwx is at most mk+1 . Suppose z = uvwxy. For i = 0 we have to show that the list uwy is a sentence in L. assume that for all derivation trees t of height at most j we have that |z| mj . where z is the list corresponding to t. For the base case. there exists a derivation tree for z in which the path from the upper A to the leaf has length at most k+1. the list corresponding to the subtree to the left (right) of the upper occurrence of A is u (y). so we define d = mk+1 . Proof. Let t be a derivation tree for a sentence z ∈ L(G). but the root of t is not necessarily the start-symbol. Theorem 9. that is. If height t j. This proves the induction step. We prove a slightly stronger result: if t is a derivation tree for a list z. It follows from Lemma 9. Suppose we have a tree t of height j+1.

Consider the following language: {wcw | w ∈ {a. why not? Exercise 9.   S →    S →  A →   A →    A → Consider the grammar G with the following productions. Show that the following language is not regular. Summary This section introduces pumping lemmas.9. Is this language regular? If it is. Is this language context-free? If it is. {A} S AA }{ 1. give a regular grammar and prove that this grammar generates the language. give a regular grammar and prove that this grammar generates the language. If it is not. why not? Exercise 9. and the lists corresponding to the trees rooted at the symbols of v all have length at most mj .12.9. Pumping lemmas are used to prove that languages are not regular or not context-free. give a context-free grammar and prove that this grammar generates the language. Exercises Exercise 9. and since the longest right-hand side of any production has length m.11. If it is not. 9. why not? 2. Consider the following language: {ai bj | 0 i j} 1. Is this language regular? If it is.5. give a context-free grammar and prove that this grammar generates the language. why not? 2.10. Exercise 9. which proves the induction step. 1}∗ ∧ nr 1 x = nr 0 x} where function nr takes an element a and a list x. b}∗ } 1. and returns the number of occurrences of a in x. For all trees s rooted at the symbols of v we have height s j. If it is not. so the induction hypothesis applies to these trees. Is this language context-free? If it is. Is this grammar 164 . {x | x ∈ {0. the list corresponding to the tree rooted at A has length at most m × mj = mj+1 . Since A → v is a production of G. If it is not. Pumping Lemmas: the expressive power of languages corresponds to the production A → v in G.

3. Prove that your description is correct. Prove that your description is correct. 3. Exercises • Context-free? • Regular? Why? 2. Is the language of G • Context-free? • Regular? Why? Exercise 9.   S →   S →  S →   S → Consider the grammar G with the following productions. Is the language of G • Context-free? • Regular? Why? 165 .9. Is this grammar • Context-free? • Regular? Why? 2.13.5. Give the language of G without referring to G itself. 0 1 S0 1. Give the language of G without referring to G itself.

Pumping Lemmas: the expressive power of languages 166 .9.

LL(1) parsing is an efficient (linear in the length of the input string) method for parsing that can be used for all LL(1) grammars. and determining the set of symbols that can appear as the first symbol in a derivation from a nonterminal. Goals After studying this chapter you will • know the definition of LL(1) grammars. etc. In order to determine whether or not a grammar is LL(1). LALR(1). • know how to parse a sentence from an LL(1) grammar. This chapter discusses one such restriction: LL(1). LL Parsing: Background In the previous chapters we have shown how to construct parsers for sentences of context-free languages using combinator parsers. Other restrictions.10. There are several ways in which we can put extra restrictions on context-free grammars such that we can parse sentences of the corresponding languages efficiently.1.2 describes an implementation in Haskell of LL(1) parsing and the different kinds of grammar analyses needed for checking whether or not a grammar is LL(1). Section 10.1 describes the background of LL(1) parsing. Since these parsers may backtrack. SLR(1). A grammar is LL(1) if at each step in a derivation the next symbol in the input uniquely determines the production that should be applied. 10. the resulting parsers are sometimes a bit slow. 167 . LL Parsing Introduction This chapter introduces LL(1) parsing. and Section 10. such as determining whether or not a nonterminal can derive the empty string. • be able to apply different kinds of grammar analyses in order to determine whether or not a grammar is LL(1). we introduce several kinds of grammar analyses. This chapter is organised as follows. not discussed in these lecture notes are LR(1).

it is popped from the stack and a right-hand side from a production of G for the nonterminal is pushed onto the stack.aab) lead to an empty input.2. A stack machine for parsing This section presents a stack machine for parsing sentences of context-free grammars. If the stack symbol and the next input symbol do not match. 2. the machine signals an error. the input 168 . it would have been stuck.10. A stack machine for a grammar G has a stack and an input. let grammar G be the grammar with the productions: S → aS | cS | b The stack machine for G accepts the input string aab because it can perform the following actions (the first component of the state (before the |in the picture below) is the symbol stack and the second component of the state (after the |) is the unmatched (remaining part of the) input string): stack S aS S aS S b input aab aab ab ab b b and end with an empty input. and the input sentence cannot be accepted.1. then this terminal symbol is ‘read’. If there is at least one sequence of actions that ends with an empty input on an input string.4. However. if the machine had chosen the production S → cS in its first step. These actions are performed until the stack is empty. then it is popped from the stack and compared with the next symbol of the input sequence. A stack machine for G accepts an input if it can terminate with an empty input when starting with the start-symbol from G on the stack. The stack machine we use in this section differs from the stack machines introduced in Sections 5. and performs one of the following two actions. Expand: If the top stack symbol is a nonterminal.4 and 7. The production is chosen nondeterministically. LL Parsing 10. We will use this machine in the following subsections to illustrate why we need grammar analysis. 1. For example. If they are equal. So not all possible sequences of actions from the state (S.1. Match: If the top stack symbol is a terminal.

amongst others. b. LL Parsing: Background string is accepted. The set of terminal symbols of gramm1 is {a.1. stack S cA A cBC BC ccC cC C ba a input ccccba ccccba cccba cccba ccba ccba cba ba ba a Starting with S the machine chooses between two productions: S → cA | b 169 . the stack machine is similar to a nondeterministic finite-state automaton. The first example Our first example is gramm1. C}. c}.2. The stack machine produces. A. the set of nonterminal symbols is {S. 10. B.10. the nondeterministic stack machine can act in a deterministic way by looking ahead one (or two) symbols of the sequence of input symbols. Some example derivations This section gives three examples in which the stack machine for parsing is applied.1. and the productions are S → cA | b A → cBC | bSA | a B → cc | Cb C → aS | ba We want to know whether or not the string ccccba is a sentence of the language of this grammar. the start symbol is S. the following sequence. for all three examples. It turns out that. In this sense. Each of the examples exemplifies why different kinds of grammar analyses are useful in parsing. corresponding with a leftmost derivation of ccccba.

Deriving the sentence ccccba using gramm1 is a deterministic computation: at each step of the derivation there is only one applicable alternative for the nonterminal on top of the stack. Since the next symbol in the remaining string to be recognised is a c. b} is called the first set of the nonterminal C. The top stack symbol of BC is B. The second example A second example is the grammar gramm2 whose productions are S → abA | aa A → bb | bS Now we want to know whether or not the string abbb is a sentence of the language of this grammar. From the above derivation we conclude the following. The first production is applicable. the only applicable production is the first one. the only applicable production is the first one. c. the second production cannot be applied. amongst others. leads to success. After expanding S a match-action removes the leading c from ccccba and cA. The machine chooses between three productions: A → cBC | bSA | a and again. since the first symbol of the string ccccba to be recognised is c. The machine chooses between two productions: B → cc | Cb The next symbol of the remaining string ccba to be recognised is. since the next symbol of the remaining string cccba to be recognised is c. So now we have to derive the string cccba from A. The stack machine produces. Determinicity is obtained by looking at the set of firsts of the nonterminals. From the productions C → aS | ba it is immediately clear that a string derived from C starts with either an a or a b. LL Parsing but. After expanding A a match-action removes the leading c from cccba and cBC. The set {a. The first symbol of the alternative Cb is the nonterminal C. after two match-actions. The machine chooses between two productions C → aS and C → ba. the following sequence 170 . To decide whether it also applies we have to determine the symbols that can appear as the first element of a string derived from B starting with the second production. After expanding B and performing two match-actions it remains to derive the string ba from C.10. but the second production may be applicable as well. So now we have to derive the string ccba from BC. only the second one applies. and. Clearly. once again.

if we look at the first two symbols ab. If we look at the first two symbols bb of the input string. Again. However. Again. after matching. The problem is that the lookahead sets (the lookahead set of a production N → α is the set of terminal symbols that can appear as the first symbol of a string that can be derived from N starting with the production N → α. then we find that the only applicable production is the first one. the definition is given in the following subsection) of the two productions for S both contain a. leads to success). then we find that the first production applies (and. LL Parsing: Background stack S abA bA A bb b input abbb abbb bbb bb bb b Starting with S the machine chooses between two productions: S → abA | aa since both alternatives start with an a.10. Each string derived from A starting with the second production starts with a b and. 171 . since it is not possible to derive a string starting with another b from S. it is not sufficient to look at the first symbol a of the string to be recognised. Deriving the string abbb using gramm2 is a deterministic computation: at each step of the derivation there is only one applicable alternative for the nonterminal on the top of the stack. the second production does not apply. we can left-factor the grammar to obtain a grammar in which all productions for a nonterminal start with a different terminal symbol. After expanding and matching it remains to derive the string bb from A. From the above derivation we conclude the following.1. determinicity is obtained by analysing the set of firsts (of strings of length 2) of the nonterminals. looking ahead one symbol in the input string does not give sufficient information for choosing one of the two productions A → bb | bS for A. Alternatively.

Nonterminal symbols that can derive the empty sequence will play a central role in the grammar analysis problems which we will consider in Section 10. and which terminal symbols can follow upon a nonterminal in a derivation. Deriving the string acbab using gramm3 is a deterministic computation: at each step of the derivation there is only one applicable alternative for the nonterminal on the top of the stack. as required. and each nonempty string derived from B starts with a b. However.10. we can apply the first production. The stack machine produces the following sequence stack S AaS aS input acbab acbab acbab Starting with S the machine chooses between two productions: S → AaS | B since each nonempty string derived from A starts with a c. producing aS which. We do not explain the rest of the leftmost derivation since it does not use any empty strings any more. since A can also derive the empty string.2. From the above derivation we conclude the following. 172 . Determinicity is obtained by analysing whether or not nonterminals can derive the empty string. starts with an a. and then apply the empty string for A. LL Parsing The third example A third example is grammar gramm3 with the following productions: S → AaS | B A → cS | B → b Now we want to know whether or not the string acbab is an element of the language of this grammar. there does not seem to be a candidate production to start a leftmost derivation of acabb with.

In the rest of this chapter we assume that we may look 1 symbol ahead.2 (LL(1) grammar). An obvious question now is: for which grammars are all derivations deterministic? Of course. Since all derivations of sentences of LL(1) grammars are deterministic. Definition 10. So lookAhead (N → α) = {x | S ⇒ γN δ ⇒ γαδ ⇒ γxβ} For example. provided we can look ahead one or two symbols in the input. The lookahead set of a production N → α is the set of terminal symbols that can appear as the first symbol of a string that can be derived from N δ (where N δ appears as a tail substring in a derivation from the start-symbol) starting with the production N → α.10. N → β of G: lookAhead (N → α) ∩ lookAhead (N → β) = ∅ Since all lookAhead sets for productions of the same nonterminal of gramm1 are disjoint. A grammar G is LL(1) if all pairs of productions of the same nonterminal have disjoint lookahead sets. gramm1 is an LL(1) grammar. For gramm2 we have: lookAhead lookAhead lookAhead lookAhead (S → abA) (S → aa) (A → bb) (A → bS) = = = = {a} {a} {b} {b} 173 .1. b} {a} {b} ∗ ∗ We use lookAhead sets in the definition of LL(1) grammar. we define lookAhead sets.1 (lookAhead set). LL Parsing: Background 10. LL(1) grammars The examples in the previous subsection show that the derivations of the example sentences are deterministic. Definition 10. LL(1) is a desirable property of grammars.1. for the productions of gramm1 we have lookAhead lookAhead lookAhead lookAhead lookAhead lookAhead lookAhead lookAhead lookAhead (S → cA) (S → b) (A → cBC) (A → bSA) (A → a) (B → cc) (B → Cb) (C → aS) (C → ba) = = = = = = = = = {c} {b} {c} {b} {a} {c} {a.3. that is: for all productions N → α. the answer to this question depends on the number of symbols we are allowed to look ahead. To formalise this definition. A grammar for which all derivations are deterministic with 1 symbol lookahead is called LL(1): Leftmost with a Lookahead of 1. as the second example shows.

3 (Empty). Function empty takes a nonterminal N . In each of the following definitions we assume that a grammar G is given. gramm2 is an LL(2) grammar. The lookAhead set of a production N → α. for some β? Then we have to determine the set of terminal symbols with which strings derived from P can start. LL Parsing Here.4 (First). that is α = P β. where an LL(k) grammar for k 2 is defined similarly to an LL(1) grammar: instead of one symbol lookahead we have k symbols lookahead. is simply x. But if P can derive the empty string. and it follows that gramm2 is not LL(1). b. • which terminal symbols can appear as the first symbol in a string derived from a nonterminal (firsts). As you see. for gramm3 we have: firsts S = {a.10. How do we determine whether or not a grammar is LL(1)? Clearly. and determines whether or not the empty string can be derived from the nonterminal: empty N For example. The set of firsts of a nonterminal N is the set of terminal symbols that can appear as the first symbol of a string that can be derived from N : firsts N For example. for gramm3 we have: empty S = False empty A = True empty B = False Definition 10. • and which terminal symbols can follow upon a nonterminal in a derivation (follow). the lookAhead sets for both nonterminals S and A are not disjoint. But what if α starts with a nonterminal P . to answer this question we need to know the lookahead sets of the productions of the grammar. in order to determine the lookAhead sets of productions. Definition 10. we are interested in • whether or not a nonterminal can derive the empty string (empty). we also have to determine the set of terminal symbols with which a string derived from β can start. where α starts with a terminal symbol x. c} firsts A = {c} firsts B = {b} = {x | N ⇒ xβ} ∗ = N ⇒ ∗ 174 .

10.1. LL Parsing: Background

We could have given more restricted definitions of empty and firsts, by only looking at derivations from the start-symbol, for example, empty N = S ⇒ αN β ⇒ αβ
∗ ∗

but the simpler definition above suffices for our purposes. Definition 10.5 (Follow). The follow set of a nonterminal N is the set of terminal symbols that can follow on N in a derivation starting with the start-symbol S from the grammar G: follow N = {x | S ⇒ αN xβ}

For example, for gramm3 we have: follow S = {a} follow A = {a} follow B = {a} In the following section we will give programs with which lookahead, empty, firsts, and follow are computed.
Exercise 10.1. Give the results of the function empty for the grammars gramm1 and gramm2. Exercise 10.2. Give the results of the function firsts for the grammars gramm1 and gramm2. Exercise 10.3. Give the results of the function follow for the grammars gramm1 and gramm2. Exercise 10.4. Give the results of the function lookahead for grammar gramm3. Is gramm3 an LL(1) grammar ? Exercise 10.5. Grammar gramm2 is not LL(1), but it can be transformed into an LL(1) grammar by left factoring. Give this equivalent grammar gramm2’ and give the results of the functions empty, first, follow and lookAhead on this grammar. Is gramm2’ an LL(1) grammar? Exercise 10.6. A non-leftrecursive grammar for Bit-Lists is given by the following grammar (see your answer to Exercise 2.18): L → BR R B → | ,BR → 0 | 1

Give the results of functions empty, firsts, follow and lookAhead on this grammar. Is this grammar LL(1)?

175

10. LL Parsing

10.2. LL Parsing: Implementation
Until now we have written parsers with parser combinators. Parser combinators use backtracking, and this is sometimes a cause of inefficiency. If a grammar is LL(1) we do not need backtracking anymore: parsing is deterministic. We can use this fact by either adjusting the parser combinators so that they don’t use backtracking anymore, or by writing a special purpose LL(1) parsing program. We present the latter in this section. This section describes the implementation of a program that parses sentences of LL(1) grammars. The program works for arbitrary context-free LL(1) grammars, so we first describe how to represent context-free grammars in Haskell. Another consequence of the fact that the program parses sentences of arbitrary context-free LL(1) grammars is that we need a generic representation of parse trees in Haskell. The second subsection defines a datatype for parse trees in Haskell. The third subsection presents the program that parses sentences of LL(1) grammars. This program assumes that the input grammar is LL(1), so in the fourth subsection we give a function that determines whether or not a grammar is LL(1). Both this and the LL(1) parsing function use a function that determines the lookahead of a production. This function is presented in the fifth subsection. The last subsections of this section define functions for determining the empty, first, and follow symbols of a nonterminal.

10.2.1. Context-free grammars in Haskell
A context-free grammar may be represented by a pair: its start symbol, and its productions. How do we represent terminal and nonterminal symbols? There are at least two possibilities. • The rigorous approach uses a datatype Symbol: data Symbol a b = N a | T b The advantage of this approach is that nonterminals and terminals are strictly separated, the disadvantage is the notational overhead of constructors that has to be carried around. However, a rigorous implementation of context-free grammars should keep terminals and nonterminals apart, so this is the preferred implementation. But in this section we will use the following implementation: • class isT isN isT Eq s => Symbol s :: s -> Bool :: s -> Bool = not . isN where

where isN and isT determine whether or not a symbol is a nonterminal or a terminal, respectively. This notation is compact, but terminals and nonterminals are no longer strictly separated, and symbols that are used as nonterminals

176

10.2. LL Parsing: Implementation

cannot be used as terminals anymore. For example, the type of characters is made an instance of this class by defining: instance Symbol Char where isN c = ’A’ <= c && c <= ’Z’ that is, capitals are nonterminals, and, by definition, all other characters are terminals. A context-free grammar is a value of the type CFG: type CFG s = (s,[(s,[s])])

where the list in the second component associates nonterminals to right-hand sides. So an element of this list is a production. For example, the grammar with productions S→ AaS | B | CB A → SC | B→ A|b C→ D D→ d is represented as: exGrammar :: CFG Char exGrammar = (’S’, [(’S’,"AaS"),(’S’,"B"),(’S’,"CB") ,(’A’,"SC"),(’A’,"") ,(’B’,"A"),(’B’,"b") ,(’C’,"D") ,(’D’,"d") ] ) On this type we define some functions for extracting the productions, nonterminals, terminals, etc. from a grammar. start start prods prods :: CFG s -> s = fst :: CFG s -> [(s,[s])] = snd :: (Symbol s, Ord s) => CFG s -> [s] = unions . map (filter isT . snd) . snd

terminals terminals

177

10. LL Parsing

where unions :: Ord s => [[s]] -> [s] returns the union of the ‘sets’ of symbols in the lists. nonterminals nonterminals :: (Symbol s, Ord s) => CFG s -> [s] = nub . map fst . snd

Here, nub :: Ord s => [s] -> [s] removes duplicates from a list. symbols :: (Symbol s, Ord s) => CFG s -> [s] symbols grammar = union (terminals grammar) (nonterminals grammar) nt2prods :: Eq s => CFG s -> s -> [(s,[s])] nt2prods grammar s = filter (\(nt,rhs) -> nt==s) (prods grammar) where function union returns the set union of two lists (removing duplicates). For example, we have ?start exGrammar ’S’ ? terminals exGrammar "abd"

10.2.2. Parse trees in Haskell
A parse tree is a tree, in which each internal node is labelled with a nonterminal, and has a list of children (corresponding with a right-hand side of a production of the nonterminal). It follows that parse trees can be represented as rose trees with symbols, where the datatype of rose trees is defined by: data Rose a = Node a [Rose a] | Nil The constructor Nil has been added to simplify error handling when parsing: when a sentence cannot be parsed, the ‘parse tree’ Nil is returned. Strictly speaking it should not occur in the datatype of rose trees.

178

10.2. LL Parsing: Implementation

10.2.3. LL(1) parsing
This section defines a function ll1 that takes a grammar and a terminal string as input, and returns one tuple: a parse tree, and the rest of the inputstring that has not been parsed. So ll1 takes a grammar, and returns a parser with Rose s as its result type. As mentioned in the beginning of this section, our parser doesn’t need backtracking anymore, since parsing with an LL(1) grammar is deterministic. Therefore, the parser type is adjusted as follows: type Parser b a = [b] -> (a,[b]) Using this parser type, the type of the function ll1 is: ll1 :: (Symbol s, Ord s) => CFG s -> Parser s (Rose s) Function ll1 is defined in terms of two functions. Function isll1 :: CFG s -> Bool is a function that checks whether or not a grammar is LL(1). And function gll1 (for generalised LL(1)) produces a list of rose trees for a list of symbols. ll1 is obtained from gll1 by giving gll1 the singleton list containing the start symbol of the grammar as argument. ll1 grammar input = if isll1 grammar then let ([rose], rest) = gll1 grammar [start grammar] input in (rose, rest) else error "ll1: grammar not LL(1)"

So now we have to implement functions isll1 and gll1. Function isll1 is implemented in the following subsection. Function gll1 also uses two functions. Function grammar2ll1table takes a grammar and returns the LL(1) table: the association list that associates productions with their lookahead sets. And function choose takes a terminal symbol, and chooses a production based on the LL(1) table. gll1 :: (Symbol s, Ord s) => CFG s -> [s] -> Parser s [Rose s] gll1 grammar = let ll1table = grammar2ll1table grammar -- The LL(1) table. nt2prods nt = filter (\((n,l),r) -> n==nt) ll1table -- nt2prods returns the productions for nonterminal -- nt from the LL(1) table selectprod nt t = choose t (nt2prods nt) -- selectprod selects the production for nt from the

179

Parse the rest of the symbols on the -.c).symbol in the input. are defined as follows.match let (rts. Ord s) => CFG s -> [((s. which are used in the above function gll1.obtained from the LL(1) table from s. else ([Nil]. in \stack input -> case stack of [] -> ([].q) -> t ‘elem‘ q) l in rhs Note that function choose assumes that there is exactly one element in the association list in which the element t occurs. input) (s:ss) -> if isT s then -.LL(1) table that should be taken when t is the next -. (rrs. rest) -.[a])] -> c choose t l = let [((s.no children).stack.rest) = gll1 grammar ss (tail input) in if s == head input then (Node s []: rts.[s]). which returns the lookahead set of a production and is defined in one of the following subsections. input) -. in ((Node s rts): rrs. 180 . vs) Functions grammar2ll1table and choose.rhs). These functions use function lookaheadp. LL Parsing -.zs) = gll1 grammar (selectprod s t) input -.10.lookaheadp grammar x)) (prods grammar) choose :: Eq a => a -> [((b.The input cannot be parsed else -.The parse tree is a leaf (a node with -.[s])] grammar2ll1table grammar = map (\x -> (x.Try to parse according to the production -.vs) = gll1 grammar ss zs -.expand let t = head input (rts. ys)] = filter (\((x.p). grammar2ll1table :: (Symbol s.

Symbol s) => CFG s -> s -> Bool Function isEmpty takes a grammar and a nonterminal and determines whether or not the empty string can be derived from the nonterminal in the grammar. It does this by computing for each nonterminal of its argument grammar the set of lookahead sets (one set for each production of the nonterminal). nt2prods grammar 10.5. the fourth function is defined in this subsection. (This function was called empty in Definition 10. All sets in a list of sets are disjoint if the length of the concatenation of these sets equals the length of the union of these sets. Ord s) => CFG s -> Bool isll1 grammar = and (map (disjoint . Each of the first three functions will be defined in a separate subsection below.) • firsts :: (Ord s. It is defined in terms of four functions. which determines whether or not all sets in a list of sets are disjoint. lookaheadn :: (Symbol s. Ord s) => CFG s -> s -> [[s]] lookaheadn grammar = map (lookaheadp grammar) . Symbol s) => CFG s -> [(s. and computing the lookahead set of each of these productions by means of function lookaheadp. It is defined in terms of a function lookaheadn. LL Parsing: Implementation 10. which computes the lookahead sets of a nonterminal.2. 181 .2. and returns the lookahead set of the production.[s])] Function firsts takes a grammar and computes the set of firsts of each symbol (the set of firsts of a terminal is the terminal itself). Implementation of isLL(1) Function isll1 checks whether or not a context-free grammar is LL(1). isll1 :: (Symbol s. • isEmpty :: (Ord s.10. Implementation of lookahead Function lookaheadp takes a grammar and a production.2. and a function disjoint. and checking that all of these sets are disjoint.4. lookaheadn grammar) (nonterminals grammar)) disjoint disjoint xss :: Ord s => [[s]] -> Bool = length (concat xss) == length (unions xss) Function lookaheadn computes the lookahead sets of a nonterminal by computing all productions of a nonterminal.3.

10. on the firsts and follow association lists. Consider the production A → SC. LL Parsing • follow :: (Ord s. with the following productions: S→ AaS | B | CB A → SC | B→ A|b C→ D D→ d Consider the production S → AaS.4. Ord s) => CFG s -> (s. The lookahead set of the production contains the set of symbols which can appear as the first terminal symbol of a sequence of symbols derived from S. The lookahead set of the production contains the set of symbols which can appear as the first terminal symbol of a sequence 182 . see Section 5. Symbol s) => CFG s -> [(s. Now we define: lookaheadp :: (Symbol s.2. the lookahead set also contains the set of symbols which can appear as the first terminal symbol of a sequence of symbols derived from C. respectively. the lookahead set also contains the symbol a. and returns the lookahead set of the production. Function lookSet takes a predicate. The lookahead set of the production contains the set of symbols which can appear as the first terminal symbol of a sequence of symbols derived from A. two functions that given a nonterminal return the first and follow set.[s])] Function follow takes a grammar and computes the follow set of each nonterminal (so it associates a list of symbols with each nonterminal). since the nonterminal symbol A can derive the empty string. Function lookSet is introduced after the definition of function lookaheadp. But. Finally. [s]) [s] => -> -> -> -> ------ isEmpty firsts? follow? production lookahead set Note that we use the operator ?. since the nonterminal symbol S can derive the empty string.[s]) -> [s] lookaheadp grammar = lookSet (isEmpty grammar) ((firsts grammar)?) ((follow grammar)?) We will exemplify the definition of function lookSet with the grammar exGrammar. • lookSet :: Ord s (s -> Bool) (s -> [s]) (s -> [s]) (s. But. consider the production B → A. and a production.

LL Parsing: Implementation of symbols derived from A. We continue if the current symbol is a nonterminal which can derive the empty sequence and we stop if the current symbol is either a terminal symbol or a nonterminal symbol which cannot derive the empty sequence. The following function makes this statement more precise. The function scanrRhs is most naturally defined in terms of the function scanr. It turns out that the definition of function follow also makes use of a function lasts which is similar to the function firsts. but which deals with last nonterminal symbols rather than first terminal ones. of course. In the exercises we give an alternative characterisation of foldRhs. While doing so we compute sets of symbols for all the symbols of the right-hand side which we encounter and collect them into a final set of symbols.10. The examples also illustrate a control structure which will be used very often in the following algorithms: we will fold over right-hand sides. This function is somewhere in between a general purpose and an application specific function (we could easily have made it more general though). most naturally defined in terms of the function foldr.2. We will also need a function scanrRhs which is like foldrRhs but accumulates intermediate results in a list. But. 183 . Whenever such a list for a symbol is computed. there are always two possibilities: • either we continue folding and return the result of taking the union of the set obtained from the current element and the set obtained by recursively folding over the rest of the right-hand side • or we stop folding and immediately return the set obtained from the current element. for every nonterminal symbol n. the lookahead set also contains the set of terminal symbols which can follow the nonterminal symbol B in some derivation. foldrRhs :: Ord s => (s -> Bool) -> (s -> [s]) -> [s] -> [s] -> [s] foldrRhs p f start = foldr op start where op x xs = f x ‘union‘ if p x then xs else [] The function foldrRhs is. The examples show that it is useful to have functions firsts and follow in which. since the nonterminal symbol A can derive the empty string. we can look up the terminal symbols which can appear as the first terminal symbol of a sequence of symbols in some derivation from n and the set of terminal symbols which can follow the nonterminal symbol n in a sequence of symbols occurring in some derivation respectively.

scanrRhs p f start .10. look nt rhs = lookaheadp exGrammar (nt.[s]) -> [s] lookSet p f g (nt.rhs) = foldrRhs p f (g nt) rhs The function lookSet makes use of foldrRhs to fold over a right-hand side. lookSet :: Ord s => (s -> Bool) -> (s -> [s]) -> (s -> [s]) -> (s. As stated above. the function foldrRhs continues processing a right-hand side only if it encounters a nonterminal symbol for which p (so isEmpty in the lookSet instance lookaheadp) holds. we will also need a function scanlRhs which does the same job as scanrRhs but in the opposite direction. We can now (assuming that the definitions of the auxiliary functions are given) use the function lookaheadp instance of lookSet to compute the lookahead sets of all productions. LL Parsing scanrRhs :: Ord s => (s -> Bool) -> (s -> [s]) -> [s] -> [s] -> [[s]] scanrRhs p f start = scanr op start where op x xs = f x ‘union‘ if p x then xs else [] Finally. Thus. The easiest way to define scanlRhs is in terms of scanrRhs and reverse. reverse We now return to the function lookSet. the set g nt (follow?nt in the lookSet instance lookaheadp) is only important for those right-hand sides for nt that consist of nonterminals that can all derive the empty sequence. scanlRhs p f start = reverse .rhs) ? look dba ? look dba ? look d ? look ’S’ "AaS" ’S’ "B" ’S’ "CB" ’A’ "SC" 184 .

2. Let us have a closer look at how these lookahead sets are obtained. The corresponding subsections explain how to compute these intermediate results.10. LL Parsing: Implementation dba ? look ad ? look dba ? look b ? look d ? look d ’A’ "" ’B’ "A" ’B’ "b" ’C’ "D" ’D’ "d" It is clear from this result that exGrammar is not an LL(1)-grammar. For the lookahead set of the production A → AaS we fold over the right-hand side AaS. We will have to use the functions firsts and follow and the predicate isEmpty for computing intermediate results. for the lookahead set of the production B → A we fold over the right-hand side A In this case we fold over the complete (one element) list and and we obtain firsts? ’A’ ‘union‘ follow? ’B’ == "dba" ‘union‘ "d" == "dba" 185 . and we obtain firsts? ’S’ ‘union‘ firsts? ’C’ == "dba" ‘union‘ "d" == "dba" Finally. Folding stops at ’a’ and we obtain firsts? ’A’ ‘union‘ firsts? ’a’ == "dba" ‘union‘ "a" == "dba" For the lookahead set of the production A → SC we fold over the right-hand side SC. Folding stops at C since it cannot derive the empty sequence.

The algorithm is iterative: it does the same steps over and over again until some desired condition is met.6. This subsection defines this function. For this purpose we use function fixedPoint. LL Parsing The other lookahead sets are computed in a similar way. which takes a function and a set.2. We are now only interested in deriving sequences which contain only nonterminal symbols (since it is impossible to derive the empty string if a terminal occurs). Consider the grammar exGrammar. Therefore we only have to consider the productions in which no terminal symbols appear in the right-hand sides. We now give the Haskell implementation of the algorithm described above. Implementation of empty Many functions defined in this chapter make use of a predicate isEmpty. S → B | CB A → SC | B → A C → D One can immediately see from those productions that the nonterminal A derives the empty string in one step. which tests whether or not the empty sequence can be derived from a nonterminal.10. until the set does not change anymore. Doing the same with B as we did with A gives us the following productions S → C → D One can now conclude that the nonterminal S derives the empty string in three steps. Doing the same with S as we did with A and B gives us the following productions C → D At this stage we can conclude that there are no more new nonterminals which derive the empty string. and repeatedly applies the function to the set. | C 186 . To know whether there are any nonterminals which derive the empty string in more than one step we eliminate the productions for A and we eliminate all occurrences of A in the right hand sides of the remaining productions S → B | CB B → C → D One can now conclude that the nonterminal B derives the empty string in two steps. 10.

(’b’. plus the (first) symbols that can be derived from that symbol in one or more steps.(’A’.(’D’. Implementation of first and last Function firsts takes a grammar.10.7. Ord s) => CFG s -> s -> Bool isEmpty grammar = (‘elem‘ emptySet grammar) The emptySet of a grammar is obtained by the iterative process described in the example above. Function isEmpty determines whether or not a nonterminal can derive the empty string. isEmpty :: (Symbol s. emptySet :: (Symbol s. The set of firsts each symbol consists of that symbol itself. Function emptyStepf adds a nonterminal if there exists a production for the nonterminal of which all elements can derive the empty string. Ord s) => CFG s -> [s] -> [s] emptyStepf grammar set = nub (map fst (filter (\(nt."a")."b"). A nonterminal can derive the empty string if it is a member of the emptySet of a grammar."D") .(’a’. Consider the grammar exGrammar again.rhs) -> all (‘elem‘ set) rhs) (prods grammar) ) ) 10."A").2. LL Parsing: Implementation fixedPoint fixedPoint f xs :: Ord a => ([a] -> [a]) -> [a] -> [a] | xs == nexts = xs | otherwise = fixedPoint f nexts where nexts = f xs fixedPoint f is sometimes called the fixed-point of f. Ord s) => CFG s -> [s] emptySet grammar = fixedPoint (emptyStepf grammar) [] emptyStepf :: (Symbol s.(’B’. and at each step n of the computation of the emptySet as a fixedPoint.(’d’. We start with the empty set of nonterminals.2. We start the iteration with [(’S’."B"). we add the nonterminals that can derive the empty string in n steps. and returns for each symbol of the grammar (so also the terminal symbols) the set of terminal symbols with which a sentence derived from that symbol can start.(’C’."d") ] 187 ."C"). The first set of a terminal symbol is the terminal symbol itself. So the set of firsts can be computed by an iterative process. just as the function isEmpty."S").

"CD").(’B’.(’C’.(’d’."d")."b") .10. Ord s) => CFG s -> [(s."S").(’C’. LL Parsing Using the productions of the grammar we can derive in one step the following lists of first symbols.(’b’.(’B’."a").[x])) (symbols grammar) The step function takes the old approximation and performs one more iteration step."Dd") .(’C’."BAb").(’b’."Ab").(’A’."d")] and the union of these two lists is [(’S’."b").[s])] firsts grammar = fixedPoint (firstStepf grammar) (startSingle grammar) startSingle :: (Ord s.(’A’.(’B’."S")."a") .(’d’. 188 ."SABCDabd") . We repeat this process until the list doesn’t change anymore."SABC").(’A’.(’a’."")] and again we have to take the union of this list with the previous result."Dd") .(’D’. [(’S’. firsts :: (Symbol s."d") ] Function firsts is defined as the fixedPoint of a step function that iterates the first computation one more step."CDd") .(’D’."ABC")."SABCDabd") ."d")] In two steps we can derive [(’S’.(’a’. The fixedPoint starts with the list that contains all symbols paired with themselves. For exGrammar this happens when: [(’S’.(’A’.(’D’.[s])] startSingle grammar = map (\x -> (x."D")."SABCDabd") ."ABC"). At each of these iteration steps we have to add the start list with which the iteration started again. Symbol s) => CFG s -> [(s."SAbD").(’D’.(’C’.(’B’."AS").

b) [] insertPair (a.2.bs) ((c. LL Parsing: Implementation firstStepf :: (Ord s.b) rest) insertPair :: Eq a => (a.rhs) -> (nt.bs) rest) compose :: Ord a => [(a.[a])] -> [(a.(b:ds)):rest function single takes an element and returns the set with the element. Symbol s) => CFG s -> [(s.[s])] first1 grammar = map (\(nt.bs) <.ds) : (insert (a.bs) [] = [(a.[s])] firstStepf grammar approx = (startSingle grammar) ‘combine‘ (compose (first1 grammar) approx) combine :: Ord s => [(s. taking into account that some nonterminals can derive the empty string.[s])] -> [(s.ds):rest) if a==c then (c.10.ds):(insertPair (a. function first1 computes the direct first symbols of all productions.[b])] = foldr insertPair [] [(a. and unions returns the union of a set of sets.[a])] compose r1 r2 = [(a.unions fs)) (group (map (\(nt.b) -> insertPair (a.[b])] = [(a.[s])] -> [(s.fs) -> (nt.[a])] -> [(a.bs)] insert (a. Ord s) => CFG s -> [(s. first1 :: (Symbol s.[b])] -> [(a. union bs ds) : rest | otherwise = (c.r1] Finally.ds):rest) | a == c = (a. 189 .[s])] -> [(s.b) ((c.[s])] combine xs = foldr insert xs where insert (a.[b])] = else (c. unions (map (r2?) bs)) | (a. and combines the results for the different nonterminals.b)] -> [(a.foldrRhs (isEmpty grammar) single [] rhs ) ) (prods grammar) ) ) where group groups elements with the same first element together group group :: Eq a => [(a.

The previous subsections explain how to compute these functions. Ord s) => CFG s -> [(s. compute the symbols which appear at the end resp. The lists of all nonterminal symbols that can appear at the end of sequences of symbols derived from "A" and "Aa" are "ADC" and "" respectively.al) -> (s. at the beginning of strings which can be derived from the left resp. a ’b’ and an ’a’ From the second pair we see that an ’S’. A nonterminal n is associated to a list containing terminal t in follow if n and t follow each other in some sequence of symbols occurring in some leftmost derivation. a ’D’. a ’D’ and a ’C’ can be be followed by an ’a’.map (\(nt. From the first pair we can see that a ’C’ and a ’D’ can be followed by a ’d’. right part of the split alternative. and a ’B’ can be followed by a ’d’. The lists of all terminal symbols which can appear at the beginning of sequences of symbols derived from "aS" and "S" are "a" and "dba" respectively. which takes a grammar. Combining all these results gives: [(’S’.rhs) -> (nt.[s])] = firsts . Splitting the alternative "CB" in the middle produces sets of firsts and sets of lasts "CD" and "dba".(’D’. Implementation of follow The final function we have to implement is the function follow. and returns an association list in which nonterminals are associated to symbols that can follow upon the nonterminal in a derivation."adb")] The function follow uses the functions scanrAlt and scanlAlt. Zipping together those lists shows that an ’A’.8.(’C’.(’A’."d"). Suppose we reverse the right-hand sides of all productions of a grammar. Let’s see what happens with the alternative "AaS".(’B’. an ’A’. Splitting the alternative "SC" in the middle produces sets of firsts and sets of lasts "SDCAB" and "d". The lists produced by these functions are exactly the ones we need: using the function zip from the 190 . reverseGrammar :: Symbol s => CFG s -> CFG s reverseGrammar = \(s.10. Function follow uses the functions firsts and lasts and the predicate isEmpty for intermediate results.reverse rhs)) al) lasts lasts :: (Symbol s. LL Parsing Function lasts is defined using function firsts. Our grammar exGrammar has three alternatives with length at least 2: "AaS".2. "CB" and "SC". Then the set of firsts of this reversed grammar is the set of lasts of the original grammar. This idea is implemented in the following functions."adb"). We can compute pairs of such adjacent symbols by splitting up the right-hand sides with length at least 2 and. reverseGrammar 10. a ’C’."ad"). using lasts and firsts."d").

"a". followNE :: (Symbol s. []] Only the two middle elements of both lists are important (they correspond to the nontrivial splittings of "AaS").map snd prods .isEmpty [(s.nonterminal lasts [(s. For example: for the alternative "AaS" the functions scanlAlt and scanrAlt produce the following lists: [[].xs) -> (x. we only have to consider alternatives of length at least 2. LL Parsing: Implementation standard prelude we can combine the lists.2. Function splitProds splits the productions of length at least 2.zip (rightscan rhs) (leftscan rhs) 191 .terminal firsts [(s. ls) <. and all terminals from the set of lasts. Ord s) => CFG s -> [(s.[s])] -> -.10. firsts isNlasts = map (\(x. "SDCAB"] ["dba". Ord s) => CFG s -> [(s.filter isT xs)) .nub rhs)) (group pairs) where pairs = [(l.[s])] -> -. f) | rhs <. Ord s) => [(s. "DC".[])) (symbols grammar) The real work is done by functions followNE and function splitProds.[s])] follow grammar = combine (followNE grammar) (startEmpty grammar) startEmpty grammar = map (\x -> (x. length rhs >= 2 . Function followNE passes the right arguments on to function splitProds. (fs. We start the computation of follow with assigning the empty follow set to each symbol: follow :: (Symbol s.xs) -> (x. and pairs the last nonterminals with the first terminals.filter isN xs)) .rhs) -> (nt. "ADC". and removes all nonterminals from the set of firsts.[s])] followNE grammar = splitProds (prods grammar) (isEmpty grammar) (isTfirsts grammar) (isNlasts grammar) where isTfirsts = map (\(x. lasts splitProds :: (Symbol s.[s])] splitProds prods p fset lset = map (\(nt. Thus.[s])] -> -.productions (s -> Bool) -> -. "dba".

10. 192 . Define a function pref p :: [a] → [a] which given a list xs returns the set of elements corresponding to the longest prefix of xs all of whose elements satisfy p. Give the Rose tree representation of the parse tree corresponding to the derivation of the sentence acbab using grammar gramm3.ls f <. Give the Rose tree representation of the parse tree corresponding to the derivation of the sentence ccccba using grammar gramm1. Since the code in this module is obscured by several implementation details we will derive the functions foldrRhs and scanrRhs in a stepwise fashion. In this derivation we will use the following: A (finite) set is implemented by a list with no duplicates. Give the Rose tree representation of the parse tree corresponding to the derivation of the sentence abbb using grammar gramm2’ defined in the exercises of the previous section. Define list2Set as a foldr. In order to construct a set. Define prefplus p as a foldr.fs = scanlRhs p (lset?) [] = scanrRhs p (fset?) [] Exercise 10. 1.7. 2. It can be shown that foldrRhs p f [] = unions . 4. Define a function list2Set :: [a] → [a] which returns the set of elements occurring in the argument. 3. map f . Exercise 10. Exercise 10. 5. A calculus for finite sets is implicit in the programs for LL(1) parsing. ] leftscan rightscan l <.10. the following operations may be used: [] union unions single :: :: :: :: [a] [a] → [a] → [a] [[a]] → [a] a → [a] the the the the empty set of a-elements union of two sets generalised union singleton function These operations satisfy the well-known laws for set operations. In this exercise we will take a closer look at the functions foldrRhs and scanrRhs which are the essential ingredients of the implementation of the grammar analysis algorithms. Exercise 10. prefplus p for all set-valued functions f. 7. LL Parsing . Define a function prefplus p :: [a] → [a] which given a list xs returns the set of elements in the longest prefix of xs all of whose elements satisfy p together with the first element of xs that does not satisfy p (if this element exists at all). From the definitions it is clear that grammar analysis is easily expressed via a calculus for (finite) sets. Give an informal description of the functions foldrRhs p f [] and foldrRhs p f start.8. 6. Show that prefplus p = foldrRhs p single [].9. .

Exercise 10. For the example grammar gramm3 and its production S → AaS a) foldrRhs empty firsts [] AaS b) scanrRhs empty firsts [] AaS tails 193 . For terminal symbols s these functions are defined by empty s = False firsts s = {s} Using the definitions in the previous exercise. Give an informal description of the function scanrRhs.2. The function scanrRhs is defined is a similar way scanrRhs p f start = map (foldrRhs p f start) .11. LL Parsing: Implementation 8.10. a) foldrRhs empty firsts [] bSA b) foldrRhs empty firsts [] Cb 2. compute the following. 1. tails where tails is a function which takes a list xs and returns a list with all tailsegments (postfixes) of xs in decreasing length. The standard function scanr is defined by scanr f q0 = map (foldr f q0) . For the example grammar gramm1 and two of its productions A → bSA and B → Cb. The computation of the functions empty and firsts is not restricted to nonterminals only.

LL Parsing 194 .10.

There are two fundamentally different deterministic parsing algorithms: • LL parsing. and b denotes the parse tree type. The parsing algorithms can be modelled by making use of a stack machine. In chapter 10 we turned to deterministic parsers. So these algorithms are both suitable for reading an input stream. The parsing algorithms are normally referred to as LL(k) or (LR(k). from a file. intead there should be some mechanism of raising an error. The second ‘L’ in ‘LL-parsing’ stands for ‘Leftmost derivation’. k is taken to be 1. where k is the number of unread symbols that the parser is allowed to ‘look ahead’. also known as bottom-up parsing The first ‘L’ in these acronyms stands for ‘Left-to-right’. that is. In most practical cases. we cannot return an empty list of successes anymore. we turn to the LR(1) algorithm. [a]) Ambiguous grammars are not allowed anymore.g. In Chapter 10 we studied the LL(1) parsing algorithm extensively. each solution contains a parse tree and the part of the input that was left unprocessed. input in processed in the order it is read. They impose some additional constraints on the form of the grammar: not every context free grammar is allowable by these algorithms.11. Next. as the parsing mimics doing a leftmost derivation of a sentence (see Section 2. The ‘R’ in ‘LR-parsing’ stands for ‘Rightmost derivation’ of the sentences. Here. [a]) ] where a denotes the alphabet type. e. They can recognize sentences described by ambiguous grammars by returning a list of solutions.4). In this chapter we start with an example application of the LL(1) algorithm. including the so-called LL(1)-property to which grammars must abide. in the case of parsing input containing syntax errors. consisting of a parse tree of type b and the remaining part of the input string: type Parser a b = [a] -> (b. rather than just one solution. also known as top-down parsing • LR parsing. parsers have only one result. There are various algorithms for deterministic parsing. Also. Nondeterministic parsers are of the following type: type Parser a b = [a] -> [ (b. 195 . LL versus LR parsing The parsers that were constructed using parser combinators in Chapter 3 are nondeterministic.

and that’s why the function is named run: it runs the stack machine. If the stack is nonempty. the top of the stack. and removed at the front in other occasions. Function run is defined as follows: run :: Stack -> run [] [] run [] (x:xs) run (a:as) input String -> Bool = True = False | isT a = && && | isN a = where rs = select a (hd not(null input) a==hd input run as (tl input) run (rs++as) input input) So. Function ll1 defined there takes a grammar and an input string. Hence the result of the function is simply a boolean value: check :: String -> Bool check input = run [’S’] input As in chapter 10. we push rs on the stack. That list is used as some kind of stack. and leave the input unchanged in the recursive call. LL(1) parser example The LL(1) parsing algorithm was implemented in Section 10.3. and returns a single result consisting of a parse tree and rest string. when called with an empty stack and an empty string. case distinction is done on a. An LL(1) checker Here. so it need not be concerned with building it. It is selected from the available alternatives by function 196 . and the machine is called recursively for the rest of the input. and could imperatively be implemented in a single loop that runs while the stack is non-empty. The function was implemented by calling a generalized function named gll1. it fails.1. LL versus LR parsing 11. Therefore we refer to the whole operation as a stack machine. the function succeeds. in addition to the input string. This is a simple tail recursive function. That generalization is then called with a singleton list containing only the root nonterminal. If it is a terminal.11. The new symbols rs that are pushed on the stack is the right hand side of a production rule for nonterminal a. generalized in the sense that the function takes an additional list of symbols that need to be recognized.2. In the case of a nonterminal. but the input is not. 11. a list of symbols (which is initially called with the above-mentioned singleton). Now the function run takes.1. we define a slightly simplified version of the algorithm: it doesn’t return a parse tree. If the stack is empty.1. it merely checks whether or not the input string is a sentence. the function is implemented by calling a generalized function run with a singleton containing just the root nonterminal. as there is junk input present. as elements are prepended at its front. the input should begin with it.

4. LL(1) parser example select. that single production is retrieved by hd. The most straightforward definition of the grammar is shown in the left column below. the first input symbol is passed to select as well. Though the general formulation of the algorithm is quite complex.1. Determining the lookahead sets of all productions of a grammar is the tricky part of the LL(1) parsing algorithm. prods) gram This is how the x = (snd is selected: where ok p@(n._) = n==a && x ‘elem‘ lahP gram p So. from all productions of the grammar returned by prods.5.7: we need auxiliary notions of ‘Term’ and ‘Factor’ (actually. and of it only the right hand side is needed.5. we need as much additional notions as there are levels of precedence). So let’s study an example grammar: arithmetic expressions with operators of two levels of precedence. as described in Section 2. the ok ones are taken. the one we are looking for. The resulting grammar is shown in the right column below. Unfortunately. and the non-common part of the right hand sides is described by additional nonterminals. its application in a simple case is rather intuitive. select a alternative . The reason for this is that the grammar contains rules with common prefixes on the right hand side (for E and T ). An LL(1) grammar The idea for the grammar for arithmetical expressions was given in Section 2. which says that the input consists of an expression followed by end-of-input (designated as ‘#’ here). of which there should be only one (this is ensured by the LL(1)-property). Now a production is ok. we also add an additional rule. For making the choice. hd . and moreover the first input symbol x is member of the lookahead set of this production.2. hence the call to snd.11. For being able to treat end-of-input as if it were a character.5 to 10.1. this grammar doesn’t abide to the LL(1)-property. filter ok .8. 11.2. A way out is the application of a grammar transformation known as left factoring. 197 .2. It is described in Sections 10. The result is a grammar where there is only one rule for E and T . if the nonterminal n is a. P and M . This is where you can see that the algorithm is LL(1): it is allowed to look ahead one symbol.

Note that empty and first are properties of a nonterminal. These properties are named empty and first respecively. rather than the 1+2 and 3. we also need to analyze whether a nonterminal can produce the empty string. the parse tree should reflect that the 2 and 3 should be multiplied. the parse tree does so: 198 . and which terminals can be the first symbol of any string produced by each nonterminal.3. LL versus LR parsing E→T E→T +E T →F T →F *T F →N F →(E) N →1 N →2 N →3 S→E# E → TP P → P →+E T → FM M→ M →*T F →N F →(E) N →1 N →2 N →3 For determining the lookahead sets of each nonterminal. Using the LL(1) parser A sentence of the language described by the example grammar is 1+2*3. All properties for the example grammar are summarized in the table below. whereas lookahead is a property of a single production rule. Because of the high precedence of * relative to +. Indeed.11.1. production S→E# E → TP P → P →+E T → FM M→ M →*T F →N F →(E) N →1 N →2 N →3 empty no no yes no yes no no first ( 1 2 3 ( 1 2 3 + ( 1 2 3 * ( 1 2 3 1 2 3 lookahead first(E) ( 1 first(T) ( 1 follow(P) ) # immediate + first(F) ( 1 follow(M) + ) immediate * first(N) 1 2 immediate ( immediate 1 immediate 2 immediate 3 2 3 2 3 2 3 # 3 11.

From the table in Figure 11. is LR(1)parsing. Initially.2. of LL-parsing. what is described in this section is known as Simple LR parsing or SLR(1)-parsing. LR parsing Another algorithm for doing deterministic parsing using a stack machine.2. the corresponing symbol is read from the input.11. Whenever a terminal is on top of the stack. or dual. where the left subtrees are analysed first. The tree is constructed top-down: first the root is visited. 11.1 it is clear that the contents of the stack describes what is still to be expected on the input. Each node corresponds to the application of a production rule. Actually. A nice property of LR-parsing is that it is in many ways exactly the opposite. Some of these ways are: • LL does a leftmost derivation. though. of course. this is the root nonterminal S. we notice that the parse tree is traversed in a depth-first fashion. an then each time the first remaining nonterminal is expanded. LR parsing Now when we do a step-by-step analysis of how the parse tree is constructed by the stack machine. LR does a rightmost derivation 199 . It is still rather complicated. The order in which the production rules are applied is a preorder traversal of the tree.

11. LL versus LR parsing

• LL starts with only the root nonterminal on the stack, LR ends with only the root nonterminal on the stack • LL ends when the stack is empty, LR starts with an empty stack • LL uses the stack for designating what is still to be expected, LR uses the stack for designating what has already been seen • LL builds the parse tree top down, LR builds the parse tree bottom up • LL continuously pops a nonterminal off the stack, and pushes a corresponding right hand side; LR tries to recognize a right hand side on the stack, pops it, and pushes the corresponding nonterminal • LL thus expands nonterminals, while LR reduces them • LL reads terminal when it pops one off the stack, LR reads terminals while it pushes them on the stack • LL uses grammar rules in an order which corresponds to pre-order traversal of the parse tree, LR does a post-order traversal.

11.2.1. A stack machine for SLR parsing
As in Section 10.1.1 a stack machine is used to parse sentences. A stack machine for a grammar G has a stack and an input. When the parsing starts the stack is empty. The stack machine performs one of the following two actions. 1. Shift: Move the first symbol of the remaining input to the top of the stack. 2. Reduce: Choose a production rule N → α; pop the sequence α from the top of the stack; push N onto the stack. These actions are performed until the stack only contains the start symbol. A stack machine for G accepts an input if it can terminate with only the start symbol on the stack when the whole input has been processed. Let us take the example of Section 10.1.1. stack a aa baa Saa Sa S input aab ab b

Note that with our notation the top of the stack is at the left side. To decide which production rule is to be used for the reduction, the symbols on the stack must be read in reverse order.

200

11.3. LR parse example

The stack machine for G accepts the input string aab because it terminates with the start symbol on the stack and the input is completely processed. The stack machine performs three shift actions and then three reduce actions. The SLR parser only performs a reduction with a grammar rule N → α if the next symbol on the input is in the follow set of N .
Exercise 11.1. Give for gramm1 of Section 10.1.2 all the states of the stack machine for a derivation of the input ccccba. Exercise 11.2. Give for gramm2 of Section 10.1.2 all the states of the stack machine for a derivation of the input abbb. Exercise 11.3. Give for gramm3 of Section 10.1.2 all the states of the stack machine for a derivation of the input acbab.

11.3. LR parse example
11.3.1. An LR checker
for a start, the main function for LR parsing calls a generalized stack function with an empty stack:
check’ :: String -> Bool check’ input = run’ [] input

The stack machine terminates when it finds the stack containing just the root nonterminal. Otherwise it either pushes the first input symbol on the stack (‘Shift’), or it drops some symbols off the stack (which should be the right hand side of a rule) and pushes the corresponding nonterminal (‘Reduce’).
run’ :: Stack -> String -> Bool run’ [’S’] [] = True run’ [’S’] (x:xs) = False run’ stack (x:xs) = case action of Shift -> run’ (x: stack) xs Reduce a n -> run’ (a:drop n stack) (x:xs) Error -> False where action = select’ stack x

In the case of LL-parsing the hard part was selecting the right rule to expand; here we have the hard decision of whether to reduce according to a (and which?) rule, or to shift the next symbol. This is done by the select’ function, which is allowed to inspect the first input symbol x and the entire stack: after all, in needs to find the right hand side of a rule on the stack. In the right column in Figure 11.1 the LR derivation of sentence 1+2*3 is shown. Compare closely to the left column, which shows the LL derivation, and note the duality of the processes.

201

11. LL versus LR parsing LL derivation rule S→E E → TP T → FM F →N N →1 read M→ P → +E read E → TP T → FM F →N N →2 read M → *T read T → FM F →N N →3 read M→ P → read ↓stack S E TP FMP NMP 1M P MP P +E E TP FMP NMP 2M P MP *T P TP FMP NMP 3M P MP P remaining 1+2*3 1+2*3 1+2*3 1+2*3 1+2*3 1+2*3 +2*3 +2*3 +2*3 2*3 2*3 2*3 2*3 2*3 *3 *3 3 3 3 3 rule shift N →1 F →N M→ T → FM shift shift N →2 F →N shift shift N →3 F →N M→ T → FM M → *T T → FM P → E → TP P → +E E → TP S→E LR derivation read 1 1 1 1 1 1+ 1+2 1+2 1+2 1+2* 1+2*3 1+2*3 1+2*3 1+2*3 1+2*3 1+2*3 1+2*3 1+2*3 1+2*3 1+2*3 1+2*3 1+2*3 stack↓ 1 N F FM T T+ T +2 T +N T +F T +F * T +F *3 T +F *N T +F *F T +F *F M T +F *T T +F M T +T T +T P T +E TP E S remaining 1+2*3 +2*3 +2*3 +2*3 +2*3 +2*3 2*3 *3 *3 *3 3

1 1 1 1+ 1+ 1+ 1+ 1+ 1+2 1+2 1+2* 1+2* 1+2* 1+2* 1+2*3 1+2*3 1+2*3

Figure 11.1.: LL and LR derivations of 1+2*3

11.3.2. LR action selection
The choice whether to shift or to reduce is made by function select’. It is defined as follows:
select’ as x | null items = Error | null redItems = Shift | otherwise = Reduce a (length rs) where items = dfa as redItems = filter red items (a,rs,_) = hd redItems

In the selection process a set (list) of so-called items plays a role. If the set is empty, there is an error. If the set contains at least one item, we filter the red, or reducible items from it. There should be only one, if the grammar has the LR-property. (Or rather: it is the LR-property that there is only one element in this situation). The reducible item is the production rule that can be reduced by the stack machine. Now what are these items? An item is defined to be a production rule, augmented

202

11.3. LR parse example

with a ‘cursor position’ somewhere in the right hand side of the rule. So for the rule F → (E), there are four possible items: F → ·(E), F → ( · E), F → (E · ) and F → (E)·, where · denotes the cursor. for the rule F → 2 we have two items: one having the cursor in front of the single symbol, and one with the cursor after it: F → · 2 and F → 2 ·. For epsilon-rules there is a single item, where the cursor is at position 0 in the right hand side. In the Haskell representation, the cursor is an integer which is added as a third element of the tuple, which already contains nonterminal and right hand side of the rule. The items thus constructed are taken to be the states of a NFA (Nondeterministic Finite-state Automaton), as described in Section 8.1.2. We have the following transition relations in this NFA: • The cursor can be ‘stepped’ to the next position. This transition is labeled with the symbol that is hopped over • If the cursor is on front of a nonterminal, we can jump to an item describing the application of that nonterminal, where the cursor is at the beginning. This relation is an epsilon-transition of the NFA.

As an example, let’s consider an simplification of the arithmetic expression grammar: E→T E→T +E T →N T →N *T T →(E) it skips the ‘factor’ notion as compared to the grammar earlier in this chapter, so it misses some well-formed expressions, but it serves only as an example for creating the states here. There are 18 states in this machine, as depicted here:

203

11. LL versus LR parsing

Exercise 11.4 (no answer provided). How can you predict the number of states from inspecting the grammar?

Exercise 11.5 (no answer provided). In the picture, the transition arrow labels are not shown. Add them.

Exercise 11.6 (no answer provided). What makes this FA nondeterministic?

As was shown in Section 8.1.4, another automaton can be constructed that is deterministic (a DFA). That construction involves defining states which are sets of the original states. So in this case, the states of the DFA are sets of items (where items are rules with a cursor position). In the worst case we would have 218 states for the DFA-version of our example NFA, but it turns out that we are in luck: there are only 11 states in the DFA. Its states are rather complicated to depict, as they are sets of items, but it can be done:

204

Given this DFA._) = hd redItems It runs the DFA. Which of the states in the picture of the DFA contain red items? How many? This is not the only condition that makes an item reducible.3.8 (no answer provided). Check that this FA is indeed deterministic. Function follow was also needed in the LL analysis in Section 10.rs.r.2. the second condition is that the lookahead input symbol is in the follow set of the nonterminal that is reduced to. which is by construction a set of items. the ‘red’ ones are filtered out. where red (a.. Both conditions are in the formal definition: . How does this condition reflect that ‘the cursor is at the end’ ? 205 .7 (no answer provided). let’s return to our function that selects whether to shift or to reduce: select’ as x | null items = Error | null redItems = Shift | otherwise = Reduce a (length rs) where items = dfa as redItems = filter red items (a. Exercise 11.c) = c==length r && x ‘elem‘ follow a Exercise 11. An item is ‘red’ (that is: reducible) if the cursor is at the end of the rule. From the state where the DFA ends.8. LR parse example Exercise 11..9 (no answer provided).11. using the contents of the stack (as) as input.

under 100). Both tables can be implemented as a two-dimensional table of integers. or SLR(1). A compromise position is taken by the so-called LALR parsers. by its nature. the corresponding DFA state is also stored. ). where red (a. . that decides what action to take from a given state seeing a given input symbol. the SLR red function. the states are merged. two tables can be precomputed: • Shift.11. for making the shift/reduce decision. This basically is the transition relation on the DFA. Its simplicity is in the reducibility test. or Look Ahead LR parsing. And the full power of LR parsing is rarely needed. of dimensions the number of states (typically. Therefore. LL versus LR parsing 11. the set is smaller.3. with a failing parser as a consequence. By analysing the grammar. pushes and pops the stack continuously. . It is not really a natural 206 . maybe. the follow sets are context dependent. As said in the introduction.c) = c==length r && x ‘elem‘ follow a The follow set is a rough approximation of what might follow a given nonterminal. it is considered a waste of time to do the full DFA transitions each time. the DFA is run on the (new) stack. • Action. LR optimizations and generalizations The stack machine run’. tupled with the symbols that used to be on the stack. Grammars that do not give rise to shift/reduce conflicts in this situation are said to be LALR-grammars. In practice. and does a recursive call afterwards. 100s to 1000s) times the number of symbols (typically. leading to a wider class of grammars that are allowable. which says: . An improvement. (An artificial bottom element should be placed on the stack initially.3. where it actually should not.. would be to make the follow set context dependent. It leads to a dramatic increase of the number of states in the DFA. the algorithm described here is a mere Simple LR parsing. as an optimization. so this amounts to just storing extra integers on the stack. that decides what is the new state when pushing a terminal symbol on the stack. at each stack position. This means that it should vary for each item instead of for each nonterminal. The states of the DFA can easily be numbered. In LALR parsers.. where it should have done a shift. But this set is not dependent of the context in which the nonterminal is used. (A rather silly name. as most of the stack remains the same after some popping and pushing at the top. as all parsers look ahead. For some grammars this might lead to a decision to reduce. but when states in the DFA differ only with respect to the follow-part of their set-members (and not with respect to the itempart of them). So. In each call.r. in some contexts. may designate items as reducible. containing a dummy symbol and the number of the initial DFA state).

A commonly used clone of yacc is named Bison. and was originally created for implementing the first C compilers. operators. it is described by regular expressions. which can be found at www. keywords etc.cs.and action-tables reasonable. It comes with Unix.3.11. LR parse example notion.edu/~tkprasad/courses/cs780 207 . identifiers. it is a performance hack that gets most of LR power while keeping the size of the goto. rather. which are analysed by a accompanying tool to yacc named lex (for ‘lexical scanner’). numbers. keywords. is not handled by this grammar.wright. Yacc is designed for doing the context-free aspects of analysing a language. that subdivides the input character stream into a stream of meaningful tokens. A widely used parser generator tool named yacc (for ‘yet another compiler compiler’) is based on an LALR engine. such as numbers. comments etc. Lex is a preprocessor to yacc. The micro structure of identifiers. and part of the description of the algorithm were taken from course notes by Alex Aiken and George Necula. Bibliographical notes The example DFA and NFA for LR parsing. Instead.

LL versus LR parsing 208 .11.

Sippu and E. Vol. Chalmers University of Technology. Ullman. 1968. [6] R. [13] P. pages 113 – 128. In J. Math. and J.D. Fast. Syst. Springer-Verlag. Functional parsers.R. [8] B. Functional Programming Languages and Computer Architecture. 1984. [10] Niklas R¨jemo. PhD thesis. Bird. Addison-Wesley. 1988.D. Springer-Verlag. Regular expressions — languages. Jeuring and E. Introduction to Functional Programming. Acta Informatica. [4] W.E. Wadler. [12] S. Addison-Wesley. Kernighan and R. 1999. Compilers — Principles. error correcting parser combinators: a short tutorial. Techniques and Tools. April:19 – 22. In J. and pattern matching in lazy functional languages. and software. Prentice Hall. Swierstra and P. Pike. [11] S. Proof-directed debugging. Meijer. Harper. Springer. 1999. 1986. volume 15 of EATCS Monographs on THeoretical Computer Science. To appear. Semantics of context-free languages. [7] G. 1: Languages and Parsing. Higher-order functions for parsing. Garbage collection and memory efficiency in lazy functional o languages. Soisalon-Soininen. 1988. algorithms. In SOFSEM’99. LNCS 201. 1995. 209 . [2] R.W.S. 21:239–250.P. Journal of Functional Programming. 1995. Journal of Functional Programming. Jouannaud. Theory. Dr. 1985. 1999. Parsing. 2(2):127– 145. How to replace failure by a list of successes: a method for exception handling. Using circular programs to eliminate multiple traversals of data. backtracking. Sethi R. Aho. Wadler. Parsing Theory. editor.Bibliography [1] A..V. [5] J. Burge. In Recursive Programming Techniques. [9] D. volume 925 of Lecture Notes in Computer Science. Bird and P. editors. Hutton. Azero Alcocer.H. Advanced Functional Programming. [3] R. 1975. Fokker.S. 2(3):323 – 343. Dobb’s Journal. 1992. Knuth.

Bibliography 210 .

split . push .A. popList . pop . pushList . top .Eq) emptyStack :: Stack x emptyStack = MkS [] isEmptyStack :: Stack x -> Bool isEmptyStack (MkS xs) = null xs push :: x -> Stack x -> Stack x push x (MkS xs) = MkS (x:xs) pushList :: [x] -> Stack x -> Stack x pushList xs (MkS ys) = MkS (xs ++ ys) pop :: Stack x -> Stack x pop (MkS xs) = if isEmptyStack (MkS xs) then error "pop on emptyStack" else MkS (tail xs) popIf :: Eq x => x -> Stack x -> Stack x 211 . mystack ) where data Stack x = MkS [x] deriving (Show. emptyStack . The Stack module module Stack ( Stack . isEmptyStack .

Stack x) 0 stack = ([]. The Stack module popIf x stack = if top stack == x then pop stack else error "argument and top of stack don’t match" popList :: Eq x => [x] -> Stack x -> Stack x popList xs stack = foldr popIf stack (reverse xs) top :: Stack x -> x top (MkS xs) = if isEmptyStack (MkS xs) then error "top on emptyStack" else head xs split split split split :: Int -> Stack x -> ([x]. stack) n (MkS []) = error "attempt to split the emptystack" (n+1) (MkS (x:xs)) = (x:ys. stack’) where (ys.2. stack’) = split n (MkS xs) mystack = MkS [1.7] 212 .5.A.3.6.4.

baaaaabaa. 2. the star operator on languages concatenates the sentences of the language.5 Section 2. t ∈ L} {s∈∅} 213 . A grammar for X = {a} is given by S →ε S → aS 2.2 {ε}. 2. A grammar for X = {a. 2. b} is given by S →ε S →X S X →a|b 2.6 A context free grammar for L is given by S →ε S → aS b { Definition of concatenation of languages } {st | s ∈ ∅.3 ∅L = = ∅ The other equalities can be proved in a similar fashion.4 The star operator on sets injects the elements of a set in a list.1 Three of the fours strings are elements of L∗ : abaabaaabaa.1 contains an inductive definition of the set of sequences over an arbitrary set X .B. aaaabaaaa. 2. The former star operator preserves more structure. Answers to exercises 2. Syntactical definitions for such sets follow immediately from this. 1.

B.e. An example of a grammar that can be derived from the inductive definition is: S → ε | aS b | bS a | SS Again. An example of a grammar that can be derived from the inductive definition is: P → ε | 1P 1 | 0P | P 0 There are many other solutions. ∅. Therefore the start symbol cannot be a sentence of the language.9 First establish an inductive definition for parity sequences. so no sentences can be derived. i. 2. {ε}. The nonterminals of a grammar do not belong to the alphabet (the set of terminals) of the language we describe using the grammar. The start symbol is a nonterminal.. 2. 2.e. 214 . P →ε | a | b | aP a | bP b 2.10 Again.13 This grammar generates the empty language.12 The language consisting of the empty string only.. i. 2. the language is the set {ab}∗ . As a consequence we have to perform at least one derivation step from the start symbol before we end up with a sentence of the language. cannot produce any sentences with only terminal symbols. M →ε | aM a | bM b 2. 2.7 Analogous to the construction of the grammar for PAL. establish an inductive definition for L. Answers to exercises 2.11 A sentence is a sentential form consisting only of terminals which can be derived in zero or more derivation steps from the start symbol (to be more precise: the sentential form consisting only of the start symbol). grammars such as this one that have no production rules without nonterminals on the right hand side.e. In general. Each derivation will always contain nonterminals.. there are many other solutions. i.8 Analogous to the construction of PAL.14 The sentences in this language consist of zero or more concatenations of ab.

this prodedure yields S → ab S → aa S → baa 2.15 Yes.17 1.16 To bring the grammar into the form where we can directly apply the rule for associative separators. For the language in Exercise 2.2. Each finite language is context free. Here are two parse trees for the sentence if b then if b then a else a: S if b if then b S else then S a S if if b b then then S S a else S a S a 215 . we introduce a new nonterminal: A → AaA A→B B →b|c Now we can remove the ambiguity: A → B aA A→B B →b|c It is now (optionally) possible to undo the auxiliary step of introducing the additional nonterminal by applying the rules for substituting right hand sides for nonterminal and removing unreachable productions. A context free grammar can be obtained by taking one nonterminal and adding a production rule for each sentence in the language. We then obtain: A → baA | caA A→b|c 2.1.

now with names for the productions: Empty: A: B: A2 : B2 : P P P P P →ε →a →b → aP a → bP b 216 . 2.19 1. Answers to exercises 2.LZ |. Choosing standard symbols.20 Of course.18 An equivalent grammar for bit lists is L →B Z |B Z →. This means we prefer the second of the two parse trees above. We could also give an unambiguous version. The disambiguating rule is incorporated directly into the grammar: → MatchedS | UnmatchedS → if b then MatchedS else MatchedS | a UnmatchedS → if b then S | if b then MatchedS else UnmatchedS S MatchedS 3.B. The rule we apply is: match else with the closest previous unmatched else. The grammar generates the language {a 2n b m | m. we can choose how we want to represent the different operators in concrete syntax. this is one possibility: Expr → Expr + Expr | Expr * Expr | Int where Int is a nonterminal that produces integers. n ∈ N}. An equivalent non left recursive grammar is S → AB A → ε | aaA B → ε | bB 2. Note that the grammar given above is ambiguous. 2. An else clause is always matched with the closest previous unmatched if.L B →0|1 2.21 Recall the grammar for palindromes from Exercise 2. 2.7. for example by introducing operator priorities.

printPal printPal printPal printPal printPal printPal :: Pal → String Empty = "" A = "a" B = "b" (A2 p) = "a" + printPal p + "a" + + (B2 p) = "b" + printPal p + "b" + + a a b a and b a P P a b Note how the function follows the structure of the datatype Pal closely. and calls itself recursively wherever a recursive value of Pal occurs in the datatype.We construct the datatype Pal by interpreting the nonterminal as datatype and the names of the productions as names of the constructors: data Pal = Empty | A | B | A2 P | B2 P Note that once again we keep the nonterminals on the right hand sides as arguments to the constructors. the desired Haskell definitions are pal 1 = A2 (B2 (A2 Empty)) pal 2 = B2 (A2 A) 2. Such a pattern is typical for semantic functions. The strings abaaba and baaab can be derived as follows: P ⇒ aP a ⇒ abP ba ⇒ abaP aba ⇒ abaaba P ⇒ bP b ⇒ baP ba ⇒ baaab The parse trees corresponding to the derivations are the following: P P a b a P P P ε Consequently.22 1. 217 . but omit all the terminal symbols.

this time with names for the productions: MEmpty: M → ε MA: M → aM a MB: M → bM b 1.23 Recall the grammar from Exercise 2. printMir printMir printMir printMir 3.8. mirToPal mirToPal mirToPal mirToPal :: Mir → Pal MEmpty = Empty (MA m) = A2 (mirToPal m) (MB m) = B2 (mirToPal m) :: Mir → String MEmpty = "" (MA m) = "a" + printMir m + "a" + + (MB m) = "b" + printMir m + "b" + + 2.24 Recall the grammar from Exercise 2. we obtain the following datatype: data Mir = MEmpty | MA Mir | MB Mir The concrete mirror palindromes cMir 1 and cMir 2 correspond to the following terms of type Mir : aMir 1 = MA (MB (MA Empty)) aMir 2 = MA (MB (MB Empty)) 2.9. this time with names for the productions: Stop: POne: PZeroL: PZeroR: P P P P →ε → 1P 1 → 0P → P0 218 . Answers to exercises 2. By systematically transforming the grammar. aCountPal aCountPal aCountPal aCountPal aCountPal aCountPal :: Pal → Int Empty = 0 A =1 B =0 (A2 p) = aCountPal p + 2 (B2 p) = aCountPal p 2.B.

data BitList = ConsBit Bit Z | SingleBit Bit data Z = ConsBitList BitList Z | SingleBitList BitList data Bit = Bit 0 | Bit 1 aBitList 1 = ConsBit Bit 0 (ConsBitList (SingleBit Bit 1 ) (SingleBitList (SingleBit Bit 0 ))) aBitList 2 = ConsBit Bit 0 (ConsBitList (SingleBit Bit 0 ) (SingleBitList (SingleBit Bit 1 ))) 2." + printBitList bs + printZ z + + printZ (SingleBitList bs) = ". for instance: aEven 1 = PZeroL (PZeroL (POne (PZeroR Stop))) aEven 2 = PZeroR (PZeroL (POne (PZeroL Stop))) 2.25 A grammar for bit lists that is not left-recursive is the following: L →B Z |B Z →.1. data Parity = Stop | POne Parity | PZeroL Parity | PZeroR Parity aEven 1 = PZeroL (PZeroL (POne (PZeroL Stop))) aEven 2 = PZeroL (PZeroR (POne (PZeroL Stop))) Note that the grammar is ambiguous. printParity printParity printParity printParity printParity :: Parity → String Stop = "" (POne p) = "1" + printParity p + "1" + + (PZeroL p) = "0" + printParity p + (PZeroR p) = printParity p + "0" + 2.L B →0|1 1. printBitList :: BitList → String printBitList (ConsBit b z ) = printBit b + printZ z + printBitList (SingleBit b) = printBit b printZ :: Z → String printZ (ConsBitList bs z ) = ".LZ |. and other representations for cEven 1 and cEven 2 are possible." + printBitList bs + 219 .

e. this language is not context-free. We never have to match on the second bit list: concatBitList :: BitList → BitList → BitList concatBitList (ConsBit b z ) cs = ConsBit b (concatZ z cs) concatBitList (SingleBit b) cs = ConsBit b (SingleBitList cs) concatZ :: Z → BitList → Z concatZ (ConsBitList bs z ) cs = ConsBitList bs (concatZ z xs) concatZ (SingleBitList bs) cs = ConsBitList bs (SingleBitList cs) 2. We get one function per datatype. i. L1 is generated by: S → ZC Z → aZ b | ε C → c∗ and L2 is generated by: S → AZ A → a∗ Z → bZ c | ε 2.B.26 We only give the EBNF notation for the productions that change. We have that L1 ∩ L2 = {a n b n c n | n ∈ N} However.28 1. 3. semantic functions typically still follow the structure of the datatypes closely. and the functions call each other recursively where appropriate – we say they are mutually recursive.27 L(G?) = L(G) ∪ {ε} 2. Digs Int → Dig ∗ → Sign? Nat AlphaNums → AlphaNum ∗ AlphaNum → Letter | Dig 2. We will see in Chapter 9 how to prove such a statement.. Answers to exercises printBit :: Bit → String printBit Bit 0 printBit Bit 1 = "0" = "1" When multiple datatypes are involved. there is no context-free grammar that generates this language. 220 . We can still make the concatenation function structurally recursive in the first of the two bit lists.

for any language L. The sentences baa.30 For example L = {x n | n ∈ N}. Two of them are S ⇒ AA ⇒ bAA ⇒ bAbA ⇒ babA ⇒ babbA ⇒ babbAb ⇒ babbab S ⇒ AA ⇒ AAb ⇒ bAAb ⇒ bAbAb ⇒ bAbbAb ⇒ babbAb ⇒ babbab 3. Several derivations are possible for the string babbab. the language L∗ fulfills the desired property. ε ∈ (L) .2. 2. 2. / 2. other hand. 2. 2.33 1. The string aabbaabba does not appear in this language.34 The grammar is equivalent to the grammar S → aaB B → bB ba B →a This grammar generates the string aaa and the strings aabm a(ba)m for m ∈ N.35 The language L is generated by: S → aS a | bS b | c The derivation is: S ⇒ aS a ⇒ abS ba ⇒ abcba 2. On the / ∗ ∗ ) and (L)∗ cannot be equal. A leftmost derivation is: S ⇒ AA ⇒∗ bm AA ⇒∗ bm Abn A ⇒ bm abn A ⇒∗ bm abn Abp ⇒ bm abn abp 2. and aab can all be derived in four steps. the language L = {aab. Thus (L 2.31 This is only the case when ε ∈ L. m 1. given any language L. baa} also satisfies L = LR .36 The language generated by the grammar is {an bn | n ∈ N} 221 . since ε ∈ L∗ . it holds that L∗ = (L∗ )∗ so.29 No. The shortest derivation is three steps long and yields the sentence aa.32 No. aba. Furthermore. we have ε ∈ (L∗ ). For any language L.

B.38 All three grammars generate the language {an | n ∈ N} 2.40 The language is L = {a2n+1 | n ∈ N} A grammar for L without left-recursive productions is A → aaA | a And a grammar without right-recursive prodcutions is A → Aaa | a 2.39 S →A|ε A → aAb | ab 2.41 The language is 222 . Answers to exercises The same language is also generated by the grammar S → aAb | ε 2.37 The first language is {an | n ∈ N} This language is also generated by the grammar S → aS | ε The second language is {ε} ∪ {a2n+1 | n ∈ N} This language is also generated by the grammar S →A|ε A → a | aAa or using EBNF notation S → (a(aa)∗ )? 2.

44 The language is generated by the grammar: S → (A) | SS A→S |ε A derivation for ( ) ( ( ) ) ( ) is: S ⇒ S S ⇒ S SS ⇒ (A)SS ⇒ ()S S ⇒ ()(A)S ⇒ ()(S )S ⇒ ()((A))S ⇒ ()(())S ⇒ ()(())(A) ⇒ ()(())() 2.42 A grammar that uses only productions with two or less symbols on the right hand side: S T X U Y → T | US → Xa | Ua → aS → S | YT → SU The sentential forms aS and SU have been abstracted to nonterminals X and Y .45 The language is generated by the grammar 223 .{abn | n ∈ N} A grammar for L without left-recursive productions is X → aY Y → bY | ε A grammar for L without left-recursive productions that is also non-contracting is X → aY | a Y → bY | b 2.43 S → 1O O → 1O | 0N N → 1∗ 2. A grammar for the same language with only two nonterminals: S → aS a | U a | US U → S | SU aS a | SUU a The nonterminal T has been substituted for its alternatives aS a and U a. 2.

B. Answers to exercises S → (A) | [A] | SS A→S |ε A derivation for [ ( ) ] ( ) is: S ⇒ S S ⇒ [A]S ⇒ [S ]S ⇒ [(A)]S ⇒ [()]S ⇒ [()](A) ⇒ [()]() 2. as the grammar E →E +T |T T →T *F |F F →0|1 224 . ⊗⇒♣3⊕ ⊗ ⇒ ⊗⇒⊗ ⊗⇒⊗3⊕ ⊗⇒⊕3⊕ ⇒♣3♣ ⊗⇒♣3♣ ⊕⇒♣3♣ ♠ Notice that the grammar of this exercise is the same.48 Here is an unambiguous grammar for the language from Exercise 2. up to renaming.46 First leftmost derivation: Sentence ⇒ Subject Predicate ⇒ they Predicate ⇒ they Verb NounPhrase ⇒ they are NounPhrase ⇒ they are Adjective Noun ⇒ they are flying Noun ⇒ they are flying planes Second leftmost derivation: Sentence ⇒ Subject Predicate ⇒ they Predicate ⇒ they AuxVerb Verb Noun ⇒ they are Verb Noun ⇒ they are flying Noun ⇒ they are flying planes 2.49 Here is a leftmost derivation for ♣ 3 ♣ ♠.45: S → (E )E | [E ] E E →ε|S 2.

But then. for example P ⇒∗ t ⇒ ata in the first situation – the other two cases are analogous. The palindromes a. and b are the palindromes with length 1. a.Char . We have now proved that any palindrome can be generated by the grammar – we still have to prove that anything generated by the grammar is a palindrome.1 Either we use the predefined predicate isUpper in module Data. respectively. and t also is a palindrome. capital = satisfy (λs → (’A’ s) ∧ (s ’Z’)) 3.2 A symbol equal to a satisfies the predicate (= = a): symbol a = satisfy (= = a) 3. Then s can be written as ata or btb or ctc where the length of t is strictly smaller. by induction hypothesis. ys) | (x . bsb. P ⇒ c. and csc. and c are palindromes. Suppose that the palindrome s with a length of 2 or more. but this is easy to see by induction over the length of derivations. Certainly ε. there is also a derivation for s.50 The palindrome ε with length 0 can be generated by de grammar with the derivation P ⇒ ε.5 The type and results of (:) <$> symbol ’a’ are (note that you cannot write this as a definition in Haskell): 225 .3 The function epsilon is a special case of succeed : epsilon :: Parser s () epsilon = succeed () 3. And if s is a palindrome that can be derived. xs)] succeed (f a) xs 3. 3. so are asa.4 Let xs :: [s]. capital = satisfy isUpper or we make use of the ordering defined characters. Thus. there is a derivation P ⇒∗ t.2. ys) ← succeed a xs] [(f a. Then (f <$> succeed a) xs = = = { definition of <$> } { definition of succeed } { definition of succeed } [(f x . They can be derivated with P ⇒ a. P ⇒ b. b. b.

B. palina :: Parser Char Int palina = (λ y → y + 2) <$> symbol ’a’ <∗> palina <∗> symbol ’a’ <|> (λ y → y) <$> symbol ’b’ <∗> palina <∗> symbol ’b’ <|> const 1 <$> symbol ’a’ <|> const 0 <$> symbol ’b’ <|> succeed 0 226 . palin 2 :: Parser Char Pal 2 palin 2 = (\ y → Twoa y) <$> symbol ’a’ <∗> palin 2 <∗> symbol ’a’ <|> (\ y → Twob y) <$> symbol ’b’ <∗> palin 2 <∗> symbol ’b’ <|> const Leafa <$> symbol ’a’ <|> const Leafb <$> symbol ’b’ <|> succeed Nil 3. Answers to exercises ((:) <$> symbol ’a’) :: Parser Char (String → String) ((:) <$> symbol ’a’) [] = [] ((:) <$> symbol ’a’) (x : xs) | x = = ’a’ = [((x :). xs)] | otherwise = [] 3. data Pal 2 = Nil | Leafa | Leafb | Twoa Pal 2 | Twob Pal 2 2. ys) ← p xs] | otherwise = [] 3.7 pBool :: Parser Char Bool pBool = const True <$> token "True" <|> const False <$> token "False" 3.6 The type and results of (:) <$> symbol ’a’ <∗> p are: ((:) <$> symbol ’a’ <∗> p) :: Parser Char String ((:) <$> symbol ’a’ <∗> p) [] = [] ((:) <$> symbol ’a’ <∗> p) (x : xs) | x = = ’a’ = [(’a’ : x . ys) | (x .9 1.

3. 3. zs) |(x .3. +. etcetera. 3.11 As <|> uses + it is more efficiently evaluated if right-associative. data English = E1 Subject Pred data Subject = E2 String data Pred = E3 Verb NounP | E4 AuxV Verb Noun data Verb = E5 String | E6 String data AuxV = E7 String data NounP = E8 Adj Noun data Adj = E9 String data Noun = E10 String 2. it pairs them together: (<.12 The function is the same as <∗>.> q) p <.13 ‘Parser transformator’. zs) ← q ys ] 3.> q) xs = [((x . b) (p <. but instead of applying the result of the first parser to that of the second. english :: Parser Char English english = E1 <$> subject <∗> pred subject = E2 <$> token "they" pred = E3 <$> verb <∗> nounp <|> E4 <$> auxv <∗> verb <∗> noun verb = E5 <$> token "are" <|> E6 <$> token "flying" auxv = E7 <$> token "are" nounp = E8 <$> adj <∗> noun adj = E9 <$> token "flying" noun = E10 <$> token "planes" 3.>) :: Parser s a → Parser s b → Parser s (a.14 The transformator <$> does to the result part of parsers what map does to the elements of a list.10 1. (y.> q = (.> can be defined in terms of each other: p <∗> q = uncurry ($) <$> (p <. ys) ← p xs . or ‘parser modifier’ or ‘parser postprocessor’. y).15 The parser combinators <∗> and <. ) <$> p <∗> q 227 .

3].([1]." 2 a")] 2." a"). Answers to exercises 3.16 Yes. Yes. In order to use the parser chainr we first define a parser plusParser that recognises the character ’+’ and returns the function (+).B." 2 3")] ?> listofdigits "1 2 a" [([1. listofdigits :: Parser Char [Int] listofdigits = listOf newdigit (symbol ’ ’) ?> listofdigits "1 2 3" [([1.2]. snd ) (p s))) 3.([1. You can combine the parser parameter of <$> with a parser that consumes no input and always yields the function parameter of <$>: f <$> p = succeed f <∗> p 3. xs)] | otherwise = [] 228 .17 f :: Char → Parentheses → Char → Parentheses → Parentheses open Parser Char Char :: f <$> open :: Parser Char (Parentheses → Char → Parentheses → Parentheses) parens :: Parser Char Parentheses (f <$> open) <∗> parens :: Parser Char (Char → Parentheses → Parentheses) 3. plusParser :: Num a ⇒ Parser Char (a → a → a) plusParser [] = [] plusParser (x : xs) | x = = ’+’ = [((+).2]." 3").18 To the left.([1]. 3.20 1.2.19 The function has to check whether applying the parser p to input s returns at least one result with an empty rest sequence: test p s = not (null (filter (null ."").

(3.6."+a")."+3")."+2+a")] ?> sumParser "1" [(1."")] Note that the parser also recognises a single integer. definition of succeed } { definition of + } + (listOfa <∗> many (symbol ’a’) <|> succeed []) [] (listOfa <∗> many (symbol ’a’)) [] + succeed [] [] + [] + [([].5 and 3."+2+3")] ?> sumParser "1+2+a" [(3. [])] xs = [’a’]: many (symbol ’a’) [’a’] = = = { definition of many and listOfa } { definition of <|> } { Exercise 3.6.6.The definiton of the parser sumParser is: sumParser :: Parser Char Int sumParser = chainr newdigit plusParser ?> sumParser "1+2+3" [(6. xs = []: many (symbol ’a’) [] = = = = { definition of many and listOfa } { definition of <|> } { Exercise 3.21 We introduce the abbreviation listOfa = (:) <$> symbol ’a’ and use the results of Exercises 3.(1. 3. [])] + [([]. previous calculation } (listOfa <∗> many (symbol ’a’) <|> succeed []) [’a’] (listOfa <∗> many (symbol ’a’)) [’a’] + succeed [] [’a’] + 229 .""). 3.(1. The parser many should be replaced by the parser greedy in de definition of listOf .

thus the ‘greedy’ parsing of all three a’s is presented first. [’a’. ’b’] = = { as before } { Exercise 3. previous calculation } (listOfa <∗> many (symbol ’a’)) [’a’. ’a’. ’b’] = = { as before } { Exercise 3. []). ’a’. then one a. ([]. ’b’] + [([’a’]. ’b’] + succeed [] [’a’. ’b’])] 3. then two a’s with a singleton rest string. ’a’.6. and finally the empty result with the original input as rest string. ’b’])] xs = [’a’. previous calculation } (listOfa <∗> many (symbol ’a’)) [’b’] + succeed [] [’b’] + [([]. ’a’. [’a’. ’b’] + succeed [] [’a’. ’a’]. ’b’]).6. ([]. ’b’] + [([’a’. ’a’. because the <|> combinator uses list concatenation for concatenating lists of successes. Answers to exercises [([’a’]. ’b’]: many (symbol ’a’) [’a’.24 — Combinators for repetition psequence :: [Parser s a] → Parser s [a] psequence [] = succeed [] psequence (p : ps) = (:) <$> p <∗> psequence ps psequence :: [Parser s a] → Parser s [a] psequence = foldr f (succeed []) where f p q = (:) <$> p <∗> q 230 . ([’a’]. 3. [’a’])] xs = [’b’]: many (symbol ’a’) [’b’] = = { as before } { Exercise 3.B. [’b’]). [’b’])] xs = [’a’. This also holds for the recursive calls. ([].22 The empty alternative is presented last. ’b’]: many (symbol ’a’) [’a’.6. [’b’]). previous calculation } (listOfa <∗> many (symbol ’a’)) [’a’. [’a’.

"ab")] ?> (choice [digit.choice :: [Parser s a] → Parser s a choice = foldr (<|>) failp ?> (psequence [digit."")] ?> (psequence [digit. then a variable. satisfy isUpper]) "1A" [("1A".b)": Var "abc" Var "abc" Var "a" :∗: Var "b" :+: Con 1 Var "a" :∗: (Var "b" :+: Con 1) Con (−1) :−: Var "a" Fun "a" [Con 1. map symbol 3. satisfy isUpper]) "1ab" [(’1’. satisfy isUpper]) "Ab" [(’A’. This first solution will however 231 . Var "b"] 2. a variable will first be recognised. The parser fact first tries to parse an integer. satisfy isUpper]) "1Ab" [("1A". satisfy isUpper]) "ab" [] 3. then a function application and finally a parenthesised expression. satisfy isUpper]) "1ab" [] ?> (choice [digit.28 1."b")] ?> (choice [digit. When the parser encounters a function application.27 identifier :: Parser Char String identifier = (:) <$> satisfy isAlpha <∗> greedy (satisfy isAlphaNum) 3."b")] ?> (psequence [digit. A function application is a variable followed by an argument list. As Haskell terms: "abc": "(abc)": "a*b+1": "a*(b+1)": "-1-a": "a(1.25 token :: Eq s ⇒ [s] → Parser s [s] token = psequence .

If we swap the second and the third line in the definition of the parser fact."()")] 3."()")] The parser parenthesised (commaList expr ) that is used in the parser fact does not accept an empty list of arguments because commaList does not."").Var "b"].29 A function with no arguments is not accepted by the parser: ?> expr "f()" [(Var "f".(Var "a".30 expr = chainr (chainl term (const (:−:) <$> symbol ’-’)) (const (:+:) <$> symbol ’+’) 3.31 The datatype Expr is extended as follows to allow raising an expression to the power of an expression: data Expr = Con Int | Var String 232 .b)" [(Fun "a" [Con 1.(Var "f". the parse tree for a function application will be the first solution of the parser: fact :: Parser Char Expr fact = Con <$> integer <|> Fun <$> identifier <∗> parenthesised (commaList expr ) <|> Var <$> identifier <|> parenthesised expr ?> expr "a(1.b)")] 3. To accept an empty list we modify the parser fact as follows: fact :: Parser Char Expr fact = Con <$> integer <|> Fun <$> identifier <∗> parenthesised (commaList expr <|> succeed []) <|> Var <$> identifier <|> parenthesised expr ?> expr "f()" [(Fun "f" []."(1. Answers to exercises not lead to a parse tree for the complete expression because the list of arguments that comes after the variable cannot be parsed."").B.

1) } { (B.| Fun String [Expr ] | Expr :+: Expr | Expr :−: Expr | Expr :∗: Expr | Expr :/: Expr | Expr : ˆ : Expr deriving Show Now the parser expr of Listing 3. and the function concat: map f . b) → (c. g) map f (x + y) = map f x + map f y + + map f . (h ∗∗∗ k ) = (f . (h <$> (f <$> p)) xs = = { (B. multis and powis are treated as left-associative. (: ˆ :))] expr :: Parser Char Expr expr = foldr gen fact [addis.1) } map (h ∗∗∗ id ) ((f <$> p) xs) map (h ∗∗∗ id ) (map (f ∗∗∗ id ) (p xs)) (B. powis] Note that because of the use of chainl all the operators listed in addis. multis.32 The proofs can be given by using laws for list comprehension. map g = map (f . g b) It has the following property: (f ∗∗∗ g) . 3.4) (B. concatenation.1) Furthermore. map (map f ) 1. h) ∗∗∗ (g . b) = (f a. concat = concat . but here we prefer to exploit the following equation (f <$> p) xs = map (f ∗∗∗ id ) (p xs) where (∗∗∗) is defined by (∗∗∗) :: (a → c) → (b → d ) → (a. we will use the following laws about map in our proof: map distributes over composition.8 can be extended with a new level of priorities: powis = [(’^’. k ) (B.2) (B.5) 233 . d ) (f ∗∗∗ g) (a.3) (B.

1) } map ((h ∗∗∗ id ) .) ∗∗∗ id ) (p xs))) concat (map (map (h ∗∗∗ id ) (map (mc q) (p xs)))) concat (map ((map (h ∗∗∗ id )) .1) } { definition of <|> } map (h ∗∗∗ id ) ((p <|> q) xs) map (h ∗∗∗ id ) (p xs + q xs) + map (h ∗∗∗ id ) (p xs) + map (h ∗∗∗ id ) (q xs) + (h <$> p) xs + (h <$> q) xs + ((h <$> p) <|> (h <$> q)) xs 3. mc q) (p xs)) (B. Answers to exercises = = = { (B.3) } { (B.) <$> p) <∗> q) xs = = = = { (B.1) } { definition of <|> } { (B.1) } { (B.4) } { (B. f ) ∗∗∗ id ) (p xs) ((h . ys) = map (f ∗∗∗ id ) (q ys) Now we calculate (((h. f ) <$> p) xs 2.3) } { (B.2) } { (B.7). First note that (p <∗> q) xs can be written as (p <∗> q) xs = concat (map (mc q) (p xs)) where mc q (f .6) } { (B.6) 234 . (h <$> (p <|> q)) xs = = = = = { (B. (f ∗∗∗ id )) (p xs) map ((h . see below } concat (map (mc q) (((h.) <$> p) xs)) concat (map (mc q) (map ((h.B.

f .33 pMir :: Parser Char Mir pMir = (λ m → MB m) <$> symbol ’b’ <∗> pMir <∗> symbol ’b’ <|> (λ m → MA m) <$> symbol ’a’ <∗> pMir <∗> symbol ’a’ <|> succeed MEmpty 3. ys) 3. } { definition of mc q } { map and ∗∗∗ distribute over composition } { definition of mc q } { definition of ∗∗∗ } map (h ∗∗∗ id ) (mc q (f . f ) ∗∗∗ id ) (q y) mc q (h . ((h.) ∗∗∗ id )) (f . ((h.= = = = { (B.) ∗∗∗ id ) = map (h ∗∗∗ id ) .7) 235 . ys) (mc q .’ <∗> pBitList pBit = const Bit 0 <$> symbol ’0’ <|> const Bit 1 <$> symbol ’1’ (B.1) } concat (map (map (h ∗∗∗ id )) (map (mc q) (p xs))) map (h ∗∗∗ id ) (concat (map (mc q) (p xs))) map (h ∗∗∗ id ) ((p <∗> q) xs) (h <$> (p <∗> q)) xs It remains to prove the claim mc q . ys) = = = = = { definition of . mc q) (f . mc q This claim is also proved by calculation: ((map (h ∗∗∗ id )) .34 pBitList :: Parser Char BitList pBitList = SingleB <$> pBit <|> (λb bs → ConsB b bs) <$> pBit <∗> symbol ’. ys)) map (h ∗∗∗ id ) (map (f ∗∗∗ id ) (q ys)) map ((h .3) } { (B.6) } { (B.5) } { (B.

4. Answers to exercises 3.B." [(JAssign "x1" (Var "a" :+: Con 1 :*: Con 2).0) fractpart :: Parser Char Float fractpart = foldr f 0.1 data FloatLiteral = FL1 IntPart FractPart ExponentPart FloatSuffix | FL2 FractPart ExponentPart FloatSuffix | FL3 IntPart ExponentPart FloatSuffix 236 .0 <$> greedy newdigit where f d n = (n + fromIntegral d ) / 10.0 / power (−e) | otherwise = fromIntegral (10e) 3.36 float :: Parser Char Float float = f <$> fixed <∗> (((λ y → y) <$> symbol ’E’ <∗> integer ) ‘option‘ 0) where f m e = m ∗ power e power e | e < 0 = 1.0 3.35 — Parser for floating point numbers fixed :: Parser Char Float fixed = (+) <$> (fromIntegral <$> greedy integer ) <∗> (((λ y → y) <$> symbol ’.37 Parse trees for Java assignments are of type: data JavaAssign = JAssign String Expr deriving Show The parser is defined as follows: assign :: Parser Char JavaAssign assign = JAssign <$> identifier <∗> ((λ y → y) <$> symbol ’=’ <∗> expr <∗> symbol ’.’ <∗> fractpart) ‘option‘ 0.’) ?> assign "x1=(a+1)*2."")] ?> assign "x=a+1" [] Note that the second example is not recognised as an assignment because the string does not end with a semicolon.

ord ’0’ = foldl f 0 <$> many1 digit where f a b = 10*a + b = (\a b c d e -> (fromIntegral a + c) * power d) <$> intPart <*> period <*> optfract <*> optexp <*> optfloat <|> (\a b c d -> b * power c) <$> period <*> fractPart <*> optexp <*> optfloat 237 . digit digits floatLiteral = f <$> satisfy isDigit where f c = ord c . only the parsers return another (semantic) result." = option exponentPart "" = option fractPart "" = option floatSuffix "" intPart fractPart exponentPart signedInteger exponentIndicator sign floatSuffix period optexp optfract optfloat 4.2 The data and type definitions are the same as before.| deriving Show FL4 IntPart ExponentPart FloatSuffix = String = String = String = String = String = String type ExponentPart type ExponentIndicator type SignedInteger type IntPart type FractPart type FloatSuffix digit digits floatLiteral = satisfy isDigit = many 1 digit (λa b c d e → FL1 a c d e) <$> intPart <∗> period <∗> optfract <∗> optexp <∗> optfloat <|> (λa b c d → FL2 b c d ) <$> period <∗> fractPart <∗> optexp <∗> optfloat <|> (λa b c → FL3 a b c) <$> intPart <∗> exponentPart <∗> optfloat <|> (λa b c → FL4 a b c) <$> intPart <∗> optexp <∗> floatSuffix = = signedInteger = digits = (+ <$> exponentIndicator <∗> signedInteger +) = (+ <$> option sign "" <∗> digits +) = token "e" <|> token "E" = token "+" <|> token "-" = token "f" <|> token "F" <|> token "d" <|> token "D" = token ".

..... = signedInteger = f <$> many1 digit where f ds = . f2 a b c d = . = f <$> many1 digit where f ds = .0 <$> many1 digit where f a b = (fromIntegral a + b)/10 exponentPart = (\x y -> y) <$> exponentIndicator <*> signedInteger signedInteger = (\ x y -> x y) <$> option sign id <*> digits exponentIndicator = symbol ’e’ <|> symbol ’E’ sign = const id <$> symbol ’+’ <|> const negate <$> symbol ’-’ floatSuffix = symbol ’f’ <|> symbol ’F’<|> symbol ’d’ <|> symbol ’D’ period = symbol ’.. = f1 <$> intPart <*> period <*> optfract <*> optexp <*> optfloat <|> f2 <$> period <*> fractPart <*> optexp <*> optfloat <|> f3 <$> intPart <*> exponentPart <*> optfloat <|> f4 <$> intPart <*> optexp <*> floatSuffix where f1 a b c d e = ...0 optfloat = option floatSuffix ’ ’ power e | e < 0 = 1 / power (-e) | otherwise = fromIntegral (10^e) 4...B. f3 a b c = ......3 The parsing scheme for Java floats is digit digits floatLiteral = f <$> satisfy isDigit where f c = ... f4 a b c = . Answers to exercises <|> (\a b c -> (fromIntegral a) * power b) <$> intPart <*> exponentPart <*> optfloat <|> (\a b c -> (fromIntegral a) * power b) <$> intPart <*> optexp <*> floatSuffix intPart = signedInteger fractPart = foldr f 0..... intPart fractPart exponentPart 238 .. = f <$> exponentIndicator <*> signedInteger where f x y = ....’ optexp = option exponentPart 0 optfract = option fractPart 0....

.. const 0) signedInteger 239 . f3 c = .’ optexp = option exponentPart ?? optfract = option fractPart ?? optfloat = option floatSuffix ?? 5...1 type LNTreeAlgebra a b x = (a → x ... node) = fold where fold (Leaf a) = leaf a fold (Node l m r ) = node (fold l ) m (fold r ) 5....... f2 h = ... f2 c = ... sign = f1 <$> symbol ’+’ <|> f2 <$> symbol ’-’ where f1 h = ..2 1... period = symbol ’.. f4 c = ...= f <$> option sign ?? <*> digits where f x y = . exponentIndicator = f1 <$> symbol ’e’ <|> f2 <$> symbol ’E’ where f1 c = ... Definition of height by case analysis: height (Leaf x ) = 0 height (Bin lt rt) = 1 + (height lt ‘max ‘ height rt) Definition as a fold: height :: BinTree x → Int height = foldBinTree heightAlgebra heightAlgebra = (λu v → 1 + (u ‘max ‘ v )... x → b → x → x ) foldLNTree :: LNTreeAlgebra a b x → LNTree a b → x foldLNTree (leaf .. floatSuffix = f1 <$> symbol ’f’ <|> f2 <$> symbol ’F’ <|> f3 <$> symbol ’d’ <|> f4 <$> symbol ’D’ where f1 c = .. f2 c = .....

id ) 4.B. Leaf . Definition of mapBinTree by case analysis: mapBinTree f (Leaf x ) = Leaf (f x ) mapBinTree f (Bin lt rt) = Bin (mapBinTree f lt) (mapBinTree f rt) Definition as a fold: mapBinTree :: (a → b) → BinTree a → BinTree b mapBinTree f = foldBinTree (Bin. f ) 5. Definition of sp by case analysis: sp (Leaf x ) = 0 sp (Bin lt rt) = 1 + (sp lt) ‘min‘ (sp rt) Definition as a fold: sp :: BinTree x → Int sp = foldBinTree spAlgebra spAlgebra = (λu v → 1 + u ‘min‘ v . const 0) 5.3 Using explicit recursion: allPaths (Leaf x ) = [[]] allPaths (Bin lt rt) = map (L:) (allPaths lt) + map (R:) (allPaths rt) + 240 . 3. Answers to exercises 2. Definition of flatten by case analysis: flatten (Leaf x ) = [x ] flatten (Bin lt rt) = flatten lt + flatten rt + Definition as a fold: flatten :: BinTree x → [x ] flatten = foldBinTree ((+ λx → [x ]) +). Definition of maxBinTree by case analysis: maxBinTree (Leaf x ) = x maxBinTree (Bin lt rt) = maxBinTree lt ‘max ‘ maxBinTree rt Definition as a fold: maxBinTree :: Ord x ⇒ BinTree x → x maxBinTree = foldBinTree (max .

data Resist = Resist :|: Resist | Resist :∗: Resist | BasicR Float deriving Show type ResistAlgebra a = (a → a → a.(++) . Float → a) foldResist :: ResistAlgebra a → Resist → a foldResist (par .5 1.4 1.const True . (+).As a fold: allPaths :: BinTree a → [[Direction]] allPaths = foldBinTree psAlgebra psAlgebra :: BinTreeAlgebra a [[Direction]] psAlgebra = (λu v → map (L:) u + map (R:) v .\x y -> False . result :: Resist → Float result = foldResist resultAlgebra resultAlgebra :: ResistAlgebra Float resultAlgebra = (λu v → (u ∗ v ) / (u + v ). a → a → a. isSum = foldExpr ((&&) .\x y -> False .const True . 241 . seq. const [[]]) + 5. basic) = fold where fold (r1 :|: r2 ) = par (fold r1 ) (fold r2 ) fold (r1 :∗: r2 ) = seq (fold r1 ) (fold r2 ) fold (BasicR f ) = basic f 2.\x y -> False .\x -> (&&) ) vars = foldExpr ((++) .(++) 2. id ) 5.

but also e1 and e2 themselves. idf ) = fold where fold (e1 ‘Plus‘ e2 ) = plus (fold e1 ) (fold e2 ) fold (e1 ‘Sub‘ e2 ) = sub (fold e1 ) (fold e2 ) fold (Con n) = con n fold (Idf s) = idf s 4. con. 3. sub.const [] .\x -> [x] . 2. Answers to exercises . The function der is not a compositional function on Expr because the righthand sides of the ‘Mul ‘ and ‘Dvd ‘ expressions do not only use der e1 dx and der e2 dx .(++) . data Exp = Exp ‘Plus‘ Exp | Exp ‘Sub‘ Exp | Con Float | Idf String deriving Show type ExpAlgebra a = (a → a → a .B. Using explicit resursion: der der der der der :: Exp → String → Exp (e1 ‘Plus‘ e2 ) dx = der e1 dx ‘Plus‘ der e2 dx (e1 ‘Sub‘ e2 ) dx = der e1 dx ‘Sub‘ der e2 dx (Con f ) dx = Con 0 (Idf s) dx = if s = = dx then Con 1 else Con 0 Using a fold: der = foldExp derAlgebra derAlgebra :: ExpAlgebra (String → Exp) 242 . der computes a symbolic differentiation of an expression.a → a → a . Float → a .\x y z -> x : (y ++ z) ) 5.6 1. String → a ) foldExp :: ExpAlgebra a → Exp → a foldExp (plus.

p. λx y → Leaf y) 5.7 Using explicit recursion: replace (Leaf x ) y = Leaf y replace (Bin lt rt) y = Bin (replace lt y) (replace rt y) As a fold: replace :: BinTree a → a → BinTree a replace = foldBinTree repAlgebra repAlgebra = (λf g → λy → Bin (f y) (g y). λs → λt → if s = = t then Con 1 else Con 0 ) 5.derAlgebra = (λf g → λs → f s ‘Plus‘ g s . p → p. p.9 1. λx → λbs → if null bs then x else error "no roothpath") 5. p → p) 243 .8 Using explicit recursion: path2Value (Leaf x ) = λbs → if null bs then x else error "no roothpath" path2Value (Bin lt rt) = λbs → case bs of [] → error "no roothpath" (L : rs) → path2Value lt rs (R : rs) → path2Value rt rs Using a fold: path2Value :: BinTree a → [Direction] → a path2Value = foldBinTree pvAlgebra pvAlgebra :: BinTreeAlgebra a ([Direction] → a) pvAlgebra = (λfl fr → λbs → case bs of [] → error "no roothpath" (L : rs) → fl rs (R : rs) → fr rs . λn → λs → Con 0 . type PalAlgebra p = (p. λf g → λs → f s ‘Sub‘ g s .

The parser pfoldPal m2 returns the number of a’s occurring in a palindrome. λp → p + 2. foldPal :: PalAlgebra p → Pal → p foldPal (pal 1 . pfoldPal :: PalAlgebra p → Parser Char p pfoldPal (pal 1 . The parser pfoldPal m1 returns the concrete representation of a palindrome.B. pal 5 ) = fPal where fPal Pal 1 = pal 1 fPal Pal 2 = pal 2 fPal Pal 3 = pal 3 fPal (Pal 4 p) = pal 4 (fPal p) fPal (Pal 5 p) = pal 5 (fPal p) 3. pal 2 . pal 3 . a2cPal = foldPal ("" .Pal4. 1. pal 4 . "a" . 5. pal 4 . λp → p) 4.Pal5) 3. type MirAlgebra m = (m.mir2.m->m) foldMir :: MirAlgebra m -> Mir -> m foldMir (mir1. λp → "b" + p + "b" + + ) aCountPal = foldPal (0. Answers to exercises 2. λp → "a" + p + "a" + + .m->m.\m->"a"++m++"a". 0.10 1.mir3) = fMir where fMir Mir1 = mir1 fMir (Mir2 m) = mir2 (fMir m) fMir (Mir3 m) = mir3 (fMir m) a2cMir = foldMir ("". pal 3 . "b" .\m->"b"++m++"b") m2pMir = foldMir (Pal1. pal 5 ) = pPal where pPal = const pal 1 <$> ε <|> const pal 2 <$> syma <|> const pal 3 <$> symb <|> (\ p → pal 4 p) <$> syma <∗> pPal <∗> syma <|> (\ p → pal 5 p) <$> symb <∗> pPal <∗> symb syma = symbol ’a’ symb = symbol ’b’ 5. pal 2 . 244 . 2.

p->p.\x->x++"0" ) 3.11 1. The parser pfoldMir m1 returns the concrete representation of a palindrome.12 1.singleb).(b.(consbl.((++). 245 .parity1. 4.("0". mir3) = pMir where pMir = const mir1 <$> epsilon <|> (\_ m _ -> mir2 m) <$> syma <*> pMir <*> syma <|> (\_ m _ -> mir3 m) <$> symb <*> pMir <*> symb syma = symbol ’a’ symb = symbol ’b’ 5. 2.id) . type BitListAlgebra bl z b = ((b->z->bl.(consbl.singlebl). 5.\x->"0"++x .4.p->p.\x->"1"++x++"1" .b)) foldBitList :: BitListAlgebra bl z b -> BitList -> bl foldBitList ((consb.b->bl).parityL0.(bit0.id) . 5.bl->z).singleb). The parser pfoldMir m2 returns the abstract representation of a palindrome.p->p) foldParity :: ParityAlgebra p -> Parity -> p foldParity (empty.(bit0.bit1)) = fBitList where fBitList (ConsB b z) = consb (fBit b) (fZ z) fBitList (SingleB b) = singleb (fBit b) fZ (ConsBL bl z) = consbl (fBitList bl) (fZ z) fZ (SingleBL bl) = singlebl (fBitList bl) fBit Bit0 = bit0 fBit Bit1 = bit1 a2cBitList = foldBitList (((++).(bl->z->z.bit1)) = pBitList where pBitList = consb <$> pBit <*> pZ <|> singleb <$> pBit 3. type ParityAlgebra p = (p. pfoldMir :: MirAlgebra m -> Parser Char m pfoldMir (mir1.singlebl)."1") ) pfoldBitList :: BitListAlgebra bl z b -> Parser Char bl pfoldBitList ((consb. mir2. 2.parityR0) = fParity where fParity Empty = empty fParity (Parity1 p) = parity1 (fParity p) fParity (ParityL0 p) = parityL0 (fParity p) fParity (ParityR0 p) = parityR0 (fParity p) a2cParity = foldParity ("" .

d ) . uy).13 1. (s → r → r . (r1 . foldBlock :: BlockAlgebra b r s d u n → Block → b foldBlock (b1 . Answers to exercises pZ pBit 5. s3 ).b → n ) 3. (ux . n1 ) = foldB where foldB (B1 stat rest) = b1 (foldS stat) (foldR rest) foldR (R1 stat rest) = r1 (foldS stat) (foldR rest) 246 .’ <*> pBitList <*> pZ <|> singlebl <$> pBitList = const bit0 <$> symbol ’0’ <|> const bit1 <$> symbol ’1’ = = B1 Stat Rest = R1 Stat Rest | Nix = S1 Decl | S2 Use | S3 Nest = Dx | Dy = UX | UY = N1 Block The abstract representation of x.Y).X is B1 (S1 Dx ) (R1 stat 2 rest 2 ) where stat 2 = S3 (N1 block ) block = B1 (S1 Dy) (R1 (S2 UY ) Nix ) rest 2 = R1 (S2 UX ) Nix 2. (dx . s2 . r ) . u → s. (s1 . (d . dy). n → s) .B. u) . (d → s. data Block data Rest data Stat data Decl data Use data Nest (\_ bl z -> consbl bl z) <$> symbol ’.(y. (u. nix ). type BlockAlgebra b r s d u n = (s → r → b .

foldR Nix foldS (S1 decl ) foldS (S2 use) foldS (S3 nest) foldD Dx foldD Dy foldU UX foldU UY foldN (N1 block ) 4.

= nix = s1 (foldD decl ) = s2 (foldU use) = s3 (foldN nest) = dx = dy = ux = uy = n1 (foldB block )

a2cBlock :: Block → String a2cBlock = foldBlock a2cBlockAlgebra a2cBlockAlgebra :: BlockAlgebra String String String String String String a2cBlockAlgebra = (b1 , (r1 , nix ), (s1 , s2 , s3 ), (dx , dy), (ux , uy), n1 ) where b1 u v = u + v + r1 u v = ";" + u + v + + nix = "" s1 u = u s2 u = u s3 u = u dx = "x" dy = "y" ux = "X" uy = "Y" n1 u = "(" + u + ")" + + 7.1 The type TreeAlgebra and the function foldTree are defined in the rep-min problem. 1. height = foldTree heightAlgebra heightAlgebra = (const 0, \l r -> (l ‘max‘ r) + 1) --frontAtLevel :: Tree -> Int -> [Int] --frontAtLevel (Leaf i) h = if h == 0 then [i] else [] --frontAtLevel (Bin l r) h = -- if h > 0 then frontAtLevel l (h-1) ++ frontAtLevel r (h-1) -else [] frontAtLevel = foldTree frontAtLevelAlgebra frontAtLevelAlgebra = ( \i h -> if h == 0 then [i] else []

247

B. Answers to exercises

, \f g h ) 2.

-> if h > 0 then f (h-1) ++ g (h-1) else []

a) Straightforward Solution: impossible to give a solution like rm.sol1. heightAlgebra = (const 0, \l r -> (l ‘max‘ r) + 1) front t = foldTree frontAtLevelAlgebra t h where h = foldTree heightAlgebra t b) Lambda Lifting front’ t = foldTree frontAtLevelAlgebra t (foldTree heightAlgebra t) c) Tupling Computations htupfr :: TreeAlgebra (Int, Int -> [Int]) htupfr -= (heightAlgebra ‘tuple‘ frontAtLevelAlgebra) = ( \i -> ( 0 , \h -> if h == 0 then [i] else [] ) , \(lh,f) (rh,g) -> ( (lh ‘max‘ rh) + 1 , \h -> if h > 0 then f (h-1) ++ g (h-1) else [] ) ) front’’ t = fr h where (h, fr) = foldTree htupfr t d) Merging Tupled Functions It is helpful to note that the merged algebra is constructed such that foldTree mergedAlgebra t i = (height t, frontAtLevel t i) Therefore, the definitions corresponding to the fourth solution are mergedAlgebra :: TreeAlgebra (Int -> (Int, [Int])) mergedAlgebra = (\i -> \h -> ( 0 , if h == 0 then [i] else [] ) , \lfun rfun -> \h -> let (lh, xs) = lfun (h-1) (rh, ys) = rfun (h-1) in ( (lh ‘max‘ rh) + 1

248

, if h > 0 then xs ++ ys else [] ) ) front’’’ t = fr where (h, fr) = foldTree mergedAlgebra t h 7.2 The highest front problem is the problem of finding the first non-empty list of nodes which are at the lowest level. The solutions are similar to these of the deepest front problem. 1. --lowest :: Tree -> Int --lowest (Leaf i) = 0 --lowest (Bin l r) = ((lowest l) ‘min‘ (lowest r)) + 1 lowest = foldTree lowAlgebra lowAlgebra = (const 0, \ l r -> (l ‘min‘ r) + 1) 2. a) Straightforward Solution: impossible to give a solution like rm.sol1. lfront t = foldTree frontAtLevelAlgebra t l where l = foldTree lowAlgebra t b) Lambda Lifting lfront’ t = foldTree frontAtLevelAlgebra t (foldTree lowAlgebra t) c) Tupling Computations ltupfr :: TreeAlgebra (Int, Int -> [Int]) ltupfr = ( \i -> ( 0 , (\l -> if l == 0 then [i] else []) ) , \ (lh,f) (rh,g) -> ( (lh ‘min‘ rh) + 1 , \l -> if l > 0 then f (l-1) ++ g (l-1) else [] ) ) lfront’’ t = fr l where (l, fr) = foldTree ltupfr t d) Merging Tupled Functions It is helpful to note that the merged algebra is constructed such that foldTree mergedAlgebra t i = (lowest t, frontAtLevel t i)

249

B. Answers to exercises

Therefore, the definitions corresponding to the fourth solution are lmergedAlgebra :: TreeAlgebra (Int -> (Int, [Int])) lmergedAlgebra = ( \i -> \l -> ( 0 , if l == 0 then [i] else [] ) , \lfun rfun -> \l -> let (ll, lres) = lfun (l-1) (rl, rres) = rfun (l-1) in ( (ll ‘min‘ rl) + 1 , if l > 0 then lres ++ rres else [] ) ) lfront’’’ t = fr where (l, fr) = foldTree lmergedAlgebra t l 8.1 Let G = (T, N, R, S). Define the regular grammar G = (T, N ∪ {S }, R , S ) as follows. S is a new nonterminal symbol. For the productions R , divide the productions of R in two sets: the productions R1 of which the right-hand side consists just of terminal symbols, and the productions R2 of which the right-hand side ends with a nonterminal. Define a set R3 of productions by adding the nonterminal S at the right end of each production in R1. Define R = R ∪ R3 ∪ S → S | . The grammar G thus defined is regular, and generates the language L∗ . 8.2 In step 2, add the productions S → aS S→ S → cC S→a A→ A → cC A→a B → cC B→a and remove the productions S→A A→B B→C In step 3, add the production S → a, and remove the productions A → , B → .

250

8.3 ndfsa d qs [ ] = qs ndfsa d qs (x : xs) = ndfsa d (d qs x) xs

8.4 Let M = (X, Q, d, S, F ) be a deterministic finite-state automaton. Then the nondeterministic finite-state automaton M defined by M = (X, Q, d , {S}, F ), where d q x = {d q x} accepts the same language. 8.5 Let M = (X, Q, d, S, F ) be a deterministic finite-state automaton that accepts language L. Then the deterministic finite-state automaton M = (X, Q, d, S, Q − F ), where Q − F is the set of states Q from which all states in F have been removed, accepts the language L. Here we assume that d q a is defined for all q and a. 8.6 1. L1 ∩ L2 = (L1 ∪ L2 ), so it follows that regular languages are closed under intersection if they are closed under complementation and union. Regular languages are closed under union, see Theorem 8.10, and they are closed under complementation, see Exercise 8.5. 2. = {w ∈ X ∗ | dfa accept w (d, (S1 , S2 ), (F1 × F2 ))} = {w ∈ X ∗ | dfa d (S1 , S2 ) w ∈ F1 × F2 } = = {w ∈ X ∗ | dfa d1 S1 w ∈ F1 ∧ dfa d2 S2 w ∈ F2 } = {w ∈ X ∗ | dfa d1 S1 w ∈ F1 } ∩ {w ∈ X ∗ | dfa d2 S2 w ∈ F2 } = Ldfa M1 ∩ Ldfa M2 8.7 1.
89:; ?>=< /.-, ()*+ S @A 89:; / ?>=< A O BC@A 89:; /.-, ()*+ / ?>=< B BCO

Ldfa M

This requires a proof by induction {w ∈ X ∗ | (dfa d1 S1 w, dfa d2 S2 w) ∈ F1 × F2 }

( )

) (

251

So when a 0 appears somewhere. { . 8. ()*+ B 0 Since S = ∅. = (Lre(R))(Lre(S + T )) = (Lre(R))(Lre(S) ∪ Lre(T )) = (Lre(R))(Lre(S)) ∪ (Lre(R))(Lre(T )) = Lre(RS + RT ) 2.-.-. / ?>=< A  89:. S = a. ()*+ / ?>=< C 8. Similar to the above calculation.12 The string 01 may not occur in a string in the language of the regular expression. GF ?>=< 89:.8 Use the definition of Lre to calculate the languages: 1. S 1 0 0 ED 89:. {a}b∗ ∪ c∗ 8. S = b.11 If both V and W are subsets of S. ()*+ / ?>=< D 1 0 89:.-. (bc∗ ) 3. Another solution is V W = S∩R = S∩R Lre(R(S + T ))  ?>=< 89:. 8. /. /. then Lre(R(S + V )) = Lre(R(S + W )). Since S = ∅. b} 2. only 0’s can follow. and it follows that V = W . /. There exist other solutions than these. 8. Take 1∗0∗. V = S and W = ∅ satisfy the requirement. and R = a.10 Take R = a. Answers to exercises 2. 8. at least one of S ∩ R and S ∩ R is not empty.9 1.13 252 .B.

(b + a(bb) ∗ b)∗ 9.1. 2 2 Take s = an with x = . Let u. The language of the regular expression a∗ is generated by the regular grammar S0 → | aS0 The language of the regular expression b∗ is generated by the regular grammar S1 → | bS1 The language of the regular expression ab is generated by the regular grammar S2 → ab It follows that the language of the regular expression a∗ + b∗ + ab is generated by the regular grammar S3 → S0 | S1 | S2 8. (0 + 10∗1)∗ 8.1 to obtain a regular grammar for the complete regular expression. w be such that y = uvw with v = . y = an . Take i = 2. and use the automata to construct the following regular expressions. that is. a(b + b(b + ab)∗(ab + )) + bb∗ + 2. b ∗ +b ∗ (aa ∗ b(a + b)∗) 2.1 and a regular grammar for a + bb to obtain a regular grammar for (a + bb)∗. and z = an −n . v = aq and w = ar with p + q + r = n and q > 0. v. we can use the procedure constructed in Exercise 8.1 We follow the proof pattern for non-regular languages given in the text. Again. 1.14 First construct a nondeterministic finite-state automaton for these grammars. The language of the regular expression a + bb is generated by the regular grammar S2 → a | bb 2. u = ap . then 253 . Let n ∈ N.16 1. If we can give a regular grammar for (a + bb)∗ + c. The following regular grammar generates (a + bb)∗ + c: S0 → S1 | c provided S1 generates (a + bb)∗. we use Exercise 8.

calculus an b2n+q ∈L 254 . u = bp . v = bq and w = br with p + q + r = n and q > 0. Let n ∈ N.B. that is. u. Take s = an bn+1 with x = an . z. w be such that y = uvw with v = . v = bq and w = br with p + q + r = n and q > 0. v. u. then xuwz ∈ L ⇐ ⇐ p+q+r ⇐ q>0 p+r+1 defn. and z = bn . w be such that y = uvw with v = . v. w. Take s = an b2n with x = an . w. Answers to exercises xuv 2 wz ∈ L ⇐ ⇐ ⇐ n2 < n2 + q < (n + 1)2 ⇐ q < 2n + 1 defn. Let u. x. w. that is. u. x. y = bn . y = bn . v. z. Take i = 2.2 Using the proof pattern. v. calculus an bp+r+1 ∈L 9. x. u = bp .3 Using the proof pattern. Let n ∈ N. and z = b. Let u. z. then xuv 2 wz ∈ L ⇐ ⇐ 2n + q > 2n ⇐ q>0 defn. calculus 2 ap+2q+r an −n ∈L p + q + r = n and q > 0 n2 + q is not a square 9. v. Take i = 0.

y. |vx| > 0 and |vwx | d That is u = ap . w be such that y = uvw with v = . u. v = aq . v = aq and w = ar with p + q + r = n and q > 0. and c’s is at least r + 2. Take s = a6 bn+4 an with x = a6 bn+4 . v. u. and z = . 255 . x. d defn. b’s.9. calculus a6 bn+4 an+5q ∈ L 9. v. u = ap . Let c.5 The length of a substring with a’s. w.4 Using the proof pattern. d ∈ N. x. w. and |vwx| r. calculus 2 ak +q+s ∈L q+s>0 k 2 + q + s is not a square 9. k q+s d defn. Take i = 6. w. x = as and y = at with p + q + r + s + t = k 2 . then xuv 6 wz ∈ L ⇐ ⇐ n + 5q > n + 4 ⇐ q>0 9. Let u. d ∈ N. Let n ∈ N. v. Let c. z. 2 Take z = ak with k = max(c. Let u. x. q + r + s d and q + s > 0. w = ar . d). then uv 2 wx 2 y ∈ L ⇐ ⇐ ⇐ k 2 < k 2 + q + s < (k + 1)2 ⇐ q + s < 2k + 1 ⇐ defn.7 Using the proof pattern. y = an . y be such that z = uvwxy. that is.6 We follow the proof pattern for proving that a language is not context-free given in the text. Take i = 2. v.

Again this sentence is not an element of the language. Then the string uwy can be written as uwy = as bt ap bq defn. or just b’s. w. Let u.t. Let u. v. w = ar . Take i = 2. y be such that z = uvwxy. y be such that z = uvwxy. v. Answers to exercises Take z = ak with k is prime and k > max(c. x. Let u. p. that is. w. then it is impossible to write the string uwy as ww for some string w. d). v = bq and w = br with p + q + r = n and q > 0. w be such that y = uvw with v = . Take i = 0. u. Take i = k + 1. calculus ak+kq+ks ∈L for some s. Let c. x. w. • vwx contains both a’s and b’s. |vx| > 0 and |vwx | d That is u = ap . y. Take s = 1n 0n with x = 1n . x = as and y = at with p + q + r + s + t = k. and z = . At least one of s . then • If vwx consists of just a’s. it lies somewhere on the border between a’s and b’s. u = bp . d ∈ N. q + r + s d and q + s > 0. 9. Take z = ak bk ak bk with k = max(c. t.8 Using the proof pattern. then 256 . or on the border between b’s and a’s. d). Let n ∈ N. p and q is less than k. or just b’s. y = 0n . q. v. respectively. v = aq . since only the number of terminals of one kind is decreased. v. while two of them are equal to k. |vx| > 0 and |vwx | d Note that our choice for k guarantees that substring vwx has one of the following shapes: • vwx consists of just a’s. x.B. • If vwx contains both a’s and b’s.9 Using the proof pattern. then uv k+1 wx k+1 y ∈ L ⇐ ⇐ k(1 + q + s) is not a prime ⇐ q+s>0 9.

The language {ai bj | 0 i j} is context-free. v. We prove this using the proof pattern. x. w be such that y = uvw with v = . z. 257 . The language is not regular. v = aq and w = ar with p + q + r = n and q > 0. w. calculus 1n 0p+2q+r ∈L p+q+r =n 1n 0n+q ∈ L q>0 true 9. that is. w.xuv 2 w ∈ L ⇐ ⇐ ⇐ defn. u = ap . x. Take s = an bn+1 with x = . The language can be generated by the following context-free grammar: S → aSbA S → A → bA A → This grammar generates the strings am bm bn for m. Take i = 3. We prove this using the proof pattern. calculus ap+3q+r bn+1 ∈ L p+q+r =n n + 2q > n + 1 q>0 true 9. n ∈ N. The language {wcw | w ∈ {a. y = an . 2. z. Let u. v. and z = bn+1 . Let n ∈ N. Let c.11 1. These are exactly the strings of the given language. then xuv 3 wz ∈ L ⇐ ⇐ ⇐ defn. u.10 1. v. u. d ∈ N. b}∗ } is not context-free.

The string uwy does not contain c so it is not an element of the language. The grammar is not right-regular but left-regular because the nonterminal appears at the left hand side in the last production rule. then • If vwx does n