Programming in Standard ML
(DRAFT: VERSION 1.2 OF 11.02.11.)
Robert Harper
Carnegie Mellon University
Spring Semester, 2011
Copyright c _2011 by Robert Harper.
All Rights Reserved.
This work is licensed under the Creative Commons
AttributionNoncommercialNo Derivative Works 3.0 United States
License. To view a copy of this license, visit
http://creativecommons.org/licenses/byncnd/3.0/us/, or send a
letter to Creative Commons, 171 Second Street, Suite 300, San Francisco,
California, 94105, USA.
Preface
This book is an introduction to programming with the Standard ML pro
gramming language. It began life as a set of lecture notes for Computer
Science 15–212: Principles of Programming, the second semester of the in
troductory sequence in the undergraduate computer science curriculumat
Carnegie Mellon University. It has subsequently been used in many other
courses at Carnegie Mellon, and at a number of universities around the
world. It is intended to supersede my Introduction to Standard ML, which
has been widely circulated over the last ten years.
Standard ML is a formally deﬁned programming language. The Deﬁ
nition of Standard ML (Revised) is the formal deﬁnition of the language. It
is supplemented by the Standard ML Basis Library, which deﬁnes a com
mon basis of types that are shared by all implementations of the language.
Commentary on Standard ML discusses some of the decisions that went into
the design of the ﬁrst version of the language.
There are several implementations of Standard ML available for a wide
variety of hardware and software platforms. The bestknown compilers
are Standard ML of New Jersey, MLton, Moscow ML, MLKit, and PolyML.
These are all freely available on the worldwide web. Please refer to The
Standard ML Home Page for uptodate information on Standard ML and
its implementations.
Numerous people have contributed directly and indirectly to this text.
I am especially grateful to the following people for their helpful com
ments and suggestions: Brian Adkins, Nels Beckman, Marc Bezem, James
Bostock, Terrence Brannon, Franck van Breugel, Chris Capel, Matthew
William Cox, Karl Crary, Yaakov Eisenberg, Matt Elder, Mike Erdmann,
Matthias Felleisen, Andrei Formiga, Stephen Harris, Nils J¨ ahnig, Joel Jones,
David Koppstein, John Lafferty, Johannes Laire, Flavio Lerda, Daniel R.
Licata, Adrian Moos, Bryce Nichols, Michael Norrish, Arthur J. O’Dwyer,
Frank Pfenning, Chris Stone, Dave Swasey, Michael Velten, Johan Wallen,
Scott Williams, and Jeannette Wing. Richard C. Cobbe helped with font se
lection. I am also grateful to the many students of 15212 who used these
notes and sent in their suggestions over the years.
These notes are a work in progress. Corrections, comments and sug
gestions are most welcome.
Contents
Preface ii
I Overview 1
1 Programming in Standard ML 3
1.1 A Regular Expression Package . . . . . . . . . . . . . . . . . 3
1.2 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
II The Core Language 13
2 Types, Values, and Effects 15
2.1 Evaluation and Execution . . . . . . . . . . . . . . . . . . . . 15
2.2 The ML Computation Model . . . . . . . . . . . . . . . . . . 16
2.2.1 Type Checking . . . . . . . . . . . . . . . . . . . . . . 17
2.2.2 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3 Types, Types, Types . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4 Type Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.5 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3 Declarations 24
3.1 Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2 Basic Bindings . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.1 Type Bindings . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.2 Value Bindings . . . . . . . . . . . . . . . . . . . . . . 26
3.3 Compound Declarations . . . . . . . . . . . . . . . . . . . . . 27
3.4 Limiting Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
CONTENTS vi
3.5 Typing and Evaluation . . . . . . . . . . . . . . . . . . . . . . 29
3.6 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4 Functions 33
4.1 Functions as Templates . . . . . . . . . . . . . . . . . . . . . . 33
4.2 Functions and Application . . . . . . . . . . . . . . . . . . . . 34
4.3 Binding and Scope, Revisited . . . . . . . . . . . . . . . . . . 37
4.4 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5 Products and Records 40
5.1 Product Types . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.1.1 Tuples . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.1.2 Tuple Patterns . . . . . . . . . . . . . . . . . . . . . . . 42
5.2 Record Types . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.3 Multiple Arguments and Multiple Results . . . . . . . . . . . 48
5.4 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
6 Case Analysis 51
6.1 Homogeneous and Heterogeneous Types . . . . . . . . . . . 51
6.2 Clausal Function Expressions . . . . . . . . . . . . . . . . . . 52
6.3 Booleans and Conditionals, Revisited . . . . . . . . . . . . . 53
6.4 Exhaustiveness and Redundancy . . . . . . . . . . . . . . . . 54
6.5 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
7 Recursive Functions 57
7.1 SelfReference and Recursion . . . . . . . . . . . . . . . . . . 58
7.2 Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.3 Inductive Reasoning . . . . . . . . . . . . . . . . . . . . . . . 62
7.4 Mutual Recursion . . . . . . . . . . . . . . . . . . . . . . . . . 65
7.5 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
8 Type Inference and Polymorphism 67
8.1 Type Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
8.2 Polymorphic Deﬁnitions . . . . . . . . . . . . . . . . . . . . . 70
8.3 Overloading . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
8.4 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
REVISED 11.02.11 DRAFT VERSION 1.2
CONTENTS vii
9 Programming with Lists 77
9.1 List Primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
9.2 Computing With Lists . . . . . . . . . . . . . . . . . . . . . . 79
9.3 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
10 Concrete Data Types 82
10.1 Datatype Declarations . . . . . . . . . . . . . . . . . . . . . . 82
10.2 NonRecursive Datatypes . . . . . . . . . . . . . . . . . . . . 83
10.3 Recursive Datatypes . . . . . . . . . . . . . . . . . . . . . . . 85
10.4 Heterogeneous Data Structures . . . . . . . . . . . . . . . . . 88
10.5 Abstract Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . 89
10.6 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
11 HigherOrder Functions 92
11.1 Functions as Values . . . . . . . . . . . . . . . . . . . . . . . . 92
11.2 Binding and Scope . . . . . . . . . . . . . . . . . . . . . . . . 93
11.3 Returning Functions . . . . . . . . . . . . . . . . . . . . . . . 95
11.4 Patterns of Control . . . . . . . . . . . . . . . . . . . . . . . . 97
11.5 Staging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
11.6 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
12 Exceptions 103
12.1 Exceptions as Errors . . . . . . . . . . . . . . . . . . . . . . . 104
12.1.1 Primitive Exceptions . . . . . . . . . . . . . . . . . . . 104
12.1.2 UserDeﬁned Exceptions . . . . . . . . . . . . . . . . . 105
12.2 Exception Handlers . . . . . . . . . . . . . . . . . . . . . . . . 107
12.3 ValueCarrying Exceptions . . . . . . . . . . . . . . . . . . . . 110
12.4 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
13 Mutable Storage 113
13.1 Reference Cells . . . . . . . . . . . . . . . . . . . . . . . . . . 113
13.2 Reference Patterns . . . . . . . . . . . . . . . . . . . . . . . . 115
13.3 Identity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
13.4 Aliasing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
13.5 Programming Well With References . . . . . . . . . . . . . . 119
13.5.1 Private Storage . . . . . . . . . . . . . . . . . . . . . . 120
13.5.2 Mutable Data Structures . . . . . . . . . . . . . . . . . 122
13.6 Mutable Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . 124
REVISED 11.02.11 DRAFT VERSION 1.2
CONTENTS viii
13.7 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
14 Input/Output 127
14.1 Textual Input/Output . . . . . . . . . . . . . . . . . . . . . . 127
14.2 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
15 Lazy Data Structures 130
15.1 Lazy Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . 132
15.2 Lazy Function Deﬁnitions . . . . . . . . . . . . . . . . . . . . 133
15.3 Programming with Streams . . . . . . . . . . . . . . . . . . . 135
15.4 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
16 Equality and Equality Types 138
16.1 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
17 Concurrency 139
17.1 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
III The Module Language 140
18 Signatures and Structures 142
18.1 Signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
18.1.1 Basic Signatures . . . . . . . . . . . . . . . . . . . . . . 143
18.1.2 Signature Inheritance . . . . . . . . . . . . . . . . . . . 144
18.2 Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
18.2.1 Basic Structures . . . . . . . . . . . . . . . . . . . . . . 147
18.2.2 Long and Short Identiﬁers . . . . . . . . . . . . . . . . 148
18.3 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
19 Signature Matching 151
19.1 Principal Signatures . . . . . . . . . . . . . . . . . . . . . . . . 152
19.2 Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
19.3 Satisfaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
19.4 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
20 Signature Ascription 158
20.1 Ascribed Structure Bindings . . . . . . . . . . . . . . . . . . . 158
20.2 Opaque Ascription . . . . . . . . . . . . . . . . . . . . . . . . 159
REVISED 11.02.11 DRAFT VERSION 1.2
CONTENTS ix
20.3 Transparent Ascription . . . . . . . . . . . . . . . . . . . . . . 162
20.4 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
21 Module Hierarchies 165
21.1 Substructures . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
21.2 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
22 Sharing Speciﬁcations 174
22.1 Combining Abstractions . . . . . . . . . . . . . . . . . . . . . 174
22.2 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
23 Parameterization 182
23.1 Functor Bindings and Applications . . . . . . . . . . . . . . . 182
23.2 Functors and Sharing Speciﬁcations . . . . . . . . . . . . . . 185
23.3 Avoiding Sharing Speciﬁcations . . . . . . . . . . . . . . . . . 187
23.4 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
IV Programming Techniques 192
24 Speciﬁcations and Correctness 194
24.1 Speciﬁcations . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
24.2 Correctness Proofs . . . . . . . . . . . . . . . . . . . . . . . . 196
24.3 Enforcement and Compliance . . . . . . . . . . . . . . . . . . 199
25 Induction and Recursion 202
25.1 Exponentiation . . . . . . . . . . . . . . . . . . . . . . . . . . 202
25.2 The GCD Algorithm . . . . . . . . . . . . . . . . . . . . . . . 207
25.3 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
26 Structural Induction 212
26.1 Natural Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 212
26.2 Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
26.3 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
26.4 Generalizations and Limitations . . . . . . . . . . . . . . . . 216
26.5 Abstracting Induction . . . . . . . . . . . . . . . . . . . . . . 217
26.6 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
REVISED 11.02.11 DRAFT VERSION 1.2
CONTENTS x
27 ProofDirected Debugging 220
27.1 Regular Expressions and Languages . . . . . . . . . . . . . . 220
27.2 Specifying the Matcher . . . . . . . . . . . . . . . . . . . . . . 222
27.3 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
28 Persistent and Ephemeral Data Structures 229
28.1 Persistent Queues . . . . . . . . . . . . . . . . . . . . . . . . . 232
28.2 Amortized Analysis . . . . . . . . . . . . . . . . . . . . . . . . 235
28.3 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
29 Options, Exceptions, and Continuations 239
29.1 The nQueens Problem . . . . . . . . . . . . . . . . . . . . . . 239
29.2 Solution Using Options . . . . . . . . . . . . . . . . . . . . . . 241
29.3 Solution Using Exceptions . . . . . . . . . . . . . . . . . . . . 242
29.4 Solution Using Continuations . . . . . . . . . . . . . . . . . . 244
29.5 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
30 HigherOrder Functions 247
30.1 Inﬁnite Sequences . . . . . . . . . . . . . . . . . . . . . . . . . 248
30.2 Circuit Simulation . . . . . . . . . . . . . . . . . . . . . . . . . 251
30.3 Regular Expression Matching, Revisited . . . . . . . . . . . . 254
30.4 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
31 Memoization 257
31.1 Cacheing Results . . . . . . . . . . . . . . . . . . . . . . . . . 257
31.2 Laziness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
31.3 Lazy Data Types in SML/NJ . . . . . . . . . . . . . . . . . . . 261
31.4 Recursive Suspensions . . . . . . . . . . . . . . . . . . . . . . 263
31.5 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
32 Data Abstraction 265
32.1 Dictionaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
32.2 Binary Search Trees . . . . . . . . . . . . . . . . . . . . . . . . 266
32.3 Balanced Binary Search Trees . . . . . . . . . . . . . . . . . . 268
32.4 Abstraction vs. RunTime Checking . . . . . . . . . . . . . . 272
32.5 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
33 Representation Independence and ADT Correctness 274
33.1 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
REVISED 11.02.11 DRAFT VERSION 1.2
CONTENTS xi
34 Modularity and Reuse 275
34.1 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
35 Dynamic Typing and Dynamic Dispatch 276
35.1 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
36 Concurrency 277
36.1 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
V Appendices 278
A The Standard ML Basis Library 279
B Compilation Management 280
B.1 Overview of CM . . . . . . . . . . . . . . . . . . . . . . . . . . 281
B.2 Building Systems with CM . . . . . . . . . . . . . . . . . . . . 281
B.3 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
C Sample Programs 282
REVISED 11.02.11 DRAFT VERSION 1.2
Part I
Overview
2
Standard ML is a typesafe programming language that embodies many
innovative ideas in programming language design. It is a statically typed
language, with an extensible type system. It supports polymorphic type
inference, which all but eliminates the burden of specifying types of vari
ables and greatly facilitates code reuse. It provides efﬁcient automatic
storage management for data structures and functions. It encourages func
tional (effectfree) programming where appropriate, but allows impera
tive (effectful) programming where necessary. It facilitates programming
with recursive and symbolic data structures by supporting the deﬁnition
of functions by pattern matching. It features an extensible exception mech
anism for handling error conditions and effecting nonlocal transfers of
control. It provides a richly expressive and ﬂexible module system for
structuring large programs, including mechanisms for enforcing abstrac
tion, imposing hierarchical structure, and building generic modules. It is
portable across platforms and implementations because it has a precise
deﬁnition. It provides a portable standard basis library that deﬁnes a rich
collection of commonlyused types and routines.
Many implementations go beyond the standard to provide experimen
tal language features, extensive libraries of commonlyused routines, and
useful program development tools. Details can be found with the doc
umentation for your compiler, but here’s some of what you may expect.
Most implementations provide an interactive system supporting online
program development, including tools for compiling, linking, and analyz
ing the behavior of programs. A few implementations are “batch compil
ers” that rely on the ambient operating system to manage the construction
of large programs from compiled parts. Nearly every compiler generates
native machine code, even when used interactively, but some also gen
erate code for a portable abstract machine. Most implementations sup
port separate compilation and provide tools for managing large systems
and shared libraries. Some implementations provide tools for tracing and
stepping programs; many provide tools for time and space proﬁling. Most
implementations supplement the standard basis library with a rich collec
tion of handy components such as dictionaries, hash tables, or interfaces to
the ambient operating system. Some implementations support language
extensions such as support for concurrent programming (using message
passing or locking), richer forms of modularity constructs, and support for
“lazy” data structures.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 1
Programming in Standard ML
1.1 A Regular Expression Package
To develop a feel for the language and how it is used, let us consider the
implementation of a package for matching strings against regular expres
sions. We’ll structure the implementation into two modules, an imple
mentation of regular expressions themselves and an implementation of a
matching algorithm for them.
These two modules are concisely described by the following signatures.
signature REGEXP = sig
datatype regexp =
Zero  One  Char of char 
Plus of regexp * regexp 
Times of regexp * regexp 
Star of regexp
exception SyntaxError of string
val parse : string > regexp
val format : regexp > string
end
signature MATCHER = sig
structure RegExp : REGEXP
val accepts : RegExp.regexp > string > bool
end
The signature REGEXP describes a module that implements regular expres
sions. It consists of a description of the abstract syntax of regular expres
1.1 A Regular Expression Package 4
sions, together with operations for parsing and unparsing them. The sig
nature MATCHER describes a module that implements a matcher for a given
notion of regular expression. It contains a function accepts that, when
given a regular expression, returns a function that determines whether or
not that expression accepts a given string. Obviously the matcher is de
pendent on the implementation of regular expressions. This is expressed
by a structure speciﬁcation that speciﬁes a hierarchical dependence of an im
plementation of a matcher on an implementation of regular expressions —
any implementation of the MATCHER signature must include an implemen
tation of regular expressions as a constituent module. This ensures that
the matcher is selfcontained, and does not rely on implicit conventions
for determining which implementation of regular expressions it employs.
The deﬁnition of the abstract syntax of regular expressions in the sig
nature REGEXP takes the form of a datatype declaration that is reminiscent
of a contextfree grammar, but which abstracts from matters of lexical pre
sentation (such as precedences of operators, parenthesization, conventions
for naming the operators, etc..) The abstract syntax consists of six clauses,
corresponding to the regular expressions 0, 1, a, r
1
+r
2
, r
1
r
2
, and r
∗
.
1
The
functions parse and format specify the parser and unparser for regular
expressions. The parser takes a string as argument and yields a regular
expression; if the string is illformed, the parser raises the exception Syn
taxError with an associated string describing the source of the error. The
unparser takes a regular expression and yields a string that parses to that
regular expression. In general there are many strings that parse to the
same regular expressions; the unparser generally tries to choose one that
is easiest to read.
The implementation of the matcher consists of two modules: an imple
mentation of regular expressions and an implementation of the matcher
itself. An implementation of a signature is called a structure. The imple
mentation of the matching package consists of two structures, one imple
menting the signature REGEXP, the other implementing MATCHER. Thus the
overall package is implemented by the following two structure declarations:
structure RegExp :> REGEXP = ...
structure Matcher :> MATCHER = ...
The structure identiﬁer RegExp is bound to an implementation of the REGEXP
1
Some authors use ∅ for 0 and ” for 1.
REVISED 11.02.11 DRAFT VERSION 1.2
1.1 A Regular Expression Package 5
signature. Conformance with the signature is ensured by the ascription of
the signature REGEXP to the binding of RegExp using the “:>” notation.
Not only does this check that the implementation (which has been elided
here) conforms with the requirements of the signature REGEXP, but it also
ensures that subsequent code cannot rely on any properties of the imple
mentation other than those explicitly speciﬁed in the signature. This helps
to ensure that modules are kept separate, facilitating subsequent changes
to the code.
Similarly, the structure identiﬁer Matcher is bound to a structure that
implements the matching algorithm in terms of the preceding implemen
tation RegExp of REGEXP. The ascribed signature speciﬁes that the structure
Matcher must conform to the requirements of the signature MATCHER. No
tice that the structure Matcher refers to the structure RegExp in its imple
mentation.
Once these structure declarations have been processed, we may use the
package by referring to its components using paths, or long identiﬁers. The
function Matcher.match has type
Matcher.RegExp.regexp > string > bool,
reﬂecting the fact that it takes a regular expression as implemented within
the package itself and yields a matching function on strings. We may build
a regular expression by applying the parser, Matcher.RegExp.parse, to a
string representing a regular expression, then passing the resulting value
of type Matcher.RegExp.regexp to Matcher.accepts.
2
Here’s an example of the matcher in action:
val regexp =
Matcher.RegExp.parse "(a+b)*"
val matches =
Matcher.accepts regexp
val ex1 = matches "aabba" (* yields true *)
val ex2 = matches "abac" (* yields false *)
2
It might seem that one can apply Matcher.accepts to the output of RegExp.parse,
since Matcher.RegExp.parse is just RegExp.parse. However, this relationship is not
stated in the interface, so there is a pro forma distinction between the two. See Chapter 22
for more information on the subtle issue of sharing.
REVISED 11.02.11 DRAFT VERSION 1.2
1.1 A Regular Expression Package 6
The use of long identiﬁers can get tedious at times. There are two typi
cal methods for alleviating the burden. One is to introduce a synonym for
a long package name. Here’s an example:
structure M = Matcher
structure R = M.RegExp
val regexp = R.parse "((a + %).(b + %))*"
val matches = M.accepts regexp
val ex1 = matches "aabba"
val ex2 = matches "abac"
Another is to “open” the structure, incorporating its bindings into the cur
rent environment:
open Matcher Matcher.RegExp
val regexp = parse "(a+b)*"
val matches = accepts regexp
val ex1 = matches "aabba"
val ex2 = matches "abac"
It is advisable to be sparing in the use of open because it is often hard to
anticipate exactly which bindings are incorporated into the environment
by its use.
Now let’s look at the internals of the structures RegExp and Matcher.
Here’s a bird’s eye view of RegExp:
structure RegExp :> REGEXP = struct
datatype regexp =
Zero  One  Char of char 
Plus of regexp * regexp 
Times of regexp * regexp 
Star of regexp
.
.
.
fun tokenize s = ...
.
.
.
fun parse s =
let
val (r, s’) =
REVISED 11.02.11 DRAFT VERSION 1.2
1.1 A Regular Expression Package 7
parse rexp (tokenize (String.explode s))
in
case s’ of
nil => r
 => raise SyntaxError "Bad input."
end
handle LexicalError =>
raise SyntaxError "Bad input."
.
.
.
fun format r =
String.implode (format exp r)
end
The elision indicates that portions of the code have been omitted so that
we can get a highlevel view of the structure of the implementation.
The structure RegExp is bracketed by the keywords struct and end.
The type regexp is implemented precisely as speciﬁed by the datatype
declaration in the signature REGEXP. The parser is implemented by a func
tion that, when given a string, “explodes” it into a list of characters, trans
forms the character list into a list of “tokens” (abstract symbols represent
ing lexical atoms), and ﬁnally parses the resulting list of tokens to obtain
its abstract syntax. If there is remaining input after the parse, or if the
tokenizer encountered an illegal token, an appropriate syntax error is sig
nalled. The formatter is implemented by a function that, when given a
piece of abstract syntax, formats it into a list of characters that are then
“imploded” to form a string. The parser and formatter work with charac
ter lists, rather than strings, because it is easier to process lists incremen
tally than it is to process strings.
It is interesting to consider in more detail the structure of the parser
since it exempliﬁes well the use of pattern matching to deﬁne functions.
Let’s start with the tokenizer, which we present here in toto:
datatype token =
AtSign  Percent  Literal of char 
PlusSign  TimesSign 
Asterisk  LParen  RParen
exception LexicalError
fun tokenize nil = nil
REVISED 11.02.11 DRAFT VERSION 1.2
1.1 A Regular Expression Package 8
 tokenize (#"+" :: cs) = PlusSign :: tokenize cs
 tokenize (#"." :: cs) = TimesSign :: tokenize cs
 tokenize (#"*" :: cs) = Asterisk :: tokenize cs
 tokenize (#"(" :: cs) = LParen :: tokenize cs
 tokenize (#")" :: cs) = RParen :: tokenize cs
 tokenize (#"@" :: cs) = AtSign :: tokenize cs
 tokenize (#"%" :: cs) = Percent :: tokenize cs
 tokenize (#"¸¸" :: c :: cs) =
Literal c :: tokenize cs
 tokenize (#"¸¸" :: nil) = raise LexicalError
 tokenize (#" " :: cs) = tokenize cs
 tokenize (c :: cs) = Literal c :: tokenize cs
The symbol “@” stands for the empty regular expression and the symbol
“%” stands for the regular expression accepting only the null string. Con
catentation is indicated by “.”, alternation by “+”, and iteration by “*”.
We use a datatype declaration to introduce the type of tokens corre
sponding to the symbols of the input language. The function tokenize
has type char list > token list; it transforms a list of characters into
a list of tokens. It is deﬁned by a series of clauses that dispatch on the ﬁrst
character of the list of characters given as input, yielding a list of tokens.
The correspondence between characters and tokens is relatively straight
forward, the only nontrivial case being to admit the use of a backslash
to “quote” a reserved symbol as a character of input. (More sophisticated
languages have more sophisticated token structures; for example, words
(consecutive sequences of letters) are often regarded as a single token of
input.) Notice that it is quite natural to “look ahead” in the input stream
in the case of the backslash character, using a pattern that dispatches on
the ﬁrst two characters (if there are such) of the input, and proceeding ac
cordingly. (It is a lexical error to have a backslash at the end of the input.)
Let’s turn to the parser. It is a simple recursivedescent parser imple
menting the precedence conventions for regular expressions given earlier.
These conventions may be formally speciﬁed by the following grammar,
which not only enforces precedence conventions, but also allows for the
REVISED 11.02.11 DRAFT VERSION 1.2
1.1 A Regular Expression Package 9
use of parenthesization to override them.
rexp ::= rtrm [ rtrm+rexp
rtrm ::= rfac [ rfac.rtrm
rfac ::= ratm [ ratm*
ratm ::= @ [ % [ a [ (rexp)
The implementation of the parser follows this grammar quite closely.
It consists of four mutually recursive functions, parse rexp, parse rtrm,
parse rfac, and parse ratm. These implement what is known as a recur
sive descent parser that dispatches on the head of the token list to determine
how to proceed.
fun parse rexp ts =
let
val (r, ts’) = parse rtrm ts
in
case ts’
of (PlusSign :: ts’’) =>
let
val (r’, ts’’’) = parse rexp ts’’
in
(Plus (r, r’), ts’’’)
end
 => (r, ts’)
end
and parse rtrm ts = ...
and parse rfac ts =
let
val (r, ts’) = parse ratm ts
in
case ts’
of (Asterisk :: ts’’) => (Star r, ts’’)
 => (r, ts’)
end
and parse ratm nil =
raise SyntaxError ("No atom")
 parse ratm (AtSign :: ts) = (Zero, ts)
 parse ratm (Percent :: ts) = (One, ts)
REVISED 11.02.11 DRAFT VERSION 1.2
1.1 A Regular Expression Package 10
 parse ratm ((Literal c) :: ts) = (Char c, ts)
 parse ratm (LParen :: ts) =
let
val (r, ts’) = parse rexp ts
in
case ts’
of (RParen :: ts’’) => (r, ts’’)
 =>
raise SyntaxError "No close paren"
end
Notice that it is quite simple to implement “lookahead” using patterns that
inspect the token list for speciﬁed tokens. This parser makes no attempt
to recover from syntax errors, but one could imagine doing so, using stan
dard techniques.
This completes the implementation of regular expressions. Nowfor the
matcher. The matcher proceeds by a recursive analysis of the regular ex
pression. The main difﬁculty is to account for concatenation — to match a
string against the regular expression r
1
r
2
we must match some initial seg
ment against r
1
, then match the corresponding ﬁnal segment against r
2
.
This suggests that we generalize the matcher to one that checks whether
some initial segment of a string matches a given regular expression, then
passes the remaining ﬁnal segment to a continuation, a function that deter
mines what to do after the initial segment has been successfully matched.
This facilitates implementation of concatentation, but how do we ensure
that at the outermost call the entire string has been matched? We achieve
this by using an initial continuation that checks whether the ﬁnal segment
is empty.
Here’s the code, written as a structure implementing the signature MATCHER:
structure Matcher :> MATCHER =
struct
structure RegExp = RegExp
open RegExp
fun match Zero k = false
 match One cs k = k cs
 match (Char c) cs k =
(case cs of nil => false  c’::cs’ => (c=c’) andalso (k cs’ ))
REVISED 11.02.11 DRAFT VERSION 1.2
1.1 A Regular Expression Package 11
 match (Plus (r1, r2)) cs k =
match r1 cs k orelse match r2 cs k
 match (Times (r1, r2)) cs k =
match r1 cs (fn cs’ => match r2 cs’ k)
 match (Star r) cs k =
let
val mstar cs’ = k cs’ orelse match r cs’ mstar
in
mstar cs
end
fun accepts regexp string =
match regexp (String.explode string) (fn nil => true  => false)
end
Note that we incorporate the structure RegExp into the structure Matcher,
in accordance with the requirements of the signature. The function accepts
explodes the string into a list of characters (to facilitiate sequential process
ing of the input), then calls match with an initial continuation that ensures
that the remaining input is empty to determine the result. The type of
match is
RegExp.regexp > char list > (char list > bool) > bool.
That is, match takes in succession a regular expression, a list of characters,
and a continuation of type char list > bool; it yields as result a value
of type bool. This is a fairly complicated type, but notice that nowhere did
we have to write this type in the code! The type inference mechanism of
ML took care of determining what that type must be based on an analysis
of the code itself.
Since match takes a function as argument, it is said to be a higherorder
function. The simplicity of the matcher is due in large measure to the ease
with which we can manipulate functions in ML. Notice that we create a
new, unnamed function to pass as a continuation in the case of concatena
tion — it is the function that matches the second part of the regular expres
sion to the characters remaining after matching an initial segment against
the ﬁrst part. We use a similar technique to implement matching against
an iterated regular expression —we attempt to match the null string, but if
this fails, we match against the regular expression being iterated followed
REVISED 11.02.11 DRAFT VERSION 1.2
1.2 Sample Code 12
by the iteration once again. This neatly captures the “zero or more times”
interpretation of iteration of a regular expression.
Important: the code given above contains a subtle error. Can
you ﬁnd it? If not, see chapter 27 for further discussion!
This completes our brief overview of Standard ML. The remainder of
these notes are structured into three parts. The ﬁrst part is a detailed intro
duction to the core language, the language in which we write programs in
ML. The second part is concerned with the module language, the means
by which we structure large programs in ML. The third is about program
ming techniques, methods for building reliable and robust programs. I
hope you enjoy it!
1.2 Sample Code
Here is the complete code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Part II
The Core Language
14
All Standard ML is divided into two parts. The ﬁrst part, the core
language, comprises the fundamental programming constructs of the lan
guage — the primitive types and operations, the means of deﬁning and
using functions, mechanisms for deﬁnining new types, and so on. The
second part, the module language, comprises the mechanisms for structur
ing programs into separate units and is described in Part III. Here we
introduce the core language.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 2
Types, Values, and Effects
2.1 Evaluation and Execution
Most familiar programming languages, such as C or Java, are based on an
imperative model of computation. Programs are thought of as specifying
a sequence of commands that modify the memory of the computer. Each
step of execution examines the current contents of memory, performs a
simple computation, modiﬁes the memory, and continues with the next
instruction. The individual commands are executed for their effect on the
memory (which we may take to include both the internal memory and
registers and the external input/output devices). The progress of the com
putation is controlled by evaluation of expressions, such as boolean tests
or arithmetic operations, that are executed for their value. Conditional
commands branch according to the value of some expression. Many lan
guages maintain a distinction between expressions and commands, but
often (in C, for example) expressions may also modify the memory, so that
even expression evaluation has an effect.
Computation in ML is of a somewhat different nature. The emphasis
in ML is on computation by evaluation of expressions, rather than execution
of commands. The idea of computation is as a generalization of your expe
rience from high school algebra in which you are given a polynomial in a
variable x and are asked to calculate its value at a given value of x. We pro
ceed by “plugging in” the given value for x, and then, using the rules of
arithmetic, determine the value of the polynomial. The evaluation model
of computation used in ML is based on the same idea, but rather than re
2.2 The ML Computation Model 16
strict ourselves to arithmetic operations on the reals, we admit a richer
variety of values and a richer variety of primitive operations on them.
The evaluation model of computation enjoys several advantages over
the more familiar imperative model. Because of its close relationship to
mathematics, it is much easier to develop mathematical techniques for
reasoning about the behavior of programs. These techniques are impor
tant tools for helping us to ensure that programs work properly without
having to resort to tedious testing and debugging that can only show the
presence of errors, never their absence. Moreover, they provide important
tools for documenting the reasoning that went into the formulation of a
program, making the code easier to understand and maintain.
What is more, the evaluation model subsumes the imperative model
as a special case. Execution of commands for the effect on memory can be
seen as a special case of evaluation of expressions by introducing primi
tive operations for allocating, accessing, and modifying memory. Rather
than forcing all aspects of computation into the framework of memory
modiﬁcation, we instead take expression evaluation as the primary no
tion. Doing so allows us to support imperative programming without de
stroying the mathematical elegance of the evaluation model for programs
that don’t use memory. As we will see, it is quite remarkable how seldom
memory modiﬁcation is required. Nevertheless, the language provides for
storagebased computation for those fewtimes that it is actually necessary.
2.2 The ML Computation Model
Computation in ML consists of evaluation of expressions. Each expression
has three important characteristics:
• It may or may not have a type.
• It may or may not have a value.
• It may or may not engender an effect.
These characteristics are all that you need to know to compute with an
expression.
The type of an expression is a description of the value it yields, should
it yield a value at all. For example, for an expression to have type int is to
REVISED 11.02.11 DRAFT VERSION 1.2
2.2 The ML Computation Model 17
say that its value (should it have one) is a number, and for an expression
to have type real is to say that its value (if any) is a ﬂoating point number.
In general we can think of the type of an expression as a “prediction” of
the form of the value that it has, should it have one. Every expression is
required to have at least one type; those that do are said to be welltyped.
Those without a type are said to be illtyped; they are considered ineligible
for evaluation. The type checker determines whether or not an expression
is welltyped, rejecting with an error those that are not.
A welltyped expression is evaluated to determine its value, if indeed
it has one. An expression can fail to have a value because its evaluation
never terminates or because it raises an exception, either because of a run
time fault such as division by zero or because some programmerdeﬁned
condition is signalled during its evaluation. If an expression has a value,
the form of that value is predicted by its type. For example, if an expres
sion evaluates to a value v and its type is bool, then v must be either true
or false; it cannot be, say, 17 or 3.14. The soundness of the type system
ensures the accuracy of the predictions made by the type checker.
Evaluation of an expression might also engender an effect. Effects in
clude such phenomena as raising an exception, modifying memory, per
forming input or output, or sending a message on the network. It is impor
tant to note that the type of an expression says nothing about its possible
effects! An expression of type int might well display a message on the
screen before returning an integer value. This possibility is not accounted
for in the type of the expression, which classiﬁes only its value. For this
reason effects are sometimes called side effects, to stress that they happen
“off to the side” during evaluation, and are not part of the value of the
expression. We will ignore effects until chapter 13. For the time being we
will assume that all expressions are effectfree, or pure.
2.2.1 Type Checking
What is a type? What types are there? Generally speaking, a type is de
ﬁned by specifying three things:
• a name for the type,
• the values of the type, and
• the operations that may be performed on values of the type.
REVISED 11.02.11 DRAFT VERSION 1.2
2.2 The ML Computation Model 18
Often the division of labor into values and operations is not completely
clearcut, but it nevertheless serves as a very useful guideline for describ
ing types.
Let’s consider ﬁrst the type of integers. Its name is int. The values
of type int are the numerals 0, 1, ˜1, 2, ˜2, and so on. (Note that neg
ative numbers are written with a preﬁx tilde, rather than a minus sign!)
Operations on integers include addition, +, subtraction, , multiplication,
*, quotient, div, and remainder, mod. Arithmetic expressions are formed
in the familiar manner, for example, 3*2+6, governed by the usual rules
of precedence. Parentheses may be used to override the precedence con
ventions, just as in ordinary mathematical practice. Thus the preceding
expression may be equivalently written as (3*2)+6, but we may also write
3*(2+6) to override the default precedences.
The formation of expressions is governed by a set of typing rules that
deﬁne the types of expressions in terms of the types of their constituent ex
pressions (if any). The typing rules are generally quite intuitive since they
are consistent with our experience in mathematics and in other program
ming languages. In their full generality the rules are somewhat involved,
but we will sneak up on them by ﬁrst considering only a small fragment
of the language, building up additional machinery as we go along.
Here are some simple arithmetic expressions, written using inﬁx no
tation for the operations (meaning that the operator comes between the
arguments, as is customary in mathematics):
3
3 + 4
4 div 3
4 mod 3
Each of these expressions is wellformed; in fact, they each have type
int. This is indicated by a typing assertion of the form exp : typ, which
states that the expression exp has the type typ. A typing assertion is said to
be valid iff the expression exp does indeed have the type typ. The following
are all valid typing assertions:
3 : int
3 + 4 : int
4 div 3 : int
4 mod 3 : int
REVISED 11.02.11 DRAFT VERSION 1.2
2.2 The ML Computation Model 19
Why are these typing assertions valid? In the case of the value 3, it
is an axiom that integer numerals have integer type. What about the ex
pression 3+4? The addition operation takes two arguments, each of which
must have type int. Since both arguments in fact have type int, it fol
lows that the entire expression is of type int. For more complex cases we
reason analogously, for example, deducing that (3+4) div (2+3): int by
observing that (3+4): int and (2+3): int.
The reasoning involved in demonstrating the validity of a typing as
sertion may be summarized by a typing derivation consisting of a nested
sequence of typing assertions, each justiﬁed either by an axiom, or a typ
ing rule for an operation. For example, the validity of the typing assertion
(3+7) div 5 : int is justiﬁed by the following derivation:
1. (3+7): int, because
(a) 3 : int because it is an axiom
(b) 7 : int because it is an axiom
(c) the arguments of + must be integers, and the result of + is an
integer
2. 5 : int because it is an axiom
3. the arguments of div must be integers, and the result is an integer
The outermost steps justify the assertion (3+4) div 5 : int by demon
strating that the arguments each have type int. Recursively, the inner
steps justify that (3+4): int.
2.2.2 Evaluation
Evaluation of expressions is deﬁned by a set of evaluation rules that deter
mine how the value of a compound expression is determined as a function
of the values of its constituent expressions (if any). Since the value of an
operator is determined by the values of its arguments, ML is sometimes
said to be a callbyvalue language. While this may seem like the only sen
sible way to deﬁne evaluation, we will see in chapter 15 that this need not
be the case — some operations may yield a value without evaluating their
arguments. Such operations are sometimes said to be lazy, to distinguish
REVISED 11.02.11 DRAFT VERSION 1.2
2.2 The ML Computation Model 20
them from eager operations that require their arguments to be evaluated
before the operation is performed.
An evaluation assertion has the form exp⇓val. This assertion states that
the expression exp has value val. It should be intuitively clear that the
following evaluation assertions are valid.
5 ⇓ 5
2+3 ⇓ 5
(2+3) div (1+4) ⇓ 1
An evaluation assertion may be justiﬁed by an evaluation derivation,
which is similar to a typing derivation. For example, we may justify the
assertion (3+7) div 5 ⇓ 2 by the derivation
1. (3+7) ⇓ 10 because
(a) 3 ⇓ 3 because it is an axiom
(b) 7 ⇓ 7 because it is an axiom
(c) Adding 3 to 7 yields 10.
2. 5 ⇓ 5 because it is an axiom
3. Dividing 10 by 5 yields 2.
Note that is an axiom that a numeral evaluates to itself; numerals are fully
evaluated expressions, or values. Second, the rules of arithmetic are used
to determine that adding 3 and 7 yields 10.
Not every expression has a value. A simple example is the expression
5 div 0, which is undeﬁned. If you attempt to evaluate this expression
it will incur a runtime error, reﬂecting the erroneous attempt to ﬁnd the
number n that, when multiplied by 0, yields 5. The error is expressed
in ML by raising an exception; we will have more to say about exceptions
in chapter 12. Another reason that a welltyped expression might not have
a value is that the attempt to evaluate it leads to an inﬁnite loop. We don’t
yet have the machinery in place to deﬁne such expressions, but we will
soon see that it is possible for an expression to diverge, or run forever, when
evaluated.
REVISED 11.02.11 DRAFT VERSION 1.2
2.3 Types, Types, Types 21
2.3 Types, Types, Types
What types are there besides the integers? Here are a few useful base types
of ML:
• Type name: real
– Values: 3.14, 2.17, 0.1E6, . . .
– Operations: +, , *, /, =, <, . . .
• Type name: char
– Values: #"a", #"b", . . .
– Operations: ord,chr,=, <, . . .
• Type name: string
– Values: "abc", "1234", . . .
– Operations: ˆ, size, =, <, . . .
• Type name: bool
– Values: true, false
– Operations: if exp then exp
1
else exp
2
There are many, many (in fact, inﬁnitely many!) others, but these are
enough to get us started. (See Appendix A for a complete description of
the primitive types of ML, including the ones given above.)
Notice that some of the arithmetic operations for real numbers are writ
ten the same way as for the corresponding operation on integers. For ex
ample, we may write 3.1+2.7 to perform a ﬂoating point addition of two
ﬂoating point numbers. This is called overloading; the addition operation
is said to be overloaded at the types int and real. In an expression in
volving addition the type checker tries to resolve which form of addition
(ﬁxed point or ﬂoating point) you mean. If the arguments are int’s, then
ﬁxed point addition is used; if the arguments are real’s, then ﬂoating ad
dition is used; otherwise an error is reported.
1
Note that ML does not per
form any implicit conversions between types! For example, the expression
1
If the type of the arguments cannot be determined, the type defaults to int.
REVISED 11.02.11 DRAFT VERSION 1.2
2.3 Types, Types, Types 22
3+3.14 is rejected as illformed! If you intend ﬂoating point addition, you
must write instead real(3)+3.14, which converts the integer 3 to its ﬂoat
ing point representation before performing the addition. If, on the other
hand, you intend integer addition, you must write 3+round(3.14), which
converts 3.14 to an integer by rounding before performing the addition.
Finally, note that ﬂoating point division is a different operation from
integer quotient! Thus we write 3.1/2.7 for the result of dividing 3.1 by
2.7, which results in a ﬂoating point number. We reserve the operator div
for integers, and use / for ﬂoating point division.
The conditional expression
if exp then exp
1
else exp
2
is used to discriminate on a Boolean value. It has type typ if exp has type
bool and both exp
1
and exp
2
have type typ. Notice that both “arms” of the
conditional must have the same type! It is evaluated by ﬁrst evaluating
exp, then proceeding to evaluate either exp
1
or exp
2
, according to whether
the value of exp is true or false. For example,
if 1<2 then "less" else "greater"
evaluates to "less" since the value of the expression 1<2 is true.
Note that the expression
if 1<2 then 0 else (1 div 0)
evaluates to 0, even though 1 div 0 incurs a runtime error. This is be
cause evaluation of the conditional proceeds either to the then clause or to
the else clause, depending on the outcome of the boolean test. Whichever
clause is evaluated, the other is simply discarded without further consid
eration.
Although we may, in fact, test equality of two boolean expressions, it
is rarely useful to do so. Beginners often writen conditionals of the form
if exp = true then exp
1
else exp
2
.
But this is equivalent to the simpler expression
if exp then exp
1
else exp
2
.
Similarly, rather than write
REVISED 11.02.11 DRAFT VERSION 1.2
2.4 Type Errors 23
if exp = false then exp
1
else exp
2
,
it is better to write
if not exp then exp
1
else exp
2
or, better yet, just
if exp then exp
2
else exp
1
.
2.4 Type Errors
Now that we have more than one type, we have enough rope to hang
ourselves by forming illtyped expressions. For example, the following ex
pressions are not welltyped:
size 45
#"1" + 1
#"2" ˆ "1"
3.14 + 2
In each case we are “misusing” an operator with arguments of the wrong
type.
This raises a natural question: is the following expression welltyped
or not?
if 1<2 then 0 else ("abc"+4)
Since the boolean test will come out true, the else clause will never be
executed, and hence need not be constrained to be welltyped. While this
reasoning is sensible for such a simple example, in general it is impossible
for the type checker to determine the outcome of the boolean test during
type checking. To be safe the type checker “assumes the worst” and insists
that both clauses of the conditional be welltyped, and in fact have the same
type, to ensure that the conditional expression can be given a type, namely
that of both of its clauses.
2.5 Sample Code
Here is the complete code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 3
Declarations
3.1 Variables
Just as in any other programming language, values may be assigned to
variables, which may then be used in expressions to stand for that value.
However, in sharp contrast to most familiar languages, variables in ML do
not vary! A value may be bound to a variable using a construct called a
value binding. Once a variable is bound to a value, it is bound to it for
life; there is no possibility of changing the binding of a variable once it has
been bound. In this respect variables in ML are more akin to variables in
mathematics than to variables in languages such as C.
A type may also be bound to a type constructor using a type binding.
A bound type constructor stands for the type bound to it, and can never
stand for any other type. For this reason a type binding is sometimes called
a type abbreviation — the type constructor stands for the type to which it is
bound.
1
A value or type binding introduces a “new” variable or type construc
tor, distinct from all others of that class, for use within its range of signif
icance, or scope. Scoping in ML is static, or lexical, meaning that the range
of signiﬁcance of a variable or type constructor is determined by the pro
gram text, not by the order of evaluation of its constituent expressions.
(Languages with dynamic scope adopt the opposite convention.) For the
time being variables and type constructors have global scope, meaning that
1
By the same token a value binding might also be called a value abbreviation, but for
some reason it never is.
3.2 Basic Bindings 25
the range of signiﬁcance of the variable or type constructor is the “rest”
of the program — the part that lexically follows the binding. However,
we will soon introduce mechanisms for limiting the scopes of variables or
type constructors to a given expression.
3.2 Basic Bindings
3.2.1 Type Bindings
Any type may be given a name using a type binding. At this stage we have
so few types that it is hard to justify binding type names to identiﬁers, but
we’ll do it anyway because we’ll need it later. Here are some examples of
type bindings:
type float = real
type count = int and average = real
The ﬁrst type binding introduces the type constructor float, which sub
sequently is synonymous with real. The second introduces two type con
structors, count and average, which stand for int and real, respectively.
In general a type binding introduces one or more new type construc
tors simultaneously in the sense that the deﬁnitions of the type constructors
may not involve any of the type constructors being deﬁned. Thus a bind
ing such as
type float = real and average = float
is nonsensical (in isolation) since the type constructors float and average
are introduced simultaneously, and hence cannot refer to one another.
The syntax for type bindings is
type tycon
1
= typ
1
and ...
and tycon
n
= typ
n
where each tycon
i
is a type constructor and each typ
i
is a type expression.
REVISED 11.02.11 DRAFT VERSION 1.2
3.2 Basic Bindings 26
3.2.2 Value Bindings
A value may be given a name using a value binding. Here are some exam
ples:
val m : int = 3+2
val pi : real = 3.14 and e : real = 2.17
The ﬁrst binding introduces the variable m, specifying its type to be int
and its value to be 5. The second introduces two variables, pi and e, si
multaneously, both having type real, and with pi having value 3.14 and
e having value 2.17. Notice that a value binding speciﬁes both the type
and the value of a variable.
The syntax of value bindings is
val var
1
: typ
1
= exp
1
and ...
and var
n
: typ
n
= exp
n
,
where each var
i
is a variable, each typ
i
is a type expression, and each exp
i
is an expression.
A value binding of the form
val var : typ = exp
is typechecked by ensuring that the expression exp has type typ. If not,
the binding is rejected as illformed. If so, the binding is evaluated using
the bindbyvalue rule: ﬁrst exp is evaluated to obtain its value val, then val
is bound to var. If exp does not have a value, then the declaration does not
bind anything to the variable var.
The purpose of a binding is to make a variable available for use within
its scope. In the case of a type binding we may use the type variable intro
duced by that binding in type expressions occurring within its scope. For
example, in the presence of the type bindings above, we may write
val pi : float = 3.14
since the type constructor float is bound to the type real, the type of the
expression 3.14. Similarly, we may make use of the variable introduced
by a value binding in value expressions occurring within its scope.
Continuing from the preceding binding, we may use the expression
REVISED 11.02.11 DRAFT VERSION 1.2
3.3 Compound Declarations 27
Math.sin pi
to stand for 0.0 (approximately), and we may bind this value to a variable
by writing
val x : float = Math.sin pi
As these examples illustrate, type checking and evaluation are context
dependent in the presence of type and value bindings since we must refer
to these bindings to determine the types and values of expressions. For
example, to determine that the above binding for x is wellformed, we
must consult the binding for pi to determine that it has type float, consult
the binding for float to determine that it is synonymous with real, which
is necessary for the binding of x to have type float.
The roughandready rule for both typechecking and evaluation is that
a bound variable or type constructor is implicitly replaced by its binding
prior to type checking and evaluation. This is sometimes called the sub
stitution principle for bindings. For example, to evaluate the expression
Math.cos x in the scope of the above declarations, we ﬁrst replace the oc
currence of x by its value (approximately 0.0), then compute as before,
yielding (approximately) 1.0. Later on we will have to reﬁne this simple
principle to take account of more sophisticated language features, but it is
useful nonetheless to keep this simple idea in mind.
3.3 Compound Declarations
Bindings may be combined to form declarations. A binding is an atomic
declaration, even though it may introduce many variables simultaneously.
Two declarations may be combined by sequential composition by simply
writing them one after the other, optionally separated by a semicolon.
Thus we may write the declaration
val m : int = 3+2
val n : int = m*m
which binds m to 5 and n to 25. Subsequently, we may evaluate m+n to ob
tain the value 30. In general a sequential composition of declarations has
the form dec
1
. . . dec
n
, where n is at least 2. The scopes of these declarations
REVISED 11.02.11 DRAFT VERSION 1.2
3.4 Limiting Scope 28
are nested within one another: the scope of dec
1
includes dec
2
, . . . , dec
n
, the
scope of dec
2
includes dec
3
, . . . , dec
n
, and so on.
One thing to keep in mind is that binding is not assignment. The binding
of a variable never changes; once bound to a value, it is always bound to
that value (within the scope of the binding). However, we may shadow a
binding by introducing a second binding for a variable within the scope
of the ﬁrst binding. Continuing the above example, we may write
val n : real = 2.17
to introduce a new variable n with both a different type and a different
value than the earlier binding. The new binding eclipses the old one,
which may then be discarded since it is no longer accessible. (Later on, we
will see that in the presence of higherorder functions shadowed bindings
are not always discarded, but are preserved as private data in a closure.
One might say that old bindings never die, they just fade away.)
3.4 Limiting Scope
The scope of a variable or type constructor may be delimited by using let
expressions and local declarations. A let expression has the form
let dec in exp end
where dec is any declaration and exp is any expression. The scope of the
declaration dec is limited to the expression exp. The bindings introduced
by dec are discarded upon completion of evaluation of exp.
Similarly, we may limit the scope of one declaration to another decla
ration by writing
local dec in dec
/
end
The scope of the bindings in dec is limited to the declaration dec
/
. After
processing dec
/
, the bindings in dec may be discarded.
The value of a let expression is determined by evaluating the decla
ration part, then evaluating the expression relative to the bindings intro
duced by the declaration, yielding this value as the overall value of the
let expression. An example will help clarify the idea:
REVISED 11.02.11 DRAFT VERSION 1.2
3.5 Typing and Evaluation 29
let
val m : int = 3
val n : int = m*m
in
m*n
end
This expression has type int and value 27, as you can readily verify by
ﬁrst calculating the bindings for m and n, then computing the value of m*n
relative to these bindings. The bindings for m and n are local to the expres
sion m*n, and are not accessible from outside the expression.
If the declaration part of a let expression eclipses earlier bindings, the
ambient bindings are restored upon completion of evaluation of the let
expression. Thus the following expression evaluates to 54:
val m : int = 2
val r : int =
let
val m : int = 3
val n : int = m*m
in
m*n
end * m
The binding of m is temporarily overridden during the evaluation of the
let expression, then restored upon completion of this evaluation.
3.5 Typing and Evaluation
To complete this chapter, let’s consider in more detail the contextsensitivity
of type checking and evaluation in the presence of bindings. The key ideas
are:
• Type checking must take account of the declared type of a variable.
• Evaluation must take account of the declared value of a variable.
This is achieved by maintaining environments for type checking and
evaluation. The type environment records the types of variables; the value
REVISED 11.02.11 DRAFT VERSION 1.2
3.5 Typing and Evaluation 30
environment records their values. For example, after processing the com
pound declaration
val m : int = 0
val x : real = Math.sqrt(2.0)
val c : char = #"a"
the type environment contains the information
val m : int
val x : real
val c : char
and the value environment contains the information
val m = 0
val x = 1.414
val c = #"a"
In a sense the value declarations have been divided in “half”, separating
the type from the value information.
Thus we see that value bindings have signiﬁcance for both type check
ing and evaluation. In contrast type bindings have signiﬁcance only for
type checking, and hence contribute only to the type environment. A type
binding such as
type float = real
is recorded in its entirety in the type environment, and no change is made
to the value environment. Subsequently, whenever we encounter the type
constructor float in a type expression, it is replaced by real in accordance
with the type binding above.
In chapter 2 we said that a typing assertion has the form exp : typ, and
that an evaluation assertion has the formexp ⇓ val. While twoplace typing
and evaluation assertions are sufﬁcient for closed expressions (those with
out variables), we must extend these relations to account for open expres
sions (those with variables). Each must be equipped with an environment
recording information about type constructors and variables introduced
by declarations.
Typing assertions are generalized to have the form
REVISED 11.02.11 DRAFT VERSION 1.2
3.5 Typing and Evaluation 31
typenv ¬ exp : typ
where typenv is a type environment that records the bindings of type con
structors and the types of variables that may occur in exp.
2
We may think
of typenv as a sequence of speciﬁcations of one of the following two forms:
1. type typvar = typ
2. val var : typ
Note that the second form does not include the binding for var, only its
type!
Evaluation assertions are generalized to have the form
valenv ¬ exp ⇓ val
where valenv is a value environment that records the bindings of the vari
ables that may occur in exp. We may think of valenv as a sequence of spec
iﬁcations of the form
val var = val
that bind the value val to the variable var.
Finally, we also need a new assertion, called type equivalence, that de
termines when two types are equivalent, relative to a type environment.
This is written
typenv ¬ typ
1
≡ typ
2
Two types are equivalent iff they are the same when the type constructors
deﬁned in typenv are replaced by their bindings.
The primary use of a type environment is to record the types of the
value variables that are available for use in a given expression. This is
expressed by the following axiom:
. . . val var : typ . . . ¬ var : typ
2
The turnstile symbol, “¬”, is simply a punctuation mark separating the type environ
ment from the expression and its type.
REVISED 11.02.11 DRAFT VERSION 1.2
3.6 Sample Code 32
In words, if the speciﬁcation val var : typ occurs in the type environment,
then we may conclude that the variable var has type typ. This rule glosses
over an important point. In order to account for shadowing we require
that the rightmost speciﬁcation govern the type of a variable. That way
rebinding of variables with the same name but different types behaves as
expected.
Similarly, the evaluation relation must take account of the value envi
ronment. Evaluation of variables is governed by the following axiom:
. . . val var = val . . . ¬ var ⇓ val
Here again we assume that the val speciﬁcation is the rightmost one gov
erning the variable var to ensure that the scoping rules are respected.
The role of the type equivalence assertion is to ensure that type con
structors always stand for their bindings. This is expressed by the follow
ing axiom:
. . . type typvar = typ . . . ¬ typvar ≡ typ
Once again, the rightmost speciﬁcation for typvar governs the assertion.
3.6 Sample Code
Here is the complete code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 4
Functions
4.1 Functions as Templates
So far we just have the means to calculate the values of expressions, and to
bind these values to variables for future reference. In this chapter we will
introduce the ability to abstract the data from a calculation, leaving behind
the bare pattern of the calculation. This pattern may then be instantiated
as often as you like so that the calculation may be repeated with speciﬁed
data values plugged in.
For example, consider the expression 2*(3+4). The data might be taken
to be the values 2, 3, and 4, leaving behind the pattern * ( + ),
with “holes” where the data used to be. We might equally well take the
data to just be 2 and 3, and leave behind the pattern * ( + 4). Or
we might even regard * and + as the data, leaving 2 (3 4) as the
pattern! What is important is that a complete expression can be recovered
by ﬁlling in the holes with chosen data.
Since a pattern can contain many different holes that can be indepen
dently instantiated, it is necessary to give names to the holes so that instan
tiation consists of plugging in a given value for all occurrences of a name
in an expression. These names are, of course, just variables, and instan
tiation is just the process of substituting a value for all occurrences of a
variable in a given expression. A pattern may therefore be viewed as a
function of the variables that occur within it; the pattern is instantiated by
applying the function to argument values.
This view of functions is similar to our experience from high school
4.2 Functions and Application 34
algebra. In algebra we manipulate polynomials such as x
2
+ 2x + 1 as
a form of expression denoting a real number, with the variable x repre
senting a ﬁxed, but unknown, quantity. (Indeed, variables in algebra are
sometimes called unknowns, or indeterminates.) It is also possible to think
of a polynomial as a function on the real line: given a real number x, a
polynomial determines a real number y computed by the given combi
nation of arithmetic operations. Indeed, we sometimes write equations
such as f (x) = x
2
+ 2x + 1, to stand for the function f determined by
the polynomial. In the univariate case we can get away with just writing
the polynomial for the function, but in the multivariate case we must be
more careful: we may regard the polynomial x
2
+ 2xy + y
2
to be a func
tion of x, a function of y, or a function of both x and y. In these cases
we write f (x) = x
2
+ 2xy + y
2
when x varies and y is held ﬁxed, and
g(y) = x
2
+2xy +y
2
when y varies for ﬁxed x, and h(x, y) = x
2
+2xy +y
2
,
when both vary jointly.
In algebra it is usually left implicit that the variables x and y range
over the real numbers, and that f , g, and h are functions on the real line.
However, to be fully explicit, we sometimes write something like
f : R →R : x ∈ R → x
2
+ 2x + 1
to indicate that f is a function on the reals sending x ∈ R to x
2
+ 2x + 1 ∈
R. This notation has the virtue of separating the name of the function, f ,
from the function itself, the mapping that sends x ∈ R to x
2
+ 2x + 1.
It also emphasizes that functions are a kind of “value” in mathematics
(namely, a certain set of ordered pairs), and that the variable f is bound to
that value (i.e., that set) by the declaration. This viewpoint is especially im
portant once we consider operators, such as the differential operator, that
map functions to functions. For example, if f is a differentiable function
on the real line, the function D f is its ﬁrst derivative, another function on
the real line. In the case of the function f deﬁned above the function D f
sends x ∈ R to 2x + 2.
4.2 Functions and Application
The treatment of functions in ML is very similar, except that we stress
the algorithmic aspects of functions (how they determine values from ar
guments), as well as the extensional aspects (what they compute). As in
REVISED 11.02.11 DRAFT VERSION 1.2
4.2 Functions and Application 35
mathematics, a function in ML is a kind of value, namely a value of func
tion type of the formtyp > typ
/
. The type typ is the domain type (the type of
arguments) of the function, and typ
/
is its range type (the type of its results).
We compute with a function by applying it to an argument value of its do
main type and calculating the result, a value of its range type. Function
application is indicated by juxtaposition: we simply write the argument
next to the function.
The values of function type consist of primitive functions, such as addi
tion and square root, and function expressions, which are also called lambda
expressions,
1
of the form
fn var : typ => exp
The variable var is called the parameter, and the expression exp is called
the body. It has type typ>typ
/
provided that exp has type typ
/
under the
assumption that the parameter var has the type typ.
To apply such a function expression to an argument value val, we add
the binding
val var = val
to the value environment, and evaluate exp, obtaining a value val
/
. Then
the value binding for the parameter is removed, and the result value, val
/
,
is returned as the value of the application.
For example, Math.sqrt is a primitive function of type real>real that
may be applied to a real number to obtain its square root. For example, the
expression Math.sqrt 2.0 evaluates to 1.414 (approximately). We can,
if we wish, parenthesize the argument, writing Math.sqrt (2.0) for the
sake of clarity; this is especially useful for expressions such as Math.sqrt
(Math.sqrt 2.0). The square root function is built in. We may write the
fourth root function as the following function expression:
fn x : real => Math.sqrt (Math.sqrt x)
It may be applied to an argument by writing an expression such as
(fn x : real => Math.sqrt (Math.sqrt x)) (16.0),
1
For purely historical reasons.
REVISED 11.02.11 DRAFT VERSION 1.2
4.2 Functions and Application 36
which calculates the fourth root of 16.0. The calculation proceeds by bind
ing the variable x to the argument 16.0, then evaluating the expression
Math.sqrt (Math.sqrt x) in the presence of this binding. When evalua
tion completes, we drop the binding of x from the environment, since it is
no longer needed.
Notice that we did not give the fourth root function a name; it is an
“anonymous” function. We may give it a name using the declaration
forms introduced in chapter 3. For example, we may bind the fourth root
function to the variable fourthroot using the following declaration:
val fourthroot : real > real =
fn x : real => Math.sqrt (Math.sqrt x)
We may then write fourthroot 16.0 to compute the fourth root of 16.0.
This notation for deﬁning functions quickly becomes tiresome, so ML
provides a special syntax for function bindings that is more concise and
natural. Instead of using the val binding above to deﬁne fourthroot, we
may instead write
fun fourthroot (x:real):real = Math.sqrt (Math.sqrt x)
This declaration has the same meaning as the earlier val binding, namely
it binds fn x:real => Math.sqrt(Math.sqrt x) to the variable fourthroot.
It is important to note that function applications in ML are evaluated
according to the callbyvalue rule: the arguments to a function are evalu
ated before the function is called. Put in other terms, functions are deﬁned
to act on values, rather than on unevaluated expressions. Thus, to evaluate
an expression such as fourthroot (2.0+2.0), we proceed as follows:
1. Evaluate fourthroot to the function value fn x : real => Math.sqrt
(Math.sqrt x).
2. Evaluate the argument 2.0+2.0 to its value 4.0
3. Bind x to the value 4.0.
4. Evaluate Math.sqrt (Math.sqrt x) to 1.414 (approximately).
(a) Evaluate Math.sqrt to a function value (the primitive square
root function).
REVISED 11.02.11 DRAFT VERSION 1.2
4.3 Binding and Scope, Revisited 37
(b) Evaluate the argument expression Math.sqrt x to its value, ap
proximately 2.0.
i. Evaluate Math.sqrt to a function value (the primitive square
root function).
ii. Evaluate x to its value, 4.0.
iii. Compute the square root of 4.0, yielding 2.0.
(c) Compute the square root of 2.0, yielding 1.414.
5. Drop the binding for the variable x.
Notice that we evaluate both the function and argument positions of an
application expression — both the function and argument are expressions
yielding values of the appropriate type. The value of the function position
must be a value of function type, either a primitive function or a lambda
expression, and the value of the argument position must be a value of the
domain type of the function. In this case the result value (if any) will be of
the range type of the function. Functions in ML are ﬁrstclass, meaning that
they may be computed as the value of an expression. We are not limited to
applying only named functions, but rather may compute “new” functions
on the ﬂy and apply these to arguments. This is a source of considerable
expressive power, as we shall see in the sequel.
Using similar techniques we may deﬁne functions with arbitrary do
main and range. For example, the following are all valid function declara
tions:
fun srev (s:string):string = implode (rev (explode s))
fun pal (s:string):string = s ˆ (srev s)
fun double (n:int):int = n + n
fun square (n:int):int = n * n
fun halve (n:int):int = n div 2
fun is even (n:int):bool = (n mod 2 = 0)
Thus pal "ot" evaluates to the string "otto", and is even 4 evaluates to
true.
4.3 Binding and Scope, Revisited
A function expression of the form
REVISED 11.02.11 DRAFT VERSION 1.2
4.3 Binding and Scope, Revisited 38
fn var:typ => exp
binds the variable var within the body exp of the function. Unlike val
bindings, function expressions bind a variable without giving it a speciﬁc
value. The value of the parameter is only determined when the function
is applied, and then only temporarily, for the duration of the evaluation of
its body.
It is worth reviewing the rules for binding and scope of variables that
we introduced in chapter 3 in the presence of function expressions. As be
fore we adhere to the principle of static scope, according to which variables
are taken to refer to the nearest enclosing binding of that variable, whether
by a val binding or by a fn expression.
Thus, in the following example, the occurrences of x in the body of the
function f refer to the parameter of f, whereas the occurrences of x in the
body of g refer to the preceding val binding.
val x:real = 2.0
fun f(x:real):real = x+x
fun g(y:real):real = x+y
Local val bindings may shadowparameters, as well as other val bindings.
For example, consider the following function declaration:
fun h(x:real):real =
let val x:real = 2.0 in x+x end * x
The inner binding of x by the val declaration shadows the parameter x of
h, but only within the body of the let expression. Thus the last occurrence
of x refers to the parameter of h, whereas the preceding two occurrences
refer to the inner binding of x to 2.0.
The phrases “inner” and “outer” binding refer to the logical structure,
or abstract syntax of an expression. In the preceding example, the body
of h lies “within” the scope of the parameter x, and the expression x+x
lies within the scope of the val binding for x. Since the occurrences of x
within the body of the let lie within the scope of the inner val binding,
they are taken to refer to that binding, rather than to the parameter. On
the other hand the last occurrence of x does not lie within the scope of the
val binding, and hence refers to the parameter of h.
In general the names of parameters do not matter; we can rename them
at will without affecting the meaning of the program, provided that we
REVISED 11.02.11 DRAFT VERSION 1.2
4.4 Sample Code 39
simultaneously (and consistently) rename the binding occurrence and all
uses of that variable. Thus the functions f and g below are completely
equivalent to each other:
fun f(x:int):int = x*x
fun g(y:int):int = y*y
A parameter is just a placeholder; its name is not important.
Our ability to rename parameters is constrained by the static scoping
rule. We may rename a parameter to whatever we’d like, provided that
we don’t change the way in which uses of a variable are resolved. For
example, consider the following situation:
val x:real = 2.0
fun h(y:real):real = x+y
The parameter y to h may be renamed to z without affecting its meaning.
However, we may not rename it to x, for doing so changes its meaning!
That is, the function
fun h’(x:real):real = x+x
does not have the same meaning as h, because now both occurrences of x
in the body of h’ refer to the parameter, whereas in h the variable x refers
to the outer val binding, whereas the variable y refers to the parameter.
While this may seem like a minor technical issue, it is essential that you
master these concepts now, for they play a central, and rather subtle, role
later on.
4.4 Sample Code
Here is the complete code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 5
Products and Records
5.1 Product Types
A distinguishing feature of ML is that aggregate data structures, such as
tuples, lists, arrays, or trees, may be created and manipulated with ease.
In contrast to most familiar languages it is not necessary in ML to be con
cerned with allocation and deallocation of data structures, nor with any
particular representation strategy involving, say, pointers or address arith
metic. Instead we may think of data structures as ﬁrstclass values, on a
par with every other value in the language. Just as it is unnecessary to
think about “allocating” integers to evaluate an arithmetic expression, it is
unnecessary to think about allocating more complex data structures such
as tuples or lists.
5.1.1 Tuples
This chapter is concerned with the simplest form of aggregate data struc
ture, the ntuple. An ntuple is a ﬁnite ordered sequence of values of the
form
(val
1
,...,val
n
),
where each val
i
is a value. A 2tuple is usually called a pair, a 3tuple a
triple, and so on.
An ntuple is a value of a product type of the form
typ
1
*... *typ
n
.
5.1 Product Types 41
Values of this type are ntuples of the form
(val
1
,...,val
n
),
where val
i
is a value of type typ
i
(for each 1 ≤ i ≤ n).
Thus the following are wellformed bindings:
val pair : int * int = (2, 3)
val triple : int * real * string = (2, 2.0, "2")
val quadruple
: int * int * real * real
= (2,3,2.0,3.0)
val pair of pairs
: (int * int) * (real * real)
= ((2,3),(2.0,3.0))
The nesting of parentheses matters! A pair of pairs is not the same as
a quadruple, so the last two bindings are of distinct values with distinct
types.
There are two limiting cases, n = 0 and n = 1, that deserve special
attention. A 0tuple, which is also known as a null tuple, is the empty
sequence of values, (). It is a value of type unit, which may be thought of
as the 0tuple type.
1
The null tuple type is surprisingly useful, especially
when programming with effects. On the other hand there seems to be no
particular use for 1tuples, and so they are absent from the language.
As a convenience, ML also provides a general tuple expression of the
form
(exp
1
,...,exp
n
),
where each exp
i
is an arbitrary expression, not necessarily a value. Tuple
expressions are evaluated from left to right, so that the above tuple expres
sion evaluates to the tuple value
(val
1
,...,val
n
),
provided that exp
1
evaluates to val
1
, exp
2
evaluates to val
2
, and so on. For
example, the binding
1
In Java (and other languages) the type unit is misleadingly written void, which sug
gests that the type has no members, but in fact it has exactly one!
REVISED 11.02.11 DRAFT VERSION 1.2
5.1 Product Types 42
val pair : int * int = (1+1, 52)
binds the value (2, 3) to the variable pair.
Strictly speaking, it is not essential to have tuple expressions as a prim
itive notion in the language. Rather than write
(exp
1
,...,exp
n
),
with the (implicit) understanding that the exp
i
’s are evaluated from left to
right, we may instead write
let val x1 = exp
1
val x2 = exp
2
.
.
.
val xn = exp
n
in (x1,...,xn) end
which makes the evaluation order explicit.
5.1.2 Tuple Patterns
One of the most powerful, and distinctive, features of ML is the use of
pattern matching to access components of aggregate data structures. For
example, suppose that val is a value of type
(int * string) * (real * char)
and we wish to retrieve the ﬁrst component of the second component of
val, a value of type real. Rather than explicitly “navigate” to this position
to retrieve it, we may simply use a generalized form of value binding in
which we select that component using a pattern:
val (( , ), (r:real, )) = val
The lefthand side of the val binding is a tuple pattern that describes a
pair of pairs, binding the ﬁrst component of the second component to the
variable r. The underscores indicate “don’t care” positions in the pattern
— their values are not bound to any variable. If we wish to give names to
all of the components, we may use the following value binding:
val ((i:int, s:string), (r:real, c:char)) = val
REVISED 11.02.11 DRAFT VERSION 1.2
5.1 Product Types 43
If we’d like we can even give names to the ﬁrst and second components of
the pair, without decomposing them into constituent parts:
val (is:int*string,rc:real*char) = val
The general form of a value binding is
val pat = exp,
where pat is a pattern and exp is an expression. A pattern is one of three
forms:
1. A variable pattern of the form var:typ.
2. A tuple pattern of the form (pat
1
,...,pat
n
), where each pat
i
is a pat
tern. This includes as a special case the nulltuple pattern, ().
3. A wildcard pattern of the form .
The type of a pattern is determined by an inductive analysis of the form
of the pattern:
1. A variable pattern var:typ is of type typ.
2. A tuple pattern (pat
1
,...,pat
n
) has type typ
1
* *typ
n
, where each
pat
i
is a pattern of type typ
i
. The nulltuple pattern () has type unit.
3. The wildcard pattern has any type whatsoever.
A value binding of the form
val pat = exp
is welltyped iff pat and exp have the same type; otherwise the binding is
illtyped and is rejected.
For example, the following bindings are welltyped:
val (m:int, n:int) = (7+1,4 div 2)
val (m:int, r:real, s:string) = (7, 7.0, "7")
val ((m:int,n:int), (r:real, s:real)) = ((4,5),(3.1,2.7))
val (m:int, n:int, r:real, s:real) = (4,5,3.1,2.7)
In contrast, the following are illtyped:
REVISED 11.02.11 DRAFT VERSION 1.2
5.1 Product Types 44
val (m:int,n:int,r:real,s:real) = ((4,5),(3.1,2.7))
val (m:int, r:real) = (7+1,4 div 2)
val (m:int, r:real) = (7, 7.0, "7")
Value bindings are evaluated using the bindbyvalue principle discussed
earlier, except that the binding process is now more complex than before.
First, we evaluate the righthand side of the binding to a value (if indeed
it has one). This happens regardless of the form of the pattern — the right
hand side is always evaluated. Second, we perform pattern matching to
determine the bindings for the variables in the pattern.
The process of matching a value against a pattern is deﬁned by a set
of rules for reducing bindings with complex patterns to a set of bindings
with simpler patterns, stopping once we reach a binding with a variable
pattern. The rules are as follows:
1. The variable binding val var = val is irreducible.
2. The wildcard binding val = val is discarded.
3. The tuple binding
val (pat
1
,...,pat
n
) =
(val
1
,...,val
n
)
is reduced to the set of n bindings
val pat
1
= val
1
.
.
.
val pat
n
= val
n
In the case that n = 0 the tuple binding is simply discarded.
These simpliﬁcations are repeated until all bindings are irreducible, which
leaves us with a set of variable bindings that constitute the result of pattern
matching.
For example, evaluation of the binding
val ((m:int,n:int), (r:real, s:real)) = ((2,3),(2.0,3.0))
proceeds as follows. First, we compose this binding into the following two
bindings:
REVISED 11.02.11 DRAFT VERSION 1.2
5.2 Record Types 45
val (m:int, n:int) = (2,3)
and (r:real, s:real) = (2.0,3.0).
Then we decompose each of these bindings in turn, resulting in the fol
lowing set of four atomic bindings:
val m:int = 2
and n:int = 3
and r:real = 2.0
and s:real = 3.0
At this point the patternmatching process is complete.
5.2 Record Types
Tuples are most useful when the number of positions is small. When the
number of components grows beyond a small number, it becomes difﬁcult
to remember which position plays which role. In that case it is more natu
ral to attach a label to each component of the tuple that mediates access to
it. This is the notion of a record type.
A record type has the form
¦lab
1
:typ
1
,...,lab
n
:typ
n
¦,
where n ≥ 0, and all of the labels lab
i
are distinct. A record value has the
form
¦lab
1
=val
1
,...,lab
n
=val
n
¦,
where val
i
has type typ
i
. A record pattern has the form
¦lab
1
=pat
1
,...,lab
n
=pat
n
¦
which has type
¦lab
1
:typ
1
,...,lab
n
:typ
n
¦
provided that each pat
i
has type typ
i
.
A record value binding of the form
REVISED 11.02.11 DRAFT VERSION 1.2
5.2 Record Types 46
val
¦lab
1
=pat
1
,...,lab
n
=pat
n
¦ =
¦lab
1
=val
1
,...,lab
n
=val
n
¦
is decomposed into the following set of bindings
val pat
1
= val
1
and ...
and pat
n
= val
n
.
Since the components of a record are identiﬁed by name, not position,
the order in which they occur in a record value or record pattern is not
important. However, in a record expression (in which the components may
not be fully evaluated), the ﬁelds are evaluated from left to right in the
order written, just as for tuple expressions.
Here are some examples to help clarify the use of record types. First,
let us deﬁne the record type hyperlink as follows:
type hyperlink =
¦ protocol : string,
address : string,
display : string ¦
The record binding
val mailto rwh : hyperlink =
¦ protocol="mailto",
address="rwh@cs.cmu.edu",
display="Robert Harper" ¦
deﬁnes a variable of type hyperlink. The record binding
val ¦ protocol=prot, display=disp, address=addr ¦ = mailto rwh
decomposes into the three variable bindings
val prot = "mailto"
val addr = "rwh@cs.cmu.edu"
val disp = "Robert Harper"
which extract the values of the ﬁelds of mailto rwh.
Using wild cards we can extract selected ﬁelds from a record. For ex
ample, we may write
REVISED 11.02.11 DRAFT VERSION 1.2
5.2 Record Types 47
val ¦protocol=prot, address= , display= ¦ = mailto rwh
to bind the variable prot to the protocol ﬁeld of the record value mailto rwh.
It is quite common to encounter record types with tens of ﬁelds. In
such cases even the wild card notation doesn’t help much when it comes
to selecting one or two ﬁelds from such a record. For this we often use
ellipsis patterns in records, as illustrated by the following example.
val ¦protocol=prot,...¦ = intro home
The pattern ¦protocol=prot,...¦ stands for the expanded pattern
¦protocol=prot, address= , display= ¦
in which the elided ﬁelds are implicitly bound to wildcard patterns.
In general the ellipsis is replaced by as many wildcard bindings as are
necessary to ﬁll out the pattern to be consistent with its type. In order for
this to occur the compiler must be able to determine unambiguously the type of
the record pattern. Here the righthand side of the value binding determines
the type of the pattern, which then determines which additional ﬁelds to
ﬁll in. In some situations the context does not disambiguate, in which case
you must supply additional type information, or avoid the use of ellipsis
notation.
Finally, ML provides a convenient abbreviated form of record pattern
¦lab
1
,...,lab
n
¦
which stands for the pattern
¦lab
1
=var
1
,...,lab
n
=var
n
¦
where the variables var
i
are variables with the same name as the corre
sponding label lab
i
. For example, the binding
val ¦ protocol, address, display ¦ = mailto rwh
decomposes into the sequence of atomic bindings
val protocol = "mailto"
val address = "rwh@cs.cmu.edu"
val display = "Robert Harper"
This avoids the need to think up a variable name for each ﬁeld; we can just
make the label do “double duty” as a variable.
REVISED 11.02.11 DRAFT VERSION 1.2
5.3 Multiple Arguments and Multiple Results 48
5.3 Multiple Arguments and Multiple Results
A function may bind more than one argument by using a pattern, rather
than a variable, in the argument position. Function expressions are gener
alized to have the form
fn pat => exp
where pat is a pattern and exp is an expression. Application of such a func
tion proceeds much as before, except that the argument value is matched
against the parameter pattern to determine the bindings of zero or more
variables, which are then used during the evaluation of the body of the
function.
For example, we may make the following deﬁnition of the Euclidean
distance function:
val dist
: real * real > real
= fn (x:real, y:real) => Math.sqrt (x*x + y*y)
This function may then be applied to a pair (a twotuple!) of arguments to
yield the distance between them. For example, dist (2.0,3.0) evaluates
to (approximately) 4.0.
Using fun notation, the distance function may be deﬁned more con
cisely as follows:
fun dist (x:real, y:real):real = Math.sqrt (x*x + y*y)
The meaning is the same as the more verbose val binding given earlier.
Keyword parameter passing is supported through the use of record pat
terns. For example, we may deﬁne the distance function using keyword
parameters as follows:
fun dist’ ¦x=x:real, y=y:real¦ = Math.sqrt (x*x + y*y)
The expression dist’ ¦x=2.0,y=3.0¦ invokes this function with the indi
cated x and y values.
Functions with multiple results may be thought of as functions yield
ing tuples (or records). For example, we might compute two different no
tions of distance between two points at once as follows:
REVISED 11.02.11 DRAFT VERSION 1.2
5.3 Multiple Arguments and Multiple Results 49
fun dist2 (x:real, y:real):real*real
= (Math.sqrt (x*x+y*y), Math.abs(xy))
Notice that the result type is a pair, which may be thought of as two results.
These examples illustrate a pleasing regularity in the design of ML.
Rather than introduce ad hoc notions such as multiple arguments, multiple
results, or keyword parameters, we make use of the general mechanisms
of tuples, records, and pattern matching.
It is sometimes useful to have a function to select a particular compo
nent from a tuple or record (e.g., the third component or the component
with a given label). Such functions may be easily deﬁned using pattern
matching. But since they arise so frequently, they are predeﬁned in ML
using sharp notation. For any tuple type
typ
1
* *typ
n
,
and each 1 ≤ i ≤ n, there is a function #i of type
typ
1
* *typ
n
>typ
i
deﬁned as follows:
fun #i ( , ..., , x, , ..., ) = x
where x occurs in the ith position of the tuple (and there are underscores
in the other n −1 positions).
Thus we may refer to the second ﬁeld of a threetuple val by writing
#2(val). It is bad style, however, to overuse the sharp notation; code is
generally clearer and easier to maintain if you use patterns wherever pos
sible. Compare, for example, the following deﬁnition of the Euclidean
distance function written using sharp notation with the original.
fun dist (p:real*real):real
= Math.sqrt((#1 p)*(#1 p)+(#2 p)*(#2 p))
You can easily see that this gets out of hand very quickly, leading to un
readable code. Use of the sharp notation is strongly discouraged!
A similar notation is provided for record ﬁeld selection. The following
function #lab selects the component of a record with label lab.
fun #lab ¦lab=x,...¦ = x
Notice the use of ellipsis! Bear in mind the disambiguation requirement:
any use of #lab must be in a context sufﬁcient to determine the full record
type of its argument.
REVISED 11.02.11 DRAFT VERSION 1.2
5.4 Sample Code 50
5.4 Sample Code
Here is the complete code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 6
Case Analysis
6.1 Homogeneous and Heterogeneous Types
Tuple types have the property that all values of that type have the same
form (ntuples, for some n determined by the type); they are said to be
homogeneous. For example, all values of type int*real are pairs whose
ﬁrst component is an integer and whose second component is a real. Any
typecorrect pattern will match any value of that type; there is no pos
sibility of failure of pattern matching. The pattern (x:int,y:real) is of
type int*real and hence will match any value of that type. On the other
hand the pattern (x:int,y:real,z:string) is of type int*real*string
and cannot be used to match against values of type int*real; attempting
to do so fails at compile time.
Other types have values of more than one form; they are said to be het
erogeneous types. For example, a value of type int might be 0, 1, ˜1, . . . or
a value of type char might be #"a" or #"z". (Other examples of hetero
geneous types will arise later on.) Corresponding to each of the values of
these types is a pattern that matches only that value. Attempting to match
any other value against that pattern fails at execution time with an error
condition called a bind failure.
Here are some examples of patternmatching against values of a het
erogeneous type:
val 0 = 11
val (0,x) = (11, 34)
val (0, #"0") = (21, #"0")
6.2 Clausal Function Expressions 52
The ﬁrst two bindings succeed, the third fails. In the case of the second,
the variable x is bound to 34 after the match. No variables are bound in
the ﬁrst or third examples.
6.2 Clausal Function Expressions
The importance of constant patterns becomes clearer once we consider
how to deﬁne functions over heterogeneous types. This is achieved in
ML using a clausal function expression whose general form is
fn pat
1
=> exp
1

.
.
.
 pat
n
=> exp
n
Each pat
i
is a pattern and each exp
i
is an expression involving the variables
of the pattern pat
i
. Each component pat=>exp is called a clause, or a rule.
The entire assembly of rules is called a match.
The typing rules for matches ensure consistency of the clauses. Specif
ically, there must exist types typ
1
and typ
2
such that
1. Each pattern pat
i
has type typ
1
.
2. Each expression exp
i
has type typ
2
, given the types of the variables in
pattern pat
i
.
If these requirements are satisﬁed, the function has the type typ
1
>typ
2
.
Application of a clausal function to a value val proceeds by considering
the clauses in the order written. At stage i, where 1 ≤ i ≤ n, the argument
value val is matched against the pattern pat
i
; if the pattern match succeeds,
evaluation continues with the evaluation of expression exp
i
, with the vari
ables of pat
i
replaced by their values as determined by pattern matching.
Otherwise we proceed to stage i + 1. If no pattern matches (i.e., we reach
stage n + 1), then the application fails with an execution error called a
match failure.
Here’s an example. Consider the following clausal function:
val recip : int > int =
fn 0 => 0  n:int => 1 div n
REVISED 11.02.11 DRAFT VERSION 1.2
6.3 Booleans and Conditionals, Revisited 53
This deﬁnes an integervalued reciprocal function on the integers, where
the reciprocal of 0 is arbitrarily deﬁned to be 0. The function has two
clauses, one for the argument 0, the other for nonzero arguments n. (Note
that n is guaranteed to be nonzero because the patterns are considered in
order: we reach the pattern n:int only if the argument fails to match the
pattern 0.)
The fun notation is also generalized so that we may deﬁne recip using
the following more concise syntax:
fun recip 0 = 0
 recip (n:int) = 1 div n
One annoying thing to watch out for is that the fun form uses an equal
sign to separate the pattern from the expression in a clause, whereas the
fn form uses a double arrow.
Case analysis on the values of a heterogeneous type is performed by
application of a clausallydeﬁned function. The notation
case exp
of pat
1
=> exp
1
 ...
 pat
n
=> exp
n
is short for the application
(fn pat
1
=> exp
1
 ...
 pat
n
=> exp
n
)
exp.
Evaluation proceeds by ﬁrst evaluating exp, then matching its value suc
cessively against the patterns in the match until one succeeds, and contin
uing with evaluation of the corresponding expression. The case expres
sion fails if no pattern succeeds to match the value.
6.3 Booleans and Conditionals, Revisited
The type bool of booleans is perhaps the most basic example of a hetero
geneous type. Its values are true and false. Functions may be deﬁned
REVISED 11.02.11 DRAFT VERSION 1.2
6.4 Exhaustiveness and Redundancy 54
on booleans using clausal deﬁnitions that match against the patterns true
and false.
For example, the negation function may be deﬁned clausally as fol
lows:
fun not true = false
 not false = true
The conditional expression
if exp then exp
1
else exp
2
is shorthand for the case analysis
case exp
of true => exp
1
 false => exp
2
which is itself shorthand for the application
(fn true => exp
1
 false => exp
2
) exp.
The “shortcircuit” conjunction and disjunction operations are deﬁned
as follows. The expression exp
1
andalso exp
2
is short for
if exp
1
then exp
2
else false
and the expression exp
1
orelse exp
2
is short for
if exp
1
then true else exp
2
.
You should expand these into case expressions and check that they behave
as expected. Pay particular attention to the evaluation order, and observe
that the callbyvalue principle is not violated by these expressions.
6.4 Exhaustiveness and Redundancy
Matches are subject to two forms of “sanity check” as an aid to the ML
programmer. The ﬁrst, called exhaustiveness checking, ensures that a well
formed match covers its domain type in the sense that every value of the
REVISED 11.02.11 DRAFT VERSION 1.2
6.4 Exhaustiveness and Redundancy 55
domain must match one of its clauses. The second, called redundancy check
ing, ensures that no clause of a match is subsumed by the clauses that pre
cede it. This means that the set of values covered by a clause in a match
must not be contained entirely within the set of values covered by the pre
ceding clauses of that match.
Redundant clauses are always a mistake — such a clause can never be
executed. Redundant rules often arise accidentally. For example, the sec
ond rule of the following clausal function deﬁnition is redundant:
fun not True = false
 not False = true
By capitalizing True we have turned it into a variable, rather than a con
stant pattern. Consequently, every value matches the ﬁrst clause, rendering
the second redundant.
Since the clauses of a match are considered in the order they are writ
ten, redundancy checking is correspondingly ordersensitive. In particu
lar, changing the order of clauses in a wellformed, irredundant match can
make it redundant, as in the following example:
fun recip (n:int) = 1 div n
 recip 0 = 0
The second clause is redundant because the ﬁrst matches any integer value,
including 0.
Inexhaustive matches may or may not be in error, depending on whether
the match might ever be applied to a value that is not covered by any
clause. Here is an example of a function with an inexhaustive match that
is plausibly in error:
fun is numeric #"0" = true
 is numeric #"1" = true
 is numeric #"2" = true
 is numeric #"3" = true
 is numeric #"4" = true
 is numeric #"5" = true
 is numeric #"6" = true
 is numeric #"7" = true
 is numeric #"8" = true
 is numeric #"9" = true
REVISED 11.02.11 DRAFT VERSION 1.2
6.5 Sample Code 56
When applied to, say, #"a", this function fails. Indeed, the function never
returns false for any argument!
Perhaps what was intended here is to include a catchall clause at the
end:
fun is numeric #"0" = true
 is numeric #"1" = true
 is numeric #"2" = true
 is numeric #"3" = true
 is numeric #"4" = true
 is numeric #"5" = true
 is numeric #"6" = true
 is numeric #"7" = true
 is numeric #"8" = true
 is numeric #"9" = true
 is numeric = false
The addition of a ﬁnal catchall clause renders the match exhaustive, be
cause any value not matched by the ﬁrst ten clauses will surely be matched
by the eleventh.
Having said that, it is a very bad idea to simply add a catchall clause
to the end of every match to suppress inexhaustiveness warnings from the
compiler. The exhaustiveness checker is your friend! Each such warning is
a suggestion to doublecheck that match to be sure that you’ve not made
a silly error of omission, but rather have intentionally left out cases that
are ruled out by the invariants of the program. In chapter 10 we will see
that the exhaustiveness checker is an extremely valuable tool for managing
code evolution.
6.5 Sample Code
Here is the complete code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 7
Recursive Functions
So far we’ve only considered very simple functions (such as the reciprocal
function) whose value is computed by a simple composition of primitive
functions. In this chapter we introduce recursive functions, the principal
means of iterative computation in ML. Informally, a recursive function is
one that computes the result of a call by possibly making further calls to
itself. Obviously, to avoid inﬁnite regress, some calls must return their
results without making any recursive calls. Those that do must ensure
that the arguments are, in some sense, “smaller” so that the process will
eventually terminate.
This informal description obscures a central point, namely the means
by which we may convince ourselves that a function computes the result
that we intend. In general we must prove that for all inputs of the do
main type, the body of the function computes the “correct” value of result
type. Usually the argument imposes some additional assumptions on the
inputs, called the preconditions. The correctness requirement for the re
sult is called a postcondition. Our burden is to prove that for every input
satisfying the preconditions, the body evaluates to a result satisfying the
postcondition. In fact we may carry out such an analysis for many differ
ent pre and postcondition pairs, according to our interest. For example,
the ML type checker proves that the body of a function yields a value of
the range type (if it terminates) whenever it is given an argument of the
domain type. Here the domain type is the precondition, and the range
type is the postcondition. In most cases we are interested in deeper prop
erties, examples of which we shall consider below.
To prove the correctness of a recursive function (with respect to given
7.1 SelfReference and Recursion 58
pre and postconditions) it is typically necessary to use some form of in
ductive reasoning. The base cases of the induction correspond to those
cases that make no recursive calls; the inductive step corresponds to those
that do. The beauty of inductive reasoning is that we may assume that the
recursive calls work correctly when showing that a case involving recur
sive calls is correct. We must separately show that the base cases satisfy
the given pre and postconditions. Taken together, these two steps are
sufﬁcient to establish the correctness of the function itself, by appeal to an
induction principle that justiﬁes the particular pattern of recursion.
No doubt this all sounds fairly theoretical. The point of this chapter is
to show that it is also profoundly practical.
7.1 SelfReference and Recursion
In order for a function to “call itself”, it must have a name by which it can
refer to itself. This is achieved by using a recursive value binding, which are
ordinary value bindings qualiﬁed by the keyword rec. The simplest form
of a recursive value binding is as follows:
val rec var:typ = val.
As in the nonrecursive case, the lefthand is a pattern, but here the right
hand side must be a value. In fact the righthand side must be a function
expression, since only functions may be deﬁned recursively in ML. The
function may refer to itself by using the variable var.
Here’s an example of a recursive value binding:
val rec factorial : int>int =
fn 0 => 1  n:int => n * factorial (n1)
Using fun notation we may write the deﬁnition of factorial much more
clearly and concisely as follows:
fun factorial 0 = 1
 factorial (n:int) = n * factorial (n1)
There is obviously a close correspondence between this formulation of
factorial and the usual textbook deﬁnition of the factorial function in
REVISED 11.02.11 DRAFT VERSION 1.2
7.1 SelfReference and Recursion 59
terms of recursion equations:
0! = 1
n! = n (n −1)! (n > 0)
Recursive value bindings are typechecked in a manner that may, at
ﬁrst glance, seem paradoxical. To check that the binding
val rec var : typ = val
is wellformed, we ensure that the value val has type typ, assuming that
var has type typ. Since var refers to the value val itself, we are in effect
assuming what we intend to prove while proving it!
(Incidentally, since val is required to be a function expression, the type
typ will always be a function type.)
Let’s look at an example. To check that the binding for factorial
given above is wellformed, we assume that the variable factorial has
type int>int, then check that its deﬁnition, the function
fn 0 => 1  n:int => n * factorial (n1),
has type int>int. To do so we must check that each clause has type
int>int by checking for each clause that its pattern has type int and
that its expression has type int. This is clearly true for the ﬁrst clause of
the deﬁnition. For the second, we assume that n has type int, then check
that n * factorial (n1) has type int. This is so because of the rules for
the primitive arithmetic operations and because of our assumption that
factorial has type int>int.
How are applications of recursive functions evaluated? The rules are
almost the same as before, with one modiﬁcation. We must arrange that
all occurrences of the variable standing for the function are replaced by
the function itself before we evaluate the body. That way all references
to the variable standing for the function itself are indeed references to the
function itself!
Suppose that we have the following recursive function binding
val rec var : typ =
fn pat
1
=> exp
1
 ...
 pat
n
=> exp
n
REVISED 11.02.11 DRAFT VERSION 1.2
7.1 SelfReference and Recursion 60
and we wish to apply var to the value val of type typ. As before, we con
sider each clause in turn, until we ﬁnd the ﬁrst pattern pat
i
matching val.
We proceed, as before, by evaluating exp
i
, replacing the variables in pat
i
by
the bindings determined by pattern matching, but, in addition, we replace
all occurrences of the var by its binding in exp
i
before continuing evalua
tion.
For example, to evaluate factorial 3, we proceed by retrieving the
binding of factorial and evaluating
(fn 0=>1  n:int => n*factorial(n1))(3).
Considering each clause in turn, we ﬁnd that the ﬁrst doesn’t match, but
the second does. We therefore continue by evaluating its righthand side,
the expression n * factorial(n1), after replacing n by 3 and factorial
by its deﬁnition. We are left with the subproblem of evaluating the ex
pression
3 * (fn 0 => 1  n:int => n*factorial(n1))(2)
Proceeding as before, we reduce this to the subproblem of evaluating
3 * (2 * (fn 0=>1  n:int => n*factorial(n1))(1)),
which reduces to the subproblem of evaluating
3 * (2 * (1 * (fn 0=>1  n:int => n*factorial(n1))(0))),
which reduces to
3 * (2 * (1 * 1)),
which then evaluates to 6, as desired.
Observe that the repeated substitution of factorial by its deﬁnition
ensures that the recursive calls really do refer to the factorial function itself.
Also observe that the size of the subproblems grows until there are no
more recursive calls, at which point the computation can complete. In
broad outline, the computation proceeds as follows:
1. factorial 3
2. 3 * factorial 2
REVISED 11.02.11 DRAFT VERSION 1.2
7.2 Iteration 61
3. 3 * 2 * factorial 1
4. 3 * 2 * 1 * factorial 0
5. 3 * 2 * 1 * 1
6. 3 * 2 * 1
7. 3 * 2
8. 6
Notice that the size of the expression ﬁrst grows (in direct proportion to
the argument), then shrinks as the pending multiplications are completed.
This growth in expression size corresponds directly to a growth in run
time storage required to record the state of the pending computation.
7.2 Iteration
The deﬁnition of factorial given above should be contrasted with the
following twopart deﬁnition:
fun helper (0,r:int) = r
 helper (n:int,r:int) = helper (n1,n*r)
fun factorial (n:int) = helper (n, 1)
First we deﬁne a “helper” function that takes two parameters, an integer
argument and an accumulator that records the running partial result of the
computation. The idea is that the accumulator reassociates the pending
multiplications in the evaluation trace given above so that they can be per
formed prior to the recursive call, rather than after it completes. This re
duces the space required to keep track of those pending steps. Second, we
deﬁne factorial by calling helper with argument n and initial accumu
lator value 1, corresponding to the product of zero terms (empty preﬁx).
As a matter of programming style, it is usual to conceal the deﬁnitions
of helper functions using a local declaration. In practice we would make
the following deﬁnition of the iterative version of factorial:
REVISED 11.02.11 DRAFT VERSION 1.2
7.3 Inductive Reasoning 62
local
fun helper (0,r:int) = r
 helper (n:int,r:int) = helper (n1,n*r)
in
fun factorial (n:int) = helper (n,1)
end
This way the helper function is not visible, only the function of interest is
“exported” by the declaration.
The important thing to observe about helper is that it is iterative, or tail
recursive, meaning that the recursive call is the last step of evaluation of an
application of it to an argument. This means that the evaluation trace of a
call to helper with arguments (3,1) has the following general form:
1. helper (3, 1)
2. helper (2, 3)
3. helper (1, 6)
4. helper (0, 6)
5. 6
Notice that there is no growth in the size of the expression because there
are no pending computations to be resumed upon completion of the re
cursive call. Consequently, there is no growth in the space required for an
application, in contrast to the ﬁrst deﬁnition given above. Tail recursive
deﬁnitions are analogous to loops in imperative languages: they merely
iterate a computation, without requiring auxiliary storage.
7.3 Inductive Reasoning
Time and space usage are important, but what is more important is that
the function compute the intended result. The key to the correctness of
a recursive function is an inductive argument establishing its correctness.
The critical ingredients are these:
1. An inputoutput speciﬁcation of the intended behavior stating preconditions
on the arguments and a postcondition on the result.
REVISED 11.02.11 DRAFT VERSION 1.2
7.3 Inductive Reasoning 63
2. A proof that the speciﬁcation holds for each clause of the function,
assuming that it holds for any recursive calls.
3. An induction principle that justiﬁes the correctness of the function as
a whole, given the correctness of its clauses.
We’ll illustrate the use of inductive reasoning by a graduated series
of examples. First consider the simple, nontail recursive deﬁnition of
factorial given in section 7.1. One reasonable speciﬁcation for factorial
is as follows:
1. Precondition: n ≥ 0.
2. Postcondition: factorial n evaluates to n!.
We are to establish the following statement of correctness of factorial:
if n ≥ 0, then factorial n evaluates to n!.
That is, we show that the preconditions imply the postcondition holds
for the result of any application. This is called a total correctness assertion
because it states not only that the postcondition holds of any result of
application, but, moreover, that every application in fact yields a result
(subject to the precondition on the argument).
In contrast, a partial correctness assertion does not insist on termination,
only that the postcondition holds whenever the application terminates.
This may be stated as the assertion
if n ≥ 0 and factorial n evaluates to p, then p = n!.
Notice that this statement is true of a function that diverges whenever it is
applied! In this sense a partial correctness assertion is weaker than a total
correctness assertion.
Let us establish the total correctness of factorial using the pre and
postconditions stated above. To do so, we apply the principle of math
ematical induction on the argument n. Recall that this means we are to
establish the speciﬁcation for the case n = 0, and, assuming it to hold for
n >= 0, show that it holds for n + 1. The base case, n = 0, is trivial:
by deﬁnition factorial n evaluates to 1, which is 0!. Now suppose that
n = m + 1 for some m >= 0. By the inductive hypothesis we have that
REVISED 11.02.11 DRAFT VERSION 1.2
7.3 Inductive Reasoning 64
factorial m evaluates to m! (since m ≥ 0), and so by deﬁnition factorial
n evaluates to
n m! = (m + 1) m!
= (m + 1)!
= n!,
as required. This completes the proof.
That was easy. What about the iterative deﬁnition of factorial? We
focus on the behavior of helper. A suitable speciﬁcation is given as fol
lows:
1. Precondition: n ≥ 0.
2. Postcondition: helper (n, r) evaluates to n! r.
To show the total correctness of helper with respect to this speciﬁcation,
we once again proceed by mathematical induction on n. We leave it as an
exercise to give the details of the proof.
With this in hand it is easy to prove the correctness of factorial — if
n ≥ 0 then factorial n evaluates to the result of helper (n, 1), which
evaluates to n! 1 = n!. This completes the proof.
Helper functions correspond to lemmas, main functions correspond to
theorems. Just as we use lemmas to help us prove theorems, we use helper
functions to help us deﬁne main functions. The foregoing argument shows
that this is more than an analogy, but lies at the heart of good program
ming style.
Here’s an example of a function deﬁned by complete induction (or strong
induction), the Fibonacci function, deﬁned on integers n >= 0:
(* for n>=0, fib n yields the nth Fibonacci number *)
fun fib 0 = 1
 fib 1 = 1
 fib (n:int) = fib (n1) + fib (n2)
The recursive calls are made not only on n1, but also n2, which is why
we must appeal to complete induction to justify the deﬁnition. This deﬁ
nition of fib is very inefﬁcient because it performs many redundant com
putations: to compute fib n requires that we compute fib (n1) and fib
(n2). To compute fib (n1) requires that we compute fib (n2) a sec
ond time, and fib (n3). Computing fib (n2) requires computing fib
REVISED 11.02.11 DRAFT VERSION 1.2
7.4 Mutual Recursion 65
(n3) again, and fib (n4). As you can see, there is considerable redun
dancy here. It can be shown that the running time of fib is exponential in
its argument, which is quite awful.
Here’s a better solution: for each n >= 0 compute not only the nth
Fibonacci number, but also the (n −1)st as well. (For n = 0 we deﬁne the
“−1st” Fibonacci number to be zero). That way we can avoid redundant
recomputation, resulting in a lineartime algorithm. Here’s the code:
(* for n>=0, fib’ n evaluates to (a, b), where
a is the nth Fibonacci number, and
b is the (n1)st *)
fun fib’ 0 = (1, 0)
 fib’ 1 = (1, 1)
 fib’ (n:int) =
let
val (a:int, b:int) = fib’ (n1)
in
(a+b, a)
end
You might feel satisﬁed with this solution since it runs in time linear in
n. It turns out (see Graham, Knuth, and Patashnik, Concrete Mathematics
(AddisonWesley 1989) for a derivation) that the recurrence
F
0
= 1
F
1
= 1
F
n
= F
n−1
+ F
n−2
has a closedform solution over the real numbers. This means that the
nth Fibonacci number can be calculated directly, without recursion, by us
ing ﬂoating point arithmetic. However, this is an unusual case. In most
instances recursivelydeﬁned functions have no known closedform solu
tion, so that some form of iteration is inevitable.
7.4 Mutual Recursion
It is often useful to deﬁne two functions simultaneously, each of which
calls the other (and possibly itself) to compute its result. Such functions
REVISED 11.02.11 DRAFT VERSION 1.2
7.5 Sample Code 66
are said to be mutually recursive. Here’s a simple example to illustrate the
point, namely testing whether a natural number is odd or even. The most
obvious approach is to test whether the number is congruent to 0 mod
2, and indeed this is what one would do in practice. But to illustrate the
idea of mutual recursion we instead use the following inductive charac
terization: 0 is even, and not odd; n > 0 is even iff n −1 is odd; n > 0 is
odd iff n −1 is even. This may be coded up using two mutuallyrecursive
procedures as follows:
fun even 0 = true
 even n = odd (n1)
and odd 0 = false
 odd n = even (n1)
Notice that even calls odd and odd calls even, so they are not deﬁnable
separately from one another. We join their deﬁnitions using the keyword
and to indicate that they are deﬁned simultaneously by mutual recursion.
7.5 Sample Code
Here is the complete code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 8
Type Inference and
Polymorphism
8.1 Type Inference
So far we’ve mostly written our programs in what is known as the explic
itly typed style. This means that whenever we’ve introduced a variable,
we’ve assigned it a type at its point of introduction. In particular every
variable in a pattern has a type associated with it. As you may have no
ticed, this gets a little tedious after a while, especially when you’re using
clausal function deﬁnitions. A particularly pleasant feature of ML is that
it allows you to omit this type information whenever it can be determined
from context. This process is known as type inference since the compiler is
inferring the missing type information based on context.
For example, there is no need to give a type to the variable s in the
function
fn s:string => s ˆ "¸n".
The reason is that no other type for s makes sense, since s is used as an
argument to string concatenation. Consequently, you may write simply
fn s => s ˆ "¸n",
leaving ML to insert “:string” for you.
When is it allowable to omit this information? Almost always, with
very few exceptions. It is a deep, and important, result about ML that
8.1 Type Inference 68
missing type information can (almost) always be reconstructed completely
and unambiguously where it is omitted. This is called the principal typing
property of ML: whenever type information is omitted, there is always a
most general (i.e., least restrictive) way to recover the omitted type informa
tion. If there is no way to recover the omitted type information, then the
expression is illtyped. Otherwise there is a “best” way to ﬁll in the blanks,
which will (almost) always be found by the compiler. This is an amazingly
useful, and widely underappreciated, property of ML. It means, for exam
ple, that the programmer can enjoy the full beneﬁts of a static type system
without paying any notational penalty whatsoever!
The prototypical example is the identity function, fn x=>x. The body
of the function places no constraints on the type of x, since it merely re
turns x as the result without performing any computation on it. Since the
behavior of the identity function is the same for all possible choices of type
for its argument, it is said to be polymorphic. Therefore the identity function
has inﬁnitely many types, one for each choice of the type of the parameter
x. Choosing the type of x to be typ, the type of the identity function is
typ>typ. In other words every type for the identity function has the form
typ>typ, where typ is the type of the argument.
Clearly there is a pattern here, which is captured by the notion of a type
scheme. A type scheme is a type expression involving one or more type
variables standing for an unknown, but arbitrary type expression. Type
variables are written ’a (pronounced “α”), ’b (pronounced “β”), ’c (pro
nounced “γ”), etc.. An instance of a type scheme is obtained by replacing
each of the type variables occurring in it with a type scheme, with the same
type scheme replacing each occurrence of a given type variable. For ex
ample, the type scheme ’a>’a has instances int>int, string>string,
(int*int)>(int*int), and (’b>’b)>(’b>’b), among inﬁnitely many
others. However, it does not have the type int>string as instance, since
we are constrained to replace all occurrences of a type variable by the same
type scheme. However, the type scheme ’a>’b has both int>int and
int>string as instances since there are different type variables occurring
in the domain and range positions.
Type schemes are used to express the polymorphic behavior of func
tions. For example, we may write fn x:’a=>x for the polymorphic iden
tity function of type ’a>’a, meaning that the behavior of the identity
function is independent of the type of x. Similarly, the behavior of the
function fn (x,y)=>x+1 is independent of the type of y, but constrains the
REVISED 11.02.11 DRAFT VERSION 1.2
8.1 Type Inference 69
type of x to be int. This may be expressed using type schemes by writ
ing this function in the explicitlytyped form fn (x:int,y:’a)=>x+1 with
type int*’a>int.
In these examples we needed only one type variable to express the
polymorphic behavior of a function, but usually we need more than one.
For example, the function fn (x,y) = x constrains neither the type of x
nor the type of y. Consequently we may choose their types freely and in
dependently of one another. This may be expressed by writing this func
tion in the form fn (x:’a,y:’b)=>x with type scheme ’a*’b>’a. Notice
that while it is correct to assign the type ’a*’a>’a to this function, doing
so would be overly restrictive since the types of the two parameters need
not be the same. However, we could not assign the type ’a*’b>’c to this
function because the type of the result must be the same as the type of
the ﬁrst parameter: it returns its ﬁrst parameter when invoked! The type
scheme ’a*’b>’a precisely captures the constraints that must be satis
ﬁed for the function to be type correct. It is said to be the most general or
principal type scheme for the function.
It is a remarkable fact about ML that every expression (with the excep
tion of a few pesky examples that we’ll discuss below) has a principal type
scheme. That is, there is (almost) always a best or most general way to infer
types for expressions that maximizes generality, and hence maximizes ﬂexi
bility in the use of the expression. Every expression “seeks its own depth”
in the sense that an occurrence of that expression is assigned a type that is
an instance of its principal type scheme determined by the context of use.
For example, if we write
(fn x=>x)(0),
the context forces the type of the identity function to be int>int, and if
we write
(fn x=>x)(fn x=>x)(0)
the context forces the instance (int>int)>(int>int) of the principal
type scheme for the identity at the ﬁrst occurrence, and the instance int>int
for the second.
How is this achieved? Type inference is a process of constraint satisfac
tion. First, the expression determines a set of equations governing the rela
tionships among the types of its subexpressions. For example, if a function
REVISED 11.02.11 DRAFT VERSION 1.2
8.2 Polymorphic Deﬁnitions 70
is applied to an argument, then a constraint equating the domain type of
the function with the type of the argument is generated. Second, the con
straints are solved using a process similar to Gaussian elimination, called
uniﬁcation. The equations can be classiﬁed by their solution sets as follows:
1. Overconstrained: there is no solution. This corresponds to a type error.
2. Underconstrained: there are many solutions. There are two subcases:
ambiguous (due to overloading, which we will discuss further in sec
tion 8.3), or polymorphic (there is a “best” solution).
3. Uniquely determined: there is precisely one solution. This corresponds
to a completely unambiguous type inference problem.
The free type variables in the solution to the system of equations may be
thought of as determining the “degrees of freedom” or “range of polymor
phism” of the type of an expression — the constraints are solvable for any
choice of types to substitute for these free type variables.
This description of type inference as a constraint satisfaction procedure
accounts for the notorious obscurity of type checking errors in ML. If a
program is not type correct, then the system of constraints associated with
it will not have a solution. The type inference procedure attempts to ﬁnd
a solution to these constraints, and at some point discovers that it cannot
succeed. It is fundamentally impossible to attribute this inconsistency to
any particular constraint; all that can be said is that the constraint set as
a whole has no solution. The checker usually reports the ﬁrst unsatisﬁ
able equation it encounters, but this may or may not correspond to the
“reason” (in the mind of the programmer) for the type error. The usual
method for ﬁnding the error is to insert sufﬁcient type information to nar
row down the source of the inconsistency until the source of the difﬁculty
is uncovered.
8.2 Polymorphic Deﬁnitions
There is an important interaction between polymorphic expressions and
value bindings that may be illustrated by the following example. Suppose
that we wish to bind the identity function to a variable I so that we may
refer to it by name. We’ve previously observed that the identity function
REVISED 11.02.11 DRAFT VERSION 1.2
8.2 Polymorphic Deﬁnitions 71
is polymorphic, with principal type scheme ’a>’a. This may be captured
by ascribing this type scheme to the variable I at the val binding. That is,
we may write
val I : ’a>’a = fn x=>x
to ascribe the type scheme ’a>’a to the variable I. (We may also write
fun I(x:’a):’a = x
for an equivalent binding of I.) Having done this, each use of I determines a
distinct instance of the ascribed type scheme ’a>’a. That is, both I 0 and I I
0 are wellformed expressions, the ﬁrst assigning the type int>int to I,
the second assigning the types
(int>int)>(int>int)
and
int>int
to the two occurrences of I. Thus the variable I behaves precisely the same
as its deﬁnition, fn x=>x, in any expression where it is used.
As a convenience ML also provides a form of type inference on value
bindings that eliminates the need to ascribe a type scheme to the variable
when it is bound. If no type is ascribed to a variable introduced by a
val binding, then it is implicitly ascribed the principal type scheme of the
righthand side. For example, we may write
val I = fn x=>x
or
fun I(x) = x
as a binding for the variable . The type checker implicitly assigns the prin
cipal type scheme, ’a>’a, of the binding to the variable I. In practice we
often allow the type checker to infer the principal type of a variable, but it
is often good form to explicitly indicate the intended type as a consistency
check and for documentation purposes.
The treatment of val bindings during type checking ensures that a
bound variable has precisely the same type as its binding. In other words
REVISED 11.02.11 DRAFT VERSION 1.2
8.2 Polymorphic Deﬁnitions 72
the type checker behaves as though all uses of the bound variable are im
plicitly replaced by its binding before type checking. Since this may in
volve replication of the binding, the meaning of a program is not necessar
ily preserved by this transformation. (Think, for example, of any expres
sion that opens a window on your screen: if you replicate the expression
and evaluate it twice, it will open two windows. This is not the same as
evaluating it only once, which results in one window.)
To ensure semantic consistency, variables introduced by a val binding
are allowed to be polymorphic only if the righthand side is a value. This
is called the value restriction on polymorphic declarations. For fun bind
ings this restriction is always met since the righthand side is implicitly a
lambda expression, which is a value. However, it might be thought that
the following declaration introduces a polymorphic variable of type ’a >
’a, but in fact it is rejected by the compiler:
val J = I I
The reason is that the righthand side is not a value; it requires compu
tation to determine its value. It is therefore ruled out as inadmissible for
polymorphism; the variable J may not be used polymorphically in the re
mainder of the program. In this case the difﬁculty may be avoided by
writing instead
fun J x = I I x
because now the binding of J is a lambda, which is a value.
In some rare circumstances this is not possible, and some polymor
phism is lost. For example, the following declaration of a value of list
type
1
val l = nil @ nil
does not introduce an identiﬁer with a polymorphic type, even though the
almost equivalent declaration
val l = nil
does do so. Since the righthand side is a list, we cannot apply the “trick”
of deﬁning l to be a function; we are stuck with a loss of polymorphism in
1
To be introduced in chapter 9.
REVISED 11.02.11 DRAFT VERSION 1.2
8.3 Overloading 73
this case. This particular example is not very impressive, but occasionally
similar examples do arise in practice.
Why is the value restriction necessary? Later on, when we study mu
table storage, we’ll see that some restriction on polymorphism is essen
tial if the language is to be type safe. The value restriction is an easily
remembered sufﬁcient condition for soundness, but as the examples above
illustrate, it is by no means necessary. The designers of ML were faced
with a choice of simplicity vs ﬂexibility; in this case they opted for simplic
ity at the expense of some expressiveness in the language.
8.3 Overloading
Type information cannot always be omitted. There are a few corner cases
that create problems for type inference, most of which arise because of
concessions that are motivated by longstanding, if dubious, notational
practices.
The main source of difﬁculty stems from overloading of arithmetic oper
ators. As a concession to longstanding practice in informal mathematics
and in many programming languages, the same notation is used for both
integer and ﬂoating point arithmetic operations. As long as we are pro
gramming in an explicitlytyped style, this convention creates no particu
lar problems. For example, in the function
fn x:int => x+x
it is clear that integer addition is called for, whereas in the function
fn x:real => x+x
it is equally obvious that ﬂoating point addition is intended.
However, if we omit type information, then a problem arises. What are
we to make of the function
fn x => x+x ?
Does “+” stand for integer or ﬂoating point addition? There are two dis
tinct reconstructions of the missing type information in this example, cor
responding to the preceding two explictlytyped programs. Which is the
compiler to choose?
When presented with such a program, the compiler has two choices:
REVISED 11.02.11 DRAFT VERSION 1.2
8.3 Overloading 74
1. Declare the expression ambiguous, and force the programmer to pro
vide enough explicit type information to resolve the ambiguity.
2. Arbitrarily choose a “default” interpretation, say the integer arith
metic, that forces one interpretation or another.
Each approach has its advantages and disadvantages. Many compilers
choose the second approach, but issue a warning indicating that it has
done so. To avoid ambiguity, explicit type information is required from
the programmer.
The situation is actually a bit more subtle than the preceding discus
sion implies. The reason is that the type inference process makes use of
the surrounding context of an expression to help resolve ambiguities. For
example, if the expression fn x=>x+x occurs in the following, larger ex
pression, there is in fact no ambiguity:
(fn x => x+x)(3).
Since the function is applied to an integer argument, there is no question
that the only possible resolution of the missing type information is to treat
x as having type int, and hence to treat + as integer addition.
The important question is how much context is considered before the
situation is considered ambiguous? The rule of thumb is that context is
considered up to the nearest enclosing function declaration. For example,
consider the following example:
let
val double = fn x => x+x
in
(double 3, double 4)
end
The function expression fn x=>x+x will be ﬂagged as ambiguous, even
though its only uses are with integer arguments. The reason is that value
bindings are considered to be “units” of type inference for which all am
biguity must be resolved before type checking continues. If your compiler
adopts the integer interpretation as default, the above program will be ac
cepted (with a warning), but the following one will be rejected:
REVISED 11.02.11 DRAFT VERSION 1.2
8.3 Overloading 75
let
val double = fn x => x+x
in
(double 3.0, double 4.0)
end
Finally, note that the following program must be rejected because no
resolution of the overloading of addition can render it meaningful:
let
val double = fn x => x+x
in
(double 3, double 3.0)
end
The ambiguity must be resolved at the val binding, which means that the
compiler must commit at that point to treating the addition operation as
either integer or ﬂoating point. No single choice can be correct, since we
subsequently use double at both types.
A closely related source of ambiguity arises from the “record elision”
notation described in chapter 5. Consider the function #name, deﬁned by
fun #name ¦name=n:string, ...¦ = n
which selects the name ﬁeld of a record. This deﬁnition is ambiguous be
cause the compiler cannot uniquely determine the domain type of the
function! Any of the following types are legitimate domain types for
#name, none of which is “best”:
¦name:string¦
¦name:string,salary:real¦
¦name:string,salary:int¦
¦name:string,address:string¦
Of course there are inﬁnitely many such examples, none of which is clearly
preferable to the other. This function deﬁnition is therefore rejected as
ambiguous by the compiler —there is no one interpretation of the function
that sufﬁces for all possible uses.
In chapter 5 we mentioned that functions such as #name are predeﬁned
by the ML compiler, yet we just now claimed that such a function def
inition is rejected as ambiguous. Isn’t this a contradiction? Not really,
REVISED 11.02.11 DRAFT VERSION 1.2
8.4 Sample Code 76
because what happens is that each occurrence of #name is replaced by the
function
fn ¦name=n,...¦ = n
and then context is used to resolve the “local” ambiguity. This works well,
provided that the complete record type of the arguments to #name can be
determined from context. If not, the uses are rejected as ambiguous. Thus,
the following expression is welltyped
fn r : ¦name:string,address:string,salary:int¦ =>
(#name r, #address r)
because the record type of r is explicitly given. If the type of r were omit
ted, the expression would be rejected as ambiguous (unless the context
resolves the ambiguity.)
8.4 Sample Code
Here is the code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 9
Programming with Lists
9.1 List Primitives
In chapter 5 we noted that aggregate data structures are especially easy
to handle in ML. In this chapter we consider another important aggregate
type, the list type. In addition to being an important form of aggregate
type it also illustrates two other general features of the ML type system:
1. Type constructors, or parameterized types. The type of a list reveals the
type of its elements.
2. Recursive types. The set of values of a list type are given by an induc
tive deﬁnition.
Informally, the values of type typ list are the ﬁnite lists of values of
type typ. More precisely, the values of type typ list are given by an in
ductive deﬁnition, as follows:
1. nil is a value of type typ list.
2. if h is a value of type typ, and t is a value of type typ list, then h::t
is a value of type typ list.
3. Nothing else is a value of type typ list.
The type expression typ list is a postﬁx notation for the application
of the type constructor list to the type typ. Thus list is a kind of “func
tion” mapping types to types: given a type typ, we may apply list to it
9.1 List Primitives 78
to get another type, written typ list. The forms nil and :: are the value
constructors of type typ list. The nullary (no argument) constructor nil
may be thought of as the empty list. The binary (two argument) construc
tor :: constructs a nonempty list from a value h of type typ and another
value t of type typ list; the resulting value, h::t, of type typ list, is pro
nounced “h cons t” (for historical reasons). We say that “h is cons’d onto t”,
that h is the head of the list, and that t is its tail.
The deﬁnition of the values of type typ list given above is an example
of an inductive deﬁnition. The type is said to be recursive because this deﬁ
nition is “selfreferential” in the sense that the values of type typ list are
deﬁned in terms of (other) values of the same type. This is especially clear
if we examine the types of the value constructors for the type typ list:
val nil : typ list
val (op ::) : typ * typ list > typ list
The notation op :: is used to refer to the :: operator as a function, rather
than to use it to form a list, which requires inﬁx notation.
Two things are notable here:
1. The :: operation takes as its second argument a value of type typ
list, and yields a result of type typ list. This selfreferential aspect
is characteristic of an inductive deﬁnition.
2. Both nil and op :: are polymorphic in the type of the underlying el
ements of the list. Thus nil is the empty list of type typ list for
any element type typ, and op :: constructs a nonempty list inde
pendently of the type of the elements of that list.
It is easy to see that a value val of type typ list has the form
val
1
::(val
2
:: ( ::(val
n
::nil) ))
for some n ≥ 0, where val
i
is a value of type typ for each 1 ≤ i ≤ n.
For according to the inductive deﬁnition of the values of type typ list,
the value val must either be nil, which is of the above form, or val
1
::val
/
,
where val
/
is a value of type typ list. By induction val
/
has the form
(val
2
:: ( ::(val
n
::nil) ))
REVISED 11.02.11 DRAFT VERSION 1.2
9.2 Computing With Lists 79
and hence val again has the speciﬁed form.
By convention the operator :: is rightassociative, so we may omit the
parentheses and just write
val
1
::val
2
:: ::val
n
::nil
as the general form of val of type typ list. This may be further abbrevi
ated using list notation, writing
[ val
1
, val
2
, ..., val
n
]
for the same list. This notation emphasizes the interpretation of lists as
ﬁnite sequences of values, but it obscures the fundamental inductive char
acter of lists as being built up from nil using the :: operation.
9.2 Computing With Lists
How do we compute with values of list type? Since the values are deﬁned
inductively, it is natural that functions on lists be deﬁned recursively, using
a clausal deﬁnition that analyzes the structure of a list. Here’s a deﬁnition
of the function length that computes the number of elements of a list:
fun length nil = 0
 length ( ::t) = 1 + length t
The deﬁnition is given by induction on the structure of the list argument.
The base case is the empty list, nil. The inductive step is the nonempty
list ::t (notice that we do not need to give a name to the head). Its deﬁ
nition is given in terms of the tail of the list t, which is “smaller” than the
list ::t. The type of length is ’a list > int; it is deﬁned for lists of
values of any type whatsoever.
We may deﬁne other functions following a similar pattern. Here’s the
function to append two lists:
fun append (nil, l) = l
 append (h::t, l) = h :: append (t, l)
This function is built into ML; it is written using inﬁx notation as exp
1
@
exp
2
. The running time of append is proportional to the length of the ﬁrst
list, as should be obvious from its deﬁnition.
Here’s a function to reverse a list.
REVISED 11.02.11 DRAFT VERSION 1.2
9.2 Computing With Lists 80
fun rev nil = nil
 rev (h::t) = rev t @ [h]
Its running time is O(n
2
), where n is the length of the argument list. This
can be demonstrated by writing down a recurrence that deﬁnes the run
ning time T(n) on a list of length n.
T(0) = O(1)
T(n + 1) = T(n) +O(n)
Solving the recurrence we obtain the result T(n) = O(n
2
).
Can we do better? Oddly, we can take advantage of the nonassociativity
of :: to give a tailrecursive deﬁnition of rev.
local
fun helper (nil, a) = a
 helper (h::t, a) = helper (t, h::a)
in
fun rev’ l = helper (l, nil)
end
The general idea of introducing an accumulator is the same as before, ex
cept that by reordering the applications of :: we reverse the list! The
helper function reverses its ﬁrst argument and prepends it to its second
argument. That is, helper (l, a) evaluates to (rev l) @ a, where we
assume here an independent deﬁnition of rev for the sake of the speciﬁca
tion. Notice that helper runs in time proportional to the length of its ﬁrst
argument, and hence rev’ runs in time proportional to the length of its
argument.
The correctness of functions deﬁned on lists may be established using
the principle of structural induction. We illustrate this by establishing that
the function helper satisﬁes the following speciﬁcation:
for every l and a of type typ list, helper(l, a) evaluates to
the result of appending a to the reversal of l.
That is, there are no preconditions on l and a, and we establish the post
condition that helper (l, a) yields (rev l) @ a.
The proof is by structural induction on the list l. If l is nil, then helper
(l,a) evaluates to a, which fulﬁlls the postcondition. If l is the list h::t,
REVISED 11.02.11 DRAFT VERSION 1.2
9.3 Sample Code 81
then the application helper (l, a) reduces to the value of helper (t,
(h::a)). By the inductive hypothesis this is just (rev t) @ (h :: a),
which is equivalent to (rev t) @ [h] @ a. But this is just rev (h::t) @
a, which was to be shown.
The principle of structural induction may be summarized as follows.
To show that a function works correctly for every list l, it sufﬁces to show
1. The correctness of the function for the empty list, nil, and
2. The correctness of the function for h::t, assuming its correctness for
t.
As with mathematical induction over the natural numbers, structural in
duction over lists allows us to focus on the basic and incremental behavior
of a function to establish its correctness for all lists.
9.3 Sample Code
Here is the code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 10
Concrete Data Types
10.1 Datatype Declarations
Lists are one example of the general notion of a recursive type. ML provides
a general mechanism, the datatype declaration, for introducing programmer
deﬁned recursive types. Earlier we introduced type declarations as an ab
breviation mechanism. Types are given names as documentation and as
a convenience to the programmer, but doing so is semantically inconse
quential — one could replace all uses of the type name by its deﬁnition
and not affect the behavior of the program. In contrast the datatype dec
laration provides a means of introducing a new type that is distinct from
all other types and that does not merely stand for some other type. It is the
means by which the ML type systemmay be extended by the programmer.
The datatype declaration in ML has a number of facets. A datatype
declaration introduces
1. One or more new type constructors. The type constructors intro
duced may, or may not, be mutually recursive.
2. One or more newvalue constructors for each of the type constructors
introduced by the declaration.
The type constructors may take zero or more arguments; a zeroargument,
or nullary, type constructor is just a type. Each value constructor may
also take zero or more arguments; a nullary value constructor is just a
constant. The type and value constructors introduced by the declaration
are “new” in the sense that they are distinct from all other type and value
10.2 NonRecursive Datatypes 83
constructors previously introduced; if a datatype redeﬁnes an “old” type
or value constructor, then the old deﬁnition is shadowed by the new one,
rendering the old ones inaccessible in the scope of the new deﬁnition.
10.2 NonRecursive Datatypes
Here’s a simple example of a nullary type constructor with four nullary
value constructors.
datatype suit = Spades  Hearts  Diamonds  Clubs
This declaration introduces a new type suit with four nullary value con
structors, Spades, Hearts, Diamonds, and Clubs. This declaration may be
read as introducing a type suit such that a value of type suit is either
Spades, or Hearts, or Diamonds, or Clubs. There is no signiﬁcance to the
ordering of the constructors in the declaration; we could just as well have
written
datatype suit = Hearts  Diamonds  Spades  Clubs
(or any other ordering, for that matter). It is conventional to capitalize the
names of value constructors, but this is not required by the language.
Given the declaration of the type suit, we may deﬁne functions on it
by case analysis on the value constructors using a clausal function deﬁni
tion. For example, we may deﬁne the suit ordering in the card game of
bridge by the function
fun outranks (Spades, Spades) = false
 outranks (Spades, ) = true
 outranks (Hearts, Spades) = false
 outranks (Hearts, Hearts) = false
 outranks (Hearts, ) = true
 outranks (Diamonds, Clubs) = true
 outranks (Diamonds, ) = false
 outranks (Clubs, ) = false
This deﬁnes a function of type suit * suit > bool that determines whether
or not the ﬁrst suit outranks the second.
Data types may be parameterized by a type. For example, the declaration
REVISED 11.02.11 DRAFT VERSION 1.2
10.2 NonRecursive Datatypes 84
datatype ’a option = NONE  SOME of ’a
introduces the unary type constructor ’a option with two value construc
tors, NONE, with no arguments, and SOME, with one. The values of type typ
option are
1. The constant NONE, and
2. Values of the form SOME val, where val is a value of type typ.
For example, some values of type string option are NONE, SOME "abc",
and SOME "def".
The option type constructor is predeﬁned in Standard ML. One com
mon use of option types is to handle functions with an optional argument.
For example, here is a function to compute the baseb exponential function
for natural number exponents that defaults to base 2:
fun expt (NONE, n) = expt (SOME 2, n)
 expt (SOME b, 0) = 1
 expt (SOME b, n) =
if n mod 2 = 0 then
expt (SOME (b*b), n div 2)
else
b * expt (SOME b, n1)
The advantage of the option type in this sort of situation is that it avoids
the need to make a special case of a particular argument, e.g., using 0 as
ﬁrst argument to mean “use the default exponent”.
A related use of option types is in aggregate data structures. For exam
ple, an address book entry might have a record type with ﬁelds for various
bits of data about a person. But not all data is relevant to all people. For
example, someone may not have a spouse, but they all have a name. For
this we might use a type deﬁnition of the form
type entry = ¦ name:string, spouse:string option ¦
so that one would create an entry for an unmarried person with a spouse
ﬁeld of NONE.
Option types may also be used to represent an optional result. For
example, we may wish to deﬁne a function reciprocal that returns the
reciprocal of an integer, if it has one, and otherwise indicates that it has
REVISED 11.02.11 DRAFT VERSION 1.2
10.3 Recursive Datatypes 85
no reciprocal. This is achieve by deﬁning reciprocal to have type int >
int option as follows:
fun reciprocal 0 = NONE
 reciprocal n = SOME (1 div n)
To use the result of a call to reciprocal we must perform a case analysis of
the form
case (reciprocal exp)
of NONE => exp
1
 SOME r => exp
2
where exp
1
covers the case that exp has no reciprocal, and exp
2
covers the
case that exp has reciprocal r.
10.3 Recursive Datatypes
The next level of generality is the recursive type deﬁnition. For example,
one may deﬁne a type typ tree of binary trees with values of type typ at
the nodes using the following declaration:
datatype ’a tree =
Empty 
Node of ’a tree * ’a * ’a tree
This declaration corresponds to the informal deﬁnition of binary trees with
values of type typ at the nodes:
1. The empty tree Empty is a binary tree.
2. If tree 1 and tree 2 are binary trees, and val is a value of type typ, then
Node (tree 1, val, tree 2) is a binary tree.
3. Nothing else is a binary tree.
The distinguishing feature of this deﬁnition is that it is recursive in the
sense that binary trees are constructed out of other binary trees, with the
empty tree serving as the base case.
REVISED 11.02.11 DRAFT VERSION 1.2
10.3 Recursive Datatypes 86
(Incidentally, a leaf in a binary tree is here represented as a node both of
whose children are the empty tree. This deﬁnition of binary trees is analo
gous to starting the natural numbers with zero, rather than one. One can
think of the children of a node in a binary tree as the “predecessors” of that
node, the only difference compared to the usual deﬁnition of predecessor
being that a node has two, rather than one, predecessors.)
To compute with a recursive type, use a recursive function. For exam
ple, here is the function to compute the height of a binary tree:
fun height Empty = 0
 height (Node (lft, , rht)) =
1 + max (height lft, height rht)
Notice that height is called recursively on the children of a node, and is
deﬁned outright on the empty tree. This pattern of deﬁnition is another
instance of structural induction (on the tree type). The function height
is said to be deﬁned by induction on the structure of a tree. The general
idea is to deﬁne the function directly for the base cases of the recursive
type (i.e., value constructors with no arguments or whose arguments do
not involve values of the type being deﬁned), and to deﬁne it for nonbase
cases in terms of its deﬁnitions for the constituent values of that type. We
will see numerous examples of this as we go along.
Here’s another example. The size of a binary tree is the number of
nodes occurring in it. Here’s a straightforward deﬁnition in ML:
fun size Empty = 0
 size (Node (lft, , rht)) =
1 + size lft + size rht
The function size is deﬁned by structural induction on trees.
A word of warning. One reason to capitalize value constructors is to
avoid a pitfall in the ML syntax that we mentioned in chapter 2. Suppose
we gave the following deﬁnition of size:
fun size empty = 0
 size (Node (lft, , rht)) =
1 + size lft + size rht
The compiler will warn us that the second clause of the deﬁnition is redun
dant! Why? Because empty, spelled with a lowercase “e”, is a variable, not
REVISED 11.02.11 DRAFT VERSION 1.2
10.3 Recursive Datatypes 87
a constructor, and hence matches any tree whatsoever. Consequently the
second clause never applies. By capitalizing constructors we can hope to
make mistakes such as these more evident, but in practice you are bound
to run into this sort of mistake.
The tree data type is appropriate for binary trees: those for which each
node has exactly two children. (Of course, either or both children might
be the empty tree, so we may consider this to deﬁne the type of trees with
at most two children; it’s a matter of terminology which interpretation you
prefer.) It should be obvious how to deﬁne the type of ternary trees, whose
nodes have at most three children, and so on for other ﬁxed arities. But
what if we wished to deﬁne a type of trees with a variable number of chil
dren? In a socalled variadic tree some nodes might have three children,
some might have two, and so on. This can be achieved in at least two
ways. One way combines lists and trees, as follows:
datatype ’a tree =
Empty 
Node of ’a * ’a tree list
Each node has a list of children, so that distinct nodes may have different
numbers of children. Notice that the empty tree is distinct from the tree
with one node and no children because there is no data associated with
the empty tree, whereas there is a value of type ’a at each node.
Another approach is to simultaneously deﬁne trees and “forests”. A
variadic tree is either empty, or a node gathering a “forest” to form a tree;
a forest is either empty or a variadic tree together with another forest. This
leads to the following deﬁnition:
datatype ’a tree =
Empty 
Node of ’a * ’a forest
and ’a forest =
None 
Tree of ’a tree * ’a forest
This example illustrates the introduction of two mutually recursive datatypes.
Mutually recursive datatypes beget mutually recursive functions. Here’s
a deﬁnition of the size (number of nodes) of a variadic tree:
REVISED 11.02.11 DRAFT VERSION 1.2
10.4 Heterogeneous Data Structures 88
fun size tree Empty = 0
 size tree (Node ( , f)) = 1 + size forest f
and size forest None = 0
 size forest (Tree (t, f’)) = size tree t + size forest f’
Notice that we deﬁne the size of a tree in terms of the size of a forest, and
vice versa, just as the type of trees is deﬁned in terms of the type of forests.
Many other variations are possible. Suppose we wish to deﬁne a notion
of binary tree in which data items are associated with branches, rather than
nodes. Here’s a datatype declaration for such trees:
datatype ’a tree =
Empty 
Node of ’a branch * ’a branch
and ’a branch =
Branch of ’a * ’a tree
In contrast to our ﬁrst deﬁnition of binary trees, in which the branches
from a node to its children were implicit, we now make the branches them
selves explicit, since data is attached to them.
For example, we can collect into a list the data items labelling the branches
of such a tree using the following code:
fun collect Empty = nil
 collect (Node (Branch (ld, lt), Branch (rd, rt))) =
ld :: rd :: (collect lt) @ (collect rt)
10.4 Heterogeneous Data Structures
Returning to the original deﬁnition of binary trees (with data items at the
nodes), observe that the type of the data items at the nodes must be the
same for every node of the tree. For example, a value of type int tree has
an integer at every node, and a value of type string tree has a string at
every node. Therefore an expression such as
Node (Empty, 43, Node (Empty, "43", Empty))
is illtyped. The type system insists that trees be homogeneous in the sense
that the type of the data items is the same at every node.
REVISED 11.02.11 DRAFT VERSION 1.2
10.5 Abstract Syntax 89
It is quite rare to encounter heterogeneous data structures in real pro
grams. For example, a dictionary with strings as keys might be repre
sented as a binary search tree with strings at the nodes; there is no need
for heterogeneity to represent such a data structure. But occasionally one
might wish to work with a heterogeneous tree, whose data values at each
node are of different types. How would one represent such a thing in ML?
To discover the answer, ﬁrst think about how one might manipulate
such a data structure. When accessing a node, we would need to check
at runtime whether the data item is an integer or a string; otherwise we
would not know whether to, say, add 1 to it, or concatenate "1" to the
end of it. This suggests that the data item must be labelled with sufﬁcient
information so that we may determine the type of the item at runtime. We
must also be able to recover the underlying data item itself so that familiar
operations (such as addition or string concatenation) may be applied to it.
The required labelling and discrimination is neatly achieved using a
datatype declaration. Suppose we wish to represent the type of integer
orstring trees. First, we deﬁne the type of values to be integers or strings,
marked with a constructor indicating which:
datatype int or string =
Int of int 
String of string
Then we deﬁne the type of interest as follows:
type int or string tree =
int or string tree
Voila! Perfectly natural and easy — heterogeneity is really a special case of
homogeneity!
10.5 Abstract Syntax
Datatype declarations and pattern matching are extremely useful for deﬁn
ing and manipulating the abstract syntax of a language. For example, we
may deﬁne a small language of arithmetic expressions using the following
declaration:
REVISED 11.02.11 DRAFT VERSION 1.2
10.5 Abstract Syntax 90
datatype expr =
Numeral of int 
Plus of expr * expr 
Times of expr * expr
This deﬁnition has only three clauses, but one could readily imagine adding
others. Here is the deﬁnition of a function to evaluate expressions of the
language of arithmetic expressions written using pattern matching:
fun eval (Numeral n) = Numeral n
 eval (Plus (e1, e2)) =
let
val Numeral n1 = eval e1
val Numeral n2 = eval e2
in
Numeral (n1+n2)
end
 eval (Times (e1, e2)) =
let
val Numeral n1 = eval e1
val Numeral n2 = eval e2
in
Numeral (n1*n2)
end
The combination of datatype declarations and pattern matching con
tributes enormously to the readability of programs written in ML. A less
obvious, but more important, beneﬁt is the error checking that the com
piler can perform for you if you use these mechanisms in tandem. As an
example, suppose that we extend the type expr with a new component for
the reciprocal of a number, yielding the following revised deﬁnition:
datatype expr =
Numeral of int 
Plus of expr * expr 
Times of expr * expr 
Recip of expr
First, observe that the “old” deﬁnition of eval is no longer applicable to
values of type expr! For example, the expression
REVISED 11.02.11 DRAFT VERSION 1.2
10.6 Sample Code 91
eval (Plus (Numeral 1, Numeral 2))
is illtyped, even though it doesn’t use the Recip constructor. The reason is
that the redeclaration of expr introduces a “new” type that just happens
to have the same name as the “old” type, but is in fact distinct from it. This
is a boon because it reminds us to recompile the old code relative to the
new deﬁnition of the expr type.
Second, upon recompiling the deﬁnition of eval we encounter an inex
haustive match warning: the old code no longer applies to every value of
type expr according to its new deﬁnition! We are of course lacking a case
for Recip, which we may provide as follows:
fun eval (Numeral n) = Numeral n
 eval (Plus (e1, e2)) = ... as before ...
 eval (Times (e1, e2)) = ... as before ...
 eval (Recip e) =
let
val Numeral n = eval e
in
Numeral (1 div n)
end
The value of the checks provided by the compiler in such cases cannot be
overestimated. When recompiling a large program after making a change
to a datatype declaration the compiler will automatically point out every
line of code that must be changed to conform to the new deﬁnition; it is
impossible to forget to attend to even a single case. This is a tremendous
help to the developer, especially if she is not the original author of the code
being modiﬁed and is another reason why the static type discipline of ML
is a positive beneﬁt, rather than a hindrance, to programmers.
10.6 Sample Code
Here is the code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 11
HigherOrder Functions
11.1 Functions as Values
Values of function type are ﬁrstclass, which means that they have the same
rights and privileges as values of any other type. In particular, functions
may be passed as arguments and returned as results of other functions,
and functions may be stored in and retrieved from data structures such
as lists and trees. We will see that ﬁrstclass functions are an important
source of expressive power in ML.
Functions which take functions as arguments or yield functions as re
sults are known as higherorder functions (or, less often, as functionals or
operators). Higherorder functions arise frequently in mathematics. For
example, the differential operator is the higherorder function that, when
given a (differentiable) function on the real line, yields its ﬁrst derivative
as a function on the real line. We also encounter functionals mapping func
tions to real numbers, and real numbers to functions. An example of the
former is provided by the deﬁnite integral viewed as a function of its in
tegrand, and an example of the latter is the deﬁnite integral of a given
function on the interval [a, x], viewed as a function of a, that yields the
area under the curve from a to x as a function of x.
Higherorder functions are less familiar tools for many programmers
since the bestknown programming languages have only rudimentary mech
anisms to support their use. In contrast higherorder functions play a
prominent role in ML, with a variety of interesting applications. Their
use may be classiﬁed into two broad categories:
11.2 Binding and Scope 93
1. Abstracting patterns of control. Higherorder functions are design pat
terns that “abstract out” the details of a computation to lay bare the
skeleton of the solution. The skeleton may be ﬂeshed out to form a
solution of a problem by applying the general pattern to arguments
that isolate the speciﬁc problem instance.
2. Staging computation. It arises frequently that computation may be
staged by expending additional effort “early” to simplify the compu
tation of “later” results. Staging can be used both to improve efﬁ
ciency and, as we will see later, to control sharing of computational
resources.
11.2 Binding and Scope
Before discussing these programming techniques, we will review the criti
cally important concept of scope as it applies to function deﬁnitions. Recall
that Standard ML is a statically scoped language, meaning that identiﬁers
are resolved according to the static structure of the program. A use of the
variable var is considered to be a reference to the nearest lexically enclosing
declaration of var. We say “nearest” because of the possibility of shadow
ing; if we redeclare a variable var, then subsequent uses of var refer to the
“most recent” (lexically!) declaration of it; any “previous” declarations are
temporarily shadowed by the latest one.
This principle is easy to apply when considering sequences of declara
tions. For example, it should be clear by now that the variable y is bound
to 32 after processing the following sequence of declarations:
val x = 2 (* x=2 *)
val y = x*x (* y=4 *)
val x = y*x (* x=8 *)
val y = x*y (* y=32 *)
In the presence of function deﬁnitions the situation is the same, but it can
be a bit tricky to understand at ﬁrst.
Here’s an example to test your grasp of the lexical scoping principle:
REVISED 11.02.11 DRAFT VERSION 1.2
11.2 Binding and Scope 94
val x = 2
fun f y = x+y
val x = 3
val z = f 4
After processing these declarations the variable z is bound to 6, not to 7!
The reason is that the occurrence of x in the body of f refers to the ﬁrst
declaration of x since it is the nearest lexically enclosing declaration of the
occurence, even though it has been subsequently redeclared.
This example illustrates three important points:
1. Binding is not assignment! If we were to view the second binding of
x as an assignment statement, then the value of z would be 7, not 6.
2. Scope resolution is lexical, not temporal. We sometimes refer to the
“most recent” declaration of a variable, which has a temporal ﬂavor,
but we always mean “nearest lexically enclosing at the point of oc
currence”.
3. ”Shadowed” bindings are not lost. The “old” binding for x is still
available (through calls to f), even though a more recent binding has
shadowed it.
One way to understand what’s going on here is through the concept
of a closure, a technique for implementing higherorder functions. When
a function expression is evaluated, a copy of the environment is attached
to the function. Subsequently, all free variables of the function (i.e., those
variables not occurring as parameters) are resolved with respect to the en
vironment attached to the function; the function is therefore said to be
“closed” with respect to the attached environment. This is achieved at
function application time by “swapping” the attached environment of the
function for the environment active at the point of the call. The swapped
environment is restored after the call is complete. Returning to the ex
ample above, the environment associated with the function f contains the
declaration val x = 2 to record the fact that at the time the function was
evaluated, the variable x was bound to the value 2. The variable x is sub
sequently rebound to 3, but when f is applied, we temporarily reinstate
the binding of x to 2, add a binding of y to 4, then evaluate the body of the
function, yielding 6. We then restore the binding of x and drop the binding
of y before yielding the result.
REVISED 11.02.11 DRAFT VERSION 1.2
11.3 Returning Functions 95
11.3 Returning Functions
While seemingly very simple, the principle of lexical scope is the source of
considerable expressive power. We’ll demonstrate this through a series of
examples.
To warm up let’s consider some simple examples of passing functions
as arguments and yielding functions as results. The standard example of
passing a function as argument is the map’ function, which applies a given
function to every element of a list. It is deﬁned as follows:
fun map’ (f, nil) = nil
 map’ (f, h::t) = (f h) :: map’ (f, t)
For example, the application
map’ (fn x => x+1, [1,2,3,4])
evaluates to the list [2,3,4,5].
Functions may also yield functions as results. What is surprising is that
we can create new functions during execution, not just return functions
that have been previously deﬁned. The most basic (and deceptively sim
ple) example is the function constantly that creates constant functions:
given a value k, the application constantly k yields a function that yields
k whenever it is applied. Here’s a deﬁnition of constantly:
val constantly = fn k => (fn a => k)
The function constantly has type ’a > (’b > ’a). We used the fn no
tation for clarity, but the declaration of the function constantly may also
be written using fun notation as follows:
fun constantly k a = k
Note well that a white space separates the two successive arguments to
constantly! The meaning of this declaration is precisely the same as the
earlier deﬁnition using fn notation.
The value of the application constantly 3 is the function that is con
stantly 3; i.e., it always yields 3 when applied. Yet nowhere have we de
ﬁned the function that always yields 3. The resulting function is “created”
by the application of constantly to the argument 3, rather than merely
REVISED 11.02.11 DRAFT VERSION 1.2
11.3 Returning Functions 96
“retrieved” off the shelf of previouslydeﬁned functions. In implementa
tion terms the result of the application constantly 3 is a closure consisting
of the function fn a => k with the environment val k = 3 attached to it.
The closure is a data structure (a pair) that is created by each application of
constantly to an argument; the closure is the representation of the “new”
function yielded by the application. Notice, however, that the only differ
ence between any two results of applying the function constantly lies in
the attached environment; the underlying function is always fn a => k. If
we think of the lambda as the “executable code” of the function, then this
amounts to the observation that no new code is created at runtime, just
new instances of existing code.
This also points out why functions in ML are not the same as code
pointers in C. You may be familiar with the idea of passing a pointer to
a C function to another C function as a means of passing functions as ar
guments or yielding functions as results. This may be considered to be a
form of “higherorder” function in C, but it must be emphasized that code
pointers are signiﬁcantly less powerful than closures because in C there
are only statically many possibilities for a code pointer (it must point to one
of the functions deﬁned in your code), whereas in ML we may generate dy
namically many different instances of a function, differing in the bindings
of the variables in its environment. The nonvarying part of the closure,
the code, is directly analogous to a function pointer in C, but there is no
counterpart in C of the varying part of the closure, the dynamic environ
ment.
The deﬁnition of the function map’ given above takes a function and list
as arguments, yielding a new list as result. Often it occurs that we wish to
map the same function across several different lists. It is inconvenient (and
a tad inefﬁcient) to keep passing the same function to map’, with the list
argument varying each time. Instead we would prefer to create a instance
of map specialized to the given function that can then be applied to many
different lists. This leads to the following deﬁnition of the function map:
fun map f nil = nil
 map f (h::t) = (f h) :: (map f t)
The function map so deﬁned has type (’a>’b) > ’a list > ’b list.
It takes a function of type ’a > ’b as argument, and yields another func
tion of type ’a list > ’b list as result.
REVISED 11.02.11 DRAFT VERSION 1.2
11.4 Patterns of Control 97
The passage from map’ to map is called currying. We have changed a
twoargument function (more properly, a function taking a pair as argu
ment) into a function that takes two arguments in succession, yielding af
ter the ﬁrst a function that takes the second as its sole argument. This
passage can be codiﬁed as follows:
fun curry f x y = f (x, y)
The type of curry is
(’a*’b>’c) > (’a > (’b > ’c)).
Given a twoargument function, curry returns another function that, when
applied to the ﬁrst argument, yields a function that, when applied to the
second, applies the original twoargument function to the ﬁrst and second
arguments, given separately.
Observe that map may be alternately deﬁned by the binding
fun map f l = curry map’ f l
Applications are implicitly leftassociated, so that this deﬁnition is equiv
alent to the more verbose declaration
fun map f l = ((curry map’) f) l
11.4 Patterns of Control
We turn now to the idea of abstracting patterns of control. There is an
obvious similarity between the following two functions, one to add up the
numbers in a list, the other to multiply them.
fun add up nil = 0
 add up (h::t) = h + add up t
fun mul up nil = 1
 mul up (h::t) = h * mul up t
What precisely is the similarity? We will look at it from two points of view.
One view is that in each case we have a binary operation and a unit
element for it. The result on the empty list is the unit element, and the
result on a nonempty list is the operation applied to the head of the list
and the result on the tail. This pattern can be abstracted as the function
reduce deﬁned as follows:
REVISED 11.02.11 DRAFT VERSION 1.2
11.4 Patterns of Control 98
fun reduce (unit, opn, nil) =
unit
 reduce (unit, opn, h::t) =
opn (h, reduce (unit, opn, t))
Here is the type of reduce:
val reduce : ’b * (’a*’b>’b) * ’a list > ’b
The ﬁrst argument is the unit element, the second is the operation, and the
third is the list of values. Notice that the type of the operation admits the
possibility of the ﬁrst argument having a different type from the second
argument and result.
Using reduce, we may redeﬁne add up and mul up as follows:
fun add up l = reduce (0, op +, l)
fun mul up l = reduce (1, op *, l)
To further check your understanding, consider the following declaration:
fun mystery l = reduce (nil, op ::, l)
(Recall that “op ::” is the function of type ’a * ’a list > ’a list that
adds a given value to the front of a list.) What function does mystery
compute?
Another view of the commonality between add up and mul up is that
they are both deﬁned by induction on the structure of the list argument,
with a base case for nil, and an inductive case for h::t, deﬁned in terms of
its behavior on t. But this is really just another way of saying that they are
deﬁned in terms of a unit element and a binary operation! The difference
is one of perspective: whether we focus on the pattern part of the clauses
(the inductive decomposition) or the result part of the clauses (the unit
and operation). The recursive structure of add up and mul up is abstracted
by the reduce functional, which is then specialized to yield add up and
mul up. Said another way, the function reduce abstracts the pattern of deﬁning
a function by induction on the structure of a list.
The deﬁnition of reduce leaves something to be desired. One thing to
notice is that the arguments unit and opn are carried unchanged through
the recursion; only the list parameter changes on recursive calls. While
this might seem like a minor overhead, it’s important to remember that
REVISED 11.02.11 DRAFT VERSION 1.2
11.5 Staging 99
multiargument functions are really singleargument functions that take
a tuple as argument. This means that each time around the loop we are
constructing a newtuple whose ﬁrst and second components remain ﬁxed,
but whose third component varies. Is there a better way? Here’s another
deﬁnition that isolates the “inner loop” as an auxiliary function:
fun better reduce (unit, opn, l) =
let
fun red nil = unit
 red (h::t) = opn (h, red t)
in
red l
end
Notice that each call to better reduce creates a new function red that uses
the parameters unit and opn of the call to better reduce. This means that
red is bound to a closure consisting of the code for the function together
with the environment active at the point of deﬁnition, which will provide
bindings for unit and opn arising from the application of better reduce
to its arguments. Furthermore, the recursive calls to red no longer carry
bindings for unit and opn, saving the overhead of creating tuples on each
iteration of the loop.
11.5 Staging
An interesting variation on reduce may be obtained by staging the compu
tation. The motivation is that unit and opn often remain ﬁxed for many
different lists (e.g., we may wish to sum the elements of many different
lists). In this case unit and opn are said to be “early” arguments and the
list is said to be a “late” argument. The idea of staging is to perform as
much computation as possible on the basis of the early arguments, yield
ing a function of the late arguments alone.
In the case of the function reduce this amounts to building red on the
basis of unit and opn, yielding it as a function that may be later applied to
many different lists. Here’s the code:
REVISED 11.02.11 DRAFT VERSION 1.2
11.5 Staging 100
fun staged reduce (unit, opn) =
let
fun red nil = unit
 red (h::t) = opn (h, red t)
in
red
end
The deﬁnition of staged reduce bears a close resemblance to the deﬁnition
of better reduce; the only difference is that the creation of the closure
bound to red occurs as soon as unit and opn are known, rather than each
time the list argument is supplied. Thus the overhead of closure creation
is “factored out” of multiple applications of the resulting function to list
arguments.
We could just as well have replaced the body of the let expression with
the function
fn l => red l
but a moment’s thought reveals that the meaning is the same.
Note well that we would not obtain the effect of staging were we to use
the following deﬁnition:
fun curried reduce (unit, opn) nil = unit
 curried reduce (unit, opn) (h::t) =
opn (h, curried reduce (unit, opn) t)
If we unravel the fun notation, we see that while we are taking two ar
guments in succession, we are not doing any useful work in between the
arrival of the ﬁrst argument (a pair) and the second (a list). A curried func
tion does not take signiﬁcant advantage of staging. Since staged reduce
and curried reduce have the same iterated function type, namely
(’b * (’a * ’b > ’b)) > ’a list > ’b
the contrast between these two examples may be summarized by saying
not every function of iterated function type is curried. Some are, and some
aren’t. The “interesting” examples (such as staged reduce) are the ones
that aren’t curried. (This directly contradicts established terminology, but
it is necessary to deviate from standard practice to avoid a serious misap
prehension.)
REVISED 11.02.11 DRAFT VERSION 1.2
11.5 Staging 101
The time saved by staging the computation in the deﬁnition of staged reduce
is admittedly minor. But consider the following deﬁnition of an append
function for lists that takes both arguments at once:
fun append (nil, l) = l
 append (h::t, l) = h :: append(t,l)
Suppose that we will have occasion to append many lists to the end of a
given list. What we’d like is to build a specialized appender for the ﬁrst
list that, when applied to a second list, appends the second to the end of
the ﬁrst. Here’s a naive solution that merely curries append:
fun curried append nil l = l
 curried append (h::t) l = h :: curried append t l
Unfortunately this solution doesn’t exploit the fact that the ﬁrst argument
is ﬁxed for many second arguments. In particular, each application of the
result of applying curried append to a list results in the ﬁrst list being
traversed so that the second can be appended to it.
We can improve on this by staging the computation as follows:
fun staged append nil = (fn l => l)
 staged append (h::t) =
let
val tail appender = staged append t
in
fn l => h :: tail appender l
end
Notice that the ﬁrst list is traversed once for all applications to a second ar
gument. When applied to a list [v
1
,...,v
n
], the function staged append
yields a function that is equivalent to, but not quite as efﬁcient as, the
function
fn l => v
1
:: v
2
:: ... :: v
n
:: l.
This still takes time proportional to n, but a substantial savings accrues
from avoiding the pattern matching required to destructure the original
list argument on each call.
REVISED 11.02.11 DRAFT VERSION 1.2
11.6 Sample Code 102
11.6 Sample Code
Here is the code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 12
Exceptions
In chapter 2 we mentioned that expressions in Standard ML always have a
type, may have a value, and may have an effect. So far we’ve concentrated
on typing and evaluation. In this chapter we will introduce the concept of
an effect. While it’s hard to give a precise general deﬁnition of what we
mean by an effect, the idea is that an effect is any action resulting from
evaluation of an expression other than returning a value. From this point
of view we might consider nontermination to be an effect, but we don’t
usually think of failure to terminate as a positive “action” in its own right,
rather as a failure to take any action.
The main examples of effects in ML are these:
1. Exceptions. Evaluation may be aborted by signaling an exceptional
condition.
2. Mutation. Storage may be allocated and modiﬁed during evaluation.
3. Input/output. It is possible to read from an input source and write to
an output sink during evaluation.
4. Communication. Data may be sent to and received from communica
tion channels.
This chapter is concerned with exceptions; the other forms of effects will
be considered later.
12.1 Exceptions as Errors 104
12.1 Exceptions as Errors
ML is a safe language in the sense that its execution behavior may be un
derstood entirely in terms of the constructs of the language itself. Behav
ior such as “dumping core” or incurring a “bus error” are extralinguistic
notions that may only be explained by appeal to the underlying imple
mentation of the language. These cannot arise in ML. This is ensured by
a combination of a static type discipline, which rules out expressions that
are manifestly illdeﬁned (e.g., adding a string to an integer or casting an
integer as a function), and by dynamic checks that rule out violations that
cannot be detected statically (e.g., division by zero or arithmetic overﬂow).
Static violations are signalled by type checking errors; dynamic violations
are signalled by raising exceptions.
12.1.1 Primitive Exceptions
The expression 3 + "3" is illtyped, and hence cannot be evaluated. In
contrast the expression 3 div 0 is welltyped (with type int), but incurs
a runtime fault that is signalled by raising the exception Div. We will
indicate this by writing
3 div 0 ⇓ raise Div
An exception is a form of “answer” to the question “what is the value
of this expression?”. In most implementations an exception such as this
is reported by an error message of the form “Uncaught exception Div”,
together with the line number (or some other indication) of the point in
the program where the exception occurred.
Exceptions have names so that we may distinguish different sources
of error in a program. For example, evaluation of the expression maxint
* maxint (where maxint is the largest representable integer) causes the
exception Overflow to be raised, indicating that an arithmetic overﬂow
error arose in the attempt to carry out the multiplication. This is usefully
distinguished from the exception Div, corresponding to division by zero.
(You may be wondering about the overhead of checking for arithmetic
faults. The compiler must generate instructions that ensure that an over
ﬂow fault is caught before any subsequent operations are performed. This
can be quite expensive on pipelined processors, which sacriﬁce precise de
livery of arithmetic faults in the interest of speeding up execution in the
REVISED 11.02.11 DRAFT VERSION 1.2
12.1 Exceptions as Errors 105
nonfaulting case. Unfortunately it is necessary to incur this overhead if
we are to avoid having the behavior of an ML program depend on the
underlying processor on which it is implemented.)
Another source of runtime exceptions is an inexhaustive match. Sup
pose we deﬁne the function hd as follows
fun hd (h:: ) = h
This deﬁnition is inexhaustive since it makes no provision for the possibil
ity of the argument being nil. What happens if we apply hd to nil? The
exception Match is raised, indicating the failure of the patternmatching
process:
hd nil ⇓ raise Match
The occurrence of a Match exception at runtime is indicative of a vio
lation of a precondition to the invocation of a function somewhere in the
program. Recall that it is often sensible for a function to be inexhaustive,
provided that we take care to ensure that it is never applied to a value
outside of its domain. Should this occur (because of a programming mis
take, evidently), the result is nevertheless welldeﬁned because ML checks
for, and signals, pattern match failure. That is, ML programs are implic
itly “bulletproofed” against failures of pattern matching. The ﬂip side is
that if no inexhaustive match warnings arise during type checking, then
the exception Match can never be raised during evaluation (and hence no
runtime checking need be performed).
Arelated situation is the use of a pattern in a val binding to destructure
a value. If the pattern can fail to match a value of this type, then a Bind
exception is raised at runtime. For example, evaluation of the binding
val h:: = nil
raises the exception Bind since the pattern h:: does not match the value
nil. Here again observe that a Bind exception cannot arise unless the com
piler has previously warned us of the possibility: no warning, no Bind
exception.
12.1.2 UserDeﬁned Exceptions
So far we have considered examples of predeﬁned exceptions that in
dicate fatal error conditions. Since the builtin exceptions have a built
REVISED 11.02.11 DRAFT VERSION 1.2
12.1 Exceptions as Errors 106
in meaning, it is generally inadvisable to use these to signal program
speciﬁc error conditions. Instead we introduce a new exception using an
exception declaration, and signal it using a raise expression when a run
time violation occurs. That way we can associate speciﬁc exceptions with
speciﬁc pieces of code, easing the process of tracking down the source of
the error.
Suppose that we wish to deﬁne a “checked factorial” function that en
sures that its argument is nonnegative. Here’s a ﬁrst attempt at deﬁning
such a function:
exception Factorial
fun checked factorial n =
if n < 0 then
raise Factorial
else if n=0 then
1
else n * checked factorial (n1)
The declaration exception Factorial introduces an exception Factorial,
which we raise in the case that checked factorial is applied to a negative
number.
The deﬁnition of checked factorial is unsatisfactory in at least two
respects. One, relatively minor, issue is that it does not make effective use
of pattern matching, but instead relies on explicit comparison operations.
To some extent this is unavoidable since we wish to check explicitly for
negative arguments, which cannot be done using a pattern. A more sig
niﬁcant problem is that checked factorial repeatedly checks the validity
of its argument on each recursive call, even though we can prove that if
the initial argument is nonnegative, then so must be the argument on each
recursive call. This fact is not reﬂected in the code. We can improve the
deﬁnition by introducing an auxiliary function:
exception Factorial
local
fun fact 0 = 1
 fact n = n * fact (n1)
in
fun checked factorial n =
if n >= 0 then
REVISED 11.02.11 DRAFT VERSION 1.2
12.2 Exception Handlers 107
fact n
else
raise Factorial
end
Notice that we performthe range check exactly once, and that the auxiliary
function makes effective use of patternmatching.
12.2 Exception Handlers
The use of exceptions to signal error conditions suggests that raising an
exception is fatal: execution of the program terminates with the raised
exception. But signaling an error is only one use of the exception mech
anism. More generally, exceptions can be used to effect nonlocal transfers
of control. By using an exception handler we may “catch” a raised exception
and continue evaluation along some other path. A very simple example
is provided by the following driver for the factorial function that accepts
numbers from the keyboard, computes their factorial, and prints the re
sult.
fun factorial driver () =
let
val input = read integer ()
val result =
toString (checked factorial input)
in
print result
end
handle Factorial => print "Out of range."
An expression of the form exp handle match is called an exception han
dler. It is evaluated by attempting to evaluate exp. If it returns a value, then
that is the value of the entire expression; the handler plays no role in this
case. If, however, exp raises an exception exc, then the exception value is
matched against the clauses of the match (exactly as in the application of a
clausal function to an argument) to determine how to proceed. If the pat
tern of a clause matches the exception exc, then evaluation resumes with
the expression part of that clause. If no pattern matches, the exception
REVISED 11.02.11 DRAFT VERSION 1.2
12.2 Exception Handlers 108
exc is reraised so that outer exception handlers may dispatch on it. If no
handler handles the exception, then the uncaught exception is signaled
as the ﬁnal result of evaluation. That is, computation is aborted with the
uncaught exception exc.
In more operational terms, evaluation of exp handle match proceeds
by installing an exception handler determined by match, then evaluating
exp. The previous binding of the exception handler is preserved so that
it may be restored once the given handler is no longer needed. Raising
an exception consists of passing a value of type exn to the current excep
tion handler. Passing an exception to a handler deinstalls that handler,
and reinstalls the previously active handler. This ensures that if the han
dler itself raises an exception, or fails to handle the given exception, then
the exception is propagated to the handler active prior to evaluation of the
handle expression. If the expression does not raise an exception, the previ
ous handler is restored as part of completing the evaluation of the handle
expression.
Returning to the function factorial driver, we see that evaluation
proceeds by attempting to compute the factorial of a given number (read
from the keyboard by an unspeciﬁed function read integer), printing the
result if the given number is in range, and otherwise reporting that the
number is out of range. The example is trivialized to focus on the role of
exceptions, but one could easily imagine generalizing it in a number of
ways that also make use of exceptions. For example, we might repeatedly
read integers until the user terminates the input stream (by typing the end
of ﬁle character). Termination of input might be signaled by an EndOfFile
exception, which is handled by the driver. Similarly, we might expect that
the function read integer raises the exception SyntaxError in the case
that the input is not properly formatted. Again we would handle this
exception, print a suitable message, and resume.
Here’s a sketch of a more complicated factorial driver:
fun factorial driver () =
let
val input = read integer ()
val result =
toString (checked factorial input)
val = print result
in
REVISED 11.02.11 DRAFT VERSION 1.2
12.2 Exception Handlers 109
factorial driver ()
end
handle EndOfFile => print "Done."
 SyntaxError =>
let
val = print "Syntax error."
in
factorial driver ()
end
 Factorial =>
let
val = print "Out of range."
in
factorial driver ()
end
We will return to a more detailed discussion of input/output later in these
notes. The point to notice here is that the code is structured with a com
pletely uncluttered “normal path” that reads an integer, computes its fac
torial, formats it, prints it, and repeats. The exception handler takes care
of the exceptional cases: end of ﬁle, syntax error, and domain error. In the
latter two cases we report an error, and resume reading. In the former we
simply report completion and we are done.
The reader is encouraged to imagine how one might structure this pro
gram without the use of exceptions. The primary beneﬁts of the exception
mechanism are as follows:
1. They force you to consider the exceptional case (if you don’t, you’ll
get an uncaught exception at runtime), and
2. They allow you to segregate the special case from the normal case in
the code (rather than clutter the code with explicit checks).
These aspects work handinhand to facilitate writing robust programs.
A typical use of exceptions is to implement backtracking, a program
ming technique based on exhaustive search of a state space. A very sim
ple, if somewhat artiﬁcial, example is provided by the following function
to compute change from an arbitrary list of coin values. What is at issue
is that the obvious “greedy” algorithm for making change that proceeds
REVISED 11.02.11 DRAFT VERSION 1.2
12.3 ValueCarrying Exceptions 110
by doling out as many coins as possible in decreasing order of value does
not always work. Given only a 5 cent and a 2 cent coin, we cannot make
16 cents in change by ﬁrst taking three 5’s and then proceeding to dole out
2’s. In fact we must use two 5’s and three 2’s to make 16 cents. Here’s a
method that works for any set of coins:
exception Change
fun change 0 = nil
 change nil = raise Change
 change (coin::coins) amt =
if coin > amt then
change coins amt
else
(coin :: change (coin::coins) (amtcoin))
handle Change => change coins amt
The idea is to proceed greedily, but if we get “stuck”, we undo the most
recent greedy decision and proceed again from there. Simulate evaluation
of the example of change [5,2] 16 to see how the code works.
12.3 ValueCarrying Exceptions
So far exceptions are just “signals” that indicate that an exceptional con
dition has arisen. Often it is useful to attach additional information that
is passed to the exception handler. This is achieved by attaching values to
exceptions.
For example, we might associate with a SyntaxError exception a string
indicating the precise nature of the error. In a parser for a language we
might write something like
raise SyntaxError "Integer expected"
to indicate a malformed expression in a situation where an integer is ex
pected, and write
raise SyntaxError "Identifier expected"
to indicate a badlyformed identiﬁer.
To associate a string with the exception SyntaxError, we declare it as
REVISED 11.02.11 DRAFT VERSION 1.2
12.3 ValueCarrying Exceptions 111
exception SyntaxError of string.
This declaration introduces the exception SyntaxError as an exception
carrying a string as value. This declaration introduces the exception con
structor SyntaxError.
Exception constructors are in many ways similar to value constructors.
In particular they can be used in patterns, as in the following code frag
ment:
... handle SyntaxError msg => (print "Syntax error: " ^ msg)
Here we specify a pattern for SyntaxError exceptions that also binds the
string associated with the exception to the identiﬁer msg and prints that
string along with an error indication.
Recall that we may use value constructors in two ways:
1. We may use themto create values of a datatype (perhaps by applying
them to other values).
2. We may use themto match values of a datatype (perhaps also match
ing a constituent value).
The situation with exception constructors is symmetric.
1. We may use them to create an exception (perhaps with an associated
value).
2. We may use them to match an exception (perhaps also matching the
associated value).
Value constructors have types, as we previously mentioned. For exam
ple, the list constructors nil and :: have types ’a list and ’a * ’a list
> ’a list, respectively. What about exception constructors? A “bare”
exception constructor (such as Factorial above) has type exn and a value
carrying exception constructor (such as SyntaxError) has type string >
exn. Thus Factorial is a value of type exn, and
SyntaxError "Integer expected"
is a value of type exn.
The type exn is the type of exception packets, the data values associated
with exceptions. The primitive operation raise takes any value of type
REVISED 11.02.11 DRAFT VERSION 1.2
12.4 Sample Code 112
exn as argument and raises an exception with that value. The clauses of
a handler may be applied to any value of type exn using the rules of pat
tern matching described earlier; if an exception constructor is no longer in
scope, then the handler cannot catch it (other than via a wildcard pattern).
The type exn may be thought of as a kind of builtin datatype, except
that the constructors of this type are not determined once and for all (as
they are with a datatype declaration), but rather are incrementally intro
duced as needed in a program. For this reason the type exn is sometimes
called an extensible datatype.
12.4 Sample Code
Here is the code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 13
Mutable Storage
In this chapter we consider a second form of effect, called a storage effect,
the allocation or mutation of storage during evaluation. The introduction
of storage effects has profound consequences, not all of which are desir
able. (Indeed, one connotation of the phrase side effect is an unintended
consequence of a medication!) While it is excessive to dismiss storage ef
fects as completely undesirable, it is advantageous to minimize the use of
storage effects except in situations where the task clearly demands them.
We will explore some techniques for programming with storage effects
later in this chapter, but ﬁrst we introduce the primitive mechanisms for
programming with mutable storage in ML.
13.1 Reference Cells
To support mutable storage the execution model that we described in chap
ter 2 is enriched with a memory consisting of a ﬁnite set of mutable cells. A
mutable cell may be thought of as a container in which a data value of a
speciﬁed type is stored. During execution of a program the contents of
a cell may be retrieved or replaced by any other value of the appropriate
type. Since cells are used by issuing “commands” to modify and retrieve
their contents, programming with cells is called imperative programming.
Changing the contents of a mutable cell introduces a temporal aspect
to evaluation. We speak of the current contents of a cell, meaning the value
most recently assigned to it. We also speak of previous and future values of a
reference cell when discussing the behavior of a program. This is in sharp
13.1 Reference Cells 114
contrast to the effectfree fragment of ML, for which no such concepts ap
ply. For example, the binding of a variable does not change while eval
uating within the scope of that variable, lending a “permanent” quality
to statements about variables — the “current” binding is the only binding
that variable will ever have.
The type typ ref is the type of reference cells containing values of type
typ. Reference cells are, like all values, ﬁrst class — they may be bound
to variables, passed as arguments to functions, returned as results of func
tions, appear within data structures, and even be stored within other ref
erence cells.
A reference cell is created, or allocated, by the function ref of type typ >
typ ref. When applied to a value val of type typ, ref allocates a “new” cell,
initializes its content to val, and returns a reference to the cell. By “new”
we mean that the allocated cell is distinct from all other cells previously
allocated, and does not share storage with them.
The contents of a cell of type typ is retrieved using the function ! of type
typ ref > typ. Applying ! to a reference cell yields the current contents
of that cell. The contents of a cell is changed by applying the assignment
operator op :=, which has type typ ref * typ > unit. Assignment is
usually written using inﬁx syntax. When applied to a cell and a value, it
replaces the content of that cell with that value, and yields the nulltuple
as result.
Here are some examples:
val r = ref 0
val s = ref 0
val = r := 3
val x = !s + !r
val t = r
val = t := 5
val y = !s + !r
val z = !t + !r
After execution of these bindings, the variable x is bound to 3, the variable
y is bound to 5, and z is bound to 10.
Notice the use of a val binding of the form val = exp when exp is
to be evaluated purely for its effect. The value of exp is discarded by the
binding, since the lefthand side is a wildcard pattern. In most cases the
REVISED 11.02.11 DRAFT VERSION 1.2
13.2 Reference Patterns 115
expression exp has type unit, so that its value is guaranteed to be the null
tuple, (), if it has a value at all.
A wildcard binding is used to deﬁne sequential composition of expres
sions in ML. The expression
exp
1
; exp
2
is shorthand for the expression
let
val = exp
1
in
exp
2
end
that ﬁrst evaluates exp
1
for its effect, then evaluates exp
2
.
Functions of type typ>unit are sometimes called procedures, because
they are executed purely for their effect. This is apparent from the type: it
is assured that the value of applying such a function is the nulltuple, (),
so the only point of applying it is for its effects on memory.
13.2 Reference Patterns
It is a common mistake to omit the exclamation point when referring to
the content of a reference, especially when that cell is bound to a variable.
In more familiar languages such as C all variables are implicitly bound
to reference cells, and they are implicitly dereferenced whenever they are
used so that a variable always stands for its current contents. This is both a
boon and a bane. It is obviously helpful in many common cases since it al
leviates the burden of having to explicitly dereference variables whenever
their content is required. However, it shifts the burden to the program
mer in the case that the address, and not the content, is intended. In C
one writes & x for the address of (the cell bound to) x. Whether explicit or
implicit dereferencing is preferable is to a large extent a matter of taste.
The burden of explicit dereferencing is not nearly so onerous in ML as
it might be in other languages simply because reference cells are used so
infrequently in ML programs, whereas they are the sole means of binding
variables in more familiar languages.
REVISED 11.02.11 DRAFT VERSION 1.2
13.3 Identity 116
An alternative to explicitly dereferencing cells is to use ref patterns. A
pattern of the form ref pat matches a reference cell whose content matches
the pattern pat. This means that the cell’s contents are implicitly retrieved
during pattern matching, and may be subsequently used without explicit
dereferencing. In fact, the function ! may be deﬁned using a ref pattern
as follows:
fun !(ref a) = a
When called with a reference cell, it is dereferenced and its contents is
bound to a, which is returned as result. In practice it is common to use
both explicit dereferencing and ref patterns, depending on the situation.
13.3 Identity
Reference cells raise delicate issues of equality that considerably compli
cate reasoning about programs. In general we say that two expressions (of
the same type) are equal iff they cannot be distinguished by any operation
in the language. That is, two expressions are distinct iff there is some way
within the language to tell them apart. This is called Leibniz’s Principle of
identity of indiscernables — we equate everything that we cannot tell apart
— and the indiscernability of identicals — that which we deem equal cannot
be told apart.
What makes Leibniz’s Principle tricky to grasp is that it hinges on what
we mean by a “way to tell expressions apart”. The crucial idea is that we
can tell two expressions apart iff there is a complete program containing one
of the expressions whose observable behavior changes when we replace that
expression by the other. That is, two expressions are considered equal iff
there is no such scenario that distinguishes them. But what do we mean by
“complete program”? And what do we mean by “observable behavior”?
For the present purposes we will consider a complete program to be
any expression of basic type (say, int or bool or string). The idea is that a
complete programis one that computes a concrete result such as a number.
The observable behavior of a complete program includes at least these
aspects:
1. Its value, or lack thereof, either by nontermination or by raising an
uncaught exception.
REVISED 11.02.11 DRAFT VERSION 1.2
13.3 Identity 117
2. Its visible side effects, include visible modiﬁcations to mutable stor
age or any input/output it may perform.
In contrast here are some behaviors that we will not count as observations:
1. Execution time or space usage.
2. “Private” uses of storage (e.g., internallyallocated reference cells).
3. The name of uncaught exceptions (i.e., we will not distinguish be
tween terminating with the uncaught exception Bind and the un
caught exception Match.
With these ideas in mind, it should be plausible that if we evaluate
these bindings
val r = ref 0
val s = ref 0
then r and s are not equivalent. Consider the following usage of r to com
pute an integer result:
(s := 1 ; !r)
Clearly this expression evaluates to 0, and mutates the binding of s. Now
replace r by s to obtain
(s := 1 ; !s)
This expression evaluates to 1, and mutates s as before. These two com
plete programs distinguish r from s, and therefore must be considered
distinct.
Had we replaced the binding for s by the binding
val s = r
then the two expressions that formerly distinguished r from s no long
do so — they are, after all, bound to the same reference cell! In fact, no
program can be concocted that would distinguish them. In this case r and
s are equivalent.
Now consider a third, very similar scenario. Let us declare r and s as
follows:
REVISED 11.02.11 DRAFT VERSION 1.2
13.4 Aliasing 118
val r = ref ()
val s = ref ()
Are r and s equivalent or not? We might ﬁrst try to distinguish them by
a variant of the experiment considered above. This breaks down because
there is only one possible value we can assign to a variable of type unit
ref! Indeed, one may suspect that r and s are equivalent in this case,
but in fact there is a way to distinguish them! Here’s a complete program
involving r that we will use to distinguish r from s:
if r=r then "it’s r" else "it’s not"
Now replace the ﬁrst occurrence of r by s to obtain
if s=r then "it’s r" else "it’s not"
and the result is different.
This example hinges on the fact that ML deﬁnes equality for values
of reference type to be reference equality (or, occasionally, pointer equality).
Two reference cells (of the same type) are equal in this sense iff they both
arise from the exact same use of the ref operation to allocate that cell;
otherwise they are distinct. Thus the two cells bound to r and s above are
observably distinct (by testing reference equality), even though they can
only ever hold the value (). Had equality not been included as a primitive,
any two reference cells of unit type would have been equal.
Why does ML provide such a ﬁnegrained notion of equality? “True”
equality, as deﬁned by Leibniz’s Principle, is, unfortunately, undecidable
— there is no computer program that determines whether two expressions
are equivalent in this sense. ML provides a useful, conservative approxi
mation to true equality that in some cases is not deﬁned (you cannot test
two functions for equality) and in other cases is too picky (it distinguishes
reference cells that are otherwise indistinguishable). Such is life.
13.4 Aliasing
To see how reference cells complicate programming, let us consider the
problem of aliasing. Any two variables of the same reference type might
be bound to the same reference cell, or to two different reference cells. For
example, after the declarations
REVISED 11.02.11 DRAFT VERSION 1.2
13.5 Programming Well With References 119
val r = ref 0
val s = ref 0
the variables r and s are not aliases, but after the declaration
val r = ref 0
val s = r
the variables r and s are aliases for the same reference cell.
These examples show that we must be careful when programming
with variables of reference type. This is particularly problematic in the
case of functions, because we cannot assume that two different argument
variables are bound to different reference cells. They might, in fact, be
bound to the same reference cell, in which case we say that the two vari
ables are aliases for one another. For example, in a function of the form
fn (x:typ ref, y:typ ref) => exp
we may not assume that x and y are bound to different reference cells. We
must always ask ourselves whether we’ve properly considered aliasing
when writing such a function. This is harder to do than it sounds. Aliasing
is a huge source of bugs in programs that work with reference cells.
13.5 Programming Well With References
Using references it is possible to mimic the style of programming used in
imperative languages such as C. For example, we might deﬁne the facto
rial function in imitation of such languages as follows:
REVISED 11.02.11 DRAFT VERSION 1.2
13.5 Programming Well With References 120
fun imperative fact (n:int) =
let
val result = ref 1
val i = ref 0
fun loop () =
if !i = n then
()
else
(i := !i + 1;
result := !result * !i;
loop ())
in
loop (); !result
end
Notice that the function loop is essentially just a while loop; it repeatedly
executes its body until the contents of the cell bound to i reaches n. The
tail call to loop is essentially just a goto statement to the top of the loop.
It is (appallingly) bad style to program in this fashion. The purpose of the
function imperative fact is to compute a simple function on the natu
ral numbers. There is nothing about its deﬁnition that suggests that state
must be maintained, and so it is senseless to allocate and modify storage
to compute it. The deﬁnition we gave earlier is shorter, simpler, more efﬁ
cient, and hence more suitable to the task. This is not to suggest, however,
that there are no good uses of references. We will now discuss some im
portant uses of state in ML.
13.5.1 Private Storage
The ﬁrst example is the use of higherorder functions to manage shared
private state. This programming style is closely related to the use of ob
jects to manage state in objectoriented programming languages. Here’s
an example to frame the discussion:
REVISED 11.02.11 DRAFT VERSION 1.2
13.5 Programming Well With References 121
local
val counter = ref 0
in
fun tick () = (counter := !counter + 1; !counter)
fun reset () = (counter := 0)
end
This declaration introduces two functions, tick of type unit > int and
reset of type unit > unit. Their deﬁnitions share a private variable
counter that is bound to a mutable cell containing the current value of
a shared counter. The tick operation increments the counter and returns
its new value, and the reset operation resets its value to zero. The types
of the operations suggest that implicit state is involved. In the absence
of exceptions and implicit state, there is only one useful function of type
unit>unit, namely the function that always returns its argument (and
it’s debatable whether this is really useful!).
The declaration above deﬁnes two functions, tick and reset, that share
a single private counter. Suppose now that we wish to have several differ
ent instances of a counter —different pairs of functions tick and reset that
share different state. We can achieve this by deﬁning a counter generator (or
constructor) as follows:
fun new counter () =
let
val counter = ref 0
fun tick () = (counter := !counter + 1; !counter)
fun reset () = (counter := 0)
in
¦ tick = tick, reset = reset ¦
end
The type of new counter is
unit > ¦ tick : unit>int, reset : unit>unit ¦.
We’ve packaged the two operations into a record containing two func
tions that share private state. There is an obvious analogy with classbased
objectoriented programming. The function new counter may be thought
of as a constructor for a class of counter objects. Each object has a private in
stance variable counter that is shared between the methods tick and reset
of the object represented as a record with two ﬁelds.
REVISED 11.02.11 DRAFT VERSION 1.2
13.5 Programming Well With References 122
Here’s how we use counters.
val c1 = new counter ()
val c2 = new counter ()
#tick c1 ();
(* 1 *)
#tick c1 ();
(* 2 *)
#tick c2 ();
(* 1 *)
#reset c1 ();
#tick c1 ();
(* 1 *)
#tick c2 ();
(* 2 *)
Notice that c1 and c2 are distinct counters that increment and reset inde
pendently of one another.
13.5.2 Mutable Data Structures
A second important use of references is to build mutable data structures.
The data structures (such as lists and trees) we’ve considered so far are
immutable in the sense that it is impossible to change the structure of the
list or tree without building a modiﬁed copy of that structure. This is
both a beneﬁt and a drawback. The principal beneﬁt is that immutable
data structures are persistent in that operations performed on them do not
destroy the original structure — in ML we can eat our cake and have it too.
For example, we can simultaneously maintain a dictionary both before
and after insertion of a given word. The principal drawback is that if we
aren’t really relying on persistence, then it is wasteful to make a copy of a
structure if the original is going to be discarded anyway. What we’d like
in this case is to have an “update in place” operation to build an ephemeral
(opposite of persistent) data structure. To do this in ML we make use of
references.
Asimple example is the type of possibly circular lists, or pcl’s. Informally,
a pcl is a ﬁnite graph in which every node has at most one neighbor, called
its predecessor, in the graph. In contrast to ordinary lists the predecessor
REVISED 11.02.11 DRAFT VERSION 1.2
13.5 Programming Well With References 123
relation is not necessarily wellfounded: there may be an inﬁnite sequence
of nodes arranged in descending order of predecession. Since the graph
is ﬁnite, this can only happen if there is a cycle in the graph: some node
has an ancestor as predecessor. How can such a structure ever come into
existence? If the predecessors of a cell are needed to construct a cell, then
the ancestor that is to serve as predecessor in the cyclic case can never be
created! The “trick” is to employ backpatching: the predecessor is initial
ized to Nil, so that the node and its ancestors can be constructed, then it is
reset to the appropriate ancestor to create the cycle.
This can be achieved in ML using the following datatype declaration:
datatype ’a pcl = Pcl of ’a pcell ref
and ’a pcell = Nil  Cons of ’a * ’a pcl;
A value of type typ pcl is essentially a reference to a value of type typ
pcell. A value of type typ pcell is either Nil, marking the end of a pos
sibly circular list that happens not to be circular, or Cons (h, t), where h
is a value of type typ and t is another such possiblycircular list.
Here are some convenient functions for creating and taking apart possibly
circular lists:
fun cons (h, t) = Pcl (ref (Cons (h, t)));
fun nill () = Pcl (ref Nil);
fun phd (Pcl (ref (Cons (h, )))) = h;
fun ptl (Pcl (ref (Cons ( , t)))) = t;
To implement backpatching, we need a way to “zap” the tail of a possibly
circular list.
fun stl (Pcl (r as ref (Cons (h, ))), u) =
(r := Cons (h, u));
If you’d like, it would make sense to require that the tail of the Cons cell
be the empty pcl, so that you’re only allowed to backpatch at the end of a
ﬁnite pcl.
Here is a ﬁnite and an inﬁnite pcl.
val finite = cons (4, cons (3, cons (2, cons (1, nill ()))))
val tail = cons (1, nill());
val infinite = cons (4, cons (3, cons (2, tail)));
val = stl (tail, infinite)
REVISED 11.02.11 DRAFT VERSION 1.2
13.6 Mutable Arrays 124
The last step backpatches the tail of the last cell of infinite to be infinite
itself, creating a circular list.
Now let us deﬁne the size of a pcl to be the number of distinct nodes
occurring in it. It is an interesting problem is to deﬁne a size function
for pcls that makes no use of auxiliary storage (e.g., no set of previously
encountered nodes) and runs in time proportional to the number of cells in
the pcl. The idea is to think of running a long race between a tortoise and a
hare. If the course is circular, then the hare, which quickly runs out ahead
of the tortoise, will eventually come from behind and pass it! Conversely,
if this happens, the course must be circular.
local
fun race (Nil, Nil) = 0
 race (Cons ( , Pcl (ref c)), Nil) =
1 + race (c, Nil)
 race (Cons ( , Pcl (ref c)), Cons ( , Pcl (ref Nil))) =
1 + race (c, Nil)
 race (Cons ( , l), Cons ( , Pcl (ref (Cons ( , m))))) =
1 + race’ (l, m)
and race’ (Pcl (r as ref c), Pcl (s as ref d)) =
if r=s then 0 else race (c, d)
in
fun size (Pcl (ref c)) = race (c, c)
end
The hare runs twice as fast as the tortoise. We let the tortoise do the count
ing; the hare’s job is simply to detect cycles. If the hare reaches the ﬁnish
line, it simply waits for the tortoise to ﬁnish counting. This covers the
ﬁrst three clauses of race. If the hare has not yet ﬁnished, we must con
tinue with the hare running at twice the pace, checking whether the hare
catches the tortoise from behind. Notice that it can never arise that the
tortoise reaches the end before the hare does! Consequently, the deﬁnition
of race is inexhaustive.
13.6 Mutable Arrays
In addition to reference cells, ML also provides mutable arrays as a prim
itive data structure. The type typ array is the type of arrays carrying val
REVISED 11.02.11 DRAFT VERSION 1.2
13.6 Mutable Arrays 125
ues of type typ. The basic operations on arrays are these:
val array : int * ’a > ’a array
val length : ’a array > int
val sub : ’a array * int > ’a
val update : ’a array * int * ’a > unit
The function array creates a new array of a given length, with the given
value as the initial value of every element of the array. The function length
returns the length of an array. The function sub performs a subscript op
eration, returning the ith element of an array A, where 0 ≤ i < length(A).
(These are just the basic operations on arrays; please see Appendix A for
complete information.)
One simple use of arrays is for memoization. Here’s a function to com
pute the nth Catalan number, which may be thought of as the number
of distinct ways to parenthesize an arithmetic expression consisting of a
sequence of n consecutive multiplication’s. It makes use of an auxiliary
summation function that you can easily deﬁne for yourself. (Applying
sum to f and n computes the sum of f 1 + + f n.)
fun C 1 = 1
 C n = sum (fn k => (C k) * (C (nk))) (n1)
This deﬁnition of C is hugely inefﬁcient because a given computation may
be repeated exponentially many times. For example, to compute C 10 we
must compute C 1, C 2, . . . , C 9, and the computation of C i engenders the
computation of C 1, . . . , C i −1 for each 1 ≤ i ≤ 9. We can do better by
caching previouslycomputed results in an array, leading to an enormous
improvement in execution speed. Here’s the code:
local
val limit : int = 100
val memopad : int option array =
Array.array (limit, NONE)
in
fun C’ 1 = 1
 C’ n = sum (fn k => (C k)*(C (nk))) (n1)
and C n =
if n < limit then
case Array.sub (memopad, n)
REVISED 11.02.11 DRAFT VERSION 1.2
13.7 Sample Code 126
of SOME r => r
 NONE =>
let
val r = C’ n
in
Array.update (memopad, n, SOME r);
r
end
else
C’ n
end
Note carefully the structure of the solution. The function C is a memoized
version of the Catalan number function. When called it consults the mem
opad to determine whether or not the required result has already been
computed. If so, the answer is simply retrieved from the memopad, other
wise the result is computed, stored in the cache, and returned. The func
tion C’ looks superﬁcially similar to the earlier deﬁnition of C, with the
important difference that the recursive calls are to C, rather than C’ itself.
This ensures that subcomputations are properly cached and that the cache
is consulted whenever possible.
The main weakness of this solution is that we must ﬁx an upper bound
on the size of the cache. This can be alleviated by implementing a more
sophisticated cache management scheme that dynamically adjusts the size
of the cache based on the calls made to it.
13.7 Sample Code
Here is the code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 14
Input/Output
The Standard ML Basis Library (described in Appendix A) deﬁnes a three
layer input and output facility for Standard ML. These modules provide
a rudimentary, platformindependent text I/O facility that we summarize
brieﬂy here. The reader is referred to Appendix A for more details. Un
fortunately, there is at present no standard library for graphical user inter
faces; each implementation provides its own package. See your compiler’s
documentation for details.
14.1 Textual Input/Output
The text I/O primitives are based on the notions of an input stream and an
output stream, which are values of type instream and outstream, respec
tively. An input stream is an unbounded sequence of characters arising
from some source. The source could be a disk ﬁle, an interactive user, or
another program (to name a few choices). Any source of characters can
be attached to an input stream. An input stream may be thought of as
a buffer containing zero or more characters that have already been read
from the source, together with a means of requesting more input from the
source should the program require it. Similarly, an output stream is an un
bounded sequence of characters leading to some sink. The sink could be a
disk ﬁle, an interactive user, or another program (to name a few choices).
Any sink for characters can be attached to an output stream. An output
stream may be thought of as a buffer containing zero or more characters
that have been produced by the program but have yet to be ﬂushed to the
14.1 Textual Input/Output 128
sink.
Each program comes with one input stream and one output stream,
called stdIn and stdOut, respectively. These are ordinarily connected to
the user’s keyboard and screen, and are used for performing simple text
I/O in a program. The output stream stdErr is also predeﬁned, and is
used for error reporting. It is ordinarily connected to the user’s screen.
Textual input and output are performed on streams using a variety of
primitives. The simplest are inputLine and print. To read a line of input
from a stream, use the function inputLine of type instream > string. It
reads a line of input from the given stream and yields that line as a string
whose last character is the line terminator. If the source is exhausted, re
turn the empty string. To write a line to stdOut, use the function print
of type string > unit. To write to a speciﬁc stream, use the function
output of type outstream * string > unit, which writes the given string
to the speciﬁed output stream. For interactive applications it is often im
portant to ensure that the output stream is ﬂushed to the sink (e.g., so
that it is displayed on the screen). This is achieved by calling flushOut of
type outstream > unit. The print function is a composition of output
(to stdOut) and flushOut.
A new input stream may be created by calling the function openIn of
type string > instream. When applied to a string, the system attempts
to open a ﬁle with that name (according to operating systemspeciﬁc nam
ing conventions) and attaches it as a source to a new input stream. Simi
larly, a new output stream may be created by calling the function openOut
of type string > outstream. When applied to a string, the system at
tempts to create a ﬁle with that name (according to operating system
speciﬁc naming conventions) and attaches it as a sink for a new output
stream. An input stream may be closed using the function closeIn of type
instream > unit. A closed input stream behaves as if there is no fur
ther input available; request for input from a closed input stream yield
the empty string. An output stream may be closed using closeOut of type
outstream > unit. Aclosed output streamis unavailable for further out
put; an attempt to write to a closed output stream raises the exception
TextIO.IO.
The function input of type instream > string is a blocking read op
eration that returns a string consisting of the characters currently available
from the source. If none are currently available, but the end of source has
not been reached, then the operation blocks until at least one character is
REVISED 11.02.11 DRAFT VERSION 1.2
14.2 Sample Code 129
available from the source. If the source is exhausted or the input stream
is closed, input returns the null string. To test whether an input opera
tion would block, use the function canInput of type instream * int >
int option. Given a stream s and a bound n, the function canInput de
termines whether or not a call to input on s would immediately yield up
to n characters. If the input operation would block, canInput yields NONE;
otherwise it yields SOME k, with 0 ≤ k ≤ n being the number of characters
immediately available on the input stream. If canInput yields SOME 0, the
stream is either closed or exhausted. The function endOfStream of type
instream > bool tests whether the input stream is currently at the end
(no further input is available from the source). This condition is transitive
since, for example, another process might append data to an open ﬁle in
between calls to endOfStream.
The function output of type outstream * string > unit writes a string
to an output stream. It may block until the sink is able to accept the en
tire string. The function flushOut of type outstream > unit forces any
pending output to the sink, blocking until the sink accepts the remaining
buffered output.
This collection of primitive I/O operations is sufﬁcient for performing
rudimentary textual I/O. For further information on textual I/O, and sup
port for binary I/O and Posix I/O primitives, see the Standard ML Basis
Library.
14.2 Sample Code
Here is the code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 15
Lazy Data Structures
In ML all variables are bound by value, which means that the bindings of
variables are fully evaluated expressions, or values. This general principle
has several consequences:
1. The righthand side of a val binding is evaluated before the binding
is effected. If the righthand side has no value, the val binding does
not take effect.
2. In a function application the argument is evaluated before being passed
to the function by binding that value to the parameter of the func
tion. If the argument does not have a value, then neither does the
application.
3. The arguments to value constructors are evaluated before the con
structed value is created.
According to the byvalue discipline, the bindings of variables are evalu
ated, regardless of whether that variable is ever needed to complete ex
ecution. For example, to compute the result of applying the function fn
x => 1 to an argument, we never actually need to evaluate the argument,
but we do anyway. For this reason ML is sometimes said to be an eager
language.
An alternative is to bind variables by name,
1
which means that the bind
ing of a variable is an unevaluated expression, known as a computation or a
1
The terminology is historical, and not wellmotivated. It is, however, ﬁrmly estab
lished.
131
suspension or a thunk.
2
This principle has several consequences:
1. The righthand side of a val binding is not evaluated before the bind
ing is effected. The variable is bound to a computation (unevaluated
expression), not a value.
2. In a function application the argument is passed to the function in
unevaluated form by binding it directly to the parameter of the func
tion. This holds regardless of whether the argument has a value or
not.
3. The arguments to value constructor are left unevaluated when the
constructed value is created.
According to the byname discipline, the bindings of variables are only
evaluated (if ever) when their values are required by a primitive operation.
For example, to evaluate the expression x+x, it is necessary to evaluate the
binding of x in order to perform the addition. Languages that adopt the
byname discipline are, for this reason, said to be lazy.
This discussion glosses over another important aspect of lazy evalua
tion, called memoization. In actual fact laziness is based on a reﬁnement
of the byname principle, called the byneed principle. According to the by
name principle, variables are bound to unevaluated computations, and are
evaluated only as often as the value of that variable’s binding is required
to complete the computation. In particular, to evaluate the expression x+x
the value of the binding of x is needed twice, and hence it is evaluated
twice. According to the byneed principle, the binding of a variable is
evaluated at most once — not at all, if it is never needed, and exactly once if
it ever needed at all. Reevaluation of the same computation is avoided by
memoization. Once a computation is evaluated, its value is saved for future
reference should that computation ever be needed again.
The advantages and disadvantages of lazy versus eager languages have
been hotly debated. We will not enter into this debate here, but rather con
tent ourselves with the observation that laziness is a special case of eagerness.
(Recent versions of) ML have lazy data types that allow us to treat uneval
uated computations as values of such types, allowing us to incorporate
laziness into the language without disrupting its fundamental character
2
For reasons that are lost in the mists of time.
REVISED 11.02.11 DRAFT VERSION 1.2
15.1 Lazy Data Types 132
on which so much else depends. This affords the beneﬁts of laziness, but
on a controlled basis — we can use it when it is appropriate, and ignore it
when it is not.
The main beneﬁt of laziness is that it supports demanddriven computa
tion. This is useful for representing online data structures that are created
only insofar as we examine them. Inﬁnite data structures, such as the se
quence of all prime numbers in order of magnitude, are one example of
an online data structure. Clearly we cannot ever “ﬁnish” creating the se
quence of all prime numbers, but we can create as much of this sequence
as we need for a given run of a program. Interactive data structures, such
as the sequence of inputs provided by the user of an interactive system,
are another example of online data structures. In such a system the user’s
inputs are not predetermined at the start of execution, but rather are cre
ated “on demand” in response to the progress of computation up to that
point. The demanddriven nature of online data structures is precisely
what is needed to model this behavior.
Note: Lazy evaluation is a nonstandard feature of ML that is supported
only by the SML/NJ compiler. The lazy evaluation features must be en
abled by executing the following at top level:
Compiler.Control.lazysml := true;
open Lazy;
15.1 Lazy Data Types
SML/NJ provides a general mechanism for introducing lazy data types by
simply attaching the keyword lazy to an ordinary datatype declaration.
The ideas are best illustrated by example. We will focus attention on the
type of inﬁnite streams, which may be declared as follows:
datatype lazy ’a stream = Cons of ’a * ’a stream
Notice that this type deﬁnition has no “base case”! Had we omitted the
keyword lazy, such a datatype would not be very useful, since there
would be no way to create a value of that type!
Adding the keyword lazy makes all the difference. Doing so speciﬁes
that the values of type typ stream are computations of values of the form
Cons (val, val
/
),
REVISED 11.02.11 DRAFT VERSION 1.2
15.2 Lazy Function Deﬁnitions 133
where val is of type typ, and val
/
is another such computation. Notice how
this description captures the “incremental” nature of lazy data structures.
The computation is not evaluated until we examine it. When we do, its
structure is revealed as consisting of an element val together with another
suspended computation of the same type. Should we inspect that compu
tation, it will again have this form, and so on ad inﬁnitum.
Values of type typ stream are created using a val rec lazy declara
tion that provides a means for building a “circular” data structure. Here
is a declaration of the inﬁnite stream of 1’s as a value of type int stream:
val rec lazy ones = Cons (1, ones)
The keyword lazy indicates that we are binding ones to a computation,
rather than a value. The keyword rec indicates that the computation is
recursive (or selfreferential or circular). It is the computation whose under
lying value is constructed using Cons (the only possibility) fromthe integer
1 and the very same computation itself.
We can inspect the underlying value of a computation by pattern match
ing. For example, the binding
val Cons (h, t) = ones
extracts the “head” and “tail” of the stream ones. This is performed by
evaluating the computation bound to ones, yielding Cons (1, ones), then
performing ordinary pattern matching to bind h to 1 and t to ones.
Had the pattern been “deeper”, further evaluation would be required,
as in the following binding:
val Cons (h, (Cons (h’, t’)) = ones
To evaluate this binding, we evaluate ones to Cons (1, ones), binding h
to 1 in the process, then evaluate ones again to Cons (1, ones), binding
h’ to 1 and t’ to ones. The general rule is pattern matching forces evaluation
of a computation to the extent required by the pattern. This is the means by
which lazy data structures are evaluated only insofar as required.
15.2 Lazy Function Deﬁnitions
The combination of (recursive) lazy function deﬁnitions and decomposi
tion by pattern matching are the core mechanisms required to support lazy
REVISED 11.02.11 DRAFT VERSION 1.2
15.2 Lazy Function Deﬁnitions 134
evaluation. However, there is a subtlety about function deﬁnitions that re
quires careful consideration, and a third new mechanism, the lazy function
declaration.
Using pattern matching we may easily deﬁne functions over lazy data
structures in a familiar manner. For example, we may deﬁne two functions
to extract the head and tail of a stream as follows:
fun shd (Cons (h, )) = h
fun stl (Cons ( , s)) = s
These are functions that, when applied to a stream, evaluate it, and match
it against the given patterns to extract the head and tail, respectively.
While these functions are surely very natural, there is a subtle issue that
deserves careful discussion. The issue is whether these functions are “lazy
enough”. From one point of view, what we are doing is decomposing a
computation by evaluating it and retrieving its components. In the case
of the shd function there is no other interpretation — we are extracting a
value of type typ from a value of type typ stream, which is a computation
of a value of the form Cons (exp
h
, exp
t
). We can adopt a similar view
point about stl, namely that it is simply extracting a component value
from a computation of a value of the form Cons (exp
h
, exp
t
).
However, in the case of stl, another point of view is also possible.
Rather than think of stl as extracting a value from a stream, we may in
stead think of it as creating a stream out of another stream. Since streams
are computations, the streamcreated by stl (according to this view) should
also be suspended until its value is required. Under this interpretation the
argument to stl should not be evaluated until its result is required, rather
than at the time stl is applied. This leads to a variant notion of “tail” that
may be deﬁned as follows:
fun lazy lstl (Cons ( , s)) = s
The keyword lazy indicates that an application of lstl to a stream does
not immediately perform pattern matching on its argument, but rather sets
up a stream computation that, when forced, forces the argument and ex
tracts the tail of the stream.
The behavior of the two forms of tail function can be distinguished
using print statements as follows:
REVISED 11.02.11 DRAFT VERSION 1.2
15.3 Programming with Streams 135
val rec lazy s = (print "."; Cons (1, s))
val = stl s (* prints "." *)
val = stl s (* silent *)
val rec lazy s = (print "."; Cons (1, s));
val = lstl s (* silent *)
val = stl s (* prints "." *)
Since stl evaluates its argument when applied, the “.” is printed when it
is ﬁrst called, but not if it is called again. However, since lstl only sets
up a computation, its argument is not evaluated when it is called, but only
when its result is evaluated.
15.3 Programming with Streams
Let’s deﬁne a function smap that applies a function to every element of a
stream, yielding another stream. The type of smap should be (’a > ’b)
> ’a stream > ’b stream. The thing to keep in mind is that the appli
cation of smap to a function and a stream should set up (but not compute)
another stream that, when forced, forces the argument stream to obtain
the head element, applies the given function to it, and yields this as the
head of the result.
Here’s the code:
fun smap f =
let
fun lazy loop (Cons (x, s)) =
Cons (f x, loop s)
in
loop
end
We have “staged” the computation so that the partial application of smap
to a function yields a function that loops over a given stream, applying
the given function to each element. This loop is a lazy function to en
sure that it merely sets up a stream computation, rather than evaluating
its argument when it is called. Had we dropped the keyword lazy from
the deﬁnition of the loop, then an application of smap to a function and a
stream would immediately force the computation of the head element of
REVISED 11.02.11 DRAFT VERSION 1.2
15.3 Programming with Streams 136
the stream, rather than merely set up a future computation of the same
result.
To illustrate the use of smap, here’s a deﬁnition of the inﬁnite stream
of natural numbers:
val one plus = smap (fn n => n+1)
val rec lazy nats = Cons (0, one plus nats)
Now let’s deﬁne a function sfilter of type
(’a > bool) > ’a stream > ’a stream
that ﬁlters out all elements of a streamthat do not satisfy a given predicate.
fun sfilter pred =
let
fun lazy loop (Cons (x, s)) =
if pred x then
Cons (x, loop s)
else
loop s
in
loop
end
We can use sfilter to deﬁne a function sieve that, when applied to a
stream of numbers, retains only those numbers that are not divisible by a
preceding number in the stream:
fun m mod n = m  n * (m div n)
fun divides m n = n mod m = 0
fun lazy sieve (Cons (x, s)) =
Cons (x, sieve (sfilter (not o (divides x)) s))
(This example uses o for function composition.)
We may now deﬁne the inﬁnite stream of primes by applying sieve to
the natural numbers greater than or equal to 2:
val nats2 = stl (stl nats)
val primes = sieve nats2
REVISED 11.02.11 DRAFT VERSION 1.2
15.4 Sample Code 137
To inspect the values of a stream it is often useful to use the following
function that takes n ≥ 0 elements from a stream and builds a list of those
n values:
fun take 0 = nil
 take n (Cons (x, s)) = x :: take (n1) s
Here’s an example to illustrate the effects of memoization:
val rec lazy s = Cons ((print "."; 1), s)
val Cons (h, ) = s;
(* prints ".", binds h to 1 *)
val Cons (h, ) = s;
(* silent, binds h to 1 *)
Replace print ".";1 by a timeconsuming operation yielding 1 as result,
and you will see that the second time we force s the result is returned
instantly, taking advantage of the effort expended on the timeconsuming
operation induced by the ﬁrst force of s.
15.4 Sample Code
Here is the code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 16
Equality and Equality Types
16.1 Sample Code
Here is the code for this chapter.
Chapter 17
Concurrency
Concurrent ML (CML) is an extension of Standard ML with mechanisms
for concurrent programming. It is available as part of the Standard ML of
New Jersey compiler. The eXene Library for programming the X windows
system is based on CML.
17.1 Sample Code
Here is the code for this chapter.
Part III
The Module Language
141
The Standard ML module language comprises the mechanisms for struc
turing programs into separate units. Program units are called structures. A
structure consists of a collection of components, including types and val
ues, that constitute the unit. Composition of units to form a larger unit
is mediated by a signature, which describes the components of that unit.
A signature may be thought of as the type of a unit. Large units may be
structured into hierarchies using substructures. Generic, or parameterized,
units may be deﬁned as functors.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 18
Signatures and Structures
The fundamental constructs of the ML module system are signatures and
structures. A signature may be thought of as a description of a structure,
and a structure may correspondingly be thought of as an implementation
of a signature. Many languages have similar constructs, often with differ
ent names. Signatures are often called interfaces or package speciﬁcations,
and structures are often called implementations or packages. Whatever
the terminology, the main idea is to assign a type to a body of code as a
whole.
Unlike many languages, however, the relationship between signatures
and structures is manytomany, rather than manytoone or onetoone. A
signature may describe many different structures, and a structure may sat
isfy many different signatures. Thus, strictly speaking, it does not make
sense to speak of the signature of a structure or the structure matching a
signature, because there can be more than one in each case. By contrast
many languages impose much more stringent conditions such as requir
ing that each structure have a unique signature, or that each signature arise
from a unique structure. This is not the case for ML.
18.1 Signatures
A signature is a speciﬁcation, or a description, of a program unit, or struc
ture. Structures consist of declarations of type constructors, exception con
structors, and value bindings. A signature speciﬁes some requirements
on a structure, such as what type components it must have, and what
18.1 Signatures 143
value components it must have and what must be their types. A struc
ture matches, or implements, a signature iff it meets these requirements in a
sense that will be made precise below. The requirements are to be thought
of as descriptive in that the structure may meet more stringent requirements
than are speciﬁed by a signature by, for example, having more compo
nents than are speciﬁed, but the structure nevertheless matches any less
stringent speciﬁcation. (As a limiting case, any structure matches the null
signature that imposes no requirements on it!)
18.1.1 Basic Signatures
A basic signature expression has the form sig specs end, where specs is a
sequence of speciﬁcations. There are four basic forms of speciﬁcation that
may occur in specs:
1
1. A type speciﬁcation of the form
type (tyvar
1
,...,tyvar
n
) tycon [ = typ ],
where the deﬁnition typ of tycon may or may not be present.
2. Adatatype speciﬁcation, which has precisely the same formas a datatype
declaration.
3. An exception speciﬁcation of the form
exception excon [ of typ ],
where the type typ of excon may or may not be present.
4. A value speciﬁcation of the form
val id : typ.
Each speciﬁcation may refer to the type constructors introduced earlier in
the sequence. No component may be speciﬁed more than once.
Signatures may be given names using a signature binding
signature sigid = sigexp,
1
There are two other forms of speciﬁcation beyond these four, substructure speciﬁcations
and sharing speciﬁcations. These will be discussed in chapter 21.
REVISED 11.02.11 DRAFT VERSION 1.2
18.1 Signatures 144
where sigid is a signature identiﬁer and sigexp is a signature expression.
Signature identiﬁers are abbreviations for the signatures to which they are
bound. In practice we nearly always bind signature expressions to identi
ﬁers and refer to them by name.
Here is an illustrative example of a signature deﬁnition. We will refer
back to this deﬁnition often in the rest of this chapter.
signature QUEUE =
sig
type ’a queue
exception Empty
val empty : ’a queue
val insert : ’a * ’a queue > ’a queue
val remove : ’a queue > ’a * ’a queue
end
The signature QUEUE speciﬁes a structure that must provide
1. a unary type constructor ’a queue,
2. a nullary exception Empty,
3. a polymorphic value empty of type ’a queue,
4. two polymorphic functions, insert and remove, with the speciﬁed
type schemes.
Queues are polymorphic in the type of its entries—the same operations
are used regardless of the entry type.
18.1.2 Signature Inheritance
Signatures may be built up from one another using two principal tools,
signature inclusion and signature specialization. Each is a form of inheritance
in which a new signature is created by enriching another signature with
additional information.
Signature inclusion is used to add more components to an existing sig
nature. For example, if we wish to add an emptiness test to the signature
QUEUE we might deﬁne the augmented signature, QUEUE WITH EMPTY, using
the following signature binding:
REVISED 11.02.11 DRAFT VERSION 1.2
18.1 Signatures 145
signature QUEUE WITH EMPTY =
sig
include QUEUE
val is empty : ’a queue > bool
end
As the notation suggests, the signature QUEUE is included into the body of
the signature QUEUE WITH EMPTY, and an additional component is added.
It is not strictly necessary to use include to deﬁne this signature. In
deed, we may deﬁne it directly using the following signature binding:
signature QUEUE WITH EMPTY =
sig
type ’a queue
exception Empty
val empty : ’a queue
val insert : ’a * ’a queue > ’a queue
val remove : ’a queue > ’a * ’a queue
val is empty : ’a queue > bool
end
There is no semantic difference between the two deﬁnitions of QUEUE WITH EMPTY.
Signature inclusion is a convenience that documents the “history” of how
the more reﬁned signature was created.
Signature specialization is used to augment an existing signature with
additional type deﬁnitions. For example, if we wish to reﬁne the signature
QUEUE to specify that the type constructor ’a queue must be deﬁned as a
pair of lists, we may proceed as follows:
signature QUEUE AS LISTS =
QUEUE where type ’a queue = ’a list * ’a list
The where type clause “patches” the signature QUEUE by adding a deﬁni
tion for the type constructor ’a queue.
The signature QUEUE AS LISTS may also be deﬁned directly as follows:
REVISED 11.02.11 DRAFT VERSION 1.2
18.1 Signatures 146
signature QUEUE AS LISTS =
sig
type ’a queue = ’a list * ’a list
exception Empty
val empty : ’a queue
val insert : ’a * ’a queue > ’a queue
val remove : ’a queue > ’a * ’a queue
end
A where type clause may not be used to redeﬁne a type that is already
deﬁned in a signature. For example, the following is illegal:
signature QUEUE AS LISTS AS LIST =
QUEUE AS LISTS where type ’a queue = ’a list
If you wish to replace the deﬁnition of a type constructor in a signature
with another deﬁnition using where type, you must go back to a common
ancestor in which that type is not yet deﬁned.
signature QUEUE AS LIST =
QUEUE where type ’a queue = ’a list
Two signatures are said to be equivalent iff they differ only up to the
type equivalences induced by type abbreviations. For example, the signa
ture QUEUE where type ’a queue = ’a list is equivalent to the signa
ture
signature QUEUE AS LIST =
sig
type ’a queue = ’a list
exception Empty
val empty : ’a list
val insert : ’a * ’a list > ’a list
val remove : ’a list > ’a * ’a list
end
Within the scope of the deﬁnition of the type ’a queue as ’a list, the two
are equivalent, and hence the speciﬁcations of the value components are
equivalent.
REVISED 11.02.11 DRAFT VERSION 1.2
18.2 Structures 147
18.2 Structures
A structure is a unit of program consisting of a sequence of declarations
of types, exceptions, and values. Structures are implementations of signa
tures; signatures are the types of structures.
18.2.1 Basic Structures
The basic form of structure is an encapsulated sequence of declarations
of the form struct decs end. The declarations in decs are of one of the
following four forms:
1. A type declaration deﬁning a type constructor.
2. A datatype declaration deﬁning a new datatype.
3. An exception declaration deﬁning an exception with a speciﬁed argu
ment type (if any).
4. A value declaration deﬁning a new value variable with a speciﬁed
type.
These are, of course, just the declarations that were introduced in Part II
of this book.
Astructure expression is wellformed iff it consists of a wellformed se
quence of wellformed declarations (according to the rules given in Part II).
A structure expression is evaluated by evaluating each of the declarations
within it, in the order given. This amounts to evaluating the righthand
sides of each value declaration in turn to determine its value, which is
then bound to the corresponding value identiﬁer. Any effects that arise
from evaluating a value binding occur when the structure expression is
evaluated. A structure value is a structure expression in which all bind
ings are fully evaluated.
A structure may be bound to a structure identiﬁer using a structure
binding of the form
structure strid = strexp
This declaration deﬁnes strid to stand for the value of strexp. Such a decla
ration is wellformed exactly when strexp is wellformed. It is evaluated by
REVISED 11.02.11 DRAFT VERSION 1.2
18.2 Structures 148
evaluating the righthand side, and binding the resulting structure value
to strid.
Here is an example of a structure binding:
structure Queue =
struct
type ’a queue = ’a list * ’a list
exception Empty
val empty = (nil, nil)
fun insert (x, (b,f)) = (x::b, f)
fun remove (nil, nil) = raise Empty
 remove (bs, nil) = remove (nil, rev bs)
 remove (bs, f::fs) = (f, (bs, fs))
end
Recall that a fun binding is really an abbreviation for a val rec binding,
and hence constitutes a value binding (of function type).
18.2.2 Long and Short Identiﬁers
Once a structure has been bound to a structure identiﬁer, we may access
its components using paths, or long identiﬁers, or qualiﬁed names. A path has
the form strid.id.
2
It stands for the id component of the structure bound
to strid. For example, Queue.empty refers to the empty component of the
structure Queue. It has type ’a Queue.queue (note well the syntax!), stating
that it is a polymorphic value whose type is built up from the unary type
constructor Queue.queue. Similarly, the function Queue.insert has type
’a * ’a Queue.queue > ’a Queue.queue and Queue.remove has type ’a
Queue.queue > ’a * ’a Queue.queue.
Type deﬁnitions permeate structure boundaries. For example, the type
’a Queue.queue is equivalent to the type ’a list because it is deﬁned to
be so in the structure Q. We will shortly introduce the means for limiting
the visibility of type deﬁnitions. Unless special steps are taken, the def
initions of types within a structure determine the deﬁnitions of the long
identiﬁers that refer to those types within a structure. Consequently, it is
correct to write an expression such as
2
In chapter 21, we will generalize this to admit an arbitrary sequence of strid’s sepa
rated by a dot.
REVISED 11.02.11 DRAFT VERSION 1.2
18.2 Structures 149
val q = Queue.insert (1, ([6,5,4],[1,2,3]))
even though the list [6,5,4] was not obtained by using the operations
from the structure Q. This is because the type int Queue.queue is equiva
lent to the type int list, and hence the call to insert is welltyped.
The use of long identiﬁers can get out of hand, cluttering the pro
gram, rather than clarifying it. Suppose that we are frequently using the
Queue operations in a program, so that the code is cluttered with calls to
Queue.empty, Queue.insert, and Queue.remove. One way to reduce clut
ter is to introduce a structure abbreviation of the form structure strid =
strid
/
that introduces one structure identiﬁer as an abbreviation for an
other. For example, after declaring
structure Q = Queue
we may write Q.empty, Q.insert, and Q.remove, rather than the more ver
bose forms mentioned above.
Another way to reduce clutter is to open the structure Queue to incorpo
rate its bindings directly into the current environment. An open declaration
has the form
open strid
1
... strid
n
which incorporates the bindings from the given structures in lefttoright
order (later structures override earlier ones when there is overlap). For
example, the declaration
open Queue
incorporates the body of the structure Queue into the current environment
so that we may write just empty, insert, and remove, without qualiﬁcation,
to refer to the corresponding components of the structure Queue.
Although this is surely convenient, using open has its disadvantages.
One is that we cannot simultaneously open two structures that have a
component with the same name. For example, if we write
open Queue Stack
where the structure Stack also has a component empty, then uses of empty
(without qualiﬁcation) will stand for Stack.empty, not for Queue.empty.
REVISED 11.02.11 DRAFT VERSION 1.2
18.3 Sample Code 150
Another problem with open is that it is hard to control its behavior,
since it incorporates the entire body of a structure, and hence may inadver
tently shadow identiﬁers that happen to be also used in the structure. For
example, if the structure Queue happened to deﬁne an auxiliary function
helper, that function would also be incorporated into the current environ
ment by the declaration open Queue, which may not have been intended.
This turns out to be a source of many bugs; it is best to use open sparingly,
and only then in a let or local declaration (to limit the damage).
18.3 Sample Code
Here is the code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 19
Signature Matching
When does a structure implement a signature? The structure must pro
vide all of the components and satisfy all of the type deﬁnitions required by
the signature. Type components must be provided with the same number
of arguments and with an equivalent deﬁnition (if any) to that given in
the signature. Value components must be present with the type speciﬁed
in the signature, up to the deﬁnitions of any preceding types. Exception
components must be present with the type of argument (if any) speciﬁed
in the signature.
These simple principles have a number of important consequences.
• To minimize bureaucracy, a structure may provide more components
than are strictly required by the signature. If a signature requires
components x, y, and z, it is sufﬁcient for the structure to provide x,
y, z, and w.
• To enhance reuse, a structure may provide values with more general
types than are required by the signature. If a signature demands a
function of type int>int, it is enough to provide a function of type
’a>’a.
• To avoid overspeciﬁcation, a datatype may be provided where a type
is required, and a value constructor may be provided where a value
is required.
• To increase ﬂexibility, a structure may consist of declarations presented
in any sensible order, not just the order speciﬁed in the signature,
provided that the requirements of the speciﬁcation are met.
19.1 Principal Signatures 152
19.1 Principal Signatures
There is a most stringent, or most precise, signature for a structure, called its
principal signature. A structure may be considered to match a signature ex
actly when the speciﬁed signature is no more restrictive than the principal
signature of the structure. To determine whether a structure matches a sig
nature, it is enough to check whether the principal signature of that struc
ture is no weaker than the speciﬁed signature. For the purposes of type
checking, the principal signature is the ofﬁcial proxy for the structure. We
need never examine the code of the structure durng type checking, once
its principal signature has been determined.
Astructure expression is assigned a principal signature by a component
bycomponent analysis of its constituent declarations. The principal sig
nature of a structure is obtained as follows:
1
1. Corresponding to a declaration of the form
type (tyvar
1
,...,tyvar
n
) tycon = typ,
the principal signature contains the speciﬁcation
type (tyvar
1
,...,tyvar
n
) tycon = typ
The principal signature includes the deﬁnition of tycon.
2. Corresponding to a declaration of the form
datatype (tyvar
1
,...,tyvar
n
) tycon =
con
1
of typ
1
 ...  con
k
of typ
k
the principal signature contains the speciﬁcation
datatype (tyvar
1
,...,tyvar
n
) tycon =
con
1
of typ
1
 ...  con
k
of typ
k
The speciﬁcation is identical to the declaration.
3. Corresponding to a declaration of the form
1
These rules gloss over some technical complications that arise only in unusual cir
cumstances. See The Deﬁnition of Standard ML [3] for complete details.
REVISED 11.02.11 DRAFT VERSION 1.2
19.2 Matching 153
exception id of typ
the principal signature contains the speciﬁcation
exception id of typ
4. Corresponding to a declaration of the form
val id = exp
the principal signature contains the speciﬁcation
val id : typ
where typ is the principal type of the expression exp (relative to the
preceding declarations).
In brief, the principal signature contains all of the type deﬁnitions, datatype
deﬁnitions, and exception bindings of the structure, plus the principal
types of its value bindings.
19.2 Matching
A candidate signature sigexp
c
is said to match a target signature sigexp
t
iff
sigexp
c
has all of the components and all of the type equations speciﬁed by
sigexp
t
. More precisely,
1. Every type constructor in the target must also be present in the can
didate, with the same arity (number of arguments) and an equivalent
deﬁnition (if any).
2. Every datatype in the target must be present in the candidate, with
equivalent types for the value constructors.
3. Every exception in the target must be present in the candidate, with
an equivalent argument type.
4. Every value in the target must be present in the candidate, with at
least as general a type.
REVISED 11.02.11 DRAFT VERSION 1.2
19.2 Matching 154
The candidate may have additional components not mentioned in the tar
get, or satisfy additional type equations not required in the target, but it
cannot have fewer of either. The target signature may therefore be seen
as a weakening of the candidate signature, since all of the properties of the
latter are true of the former.
The matching relation is reﬂexive—every signature matches itself—
and transitive—if sigexp
1
matches sigexp
2
and sigexp
2
matches sigexp
3
, then
sigexp
1
matches sigexp
3
. Two signatures are equivalent (in the sense of
Chapter chapter 18) iff each matches the other, which is to say that they
are equivalently restrictive.
It will be helpful to consider some examples. Recall the following sig
natures from chapter 18.
signature QUEUE =
sig
type ’a queue
exception Empty
val empty : ’a queue
val insert : ’a * ’a queue > ’a queue
val remove : ’a queue > ’a * ’a queue
end
signature QUEUE WITH EMPTY =
sig
include QUEUE
val is empty : ’a queue > bool
end
signature QUEUE AS LISTS =
QUEUE where type ’a queue = ’a list * ’a list
The signature QUEUE WITH EMPTY matches the signature QUEUE, because
all of requirements of QUEUE are met by QUEUE WITH EMPTY. The converse
does not hold, because QUEUE lacks the component is empty, which is re
quired by QUEUE WITH EMPTY.
The signature QUEUE AS LISTS matches the signature QUEUE. It is identi
cal to QUEUE, apart from the additional speciﬁcation of the type ’a queue.
The converse fails, because the signature QUEUE does not satisfy the re
quirement that ’a queue be equivalent to ’a list * ’a list.
REVISED 11.02.11 DRAFT VERSION 1.2
19.2 Matching 155
Matching does not distinguish between equivalent signatures. For ex
ample, consider the following signature:
signature QUEUE AS LIST = sig
type ’a queue = ’a list
exception Empty
val empty : ’a list
val insert : ’a * ’a list > ’a list
val remove : ’a list > ’a * ’a list
val is empty : ’a list > bool
end
At ﬁrst glance you might think that this signature does not match the sig
nature QUEUE, since the components of QUEUE AS LIST have superﬁcially
dissimilar types fromthose in QUEUE. However, the signature QUEUE AS LIST
is equivalent to the signature QUEUE with type ’a queue = ’a list, which
matches QUEUE for reasons noted earlier. Therefore, QUEUE AS LIST matches
QUEUE as well.
Signature matching may also involve instantiation of polymorphic types.
The types of values in the candidate may be more general than required
by the target. For example, the signature
signature MERGEABLE QUEUE =
sig
include QUEUE
val merge : ’a queue * ’a queue > ’a queue
end
matches the signature
signature MERGEABLE INT QUEUE =
sig
include QUEUE
val merge : int queue * int queue > int queue
end
because the polymorphic type of merge in MERGEABLE QUEUE instantiates to
its type in MERGEABLE INT QUEUE.
Finally, a datatype speciﬁcation matches a signature that speciﬁes a
type with the same name and arity (but no deﬁnition), and zero or more
REVISED 11.02.11 DRAFT VERSION 1.2
19.2 Matching 156
value components corresponding to some (or all) of the value constructors
of the datatype. The types of the value components must match exactly the
types of the corresponding value constructors; no specialization is allowed
in this case. For example, the signature
signature RBT DT =
sig
datatype ’a rbt =
Empty 
Red of ’a rbt * ’a * ’a rbt 
Black of ’a rbt * ’a * ’a rbt
end
matches the signature
signature RBT =
sig
type ’a rbt
val Empty : ’a rbt
val Red : ’a rbt * ’a * ’a rbt > ’a rbt
end
The signature RBT speciﬁes the type ’a rbt as abstract, and includes two
value speciﬁcations that are met by value constructors in the signature
RBT DT.
One way to understand this is to mentally rewrite the signature RBT DT
in the (fanciful) form
2
signature RBT DTS =
sig
type ’a rbt
con Empty : ’a rbt
con Red : ’a rbt * ’a * ’a rbt > ’a rbt
con Black : ’a rbt * ’a * ’a rbt > ’a rbt
The rule is simply that a val speciﬁcation may be matched by a con speci
ﬁcation.
2
Unfortunately, the “signature” RBT DTS is not a legal ML signature!
REVISED 11.02.11 DRAFT VERSION 1.2
19.3 Satisfaction 157
19.3 Satisfaction
Returning to the motivating question of this chapter, a candidate structure
implements a target signature iff the principal signature of the candidate
structure matches the target signature. By the reﬂexivity of the match
ing relation it is immediate that a structure satisﬁes its principal signa
ture. Therefore any signature implemented by a structure is weaker than
the principal signature of that structure. That is, the principal signature is
the strongest signature implemented by a structure.
19.4 Sample Code
Here is the code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 20
Signature Ascription
Signature ascription imposes the requirement that a structure implement a
signature and, in so doing, weakens the signature of that structure for all
subsequent uses of it. There are two forms of ascription in ML. Both re
quire that a structure implement a signature; they differ in the extent to
which the assigned signature of the structure is weakened by the ascrip
tion.
1. Transparent, or descriptive ascription. The structure is assigned the
signature obtained by propagating type deﬁnitions from the principal
signature to the candidate signature.
2. Opaque, or restrictive ascription. The structure is assigned the target
signature as is, without propagating any type deﬁnitions.
In either case the components of the structure are cut down to those spec
iﬁed in the signature; the only difference is whether type deﬁnitions are
propagated from the principal signature or not.
20.1 Ascribed Structure Bindings
The most common form of signature ascription is in a structure binding.
There are two forms, the transparent
structure strid : sigexp = strexp
and the opaque
20.2 Opaque Ascription 159
structure strid :> sigexp = strexp
Transparent ascription is written using a single colon, “:”, whereas opaque
ascription is written using “:>”.
Ascribed structure bindings are type checked as follows. First, we
check that strexp implements sigexp according to the rules given in chap
ter 19. The principal signature sigexp
0
of strexp is determined, and checked
to match sigexp. The matching process determines an augmentation sigexp
/
of sigexp in which we propagate type equations from the principal signa
ture, sigexp
0
, to sigexp
/
. Second, the structure identiﬁer is assigned a sig
nature based on the form of ascription. For an opaque ascription, it is
assigned the signature sigexp; for a transparent ascription, it is assigned
the signature sigexp
/
.
Ascribed signature bindings are evaluated by ﬁrst evaluating strexp.
Then a view of the resulting value is formed by dropping all components
that are not present in the target signature, sigexp. The structure variable
strid is bound to the view. The formation of the view ensures that the
components of a structure may always be accessed in constant time, and
that there are no “space leaks” because of components that are present in
the structure, but not in the signature.
20.2 Opaque Ascription
The primary use of opaque ascription is to enforce data abstraction. A
good example is provided by the implementation of queues as, say, pairs
of lists.
structure Queue :> QUEUE =
struct
type ’a queue = ’a list * ’a list
val empty = (nil, nil)
fun insert (x, (bs, fs)) = (x::bs, fs)
exception Empty
fun remove (nil, nil) = raise Empty
 remove (bs, f::fs) = (f, (bs, fs))
 remove (bs, nil) = remove (nil, rev bs)
end
REVISED 11.02.11 DRAFT VERSION 1.2
20.2 Opaque Ascription 160
The use of opaque ascription ensures that the type ’a Queue.queue is ab
stract. No deﬁnition is provided for it in the signature QUEUE, and there
fore it has no deﬁnition in terms of other types of the language; the type
’a Queue.queue is abstract.
For the type ’a Queue.queue to be abstract means that the only opera
tions that may be performed on values of that type are empty, insert, and
remove. Importantly, we may not make use of the fact that a queue is really
a pair of lists on the grounds that it is implemented this way. We have
obscured this fact by opaquely ascribing a signature that does not provide
a deﬁnition for the type ’a queue. All clients of the structure Queue are
insulated from the details of how queues are implemented. Consequently,
the implementation of queues can be changed without breaking any client
code, as long as the new implementation satisﬁes the same signature.
Hiding the representation of a type allows us to isolate the enforcement
of representation invariants on a data structure. We may think of the type
’a Queue.queue as the type of states of an abstract machine whose sole
instructions are empty (the initial state), insert, and remove. Internally
to the structure Queue we may wish to impose invariants on the internal
state of the machine. The beauty of data abstraction is that it provides
an elegant means of enforcing such invariants, called the assumeensure,
or relyguarantee, method. It reduces the enforcement of representation
invariants to these two requirements:
1. All initialization instructions must ensure that the invariant holds
true of the machine state after execution.
2. All state transition instructions may assume that the invariant holds
of the inputs states, and must ensure that it holds of the output state.
By induction on the number of “instructions” executed, the invariant must
hold for all states—it must really be invariant!
Suppose that we wish to implement an abstract type of priority queues
for an arbitrary element type. The queue operations are no longer poly
morphic in the element type because they actually “touch” the elements to
determine their relative priorities. Here is a possible signature for priority
queues that expresses this dependency:
1
1
In chapter 21 we’ll introduce better means for structuring this module, but the central
points discussed here will not be affected.
REVISED 11.02.11 DRAFT VERSION 1.2
20.2 Opaque Ascription 161
signature PQ =
sig
type elt
val lt : elt * elt > bool
type queue
exception Empty
val empty : queue
val insert : elt * queue > queue
val remove : queue > elt * queue
end
Now let us consider an implementation of priority queues in which
the elements are taken to be strings. Since priority queues form an ab
stract type, we would expect to use opaque ascription to ensure that its
representation is hidden. This suggests an implementation along these
lines:
structure PrioQueue :> PQ =
struct
type elt = string
val lt : string * string > bool = (op <)
type queue = ...
.
.
.
end
But not only is the type PrioQueue.queue abstract, so is PrioQueue.elt!
This leaves us no means of creating a value of type PrioQueue.elt, and
hence we can never call PrioQueue.insert. The problem is that the in
terface is “too abstract” — it should only obscure the identity of the type
queue, and not that of the type elt.
The solution is to augment the signature PQ with a deﬁnition for the
type elt, then opaquely ascribe this to PrioQueue:
signature STRING PQ = PQ where type elt = string
structure PrioQueue :> STRING PQ = ...
Now the type PrioQueue.elt is equivalent to string, and we may call
PrioQueue.insert with a string, as expected.
The moral is that there is always an element of judgement involved in
deciding which types to hold abstract, and which to make opaque. In the
REVISED 11.02.11 DRAFT VERSION 1.2
20.3 Transparent Ascription 162
case of priority queues, the determining factor is that we speciﬁed only the
operations on elt that were required for the implementation of priority
queues, and no others. This means that elt could not usefully be held
abstract, but must instead be speciﬁed in the signature. On the other hand
the operations on queues are intended to be complete, and so we hold the
type abstract.
20.3 Transparent Ascription
Transparent ascription cuts down on the need for explicit speciﬁcation of
type deﬁnitions in signatures. As we remarked earlier, we can always re
place uses of transparent ascription by a use of opaque ascription with a
handcrafted augmented signature. This can become burdensome. On the
other hand, excessive use of transparent ascription impedes modular pro
gramming by exposing type information that would better be left abstract.
The prototypical use of transparent ascription is to form a view of a
structure that eliminates the components that are not necessary in a given
context without obscuring the identities of its type components. Consider
the signature ORDERED deﬁned as follows:
signature ORDERED =
sig
type t
val lt : t * t > bool
end
This signature speciﬁes a type t equipped with a comparison operation
lt.
It should be clear that it would not make sense to opaquely ascribe
this signature to a structure. Doing so would preclude ever calling the lt
operation, for there would be no means of creating values of type t. Such
a signature is only useful once it has been augmented with a deﬁnition for
the type t. This is precisely what transparent ascription does for you.
For example, consider the following structure binding:
REVISED 11.02.11 DRAFT VERSION 1.2
20.3 Transparent Ascription 163
structure String : ORDERED =
struct
type t = string
val clt = Char.<
fun lt (s, t) = ... clt ...
end
This structure implements string comparison in terms of character com
parison (say, to implement the lexicographic ordering of strings). Ascrip
tion of the signature ORDERED ensures two things:
1. The auxiliary function clt is pruned out of the structure. It was in
tended for internal use, and was not meant to be externally visible.
2. The type String.t is equivalent to string, even though this fact is
not present in the signature ORDERED. Transparent ascription com
putes an augmentation of ORDERED with this deﬁnition exposed. The
“true” signature of String is the signature
ORDERED where type t = string
which makes clear the underlying deﬁnition of t.
A related use of transparent ascription is to document an interpreta
tion of a type without rendering it abstract. For example, we may wish
to consider the integers ordered in two different ways, one by the stan
dard arithmetic comparison, the other by divisibility. We might make the
following declarations to express this:
structure IntLt : ORDERED =
struct
type t = int
val lt = (op <)
end
structure IntDiv : ORDERED =
struct
type t = int
fun lt (m, n) = (n mod m = 0)
end
REVISED 11.02.11 DRAFT VERSION 1.2
20.4 Sample Code 164
The ascription speciﬁes the interpretation of int as partially ordered, in
two senses, but does not hide the type of elements. In particular, IntLt.t
and IntDiv.t are both equivalent to int.
20.4 Sample Code
Here is the code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 21
Module Hierarchies
So far we have conﬁned ourselves to considering “ﬂat” modules consist
ing of a linear sequence of declarations of types, exceptions, and values.
As programs grow in size and complexity, it becomes important to intro
duce further structuring mechanisms to support their growth. The ML
module language also supports module hierarchies, treestructured conﬁg
urations of modules that reﬂect the architecture of a large system.
21.1 Substructures
A substructure is a “structure within a structure”. Structures bindings (ei
ther opaque or transparent) are admitted as components of other struc
tures. Structure speciﬁcations of the form
structure strid : sigexp
may appear in signatures. There is no distinction between transparent
and opaque speciﬁcations in a signature, because there is no structure to
ascribe!
The type checking and evaluation rules for structures are extended
to substructures recursively. The principal signature of a substructure
binding is determined according to the rules given in chapter 19. A sub
structure binding in one signature matches the corresponding one in an
other iff their signatures match according to the rules in chapter 19. Eval
uation of a substructure binding consists of evaluating the structure ex
pression, then binding the resulting structure value to that identiﬁer.
21.1 Substructures 166
To see how substructures arise in practice, consider the following pro
gramming scenario. The ﬁrst version of a system makes use of a polymor
phic dictionary data structure whose search keys are strings. The signature
for such a data structure might be as follows:
signature MY STRING DICT =
sig
type ’a dict
val empty : ’a dict
val insert : ’a dict * string * ’a > ’a dict
val lookup : ’a dict * string > ’a option
end
The return type of lookup is ’a option, since there may be no entry in the
dictionary with the speciﬁed key.
The implementation of this abstraction looks approximately like this:
structure MyStringDict :> MY STRING DICT =
struct
datatype ’a dict =
Empty 
Node of ’a dict * string * ’a * ’a dict
val empty = Empty
fun insert (d, k, v) = ...
fun lookup (d, k) = ...
end
The omitted implementations of insert and lookup make use of the built
in lexicographic ordering of strings.
The second version of the system requires another dictionary whose
keys are integers, leading to another signature and implementation for
dictionaries.
signature MY INT DICT =
sig
type ’a dict
val empty : ’a dict
val insert : ’a dict * int * ’a > ’a dict
val lookup : ’a dict * int > ’a option
end
REVISED 11.02.11 DRAFT VERSION 1.2
21.1 Substructures 167
structure MyIntDict :> MY INT DICT =
sig
datatype ’a dict =
Empty 
Node of ’a dict * int * ’a * ’a dict
val empty = Empty
fun insert (d, k, v) = ...
fun lookup (d, k) = ...
end
The ellided implementations of insert and lookup make use of the prim
itive comparison operations on integers.
At this point we may observe an obvious pattern, that of a dictionary
with keys of a speciﬁc type. To avoid further repetition we decide to ab
stract out the key type from the signature so that it can be ﬁlled in later.
signature MY GEN DICT =
sig
type key
type ’a dict
val empty : ’a dict
val insert : ’a dict * key * ’a > ’a dict
end
Notice that the dictionary abstraction carries with it the type of its keys.
Speciﬁc instances of this generic dictionary signature are obtained us
ing where type.
signature MY STRING DICT =
MY GEN DICT where type key = string
signature MY INT DICT =
MY GEN DICT where type key = int
A string dictionary might then be implemented as follows:
structure MyStringDict :> MY STRING DICT =
struct
type key = string
datatype ’a dict =
Empty 
REVISED 11.02.11 DRAFT VERSION 1.2
21.1 Substructures 168
Node of ’a dict * key * ’a * ’a dict
val empty = Empty
fun insert (Empty, k, v) = Node (Empty, k, v, Empty)
fun lookup (Empty, ) = NONE
 lookup (Node (dl, l, v, dr), k) =
if k < l then (* string comparison *)
lookup (dl, k)
else if k > l then (* string comparison *)
lookup (dr, k)
else
v
end
By a similar process we may build an implementation MyIntDict of the
signature MY INT DICT, with integer keys ordered by the standard integer
comparison operations.
Now suppose that we require a third dictionary, with integers as keys,
but ordered according to the divisibility ordering.
1
This implementation,
say MyIntDivDict, makes use of modular arithmetic to compare opera
tions, but has the same signature MY INT DICT as MyIntDict.
structure MyIntDivDict :> MY INT DICT =
struct
type key = int
datatype ’a dict =
Empty 
Node of ’a dict * key * ’a * ’a dict
fun divides (k, l) = (l mod k = 0)
val empty = Empty
fun insert (None, k, v) = Node (Empty, k, v, Empty)
fun lookup (Empty, ) = NONE
 lookup (Node (dl, l, v, dr), k) =
if divides (k, l) then (* divisibility test *)
lookup (dl, k)
else if divides (l, k) then (* divisibility test *)
lookup (dr, k)
else
1
For which, m < n iff m divides n evenly.
REVISED 11.02.11 DRAFT VERSION 1.2
21.1 Substructures 169
v
end
Notice that we required an auxiliary function, divides, to implement the
comparison in the required sense.
With this in mind, let us reconsider our initial attempt to consolidate
the signatures of the various versions of dictionaries in play. In one sense
there is nothing to do — the signature MY GEN DICT sufﬁces. However,
as we’ve just seen, the instances of this signature, which are ascribed to
particular implementations, do not determine the interpretation. What
we’d like to do is to package the type with its interpretation so that the
dictionary module is selfcontained. Not only does the dictionary module
carry with it the type of its keys, but it also carries the interpretation used
on that type.
This is achieved by introducing a substructure binding in the dictio
nary structure. To begin with we ﬁrst isolate the notion of an ordered
type.
signature ORDERED =
sig
type t
val lt : t * t > bool
val eq : t * t > bool
end
This signature describes modules that contain a type t equipped with an
equality and comparison operation on it.
An implementation of this signature speciﬁes the type and the inter
pretation, as in the following examples.
(* Lexicographically ordered strings. *)
structure LexString : ORDERED =
struct
type t = string
val eq = (op =)
val lt = (op <)
end
(* Integers ordered conventionally. *)
structure LessInt : ORDERED =
REVISED 11.02.11 DRAFT VERSION 1.2
21.1 Substructures 170
struct
type t = int
val eq = (op =)
val lt = (op <)
end
(* Integers ordered by divisibility.*)
structure DivInt : ORDERED =
struct
type t = int
fun lt (m, n) = (n mod m = 0)
fun eq (m, n) = lt (m, n) andalso lt (n, m)
end
Notice that the use of transparent ascription is very natural here, since
ORDERED is not intended as a selfcontained abstraction.
The signature of dictionaries is restructured as follows:
signature DICT =
sig
structure Key : ORDERED
type ’a dict
val empty : ’a dict
val insert : ’a dict * Key.t * ’a > ’a dict
val lookup : ’a dict * Key.t > ’a option
end
The signature DICT includes as a substructure the key type together with
its interpretation as an ordered type.
To enforce abstraction we introduce specialized versions of this signa
ture that specify the key type using a where type clause.
signature STRING DICT =
DICT where type Key.t=string
signature INT DICT =
DICT where type Key.t=int
These are, respectively, signatures for the abstract type of dictionaries whose
keys are strings and integers.
How are these signatures to be implemented? Corresponding to the
layering of the signatures, we have a layering of the implementation.
REVISED 11.02.11 DRAFT VERSION 1.2
21.1 Substructures 171
structure StringDict :> STRING DICT =
struct
structure Key : ORDERED = LexString
datatype ’a dict =
Empty 
Node of ’a dict * Key.t * ’a * ’a dict
val empty = Empty
fun insert (None, k, v) = Node (Empty, k, v, Empty)
fun lookup (Empty, ) = NONE
 lookup (Node (dl, l, v, dr), k) =
if Key.lt(k, l) then
lookup (dl, k)
else if Key.lt (l, k) then
lookup (dr, k)
else
v
end
Observe that the implementation of insert and lookup make use of the
comparison operations Key.lt and Key.eq.
Similarly, we may implement IntDict, with the standard ordering, as
follows:
structure LessIntDict :> INT DICT =
struct
structure Key : ORDERED = LessInt
datatype ’a dict =
Empty 
Node of ’a dict * Key.t * ’a * ’a dict
val empty = Empty
fun insert (None, k, v) = Node (Empty, k, v, Empty)
fun lookup (Empty, ) = NONE
 lookup (Node (dl, l, v, dr), k) =
if Key.lt(k, l) then
lookup (dl, k)
else if Key.lt (l, k) then
lookup (dr, k)
else
v
REVISED 11.02.11 DRAFT VERSION 1.2
21.1 Substructures 172
end
Similarly, dictionaries with integer keys ordered by divisibility may be
implemented as follows:
structure IntDivDict :> INT DICT =
struct
structure Key : ORDERED = IntDiv
datatype ’a dict =
Empty 
Node of ’a dict * Key.t * ’a * ’a dict
val empty = Empty
fun insert (None, k, v) = Node (Empty, k, v, Empty)
fun lookup (Empty, ) = NONE
 lookup (Node (dl, l, v, dr), k) =
if Key.lt(k, l) then
lookup (dl, k)
else if Key.lt (l, k) then
lookup (dr, k)
else
v
end
Taking stock of the development, what we have done is to structure
the signature of dictionaries to allow the type of keys, together with its
interpretation, to vary from one implementation to another. The Key sub
structure may be viewed as a “parameter” of the signature DICT that is “in
stantiated” by specialization to speciﬁc types of interest. In this sense sub
structures subsume the notion of a parameterized signature found in some
languages. There are several advantages to this:
1. A signature with one or more substructures is still a complete sig
nature. Parameterized signatures, in contrast, are incomplete signa
tures that must be completed to be used.
2. Any substructure of a signature may play the role of a “parameter”.
There is no need to designate in advance which are “arguments” and
which are “results”.
In chapter 23 we will introduce the mechanisms needed to build a
generic implementation of dictionaries that may be instantiated by the key
type and its ordering.
REVISED 11.02.11 DRAFT VERSION 1.2
21.2 Sample Code 173
21.2 Sample Code
Here is the code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 22
Sharing Speciﬁcations
In chapter 21 we illustrated the use of substructures to express the depen
dence of one abstraction on another. In this chapter we will consider the
problem of symmetric combination of modules to form larger modules.
22.1 Combining Abstractions
The discussion will be based on a representation of geometry in ML based
on the following (drastically simpliﬁed) signature.
signature GEOMETRY =
sig
structure Point : POINT
structure Sphere : SPHERE
end
For the purposes of this example, we have reduced geometry to two con
cepts, that of a point in space and that of a sphere.
Points and vectors are fundamental to representing geometry. They are
described by the following (abbreviated) signatures:
signature VECTOR =
sig
type vector
val zero : vector
val scale : real * vector > vector
22.1 Combining Abstractions 175
val add : vector * vector > vector
val dot : vector * vector > real
end
signature POINT =
sig
structure Vector : VECTOR
type point
(* move a point along a vector *)
val translate : point * Vector.vector > point
(* the vector from a to b *)
val ray : point * point > Vector.vector
end
The vector operations support addition, scalar multiplication, and inner
product, and include a unit element for addition. The point operations
support translation of a point along a vector and the creation of a vector as
the “difference” of two points (i.e., the vector from the ﬁrst to the second).
Spheres are implemented by a module implementing the following
(abbreviated) signature:
signature SPHERE =
sig
structure Vector : VECTOR
structure Point : POINT
type sphere
val sphere : Point.point * Vector.vector > sphere
end
The operation sphere creates a sphere centered at a given point and with
the radius vector given.
These signatures are intentionally designed so that the dimension of
the space is not part of the speciﬁcation. This allows us — using the mech
anisms to be introduced in chapter 23 — to build packages that work in
an arbitrary dimension without requiring runtime conformance checks.
It is the structures, and not the signatures, that specify the dimension.
Two and threedimensional geometry are deﬁned by structure bindings
like these:
structure Geom2D :> GEOMETRY = ...
structure Geom3D :> GEOMETRY = ...
REVISED 11.02.11 DRAFT VERSION 1.2
22.1 Combining Abstractions 176
As a consequence of the use of opaque ascription, the types Geom2D.Point.point
and Geom3D.Point.point are distinct. This means that dimensional con
formance is enforced by the type checker. For example, we cannot apply
Geom3D.Sphere.sphere to a point in twospace and a vector in threespace.
This is a good thing: the more static checking we have, the better off
we are. Closer inspection reveals that, unfortunately, we have too much
of a good thing. Suppose that p and q are twodimensional points of type
Geom2D.Point.point. We might expect to be able to form a sphere cen
tered at p with radius determined by the vector from p to q:
Geom2D.Sphere.sphere (p, Geom2D.Point.ray (p, q)).
But this expression is illtyped! The reason is that the types Geom2D.Point.Vector
and Geom2D.Sphere.Vector.vector are also distinct fromone another, which
is not at all what we intend.
What has gone wrong? The situation is quite subtle. In keeping with
the guidelines discussed in section 21.1, we have incorporated as substruc
tures the structures on which a given structure depends. For example,
forming a ray from one point to another yields a vector, so an implemen
tation of POINT depends on an implementation of VECTOR. Thus, POINT has
a substructure implementing VECTOR, and, similarly, SPHERE has substruc
tures implementing VECTOR and POINT.
This leads to a proliferation of structures. Even in the very simpli
ﬁed geometry signature given above, we have two “copies” of the point
abstraction, and three “copies” of the vector abstraction! Since we used
opaque ascription to deﬁne the two and threedimensional implementa
tions of the signature GEOMETRY, all of these abstractions are kept distinct
from one another, even though they may be implemented identically.
In a sense this is the correct state of affairs. The various “copies” of,
say, the vector abstraction might well be distinct from one another. In
the elided implementation of twodimensional geometry, we might have
used completely incompatible notions of vector in each of the three places
where they are required. Of course, this may not be what is intended, but
(so far) there is nothing in the signature to prevent it. Hence, we are compelled
to keep these types distinct.
What is missing is the expression of the intention that the various “copies”
of vectors and points within the geometry abstraction be identical, so that
we can mixandmatch the vectors constructed in various components of
REVISED 11.02.11 DRAFT VERSION 1.2
22.1 Combining Abstractions 177
the package. To support this it is necessary to constrain the implementa
tion to use the same notion of vector throughout. This is achieved using
a type sharing constraint. The revised signatures for the geometry package
look like this:
signature SPHERE =
sig
structure Vector : VECTOR
structure Point : POINT
sharing type Point.Vector.vector = Vector.vector
type sphere
val sphere : Point.point * Vector.vector > sphere
end
signature GEOMETRY =
sig
structure Point : POINT
structure Sphere : SPHERE
sharing type Point.point = Sphere.Point.point
and Point.Vector.vector = Sphere.Vector.vector
end
These equations specify that the two “copies” of the point abstraction, and
the three “copies” of the vector abstraction must coincide. In the presence
of the above sharing speciﬁcation, the illtyped expression above becomes
welltyped, since now the required type equation holds by explicit speci
ﬁcation in the signature.
As a notational convenience we may use a structure sharing constraint
instead to express the same requirements:
signature SPHERE =
sig
structure Vector : VECTOR
structure Point : POINT
sharing Point.Vector = Vector
type sphere
val sphere : Point.point * Vector.vector > sphere
end
signature GEOMETRY =
sig
REVISED 11.02.11 DRAFT VERSION 1.2
22.1 Combining Abstractions 178
structure Point : POINT
structure Sphere : SPHERE
sharing Point = Sphere.Point
and Point.Vector = Sphere.Vector
end
Rather than specify the required sharing typebytype, we can instead
specify it structurebystructure, with the meaning that corresponding types
of shared structures are required to share. Since each structure in our ex
ample contains only one type, the effect of the structure sharing speciﬁca
tion above is identical to the preceding type sharing speciﬁcation.
Not only does the sharing speciﬁcation ensure that the desired equa
tions hold amongst the various components of an implementation of GEOMETRY,
but it also constrains the implementation to ensure that these types are the
same. It is easy to achieve this requirement by deﬁning a single implemen
tation of points and vectors that is reused in the higherlevel abstractions.
structure Vector3D : VECTOR = ...
structure Point3D : POINT =
struct
structure Vector : VECTOR = Vector3D
.
.
.
end
structure Sphere3D : SPHERE =
struct
structure Vector : VECTOR = Vector3D
structure Point : POINT = Point3D
.
.
.
end
structure Geom3D :> GEOMETRY =
struct
structure Point = Point3D
structure Sphere = Sphere3D
end
The required type sharing constraints are true by construction.
Had we instead replaced the above declaration of Geom3D by the fol
lowing one, the type checker would reject it on the grounds that the re
REVISED 11.02.11 DRAFT VERSION 1.2
22.1 Combining Abstractions 179
quired sharing between Sphere.Point and Point does not hold, because
Sphere2D.Point is distinct from Point3D.
structure Geom3D :> GEOMETRY =
struct
structure Point = Point3D
structure Sphere = Sphere2D
end
It is natural to wonder whether it might be possible to restructure the
GEOMETRY signature so that the duplication of the point and vector com
ponents is avoided, thereby obviating the need for sharing speciﬁcations.
One can restructure the code in this manner, but doing so would do vio
lence to the overall structure of the program. This is why sharing speciﬁ
cations are so important.
Let’s try to reorganize the signature GEOMETRY so that duplication of
the point and vector structures is avoided. One step is to eliminate the sub
structure Vector fromSPHERE, replacing uses of Vector.vector by Vector.Point.vector.
signature SPHERE =
sig
structure Point : POINT
type sphere
val sphere :
Point.point * Point.Vector.vector > sphere
end
After all, since the structure Point comes equipped with a notion of vector,
why not use it?
This cuts down the number of sharing speciﬁcations to one:
signature GEOMETRY =
sig
structure Point : POINT
structure Sphere : SPHERE
sharing Point = Sphere.Point
end
If we could further eliminate the substructure Point from the signature
SPHERE we would have only one copy of Point and no need for a sharing
speciﬁcation.
REVISED 11.02.11 DRAFT VERSION 1.2
22.1 Combining Abstractions 180
But what would the signature SPHERE look like in this case?
signature SPHERE =
sig
type sphere
val sphere :
Point.point * Point.Vector.vector > sphere
end
The problem now is that the signature SPHERE is no longer selfcontained.
It makes reference to a structure Point, but which Point are we talking
about? Any commitment would tie the signature to a speciﬁc structure,
and hence a speciﬁc dimension, contrary to our intentions. Rather, the
notion of point must be a generic concept within SPHERE, and hence Point
must appear as a substructure. The substructure Point may be thought of
as a parameter of the signature SPHERE in the sense discussed earlier.
The only other move available to us is to eliminate the structure Point
from the signature GEOMETRY. This is indeed possible, and would eliminate
the need for any sharing speciﬁcations. But it only defers the problem,
rather than solving it. A fullscale geometry package would contain more
abstractions that involve points, so that there will still be copies in the
other abstractions. Sharing speciﬁcations would then be required to en
sure that these copies are, in fact, identical.
Here is an example. Let us introduce another geometric abstraction,
the semispace.
signature SEMI SPACE =
sig
structure Point : POINT
type semispace
val side : Point.point * semispace > bool option
end
The function side determines (if possible) whether a given point lies in
one half of the semispace or the other.
The expanded GEOMETRY signature would look like this (with the elim
ination of the Point structure in place).
signature EXTD GEOMETRY =
sig
REVISED 11.02.11 DRAFT VERSION 1.2
22.2 Sample Code 181
structure Sphere : SPHERE
structure SemiSpace : SEMI SPACE
sharing Sphere.Point = SemiSpace.Point
end
By an argument similar to the one we gave for the signature SPHERE, we
cannot eliminate the substructure Point from the signature SEMI SPACE,
and hence we wind up with two copies. Therefore the sharing speciﬁca
tion is required.
What is at issue here is a fundamental tension in the very notion of
modular programming. On the one hand we wish to separate modules
from one another so that they may be treated independently. This re
quires that the signatures of these modules be selfcontained. Unbound
references to a structure — such as Point — ties that signature to a spe
ciﬁc implementation, in violation of our desire to treat modules separately
from one another. On the other hand we wish to combine modules together
to form programs. Doing so requires that the composition be coherent,
which is achieved by the use of sharing speciﬁcations. What sharing spec
iﬁcations do for you is to provide an after the fact means of tying together
several different abstractions to form a coherent whole. This approach to
the problem of coherence is a unique — and uniquely effective — feature
of the ML module system.
22.2 Sample Code
Here is the code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 23
Parameterization
To support code reuse it is useful to deﬁne generic, or parameterized, mod
ules that leave unspeciﬁed some aspects of the implementation of a mod
ule. The unspeciﬁed parts may be instantiated to determine speciﬁc in
stances of the module. The common part is thereby implemented once and
shared among all instances.
In ML such generic modules are called functors. A functor is a module
level function that takes a structure as argument and yields a structure
as result. Instances are created by applying the functor to an argument
specifying the interpretation of the parameters.
23.1 Functor Bindings and Applications
Functors are deﬁned using a functor binding. There are two forms, the
opaque and the transparent. A transparent functor has the form
functor funid(decs):sigexp = strexp
where the result signature, sigexp, is transparently ascribed; an opaque func
tor has the form
functor funid(decs):>sigexp = strexp
where the result signature is opaquely ascribed. A functor is a module
level function whose argument is a sequence of declarations, and whose
result is a structure.
23.1 Functor Bindings and Applications 183
Type checking of a functor consists of checking that the functor body
matches the ascribed result signature, given that the parameters of the
functor have the speciﬁed signatures. For a transparent functor the re
sult signature is the augmented signature determined by matching the
principal signature of the body (relative to the assumptions governing the
parameters) against the given result signature. For opaque functors the
ascribed result signature is used asis, without further augmentation by
type equations.
In chapter 21 we developed a modular implementation of dictionaries
in which the ordering of keys is made explicit as a substructure. The im
plementation of dictionaries is the same for each choice of order structure
on the keys. This can be expressed by a single functor binding that deﬁnes
the implementation of dictionaries parametrically in the choice of keys.
functor DictFun
(structure K : ORDERED) :>
DICT where type Key.t = K.t =
struct
structure Key : ORDERED = K
datatype ’a dict =
Empty 
Node of ’a dict * Key.t * ’a * ’a dict
val empty = Empty
fun insert (None, k, v) =
Node (Empty, k, v, Empty)
fun lookup (Empty, ) = NONE
 lookup (Node (dl, l, v, dr), k) =
if Key.lt(k, l) then
lookup (dl, k)
else if Key.lt (l, k) then
lookup (dr, k)
else
v
end
The functor DictFun takes as argument a single structure speciﬁying the
ordered key type as a structure implementing signature ORDERED. The opaquely
ascribed result signature is
REVISED 11.02.11 DRAFT VERSION 1.2
23.1 Functor Bindings and Applications 184
DICT where type Key.t = K.t.
This signature holds the type ’a dict abstract, but speciﬁes that the key
type is the type K.t passed as argument to the functor. The body of the
functor has been written so that the comparison operations are obtained
from the Key substructure. This ensures that the dictionary code is inde
pendent of the choice of key type and ordering.
Instances of functors are obtained by application. A functor application
has the form
funid(binds)
where binds is a sequence of bindings of the arguments of the functor.
The signature of a functor application is determined by the following
procedure. We assume we are given the signatures of the functor param
eters, and also the “true” result signature of the functor (the given sig
nature for opaque functors, the augmented signature for the transparent
functors).
1. For each argument, match the argument signature against the cor
responding parameter signature of the functor. This determines an
augmentation of the parameter signature for each argument (as de
scribed in chapter 20).
2. For each reference to a type component of a functor parameter in
the result signature, propagate the type deﬁnitions of the augmented
parameter signature to the result signature.
The signature of the application determined by this procedure is then
opaquely ascribed to the application. This means that if a type is left abstract
in the result signature of a functor, that type is “new” in every instance of
that functor. This behavior is called generativity of the functor.
1
Returning to the example of the dictionary functor, the three versions of
dictionaries considered in chapter 21 may be obtained by applying DictFun
to appropriate arguments.
structure LtIntDict = DictFun (structure K = LessInt)
structure LexStringDict = DictFun (structure K = LexString)
structure DivIntDict = DictFun (structure K = DivInt)
1
The alternative, called applicativity, means that there is one abstract type shared by all
instances of that functor.
REVISED 11.02.11 DRAFT VERSION 1.2
23.2 Functors and Sharing Speciﬁcations 185
In each case the functor DictFun is instantiated by specifying a binding for
its argument stucture K. The argument structures are, as described in chap
ter 21, implementations of the signature ORDERED. They specify the type of
keys and the sense in which they are ordered.
The signatures for the structures LtIntDict, LexStringDict, and DivIntDict
are determined by instantiating the result signature of the functor DictFun
according to the above procedure. Consider the application of DictFun to
LtIntDict. The augmented signature resulting from matching the signa
ture of LtIntDict against the parameter signature ORDERED is the signature
ORDERED where type t=int
Assigning this to the parameter K, we deduce that the type K.t is equiva
lent to int, and hence the result signature of DictFun is
DICT where type Key.t = int
so that IntLtDict.Key.t is equivalent to int, as desired. By a similar pro
cess we deduce that the signature of LexStringDict is
DICT where type Key.t = string
and that the signature of DivIntDict is
DICT where type Key.t = int.
23.2 Functors and Sharing Speciﬁcations
In chapter 22 we developed a signature of geometric primitives that con
tained sharing speciﬁcations to ensure that the constituent abstractions
may be combined properly. The signature GEOMETRY is deﬁned as follows:
signature GEOMETRY =
sig
structure Point : POINT
structure Sphere : SPHERE
sharing Point = Sphere.Point
and Point.Vector = Sphere.Vector
and Sphere.Vector = Sphere.Point.Vector
end
REVISED 11.02.11 DRAFT VERSION 1.2
23.2 Functors and Sharing Speciﬁcations 186
The sharing clauses ensure that the Point and Sphere components are
compatible with each other.
Since we expect to deﬁne vectors, points, and spheres of various di
mensions, it makes sense to implement these as functors, according to the
following scheme:
functor PointFun
(structure V : VECTOR) : POINT = ...
functor SphereFun
(structure V : VECTOR
structure P : POINT) : SPHERE =
struct
structure Vector = V
structure Point = P
.
.
.
end
functor GeomFun
(structure P : POINT
structure S : SPHERE) : GEOMETRY =
struct
structure Point = P
structure Sphere = S
end
A twodimensional geometry package may then be deﬁned as follows:
structure Vector2D : VECTOR = ...
structure Point2D : POINT =
PointFun (structure V = Vector2D)
structure Sphere2D : SPHERE =
SphereFun (structure V = Vector2D and P = Point2D)
structure Geom2D : GEOMETRY =
GeomFun (structure P = Point2D and S = Sphere2D)
A threedimensional version is deﬁned similarly.
There is only one problem: the functors SphereFun and GeomFun are
not welltyped! The reason is that in both cases their result signatures
require type equations that are not true of their parameters! For example,
REVISED 11.02.11 DRAFT VERSION 1.2
23.3 Avoiding Sharing Speciﬁcations 187
the signature SPHERE requires that Point.Vector be the same as Vector,
which is not satisﬁed by the body of SphereFun. For these to be true, the
structures P.Vector and V must be equivalent. This is not true in general,
because the functors might be applied to arguments for which this is false.
Similar problems plague the functor GeomFun. The solution is to include
sharing constraints in the parameter list of the functors, as follows:
functor SphereFun
(structure V : VECTOR
structure P : POINT
sharing P.Vector = V) : SPHERE =
struct
structure Vector = V
structure Point = P
.
.
.
end
functor GeomFun
(structure P : POINT
structure S : SPHERE
sharing P.Vector = S.Vector and P = S.Point) : GEOMETRY =
struct
structure Point = P
structure Sphere = S
end
These equations preclude instantiations for which the required equations
do not hold, and are sufﬁcient to ensure that the requirements of the result
signatures of the functors are met.
23.3 Avoiding Sharing Speciﬁcations
As with sharing speciﬁcations in signatures, it is natural to wonder whether
they can be avoided in functor parameters. Once again, the answer is
“yes”, but doing so does violence to the structure of your program. The
chief virtue of sharing speciﬁcations is that they express directly and con
cisely the required relationships without requiring that these relationships
be anticipated when deﬁning the signatures of the parameters. This greatly
REVISED 11.02.11 DRAFT VERSION 1.2
23.3 Avoiding Sharing Speciﬁcations 188
facilitates reuse of offtheshelf code, for which it is impossible to assume
any sharing relationships that one may wish to impose in an application.
To see what happens, let’s consider the bestcase scenario from chap
ter 22 in which we have minimized sharing speciﬁcations to one tying to
gether the sphere and semispace components. That is, we’re to implement
the following signature:
signature EXTD GEOMETRY =
sig
structure Sphere : SPHERE
structure SemiSpace : SEMI SPACE
sharing Sphere.Point = SemiSpace.Point
end
The implementation is a functor of the form
functor ExtdGeomFun
(structure Sp : SPHERE
structure Ss : SEMI SPACE
sharing Sphere.Point = SemiSpace.Point) =
struct
structure Sphere = Sp
structure SemiSpace = Ss
end
To eliminate the sharing equation in the functor parameter, we must
arrange that the sharing equation in the signature EXTD GEOMETRY holds.
Simply dropping the sharing speciﬁcation will not do, because then there
is no reason to believe that it will hold as required in the signature. A
natural move is to “factor out” the implementation of POINT, and use it to
ensure that the required equation is true of the functor body. There are
two methods for doing this, each with disadvtanges compared to the use
of sharing speciﬁcations.
One is to make the desired equation true by construction. Rather than
take implementations of SPHERE and SEMI SPACE as arguments, the functor
ExtdGeomFun takes only an implementation of POINT, then creates in the
functor body appropriate implementations of spheres and semispaces.
functor SphereFun
(structure P : POINT) : SPHERE =
REVISED 11.02.11 DRAFT VERSION 1.2
23.3 Avoiding Sharing Speciﬁcations 189
struct
structure Vector = P.Vector
structure Point = P
.
.
.
end
functor SemiSpaceFun
(structure P : POINT) : SEMI SPACE =
struct
.
.
.
end
functor ExtdGeomFun1
(structure P : POINT) : GEOMETRY =
struct
structure Sphere =
SphereFun (structure P = Point)
structure SemiSpace =
SemiSpaceFun (structure P = Point)
end
The problems with this solution are these:
• The body of ExtdGeomFun1 makes use of the functors SphereFun and
SemiSpaceFun. In effect we are limiting the geometry functor to argu
ments that are built from these speciﬁc functors, and no other. This is
a signiﬁcant loss of generality that is otherwise present in the func
tor ExtdGeomFun, which may be applied to any implementations of
SPHERE and SEMI SPACE.
• The functor ExtdGeomFun1 must have as parameter the common el
ement(s) of the components of its body, which is then used to build
up the appropriate substructures in a manner consistent with the re
quired sharing. This approach does not scale well when many ab
stractions are layered atop one another. We must reconstruct the
entire hierarchy, starting with the components that are conceptually
“furthest away” as arguments.
• There is no inherent reason why ExtdGeomFun1 must take an imple
mentation of POINT as argument. It does so only so that it can recon
REVISED 11.02.11 DRAFT VERSION 1.2
23.3 Avoiding Sharing Speciﬁcations 190
struct the hierarchy so as to satisfy the sharing requirements of the
result signature.
Another approach is to factor out the common component, and use
this to constrain the arguments to the functor to ensure that the possible
arguments are limited to situations for which the required sharing holds.
functor ExtdGeomFun2
(structure P : POINT
structure Sp : SPHERE where Point = P
structure Ss : SEMI SPACE where Point = P) =
struct
structure Sphere = Sp
structure SemiSpace = Ss
end
Now the required sharing requirements are met, but it is also clear that
this approach has no particular advantages over just using a sharing spec
iﬁcation. It has the disadvantage of requiring a third argument, whose
only role is to make it possible to express the required sharing. An appli
cation of this functor must provide not only implementations of SPHERE
and SEMI SPACE, but also an implementation of POINT that is used to build
these!
A slightly more sophisticated version of this solution is as follows:
functor ExtdGeomFun3
(structure Sp : SPHERE
structure Ss : SEMI SPACE where Point = Sp.Point) =
struct
structure Sphere = Sp
structure SemiSpace = Ss
end
The “extra” parameter to the functor has been eliminated by choosing one
of the components as a “representative” and insisting that the others be
compatible with it by using a where clause.
2
2
Ofﬁcially, we must write where type Point.point = Sp.Point.point, but many
compilers accept the syntax above.
REVISED 11.02.11 DRAFT VERSION 1.2
23.4 Sample Code 191
This solution has all of the advantages of the direct use of sharing speci
ﬁcations, and no further disadvantages. However, we are forced to violate
arbitrarily the inherent symmetry of the situation. We could just as well
have written
functor ExtdGeomFun4
(structure Ss : SEMI SPACE
structure Sp : SPHERE where Point = Sp.Point) =
struct
structure Sphere = Sp
structure SemiSpace = Ss
end
without changing the meaning.
Here is the point: sharing speciﬁcations allow a symmetric situation to be
treated in a symmetric manner. The compiler breaks the symmetry by choos
ing representatives arbitrarily in the manner illustrated above. Sharing
speciﬁcations offload the burden of making such tedious (because arbi
trary) decisions to the compiler, rather than imposing it on the program
mer.
23.4 Sample Code
Here is the code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Part IV
Programming Techniques
193
In this part of the book we will explore the use of Standard ML to build
elegant, reliable, and efﬁcient programs. The discussion takes the form of
a series of worked examples illustrating various techniques for building
programs.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 24
Speciﬁcations and Correctness
The most important tools for getting programs right are speciﬁcation and
veriﬁcation. In this Chapter we review the main ideas in preparation for
their subsequent use in the rest of the book.
24.1 Speciﬁcations
A speciﬁcation is a description of the behavior of a piece of code. Speciﬁca
tions take many forms:
• Typing. A type speciﬁcation describes the “form” of the value of an
expression, without saying anything about the value itself.
• Effect Behavior. An effect speciﬁcation resembles a type speciﬁcation,
but instead of describing the value of an expression, it describes the
effects it may engender when evaluated.
• InputOutput Behavior. An inputoutput speciﬁcation is a mathemat
ical formula, usually an implication, that describes the output of a
function for all inputs satisfying some assumptions.
• Time and Space Complexity. A complexity speciﬁcation states the time
or space required to evaluate an expression. The speciﬁcation is most
often stated asymptotically in terms of the number of execution steps
or the size of a data structure.
24.1 Speciﬁcations 195
• Equational Speciﬁcations. An equational speciﬁcation states that one
code fragment is equivalent to another. This means that wherever
one is used we may replace it with the other without changing the
observable behavior of any program in which they occur.
This list by no means exhausts the possibilities. What speciﬁcations have
in common, however, is a descriptive, or declarative, ﬂavor, rather than a
prescriptive, or operational, one. A speciﬁcation states what a piece of code
does, not how it does it.
Good programmers use speciﬁcations to state precisely and concisely
their intentions when writing a piece of code. The code is written to solve
a problem; the speciﬁcation records for future reference what problem the
code was intended to solve. The very act of formulating a precise spec
iﬁcation of a piece of code is often the key to ﬁnding a good solution to
a tricky problem. The guiding principle is this: if you are unable to write
a clear speciﬁcation, you do not understand the problem well enough to solve it
correctly.
The greatest difﬁculty in using speciﬁcations is in knowing what to
say. A common misunderstanding is that a given piece of code has one
speciﬁcation stating all there is to know about it. Rather you should see
a speciﬁcation as describing one of perhaps many properties of interest.
In ML every program comes with at least one speciﬁcation, its type. But
we may also consider other speciﬁcations, according to interest and need.
Sometimes only relatively weak properties are important — the function
square always yields a nonnegative result. Other times stronger proper
ties are needed — the function fib applied to n yields the nth and n1st
Fibonacci number. It’s a matter of taste and experience to know what to
say and how best to say it.
Recall the following two ML functions from chapter 7:
fun fib 0 = 1
 fib 1 = 1
 fib n = fib (n1) + fib (n2)
fun fib’ 0 = (1, 0)
 fib’ 1 = (1, 1)
 fib’ n =
let
val (a, b) = fib’ (n1)
REVISED 11.02.11 DRAFT VERSION 1.2
24.2 Correctness Proofs 196
in
(a+b, a)
end
Here are some speciﬁcations pertaining to these functions:
• Type speciﬁcations:
– val fib : int > int
– val fib’ : int > int * int
• Effect speciﬁcations:
– The application fib n may raise the exception Overflow.
– The application fib’ n may raise the exception Overflow.
• Inputoutput speciﬁcations:
– If n ≥ 0, then fib n evaluates to the nth Fibonacci number.
– If n ≥ 0, then fib’ n evaluates the nth and n −1st Fibonacci
number, in that order.
• Time complexity speciﬁcations:
– If n ≥ 0, then fib n terminates in O(2
n
) steps.
– If n ≥ 0, then fib’ n terminates in O(n) steps.
• Equivalence speciﬁcation:
For all n ≥ 0, fib n is equivalent to #1(fib’ n)
24.2 Correctness Proofs
A program satisﬁes, or meets, a speciﬁcation iff its execution behavior is
as described by the speciﬁcation. Veriﬁcation is the process of checking
that a program satisﬁes a speciﬁcation. This takes the form of proving a
mathematical theorem (by hand or with machine assistance) stating that
the program implements a speciﬁcation.
REVISED 11.02.11 DRAFT VERSION 1.2
24.2 Correctness Proofs 197
There are many misunderstandings in the literature about speciﬁcation
and veriﬁcation. It is worthwhile to take time out to address some of them
here.
It is often said that a program is correct if it meets a speciﬁcation. The
veriﬁcation of this fact is then called a correctness proof. While there is noth
ing wrong with this usage, it invites misinterpretation. As we remarked
in section 24.1, there is in general no single preferred speciﬁcation of a
piece of code. Veriﬁcation is always relative to a speciﬁcation, and hence
so is correctness. In particular, a program can be “correct” with respect to
one speciﬁcation, and “incorrect” with respect to another. Consequently,
a program is never inherently correct or incorrect; it is only so relative to a
particular speciﬁcation.
A related misunderstanding is the inference that a program “works”
from the fact that it is “correct” in this sense. Speciﬁcations usually make
assumptions about the context in which the program is used. If these as
sumptions are not satisﬁed, then all bets are off — nothing can be said
about its behavior. For this reason it is entirely possible that a correct pro
gram will malfunction when used, not because the speciﬁcation is not met,
but rather because the speciﬁcation is not relevant to the context in which
the code is used.
Another common misconception about speciﬁcations is that they can
always be implemented as runtime checks.
1
There are at least two falla
cies here:
1. The speciﬁcation is stated at the level of the source code. It may
not even be possible to test it at runtime. See chapter 32 for further
discussion of this point.
2. The speciﬁcation may not be mechanically checkable. For example,
we might specify that a function f of type int>int always yields a
nonnegative result. But there is no way to implement this as a run
time check — this is an undecidable, or uncomputable, problem.
For this reason speciﬁcations are strictly more powerful than runtime
checks. Correspondingly, it takes more work than a mere conditional
1
This misconception is encouraged by the C assert macro, which introduces an ex
ecutable test that a certain computable condition holds. This is a ﬁne thing, but from
this many people draw the conclusion that assertions (speciﬁcations) are simply boolean
tests. This is false.
REVISED 11.02.11 DRAFT VERSION 1.2
24.2 Correctness Proofs 198
branch to ensure that a program satisﬁes a speciﬁcation.
Finally, it is important to note that speciﬁcation, implementation, and
veriﬁcation go handinhand. It is unrealistic to propose to verify that an
arbitrary piece of code satisﬁes an arbitrary speciﬁcation. Fundamental
computability and complexity results make clear that we can never suc
ceed in such an endeavor. Fortunately, it is also completely artiﬁcial. In
practice we specify, code, and verify simultaneously, with each activity in
forming the other. If the veriﬁcation breaks down, reconsider the code
or the speciﬁcation (or both). If the code is difﬁcult to write, look for in
sights from the speciﬁcation and veriﬁcation. If the speciﬁcation is com
plex, rough out the code and proof to see what it should be. There is no
“magic bullet”, but these are some of the most important, and useful, tools
for building elegant, robust programs.
Veriﬁcations of speciﬁcations take many different forms.
• Type speciﬁcations are veriﬁed automatically by the ML compiler.
If you put type annotations on a function stating the types of the
parameters or results of that function, the correctness of these anno
tations is ensured by the compiler. For example, we may state the
intended typing of fib as follows:
fun fib (n:int):int =
case n
of 0 => 1
 1 => 1
 n => fib (n1) + fib (n2)
This ensures that the type of fib is int>int.
• Effect speciﬁcations must be checked by hand. These generally state
that a piece of code “may raise” one or more exceptions. If an excep
tion is not mentioned in such a speciﬁcation, we cannot conclude that
the code does not raise it, only that we have no information. Notice
that a handler serves to eliminate a “may raise” speciﬁcation. For ex
ample, the following function may not raise the exception Overflow,
even though fib might:
fun protected fib n =
(fib n) handle Overflow => 0
REVISED 11.02.11 DRAFT VERSION 1.2
24.3 Enforcement and Compliance 199
A handler makes it possible to specify that an exception may not be
raised by a given expression (because the handler traps it).
• Inputoutput speciﬁcations require proof, typically using some form
of induction. For example, in chapter 7 we proved that fib n yields
the nth Fibonacci number by complete induction on n.
• Complexity speciﬁcations are often veriﬁed by solving a recurrence
describing the execution time of a program. In the case of fib we
may read off the following recurrence:
T(0) = 1
T(1) = 1
T(n + 2) = T(n) + T(n + 1) + 1
Solving this recurrence yields the proof that T(n) = O(2
n
).
• Equivalence speciﬁcations also require proof. Since equivalence of
expressions must account for all possible uses of them, these proofs
are, in general, very tricky. One method that often works is “induc
tion plus handsimulation”. For example, it is not hard to prove by
induction on n that fib n is equivalent to #1(fib’ n). First, plug in
n = 0 and n = 1, and calculate using the deﬁnitions. Then assume
the result for n and n +1, and consider n +2, once again calculating
based on the deﬁnitions, to obtain the result.
24.3 Enforcement and Compliance
Most speciﬁcations have the form of an implication: if certain conditions
are met, then the program behaves a certain way. For example, type spec
iﬁcations are conditional on the types of its free variables. For example,
if x has type int, then x+1 has type int. Inputoutput speciﬁcations are
characteristically of this form. For example, if x is nonnegative, then so is
x+1.
Just as in ordinary mathematical reasoning, if the premises of such
a speciﬁcation are not true, then all bets are off — nothing can be said
about the behavior of the code. The premises of the speciﬁcation are pre
conditions that must be met in order for the program to behave in the man
ner described by the conclusion, or postcondition, of the speciﬁcation. This
REVISED 11.02.11 DRAFT VERSION 1.2
24.3 Enforcement and Compliance 200
means that the preconditions impose obligations on the caller, the user
of the code, in order for the callee, the code itself, to be wellbehaved. A
conditional speciﬁcation is a contract between the caller and the callee: if
the caller meets the preconditions, the caller promises to fulﬁll the post
condition.
In the case of type speciﬁcations the compiler enforces this obligation
by ruling out as illtyped any attempt to use a piece of code in a con
text that does not fulﬁll its typing assumptions. Returning to the example
above, if one attempts to use the expression x+1 in a context where x is not
an integer, one can hardly expect that x+1 will yield an integer. Therefore
it is rejected by the type checker as a violation of the stated assumptions
governing the types of its free variables.
What about speciﬁcations that are not mechanically enforced? For ex
ample, if x is negative, then we cannot infer anything about x+1 from the
speciﬁcation given above.
2
To make use of the speciﬁcation in reasoning
about its used in a larger program, it is essential that this precondition be
met in the context of its use.
Lacking mechanical enforcement of these obligations, it is all too easy
to neglect them when writing code. Many programming mistakes can be
traced to violation of assumptions made by the callee that are not met by
the caller.
3
What can be done about this?
A standard method, called bulletprooﬁng, is to augment the callee with
runtime checks that ensure that its preconditions are met, raising an ex
ception if they are not. For example, we might write a “bulletproofed”
version of fib that ensures that its argument is nonnegative as follows:
local
exception PreCond
fun unchecked fib 0 = 1
 unchecked fib 1 = 1
 unchecked fib n =
unchecked fib (n1) + unchecked fib (n2)
in
2
There are obviously other speciﬁcations that carry more information, but we’re only
concerned here with the one given. Moreover, if f is an unknown function, then we will,
in general, only have the speciﬁcation, and not the code, to reason about.
3
Sadly, these assumptions are often unstated and can only be culled from the code
with great effort, if at all.
REVISED 11.02.11 DRAFT VERSION 1.2
24.3 Enforcement and Compliance 201
fun checked fib n =
if n < 0 then
raise PreCond
else
unchecked fib n
end
It is worth noting that we have structured this program to take the
precondition check out of the loop. It would be poor practice to deﬁne
checked fib as follows:
fun bad checked fib n =
if n < 0 then
raise PreCond
else
case n
of 0 => 1
 1 => 1
 n => bad checked fib (n1) + bad checked fib (n2)
Once we know that the initial argument is nonnegative, it is assured that
recursive calls also satisfy this requirement, provided that you’ve done the
inductive reasoning to validate the speciﬁcation of the function.
However, bulletprooﬁng in this form has several drawbacks. First, it
imposes the overhead of checking on all callers, even those that have en
sured that the desired precondition is true. In truth the runtime overhead
is minor; the real overhead is requiring that the implementor of the callee
take the trouble to impose the checks.
Second, and far more importantly, bulletprooﬁng only applies to speci
ﬁcations that can be checked at runtime. As we remarked earlier, not all
speciﬁcations are amenable to runtime checks. For these cases there is
no question of inserting runtime checks to enforce the precondition. For
example, we may wish to impose the requirement that a function argu
ment of type int>int always yields a nonnegative result. There is no
runtime check for this condition — we cannot write a function nonneg of
type (int>int)>bool that determines whether or not a function f al
ways yields a nonnegative result. In chapter 32 we will consider the use
of data abstraction to enforce at compile time speciﬁcations that may not
be checked at runtime.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 25
Induction and Recursion
This chapter is concerned with the close relationship between recursionand
induction in programming. If a function is recursivelydeﬁned, an induc
tive proof is required to show that it meets a speciﬁcation of its behavior.
The motto is
when programming recursively, think inductively.
Doing so signiﬁcantly reduces the time spent debugging, and often leads
to more efﬁcient, robust, and elegant programs.
25.1 Exponentiation
Let’s start with a very simple series of examples, all involving the compu
tation of the integer exponential function. Our ﬁrst example is to compute
2
n
for integers n ≥ 0. We seek to deﬁne the function exp of type int>int
satisfying the speciﬁcation
if n ≥ 0, then exp n evaluates to 2
n
.
The precondition, or assumption, is that the argument n is nonnegative. The
postcondition, or guarantee, is that the result of applying exp to n is the num
ber 2
n
. The caller is required to establish the precondition before applying
exp; in exchange, the caller may assume that the result is 2
n
.
Here’s the code:
fun exp 0 = 1
 exp n = 2 * exp (n1)
25.1 Exponentiation 203
Does this function satisfy the speciﬁcation? It does, and we can prove
this by induction on n. If n = 0, then exp n evaluates to 1 (as you can
see from the ﬁrst line of its deﬁnition), which is, of course, 2
0
. Otherwise,
assume that exp is correct for n − 1 ≥ 0, and consider the value of exp
n. From the second line of its deﬁnition we can see that this is the value
of 2 p, where p is the value of exp (n −1). Inductively, p ≥ 2
n−1
, so
2 p = 2 2
n−1
= 2
n
, as desired. Notice that we need not consider
arguments n < 0 since the precondition of the speciﬁcation requires that
this be so. We must, however, ensure that each recursive call satisﬁes this
requirement in order to apply the inductive hypothesis.
That was pretty simple. Now let us consider the running time of exp
expressed as a function of n. Assuming that arithmetic operations are ex
ecuted in constant time, then we can read off a recurrence describing its
execution time as follows:
T(0) = O(1)
T(n + 1) = O(1) + T(n)
We are interested in solving a recurrence by ﬁnding a closedform expres
sion for it. In this case the solution is easily obtained:
T(n) = O(n)
Thus we have a linear time algorithm for computing the integer exponen
tial function.
What about space? This is a much more subtle issue than time be
cause it is much more difﬁcult in a highlevel language such as ML to see
where the space is used. Based on our earlier discussions of recursion and
iteration we can argue informally that the deﬁnition of exp given above
requires space given by the following recurrence:
S(0) = O(1)
S(n + 1) = O(1) + S(n)
The justiﬁcation is that the implementation requires a constant amount of
storage to record the pending multiplication that must be performed upon
completion of the recursive call.
Solving this simple recurrence yields the equation
S(n) = O(n)
REVISED 11.02.11 DRAFT VERSION 1.2
25.1 Exponentiation 204
expressing that exp is also a linear space algorithm for the integer exponen
tial function.
Can we do better? Yes, on both counts! Here’s how. Rather than count
down by one’s, multiplying by two at each stage, we use successive squar
ing to achieve logarithmic time and space requirements. The idea is that
if the exponent is even, we square the result of raising 2 to half the given
power; otherwise, we reduce the exponent by one and double the result,
ensuring that the next exponent will be even. Here’s the code:
fun square (n:int) = n*n
fun double (n:int) = n+n
fun fast exp 0 = 1
 fast exp n =
if n mod 2 = 0 then
square (fast exp (n div 2))
else
double (fast exp (n1))
Its speciﬁcation is precisely the same as before. Does this code satisfy
the speciﬁcation? Yes, and we can prove this by using complete induction,
a form of mathematical induction in which we may prove that n > 0 has
a desired property by assuming not only that the predecessor has it, but
that all preceding numbers have it, and arguing that therefore n must have
it. Here’s how it’s done. For n = 0 the argument is exactly as before.
Suppose, then, that n > 0. If nis even, the value of exp n is the result
of squaring the value of exp (n ÷2). Inductively this value is 2
(n÷2)
, so
squaring it yields 2
(ndiv2)
2
(n÷2)
= 2
2(n÷2)
= 2
n
, as required. If, on
the other hand, n is odd, the value is the result of doubling exp (n −1).
Inductively the latter value is 2
(n−1)
, so doubling it yields 2
n
, as required.
Here’s a recurrence governing the running time of fast exp as a func
tion of its argument:
T(0) = O(1)
T(2n) = O(1) + T(n)
T(2n + 1) = O(1) + T(2n)
= O(1) + T(n)
Solving this recurrence using standard techniques yields the solution
T(n) = O(lgn)
REVISED 11.02.11 DRAFT VERSION 1.2
25.1 Exponentiation 205
You should convince yourself that fast exp also requires logarithmic space
usage.
Can we do better? Well, it’s not possible to improve the time require
ment (at least not asymptotically), but we can reduce the space required to
O(1) by putting the function into iterative (tail recursive) form. However,
this may not be achieved in this case by simply adding an accumulator ar
gument, without also increasing the running time! The obvious approach
is to attempt to satisfy the speciﬁcation
if n ≥ 0, then skinny fast exp (n, a) evaluates to 2
n
a.
Here’s some code that achieves this speciﬁcation:
fun skinny fast exp (0, a) = a
 skinny fast exp (n, a) =
if n mod 2 = 0 then
skinny fast exp (n div 2,
skinny fast exp (n div 2, a))
else
skinny fast exp (n1, 2*a)
It is easy to see that this code works properly for n = 0 and for n > 0
when n is odd, but what if n > 0 is even? Then by induction we compute
2
(n÷2)
2
(n÷2
) a by two recursive calls to skinny fast exp.
This yields the desired result, but what is the running time? Here’s a
recurrence to describe its running time as a function of n:
T(0) = 1
T(2n) = O(1) + 2T(n)
T(2n + 1) = O(1) + T(2n)
= O(1) + 2T(n)
Here again we have a standard recurrence whose solution is
T(n) = O(n).
Can we do better? The key is to recall the following important fact:
2
(2n)
= (2
2
)
n
= 4
n
.
We can achieve a logarithmic time and exponential space bound by a
change of base. Here’s the speciﬁcation:
REVISED 11.02.11 DRAFT VERSION 1.2
25.1 Exponentiation 206
if n ≥ 0, then gen skinny fast exp (b, n, a) evaluates to b
n
a.
Here’s the code:
fun gen skinny fast exp (b, 0, a) = a
 gen skinny fast exp (b, n, a) =
if n mod 2 = 0 then
gen skinny fast exp (b*b, n div 2, a)
else
gen skinny fast exp (b, n  1, b * a)
Let’s check its correctness by complete induction on n. The base case
is obvious. Assume the speciﬁcation for arguments smaller than n > 0.
If n is even, then by induction the result is (b b)
(n÷2)
a = b
n
a, and
if n is odd, we obtain inductively the result b
(n−1)
b a = b
n
a. This
completes the proof.
The trick to achieving an efﬁcient implementation of the exponential
function was to compute a more general function that can be implemented
using less time and space. Since this is a trick employed by the implemen
tor of the exponential function, it is important to insulate the client from
it. This is easily achieved by using a local declaration to “hide” the gen
eralized function, making available to the caller a function satisfying the
original speciﬁcation. Here’s what the code looks like in this case:
local
fun gen skinny fast exp (b, 0, a) =
 gen skinny fast exp (b, n, a) = ...
in
fun exp n = gen skinny fast exp (2, n, 1)
end
(The ellided code is the same as above.) The point here is to see how
induction and recursion go handinhand, and how we used induction
not only to verify programs afterthefact, but, more importantly, to help
discover the program in the ﬁrst place. If the veriﬁcation is performed
simultaneously with the coding, it is far more likely that the proof will go
through and the program will work the ﬁrst time you run it.
It is important to notice the correspondence between strengthening
the speciﬁcation by adding additional assumptions (and guarantees) and
REVISED 11.02.11 DRAFT VERSION 1.2
25.2 The GCD Algorithm 207
adding accumulator arguments. What we observe is the apparent para
dox that it is often easier to do something (superﬁcially) harder! In terms
of proving, it is often easier to push through an inductive argument for a
stronger speciﬁcation, precisely because we get to assume the result as the
inductive hypothesis when arguing the inductive step(s). We are limited
only by the requirement that the speciﬁcation be proved outright at the
base case(s); no inductive assumption is available to help us along here. In
terms of programming, it is often easier to compute a more complicated
function involving accumulator arguments, precisely because we get to
exploit the accumulator when making recursive calls. We are limited only
by the requirement that the result be deﬁned outright for the base case(s);
no recursive calls are available to help us along here.
25.2 The GCD Algorithm
Let’s consider a more complicated example, the computation of the great
est common divisor of a pair of nonnegative integers. Recall that m is
a divisor of n, written m[n, iff n is a multiple of m, which is to say that
there is some k ≥ 0 such that n = k m. The greatest common divisor of
nonnegative integers m and n is the largest p such that p[m and p[n. (By
convention the g.c.d. of 0 and 0 is taken to be 0.) Here’s the speciﬁcation
of the gcdfunction:
if m, n ≥ 0, then gcd(m,n) evaluates to the g.c.d. of m and n.
Euclid’s algorithm for computing the g.c.d. of m andn is deﬁned by
complete induction on the product mn. Here’s the algorithm, written in
ML:
fun gcd (m:int, 0):int = m
 gcd (0, n:int):int = n
 gcd (m:int, n:int):int =
if m>n then
gcd (m mod n, n)
else
gcd (m, n mod m)
Why is this algorithm correct? We may prove that gcd satisﬁes the
speciﬁcation by complete induction on the product mn. If mn is zero,
REVISED 11.02.11 DRAFT VERSION 1.2
25.2 The GCD Algorithm 208
then either mor n is zero, in which case the answer is, correctly, the other
number. Otherwise the product is positive, and we proceed according to
whether m > n or m ≤ n. Suppose that m > n. Observe that mmodn =
m − (m ÷ n) n, so that (mmodn) n = m n − (m ÷ n)n
2
< m n,
so that by induction we return the g.c.d. of mmodn and n. It remains to
show that this is the g.c.d. of m and n. If d divides both mmodn and n,
then k d = (mmodn) = (m−(m÷n) n)and l d = n for some non
negative k and l. Consequently, k d = m − (m ÷ n) l d, so m =
(k + (m ÷ n) l) d, which is to say that d divides m. Now if d’ is any
other divisor of m and n, then it is also a divisor of (mmodn) and n, so
d > d
/
. That is, d is the g.c.d. of m and n. The other case, m ≤ n, follows
similarly. This completes the proof.
At this point you may well be thinking that all this inductive reasoning
is surely helpful, but it’s no replacement for good oldfashioned “bullet
prooﬁng” — conditional tests inserted at critical junctures to ensure that
key invariants do indeed hold at execution time. Sure, you may be think
ing, these checks have a runtime cost, but they can be turned off once
the code is in production, and anyway the cost is minimal compared to,
say, the time required to read and write from disk. It’s hard to complain
about this attitude, provided that sufﬁciently cheap checks can be put into
place and provided that you know where to put them to maximize their
effectiveness. For example, there’s no use checking i > 0 at the start of the
then clause of a test for i > 0. Barring compiler bugs, it can’t possibly be
anything other than the case at that point in the program. Or it may be
possible to insert a check whose computation is more expensive (or more
complicated) than the one we’re trying to perform, in which case we’re
defeating the purpose by including them!
This raises the question of where should we put such checks, and what
checks should be included to help ensure the correct operation (or, at least,
graceful malfunction) of our programs? This is an instance of the general
problem of writing selfchecking programs. We’ll illustrate the idea by elab
orating on the g.c.d. example a bit further. Suppose we wish to write a
selfchecking g.c.d. algorithm that computes the g.c.d., and then checks
the result to ensure that it really is the greatest common divisor of the two
given nonnegative integers before returning it as result. The code might
look something like this:
exception GCD ERROR
REVISED 11.02.11 DRAFT VERSION 1.2
25.2 The GCD Algorithm 209
fun checked gcd (m, n) =
let
val d = gcd (m, n)
in
if m mod d = 0 andalso
n mod d = 0 andalso ???
then
d
else
raise GCD ERROR
end
It’s obviously no problem to check that putative g.c.d., d, is in fact a com
mon divisor of mand n, but how do we check that it’s the greatest common
divisor? Well, one choice is to simply try all numbers between d and the
smaller of m and n to ensure that no intervening number is also a divi
sor, refuting the maximality of d. But this is clearly so inefﬁcient as to be
impractical. But there’s a better way, which, it has to be emphasized, re
lies on the kind of mathematical reasoning we’ve been considering right
along. Here’s an important fact:
d is the g.c.d. of m and n iff d divides both m and n and can be written
as a linear combination of m and n.
That is, d is the g.c.d. of mand n iff m = k d for some k ≥ 0, n = l d for
some l ≥ 0, and d = a m + b n for some integers (possibly negative!)
a and b. We’ll prove this constructively by giving a program to compute
not only the g.c.d. d of m and n, but also the coefﬁcients a and b such that
d = a m + b n. Here’s the speciﬁcation:
if m, n ≥ 0, then ggcd (m, n) evaluates to (d, a, b) such that
d divides m, d divides n, and d = a m + b n; consequently, d is
the g.c.d. of m and n.
And here’s the code to compute it:
fun ggcd (0, n) = (n, 0, 1)
 ggcd (m, 0) = (m, 1, 0)
 ggcd (m, n) =
if m>n then
REVISED 11.02.11 DRAFT VERSION 1.2
25.2 The GCD Algorithm 210
let
val (d, a, b) = ggcd (m mod n, n)
in
(d, a, b  a * (m div n))
end
else
let
val (d, a, b) = ggcd (m, n mod m)
in
(d, a  b*(n div m), b)
end
We may easily check that this code satisﬁes the speciﬁcation by induction
on the product mn. If mn = 0, then either m or n is 0, in which case
the result follows immediately. Otherwise assume the result for smaller
products, and show it for m n > 0. Suppose m > n; the other case
is handled analogously. Inductively we obtain d, a, and b such that d is
the g.c.d. of mmodn and n, and hence is the g.c.d. of m and n, and d =
a (mmodn) + b n. Since mmodn = m − (m ÷ n) n, it follows that
d = a m + (b −a (m÷n)) n, from which the result follows.
Now we can write a selfchecking g.c.d. as follows:
exception GCD ERROR
fun checked gcd (m, n) =
let
val (d, a, b) = ggcd (m, n)
in
if m mod d = 0 andalso
n mod d = 0 andalso d = a*m+b*n
then
d
else
raise GCD ERROR
end
This algorithm takes no more time (asymptotically) than the original, and,
moreover, ensures that the result is correct. This illustrates the power of
the interplay between mathematical reasoning methods such as induction
and number theory and programming methods such as bulletprooﬁng to
achieve robust, reliable, and, what is more important, elegant programs.
REVISED 11.02.11 DRAFT VERSION 1.2
25.3 Sample Code 211
25.3 Sample Code
Here is the code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 26
Structural Induction
The importance of induction and recursion are not limited to functions
deﬁned over the integers. Rather, the familiar concept of mathematical in
duction over the natural numbers is an instance of the more general notion
of structural induction over values of an inductivelydeﬁned type. Rather than
develop a general treatment of inductivelydeﬁned types, we will rely on a
few examples to illustrate the point. Let’s begin by considering the natural
numbers as an inductively deﬁned type.
26.1 Natural Numbers
The set of natural numbers, N, may be thought of as the smallest set con
taining 0 and closed under the formation of successors. In other words, n
is an element of N iff either n = 0 or n = m + 1 for some m in N. Still
another way of saying it is to deﬁne N by the following clauses:
1. 0 is an element of N.
2. If m is an element of N, then so is m + 1.
3. Nothing else is an element of N.
(The third clause is sometimes called the extremal clause; it ensures that
we are talking about N and not just some superset of it.) All of these
deﬁnitions are equivalent ways of saying the same thing.
26.1 Natural Numbers 213
Since N is inductively deﬁned, we may prove properties of the natural
numbers by structural induction, which in this case is just ordinary math
ematical induction. Speciﬁcally, to prove that a property P(n) holds of
every n in N, it sufﬁces to demonstrate the following facts:
1. Show that P(0) holds.
2. Assuming that P(m) holds, show that P(m + 1) holds.
The pattern of reasoning follows exactly the structure of the inductive def
inition — the base case is handled outright, and the inductive step is han
dled by assuming the property for the predecessor and show that it holds
for the numbers.
The principal of structural induction also licenses the deﬁnition of func
tions by structural recursion. To deﬁne a function f with domain N, it suf
ﬁces to proceed as follows:
1. Give the value of f (0).
2. Give the value of f (m + 1) in terms of the value of f (m).
Given this information, there is a unique function f with domain N
satisfying these requirements. Speciﬁcally, we may show by induction on
n ≥ 0 that the value of f is uniquely determined on all values m ≤ n. If
n = 0, this is obvious since f (0) is deﬁned by the ﬁrst clause. If n = m +1,
then by induction the value of f is determined for all values k ≤ m. But the
value of f at n is determined as a function of f (m), and hence is uniquely
determined. Thus f is uniquely determined for all values of n in N, as was
to be shown.
The natural numbers, viewed as an inductivelydeﬁned type, may be
represented in ML using a datatype declaration, as follows:
datatype nat = Zero  Succ of nat
The constructors correspond oneforone with the clauses of the induc
tive deﬁnition. The extremal clause is implicit in the datatype declaration
since the given constructors are assumed to be all the ways of building
values of type nat. This assumption forms the basis for exhaustiveness
checking for clausal function deﬁnitions.
(You may object that this deﬁnition of the type nat amounts to a unary
(base 1) representation of natural numbers, an unnatural and spacewasting
REVISED 11.02.11 DRAFT VERSION 1.2
26.2 Lists 214
representation. This is indeed true; in practice the natural numbers are
represented as nonnegative machine integers to avoid excessive over
head. Note, however, that this representation places a ﬁxed upper bound
on the size of numbers, whereas the unary representation does not. Hy
brid representations that enjoy the beneﬁts of both are, of course, possible
and occasionally used when enormous numbers are required.)
Functions deﬁned by structural recursion are naturally represented by
clausal function deﬁnitions, as in the following example:
fun double Zero = Zero
 double (Succ n) = Succ (Succ (double n))
fun exp Zero = Succ(Zero)
 exp (Succ n) = double (exp n)
The type checker ensures that we have covered all cases, but it does not en
sure that the pattern of structural recursion is strictly followed — we may
accidentally deﬁne f (m + 1) in terms of itself or some f (k) where k > m,
breaking the pattern. The reason this is admitted is that the ML compiler
cannot always follow our reasoning: we may have a clever algorithm in
mind that isn’t easily expressed by a simple structural induction. To avoid
restricting the programmer, the language assumes the best and allows any
form of deﬁnition.
Using the principle of structure induction for the natural numbers, we
may prove properties of functions deﬁned over the naturals. For example,
we may easily prove by structural induction over the type nat that for
every n ∈ N, exp n evaluates to a positive number. (In previous chapters
we carried out proofs of more interesting program properties.)
26.2 Lists
Generalizing a bit, we may think of the type ’a list as inductively de
ﬁned by the following clauses:
1. nil is a value of type ’a list
2. If h is a value of type ’a, and t is a value of type ’a list, then h::t
is a value of type ’a list.
3. Nothing else is a value of type ’a list.
REVISED 11.02.11 DRAFT VERSION 1.2
26.3 Trees 215
This deﬁnition licenses the following principle of structural induction
over lists. To prove that a property P holds of all lists l, it sufﬁces to pro
ceed as follows:
1. Show that P holds for nil.
2. Show that P holds for h::t, assuming that P holds for t.
Similarly, we may deﬁne functions by structural recursion over lists as
follows:
1. Deﬁne the function for nil.
2. Deﬁne the function for h::t in terms of its value for t.
The clauses of the inductive deﬁnition of lists correspond to the follow
ing (builtin) datatype declaration in ML:
datatype ’a list = nil  :: of ’a * ’a list
(We are neglecting the fact that :: is regarded as an inﬁx operator.)
The principle of structural recursion may be applied to deﬁne the re
verse function as follows:
fun reverse nil = nil
 reverse (h::t) = reverse t @ [h]
There is one clause for each constructor, and the value of reverse for h::t is
deﬁned in terms of its value for t. (We have ignored questions of time and
space efﬁciency to avoid obscuring the induction principle underlying the
deﬁnition of reverse.)
Using the principle of structural induction over lists, we may prove
that reverse l evaluates to the reversal of l. First, we show that reverse
nil yields nil, as indeed it does and ought to. Second, we assume that
reverse t yields the reversal of t, and argue that reverse (h::t) yields
the reversal of h::t, as indeed it does since it returns reverse (t @ [h]).
26.3 Trees
Generalizing even further, we can introduce newinductivelydeﬁned types
such as 23 trees in which interior nodes are either binary (have two chil
dren) or ternary (have three children). Here’s a deﬁnition of 23 trees in
ML:
REVISED 11.02.11 DRAFT VERSION 1.2
26.4 Generalizations and Limitations 216
datatype ’a twth tree =
Empty 
Bin of ’a * ’a twth tree * ’a twth tree 
Ter of ’a * ’a twth tree * ’a twth tree * ’a twth tree
How might one deﬁne the “size” of a value of this type? Your ﬁrst
thought should be to write down a template like the following:
fun size Empty = ???
 size (Bin ( , t1, t2)) = ???
 size (Ter ( , t1, t2, t3)) = ???
We have one clause per constructor, and will ﬁll in the ellided expressions
to complete the deﬁnition. In many cases (including this one) the function
is deﬁned by structural recursion. Here’s the complete deﬁnition:
fun size Empty = 0
 size (Bin ( , t1, t2)) =
1 + size t1 + size t2
 size (Ter ( , t1, t2, t3)) =
1 + size t1 + size t2 + size t3
Obviously this function computes the number of nodes in the tree, as you
can readily verify by structural induction over the type ’a twth tree.
26.4 Generalizations and Limitations
Does this pattern apply to every datatype declaration? Yes and no. No
matter what the form of the declaration it always makes sense to deﬁne a
function over it by a clausal function deﬁnition with one clause per con
structor. Such a deﬁnition is guaranteed to be exhaustive (cover all cases),
and serves as a valuable guide to structuring your code. (It is especially
valuable if you change the datatype declaration, because then the compiler
will inform you of what clauses need to be added or removed from func
tions deﬁned over that type in order to restore it to a sensible deﬁnition.)
The slogan is:
To deﬁne functions over a datatype, use a clausal deﬁnition
with one clause per constructor
REVISED 11.02.11 DRAFT VERSION 1.2
26.5 Abstracting Induction 217
The catch is that not every datatype declaration supports a principle
of structural induction because it is not always clear what constitutes the
predecessor(s) of a constructed value. For example, the declaration
datatype D = Int of int  Fun of D>D
is problematic because a value of the form Fun(f ) is not constructed di
rectly from another value of type D, and hence it is not clear what to regard
as its predecessor. In practice this sort of deﬁnition comes up only rarely;
in most cases datatypes are naturally viewed as inductively deﬁned.
26.5 Abstracting Induction
It is interesting to observe that the pattern of structural recursion may be
directly codiﬁed in ML as a higherorder function. Speciﬁcally, we may
associate with each inductivelydeﬁned type a higherorder function that
takes as arguments values that determine the base case(s) and step case(s)
of the deﬁnition, and deﬁnes a function by structural induction based on
these arguments. An example will illustrate the point. The pattern of
structural induction over the type nat may be codiﬁed by the following
function:
fun nat rec base step =
let
fun loop Zero = base
 loop (Succ n) = step (loop n)
in
loop
end
This function has the type ’a > (’a > ’a) > nat > ’a.
Given the ﬁrst two arguments, nat rec yields a function of type nat
> ’a whose behavior is determined at the base case by the ﬁrst argument
and at the inductive step by the second. Here’s an example of the use of
nat rec to deﬁne the exponential function:
val double =
nat rec Zero (fn result => Succ (Succ result))
val exp =
nat rec (Succ Zero) double
REVISED 11.02.11 DRAFT VERSION 1.2
26.5 Abstracting Induction 218
Note well the pattern! The arguments to nat rec are
1. The value for Zero.
2. The value for Succ n deﬁned in terms of its value for n.
Similarly, the pattern of list recursion may be captured by the following
functional:
fun list recursion base step =
let
fun loop nil = base
 loop (h::t) = step (h, loop t)
in
loop
end
The type of the function list recursion is
’a > (’b * ’a > ’a) > ’b list > ’a
It may be instantiated to deﬁne the reverse function as follows:
val reverse = list recursion nil (fn (h, t) => t @ [h])
Finally, the principle of structural recursion for values of type ’a twth tree
is given as follows:
fun twth rec base bin step ter step =
let
fun loop Empty = base
 loop (Bin (v, t1, t2)) =
bin step (v, loop t1, loop t2)
 loop (Ter (v, t1, t2, t3)) =
ter step (v, loop t1, loop t2, loop t3)
in
loop
end
Notice that we have two inductive steps, one for each form of node.
The type of twth rec is
REVISED 11.02.11 DRAFT VERSION 1.2
26.6 Sample Code 219
’a > (’b * ’a * ’a > ’a) > (’b * ’a * ’a * ’a > ’a) > ’b twth tree > ’a
We may instantiate it to deﬁne the function size as follows:
val size =
twth rec 0
(fn ( , s1, s2)) => 1+s1+s2)
(fn ( , s1, s2, s3)) => 1+s1+s2+s3)
Summarizing, the principle of structural induction over a recursive
datatype is naturally codiﬁed in ML using pattern matching and higher
order functions. Whenever you’re programming with a datatype, you
should use the techniques outlined in this chapter to structure your code.
26.6 Sample Code
Here is the code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 27
ProofDirected Debugging
In this chapter we’ll put speciﬁcation and veriﬁcation techniques to work
in devising a regular expression matcher. The code is similar to that sketched
in chapter 1, but we will use veriﬁcation techiques to detect and correct a
subtle error that may not be immediately apparent frominspecting or even
testing the code. We call this process proofdirected debugging.
The ﬁrst task is to devise a precise speciﬁcation of the regular expres
sion matcher. This is a difﬁcult problem in itself. We then attempt to verify
that the matching program developed in chapter 1 satisﬁes this speciﬁca
tion. The proof attempt breaks down. Careful examination of the failure
reveals a counterexample to the speciﬁcation — the program does not sat
isfy it. We then consider how best to resolve the problem, not by change of
implementation, but instead by change of speciﬁcation.
27.1 Regular Expressions and Languages
Before we begin work on the matcher, let us ﬁrst deﬁne the set of regu
lar expressions and their meaning as a set of strings. The set of regular
expressions is given by the following grammar:
r ::= 0 [ 1 [ a [ r
1
r
2
[ r
1
+r
2
[ r
∗
Here a ranges over a given alphabet, a set of primitive “letters” that may
be used in a regular expression. A string is a ﬁnite sequence of letters of
the alphabet. We write ε for the null string, the empty sequence of letters.
We write s
1
s
2
for the concatenation of the strings s
1
and s
2
, the string s
27.1 Regular Expressions and Languages 221
consisting of the letters in s
1
followed by those in s
2
. The length of a string
is the number of letters in it. We do no distinguish between a character and
the unitlength string consisting solely of that character. Thus we write a s
for the extension of s with the letter a at the front.
A language is a set of strings. Every regular expression r stands for a
particular language L(r), the language of r, which is deﬁned by induction
on the structure of r as follows:
L(0) = 0
L(1) = 1
L(a) = ¦ a ¦
L(r
1
r
2
) = L(r
1
) L(r
2
)
L(r
1
+r
2
) = L(r
1
) + L(r
2
)
L(r
∗
) = L(r)
∗
This deﬁnition employs the following operations on languages:
0 = ∅
1 = ¦ ε ¦
L
1
+ L
2
= L
1
∪ L
2
L
1
L
2
= ¦ s
1
s
2
[ s
1
∈ L
1
, s
2
∈ L
2
¦
L
(0)
= 1
L
(i+1)
= L L
(i)
L
∗
=
i≥0
L
(i)
An important fact about L
∗
is that it is the smallest language L
/
such
that 1 + L L
/
⊆ L
/
. Spelled out, this means two things:
1. 1 + L L
∗
⊆ L
∗
, which is to say that
(a) ε ∈ L
∗
, and
(b) if s ∈ L and s
/
∈ L
∗
, then s s
/
∈ L
∗
.
2. If 1 + L L
/
⊆ L
/
, then L
∗
⊆ L
/
.
This means that L
∗
is the smallest language (with respect to language con
tainment) that contains the null string and is closed under concatenation
on the left by L.
Let’s prove that this is the case. First, since L
(0)
= 1, it follows immedi
ately that ε ∈ L
∗
. Second, if l ∈ L and l
/
∈ L
∗
, then l
/
∈ L
(i)
for some i ≥ 0,
REVISED 11.02.11 DRAFT VERSION 1.2
27.2 Specifying the Matcher 222
and hence l l
/
∈ L
(i+1)
by deﬁnition of concatentation of languages. This
completes the ﬁrst step. Now suppose that L
/
is such that 1 + L L
/
⊆ L
/
.
We are to show that L
∗
⊆ L
/
. We show by induction on i ≥ 0 that L
(i)
⊆ L
/
,
fromwhich the result follows immediately. If i = 0, then it sufﬁces to show
that ε ∈ L
/
. But this follows from the assumption that 1 + L L
/
⊆ L
/
, which
implies that 1 ⊆ L
/
. To show that L
(i+1)
⊆ L
/
, we observe that, by deﬁni
tion, L
(i+1)
= L L
(i)
. By induction L
(i)
⊆ L
/
, and hence L L
(i)
⊆ L
/
, since
L L
/
⊆ L
/
by assumption.
Having proved that L
∗
is the smallest language L
/
such that 1 + L L
/
⊆
L
/
, it is not hard to prove that L
∗
satisﬁes the recurrence L
∗
= 1 + L L
∗
.
We just proved the right to left containment. For the converse, it sufﬁces
to observe that 1 + L (1 + L L
∗
) ⊆ 1 + L L
∗
, for then the result follows by
minimality the result. This is easily established by a simple case analysis.
Exercise 1
Give a full proof of the fact that L
∗
= 1 + L L
∗
.
Finally, a word about implementation. We will assume in what follows
that the alphabet is given by the type char of characters, that strings are
elements of the type string, and that regular expressions are deﬁned by
the following datatype declaration:
datatype regexp =
Zero  One  Char of char 
Plus of regexp * regexp 
Times of regexp * regexp 
Star of regexp
We will also work with lists of characters, values of type char list, using
ML notation for primitive operations on lists such as concatenation and
extension. Occasionally we will abuse notation and not distinguish (in the
informal discussion) between a string and a list of characters. In particular
we will speak of a character list as being a member of a language, when in
fact we mean that the corresponding string is a member of that language.
27.2 Specifying the Matcher
Let us begin by devising a speciﬁcation for the regular expression matcher.
As a ﬁrst cut we write down a type speciﬁcation. We seek to deﬁne a func
REVISED 11.02.11 DRAFT VERSION 1.2
27.2 Specifying the Matcher 223
tion match of type regexp > string > bool that determines whether or
not a given string matches a given regular expression. More precisely, we
wish to satisfy the following speciﬁcation:
For every regular expression r and every string s, match r s
terminates, and evaluates to true iff s ∈ L(r).
We saw in chapter 1 that a natural way to deﬁne the procedure match
is to use a technique called continuation passing. We deﬁned an auxiliary
function match is with the type
regexp > char list > (char list > bool) > bool
that takes a regular expression, a list of characters (essentially a string, but
in a form suitable for incremental processing), and a continuation, and
yields a boolean. The idea is that match is takes a regular expression r, a
character list cs, and a continuation k, and determines whether or not some
initial segment of cs matches r, passing the remaining characters cs
/
to k in
the case that there is such an initial segment, and yields false otherwise.
Put more precisely,
For every regular expression r, character list cs, and contin
uation k, if cs = cs
/
@cs
//
with cs
/
∈ L(r) and k cs
//
evalu
ates to true, then match is r cs k evaluates true; otherwise,
match is r cs k evaluates to false.
Unfortunately, this speciﬁcation is too strong to ever be satisﬁed by any
program! Can you see why? The difﬁculty is that if k is not guaranteed to
terminate for all inputs, then there is no way that match is can behave as
required. For example, if there is no input on which k terminates, the spec
iﬁcation requires that match is return false. It should be intuitively clear
that we can never implement such a function. Instead, we must restrict
attention to total continuations, those that always terminate with true or
false on any input. This leads to the following revised speciﬁcation:
For every regular expression r, character list cs, and total con
tinuation k, if cs = cs
/
cs
//
with cs
/
∈ L(r) and k cs
//
evalu
ates to true, then match is r cs k evaluates to true; other
wise, match is r cs k evaluates to false.
REVISED 11.02.11 DRAFT VERSION 1.2
27.2 Specifying the Matcher 224
Observe that this speciﬁcation makes use of an implicit existential quan
tiﬁcation. Written out in full, we might say “For all . . . , if there exists cs
/
and cs
//
such that cs = cs
/
cs
//
with . . . , then . . . ”. This observation makes
clear that we must search for a suitable splitting of cs into two parts such
that the ﬁrst part is in L(r) and the second is accepted by k. There may,
in general, be many ways to partition the input to as to satisfy both of
these requirements; we need only ﬁnd one such way. Note, however, that
if cs = cs
/
@cs
//
with cs
/
∈ L(r) but k cs
//
yielding false, we must reject
this partitioning and search for another. In other words we cannot simply
accept any partitioning whose initial segment matches r, but rather only
those that also induce k to accept its corresponding ﬁnal segment. We may
return false only if there is no such splitting, not merely if a particular
splitting fails to work.
Suppose for the moment that match is satisﬁes this speciﬁcation. Does
it followthat match satisﬁes the original speciﬁcation? Recall that the func
tion match is deﬁned as follows:
fun match r s =
match is r
(String.explode s)
(fn nil => true  false)
Notice that the initial continuation is indeed total, and that it yields true
(accepts) iff it is applied to the null string. Therefore match satisﬁes the fol
lowing property obtained from the speciﬁcation of mathc is by plugging
in the initial continuation:
For every regular expression r and string s, if s ∈ L(r), then
match r s evaluates to true, and otherwise match r s evalu
ates to false.
This is precisely the property that we desire for match. Thus match is cor
rect (satisﬁes its speciﬁcation) if match is is correct.
So far so good. But does match is satisfy its speciﬁcation? If so, we are
done. How might we check this? Recall the deﬁnition of match is given
in the overview:
fun match is Zero k = false
 match is One cs k = k cs
 match is (Char c) nil k = false
REVISED 11.02.11 DRAFT VERSION 1.2
27.2 Specifying the Matcher 225
 match is (Char c) (d::cs) k =
if c=d then k cs else false
 match is (Times (r1, r2)) cs k =
match is r1 cs (fn cs’ => match is r2 cs’ k)
 match is (Plus (r1, r2)) cs k =
match is r1 cs k orelse match is r2 cs k
 match is (Star r) cs k =
k cs orelse
match is r cs (fn cs’ => match is (Star r) cs’ k)
Since match is is deﬁned by a recursive analysis of the regular expression
r, it is natural to proceed by induction on the structure of r. That is, we treat
the speciﬁcation as a conjecture about match is, and attempt to prove it by
structural induction on r.
We ﬁrst consider the three base cases. Suppose that r is 0. Then no
string is in L(r), so match is must return false, which indeed it does.
Suppose that r is 1. Since the null string is an initial segment of every
string, and the null string is in L(1), we must yield true iff k cs yields true,
and false otherwise. This is precisely how match is is deﬁned. Suppose
that r is a. Then to succeed cs must have the forma cs
/
with k cs
/
evaluating
to true; otherwise we must fail. The code for match is checks that cs has
the required form and, if so, passes cs
/
to k to determine the outcome, and
otherwise yields false. Thus match is behaves correctly for each of the
three base cases.
We now consider the three inductive steps. For r = r
1
+r
2
, we observe
that some initial segment of cs matches r and causes k to accept the corre
sponding ﬁnal segment of cs iff either some initial segment matches r
1
and
drives k to accept the rest or some initial segment matches r
2
and drives k
to accept the rest. By induction match is works as speciﬁed for r
1
and r
2
,
which is sufﬁcient to justify the correctness of match is for r = r
1
+r
2
.
For r = r
1
r
2
, the proof is slightly more complicated. By induction
match is behaves according to the speciﬁcation if it is applied to either
r
1
or r
2
, provided that the continuation argument is total. Note that the
continuation k
/
given by fn cs’ => match is r2 cs’ k is total, since by
induction the inner recursive call to match is always terminates. Suppose
that there exists a partitioning cs = cs
/
@cs
//
with cs
/
∈ L(r)and k cs
//
eval
uating to true. Then cs
/
= cs
/
1
cs
/
2
with cs
/
1
∈ L(r
1
) and cs
/
2
∈ L(r
2
), by
deﬁnition of L(r
1
r
2
). Consequently, match is r
2
(cs
/
2
cs
//
) k evaluates to
REVISED 11.02.11 DRAFT VERSION 1.2
27.2 Specifying the Matcher 226
true, and hence match is r
1
cs
/
1
cs
/
2
cs
//
k
/
evaluates to true, as required.
If, however, no such partitioning exists, then one of three situations occurs:
1. either no initial segment of cs matches r
1
, in which case the outer
recursive call yields false, as required, or
2. for every initial segment matching r
1
, no initial segment of the corre
sponding ﬁnal segment matches r
2
, in which case the inner recursive
call yields false on every call, and hence the outer call yields false,
as required, or
3. every pair of successive initial segments of cs matching r
1
and r
2
successively results in k evaluating to false, in which case the inner
recursive call always yields false, and hence the continuation k
/
al
ways yields false, and hence the outer recursive call yields false,
as required.
Be sure you understand the reasoning involved here, it is quite tricky to
get right!
We seem to be on track, with one more case to consider, r = r
1
∗
. This
case would appear to be a combination of the preceding two cases for al
ternation and concatenation, with a similar argument sufﬁcing to estab
lish correctness. But there is a snag: the second recursive call to match is
leaves the regular expression unchanged! Consequently we cannot apply
the inductive hypothesis to establish that it behaves correctly in this case,
and the obvious proof attempt breaks down.
What to do? A moment’s thought suggests that we proceed by an in
ner induction on the length of the string, based on the idea that if some
initial segment of cs matches L(r), then either that initial segment is the
null string (base case), or cs = cs
/
@cs
//
with cs
/
∈ L(r
1
) and cs
//
∈ L(r)
(induction step). We then handle the base case directly, and handle the
inductive case by assuming that match is behaves correctly for cs
//
and
showing that it behaves correctly for cs. But there is a ﬂaw in this argu
ment — the string cs
//
need not be shorter than cs in the case that cs
/
is the
null string! In that case the inductive hypothesis does not apply, and we
are once again unable to complete the proof.
This time we can use the failure of the proof to obtain a counterexam
ple to the speciﬁcation! For if r = 1
∗
, for example, then match is r cs k
does not terminate! In general if r = r
1
∗
with ε ∈ L(r
1
), then match is r
REVISED 11.02.11 DRAFT VERSION 1.2
27.2 Specifying the Matcher 227
cs k fails to terminate. In other words, match is does not satisfy the speci
ﬁcation we have given for it. Our conjecture is false!
Our failure to establish that match is satisﬁes its speciﬁcation lead to a
counterexample that refuted our conjecture and uncovered a genuine bug
in the program — the matcher may not terminate for some inputs. What
to do? One approach is to explicitly check for looping behavior during
matching by ensuring that each recursive calls matches some nonempty
initial segment of the string. This will work, but at the expense of clut
tering the code and imposing additional runtime overhead. You should
write out a version of the matcher that works this way, and check that it
indeed satisﬁes the speciﬁcation we’ve given above.
An alternative is to observe that the proof goes through under the ad
ditional assumption that no iterated regular expression matches the null
string. Call a regular expression r standard iff whenever r
/
∗
occurs within
r, the null string is not an element of L(r
/
). It is easy to check that the proof
given above goes through under the assumption that the regular expres
sion r is standard.
This says that the matcher works correctly for standard regular expres
sions. But what about the nonstandard ones? The key observation is that
every regular expression is equivalent to one in standard form. By “equivalent”
we mean “accepting the same language”. For example, the regular expres
sions r +0 and r are easily seen to be equivalent. Using this observation
we may avoid the need to consider nonstandard regular expressions. In
stead we can preprocess the regular expression to put it into standard
form, then call the matcher on the standardized regular expression.
The required preprocessing is based on the following deﬁnitions. We
will associate with each regular expression r two standard regular expres
sions δ(r) and r
−
with the following properties:
1. L(δ(r)) = 1 iff ε ∈ L(r) and L(δ(r)) = 0 otherwise.
2. L(r
−
) = L(r) ¸ 1.
With these equations in mind, we see that every regular expression r may
be written in the form δ(r) +r
−
, which is in standard form.
REVISED 11.02.11 DRAFT VERSION 1.2
27.3 Sample Code 228
The function δ mapping regular expressions to regular expressions is
deﬁned by induction on regular expressions by the following equations:
δ(0) = 0
δ(1) = 1
δ(a) = 0
δ(r
1
+r
2
) = δ(r
1
) ⊕δ(r
2
)
δ(r
1
r
2
) = δ(r
1
) ⊗δ(r
2
)
δ(r
∗
) = 1
Here we deﬁne 0 ⊕ 1 = 1 ⊕ 0 = 1 ⊕ 1 = 1 and 0 ⊕ 0 = 0 and 0 ⊗ 1 =
1 ⊗0 = 0 ⊗0 = 0 and 1 ⊗1 = 1.
Exercise 2
Show that L(δ(r)) = 1 iff ε ∈ L(r).
The deﬁnition of r
−
is given by induction on the structure of r by the
following equations:
0
−
= 0
1
−
= 0
a
−
= 0
(r
1
+r
2
)
−
= r
−
1
+r
−
2
(r
1
r
2
)
−
= δ(r
1
) r
−
2
+r
1
δ(r
2
) +r
−
1
r
−
2
(r
∗
)
−
= δ(r) +r
−
∗
The only tricky case is the one for concatenation, which must take account
of the possibility that r
1
or r
2
accepts the null string.
Exercise 3
Show that L(r
−
) = L(r) ¸ 1.
27.3 Sample Code
Here is the code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 28
Persistent and Ephemeral Data
Structures
This chapter is concerned with persistent and ephemeral abstract types. The
distinction is best explained in terms of the logical future of a value. When
ever a value of an abstract type is created it may be subsequently acted
upon by the operations of the type (and, since the type is abstract, by no
other operations). Each of these operations may yield (other) values of that
abstract type, which may themselves be handed off to further operations
of the type. Ultimately a value of some other type, say a string or an inte
ger, is obtained as an observable outcome of the succession of operations
on the abstract value. The sequence of operations performed on a value of
an abstract type constitutes a logical future of that type — a computation
that starts with that value and ends with a value of some observable type.
We say that a type is ephemeral iff every value of that type has at most one
logical future, which is to say that it is handed off from one operation of
the type to another until an observable value is obtained from it. This is
the normal case in familiar imperative programming languages because
in such languages the operations of an abstract type destructively modify
the value upon which they operate; its original state is irretrievably lost by
the performance of an operation. It is therefore inherent in the imperative
programming model that a value have at most one logical future. In con
trast, values of an abstract type in functional languages such as ML may
have many different logical futures, precisely because the operations do
not “destroy” the value upon which they operate, but rather create fresh
values of that type to yield as results. Such values are said to be persistent
230
because they persist after application of an operation of the type, and in
fact may serve as arguments to further operations of that type.
Some examples will help to clarify the distinction. The primitive list
types of ML are persistent because the performance of an operation such
as cons’ing, appending, or reversing a list does not destroy the original
list. This leads naturally to the idea of multiple logical futures for a given
value, as illustrated by the following code sequence:
(* original list *)
val l = [1,2,3]
val m1 = hd l
(* first future of l *)
val n1 = rev (tl m1)
(* second future of l *)
val m2 = l @ [4,5,6]
Notice that the original list value, [1,2,3], has two distinct logical futures,
one in which we remove its head, then reverse the tail, and the other in
which we append the list [4,5,6] to it. The ability to easily handle multi
ple logical futures for a data structure is a tremendous source of ﬂexibility
and expressive power, alleviating the need to perform tedious bookkeep
ing to manage “versions” or “copies” of a data structure to be passed to
different operations.
The prototypical ephemeral data structure in ML is the reference cell.
Performing an assignment operation on a reference cell changes it irrevo
cably; the original contents of the cell are lost, even if we keep a handle on
it.
val r = ref 0
(* original cell *)
val s = r
val = (s := 1)
val x = !r
(* 1! *)
Notice that the contents of (the cell bound to) r changes as a result of per
forming an assignment to the underlying cell. There is only one future
for this cell; a reference to its original binding does not yield its original
contents.
REVISED 11.02.11 DRAFT VERSION 1.2
231
More elaborate forms of ephemeral data structures are certainly possi
ble. For example, the following declaration deﬁnes a type of lists whose
tails are mutable. It is therefore a singlylinked list, one whose predecessor
relation may be changed dynamically by assignment:
datatype ’a mutable list =
Nil 
Cons of ’a * ’a mutable list ref
Values of this type are ephemeral in the sense that some operations
on values of this type are destructive, and hence are irreversible (so to
speak!). For example, here’s an implementation of a destructive reversal
of a mutable list. Given a mutable list l, this function reverses the links in
the cell so that the elements occur in reverse order of their occurrence in l.
local
fun ipr (Nil, a) = a
 ipr (this as (Cons ( , r as ref next)), a) =
ipr (next, (r := a; this))
in
(* destructively reverse a list *)
fun inplace reverse l = ipr (l, Nil)
end
As you can see, the code is quite tricky to understand! The idea is the
same as the iterative reverse function for pure lists, except that we reuse
the nodes of the original list, rather than generate new ones, when moving
elements onto the accumulator argument.
The distinction between ephemeral and persistent data structures is
essentially the distinction between functional (effectfree) and imperative
(effectful) programming — functional data structures are persistent; im
perative data structures are ephemeral. However, this characterization is
oversimpliﬁed in two respects. First, it is possible to implement a persis
tent data structure that exploits mutable storage. Such a use of mutation
is an example of what is called a benign effect because for all practical pur
poses the data structure is “purely functional” (i.e., persistent), but is in
fact implemented using mutable storage. As we will see later the exploita
tion of benign effects is crucial for building efﬁcient implementations of
persistent data structures. Second, it is possible for a persistent data type
REVISED 11.02.11 DRAFT VERSION 1.2
28.1 Persistent Queues 232
to be used in such a way that persistence is not exploited — rather, ev
ery value of the type has at most one future in the program. Such a type
is said to be singlethreaded, reﬂecting the linear, as opposed to branch
ing, structure of the future uses of values of that type. The signiﬁcance
of a singlethreaded type is that it may as well have been implemented as
an ephemeral data structure (e.g., by having observable effects on values)
without changing the behavior of the program.
28.1 Persistent Queues
Here is a signature of persistent queues:
signature QUEUE = sig
type ’a queue
exception Empty
val empty : ’a queue
val insert : ’a * ’a queue > ’a queue
val remove : ’a queue > ’a * ’a queue
end
This signature describes a structure providing a representation type for
queues, together with operations to create an empty queue, insert an ele
ment onto the back of the queue, and to remove an element from the front
of the queue. It also provides an exception that is raised in response to
an attempt to remove an element from the empty queue. Notice that re
moving an element from a queue yields both the element at the front of
the queue, and the queue resulting from removing that element. This is
a direct reﬂection of the persistence of queues implemented by this signa
ture; the original queue remains available as an argument to further queue
operations.
By a sequence of queue operations we shall mean a succession of uses
of empty, insert, and remove operations in such a way that the queue
argument of one operation is obtained as a result of the immediately pre
ceding queue operation. Thus a sequence of queue operations represents
a singlethreaded timeline in the life of a queue value. Here is an example
of a sequence of queue operations:
val q0 : int queue = empty
REVISED 11.02.11 DRAFT VERSION 1.2
28.1 Persistent Queues 233
val q1 = insert (1, q0)
val q2 = insert (2, q1)
val (h1, q3) = remove q2 (* h1 = 1, q3 = q1 *)
val (h2, q4) = remove q3 (* h2 = 2, q4 = q0 *)
By contrast the following operations do not form a single thread, but
rather a branching development of the queue’s lifetime:
val q0 : int queue = empty
val q1 = insert (1, q0)
val q2 = insert (2, q0) (* NB: q0, not q1! *)
val (h1, q3) = remove q1 (* h1 = 1, q3 = q0 *)
val (h2, q4) = remove q3 (* raise Empty *)
val (h2, q4) = remove q2 (* h2 = 2,, q4 = q0 *)
In the remainder of this section we will be concerned with singlethreaded
sequences of queue operations.
How might we implement the signature QUEUE? The most obvious ap
proach is to represent the queue as a list with, say, the head element of
the list representing the “back” (most recently enqueued element) of the
queue. With this representation enqueueing is a constanttime operation,
but dequeuing requires time proportional to the number of elements in
the queue. Thus in the worst case a sequence of n enqueue and dequeue
operations will take time O(n
2
), which is clearly excessive. We can make
dequeue simpler, at the expense of enqueue, by regarding the head of the
list as the “front” of the queue, but the time bound for n operations re
mains the same in the worst case.
Can we do better? A wellknown “trick” achieves an O(n) worstcase
performance for any sequence of n operations, which means that each op
eration takes O(1) steps if we amortize the cost over the entire sequence.
Notice that this is a worstcase bound for the sequence, yielding an amortized
bound for each operation of the sequence. This means that some operations
may be relatively expensive, but, in compensation, many operations will
be cheap.
How is this achieved? By combining the two naive solutions sketched
above. The idea is to represent the queue by two lists, one for the “back
half” consisting of recently inserted elements in the order of arrival, and
one for the “front half” consisting of soontoberemoved elements in re
verse order of arrival (i.e., in order of removal). We speak loosely of the
REVISED 11.02.11 DRAFT VERSION 1.2
28.1 Persistent Queues 234
“halves” of the queue, because we will not, in general, maintain an even
split of elements between the front and the back lists. Rather, we will ar
range things so that the following representation invariants holds true:
1. The elements of the queue listed in order of removal are the elements
of the front followed by the elements of the back in reverse order.
2. The front is empty only if the back is empty.
These invariants are maintained by using a “smart constructor” that
creates a queue from two lists representing the back and front parts of the
queue. This constructor ensures that the representation invariant holds
by ensuring that condition (2) is always true of the resulting queue. The
constructor proceeds by a case analysis on the back and front parts of the
queue. If the front list is nonempty, or both the front and back are empty,
the resulting queue consists of the back and front parts as given. If the
front is empty and the back is nonempty, the queue constructor yields
the queue consisting of an empty back part and a front part equal to the
reversal of the given back part. Observe that this is sufﬁcient to ensure
that the representation invariant holds of the resulting queue in all cases.
Observe also that the smart constructor either runs in constant time, or in
time proportional to the length of the back part, according to whether the
front part is empty or not.
Insertion of an element into a queue is achieved by cons’ing the ele
ment onto the back of the queue, then calling the queue constructor to
ensure that the result is in conformance with the representation invari
ant. Thus an insert can either take constant time, or time proportional to
the size of the back of the queue, depending on whether the front part is
empty. Removal of an element from a queue requires a case analysis. If
the front is empty, then by condition (2) the queue is empty, so we raise an
exception. If the front is nonempty, we simply return the head element
together with the queue created from the original back part and the front
part with the head element removed. Here again the time required is ei
ther constant or proportional to the size of the back of the queue, according
to whether the front part becomes empty after the removal. Notice that if
an insertion or removal requires a reversal of k elements, then the next k
operations are constanttime. This is the fundamental insight as to why
we achieve O(n) time complexity over any sequence of n operations. (We
will give a more rigorous analysis shortly.)
REVISED 11.02.11 DRAFT VERSION 1.2
28.2 Amortized Analysis 235
Here’s the implementation of this idea in ML:
structure Queue :> QUEUE = struct
type ’a queue = ’a list * ’a list
fun make queue (q as (nil, nil)) = q
 make queue (bs, nil) = (nil, rev bs)
 make queue (q as (bs, fs)) = q
val empty = make queue (nil, nil)
fun insert (x, (back,front)) =
make queue (x::back, front)
exception Empty
fun remove ( , nil) = raise Empty
 remove (bs, f::fs) = (f, make queue (bs, fs))
end
Notice that we call the “smart constructor” make queue whenever we
wish to return a queue to ensure that the representation invariant holds.
Consequently, some queue operations are more expensive than others, ac
cording to whether or not the queue needs to be reorganized to satisfy the
representation invariant. However, each such reorganization makes a cor
responding number of subsequent queue operations “cheap” (constant
time), so the overall effort required evens out in the end to constanttime
per operation. More precisely, the running time of a sequence of n queue
operations is now O(n), rather than O(n
2
), as it was in the naive imple
mentation. Consequently, each operation takes O(1) (constant) time “on
average,” i.e., when the total effort is evenly apportioned among each of
the operations in the sequence. Note that this is a worstcase time bound
for each operation, amortized over the entire sequence, not an averagecase time
bound based on assumptions about the distribution of the operations.
28.2 Amortized Analysis
How can we prove this claim? First we given an informal argument, then
we tighten it up with a more rigorous analysis. We are to account for the
total work performed by a sequence of n operations by showing that any
sequence of n operations can be executed in cn steps for some constant
c. Dividing by n, we obtain the result that each operations takes c steps
when amortized over the entire sequence. The key is to observe ﬁrst that
REVISED 11.02.11 DRAFT VERSION 1.2
28.2 Amortized Analysis 236
the work required to execute a sequence of queue operations may be ap
portioned to the elements themselves, then that only a constant amount of
work is expended on each element. The “life” of a queue element may be
divided into three stages: it’s arrival in the queue, it’s transit time in the
queue, and it’s departure from the queue. In the worst case each element
passes through each of these stages (but may “die young”, never partic
ipating in the second or third stage). Arrival requires constant time to
add the element to the back of the queue. Transit consists of being moved
from the back to the front by a reversal, which takes constant time per
element on the back. Departure takes constant time to pattern match and
extract the element. Thus at worst we require three steps per element to ac
count for the entire effort expended to perform a sequence of queue oper
ations. This is in fact a conservative upper bound in the sense that we may
need less than 3n steps for the sequence, but asymptotically the bound is
optimal — we cannot do better than constant time per operation! (You
might reasonably wonder whether there is a worstcase, nonamortized
constanttime implementation of persistent queues. The answer is “yes”,
but the code is far more complicated than the simple implementation we
are sketching here.)
This argument can be made rigorous as follows. The general idea is to
introduce the notion of a charge scheme that provides an upper bound on
the actual cost of executing a sequence of operations. An upper bound on
the charge will then provide an upper bound on the actual cost. Let T(n)
be the cumulative time required (in the worst case) to execute a sequence
of n queue operations. We will introduce a charge function, C(n), repre
senting the cumulative charge for executing a sequence of n operations and
show that T(n) ≤ C(n) = O(n). It is convenient to express this in terms
of a function R(n) = C(n) −T(n) representing the cumulative residual, or
overcharge, which is the amount that the charge for n operations exceeds
the actual cost of executing them. We will arrange things so that R(n) ≥ 0
and that C(n) = O(n), from which the result follows immediately.
Down to speciﬁcs. By charging 2 for each insert operation and 1 for
each remove, it follows that C(n) ≤ 2n for any sequence of n inserts and
removes. Thus C(n) = O(n). After any sequence of n ≥ 0 operations
have been performed, the queue contains 0 ≤ b ≤ n elements on the back
“half” and 0 ≤ f ≤ n elements on the front “half”. We claim that for every
n ≥ 0, R(n) = b. We prove this by induction on n ≥ 0. The condition
clearly holds after performing 0 operations, since T(0) = 0, C(0) = 0,
REVISED 11.02.11 DRAFT VERSION 1.2
28.2 Amortized Analysis 237
and hence R(0) = C(0) − T(0) = 0. Consider the n + 1st operation. If
it is an insert, and f > 0, T(n + 1) = T(n) + 1, C(n + 1) = C(n) + 2,
and hence R(n + 1) = R(n) + 1 = b + 1. This is correct because an insert
operation adds one element to the back of the queue. If, on the other hand,
f = 0, then T(n + 1) = T(n) + b + 2 (charging one for the cons and one
for creating the new pair of lists), C(n + 1) = C(n) + 2, so R(n + 1) =
R(n) + 2 −b −2 = b + 2 −b −2 = 0. This is correct because the back is
nowempty; we have used the residual overcharge to pay for the cost of the
reversal. If the n + 1st operation is a remove, and f > 0, then T(n + 1) =
T(n) + 1 and C(n + 1) = C(n) + 1 and hence R(n + 1) = R(n) = b. This
is correct because the remove doesn’t disturb the back in this case. Finally,
if we are performing a remove with f = 0, then T(n + 1) = T(n) + b + 1,
C(n + 1) = C(n) + 1, and hence R(n + 1) = R(n) −b = b −b = 0. Here
again we use of the residual overcharge to pay for the reversal of the back
to the front. The result follows immediately since R(n) = b ≥ 0, and hence
C(n) ≥ T(n).
It is instructive to examine where this solution breaks down in the
multithreaded case (i.e., where persistence is fully exploited). Suppose
that we perform a sequence of n insert operations on the empty queue,
resulting in a queue with n elements on the back and none on the front.
Call this queue q. Let us suppose that we have n independent “futures”
for q, each of which removes an element from it, for a total of 2n opera
tions. How much time do these 2n operations take? Since each indepen
dent future must reverse all n elements onto the front of the queue before
performing the removal, the entire collection of 2n operations takes n + n
2
steps, or O(n) steps per operation, breaking the amortized constanttime
bound just derived for a singlethreaded sequence of queue operations.
Can we recover a constanttime amortized cost in the persistent case? We
can, provided that we share the cost of the reversal among all futures of
q — as soon as one performs the reversal, they all enjoy the beneﬁt of its
having been done. This may be achieved by using a benign side effect to
cache the result of the reversal in a reference cell that is shared among all
uses of the queue. We will return to this once we introduce memoization
and lazy evaluation.
REVISED 11.02.11 DRAFT VERSION 1.2
28.3 Sample Code 238
28.3 Sample Code
Here is the code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 29
Options, Exceptions, and
Continuations
In this chapter we discuss the close relationships between option types,
exceptions, and continuations. They each provide the means for handling
failure to produce a value in a computation. Option types provide the
means of explicitly indicating in the type of a function the possibility that
it may fail to yield a “normal” result. The result type of the function forces
the caller to dispatch explicitly on whether or not it returned a normal
value. Exceptions provide the means of implicitly signalling failure to re
turn a normal result value, without sacriﬁcing the requirement that an ap
plication of such a function cannot ignore failure to yield a value. Continu
ations provide another means of handling failure by providing a function
to invoke in the case that normal return is impossible.
29.1 The nQueens Problem
We will explore the tradeoffs between these three approaches by consid
ering three different implementations of the nqueens problem: ﬁnd a way
to place n queens on an n n chessboard in such a way that no two queens
attack one another. The general strategy is to place queens in successive
columns in such a way that it is not attacked by a previously placed queen.
Unfortunately it’s not possible to do this in one pass; we may ﬁnd that we
can safely place k < n queens on the board, only to discover that there is
no way to place the next one. To ﬁnd a solution we must reconsider earlier
29.1 The nQueens Problem 240
decisions, and work forward from there. If all possible reconsiderations of
all previous decisions all lead to failure, then the problem is unsolvable.
For example, there is no safe placement of three queens on a 3x3 chess
board. This trialanderror approach to solving the nqueens problem is
called backtracking search.
A solution to the nqueens problem consists of an n n chessboard
with n queens safely placed on it. The following signature deﬁnes a chess
board abstraction:
signature BOARD =
sig
type board
val new : int > board
val complete : board > bool
val place : board * int > board
val safe : board * int > bool
val size : board > int
val positions : board > (int * int) list
end
The operation new creates a new board of a given dimension n ≥ 0. The
operation complete checks whether the board contains a complete safe
placement of n queens. The function safe checks whether it is safe to
place a queen at row i in the next free column of a board B. The operation
place puts a queen at row i in the next available column of the board.
The function size returns the size of a board, and the function positions
returns the coordinates of the queens on the board.
The board abstraction may be implemented as follows:
structure Board :> BOARD =
struct
(* rep: size, next free column, number placed, placements
inv: size>=0, 1<=next free<=size,
length(placements) = number placed
*)
type board = int * int * int * (int * int) list
fun new n = (n, 1, 0, nil)
fun size (n, , , ) = n
REVISED 11.02.11 DRAFT VERSION 1.2
29.2 Solution Using Options 241
fun complete (n, , k, ) = (k=n)
fun positions ( , , , qs) = qs
fun place ((n, i, k, qs),j) =
(n, i+1, k+1, (i,j)::qs)
fun threatens ((i,j), (i’,j’)) =
i=i’ orelse j=j’ orelse
i+j = i’+j’ orelse
ij = i’j’
fun conflicts (q, nil) =
false
 conflicts (q, q’::qs) =
threatens (q, q’) orelse conflicts (q, qs)
fun safe (( , i, , qs), j) =
not (conflicts ((i,j), qs))
end
The representation type contains “redundant” information in order to make
the individual operations more efﬁcient. The representation invariant en
sures that the components of the representation are properly related to one
another (e.g., the claimed number of placements is indeed the length of the
list of placed queens, and so on.)
Our goal is to deﬁne a function
val queens : int > Board.board option
such that if n ≥ 0, then queens n evaluates either to NONE if there is no safe
placement of n queens on an n n board, or to SOME B otherwise, with B a
complete board containing a safe placement of n queens. We will consider
three different solutions, one using option types, one using exceptions,
and one using a failure continuation.
29.2 Solution Using Options
Here’s a solution based on option types:
(* addqueen bd yields SOME bd’ with bd’ a
complete safe placement extending bd,
REVISED 11.02.11 DRAFT VERSION 1.2
29.3 Solution Using Exceptions 242
if one exists, and yields NONE otherwise
*)
fun addqueen bd =
let
fun try j =
if j > Board.size bd then
NONE
else if Board.safe (bd, j) then
case addqueen (Board.place (bd, j))
of NONE => try (j+1)
 r as (SOME bd’) => r
else
try (j+1)
in
if Board.complete bd then
SOME bd
else
try 1
end
fun queens n = addqueen (Board.new n)
The characteristic feature of this solution is that we must explicitly check
the result of each recursive call to addqueen to determine whether a safe
placement is possible from that position. If so, we simply return it; if not,
we must reconsider the placement of a queen in row j of the next available
column. If no placement is possible in the current column, the function
yields NONE, which forces reconsideration of the placement of a queen in
the preceding row. Eventually we either ﬁnd a safe placement, or yield
NONE indicating that no solution is possible.
29.3 Solution Using Exceptions
The explicit check on the result of each recursive call can be replaced by
the use of exceptions. Rather than have addqueen return a value of type
Board.board option, we instead have it return a value of type Board.board,
if possible, and otherwise raise an exception indicating failure. The case
analysis on the result is replaced by a use of an exception handler. Here’s
the code:
REVISED 11.02.11 DRAFT VERSION 1.2
29.3 Solution Using Exceptions 243
exception Fail
(* addqueen bd yields bd’, where bd’ is a complete safe
placement extending bd, if one exists, and raises Fail otherwise
*)
fun addqueen bd =
let
fun try j =
if j > Board.size bd then
raise Fail
else if Board.safe (bd, j) then
addqueen (Board.place (bd, j))
handle Fail => try (j+1)
else
try (j+1)
in
if Board.complete bd then
bd
else
try 1
end
fun queens n =
SOME (addqueen (Board.new n))
handle Fail => NONE
The main difference between this solution and the previous one is that
both calls to addqueen must handle the possibility that it raises the excep
tion Fail. In the outermost call this corresponds to a complete failure to
ﬁnd a safe placement, which means that queens must return NONE. If a safe
placement is indeed found, it is wrapped with the constructor SOME to in
dicate success. In the recursive call within try, an exception handler is
required to handle the possibility of there being no safe placement start
ing in the current position. This check corresponds directly to the case
analysis required in the solution based on option types.
What are the tradeoffs between the two solutions?
1. The solution based on option types makes explicit in the type of
the function addqueen the possibility of failure. This forces the pro
grammer to explicitly test for failure using a case analysis on the re
sult of the call. The type checker will ensure that one cannot use a
REVISED 11.02.11 DRAFT VERSION 1.2
29.4 Solution Using Continuations 244
Board.board option where a Board.board is expected. The solution
based on exceptions does not explicitly indicate failure in its type.
However, the programmer is nevertheless forced to handle the fail
ure, for otherwise an uncaught exception error would be raised at
runtime, rather than compiletime.
2. The solution based on option types requires an explicit case analysis
on the result of each recursive call. If “most” results are successful,
the check is redundant and therefore excessively costly. The solution
based on exceptions is free of this overhead: it is biased towards the
“normal” case of returning a board, rather than the “failure” case
of not returning a board at all. The implementation of exceptions
ensures that the use of a handler is more efﬁcient than an explicit
case analysis in the case that failure is rare compared to success.
For the nqueens problem it is not clear which solution is preferable. In
general, if efﬁciency is paramount, we tend to prefer exceptions if failure
is a rarity, and to prefer options if failure is relatively common. If, on
the other hand, static checking is paramount, then it is advantageous to
use options since the type checker will enforce the requirement that the
programmer check for failure, rather than having the error arise only at
runtime.
29.4 Solution Using Continuations
We turn nowto a third solution based on continuationpassing. The idea is
quite simple: an exception handler is essentially a function that we invoke
when we reach a blind alley. Ordinarily we achieve this invocation by
raising an exception and relying on the caller to catch it and pass control
to the handler. But we can, if we wish, pass the handler around as an
additional argument, the failure continuation of the computation. Here’s
how it’s done in the case of the nqueens problem:
(* addqueen bd yields bd’, where bd’ is a complete safe
placement extending bd, if one exists, and otherwise,
yields the value of fc ()
*)
fun addqueen (bd, fc) =
REVISED 11.02.11 DRAFT VERSION 1.2
29.4 Solution Using Continuations 245
let
fun try j =
if j > Board.size bd then
fc ()
else if Board.safe (bd, j) then
addqueen
(Board.place (bd, j),
fn () => try (j+1))
else
try (j+1)
in
if Board.complete bd then
SOME bd
else
try 1
end
fun queens n =
addqueen (Board.new n, fn () => NONE)
Here again the differences are small, but signiﬁcant. The initial continua
tion simply yields NONE, reﬂecting the ultimate failure to ﬁnd a safe place
ment. On a recursive call we pass to addqueen a continuation that resumes
search at the next row of the current column. Should we exceed the num
ber of rows on the board, we invoke the failure continuation of the most
recent call to addqueen.
The solution based on continuations is very close to the solution based
on exceptions, both in formand in terms of efﬁciency. Which is preferable?
Here again there is no easy answer, we can only offer general advice. First
off, as we’ve seen in the case of regular expression matching, failure con
tinuations are more powerful than exceptions; there is no obvious way to
replace the use of a failure continuation with a use of exceptions in the
matcher. However, in the case that exceptions would sufﬁce, it is gener
ally preferable to use them since one may then avoid passing an explicit
failure continuation. More signiﬁcantly, the compiler ensures that an un
caught exception aborts the program gracefully, whereas failure to invoke
a continuation is not in itself a runtime fault. Using the right tool for the
right job makes life easier.
REVISED 11.02.11 DRAFT VERSION 1.2
29.5 Sample Code 246
29.5 Sample Code
Here is the code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 30
HigherOrder Functions
Higherorder functions — those that take functions as arguments or re
turn functions as results — are powerful tools for building programs. An
interesting application of higherorder functions is to implement inﬁnite
sequences of values as (total) functions from the natural numbers (non
negative integers) to the type of values of the sequence. We will develop a
small package of operations for creating and manipulating sequences, all
of which are higherorder functions since they take sequences (functions!)
as arguments and/or return themas results. Anatural way to deﬁne many
sequences is by recursion, or selfreference. Since sequences are functions,
we may use recursive function deﬁnitions to deﬁne such sequences. Al
ternatively, we may think of such a sequence as arising from a “loopback”
or “feedback” construct. We will explore both approaches.
Sequences may be used to simulate digital circuits by thinking of a
“wire” as a sequence of bits developing over time. The ith value of the
sequence corresponds to the signal on the wire at time i. For simplicity
we will assume a perfect waveform: the signal is always either high or
low (or is undeﬁned); we will not attempt to model electronic effects such
as attenuation or noise. Combinational logic elements (such as and gates
or inverters) are operations on wires: they take in one or more wires as
input and yield one or more wires as results. Digital logic elements (such
as ﬂipﬂops) are obtained from combinational logic elements by feedback,
or recursion — a ﬂipﬂop is a recursivelydeﬁned wire!
30.1 Inﬁnite Sequences 248
30.1 Inﬁnite Sequences
Let us begin by developing a sequence package. Here is a suitable signa
ture deﬁning the type of sequences:
signature SEQUENCE =
sig
type ’a seq = int > ’a
(* constant sequence *)
val constantly : ’a > ’a seq
(* alternating values *)
val alternately : ’a * ’a > ’a seq
(* insert at front *)
val insert : ’a * ’a seq > ’a seq
val map : (’a > ’b) > ’a seq > ’b seq
val zip : ’a seq * ’b seq > (’a * ’b) seq
val unzip : (’a * ’b) seq > ’a seq * ’b seq
(* fair merge *)
val merge : (’a * ’a) seq > ’a seq
val stretch : int > ’a seq > ’a seq
val shrink : int > ’a seq > ’a seq
val take : int > ’a seq > ’a list
val drop : int > ’a seq > ’a seq
val shift : ’a seq > ’a seq
val loopback : (’a seq > ’a seq) > ’a seq
end
Observe that we expose the representation of sequences as functions. This
is done to simplify the deﬁnition of recursive sequences as recursive func
tions. Alternatively we could have hidden the representation type, at the
expense of making it a bit more awkward to deﬁne recursive sequences.
In the absence of this exposure of representation, recursive sequences may
only be built using the loopback operation which constructs a recursive
sequence by “looping back” the output of a sequence transformer to its
input. Most of the other operations of the signature are adaptations of
familiar operations on lists. Two exceptions to this rule are the functions
stretch and shrink that dilate and contract the sequence by a given time
REVISED 11.02.11 DRAFT VERSION 1.2
30.1 Inﬁnite Sequences 249
parameter — if a sequence is expanded by k, its value at i is the value of
the original sequence at i/k, and dually for shrinking.
Here’s an implementation of sequences as functions.
structure Sequence :> SEQUENCE =
struct
type ’a seq = int > ’a
fun constantly c n = c
fun alternately (c,d) n =
if n mod 2 = 0 then c else d
fun insert (x, s) 0 = x
 insert (x, s) n = s (n1)
fun map f s = f o s
fun zip (s1, s2) n = (s1 n, s2 n)
fun unzip (s : (’a * ’b) seq) =
(map #1 s, map #2 s)
fun merge (s1, s2) n =
(if n mod 2 = 0 then s1 else s2) (n div 2)
fun stretch k s n = s (n div k)
fun shrink k s n = s (n * k)
fun drop k s n = s (n+k)
fun shift s = drop 1 s
fun take 0 = nil
 take n s = s 0 :: take (n1) (shift s)
fun loopback loop n = loop (loopback loop) n
end
Most of this implementation is entirely straightforward, given the ease
with which we may manipulate higherorder functions in ML. The only
tricky function is loopback, which must arrange that the output of the
function loop is “looped back” to its input. This is achieved by a simple
recursive deﬁnition of a sequence whose value at n is the value at n of the
sequence resulting from applying the loop to this very sequence.
The sensibility of this deﬁnition of loopback relies on two separate
ideas. First, notice that we may not simplify the deﬁnition of loopback
as follows:
REVISED 11.02.11 DRAFT VERSION 1.2
30.1 Inﬁnite Sequences 250
(* bad definition *)
fun loopback loop = loop (loopback loop)
The reason is that any application of loopback will immediately loop for
ever! In contrast, the original deﬁnition is arranged so that application of
loopback immediately returns a function. This may be made more appar
ent by writing it in the following form, which is entirely equivalent to the
deﬁnition given above:
fun loopback loop =
fn n => loop (loopback loop) n
This format makes it clear that loopback immediately returns a function
when applied to a loop functional.
Second, for an application of loopback to a loop to make sense, it must
be the case that the loop returns a sequence without “touching” the ar
gument sequence (i.e., without applying the argument to a natural num
ber). Otherwise accessing the sequence resulting from an application of
loopback would immediately loop forever. Some examples will help to
illustrate the point.
First, let’s build a few sequences without using the loopback function,
just to get familiar with using sequences:
val evens : int seq = fn n => 2*n
val odds : int seq = fn n => 2*n+1
val nats : int seq = merge (evens, odds)
fun fibs n =
(insert
(1, insert
(1, map (op +)
(zip (drop 1 fibs, fibs)))))(n)
We may “inspect” the sequence using take and drop, as follows:
take 10 nats (* [0,1,2,3,4,5,6,7,8,9] *)
take 5 (drop 5 nats) (* [5,6,7,8,9] *)
take 5 fibs (* [1,1,2,3,5] *)
Now let’s consider an alternative deﬁnition of fibs that uses the loopback
operation:
REVISED 11.02.11 DRAFT VERSION 1.2
30.2 Circuit Simulation 251
fun fibs loop s =
insert (1, insert (1,
map (op +) (zip (drop 1 s, s))))
val fibs = loopback fibs loop;
The deﬁnition of fibs loop is exactly like the original deﬁnition of fibs,
except that the reference to fibs itself is replaced by a reference to the
argument s. Notice that the application of fibs loop to an argument s
does not inspect the argument s!
One way to understand loopback is that it solves a system of equations
for an unknown sequence. In the case of the second deﬁnition of ﬁbs, we
are solving the following system of equations for f :
f 0 = 1
f 1 = 1
f (n + 2) = f (n + 1) + f (n)
These equations are derived by inspecting the deﬁnitions of insert, map,
zip, and drop given earlier. It is obvious that the solution is the Fibonacci
sequence; this is precisely the sequence obtained by applying loopback to
fibs loop.
Here’s an example of a loop that, when looped back, yields an unde
ﬁned sequence — any attempt to access it results in an inﬁnite loop:
fun bad loop s n = s n + 1
val bad = loopback bad loop
val = bad 0 (* infinite loop! *)
In this example we are, in effect, trying to solve the equation sn = sn + 1
for s, which has no solution (except the totally undeﬁned sequence). The
problem is that the “next” element of the output is deﬁned in terms of the
next element itself, rather than in terms of “previous” elements. Conse
quently, no solution exists.
30.2 Circuit Simulation
With these ideas in mind, we may apply the sequence package to build
an implementation of digital circuits. Let’s start with wires, which are
represented as sequences of levels:
REVISED 11.02.11 DRAFT VERSION 1.2
30.2 Circuit Simulation 252
datatype level = High  Low  Undef
type wire = level seq
type pair = (level * level) seq
val Zero : wire = constantly Low
val One : wire = constantly High
(* clock pulse with given duration of each pulse *)
fun clock (freq:int):wire =
stretch freq (alternately (Low, High))
We include the “undeﬁned” level to account for propagation delays and
settling times in circuit elements.
Combinational logic elements (gates) may be deﬁned as follows. We in
troduce an explicit unit time propagation delay for each gate — the output
is undeﬁned initially, and is then determined as a function of its inputs. As
we build up layers of circuit elements, it takes longer and longer (propor
tional to the length of the longest path through the circuit) for the output
to settle, exactly as in “real life”.
(* apply two functions in parallel *)
infixr **;
fun (f ** g) (x, y) = (f x, g y)
(* hardware logical and *)
fun logical and (Low, ) = Low
 logical and ( , Low) = Low
 logical and (High, High) = High
 logical and = Undef
fun logical not Undef = Undef
 logical not High = Low
 logical not Low = High
fun logical nop l = l
(* a nor b = not a and not b *)
val logical nor =
logical and o (logical not ** logical not)
type unary gate = wire > wire
type binary gate = pair > wire
fun gate f w 0 = Undef
(* logic gate with unit propagation delay *)
REVISED 11.02.11 DRAFT VERSION 1.2
30.2 Circuit Simulation 253
 gate f w i = f (w (i1))
val delay : unary gate = gate logical nop (* unit delay *)
val inverter : unary gate = gate logical not
val nor gate : binary gate = gate logical nor
It is a good exercise to build a onebit adder out of these elements, then
to string them together to form an nbit ripplecarry adder. Be sure to
present the inputs to the adder with sufﬁcient pulse widths to ensure that
the circuit has time to settle!
Combining these basic logic elements with recursive deﬁnitions allows
us to deﬁne digital logic elements such as the RS ﬂipﬂop. The propaga
tion delay inherent in our deﬁnition of a gate is fundamental to ensuring
that the behavior of the ﬂipﬂop is welldeﬁned! This is consistent with
“real life” —ﬂipﬂop’s depend on the existence of a hardware propagation
delay for their proper functioning. Note also that presentation of “illegal”
inputs (such as setting both the R and the S leads high results in metastable
behavior of the ﬂipﬂop, here as in real life. Finally, observe that the ﬂip
ﬂop exhibits a momentary “glitch” in its output before settling, exactly as
in hardware.
1
fun RS ff (S : wire, R : wire) =
let
fun X n = nor gate (zip (S, Y))(n)
and Y n = nor gate (zip (X, R))(n)
in
Y
end
(* generate a pulse of b’s n wide, followed by w *)
fun pulse b 0 w i = w i
 pulse b n w 0 = b
 pulse b n w i = pulse b (n1) w (i1)
val S = pulse Low 2 (pulse High 2 Zero);
val R = pulse Low 6 (pulse High 2 Zero);
val Q = RS ff (S, R);
val = take 20 Q;
val X = RS ff (S, S); (* unstable! *)
1
All of these behaviors may be observed by using take and drop to inspect the values
on the circuit.
REVISED 11.02.11 DRAFT VERSION 1.2
30.3 Regular Expression Matching, Revisited 254
val = take 20 X;
It is a good exercise to derive a system of equations governing the RS ﬂip
ﬂop from the deﬁnition we’ve given here, using the implementation of the
sequence operations given above. Observe that the delays arising from
the combinational logic elements ensure that a solution exists by ensuring
that the “next” element of the output refers only the “previous” elements,
and not the “current” element.
Finally, we consider a variant implementation of an RS ﬂipﬂop using
the loopback operation:
fun loopback2 (f : wire * wire > wire * wire) =
unzip (loopback (zip o f o unzip))
fun RS ff’ (S : wire, R : wire) =
let
fun RS loop (X, Y) =
(nor gate (zip (S, Y)),
nor gate (zip (X, R)))
in
loopback2 RS loop
end
Here we must deﬁne a “binary loopback” function to implement the ﬂip
ﬂop. This is achieved by reducing binary loopback to unary loopback by
composing with zip and unzip.
30.3 Regular Expression Matching, Revisited
The regular expression matcher introduced in Chapter 1 may be elegantly
reformulated using higherorder functions. The code for the matcher may
be restructured to separate the highlevel control structure of the matcher
from the details of how that control structure is implemented. A matcher
is a function that, given a string, matches an initial segment of that string.
A match combinator is a higherorder function that combines zero or more
matchers to form a compound matcher with the following behavior:
1. FAIL: always fails to match;
2. NULL: always matches the null string;
REVISED 11.02.11 DRAFT VERSION 1.2
30.3 Regular Expression Matching, Revisited 255
3. LITERALLY c: match exactly the character c at the start of the input
string, yielding the remainder of the string;
4. m
1
OR m
2
: match m
1
on a preﬁx of the input, if possible; if not, match
m
2
on a preﬁx of the same input; otherwise fail;
5. m
1
THEN m
2
: match m
1
on a preﬁx of the input, and then match m
2
on the remaining sufﬁx.
6. REPEATEDLY m: match m zero or more times on a preﬁx of the input.
Using these combinators we may present the regular expression matcher
in a particularly concise form:
fun match Zero = FAIL
 match One = NULL
 match (Char c) = LITERALLY c
 match (Plus (r1, r2)) = match r1 OR match r2
 match (Times (r1, r2)) = match r1 THEN match r2
 match (Star r) = REPEATEDLY (match r)
fun accepts regexp =
let
val matcher = match regexp
val done = fn nil => true  => false
in
fn str => matcher (String.explode string) done
end
The auxiliary function, match, transforms each regular expression into a
matcher, using the combinators just described.
The deﬁnition of the function match illustrates the concept of staged
computation. The function match is deﬁned by induction on the structure
of the regular expression. It transforms each regular expression into a
matcher for that regular expression, using the combinators to build up the
matcher by inductive analysis of the regular expression. The deﬁnition of
accepts ensures that the regular expression is analyzed once, yielding a
matcher that may be applied many times to candidate strings, determining
whether or not they match the given regular expression.
It remains to deﬁne the matching combinators. A matcher is a function
of type
REVISED 11.02.11 DRAFT VERSION 1.2
30.4 Sample Code 256
char list > (char list > bool) > bool
that is given a character list and a continuation, and yields either true or
false (please see Chapter 1 to review how the matcher works). The match
combinators are deﬁned as follows:
fun FAIL cs k = false
fun NULL cs k = k cs
fun LITERALLY c cs k =
(case cs of nil => false  c’::cs’ => (c=c’) andalso (k cs’))
fun OR (m1, m2) cs k = m1 cs k orelse m2 cs k
infix 8 OR
fun THEN (m1, m2) cs k = m1 cs (fn cs’ => m2 cs’ k)
infix 9 THEN
fun REPEATEDLY m cs k =
let
fun mstar cs’ = k cs’ orelse m cs’ mstar
in
mstar cs
end
The details of the control ﬂow, which is managed through the use of con
tinuations, is packaged into the match combinators, so that the matcher
itself is as perspicuous as possible.
30.4 Sample Code
Here is the code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 31
Memoization
In this chapter we will discuss memoization, a programming technique for
cacheing the results of previous computations so that they can be quickly
retrieved without repeated effort. Memoization is fundamental to the im
plementation of lazy data structures, either “by hand” or using the provi
sions of the SML/NJ compiler.
31.1 Cacheing Results
We begin with a discussion of memoization to increase the efﬁciency of
computing a recursivelydeﬁned function whose pattern of recursion in
volves a substantial amount of redundant computation. The problem is
to compute the number of ways to parenthesize an expression consisting
of a sequence of n multiplications as a function of n. For example, the
expression 2 ∗ 3 ∗ 4 ∗ 5 can be parenthesized in 5 ways:
((2 ∗3) ∗4) ∗5, (2 ∗ (3 ∗4)) ∗5, (2 ∗3) ∗ (4 ∗5), 2 ∗ (3 ∗ (4 ∗5)), 2 ∗ ((3 ∗4) ∗5).
A simple recurrence expresses the number of ways of parenthesizing a
sequence of n multiplications:
fun sum f 0 = 0
 sum f n = (f n) + sum f (n1)
fun p 1 = 1
 p n = sum (fn k => (p k) * (p (nk))) (n1)
31.1 Cacheing Results 258
where sum f ncomputes the sum of values of a function f (k) with 1 ≤ k ≤
n. This program is extremely inefﬁcient because of the redundancy in the
pattern of the recursive calls.
What can we do about this problem? One solution is to be clever and
solve the recurrence. As it happens this recurrence has a closedform so
lution (the Catalan numbers). But in many cases there is no known closed
form, and something else must be done to cut down the overhead. In this
case a simple cacheing technique proves effective. The idea is to maintain
a table of values of the function that is ﬁlled in whenever the function is
applied. If the function is called on an argument n, the table is consulted
to see whether the value has already been computed; if so, it is simply
returned. If not, we compute the value and store it in the table for future
use. This ensures that no redundant computations are performed. We will
maintain the table as an array so that its entries can be accessed in constant
time. The penalty is that the array has a ﬁxed size, so we can only record
the values of the function at some predetermined set of arguments. Once
we exceed the bounds of the table, we must compute the value the “hard
way”. An alternative is to use a dictionary (e.g., a balanced binary search
tree) which has no a priori size limitation, but which takes logarithmic time
to perform a lookup. For simplicity we’ll use a solution based on arrays.
Here’s the code to implement a memoized version of the parenthesiza
tion function:
local
val limit = 100
val memopad = Array.array (100, NONE)
in
fun p’ 1 = 1
 p’ n = sum (fn k => (p k) * (p (nk))) (n1)
and p n =
if n < limit then
case Array.sub of
SOME r => r
 NONE =>
let
val r = p’ n
in
REVISED 11.02.11 DRAFT VERSION 1.2
31.2 Laziness 259
Array.update (memopad, n, SOME r);
r
end
else
p’ n
end
The main idea is to modify the original deﬁnition so that the recursive
calls consult and update the memopad. The “exported” version of the
function is the one that refers to the memo pad. Notice that the deﬁnitions
of p and p’ are mutually recursive!
31.2 Laziness
Lazy evaluation is a combination of delayed evaluation and memoization.
Delayed evaluation is implemented using thunks, functions of type unit
> ’a. To delay the evaluation of an expression exp of type’a, simply write
fn () => exp. This is a value of type unit > ’a; the expression exp is ef
fectively “frozen” until the function is applied. To “thaw” the expression,
simply apply the thunk to the null tuple, (). Here’s a simple example:
val thunk =
fn () => print "hello" (* nothing printed *)
val = thunk () (* prints hello *)
While this example is especially simpleminded, remarkable effects can
be achieved by combining delayed evaluation with memoization. To do
so, we will consider the following signature of suspensions:
signature SUSP =
sig
type ’a susp
val force : ’a susp > ’a
val delay : (unit > ’a) > ’a susp
end
The function delay takes a suspended computation (in the form of a
thunk) and yields a suspension. It’s job is to “memoize” the suspension
REVISED 11.02.11 DRAFT VERSION 1.2
31.2 Laziness 260
so that the suspended computation is evaluated at most once — once the
result is computed, the value is stored in a reference cell so that subsequent
forces are fast. The implementation is slick. Here’s the code to do it:
structure Susp :> SUSP =
struct
type ’a susp = unit > ’a
fun force t = t ()
fun delay (t : ’a susp) =
let
exception Impossible
val memo : ’a susp ref =
ref (fn () => raise Impossible)
fun t’ () =
let val r = t ()
in memo := (fn () => r); r end
in
memo := t’;
fn () => (!memo)()
end
end
It’s worth discussing this code in detail because it is rather tricky. Sus
pensions are just thunks; force simply applies the suspension to the null
tuple to force its evaluation. What about delay? When applied, delay al
locates a reference cell containing a thunk that, if forced, raises an internal
exception. This can never happen for reasons that will become apparent in
a moment; it is merely a placeholder with which we initialize the reference
cell. We then deﬁne another thunk t’ that, when forced, does three things:
1. It forces the thunk t to obtain its value r.
2. It replaces the contents of the memopad with the constant function
that immediately returns r.
3. It returns r as result.
We then assign t’ to the memo pad (hence obliterating the placeholder),
and return a thunk dt that, when forced, simply forces the contents of the
memo pad. Whenever dt is forced, it immediately forces the contents of
REVISED 11.02.11 DRAFT VERSION 1.2
31.3 Lazy Data Types in SML/NJ 261
the memo pad. However, the contents of the memo pad changes as a result
of forcing it so that subsequent forces exhibit different behavior. Speciﬁ
cally, the ﬁrst time dt is forced, it forces the thunk t’, which then forces
t its value r, “zaps” the memo pad, and returns r. The second time dt is
forced, it forces the contents of the memo pad, as before, but this time the
it contains the constant function that immediately returns r. Altogether
we have ensured that t is forced at most once by using a form of “self
modifying” code.
Here’s an example to illustrate the effect of delaying a thunk:
val t = Susp.delay (fn () => print "hello")
val = Susp.force t (* prints hello *)
val = Susp.force t (* silent *)
Notice that hello is printed once, not twice! The reason is that the sus
pended computation is evaluated at most once, so the message is printed
at most once on the screen.
31.3 Lazy Data Types in SML/NJ
The lazy datatype declaration
1
datatype lazy ’a stream = Cons of ’a * ’a stream
expands into the following pair of type declarations
datatype ’a stream! = Cons of ’a * ’a stream
withtype ’a stream = ’a stream! Susp.susp
The ﬁrst deﬁnes the type of stream values, the result of forcing a stream
computation, the second deﬁnes the type of stream computations, which
are suspensions yielding stream values. Thus streams are represented
by suspended (unevaluated, memoized) computations of stream values,
which are formed by applying the constructor Cons to a value and another
stream.
The value constructor Cons, when used to build a stream, automatically
suspends computation. This is achieved by regarding Cons e as shorthand
for Cons (Susp.susp (fn () => e). When used in a pattern, the value
constructor Cons induces a use of force. For example, the binding
1
Please see chapter 15 for a description of the SML/NJ lazy data type mechanism.
REVISED 11.02.11 DRAFT VERSION 1.2
31.3 Lazy Data Types in SML/NJ 262
val Cons (h, t) = e
becomes
val Cons (h, t) = Susp.force e
which forces the righthand side before performing pattern matching.
Asimilar transformation applies to nonlazy function deﬁnitions —the
argument is forced before pattern matching commences. Thus the “eager”
tail function
fun stl (Cons ( , t)) = t
expands into
fun stl! (Cons ( , t)) = t
and stl s = stl! (Susp.force s)
which forces the argument as soon as it is applied.
On the other hand, lazy function deﬁnitions defer pattern matching
until the result is forced. Thus the lazy tail function
fun lstl (Cons ( , t)) = t
expands into
fun lstl! (Cons ( , t)) =
t
and lstl s =
Susp.delay (fn () => lstl! (Susp.force s))
which a suspension that, when forced, performs the pattern match.
Finally, the recursive stream deﬁnition
val rec lazy ones = Cons (1, ones)
expands into the following recursive function deﬁnition:
val rec ones = Susp.delay (fn () => Cons (1, ones))
Unfortunately this is not quite legal in SML since the righthand side
involves an application of a a function to another function. This can either
be provided by extending SML to admit such deﬁnitions, or by extending
the Susp package to include an operation for building recursive suspen
sions such as this one. Since it is an interesting exercise in itself, we’ll
explore the latter alternative.
REVISED 11.02.11 DRAFT VERSION 1.2
31.4 Recursive Suspensions 263
31.4 Recursive Suspensions
We seek to add a function to the Susp package with signature
val loopback : (’a susp > ’a susp) > ’a susp
that, when applied to a function f mapping suspensions to suspensions,
yields a suspension s whose behavior is the same as f (s), the application
of f to the resulting suspension. In the above example the function in
question is
fun ones loop s = Susp.delay (fn () => Cons (1, s))
We use loopback to deﬁne ones as follows:
val ones = Susp.loopback ones loop
The idea is that ones should be equivalent to Susp.delay (fn () => Cons
(1, ones)), as in the original deﬁnition and which is the result of evaluat
ing Susp.loopback ones loop, assuming Susp.loopback is implemented
properly.
How is loopback implemented? We use a technique known as back
patching. Here’s the code
fun loopback f =
let
exception Circular
val r = ref (fn () => raise Circular)
val t = fn () => (!r)()
in
r := f t ; t
end
First we allocate a reference cell which is initialized to a placeholder
that, if forced, raises the exception Circular. Then we deﬁne a thunk that,
when forced, forces the contents of this reference cell. This will be the
return value of loopback. But before returning, we assign to the reference
cell the result of applying the given function to the result thunk. This “ties
the knot” to ensure that the output is “looped back” to the input. Observe
that if the loop function touches its input suspension before yielding an
output suspension, the exception Circular will be raised.
Here is the code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
31.5 Sample Code 264
31.5 Sample Code
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 32
Data Abstraction
An abstract data type (ADT) is a type equipped with a set of operations for
manipulating values of that type. An ADT is implemented by providing
a representation type for the values of the ADT and an implementation for
the operations deﬁned on values of the representation type. What makes
an ADT abstract is that the representation type is hidden from clients of the
ADT. Consequently, the only operations that may be performed on a value
of the ADT are the given ones. This ensures that the representation may be
changed without affecting the behavior of the client — since the represen
tation is hidden from it, the client cannot depend on it. This also facilitates
the implementation of efﬁcient data structures by imposing a condition,
called a representation invariant, on the representation that is preserved by
the operations of the type. Each operation that takes a value of the ADT as
argument may assume that the representation invariant holds. In compen
sation each operation that yields a value of the ADT as result must guar
antee that the representation invariant holds of it. If the operations of the
ADT preserve the representation invariant, then it must truly be invariant
— no other code in the system could possibly disrupt it. Put another way,
any violation of the representation invariant may be localized to the im
plementation of one of the operations. This signiﬁcantly reduces the time
required to ﬁnd an error in a program.
32.1 Dictionaries 266
32.1 Dictionaries
To make these ideas concrete we will consider the abstract data type of
dictionaries. A dictionary is a mapping from keys to values. For simplic
ity we take keys to be strings, but it is possible to deﬁne a dictionary for
any ordered type; the values associated with keys are completely arbitrary.
Viewed as an ADT, a dictionary is a type ’a dict of dictionaries mapping
strings to values of type ’a together with empty, insert, and lookup op
erations that create a new dictionary, insert a value with a given key, and
retrieve the value associated with a key (if any). In short a dictionary is an
implementation of the following signature:
signature DICT =
sig
type key = string
type ’a entry = key * ’a
type ’a dict
exception Lookup of key
val empty : ’a dict
val insert : ’a dict * ’a entry > ’a dict
val lookup : ’a dict * key > ’a
end
Notice that the type ’a dict is not speciﬁed in the signature, whereas the
types key and ’a entry are deﬁned to be string and string * ’a, respec
tively.
32.2 Binary Search Trees
A simple implementation of a dictionary is a binary search tree. A binary
search tree is a binary tree with values of an ordered type at the nodes
arranged in such a way that for every node in the tree, the value at that
node is greater than the value at any node in the left child of that node,
and smaller than the value at any node in the right child. It follows imme
diately that no two nodes in a binary search tree are labelled with the same
value. The binary search tree property is an example of a representation
invariant on an underlying data structure. The underlying structure is a
REVISED 11.02.11 DRAFT VERSION 1.2
32.2 Binary Search Trees 267
binary tree with values at the nodes; the representation invariant isolates
a set of structures satisfying some additional, more stringent, conditions.
We may use a binary search tree to implement a dictionary as follows:
structure BinarySearchTree :> DICT =
struct
type key = string
type ’a entry = key * ’a
(* Rep invariant: ’a tree is a binary search tree *)
datatype ’a tree =
Empty 
Node of ’a tree * ’a entry * ’a tree
type ’a dict = ’a tree
exception Lookup of key
val empty = Empty
fun insert (Empty, entry) =
Node (Empty, entry, Empty)
 insert (n as Node (l, e as (k, ), r), e’ as (k’, )) =
(case String.compare (k’, k)
of LESS => Node (insert (l, e’), e, r)
 GREATER => Node (l, e, insert (r, e’))
 EQUAL => n)
fun lookup (Empty, k) = raise (Lookup k)
 lookup (Node (l, (k, v), r), k’) =
(case String.compare (k’, k)
of EQUAL => v
 LESS => lookup (l, k’)
 GREATER => lookup (r, k’))
end
Notice that empty is deﬁned to be a valid binary search tree, that insert
yields a binary search tree if its argument is one, and that lookup relies
on its argument being a binary search tree (if not, it might fail to ﬁnd a
key that in fact occurs in the tree!). The structure BinarySearchTree is
sealed with the signature DICT to ensure that the representation type is
held abstract.
REVISED 11.02.11 DRAFT VERSION 1.2
32.3 Balanced Binary Search Trees 268
32.3 Balanced Binary Search Trees
The difﬁculty with binary search trees is that they may become unbal
anced. In particular if we insert keys in ascending order, the represen
tation is essentially just a list! The left child of each node is empty; the
right child is the rest of the dictionary. Consequently, it takes O(n) time in
the worse case to perform a lookup on a dictionary containing n elements.
Such a tree is said to be unbalanced because the children of a node have
widely varying heights. Were it to be the case that the children of every
node had roughly equal height, then the lookup would take O(lg n) time,
a considerable improvement.
Can we do better? Many approaches have been suggested. One that we
will consider here is an instance of what is called a selfadjusting tree, called
a redblack tree (the reason for the name will be apparent shortly). The gen
eral idea of a selfadjusting tree is that operations on the tree may cause a
reorganization of its structure to ensure that some invariant is maintained.
In our case we will arrange things so that the tree is selfbalancing, mean
ing that the children of any node have roughly the same height. As we just
remarked, this ensures that lookup is efﬁcient.
Howis this achieved? By imposing a clever representation invariant on
the binary search tree, called the redblack tree condition. A redblack tree
is a binary search tree in which every node is colored either red or black
(with the empty tree being regarded as black) and such that the following
properties hold:
1. The children of a red node are black.
2. For any node in the tree, the number of black nodes on any two paths
from that node to a leaf is the same. This number is called the black
height of the node.
These two conditions ensure that a redblack tree is a balanced binary
search tree. Here’s why. First, observe that a redblack tree of black height
h has at least 2
h
−1 nodes. We may prove this by induction on the structure
of the redblack tree. The empty tree has blackheight 1 (since we consider
it to be black), which is at least 2
1
−1, as required. Suppose we have a red
node. The black height of both children must be h, hence each has at most
2
h
−1 nodes, yielding a total of 2 (2
h
−1) +1 = 2
h+1
−1 nodes, which is
at least 2
h
−1. If, on the other hand, we have a black node, then the black
REVISED 11.02.11 DRAFT VERSION 1.2
32.3 Balanced Binary Search Trees 269
height of both children is h −1, and each have at most 2
h−1
−1 nodes, for
a total of 2 (2
h−1
−1) +1 = 2
h
−1 nodes. Now, observe that a redblack
tree of height h with n nodes has black height at least h/2, and hence has at
least 2
h/2
−1 nodes. Consequently, lg(n + 1) ≥ h/2, so h ≤ 2 lg(n + 1).
In other words, its height is logarithmic in the number of nodes, which
implies that the tree is height balanced.
To ensure logarithmic behavior, all we have to do is to maintain the red
black invariant. The empty tree is a redblack tree, so the only question is
how to perform an insert operation. First, we insert the entry as usual for
a binary search tree, with the fresh node starting out colored red. In doing
so we do not disturb the black height condition, but we might introduce a
redred violation, a situation in which a red node has a red child. We then
remove the redred violation by propagating it upwards towards the root
by a constanttime transformation on the tree (one of several possibilities,
which we’ll discuss shortly). These transformations either eliminate the
redred violation outright, or, in logarithmic time, push the violation to
the root where it is neatly resolved by recoloring the root black (which
preserves the blackheight invariant!).
The violation is propagated upwards by one of four rotations. We will
maintain the invariant that there is at most one redred violation in the
tree. The insertion may or may not create such a violation, and each prop
agation step will preserve this invariant. It follows that the parent of a
redred violation must be black. Consequently, the situation must look
like this. This diagram represents four distinct situations, according to
whether the uppermost red node is a left or right child of the black node,
and whether the red child of the red node is itself a left or right child. In
each case the redred violation is propagated upwards by transforming it
to look like this. Notice that by making the uppermost node red we may be
introducing a redred violation further up the tree (since the black node’s
parent might have been red), and that we are preserving the blackheight
invariant since the greatgrandchildren of the black node in the original
situation will appear as children of the two black nodes in the reorganized
situation. Notice as well that the binary search tree conditions are also pre
served by this transformation. As a limiting case if the redred violation is
propagated to the root of the entire tree, we recolor the root black, which
preserves the blackheight condition, and we are done rebalancing the
tree.
Let’s look in detail at two of the four cases of removing a redred vio
REVISED 11.02.11 DRAFT VERSION 1.2
32.3 Balanced Binary Search Trees 270
lation, those in which the uppermost red node is the left child of the black
node; the other two cases are handled symmetrically. If the situation looks
like this, we reorganize the tree to look like this. You should check that the
blackheight and binary search tree invariants are preserved by this trans
formation. Similarly, if the situation looks like this, then we reorganize the
tree to look like this (precisely as before). Once again, the blackheight and
binary search tree invariants are preserved by this transformation, and the
redred violation is pushed further up the tree.
Here is the ML code to implement dictionaries using a redblack tree.
Notice that the tree rotations are neatly expressed using pattern matching.
structure RedBlackTree :> DICT =
struct
type key = string
type ’a entry = string * ’a
(* Inv: binary search tree + redblack conditions *)
datatype ’a dict =
Empty 
Red of ’a entry * ’a dict * ’a dict 
Black of ’a entry * ’a dict * ’a dict
val empty = Empty
exception Lookup of key
fun lookup (dict, key) =
let
fun lk (Empty) = raise (Lookup key)
 lk (Red tree) = lk’ tree
 lk (Black tree) = lk’ tree
and lk’ ((key1, datum1), left, right) =
(case String.compare(key,key1)
of EQUAL => datum1
 LESS => lk left
 GREATER => lk right)
in
lk dict
end
fun restoreLeft
(Black (z, Red (y, Red (x, d1, d2), d3), d4)) =
Red (y, Black (x, d1, d2), Black (z, d3, d4))
 restoreLeft
REVISED 11.02.11 DRAFT VERSION 1.2
32.3 Balanced Binary Search Trees 271
(Black (z, Red (x, d1, Red (y, d2, d3)), d4)) =
Red (y, Black (x, d1, d2), Black (z, d3, d4))
 restoreLeft dict = dict
fun restoreRight
(Black (x, d1, Red (y, d2, Red (z, d3, d4)))) =
Red (y, Black (x, d1, d2), Black (z, d3, d4))
 restoreRight
(Black (x, d1, Red (z, Red (y, d2, d3), d4))) =
Red (y, Black (x, d1, d2), Black (z, d3, d4))
 restoreRight dict = dict
fun insert (dict, entry as (key, datum)) =
let
(* val ins : ’a dict>’a dict insert entry *)
(* ins (Red ) may have redred at root *)
(* ins (Black ) or ins (Empty) is red/black *)
(* ins preserves black height *)
fun ins (Empty) = Red (entry, Empty, Empty)
 ins (Red (entry1 as (key1, datum1), left, right)) =
(case String.compare (key, key1)
of EQUAL => Red (entry, left, right)
 LESS => Red (entry1, ins left, right)
 GREATER => Red (entry1, left, ins right))
 ins (Black (entry1 as (key1, datum1), left, right)) =
(case String.compare (key, key1)
of EQUAL => Black (entry, left, right)
 LESS => restoreLeft (Black (entry1, ins left, right))
 GREATER => restoreRight (Black (entry1, left, ins right)))
in
case ins dict
of Red (t as ( , Red , )) => Black t (* recolor *)
 Red (t as ( , , Red )) => Black t (* recolor *)
 dict => dict
end
end
It is worthwhile to contemplate the role played by the redblack invariant
in ensuring the correctness of the implementation and the time complexity
of the operations.
REVISED 11.02.11 DRAFT VERSION 1.2
32.4 Abstraction vs. RunTime Checking 272
32.4 Abstraction vs. RunTime Checking
You might wonder whether we could equally well use runtime checks to
enforce representation invariants. The idea would be to introduce a “de
bug ﬂag” that, when set, causes the operations of the dictionary to check
that the representation invariant holds of their arguments and results. In
the case of a binary search tree this is surely possible, but at considerable
expense since the time required to check the binary search tree invariant is
proportional to the size of the binary search tree itself, whereas an insert
(for example) can be performed in logarithmic time. But wouldn’t we turn
off the debug ﬂag before shipping the production copy of the code? Yes,
indeed, but then the beneﬁts of checking are lost for the code we care about
most! By using the type system to enforce abstraction, we can conﬁne the
possible violations of the representation invariant to the dictionary pack
age itself, and, moreover, we need not turn off the check for production
code because there is no runtime penalty for doing so.
Amore subtle point is that it may not always be possible to enforce data
abstraction at runtime. Efﬁciency considerations aside, you might think
that we can always replace static localization of representation errors by
dynamic checks for violations of them. But this is false! One reason is that
the representation invariant might not be computable. As an example,
consider an abstract type of total functions on the integers, those that are
guaranteed to terminate when called, without performing any I/O or hav
ing any other computational effect. Another is that no runtime check can
be deﬁned that ensures that a given integervalued function is total. Yet
we can deﬁne an abstract type of total functions that, while not admitting
every possible total function on the integers as values, provides a useful
set of such functions as elements of a structure. By using these speciﬁed
operations to create a total function, we are in effect encoding a proof of
totality in the program itself.
Here’s a sketch of such a package:
signature TIF = sig
type tif
val apply : tif > (int > int)
val id : tif
val compose : tif * tif > tif
val double : tif
REVISED 11.02.11 DRAFT VERSION 1.2
32.5 Sample Code 273
.
.
.
end
structure Tif :> TIF = struct
type tif = int>int
fun apply t n = t n
fun id x = x
fun compose (f, g) = f o g
fun double x = 2 * x
.
.
.
end
Should the application of such some value of type Tif.tif fail to termi
nate, we know where to look for the error. No runtime check can assure
us that an arbitrary integer function is in fact total.
Another reason why a runtime check to enforce data abstraction is im
possible is that it may not be possible to tell from looking at a given value
whether or not it is a legitimate value of the abstact type. Here’s an exam
ple. In many operating systems processes are “named” by integervalue
process identiﬁers. Using the process identiﬁer we may send messages to
the process, cause it to terminate, or perform any number of other opera
tions on it. The thing to notice here is that any integer at all is a possible
process identiﬁer; we cannot tell by looking at the integer whether it is
indeed valid. No runtime check on the value will reveal whether a given
integer is a “real” or “bogus” process identiﬁer. The only way to know is
to consider the “history” of how that integer came into being, and what
operations were performed on it. Using the abstraction mechanisms just
described, we can enforce the requirement that a value of type pid, whose
underlying representation is int, is indeed a process identiﬁer. You are
invited to imagine how this might be achieved in ML.
32.5 Sample Code
Here is the code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Chapter 33
Representation Independence and
ADT Correctness
This chapter is concerned with proving correctness of ADT implementa
tions by exhibiting a simulation relation between a reference implemen
tation (taken, or known, to be correct) and a candidate implementation
(whose correctness is to be established). The methodology generalizes
Hoare’s notion of abstraction functions to an arbitrary relation, and relies
on Reynolds’ notion of parametricity to conclude that related implemen
tations engender the same observable behavior in all clients.
33.1 Sample Code
Here is the code for this chapter.
Chapter 34
Modularity and Reuse
1. Naming conventions.
2. Exploiting structural subtyping (type t convention).
3. Impedancematching functors.
34.1 Sample Code
Here is the code for this chapter.
Chapter 35
Dynamic Typing and Dynamic
Dispatch
This chapter is concerned with dynamic typing in a statically typed lan
guage. It is commonly thought that there is an “opposition” between
staticallytyped languages (such as Standard ML) and dynamicallytyped
languages (such as Scheme). In fact, dynamically typed languages are a
special case of staticallytyped languages! We will demonstrate this by
exhibiting a faithful representation of Scheme inside of ML.
35.1 Sample Code
Here is the code for this chapter.
Chapter 36
Concurrency
In this chapter we consider some fundamental techniques for concurrent
programming using CML.
36.1 Sample Code
Here is the code for this chapter.
Part V
Appendices
Appendix A
The Standard ML Basis Library
The Standard ML Basis Library is a collection of modules providing a basic
collection of abstract types that are shared by all implementations of Stan
dard ML. All of the primitive types of Standard ML are deﬁned in struc
tures in the Standard Basis. It also deﬁnes a variety of other commonly
used abstract types.
Most implementations of Standard ML include module libraries imple
menting a wide variety of services. These libraries are usually not portable
across implementations, particularly not those that are concerned with the
internals of the compiler or its interaction with the host computer system.
Please refer to the documentation of your compiler for information on its
libraries.
Appendix B
Compilation Management
All programdevelopment environments provide tools to support building
systems out of collections of separatelydeveloped modules. These tools
usually provide services such as:
1. Source code management such as version and revision control.
2. Separate compilation and linking to support simultaneous development
and to reduce build times.
3. Libraries of reusable modules with consistent conventions for identify
ing modules and their components.
4. Release management for building and disseminating systems for gen
eral use.
Different languages, and different vendors, support these activities in dif
ferent ways. Some rely on generic tools, such as the familiar Unix tools,
others provide proprietary tools, commonly known as IDE’s (integrated
development environments).
Most implementations of Standard ML rely on a combination of generic
program development tools and tools speciﬁc to that implementation of
the language. Rather than attempt to summarize all of the known im
plementations, we will instead consider the SML/NJ Compilation Manager
(CM) as a representative program development framework for ML. Other
compilers provide similar tools; please consult your compiler’s documen
tation for details of how to use them.
B.1 Overview of CM 281
B.1 Overview of CM
B.2 Building Systems with CM
B.3 Sample Code
Here is the code for this chapter.
REVISED 11.02.11 DRAFT VERSION 1.2
Appendix C
Sample Programs
A number of example programs illustrating the concepts discussed in the
preceding chapters are available in the Sample Code directory on the world
wide web.
Revision History
Revision Date Author(s) Description
1.0 21.01.11 RH Created
1.1 10.02.11 RH Expanded HOF’s techniques
1.2 11.02.11 RH Added matching combinators to
higherorder function techniques
Bibliography
[1] Emden R. Gansner and John H. Reppy, editors. The Standard ML Basis
Library. Cambridge University Press, 2000.
[2] Peter Lee. Standard ML at Carnegie Mellon. Available within CMU at
http://www.cs.cmu.edu/afs/cs/local/sml/common/smlguide.
[3] Robin Milner, Mads Tofte, Robert Harper, and David MacQueen. The
Deﬁnition of Standard ML (Revised). MIT Press, 1997.
Copyright c 2011 by Robert Harper. All Rights Reserved.
This work is licensed under the Creative Commons AttributionNoncommercialNo Derivative Works 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/licenses/byncnd/3.0/us/, or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA.
Preface
This book is an introduction to programming with the Standard ML programming language. It began life as a set of lecture notes for Computer Science 15–212: Principles of Programming, the second semester of the introductory sequence in the undergraduate computer science curriculum at Carnegie Mellon University. It has subsequently been used in many other courses at Carnegie Mellon, and at a number of universities around the world. It is intended to supersede my Introduction to Standard ML, which has been widely circulated over the last ten years. Standard ML is a formally deﬁned programming language. The Deﬁnition of Standard ML (Revised) is the formal deﬁnition of the language. It is supplemented by the Standard ML Basis Library, which deﬁnes a common basis of types that are shared by all implementations of the language. Commentary on Standard ML discusses some of the decisions that went into the design of the ﬁrst version of the language. There are several implementations of Standard ML available for a wide variety of hardware and software platforms. The bestknown compilers are Standard ML of New Jersey, MLton, Moscow ML, MLKit, and PolyML. These are all freely available on the worldwide web. Please refer to The Standard ML Home Page for uptodate information on Standard ML and its implementations. Numerous people have contributed directly and indirectly to this text. I am especially grateful to the following people for their helpful comments and suggestions: Brian Adkins, Nels Beckman, Marc Bezem, James Bostock, Terrence Brannon, Franck van Breugel, Chris Capel, Matthew William Cox, Karl Crary, Yaakov Eisenberg, Matt Elder, Mike Erdmann, Matthias Felleisen, Andrei Formiga, Stephen Harris, Nils J¨ hnig, Joel Jones, a David Koppstein, John Lafferty, Johannes Laire, Flavio Lerda, Daniel R. Licata, Adrian Moos, Bryce Nichols, Michael Norrish, Arthur J. O’Dwyer,
Frank Pfenning, Chris Stone, Dave Swasey, Michael Velten, Johan Wallen, Scott Williams, and Jeannette Wing. Richard C. Cobbe helped with font selection. I am also grateful to the many students of 15212 who used these notes and sent in their suggestions over the years. These notes are a work in progress. Corrections, comments and suggestions are most welcome.
2. . . . . . . . . . . . . . . . . . . . . . . .2. . . . . . . . . . . . . . . . . . . . . . . . 3 1. . . . . . . . . . 3. . . . . . . . . . . . Types. . . . .3 Compound Declarations 3. .4 Type Errors . . . .1 Evaluation and Execution . .2 Sample Code . . .1 Type Bindings . . . . . . . . . . . . . . . . . . .5 Sample Code . .Contents Preface ii I Overview 1 1 Programming in Standard ML 3 1. . . . . .1 Variables . . . . . . .2 Evaluation . . 3 Declarations 3. . . . .1 A Regular Expression Package . . . . . . . 3.2 Value Bindings . . . . . . .4 Limiting Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 2. . . . . . . 2. . . . . . . . . . . . .3 Types. . . . . . . . . . . .2. . . . . . . . . . . . . . . . . . .1 Type Checking . . . . . . . . 2. . . . . . . . . . . . . . . . . . and Effects 2. . . . . . . . . . . . . . . . . . . . Values. . . .2 The ML Computation Model 2. . . . . . . . . . . 2. . . . . . . 3. . . . . . 2. .2. . . . . . . . . . . . . . . . . . . . . . . . . 12 II The Core Language . . . . .2 Basic Bindings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Types . . . . . 3. . . . . . . . . . . . . . . . . . 13 15 15 16 17 19 21 23 23 24 24 25 25 26 27 28 2 Types. .
. . . . . . . . .4 Mutual Recursion . . . . . . . . .2 Functions and Application . .2 Record Types . . . . . . .3 Inductive Reasoning . . . . . . . . . . . . . . . . .4 Exhaustiveness and Redundancy . . . . . . . 6. . . . . . . . . . .4 Sample Code .4 Sample Code . . . . . . . . . . . .5 Sample Code . . . . .2 Polymorphic Deﬁnitions . . . . . . . . 5. 8. . . . . . . . 33 33 34 37 39 40 40 40 42 45 48 50 51 51 52 53 54 56 57 58 61 62 65 66 67 67 70 73 76 4 Functions 4. . 7. . . . . .5 Sample Code . . . . . . 7. . . 5. . . . . .11 D RAFT V ERSION 1. . . . . . . . . . . . . . . . . . 7 Recursive Functions 7. . . . . . . . . . .CONTENTS 3. . . . . . . . . . . . .1 Product Types . . . . . . . .3 Binding and Scope. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Overloading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 . . . . . . . . . .2 Clausal Function Expressions . . . . . . . .1. . . . R EVISED 11. . .4 Sample Code . . . . . . . . . . . . . . . . . . . . .1. 8. 6. . . . . . . . 8. . . . . . . . . . . . . . . . . .3 Booleans and Conditionals. . . . . . .1 SelfReference and Recursion 7. . . . . .02. . . . . . . . . . . .5 3. . . . . . . . . . . . . . .2 Tuple Patterns . . . Revisited 4. . . . . . . .1 Type Inference . . .1 Functions as Templates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6 vi Typing and Evaluation . . . . . . . . 5 Products and Records 5. . . . . . . . . . .3 Multiple Arguments and Multiple Results 5. . . . . . . . . . 29 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Type Inference and Polymorphism 8. . . . . . 5. . . . . . . . . . . . . . . . . . 4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. . . . 7. . . . . .2 Iteration . . . . . . . . . .1 Homogeneous and Heterogeneous Types 6. . . . . . . . 5. .1 Tuples . . . . 6. . 6 Case Analysis 6. . .
. . . . . . . . . . . . . . . . 81 . . . . . . . . .11 D RAFT V ERSION 1. . . . . . . . .1 List Primitives . . . . . . . . . . . . 82 82 83 85 88 89 91 92 92 93 95 97 99 102 103 104 104 105 107 110 112 113 113 115 116 118 119 120 122 124 10 Concrete Data Types 10. 13. . . . . . . . . . . . . . . . . . . . . . . . . . 13. . . . 13. . . . . . R EVISED 11. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 Sample Code . . . . . . . . . . . .1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 HigherOrder Functions 11. .1 Private Storage . . . . . . . . . . . . . . .6 Sample Code . . . . .4 Heterogeneous Data Structures 10. . . . . . . . . 13. . 10. . . . . . . . 77 9. . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Reference Cells . . . . 12.4 Patterns of Control . . . . . . 12. . . . . .5 Abstract Syntax . . . . . . . . . . .5. . .1 Datatype Declarations . . . 10. . . . . . . .2 Computing With Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 . . . .2 Exception Handlers . . . . . . . . . . . . . .3 Identity . . . . . . . . . . . . . . . .02. . . 13. . . . . 13 Mutable Storage 13. . .2 Binding and Scope . . . . .1 Exceptions as Errors . . . . . .1. . . . . .2 Reference Patterns . . . . . . . . . . . . . 13. . . . . .3 ValueCarrying Exceptions . . . . . . . . . . . . . . . . 10. . .1 Primitive Exceptions . . . . . . . . . . . . .2 NonRecursive Datatypes . . . . . . .6 Mutable Arrays . . . . . . . . . . . . 10. . . . . . . . . . . . . . . . . . . . . . 11. . . . . . . . . . . . .2 Mutable Data Structures . 12. . 11. . . . . . . . . .4 Aliasing . . . . . . . . . 11. . . . . . . . . . . . . . . . . .5 Programming Well With References 13. .5 Staging . . . 12. . . . . . . . . . . . . . . . . . . . . . .1 Functions as Values . . . . . 12 Exceptions 12. . . . .5. . 11. . . . . . . . . . . . . . . .3 Returning Functions 11. . . . . . . . . . . . . . . . . . . .CONTENTS 9 vii Programming with Lists 77 9. . . . . . . . . . . . . .2 UserDeﬁned Exceptions . .6 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12. . .3 Recursive Datatypes . . . . . . . . . . . . . .3 Sample Code . . . . . . . . 79 9. . . . . . . . .
. . . 139 III The Module Language . . . . . .2 Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 R EVISED 11. . . . . . . . . . . . 18. . . . . . . . . . . . . . . . . . . . . . . . . 15. . . . . . . . . .1. . .1 Basic Structures . . . . . . . 18. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Ascribed Structure Bindings . . . .11 D RAFT V ERSION 1. . . . . . .2 Long and Short Identiﬁers 18. . . . . . . .2. . . . . . 19. . 20 Signature Ascription 158 20. . . 129 15 Lazy Data Structures 15. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15. . . .1 Sample Code . . . . . . . . . . . . . 130 132 133 135 137 16 Equality and Equality Types 138 16. . . . . . . . . . . . 127 14. . . . . . . . . . . . . . 126 14 Input/Output 127 14. . . . . . . . . . . . . . . .2 Lazy Function Deﬁnitions . . . . . . . . . . . . . . . . . . . . . . . .3 Satisfaction . . . . . . . . . . . . . . . . . . 138 17 Concurrency 139 17. . . . . . . . . . .2 Signature Inheritance .1 Textual Input/Output . . . .4 Sample Code . . . . . . 19.1 Lazy Data Types . . .1. . . . . . . . .2 Opaque Ascription . . . . . . . . . . . . . . . . . . . . . . . . . . .7 Sample Code . . . . . . . . 18. . . . . . . . . . . . . . . . . . . . 18. . 158 20. . . . . . . . . . . . . . . . . . . . . 19 Signature Matching 19. . .2 Structures . . . . . . . . . . . . . .3 Sample Code . . . . . . . . . . . 19. . . . . .02. . . . . . . . .1 Sample Code . . . . . . 140 142 142 143 144 147 147 148 150 151 152 153 157 157 18 Signatures and Structures 18. .1 Principal Signatures . . . . . . . .2. . . . . . . . . . . . . . . . . . . . . .3 Programming with Streams 15. . . . . . . . . . . . . . . . .1 Basic Signatures . . . . . . . . . . . . . . 18. . . .2 Sample Code . . . . .2 . . . . . . . . . . .1 Signatures .4 Sample Code . . . . . . . . .CONTENTS viii 13. .
. . . . . . . . . . . . 212 212 214 215 216 217 219 R EVISED 11. . . . . . . . . . . . . . 207 25. . . . . . . . .3 Transparent Ascription . . . . . . 23. . . . . . . . . . . . . . . . 24. 26.11 D RAFT V ERSION 1. . . . .1 Combining Abstractions . . . . . . . . . .3 Avoiding Sharing Speciﬁcations . .1 Exponentiation . . . . . . . . . . . . . . . . . 164 21 Module Hierarchies 165 21. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 The GCD Algorithm . . . . . . . . 26. . . . . . . . . . . .02. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 21. 173 22 Sharing Speciﬁcations 174 22. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Natural Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 Abstracting Induction . 181 23 Parameterization 23. . . . . . . . . . . . . . .3 Trees . .2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 26 Structural Induction 26. . . . . . . . . . .1 Functor Bindings and Applications . . . . . 26. . . . . . . 174 22. . . . . . . . . . . . . . . . . . . . . . . . . .2 Correctness Proofs . . 192 194 194 196 199 25 Induction and Recursion 202 25.2 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 182 185 187 191 IV Programming Techniques 24 Speciﬁcations and Correctness 24. . . . . . . . . . . . . . . . . . . . . . . .3 Enforcement and Compliance . . .CONTENTS ix 20. . . . . .6 Sample Code .3 Sample Code . . .1 Speciﬁcations . . . . . . . . . . . . .1 Substructures . . . . . . . . . . . . . . . . . . . . . . 23. . . . . . . . . . . 26. . . . . . . . .4 Generalizations and Limitations 26. . . .4 Sample Code . . . . . . . . . . . . . .2 Functors and Sharing Speciﬁcations 23. . . . . . . . . . . . . . . 162 20. . . . . 24. . . . . . 202 25. . . . . .2 Sample Code . . . . . . . . . . . . . . . . . . .2 Lists . . .4 Sample Code . . . . . . . . . .
. . . . . . . . . . . . . . . . . 29. . . . . . . . . . . . . . . . . . . . . . . . Revisited 30. . . 232 28. . . . . . . 30. .4 Sample Code . . . 239 239 241 242 244 246 247 248 251 254 256 257 257 259 261 263 264 265 266 266 268 272 273 30 HigherOrder Functions 30. . . . . . . . . . . and Continuations 29. . . . . . . . . . . . . . . . . . . . . . . . 30. . . . . . . . . . . . . . . . 33 Representation Independence and ADT Correctness 274 33. . . . . . . . . . .3 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . RunTime Checking 32. . . . . . . . . . . . . . . . . . . . . . 32. .3 Balanced Binary Search Trees . . . . . . . . . .2 Specifying the Matcher . . . . . . . . . . . . . . . . . . . . . . . .CONTENTS x 27 ProofDirected Debugging 220 27. . . . .1 Dictionaries . . . . .1 Sample Code . 32. . . . . . . . . .4 Recursive Suspensions . . . .3 Sample Code . . . . . . . . . . . . . . . . . . . . . . . . . . 31. . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Solution Using Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . .5 Sample Code . . . . . . . . . . . . . 29. . . . . . . . . . . . .1 Inﬁnite Sequences . . . . . . . . . .1 The nQueens Problem . . . . . . . . . 222 27. . . . . . . . . . . . . . . . . . . . .2 Amortized Analysis . . . . 31 Memoization 31. . . . . . . . 274 R EVISED 11. . . . .02. . . .5 Sample Code . . . . . . . . . . . 31. . .3 Regular Expression Matching. . . . . . . . . . . . . . Exceptions. . . . . . . . . . . . . . . . .11 D RAFT V ERSION 1. .3 Lazy Data Types in SML/NJ 31. . . .2 Laziness . . . . . . . . . . . . . . . .1 Persistent Queues . . .1 Cacheing Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.5 Sample Code . . . . . . . . . . . . .2 . . . . . . . . . . . 220 27. . . . . . .2 Solution Using Options . . . 32 Data Abstraction 32. . . . .2 Binary Search Trees . . . . . . .4 Abstraction vs. . . 29. . . . . . . 29. . . . . . . . . . . . . . . . . . . . . 228 28 Persistent and Ephemeral Data Structures 229 28. . . . . . . . . . . . . . . .2 Circuit Simulation . . . . . . . . . . . .1 Regular Expressions and Languages . . . .4 Solution Using Continuations . . . 235 28. . . . . . . . . . . . . . . 238 29 Options. . . . . 32. . . . . . . .
. . . . . . . . . . . . . . . .1 Sample Code . . . . . . . . . . . 281 B.2 Building Systems with CM .11 D RAFT V ERSION 1. . . . 275 35 Dynamic Typing and Dynamic Dispatch 276 35. . 276 36 Concurrency 277 36. .02. .3 Sample Code . . .1 Sample Code . . . . . . .1 Overview of CM . . . . . . . . . . . . . 277 V Appendices A The Standard ML Basis Library 278 279 B Compilation Management 280 B. . . . . . . . . . . . . . . . . . . . .1 Sample Code . . . . . . . . . . . . . . . . . . . . .2 . . . . . . .CONTENTS xi 34 Modularity and Reuse 275 34. . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 C Sample Programs 282 R EVISED 11. . . . . . . . . . . . . . 281 B. . . . . . . . . . . . . . .
.
Part I Overview .
including tools for compiling. Nearly every compiler generates native machine code. with an extensible type system. richer forms of modularity constructs. but allows imperative (effectful) programming where necessary. Details can be found with the documentation for your compiler. Most implementations provide an interactive system supporting online program development. It provides a portable standard basis library that deﬁnes a rich collection of commonlyused types and routines. and support for “lazy” data structures. hash tables. It is portable across platforms and implementations because it has a precise deﬁnition. imposing hierarchical structure. and building generic modules. It facilitates programming with recursive and symbolic data structures by supporting the deﬁnition of functions by pattern matching. It provides a richly expressive and ﬂexible module system for structuring large programs. extensive libraries of commonlyused routines.2 . many provide tools for time and space proﬁling. Most implementations supplement the standard basis library with a rich collection of handy components such as dictionaries. It is a statically typed language. linking. and analyzing the behavior of programs. It provides efﬁcient automatic storage management for data structures and functions. Many implementations go beyond the standard to provide experimental language features. but here’s some of what you may expect.02. R EVISED 11. It encourages functional (effectfree) programming where appropriate. and useful program development tools.2 Standard ML is a typesafe programming language that embodies many innovative ideas in programming language design. Some implementations support language extensions such as support for concurrent programming (using messagepassing or locking). including mechanisms for enforcing abstraction.11 D RAFT V ERSION 1. It features an extensible exception mechanism for handling error conditions and effecting nonlocal transfers of control. even when used interactively. but some also generate code for a portable abstract machine. Some implementations provide tools for tracing and stepping programs. which all but eliminates the burden of specifying types of variables and greatly facilitates code reuse. Most implementations support separate compilation and provide tools for managing large systems and shared libraries. or interfaces to the ambient operating system. It supports polymorphic type inference. A few implementations are “batch compilers” that rely on the ambient operating system to manage the construction of large programs from compiled parts.
signature REGEXP = sig datatype regexp = Zero  One  Char of char  Plus of regexp * regexp  Times of regexp * regexp  Star of regexp exception SyntaxError of string val parse : string > regexp val format : regexp > string end signature MATCHER = sig structure RegExp : REGEXP val accepts : RegExp. We’ll structure the implementation into two modules.regexp > string > bool end The signature REGEXP describes a module that implements regular expressions. These two modules are concisely described by the following signatures.1 A Regular Expression Package To develop a feel for the language and how it is used. an implementation of regular expressions themselves and an implementation of a matching algorithm for them. It consists of a description of the abstract syntax of regular expres .Chapter 1 Programming in Standard ML 1. let us consider the implementation of a package for matching strings against regular expressions.
. r1 r2 . parenthesization.. An implementation of a signature is called a structure. and does not rely on implicit conventions for determining which implementation of regular expressions it employs. etc. The implementation of the matcher consists of two modules: an implementation of regular expressions and an implementation of the matcher itself. The signature MATCHER describes a module that implements a matcher for a given notion of regular expression. one implementing the signature REGEXP. returns a function that determines whether or not that expression accepts a given string. the other implementing MATCHER.1 A Regular Expression Package 4 sions. This ensures that the matcher is selfcontained. conventions for naming the operators. The implementation of the matching package consists of two structures. The unparser takes a regular expression and yields a string that parses to that regular expression. The parser takes a string as argument and yields a regular expression. together with operations for parsing and unparsing them..2 . the parser raises the exception SyntaxError with an associated string describing the source of the error. the unparser generally tries to choose one that is easiest to read..11 D RAFT V ERSION 1. Thus the overall package is implemented by the following two structure declarations: structure RegExp :> REGEXP = . Obviously the matcher is dependent on the implementation of regular expressions.1 The functions parse and format specify the parser and unparser for regular expressions. R EVISED 11.02.1. The deﬁnition of the abstract syntax of regular expressions in the signature REGEXP takes the form of a datatype declaration that is reminiscent of a contextfree grammar. This is expressed by a structure speciﬁcation that speciﬁes a hierarchical dependence of an implementation of a matcher on an implementation of regular expressions — any implementation of the MATCHER signature must include an implementation of regular expressions as a constituent module. a.. In general there are many strings that parse to the same regular expressions. r1 + r2 . It contains a function accepts that. corresponding to the regular expressions 0. if the string is illformed. The structure identiﬁer RegExp is bound to an implementation of the REGEXP 1 Some authors use ∅ for 0 and ” for 1.) The abstract syntax consists of six clauses. but which abstracts from matters of lexical presentation (such as precedences of operators. 1. when given a regular expression. structure Matcher :> MATCHER = . and r ∗ .
so there is a pro forma distinction between the two. or long identiﬁers.parse. See Chapter 22 for more information on the subtle issue of sharing.parse.regexp > string > bool. Matcher. We may build a regular expression by applying the parser.RegExp. then passing the resulting value of type Matcher.parse is just RegExp. but it also ensures that subsequent code cannot rely on any properties of the implementation other than those explicitly speciﬁed in the signature. Similarly.11 D RAFT V ERSION 1.1. this relationship is not stated in the interface. to a string representing a regular expression.RegExp. reﬂecting the fact that it takes a regular expression as implemented within the package itself and yields a matching function on strings. Once these structure declarations have been processed.2 Here’s an example of the matcher in action: val regexp = Matcher. However. The ascribed signature speciﬁes that the structure Matcher must conform to the requirements of the signature MATCHER.accepts.accepts to the output of RegExp. This helps to ensure that modules are kept separate.02.accepts regexp val ex1 = matches "aabba" (* yields true *) val ex2 = matches "abac" (* yields false *) might seem that one can apply Matcher. since Matcher.RegExp. the structure identiﬁer Matcher is bound to a structure that implements the matching algorithm in terms of the preceding implementation RegExp of REGEXP. Notice that the structure Matcher refers to the structure RegExp in its implementation.1 A Regular Expression Package 5 signature.parse.regexp to Matcher.match has type Matcher. The function Matcher. we may use the package by referring to its components using paths. 2 It R EVISED 11. Conformance with the signature is ensured by the ascription of the signature REGEXP to the binding of RegExp using the “:>” notation. facilitating subsequent changes to the code.RegExp.parse "(a+b)*" val matches = Matcher.2 .RegExp. Not only does this check that the implementation (which has been elided here) conforms with the requirements of the signature REGEXP.
. s’) = R EVISED 11.RegExp val regexp = parse "(a+b)*" val matches = accepts regexp val ex1 = matches "aabba" val ex2 = matches "abac" It is advisable to be sparing in the use of open because it is often hard to anticipate exactly which bindings are incorporated into the environment by its use. Now let’s look at the internals of the structures RegExp and Matcher. Here’s an example: structure M = Matcher structure R = M. fun parse s = let val (r. .accepts regexp val ex1 = matches "aabba" val ex2 = matches "abac" Another is to “open” the structure.02. Here’s a bird’s eye view of RegExp: structure RegExp :> REGEXP = struct datatype regexp = Zero  One  Char of char  Plus of regexp * regexp  Times of regexp * regexp  Star of regexp . One is to introduce a synonym for a long package name.2 . . fun tokenize s = .RegExp val regexp = R. .1. There are two typical methods for alleviating the burden. incorporating its bindings into the current environment: open Matcher Matcher...parse "((a + %).(b + %))*" val matches = M.11 D RAFT V ERSION 1. .1 A Regular Expression Package 6 The use of long identiﬁers can get tedious at times.
02. The formatter is implemented by a function that. . The parser is implemented by a function that. when given a piece of abstract syntax. when given a string. or if the tokenizer encountered an illegal token. . which we present here in toto: datatype token = AtSign  Percent  Literal of char  PlusSign  TimesSign  Asterisk  LParen  RParen exception LexicalError fun tokenize nil = nil R EVISED 11." ." end handle LexicalError => raise SyntaxError "Bad input. Let’s start with the tokenizer. The type regexp is implemented precisely as speciﬁed by the datatype declaration in the signature REGEXP.1. rather than strings.explode s)) in case s’ of nil => r  => raise SyntaxError "Bad input.2 .1 A Regular Expression Package parse rexp (tokenize (String. formats it into a list of characters that are then “imploded” to form a string. The parser and formatter work with character lists. The structure RegExp is bracketed by the keywords struct and end. an appropriate syntax error is signalled. fun format r = String. It is interesting to consider in more detail the structure of the parser since it exempliﬁes well the use of pattern matching to deﬁne functions.implode (format exp r) end 7 The elision indicates that portions of the code have been omitted so that we can get a highlevel view of the structure of the implementation. and ﬁnally parses the resulting list of tokens to obtain its abstract syntax. “explodes” it into a list of characters.11 D RAFT V ERSION 1. because it is easier to process lists incrementally than it is to process strings. transforms the character list into a list of “tokens” (abstract symbols representing lexical atoms). If there is remaining input after the parse.
It is deﬁned by a series of clauses that dispatch on the ﬁrst character of the list of characters given as input. (It is a lexical error to have a backslash at the end of the input. it transforms a list of characters into a list of tokens.) Notice that it is quite natural to “look ahead” in the input stream in the case of the backslash character.1. Concatentation is indicated by “. yielding a list of tokens.02.) Let’s turn to the parser. It is a simple recursivedescent parser implementing the precedence conventions for regular expressions given earlier. the only nontrivial case being to admit the use of a backslash to “quote” a reserved symbol as a character of input.”." :: cs) = TimesSign :: tokenize cs tokenize (#"*" :: cs) = Asterisk :: tokenize cs tokenize (#"(" :: cs) = LParen :: tokenize cs tokenize (#")" :: cs) = RParen :: tokenize cs tokenize (#"@" :: cs) = AtSign :: tokenize cs tokenize (#"%" :: cs) = Percent :: tokenize cs tokenize (#"\\" :: c :: cs) = Literal c :: tokenize cs  tokenize (#"\\" :: nil) = raise LexicalError  tokenize (#" " :: cs) = tokenize cs  tokenize (c :: cs) = Literal c :: tokenize cs 8 The symbol “@” stands for the empty regular expression and the symbol “%” stands for the regular expression accepting only the null string. We use a datatype declaration to introduce the type of tokens corresponding to the symbols of the input language. and iteration by “*”. (More sophisticated languages have more sophisticated token structures. These conventions may be formally speciﬁed by the following grammar. but also allows for the R EVISED 11. The correspondence between characters and tokens is relatively straightforward. which not only enforces precedence conventions. alternation by “+”. using a pattern that dispatches on the ﬁrst two characters (if there are such) of the input.2 . and proceeding accordingly. words (consecutive sequences of letters) are often regarded as a single token of input. for example.1 A Regular Expression Package         tokenize (#"+" :: cs) = PlusSign :: tokenize cs tokenize (#".11 D RAFT V ERSION 1. The function tokenize has type char list > token list.
ts’) = parse ratm ts in case ts’ of (Asterisk :: ts’’) => (Star r.. rexp rtrm rfac ratm ::= ::= ::= ::= rtrm  rtrm+rexp rfac  rfac.02.11 D RAFT V ERSION 1..2 . parse rfac. ts’) = parse rtrm ts in case ts’ of (PlusSign :: ts’’) => let val (r’. ts’) end and parse ratm nil = raise SyntaxError ("No atom")  parse ratm (AtSign :: ts) = (Zero.1 A Regular Expression Package use of parenthesization to override them. parse rtrm. ts’’)  => (r. r’). parse rexp. ts) R EVISED 11. ts’) end and parse rtrm ts = . fun parse rexp ts = let val (r.1. and parse ratm. These implement what is known as a recursive descent parser that dispatches on the head of the token list to determine how to proceed. ts’’’) end  => (r.rtrm ratm  ratm* @  %  a  (rexp) 9 The implementation of the parser follows this grammar quite closely. and parse rfac ts = let val (r. ts)  parse ratm (Percent :: ts) = (One. ts’’’) = parse rexp ts’’ in (Plus (r. It consists of four mutually recursive functions.
but how do we ensure that at the outermost call the entire string has been matched? We achieve this by using an initial continuation that checks whether the ﬁnal segment is empty. This facilitates implementation of concatentation.2 . This completes the implementation of regular expressions. The main difﬁculty is to account for concatenation — to match a string against the regular expression r1 r2 we must match some initial segment against r1 .1. This suggests that we generalize the matcher to one that checks whether some initial segment of a string matches a given regular expression. This parser makes no attempt to recover from syntax errors.1 A Regular Expression Package  parse ratm ((Literal c) :: ts) = (Char c. Here’s the code. but one could imagine doing so. ts)  parse ratm (LParen :: ts) = let val (r. then match the corresponding ﬁnal segment against r2 . a function that determines what to do after the initial segment has been successfully matched. written as a structure implementing the signature MATCHER: structure Matcher :> MATCHER = struct structure RegExp = RegExp open RegExp fun match Zero k = false  match One cs k = k cs  match (Char c) cs k = (case cs of nil => false  c’::cs’ => (c=c’) andalso (k cs’ )) R EVISED 11.11 D RAFT V ERSION 1. using standard techniques.02. ts’’)  => raise SyntaxError "No close paren" end 10 Notice that it is quite simple to implement “lookahead” using patterns that inspect the token list for speciﬁed tokens. ts’) = parse rexp ts in case ts’ of (RParen :: ts’’) => (r. then passes the remaining ﬁnal segment to a continuation. Now for the matcher. The matcher proceeds by a recursive analysis of the regular expression.
we match against the regular expression being iterated followed R EVISED 11. r2)) cs k = match r1 cs k orelse match r2 cs k  match (Times (r1. The function accepts explodes the string into a list of characters (to facilitiate sequential processing of the input).2 . it yields as result a value of type bool. but if this fails.regexp > char list > (char list > bool) > bool.11 D RAFT V ERSION 1.explode string) (fn nil => true  end 11 => false) Note that we incorporate the structure RegExp into the structure Matcher.1 A Regular Expression Package  match (Plus (r1. match takes in succession a regular expression. but notice that nowhere did we have to write this type in the code! The type inference mechanism of ML took care of determining what that type must be based on an analysis of the code itself. r2)) cs k = match r1 cs (fn cs’ => match r2 cs’ k)  match (Star r) cs k = let val mstar cs’ = k cs’ orelse match r cs’ mstar in mstar cs end fun accepts regexp string = match regexp (String. in accordance with the requirements of the signature. The simplicity of the matcher is due in large measure to the ease with which we can manipulate functions in ML. and a continuation of type char list > bool. unnamed function to pass as a continuation in the case of concatenation — it is the function that matches the second part of the regular expression to the characters remaining after matching an initial segment against the ﬁrst part. Notice that we create a new. it is said to be a higherorder function. The type of match is RegExp. then calls match with an initial continuation that ensures that the remaining input is empty to determine the result. We use a similar technique to implement matching against an iterated regular expression — we attempt to match the null string. This is a fairly complicated type. Since match takes a function as argument. That is.1. a list of characters.02.
The second part is concerned with the module language. The third is about programming techniques. R EVISED 11. This neatly captures the “zero or more times” interpretation of iteration of a regular expression.1. methods for building reliable and robust programs.11 D RAFT V ERSION 1. Can you ﬁnd it? If not. I hope you enjoy it! 1.02. The remainder of these notes are structured into three parts. see chapter 27 for further discussion! This completes our brief overview of Standard ML. the means by which we structure large programs in ML. The ﬁrst part is a detailed introduction to the core language. the language in which we write programs in ML.2 .2 Sample Code Here is the complete code for this chapter. Important: the code given above contains a subtle error.2 Sample Code 12 by the iteration once again.
Part II The Core Language .
The ﬁrst part.02. comprises the mechanisms for structuring programs into separate units and is described in Part III.14 All Standard ML is divided into two parts. mechanisms for deﬁnining new types. comprises the fundamental programming constructs of the language — the primitive types and operations. The second part.2 . the module language. the means of deﬁning and using functions.11 D RAFT V ERSION 1. and so on. Here we introduce the core language. the core language. R EVISED 11.
such as boolean tests or arithmetic operations. modiﬁes the memory. We proceed by “plugging in” the given value for x. but often (in C. rather than execution of commands. and continues with the next instruction. The emphasis in ML is on computation by evaluation of expressions.1 Evaluation and Execution Most familiar programming languages. The evaluation model of computation used in ML is based on the same idea. performs a simple computation. using the rules of arithmetic. but rather than re . such as C or Java. determine the value of the polynomial. are based on an imperative model of computation. Many languages maintain a distinction between expressions and commands.Chapter 2 Types. The idea of computation is as a generalization of your experience from high school algebra in which you are given a polynomial in a variable x and are asked to calculate its value at a given value of x. that are executed for their value. Each step of execution examines the current contents of memory. and Effects 2. Computation in ML is of a somewhat different nature. Programs are thought of as specifying a sequence of commands that modify the memory of the computer. so that even expression evaluation has an effect. Values. for example) expressions may also modify the memory. Conditional commands branch according to the value of some expression. and then. The individual commands are executed for their effect on the memory (which we may take to include both the internal memory and registers and the external input/output devices). The progress of the computation is controlled by evaluation of expressions.
the evaluation model subsumes the imperative model as a special case. For example. the language provides for storagebased computation for those few times that it is actually necessary.2 The ML Computation Model 16 strict ourselves to arithmetic operations on the reals. Nevertheless. it is much easier to develop mathematical techniques for reasoning about the behavior of programs. Execution of commands for the effect on memory can be seen as a special case of evaluation of expressions by introducing primitive operations for allocating. Each expression has three important characteristics: • It may or may not have a type. 2.11 D RAFT V ERSION 1.2. Because of its close relationship to mathematics. making the code easier to understand and maintain. Moreover. Doing so allows us to support imperative programming without destroying the mathematical elegance of the evaluation model for programs that don’t use memory. • It may or may not have a value. and modifying memory. Rather than forcing all aspects of computation into the framework of memory modiﬁcation.2 The ML Computation Model Computation in ML consists of evaluation of expressions. we admit a richer variety of values and a richer variety of primitive operations on them. • It may or may not engender an effect. they provide important tools for documenting the reasoning that went into the formulation of a program. never their absence. accessing. These techniques are important tools for helping us to ensure that programs work properly without having to resort to tedious testing and debugging that can only show the presence of errors.2 . As we will see. it is quite remarkable how seldom memory modiﬁcation is required. should it yield a value at all. for an expression to have type int is to R EVISED 11. The evaluation model of computation enjoys several advantages over the more familiar imperative model. What is more. The type of an expression is a description of the value it yields. These characteristics are all that you need to know to compute with an expression. we instead take expression evaluation as the primary notion.02.
For the time being we will assume that all expressions are effectfree. or pure. In general we can think of the type of an expression as a “prediction” of the form of the value that it has. and for an expression to have type real is to say that its value (if any) is a ﬂoating point number. either because of a runtime fault such as division by zero or because some programmerdeﬁned condition is signalled during its evaluation. The soundness of the type system ensures the accuracy of the predictions made by the type checker. or sending a message on the network. if an expression evaluates to a value v and its type is bool. Those without a type are said to be illtyped. Every expression is required to have at least one type.1 Type Checking What is a type? What types are there? Generally speaking. It is important to note that the type of an expression says nothing about its possible effects! An expression of type int might well display a message on the screen before returning an integer value. they are considered ineligible for evaluation. should it have one. • the values of the type. say. a type is deﬁned by specifying three things: • a name for the type. and • the operations that may be performed on values of the type.11 D RAFT V ERSION 1. Effects include such phenomena as raising an exception.2. those that do are said to be welltyped.2 . the form of that value is predicted by its type. 17 or 3. A welltyped expression is evaluated to determine its value. For this reason effects are sometimes called side effects. performing input or output. Evaluation of an expression might also engender an effect. then v must be either true or false. An expression can fail to have a value because its evaluation never terminates or because it raises an exception. and are not part of the value of the expression. 2. it cannot be. to stress that they happen “off to the side” during evaluation.2 The ML Computation Model 17 say that its value (should it have one) is a number. which classiﬁes only its value.14. For example.02. modifying memory. rejecting with an error those that are not. if indeed it has one. We will ignore effects until chapter 13. The type checker determines whether or not an expression is welltyped. R EVISED 11.2. This possibility is not accounted for in the type of the expression. If an expression has a value.
+. but it nevertheless serves as a very useful guideline for describing types. just as in ordinary mathematical practice. Thus the preceding expression may be equivalently written as (3*2)+6.2. (Note that negative numbers are written with a preﬁx tilde. 1. The formation of expressions is governed by a set of typing rules that deﬁne the types of expressions in terms of the types of their constituent expressions (if any). The following are all valid typing assertions: 3 3 4 4 : int + 4 : int div 3 : int mod 3 : int D RAFT V ERSION 1. multiplication.11 . rather than a minus sign!) Operations on integers include addition. 2. *. Its name is int. mod. In their full generality the rules are somewhat involved. Parentheses may be used to override the precedence conventions. Arithmetic expressions are formed in the familiar manner. which states that the expression exp has the type typ. and so on. This is indicated by a typing assertion of the form exp : typ. in fact. Here are some simple arithmetic expressions. building up additional machinery as we go along. A typing assertion is said to be valid iff the expression exp does indeed have the type typ. . but we will sneak up on them by ﬁrst considering only a small fragment of the language. as is customary in mathematics): 3 3 + 4 4 div 3 4 mod 3 Each of these expressions is wellformed. ˜1. The typing rules are generally quite intuitive since they are consistent with our experience in mathematics and in other programming languages.02.2 R EVISED 11. written using inﬁx notation for the operations (meaning that the operator comes between the arguments. Let’s consider ﬁrst the type of integers. and remainder. subtraction. 3*2+6. governed by the usual rules of precedence. ˜2. for example. The values of type int are the numerals 0. div. they each have type int.2 The ML Computation Model 18 Often the division of labor into values and operations is not completely clearcut. quotient. but we may also write 3*(2+6) to override the default precedences.
11 D RAFT V ERSION 1. and the result is an integer The outermost steps justify the assertion (3+4) div 5 : int by demonstrating that the arguments each have type int.2 Evaluation Evaluation of expressions is deﬁned by a set of evaluation rules that determine how the value of a compound expression is determined as a function of the values of its constituent expressions (if any).2. because (a) 3 : int because it is an axiom (b) 7 : int because it is an axiom (c) the arguments of + must be integers.2 .2 The ML Computation Model 19 Why are these typing assertions valid? In the case of the value 3. The reasoning involved in demonstrating the validity of a typing assertion may be summarized by a typing derivation consisting of a nested sequence of typing assertions. For more complex cases we reason analogously. 2. it is an axiom that integer numerals have integer type. each of which must have type int. the validity of the typing assertion (3+7) div 5 : int is justiﬁed by the following derivation: 1.02. Since both arguments in fact have type int. ML is sometimes said to be a callbyvalue language. each justiﬁed either by an axiom. or a typing rule for an operation. the arguments of div must be integers. the inner steps justify that (3+4): int. For example. While this may seem like the only sensible way to deﬁne evaluation. we will see in chapter 15 that this need not be the case — some operations may yield a value without evaluating their arguments. for example. 5 : int because it is an axiom 3. Such operations are sometimes said to be lazy. deducing that (3+4) div (2+3): int by observing that (3+4): int and (2+3): int.2. it follows that the entire expression is of type int. Recursively. What about the expression 3+4? The addition operation takes two arguments. to distinguish R EVISED 11. and the result of + is an integer 2. (3+7): int. Since the value of an operator is determined by the values of its arguments.
The error is expressed in ML by raising an exception. when multiplied by 0. This assertion states that the expression exp has value val. For example. but we will soon see that it is possible for an expression to diverge. A simple example is the expression 5 div 0.2 The ML Computation Model 20 them from eager operations that require their arguments to be evaluated before the operation is performed. If you attempt to evaluate this expression it will incur a runtime error.2 . 5 ⇓ 5 because it is an axiom 3. 5 ⇓ 5 2+3 ⇓ 5 (2+3) div (1+4) ⇓ 1 An evaluation assertion may be justiﬁed by an evaluation derivation. numerals are fullyevaluated expressions. 2. which is similar to a typing derivation. yields 5. or run forever. reﬂecting the erroneous attempt to ﬁnd the number n that. Second. (3+7) ⇓ 10 because (a) 3 ⇓ 3 because it is an axiom (b) 7 ⇓ 7 because it is an axiom (c) Adding 3 to 7 yields 10. R EVISED 11. we will have more to say about exceptions in chapter 12. or values. An evaluation assertion has the form exp⇓val. Dividing 10 by 5 yields 2. Another reason that a welltyped expression might not have a value is that the attempt to evaluate it leads to an inﬁnite loop. We don’t yet have the machinery in place to deﬁne such expressions. the rules of arithmetic are used to determine that adding 3 and 7 yields 10.02. Not every expression has a value.11 D RAFT V ERSION 1. when evaluated.2. we may justify the assertion (3+7) div 5 ⇓ 2 by the derivation 1. Note that is an axiom that a numeral evaluates to itself. It should be intuitively clear that the following evaluation assertions are valid. which is undeﬁned.
(See Appendix A for a complete description of the primitive types of ML. we may write 3. This is called overloading.3 Types. but these are enough to get us started. =. false – Operations: if exp then exp1 else exp2 There are many. the expression 1 If the type of the arguments cannot be determined.3 Types. Types 21 2. • Type name: char – Values: #"a". including the ones given above. .2. • Type name: string – Values: "abc". . – Operations: +. #"b". <. . – Operations: ˆ. In an expression involving addition the type checker tries to resolve which form of addition (ﬁxed point or ﬂoating point) you mean. the addition operation is said to be overloaded at the types int and real. =. R EVISED 11. Types. . "1234". if the arguments are real’s. Types What types are there besides the integers? Here are a few useful base types of ML: • Type name: real – Values: 3.11 D RAFT V ERSION 1. size.14. • Type name: bool – Values: true. – Operations: ord. . <. Types. . the type defaults to int.1E6. . . . /. <.=. For example. . otherwise an error is reported. 0.1+2. *. . .chr. If the arguments are int’s. 2.02. . . .2 . .7 to perform a ﬂoating point addition of two ﬂoating point numbers. inﬁnitely many!) others. then ﬁxed point addition is used. then ﬂoating addition is used.17.1 Note that ML does not perform any implicit conversions between types! For example. many (in fact.) Notice that some of the arithmetic operations for real numbers are written the same way as for the corresponding operation on integers. . . .
For example. Notice that both “arms” of the conditional must have the same type! It is evaluated by ﬁrst evaluating exp.1/2. Although we may. depending on the outcome of the boolean test. We reserve the operator div for integers. even though 1 div 0 incurs a runtime error. it is rarely useful to do so.14 to an integer by rounding before performing the addition.02. Similarly.1 by 2. Beginners often writen conditionals of the form if exp = true then exp1 else exp2 . Finally. the other is simply discarded without further consideration. you must write instead real(3)+3. Note that the expression if 1<2 then 0 else (1 div 0) evaluates to 0.2. Types 22 3+3. which converts the integer 3 to its ﬂoating point representation before performing the addition.7 for the result of dividing 3. rather than write R EVISED 11.14). and use / for ﬂoating point division.2 .14 is rejected as illformed! If you intend ﬂoating point addition. which results in a ﬂoating point number. according to whether the value of exp is true or false. you must write 3+round(3. then proceeding to evaluate either exp1 or exp2 .7. This is because evaluation of the conditional proceeds either to the then clause or to the else clause. Types.11 D RAFT V ERSION 1. in fact. Whichever clause is evaluated. It has type typ if exp has type bool and both exp1 and exp2 have type typ. The conditional expression if exp then exp1 else exp2 is used to discriminate on a Boolean value.3 Types. on the other hand. if 1<2 then "less" else "greater" evaluates to "less" since the value of the expression 1<2 is true. which converts 3. If. note that ﬂoating point division is a different operation from integer quotient! Thus we write 3.14. you intend integer addition. test equality of two boolean expressions. But this is equivalent to the simpler expression if exp then exp1 else exp2 .
to ensure that the conditional expression can be given a type. 2. and hence need not be constrained to be welltyped.02. For example. While this reasoning is sensible for such a simple example.2 . the else clause will never be executed. 23 2. and in fact have the same type. To be safe the type checker “assumes the worst” and insists that both clauses of the conditional be welltyped.2.4 Type Errors if exp = false then exp1 else exp2 .14 45 + 1 ˆ "1" + 2 In each case we are “misusing” an operator with arguments of the wrong type. better yet. the following expressions are not welltyped: size #"1" #"2" 3.11 D RAFT V ERSION 1. we have enough rope to hang ourselves by forming illtyped expressions.5 Sample Code Here is the complete code for this chapter.4 Type Errors Now that we have more than one type. in general it is impossible for the type checker to determine the outcome of the boolean test during type checking. just if exp then exp2 else exp1 . R EVISED 11. namely that of both of its clauses. it is better to write if not exp then exp1 else exp2 or. This raises a natural question: is the following expression welltyped or not? if 1<2 then 0 else ("abc"+4) Since the boolean test will come out true.
for use within its range of significance. there is no possibility of changing the binding of a variable once it has been bound. distinct from all others of that class. it is bound to it for life. but for some reason it never is. 1 By . in sharp contrast to most familiar languages.1 A value or type binding introduces a “new” variable or type constructor.1 Variables Just as in any other programming language. variables in ML do not vary! A value may be bound to a variable using a construct called a value binding. In this respect variables in ML are more akin to variables in mathematics than to variables in languages such as C. A type may also be bound to a type constructor using a type binding. and can never stand for any other type. meaning that the range of signiﬁcance of a variable or type constructor is determined by the program text. However. Scoping in ML is static.Chapter 3 Declarations 3. values may be assigned to variables. which may then be used in expressions to stand for that value.) For the time being variables and type constructors have global scope. For this reason a type binding is sometimes called a type abbreviation — the type constructor stands for the type to which it is bound. or scope. Once a variable is bound to a value. meaning that the same token a value binding might also be called a value abbreviation. A bound type constructor stands for the type bound to it. not by the order of evaluation of its constituent expressions. (Languages with dynamic scope adopt the opposite convention. or lexical.
Here are some examples of type bindings: type float = real type count = int and average = real The ﬁrst type binding introduces the type constructor float. 3. At this stage we have so few types that it is hard to justify binding type names to identiﬁers. Thus a binding such as type float = real and average = float is nonsensical (in isolation) since the type constructors float and average are introduced simultaneously.1 Type Bindings Any type may be given a name using a type binding. In general a type binding introduces one or more new type constructors simultaneously in the sense that the deﬁnitions of the type constructors may not involve any of the type constructors being deﬁned.2 Basic Bindings 25 the range of signiﬁcance of the variable or type constructor is the “rest” of the program — the part that lexically follows the binding. which subsequently is synonymous with real. count and average. The second introduces two type constructors. which stand for int and real..2 Basic Bindings 3.2 . However.2.. respectively. and tyconn = typn where each tyconi is a type constructor and each typi is a type expression. The syntax for type bindings is type tycon1 = typ1 and .11 D RAFT V ERSION 1. R EVISED 11. and hence cannot refer to one another.3. we will soon introduce mechanisms for limiting the scopes of variables or type constructors to a given expression.02. but we’ll do it anyway because we’ll need it later.
17. specifying its type to be int and its value to be 5..2. we may make use of the variable introduced by a value binding in value expressions occurring within its scope. and each expi is an expression. A value binding of the form val var : typ = exp is typechecked by ensuring that the expression exp has type typ.14 since the type constructor float is bound to the type real. and varn : typn = expn . we may write val pi : float = 3. the binding is rejected as illformed. For example. the binding is evaluated using the bindbyvalue rule: ﬁrst exp is evaluated to obtain its value val.02. both having type real.2 Basic Bindings 26 3. In the case of a type binding we may use the type variable introduced by that binding in type expressions occurring within its scope. where each vari is a variable. the type of the expression 3. If not. Similarly.14. Here are some examples: val m : int = 3+2 val pi : real = 3. The purpose of a binding is to make a variable available for use within its scope. If exp does not have a value.14 and e : real = 2. then val is bound to var. The second introduces two variables.2 Value Bindings A value may be given a name using a value binding.. pi and e. in the presence of the type bindings above. simultaneously. we may use the expression R EVISED 11. If so. each typi is a type expression.2 .14 and e having value 2. Continuing from the preceding binding. The syntax of value bindings is val var1 : typ1 = exp1 and . Notice that a value binding speciﬁes both the type and the value of a variable. and with pi having value 3.17 The ﬁrst binding introduces the variable m.11 D RAFT V ERSION 1. then the declaration does not bind anything to the variable var.3.
which is necessary for the binding of x to have type float. we must consult the binding for pi to determine that it has type float. The roughandready rule for both typechecking and evaluation is that a bound variable or type constructor is implicitly replaced by its binding prior to type checking and evaluation. type checking and evaluation are context dependent in the presence of type and value bindings since we must refer to these bindings to determine the types and values of expressions. A binding is an atomic declaration.0). we may evaluate m+n to obtain the value 30.cos x in the scope of the above declarations. we ﬁrst replace the occurrence of x by its value (approximately 0.02. even though it may introduce many variables simultaneously. .0.2 . In general a sequential composition of declarations has the form dec1 . where n is at least 2. optionally separated by a semicolon. For example. decn .3 Compound Declarations Math. This is sometimes called the substitution principle for bindings. Thus we may write the declaration val m : int = 3+2 val n : int = m*m which binds m to 5 and n to 25. consult the binding for float to determine that it is synonymous with real.3 Compound Declarations Bindings may be combined to form declarations.sin pi As these examples illustrate. Subsequently. . The scopes of these declarations R EVISED 11.sin pi 27 to stand for 0.0 (approximately). to determine that the above binding for x is wellformed. to evaluate the expression Math.3. yielding (approximately) 1. Two declarations may be combined by sequential composition by simply writing them one after the other. and we may bind this value to a variable by writing val x : float = Math. 3. For example. Later on we will have to reﬁne this simple principle to take account of more sophisticated language features. but it is useful nonetheless to keep this simple idea in mind.11 D RAFT V ERSION 1. then compute as before.
The binding of a variable never changes. The scope of the declaration dec is limited to the expression exp. we may shadow a binding by introducing a second binding for a variable within the scope of the ﬁrst binding. After processing dec . The value of a let expression is determined by evaluating the declaration part. which may then be discarded since it is no longer accessible. Similarly. . decn . One thing to keep in mind is that binding is not assignment. One might say that old bindings never die.2 . However.3. it is always bound to that value (within the scope of the binding). .11 D RAFT V ERSION 1. the bindings in dec may be discarded. . once bound to a value. . .) 3. they just fade away.4 Limiting Scope 28 are nested within one another: the scope of dec1 includes dec2 . An example will help clarify the idea: R EVISED 11.17 to introduce a new variable n with both a different type and a different value than the earlier binding.02. but are preserved as private data in a closure. and so on. we may limit the scope of one declaration to another declaration by writing local dec in dec end The scope of the bindings in dec is limited to the declaration dec . . A let expression has the form let dec in exp end where dec is any declaration and exp is any expression. . then evaluating the expression relative to the bindings introduced by the declaration.4 Limiting Scope The scope of a variable or type constructor may be delimited by using let expressions and local declarations. The new binding eclipses the old one. . yielding this value as the overall value of the let expression. the scope of dec2 includes dec3 . we will see that in the presence of higherorder functions shadowed bindings are not always discarded. Continuing the above example. we may write val n : real = 2. decn . The bindings introduced by dec are discarded upon completion of evaluation of exp. (Later on.
and are not accessible from outside the expression. the ambient bindings are restored upon completion of evaluation of the let expression.2 . This is achieved by maintaining environments for type checking and evaluation. let’s consider in more detail the contextsensitivity of type checking and evaluation in the presence of bindings. as you can readily verify by ﬁrst calculating the bindings for m and n.5 Typing and Evaluation To complete this chapter.02.3. • Evaluation must take account of the declared value of a variable. Thus the following expression evaluates to 54: val m : int = 2 val r : int = let val m : int = 3 val n : int = m*m in m*n end * m The binding of m is temporarily overridden during the evaluation of the let expression. If the declaration part of a let expression eclipses earlier bindings.5 Typing and Evaluation let val m : int = 3 val n : int = m*m in m*n end 29 This expression has type int and value 27. then computing the value of m*n relative to these bindings.11 D RAFT V ERSION 1. The key ideas are: • Type checking must take account of the declared type of a variable. the value R EVISED 11. then restored upon completion of this evaluation. 3. The bindings for m and n are local to the expression m*n. The type environment records the types of variables.
whenever we encounter the type constructor float in a type expression. and hence contribute only to the type environment.11 D RAFT V ERSION 1. For example. it is replaced by real in accordance with the type binding above. A type binding such as type float = real is recorded in its entirety in the type environment.sqrt(2.5 Typing and Evaluation 30 environment records their values. separating the type from the value information.02. In chapter 2 we said that a typing assertion has the form exp : typ. after processing the compound declaration val m : int = 0 val x : real = Math. and no change is made to the value environment.414 val c = #"a" In a sense the value declarations have been divided in “half”.0) val c : char = #"a" the type environment contains the information val m : int val x : real val c : char and the value environment contains the information val m = 0 val x = 1. Each must be equipped with an environment recording information about type constructors and variables introduced by declarations.3.2 . we must extend these relations to account for open expressions (those with variables). and that an evaluation assertion has the form exp ⇓ val. Typing assertions are generalized to have the form R EVISED 11. Subsequently. Thus we see that value bindings have signiﬁcance for both type checking and evaluation. While twoplace typing and evaluation assertions are sufﬁcient for closed expressions (those without variables). In contrast type bindings have signiﬁcance only for type checking.
val var : typ .3. . This is expressed by the following axiom: .5 Typing and Evaluation typenv exp : typ 31 where typenv is a type environment that records the bindings of type constructors and the types of variables that may occur in exp. Finally. is simply a punctuation mark separating the type environment from the expression and its type.2 . type typvar = typ 2. that determines when two types are equivalent. val var : typ Note that the second form does not include the binding for var. This is written typenv typ1 ≡ typ2 Two types are equivalent iff they are the same when the type constructors deﬁned in typenv are replaced by their bindings. . only its type! Evaluation assertions are generalized to have the form valenv exp ⇓ val where valenv is a value environment that records the bindings of the variables that may occur in exp. The primary use of a type environment is to record the types of the value variables that are available for use in a given expression. We may think of valenv as a sequence of speciﬁcations of the form val var = val that bind the value val to the variable var. relative to a type environment. 2 The var : typ turnstile symbol.2 We may think of typenv as a sequence of speciﬁcations of one of the following two forms: 1. . “ ”.02. called type equivalence. we also need a new assertion. R EVISED 11. .11 D RAFT V ERSION 1.
This rule glosses over an important point. the evaluation relation must take account of the value environment.6 Sample Code 32 In words.2 . 3. then we may conclude that the variable var has type typ. var ⇓ val Here again we assume that the val speciﬁcation is the rightmost one governing the variable var to ensure that the scoping rules are respected. The role of the type equivalence assertion is to ensure that type constructors always stand for their bindings. . . . That way rebinding of variables with the same name but different types behaves as expected. typvar ≡ typ Once again. In order to account for shadowing we require that the rightmost speciﬁcation govern the type of a variable. type typvar = typ . Evaluation of variables is governed by the following axiom: .6 Sample Code Here is the complete code for this chapter. if the speciﬁcation val var : typ occurs in the type environment. . . the rightmost speciﬁcation for typvar governs the assertion. Similarly.3. val var = val . . This is expressed by the following axiom: . .11 D RAFT V ERSION 1. .02. R EVISED 11.
The data might be taken to be the values 2. it is necessary to give names to the holes so that instantiation consists of plugging in a given value for all occurrences of a name in an expression. Or we might even regard * and + as the data. Since a pattern can contain many different holes that can be independently instantiated. leaving behind the pattern * ( + ). the pattern is instantiated by applying the function to argument values. leaving behind the bare pattern of the calculation. leaving 2 (3 4) as the pattern! What is important is that a complete expression can be recovered by ﬁlling in the holes with chosen data. and to bind these values to variables for future reference. These names are. and 4. In this chapter we will introduce the ability to abstract the data from a calculation. 3. We might equally well take the data to just be 2 and 3. This view of functions is similar to our experience from high school .Chapter 4 Functions 4. consider the expression 2*(3+4). just variables. and leave behind the pattern * ( + 4). A pattern may therefore be viewed as a function of the variables that occur within it.1 Functions as Templates So far we just have the means to calculate the values of expressions. with “holes” where the data used to be. and instantiation is just the process of substituting a value for all occurrences of a variable in a given expression. This pattern may then be instantiated as often as you like so that the calculation may be repeated with speciﬁed data values plugged in. of course. For example.
the function D f is its ﬁrst derivative. that set) by the declaration. or a function of both x and y. Indeed. another function on the real line. but in the multivariate case we must be more careful: we may regard the polynomial x2 + 2xy + y2 to be a function of x. As in R EVISED 11. if f is a differentiable function on the real line. y) = x2 + 2xy + y2 . In these cases we write f ( x ) = x2 + 2xy + y2 when x varies and y is held ﬁxed. or indeterminates. f .e. we sometimes write equations such as f ( x ) = x2 + 2x + 1. and that f . In algebra it is usually left implicit that the variables x and y range over the real numbers.. and h( x. This viewpoint is especially important once we consider operators. such as the differential operator.02. except that we stress the algorithmic aspects of functions (how they determine values from arguments). and g(y) = x2 + 2xy + y2 when y varies for ﬁxed x. that map functions to functions. to be fully explicit. In algebra we manipulate polynomials such as x2 + 2x + 1 as a form of expression denoting a real number. quantity. (Indeed. as well as the extensional aspects (what they compute). from the function itself.2 Functions and Application 34 algebra. It also emphasizes that functions are a kind of “value” in mathematics (namely. the mapping that sends x ∈ R to x2 + 2x + 1.11 D RAFT V ERSION 1. In the case of the function f deﬁned above the function D f sends x ∈ R to 2x + 2. we sometimes write something like f : R → R : x ∈ R → x2 + 2x + 1 to indicate that f is a function on the reals sending x ∈ R to x2 + 2x + 1 ∈ R. and h are functions on the real line. a function of y. a polynomial determines a real number y computed by the given combination of arithmetic operations. a certain set of ordered pairs). to stand for the function f determined by the polynomial.2 . For example. g. when both vary jointly.) It is also possible to think of a polynomial as a function on the real line: given a real number x. This notation has the virtue of separating the name of the function. and that the variable f is bound to that value (i. In the univariate case we can get away with just writing the polynomial for the function. but unknown. 4. variables in algebra are sometimes called unknowns.4.2 Functions and Application The treatment of functions in ML is very similar. However. with the variable x representing a ﬁxed.
sqrt x) It may be applied to an argument by writing an expression such as (fn x : real => Math. is returned as the value of the application. We compute with a function by applying it to an argument value of its domain type and calculating the result. if we wish.sqrt x)) (16.0 evaluates to 1. The square root function is built in. and function expressions. namely a value of function type of the form typ > typ . and the result value.sqrt 2.0).sqrt 2. a value of its range type.sqrt is a primitive function of type real>real that may be applied to a real number to obtain its square root. we add the binding val var = val to the value environment.11 D RAFT V ERSION 1. We may write the fourth root function as the following function expression: fn x : real => Math. Then the value binding for the parameter is removed. parenthesize the argument. which are also called lambda expressions.2 Functions and Application 35 mathematics. The values of function type consist of primitive functions. obtaining a value val . this is especially useful for expressions such as Math.02. val . and evaluate exp. For example. such as addition and square root. and the expression exp is called the body. and typ is its range type (the type of its results).0) for the sake of clarity.sqrt (2. R EVISED 11. Function application is indicated by juxtaposition: we simply write the argument next to the function. a function in ML is a kind of value. To apply such a function expression to an argument value val. We can. It has type typ>typ provided that exp has type typ under the assumption that the parameter var has the type typ.2 .4. the expression Math.414 (approximately).1 of the form fn var : typ => exp The variable var is called the parameter.sqrt (Math.sqrt (Math. Math.0).sqrt (Math. writing Math. The type typ is the domain type (the type of arguments) of the function. For example. 1 For purely historical reasons.
sqrt x) This declaration has the same meaning as the earlier val binding.4. We may give it a name using the declaration forms introduced in chapter 3.sqrt(Math.11 D RAFT V ERSION 1. Notice that we did not give the fourth root function a name.sqrt (Math.0+2. rather than on unevaluated expressions. R EVISED 11.0). Bind x to the value 4.0. This notation for deﬁning functions quickly becomes tiresome.0.sqrt x). 4.sqrt (Math. Thus.sqrt (Math. so ML provides a special syntax for function bindings that is more concise and natural.sqrt (Math. since it is no longer needed.sqrt to a function value (the primitive square root function). It is important to note that function applications in ML are evaluated according to the callbyvalue rule: the arguments to a function are evaluated before the function is called. Evaluate fourthroot to the function value fn x : real => Math. it is an “anonymous” function.sqrt x) in the presence of this binding.sqrt (Math. we proceed as follows: 1. 2. Evaluate the argument 2.sqrt x) We may then write fourthroot 16. When evaluation completes.0.02. Instead of using the val binding above to deﬁne fourthroot.2 Functions and Application 36 which calculates the fourth root of 16.414 (approximately). we drop the binding of x from the environment.2 . The calculation proceeds by binding the variable x to the argument 16.0 to compute the fourth root of 16.0 3. Evaluate Math. then evaluating the expression Math.sqrt x) to the variable fourthroot.sqrt x) to 1. For example. namely it binds fn x:real => Math. to evaluate an expression such as fourthroot (2. functions are deﬁned to act on values.0+2. (a) Evaluate Math.0 to its value 4. we may instead write fun fourthroot (x:real):real = Math.0. we may bind the fourth root function to the variable fourthroot using the following declaration: val fourthroot : real > real = fn x : real => Math. Put in other terms.
but rather may compute “new” functions on the ﬂy and apply these to arguments. the following are all valid function declarations: fun fun fun fun fun fun srev (s:string):string = implode (rev (explode s)) pal (s:string):string = s ˆ (srev s) double (n:int):int = n + n square (n:int):int = n * n halve (n:int):int = n div 2 is even (n:int):bool = (n mod 2 = 0) Thus pal "ot" evaluates to the string "otto". Evaluate x to its value. For example. Drop the binding for the variable x.414.0. either a primitive function or a lambda expression. We are not limited to applying only named functions. as we shall see in the sequel. Using similar techniques we may deﬁne functions with arbitrary domain and range.sqrt x to its value. Functions in ML are ﬁrstclass.0. yielding 1.0.3 Binding and Scope.02. Notice that we evaluate both the function and argument positions of an application expression — both the function and argument are expressions yielding values of the appropriate type. Revisited 37 (b) Evaluate the argument expression Math.0.sqrt to a function value (the primitive square root function). i.11 D RAFT V ERSION 1. 4. The value of the function position must be a value of function type.0. meaning that they may be computed as the value of an expression. ii. and is even 4 evaluates to true. Compute the square root of 4. 4. yielding 2. (c) Compute the square root of 2. Evaluate Math. Revisited A function expression of the form R EVISED 11.4. This is a source of considerable expressive power.3 Binding and Scope. approximately 2. 5. and the value of the argument position must be a value of the domain type of the function.2 . In this case the result value (if any) will be of the range type of the function. iii.
0. The value of the parameter is only determined when the function is applied. In the preceding example.0 fun f(x:real):real = x+x fun g(y:real):real = x+y Local val bindings may shadow parameters. in the following example. The phrases “inner” and “outer” binding refer to the logical structure. Thus. or abstract syntax of an expression. Thus the last occurrence of x refers to the parameter of h. function expressions bind a variable without giving it a speciﬁc value. but only within the body of the let expression. whether by a val binding or by a fn expression. Revisited fn var:typ => exp 38 binds the variable var within the body exp of the function. for the duration of the evaluation of its body.02. val x:real = 2. we can rename them at will without affecting the meaning of the program. Since the occurrences of x within the body of the let lie within the scope of the inner val binding.4. whereas the occurrences of x in the body of g refer to the preceding val binding. the occurrences of x in the body of the function f refer to the parameter of f. provided that we R EVISED 11. In general the names of parameters do not matter.0 in x+x end * x The inner binding of x by the val declaration shadows the parameter x of h. Unlike val bindings. whereas the preceding two occurrences refer to the inner binding of x to 2. consider the following function declaration: fun h(x:real):real = let val x:real = 2.3 Binding and Scope.11 D RAFT V ERSION 1. On the other hand the last occurrence of x does not lie within the scope of the val binding. as well as other val bindings. For example. and the expression x+x lies within the scope of the val binding for x. and then only temporarily. the body of h lies “within” the scope of the parameter x. rather than to the parameter. It is worth reviewing the rules for binding and scope of variables that we introduced in chapter 3 in the presence of function expressions.2 . As before we adhere to the principle of static scope. and hence refers to the parameter of h. they are taken to refer to that binding. according to which variables are taken to refer to the nearest enclosing binding of that variable.
2 . the function fun h’(x:real):real = x+x does not have the same meaning as h. whereas the variable y refers to the parameter.0 fun h(y:real):real = x+y The parameter y to h may be renamed to z without affecting its meaning. we may not rename it to x. However. it is essential that you master these concepts now. While this may seem like a minor technical issue. for they play a central. 4.4. and rather subtle. because now both occurrences of x in the body of h’ refer to the parameter. For example.02.4 Sample Code 39 simultaneously (and consistently) rename the binding occurrence and all uses of that variable. Thus the functions f and g below are completely equivalent to each other: fun f(x:int):int = x*x fun g(y:int):int = y*y A parameter is just a placeholder. consider the following situation: val x:real = 2. Our ability to rename parameters is constrained by the static scoping rule. whereas in h the variable x refers to the outer val binding. role later on. We may rename a parameter to whatever we’d like.4 Sample Code Here is the complete code for this chapter. its name is not important. R EVISED 11. provided that we don’t change the way in which uses of a variable are resolved.11 D RAFT V ERSION 1. for doing so changes its meaning! That is.
a 3tuple a triple. nor with any particular representation strategy involving. An ntuple is a value of a product type of the form typ1 *.. say. or trees. arrays. it is unnecessary to think about allocating more complex data structures such as tuples or lists. and so on. *typn . on a par with every other value in the language.1 Tuples This chapter is concerned with the simplest form of aggregate data structure... An ntuple is a ﬁnite ordered sequence of values of the form (val1 . lists.valn ). In contrast to most familiar languages it is not necessary in ML to be concerned with allocation and deallocation of data structures.1 Product Types A distinguishing feature of ML is that aggregate data structures. the ntuple. A 2tuple is usually called a pair. Just as it is unnecessary to think about “allocating” integers to evaluate an arithmetic expression. Instead we may think of data structures as ﬁrstclass values. pointers or address arithmetic.Chapter 5 Products and Records 5. where each vali is a value. .. such as tuples. may be created and manipulated with ease.. 5..1.
. especially when programming with effects. "2") val quadruple : int * int * real * real = (2. so the last two bindings are of distinct values with distinct types.11 D RAFT V ERSION 1.. n = 0 and n = 1.3.. 3) val triple : int * real * string = (2. On the other hand there seems to be no particular use for 1tuples..0) val pair of pairs : (int * int) * (real * real) = ((2.5. It is a value of type unit.(2. is the empty sequence of values. provided that exp1 evaluates to val1 .2. and so on. where each expi is an arbitrary expression.0. which may be thought of as the 0tuple type.0.0. A 0tuple.2 . which is also known as a null tuple...0)) 41 The nesting of parentheses matters! A pair of pairs is not the same as a quadruple. There are two limiting cases.. that deserve special attention. so that the above tuple expression evaluates to the tuple value (val1 ..expn ). As a convenience. Thus the following are wellformed bindings: val pair : int * int = (2. ().. not necessarily a value. which suggests that the type has no members.. exp2 evaluates to val2 .3. 2. where vali is a value of type typi (for each 1 ≤ i ≤ n).3). For example.1 The null tuple type is surprisingly useful. Tuple expressions are evaluated from left to right. and so they are absent from the language.1 Product Types Values of this type are ntuples of the form (val1 .valn ). but in fact it has exactly one! R EVISED 11.3. the binding 1 In Java (and other languages) the type unit is misleadingly written void..02.. ML also provides a general tuple expression of the form (exp1 .valn ).
we may instead write let val x1 = exp1 val x2 = exp2 .. and distinctive. .expn ). we may simply use a generalized form of value binding in which we select that component using a pattern: val (( .11 D RAFT V ERSION 1.... c:char)) = val R EVISED 11. . Strictly speaking.xn) end which makes the evaluation order explicit.2 . (r:real. with the (implicit) understanding that the expi ’s are evaluated from left to right.2 Tuple Patterns One of the most powerful. we may use the following value binding: val ((i:int. in val xn = expn (x1. )) = val The lefthand side of the val binding is a tuple pattern that describes a pair of pairs.5. features of ML is the use of pattern matching to access components of aggregate data structures. ).1 Product Types val pair : int * int = (1+1.. suppose that val is a value of type (int * string) * (real * char) and we wish to retrieve the ﬁrst component of the second component of val. binding the ﬁrst component of the second component to the variable r. (r:real. Rather than write (exp1 .02. 52) 42 binds the value (2. The underscores indicate “don’t care” positions in the pattern — their values are not bound to any variable.1.. s:string).. Rather than explicitly “navigate” to this position to retrieve it.. a value of type real. 5. it is not essential to have tuple expressions as a primitive notion in the language. 3) to the variable pair. If we wish to give names to all of the components. For example.
. For example.1.11 D RAFT V ERSION 1. s:real) = (4.(3.1 Product Types 43 If we’d like we can even give names to the ﬁrst and second components of the pair.5. A pattern is one of three forms: 1. A wildcard pattern of the form . without decomposing them into constituent parts: val (is:int*string.2 . s:string) = (7.5).0.2. A variable pattern of the form var:typ. 2.7) In contrast. r:real. A value binding of the form val pat = exp is welltyped iff pat and exp have the same type. (). The type of a pattern is determined by an inductive analysis of the form of the pattern: 1. n:int. 3.n:int).02.3.. n:int) = (7+1. where each pati is a pattern of type typi . where each pati is a pattern. A variable pattern var:typ is of type typ. The wildcard pattern has any type whatsoever. (r:real... This includes as a special case the nulltuple pattern. otherwise the binding is illtyped and is rejected.4 div 2) (m:int. "7") ((m:int. 7..rc:real*char) = val The general form of a value binding is val pat = exp. where pat is a pattern and exp is an expression... A tuple pattern of the form (pat1 . the following are illtyped: R EVISED 11. 2. A tuple pattern (pat1 .5. s:real)) = ((4. the following bindings are welltyped: val val val val (m:int.7)) (m:int. r:real. The nulltuple pattern () has type unit. 3..2.patn ).1.patn ) has type typ1 *· · · *typn .
The wildcard binding val 3..(3.4 div 2) val (m:int. s:real)) = ((2.1.11 D RAFT V ERSION 1. This happens regardless of the form of the pattern — the righthand side is always evaluated.2. 2. For example..patn ) = (val1 . r:real) = (7+1..5. First. The rules are as follows: 1..n:int).valn ) is reduced to the set of n bindings . val pat1 = val1 val patn = valn In the case that n = 0 the tuple binding is simply discarded. stopping once we reach a binding with a variable pattern.(2. These simpliﬁcations are repeated until all bindings are irreducible.7)) val (m:int. which leaves us with a set of variable bindings that constitute the result of pattern matching. we evaluate the righthand side of the binding to a value (if indeed it has one). we perform pattern matching to determine the bindings for the variables in the pattern. we compose this binding into the following two bindings: R EVISED 11. The process of matching a value against a pattern is deﬁned by a set of rules for reducing bindings with complex patterns to a set of bindings with simpler patterns.r:real. .2 = val is discarded.n:int. The variable binding val var = val is irreducible.0)) proceeds as follows. First. except that the binding process is now more complex than before.02.0. Second. The tuple binding val (pat1 . "7") 44 Value bindings are evaluated using the bindbyvalue principle discussed earlier. evaluation of the binding val ((m:int.0.3). .3....1 Product Types val (m:int.5). r:real) = (7. 7. . (r:real.s:real) = ((4..
. n:int) = (2.. it becomes difﬁcult to remember which position plays which role.2 . A record value binding of the form R EVISED 11..0.0). A record value has the form {lab1 =val1 . where vali has type typi .0 At this point the patternmatching process is complete..labn =valn }.... A record pattern has the form {lab1 =pat1 ..labn :typn }.0 s:real = 3.5. s:real) = (2.11 D RAFT V ERSION 1. where n ≥ 0......labn =patn } which has type {lab1 :typ1 .2 Record Types val (m:int. 5.3) and (r:real..labn :typn } provided that each pati has type typi .. In that case it is more natural to attach a label to each component of the tuple that mediates access to it.02.2 Record Types Tuples are most useful when the number of positions is small.3. A record type has the form {lab1 :typ1 . and all of the labels labi are distinct. resulting in the following set of four atomic bindings: val and and and m:int = 2 n:int = 3 r:real = 2. When the number of components grows beyond a small number. 45 Then we decompose each of these bindings in turn. This is the notion of a record type..
the ﬁelds are evaluated from left to right in the order written... address=addr } = mailto rwh decomposes into the three variable bindings val prot = "mailto" val addr = "rwh@cs. 46 Since the components of a record are identiﬁed by name. display : string } The record binding val mailto rwh : hyperlink = { protocol="mailto".5. Using wild cards we can extract selected ﬁelds from a record.. Here are some examples to help clarify the use of record types. address="rwh@cs. we may write R EVISED 11.edu"... not position. in a record expression (in which the components may not be fully evaluated)..11 D RAFT V ERSION 1. However. address : string.2 ..02.cmu.labn =valn } is decomposed into the following set of bindings val pat1 = val1 and .cmu. display=disp. and patn = valn .labn =patn } = {lab1 =val1 ...edu" val disp = "Robert Harper" which extract the values of the ﬁelds of mailto rwh. the order in which they occur in a record value or record pattern is not important. First. The record binding val { protocol=prot.. let us deﬁne the record type hyperlink as follows: type hyperlink = { protocol : string.2 Record Types val {lab1 =pat1 . For example. just as for tuple expressions. display="Robert Harper" } deﬁnes a variable of type hyperlink.
ML provides a convenient abbreviated form of record pattern {lab1 ...cmu. For example. in which case you must supply additional type information. display= } = mailto rwh 47 to bind the variable prot to the protocol ﬁeld of the record value mailto rwh.2 Record Types val {protocol=prot. It is quite common to encounter record types with tens of ﬁelds. address= .} stands for the expanded pattern {protocol=prot.. display } = mailto rwh decomposes into the sequence of atomic bindings val protocol = "mailto" val address = "rwh@cs.5. which then determines which additional ﬁelds to ﬁll in. address= ... Finally. val {protocol=prot. display= } in which the elided ﬁelds are implicitly bound to wildcard patterns....labn } which stands for the pattern {lab1 =var1 .. For this we often use ellipsis patterns in records. In such cases even the wild card notation doesn’t help much when it comes to selecting one or two ﬁelds from such a record.labn =varn } where the variables vari are variables with the same name as the corresponding label labi . In some situations the context does not disambiguate.2 ... the binding val { protocol.. In order for this to occur the compiler must be able to determine unambiguously the type of the record pattern.edu" val display = "Robert Harper" This avoids the need to think up a variable name for each ﬁeld. address.} = intro home The pattern {protocol=prot.. In general the ellipsis is replaced by as many wildcard bindings as are necessary to ﬁll out the pattern to be consistent with its type. or avoid the use of ellipsis notation.02. Here the righthand side of the value binding determines the type of the pattern.. we can just make the label do “double duty” as a variable. R EVISED 11.11 D RAFT V ERSION 1. as illustrated by the following example.
we may deﬁne the distance function using keyword parameters as follows: fun dist’ {x=x:real. For example.02. Application of such a function proceeds much as before.sqrt (x*x + y*y) This function may then be applied to a pair (a twotuple!) of arguments to yield the distance between them.5. For example. For example.0. the distance function may be deﬁned more concisely as follows: fun dist (x:real.3.sqrt (x*x + y*y) The meaning is the same as the more verbose val binding given earlier. y:real):real = Math. rather than a variable. Functions with multiple results may be thought of as functions yielding tuples (or records). Function expressions are generalized to have the form fn pat => exp where pat is a pattern and exp is an expression. except that the argument value is matched against the parameter pattern to determine the bindings of zero or more variables. we may make the following deﬁnition of the Euclidean distance function: val dist : real * real > real = fn (x:real.sqrt (x*x + y*y) The expression dist’ {x=2.y=3.11 D RAFT V ERSION 1.3 Multiple Arguments and Multiple Results A function may bind more than one argument by using a pattern.0) evaluates to (approximately) 4. in the argument position. For example. Keyword parameter passing is supported through the use of record patterns.0.3 Multiple Arguments and Multiple Results 48 5. Using fun notation. dist (2. which are then used during the evaluation of the body of the function.2 .0} invokes this function with the indicated x and y values. y:real) => Math. y=y:real} = Math. we might compute two different notions of distance between two points at once as follows: R EVISED 11.0.
code is generally clearer and easier to maintain if you use patterns wherever possible.. .sqrt (x*x+y*y). to overuse the sharp notation.. These examples illustrate a pleasing regularity in the design of ML... .. fun dist (p:real*real):real = Math. the following deﬁnition of the Euclidean distance function written using sharp notation with the original. But since they arise so frequently. . R EVISED 11. ) = x where x occurs in the ith position of the tuple (and there are underscores in the other n − 1 positions).2 . we make use of the general mechanisms of tuples. Such functions may be easily deﬁned using pattern matching.sqrt((#1 p)*(#1 p)+(#2 p)*(#2 p)) You can easily see that this gets out of hand very quickly.02... which may be thought of as two results..abs(xy)) 49 Notice that the result type is a pair. leading to unreadable code. or keyword parameters. For any tuple type typ1 *· · ·*typn .11 D RAFT V ERSION 1.. x. records. It is bad style.. multiple results. and each 1 ≤ i ≤ n. Rather than introduce ad hoc notions such as multiple arguments. fun #lab {lab=x. there is a function #i of type typ1 *· · ·*typn >typi deﬁned as follows: fun #i ( . and pattern matching. It is sometimes useful to have a function to select a particular component from a tuple or record (e.3 Multiple Arguments and Multiple Results fun dist2 (x:real. Math.g.5. for example. Use of the sharp notation is strongly discouraged! A similar notation is provided for record ﬁeld selection. The following function #lab selects the component of a record with label lab.} = x Notice the use of ellipsis! Bear in mind the disambiguation requirement: any use of #lab must be in a context sufﬁcient to determine the full record type of its argument. they are predeﬁned in ML using sharp notation. Compare. Thus we may refer to the second ﬁeld of a threetuple val by writing #2(val). . however. y:real):real*real = (Math. the third component or the component with a given label).
4 Sample Code 50 5.11 D RAFT V ERSION 1.02.5.4 Sample Code Here is the complete code for this chapter.2 . R EVISED 11.
˜1. Other types have values of more than one form. Any typecorrect pattern will match any value of that type. . 1. for some n determined by the type). .Chapter 6 Case Analysis 6. For example.) Corresponding to each of the values of these types is a pattern that matches only that value. #"0") = (21. For example.1 Homogeneous and Heterogeneous Types Tuple types have the property that all values of that type have the same form (ntuples. or a value of type char might be #"a" or #"z". there is no possibility of failure of pattern matching. The pattern (x:int. Here are some examples of patternmatching against values of a heterogeneous type: val 0 = 11 val (0. #"0") .y:real. all values of type int*real are pairs whose ﬁrst component is an integer and whose second component is a real. attempting to do so fails at compile time.x) = (11. they are said to be heterogeneous types. Attempting to match any other value against that pattern fails at execution time with an error condition called a bind failure. they are said to be homogeneous. 34) val (0.z:string) is of type int*real*string and cannot be used to match against values of type int*real.y:real) is of type int*real and hence will match any value of that type. On the other hand the pattern (x:int. (Other examples of heterogeneous types will arise later on. . a value of type int might be 0.
Otherwise we proceed to stage i + 1.2 Clausal Function Expressions 52 The ﬁrst two bindings succeed. If these requirements are satisﬁed. 2. the argument value val is matched against the pattern pati . Application of a clausal function to a value val proceeds by considering the clauses in the order written.2 Clausal Function Expressions The importance of constant patterns becomes clearer once we consider how to deﬁne functions over heterogeneous types. At stage i. Each component pat=>exp is called a clause. then the application fails with an execution error called a match failure. or a rule.e. Consider the following clausal function: val recip : int > int = fn 0 => 0  n:int => 1 div n R EVISED 11. In the case of the second. Here’s an example. given the types of the variables in pattern pati . if the pattern match succeeds. evaluation continues with the evaluation of expression expi .  . where 1 ≤ i ≤ n..6.2 .02. we reach stage n + 1). The entire assembly of rules is called a match. the third fails.  patn => expn Each pati is a pattern and each expi is an expression involving the variables of the pattern pati . If no pattern matches (i. the function has the type typ1 >typ2 . the variable x is bound to 34 after the match. Specifically. there must exist types typ1 and typ2 such that 1. This is achieved in ML using a clausal function expression whose general form is fn pat1 => exp1 . . Each expression expi has type typ2 .11 D RAFT V ERSION 1. No variables are bound in the ﬁrst or third examples. 6. with the variables of pati replaced by their values as determined by pattern matching. Each pattern pati has type typ1 . The typing rules for matches ensure consistency of the clauses.
The function has two clauses. Revisited 53 This deﬁnes an integervalued reciprocal function on the integers.2 .3 Booleans and Conditionals. and continuing with evaluation of the corresponding expression. where the reciprocal of 0 is arbitrarily deﬁned to be 0. the other for nonzero arguments n.. Case analysis on the values of a heterogeneous type is performed by application of a clausallydeﬁned function. 6.11 D RAFT V ERSION 1..02. Functions may be deﬁned R EVISED 11. one for the argument 0. Revisited The type bool of booleans is perhaps the most basic example of a heterogeneous type.6. Its values are true and false. Evaluation proceeds by ﬁrst evaluating exp...) The fun notation is also generalized so that we may deﬁne recip using the following more concise syntax: fun recip 0 = 0  recip (n:int) = 1 div n One annoying thing to watch out for is that the fun form uses an equal sign to separate the pattern from the expression in a clause. The notation case of   exp pat1 => exp1 . The case expression fails if no pattern succeeds to match the value.  patn => expn ) exp. then matching its value successively against the patterns in the match until one succeeds. (Note that n is guaranteed to be nonzero because the patterns are considered in order: we reach the pattern n:int only if the argument fails to match the pattern 0.3 Booleans and Conditionals. whereas the fn form uses a double arrow. patn => expn is short for the application (fn pat1 => exp1  .
6. You should expand these into case expressions and check that they behave as expected.11 D RAFT V ERSION 1.4 Exhaustiveness and Redundancy Matches are subject to two forms of “sanity check” as an aid to the ML programmer.4 Exhaustiveness and Redundancy 54 on booleans using clausal deﬁnitions that match against the patterns true and false. For example.02.6. called exhaustiveness checking. ensures that a wellformed match covers its domain type in the sense that every value of the R EVISED 11. Pay particular attention to the evaluation order. The “shortcircuit” conjunction and disjunction operations are deﬁned as follows. and observe that the callbyvalue principle is not violated by these expressions. The expression exp1 andalso exp2 is short for if exp1 then exp2 else false and the expression exp1 orelse exp2 is short for if exp1 then true else exp2 . the negation function may be deﬁned clausally as follows: fun not true = false  not false = true The conditional expression if exp then exp1 else exp2 is shorthand for the case analysis case exp of true => exp1  false => exp2 which is itself shorthand for the application (fn true => exp1  false => exp2 ) exp. The ﬁrst.2 .
every value matches the ﬁrst clause.6. Consequently. Inexhaustive matches may or may not be in error. ensures that no clause of a match is subsumed by the clauses that precede it. the second rule of the following clausal function deﬁnition is redundant: fun not True = false  not False = true By capitalizing True we have turned it into a variable. rather than a constant pattern. irredundant match can make it redundant. called redundancy checking. including 0. Since the clauses of a match are considered in the order they are written. This means that the set of values covered by a clause in a match must not be contained entirely within the set of values covered by the preceding clauses of that match.11 true = true = true = true = true = true = true = true = true = true D RAFT V ERSION 1. as in the following example: fun recip (n:int) = 1 div n  recip 0 = 0 The second clause is redundant because the ﬁrst matches any integer value. Redundant rules often arise accidentally. rendering the second redundant. In particular. redundancy checking is correspondingly ordersensitive. For example. changing the order of clauses in a wellformed.2 . Redundant clauses are always a mistake — such a clause can never be executed. depending on whether the match might ever be applied to a value that is not covered by any clause. The second.02.4 Exhaustiveness and Redundancy 55 domain must match one of its clauses. Here is an example of a function with an inexhaustive match that is plausibly in error: fun is numeric #"0" =  is numeric #"1"  is numeric #"2"  is numeric #"3"  is numeric #"4"  is numeric #"5"  is numeric #"6"  is numeric #"7"  is numeric #"8"  is numeric #"9" R EVISED 11.
The exhaustiveness checker is your friend! Each such warning is a suggestion to doublecheck that match to be sure that you’ve not made a silly error of omission.5 Sample Code Here is the complete code for this chapter.11 D RAFT V ERSION 1. In chapter 10 we will see that the exhaustiveness checker is an extremely valuable tool for managing code evolution. this function fails. the function never returns false for any argument! Perhaps what was intended here is to include a catchall clause at the end: fun is numeric #"0" = true  is numeric #"1" = true  is numeric #"2" = true  is numeric #"3" = true  is numeric #"4" = true  is numeric #"5" = true  is numeric #"6" = true  is numeric #"7" = true  is numeric #"8" = true  is numeric #"9" = true  is numeric = false The addition of a ﬁnal catchall clause renders the match exhaustive.6. #"a". Having said that. it is a very bad idea to simply add a catchall clause to the end of every match to suppress inexhaustiveness warnings from the compiler.02. but rather have intentionally left out cases that are ruled out by the invariants of the program.2 . R EVISED 11. say.5 Sample Code 56 When applied to. 6. Indeed. because any value not matched by the ﬁrst ten clauses will surely be matched by the eleventh.
The correctness requirement for the result is called a postcondition. Here the domain type is the precondition. some calls must return their results without making any recursive calls. Obviously. the principal means of iterative computation in ML. examples of which we shall consider below. and the range type is the postcondition. In most cases we are interested in deeper properties. In general we must prove that for all inputs of the domain type. In this chapter we introduce recursive functions.Chapter 7 Recursive Functions So far we’ve only considered very simple functions (such as the reciprocal function) whose value is computed by a simple composition of primitive functions. Usually the argument imposes some additional assumptions on the inputs. according to our interest. This informal description obscures a central point. “smaller” so that the process will eventually terminate. called the preconditions. In fact we may carry out such an analysis for many different pre. in some sense. the body of the function computes the “correct” value of result type.and postcondition pairs. namely the means by which we may convince ourselves that a function computes the result that we intend. Those that do must ensure that the arguments are. To prove the correctness of a recursive function (with respect to given . a recursive function is one that computes the result of a call by possibly making further calls to itself. For example. Our burden is to prove that for every input satisfying the preconditions. the body evaluates to a result satisfying the postcondition. Informally. to avoid inﬁnite regress. the ML type checker proves that the body of a function yields a value of the range type (if it terminates) whenever it is given an argument of the domain type.
which are ordinary value bindings qualiﬁed by the keyword rec. Here’s an example of a recursive value binding: val rec factorial : int>int = fn 0 => 1  n:int => n * factorial (n1) Using fun notation we may write the deﬁnition of factorial much more clearly and concisely as follows: fun factorial 0 = 1  factorial (n:int) = n * factorial (n1) There is obviously a close correspondence between this formulation of factorial and the usual textbook deﬁnition of the factorial function in R EVISED 11. Taken together. but here the righthand side must be a value. No doubt this all sounds fairly theoretical.11 D RAFT V ERSION 1. The function may refer to itself by using the variable var. The simplest form of a recursive value binding is as follows: val rec var:typ = val. 7. As in the nonrecursive case. The beauty of inductive reasoning is that we may assume that the recursive calls work correctly when showing that a case involving recursive calls is correct. it must have a name by which it can refer to itself.7. by appeal to an induction principle that justiﬁes the particular pattern of recursion. This is achieved by using a recursive value binding. The base cases of the induction correspond to those cases that make no recursive calls.and postconditions) it is typically necessary to use some form of inductive reasoning. We must separately show that the base cases satisfy the given pre. these two steps are sufﬁcient to establish the correctness of the function itself. In fact the righthand side must be a function expression. The point of this chapter is to show that it is also profoundly practical.1 SelfReference and Recursion 58 pre.1 SelfReference and Recursion In order for a function to “call itself”.2 . the inductive step corresponds to those that do.02. since only functions may be deﬁned recursively in ML. the lefthand is a pattern.and postconditions.
) Let’s look at an example. Since var refers to the value val itself. then check that its deﬁnition.02. we ensure that the value val has type typ. since val is required to be a function expression.7. How are applications of recursive functions evaluated? The rules are almost the same as before. we assume that n has type int. To do so we must check that each clause has type int>int by checking for each clause that its pattern has type int and that its expression has type int..11 D RAFT V ERSION 1. For the second. we assume that the variable factorial has type int>int. assuming that var has type typ.  patn => expn R EVISED 11. then check that n * factorial (n1) has type int. That way all references to the variable standing for the function itself are indeed references to the function itself! Suppose that we have the following recursive function binding val rec var : typ = fn pat1 => exp1  . seem paradoxical. the type typ will always be a function type.1 SelfReference and Recursion terms of recursion equations: 0! = 1 n! = n × (n − 1)! 59 ( n > 0) Recursive value bindings are typechecked in a manner that may. We must arrange that all occurrences of the variable standing for the function are replaced by the function itself before we evaluate the body. with one modiﬁcation. This is clearly true for the ﬁrst clause of the deﬁnition. we are in effect assuming what we intend to prove while proving it! (Incidentally. To check that the binding for factorial given above is wellformed. This is so because of the rules for the primitive arithmetic operations and because of our assumption that factorial has type int>int. at ﬁrst glance. the function fn 0 => 1  n:int => n * factorial (n1). has type int>int. To check that the binding val rec var : typ = val is wellformed..2 .
factorial 3 2. as before. Also observe that the size of the subproblems grows until there are no more recursive calls.2 . at which point the computation can complete. we proceed by retrieving the binding of factorial and evaluating (fn 0=>1  n:int => n*factorial(n1))(3). which then evaluates to 6. As before. which reduces to the subproblem of evaluating 3 * (2 * (1 * (fn 0=>1  n:int => n*factorial(n1))(0))). we replace all occurrences of the var by its binding in expi before continuing evaluation. but. Considering each clause in turn. in addition. 3 * factorial 2 R EVISED 11. replacing the variables in pati by the bindings determined by pattern matching. as desired. to evaluate factorial 3. until we ﬁnd the ﬁrst pattern pati matching val. For example.7. Observe that the repeated substitution of factorial by its deﬁnition ensures that the recursive calls really do refer to the factorial function itself.1 SelfReference and Recursion 60 and we wish to apply var to the value val of type typ. the expression n * factorial(n1). We are left with the subproblem of evaluating the expression 3 * (fn 0 => 1  n:int => n*factorial(n1))(2) Proceeding as before. the computation proceeds as follows: 1. we consider each clause in turn. by evaluating expi . We therefore continue by evaluating its righthand side. we reduce this to the subproblem of evaluating 3 * (2 * (fn 0=>1  n:int => n*factorial(n1))(1)). We proceed. we ﬁnd that the ﬁrst doesn’t match. In broad outline. but the second does. after replacing n by 3 and factorial by its deﬁnition. which reduces to 3 * (2 * (1 * 1)).11 D RAFT V ERSION 1.02.
6 61 Notice that the size of the expression ﬁrst grows (in direct proportion to the argument). 3 * 2 * 1 * factorial 0 5.2 Iteration 3. an integer argument and an accumulator that records the running partial result of the computation. This growth in expression size corresponds directly to a growth in runtime storage required to record the state of the pending computation.02. This reduces the space required to keep track of those pending steps. 3 * 2 * 1 7. 3 * 2 * 1 * 1 6. 7. 3 * 2 * factorial 1 4. rather than after it completes.r:int) = r  helper (n:int. corresponding to the product of zero terms (empty preﬁx). we deﬁne factorial by calling helper with argument n and initial accumulator value 1. 1) First we deﬁne a “helper” function that takes two parameters.2 . 3 * 2 8.7. then shrinks as the pending multiplications are completed. The idea is that the accumulator reassociates the pending multiplications in the evaluation trace given above so that they can be performed prior to the recursive call.2 Iteration The deﬁnition of factorial given above should be contrasted with the following twopart deﬁnition: fun helper (0. Second. it is usual to conceal the deﬁnitions of helper functions using a local declaration. In practice we would make the following deﬁnition of the iterative version of factorial: R EVISED 11. As a matter of programming style.11 D RAFT V ERSION 1.n*r) fun factorial (n:int) = helper (n.r:int) = helper (n1.
R EVISED 11. or tail recursive.1) has the following general form: 1.r:int) = r  helper (n:int. in contrast to the ﬁrst deﬁnition given above.1) end 62 This way the helper function is not visible.n*r) in fun factorial (n:int) = helper (n. The key to the correctness of a recursive function is an inductive argument establishing its correctness. only the function of interest is “exported” by the declaration. 7.3 Inductive Reasoning Time and space usage are important. 1) 2. without requiring auxiliary storage. The critical ingredients are these: 1. helper (1.r:int) = helper (n1. Consequently. helper (0. but what is more important is that the function compute the intended result. there is no growth in the space required for an application. helper (2. meaning that the recursive call is the last step of evaluation of an application of it to an argument.7. 6 Notice that there is no growth in the size of the expression because there are no pending computations to be resumed upon completion of the recursive call. helper (3. An inputoutput speciﬁcation of the intended behavior stating preconditions on the arguments and a postcondition on the result. 6) 4.11 D RAFT V ERSION 1.3 Inductive Reasoning local fun helper (0.02. 3) 3. 6) 5.2 . The important thing to observe about helper is that it is iterative. This means that the evaluation trace of a call to helper with arguments (3. Tail recursive deﬁnitions are analogous to loops in imperative languages: they merely iterate a computation.
given the correctness of its clauses. We’ll illustrate the use of inductive reasoning by a graduated series of examples. Precondition: n ≥ 0. Now suppose that n = m + 1 for some m >= 0. Postcondition: factorial n evaluates to n!. that every application in fact yields a result (subject to the precondition on the argument). In contrast. n = 0. 2. This may be stated as the assertion if n ≥ 0 and factorial n evaluates to p. Let us establish the total correctness of factorial using the pre.2 . Notice that this statement is true of a function that diverges whenever it is applied! In this sense a partial correctness assertion is weaker than a total correctness assertion.3 Inductive Reasoning 63 2.7. which is 0!. The base case. To do so. First consider the simple.1.and postconditions stated above. moreover. and. only that the postcondition holds whenever the application terminates. One reasonable speciﬁcation for factorial is as follows: 1.11 D RAFT V ERSION 1. show that it holds for n + 1. That is. We are to establish the following statement of correctness of factorial: if n ≥ 0. An induction principle that justiﬁes the correctness of the function as a whole. nontail recursive deﬁnition of factorial given in section 7. is trivial: by deﬁnition factorial n evaluates to 1. we apply the principle of mathematical induction on the argument n. 3. A proof that the speciﬁcation holds for each clause of the function. a partial correctness assertion does not insist on termination. assuming it to hold for n >= 0. then p = n!. we show that the preconditions imply the postcondition holds for the result of any application. but. then factorial n evaluates to n!. Recall that this means we are to establish the speciﬁcation for the case n = 0.02. By the inductive hypothesis we have that R EVISED 11. This is called a total correctness assertion because it states not only that the postcondition holds of any result of application. assuming that it holds for any recursive calls.
Just as we use lemmas to help us prove theorems. main functions correspond to theorems. Computing fib (n2) requires computing fib R EVISED 11. This deﬁnition of fib is very inefﬁcient because it performs many redundant computations: to compute fib n requires that we compute fib (n1) and fib (n2). That was easy. and so by deﬁnition factorial n evaluates to n × m! = (m + 1) × m! = ( m + 1) ! = n!.02. With this in hand it is easy to prove the correctness of factorial — if n ≥ 0 then factorial n evaluates to the result of helper (n. To show the total correctness of helper with respect to this speciﬁcation. which evaluates to n! × 1 = n!. deﬁned on integers n >= 0: (* for n>=0.11 D RAFT V ERSION 1. we once again proceed by mathematical induction on n. This completes the proof.7. Helper functions correspond to lemmas. Precondition: n ≥ 0. 2. but lies at the heart of good programming style. What about the iterative deﬁnition of factorial? We focus on the behavior of helper. Postcondition: helper (n. we use helper functions to help us deﬁne main functions. and fib (n3).3 Inductive Reasoning 64 factorial m evaluates to m! (since m ≥ 0). the Fibonacci function. r) evaluates to n! × r. To compute fib (n1) requires that we compute fib (n2) a second time. 1). which is why we must appeal to complete induction to justify the deﬁnition. fib n yields the nth Fibonacci number *) fun fib 0 = 1  fib 1 = 1  fib (n:int) = fib (n1) + fib (n2) The recursive calls are made not only on n1. We leave it as an exercise to give the details of the proof.2 . but also n2. A suitable speciﬁcation is given as follows: 1. as required. Here’s an example of a function deﬁned by complete induction (or strong induction). The foregoing argument shows that this is more than an analogy. This completes the proof.
As you can see. 7. It turns out (see Graham. Here’s the code: (* for n>=0.4 Mutual Recursion 65 (n3) again.4 Mutual Recursion It is often useful to deﬁne two functions simultaneously.2 . and Patashnik. In most instances recursivelydeﬁned functions have no known closedform solution. Knuth. and fib (n4). this is an unusual case. It can be shown that the running time of fib is exponential in its argument. b). Concrete Mathematics (AddisonWesley 1989) for a derivation) that the recurrence F0 = 1 F1 = 1 Fn = Fn−1 + Fn−2 has a closedform solution over the real numbers.02. which is quite awful. and b is the (n1)st *) fun fib’ 0 = (1. 1)  fib’ (n:int) = let val (a:int. by using ﬂoating point arithmetic. without recursion. so that some form of iteration is inevitable. where a is the nth Fibonacci number. Here’s a better solution: for each n >= 0 compute not only the nth Fibonacci number. fib’ n evaluates to (a. Such functions R EVISED 11. a) end You might feel satisﬁed with this solution since it runs in time linear in n. there is considerable redundancy here. That way we can avoid redundant recomputation.7. but also the (n − 1)st as well.11 D RAFT V ERSION 1. b:int) = fib’ (n1) in (a+b. 0)  fib’ 1 = (1. (For n = 0 we deﬁne the “−1st” Fibonacci number to be zero). each of which calls the other (and possibly itself) to compute its result. However. This means that the nth Fibonacci number can be calculated directly. resulting in a lineartime algorithm.
n > 0 is odd iff n − 1 is even.5 Sample Code 66 are said to be mutually recursive. R EVISED 11.2 . so they are not deﬁnable separately from one another.11 D RAFT V ERSION 1.7. 7. We join their deﬁnitions using the keyword and to indicate that they are deﬁned simultaneously by mutual recursion.02. The most obvious approach is to test whether the number is congruent to 0 mod 2. Here’s a simple example to illustrate the point. and indeed this is what one would do in practice. namely testing whether a natural number is odd or even. and not odd. n > 0 is even iff n − 1 is odd.5 Sample Code Here is the complete code for this chapter. This may be coded up using two mutuallyrecursive procedures as follows: fun  and  even 0 = true even n = odd (n1) odd 0 = false odd n = even (n1) Notice that even calls odd and odd calls even. But to illustrate the idea of mutual recursion we instead use the following inductive characterization: 0 is even.
The reason is that no other type for s makes sense. this gets a little tedious after a while. leaving ML to insert “:string” for you. This process is known as type inference since the compiler is inferring the missing type information based on context. and important. For example. you may write simply fn s => s ˆ "\n". result about ML that . When is it allowable to omit this information? Almost always. we’ve assigned it a type at its point of introduction.Chapter 8 Type Inference and Polymorphism 8. A particularly pleasant feature of ML is that it allows you to omit this type information whenever it can be determined from context. especially when you’re using clausal function deﬁnitions. there is no need to give a type to the variable s in the function fn s:string => s ˆ "\n". As you may have noticed. This means that whenever we’ve introduced a variable.1 Type Inference So far we’ve mostly written our programs in what is known as the explicitly typed style. It is a deep. with very few exceptions. Consequently. In particular every variable in a pattern has a type associated with it. since s is used as an argument to string concatenation.
etc.1 Type Inference 68 missing type information can (almost) always be reconstructed completely and unambiguously where it is omitted. one for each choice of the type of the parameter x. property of ML. then the expression is illtyped. The body of the function places no constraints on the type of x. This is called the principal typing property of ML: whenever type information is omitted.e.. Similarly. with the same type scheme replacing each occurrence of a given type variable. For example. it does not have the type int>string as instance. meaning that the behavior of the identity function is independent of the type of x.8. but arbitrary type expression. For example. A type scheme is a type expression involving one or more type variables standing for an unknown. since we are constrained to replace all occurrences of a type variable by the same type scheme. the type scheme ’a>’b has both int>int and int>string as instances since there are different type variables occurring in the domain and range positions. the behavior of the function fn (x.. but constrains the R EVISED 11. However. which is captured by the notion of a type scheme. Clearly there is a pattern here. Choosing the type of x to be typ. Since the behavior of the identity function is the same for all possible choices of type for its argument. among inﬁnitely many others. and widely underappreciated. Type variables are written ’a (pronounced “α”).11 D RAFT V ERSION 1. for example. Type schemes are used to express the polymorphic behavior of functions. ’b (pronounced “β”). An instance of a type scheme is obtained by replacing each of the type variables occurring in it with a type scheme. there is always a most general (i. Otherwise there is a “best” way to ﬁll in the blanks. least restrictive) way to recover the omitted type information.02. that the programmer can enjoy the full beneﬁts of a static type system without paying any notational penalty whatsoever! The prototypical example is the identity function. which will (almost) always be found by the compiler. Therefore the identity function has inﬁnitely many types. However. it is said to be polymorphic. where typ is the type of the argument. This is an amazingly useful.y)=>x+1 is independent of the type of y. In other words every type for the identity function has the form typ>typ. we may write fn x:’a=>x for the polymorphic identity function of type ’a>’a. fn x=>x. string>string. ’c (pronounced “γ”).2 . since it merely returns x as the result without performing any computation on it. and (’b>’b)>(’b>’b). It means. the type of the identity function is typ>typ. If there is no way to recover the omitted type information. the type scheme ’a>’a has instances int>int. (int*int)>(int*int).
First. and the instance int>int for the second. For example. In these examples we needed only one type variable to express the polymorphic behavior of a function. This may be expressed by writing this function in the form fn (x:’a. Consequently we may choose their types freely and independently of one another. the expression determines a set of equations governing the relationships among the types of its subexpressions. if we write (fn x=>x)(0).8.11 D RAFT V ERSION 1. It is said to be the most general or principal type scheme for the function.y:’b)=>x with type scheme ’a*’b>’a. we could not assign the type ’a*’b>’c to this function because the type of the result must be the same as the type of the ﬁrst parameter: it returns its ﬁrst parameter when invoked! The type scheme ’a*’b>’a precisely captures the constraints that must be satisﬁed for the function to be type correct. and hence maximizes ﬂexibility in the use of the expression. there is (almost) always a best or most general way to infer types for expressions that maximizes generality. That is. For example. and if we write (fn x=>x)(fn x=>x)(0) the context forces the instance (int>int)>(int>int) of the principal type scheme for the identity at the ﬁrst occurrence. However. Every expression “seeks its own depth” in the sense that an occurrence of that expression is assigned a type that is an instance of its principal type scheme determined by the context of use. the context forces the type of the identity function to be int>int. For example. It is a remarkable fact about ML that every expression (with the exception of a few pesky examples that we’ll discuss below) has a principal type scheme. Notice that while it is correct to assign the type ’a*’a>’a to this function. if a function R EVISED 11.1 Type Inference 69 type of x to be int.02. but usually we need more than one. the function fn (x.y) = x constrains neither the type of x nor the type of y.2 .y:’a)=>x+1 with type int*’a>int. doing so would be overly restrictive since the types of the two parameters need not be the same. How is this achieved? Type inference is a process of constraint satisfaction. This may be expressed using type schemes by writing this function in the explicitlytyped form fn (x:int.
Suppose that we wish to bind the identity function to a variable I so that we may refer to it by name. 3. It is fundamentally impossible to attribute this inconsistency to any particular constraint. Second. the constraints are solved using a process similar to Gaussian elimination.02.2 Polymorphic Deﬁnitions 70 is applied to an argument. The free type variables in the solution to the system of equations may be thought of as determining the “degrees of freedom” or “range of polymorphism” of the type of an expression — the constraints are solvable for any choice of types to substitute for these free type variables.3). or polymorphic (there is a “best” solution). called uniﬁcation. The checker usually reports the ﬁrst unsatisﬁable equation it encounters. The equations can be classiﬁed by their solution sets as follows: 1.8. The type inference procedure attempts to ﬁnd a solution to these constraints. 2. The usual method for ﬁnding the error is to insert sufﬁcient type information to narrow down the source of the inconsistency until the source of the difﬁculty is uncovered. then the system of constraints associated with it will not have a solution. If a program is not type correct. which we will discuss further in section 8. and at some point discovers that it cannot succeed. Underconstrained: there are many solutions. all that can be said is that the constraint set as a whole has no solution. This corresponds to a type error.2 Polymorphic Deﬁnitions There is an important interaction between polymorphic expressions and value bindings that may be illustrated by the following example. This corresponds to a completely unambiguous type inference problem.2 . This description of type inference as a constraint satisfaction procedure accounts for the notorious obscurity of type checking errors in ML. Overconstrained: there is no solution.11 D RAFT V ERSION 1. 8. There are two subcases: ambiguous (due to overloading. We’ve previously observed that the identity function R EVISED 11. Uniquely determined: there is precisely one solution. then a constraint equating the domain type of the function with the type of the argument is generated. but this may or may not correspond to the “reason” (in the mind of the programmer) for the type error.
we may write val I : ’a>’a = fn x=>x to ascribe the type scheme ’a>’a to the variable I. both I 0 and I I 0 are wellformed expressions. In other words R EVISED 11. If no type is ascribed to a variable introduced by a val binding.) Having done this. we may write val I = fn x=>x or fun I(x) = x as a binding for the variable .8. but it is often good form to explicitly indicate the intended type as a consistency check and for documentation purposes. each use of I determines a distinct instance of the ascribed type scheme ’a>’a. That is. For example. the second assigning the types (int>int)>(int>int) and int>int to the two occurrences of I. of the binding to the variable I. Thus the variable I behaves precisely the same as its deﬁnition.02. In practice we often allow the type checker to infer the principal type of a variable. The treatment of val bindings during type checking ensures that a bound variable has precisely the same type as its binding.11 D RAFT V ERSION 1. This may be captured by ascribing this type scheme to the variable I at the val binding. (We may also write fun I(x:’a):’a = x for an equivalent binding of I. As a convenience ML also provides a form of type inference on value bindings that eliminates the need to ascribe a type scheme to the variable when it is bound.2 Polymorphic Deﬁnitions 71 is polymorphic. the ﬁrst assigning the type int>int to I. ’a>’a. in any expression where it is used. with principal type scheme ’a>’a. That is.2 . then it is implicitly ascribed the principal type scheme of the righthand side. The type checker implicitly assigns the principal type scheme. fn x=>x.
which is a value. even though the almost equivalent declaration val l = nil does do so.) To ensure semantic consistency. we are stuck with a loss of polymorphism in 1 To be introduced in chapter 9. variables introduced by a val binding are allowed to be polymorphic only if the righthand side is a value. However.11 D RAFT V ERSION 1. (Think. for example. we cannot apply the “trick” of deﬁning l to be a function.2 Polymorphic Deﬁnitions 72 the type checker behaves as though all uses of the bound variable are implicitly replaced by its binding before type checking. the variable J may not be used polymorphically in the remainder of the program. For example. Since this may involve replication of the binding. Since the righthand side is a list. and some polymorphism is lost.2 . In this case the difﬁculty may be avoided by writing instead fun J x = I I x because now the binding of J is a lambda. This is not the same as evaluating it only once. of any expression that opens a window on your screen: if you replicate the expression and evaluate it twice. the following declaration of a value of list type1 val l = nil @ nil does not introduce an identiﬁer with a polymorphic type.02. This is called the value restriction on polymorphic declarations. which is a value. In some rare circumstances this is not possible. For fun bindings this restriction is always met since the righthand side is implicitly a lambda expression. it requires computation to determine its value. it might be thought that the following declaration introduces a polymorphic variable of type ’a > ’a. but in fact it is rejected by the compiler: val J = I I The reason is that the righthand side is not a value. the meaning of a program is not necessarily preserved by this transformation. which results in one window. R EVISED 11.8. it will open two windows. It is therefore ruled out as inadmissible for polymorphism.
but occasionally similar examples do arise in practice. it is by no means necessary. in this case they opted for simplicity at the expense of some expressiveness in the language. notational practices. 8. the compiler has two choices: R EVISED 11. this convention creates no particular problems. in the function fn x:int => x+x it is clear that integer addition is called for. There are a few corner cases that create problems for type inference. The designers of ML were faced with a choice of simplicity vs ﬂexibility. most of which arise because of concessions that are motivated by longstanding. when we study mutable storage. if we omit type information. Which is the compiler to choose? When presented with such a program. if dubious. The value restriction is an easilyremembered sufﬁcient condition for soundness.02.2 . Why is the value restriction necessary? Later on. As a concession to longstanding practice in informal mathematics and in many programming languages. but as the examples above illustrate. As long as we are programming in an explicitlytyped style. corresponding to the preceding two explictlytyped programs. The main source of difﬁculty stems from overloading of arithmetic operators. whereas in the function fn x:real => x+x it is equally obvious that ﬂoating point addition is intended. For example. This particular example is not very impressive. we’ll see that some restriction on polymorphism is essential if the language is to be type safe. the same notation is used for both integer and ﬂoating point arithmetic operations. What are we to make of the function fn x => x+x ? Does “+” stand for integer or ﬂoating point addition? There are two distinct reconstructions of the missing type information in this example.11 D RAFT V ERSION 1. However.3 Overloading 73 this case.3 Overloading Type information cannot always be omitted.8. then a problem arises.
say the integer arithmetic. that forces one interpretation or another.2 . Arbitrarily choose a “default” interpretation. and hence to treat + as integer addition.3 Overloading 74 1. For example. Declare the expression ambiguous. 2. explicit type information is required from the programmer. Each approach has its advantages and disadvantages. The situation is actually a bit more subtle than the preceding discussion implies. if the expression fn x=>x+x occurs in the following. larger expression.11 D RAFT V ERSION 1. even though its only uses are with integer arguments.8. double 4) end The function expression fn x=>x+x will be ﬂagged as ambiguous. The reason is that the type inference process makes use of the surrounding context of an expression to help resolve ambiguities. Many compilers choose the second approach. To avoid ambiguity. The important question is how much context is considered before the situation is considered ambiguous? The rule of thumb is that context is considered up to the nearest enclosing function declaration. there is in fact no ambiguity: (fn x => x+x)(3). there is no question that the only possible resolution of the missing type information is to treat x as having type int. but issue a warning indicating that it has done so. the above program will be accepted (with a warning). and force the programmer to provide enough explicit type information to resolve the ambiguity. The reason is that value bindings are considered to be “units” of type inference for which all ambiguity must be resolved before type checking continues. consider the following example: let val double = fn x => x+x in (double 3. If your compiler adopts the integer interpretation as default. but the following one will be rejected: R EVISED 11. Since the function is applied to an integer argument.02. For example.
salary:int} {name:string.address:string} Of course there are inﬁnitely many such examples.0) end 75 Finally. Consider the function #name. yet we just now claimed that such a function definition is rejected as ambiguous. note that the following program must be rejected because no resolution of the overloading of addition can render it meaningful: let val double = fn x => x+x in (double 3. A closely related source of ambiguity arises from the “record elision” notation described in chapter 5.} = n which selects the name ﬁeld of a record.. none of which is “best”: {name:string} {name:string. In chapter 5 we mentioned that functions such as #name are predeﬁned by the ML compiler. R EVISED 11. deﬁned by fun #name {name=n:string.0) end The ambiguity must be resolved at the val binding.0. double 4.8. This function deﬁnition is therefore rejected as ambiguous by the compiler — there is no one interpretation of the function that sufﬁces for all possible uses. since we subsequently use double at both types. This deﬁnition is ambiguous because the compiler cannot uniquely determine the domain type of the function! Any of the following types are legitimate domain types for #name.3 Overloading let val double = fn x => x+x in (double 3. No single choice can be correct.2 .salary:real} {name:string. Isn’t this a contradiction? Not really.02. which means that the compiler must commit at that point to treating the addition operation as either integer or ﬂoating point.11 D RAFT V ERSION 1. none of which is clearly preferable to the other. .. double 3.
R EVISED 11. If the type of r were omitted..02. This works well.address:string.} = n and then context is used to resolve the “local” ambiguity. the following expression is welltyped fn r : {name:string.4 Sample Code 76 because what happens is that each occurrence of #name is replaced by the function fn {name=n.) 8. If not. #address r) because the record type of r is explicitly given. provided that the complete record type of the arguments to #name can be determined from context. the expression would be rejected as ambiguous (unless the context resolves the ambiguity.4 Sample Code Here is the code for this chapter. the uses are rejected as ambiguous.8.. Thus..11 D RAFT V ERSION 1.2 .salary:int} => (#name r.
The type of a list reveals the type of its elements. the list type. Informally. More precisely.1 List Primitives In chapter 5 we noted that aggregate data structures are especially easy to handle in ML. Thus list is a kind of “function” mapping types to types: given a type typ. In this chapter we consider another important aggregate type. Nothing else is a value of type typ list. The type expression typ list is a postﬁx notation for the application of the type constructor list to the type typ. Recursive types. as follows: 1. Type constructors.Chapter 9 Programming with Lists 9. 3. or parameterized types. if h is a value of type typ. the values of type typ list are given by an inductive deﬁnition. In addition to being an important form of aggregate type it also illustrates two other general features of the ML type system: 1. 2. 2. and t is a value of type typ list. then h::t is a value of type typ list. the values of type typ list are the ﬁnite lists of values of type typ. nil is a value of type typ list. The set of values of a list type are given by an inductive deﬁnition. we may apply list to it .
and yields a result of type typ list. Thus nil is the empty list of type typ list for any element type typ. 2. The :: operation takes as its second argument a value of type typ list. For according to the inductive deﬁnition of the values of type typ list.1 List Primitives 78 to get another type. the value val must either be nil. It is easy to see that a value val of type typ list has the form val1 ::(val2 :: (· · · ::(valn ::nil)· · ·)) for some n ≥ 0. of type typ list. written typ list. The nullary (no argument) constructor nil may be thought of as the empty list. The binary (two argument) constructor :: constructs a nonempty list from a value h of type typ and another value t of type typ list. and op :: constructs a nonempty list independently of the type of the elements of that list. h::t. the resulting value. Two things are notable here: 1. The forms nil and :: are the value constructors of type typ list. that h is the head of the list.11 D RAFT V ERSION 1. This selfreferential aspect is characteristic of an inductive deﬁnition. The type is said to be recursive because this deﬁnition is “selfreferential” in the sense that the values of type typ list are deﬁned in terms of (other) values of the same type. By induction val has the form (val2 :: (· · · ::(valn ::nil)· · ·)) R EVISED 11. We say that “h is cons’d onto t”. This is especially clear if we examine the types of the value constructors for the type typ list: val nil : typ list val (op ::) : typ * typ list > typ list The notation op :: is used to refer to the :: operator as a function. where val is a value of type typ list. The deﬁnition of the values of type typ list given above is an example of an inductive deﬁnition. or val1 ::val . which requires inﬁx notation.2 .02. and that t is its tail.9. Both nil and op :: are polymorphic in the type of the underlying elements of the list. is pronounced “h cons t” (for historical reasons). rather than to use it to form a list. where vali is a value of type typ for each 1 ≤ i ≤ n. which is of the above form.
This notation emphasizes the interpretation of lists as ﬁnite sequences of values. nil. The type of length is ’a list > int. l) = l  append (h::t. The base case is the empty list. as should be obvious from its deﬁnition.. l) = h :: append (t. We may deﬁne other functions following a similar pattern. so we may omit the parentheses and just write val1 ::val2 ::· · ·::valn ::nil as the general form of val of type typ list. By convention the operator :: is rightassociative. . The inductive step is the nonempty list ::t (notice that we do not need to give a name to the head).. Here’s a deﬁnition of the function length that computes the number of elements of a list: fun length nil = 0  length ( ::t) = 1 + length t The deﬁnition is given by induction on the structure of the list argument.11 D RAFT V ERSION 1.2 Computing With Lists 79 and hence val again has the speciﬁed form. This may be further abbreviated using list notation. 9. R EVISED 11. it is natural that functions on lists be deﬁned recursively. val2 . but it obscures the fundamental inductive character of lists as being built up from nil using the :: operation..2 Computing With Lists How do we compute with values of list type? Since the values are deﬁned inductively.02. The running time of append is proportional to the length of the ﬁrst list. valn ] for the same list. l) This function is built into ML. it is written using inﬁx notation as exp1 @ exp2 .2 . using a clausal deﬁnition that analyzes the structure of a list. writing [ val1 . which is “smaller” than the list ::t. Here’s the function to append two lists: fun append (nil.9. Its deﬁnition is given in terms of the tail of the list t. Here’s a function to reverse a list. it is deﬁned for lists of values of any type whatsoever.
nil) end The general idea of introducing an accumulator is the same as before. a) evaluates to the result of appending a to the reversal of l. and we establish the postcondition that helper (l. Can we do better? Oddly. We illustrate this by establishing that the function helper satisﬁes the following speciﬁcation: for every l and a of type typ list. and hence rev’ runs in time proportional to the length of its argument. we can take advantage of the nonassociativity of :: to give a tailrecursive deﬁnition of rev. helper(l. R EVISED 11. If l is the list h::t. there are no preconditions on l and a. where n is the length of the argument list. where we assume here an independent deﬁnition of rev for the sake of the speciﬁcation. a) = a  helper (h::t.02. If l is nil. a) evaluates to (rev l) @ a.a) evaluates to a. then helper (l. a) yields (rev l) @ a. which fulﬁlls the postcondition.2 . The proof is by structural induction on the list l.11 D RAFT V ERSION 1. helper (l. Notice that helper runs in time proportional to the length of its ﬁrst argument. That is. T (0) = O (1) T ( n + 1) = T ( n ) + O ( n ) Solving the recurrence we obtain the result T (n) = O(n2 ). local fun helper (nil.9. a) = helper (t. except that by reordering the applications of :: we reverse the list! The helper function reverses its ﬁrst argument and prepends it to its second argument. This can be demonstrated by writing down a recurrence that deﬁnes the running time T (n) on a list of length n. The correctness of functions deﬁned on lists may be established using the principle of structural induction. That is. h::a) in fun rev’ l = helper (l.2 Computing With Lists fun rev nil = nil  rev (h::t) = rev t @ [h] 80 Its running time is O(n2 ).
9.3 Sample Code
81
then the application helper (l, a) reduces to the value of helper (t, (h::a)). By the inductive hypothesis this is just (rev t) @ (h :: a), which is equivalent to (rev t) @ [h] @ a. But this is just rev (h::t) @ a, which was to be shown. The principle of structural induction may be summarized as follows. To show that a function works correctly for every list l, it sufﬁces to show 1. The correctness of the function for the empty list, nil, and 2. The correctness of the function for h::t, assuming its correctness for t. As with mathematical induction over the natural numbers, structural induction over lists allows us to focus on the basic and incremental behavior of a function to establish its correctness for all lists.
9.3 Sample Code
Here is the code for this chapter.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 10 Concrete Data Types
10.1 Datatype Declarations
Lists are one example of the general notion of a recursive type. ML provides a general mechanism, the datatype declaration, for introducing programmerdeﬁned recursive types. Earlier we introduced type declarations as an abbreviation mechanism. Types are given names as documentation and as a convenience to the programmer, but doing so is semantically inconsequential — one could replace all uses of the type name by its deﬁnition and not affect the behavior of the program. In contrast the datatype declaration provides a means of introducing a new type that is distinct from all other types and that does not merely stand for some other type. It is the means by which the ML type system may be extended by the programmer. The datatype declaration in ML has a number of facets. A datatype declaration introduces 1. One or more new type constructors. The type constructors introduced may, or may not, be mutually recursive. 2. One or more new value constructors for each of the type constructors introduced by the declaration. The type constructors may take zero or more arguments; a zeroargument, or nullary, type constructor is just a type. Each value constructor may also take zero or more arguments; a nullary value constructor is just a constant. The type and value constructors introduced by the declaration are “new” in the sense that they are distinct from all other type and value
10.2 NonRecursive Datatypes
83
constructors previously introduced; if a datatype redeﬁnes an “old” type or value constructor, then the old deﬁnition is shadowed by the new one, rendering the old ones inaccessible in the scope of the new deﬁnition.
10.2 NonRecursive Datatypes
Here’s a simple example of a nullary type constructor with four nullary value constructors. datatype suit = Spades  Hearts  Diamonds  Clubs This declaration introduces a new type suit with four nullary value constructors, Spades, Hearts, Diamonds, and Clubs. This declaration may be read as introducing a type suit such that a value of type suit is either Spades, or Hearts, or Diamonds, or Clubs. There is no signiﬁcance to the ordering of the constructors in the declaration; we could just as well have written datatype suit = Hearts  Diamonds  Spades  Clubs (or any other ordering, for that matter). It is conventional to capitalize the names of value constructors, but this is not required by the language. Given the declaration of the type suit, we may deﬁne functions on it by case analysis on the value constructors using a clausal function deﬁnition. For example, we may deﬁne the suit ordering in the card game of bridge by the function fun        outranks outranks outranks outranks outranks outranks outranks outranks (Spades, Spades) = false (Spades, ) = true (Hearts, Spades) = false (Hearts, Hearts) = false (Hearts, ) = true (Diamonds, Clubs) = true (Diamonds, ) = false (Clubs, ) = false
This deﬁnes a function of type suit * suit > bool that determines whether or not the ﬁrst suit outranks the second. Data types may be parameterized by a type. For example, the declaration R EVISED 11.02.11 D RAFT V ERSION 1.2
10.2 NonRecursive Datatypes datatype ’a option = NONE  SOME of ’a
84
introduces the unary type constructor ’a option with two value constructors, NONE, with no arguments, and SOME, with one. The values of type typ option are 1. The constant NONE, and 2. Values of the form SOME val, where val is a value of type typ. For example, some values of type string option are NONE, SOME "abc", and SOME "def". The option type constructor is predeﬁned in Standard ML. One common use of option types is to handle functions with an optional argument. For example, here is a function to compute the baseb exponential function for natural number exponents that defaults to base 2: fun expt (NONE, n) = expt (SOME 2, n)  expt (SOME b, 0) = 1  expt (SOME b, n) = if n mod 2 = 0 then expt (SOME (b*b), n div 2) else b * expt (SOME b, n1) The advantage of the option type in this sort of situation is that it avoids the need to make a special case of a particular argument, e.g., using 0 as ﬁrst argument to mean “use the default exponent”. A related use of option types is in aggregate data structures. For example, an address book entry might have a record type with ﬁelds for various bits of data about a person. But not all data is relevant to all people. For example, someone may not have a spouse, but they all have a name. For this we might use a type deﬁnition of the form type entry = { name:string, spouse:string option } so that one would create an entry for an unmarried person with a spouse ﬁeld of NONE. Option types may also be used to represent an optional result. For example, we may wish to deﬁne a function reciprocal that returns the reciprocal of an integer, if it has one, and otherwise indicates that it has R EVISED 11.02.11 D RAFT V ERSION 1.2
10.3 Recursive Datatypes
85
no reciprocal. This is achieve by deﬁning reciprocal to have type int > int option as follows: fun reciprocal 0 = NONE  reciprocal n = SOME (1 div n) To use the result of a call to reciprocal we must perform a case analysis of the form case (reciprocal exp) of NONE => exp1  SOME r => exp2 where exp1 covers the case that exp has no reciprocal, and exp2 covers the case that exp has reciprocal r.
10.3 Recursive Datatypes
The next level of generality is the recursive type deﬁnition. For example, one may deﬁne a type typ tree of binary trees with values of type typ at the nodes using the following declaration: datatype ’a tree = Empty  Node of ’a tree * ’a * ’a tree This declaration corresponds to the informal deﬁnition of binary trees with values of type typ at the nodes: 1. The empty tree Empty is a binary tree. 2. If tree 1 and tree 2 are binary trees, and val is a value of type typ, then Node (tree 1, val, tree 2) is a binary tree. 3. Nothing else is a binary tree. The distinguishing feature of this deﬁnition is that it is recursive in the sense that binary trees are constructed out of other binary trees, with the empty tree serving as the base case.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
) To compute with a recursive type.11 D RAFT V ERSION 1. Suppose we gave the following deﬁnition of size: fun size empty = 0  size (Node (lft. spelled with a lowercase “e”. Here’s a straightforward deﬁnition in ML: fun size Empty = 0  size (Node (lft. height rht) Notice that height is called recursively on the children of a node. We will see numerous examples of this as we go along.10. here is the function to compute the height of a binary tree: fun height Empty = 0  height (Node (lft. . rather than one. not R EVISED 11. The function height is said to be deﬁned by induction on the structure of a tree. . rht)) = 1 + size lft + size rht The compiler will warn us that the second clause of the deﬁnition is redundant! Why? Because empty. rather than one. This pattern of deﬁnition is another instance of structural induction (on the tree type).02. predecessors. One reason to capitalize value constructors is to avoid a pitfall in the ML syntax that we mentioned in chapter 2. A word of warning. For example.2 . the only difference compared to the usual deﬁnition of predecessor being that a node has two.. a leaf in a binary tree is here represented as a node both of whose children are the empty tree. One can think of the children of a node in a binary tree as the “predecessors” of that node. Here’s another example. is a variable. and to deﬁne it for nonbase cases in terms of its deﬁnitions for the constituent values of that type. use a recursive function. . value constructors with no arguments or whose arguments do not involve values of the type being deﬁned). rht)) = 1 + max (height lft. This deﬁnition of binary trees is analogous to starting the natural numbers with zero. The size of a binary tree is the number of nodes occurring in it.e.3 Recursive Datatypes 86 (Incidentally. rht)) = 1 + size lft + size rht The function size is deﬁned by structural induction on trees. and is deﬁned outright on the empty tree. The general idea is to deﬁne the function directly for the base cases of the recursive type (i.
2 . or a node gathering a “forest” to form a tree.11 D RAFT V ERSION 1. whereas there is a value of type ’a at each node. Another approach is to simultaneously deﬁne trees and “forests”.10. as follows: datatype ’a tree = Empty  Node of ’a * ’a tree list Each node has a list of children. By capitalizing constructors we can hope to make mistakes such as these more evident. This leads to the following deﬁnition: datatype ’a tree = Empty  Node of ’a * ’a forest and ’a forest = None  Tree of ’a tree * ’a forest This example illustrates the introduction of two mutually recursive datatypes. Mutually recursive datatypes beget mutually recursive functions.02. The tree data type is appropriate for binary trees: those for which each node has exactly two children. but in practice you are bound to run into this sort of mistake.) It should be obvious how to deﬁne the type of ternary trees. But what if we wished to deﬁne a type of trees with a variable number of children? In a socalled variadic tree some nodes might have three children. and so on for other ﬁxed arities. either or both children might be the empty tree. This can be achieved in at least two ways. Notice that the empty tree is distinct from the tree with one node and no children because there is no data associated with the empty tree. whose nodes have at most three children. Consequently the second clause never applies. (Of course. and so on. and hence matches any tree whatsoever. Here’s a deﬁnition of the size (number of nodes) of a variadic tree: R EVISED 11. a forest is either empty or a variadic tree together with another forest. it’s a matter of terminology which interpretation you prefer. so we may consider this to deﬁne the type of trees with at most two children. some might have two. One way combines lists and trees. so that distinct nodes may have different numbers of children.3 Recursive Datatypes 87 a constructor. A variadic tree is either empty.
rt))) = ld :: rd :: (collect lt) @ (collect rt) 10. The type system insists that trees be homogeneous in the sense that the type of the data items is the same at every node. rather than nodes. Empty)) is illtyped. Therefore an expression such as Node (Empty.02. and a value of type string tree has a string at every node. a value of type int tree has an integer at every node. For example. Suppose we wish to deﬁne a notion of binary tree in which data items are associated with branches.11 D RAFT V ERSION 1. Branch (rd. observe that the type of the data items at the nodes must be the same for every node of the tree. R EVISED 11. we now make the branches themselves explicit. and vice versa. "43".4 Heterogeneous Data Structures fun  and  size size size size tree Empty = 0 tree (Node ( . we can collect into a list the data items labelling the branches of such a tree using the following code: fun collect Empty = nil  collect (Node (Branch (ld.4 Heterogeneous Data Structures Returning to the original deﬁnition of binary trees (with data items at the nodes). Many other variations are possible. in which the branches from a node to its children were implicit. For example. f)) = 1 + size forest f forest None = 0 forest (Tree (t.2 . since data is attached to them. 43. lt).10. Node (Empty. Here’s a datatype declaration for such trees: datatype ’a tree = Empty  Node of ’a branch * ’a branch and ’a branch = Branch of ’a * ’a tree In contrast to our ﬁrst deﬁnition of binary trees. just as the type of trees is deﬁned in terms of the type of forests. f’)) = size tree t + size forest f’ 88 Notice that we deﬁne the size of a tree in terms of the size of a forest.
How would one represent such a thing in ML? To discover the answer. there is no need for heterogeneity to represent such a data structure. add 1 to it. But occasionally one might wish to work with a heterogeneous tree.11 D RAFT V ERSION 1. say.5 Abstract Syntax Datatype declarations and pattern matching are extremely useful for deﬁning and manipulating the abstract syntax of a language. we deﬁne the type of values to be integers or strings. This suggests that the data item must be labelled with sufﬁcient information so that we may determine the type of the item at runtime. Suppose we wish to represent the type of integerorstring trees. we may deﬁne a small language of arithmetic expressions using the following declaration: R EVISED 11. we would need to check at runtime whether the data item is an integer or a string. For example. The required labelling and discrimination is neatly achieved using a datatype declaration.5 Abstract Syntax 89 It is quite rare to encounter heterogeneous data structures in real programs. First. ﬁrst think about how one might manipulate such a data structure. marked with a constructor indicating which: datatype int or string = Int of int  String of string Then we deﬁne the type of interest as follows: type int or string tree = int or string tree Voila! Perfectly natural and easy — heterogeneity is really a special case of homogeneity! 10. When accessing a node. For example.10.02. whose data values at each node are of different types. otherwise we would not know whether to. or concatenate "1" to the end of it. a dictionary with strings as keys might be represented as a binary search tree with strings at the nodes.2 . We must also be able to recover the underlying data item itself so that familiar operations (such as addition or string concatenation) may be applied to it.
beneﬁt is the error checking that the compiler can perform for you if you use these mechanisms in tandem. Here is the deﬁnition of a function to evaluate expressions of the language of arithmetic expressions written using pattern matching: fun eval (Numeral n) = Numeral n  eval (Plus (e1.11 D RAFT V ERSION 1. A less obvious. yielding the following revised deﬁnition: datatype expr = Numeral of int  Plus of expr * expr  Times of expr * expr  Recip of expr First. e2)) = let val Numeral n1 = eval e1 val Numeral n2 = eval e2 in Numeral (n1+n2) end  eval (Times (e1.10. As an example.5 Abstract Syntax datatype expr = Numeral of int  Plus of expr * expr  Times of expr * expr 90 This deﬁnition has only three clauses.2 . but more important.02. suppose that we extend the type expr with a new component for the reciprocal of a number. e2)) = let val Numeral n1 = eval e1 val Numeral n2 = eval e2 in Numeral (n1*n2) end The combination of datatype declarations and pattern matching contributes enormously to the readability of programs written in ML. the expression R EVISED 11. but one could readily imagine adding others. observe that the “old” deﬁnition of eval is no longer applicable to values of type expr! For example.
.. rather than a hindrance. even though it doesn’t use the Recip constructor. it is impossible to forget to attend to even a single case. Numeral 2)) 91 is illtyped.. upon recompiling the deﬁnition of eval we encounter an inexhaustive match warning: the old code no longer applies to every value of type expr according to its new deﬁnition! We are of course lacking a case for Recip.6 Sample Code eval (Plus (Numeral 1. e2)) = . This is a tremendous help to the developer. When recompiling a large program after making a change to a datatype declaration the compiler will automatically point out every line of code that must be changed to conform to the new deﬁnition.. This is a boon because it reminds us to recompile the old code relative to the new deﬁnition of the expr type. as before .. 10.. Second. eval (Recip e) = let val Numeral n = eval e in Numeral (1 div n) end The value of the checks provided by the compiler in such cases cannot be overestimated. The reason is that the redeclaration of expr introduces a “new” type that just happens to have the same name as the “old” type..6 Sample Code Here is the code for this chapter.02. which we may provide as follows: fun    eval (Numeral n) = Numeral n eval (Plus (e1. to programmers. as before . e2)) = ..11 D RAFT V ERSION 1.10. but is in fact distinct from it. eval (Times (e1. especially if she is not the original author of the code being modiﬁed and is another reason why the static type discipline of ML is a positive beneﬁt.2 . R EVISED 11.
1 Functions as Values Values of function type are ﬁrstclass. as functionals or operators). x ]. with a variety of interesting applications. when given a (differentiable) function on the real line. less often. which means that they have the same rights and privileges as values of any other type.Chapter 11 HigherOrder Functions 11. Their use may be classiﬁed into two broad categories: . Functions which take functions as arguments or yield functions as results are known as higherorder functions (or. Higherorder functions arise frequently in mathematics. the differential operator is the higherorder function that. viewed as a function of a. We also encounter functionals mapping functions to real numbers. and an example of the latter is the deﬁnite integral of a given function on the interval [ a. For example. functions may be passed as arguments and returned as results of other functions. yields its ﬁrst derivative as a function on the real line. In particular. We will see that ﬁrstclass functions are an important source of expressive power in ML. An example of the former is provided by the deﬁnite integral viewed as a function of its integrand. and real numbers to functions. In contrast higherorder functions play a prominent role in ML. and functions may be stored in and retrieved from data structures such as lists and trees. Higherorder functions are less familiar tools for many programmers since the bestknown programming languages have only rudimentary mechanisms to support their use. that yields the area under the curve from a to x as a function of x.
For example. The skeleton may be ﬂeshed out to form a solution of a problem by applying the general pattern to arguments that isolate the speciﬁc problem instance. to control sharing of computational resources. 2. Staging can be used both to improve efﬁciency and.11.2 Binding and Scope 93 1. Abstracting patterns of control. but it can be a bit tricky to understand at ﬁrst. 11. we will review the critically important concept of scope as it applies to function deﬁnitions. Higherorder functions are design patterns that “abstract out” the details of a computation to lay bare the skeleton of the solution. It arises frequently that computation may be staged by expending additional effort “early” to simplify the computation of “later” results. if we redeclare a variable var.2 Binding and Scope Before discussing these programming techniques. This principle is easy to apply when considering sequences of declarations.11 D RAFT V ERSION 1. it should be clear by now that the variable y is bound to 32 after processing the following sequence of declarations: val val val val x y x y = = = = 2 x*x y*x x*y (* (* (* (* x=2 *) y=4 *) x=8 *) y=32 *) In the presence of function deﬁnitions the situation is the same. Here’s an example to test your grasp of the lexical scoping principle: R EVISED 11. A use of the variable var is considered to be a reference to the nearest lexically enclosing declaration of var. Recall that Standard ML is a statically scoped language. We say “nearest” because of the possibility of shadowing.02.2 . any “previous” declarations are temporarily shadowed by the latest one. Staging computation. meaning that identiﬁers are resolved according to the static structure of the program. then subsequent uses of var refer to the “most recent” (lexically!) declaration of it. as we will see later.
the variable x was bound to the value 2. This example illustrates three important points: 1. One way to understand what’s going on here is through the concept of a closure.02. ”Shadowed” bindings are not lost. Subsequently. We sometimes refer to the “most recent” declaration of a variable. all free variables of the function (i. Returning to the example above. We then restore the binding of x and drop the binding of y before yielding the result. those variables not occurring as parameters) are resolved with respect to the environment attached to the function.11 D RAFT V ERSION 1. add a binding of y to 4. a copy of the environment is attached to the function. When a function expression is evaluated. a technique for implementing higherorder functions. but we always mean “nearest lexically enclosing at the point of occurrence”. The variable x is subsequently rebound to 3. 3. Scope resolution is lexical. then the value of z would be 7. then evaluate the body of the function. not 6.2 . 2. yielding 6. The “old” binding for x is still available (through calls to f). even though it has been subsequently redeclared. Binding is not assignment! If we were to view the second binding of x as an assignment statement. we temporarily reinstate the binding of x to 2. not to 7! The reason is that the occurrence of x in the body of f refers to the ﬁrst declaration of x since it is the nearest lexically enclosing declaration of the occurence. not temporal. the environment associated with the function f contains the declaration val x = 2 to record the fact that at the time the function was evaluated. the function is therefore said to be “closed” with respect to the attached environment. even though a more recent binding has shadowed it.11. The swapped environment is restored after the call is complete. which has a temporal ﬂavor.e. This is achieved at function application time by “swapping” the attached environment of the function for the environment active at the point of the call.2 Binding and Scope val fun val val x f x z = y = = 2 = x+y 3 f 4 94 After processing these declarations the variable z is bound to 6. but when f is applied.. R EVISED 11.
3 Returning Functions 95 11.11 D RAFT V ERSION 1.3 Returning Functions While seemingly very simple. The resulting function is “created” by the application of constantly to the argument 3.5]. It is deﬁned as follows: fun map’ (f. it always yields 3 when applied. nil) = nil  map’ (f. the principle of lexical scope is the source of considerable expressive power.e. the application map’ (fn x => x+1.4. What is surprising is that we can create new functions during execution.3.. the application constantly k yields a function that yields k whenever it is applied. Yet nowhere have we deﬁned the function that always yields 3.2 . which applies a given function to every element of a list.4]) evaluates to the list [2. The standard example of passing a function as argument is the map’ function. but the declaration of the function constantly may also be written using fun notation as follows: fun constantly k a = k Note well that a white space separates the two successive arguments to constantly! The meaning of this declaration is precisely the same as the earlier deﬁnition using fn notation. h::t) = (f h) :: map’ (f. To warm up let’s consider some simple examples of passing functions as arguments and yielding functions as results. The value of the application constantly 3 is the function that is constantly 3. t) For example. not just return functions that have been previously deﬁned. [1.11. The most basic (and deceptively simple) example is the function constantly that creates constant functions: given a value k. We’ll demonstrate this through a series of examples. i. Here’s a deﬁnition of constantly: val constantly = fn k => (fn a => k) The function constantly has type ’a > (’b > ’a).02. We used the fn notation for clarity.3. rather than merely R EVISED 11. Functions may also yield functions as results.2.
It takes a function of type ’a > ’b as argument. but it must be emphasized that code pointers are signiﬁcantly less powerful than closures because in C there are only statically many possibilities for a code pointer (it must point to one of the functions deﬁned in your code). yielding a new list as result. The deﬁnition of the function map’ given above takes a function and list as arguments.02. The nonvarying part of the closure. the code. This also points out why functions in ML are not the same as code pointers in C. You may be familiar with the idea of passing a pointer to a C function to another C function as a means of passing functions as arguments or yielding functions as results. This leads to the following deﬁnition of the function map: fun map f nil = nil  map f (h::t) = (f h) :: (map f t) The function map so deﬁned has type (’a>’b) > ’a list > ’b list.2 . Often it occurs that we wish to map the same function across several different lists. the underlying function is always fn a => k. whereas in ML we may generate dynamically many different instances of a function.3 Returning Functions 96 “retrieved” off the shelf of previouslydeﬁned functions. In implementation terms the result of the application constantly 3 is a closure consisting of the function fn a => k with the environment val k = 3 attached to it. It is inconvenient (and a tad inefﬁcient) to keep passing the same function to map’. then this amounts to the observation that no new code is created at runtime. with the list argument varying each time. R EVISED 11. the closure is the representation of the “new” function yielded by the application. is directly analogous to a function pointer in C.11. however. If we think of the lambda as the “executable code” of the function. and yields another function of type ’a list > ’b list as result. just new instances of existing code. the dynamic environment. that the only difference between any two results of applying the function constantly lies in the attached environment. but there is no counterpart in C of the varying part of the closure. differing in the bindings of the variables in its environment.11 D RAFT V ERSION 1. Instead we would prefer to create a instance of map specialized to the given function that can then be applied to many different lists. The closure is a data structure (a pair) that is created by each application of constantly to an argument. Notice. This may be considered to be a form of “higherorder” function in C.
11. fun  fun  add add mul mul up up up up nil = 0 (h::t) = h + add up t nil = 1 (h::t) = h * mul up t What precisely is the similarity? We will look at it from two points of view. This passage can be codiﬁed as follows: fun curry f x y = f (x. yielding after the ﬁrst a function that takes the second as its sole argument. applies the original twoargument function to the ﬁrst and second arguments. There is an obvious similarity between the following two functions. y) The type of curry is (’a*’b>’c) > (’a > (’b > ’c)). Observe that map may be alternately deﬁned by the binding fun map f l = curry map’ f l Applications are implicitly leftassociated. one to add up the numbers in a list.02. This pattern can be abstracted as the function reduce deﬁned as follows: R EVISED 11. One view is that in each case we have a binary operation and a unit element for it. curry returns another function that. and the result on a nonempty list is the operation applied to the head of the list and the result on the tail.4 Patterns of Control We turn now to the idea of abstracting patterns of control. a function taking a pair as argument) into a function that takes two arguments in succession. when applied to the second. yields a function that.4 Patterns of Control 97 The passage from map’ to map is called currying. given separately. so that this deﬁnition is equivalent to the more verbose declaration fun map f l = ((curry map’) f) l 11. Given a twoargument function.2 . the other to multiply them. We have changed a twoargument function (more properly. The result on the empty list is the unit element.11 D RAFT V ERSION 1. when applied to the ﬁrst argument.
l) fun mul up l = reduce (1. reduce (unit. only the list parameter changes on recursive calls. The recursive structure of add up and mul up is abstracted by the reduce functional. h::t) = opn (h. Notice that the type of the operation admits the possibility of the ﬁrst argument having a different type from the second argument and result. op +. opn. l) To further check your understanding. But this is really just another way of saying that they are deﬁned in terms of a unit element and a binary operation! The difference is one of perspective: whether we focus on the pattern part of the clauses (the inductive decomposition) or the result part of the clauses (the unit and operation).11 D RAFT V ERSION 1. nil) = unit  reduce (unit. which is then specialized to yield add up and mul up. opn. deﬁned in terms of its behavior on t. Using reduce. op *. it’s important to remember that R EVISED 11. and an inductive case for h::t.4 Patterns of Control fun reduce (unit.02. the function reduce abstracts the pattern of deﬁning a function by induction on the structure of a list.) What function does mystery compute? Another view of the commonality between add up and mul up is that they are both deﬁned by induction on the structure of the list argument. the second is the operation. consider the following declaration: fun mystery l = reduce (nil. While this might seem like a minor overhead. The deﬁnition of reduce leaves something to be desired. l) (Recall that “op ::” is the function of type ’a * ’a list > ’a list that adds a given value to the front of a list. t)) Here is the type of reduce: val reduce : ’b * (’a*’b>’b) * ’a list > ’b 98 The ﬁrst argument is the unit element.11.2 . with a base case for nil. and the third is the list of values. One thing to notice is that the arguments unit and opn are carried unchanged through the recursion. Said another way. op ::. opn. we may redeﬁne add up and mul up as follows: fun add up l = reduce (0.
5 Staging 99 multiargument functions are really singleargument functions that take a tuple as argument.2 .11 D RAFT V ERSION 1. Here’s the code: R EVISED 11. the recursive calls to red no longer carry bindings for unit and opn. 11. red t) in red l end Notice that each call to better reduce creates a new function red that uses the parameters unit and opn of the call to better reduce. The motivation is that unit and opn often remain ﬁxed for many different lists (e. The idea of staging is to perform as much computation as possible on the basis of the early arguments. In this case unit and opn are said to be “early” arguments and the list is said to be a “late” argument. This means that each time around the loop we are constructing a new tuple whose ﬁrst and second components remain ﬁxed. Is there a better way? Here’s another deﬁnition that isolates the “inner loop” as an auxiliary function: fun better reduce (unit. In the case of the function reduce this amounts to building red on the basis of unit and opn. but whose third component varies. Furthermore. yielding it as a function that may be later applied to many different lists. This means that red is bound to a closure consisting of the code for the function together with the environment active at the point of deﬁnition. opn. yielding a function of the late arguments alone. saving the overhead of creating tuples on each iteration of the loop.11.02.. which will provide bindings for unit and opn arising from the application of better reduce to its arguments.g. we may wish to sum the elements of many different lists).5 Staging An interesting variation on reduce may be obtained by staging the computation. l) = let fun red nil = unit  red (h::t) = opn (h.
Since staged reduce and curried reduce have the same iterated function type. Note well that we would not obtain the effect of staging were we to use the following deﬁnition: fun curried reduce (unit.5 Staging fun staged reduce (unit.02. opn) (h::t) = opn (h.) R EVISED 11. namely (’b * (’a * ’b > ’b)) > ’a list > ’b the contrast between these two examples may be summarized by saying not every function of iterated function type is curried. red t) in red end 100 The deﬁnition of staged reduce bears a close resemblance to the deﬁnition of better reduce. A curried function does not take signiﬁcant advantage of staging.11 D RAFT V ERSION 1. and some aren’t. rather than each time the list argument is supplied. we see that while we are taking two arguments in succession. but it is necessary to deviate from standard practice to avoid a serious misapprehension.11.2 . the only difference is that the creation of the closure bound to red occurs as soon as unit and opn are known. (This directly contradicts established terminology. We could just as well have replaced the body of the let expression with the function fn l => red l but a moment’s thought reveals that the meaning is the same. opn) = let fun red nil = unit  red (h::t) = opn (h. opn) nil = unit  curried reduce (unit. The “interesting” examples (such as staged reduce) are the ones that aren’t curried. Thus the overhead of closure creation is “factored out” of multiple applications of the resulting function to list arguments. Some are. opn) t) If we unravel the fun notation. curried reduce (unit. we are not doing any useful work in between the arrival of the ﬁrst argument (a pair) and the second (a list).
In particular.. but a substantial savings accrues from avoiding the pattern matching required to destructure the original list argument on each call. the function fn l => v1 :: v2 :: .11. each application of the result of applying curried append to a list results in the ﬁrst list being traversed so that the second can be appended to it. but not quite as efﬁcient as. the function staged append yields a function that is equivalent to.02.11 D RAFT V ERSION 1.. Here’s a naive solution that merely curries append: fun curried append nil l = l  curried append (h::t) l = h :: curried append t l Unfortunately this solution doesn’t exploit the fact that the ﬁrst argument is ﬁxed for many second arguments.l) Suppose that we will have occasion to append many lists to the end of a given list. When applied to a list [v1 . What we’d like is to build a specialized appender for the ﬁrst list that. We can improve on this by staging the computation as follows: fun staged append nil = (fn l => l)  staged append (h::t) = let val tail appender = staged append t in fn l => h :: tail appender l end Notice that the ﬁrst list is traversed once for all applications to a second argument..vn ].2 . :: vn :: l. l) = l  append (h::t. when applied to a second list. But consider the following deﬁnition of an append function for lists that takes both arguments at once: fun append (nil.. l) = h :: append(t. This still takes time proportional to n.. appends the second to the end of the ﬁrst.5 Staging 101 The time saved by staging the computation in the deﬁnition of staged reduce is admittedly minor.. R EVISED 11.
2 .11 D RAFT V ERSION 1.11.6 Sample Code 102 11.02.6 Sample Code Here is the code for this chapter. R EVISED 11.
Input/output. While it’s hard to give a precise general deﬁnition of what we mean by an effect. Storage may be allocated and modiﬁed during evaluation. Evaluation may be aborted by signaling an exceptional condition. Mutation.Chapter 12 Exceptions In chapter 2 we mentioned that expressions in Standard ML always have a type. Communication. The main examples of effects in ML are these: 1. 3. . Exceptions. may have a value. So far we’ve concentrated on typing and evaluation. This chapter is concerned with exceptions. 2. 4. the other forms of effects will be considered later. It is possible to read from an input source and write to an output sink during evaluation. In this chapter we will introduce the concept of an effect. rather as a failure to take any action. and may have an effect. but we don’t usually think of failure to terminate as a positive “action” in its own right. Data may be sent to and received from communication channels. the idea is that an effect is any action resulting from evaluation of an expression other than returning a value. From this point of view we might consider nontermination to be an effect.
In contrast the expression 3 div 0 is welltyped (with type int). which sacriﬁce precise delivery of arithmetic faults in the interest of speeding up execution in the R EVISED 11. and hence cannot be evaluated. The compiler must generate instructions that ensure that an overﬂow fault is caught before any subsequent operations are performed.g. and by dynamic checks that rule out violations that cannot be detected statically (e. In most implementations an exception such as this is reported by an error message of the form “Uncaught exception Div”.1 Exceptions as Errors 104 12.. We will indicate this by writing 3 div 0 ⇓ raise Div An exception is a form of “answer” to the question “what is the value of this expression?”. Static violations are signalled by type checking errors. indicating that an arithmetic overﬂow error arose in the attempt to carry out the multiplication. adding a string to an integer or casting an integer as a function). For example. which rules out expressions that are manifestly illdeﬁned (e. division by zero or arithmetic overﬂow). 12.1 Exceptions as Errors ML is a safe language in the sense that its execution behavior may be understood entirely in terms of the constructs of the language itself.1 Primitive Exceptions The expression 3 + "3" is illtyped. This is ensured by a combination of a static type discipline. These cannot arise in ML..12.02. corresponding to division by zero.g.2 . together with the line number (or some other indication) of the point in the program where the exception occurred.11 D RAFT V ERSION 1. evaluation of the expression maxint * maxint (where maxint is the largest representable integer) causes the exception Overflow to be raised. but incurs a runtime fault that is signalled by raising the exception Div. Behavior such as “dumping core” or incurring a “bus error” are extralinguistic notions that may only be explained by appeal to the underlying implementation of the language. dynamic violations are signalled by raising exceptions. This can be quite expensive on pipelined processors.1. (You may be wondering about the overhead of checking for arithmetic faults. Exceptions have names so that we may distinguish different sources of error in a program. This is usefully distinguished from the exception Div.
2 UserDeﬁned Exceptions So far we have considered examples of predeﬁned exceptions that indicate fatal error conditions. the result is nevertheless welldeﬁned because ML checks for. Suppose we deﬁne the function hd as follows fun hd (h:: ) = h This deﬁnition is inexhaustive since it makes no provision for the possibility of the argument being nil. evaluation of the binding val h:: = nil raises the exception Bind since the pattern h:: does not match the value nil.) Another source of runtime exceptions is an inexhaustive match. For example. and signals. no Bind exception. indicating the failure of the patternmatching process: hd nil ⇓ raise Match The occurrence of a Match exception at runtime is indicative of a violation of a precondition to the invocation of a function somewhere in the program. Since the builtin exceptions have a builtR EVISED 11. 12.11 D RAFT V ERSION 1. then the exception Match can never be raised during evaluation (and hence no runtime checking need be performed). What happens if we apply hd to nil? The exception Match is raised. That is. evidently). Recall that it is often sensible for a function to be inexhaustive. A related situation is the use of a pattern in a val binding to destructure a value.02. Should this occur (because of a programming mistake. The ﬂip side is that if no inexhaustive match warnings arise during type checking. provided that we take care to ensure that it is never applied to a value outside of its domain. ML programs are implicitly “bulletproofed” against failures of pattern matching. Here again observe that a Bind exception cannot arise unless the compiler has previously warned us of the possibility: no warning. If the pattern can fail to match a value of this type.2 . Unfortunately it is necessary to incur this overhead if we are to avoid having the behavior of an ML program depend on the underlying processor on which it is implemented.1. then a Bind exception is raised at runtime.12.1 Exceptions as Errors 105 nonfaulting case. pattern match failure.
which cannot be done using a pattern. To some extent this is unavoidable since we wish to check explicitly for negative arguments. but instead relies on explicit comparison operations. Here’s a ﬁrst attempt at deﬁning such a function: exception Factorial fun checked factorial n = if n < 0 then raise Factorial else if n=0 then 1 else n * checked factorial (n1) The declaration exception Factorial introduces an exception Factorial.02.11 D RAFT V ERSION 1. easing the process of tracking down the source of the error. This fact is not reﬂected in the code.2 . even though we can prove that if the initial argument is nonnegative. it is generally inadvisable to use these to signal programspeciﬁc error conditions. relatively minor. A more signiﬁcant problem is that checked factorial repeatedly checks the validity of its argument on each recursive call. Instead we introduce a new exception using an exception declaration. which we raise in the case that checked factorial is applied to a negative number. One.1 Exceptions as Errors 106 in meaning. That way we can associate speciﬁc exceptions with speciﬁc pieces of code.12. Suppose that we wish to deﬁne a “checked factorial” function that ensures that its argument is nonnegative. and signal it using a raise expression when a runtime violation occurs. We can improve the deﬁnition by introducing an auxiliary function: exception Factorial local fun fact 0 = 1  fact n = n * fact (n1) in fun checked factorial n = if n >= 0 then R EVISED 11. issue is that it does not make effective use of pattern matching. then so must be the argument on each recursive call. The deﬁnition of checked factorial is unsatisfactory in at least two respects.
then that is the value of the entire expression. If it returns a value. If the pattern of a clause matches the exception exc. More generally. the exception R EVISED 11.02. If." An expression of the form exp handle match is called an exception handler. If no pattern matches. exceptions can be used to effect nonlocal transfers of control. A very simple example is provided by the following driver for the factorial function that accepts numbers from the keyboard. then evaluation resumes with the expression part of that clause. and that the auxiliary function makes effective use of patternmatching. however. computes their factorial.12. then the exception value is matched against the clauses of the match (exactly as in the application of a clausal function to an argument) to determine how to proceed. By using an exception handler we may “catch” a raised exception and continue evaluation along some other path. 12.2 . fun factorial driver () = let val input = read integer () val result = toString (checked factorial input) in print result end handle Factorial => print "Out of range.2 Exception Handlers The use of exceptions to signal error conditions suggests that raising an exception is fatal: execution of the program terminates with the raised exception.2 Exception Handlers fact n else raise Factorial end 107 Notice that we perform the range check exactly once. But signaling an error is only one use of the exception mechanism. and prints the result. It is evaluated by attempting to evaluate exp.11 D RAFT V ERSION 1. exp raises an exception exc. the handler plays no role in this case.
Raising an exception consists of passing a value of type exn to the current exception handler. Termination of input might be signaled by an EndOfFile exception.02. That is. we might repeatedly read integers until the user terminates the input stream (by typing the end of ﬁle character). If the expression does not raise an exception. then the uncaught exception is signaled as the ﬁnal result of evaluation. In more operational terms. printing the result if the given number is in range. computation is aborted with the uncaught exception exc. or fails to handle the given exception.2 Exception Handlers 108 exc is reraised so that outer exception handlers may dispatch on it. This ensures that if the handler itself raises an exception. then evaluating exp. which is handled by the driver. For example. and otherwise reporting that the number is out of range. The previous binding of the exception handler is preserved so that it may be restored once the given handler is no longer needed. and resume.11 D RAFT V ERSION 1. Similarly. print a suitable message. the previous handler is restored as part of completing the evaluation of the handle expression.12. Here’s a sketch of a more complicated factorial driver: fun factorial driver () = let val input = read integer () val result = toString (checked factorial input) val = print result in R EVISED 11. then the exception is propagated to the handler active prior to evaluation of the handle expression. but one could easily imagine generalizing it in a number of ways that also make use of exceptions. Again we would handle this exception.2 . evaluation of exp handle match proceeds by installing an exception handler determined by match. Returning to the function factorial driver. we might expect that the function read integer raises the exception SyntaxError in the case that the input is not properly formatted. and reinstalls the previously active handler. we see that evaluation proceeds by attempting to compute the factorial of a given number (read from the keyboard by an unspeciﬁed function read integer). The example is trivialized to focus on the role of exceptions. If no handler handles the exception. Passing an exception to a handler deinstalls that handler.
" in factorial driver () end 109 We will return to a more detailed discussion of input/output later in these notes. if somewhat artiﬁcial. a programming technique based on exhaustive search of a state space. The exception handler takes care of the exceptional cases: end of ﬁle.12. A very simple.2 Exception Handlers factorial driver () end handle EndOfFile => print "Done.11 D RAFT V ERSION 1. you’ll get an uncaught exception at runtime). A typical use of exceptions is to implement backtracking. The point to notice here is that the code is structured with a completely uncluttered “normal path” that reads an integer. and repeats. and resume reading."  SyntaxError => let val = print "Syntax error. These aspects work handinhand to facilitate writing robust programs. They force you to consider the exceptional case (if you don’t. computes its factorial. and 2.02. formats it. and domain error." in factorial driver () end  Factorial => let val = print "Out of range. The primary beneﬁts of the exception mechanism are as follows: 1.2 . syntax error. In the former we simply report completion and we are done. What is at issue is that the obvious “greedy” algorithm for making change that proceeds R EVISED 11. The reader is encouraged to imagine how one might structure this program without the use of exceptions. In the latter two cases we report an error. prints it. They allow you to segregate the special case from the normal case in the code (rather than clutter the code with explicit checks). example is provided by the following function to compute change from an arbitrary list of coin values.
In a parser for a language we might write something like raise SyntaxError "Integer expected" to indicate a malformed expression in a situation where an integer is expected. we undo the most recent greedy decision and proceed again from there. For example. 12. but if we get “stuck”. Simulate evaluation of the example of change [5. Often it is useful to attach additional information that is passed to the exception handler. we might associate with a SyntaxError exception a string indicating the precise nature of the error. we declare it as R EVISED 11.3 ValueCarrying Exceptions So far exceptions are just “signals” that indicate that an exceptional condition has arisen. and write raise SyntaxError "Identifier expected" to indicate a badlyformed identiﬁer. Given only a 5 cent and a 2 cent coin.12. To associate a string with the exception SyntaxError.11 D RAFT V ERSION 1.2 . This is achieved by attaching values to exceptions.2] 16 to see how the code works. we cannot make 16 cents in change by ﬁrst taking three 5’s and then proceeding to dole out 2’s. In fact we must use two 5’s and three 2’s to make 16 cents.02.3 ValueCarrying Exceptions 110 by doling out as many coins as possible in decreasing order of value does not always work. Here’s a method that works for any set of coins: exception Change fun change 0 = nil  change nil = raise Change  change (coin::coins) amt = if coin > amt then change coins amt else (coin :: change (coin::coins) (amtcoin)) handle Change => change coins amt The idea is to proceed greedily.
02. We may use them to create an exception (perhaps with an associated value). Exception constructors are in many ways similar to value constructors.3 ValueCarrying Exceptions exception SyntaxError of string. For example. What about exception constructors? A “bare” exception constructor (such as Factorial above) has type exn and a valuecarrying exception constructor (such as SyntaxError) has type string > exn. This declaration introduces the exception constructor SyntaxError. as we previously mentioned. the list constructors nil and :: have types ’a list and ’a * ’a list > ’a list. and SyntaxError "Integer expected" is a value of type exn. The primitive operation raise takes any value of type R EVISED 11.12.11 D RAFT V ERSION 1. Recall that we may use value constructors in two ways: 1. respectively. Thus Factorial is a value of type exn. We may use them to match an exception (perhaps also matching the associated value). We may use them to match values of a datatype (perhaps also matching a constituent value). 2. 1. 111 This declaration introduces the exception SyntaxError as an exception carrying a string as value.. The type exn is the type of exception packets. the data values associated with exceptions. 2.. The situation with exception constructors is symmetric. handle SyntaxError msg => (print "Syntax error: " ^ msg) Here we specify a pattern for SyntaxError exceptions that also binds the string associated with the exception to the identiﬁer msg and prints that string along with an error indication. Value constructors have types.2 . In particular they can be used in patterns. as in the following code fragment: . We may use them to create values of a datatype (perhaps by applying them to other values).
02. For this reason the type exn is sometimes called an extensible datatype.4 Sample Code 112 exn as argument and raises an exception with that value. except that the constructors of this type are not determined once and for all (as they are with a datatype declaration).12. 12.2 .4 Sample Code Here is the code for this chapter. R EVISED 11. if an exception constructor is no longer in scope. The type exn may be thought of as a kind of builtin datatype. The clauses of a handler may be applied to any value of type exn using the rules of pattern matching described earlier. then the handler cannot catch it (other than via a wildcard pattern). but rather are incrementally introduced as needed in a program.11 D RAFT V ERSION 1.
Since cells are used by issuing “commands” to modify and retrieve their contents. 13.Chapter 13 Mutable Storage In this chapter we consider a second form of effect. Changing the contents of a mutable cell introduces a temporal aspect to evaluation. not all of which are desirable. the allocation or mutation of storage during evaluation. During execution of a program the contents of a cell may be retrieved or replaced by any other value of the appropriate type. A mutable cell may be thought of as a container in which a data value of a speciﬁed type is stored. but ﬁrst we introduce the primitive mechanisms for programming with mutable storage in ML. one connotation of the phrase side effect is an unintended consequence of a medication!) While it is excessive to dismiss storage effects as completely undesirable. it is advantageous to minimize the use of storage effects except in situations where the task clearly demands them. We also speak of previous and future values of a reference cell when discussing the behavior of a program. The introduction of storage effects has profound consequences. This is in sharp . programming with cells is called imperative programming. meaning the value most recently assigned to it. called a storage effect. We speak of the current contents of a cell. (Indeed.1 Reference Cells To support mutable storage the execution model that we described in chapter 2 is enriched with a memory consisting of a ﬁnite set of mutable cells. We will explore some techniques for programming with storage effects later in this chapter.
The type typ ref is the type of reference cells containing values of type typ. The value of exp is discarded by the binding. or allocated. Assignment is usually written using inﬁx syntax. and does not share storage with them. lending a “permanent” quality to statements about variables — the “current” binding is the only binding that variable will ever have. In most cases the R EVISED 11. Here are some examples: val val val val val val val val r = ref 0 s = ref 0 = r := 3 x = !s + !r t = r = t := 5 y = !s + !r z = !t + !r After execution of these bindings. ref allocates a “new” cell. passed as arguments to functions. and yields the nulltuple as result. it replaces the content of that cell with that value. When applied to a cell and a value.02. appear within data structures. and returns a reference to the cell. The contents of a cell of type typ is retrieved using the function ! of type typ ref > typ. for which no such concepts apply.13.11 D RAFT V ERSION 1. which has type typ ref * typ > unit.1 Reference Cells 114 contrast to the effectfree fragment of ML. since the lefthand side is a wildcard pattern. When applied to a value val of type typ. Reference cells are. the variable y is bound to 5. By “new” we mean that the allocated cell is distinct from all other cells previously allocated. For example. initializes its content to val. and even be stored within other reference cells. by the function ref of type typ > typ ref. Applying ! to a reference cell yields the current contents of that cell. returned as results of functions. and z is bound to 10.2 . Notice the use of a val binding of the form val = exp when exp is to be evaluated purely for its effect. the binding of a variable does not change while evaluating within the scope of that variable. The contents of a cell is changed by applying the assignment operator op :=. A reference cell is created. the variable x is bound to 3. like all values. ﬁrst class — they may be bound to variables.
especially when that cell is bound to a variable.02. and they are implicitly dereferenced whenever they are used so that a variable always stands for its current contents.2 . However.2 Reference Patterns 115 expression exp has type unit. (). because they are executed purely for their effect. In C one writes & x for the address of (the cell bound to) x. Functions of type typ>unit are sometimes called procedures. then evaluates exp2 . A wildcard binding is used to deﬁne sequential composition of expressions in ML.11 D RAFT V ERSION 1. This is apparent from the type: it is assured that the value of applying such a function is the nulltuple. it shifts the burden to the programmer in the case that the address. if it has a value at all. is intended. and not the content. = exp1 13. The burden of explicit dereferencing is not nearly so onerous in ML as it might be in other languages simply because reference cells are used so infrequently in ML programs. The expression exp1 . It is obviously helpful in many common cases since it alleviates the burden of having to explicitly dereference variables whenever their content is required. whereas they are the sole means of binding variables in more familiar languages.2 Reference Patterns It is a common mistake to omit the exclamation point when referring to the content of a reference.13. R EVISED 11. In more familiar languages such as C all variables are implicitly bound to reference cells. so that its value is guaranteed to be the nulltuple. exp2 is shorthand for the expression let val in exp2 end that ﬁrst evaluates exp1 for its effect. Whether explicit or implicit dereferencing is preferable is to a large extent a matter of taste. (). so the only point of applying it is for its effects on memory. This is both a boon and a bane.
What makes Leibniz’s Principle tricky to grasp is that it hinges on what we mean by a “way to tell expressions apart”. 13. depending on the situation. it is dereferenced and its contents is bound to a. two expressions are considered equal iff there is no such scenario that distinguishes them. Its value. either by nontermination or by raising an uncaught exception.11 D RAFT V ERSION 1. But what do we mean by “complete program”? And what do we mean by “observable behavior”? For the present purposes we will consider a complete program to be any expression of basic type (say. A pattern of the form ref pat matches a reference cell whose content matches the pattern pat. or lack thereof.2 . and may be subsequently used without explicit dereferencing.3 Identity Reference cells raise delicate issues of equality that considerably complicate reasoning about programs. int or bool or string). the function ! may be deﬁned using a ref pattern as follows: fun !(ref a) = a When called with a reference cell.13.02. This means that the cell’s contents are implicitly retrieved during pattern matching. two expressions are distinct iff there is some way within the language to tell them apart. The idea is that a complete program is one that computes a concrete result such as a number. In general we say that two expressions (of the same type) are equal iff they cannot be distinguished by any operation in the language. In practice it is common to use both explicit dereferencing and ref patterns.3 Identity 116 An alternative to explicitly dereferencing cells is to use ref patterns. R EVISED 11. This is called Leibniz’s Principle of identity of indiscernables — we equate everything that we cannot tell apart — and the indiscernability of identicals — that which we deem equal cannot be told apart. In fact. The observable behavior of a complete program includes at least these aspects: 1. That is. The crucial idea is that we can tell two expressions apart iff there is a complete program containing one of the expressions whose observable behavior changes when we replace that expression by the other. That is. which is returned as result.
.02.. internallyallocated reference cells). In this case r and s are equivalent. after all. In contrast here are some behaviors that we will not count as observations: 1. 2. Now replace r by s to obtain (s := 1 . Consider the following usage of r to compute an integer result: (s := 1 .e.13.2 . Had we replaced the binding for s by the binding val s = r then the two expressions that formerly distinguished r from s no long do so — they are. Now consider a third. “Private” uses of storage (e. we will not distinguish between terminating with the uncaught exception Bind and the uncaught exception Match. bound to the same reference cell! In fact. !r) Clearly this expression evaluates to 0. Execution time or space usage. and mutates s as before. very similar scenario. and therefore must be considered distinct. no program can be concocted that would distinguish them. and mutates the binding of s. Its visible side effects. include visible modiﬁcations to mutable storage or any input/output it may perform. it should be plausible that if we evaluate these bindings val r = ref 0 val s = ref 0 then r and s are not equivalent.11 D RAFT V ERSION 1. 3. !s) This expression evaluates to 1. With these ideas in mind. These two complete programs distinguish r from s.3 Identity 117 2. The name of uncaught exceptions (i.g. Let us declare r and s as follows: R EVISED 11.
Thus the two cells bound to r and s above are observably distinct (by testing reference equality). let us consider the problem of aliasing. This breaks down because there is only one possible value we can assign to a variable of type unit ref! Indeed. Any two variables of the same reference type might be bound to the same reference cell. otherwise they are distinct. ML provides a useful.13. Two reference cells (of the same type) are equal in this sense iff they both arise from the exact same use of the ref operation to allocate that cell. even though they can only ever hold the value ().11 D RAFT V ERSION 1. 13.4 Aliasing val r = ref () val s = ref () 118 Are r and s equivalent or not? We might ﬁrst try to distinguish them by a variant of the experiment considered above. any two reference cells of unit type would have been equal. Such is life. as deﬁned by Leibniz’s Principle. or to two different reference cells. Had equality not been included as a primitive. but in fact there is a way to distinguish them! Here’s a complete program involving r that we will use to distinguish r from s: if r=r then "it’s r" else "it’s not" Now replace the ﬁrst occurrence of r by s to obtain if s=r then "it’s r" else "it’s not" and the result is different. one may suspect that r and s are equivalent in this case. pointer equality). conservative approximation to true equality that in some cases is not deﬁned (you cannot test two functions for equality) and in other cases is too picky (it distinguishes reference cells that are otherwise indistinguishable). occasionally.4 Aliasing To see how reference cells complicate programming. after the declarations R EVISED 11.2 . is.02. This example hinges on the fact that ML deﬁnes equality for values of reference type to be reference equality (or. For example. undecidable — there is no computer program that determines whether two expressions are equivalent in this sense. unfortunately. Why does ML provide such a ﬁnegrained notion of equality? “True” equality.
For example. They might. 13.11 D RAFT V ERSION 1. but after the declaration val r = ref 0 val s = r 119 the variables r and s are aliases for the same reference cell. We must always ask ourselves whether we’ve properly considered aliasing when writing such a function.5 Programming Well With References Using references it is possible to mimic the style of programming used in imperative languages such as C. This is harder to do than it sounds. because we cannot assume that two different argument variables are bound to different reference cells. we might deﬁne the factorial function in imitation of such languages as follows: R EVISED 11. y:typ ref) => exp we may not assume that x and y are bound to different reference cells.02. Aliasing is a huge source of bugs in programs that work with reference cells. in fact. For example.2 . in which case we say that the two variables are aliases for one another.5 Programming Well With References val r = ref 0 val s = ref 0 the variables r and s are not aliases. This is particularly problematic in the case of functions.13. These examples show that we must be careful when programming with variables of reference type. in a function of the form fn (x:typ ref. be bound to the same reference cell.
The tail call to loop is essentially just a goto statement to the top of the loop.11 D RAFT V ERSION 1. and so it is senseless to allocate and modify storage to compute it. Here’s an example to frame the discussion: R EVISED 11. more efﬁcient. it repeatedly executes its body until the contents of the cell bound to i reaches n.5. The deﬁnition we gave earlier is shorter. This programming style is closely related to the use of objects to manage state in objectoriented programming languages.1 Private Storage The ﬁrst example is the use of higherorder functions to manage shared private state.5 Programming Well With References fun imperative fact (n:int) = let val result = ref 1 val i = ref 0 fun loop () = if !i = n then () else (i := !i + 1. There is nothing about its deﬁnition that suggests that state must be maintained.2 . result := !result * !i. It is (appallingly) bad style to program in this fashion. !result end 120 Notice that the function loop is essentially just a while loop. This is not to suggest. simpler. The purpose of the function imperative fact is to compute a simple function on the natural numbers.13. 13. We will now discuss some important uses of state in ML. that there are no good uses of references. loop ()) in loop (). however.02. and hence more suitable to the task.
We’ve packaged the two operations into a record containing two functions that share private state. reset = reset } end The type of new counter is unit > { tick : unit>int.2 . The declaration above deﬁnes two functions. tick of type unit > int and reset of type unit > unit. The tick operation increments the counter and returns its new value. !counter) fun reset () = (counter := 0) end 121 This declaration introduces two functions. R EVISED 11. Suppose now that we wish to have several different instances of a counter — different pairs of functions tick and reset that share different state. !counter) fun reset () = (counter := 0) in { tick = tick. The function new counter may be thought of as a constructor for a class of counter objects. Their deﬁnitions share a private variable counter that is bound to a mutable cell containing the current value of a shared counter. tick and reset. In the absence of exceptions and implicit state. There is an obvious analogy with classbased objectoriented programming. Each object has a private instance variable counter that is shared between the methods tick and reset of the object represented as a record with two ﬁelds. The types of the operations suggest that implicit state is involved.13. namely the function that always returns its argument (and it’s debatable whether this is really useful!). and the reset operation resets its value to zero. We can achieve this by deﬁning a counter generator (or constructor) as follows: fun new counter () = let val counter = ref 0 fun tick () = (counter := !counter + 1.5 Programming Well With References local val counter = ref 0 in fun tick () = (counter := !counter + 1.02. there is only one useful function of type unit>unit. that share a single private counter.11 D RAFT V ERSION 1. reset : unit>unit }.
(* 1 *) #tick c2 (). val c1 = new counter () val c2 = new counter () #tick c1 ().02.5 Programming Well With References Here’s how we use counters. then it is wasteful to make a copy of a structure if the original is going to be discarded anyway.2 . a pcl is a ﬁnite graph in which every node has at most one neighbor. To do this in ML we make use of references. we can simultaneously maintain a dictionary both before and after insertion of a given word. The principal drawback is that if we aren’t really relying on persistence. in the graph. In contrast to ordinary lists the predecessor R EVISED 11. This is both a beneﬁt and a drawback. #tick c1 (). called its predecessor. What we’d like in this case is to have an “update in place” operation to build an ephemeral (opposite of persistent) data structure.11 D RAFT V ERSION 1. A simple example is the type of possibly circular lists. (* 1 *) #tick c1 (). (* 1 *) #reset c1 (). or pcl’s. (* 2 *) 122 Notice that c1 and c2 are distinct counters that increment and reset independently of one another.5. For example. The data structures (such as lists and trees) we’ve considered so far are immutable in the sense that it is impossible to change the structure of the list or tree without building a modiﬁed copy of that structure.13. The principal beneﬁt is that immutable data structures are persistent in that operations performed on them do not destroy the original structure — in ML we can eat our cake and have it too. 13. (* 2 *) #tick c2 ().2 Mutable Data Structures A second important use of references is to build mutable data structures. Informally.
cons (2. t)))) = t. )))) = h. u) = (r := Cons (h. (Cons ( . it would make sense to require that the tail of the Cons cell be the empty pcl. so that you’re only allowed to backpatch at the end of a ﬁnite pcl.02. How can such a structure ever come into existence? If the predecessors of a cell are needed to construct a cell.5 Programming Well With References 123 relation is not necessarily wellfounded: there may be an inﬁnite sequence of nodes arranged in descending order of predecession. where h is a value of type typ and t is another such possiblycircular list. val finite = cons (4. nill()).13. t). so that the node and its ancestors can be constructed. (Cons (h. A value of type typ pcell is either Nil. cons (1. marking the end of a possibly circular list that happens not to be circular. Here are some convenient functions for creating and taking apart possiblycircular lists: fun fun fun fun cons (h. Since the graph is ﬁnite. then the ancestor that is to serve as predecessor in the cyclic case can never be created! The “trick” is to employ backpatching: the predecessor is initialized to Nil. nill ())))) val tail = cons (1. u)). cons (3. val infinite = cons (4. or Cons (h. t) = nill () = Pcl phd (Pcl (ref ptl (Pcl (ref Pcl (ref (Cons (h. tail))). If you’d like. we need a way to “zap” the tail of a possiblycircular list.2 . To implement backpatching. (ref Nil). t))). A value of type typ pcl is essentially a reference to a value of type typ pcell. ))).11 D RAFT V ERSION 1. cons (2. infinite) R EVISED 11. then it is reset to the appropriate ancestor to create the cycle. this can only happen if there is a cycle in the graph: some node has an ancestor as predecessor. val = stl (tail. cons (3. This can be achieved in ML using the following datatype declaration: datatype ’a pcl = Pcl of ’a pcell ref and ’a pcell = Nil  Cons of ’a * ’a pcl. Here is a ﬁnite and an inﬁnite pcl. fun stl (Pcl (r as ref (Cons (h.
13. Nil)  race (Cons ( . Notice that it can never arise that the tortoise reaches the end before the hare does! Consequently. Pcl (s as ref d)) = if r=s then 0 else race (c. it simply waits for the tortoise to ﬁnish counting.g. d) in fun size (Pcl (ref c)) = race (c. If the course is circular. if this happens. Now let us deﬁne the size of a pcl to be the number of distinct nodes occurring in it. Pcl (ref c)). m))))) = 1 + race’ (l. Cons ( . Pcl (ref (Cons ( . Cons ( .6 Mutable Arrays In addition to reference cells. the deﬁnition of race is inexhaustive.02. If the hare has not yet ﬁnished.11 D RAFT V ERSION 1. Nil) = 1 + race (c. Pcl (ref c)). The idea is to think of running a long race between a tortoise and a hare. ML also provides mutable arrays as a primitive data structure. This covers the ﬁrst three clauses of race. Pcl (ref Nil))) = 1 + race (c. If the hare reaches the ﬁnish line. m) and race’ (Pcl (r as ref c). no set of previouslyencountered nodes) and runs in time proportional to the number of cells in the pcl. the hare’s job is simply to detect cycles. c) end The hare runs twice as fast as the tortoise.. local fun race (Nil. creating a circular list. we must continue with the hare running at twice the pace.13. The type typ array is the type of arrays carrying valR EVISED 11. then the hare.6 Mutable Arrays 124 The last step backpatches the tail of the last cell of infinite to be infinite itself. Nil)  race (Cons ( . will eventually come from behind and pass it! Conversely. l). It is an interesting problem is to deﬁne a size function for pcls that makes no use of auxiliary storage (e. We let the tortoise do the counting. the course must be circular. which quickly runs out ahead of the tortoise. checking whether the hare catches the tortoise from behind. Nil) = 0  race (Cons ( .2 .
. The function sub performs a subscript operation. .) fun C 1 = 1  C n = sum (fn k => (C k) * (C (nk))) (n1) This deﬁnition of C is hugely inefﬁcient because a given computation may be repeated exponentially many times. please see Appendix A for complete information. .2 . We can do better by caching previouslycomputed results in an array. where 0 ≤ i < length(A). C i − 1 for each 1 ≤ i ≤ 9. n) R EVISED 11. to compute C 10 we must compute C 1.sub (memopad. NONE) in fun C’ 1 = 1  C’ n = sum (fn k => (C k)*(C (nk))) (n1) and C n = if n < limit then case Array.array (limit. . and the computation of C i engenders the computation of C 1.02. It makes use of an auxiliary summation function that you can easily deﬁne for yourself. . returning the ith element of an array A. . Here’s a function to compute the nth Catalan number. with the given value as the initial value of every element of the array. which may be thought of as the number of distinct ways to parenthesize an arithmetic expression consisting of a sequence of n consecutive multiplication’s.13. The basic operations on arrays are these: val val val val array : int * ’a > ’a array length : ’a array > int sub : ’a array * int > ’a update : ’a array * int * ’a > unit 125 The function array creates a new array of a given length. (These are just the basic operations on arrays. C 9.11 D RAFT V ERSION 1. C 2. Here’s the code: local val limit : int = 100 val memopad : int option array = Array. leading to an enormous improvement in execution speed. (Applying sum to f and n computes the sum of f 1 + · · · + f n.6 Mutable Arrays ues of type typ. For example. The function length returns the length of an array. . .) One simple use of arrays is for memoization.
stored in the cache. This ensures that subcomputations are properly cached and that the cache is consulted whenever possible.7 Sample Code of SOME r => r  NONE => let val r = C’ n in Array. This can be alleviated by implementing a more sophisticated cache management scheme that dynamically adjusts the size of the cache based on the calls made to it. with the important difference that the recursive calls are to C. The main weakness of this solution is that we must ﬁx an upper bound on the size of the cache.2 . When called it consults the memopad to determine whether or not the required result has already been computed.13.02. The function C is a memoized version of the Catalan number function. R EVISED 11. r end else C’ n end 126 Note carefully the structure of the solution.7 Sample Code Here is the code for this chapter. SOME r).11 D RAFT V ERSION 1. The function C’ looks superﬁcially similar to the earlier deﬁnition of C. the answer is simply retrieved from the memopad. If so. and returned. otherwise the result is computed.update (memopad. 13. rather than C’ itself. n.
Chapter 14 Input/Output
The Standard ML Basis Library (described in Appendix A) deﬁnes a threelayer input and output facility for Standard ML. These modules provide a rudimentary, platformindependent text I/O facility that we summarize brieﬂy here. The reader is referred to Appendix A for more details. Unfortunately, there is at present no standard library for graphical user interfaces; each implementation provides its own package. See your compiler’s documentation for details.
14.1 Textual Input/Output
The text I/O primitives are based on the notions of an input stream and an output stream, which are values of type instream and outstream, respectively. An input stream is an unbounded sequence of characters arising from some source. The source could be a disk ﬁle, an interactive user, or another program (to name a few choices). Any source of characters can be attached to an input stream. An input stream may be thought of as a buffer containing zero or more characters that have already been read from the source, together with a means of requesting more input from the source should the program require it. Similarly, an output stream is an unbounded sequence of characters leading to some sink. The sink could be a disk ﬁle, an interactive user, or another program (to name a few choices). Any sink for characters can be attached to an output stream. An output stream may be thought of as a buffer containing zero or more characters that have been produced by the program but have yet to be ﬂushed to the
14.1 Textual Input/Output
128
sink. Each program comes with one input stream and one output stream, called stdIn and stdOut, respectively. These are ordinarily connected to the user’s keyboard and screen, and are used for performing simple text I/O in a program. The output stream stdErr is also predeﬁned, and is used for error reporting. It is ordinarily connected to the user’s screen. Textual input and output are performed on streams using a variety of primitives. The simplest are inputLine and print. To read a line of input from a stream, use the function inputLine of type instream > string. It reads a line of input from the given stream and yields that line as a string whose last character is the line terminator. If the source is exhausted, return the empty string. To write a line to stdOut, use the function print of type string > unit. To write to a speciﬁc stream, use the function output of type outstream * string > unit, which writes the given string to the speciﬁed output stream. For interactive applications it is often important to ensure that the output stream is ﬂushed to the sink (e.g., so that it is displayed on the screen). This is achieved by calling flushOut of type outstream > unit. The print function is a composition of output (to stdOut) and flushOut. A new input stream may be created by calling the function openIn of type string > instream. When applied to a string, the system attempts to open a ﬁle with that name (according to operating systemspeciﬁc naming conventions) and attaches it as a source to a new input stream. Similarly, a new output stream may be created by calling the function openOut of type string > outstream. When applied to a string, the system attempts to create a ﬁle with that name (according to operating systemspeciﬁc naming conventions) and attaches it as a sink for a new output stream. An input stream may be closed using the function closeIn of type instream > unit. A closed input stream behaves as if there is no further input available; request for input from a closed input stream yield the empty string. An output stream may be closed using closeOut of type outstream > unit. A closed output stream is unavailable for further output; an attempt to write to a closed output stream raises the exception TextIO.IO. The function input of type instream > string is a blocking read operation that returns a string consisting of the characters currently available from the source. If none are currently available, but the end of source has not been reached, then the operation blocks until at least one character is R EVISED 11.02.11 D RAFT V ERSION 1.2
14.2 Sample Code
129
available from the source. If the source is exhausted or the input stream is closed, input returns the null string. To test whether an input operation would block, use the function canInput of type instream * int > int option. Given a stream s and a bound n, the function canInput determines whether or not a call to input on s would immediately yield up to n characters. If the input operation would block, canInput yields NONE; otherwise it yields SOME k, with 0 ≤ k ≤ n being the number of characters immediately available on the input stream. If canInput yields SOME 0, the stream is either closed or exhausted. The function endOfStream of type instream > bool tests whether the input stream is currently at the end (no further input is available from the source). This condition is transitive since, for example, another process might append data to an open ﬁle in between calls to endOfStream. The function output of type outstream * string > unit writes a string to an output stream. It may block until the sink is able to accept the entire string. The function flushOut of type outstream > unit forces any pending output to the sink, blocking until the sink accepts the remaining buffered output. This collection of primitive I/O operations is sufﬁcient for performing rudimentary textual I/O. For further information on textual I/O, and support for binary I/O and Posix I/O primitives, see the Standard ML Basis Library.
14.2 Sample Code
Here is the code for this chapter.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Chapter 15 Lazy Data Structures
In ML all variables are bound by value, which means that the bindings of variables are fully evaluated expressions, or values. This general principle has several consequences: 1. The righthand side of a val binding is evaluated before the binding is effected. If the righthand side has no value, the val binding does not take effect. 2. In a function application the argument is evaluated before being passed to the function by binding that value to the parameter of the function. If the argument does not have a value, then neither does the application. 3. The arguments to value constructors are evaluated before the constructed value is created. According to the byvalue discipline, the bindings of variables are evaluated, regardless of whether that variable is ever needed to complete execution. For example, to compute the result of applying the function fn x => 1 to an argument, we never actually need to evaluate the argument, but we do anyway. For this reason ML is sometimes said to be an eager language. An alternative is to bind variables by name,1 which means that the binding of a variable is an unevaluated expression, known as a computation or a
1 The
terminology is historical, and not wellmotivated. It is, however, ﬁrmly estab
lished.
131 suspension or a thunk.2 This principle has several consequences: 1. The righthand side of a val binding is not evaluated before the binding is effected. The variable is bound to a computation (unevaluated expression), not a value. 2. In a function application the argument is passed to the function in unevaluated form by binding it directly to the parameter of the function. This holds regardless of whether the argument has a value or not. 3. The arguments to value constructor are left unevaluated when the constructed value is created. According to the byname discipline, the bindings of variables are only evaluated (if ever) when their values are required by a primitive operation. For example, to evaluate the expression x+x, it is necessary to evaluate the binding of x in order to perform the addition. Languages that adopt the byname discipline are, for this reason, said to be lazy. This discussion glosses over another important aspect of lazy evaluation, called memoization. In actual fact laziness is based on a reﬁnement of the byname principle, called the byneed principle. According to the byname principle, variables are bound to unevaluated computations, and are evaluated only as often as the value of that variable’s binding is required to complete the computation. In particular, to evaluate the expression x+x the value of the binding of x is needed twice, and hence it is evaluated twice. According to the byneed principle, the binding of a variable is evaluated at most once — not at all, if it is never needed, and exactly once if it ever needed at all. Reevaluation of the same computation is avoided by memoization. Once a computation is evaluated, its value is saved for future reference should that computation ever be needed again. The advantages and disadvantages of lazy versus eager languages have been hotly debated. We will not enter into this debate here, but rather content ourselves with the observation that laziness is a special case of eagerness. (Recent versions of) ML have lazy data types that allow us to treat unevaluated computations as values of such types, allowing us to incorporate laziness into the language without disrupting its fundamental character
2 For
reasons that are lost in the mists of time.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Clearly we cannot ever “ﬁnish” creating the sequence of all prime numbers. R EVISED 11. such as the sequence of all prime numbers in order of magnitude.2 . We will focus attention on the type of inﬁnite streams. such as the sequence of inputs provided by the user of an interactive system. are one example of an online data structure. since there would be no way to create a value of that type! Adding the keyword lazy makes all the difference. The ideas are best illustrated by example. In such a system the user’s inputs are not predetermined at the start of execution. This is useful for representing online data structures that are created only insofar as we examine them. The lazy evaluation features must be enabled by executing the following at top level: Compiler. are another example of online data structures. val ). Doing so speciﬁes that the values of type typ stream are computations of values of the form Cons (val. Note: Lazy evaluation is a nonstandard feature of ML that is supported only by the SML/NJ compiler. open Lazy. and ignore it when it is not. Inﬁnite data structures. but we can create as much of this sequence as we need for a given run of a program. Interactive data structures.15. which may be declared as follows: datatype lazy ’a stream = Cons of ’a * ’a stream Notice that this type deﬁnition has no “base case”! Had we omitted the keyword lazy.lazysml := true.11 D RAFT V ERSION 1. such a datatype would not be very useful.Control.02. The main beneﬁt of laziness is that it supports demanddriven computation. The demanddriven nature of online data structures is precisely what is needed to model this behavior. but on a controlled basis — we can use it when it is appropriate. 15. This affords the beneﬁts of laziness.1 Lazy Data Types SML/NJ provides a general mechanism for introducing lazy data types by simply attaching the keyword lazy to an ordinary datatype declaration. but rather are created “on demand” in response to the progress of computation up to that point.1 Lazy Data Types 132 on which so much else depends.
and so on ad inﬁnitum. The general rule is pattern matching forces evaluation of a computation to the extent required by the pattern. then performing ordinary pattern matching to bind h to 1 and t to ones. This is the means by which lazy data structures are evaluated only insofar as required. (Cons (h’. The computation is not evaluated until we examine it. and val is another such computation.2 . it will again have this form. When we do. ones) The keyword lazy indicates that we are binding ones to a computation. then evaluate ones again to Cons (1. Notice how this description captures the “incremental” nature of lazy data structures. It is the computation whose underlying value is constructed using Cons (the only possibility) from the integer 1 and the very same computation itself. 15. we evaluate ones to Cons (1. We can inspect the underlying value of a computation by pattern matching.11 D RAFT V ERSION 1. ones).15. binding h to 1 in the process. Had the pattern been “deeper”.2 Lazy Function Deﬁnitions The combination of (recursive) lazy function deﬁnitions and decomposition by pattern matching are the core mechanisms required to support lazy R EVISED 11. further evaluation would be required. ones). Values of type typ stream are created using a val rec lazy declaration that provides a means for building a “circular” data structure. rather than a value. the binding val Cons (h. binding h’ to 1 and t’ to ones. The keyword rec indicates that the computation is recursive (or selfreferential or circular). t) = ones extracts the “head” and “tail” of the stream ones. t’)) = ones To evaluate this binding.2 Lazy Function Deﬁnitions 133 where val is of type typ. For example. as in the following binding: val Cons (h. Should we inspect that computation. its structure is revealed as consisting of an element val together with another suspended computation of the same type. This is performed by evaluating the computation bound to ones.02. Here is a declaration of the inﬁnite stream of 1’s as a value of type int stream: val rec lazy ones = Cons (1. ones). yielding Cons (1.
but rather sets up a stream computation that. namely that it is simply extracting a component value from a computation of a value of the form Cons (exph . However. another point of view is also possible. In the case of the shd function there is no other interpretation — we are extracting a value of type typ from a value of type typ stream. While these functions are surely very natural. However.11 D RAFT V ERSION 1. s)) = s These are functions that. We can adopt a similar viewpoint about stl. in the case of stl. s)) = s The keyword lazy indicates that an application of lstl to a stream does not immediately perform pattern matching on its argument.2 . respectively. Under this interpretation the argument to stl should not be evaluated until its result is required. rather than at the time stl is applied. evaluate it. and a third new mechanism. The behavior of the two forms of tail function can be distinguished using print statements as follows: R EVISED 11. expt ). we may deﬁne two functions to extract the head and tail of a stream as follows: fun shd (Cons (h. Since streams are computations. Using pattern matching we may easily deﬁne functions over lazy data structures in a familiar manner. From one point of view. there is a subtlety about function deﬁnitions that requires careful consideration. the stream created by stl (according to this view) should also be suspended until its value is required. forces the argument and extracts the tail of the stream. which is a computation of a value of the form Cons (exph . what we are doing is decomposing a computation by evaluating it and retrieving its components. This leads to a variant notion of “tail” that may be deﬁned as follows: fun lazy lstl (Cons ( .2 Lazy Function Deﬁnitions 134 evaluation. we may instead think of it as creating a stream out of another stream. For example.15. Rather than think of stl as extracting a value from a stream. expt ). when forced.02. when applied to a stream. and match it against the given patterns to extract the head and tail. there is a subtle issue that deserves careful discussion. The issue is whether these functions are “lazy enough”. )) = h fun stl (Cons ( . the lazy function declaration.
loop s) in loop end We have “staged” the computation so that the partial application of smap to a function yields a function that loops over a given stream.3 Programming with Streams Let’s deﬁne a function smap that applies a function to every element of a stream. rather than evaluating its argument when it is called. Cons (1. This loop is a lazy function to ensure that it merely sets up a stream computation. but not if it is called again. 15. s)). the “. its argument is not evaluated when it is called. The thing to keep in mind is that the application of smap to a function and a stream should set up (but not compute) another stream that. s)) val = stl s (* prints ". since lstl only sets up a computation.15.2 .” is printed when it is ﬁrst called.02. applying the given function to each element.". Here’s the code: fun smap f = let fun lazy loop (Cons (x. applies the given function to it. when forced.3 Programming with Streams val rec lazy s = (print ".". then an application of smap to a function and a stream would immediately force the computation of the head element of R EVISED 11." *) (* silent *) val = stl s val rec lazy s = (print ". Had we dropped the keyword lazy from the deﬁnition of the loop. but only when its result is evaluated. The type of smap should be (’a > ’b) > ’a stream > ’b stream. s)) = Cons (f x. yielding another stream.11 D RAFT V ERSION 1. val = lstl s (* silent *) (* prints "." *) val = stl s 135 Since stl evaluates its argument when applied. However. Cons (1. and yields this as the head of the result. forces the argument stream to obtain the head element.
retains only those numbers that are not divisible by a preceding number in the stream: fun m mod n = m . one plus nats) Now let’s deﬁne a function sfilter of type (’a > bool) > ’a stream > ’a stream that ﬁlters out all elements of a stream that do not satisfy a given predicate. sieve (sfilter (not o (divides x)) s)) (This example uses o for function composition. fun sfilter pred = let fun lazy loop (Cons (x. when applied to a stream of numbers. here’s a deﬁnition of the inﬁnite stream of natural numbers: val one plus = smap (fn n => n+1) val rec lazy nats = Cons (0.02. loop s) else loop s in loop end We can use sfilter to deﬁne a function sieve that.11 D RAFT V ERSION 1. rather than merely set up a future computation of the same result. s)) = if pred x then Cons (x.2 .3 Programming with Streams 136 the stream. s)) = Cons (x.) We may now deﬁne the inﬁnite stream of primes by applying sieve to the natural numbers greater than or equal to 2: val nats2 = stl (stl nats) val primes = sieve nats2 R EVISED 11.n * (m div n) fun divides m n = n mod m = 0 fun lazy sieve (Cons (x. To illustrate the use of smap.15.
2 .02.4 Sample Code 137 To inspect the values of a stream it is often useful to use the following function that takes n ≥ 0 elements from a stream and builds a list of those n values: fun take 0 = nil  take n (Cons (x. R EVISED 11. (* prints ". binds h to 1 *) Replace print ".". taking advantage of the effort expended on the timeconsuming operation induced by the ﬁrst force of s. ) = s.11 D RAFT V ERSION 1. binds h to 1 *) val Cons (h. 1). ) = s.". s) val Cons (h. (* silent. and you will see that the second time we force s the result is returned instantly.15. s)) = x :: take (n1) s Here’s an example to illustrate the effects of memoization: val rec lazy s = Cons ((print ". 15.4 Sample Code Here is the code for this chapter.1 by a timeconsuming operation yielding 1 as result.".
.Chapter 16 Equality and Equality Types 16.1 Sample Code Here is the code for this chapter.
Chapter 17 Concurrency Concurrent ML (CML) is an extension of Standard ML with mechanisms for concurrent programming.1 Sample Code Here is the code for this chapter. It is available as part of the Standard ML of New Jersey compiler. . 17. The eXene Library for programming the X windows system is based on CML.
Part III The Module Language .
that constitute the unit.2 . units may be deﬁned as functors. Generic. R EVISED 11. A signature may be thought of as the type of a unit. or parameterized. A structure consists of a collection of components.02. which describes the components of that unit. Composition of units to form a larger unit is mediated by a signature. including types and values.11 D RAFT V ERSION 1. Large units may be structured into hierarchies using substructures.141 The Standard ML module language comprises the mechanisms for structuring programs into separate units. Program units are called structures.
exception constructors. By contrast many languages impose much more stringent conditions such as requiring that each structure have a unique signature. Signatures are often called interfaces or package speciﬁcations. it does not make sense to speak of the signature of a structure or the structure matching a signature. or structure. and a structure may correspondingly be thought of as an implementation of a signature. and value bindings. and a structure may satisfy many different signatures. rather than manytoone or onetoone. Whatever the terminology. the relationship between signatures and structures is manytomany. such as what type components it must have. A signature speciﬁes some requirements on a structure. because there can be more than one in each case. A signature may describe many different structures. Many languages have similar constructs. or that each signature arise from a unique structure.1 Signatures A signature is a speciﬁcation. and structures are often called implementations or packages. and what .Chapter 18 Signatures and Structures The fundamental constructs of the ML module system are signatures and structures. of a program unit. Thus. A signature may be thought of as a description of a structure. Structures consist of declarations of type constructors. often with different names. 18. or a description. Unlike many languages. strictly speaking. however. the main idea is to assign a type to a body of code as a whole. This is not the case for ML.
The requirements are to be thought of as descriptive in that the structure may meet more stringent requirements than are speciﬁed by a signature by.02. These will be discussed in chapter 21. a signature iff it meets these requirements in a sense that will be made precise below.18. A structure matches. A type speciﬁcation of the form type (tyvar1 . for example. (As a limiting case. 1 There are two other forms of speciﬁcation beyond these four. 4. where specs is a sequence of speciﬁcations.. 3. having more components than are speciﬁed. where the type typ of excon may or may not be present. or implements.11 D RAFT V ERSION 1.. There are four basic forms of speciﬁcation that may occur in specs:1 1. No component may be speciﬁed more than once. where the deﬁnition typ of tycon may or may not be present.1 Signatures 143 value components it must have and what must be their types. but the structure nevertheless matches any less stringent speciﬁcation. any structure matches the null signature that imposes no requirements on it!) 18. An exception speciﬁcation of the form exception excon [ of typ ]. Each speciﬁcation may refer to the type constructors introduced earlier in the sequence.. A datatype speciﬁcation. Signatures may be given names using a signature binding signature sigid = sigexp. A value speciﬁcation of the form val id : typ.2 . which has precisely the same form as a datatype declaration. substructure speciﬁcations and sharing speciﬁcations. 2.tyvarn ) tycon [ = typ ]..1.1 Basic Signatures A basic signature expression has the form sig specs end. R EVISED 11.
3.11 D RAFT V ERSION 1. Queues are polymorphic in the type of its entries—the same operations are used regardless of the entry type. Each is a form of inheritance in which a new signature is created by enriching another signature with additional information. a nullary exception Empty. using the following signature binding: R EVISED 11. Signature inclusion is used to add more components to an existing signature.2 Signature Inheritance Signatures may be built up from one another using two principal tools. 18.2 . if we wish to add an emptiness test to the signature QUEUE we might deﬁne the augmented signature. signature QUEUE = sig type ’a queue exception Empty val empty : ’a queue val insert : ’a * ’a queue > ’a queue val remove : ’a queue > ’a * ’a queue end The signature QUEUE speciﬁes a structure that must provide 1. 2.18. For example.02. QUEUE WITH EMPTY. insert and remove. with the speciﬁed type schemes. signature inclusion and signature specialization.1 Signatures 144 where sigid is a signature identiﬁer and sigexp is a signature expression. In practice we nearly always bind signature expressions to identiﬁers and refer to them by name. Signature identiﬁers are abbreviations for the signatures to which they are bound. a polymorphic value empty of type ’a queue. 4. a unary type constructor ’a queue. Here is an illustrative example of a signature deﬁnition.1. two polymorphic functions. We will refer back to this deﬁnition often in the rest of this chapter.
The signature QUEUE AS LISTS may also be deﬁned directly as follows: R EVISED 11. we may deﬁne it directly using the following signature binding: signature QUEUE WITH EMPTY = sig type ’a queue exception Empty val empty : ’a queue val insert : ’a * ’a queue > ’a queue val remove : ’a queue > ’a * ’a queue val is empty : ’a queue > bool end There is no semantic difference between the two deﬁnitions of QUEUE WITH EMPTY. we may proceed as follows: signature QUEUE AS LISTS = QUEUE where type ’a queue = ’a list * ’a list The where type clause “patches” the signature QUEUE by adding a deﬁnition for the type constructor ’a queue.11 D RAFT V ERSION 1. Signature specialization is used to augment an existing signature with additional type deﬁnitions. the signature QUEUE is included into the body of the signature QUEUE WITH EMPTY. For example. Signature inclusion is a convenience that documents the “history” of how the more reﬁned signature was created. Indeed. It is not strictly necessary to use include to deﬁne this signature. and an additional component is added.1 Signatures signature QUEUE WITH EMPTY = sig include QUEUE val is empty : ’a queue > bool end 145 As the notation suggests.02. if we wish to reﬁne the signature QUEUE to specify that the type constructor ’a queue must be deﬁned as a pair of lists.18.2 .
For example. the following is illegal: signature QUEUE AS LISTS AS LIST = QUEUE AS LISTS where type ’a queue = ’a list If you wish to replace the deﬁnition of a type constructor in a signature with another deﬁnition using where type.2 . and hence the speciﬁcations of the value components are equivalent. you must go back to a common ancestor in which that type is not yet deﬁned. For example. the signature QUEUE where type ’a queue = ’a list is equivalent to the signature signature QUEUE AS LIST = sig type ’a queue = ’a list exception Empty val empty : ’a list val insert : ’a * ’a list > ’a list val remove : ’a list > ’a * ’a list end Within the scope of the deﬁnition of the type ’a queue as ’a list.1 Signatures signature QUEUE AS LISTS = sig type ’a queue = ’a list * ’a list exception Empty val empty : ’a queue val insert : ’a * ’a queue > ’a queue val remove : ’a queue > ’a * ’a queue end 146 A where type clause may not be used to redeﬁne a type that is already deﬁned in a signature.11 D RAFT V ERSION 1. the two are equivalent. signature QUEUE AS LIST = QUEUE where type ’a queue = ’a list Two signatures are said to be equivalent iff they differ only up to the type equivalences induced by type abbreviations. R EVISED 11.18.02.
just the declarations that were introduced in Part II of this book. of course. signatures are the types of structures.2 Structures A structure is a unit of program consisting of a sequence of declarations of types.2 . 18.2. 4.18. Any effects that arise from evaluating a value binding occur when the structure expression is evaluated. This amounts to evaluating the righthand sides of each value declaration in turn to determine its value. Structures are implementations of signatures.11 D RAFT V ERSION 1. Such a declaration is wellformed exactly when strexp is wellformed. It is evaluated by R EVISED 11. and values.02. A type declaration deﬁning a type constructor.1 Basic Structures The basic form of structure is an encapsulated sequence of declarations of the form struct decs end. A datatype declaration deﬁning a new datatype. The declarations in decs are of one of the following four forms: 1. A structure value is a structure expression in which all bindings are fully evaluated. A structure expression is evaluated by evaluating each of the declarations within it. A structure may be bound to a structure identiﬁer using a structure binding of the form structure strid = strexp This declaration deﬁnes strid to stand for the value of strexp. which is then bound to the corresponding value identiﬁer. A value declaration deﬁning a new value variable with a speciﬁed type. exceptions. 3. A structure expression is wellformed iff it consists of a wellformed sequence of wellformed declarations (according to the rules given in Part II). These are. in the order given. An exception declaration deﬁning an exception with a speciﬁed argument type (if any).2 Structures 147 18. 2.
and binding the resulting structure value to strid.empty refers to the empty component of the structure Queue. Queue. nil) = remove (nil.id.queue (note well the syntax!). we will generalize this to admit an arbitrary sequence of strid’s separated by a dot. or long identiﬁers. (bs. the function Queue. A path has the form strid. or qualiﬁed names. nil) fun insert (x. Consequently.insert has type ’a * ’a Queue.2 Long and Short Identiﬁers Once a structure has been bound to a structure identiﬁer. the type ’a Queue. nil) = raise Empty  remove (bs.queue > ’a Queue.queue.18.queue > ’a * ’a Queue.remove has type ’a Queue.queue and Queue. stating that it is a polymorphic value whose type is built up from the unary type constructor Queue.f)) = (x::b. we may access its components using paths. Unless special steps are taken. it is correct to write an expression such as chapter 21. 2 In R EVISED 11. f) fun remove (nil.2 Structures 148 evaluating the righthand side.2. Type deﬁnitions permeate structure boundaries.queue is equivalent to the type ’a list because it is deﬁned to be so in the structure Q. We will shortly introduce the means for limiting the visibility of type deﬁnitions. Here is an example of a structure binding: structure Queue = struct type ’a queue = ’a list * ’a list exception Empty val empty = (nil. 18. and hence constitutes a value binding (of function type). the definitions of types within a structure determine the deﬁnitions of the long identiﬁers that refer to those types within a structure.11 D RAFT V ERSION 1. f::fs) = (f.2 . rev bs)  remove (bs. Similarly. It has type ’a Queue. For example. For example. (b.queue. fs)) end Recall that a fun binding is really an abbreviation for a val rec binding.02.2 It stands for the id component of the structure bound to strid.
insert.4] was not obtained by using the operations from the structure Q.insert. so that the code is cluttered with calls to Queue. Another way to reduce clutter is to open the structure Queue to incorporate its bindings directly into the current environment. This is because the type int Queue. The use of long identiﬁers can get out of hand..02. and hence the call to insert is welltyped.queue is equivalent to the type int list. and Q. Q. using open has its disadvantages.18. rather than the more verbose forms mentioned above. the declaration open Queue incorporates the body of the structure Queue into the current environment so that we may write just empty..4]. cluttering the program.remove.empty. and remove. insert. An open declaration has the form open strid1 .3])) 149 even though the list [6. Queue. if we write open Queue Stack where the structure Stack also has a component empty.5. to refer to the corresponding components of the structure Queue.[1. For example. after declaring structure Q = Queue we may write Q.empty. then uses of empty (without qualiﬁcation) will stand for Stack. not for Queue. One is that we cannot simultaneously open two structures that have a component with the same name. Suppose that we are frequently using the Queue operations in a program. R EVISED 11.empty.2 .2. without qualiﬁcation. and Queue.5.2 Structures val q = Queue. Although this is surely convenient. stridn which incorporates the bindings from the given structures in lefttoright order (later structures override earlier ones when there is overlap).empty. rather than clarifying it. For example. For example. ([6.remove. One way to reduce clutter is to introduce a structure abbreviation of the form structure strid = strid that introduces one structure identiﬁer as an abbreviation for another.11 D RAFT V ERSION 1.insert (1.
For example.18. since it incorporates the entire body of a structure.2 . it is best to use open sparingly. if the structure Queue happened to deﬁne an auxiliary function helper.3 Sample Code 150 Another problem with open is that it is hard to control its behavior. and hence may inadvertently shadow identiﬁers that happen to be also used in the structure. This turns out to be a source of many bugs.3 Sample Code Here is the code for this chapter.02. 18. R EVISED 11.11 D RAFT V ERSION 1. which may not have been intended. and only then in a let or local declaration (to limit the damage). that function would also be incorporated into the current environment by the declaration open Queue.
and z. y. it is sufﬁcient for the structure to provide x. Value components must be present with the type speciﬁed in the signature. If a signature requires components x. a structure may provide more components than are strictly required by the signature. Type components must be provided with the same number of arguments and with an equivalent deﬁnition (if any) to that given in the signature. a datatype may be provided where a type is required. it is enough to provide a function of type ’a>’a. Exception components must be present with the type of argument (if any) speciﬁed in the signature. . If a signature demands a function of type int>int. not just the order speciﬁed in the signature. • To enhance reuse. and w. These simple principles have a number of important consequences.Chapter 19 Signature Matching When does a structure implement a signature? The structure must provide all of the components and satisfy all of the type deﬁnitions required by the signature. • To avoid overspeciﬁcation. a structure may provide values with more general types than are required by the signature. z. a structure may consist of declarations presented in any sensible order. and a value constructor may be provided where a value is required. up to the deﬁnitions of any preceding types. • To increase ﬂexibility. • To minimize bureaucracy. provided that the requirements of the speciﬁcation are met. y.
Corresponding to a declaration of the form rules gloss over some technical complications that arise only in unusual circumstances. it is enough to check whether the principal signature of that structure is no weaker than the speciﬁed signature.19. For the purposes of type checking.... the principal signature contains the speciﬁcation type (tyvar1 . the principal signature is the ofﬁcial proxy for the structure.02.. 1 These R EVISED 11.... To determine whether a structure matches a signature..tyvarn ) tycon = typ... Corresponding to a declaration of the form datatype (tyvar1 .. 3. once its principal signature has been determined..tyvarn ) tycon = con1 of typ1  .. called its principal signature.1 Principal Signatures 152 19. Corresponding to a declaration of the form type (tyvar1 ...1 Principal Signatures There is a most stringent. A structure expression is assigned a principal signature by a componentbycomponent analysis of its constituent declarations..  conk of typk the principal signature contains the speciﬁcation datatype (tyvar1 . A structure may be considered to match a signature exactly when the speciﬁed signature is no more restrictive than the principal signature of the structure..tyvarn ) tycon = typ The principal signature includes the deﬁnition of tycon.. The principal signature of a structure is obtained as follows:1 1. or most precise. See The Deﬁnition of Standard ML [3] for complete details..2 .  conk of typk The speciﬁcation is identical to the declaration. signature for a structure.tyvarn ) tycon = con1 of typ1  .11 D RAFT V ERSION 1. 2. We need never examine the code of the structure durng type checking..
Every datatype in the target must be present in the candidate. datatype deﬁnitions. R EVISED 11. 4. More precisely. and exception bindings of the structure.02.11 D RAFT V ERSION 1. Every exception in the target must be present in the candidate. plus the principal types of its value bindings. 3.2 . with the same arity (number of arguments) and an equivalent deﬁnition (if any). In brief. the principal signature contains all of the type deﬁnitions.2 Matching exception id of typ the principal signature contains the speciﬁcation exception id of typ 4. 2. Every value in the target must be present in the candidate. with equivalent types for the value constructors.2 Matching A candidate signature sigexpc is said to match a target signature sigexpt iff sigexpc has all of the components and all of the type equations speciﬁed by sigexpt . with at least as general a type. 1. 19.19. with an equivalent argument type. Corresponding to a declaration of the form val id = exp the principal signature contains the speciﬁcation val id : typ 153 where typ is the principal type of the expression exp (relative to the preceding declarations). Every type constructor in the target must also be present in the candidate.
apart from the additional speciﬁcation of the type ’a queue.11 D RAFT V ERSION 1. The converse fails. R EVISED 11. The converse does not hold. which is to say that they are equivalently restrictive. because QUEUE lacks the component is empty. or satisfy additional type equations not required in the target. which is required by QUEUE WITH EMPTY.2 . because all of requirements of QUEUE are met by QUEUE WITH EMPTY. It will be helpful to consider some examples. because the signature QUEUE does not satisfy the requirement that ’a queue be equivalent to ’a list * ’a list. then sigexp1 matches sigexp3 .2 Matching 154 The candidate may have additional components not mentioned in the target. The signature QUEUE AS LISTS matches the signature QUEUE.02. Recall the following signatures from chapter 18. since all of the properties of the latter are true of the former. It is identical to QUEUE. The target signature may therefore be seen as a weakening of the candidate signature. Two signatures are equivalent (in the sense of Chapter chapter 18) iff each matches the other.19. The matching relation is reﬂexive—every signature matches itself— and transitive— if sigexp1 matches sigexp2 and sigexp2 matches sigexp3 . but it cannot have fewer of either. signature QUEUE = sig type ’a queue exception Empty val empty : ’a queue val insert : ’a * ’a queue > ’a queue val remove : ’a queue > ’a * ’a queue end signature QUEUE WITH EMPTY = sig include QUEUE val is empty : ’a queue > bool end signature QUEUE AS LISTS = QUEUE where type ’a queue = ’a list * ’a list The signature QUEUE WITH EMPTY matches the signature QUEUE.
The types of values in the candidate may be more general than required by the target. the signature signature MERGEABLE QUEUE = sig include QUEUE val merge : ’a queue * ’a queue > ’a queue end matches the signature signature MERGEABLE INT QUEUE = sig include QUEUE val merge : int queue * int queue > int queue end because the polymorphic type of merge in MERGEABLE QUEUE instantiates to its type in MERGEABLE INT QUEUE. For example.19. Signature matching may also involve instantiation of polymorphic types. Finally. consider the following signature: signature QUEUE AS LIST = sig type ’a queue = ’a list exception Empty val empty : ’a list val insert : ’a * ’a list > ’a list val remove : ’a list > ’a * ’a list val is empty : ’a list > bool end At ﬁrst glance you might think that this signature does not match the signature QUEUE. Therefore. a datatype speciﬁcation matches a signature that speciﬁes a type with the same name and arity (but no deﬁnition).11 D RAFT V ERSION 1. which matches QUEUE for reasons noted earlier.2 .02. since the components of QUEUE AS LIST have superﬁcially dissimilar types from those in QUEUE. and zero or more R EVISED 11. For example. the signature QUEUE AS LIST is equivalent to the signature QUEUE with type ’a queue = ’a list. However. QUEUE AS LIST matches QUEUE as well.2 Matching 155 Matching does not distinguish between equivalent signatures.
The types of the value components must match exactly the types of the corresponding value constructors. One way to understand this is to mentally rewrite the signature RBT DT in the (fanciful) form2 signature RBT DTS = sig type ’a rbt con Empty : ’a rbt con Red : ’a rbt * ’a * ’a rbt > ’a rbt con Black : ’a rbt * ’a * ’a rbt > ’a rbt The rule is simply that a val speciﬁcation may be matched by a con speciﬁcation. no specialization is allowed in this case.2 .11 D RAFT V ERSION 1. the “signature” RBT DTS is not a legal ML signature! R EVISED 11.2 Matching 156 value components corresponding to some (or all) of the value constructors of the datatype. and includes two value speciﬁcations that are met by value constructors in the signature RBT DT.19. 2 Unfortunately.02. the signature signature RBT DT = sig datatype ’a rbt = Empty  Red of ’a rbt * ’a * ’a rbt  Black of ’a rbt * ’a * ’a rbt end matches the signature signature RBT = sig type ’a rbt val Empty : ’a rbt val Red : ’a rbt * ’a * ’a rbt > ’a rbt end The signature RBT speciﬁes the type ’a rbt as abstract. For example.
By the reﬂexivity of the matching relation it is immediate that a structure satisﬁes its principal signature.19. That is. 19. the principal signature is the strongest signature implemented by a structure.3 Satisfaction 157 19. R EVISED 11. a candidate structure implements a target signature iff the principal signature of the candidate structure matches the target signature.3 Satisfaction Returning to the motivating question of this chapter.2 .4 Sample Code Here is the code for this chapter.11 D RAFT V ERSION 1.02. Therefore any signature implemented by a structure is weaker than the principal signature of that structure.
2.1 Ascribed Structure Bindings The most common form of signature ascription is in a structure binding. There are two forms. in so doing. In either case the components of the structure are cut down to those speciﬁed in the signature. without propagating any type deﬁnitions. they differ in the extent to which the assigned signature of the structure is weakened by the ascription. weakens the signature of that structure for all subsequent uses of it. or descriptive ascription. Both require that a structure implement a signature. or restrictive ascription.Chapter 20 Signature Ascription Signature ascription imposes the requirement that a structure implement a signature and. 20. Transparent. the only difference is whether type deﬁnitions are propagated from the principal signature or not. The structure is assigned the signature obtained by propagating type deﬁnitions from the principal signature to the candidate signature. Opaque. The structure is assigned the target signature as is. 1. the transparent structure strid : sigexp = strexp and the opaque . There are two forms of ascription in ML.
f::fs) = (f.2 .2 Opaque Ascription structure strid :> sigexp = strexp 159 Transparent ascription is written using a single colon. (bs. for a transparent ascription. it is assigned the signature sigexp . sigexp. The principal signature sigexp0 of strexp is determined. say. pairs of lists. but not in the signature. First. to sigexp . we check that strexp implements sigexp according to the rules given in chapter 19. nil) = raise Empty  remove (bs.11 D RAFT V ERSION 1. and checked to match sigexp. Second. sigexp0 . For an opaque ascription. Ascribed signature bindings are evaluated by ﬁrst evaluating strexp. nil) fun insert (x. The matching process determines an augmentation sigexp of sigexp in which we propagate type equations from the principal signature. A good example is provided by the implementation of queues as. structure Queue :> QUEUE = struct type ’a queue = ’a list * ’a list val empty = (nil. fs)) = (x::bs.20. 20. and that there are no “space leaks” because of components that are present in the structure. The formation of the view ensures that the components of a structure may always be accessed in constant time. nil) = remove (nil. Ascribed structure bindings are type checked as follows. fs) exception Empty fun remove (nil. “:”. (bs.02. whereas opaque ascription is written using “:>”.2 Opaque Ascription The primary use of opaque ascription is to enforce data abstraction. fs))  remove (bs. rev bs) end R EVISED 11. it is assigned the signature sigexp. Then a view of the resulting value is formed by dropping all components that are not present in the target signature. the structure identiﬁer is assigned a signature based on the form of ascription. The structure variable strid is bound to the view.
and therefore it has no deﬁnition in terms of other types of the language. We may think of the type ’a Queue.queue to be abstract means that the only operations that may be performed on values of that type are empty.queue is abstract. It reduces the enforcement of representation invariants to these two requirements: 1. method. We have obscured this fact by opaquely ascribing a signature that does not provide a deﬁnition for the type ’a queue. and remove.queue as the type of states of an abstract machine whose sole instructions are empty (the initial state). All state transition instructions may assume that the invariant holds of the inputs states. called the assumeensure. the type ’a Queue. Hiding the representation of a type allows us to isolate the enforcement of representation invariants on a data structure.20.2 .02. Here is a possible signature for priority queues that expresses this dependency:1 1 In chapter 21 we’ll introduce better means for structuring this module.11 D RAFT V ERSION 1. All initialization instructions must ensure that the invariant holds true of the machine state after execution. insert. the implementation of queues can be changed without breaking any client code. No deﬁnition is provided for it in the signature QUEUE. 2. The queue operations are no longer polymorphic in the element type because they actually “touch” the elements to determine their relative priorities. the invariant must hold for all states—it must really be invariant! Suppose that we wish to implement an abstract type of priority queues for an arbitrary element type. and must ensure that it holds of the output state. By induction on the number of “instructions” executed. but the central points discussed here will not be affected. as long as the new implementation satisﬁes the same signature. or relyguarantee. Internally to the structure Queue we may wish to impose invariants on the internal state of the machine. For the type ’a Queue. The beauty of data abstraction is that it provides an elegant means of enforcing such invariants.2 Opaque Ascription 160 The use of opaque ascription ensures that the type ’a Queue. Consequently. R EVISED 11. All clients of the structure Queue are insulated from the details of how queues are implemented. insert. we may not make use of the fact that a queue is really a pair of lists on the grounds that it is implemented this way. Importantly.queue is abstract. and remove.
2 Opaque Ascription signature PQ = sig type elt val lt : elt * elt > bool type queue exception Empty val empty : queue val insert : elt * queue > queue val remove : queue > elt * queue end 161 Now let us consider an implementation of priority queues in which the elements are taken to be strings.queue abstract. and hence we can never call PrioQueue.elt! This leaves us no means of creating a value of type PrioQueue.11 D RAFT V ERSION 1. The solution is to augment the signature PQ with a deﬁnition for the type elt.elt... as expected.02. and which to make opaque. so is PrioQueue.insert. then opaquely ascribe this to PrioQueue: signature STRING PQ = PQ where type elt = string structure PrioQueue :> STRING PQ = ..2 . This suggests an implementation along these lines: structure PrioQueue :> PQ = struct type elt = string val lt : string * string > bool = (op <) type queue = . Now the type PrioQueue.20. and we may call PrioQueue. . The problem is that the interface is “too abstract” — it should only obscure the identity of the type queue. Since priority queues form an abstract type.elt is equivalent to string.insert with a string. we would expect to use opaque ascription to ensure that its representation is hidden. . and not that of the type elt. The moral is that there is always an element of judgement involved in deciding which types to hold abstract. end But not only is the type PrioQueue.. In the R EVISED 11. .
This is precisely what transparent ascription does for you. It should be clear that it would not make sense to opaquely ascribe this signature to a structure. for there would be no means of creating values of type t.3 Transparent Ascription Transparent ascription cuts down on the need for explicit speciﬁcation of type deﬁnitions in signatures. the determining factor is that we speciﬁed only the operations on elt that were required for the implementation of priority queues.11 D RAFT V ERSION 1. On the other hand.20. Doing so would preclude ever calling the lt operation. and so we hold the type abstract.2 . we can always replace uses of transparent ascription by a use of opaque ascription with a handcrafted augmented signature. consider the following structure binding: R EVISED 11.3 Transparent Ascription 162 case of priority queues. The prototypical use of transparent ascription is to form a view of a structure that eliminates the components that are not necessary in a given context without obscuring the identities of its type components. For example. This means that elt could not usefully be held abstract.02. excessive use of transparent ascription impedes modular programming by exposing type information that would better be left abstract. but must instead be speciﬁed in the signature. and no others. As we remarked earlier. 20. Such a signature is only useful once it has been augmented with a deﬁnition for the type t. This can become burdensome. Consider the signature ORDERED deﬁned as follows: signature ORDERED = sig type t val lt : t * t > bool end This signature speciﬁes a type t equipped with a comparison operation lt. On the other hand the operations on queues are intended to be complete.
The “true” signature of String is the signature ORDERED where type t = string which makes clear the underlying deﬁnition of t. even though this fact is not present in the signature ORDERED. Transparent ascription computes an augmentation of ORDERED with this deﬁnition exposed.20. to implement the lexicographic ordering of strings). end 163 This structure implements string comparison in terms of character comparison (say.< fun lt (s.t is equivalent to string. For example.2 . A related use of transparent ascription is to document an interpretation of a type without rendering it abstract. We might make the following declarations to express this: structure IntLt : ORDERED = struct type t = int val lt = (op <) end structure IntDiv : ORDERED = struct type t = int fun lt (m.. one by the standard arithmetic comparison. clt . t) = . The type String. The auxiliary function clt is pruned out of the structure. and was not meant to be externally visible.. It was intended for internal use.11 D RAFT V ERSION 1..3 Transparent Ascription structure String : ORDERED = struct type t = string val clt = Char. the other by divisibility. n) = (n mod m = 0) end R EVISED 11. we may wish to consider the integers ordered in two different ways. 2.02.. Ascription of the signature ORDERED ensures two things: 1.
2 . in two senses.20. IntLt.02. R EVISED 11. 20.t and IntDiv.t are both equivalent to int. In particular.4 Sample Code Here is the code for this chapter. but does not hide the type of elements.4 Sample Code 164 The ascription speciﬁes the interpretation of int as partially ordered.11 D RAFT V ERSION 1.
then binding the resulting structure value to that identiﬁer.1 Substructures A substructure is a “structure within a structure”. There is no distinction between transparent and opaque speciﬁcations in a signature.Chapter 21 Module Hierarchies So far we have conﬁned ourselves to considering “ﬂat” modules consisting of a linear sequence of declarations of types. . Structures bindings (either opaque or transparent) are admitted as components of other structures. Structure speciﬁcations of the form structure strid : sigexp may appear in signatures. Evaluation of a substructure binding consists of evaluating the structure expression. A substructure binding in one signature matches the corresponding one in another iff their signatures match according to the rules in chapter 19. and values. treestructured conﬁgurations of modules that reﬂect the architecture of a large system. As programs grow in size and complexity. it becomes important to introduce further structuring mechanisms to support their growth. because there is no structure to ascribe! The type checking and evaluation rules for structures are extended to substructures recursively. 21. The principal signature of a substructure binding is determined according to the rules given in chapter 19. exceptions. The ML module language also supports module hierarchies.
k. consider the following programming scenario.02. The signature for such a data structure might be as follows: signature MY STRING DICT = sig type ’a dict val empty : ’a dict val insert : ’a dict * string * ’a > ’a dict val lookup : ’a dict * string > ’a option end The return type of lookup is ’a option. The implementation of this abstraction looks approximately like this: structure MyStringDict :> MY STRING DICT = struct datatype ’a dict = Empty  Node of ’a dict * string * ’a * ’a dict val empty = Empty fun insert (d. end The omitted implementations of insert and lookup make use of the builtin lexicographic ordering of strings. signature MY INT DICT = sig type ’a dict val empty : ’a dict val insert : ’a dict * int * ’a > ’a dict val lookup : ’a dict * int > ’a option end R EVISED 11..11 D RAFT V ERSION 1. The second version of the system requires another dictionary whose keys are integers. leading to another signature and implementation for dictionaries.. The ﬁrst version of a system makes use of a polymorphic dictionary data structure whose search keys are strings. since there may be no entry in the dictionary with the speciﬁed key. fun lookup (d.1 Substructures 166 To see how substructures arise in practice...21. k) = .2 . v) = .
k) = .02. signature MY STRING DICT = MY GEN DICT where type key = string signature MY INT DICT = MY GEN DICT where type key = int A string dictionary might then be implemented as follows: structure MyStringDict :> MY STRING DICT = struct type key = string datatype ’a dict = Empty  R EVISED 11.11 D RAFT V ERSION 1. that of a dictionary with keys of a speciﬁc type..1 Substructures structure MyIntDict :> MY INT DICT = sig datatype ’a dict = Empty  Node of ’a dict * int * ’a * ’a dict val empty = Empty fun insert (d. v) = .. Speciﬁc instances of this generic dictionary signature are obtained using where type.21. fun lookup (d. end 167 The ellided implementations of insert and lookup make use of the primitive comparison operations on integers. To avoid further repetition we decide to abstract out the key type from the signature so that it can be ﬁlled in later. At this point we may observe an obvious pattern. k.2 ... signature MY GEN DICT = sig type key type ’a dict val empty : ’a dict val insert : ’a dict * key * ’a > ’a dict end Notice that the dictionary abstraction carries with it the type of its keys.
k. ) = NONE  lookup (Node (dl. ) = NONE  lookup (Node (dl. v. l.11 D RAFT V ERSION 1.21. but ordered according to the divisibility ordering. k) else if divides (l. but has the same signature MY INT DICT as MyIntDict. v. k. Now suppose that we require a third dictionary. k) else 1 For which. Empty) fun lookup (Empty. k. structure MyIntDivDict :> MY INT DICT = struct type key = int datatype ’a dict = Empty  Node of ’a dict * key * ’a * ’a dict fun divides (k. k. R EVISED 11. v) = Node (Empty. v. say MyIntDivDict. m < n iff m divides n evenly. with integers as keys. v) = Node (Empty. l. with integer keys ordered by the standard integer comparison operations. l) = (l mod k = 0) val empty = Empty fun insert (None.2 . l) then (* divisibility test *) lookup (dl. k) = if k < l then (* string comparison *) lookup (dl. k) = if divides (k. k) then (* divisibility test *) lookup (dr. v. dr). Empty) fun lookup (Empty.1 Substructures Node of ’a dict * key * ’a * ’a dict val empty = Empty fun insert (Empty. k) else if k > l then (* string comparison *) lookup (dr.02. dr). k) else v end 168 By a similar process we may build an implementation MyIntDict of the signature MY INT DICT. makes use of modular arithmetic to compare operations.1 This implementation.
21.1 Substructures v end
169
Notice that we required an auxiliary function, divides, to implement the comparison in the required sense. With this in mind, let us reconsider our initial attempt to consolidate the signatures of the various versions of dictionaries in play. In one sense there is nothing to do — the signature MY GEN DICT sufﬁces. However, as we’ve just seen, the instances of this signature, which are ascribed to particular implementations, do not determine the interpretation. What we’d like to do is to package the type with its interpretation so that the dictionary module is selfcontained. Not only does the dictionary module carry with it the type of its keys, but it also carries the interpretation used on that type. This is achieved by introducing a substructure binding in the dictionary structure. To begin with we ﬁrst isolate the notion of an ordered type. signature ORDERED = sig type t val lt : t * t > bool val eq : t * t > bool end This signature describes modules that contain a type t equipped with an equality and comparison operation on it. An implementation of this signature speciﬁes the type and the interpretation, as in the following examples. (* Lexicographically ordered strings. *) structure LexString : ORDERED = struct type t = string val eq = (op =) val lt = (op <) end (* Integers ordered conventionally. *) structure LessInt : ORDERED = R EVISED 11.02.11 D RAFT V ERSION 1.2
21.1 Substructures struct type t = int val eq = (op =) val lt = (op <) end (* Integers ordered by divisibility.*) structure DivInt : ORDERED = struct type t = int fun lt (m, n) = (n mod m = 0) fun eq (m, n) = lt (m, n) andalso lt (n, m) end
170
Notice that the use of transparent ascription is very natural here, since ORDERED is not intended as a selfcontained abstraction. The signature of dictionaries is restructured as follows: signature DICT = sig structure Key : ORDERED type ’a dict val empty : ’a dict val insert : ’a dict * Key.t * ’a > ’a dict val lookup : ’a dict * Key.t > ’a option end The signature DICT includes as a substructure the key type together with its interpretation as an ordered type. To enforce abstraction we introduce specialized versions of this signature that specify the key type using a where type clause. signature STRING DICT = DICT where type Key.t=string signature INT DICT = DICT where type Key.t=int These are, respectively, signatures for the abstract type of dictionaries whose keys are strings and integers. How are these signatures to be implemented? Corresponding to the layering of the signatures, we have a layering of the implementation. R EVISED 11.02.11 D RAFT V ERSION 1.2
21.1 Substructures structure StringDict :> STRING DICT = struct structure Key : ORDERED = LexString datatype ’a dict = Empty  Node of ’a dict * Key.t * ’a * ’a dict val empty = Empty fun insert (None, k, v) = Node (Empty, k, v, Empty) fun lookup (Empty, ) = NONE  lookup (Node (dl, l, v, dr), k) = if Key.lt(k, l) then lookup (dl, k) else if Key.lt (l, k) then lookup (dr, k) else v end
171
Observe that the implementation of insert and lookup make use of the comparison operations Key.lt and Key.eq. Similarly, we may implement IntDict, with the standard ordering, as follows: structure LessIntDict :> INT DICT = struct structure Key : ORDERED = LessInt datatype ’a dict = Empty  Node of ’a dict * Key.t * ’a * ’a dict val empty = Empty fun insert (None, k, v) = Node (Empty, k, v, Empty) fun lookup (Empty, ) = NONE  lookup (Node (dl, l, v, dr), k) = if Key.lt(k, l) then lookup (dl, k) else if Key.lt (l, k) then lookup (dr, k) else v R EVISED 11.02.11 D RAFT V ERSION 1.2
21.1 Substructures end
172
Similarly, dictionaries with integer keys ordered by divisibility may be implemented as follows: structure IntDivDict :> INT DICT = struct structure Key : ORDERED = IntDiv datatype ’a dict = Empty  Node of ’a dict * Key.t * ’a * ’a dict val empty = Empty fun insert (None, k, v) = Node (Empty, k, v, Empty) fun lookup (Empty, ) = NONE  lookup (Node (dl, l, v, dr), k) = if Key.lt(k, l) then lookup (dl, k) else if Key.lt (l, k) then lookup (dr, k) else v end Taking stock of the development, what we have done is to structure the signature of dictionaries to allow the type of keys, together with its interpretation, to vary from one implementation to another. The Key substructure may be viewed as a “parameter” of the signature DICT that is “instantiated” by specialization to speciﬁc types of interest. In this sense substructures subsume the notion of a parameterized signature found in some languages. There are several advantages to this: 1. A signature with one or more substructures is still a complete signature. Parameterized signatures, in contrast, are incomplete signatures that must be completed to be used. 2. Any substructure of a signature may play the role of a “parameter”. There is no need to designate in advance which are “arguments” and which are “results”. In chapter 23 we will introduce the mechanisms needed to build a generic implementation of dictionaries that may be instantiated by the key type and its ordering. R EVISED 11.02.11 D RAFT V ERSION 1.2
21.2 Sample Code
173
21.2 Sample Code
Here is the code for this chapter.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
Points and vectors are fundamental to representing geometry. signature GEOMETRY = sig structure Point : POINT structure Sphere : SPHERE end For the purposes of this example.Chapter 22 Sharing Speciﬁcations In chapter 21 we illustrated the use of substructures to express the dependence of one abstraction on another. we have reduced geometry to two concepts. that of a point in space and that of a sphere. 22. They are described by the following (abbreviated) signatures: signature VECTOR = sig type vector val zero : vector val scale : real * vector > vector . In this chapter we will consider the problem of symmetric combination of modules to form larger modules.1 Combining Abstractions The discussion will be based on a representation of geometry in ML based on the following (drastically simpliﬁed) signature.
The point operations support translation of a point along a vector and the creation of a vector as the “difference” of two points (i.point * Vector.2 . that specify the dimension.1 Combining Abstractions val add : vector * vector > vector val dot : vector * vector > real end signature POINT = sig structure Vector : VECTOR type point (* move a point along a vector *) val translate : point * Vector. and inner product.22..vector > sphere end The operation sphere creates a sphere centered at a given point and with the radius vector given.. structure Geom3D :> GEOMETRY = . the vector from the ﬁrst to the second). This allows us — using the mechanisms to be introduced in chapter 23 — to build packages that work in an arbitrary dimension without requiring runtime conformance checks.... scalar multiplication.vector end 175 The vector operations support addition. and include a unit element for addition. Spheres are implemented by a module implementing the following (abbreviated) signature: signature SPHERE = sig structure Vector : VECTOR structure Point : POINT type sphere val sphere : Point. It is the structures.11 D RAFT V ERSION 1. and not the signatures.e.vector > point (* the vector from a to b *) val ray : point * point > Vector.02. R EVISED 11. These signatures are intentionally designed so that the dimension of the space is not part of the speciﬁcation.and threedimensional geometry are deﬁned by structure bindings like these: structure Geom2D :> GEOMETRY = . Two.
In a sense this is the correct state of affairs.22. In keeping with the guidelines discussed in section 21. similarly.11 D RAFT V ERSION 1. forming a ray from one point to another yields a vector.Point. Suppose that p and q are twodimensional points of type Geom2D.2 .ray (p.sphere (p. What is missing is the expression of the intention that the various “copies” of vectors and points within the geometry abstraction be identical. This means that dimensional conformance is enforced by the type checker. The various “copies” of. which is not at all what we intend. we have incorporated as substructures the structures on which a given structure depends.Sphere. say.Vector and Geom2D. For example. we are compelled to keep these types distinct.point are distinct.point and Geom3D. we have two “copies” of the point abstraction. Even in the very simpliﬁed geometry signature given above. the types Geom2D. Hence.Vector. we might have used completely incompatible notions of vector in each of the three places where they are required. the better off we are. so that we can mixandmatch the vectors constructed in various components of R EVISED 11. so an implementation of POINT depends on an implementation of VECTOR. POINT has a substructure implementing VECTOR. But this expression is illtyped! The reason is that the types Geom2D. This leads to a proliferation of structures.and threedimensional implementations of the signature GEOMETRY.Point.point.Point. Closer inspection reveals that. What has gone wrong? The situation is quite subtle. q)). Of course. but (so far) there is nothing in the signature to prevent it. Geom2D. For example. the vector abstraction might well be distinct from one another.Sphere.sphere to a point in twospace and a vector in threespace. all of these abstractions are kept distinct from one another. and three “copies” of the vector abstraction! Since we used opaque ascription to deﬁne the two. This is a good thing: the more static checking we have.vector are also distinct from one another.1. In the elided implementation of twodimensional geometry. even though they may be implemented identically. this may not be what is intended. and. We might expect to be able to form a sphere centered at p with radius determined by the vector from p to q: Geom2D. SPHERE has substructures implementing VECTOR and POINT. we have too much of a good thing. we cannot apply Geom3D.1 Combining Abstractions 176 As a consequence of the use of opaque ascription. unfortunately.Sphere. Thus.Point.Point.02.
vector > sphere end signature GEOMETRY = sig structure Point : POINT structure Sphere : SPHERE sharing type Point.11 D RAFT V ERSION 1.point and Point. since now the required type equation holds by explicit speciﬁcation in the signature. and the three “copies” of the vector abstraction must coincide.Point. the illtyped expression above becomes welltyped.Vector.vector = Sphere.Vector.point * Vector.Vector = Vector type sphere val sphere : Point.point = Sphere.vector end These equations specify that the two “copies” of the point abstraction. As a notational convenience we may use a structure sharing constraint instead to express the same requirements: signature SPHERE = sig structure Vector : VECTOR structure Point : POINT sharing Point.02.vector > sphere end signature GEOMETRY = sig R EVISED 11.2 . The revised signatures for the geometry package look like this: signature SPHERE = sig structure Vector : VECTOR structure Point : POINT sharing type Point.22. In the presence of the above sharing speciﬁcation. This is achieved using a type sharing constraint.vector = Vector.point * Vector.vector type sphere val sphere : Point.Vector.1 Combining Abstractions 177 the package. To support this it is necessary to constrain the implementation to use the same notion of vector throughout.
.Vector = Sphere. with the meaning that corresponding types of shared structures are required to share.02. we can instead specify it structurebystructure. but it also constrains the implementation to ensure that these types are the same. the effect of the structure sharing speciﬁcation above is identical to the preceding type sharing speciﬁcation.2 . . structure Point3D : POINT = struct structure Vector : VECTOR = Vector3D . end structure Geom3D :> GEOMETRY = struct structure Point = Point3D structure Sphere = Sphere3D end The required type sharing constraints are true by construction. It is easy to achieve this requirement by deﬁning a single implementation of points and vectors that is reused in the higherlevel abstractions. Had we instead replaced the above declaration of Geom3D by the following one. . . Not only does the sharing speciﬁcation ensure that the desired equations hold amongst the various components of an implementation of GEOMETRY.. end structure Sphere3D : SPHERE = struct structure Vector : VECTOR = Vector3D structure Point : POINT = Point3D . .1 Combining Abstractions structure Point : POINT structure Sphere : SPHERE sharing Point = Sphere.22. the type checker would reject it on the grounds that the reR EVISED 11. Since each structure in our example contains only one type. structure Vector3D : VECTOR = .Point and Point.Vector end 178 Rather than specify the required sharing typebytype.11 D RAFT V ERSION 1.
02.11 D RAFT V ERSION 1.2 .vector.Point is distinct from Point3D.Point and Point does not hold.point * Point.vector by Vector. because Sphere2D. since the structure Point comes equipped with a notion of vector. replacing uses of Vector. R EVISED 11.Point end If we could further eliminate the substructure Point from the signature SPHERE we would have only one copy of Point and no need for a sharing speciﬁcation. Let’s try to reorganize the signature GEOMETRY so that duplication of the point and vector structures is avoided. thereby obviating the need for sharing speciﬁcations. One step is to eliminate the substructure Vector from SPHERE. structure Geom3D :> GEOMETRY = struct structure Point = Point3D structure Sphere = Sphere2D end It is natural to wonder whether it might be possible to restructure the GEOMETRY signature so that the duplication of the point and vector components is avoided. One can restructure the code in this manner.Vector. but doing so would do violence to the overall structure of the program.Point.1 Combining Abstractions 179 quired sharing between Sphere. why not use it? This cuts down the number of sharing speciﬁcations to one: signature GEOMETRY = sig structure Point : POINT structure Sphere : SPHERE sharing Point = Sphere.22. This is why sharing speciﬁcations are so important. signature SPHERE = sig structure Point : POINT type sphere val sphere : Point.vector > sphere end After all.
the notion of point must be a generic concept within SPHERE. Here is an example. contrary to our intentions. rather than solving it. It makes reference to a structure Point. the semispace. signature SEMI SPACE = sig structure Point : POINT type semispace val side : Point.11 D RAFT V ERSION 1.point * semispace > bool option end The function side determines (if possible) whether a given point lies in one half of the semispace or the other. The substructure Point may be thought of as a parameter of the signature SPHERE in the sense discussed earlier. Let us introduce another geometric abstraction. A fullscale geometry package would contain more abstractions that involve points. This is indeed possible. and would eliminate the need for any sharing speciﬁcations.1 Combining Abstractions But what would the signature SPHERE look like in this case? signature SPHERE = sig type sphere val sphere : Point. and hence a speciﬁc dimension. so that there will still be copies in the other abstractions.point * Point. But it only defers the problem. The expanded GEOMETRY signature would look like this (with the elimination of the Point structure in place).Vector. The only other move available to us is to eliminate the structure Point from the signature GEOMETRY. identical.22. but which Point are we talking about? Any commitment would tie the signature to a speciﬁc structure.02. and hence Point must appear as a substructure.2 . Sharing speciﬁcations would then be required to ensure that these copies are. in fact. signature EXTD GEOMETRY = sig R EVISED 11.vector > sphere end 180 The problem now is that the signature SPHERE is no longer selfcontained. Rather.
R EVISED 11. 22. and hence we wind up with two copies.2 Sample Code structure Sphere : SPHERE structure SemiSpace : SEMI SPACE sharing Sphere. which is achieved by the use of sharing speciﬁcations.11 D RAFT V ERSION 1. On the other hand we wish to combine modules together to form programs. Doing so requires that the composition be coherent. Therefore the sharing speciﬁcation is required. This requires that the signatures of these modules be selfcontained.22.Point end 181 By an argument similar to the one we gave for the signature SPHERE. we cannot eliminate the substructure Point from the signature SEMI SPACE. What sharing speciﬁcations do for you is to provide an after the fact means of tying together several different abstractions to form a coherent whole.2 Sample Code Here is the code for this chapter. What is at issue here is a fundamental tension in the very notion of modular programming. Unbound references to a structure — such as Point — ties that signature to a speciﬁc implementation.02.2 .Point = SemiSpace. in violation of our desire to treat modules separately from one another. This approach to the problem of coherence is a unique — and uniquely effective — feature of the ML module system. On the one hand we wish to separate modules from one another so that they may be treated independently.
. A functor is a modulelevel function whose argument is a sequence of declarations. sigexp. A transparent functor has the form functor funid(decs):sigexp = strexp where the result signature. There are two forms.1 Functor Bindings and Applications Functors are deﬁned using a functor binding. A functor is a modulelevel function that takes a structure as argument and yields a structure as result. the opaque and the transparent. modules that leave unspeciﬁed some aspects of the implementation of a module. The unspeciﬁed parts may be instantiated to determine speciﬁc instances of the module. an opaque functor has the form functor funid(decs):>sigexp = strexp where the result signature is opaquely ascribed. 23. Instances are created by applying the functor to an argument specifying the interpretation of the parameters.Chapter 23 Parameterization To support code reuse it is useful to deﬁne generic. The common part is thereby implemented once and shared among all instances. and whose result is a structure. or parameterized. In ML such generic modules are called functors. is transparently ascribed.
02. given that the parameters of the functor have the speciﬁed signatures.23. k) then lookup (dr.t = struct structure Key : ORDERED = K datatype ’a dict = Empty  Node of ’a dict * Key. v) = Node (Empty.11 D RAFT V ERSION 1. The opaquely ascribed result signature is R EVISED 11. l.lt (l. This can be expressed by a single functor binding that deﬁnes the implementation of dictionaries parametrically in the choice of keys.t = K. dr). The implementation of dictionaries is the same for each choice of order structure on the keys. l) then lookup (dl. v.t * ’a * ’a dict val empty = Empty fun insert (None.2 . Empty) fun lookup (Empty. ) = NONE  lookup (Node (dl. without further augmentation by type equations. v. k) = if Key.1 Functor Bindings and Applications 183 Type checking of a functor consists of checking that the functor body matches the ascribed result signature.lt(k. For a transparent functor the result signature is the augmented signature determined by matching the principal signature of the body (relative to the assumptions governing the parameters) against the given result signature. For opaque functors the ascribed result signature is used asis. k. k) else if Key. k) else v end The functor DictFun takes as argument a single structure speciﬁying the ordered key type as a structure implementing signature ORDERED. k. In chapter 21 we developed a modular implementation of dictionaries in which the ordering of keys is made explicit as a substructure. functor DictFun (structure K : ORDERED) :> DICT where type Key.
but speciﬁes that the key type is the type K.t = K. This determines an augmentation of the parameter signature for each argument (as described in chapter 20).t passed as argument to the functor. 184 This signature holds the type ’a dict abstract. The signature of the application determined by this procedure is then opaquely ascribed to the application. propagate the type deﬁnitions of the augmented parameter signature to the result signature.1 Functor Bindings and Applications DICT where type Key.1 Returning to the example of the dictionary functor. match the argument signature against the corresponding parameter signature of the functor. The signature of a functor application is determined by the following procedure. 2. This means that if a type is left abstract in the result signature of a functor. For each reference to a type component of a functor parameter in the result signature. A functor application has the form funid(binds) where binds is a sequence of bindings of the arguments of the functor. and also the “true” result signature of the functor (the given signature for opaque functors.t.11 D RAFT V ERSION 1. that type is “new” in every instance of that functor. 1. This behavior is called generativity of the functor. the augmented signature for the transparent functors). We assume we are given the signatures of the functor parameters. For each argument.2 .02. the three versions of dictionaries considered in chapter 21 may be obtained by applying DictFun to appropriate arguments. 1 The R EVISED 11. This ensures that the dictionary code is independent of the choice of key type and ordering. Instances of functors are obtained by application. structure LtIntDict = DictFun (structure K = LessInt) structure LexStringDict = DictFun (structure K = LexString) structure DivIntDict = DictFun (structure K = DivInt) alternative. The body of the functor has been written so that the comparison operations are obtained from the Key substructure. called applicativity. means that there is one abstract type shared by all instances of that functor.23.
02. LexStringDict.2 Functors and Sharing Speciﬁcations In chapter 22 we developed a signature of geometric primitives that contained sharing speciﬁcations to ensure that the constituent abstractions may be combined properly.t is equivalent to int. The signatures for the structures LtIntDict. as described in chapter 21. and DivIntDict are determined by instantiating the result signature of the functor DictFun according to the above procedure. we deduce that the type K.Point and Point.t is equivalent to int.23.Vector = Sphere.2 Functors and Sharing Speciﬁcations 185 In each case the functor DictFun is instantiated by specifying a binding for its argument stucture K.11 D RAFT V ERSION 1. The augmented signature resulting from matching the signature of LtIntDict against the parameter signature ORDERED is the signature ORDERED where type t=int Assigning this to the parameter K. as desired. 23. implementations of the signature ORDERED.t = int. and hence the result signature of DictFun is DICT where type Key.Vector = Sphere.Point.2 . By a similar process we deduce that the signature of LexStringDict is DICT where type Key.Vector end R EVISED 11.t = string and that the signature of DivIntDict is DICT where type Key. The argument structures are. They specify the type of keys and the sense in which they are ordered. The signature GEOMETRY is deﬁned as follows: signature GEOMETRY = sig structure Point : POINT structure Sphere : SPHERE sharing Point = Sphere.Key.Vector and Sphere.t = int so that IntLtDict. Consider the application of DictFun to LtIntDict.
R EVISED 11..2 Functors and Sharing Speciﬁcations 186 The sharing clauses ensure that the Point and Sphere components are compatible with each other.. There is only one problem: the functors SphereFun and GeomFun are not welltyped! The reason is that in both cases their result signatures require type equations that are not true of their parameters! For example. . it makes sense to implement these as functors. . Since we expect to deﬁne vectors.02.2 . according to the following scheme: functor PointFun (structure V : VECTOR) : POINT = . functor SphereFun (structure V : VECTOR structure P : POINT) : SPHERE = struct structure Vector = V structure Point = P .23. end functor GeomFun (structure P : POINT structure S : SPHERE) : GEOMETRY = struct structure Point = P structure Sphere = S end A twodimensional geometry package may then be deﬁned as follows: structure Vector2D : VECTOR = ..11 D RAFT V ERSION 1.. structure Point2D : POINT = PointFun (structure V = Vector2D) structure Sphere2D : SPHERE = SphereFun (structure V = Vector2D and P = Point2D) structure Geom2D : GEOMETRY = GeomFun (structure P = Point2D and S = Sphere2D) A threedimensional version is deﬁned similarly. and spheres of various dimensions. points.
The chief virtue of sharing speciﬁcations is that they express directly and concisely the required relationships without requiring that these relationships be anticipated when deﬁning the signatures of the parameters.Vector and V must be equivalent.02. This is not true in general.Point) : GEOMETRY = struct structure Point = P structure Sphere = S end These equations preclude instantiations for which the required equations do not hold. but doing so does violence to the structure of your program. the answer is “yes”.3 Avoiding Sharing Speciﬁcations As with sharing speciﬁcations in signatures.Vector = S. the structures P.Vector = V) : SPHERE = struct structure Vector = V structure Point = P .23. end functor GeomFun (structure P : POINT structure S : SPHERE sharing P. which is not satisﬁed by the body of SphereFun. and are sufﬁcient to ensure that the requirements of the result signatures of the functors are met. it is natural to wonder whether they can be avoided in functor parameters. because the functors might be applied to arguments for which this is false. . as follows: functor SphereFun (structure V : VECTOR structure P : POINT sharing P. .2 . 23. Similar problems plague the functor GeomFun. This greatly R EVISED 11. For these to be true.Vector be the same as Vector. Once again. The solution is to include sharing constraints in the parameter list of the functors.11 D RAFT V ERSION 1.Vector and P = S.3 Avoiding Sharing Speciﬁcations 187 the signature SPHERE requires that Point.
One is to make the desired equation true by construction. for which it is impossible to assume any sharing relationships that one may wish to impose in an application. we must arrange that the sharing equation in the signature EXTD GEOMETRY holds. functor SphereFun (structure P : POINT) : SPHERE = R EVISED 11.Point = SemiSpace.3 Avoiding Sharing Speciﬁcations 188 facilitates reuse of offtheshelf code.Point) = struct structure Sphere = Sp structure SemiSpace = Ss end To eliminate the sharing equation in the functor parameter.Point = SemiSpace. the functor ExtdGeomFun takes only an implementation of POINT.23. There are two methods for doing this.Point end The implementation is a functor of the form functor ExtdGeomFun (structure Sp : SPHERE structure Ss : SEMI SPACE sharing Sphere. Rather than take implementations of SPHERE and SEMI SPACE as arguments. because then there is no reason to believe that it will hold as required in the signature. we’re to implement the following signature: signature EXTD GEOMETRY = sig structure Sphere : SPHERE structure SemiSpace : SEMI SPACE sharing Sphere. let’s consider the bestcase scenario from chapter 22 in which we have minimized sharing speciﬁcations to one tying together the sphere and semispace components.2 . A natural move is to “factor out” the implementation of POINT. then creates in the functor body appropriate implementations of spheres and semispaces.02. To see what happens. Simply dropping the sharing speciﬁcation will not do. each with disadvtanges compared to the use of sharing speciﬁcations. That is.11 D RAFT V ERSION 1. and use it to ensure that the required equation is true of the functor body.
• There is no inherent reason why ExtdGeomFun1 must take an implementation of POINT as argument. We must reconstruct the entire hierarchy. • The functor ExtdGeomFun1 must have as parameter the common element(s) of the components of its body. This is a signiﬁcant loss of generality that is otherwise present in the functor ExtdGeomFun.23. . which may be applied to any implementations of SPHERE and SEMI SPACE.3 Avoiding Sharing Speciﬁcations struct structure Vector = P.11 D RAFT V ERSION 1. .02.2 . . end functor ExtdGeomFun1 (structure P : POINT) : GEOMETRY = struct structure Sphere = SphereFun (structure P = Point) structure SemiSpace = SemiSpaceFun (structure P = Point) end The problems with this solution are these: 189 • The body of ExtdGeomFun1 makes use of the functors SphereFun and SemiSpaceFun. It does so only so that it can reconR EVISED 11.Vector structure Point = P . This approach does not scale well when many abstractions are layered atop one another. end functor SemiSpaceFun (structure P : POINT) : SEMI SPACE = struct . In effect we are limiting the geometry functor to arguments that are built from these speciﬁc functors. which is then used to build up the appropriate substructures in a manner consistent with the required sharing. . starting with the components that are conceptually “furthest away” as arguments. and no other.
and use this to constrain the arguments to the functor to ensure that the possible arguments are limited to situations for which the required sharing holds. 2 Ofﬁcially. Another approach is to factor out the common component. It has the disadvantage of requiring a third argument. An application of this functor must provide not only implementations of SPHERE and SEMI SPACE.2 we must write where type Point.3 Avoiding Sharing Speciﬁcations 190 struct the hierarchy so as to satisfy the sharing requirements of the result signature.2 .Point. but many compilers accept the syntax above.point = Sp. but also an implementation of POINT that is used to build these! A slightly more sophisticated version of this solution is as follows: functor ExtdGeomFun3 (structure Sp : SPHERE structure Ss : SEMI SPACE where Point = Sp. but it is also clear that this approach has no particular advantages over just using a sharing speciﬁcation.11 D RAFT V ERSION 1. functor ExtdGeomFun2 (structure P : POINT structure Sp : SPHERE where Point = P structure Ss : SEMI SPACE where Point = P) = struct structure Sphere = Sp structure SemiSpace = Ss end Now the required sharing requirements are met.02.point.23. whose only role is to make it possible to express the required sharing.Point) = struct structure Sphere = Sp structure SemiSpace = Ss end The “extra” parameter to the functor has been eliminated by choosing one of the components as a “representative” and insisting that the others be compatible with it by using a where clause. R EVISED 11.
4 Sample Code 191 This solution has all of the advantages of the direct use of sharing speciﬁcations.4 Sample Code Here is the code for this chapter. R EVISED 11.02. and no further disadvantages.23.Point) = struct structure Sphere = Sp structure SemiSpace = Ss end without changing the meaning. Sharing speciﬁcations offload the burden of making such tedious (because arbitrary) decisions to the compiler. we are forced to violate arbitrarily the inherent symmetry of the situation. Here is the point: sharing speciﬁcations allow a symmetric situation to be treated in a symmetric manner.11 D RAFT V ERSION 1. 23. We could just as well have written functor ExtdGeomFun4 (structure Ss : SEMI SPACE structure Sp : SPHERE where Point = Sp.2 . However. The compiler breaks the symmetry by choosing representatives arbitrarily in the manner illustrated above. rather than imposing it on the programmer.
Part IV Programming Techniques .
R EVISED 11.2 .02.11 D RAFT V ERSION 1. reliable. The discussion takes the form of a series of worked examples illustrating various techniques for building programs. and efﬁcient programs.193 In this part of the book we will explore the use of Standard ML to build elegant.
The speciﬁcation is most often stated asymptotically in terms of the number of execution steps or the size of a data structure.1 Speciﬁcations A speciﬁcation is a description of the behavior of a piece of code. A complexity speciﬁcation states the time or space required to evaluate an expression. without saying anything about the value itself. In this Chapter we review the main ideas in preparation for their subsequent use in the rest of the book. An effect speciﬁcation resembles a type speciﬁcation. • InputOutput Behavior. usually an implication.Chapter 24 Speciﬁcations and Correctness The most important tools for getting programs right are speciﬁcation and veriﬁcation. • Time and Space Complexity. but instead of describing the value of an expression. it describes the effects it may engender when evaluated. • Effect Behavior. Speciﬁcations take many forms: • Typing. . A type speciﬁcation describes the “form” of the value of an expression. that describes the output of a function for all inputs satisfying some assumptions. 24. An inputoutput speciﬁcation is a mathematical formula.
Rather you should see a speciﬁcation as describing one of perhaps many properties of interest. the speciﬁcation records for future reference what problem the code was intended to solve. Other times stronger properties are needed — the function fib applied to n yields the nth and n1st Fibonacci number. according to interest and need. b) = fib’ (n1) D RAFT V ERSION 1. In ML every program comes with at least one speciﬁcation. A speciﬁcation states what a piece of code does. The guiding principle is this: if you are unable to write a clear speciﬁcation. Sometimes only relatively weak properties are important — the function square always yields a nonnegative result. This means that wherever one is used we may replace it with the other without changing the observable behavior of any program in which they occur.11 . or declarative. What speciﬁcations have in common. The code is written to solve a problem. one. An equational speciﬁcation states that one code fragment is equivalent to another. 0) fib’ 1 = (1. This list by no means exhausts the possibilities.1 Speciﬁcations 195 • Equational Speciﬁcations. however. But we may also consider other speciﬁcations. Recall the following two ML functions from chapter 7: fun   fun   fib 0 = 1 fib 1 = 1 fib n = fib (n1) + fib (n2) fib’ 0 = (1. its type.02. rather than a prescriptive.2 R EVISED 11. It’s a matter of taste and experience to know what to say and how best to say it. is a descriptive. not how it does it. A common misunderstanding is that a given piece of code has one speciﬁcation stating all there is to know about it. The very act of formulating a precise speciﬁcation of a piece of code is often the key to ﬁnding a good solution to a tricky problem. ﬂavor. you do not understand the problem well enough to solve it correctly.24. or operational. Good programmers use speciﬁcations to state precisely and concisely their intentions when writing a piece of code. 1) fib’ n = let val (a. The greatest difﬁculty in using speciﬁcations is in knowing what to say.
– The application fib’ n may raise the exception Overflow. a speciﬁcation iff its execution behavior is as described by the speciﬁcation. • Inputoutput speciﬁcations: – If n ≥ 0.2 Correctness Proofs in (a+b. in that order. then fib’ n terminates in O(n) steps. • Equivalence speciﬁcation: For all n ≥ 0. then fib n evaluates to the nth Fibonacci number.02.2 . – If n ≥ 0. This takes the form of proving a mathematical theorem (by hand or with machine assistance) stating that the program implements a speciﬁcation. • Time complexity speciﬁcations: – If n ≥ 0. a) end Here are some speciﬁcations pertaining to these functions: • Type speciﬁcations: – val fib : int > int – val fib’ : int > int * int • Effect speciﬁcations: – The application fib n may raise the exception Overflow. fib n is equivalent to #1(fib’ n) 24. Veriﬁcation is the process of checking that a program satisﬁes a speciﬁcation.11 D RAFT V ERSION 1. or meets. 196 – If n ≥ 0. R EVISED 11. then fib’ n evaluates the nth and n − 1st Fibonacci number. then fib n terminates in O(2n ) steps.24.2 Correctness Proofs A program satisﬁes.
it invites misinterpretation. Another common misconception about speciﬁcations is that they can always be implemented as runtime checks. For this reason speciﬁcations are strictly more powerful than runtime checks. but rather because the speciﬁcation is not relevant to the context in which the code is used. there is in general no single preferred speciﬁcation of a piece of code.1 There are at least two fallacies here: 1. it takes more work than a mere conditional misconception is encouraged by the C assert macro. See chapter 32 for further discussion of this point.2 Correctness Proofs 197 There are many misunderstandings in the literature about speciﬁcation and veriﬁcation. The speciﬁcation may not be mechanically checkable.2 . It is worthwhile to take time out to address some of them here. then all bets are off — nothing can be said about its behavior. As we remarked in section 24. problem. While there is nothing wrong with this usage. and “incorrect” with respect to another.11 D RAFT V ERSION 1. it is only so relative to a particular speciﬁcation. a program is never inherently correct or incorrect.1. Correspondingly. we might specify that a function f of type int>int always yields a nonnegative result. which introduces an executable test that a certain computable condition holds. For this reason it is entirely possible that a correct program will malfunction when used. The speciﬁcation is stated at the level of the source code. not because the speciﬁcation is not met. Consequently. Speciﬁcations usually make assumptions about the context in which the program is used. The veriﬁcation of this fact is then called a correctness proof. It is often said that a program is correct if it meets a speciﬁcation.02. a program can be “correct” with respect to one speciﬁcation. It may not even be possible to test it at runtime. or uncomputable. This is false. 2. In particular. This is a ﬁne thing. 1 This R EVISED 11. But there is no way to implement this as a runtime check — this is an undecidable. For example. and hence so is correctness.24. but from this many people draw the conclusion that assertions (speciﬁcations) are simply boolean tests. A related misunderstanding is the inference that a program “works” from the fact that it is “correct” in this sense. If these assumptions are not satisﬁed. Veriﬁcation is always relative to a speciﬁcation.
These generally state that a piece of code “may raise” one or more exceptions. If the veriﬁcation breaks down. implementation. It is unrealistic to propose to verify that an arbitrary piece of code satisﬁes an arbitrary speciﬁcation. with each activity informing the other. it is also completely artiﬁcial. If an exception is not mentioned in such a speciﬁcation. the correctness of these annotations is ensured by the compiler. If you put type annotations on a function stating the types of the parameters or results of that function. but these are some of the most important. • Effect speciﬁcations must be checked by hand. robust programs. code. look for insights from the speciﬁcation and veriﬁcation. Fortunately. rough out the code and proof to see what it should be. only that we have no information. For example.11 D RAFT V ERSION 1. it is important to note that speciﬁcation. Fundamental computability and complexity results make clear that we can never succeed in such an endeavor. For example. Finally. In practice we specify.2 Correctness Proofs 198 branch to ensure that a program satisﬁes a speciﬁcation. even though fib might: fun protected fib n = (fib n) handle Overflow => 0 R EVISED 11.02.2 . and verify simultaneously. If the code is difﬁcult to write. • Type speciﬁcations are veriﬁed automatically by the ML compiler. Veriﬁcations of speciﬁcations take many different forms. we may state the intended typing of fib as follows: fun fib (n:int):int = case n of 0 => 1  1 => 1  n => fib (n1) + fib (n2) This ensures that the type of fib is int>int. the following function may not raise the exception Overflow. tools for building elegant. Notice that a handler serves to eliminate a “may raise” speciﬁcation. we cannot conclude that the code does not raise it. and veriﬁcation go handinhand. and useful. If the speciﬁcation is complex. There is no “magic bullet”.24. reconsider the code or the speciﬁcation (or both).
• Inputoutput speciﬁcations require proof. For example. in chapter 7 we proved that fib n yields the nth Fibonacci number by complete induction on n. • Equivalence speciﬁcations also require proof. and calculate using the deﬁnitions.3 Enforcement and Compliance 199 A handler makes it possible to specify that an exception may not be raised by a given expression (because the handler traps it). Then assume the result for n and n + 1. The premises of the speciﬁcation are preconditions that must be met in order for the program to behave in the manner described by the conclusion. For example. One method that often works is “induction plus handsimulation”. these proofs are. For example. Just as in ordinary mathematical reasoning. in general. to obtain the result. once again calculating based on the deﬁnitions. For example. if x has type int. it is not hard to prove by induction on n that fib n is equivalent to #1(fib’ n). of the speciﬁcation. then the program behaves a certain way. if x is nonnegative. type speciﬁcations are conditional on the types of its free variables.24. then all bets are off — nothing can be said about the behavior of the code. or postcondition. Since equivalence of expressions must account for all possible uses of them. plug in n = 0 and n = 1. Inputoutput speciﬁcations are characteristically of this form. then x+1 has type int. and consider n + 2. This R EVISED 11. In the case of fib we may read off the following recurrence: T (0) = 1 T (1) = 1 T ( n + 2) = T ( n ) + T ( n + 1) + 1 Solving this recurrence yields the proof that T (n) = O(2n ). For example.11 D RAFT V ERSION 1. typically using some form of induction. 24. if the premises of such a speciﬁcation are not true.02. then so is x+1. very tricky. First. • Complexity speciﬁcations are often veriﬁed by solving a recurrence describing the execution time of a program.3 Enforcement and Compliance Most speciﬁcations have the form of an implication: if certain conditions are met.2 .
24.3 Enforcement and Compliance
200
means that the preconditions impose obligations on the caller, the user of the code, in order for the callee, the code itself, to be wellbehaved. A conditional speciﬁcation is a contract between the caller and the callee: if the caller meets the preconditions, the caller promises to fulﬁll the postcondition. In the case of type speciﬁcations the compiler enforces this obligation by ruling out as illtyped any attempt to use a piece of code in a context that does not fulﬁll its typing assumptions. Returning to the example above, if one attempts to use the expression x+1 in a context where x is not an integer, one can hardly expect that x+1 will yield an integer. Therefore it is rejected by the type checker as a violation of the stated assumptions governing the types of its free variables. What about speciﬁcations that are not mechanically enforced? For example, if x is negative, then we cannot infer anything about x+1 from the speciﬁcation given above.2 To make use of the speciﬁcation in reasoning about its used in a larger program, it is essential that this precondition be met in the context of its use. Lacking mechanical enforcement of these obligations, it is all too easy to neglect them when writing code. Many programming mistakes can be traced to violation of assumptions made by the callee that are not met by the caller.3 What can be done about this? A standard method, called bulletprooﬁng, is to augment the callee with runtime checks that ensure that its preconditions are met, raising an exception if they are not. For example, we might write a “bulletproofed” version of fib that ensures that its argument is nonnegative as follows: local exception PreCond fun unchecked fib  unchecked fib  unchecked fib unchecked fib in
2 There
0 = 1 1 = 1 n = (n1) + unchecked fib (n2)
are obviously other speciﬁcations that carry more information, but we’re only concerned here with the one given. Moreover, if f is an unknown function, then we will, in general, only have the speciﬁcation, and not the code, to reason about. 3 Sadly, these assumptions are often unstated and can only be culled from the code with great effort, if at all.
R EVISED 11.02.11
D RAFT
V ERSION 1.2
24.3 Enforcement and Compliance fun checked fib n = if n < 0 then raise PreCond else unchecked fib n end
201
It is worth noting that we have structured this program to take the precondition check out of the loop. It would be poor practice to deﬁne checked fib as follows: fun bad checked fib n = if n < 0 then raise PreCond else case n of 0 => 1  1 => 1  n => bad checked fib (n1) + bad checked fib (n2) Once we know that the initial argument is nonnegative, it is assured that recursive calls also satisfy this requirement, provided that you’ve done the inductive reasoning to validate the speciﬁcation of the function. However, bulletprooﬁng in this form has several drawbacks. First, it imposes the overhead of checking on all callers, even those that have ensured that the desired precondition is true. In truth the runtime overhead is minor; the real overhead is requiring that the implementor of the callee take the trouble to impose the checks. Second, and far more importantly, bulletprooﬁng only applies to speciﬁcations that can be checked at runtime. As we remarked earlier, not all speciﬁcations are amenable to runtime checks. For these cases there is no question of inserting runtime checks to enforce the precondition. For example, we may wish to impose the requirement that a function argument of type int>int always yields a nonnegative result. There is no runtime check for this condition — we cannot write a function nonneg of type (int>int)>bool that determines whether or not a function f always yields a nonnegative result. In chapter 32 we will consider the use of data abstraction to enforce at compile time speciﬁcations that may not be checked at runtime. R EVISED 11.02.11 D RAFT V ERSION 1.2
Chapter 25 Induction and Recursion
This chapter is concerned with the close relationship between recursionand induction in programming. If a function is recursivelydeﬁned, an inductive proof is required to show that it meets a speciﬁcation of its behavior. The motto is when programming recursively, think inductively. Doing so signiﬁcantly reduces the time spent debugging, and often leads to more efﬁcient, robust, and elegant programs.
25.1 Exponentiation
Let’s start with a very simple series of examples, all involving the computation of the integer exponential function. Our ﬁrst example is to compute 2n for integers n ≥ 0. We seek to deﬁne the function exp of type int>int satisfying the speciﬁcation if n ≥ 0, then exp n evaluates to 2n . The precondition, or assumption, is that the argument n is nonnegative. The postcondition, or guarantee, is that the result of applying exp to n is the number 2n . The caller is required to establish the precondition before applying exp; in exchange, the caller may assume that the result is 2n . Here’s the code: fun exp 0 = 1  exp n = 2 * exp (n1)
25.1 Exponentiation
203
Does this function satisfy the speciﬁcation? It does, and we can prove this by induction on n. If n = 0, then exp n evaluates to 1 (as you can see from the ﬁrst line of its deﬁnition), which is, of course, 20 . Otherwise, assume that exp is correct for n − 1 ≥ 0, and consider the value of exp n. From the second line of its deﬁnition we can see that this is the value of 2 × p, where p is the value of exp (n − 1). Inductively, p ≥ 2n−1 , so 2 × p = 2 × 2n−1 = 2n , as desired. Notice that we need not consider arguments n < 0 since the precondition of the speciﬁcation requires that this be so. We must, however, ensure that each recursive call satisﬁes this requirement in order to apply the inductive hypothesis. That was pretty simple. Now let us consider the running time of exp expressed as a function of n. Assuming that arithmetic operations are executed in constant time, then we can read off a recurrence describing its execution time as follows: T (0) = O (1) T ( n + 1) = O (1) + T ( n ) We are interested in solving a recurrence by ﬁnding a closedform expression for it. In this case the solution is easily obtained: T (n) = O(n) Thus we have a linear time algorithm for computing the integer exponential function. What about space? This is a much more subtle issue than time because it is much more difﬁcult in a highlevel language such as ML to see where the space is used. Based on our earlier discussions of recursion and iteration we can argue informally that the deﬁnition of exp given above requires space given by the following recurrence: S (0) = O (1) S ( n + 1) = O (1) + S ( n ) The justiﬁcation is that the implementation requires a constant amount of storage to record the pending multiplication that must be performed upon completion of the recursive call. Solving this simple recurrence yields the equation S(n) = O(n) R EVISED 11.02.11 D RAFT V ERSION 1.2
25.1 Exponentiation
204
expressing that exp is also a linear space algorithm for the integer exponential function. Can we do better? Yes, on both counts! Here’s how. Rather than count down by one’s, multiplying by two at each stage, we use successive squaring to achieve logarithmic time and space requirements. The idea is that if the exponent is even, we square the result of raising 2 to half the given power; otherwise, we reduce the exponent by one and double the result, ensuring that the next exponent will be even. Here’s the code: fun fun fun  square (n:int) = n*n double (n:int) = n+n fast exp 0 = 1 fast exp n = if n mod 2 = 0 then square (fast exp (n div 2)) else double (fast exp (n1))
Its speciﬁcation is precisely the same as before. Does this code satisfy the speciﬁcation? Yes, and we can prove this by using complete induction, a form of mathematical induction in which we may prove that n > 0 has a desired property by assuming not only that the predecessor has it, but that all preceding numbers have it, and arguing that therefore n must have it. Here’s how it’s done. For n = 0 the argument is exactly as before. Suppose, then, that n > 0. If nis even, the value of exp n is the result of squaring the value of exp (n ÷ 2). Inductively this value is 2(n÷2) , so squaring it yields 2(ndiv2) × 2(n÷2) = 22×(n÷2) = 2n , as required. If, on the other hand, n is odd, the value is the result of doubling exp (n − 1). Inductively the latter value is 2(n−1) , so doubling it yields 2n , as required. Here’s a recurrence governing the running time of fast exp as a function of its argument: T (0) = O (1) T (2n) = O(1) + T (n) T (2n + 1) = O(1) + T (2n) = O (1) + T ( n ) Solving this recurrence using standard techniques yields the solution T (n) = O(lgn) R EVISED 11.02.11 D RAFT V ERSION 1.2
However. this may not be achieved in this case by simply adding an accumulator argument. without also increasing the running time! The obvious approach is to attempt to satisfy the speciﬁcation if n ≥ 0. We can achieve a logarithmic time and exponential space bound by a change of base.1 Exponentiation 205 You should convince yourself that fast exp also requires logarithmic space usage. it’s not possible to improve the time requirement (at least not asymptotically).2 . a) evaluates to 2n × a. Can we do better? The key is to recall the following important fact: 2(2n) = (22 )n = 4n .25. Can we do better? Well. Here’s the speciﬁcation: R EVISED 11. 2*a) It is easy to see that this code works properly for n = 0 and for n > 0 when n is odd. skinny fast exp (n div 2. a) = a  skinny fast exp (n.02. Here’s some code that achieves this speciﬁcation: fun skinny fast exp (0. a)) else skinny fast exp (n1. This yields the desired result. but what is the running time? Here’s a recurrence to describe its running time as a function of n: T (0) = 1 T (2n) = O(1) + 2T (n) T (2n + 1) = O(1) + T (2n) = O(1) + 2T (n) Here again we have a standard recurrence whose solution is T ( n ) = O ( n ).11 D RAFT V ERSION 1. but we can reduce the space required to O(1) by putting the function into iterative (tail recursive) form. a) = if n mod 2 = 0 then skinny fast exp (n div 2. but what if n > 0 is even? Then by induction we compute 2(n÷2) × 2(n÷2 ) × a by two recursive calls to skinny fast exp. then skinny fast exp (n.
25. It is important to notice the correspondence between strengthening the speciﬁcation by adding additional assumptions (and guarantees) and R EVISED 11.) The point here is to see how induction and recursion go handinhand. Here’s the code: fun gen skinny fast exp (b. a) =  gen skinny fast exp (b.02. The base case is obvious.11 D RAFT V ERSION 1. we obtain inductively the result b(n−1) × b × a = bn × a. If the veriﬁcation is performed simultaneously with the coding. The trick to achieving an efﬁcient implementation of the exponential function was to compute a more general function that can be implemented using less time and space. n. b * a) 206 Let’s check its correctness by complete induction on n. then gen skinny fast exp (b.. and how we used induction not only to verify programs afterthefact..1. making available to the caller a function satisfying the original speciﬁcation. 0. n div 2. 1) end (The ellided code is the same as above. in fun exp n = gen skinny fast exp (2. 0. to help discover the program in the ﬁrst place. more importantly. it is far more likely that the proof will go through and the program will work the ﬁrst time you run it. a) = a  gen skinny fast exp (b.1 Exponentiation if n ≥ 0. This completes the proof. a) else gen skinny fast exp (b. but. Since this is a trick employed by the implementor of the exponential function. it is important to insulate the client from it. n. Here’s what the code looks like in this case: local fun gen skinny fast exp (b. This is easily achieved by using a local declaration to “hide” the generalized function. If n is even. n. n. then by induction the result is (b × b)(n÷2) × a = bn × a. n . a) = . a) = if n mod 2 = 0 then gen skinny fast exp (b*b. and if n is odd. a) evaluates to bn × a.2 . Assume the speciﬁcation for arguments smaller than n > 0.
0):int = m  gcd (0. no recursive calls are available to help us along here. it is often easier to compute a more complicated function involving accumulator arguments. written in ML: fun gcd (m:int. precisely because we get to assume the result as the inductive hypothesis when arguing the inductive step(s). which is to say that there is some k ≥ 0 such that n = k × m. n mod m) Why is this algorithm correct? We may prove that gcd satisﬁes the speciﬁcation by complete induction on the product m × n. If m × n is zero. of m and n. written mn. We are limited only by the requirement that the result be deﬁned outright for the base case(s).c.2 .11 D RAFT V ERSION 1. What we observe is the apparent paradox that it is often easier to do something (superﬁcially) harder! In terms of proving. iff n is a multiple of m. n:int):int = n  gcd (m:int. Euclid’s algorithm for computing the g.25. n) else gcd (m. (By convention the g. of 0 and 0 is taken to be 0.d.d.02.n) evaluates to the g. R EVISED 11.2 The GCD Algorithm Let’s consider a more complicated example.) Here’s the speciﬁcation of the gcdfunction: if m. We are limited only by the requirement that the speciﬁcation be proved outright at the base case(s). n ≥ 0. Here’s the algorithm. Recall that m is a divisor of n. of m andn is deﬁned by complete induction on the product mn. The greatest common divisor of nonnegative integers m and n is the largest p such that pm and pn. n:int):int = if m>n then gcd (m mod n. precisely because we get to exploit the accumulator when making recursive calls. then gcd(m.d. no inductive assumption is available to help us along here.c.c. 25. the computation of the greatest common divisor of a pair of nonnegative integers. it is often easier to push through an inductive argument for a stronger speciﬁcation.2 The GCD Algorithm 207 adding accumulator arguments. In terms of programming.
follows similarly.c. it can’t possibly be anything other than the case at that point in the program.d. but they can be turned off once the code is in production. which is to say that d divides m. these checks have a runtime cost.2 The GCD Algorithm 208 then either mor n is zero. of m and n. algorithm that computes the g. correctly. so m = (k + (m ÷ n) × l ) × d. d is the g. in which case the answer is. graceful malfunction) of our programs? This is an instance of the general problem of writing selfchecking programs. We’ll illustrate the idea by elaborating on the g. Observe that mmodn = m − (m ÷ n) × n. Suppose that m > n. Now if d’ is any other divisor of m and n.2 . so that (mmodn) × n = m × n − (m ÷ n)n2 < m × n.02.c. It remains to show that this is the g.d.c. at least. the time required to read and write from disk.c. That is. For example. and anyway the cost is minimal compared to. the other number. If d divides both mmodn and n. Consequently. say. you may be thinking. but it’s no replacement for good oldfashioned “bulletprooﬁng” — conditional tests inserted at critical junctures to ensure that key invariants do indeed hold at execution time. then k × d = (mmodn) = (m − (m ÷ n) × n)and l × d = n for some nonnegative k and l. Barring compiler bugs. provided that sufﬁciently cheap checks can be put into place and provided that you know where to put them to maximize their effectiveness. example a bit further. Sure. of m and n. Otherwise the product is positive. This completes the proof. At this point you may well be thinking that all this inductive reasoning is surely helpful. It’s hard to complain about this attitude. Or it may be possible to insert a check whose computation is more expensive (or more complicated) than the one we’re trying to perform.c.d.c.d. then it is also a divisor of (mmodn) and n.11 D RAFT V ERSION 1. and we proceed according to whether m > n or m ≤ n. Suppose we wish to write a selfchecking g. k × d = m − (m ÷ n) × l × d. and what checks should be included to help ensure the correct operation (or. in which case we’re defeating the purpose by including them! This raises the question of where should we put such checks. and then checks the result to ensure that it really is the greatest common divisor of the two given nonnegative integers before returning it as result.. m ≤ n.d. The code might look something like this: exception GCD ERROR R EVISED 11. The other case.25. there’s no use checking i > 0 at the start of the then clause of a test for i > 0. so d > d .d. so that by induction we return the g. of mmodn and n.
d. and d = a × m + b × n.d. n) = if m>n then R EVISED 11. 0)  ggcd (m. of m and n iff d divides both m and n and can be written as a linear combination of m and n.02. Here’s the speciﬁcation: if m. d is the g. n ≥ 0. a. d divides n. 1. which.2 .25. We’ll prove this constructively by giving a program to compute not only the g. But this is clearly so inefﬁcient as to be impractical. 1)  ggcd (m. n) = let val d = gcd (m. d. But there’s a better way. of mand n iff m = k × d for some k ≥ 0. And here’s the code to compute it: fun ggcd (0. but also the coefﬁcients a and b such that d = a × m + b × n. relies on the kind of mathematical reasoning we’ve been considering right along. refuting the maximality of d. is in fact a common divisor of mand n. n) in if m mod d = 0 andalso n mod d = 0 andalso ??? then d else raise GCD ERROR end 209 It’s obviously no problem to check that putative g. it has to be emphasized. consequently. then ggcd (m.c.c.c. but how do we check that it’s the greatest common divisor? Well. That is. 0. n) evaluates to (d. Here’s an important fact: d is the g. n = l × d for some l ≥ 0. d is the g. 0) = (m.d.c.. of m and n. b) such that d divides m.2 The GCD Algorithm fun checked gcd (m.11 D RAFT V ERSION 1.d.c. one choice is to simply try all numbers between d and the smaller of m and n to ensure that no intervening number is also a divisor. n) = (n.d. and d = a × m + b × n for some integers (possibly negative!) a and b. d of m and n.
ensures that the result is correct. and. a .d. n mod m) in (d. and show it for m × n > 0.25. a.d. as follows: exception GCD ERROR fun checked gcd (m. This illustrates the power of the interplay between mathematical reasoning methods such as induction and number theory and programming methods such as bulletprooﬁng to achieve robust. R EVISED 11. moreover. and d = a × (mmodn) + b × n. n) = let val (d. what is more important. b) = ggcd (m. Otherwise assume the result for smaller products. and b such that d is the g.a * (m div n)) end else let val (d. If m × n = 0. b . in which case the result follows immediately. elegant programs. from which the result follows. reliable. a. Now we can write a selfchecking g. b) = ggcd (m.c.b*(n div m). b) end 210 We may easily check that this code satisﬁes the speciﬁcation by induction on the product m × n. n) in if m mod d = 0 andalso n mod d = 0 andalso d = a*m+b*n then d else raise GCD ERROR end This algorithm takes no more time (asymptotically) than the original. it follows that d = a × m + (b − a × (m ÷ n)) × n. then either m or n is 0.c. Since mmodn = m − (m ÷ n) × n.c.2 The GCD Algorithm let val (d. n) in (d. Inductively we obtain d. and. Suppose m > n. a. the other case is handled analogously. of m and n. b) = ggcd (m mod n.2 . a.11 D RAFT V ERSION 1.d.02. a. of mmodn and n. and hence is the g.
3 Sample Code 211 25.25.2 .02. R EVISED 11.3 Sample Code Here is the code for this chapter.11 D RAFT V ERSION 1.
0 is an element of N. N. n is an element of N iff either n = 0 or n = m + 1 for some m in N. (The third clause is sometimes called the extremal clause. the familiar concept of mathematical induction over the natural numbers is an instance of the more general notion of structural induction over values of an inductivelydeﬁned type. In other words. may be thought of as the smallest set containing 0 and closed under the formation of successors. 3.Chapter 26 Structural Induction The importance of induction and recursion are not limited to functions deﬁned over the integers. Let’s begin by considering the natural numbers as an inductively deﬁned type. 26. Nothing else is an element of N. . we will rely on a few examples to illustrate the point. Rather. it ensures that we are talking about N and not just some superset of it.1 Natural Numbers The set of natural numbers. 2. Rather than develop a general treatment of inductivelydeﬁned types. Still another way of saying it is to deﬁne N by the following clauses: 1. If m is an element of N.) All of these deﬁnitions are equivalent ways of saying the same thing. then so is m + 1.
11 D RAFT V ERSION 1. as was to be shown. an unnatural and spacewasting R EVISED 11. to prove that a property P(n) holds of every n in N. then by induction the value of f is determined for all values k ≤ m. we may show by induction on n ≥ 0 that the value of f is uniquely determined on all values m ≤ n. Given this information. Thus f is uniquely determined for all values of n in N. may be represented in ML using a datatype declaration. (You may object that this deﬁnition of the type nat amounts to a unary (base 1) representation of natural numbers. Give the value of f (0).2 .26. show that P(m + 1) holds. This assumption forms the basis for exhaustiveness checking for clausal function deﬁnitions. But the value of f at n is determined as a function of f (m). To deﬁne a function f with domain N. The pattern of reasoning follows exactly the structure of the inductive definition — the base case is handled outright. it sufﬁces to proceed as follows: 1. it sufﬁces to demonstrate the following facts: 1. and the inductive step is handled by assuming the property for the predecessor and show that it holds for the numbers. Show that P(0) holds. 2. we may prove properties of the natural numbers by structural induction. The principal of structural induction also licenses the deﬁnition of functions by structural recursion.02. Assuming that P(m) holds. Give the value of f (m + 1) in terms of the value of f (m). which in this case is just ordinary mathematical induction. this is obvious since f (0) is deﬁned by the ﬁrst clause. Speciﬁcally. and hence is uniquely determined. Speciﬁcally. as follows: datatype nat = Zero  Succ of nat The constructors correspond oneforone with the clauses of the inductive deﬁnition.1 Natural Numbers 213 Since N is inductively deﬁned. If n = 0. viewed as an inductivelydeﬁned type. there is a unique function f with domain N satisfying these requirements. The natural numbers. 2. The extremal clause is implicit in the datatype declaration since the given constructors are assumed to be all the ways of building values of type nat. If n = m + 1.
02. 3. R EVISED 11.) Functions deﬁned by structural recursion are naturally represented by clausal function deﬁnitions. the language assumes the best and allows any form of deﬁnition. but it does not ensure that the pattern of structural recursion is strictly followed — we may accidentally deﬁne f (m + 1) in terms of itself or some f (k ) where k > m. exp n evaluates to a positive number. To avoid restricting the programmer. nil is a value of type ’a list 2. then h::t is a value of type ’a list. The reason this is admitted is that the ML compiler cannot always follow our reasoning: we may have a clever algorithm in mind that isn’t easily expressed by a simple structural induction. and t is a value of type ’a list. Note. For example. we may easily prove by structural induction over the type nat that for every n ∈ N. (In previous chapters we carried out proofs of more interesting program properties.) 26.2 . Hybrid representations that enjoy the beneﬁts of both are.11 D RAFT V ERSION 1. we may think of the type ’a list as inductively deﬁned by the following clauses: 1. possible and occasionally used when enormous numbers are required. in practice the natural numbers are represented as nonnegative machine integers to avoid excessive overhead. Nothing else is a value of type ’a list. we may prove properties of functions deﬁned over the naturals. of course. as in the following example: fun  fun  double Zero = Zero double (Succ n) = Succ (Succ (double n)) exp Zero = Succ(Zero) exp (Succ n) = double (exp n) The type checker ensures that we have covered all cases. whereas the unary representation does not. This is indeed true. however. that this representation places a ﬁxed upper bound on the size of numbers. If h is a value of type ’a. Using the principle of structure induction for the natural numbers.2 Lists Generalizing a bit.2 Lists 214 representation. breaking the pattern.26.
Second. Similarly.11 D RAFT V ERSION 1.) Using the principle of structural induction over lists. as indeed it does and ought to.3 Trees Generalizing even further. Show that P holds for h::t. we may deﬁne functions by structural recursion over lists as follows: 1.26. we assume that reverse t yields the reversal of t. we show that reverse nil yields nil. Deﬁne the function for nil. The clauses of the inductive deﬁnition of lists correspond to the following (builtin) datatype declaration in ML: datatype ’a list = nil  :: of ’a * ’a list (We are neglecting the fact that :: is regarded as an inﬁx operator.2 . 26. To prove that a property P holds of all lists l. 2. Here’s a deﬁnition of 23 trees in ML: R EVISED 11. (We have ignored questions of time and space efﬁciency to avoid obscuring the induction principle underlying the deﬁnition of reverse.02. as indeed it does since it returns reverse (t @ [h]). 2. Deﬁne the function for h::t in terms of its value for t. we may prove that reverse l evaluates to the reversal of l. and argue that reverse (h::t) yields the reversal of h::t. First. it sufﬁces to proceed as follows: 1. we can introduce new inductivelydeﬁned types such as 23 trees in which interior nodes are either binary (have two children) or ternary (have three children). and the value of reverse for h::t is deﬁned in terms of its value for t. assuming that P holds for t.) The principle of structural recursion may be applied to deﬁne the reverse function as follows: fun reverse nil = nil  reverse (h::t) = reverse t @ [h] There is one clause for each constructor.3 Trees 215 This deﬁnition licenses the following principle of structural induction over lists. Show that P holds for nil.
t2. t1. Here’s the complete deﬁnition: fun size Empty = 0  size (Bin ( . t2)) = ???  size (Ter ( . In many cases (including this one) the function is deﬁned by structural recursion. use a clausal deﬁnition with one clause per constructor R EVISED 11. 26.) The slogan is: To deﬁne functions over a datatype.4 Generalizations and Limitations Does this pattern apply to every datatype declaration? Yes and no. Such a deﬁnition is guaranteed to be exhaustive (cover all cases).02. t2.4 Generalizations and Limitations datatype ’a twth tree = Empty  Bin of ’a * ’a twth tree * ’a twth tree  Ter of ’a * ’a twth tree * ’a twth tree * ’a twth tree 216 How might one deﬁne the “size” of a value of this type? Your ﬁrst thought should be to write down a template like the following: fun size Empty = ???  size (Bin ( . because then the compiler will inform you of what clauses need to be added or removed from functions deﬁned over that type in order to restore it to a sensible deﬁnition. t3)) = 1 + size t1 + size t2 + size t3 Obviously this function computes the number of nodes in the tree.2 . t3)) = ??? We have one clause per constructor. (It is especially valuable if you change the datatype declaration. t1.26. t1.11 D RAFT V ERSION 1. as you can readily verify by structural induction over the type ’a twth tree. t1. t2)) = 1 + size t1 + size t2  size (Ter ( . No matter what the form of the declaration it always makes sense to deﬁne a function over it by a clausal function deﬁnition with one clause per constructor. and serves as a valuable guide to structuring your code. and will ﬁll in the ellided expressions to complete the deﬁnition.
2 . In practice this sort of deﬁnition comes up only rarely. nat rec yields a function of type nat > ’a whose behavior is determined at the base case by the ﬁrst argument and at the inductive step by the second. and hence it is not clear what to regard as its predecessor.5 Abstracting Induction It is interesting to observe that the pattern of structural recursion may be directly codiﬁed in ML as a higherorder function. the declaration datatype D = Int of int  Fun of D>D is problematic because a value of the form Fun( f ) is not constructed directly from another value of type D. For example. Given the ﬁrst two arguments. in most cases datatypes are naturally viewed as inductively deﬁned. An example will illustrate the point.11 D RAFT V ERSION 1. Speciﬁcally.02.5 Abstracting Induction 217 The catch is that not every datatype declaration supports a principle of structural induction because it is not always clear what constitutes the predecessor(s) of a constructed value. we may associate with each inductivelydeﬁned type a higherorder function that takes as arguments values that determine the base case(s) and step case(s) of the deﬁnition.26. Here’s an example of the use of nat rec to deﬁne the exponential function: val double = nat rec Zero (fn result => Succ (Succ result)) val exp = nat rec (Succ Zero) double R EVISED 11. 26. The pattern of structural induction over the type nat may be codiﬁed by the following function: fun nat rec base step = let fun loop Zero = base  loop (Succ n) = step (loop n) in loop end This function has the type ’a > (’a > ’a) > nat > ’a. and deﬁnes a function by structural induction based on these arguments.
2. The value for Zero. loop t1. loop t) in loop end The type of the function list recursion is ’a > (’b * ’a > ’a) > ’b list > ’a It may be instantiated to deﬁne the reverse function as follows: val reverse = list recursion nil (fn (h. the pattern of list recursion may be captured by the following functional: fun list recursion base step = let fun loop nil = base  loop (h::t) = step (h. t1. The value for Succ n deﬁned in terms of its value for n.5 Abstracting Induction Note well the pattern! The arguments to nat rec are 1. The type of twth rec is R EVISED 11.26. t2)) = bin step (v. t1. t2. t) => t @ [h]) Finally. loop t1.2 . loop t3) in loop end Notice that we have two inductive steps. t3)) = ter step (v. the principle of structural recursion for values of type ’a twth tree is given as follows: fun twth rec base bin step ter step = let fun loop Empty = base  loop (Bin (v.02. 218 Similarly.11 D RAFT V ERSION 1. one for each form of node. loop t2)  loop (Ter (v. loop t2.
6 Sample Code Here is the code for this chapter.02. s1. s3)) => 1+s1+s2+s3) Summarizing. 26.6 Sample Code 219 ’a > (’b * ’a * ’a > ’a) > (’b * ’a * ’a * ’a > ’a) > ’b twth tree > ’a We may instantiate it to deﬁne the function size as follows: val size = twth rec 0 (fn ( . s2)) => 1+s1+s2) (fn ( . Whenever you’re programming with a datatype. you should use the techniques outlined in this chapter to structure your code. R EVISED 11. the principle of structural induction over a recursive datatype is naturally codiﬁed in ML using pattern matching and higherorder functions.26.2 . s1.11 D RAFT V ERSION 1. s2.
the string s . We then attempt to verify that the matching program developed in chapter 1 satisﬁes this speciﬁcation. We then consider how best to resolve the problem. We write ε for the null string. let us ﬁrst deﬁne the set of regular expressions and their meaning as a set of strings. the empty sequence of letters. The code is similar to that sketched in chapter 1. not by change of implementation. The set of regular expressions is given by the following grammar: r ::= 0  1  a  r1 r2  r1 + r2  r ∗ Here a ranges over a given alphabet.1 Regular Expressions and Languages Before we begin work on the matcher. This is a difﬁcult problem in itself. 27.Chapter 27 ProofDirected Debugging In this chapter we’ll put speciﬁcation and veriﬁcation techniques to work in devising a regular expression matcher. The proof attempt breaks down. Careful examination of the failure reveals a counterexample to the speciﬁcation — the program does not satisfy it. a set of primitive “letters” that may be used in a regular expression. We write s1 s2 for the concatenation of the strings s1 and s2 . but we will use veriﬁcation techiques to detect and correct a subtle error that may not be immediately apparent from inspecting or even testing the code. but instead by change of speciﬁcation. We call this process proofdirected debugging. A string is a ﬁnite sequence of letters of the alphabet. The ﬁrst task is to devise a precise speciﬁcation of the regular expression matcher.
R EVISED 11. A language is a set of strings.11 D RAFT V ERSION 1. Let’s prove that this is the case. and (b) if s ∈ L and s ∈ L∗ . then l ∈ L(i) for some i ≥ 0. The length of a string is the number of letters in it. this means two things: 1. 1 + L L∗ ⊆ L∗ .02. We do no distinguish between a character and the unitlength string consisting solely of that character. it follows immediately that ε ∈ L∗ . then L∗ ⊆ L . if l ∈ L and l ∈ L∗ . since L(0) = 1.27. the language of r. This means that L∗ is the smallest language (with respect to language containment) that contains the null string and is closed under concatenation on the left by L. which is to say that (a) ε ∈ L∗ .1 Regular Expressions and Languages 221 consisting of the letters in s1 followed by those in s2 . If 1 + L L ⊆ L . which is deﬁned by induction on the structure of r as follows: L(0) L(1) L (a) L (r1 r2 ) L (r1 + r2 ) L (r ∗ ) = = = = = = 0 1 {a} L (r1 ) L (r2 ) L (r1 ) + L (r2 ) L (r ) ∗ This deﬁnition employs the following operations on languages: 0 1 L1 + L2 L1 L2 L (0) L ( i +1) L∗ = = = = = = = ∅ {ε} L1 ∪ L2 { s1 s2  s1 ∈ L1 . Every regular expression r stands for a particular language L(r ). Second. Spelled out. Thus we write a s for the extension of s with the letter a at the front. then s s ∈ L∗ . 2.2 . s2 ∈ L2 } 1 L L (i ) (i ) i ≥0 L An important fact about L∗ is that it is the smallest language L such that 1 + L L ⊆ L . First.
27. This is easily established by a simple case analysis. using ML notation for primitive operations on lists such as concatenation and extension. and that regular expressions are deﬁned by the following datatype declaration: datatype regexp = Zero  One  Char of char  Plus of regexp * regexp  Times of regexp * regexp  Star of regexp We will also work with lists of characters. This completes the ﬁrst step.27. But this follows from the assumption that 1 + L L ⊆ L . since L L ⊆ L by assumption.2 Specifying the Matcher 222 and hence l l ∈ L(i+1) by deﬁnition of concatentation of languages. that strings are elements of the type string. We are to show that L∗ ⊆ L . values of type char list. As a ﬁrst cut we write down a type speciﬁcation. we observe that. We just proved the right to left containment. We show by induction on i ≥ 0 that L(i) ⊆ L . and hence L L(i) ⊆ L . Occasionally we will abuse notation and not distinguish (in the informal discussion) between a string and a list of characters. Now suppose that L is such that 1 + L L ⊆ L . In particular we will speak of a character list as being a member of a language. Having proved that L∗ is the smallest language L such that 1 + L L ⊆ L . a word about implementation. To show that L(i+1) ⊆ L . We will assume in what follows that the alphabet is given by the type char of characters. then it sufﬁces to show that ε ∈ L . it sufﬁces to observe that 1 + L (1 + L L∗ ) ⊆ 1 + L L∗ . By induction L(i) ⊆ L . for then the result follows by minimality the result.2 Specifying the Matcher Let us begin by devising a speciﬁcation for the regular expression matcher. Exercise 1 Give a full proof of the fact that L∗ = 1 + L L∗ . For the converse. L(i+1) = L L(i) .2 . which implies that 1 ⊆ L .02. from which the result follows immediately. We seek to deﬁne a funcR EVISED 11. If i = 0. when in fact we mean that the corresponding string is a member of that language. by deﬁnition.11 D RAFT V ERSION 1. it is not hard to prove that L∗ satisﬁes the recurrence L∗ = 1 + L L∗ . Finally.
We saw in chapter 1 that a natural way to deﬁne the procedure match is to use a technique called continuation passing. Instead. match r s terminates. and total continuation k.27. a character list cs. if cs = cs @cs with cs ∈ L(r ) and k cs evaluates to true. R EVISED 11. Put more precisely. then there is no way that match is can behave as required. This leads to the following revised speciﬁcation: For every regular expression r. match is r cs k evaluates to false. the speciﬁcation requires that match is return false. More precisely. For every regular expression r. It should be intuitively clear that we can never implement such a function. The idea is that match is takes a regular expression r. character list cs. otherwise. we wish to satisfy the following speciﬁcation: For every regular expression r and every string s.02. and evaluates to true iff s ∈ L(r ).2 Specifying the Matcher 223 tion match of type regexp > string > bool that determines whether or not a given string matches a given regular expression. We deﬁned an auxiliary function match is with the type regexp > char list > (char list > bool) > bool that takes a regular expression. and continuation k. we must restrict attention to total continuations. and yields false otherwise.2 .11 D RAFT V ERSION 1. passing the remaining characters cs to k in the case that there is such an initial segment. otherwise. and yields a boolean. match is r cs k evaluates to false. and determines whether or not some initial segment of cs matches r. if cs = cs cs with cs ∈ L(r ) and k cs evaluates to true. a list of characters (essentially a string. then match is r cs k evaluates true. this speciﬁcation is too strong to ever be satisﬁed by any program! Can you see why? The difﬁculty is that if k is not guaranteed to terminate for all inputs. and a continuation k. if there is no input on which k terminates. but in a form suitable for incremental processing). character list cs. For example. Unfortunately. those that always terminate with true or false on any input. and a continuation. then match is r cs k evaluates to true.
So far so good. we are done. .2 Specifying the Matcher 224 Observe that this speciﬁcation makes use of an implicit existential quantiﬁcation. Suppose for the moment that match is satisﬁes this speciﬁcation. we must reject this partitioning and search for another.2 . . How might we check this? Recall the deﬁnition of match is given in the overview: fun match is Zero k = false  match is One cs k = k cs  match is (Char c) nil k = false R EVISED 11. if s ∈ L(r ). . But does match is satisfy its speciﬁcation? If so. and that it yields true (accepts) iff it is applied to the null string.02. This observation makes clear that we must search for a suitable splitting of cs into two parts such that the ﬁrst part is in L(r ) and the second is accepted by k.11 D RAFT V ERSION 1. if there exists cs and cs such that cs = cs cs with . that if cs = cs @cs with cs ∈ L(r ) but k cs yielding false. we need only ﬁnd one such way. Written out in full. This is precisely the property that we desire for match. not merely if a particular splitting fails to work. we might say “For all .explode s) (fn nil => true  false) Notice that the initial continuation is indeed total. be many ways to partition the input to as to satisfy both of these requirements. In other words we cannot simply accept any partitioning whose initial segment matches r. . in general. . . Thus match is correct (satisﬁes its speciﬁcation) if match is is correct. . There may. Note. Therefore match satisﬁes the following property obtained from the speciﬁcation of mathc is by plugging in the initial continuation: For every regular expression r and string s. but rather only those that also induce k to accept its corresponding ﬁnal segment. then match r s evaluates to true. ”. We may return false only if there is no such splitting. however. and otherwise match r s evaluates to false. then . . Does it follow that match satisﬁes the original speciﬁcation? Recall that the function match is deﬁned as follows: fun match r s = match is r (String.27.
and otherwise yields false. provided that the continuation argument is total. Then cs = cs1 cs2 with cs1 ∈ L(r1 ) and cs2 ∈ L(r2 ). so match is must return false. Suppose that there exists a partitioning cs = cs @cs with cs ∈ L(r )and k cs evaluating to true. Suppose that r is 1. We ﬁrst consider the three base cases. we observe that some initial segment of cs matches r and causes k to accept the corresponding ﬁnal segment of cs iff either some initial segment matches r1 and drives k to accept the rest or some initial segment matches r2 and drives k to accept the rest. since by induction the inner recursive call to match is always terminates. Note that the continuation k given by fn cs’ => match is r2 cs’ k is total. r2)) cs k = match is r1 cs (fn cs’ => match is r2 cs’ k)  match is (Plus (r1. By induction match is works as speciﬁed for r1 and r2 . otherwise we must fail. Suppose that r is a.27. That is. Suppose that r is 0.02. Consequently. For r = r1 r2 . We now consider the three inductive steps. By induction match is behaves according to the speciﬁcation if it is applied to either r1 or r2 . it is natural to proceed by induction on the structure of r. r2)) cs k = match is r1 cs k orelse match is r2 cs k  match is (Star r) cs k = k cs orelse match is r cs (fn cs’ => match is (Star r) cs’ k) 225 Since match is is deﬁned by a recursive analysis of the regular expression r. match is r2 (cs2 cs ) k evaluates to R EVISED 11. Thus match is behaves correctly for each of the three base cases.11 D RAFT V ERSION 1. we treat the speciﬁcation as a conjecture about match is. we must yield true iff k cs yields true. if so. and the null string is in L(1). the proof is slightly more complicated. The code for match is checks that cs has the required form and.2 Specifying the Matcher  match is (Char c) (d::cs) k = if c=d then k cs else false  match is (Times (r1. and false otherwise. by deﬁnition of L(r1 r2 ). passes cs to k to determine the outcome. Since the null string is an initial segment of every string. Then to succeed cs must have the form a cs with k cs evaluating to true. which is sufﬁcient to justify the correctness of match is for r = r1 + r2 . For r = r1 + r2 . This is precisely how match is is deﬁned. which indeed it does.2 . Then no string is in L(r ). and attempt to prove it by structural induction on r.
for every initial segment matching r1 . and hence match is r1 cs1 cs2 cs k evaluates to true. and we are once again unable to complete the proof.11 D RAFT V ERSION 1. then match is r cs k does not terminate! In general if r = r1 ∗ with ε ∈ L(r1 ). based on the idea that if some initial segment of cs matches L(r ).02. no such partitioning exists.2 . however. every pair of successive initial segments of cs matching r1 and r2 successively results in k evaluating to false. it is quite tricky to get right! We seem to be on track. and hence the outer recursive call yields false. as required. in which case the inner recursive call always yields false. for example. If. then one of three situations occurs: 1. r = r1 ∗ . no initial segment of the corresponding ﬁnal segment matches r2 . as required. or cs = cs @cs with cs ∈ L(r1 ) and cs ∈ L(r ) (induction step). as required. or 2. then either that initial segment is the null string (base case). This case would appear to be a combination of the preceding two cases for alternation and concatenation. and handle the inductive case by assuming that match is behaves correctly for cs and showing that it behaves correctly for cs. This time we can use the failure of the proof to obtain a counterexample to the speciﬁcation! For if r = 1∗ . or 3. with one more case to consider. in which case the outer recursive call yields false. as required. in which case the inner recursive call yields false on every call.27. What to do? A moment’s thought suggests that we proceed by an inner induction on the length of the string. with a similar argument sufﬁcing to establish correctness. then match is r R EVISED 11. We then handle the base case directly.2 Specifying the Matcher 226 true. and hence the continuation k always yields false. But there is a ﬂaw in this argument — the string cs need not be shorter than cs in the case that cs is the null string! In that case the inductive hypothesis does not apply. Be sure you understand the reasoning involved here. either no initial segment of cs matches r1 . and hence the outer call yields false. But there is a snag: the second recursive call to match is leaves the regular expression unchanged! Consequently we cannot apply the inductive hypothesis to establish that it behaves correctly in this case. and the obvious proof attempt breaks down.
What to do? One approach is to explicitly check for looping behavior during matching by ensuring that each recursive calls matches some nonempty initial segment of the string.02. Instead we can preprocess the regular expression to put it into standard form.2 . Using this observation we may avoid the need to consider nonstandard regular expressions. You should write out a version of the matcher that works this way. In other words. Our conjecture is false! Our failure to establish that match is satisﬁes its speciﬁcation lead to a counterexample that refuted our conjecture and uncovered a genuine bug in the program — the matcher may not terminate for some inputs. This says that the matcher works correctly for standard regular expressions. An alternative is to observe that the proof goes through under the additional assumption that no iterated regular expression matches the null ∗ string.27. but at the expense of cluttering the code and imposing additional runtime overhead. By “equivalent” we mean “accepting the same language”. Call a regular expression r standard iff whenever r occurs within r. The required preprocessing is based on the following deﬁnitions.2 Specifying the Matcher 227 cs k fails to terminate. we see that every regular expression r may be written in the form δ(r ) + r − . L(r − ) = L(r ) \ 1. 2. L(δ(r )) = 1 iff ε ∈ L(r ) and L(δ(r )) = 0 otherwise. But what about the nonstandard ones? The key observation is that every regular expression is equivalent to one in standard form. the regular expressions r + 0 and r are easily seen to be equivalent. which is in standard form. and check that it indeed satisﬁes the speciﬁcation we’ve given above. It is easy to check that the proof given above goes through under the assumption that the regular expression r is standard. This will work. match is does not satisfy the speciﬁcation we have given for it. the null string is not an element of L(r ). With these equations in mind. We will associate with each regular expression r two standard regular expressions δ(r ) and r − with the following properties: 1.11 D RAFT V ERSION 1. For example. then call the matcher on the standardized regular expression. R EVISED 11.
11 D RAFT V ERSION 1.27.02.3 Sample Code 228 The function δ mapping regular expressions to regular expressions is deﬁned by induction on regular expressions by the following equations: δ(0) δ(1) δ (a) δ (r1 + r2 ) δ (r1 r2 ) δ (r ∗ ) = = = = = = 0 1 0 δ (r1 ) ⊕ δ (r2 ) δ (r1 ) ⊗ δ (r2 ) 1 Here we deﬁne 0 ⊕ 1 = 1 ⊕ 0 = 1 ⊕ 1 = 1 and 0 ⊕ 0 = 0 and 0 ⊗ 1 = 1 ⊗ 0 = 0 ⊗ 0 = 0 and 1 ⊗ 1 = 1. The deﬁnition of r − is given by induction on the structure of r by the following equations: 0− 1− a− (r1 + r2 ) − (r1 r2 ) − (r ∗ ) − = = = = = = 0 0 0 − − r1 + r2 − − − δ (r1 ) r2 + r1 δ (r2 ) + r1 r2 ∗ δ (r ) + r − The only tricky case is the one for concatenation. 27. which must take account of the possibility that r1 or r2 accepts the null string.3 Sample Code Here is the code for this chapter. Exercise 3 Show that L(r − ) = L(r ) \ 1.2 . Exercise 2 Show that L(δ(r )) = 1 iff ε ∈ L(r ). R EVISED 11.
It is therefore inherent in the imperative programming model that a value have at most one logical future. The distinction is best explained in terms of the logical future of a value. values of an abstract type in functional languages such as ML may have many different logical futures. This is the normal case in familiar imperative programming languages because in such languages the operations of an abstract type destructively modify the value upon which they operate. The sequence of operations performed on a value of an abstract type constitutes a logical future of that type — a computation that starts with that value and ends with a value of some observable type. which may themselves be handed off to further operations of the type. its original state is irretrievably lost by the performance of an operation. In contrast. Whenever a value of an abstract type is created it may be subsequently acted upon by the operations of the type (and. say a string or an integer. Ultimately a value of some other type. but rather create fresh values of that type to yield as results. Each of these operations may yield (other) values of that abstract type. Such values are said to be persistent . which is to say that it is handed off from one operation of the type to another until an observable value is obtained from it. by no other operations).Chapter 28 Persistent and Ephemeral Data Structures This chapter is concerned with persistent and ephemeral abstract types. since the type is abstract. precisely because the operations do not “destroy” the value upon which they operate. is obtained as an observable outcome of the succession of operations on the abstract value. We say that a type is ephemeral iff every value of that type has at most one logical future.
alleviating the need to perform tedious bookkeeping to manage “versions” or “copies” of a data structure to be passed to different operations. Some examples will help to clarify the distinction. as illustrated by the following code sequence: (* original list *) val l = [1. or reversing a list does not destroy the original list.3] val m1 = hd l (* first future of l *) val n1 = rev (tl m1) (* second future of l *) val m2 = l @ [4.2.2. There is only one future for this cell.11 D RAFT V ERSION 1. a reference to its original binding does not yield its original contents. The primitive list types of ML are persistent because the performance of an operation such as cons’ing. even if we keep a handle on it. has two distinct logical futures. val r = ref 0 (* original cell *) val s = r val = (s := 1) val x = !r (* 1! *) Notice that the contents of (the cell bound to) r changes as a result of performing an assignment to the underlying cell. appending. The ability to easily handle multiple logical futures for a data structure is a tremendous source of ﬂexibility and expressive power. then reverse the tail. one in which we remove its head.5. Performing an assignment operation on a reference cell changes it irrevocably.2 .3].230 because they persist after application of an operation of the type. the original contents of the cell are lost.02. and in fact may serve as arguments to further operations of that type.6] Notice that the original list value.5.6] to it. The prototypical ephemeral data structure in ML is the reference cell. [1. R EVISED 11. and the other in which we append the list [4. This leads naturally to the idea of multiple logical futures for a given value.
Second. it is possible to implement a persistent data structure that exploits mutable storage. when moving elements onto the accumulator argument.e. For example. here’s an implementation of a destructive reversal of a mutable list. Given a mutable list l. it is possible for a persistent data type R EVISED 11. For example. (r := a. First. this function reverses the links in the cell so that the elements occur in reverse order of their occurrence in l.11 D RAFT V ERSION 1. Such a use of mutation is an example of what is called a benign effect because for all practical purposes the data structure is “purely functional” (i. persistent). a) = ipr (next.2 . However. this)) in (* destructively reverse a list *) fun inplace reverse l = ipr (l. As we will see later the exploitation of benign effects is crucial for building efﬁcient implementations of persistent data structures.. the code is quite tricky to understand! The idea is the same as the iterative reverse function for pure lists. one whose predecessor relation may be changed dynamically by assignment: datatype ’a mutable list = Nil  Cons of ’a * ’a mutable list ref Values of this type are ephemeral in the sense that some operations on values of this type are destructive. rather than generate new ones. this characterization is oversimpliﬁed in two respects. the following declaration deﬁnes a type of lists whose tails are mutable. imperative data structures are ephemeral. r as ref next)). Nil) end As you can see. The distinction between ephemeral and persistent data structures is essentially the distinction between functional (effectfree) and imperative (effectful) programming — functional data structures are persistent. and hence are irreversible (so to speak!). It is therefore a singlylinked list.02. a) = a  ipr (this as (Cons ( . except that we reuse the nodes of the original list. local fun ipr (Nil.231 More elaborate forms of ephemeral data structures are certainly possible. but is in fact implemented using mutable storage.
Here is an example of a sequence of queue operations: val q0 : int queue = empty R EVISED 11.1 Persistent Queues Here is a signature of persistent queues: signature QUEUE = sig type ’a queue exception Empty val empty : ’a queue val insert : ’a * ’a queue > ’a queue val remove : ’a queue > ’a * ’a queue end This signature describes a structure providing a representation type for queues.11 D RAFT V ERSION 1. Notice that removing an element from a queue yields both the element at the front of the queue. Such a type is said to be singlethreaded. and remove operations in such a way that the queue argument of one operation is obtained as a result of the immediately preceding queue operation. the original queue remains available as an argument to further queue operations. insert an element onto the back of the queue. 28. every value of the type has at most one future in the program. This is a direct reﬂection of the persistence of queues implemented by this signature.g.2 . and to remove an element from the front of the queue.28. reﬂecting the linear. It also provides an exception that is raised in response to an attempt to remove an element from the empty queue.. By a sequence of queue operations we shall mean a succession of uses of empty. Thus a sequence of queue operations represents a singlethreaded timeline in the life of a queue value. as opposed to branching. structure of the future uses of values of that type. The signiﬁcance of a singlethreaded type is that it may as well have been implemented as an ephemeral data structure (e. insert. together with operations to create an empty queue.02.1 Persistent Queues 232 to be used in such a way that persistence is not exploited — rather. by having observable effects on values) without changing the behavior of the program. and the queue resulting from removing that element.
the head element of the list representing the “back” (most recently enqueued element) of the queue. say.e. q4 = q0 *) By contrast the following operations do not form a single thread. Thus in the worst case a sequence of n enqueue and dequeue operations will take time O(n2 ). With this representation enqueueing is a constanttime operation. yielding an amortized bound for each operation of the sequence.1 Persistent Queues val val val val q1 = q2 = (h1. and one for the “front half” consisting of soontoberemoved elements in reverse order of arrival (i. which is clearly excessive. insert (1. (h2. q3 = q0 *) raise Empty *) h2 = 2. Can we do better? A wellknown “trick” achieves an O(n) worstcase performance for any sequence of n operations.02.11 D RAFT V ERSION 1.2 . but rather a branching development of the queue’s lifetime: val val val val val val q0 : q1 = q2 = (h1. but. by regarding the head of the list as the “front” of the queue. but dequeuing requires time proportional to the number of elements in the queue. one for the “back half” consisting of recently inserted elements in the order of arrival. in compensation. which means that each operation takes O(1) steps if we amortize the cost over the entire sequence. q1) q3) = remove q2 q4) = remove q3 233 (* h1 = 1. not q1! *) h1 = 1. (h2.. We speak loosely of the R EVISED 11. q0) insert (2.28.. How is this achieved? By combining the two naive solutions sketched above. How might we implement the signature QUEUE? The most obvious approach is to represent the queue as a list with. q3 = q1 *) (* h2 = 2. many operations will be cheap. in order of removal). Notice that this is a worstcase bound for the sequence. q4 = q0 *) In the remainder of this section we will be concerned with singlethreaded sequences of queue operations. We can make dequeue simpler. but the time bound for n operations remains the same in the worst case. int queue = empty insert (1. q0) q3) = remove q1 q4) = remove q3 q4) = remove q2 (* (* (* (* NB: q0. The idea is to represent the queue by two lists. at the expense of enqueue. q0) insert (2. (h2. This means that some operations may be relatively expensive.
If the front list is nonempty. we simply return the head element together with the queue created from the original back part and the front part with the head element removed. Here again the time required is either constant or proportional to the size of the back of the queue. then by condition (2) the queue is empty. according to whether the front part becomes empty after the removal. the queue constructor yields the queue consisting of an empty back part and a front part equal to the reversal of the given back part. or time proportional to the size of the back of the queue.1 Persistent Queues 234 “halves” of the queue. the resulting queue consists of the back and front parts as given. then calling the queue constructor to ensure that the result is in conformance with the representation invariant. If the front is empty. If the front is empty and the back is nonempty. we will arrange things so that the following representation invariants holds true: 1. Insertion of an element into a queue is achieved by cons’ing the element onto the back of the queue.02. (We will give a more rigorous analysis shortly. or both the front and back are empty. Thus an insert can either take constant time.) R EVISED 11. in general. so we raise an exception.2 . Observe that this is sufﬁcient to ensure that the representation invariant holds of the resulting queue in all cases.11 D RAFT V ERSION 1. This is the fundamental insight as to why we achieve O(n) time complexity over any sequence of n operations. This constructor ensures that the representation invariant holds by ensuring that condition (2) is always true of the resulting queue. because we will not. These invariants are maintained by using a “smart constructor” that creates a queue from two lists representing the back and front parts of the queue. The constructor proceeds by a case analysis on the back and front parts of the queue. or in time proportional to the length of the back part. If the front is nonempty. 2.28. depending on whether the front part is empty. The front is empty only if the back is empty. Rather. Notice that if an insertion or removal requires a reversal of k elements. maintain an even split of elements between the front and the back lists. then the next k operations are constanttime. Observe also that the smart constructor either runs in constant time. The elements of the queue listed in order of removal are the elements of the front followed by the elements of the back in reverse order. Removal of an element from a queue requires a case analysis. according to whether the front part is empty or not.
(back. Dividing by n. 28.2 Amortized Analysis Here’s the implementation of this idea in ML: structure Queue :> QUEUE = struct type ’a queue = ’a list * ’a list fun make queue (q as (nil. so the overall effort required evens out in the end to constanttime per operation. rather than O(n2 ). fs)) end 235 Notice that we call the “smart constructor” make queue whenever we wish to return a queue to ensure that the representation invariant holds. Consequently.2 . Consequently. Note that this is a worstcase time bound for each operation. nil) = raise Empty  remove (bs.” i. front) exception Empty fun remove ( .28.front)) = make queue (x::back. each operation takes O(1) (constant) time “on average. fs)) = q val empty = make queue (nil. each such reorganization makes a corresponding number of subsequent queue operations “cheap” (constanttime).11 D RAFT V ERSION 1. not an averagecase time bound based on assumptions about the distribution of the operations. The key is to observe ﬁrst that R EVISED 11. the running time of a sequence of n queue operations is now O(n).e. f::fs) = (f. We are to account for the total work performed by a sequence of n operations by showing that any sequence of n operations can be executed in cn steps for some constant c. nil) fun insert (x. according to whether or not the queue needs to be reorganized to satisfy the representation invariant. However. make queue (bs. More precisely. when the total effort is evenly apportioned among each of the operations in the sequence. then we tighten it up with a more rigorous analysis.02. we obtain the result that each operations takes c steps when amortized over the entire sequence. rev bs)  make queue (q as (bs..2 Amortized Analysis How can we prove this claim? First we given an informal argument. as it was in the naive implementation. amortized over the entire sequence. some queue operations are more expensive than others. nil)) = q  make queue (bs. nil) = (nil.
11 D RAFT V ERSION 1.2 Amortized Analysis 236 the work required to execute a sequence of queue operations may be apportioned to the elements themselves. R EVISED 11. then that only a constant amount of work is expended on each element. We claim that for every n ≥ 0. which is the amount that the charge for n operations exceeds the actual cost of executing them. or overcharge. but asymptotically the bound is optimal — we cannot do better than constant time per operation! (You might reasonably wonder whether there is a worstcase.02. never participating in the second or third stage). representing the cumulative charge for executing a sequence of n operations and show that T (n) ≤ C (n) = O(n). from which the result follows immediately. We will introduce a charge function. Thus C (n) = O(n). Let T (n) be the cumulative time required (in the worst case) to execute a sequence of n queue operations. and it’s departure from the queue. which takes constant time per element on the back. Thus at worst we require three steps per element to account for the entire effort expended to perform a sequence of queue operations. Transit consists of being moved from the back to the front by a reversal. The general idea is to introduce the notion of a charge scheme that provides an upper bound on the actual cost of executing a sequence of operations. We prove this by induction on n ≥ 0. R(n) = b.2 . since T (0) = 0. An upper bound on the charge will then provide an upper bound on the actual cost.) This argument can be made rigorous as follows. it follows that C (n) ≤ 2n for any sequence of n inserts and removes. but the code is far more complicated than the simple implementation we are sketching here. the queue contains 0 ≤ b ≤ n elements on the back “half” and 0 ≤ f ≤ n elements on the front “half”. it’s transit time in the queue. C (0) = 0. The “life” of a queue element may be divided into three stages: it’s arrival in the queue. Down to speciﬁcs. The condition clearly holds after performing 0 operations. C (n). Arrival requires constant time to add the element to the back of the queue. It is convenient to express this in terms of a function R(n) = C (n) − T (n) representing the cumulative residual. The answer is “yes”. This is in fact a conservative upper bound in the sense that we may need less than 3n steps for the sequence. Departure takes constant time to pattern match and extract the element. nonamortized constanttime implementation of persistent queues.28. We will arrange things so that R(n) ≥ 0 and that C (n) = O(n). After any sequence of n ≥ 0 operations have been performed. In the worst case each element passes through each of these stages (but may “die young”. By charging 2 for each insert operation and 1 for each remove.
then T (n + 1) = T (n) + b + 2 (charging one for the cons and one for creating the new pair of lists).2 .. C (n + 1) = C (n) + 2. then T (n + 1) = T (n) + b + 1.02. T (n + 1) = T (n) + 1. Let us suppose that we have n independent “futures” for q. Suppose that we perform a sequence of n insert operations on the empty queue. The result follows immediately since R(n) = b ≥ 0. or O(n) steps per operation. the entire collection of 2n operations takes n + n2 steps. they all enjoy the beneﬁt of its having been done. breaking the amortized constanttime bound just derived for a singlethreaded sequence of queue operations. This is correct because the back is now empty. We will return to this once we introduce memoization and lazy evaluation. f = 0. How much time do these 2n operations take? Since each independent future must reverse all n elements onto the front of the queue before performing the removal. Finally.e. C (n + 1) = C (n) + 1. and hence C ( n ) ≥ T ( n ). each of which removes an element from it. It is instructive to examine where this solution breaks down in the multithreaded case (i. and f > 0. we have used the residual overcharge to pay for the cost of the reversal. R EVISED 11.2 Amortized Analysis 237 and hence R(0) = C (0) − T (0) = 0. so R(n + 1) = R(n) + 2 − b − 2 = b + 2 − b − 2 = 0. This is correct because the remove doesn’t disturb the back in this case. This may be achieved by using a benign side effect to cache the result of the reversal in a reference cell that is shared among all uses of the queue. and hence R(n + 1) = R(n) − b = b − b = 0. on the other hand.28. for a total of 2n operations. If it is an insert. Call this queue q. If. where persistence is fully exploited). This is correct because an insert operation adds one element to the back of the queue. and f > 0. Can we recover a constanttime amortized cost in the persistent case? We can. If the n + 1st operation is a remove. and hence R(n + 1) = R(n) + 1 = b + 1. Here again we use of the residual overcharge to pay for the reversal of the back to the front. Consider the n + 1st operation. provided that we share the cost of the reversal among all futures of q — as soon as one performs the reversal.11 D RAFT V ERSION 1. resulting in a queue with n elements on the back and none on the front. then T (n + 1) = T (n) + 1 and C (n + 1) = C (n) + 1 and hence R(n + 1) = R(n) = b. C (n + 1) = C (n) + 2. if we are performing a remove with f = 0.
3 Sample Code 238 28.2 .02. R EVISED 11.28.3 Sample Code Here is the code for this chapter.11 D RAFT V ERSION 1.
To ﬁnd a solution we must reconsider earlier . exceptions. Option types provide the means of explicitly indicating in the type of a function the possibility that it may fail to yield a “normal” result. The result type of the function forces the caller to dispatch explicitly on whether or not it returned a normal value. Unfortunately it’s not possible to do this in one pass. and Continuations In this chapter we discuss the close relationships between option types. we may ﬁnd that we can safely place k < n queens on the board. and continuations. 29. Exceptions. They each provide the means for handling failure to produce a value in a computation.Chapter 29 Options. Exceptions provide the means of implicitly signalling failure to return a normal result value. The general strategy is to place queens in successive columns in such a way that it is not attacked by a previously placed queen. without sacriﬁcing the requirement that an application of such a function cannot ignore failure to yield a value. Continuations provide another means of handling failure by providing a function to invoke in the case that normal return is impossible.1 The nQueens Problem We will explore the tradeoffs between these three approaches by considering three different implementations of the nqueens problem: ﬁnd a way to place n queens on an n × n chessboard in such a way that no two queens attack one another. only to discover that there is no way to place the next one.
The board abstraction may be implemented as follows: structure Board :> BOARD = struct (* rep: size. length(placements) = number placed *) type board = int * int * int * (int * int) list fun new n = (n.11 D RAFT V ERSION 1. number placed. 0. 1. If all possible reconsiderations of all previous decisions all lead to failure. The function size returns the size of a board. placements inv: size>=0. 1<=next free<=size. The operation complete checks whether the board contains a complete safe placement of n queens.1 The nQueens Problem 240 decisions. and work forward from there. then the problem is unsolvable. The operation place puts a queen at row i in the next available column of the board. A solution to the nqueens problem consists of an n × n chessboard with n queens safely placed on it. The following signature deﬁnes a chessboard abstraction: signature BOARD = sig type board val new : int > board val complete : board > bool val place : board * int > board val safe : board * int > bool val size : board > int val positions : board > (int * int) list end The operation new creates a new board of a given dimension n ≥ 0. This trialanderror approach to solving the nqueens problem is called backtracking search. next free column. nil) fun size (n. there is no safe placement of three queens on a 3x3 chessboard. .29.02.2 . and the function positions returns the coordinates of the queens on the board. . The function safe checks whether it is safe to place a queen at row i in the next free column of a board B. For example. ) = n R EVISED 11.
j’)) = i=i’ orelse j=j’ orelse i+j = i’+j’ orelse ij = i’j’ fun conflicts (q. q’::qs) = threatens (q.board option such that if n ≥ 0. j) = not (conflicts ((i. i+1.g. qs)) end 241 The representation type contains “redundant” information in order to make the individual operations more efﬁcient. qs). qs). We will consider three different solutions. the claimed number of placements is indeed the length of the list of placed queens. and so on.11 D RAFT V ERSION 1.j) = (n. R EVISED 11. nil) = false  conflicts (q. . then queens n evaluates either to NONE if there is no safe placement of n queens on an n × n board. or to SOME B otherwise. . ) = (k=n) fun positions ( .29.) Our goal is to deﬁne a function val queens : int > Board.. one using option types. i. qs) = qs fun place ((n. The representation invariant ensures that the components of the representation are properly related to one another (e.2 . . k. . and one using a failure continuation. (i. qs) fun safe (( .2 Solution Using Options fun complete (n. k.j).j)::qs) fun threatens ((i. one using exceptions. (i’.2 Solution Using Options Here’s a solution based on option types: (* addqueen bd yields SOME bd’ with bd’ a complete safe placement extending bd. 29.j). i.02. with B a complete board containing a safe placement of n queens. k+1. q’) orelse conflicts (q.
we instead have it return a value of type Board.board. and otherwise raise an exception indicating failure. if not. Eventually we either ﬁnd a safe placement.11 D RAFT V ERSION 1.29.3 Solution Using Exceptions The explicit check on the result of each recursive call can be replaced by the use of exceptions. j) then case addqueen (Board. Here’s the code: R EVISED 11.size bd then NONE else if Board. j)) of NONE => try (j+1)  r as (SOME bd’) => r else try (j+1) in if Board.board option. The case analysis on the result is replaced by a use of an exception handler.safe (bd.place (bd.3 Solution Using Exceptions if one exists. we simply return it. we must reconsider the placement of a queen in row j of the next available column. if possible. Rather than have addqueen return a value of type Board. and yields NONE otherwise *) fun addqueen bd = let fun try j = if j > Board. the function yields NONE. 29.02.complete bd then SOME bd else try 1 end fun queens n = addqueen (Board. If so.2 . If no placement is possible in the current column. which forces reconsideration of the placement of a queen in the preceding row. or yield NONE indicating that no solution is possible.new n) 242 The characteristic feature of this solution is that we must explicitly check the result of each recursive call to addqueen to determine whether a safe placement is possible from that position.
11 D RAFT V ERSION 1. which means that queens must return NONE.29. This forces the programmer to explicitly test for failure using a case analysis on the result of the call. an exception handler is required to handle the possibility of there being no safe placement starting in the current position. it is wrapped with the constructor SOME to indicate success. What are the tradeoffs between the two solutions? 1.02. j) then addqueen (Board. where bd’ is a complete safe placement extending bd.2 .new n)) handle Fail => NONE The main difference between this solution and the previous one is that both calls to addqueen must handle the possibility that it raises the exception Fail. The solution based on option types makes explicit in the type of the function addqueen the possibility of failure. and raises Fail otherwise *) fun addqueen bd = let fun try j = if j > Board. j)) handle Fail => try (j+1) else try (j+1) in if Board.place (bd.3 Solution Using Exceptions 243 exception Fail (* addqueen bd yields bd’. The type checker will ensure that one cannot use a R EVISED 11.size bd then raise Fail else if Board.complete bd then bd else try 1 end fun queens n = SOME (addqueen (Board.safe (bd. This check corresponds directly to the case analysis required in the solution based on option types. In the outermost call this corresponds to a complete failure to ﬁnd a safe placement. If a safe placement is indeed found. if one exists. In the recursive call within try.
we tend to prefer exceptions if failure is a rarity. For the nqueens problem it is not clear which solution is preferable. static checking is paramount. rather than the “failure” case of not returning a board at all. on the other hand. rather than having the error arise only at runtime. where bd’ is a complete safe placement extending bd. The idea is quite simple: an exception handler is essentially a function that we invoke when we reach a blind alley. then it is advantageous to use options since the type checker will enforce the requirement that the programmer check for failure. However. the check is redundant and therefore excessively costly.board is expected. for otherwise an uncaught exception error would be raised at runtime. if efﬁciency is paramount.11 D RAFT V ERSION 1.4 Solution Using Continuations We turn now to a third solution based on continuationpassing. yields the value of fc () *) fun addqueen (bd.29. and otherwise. In general. the failure continuation of the computation. if one exists. But we can. The solution based on exceptions is free of this overhead: it is biased towards the “normal” case of returning a board. fc) = R EVISED 11. rather than compiletime. and to prefer options if failure is relatively common.02. The solution based on exceptions does not explicitly indicate failure in its type. pass the handler around as an additional argument.2 . if we wish. 2. The implementation of exceptions ensures that the use of a handler is more efﬁcient than an explicit case analysis in the case that failure is rare compared to success. Ordinarily we achieve this invocation by raising an exception and relying on the caller to catch it and pass control to the handler.board option where a Board. If. Here’s how it’s done in the case of the nqueens problem: (* addqueen bd yields bd’.4 Solution Using Continuations 244 Board. the programmer is nevertheless forced to handle the failure. The solution based on option types requires an explicit case analysis on the result of each recursive call. 29. If “most” results are successful.
fn () => try (j+1)) else try (j+1) in if Board.02. the compiler ensures that an uncaught exception aborts the program gracefully. reﬂecting the ultimate failure to ﬁnd a safe placement. The solution based on continuations is very close to the solution based on exceptions.complete bd then SOME bd else try 1 end fun queens n = addqueen (Board. The initial continuation simply yields NONE. j) then addqueen (Board. we invoke the failure continuation of the most recent call to addqueen. However. as we’ve seen in the case of regular expression matching.4 Solution Using Continuations let fun try j = if j > Board. both in form and in terms of efﬁciency. whereas failure to invoke a continuation is not in itself a runtime fault. R EVISED 11.place (bd. On a recursive call we pass to addqueen a continuation that resumes search at the next row of the current column.2 . More signiﬁcantly.11 D RAFT V ERSION 1. j). we can only offer general advice.size bd then fc () else if Board. Using the right tool for the right job makes life easier. in the case that exceptions would sufﬁce. failure continuations are more powerful than exceptions. there is no obvious way to replace the use of a failure continuation with a use of exceptions in the matcher.safe (bd. fn () => NONE) 245 Here again the differences are small. it is generally preferable to use them since one may then avoid passing an explicit failure continuation. but signiﬁcant.29. First off. Which is preferable? Here again there is no easy answer. Should we exceed the number of rows on the board.new n.
R EVISED 11.5 Sample Code Here is the code for this chapter.2 .5 Sample Code 246 29.11 D RAFT V ERSION 1.29.02.
Alternatively.Chapter 30 HigherOrder Functions Higherorder functions — those that take functions as arguments or return functions as results — are powerful tools for building programs. Digital logic elements (such as ﬂipﬂops) are obtained from combinational logic elements by feedback. The ith value of the sequence corresponds to the signal on the wire at time i. Since sequences are functions. We will develop a small package of operations for creating and manipulating sequences. We will explore both approaches. all of which are higherorder functions since they take sequences (functions!) as arguments and/or return them as results. or recursion — a ﬂipﬂop is a recursivelydeﬁned wire! . An interesting application of higherorder functions is to implement inﬁnite sequences of values as (total) functions from the natural numbers (nonnegative integers) to the type of values of the sequence. we may think of such a sequence as arising from a “loopback” or “feedback” construct. we may use recursive function deﬁnitions to deﬁne such sequences. Sequences may be used to simulate digital circuits by thinking of a “wire” as a sequence of bits developing over time. or selfreference. Combinational logic elements (such as and gates or inverters) are operations on wires: they take in one or more wires as input and yield one or more wires as results. For simplicity we will assume a perfect waveform: the signal is always either high or low (or is undeﬁned). we will not attempt to model electronic effects such as attenuation or noise. A natural way to deﬁne many sequences is by recursion.
recursive sequences may only be built using the loopback operation which constructs a recursive sequence by “looping back” the output of a sequence transformer to its input.1 Inﬁnite Sequences 248 30. This is done to simplify the deﬁnition of recursive sequences as recursive functions.02. Here is a suitable signature deﬁning the type of sequences: signature SEQUENCE = sig type ’a seq = int > ’a (* constant sequence *) val constantly : ’a > ’a seq (* alternating values *) val alternately : ’a * ’a > ’a seq (* insert at front *) val insert : ’a * ’a seq > ’a seq val map : (’a > ’b) > ’a seq > ’b seq val zip : ’a seq * ’b seq > (’a * ’b) seq val unzip : (’a * ’b) seq > ’a seq * ’b seq (* fair merge *) val merge : (’a * ’a) seq > ’a seq val stretch : int > ’a seq > ’a seq val shrink : int > ’a seq > ’a seq val take : int > ’a seq > ’a list val drop : int > ’a seq > ’a seq val shift : ’a seq > ’a seq val loopback : (’a seq > ’a seq) > ’a seq end Observe that we expose the representation of sequences as functions. Most of the other operations of the signature are adaptations of familiar operations on lists.30.1 Inﬁnite Sequences Let us begin by developing a sequence package. In the absence of this exposure of representation. Alternatively we could have hidden the representation type. at the expense of making it a bit more awkward to deﬁne recursive sequences.2 . Two exceptions to this rule are the functions stretch and shrink that dilate and contract the sequence by a given time R EVISED 11.11 D RAFT V ERSION 1.
This is achieved by a simple recursive deﬁnition of a sequence whose value at n is the value at n of the sequence resulting from applying the loop to this very sequence. s2) n = (if n mod 2 = 0 then s1 else s2) (n div 2) fun stretch k s n = s (n div k) fun shrink k s n = s (n * k) fun drop k s n = s (n+k) fun shift s = drop 1 s fun take 0 = nil  take n s = s 0 :: take (n1) (shift s) fun loopback loop n = loop (loopback loop) n end Most of this implementation is entirely straightforward. given the ease with which we may manipulate higherorder functions in ML. which must arrange that the output of the function loop is “looped back” to its input.d) n = if n mod 2 = 0 then c else d fun insert (x. s2) n = (s1 n. Here’s an implementation of sequences as functions.2 . s) n = s (n1) fun map f s = f o s fun zip (s1. notice that we may not simplify the deﬁnition of loopback as follows: R EVISED 11. First. s2 n) fun unzip (s : (’a * ’b) seq) = (map #1 s. The only tricky function is loopback.1 Inﬁnite Sequences 249 parameter — if a sequence is expanded by k. and dually for shrinking.02. map #2 s) fun merge (s1.11 D RAFT V ERSION 1. The sensibility of this deﬁnition of loopback relies on two separate ideas. its value at i is the value of the original sequence at i/k. structure Sequence :> SEQUENCE = struct type ’a seq = int > ’a fun constantly c n = c fun alternately (c. s) 0 = x  insert (x.30.
as follows: take 10 nats take 5 (drop 5 nats) take 5 fibs (* [0.1. it must be the case that the loop returns a sequence without “touching” the argument sequence (i.1.5] *) Now let’s consider an alternative deﬁnition of fibs that uses the loopback operation: R EVISED 11.2 .4. Some examples will help to illustrate the point. which is entirely equivalent to the deﬁnition given above: fun loopback loop = fn n => loop (loopback loop) n This format makes it clear that loopback immediately returns a function when applied to a loop functional.8.7.. fibs)))))(n) We may “inspect” the sequence using take and drop.1 Inﬁnite Sequences (* bad definition *) fun loopback loop = loop (loopback loop) 250 The reason is that any application of loopback will immediately loop forever! In contrast. insert (1.7.9] *) (* [1. Otherwise accessing the sequence resulting from an application of loopback would immediately loop forever.02. let’s build a few sequences without using the loopback function.2. for an application of loopback to a loop to make sense.8. odds) fibs n = (insert (1.3.3. map (op +) (zip (drop 1 fibs.6.5. just to get familiar with using sequences: val val val fun evens : int seq = fn n => 2*n odds : int seq = fn n => 2*n+1 nats : int seq = merge (evens.9] *) (* [5. First. the original deﬁnition is arranged so that application of loopback immediately returns a function.30.e. without applying the argument to a natural number).11 D RAFT V ERSION 1.2. Second.6. This may be made more apparent by writing it in the following form.
11 D RAFT V ERSION 1. in effect.2 . It is obvious that the solution is the Fibonacci sequence. 251 The deﬁnition of fibs loop is exactly like the original deﬁnition of fibs. we may apply the sequence package to build an implementation of digital circuits. Here’s an example of a loop that. which are represented as sequences of levels: R EVISED 11. no solution exists. we are solving the following system of equations for f : f0 = 1 f1 = 1 f ( n + 2) = f ( n + 1) + f ( n ) These equations are derived by inspecting the deﬁnitions of insert. yields an undeﬁned sequence — any attempt to access it results in an inﬁnite loop: fun bad loop s n = s n + 1 val bad = loopback bad loop (* infinite loop! *) val = bad 0 In this example we are. map. insert (1. which has no solution (except the totally undeﬁned sequence). In the case of the second deﬁnition of ﬁbs. except that the reference to fibs itself is replaced by a reference to the argument s. zip. Consequently. trying to solve the equation sn = sn + 1 for s. this is precisely the sequence obtained by applying loopback to fibs loop. 30. The problem is that the “next” element of the output is deﬁned in terms of the next element itself.2 Circuit Simulation With these ideas in mind. and drop given earlier. Notice that the application of fibs loop to an argument s does not inspect the argument s! One way to understand loopback is that it solves a system of equations for an unknown sequence. s)))) val fibs = loopback fibs loop.2 Circuit Simulation fun fibs loop s = insert (1. Let’s start with wires. rather than in terms of “previous” elements.02.30. map (op +) (zip (drop 1 s. when looped back.
High) = High  logical and = Undef fun logical not Undef = Undef  logical not High = Low  logical not Low = High fun logical nop l = l (* a nor b = not a and not b *) val logical nor = logical and o (logical not ** logical not) type unary gate = wire > wire type binary gate = pair > wire fun gate f w 0 = Undef (* logic gate with unit propagation delay *) R EVISED 11.2 . We introduce an explicit unit time propagation delay for each gate — the output is undeﬁned initially. fun (f ** g) (x. y) = (f x. exactly as in “real life”. ) = Low  logical and ( .02.2 Circuit Simulation datatype level = High  Low  Undef type wire = level seq type pair = (level * level) seq val Zero : wire = constantly Low val One : wire = constantly High (* clock pulse with given duration of each pulse *) fun clock (freq:int):wire = stretch freq (alternately (Low. Combinational logic elements (gates) may be deﬁned as follows.30. g y) (* hardware logical and *) fun logical and (Low. it takes longer and longer (proportional to the length of the longest path through the circuit) for the output to settle. and is then determined as a function of its inputs. (* apply two functions in parallel *) infixr **. As we build up layers of circuit elements. High)) 252 We include the “undeﬁned” level to account for propagation delays and settling times in circuit elements.11 D RAFT V ERSION 1. Low) = Low  logical and (High.
The propagation delay inherent in our deﬁnition of a gate is fundamental to ensuring that the behavior of the ﬂipﬂop is welldeﬁned! This is consistent with “real life” — ﬂipﬂop’s depend on the existence of a hardware propagation delay for their proper functioning. val X = RS ff (S. Y))(n) and Y n = nor gate (zip (X.2 . (* unstable! *) of these behaviors may be observed by using take and drop to inspect the values on the circuit. 1 All R EVISED 11.30. val = take 20 Q. then to string them together to form an nbit ripplecarry adder.1 fun RS ff (S : wire.2 Circuit Simulation  val val val gate f w i = f (w (i1)) (* unit delay *) delay : unary gate = gate logical nop inverter : unary gate = gate logical not nor gate : binary gate = gate logical nor 253 It is a good exercise to build a onebit adder out of these elements. here as in real life.11 D RAFT V ERSION 1. val Q = RS ff (S. Be sure to present the inputs to the adder with sufﬁcient pulse widths to ensure that the circuit has time to settle! Combining these basic logic elements with recursive deﬁnitions allows us to deﬁne digital logic elements such as the RS ﬂipﬂop. R))(n) in Y end (* generate a pulse of b’s n wide. Note also that presentation of “illegal” inputs (such as setting both the R and the S leads high results in metastable behavior of the ﬂipﬂop. val R = pulse Low 6 (pulse High 2 Zero). followed by w *) fun pulse b 0 w i = w i  pulse b n w 0 = b  pulse b n w i = pulse b (n1) w (i1) val S = pulse Low 2 (pulse High 2 Zero). R : wire) = let fun X n = nor gate (zip (S.02. S). observe that the ﬂipﬂop exhibits a momentary “glitch” in its output before settling. exactly as in hardware. Finally. R).
and not the “current” element. Finally.02. Revisited val = take 20 X. 2. FAIL: always fails to match. The code for the matcher may be restructured to separate the highlevel control structure of the matcher from the details of how that control structure is implemented.3 Regular Expression Matching. matches an initial segment of that string. Observe that the delays arising from the combinational logic elements ensure that a solution exists by ensuring that the “next” element of the output refers only the “previous” elements. A match combinator is a higherorder function that combines zero or more matchers to form a compound matcher with the following behavior: 1. This is achieved by reducing binary loopback to unary loopback by composing with zip and unzip. R : wire) = let fun RS loop (X. Revisited The regular expression matcher introduced in Chapter 1 may be elegantly reformulated using higherorder functions. R EVISED 11. R))) in loopback2 RS loop end Here we must deﬁne a “binary loopback” function to implement the ﬂipﬂop.11 D RAFT V ERSION 1.30. NULL: always matches the null string. given a string. we consider a variant implementation of an RS ﬂipﬂop using the loopback operation: fun loopback2 (f : wire * wire > wire * wire) = unzip (loopback (zip o f o unzip)) fun RS ff’ (S : wire. 30.2 . 254 It is a good exercise to derive a system of equations governing the RS ﬂipﬂop from the deﬁnition we’ve given here. Y) = (nor gate (zip (S.3 Regular Expression Matching. Y)). nor gate (zip (X. A matcher is a function that. using the implementation of the sequence operations given above.
30. 6. m1 THEN m2 : match m1 on a preﬁx of the input. It transforms each regular expression into a matcher for that regular expression.3 Regular Expression Matching.explode string) done end The auxiliary function.2 . match. using the combinators to build up the matcher by inductive analysis of the regular expression.11 D RAFT V ERSION 1. Revisited 255 3. 5. Using these combinators we may present the regular expression matcher in a particularly concise form: fun match Zero = FAIL  match One = NULL  match (Char c) = LITERALLY c  match (Plus (r1. The deﬁnition of the function match illustrates the concept of staged computation.02. m1 OR m2 : match m1 on a preﬁx of the input. LITERALLY c: match exactly the character c at the start of the input string. 4. r2)) = match r1 OR match r2  match (Times (r1. match m2 on a preﬁx of the same input. A matcher is a function of type R EVISED 11. The deﬁnition of accepts ensures that the regular expression is analyzed once. REPEATEDLY m: match m zero or more times on a preﬁx of the input. yielding a matcher that may be applied many times to candidate strings. r2)) = match r1 THEN match r2  match (Star r) = REPEATEDLY (match r) fun accepts regexp = let val matcher = match regexp val done = fn nil => true  => false in fn str => matcher (String. transforms each regular expression into a matcher. using the combinators just described. if possible. and then match m2 on the remaining sufﬁx. otherwise fail. It remains to deﬁne the matching combinators. determining whether or not they match the given regular expression. if not. The function match is deﬁned by induction on the structure of the regular expression. yielding the remainder of the string.
m2) cs k = m1 cs (fn cs’ => m2 cs’ k) infix 9 THEN fun REPEATEDLY m cs k = let fun mstar cs’ = k cs’ orelse m cs’ mstar in mstar cs end The details of the control ﬂow.02.4 Sample Code Here is the code for this chapter. so that the matcher itself is as perspicuous as possible.30. 30. and yields either true or false (please see Chapter 1 to review how the matcher works). The match combinators are deﬁned as follows: fun FAIL cs k = false fun NULL cs k = k cs fun LITERALLY c cs k = (case cs of nil => false  c’::cs’ => (c=c’) andalso (k cs’)) fun OR (m1. m2) cs k = m1 cs k orelse m2 cs k infix 8 OR fun THEN (m1. which is managed through the use of continuations. is packaged into the match combinators.2 . R EVISED 11.11 D RAFT V ERSION 1.4 Sample Code char list > (char list > bool) > bool 256 that is given a character list and a continuation.
1 Cacheing Results We begin with a discussion of memoization to increase the efﬁciency of computing a recursivelydeﬁned function whose pattern of recursion involves a substantial amount of redundant computation. Memoization is fundamental to the implementation of lazy data structures. the expression 2 ∗ 3 ∗ 4 ∗ 5 can be parenthesized in 5 ways: ((2 ∗ 3) ∗ 4) ∗ 5. For example. a programming technique for cacheing the results of previous computations so that they can be quickly retrieved without repeated effort. (2 ∗ 3) ∗ (4 ∗ 5).Chapter 31 Memoization In this chapter we will discuss memoization. either “by hand” or using the provisions of the SML/NJ compiler. A simple recurrence expresses the number of ways of parenthesizing a sequence of n multiplications: fun  fun  sum sum p 1 p n f f = = 0 = 0 n = (f n) + sum f (n1) 1 sum (fn k => (p k) * (p (nk))) (n1) . 2 ∗ ((3 ∗ 4) ∗ 5). 2 ∗ (3 ∗ (4 ∗ 5)). The problem is to compute the number of ways to parenthesize an expression consisting of a sequence of n multiplications as a function of n. 31. (2 ∗ (3 ∗ 4)) ∗ 5.
What can we do about this problem? One solution is to be clever and solve the recurrence. a balanced binary search tree) which has no a priori size limitation. and something else must be done to cut down the overhead.02. so we can only record the values of the function at some predetermined set of arguments. We will maintain the table as an array so that its entries can be accessed in constant time. If the function is called on an argument n. If not.2 R EVISED 11. In this case a simple cacheing technique proves effective. For simplicity we’ll use a solution based on arrays. An alternative is to use a dictionary (e. we compute the value and store it in the table for future use. if so.1 Cacheing Results 258 where sum f ncomputes the sum of values of a function f (k ) with 1 ≤ k ≤ n. This program is extremely inefﬁcient because of the redundancy in the pattern of the recursive calls. the table is consulted to see whether the value has already been computed. Here’s the code to implement a memoized version of the parenthesization function: local val val in fun  and limit = 100 memopad = Array. As it happens this recurrence has a closedform solution (the Catalan numbers).g.11 .array (100. The penalty is that the array has a ﬁxed size. Once we exceed the bounds of the table.31.. but which takes logarithmic time to perform a lookup. we must compute the value the “hard way”. NONE) p’ 1 = 1 p’ n = sum (fn k => (p k) * (p (nk))) (n1) p n = if n < limit then case Array.sub of SOME r => r  NONE => let val r = p’ n in D RAFT V ERSION 1. it is simply returned. This ensures that no redundant computations are performed. The idea is to maintain a table of values of the function that is ﬁlled in whenever the function is applied. But in many cases there is no known closed form.
The “exported” version of the function is the one that refers to the memo pad. SOME r). the expression exp is effectively “frozen” until the function is applied.11 D RAFT V ERSION 1. It’s job is to “memoize” the suspension R EVISED 11. functions of type unit > ’a.2 Laziness Lazy evaluation is a combination of delayed evaluation and memoization. simply write fn () => exp.update (memopad.31. n. This is a value of type unit > ’a. simply apply the thunk to the null tuple. To do so. Here’s a simple example: val thunk = fn () => print "hello" val = thunk () (* nothing printed *) (* prints hello *) While this example is especially simpleminded. To “thaw” the expression.2 Laziness Array. r end else p’ n end 259 The main idea is to modify the original deﬁnition so that the recursive calls consult and update the memopad. we will consider the following signature of suspensions: signature SUSP = sig type ’a susp val force : ’a susp > ’a val delay : (unit > ’a) > ’a susp end The function delay takes a suspended computation (in the form of a thunk) and yields a suspension.02. To delay the evaluation of an expression exp of type’a. remarkable effects can be achieved by combining delayed evaluation with memoization. (). Delayed evaluation is implemented using thunks. Notice that the deﬁnitions of p and p’ are mutually recursive! 31.2 .
raises an internal exception. when forced. This can never happen for reasons that will become apparent in a moment.2 Laziness 260 so that the suspended computation is evaluated at most once — once the result is computed. and return a thunk dt that. when forced. it immediately forces the contents of R EVISED 11. force simply applies the suspension to the null tuple to force its evaluation.02. it is merely a placeholder with which we initialize the reference cell. r end in memo := t’.31. It forces the thunk t to obtain its value r.2 . Whenever dt is forced. if forced. 2. What about delay? When applied. Suspensions are just thunks. The implementation is slick. 3. simply forces the contents of the memo pad.11 D RAFT V ERSION 1. Here’s the code to do it: structure Susp :> SUSP = struct type ’a susp = unit > ’a fun force t = t () fun delay (t : ’a susp) = let exception Impossible val memo : ’a susp ref = ref (fn () => raise Impossible) fun t’ () = let val r = t () in memo := (fn () => r). fn () => (!memo)() end end It’s worth discussing this code in detail because it is rather tricky. We then deﬁne another thunk t’ that. delay allocates a reference cell containing a thunk that. We then assign t’ to the memo pad (hence obliterating the placeholder). the value is stored in a reference cell so that subsequent forces are fast. It replaces the contents of the memopad with the constant function that immediately returns r. does three things: 1. It returns r as result.
force t val = Susp. which are suspensions yielding stream values. when used to build a stream.delay (fn () => print "hello") (* prints hello *) val = Susp.force t (* silent *) Notice that hello is printed once. the contents of the memo pad changes as a result of forcing it so that subsequent forces exhibit different behavior. Here’s an example to illustrate the effect of delaying a thunk: val t = Susp.3 Lazy Data Types in SML/NJ The lazy datatype declaration1 datatype lazy ’a stream = Cons of ’a * ’a stream expands into the following pair of type declarations datatype ’a stream! = Cons of ’a * ’a stream withtype ’a stream = ’a stream! Susp. the binding 1 Please see chapter 15 for a description of the SML/NJ lazy data type mechanism.31. it forces the thunk t’. automatically suspends computation.3 Lazy Data Types in SML/NJ 261 the memo pad. the second deﬁnes the type of stream computations.2 . R EVISED 11. the ﬁrst time dt is forced. The value constructor Cons. The second time dt is forced. the value constructor Cons induces a use of force.susp (fn () => e). This is achieved by regarding Cons e as shorthand for Cons (Susp.susp The ﬁrst deﬁnes the type of stream values. as before. Speciﬁcally.11 D RAFT V ERSION 1. which are formed by applying the constructor Cons to a value and another stream. For example. which then forces t its value r. “zaps” the memo pad. it forces the contents of the memo pad. the result of forcing a stream computation. and returns r. Altogether we have ensured that t is forced at most once by using a form of “selfmodifying” code. Thus streams are represented by suspended (unevaluated. 31. but this time the it contains the constant function that immediately returns r. not twice! The reason is that the suspended computation is evaluated at most once. so the message is printed at most once on the screen. memoized) computations of stream values. However. When used in a pattern.02.
force s) which forces the argument as soon as it is applied. Finally.force e 262 which forces the righthand side before performing pattern matching. the recursive stream deﬁnition val rec lazy ones = Cons (1. when forced. t)) = t and lstl s = Susp. t)) = t and stl s = stl! (Susp. ones)) Unfortunately this is not quite legal in SML since the righthand side involves an application of a a function to another function.11 D RAFT V ERSION 1. we’ll explore the latter alternative.force s)) which a suspension that. ones) expands into the following recursive function deﬁnition: val rec ones = Susp.31. On the other hand. This can either be provided by extending SML to admit such deﬁnitions.2 .3 Lazy Data Types in SML/NJ val Cons (h. Thus the lazy tail function fun lstl (Cons ( . performs the pattern match. t)) = t expands into fun stl! (Cons ( .delay (fn () => lstl! (Susp.02. R EVISED 11. lazy function deﬁnitions defer pattern matching until the result is forced. t) = e becomes val Cons (h.delay (fn () => Cons (1. A similar transformation applies to nonlazy function deﬁnitions — the argument is forced before pattern matching commences. or by extending the Susp package to include an operation for building recursive suspensions such as this one. t) = Susp. t)) = t expands into fun lstl! (Cons ( . Thus the “eager” tail function fun stl (Cons ( . Since it is an interesting exercise in itself.
when applied to a function f mapping suspensions to suspensions. This “ties the knot” to ensure that the output is “looped back” to the input. s)) We use loopback to deﬁne ones as follows: val ones = Susp. In the above example the function in question is fun ones loop s = Susp.loopback ones loop The idea is that ones should be equivalent to Susp.loopback is implemented properly. Here’s the code fun loopback f = let exception Circular val r = ref (fn () => raise Circular) val t = fn () => (!r)() in r := f t . Observe that if the loop function touches its input suspension before yielding an output suspension.4 Recursive Suspensions We seek to add a function to the Susp package with signature val loopback : (’a susp > ’a susp) > ’a susp that. ones)). This will be the return value of loopback. when forced.delay (fn () => Cons (1. the exception Circular will be raised.delay (fn () => Cons (1. the application of f to the resulting suspension.11 D RAFT V ERSION 1.4 Recursive Suspensions 263 31.02. assuming Susp.31. forces the contents of this reference cell. yields a suspension s whose behavior is the same as f (s). R EVISED 11. How is loopback implemented? We use a technique known as backpatching. as in the original deﬁnition and which is the result of evaluating Susp. Here is the code for this chapter.2 . t end First we allocate a reference cell which is initialized to a placeholder that. we assign to the reference cell the result of applying the given function to the result thunk. But before returning.loopback ones loop. Then we deﬁne a thunk that. raises the exception Circular. if forced.
2 .02.31.5 Sample Code R EVISED 11.5 Sample Code 264 31.11 D RAFT V ERSION 1.
What makes an ADT abstract is that the representation type is hidden from clients of the ADT. . This signiﬁcantly reduces the time required to ﬁnd an error in a program. Each operation that takes a value of the ADT as argument may assume that the representation invariant holds. In compensation each operation that yields a value of the ADT as result must guarantee that the representation invariant holds of it. on the representation that is preserved by the operations of the type. This ensures that the representation may be changed without affecting the behavior of the client — since the representation is hidden from it. the client cannot depend on it. An ADT is implemented by providing a representation type for the values of the ADT and an implementation for the operations deﬁned on values of the representation type.Chapter 32 Data Abstraction An abstract data type (ADT) is a type equipped with a set of operations for manipulating values of that type. the only operations that may be performed on a value of the ADT are the given ones. Consequently. then it must truly be invariant — no other code in the system could possibly disrupt it. called a representation invariant. any violation of the representation invariant may be localized to the implementation of one of the operations. If the operations of the ADT preserve the representation invariant. This also facilitates the implementation of efﬁcient data structures by imposing a condition. Put another way.
1 Dictionaries 266 32.2 .11 D RAFT V ERSION 1. It follows immediately that no two nodes in a binary search tree are labelled with the same value. insert. whereas the types key and ’a entry are deﬁned to be string and string * ’a. the value at that node is greater than the value at any node in the left child of that node. A dictionary is a mapping from keys to values.2 Binary Search Trees A simple implementation of a dictionary is a binary search tree. but it is possible to deﬁne a dictionary for any ordered type.02. and smaller than the value at any node in the right child. The underlying structure is a R EVISED 11. A binary search tree is a binary tree with values of an ordered type at the nodes arranged in such a way that for every node in the tree.32. Viewed as an ADT.1 Dictionaries To make these ideas concrete we will consider the abstract data type of dictionaries. insert a value with a given key. and retrieve the value associated with a key (if any). In short a dictionary is an implementation of the following signature: signature DICT = sig type key = string type ’a entry = key * ’a type ’a dict exception Lookup of key val empty : ’a dict val insert : ’a dict * ’a entry > ’a dict val lookup : ’a dict * key > ’a end Notice that the type ’a dict is not speciﬁed in the signature. The binary search tree property is an example of a representation invariant on an underlying data structure. and lookup operations that create a new dictionary. 32. the values associated with keys are completely arbitrary. a dictionary is a type ’a dict of dictionaries mapping strings to values of type ’a together with empty. respectively. For simplicity we take keys to be strings.
e’ as (k’. k’) = (case String. (k. Empty)  insert (n as Node (l. insert (r. R EVISED 11. k) = raise (Lookup k)  lookup (Node (l.11 D RAFT V ERSION 1. )) = (case String.02. that insert yields a binary search tree if its argument is one. e as (k.2 Binary Search Trees 267 binary tree with values at the nodes. more stringent. it might fail to ﬁnd a key that in fact occurs in the tree!). k’)  GREATER => lookup (r. entry) = Node (Empty. the representation invariant isolates a set of structures satisfying some additional. entry. e. conditions. k) of LESS => Node (insert (l. We may use a binary search tree to implement a dictionary as follows: structure BinarySearchTree :> DICT = struct type key = string type ’a entry = key * ’a (* Rep invariant: ’a tree is a binary search tree *) datatype ’a tree = Empty  Node of ’a tree * ’a entry * ’a tree type ’a dict = ’a tree exception Lookup of key val empty = Empty fun insert (Empty. v). e. e’). and that lookup relies on its argument being a binary search tree (if not.compare (k’. r).32. r). e’))  EQUAL => n) fun lookup (Empty. The structure BinarySearchTree is sealed with the signature DICT to ensure that the representation type is held abstract.2 . r)  GREATER => Node (l.compare (k’. ). k’)) end Notice that empty is deﬁned to be a valid binary search tree. k) of EQUAL => v  LESS => lookup (l.
We may prove this by induction on the structure of the redblack tree. First. then the black R EVISED 11. For any node in the tree. Here’s why. called a redblack tree (the reason for the name will be apparent shortly). Consequently. Can we do better? Many approaches have been suggested.11 D RAFT V ERSION 1. which is at least 2h − 1. which is at least 21 − 1. we have a black node. observe that a redblack tree of black height h has at least 2h − 1 nodes. 2. In particular if we insert keys in ascending order. called the redblack tree condition. a considerable improvement. yielding a total of 2 × (2h − 1) + 1 = 2h+1 − 1 nodes. Were it to be the case that the children of every node had roughly equal height. The empty tree has blackheight 1 (since we consider it to be black). One that we will consider here is an instance of what is called a selfadjusting tree. on the other hand.3 Balanced Binary Search Trees 268 32. meaning that the children of any node have roughly the same height.3 Balanced Binary Search Trees The difﬁculty with binary search trees is that they may become unbalanced. If. The general idea of a selfadjusting tree is that operations on the tree may cause a reorganization of its structure to ensure that some invariant is maintained. How is this achieved? By imposing a clever representation invariant on the binary search tree.2 . as required. A redblack tree is a binary search tree in which every node is colored either red or black (with the empty tree being regarded as black) and such that the following properties hold: 1. The children of a red node are black. This number is called the black height of the node. Suppose we have a red node. this ensures that lookup is efﬁcient. it takes O(n) time in the worse case to perform a lookup on a dictionary containing n elements. Such a tree is said to be unbalanced because the children of a node have widely varying heights.32. These two conditions ensure that a redblack tree is a balanced binary search tree. hence each has at most 2h − 1 nodes.02. The black height of both children must be h. In our case we will arrange things so that the tree is selfbalancing. then the lookup would take O(lg n) time. the right child is the rest of the dictionary. As we just remarked. the number of black nodes on any two paths from that node to a leaf is the same. the representation is essentially just a list! The left child of each node is empty.
push the violation to the root where it is neatly resolved by recoloring the root black (which preserves the blackheight invariant!). To ensure logarithmic behavior. and that we are preserving the blackheight invariant since the greatgrandchildren of the black node in the original situation will appear as children of the two black nodes in the reorganized situation. which implies that the tree is height balanced. all we have to do is to maintain the redblack invariant.2 . This diagram represents four distinct situations. which we’ll discuss shortly). for a total of 2 × (2h−1 − 1) + 1 = 2h − 1 nodes. so the only question is how to perform an insert operation. we insert the entry as usual for a binary search tree. we recolor the root black. according to whether the uppermost red node is a left or right child of the black node. The insertion may or may not create such a violation. Consequently. Notice as well that the binary search tree conditions are also preserved by this transformation. with the fresh node starting out colored red.32. We will maintain the invariant that there is at most one redred violation in the tree. In doing so we do not disturb the black height condition. its height is logarithmic in the number of nodes. The empty tree is a redblack tree. First. In other words. Let’s look in detail at two of the four cases of removing a redred vioR EVISED 11. and whether the red child of the red node is itself a left or right child. a situation in which a red node has a red child. These transformations either eliminate the redred violation outright. and each propagation step will preserve this invariant.02.3 Balanced Binary Search Trees 269 height of both children is h − 1. the situation must look like this. Notice that by making the uppermost node red we may be introducing a redred violation further up the tree (since the black node’s parent might have been red). It follows that the parent of a redred violation must be black. and hence has at least 2h/2 − 1 nodes. and each have at most 2h−1 − 1 nodes.11 D RAFT V ERSION 1. and we are done rebalancing the tree. The violation is propagated upwards by one of four rotations. Consequently. but we might introduce a redred violation. in logarithmic time. We then remove the redred violation by propagating it upwards towards the root by a constanttime transformation on the tree (one of several possibilities. observe that a redblack tree of height h with n nodes has black height at least h/2. As a limiting case if the redred violation is propagated to the root of the entire tree. or. which preserves the blackheight condition. so h ≤ 2 × lg(n + 1). lg(n + 1) ≥ h/2. In each case the redred violation is propagated upwards by transforming it to look like this. Now.
If the situation looks like this.compare(key. those in which the uppermost red node is the left child of the black node. Red (y. the other two cases are handled symmetrically. d2). then we reorganize the tree to look like this (precisely as before). Black (x. structure RedBlackTree :> DICT = struct type key = string type ’a entry = string * ’a (* Inv: binary search tree + redblack conditions *) datatype ’a dict = Empty  Red of ’a entry * ’a dict * ’a dict  Black of ’a entry * ’a dict * ’a dict val empty = Empty exception Lookup of key fun lookup (dict. Black (z. Once again. right) = (case String. d2).3 Balanced Binary Search Trees 270 lation. left. we reorganize the tree to look like this.key1) of EQUAL => datum1  LESS => lk left  GREATER => lk right) in lk dict end fun restoreLeft (Black (z. d3.32. Red (x.2 . d4)) = Red (y. d1.02. key) = let fun lk (Empty) = raise (Lookup key)  lk (Red tree) = lk’ tree  lk (Black tree) = lk’ tree and lk’ ((key1. Similarly. Here is the ML code to implement dictionaries using a redblack tree. d1. if the situation looks like this.11 D RAFT V ERSION 1. d4))  restoreLeft R EVISED 11. the blackheight and binary search tree invariants are preserved by this transformation. You should check that the blackheight and binary search tree invariants are preserved by this transformation. Notice that the tree rotations are neatly expressed using pattern matching. datum1). d3). and the redred violation is pushed further up the tree.
d4)) d2. Red (x. ins left. left. right))  GREATER => restoreRight (Black (entry1. d4))) = (z. d2). d1. d4)) = Red (y. d1. d3). key1) of EQUAL => Red (entry.02. left. Red Red (y. d4)) 271 fun insert (dict. d3. )) => Black t (* recolor *)  Red (t as ( . d1. Black  restoreRight dict = dict (z.compare (key. right)) = (case String. Red (y. Red . d3. . d1. Black (z. Red (y. d2. ins right))  ins (Black (entry1 as (key1. datum1).compare (key. entry as (key. d1. key1) of EQUAL => Black (entry. Red (z. Black (x.11 D RAFT V ERSION 1. right)  LESS => Red (entry1.2 . left. Black  restoreRight (Black (x. Red (y. right)  GREATER => Red (entry1. Empty. left. d2). d2. Red )) => Black t (* recolor *)  dict => dict end end It is worthwhile to contemplate the role played by the redblack invariant in ensuring the correctness of the implementation and the time complexity of the operations. d1. Black (x. datum1). d3. ins left. d2). d4))  restoreLeft dict = dict fun restoreRight (Black (x. d3)). R EVISED 11. right)  LESS => restoreLeft (Black (entry1. Black (x.32. left. d4)))) = (z. Red (y. Empty)  ins (Red (entry1 as (key1.3 Balanced Binary Search Trees (Black (z. ins right))) in case ins dict of Red (t as ( . right)) = (case String. left. d3. datum)) = let (* val ins : ’a dict>’a dict insert entry *) (* ins (Red ) may have redred at root *) (* ins (Black ) or ins (Empty) is red/black *) (* ins preserves black height *) fun ins (Empty) = Red (entry.
By using these speciﬁed operations to create a total function. indeed. In the case of a binary search tree this is surely possible. As an example. those that are guaranteed to terminate when called.11 D RAFT V ERSION 1.02. Here’s a sketch of such a package: signature TIF = sig type tif val apply : tif > (int > int) val id : tif val compose : tif * tif > tif val double : tif R EVISED 11. provides a useful set of such functions as elements of a structure.32. causes the operations of the dictionary to check that the representation invariant holds of their arguments and results. whereas an insert (for example) can be performed in logarithmic time. without performing any I/O or having any other computational effect.4 Abstraction vs. moreover. we need not turn off the check for production code because there is no runtime penalty for doing so. but then the beneﬁts of checking are lost for the code we care about most! By using the type system to enforce abstraction. RunTime Checking You might wonder whether we could equally well use runtime checks to enforce representation invariants. Another is that no runtime check can be deﬁned that ensures that a given integervalued function is total. But this is false! One reason is that the representation invariant might not be computable. consider an abstract type of total functions on the integers. RunTime Checking 272 32. we are in effect encoding a proof of totality in the program itself. while not admitting every possible total function on the integers as values. But wouldn’t we turn off the debug ﬂag before shipping the production copy of the code? Yes.2 . you might think that we can always replace static localization of representation errors by dynamic checks for violations of them.4 Abstraction vs. we can conﬁne the possible violations of the representation invariant to the dictionary package itself. Yet we can deﬁne an abstract type of total functions that. but at considerable expense since the time required to check the binary search tree invariant is proportional to the size of the binary search tree itself. and. Efﬁciency considerations aside. when set. A more subtle point is that it may not always be possible to enforce data abstraction at runtime. The idea would be to introduce a “debug ﬂag” that.
g) = f o g fun double x = 2 * x . end 273 Should the application of such some value of type Tif. . Here’s an example. In many operating systems processes are “named” by integervalue process identiﬁers. we cannot tell by looking at the integer whether it is indeed valid.5 Sample Code .2 . .11 D RAFT V ERSION 1. No runtime check can assure us that an arbitrary integer function is in fact total. You are invited to imagine how this might be achieved in ML. 32. R EVISED 11. we can enforce the requirement that a value of type pid. Another reason why a runtime check to enforce data abstraction is impossible is that it may not be possible to tell from looking at a given value whether or not it is a legitimate value of the abstact type. whose underlying representation is int. end structure Tif :> TIF = struct type tif = int>int fun apply t n = t n fun id x = x fun compose (f. No runtime check on the value will reveal whether a given integer is a “real” or “bogus” process identiﬁer.02.tif fail to terminate. The only way to know is to consider the “history” of how that integer came into being. and what operations were performed on it. we know where to look for the error.32. or perform any number of other operations on it.5 Sample Code Here is the code for this chapter. . cause it to terminate. Using the abstraction mechanisms just described. is indeed a process identiﬁer. . The thing to notice here is that any integer at all is a possible process identiﬁer. Using the process identiﬁer we may send messages to the process.
and relies on Reynolds’ notion of parametricity to conclude that related implementations engender the same observable behavior in all clients.1 Sample Code Here is the code for this chapter. 33. or known. The methodology generalizes Hoare’s notion of abstraction functions to an arbitrary relation. .Chapter 33 Representation Independence and ADT Correctness This chapter is concerned with proving correctness of ADT implementations by exhibiting a simulation relation between a reference implementation (taken. to be correct) and a candidate implementation (whose correctness is to be established).
. 34. 2.Chapter 34 Modularity and Reuse 1.1 Sample Code Here is the code for this chapter. Naming conventions. Exploiting structural subtyping (type t convention). 3. Impedancematching functors.
It is commonly thought that there is an “opposition” between staticallytyped languages (such as Standard ML) and dynamicallytyped languages (such as Scheme). 35.1 Sample Code Here is the code for this chapter. .Chapter 35 Dynamic Typing and Dynamic Dispatch This chapter is concerned with dynamic typing in a statically typed language. dynamically typed languages are a special case of staticallytyped languages! We will demonstrate this by exhibiting a faithful representation of Scheme inside of ML. In fact.
. 36.1 Sample Code Here is the code for this chapter.Chapter 36 Concurrency In this chapter we consider some fundamental techniques for concurrent programming using CML.
Part V Appendices .
particularly not those that are concerned with the internals of the compiler or its interaction with the host computer system. It also deﬁnes a variety of other commonlyused abstract types. All of the primitive types of Standard ML are deﬁned in structures in the Standard Basis. Please refer to the documentation of your compiler for information on its libraries. Most implementations of Standard ML include module libraries implementing a wide variety of services.Appendix A The Standard ML Basis Library The Standard ML Basis Library is a collection of modules providing a basic collection of abstract types that are shared by all implementations of Standard ML. These libraries are usually not portable across implementations. .
such as the familiar Unix tools. . please consult your compiler’s documentation for details of how to use them. 3.Appendix B Compilation Management All program development environments provide tools to support building systems out of collections of separatelydeveloped modules. Some rely on generic tools. These tools usually provide services such as: 1. 4. Source code management such as version and revision control. support these activities in different ways. Different languages. and different vendors. Release management for building and disseminating systems for general use. others provide proprietary tools. Libraries of reusable modules with consistent conventions for identifying modules and their components. commonly known as IDE’s (integrated development environments). Rather than attempt to summarize all of the known implementations. we will instead consider the SML/NJ Compilation Manager (CM) as a representative program development framework for ML. Most implementations of Standard ML rely on a combination of generic program development tools and tools speciﬁc to that implementation of the language. Other compilers provide similar tools. 2. Separate compilation and linking to support simultaneous development and to reduce build times.
2 Building Systems with CM B.1 Overview of CM 281 B. R EVISED 11.3 Sample Code Here is the code for this chapter.02.2 .1 Overview of CM B.B.11 D RAFT V ERSION 1.
Appendix C Sample Programs A number of example programs illustrating the concepts discussed in the preceding chapters are available in the Sample Code directory on the worldwide web. .
01.02.11 RH 10.11 RH 11.2 Author(s) Description Created Expanded HOF’s techniques Added matching combinators to higherorder function techniques 21.Revision History Revision Date 1.1 1.02.0 1.11 RH .
MIT Press.edu/afs/cs/local/sml/common/smlguide.cmu. The Standard ML Basis Library. [2] Peter Lee. Available within CMU at http://www. . Gansner and John H. The Deﬁnition of Standard ML (Revised).Bibliography [1] Emden R. editors. and David MacQueen. [3] Robin Milner. Reppy. Standard ML at Carnegie Mellon. 1997. 2000. Mads Tofte. Cambridge University Press. Robert Harper.cs.